0% found this document useful (0 votes)
62 views1,063 pages

Full

Uploaded by

TA MI M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views1,063 pages

Full

Uploaded by

TA MI M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1063

CALCULUS - EARLY

TRANSCENDENTALS

LibreTexts
Calculus - Early Transcendentals

LibreTexts
This text is disseminated via the Open Education Resource (OER) LibreTexts Project (https://LibreTexts.org) and like the hundreds
of other texts available within this powerful platform, it is freely available for reading, printing and "consuming." Most, but not all,
pages in the library have licenses that may allow individuals to make changes, save, and print this book. Carefully
consult the applicable license(s) before pursuing such effects.
Instructors can adopt existing LibreTexts texts or Remix them to quickly build course-specific resources to meet the needs of their
students. Unlike traditional textbooks, LibreTexts’ web based origins allow powerful integration of advanced features and new
technologies to support learning.

The LibreTexts mission is to unite students, faculty and scholars in a cooperative effort to develop an easy-to-use online platform
for the construction, customization, and dissemination of OER content to reduce the burdens of unreasonable textbook costs to our
students and society. The LibreTexts project is a multi-institutional collaborative venture to develop the next generation of open-
access texts to improve postsecondary education at all levels of higher learning by developing an Open Access Resource
environment. The project currently consists of 14 independently operating and interconnected libraries that are constantly being
optimized by students, faculty, and outside experts to supplant conventional paper-based books. These free textbook alternatives are
organized within a central environment that is both vertically (from advance to basic level) and horizontally (across different fields)
integrated.
The LibreTexts libraries are Powered by NICE CXOne and are supported by the Department of Education Open Textbook Pilot
Project, the UC Davis Office of the Provost, the UC Davis Library, the California State University Affordable Learning Solutions
Program, and Merlot. This material is based upon work supported by the National Science Foundation under Grant No. 1246120,
1525057, and 1413739.
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not
necessarily reflect the views of the National Science Foundation nor the US Department of Education.
Have questions or comments? For information about adoptions or adaptions contact [email protected]. More information on our
activities can be found via Facebook (https://facebook.com/Libretexts), Twitter (https://twitter.com/libretexts), or our blog
(http://Blog.Libretexts.org).
This text was compiled on 12/01/2023
TABLE OF CONTENTS
Licensing

1: Functions and Models


1.1: Four Ways to Represent a Function
1.2: Mathematical Models- A Catalog of Essential Functions
1.3: New Functions from Old Functions
1.4: Exponential Functions
1.5: Inverse Functions and Logarithms

2: Limits and Derivatives


2.1: The Tangent and Velocity Problems
2.2: The Limit of a Function
2.3: Calculating Limits Using the Limit Laws
2.4: The Precise Definition of a Limit
2.5: Continuity
2.6: Limits at Infinity; Horizontal Asymptotes
2.7: Derivatives and Rates of Change
2.8: The Derivative as a Function

3: Differentiation Rules
3.1: Derivatives of Polynomials and Exponential Functions
3.2: The Product and Quotient Rules
3.3: Derivatives of Trigonometric Functions
3.4: The Chain Rule
3.5: Implicit Differentiation
3.6: Derivatives of Logarithmic Functions
3.7: Rates of Change in the Natural and Social Sciences
3.8: Exponential Growth and Decay
3.9: Related Rates
3.10: Linear Approximations and Differentials
3.11: Hyperbolic Functions

4: Applications of Differentiation
4.1: Maximum and Minimum Values
4.2: The Mean Value Theorem
4.3: How Derivatives Affect the Shape of a Graph
4.4: Indeterminate Forms and l'Hospital's Rule
4.5: Summary of Curve Sketching
4.6: Graphing with Calculus and Calculators
4.7: Optimization Problems
4.8: Newton's Method
4.9: Antiderivatives

1 https://math.libretexts.org/@go/page/24026
5: Integrals
5.1: Areas and Distances
5.2: The Definite Integral
5.3: The Fundamental Theorem of Calculus
5.4: Indefinite Integrals and the Net Change Theorem
5.5: The Substitution Rule

6: Applications of Integration
6.1: Areas Between Curves
6.2: Volumes
6.3: Volumes by Cylindrical Shells
6.4: Work
6.5: Average Value of a Function

7: Techniques of Integration
7.1: Integration by Parts
7.2: Trigonometric Integrals
7.3: Trigonometric Substitution
7.4: Integration of Rational Functions by Partial Fractions
7.5: Strategy for Integration
7.6: Integration Using Tables and Computer Algebra Systems
7.7: Approximate Integration
7.8: Improper Integrals

8: Further Applications of Integration


8.1: Arc Length
8.2: Area of a Surface of Revolution
8.3: Applications to Physics and Engineering
8.4: Applications to Economics and Biology
8.5: Probability

9: Differential Equations
9.1: Modeling with Differential Equations
9.2: Direction Fields and Euler's Method
9.3: Separable Equations
9.4: Models for Population Growth
9.5: Linear Equations
9.6: Predator-Prey Systems

10: Parametric Equations And Polar Coordinates


10.1: Curves Defined by Parametric Equations
10.2: Calculus with Parametric Curves
10.3: Polar Coordinates
10.4: Areas and Lengths in Polar Coordinates
10.5: Conic Sections
10.6: Conic Sections in Polar Coordinates
Index

2 https://math.libretexts.org/@go/page/24026
11: Infinite Sequences And Series
11.1: Sequences
11.2: Series
11.3: The Integral Test and Estimates of Sums
11.4: The Comparison Tests
11.5: Alternating Series
11.6: Absolute Convergence and the Ratio and Root Test
11.7: Strategy for Testing Series
11.8: Power Series
11.9: Representations of Functions as Power Series
11.10: Taylor and Maclaurin Series
11.11: Applications of Taylor Polynomials

12: Vectors and The Geometry of Space


12.1: Three-Dimensional Coordinate Systems
12.2: Vectors
12.3: The Dot Product
12.4: The Cross Product
12.5: Equations of Lines and Planes
12.6: Cylinders and Quadric Surfaces

13: Vector Functions


13.1: Vector Functions and Space Curves
13.2: Derivatives and Integrals of Vector Functions
13.3: Arc Length and Curvature
13.4: Motion in Space- Velocity and Acceleration

14: Partial Derivatives


14.1: Functions of Several Variables
14.2: Limits and Continuity
14.3: Partial Derivatives
14.4: Tangent Planes and Linear Approximations
14.5: The Chain Rule
14.6: Directional Derivatives and the Gradient Vector
14.7: Maximum and Minimum Values
14.8: Lagrange Multipliers

15: Multiple Integrals


15.1: Double Integrals over Rectangles
15.2: Double Integrals over General Regions
15.3: Double Integrals in Polar Coordinates
15.4: Applications of Double Integrals
15.5: Surface Area
15.6: Triple Integrals
15.7: Triple Integrals in Cylindrical Coordinates
15.8: Triple Integrals in Spherical Coordinates
15.9: Change of Variables in Multiple Integrals

3 https://math.libretexts.org/@go/page/24026
16: Vector Calculus
16.1: Vector Fields
16.2: Line Integrals
16.3: The Fundamental Theorem for Line Integrals
16.4: Green's Theorem
16.5: Curl and Divergence
16.6: Parametric Surfaces and Their Areas
16.7: Surface Integrals
16.8: Stokes' Theorem
16.9: The Divergence Theorem

17: Second-Order Differential Equations


17.1: Second-Order Linear Equations
17.2: Nonhomogeneous Linear Equations
17.3: Applications of Second-Order Differential Equations
17.4: Series Solutions of Differential Equations

Index

Glossary
Detailed Licensing

4 https://math.libretexts.org/@go/page/24026
Licensing
A detailed breakdown of this resource's licensing can be found in Back Matter/Detailed Licensing.

1 https://math.libretexts.org/@go/page/115431
CHAPTER OVERVIEW

1: Functions and Models


A general Calculus Textmap organized around the textbook

Calculus: Early Transcendentals


by James Stewart

I II III IV V VI VII VIII IX X XI XII XIII XIV XV XVI XVII

This Textmap is currently under construction... please be patient with us.

Topic hierarchy
1.1: Four Ways to Represent a Function
1.2: Mathematical Models- A Catalog of Essential Functions
1.3: New Functions from Old Functions
1.4: Exponential Functions
1.5: Inverse Functions and Logarithms

1: Functions and Models is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

1
1.1: Four Ways to Represent a Function
Learning Objectives
Determine whether a relation represents a function.
Find the value of a function.
Determine whether a function is one-to-one.
Use the vertical line test to identify functions.
Graph the functions listed in the library of functions.

A jetliner changes altitude as its distance from the starting point of a flight increases. The weight of a growing child increases with
time. In each case, one quantity depends on another. There is a relationship between the two quantities that we can describe,
analyze, and use to make predictions. In this section, we will analyze such relationships.

Determining Whether a Relation Represents a Function


A relation is a set of ordered pairs. The set of the first components of each ordered pair is called the domain and the set of the
second components of each ordered pair is called the range. Consider the following set of ordered pairs. The first numbers in each
pair are the first five natural numbers. The second number in each pair is twice that of the first.

{(1, 2), (2, 4), (3, 6), (4, 8), (5, 10)} (1.1.1)

The domain is {1, 2, 3, 4, 5}. The range is {2, 4, 6, 8, 10}.


Note that each value in the domain is also known as an input value, or independent variable, and is often labeled with the
lowercase letter x. Each value in the range is also known as an output value, or dependent variable, and is often labeled lowercase
letter y .
A function f is a relation that assigns a single value in the range to each value in the domain. In other words, no x-values are
repeated. For our example that relates the first five natural numbers to numbers double their values, this relation is a function
because each element in the domain, {1, 2, 3, 4, 5}, is paired with exactly one element in the range, {2, 4, 6, 8, 10}.
Now let’s consider the set of ordered pairs that relates the terms “even” and “odd” to the first five natural numbers. It would appear
as
{(odd, 1), (even, 2), (odd, 3), (even, 4), (odd, 5)} (1.1.2)

Notice that each element in the domain, {even, odd} is not paired with exactly one element in the range, {1, 2, 3, 4, 5}. For
example, the term “odd” corresponds to three values from the range, {1, 3, 5}, and the term “even” corresponds to two values from
the range, {2, 4}. This violates the definition of a function, so this relation is not a function.
Figure 1.1.1 compares relations that are functions and not functions.

Figure 1.1.1 : (a) This relationship is a function because each input is associated with a single output. Note that input q and r both
give output n . (b) This relationship is also a function. In this case, each input is associated with a single output. (c) This
relationship is not a function because input q is associated with two different outputs.

1.1.1 https://math.libretexts.org/@go/page/4434
Function
A function is a relation in which each possible input value leads to exactly one output value. We say “the output is a function
of the input.”
The input values make up the domain, and the output values make up the range.

How To: Given a relationship between two quantities, determine whether the relationship is a function
1. Identify the input values.
2. Identify the output values.
3. If each input value leads to only one output value, classify the relationship as a function. If any input value leads to two or
more outputs, do not classify the relationship as a function.

Example 1.1.1: Determining If Menu Price Lists Are Functions

The coffee shop menu, shown in Figure 1.1.2 consists of items and their prices.
a. Is price a function of the item?
b. Is the item a function of the price?

Figure 1.1.2 : A menu of donut prices from a coffee shop where a plain donut is $1.49 and a jelly donut and chocolate donut are
$1.99.
Solution
a. Let’s begin by considering the input as the items on the menu. The output values are then the prices. See Figure 1.1.3.

Figure 1.1.3 : A menu of donut prices from a coffee shop where a plain donut is $1.49 and a jelly donut and chocolate donut are
$1.99.
Each item on the menu has only one price, so the price is a function of the item.
a. Two items on the menu have the same price. If we consider the prices to be the input values and the items to be the output,
then the same input value could have more than one output associated with it. See Figure 1.1.4.

1.1.2 https://math.libretexts.org/@go/page/4434
Figure 1.1.4 : Association of the prices to the donuts.
Therefore, the item is a not a function of price.

Example 1.1.2: Determining If Class Grade Rules Are Functions

In a particular math class, the overall percent grade corresponds to a grade point average. Is grade point average a function of
the percent grade? Is the percent grade a function of the grade point average? Table 1.1.1 shows a possible rule for assigning
grade points.
Table 1.1.1 : Class grade points.
Percent
0–56 57–61 62–66 67–71 72–77 78–86 87–91 92–100
grade

Grade point
0.0 1.0 1.5 2.0 2.5 3.0 3.5 4.0
average

Solution
For any percent grade earned, there is an associated grade point average, so the grade point average is a function of the percent
grade. In other words, if we input the percent grade, the output is a specific grade point average.
In the grading system given, there is a range of percent grades that correspond to the same grade point average. For example,
students who receive a grade point average of 3.0 could have a variety of percent grades ranging from 78 all the way to 86.
Thus, percent grade is not a function of grade point average.

Exercise 1.1.2

Table 1.1.2 lists the five greatest baseball players of all time in order of rank.
Table 1.1.2 : Five greatest baseball players.
Player Rank

Babe Ruth 1

Willie Mays 2

Ty Cobb 3

Walter Johnson 4

Hank Aaron 5

a. Is the rank a function of the player name?


b. Is the player name a function of the rank?

Answer a
Yes
Answer b
yes. (Note: If two players had been tied for, say, 4th place, then the name would not have been a function of rank.)

1.1.3 https://math.libretexts.org/@go/page/4434
Using Function Notation
Once we determine that a relationship is a function, we need to display and define the functional relationships so that we can
understand and use them, and sometimes also so that we can program them into computers. There are various ways of representing
functions. A standard function notation is one representation that facilitates working with functions.
To represent “height is a function of age,” we start by identifying the descriptive variables h for height and a for age. The letters f ,
g ,and h are often used to represent functions just as we use x, y ,and z to represent numbers and A , B , and C to represent sets.

h is f of a We name the function f ; height is a function of age.

h = f (a) We use parentheses to indicate the function input. (1.1.1)

f (a) We name the function f ; the expression is read as “ f of a.”

Remember, we can use any letter to name the function; the notation h(a) shows us that h depends on a . The value a must be put
into the function h to get a result. The parentheses indicate that age is input into the function; they do not indicate multiplication.
We can also give an algebraic expression as the input to a function. For example f (a + b) means “first add a and b , and the result
is the input for the function f .” The operations must be performed in this order to obtain the correct result.

Function Notation

The notation y = f (x) defines a function named f . This is read as “y is a function of x.” The letter x represents the input
value, or independent variable. The letter y , or f (x), represents the output value, or dependent variable.

Example 1.1.3: Using Function Notation for Days in a Month

Use function notation to represent a function whose input is the name of a month and output is the number of days in that
month.
Solution
Using Function Notation for Days in a Month
Use function notation to represent a function whose input is the name of a month and output is the number of days in that
month.
The number of days in a month is a function of the name of the month, so if we name the function f , we write
days = f (month) or d = f (m) . The name of the month is the input to a “rule” that associates a specific number (the output)

with each input.

Figure 1.1.5 : The function 31 = f (J anuary) where 31 is the output, f is the rule, and January is the input.
For example, f (March) = 31 , because March has 31 days. The notation d = f (m) reminds us that the number of days, d (the
output), is dependent on the name of the month, m (the input).
Analysis
Note that the inputs to a function do not have to be numbers; function inputs can be names of people, labels of geometric
objects, or any other element that determines some kind of output. However, most of the functions we will work with in this
book will have numbers as inputs and outputs.

Example 1.1.3B: Interpreting Function Notation

A function N = f (y) gives the number of police officers, N , in a town in year y . What does f (2005) = 300 represent?
Solution
When we read f (2005) = 300, we see that the input year is 2005. The value for the output, the number of police officers (N ) ,
is 300. Remember, N = f (y) . The statement f (2005) = 300 tells us that in the year 2005 there were 300 police officers in the

1.1.4 https://math.libretexts.org/@go/page/4434
town.

Exercise 1.1.3

Use function notation to express the weight of a pig in pounds as a function of its age in days d .

Answer
w = f (d)

Q&A

Instead of a notation such as y = f (x), could we use the same symbol for the output as for the function, such as y = y(x) ,
meaning “y is a function of x?”
Yes, this is often done, especially in applied subjects that use higher math, such as physics and engineering. However, in
exploring math itself we like to maintain a distinction between a function such as f , which is a rule or procedure, and the
output y we get by applying f to a particular input x. This is why we usually use notation such as y = f (x), P = W (d) , and
so on.

Representing Functions Using Tables


A common method of representing functions is in the form of a table. The table rows or columns display the corresponding input
and output values. In some cases, these values represent all we know about the relationship; other times, the table provides a few
select examples from a more complete relationship.
Table 1.1.3 lists the input number of each month (January = 1 , February = 2 , and so on) and the output value of the number of
days in that month. This information represents all we know about the months and days for a given year (that is not a leap year).
Note that, in this table, we define a days-in-a-month function f where D = f (m) identifies months by an integer rather than by
name.
Table 1.1.3 : Months and number of days per month.
Month
number,
1 2 3 4 5 6 7 8 9 10 11 12
m

(input)

Days in
month,
31 28 31 30 31 30 31 31 30 31 30 31
D

(output)

Table 1.1.4 defines a function Q = g(n) Remember, this notation tells us that g is the name of the function that takes the input n
and gives the output Q.
Table 1.1.4 : Function Q = g(n)
n 1 2 3 4 5

Q 8 6 7 6 8

Table 1.1.5 displays the age of children in years and their corresponding heights. This table displays just some of the data available
for the heights and ages of children. We can see right away that this table does not represent a function because the same input
value, 5 years, has two different output values, 40 in. and 42 in.
Table 1.1.5 : Age of children and their corresponding heights.
Age in years, a
5 5 6 7 8 9 10
(input)

1.1.5 https://math.libretexts.org/@go/page/4434
Height in
inches, h 40 42 44 47 50 52 54
(output)

How To: Given a table of input and output values, determine whether the table represents a function
1. Identify the input and output values.
2. Check to see if each input value is paired with only one output value. If so, the table represents a function.

Example 1.1.5: Identifying Tables that Represent Functions

Which table, Table 1.1.6, Table 1.1.7, or Table 1.1.8, represents a function (if any)?
Table 1.1.6
Input Output

2 1

5 3

8 6

Table 1.1.7
Input Output

-3 5

0 1

4 5

Table 1.1.8
Input Output

1 0

5 2

5 4

Solution
Table 1.1.6 and Table 1.1.7 define functions. In both, each input value corresponds to exactly one output value. Table 1.1.8

does not define a function because the input value of 5 corresponds to two different output values.
When a table represents a function, corresponding input and output values can also be specified using function notation.
The function represented by Table 1.1.6 can be represented by writing

f (2) = 1, f (5) = 3, and f (8) = 6

Similarly, the statements

g(−3) = 5, g(0) = 1, and g(4) = 5

represent the function in Table 1.1.7.


Table 1.1.8 cannot be expressed in a similar way because it does not represent a function.

Exercise 1.1.5

Does Table 1.1.9 represent a function?


Table 1.1.9
Input Output

1.1.6 https://math.libretexts.org/@go/page/4434
Input Output

1 10

2 100

3 1000

Answer
yes

Finding Input and Output Values of a Function


When we know an input value and want to determine the corresponding output value for a function, we evaluate the function.
Evaluating will always produce one result because each input value of a function corresponds to exactly one output value.
When we know an output value and want to determine the input values that would produce that output value, we set the output
equal to the function’s formula and solve for the input. Solving can produce more than one solution because different input values
can produce the same output value.

Evaluation of Functions in Algebraic Forms


When we have a function in formula form, it is usually a simple matter to evaluate the function. For example, the function
f (x) = 5 − 3x can be evaluated by squaring the input value, multiplying by 3, and then subtracting the product from 5.
2

How To: Given the formula for a function, evaluate.

Given the formula for a function, evaluate.


1. Replace the input variable in the formula with the value provided.
2. Calculate the result.

Example 1.1.6A: Evaluating Functions at Specific Values

1. Evaluate f (x) = x 2
+ 3x − 4 at
a. 2
b. a
c. a + h
f (a+h)−f (a)
d. Evaluate h

Solution
Replace the x in the function with each specified value.
a. Because the input value is a number, 2, we can use simple algebra to simplify.
2
f (2) = 2 + 3(2) − 4

= 4 +6 −4

=6

b. In this case, the input value is a letter so we cannot simplify the answer any further.
2
f (a) = a + 3a − 4

c. With an input value of a + h , we must use the distributive property.


2
f (a + h) = (a + h ) + 3(a + h) − 4

2 2
=a + 2ah + h + 3a + 3h − 4

d. In this case, we apply the input values to the function more than once, and then perform algebraic operations on the
result. We already found that

1.1.7 https://math.libretexts.org/@go/page/4434
2 2
f (a + h) = a + 2ah + h + 3a + 3h − 4

and we know that


2
f (a) = a + 3a − 4

Now we combine the results and simplify.


2 2 2
f (a + h) − f (a) (a + 2ah + h + 3a + 3h − 4) − (a + 3a − 4)
=
h h
2
(2ah + h + 3h)
=
h

h(2a + h + 3)
= Factor out h.
h

= 2a + h + 3 Simplify.

Example 1.1.6B: Evaluating Functions

Given the function h(p) = p 2


+ 2p , evaluate h(4).
Solution
To evaluate h(4), we substitute the value 4 for the input variable p in the given function.
2
h(p) = p + 2p

2
h(4) = (4 ) + 2(4)

= 16 + 8

= 24

Therefore, for an input of 4, we have an output of 24.

Exercise 1.1.6
−−−−−
Given the function g(m) = √m − 4 , evaluate g(5) .

Answer
g(5) = 1

Example 1.1.7: Solving Functions

Given the function h(p) = p 2


+ 2p , solve for h(p) = 3 .
Solution
h(p) = 3

2
p + 2p = 3 Substitute the original function

2
p + 2p − 3 = 0 Subtract 3 from each side.

(p + 3)(p − 1) = 0 Factor.

If (p + 3)(p − 1) = 0 , either (p + 3) = 0 or (p − 1) = 0 (or both of them equal 0). We will set each factor equal to 0 and
solve for p in each case.

(p + 3) = 0, p = −3

(p − 1) = 0, p = 1

This gives us two solutions. The output h(p) = 3 when the input is either p = 1 or p = −3 . We can also verify by graphing as
in Figure 1.1.6. The graph verifies that h(1) = h(−3) = 3 and h(4) = 24 .

1.1.8 https://math.libretexts.org/@go/page/4434
Figure 1.1.6 : Graph of h(p) = p
2
+ 2p

Exercise 1.1.7
−−−−−
Given the function g(m) = √m − 4 , solve g(m) = 2 .

Answer
m =8

Evaluating Functions Expressed in Formulas


Some functions are defined by mathematical rules or procedures expressed in equation form. If it is possible to express the
function output with a formula involving the input quantity, then we can define a function in algebraic form. For example, the
equation 2n + 6p = 12 expresses a functional relationship between n and p. We can rewrite it to decide if p is a function of n .

How to: Given a function in equation form, write its algebraic formula.
1. Solve the equation to isolate the output variable on one side of the equal sign, with the other side as an expression that
involves only the input variable.
2. Use all the usual algebraic methods for solving equations, such as adding or subtracting the same quantity to or from both
sides, or multiplying or dividing both sides of the equation by the same quantity.

Example 1.1.8A: Finding an Equation of a Function

Express the relationship 2n + 6p = 12 as a function p = f (n) , if possible.


Solution
To express the relationship in this form, we need to be able to write the relationship where p is a function of n , which means
writing it as p = [expression involving n] .
2n + 6p = 12

6p = 12 − 2n Subtract 2n from both sides.

12 − 2n
p = Divide both sides by 6 and simplify.
6

12 2n
p = −
6 6
1
p =2− n
3

1.1.9 https://math.libretexts.org/@go/page/4434
Therefore, p as a function of n is written as
1
p = f (n) = 2 − n
3

Analysis
It is important to note that not every relationship expressed by an equation can also be expressed as a function with a formula.

Example 1.1.8B: Expressing the Equation of a Circle as a Function

Does the equation x 2


+y
2
=1 represent a function with x as input and y as output? If so, express the relationship as a
function y = f (x).
Solution
First we subtract x from both sides.
2

2 2
y = 1 −x

We now try to solve for y in this equation.


− −−− −
2
y = ±√ 1 − x

− −−− − − −−− −
2 2
so, y = √ 1 − x and y = −√ 1 − x

We get two outputs corresponding to the same input, so this relationship cannot be represented as a single function y = f (x).

Exercise 1.1.8

If x − 8y 3
=0 , express y as a function of x.

Answer
3 −
√x
y = f (x) =
2

Q&A

Are there relationships expressed by an equation that do represent a function but which still cannot be represented by an
algebraic formula?
Yes, this can happen. For example, given the equation x = y + 2 , if we want to express y as a function of x, there is no
y

simple algebraic formula involving only x that equals y . However, each x does determine a unique value for y , and there are
mathematical procedures by which y can be found to any desired accuracy. In this case, we say that the equation gives an
implicit (implied) rule for y as a function of x, even though the formula cannot be written explicitly.

Evaluating a Function Given in Tabular Form


As we saw above, we can represent functions in tables. Conversely, we can use information in tables to write functions, and we can
evaluate functions using the tables. For example, how well do our pets recall the fond memories we share with them? There is an
urban legend that a goldfish has a memory of 3 seconds, but this is just a myth. Goldfish can remember up to 3 months, while the
beta fish has a memory of up to 5 months. And while a puppy’s memory span is no longer than 30 seconds, the adult dog can
remember for 5 minutes. This is meager compared to a cat, whose memory span lasts for 16 hours.
The function that relates the type of pet to the duration of its memory span is more easily visualized with the use of a table (Table
1.1.10).

Table 1.1.10
Pet Memory span in hours

Puppy 0.008

1.1.10 https://math.libretexts.org/@go/page/4434
Pet Memory span in hours

Adult Dog 0.083

Cat 3

Goldfish 2160

Beta Fish 3600

At times, evaluating a function in table form may be more useful than using equations. Here let us call the function P . The domain
of the function is the type of pet and the range is a real number representing the number of hours the pet’s memory span lasts. We
can evaluate the function P at the input value of “goldfish.” We would write P (goldf ish) = 2160 . Notice that, to evaluate the
function in table form, we identify the input value and the corresponding output value from the pertinent row of the table. The
tabular form for function P seems ideally suited to this function, more so than writing it in paragraph or function form.

How To: Given a function represented by a table, identify specific output and input values

1. Find the given input in the row (or column) of input values.
2. Identify the corresponding output value paired with that input value.
3. Find the given output values in the row (or column) of output values, noting every time that output value appears.
4. Identify the input value(s) corresponding to the given output value.

Example 1.1.9: Evaluating and Solving a Tabular Function

Using Table 1.1.11,


a. Evaluate g(3) .
b. Solve g(n) = 6 .
Table 1.1.11
n 1 2 3 4 5

g(n) 8 6 7 6 8

Solution
a. Evaluating g(3) means determining the output value of the function g for the input value of n = 3 . The table output
value corresponding to n = 3 is 7, so g(3) = 7 .
b. Solving g(n) = 6 means identifying the input values, n,that produce an output value of 6. Table 1.1.12 shows two
solutions: 2 and 4.
Table 1.1.12
n 1 2 3 4 5

g(n) 8 6 7 6 8

When we input 2 into the function g , our output is 6. When we input 4 into the function g , our output is also 6.

Exercise 1.1.1

Using Table 1.1.12, evaluate g(1) .

Answer
g(1) = 8

Finding Function Values from a Graph


Evaluating a function using a graph also requires finding the corresponding output value for a given input value, only in this case,
we find the output value by looking at the graph. Solving a function equation using a graph requires finding all instances of the

1.1.11 https://math.libretexts.org/@go/page/4434
given output value on the graph and observing the corresponding input value(s).

Example 1.1.10: Reading Function Values from a Graph

Given the graph in Figure 1.1.7,


a. Evaluate f (2).
b. Solve f (x) = 4 .

Figure 1.1.7 : Graph of a positive parabola centered at (1, 0).


Solution
To evaluate f (2), locate the point on the curve where x = 2 , then read the y-coordinate of that point. The point has coordinates
(2, 1), so f (2) = 1 . See Figure 1.1.8.

1.1.8 : Graph of a positive parabola centered at (1, 0) with the labeled point (2, 1) where f (2) = 1.
To solve f (x) = 4 , we find the output value 4 on the vertical axis. Moving horizontally along the line y = 4 , we locate two
points of the curve with output value 4: (−1, 4) and (3, 4). These points represent the two solutions to f (x) = 4 : −1 or 3. This
means f (−1) = 4 and f (3) = 4 , or when the input is −1 or 3, the output is 4. See Figure 1.1.9.

1.1.12 https://math.libretexts.org/@go/page/4434
Figure 1.1.9 : Graph of an upward-facing parabola with a vertex at (0, 1) and labeled points at (−1, 4) and . A line at
(3, 4)

y = 4 intersects the parabola at the labeled points.

Exercise 1.1.10

Given the graph in Figure 1.1.7, solve f (x) = 1 .

Answer
x =0 or x = 2

Determining Whether a Function is One-to-One


Some functions have a given output value that corresponds to two or more input values. For example, in the stock chart shown in
the Figure at the beginning of this chapter, the stock price was $1000 on five different dates, meaning that there were five different
input values that all resulted in the same output value of $1000.
However, some functions have only one input value for each output value, as well as having only one output for each input. We call
these functions one-to-one functions. As an example, consider a school that uses only letter grades and decimal equivalents, as
listed in Table 1.1.13.
Table 1.1.13 : Letter grades and decimal equivalents.
Letter Grade Grade Point Average

A 4.0

B 3.0

C 2.0

D 1.0

This grading system represents a one-to-one function, because each letter input yields one particular grade point average output and
each grade point average corresponds to one input letter.
To visualize this concept, let’s look again at the two simple functions sketched in Figures 1.1.1a and 1.1.1b. The function in part
(a) shows a relationship that is not a one-to-one function because inputs q and r both give output n . The function in part (b) shows
a relationship that is a one-to-one function because each input is associated with a single output.

1.1.13 https://math.libretexts.org/@go/page/4434
One-to-One Functions
A one-to-one function is a function in which each output value corresponds to exactly one input value.

Example 1.1.11: Determining Whether a Relationship Is a One-to-One Function

Is the area of a circle a function of its radius? If yes, is the function one-to-one?
Solution
A circle of radius r has a unique area measure given by A = πr , so for any input, r, there is only one output, A . The area is a
2

function of radiusr.
If the function is one-to-one, the output value, the area, must correspond to a unique input value, the radius. Any area measure


A is given by the formula A = πr
2
. Because areas and radii are positive numbers, there is exactly one solution:√ . So the
A

area of a circle is a one-to-one function of the circle’s radius.

Exercise 1.1.11A
a. Is a balance a function of the bank account number?
b. Is a bank account number a function of the balance?
c. Is a balance a one-to-one function of the bank account number?

Answer
a. yes, because each bank account has a single balance at any given time;
b. no, because several bank account numbers may have the same balance;
c. no, because the same output may correspond to more than one input.

Exercise 1.1.11B

Evaluate the following:


a. If each percent grade earned in a course translates to one letter grade, is the letter grade a function of the percent grade?
b. If so, is the function one-to-one?

Answer
a. Yes, letter grade is a function of percent grade;
b. No, it is not one-to-one. There are 100 different percent numbers we could get but only about five possible letter grades,
so there cannot be only one percent number that corresponds to each letter grade.

Using the Vertical Line Test


As we have seen in some examples above, we can represent a function using a graph. Graphs display a great many input-output
pairs in a small space. The visual information they provide often makes relationships easier to understand. By convention, graphs
are typically constructed with the input values along the horizontal axis and the output values along the vertical axis.
The most common graphs name the input value x and the output y , and we say y is a function of x, or y = f (x) when the function
is named f . The graph of the function is the set of all points (x, y) in the plane that satisfies the equation y = f (x). If the function
is defined for only a few input values, then the graph of the function is only a few points, where the x-coordinate of each point is an
input value and the y-coordinate of each point is the corresponding output value. For example, the black dots on the graph in Figure
1.1.10 tell us that f (0) = 2 and f (6) = 1 . However, the set of all points (x, y) satisfying y = f (x) is a curve. The curve shown

includes (0, 2) and (6, 1) because the curve passes through those points

1.1.14 https://math.libretexts.org/@go/page/4434
Figure 1.1.10 : Graph of a polynomial.
The vertical line test can be used to determine whether a graph represents a function. If we can draw any vertical line that intersects
a graph more than once, then the graph does not define a function because a function has only one output value for each input
value. See Figure 1.1.11.

Figure 1.1.11 : Three graphs visually showing what is and is not a function.

Howto: Given a graph, use the vertical line test to determine if the graph represents a function
1. Inspect the graph to see if any vertical line drawn would intersect the curve more than once.
2. If there is any such line, determine that the graph does not represent a function.

Example 1.1.12: Applying the Vertical Line Test

Which of the graphs in Figure 1.1.12 represent(s) a function y = f (x)?

1.1.15 https://math.libretexts.org/@go/page/4434
Figure 1.1.12 : Graph of a polynomial (a), a downward-sloping line (b), and a circle (c).
Solution
If any vertical line intersects a graph more than once, the relation represented by the graph is not a function. Notice that any
vertical line would pass through only one point of the two graphs shown in parts (a) and (b) of Figure 1.1.12. From this we can
conclude that these two graphs represent functions. The third graph does not represent a function because, at most x-values, a
vertical line would intersect the graph at more than one point, as shown in Figure 1.1.13.

Figure 1.1.13 : Graph of a circle.

Exercise 1.1.12

Does the graph in Figure 1.1.14 represent a function?

1.1.16 https://math.libretexts.org/@go/page/4434
Figure 1.1.14 : Graph of absolute value function.

Answer
yes

Using the Horizontal Line Test


Once we have determined that a graph defines a function, an easy way to determine if it is a one-to-one function is to use the
horizontal line test. Draw horizontal lines through the graph. If any horizontal line intersects the graph more than once, then the
graph does not represent a one-to-one function.

Howto: Given a graph of a function, use the horizontal line test to determine if the graph represents a one-to-
one function
1. Inspect the graph to see if any horizontal line drawn would intersect the curve more than once.
2. If there is any such line, determine that the function is not one-to-one.

Example 1.1.13: Applying the Horizontal Line Test

Consider the functions shown in Figure 1.1.12a and Figure 1.1.12b. Are either of the functions one-to-one?
Solution
The function in Figure 1.1.12a is not one-to-one. The horizontal line shown in Figure 1.1.15 intersects the graph of the
function at two points (and we can even find horizontal lines that intersect it at three points.)

1.1.17 https://math.libretexts.org/@go/page/4434
Figure 1.1.15 : Graph of a polynomial with a horizontal line crossing through 2 points
The function in Figure 1.1.12b is one-to-one. Any horizontal line will intersect a diagonal line at most once.

Exercise 1.1.13

Is the graph shown in Figure 1.1.13 one-to-one?

Answer
No, because it does not pass the horizontal line test.

Identifying Basic Toolkit Functions


In this text, we will be exploring functions—the shapes of their graphs, their unique characteristics, their algebraic formulas, and
how to solve problems with them. When learning to read, we start with the alphabet. When learning to do arithmetic, we start with
numbers. When working with functions, it is similarly helpful to have a base set of building-block elements. We call these our
“toolkit functions,” which form a set of basic named functions for which we know the graph, formula, and special properties. Some
of these functions are programmed to individual buttons on many calculators. For these definitions we will use x as the input
variable and y = f (x) as the output variable.
We will see these toolkit functions, combinations of toolkit functions, their graphs, and their transformations frequently throughout
this book. It will be very helpful if we can recognize these toolkit functions and their features quickly by name, formula, graph, and
basic table properties. The graphs and sample table values are included with each function shown in Table 1.1.14.
Table 1.1.14 : Toolkit Functions
Name Function Graph

Constant f (x) = c where c is a constant

1.1.18 https://math.libretexts.org/@go/page/4434
Name Function Graph

Identity f (x) = x

Absolute Value f (x) = |x|

Quadratic f (x) = x
2

Cubic f (x) = x
3

1
reciprocal f (x) =
x

1
Reciprocal squared f (x) =
2
x

1.1.19 https://math.libretexts.org/@go/page/4434
Name Function Graph


Square root f (x) = √x

3 −
Cube root f (x) = √x

Key Equations
Constant function f (x) = c , where c is a constant
Identity function f (x) = x
Absolute value function f (x) = |x|
Quadratic function f (x) = x 2

Cubic function f (x) = x 3

1
Reciprocal function f (x) =
x
Reciprocal squared function f (x) = 1

x
2

Square root function f (x) = √−x

Cube root function f (x) = 3√− x

Key Concepts
A relation is a set of ordered pairs. A function is a specific type of relation in which each domain value, or input, leads to
exactly one range value, or output.
Function notation is a shorthand method for relating the input to the output in the form y = f (x).
In tabular form, a function can be represented by rows or columns that relate to input and output values.
To evaluate a function, we determine an output value for a corresponding input value. Algebraic forms of a function can be
evaluated by replacing the input variable with a given value.
To solve for a specific function value, we determine the input values that yield the specific output value.
An algebraic form of a function can be written from an equation.
Input and output values of a function can be identified from a table.
Relating input values to output values on a graph is another way to evaluate a function.
A function is one-to-one if each output value corresponds to only one input value.
A graph represents a function if any vertical line drawn on the graph intersects the graph at no more than one point.
The graph of a one-to-one function passes the horizontal line test.

Footnotes
1 http://www.baseball-almanac.com/lege.../lisn100.shtml. Accessed 3/24/2014.
2 www.kgbanswers.com/how-long-i...y-span/4221590. Accessed 3/24/2014.

1.1.20 https://math.libretexts.org/@go/page/4434
Glossary
dependent variable
an output variable
domain
the set of all possible input values for a relation
function
a relation in which each input value yields a unique output value
horizontal line test
a method of testing whether a function is one-to-one by determining whether any horizontal line intersects the graph more than
once
independent variable
an input variable
input
each object or value in a domain that relates to another object or value by a relationship known as a function
one-to-one function
a function for which each value of the output is associated with a unique input value
output
each object or value in the range that is produced when an input value is entered into a function
range
the set of output values that result from the input values in a relation
relation
a set of ordered pairs
vertical line test
a method of testing whether a graph represents a function by determining whether a vertical line intersects the graph no more than
once

1.1: Four Ways to Represent a Function is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
1.1: Functions and Function Notation by OpenStax is licensed CC BY 4.0. Original source: https://openstax.org/details/books/precalculus.

1.1.21 https://math.libretexts.org/@go/page/4434
Welcome to the Mathematics Library. This Living Library is a principal hub of the LibreTexts project, which is a multi-institutional
collaborative venture to develop the next generation of open-access texts to improve postsecondary education at all levels of higher
learning. The LibreTexts approach is highly collaborative where an Open Access textbook environment is under constant revision
by students, faculty, and outside experts to supplant conventional paper-based books.

1 https://math.libretexts.org/@go/page/4435
1.3: New Functions from Old Functions
Learning Objectives
Combine functions using algebraic operations.
Create a new function by composition of functions.
Evaluate composite functions.
Find the domain of a composite function.
Decompose a composite function into its component functions.

Suppose we want to calculate how much it costs to heat a house on a particular day of the year. The cost to heat a house will
depend on the average daily temperature, and in turn, the average daily temperature depends on the particular day of the year.
Notice how we have just defined two relationships: The cost depends on the temperature, and the temperature depends on the day.
Using descriptive variables, we can notate these two functions. The function C (T ) gives the cost C of heating a house for a given
average daily temperature in T degrees Celsius. The function T (d) gives the average daily temperature on day d of the year. For
any given day, C ost = C (T (d)) means that the cost depends on the temperature, which in turns depends on the day of the year.
Thus, we can evaluate the cost function at the temperature T (d) . For example, we could evaluate T (5) to determine the average
daily temperature on the 5th day of the year. Then, we could evaluate the cost function at that temperature. We would write
C (T (5)).

Figure 1.3.1 : Explanation of C (T (5)), which is the cost for the temperature and T (5) is the temperature on day 5.
By combining these two relationships into one function, we have performed function composition, which is the focus of this
section.

Combining Functions Using Algebraic Operations


Function composition is only one way to combine existing functions. Another way is to carry out the usual algebraic operations on
functions, such as addition, subtraction, multiplication and division. We do this by performing the operations with the function
outputs, defining the result as the output of our new function.
Suppose we need to add two columns of numbers that represent a husband and wife’s separate annual incomes over a period of
years, with the result being their total household income. We want to do this for every year, adding only that year’s incomes and
then collecting all the data in a new column. If w(y) is the wife’s income and h(y) is the husband’s income in year y , and we want
T to represent the total income, then we can define a new function.

T (y) = h(y) + w(y)

If this holds true for every year, then we can focus on the relation between the functions without reference to a year and write

T = h +w

Just as for this sum of two functions, we can define difference, product, and ratio functions for any pair of functions that have the
same kinds of inputs (not necessarily numbers) and also the same kinds of outputs (which do have to be numbers so that the usual
operations of algebra can apply to them, and which also must have the same units or no units when we add and subtract). In this
way, we can think of adding, subtracting, multiplying, and dividing functions.
f
For two functions f (x) and g(x) with real number outputs, we define new functions f + g , f − g , f g, and g
by the relations.

1.3.1 https://math.libretexts.org/@go/page/4436
(f + g)(x) = f (x) + g(x)

(f − g)(x) = f (x) − g(x)

(f g)(x) = f (x)g(x)

f f (x)
( ) (x) =
g g(x)

Example 1.3.1: Performing Algebraic Operations on Functions

g
Find and simplify the functions (g − f )(x) and ( ) (x) , given f (x) = x − 1 and 2
g(x) = x −1 . Are they the same
f

function?
Solution
Begin by writing the general form, and then substitute the given functions.
(g − f )(x) = g(x) − f (x)

2
(g − f )(x) =x − 1 − (x − 1)

2
=x −x

= x(x − 1)

g
( ) (x) = g(x)f (x)
f

2
g x −1
( ) (x) =
f x −1

(x + 1)(x − 1)
=
x −1

= x +1

No, the functions are not the same.


g
Note: For ( , the condition
) (x) x ≠1 is necessary because when x =1 , the denominator is equal to 0, which makes the
f

function undefined.

Exercise 1.3.1

Find and simplify the functions (f g)(x) and (f − g)(x) .

f (x) = x − 1

and
2
g(x) = x −1

Are they the same function?

Answer
3 2
(f g)(x) = f (x)g(x) = (x − 1)(x2 − 1) = x −x −x +1

2 2
(f − g)(x) = f (x) − g(x) = (x − 1) − (x − 1) = x − x

No, the functions are not the same.

1.3.2 https://math.libretexts.org/@go/page/4436
Create a Function by Composition of Functions
Performing algebraic operations on functions combines them into a new function, but we can also create functions by composing
functions. When we wanted to compute a heating cost from a day of the year, we created a new function that takes a day as input
and yields a cost as output. The process of combining functions so that the output of one function becomes the input of another is
known as a composition of functions. The resulting function is known as a composite function. We represent this combination by
the following notation:

f ∘g(x) = f (g(x)) (1.3.1)

We read the left-hand side as“f composed with g at x,” and the right-hand side as“f of g of x.”The two sides of the equation have
the same mathematical meaning and are equal. The open circle symbol ∘ is called the composition operator. We use this operator
mainly when we wish to emphasize the relationship between the functions themselves without referring to any particular input
value. Composition is a binary operation that takes two functions and forms a new function, much as addition or multiplication
takes two numbers and gives a new number. However, it is important not to confuse function composition with multiplication
because, as we learned above, in most cases f (g(x))≠f (x)g(x).
It is also important to understand the order of operations in evaluating a composite function. We follow the usual convention with
parentheses by starting with the innermost parentheses first, and then working to the outside. In the equation above, the function g
takes the input x first and yields an output g(x). Then the function f takes g(x) as an input and yields an output f (g(x)).

Figure 1.3.2 : Explanation of the composite function.


In general, f ∘g and g∘f are different functions. In other words, in many cases f (g(x))≠g(f (x)) for all x. We will also see that
sometimes two functions can be composed only in one specific order.
For example, if f (x) = x and g(x) = x + 2 , then
2

f (g(x)) = f (x + 2)

2
= (x + 2)

2
=x + 4x + 4

but
2
g(f (x)) = g(x )

2
=x +2

These expressions are not equal for all values of x, so the two functions are not equal. It is irrelevant that the expressions happen to
be equal for the single input value x = − .1

Note that the range of the inside function (the first function to be evaluated) needs to be within the domain of the outside function.
Less formally, the composition has to make sense in terms of inputs and outputs.

Composition of Functions
When the output of one function is used as the input of another, we call the entire operation a composition of functions. For
any input x and functions f and g , this action defines a composite function, which we write as f ∘g such that
(f ∘g)(x) = f (g(x)) (1.3.2)

The domain of the composite function f ∘g is all x such that x is in the domain of g and g(x) is in the domain of f .
It is important to realize that the product of functions fg is not the same as the function composition f (g(x)) , because, in
general, f (x)g(x)≠f (g(x)).

1.3.3 https://math.libretexts.org/@go/page/4436
Example 1.3.2: Determining whether Composition of Functions is Commutative

Using the functions provided, find f (g(x)) and g(f (x)). Determine whether the composition of the functions is commutative.

f (x) = 2x + 1 g(x) = 3 − x

Solution
Let’s begin by substituting g(x) into f (x).
f (g(x)) = 2(3 − x) + 1

= 6 − 2x + 1

= 7 − 2x

Now we can substitute f (x) into g(x).

g(f (x)) = 3 − (2x + 1)

= 3 − 2x − 1

= 2 − 2x

We find that g(f (x))≠f (g(x)), so the operation of function composition is not commutative.

Example 1.3.3: Interpreting Composite Functions

The function c(s) gives the number of calories burned completing s sit-ups, and s(t) gives the number of sit-ups a person can
complete in t minutes. Interpret c(s(3)).
Solution
The inside expression in the composition is s(3) . Because the input to the s -function is time, t = 3 represents 3 minutes, and
s(3) is the number of sit-ups completed in 3 minutes.

Using s(3) as the input to the function c(s) gives us the number of calories burned during the number of sit-ups that can be
completed in 3 minutes, or simply the number of calories burned in 3 minutes (by doing sit-ups).

Example 1.3.4: Investigating the Order of Function Composition

Suppose f (x) gives miles that can be driven in x hours and g(y) gives the gallons of gas used in driving y miles. Which of
these expressions is meaningful: f (g(y)) or g(f (x))?
Solution
The function y = f (x) is a function whose output is the number of miles driven corresponding to the number of hours driven.

number of miles = f (number of hours)

The function g(y) is a function whose output is the number of gallons used corresponding to the number of miles driven. This
means:

number of gallons = g(number of miles)

The expression g(y) takes miles as the input and a number of gallons as the output. The function f (x) requires a number of
hours as the input. Trying to input a number of gallons does not make sense. The expression f (g(y)) is meaningless.
The expression f (x) takes hours as input and a number of miles driven as the output. The function g(y) requires a number of
miles as the input. Using f (x) (miles driven) as an input value for g(y) , where gallons of gas depends on miles driven, does
make sense. The expression g(f (x)) makes sense, and will yield the number of gallons of gas used, g , driving a certain
number of miles, f (x), in x hours.

1.3.4 https://math.libretexts.org/@go/page/4436
Question/Answer
Are there any situations where f (g(y)) and g(f (x)) would both be meaningful or useful expressions?
Yes. For many pure mathematical functions, both compositions make sense, even though they usually produce different new
functions. In real-world problems, functions whose inputs and outputs have the same units also may give compositions that are
meaningful in either order

Exercise 1.3.2

The gravitational force on a planet a distance r from the sun is given by the function G(r) . The acceleration of a planet
subjected to any force F is given by the function a(F ) . Form a meaningful composition of these two functions, and explain
what it means.

Answer
A gravitational force is still a force, so a(G(r)) makes sense as the acceleration of a planet at a distance r from the Sun
(due to gravity), but G(a(F )) does not make sense.

Evaluating Composite Functions


Once we compose a new function from two existing functions, we need to be able to evaluate it for any input in its domain. We will
do this with specific numerical inputs for functions expressed as tables, graphs, and formulas and with variables as inputs to
functions expressed as formulas. In each case, we evaluate the inner function using the starting input and then use the inner
function’s output as the input for the outer function.

Evaluating Composite Functions Using Tables


When working with functions given as tables, we read input and output values from the table entries and always work from the
inside to the outside. We evaluate the inside function first and then use the output of the inside function as the input to the outside
function.

Example 1.3.5: Using a Table to Evaluate a Composite Function

Using Table 1.3.1, evaluate f (g(3)) and g(f (3)).


Table 1.3.1
x f (x) g(x)

1 6 3

2 8 5

3 3 2

4 1 7

Solution
To evaluate f (g(3)), we start from the inside with the input value 3. We then evaluate the inside expression g(3) using the
table that defines the function g : g(3) = 2 . We can then use that result as the input to the function f , so g(3) is replaced by 2
and we get f (2). Then, using the table that defines the function f , we find that f (2) = 8 .

g(3) = 2

f (g(3)) = f (2) = 8

To evaluate g(f (3)), we first evaluate the inside expression f (3) using the first table: f (3) = 3 . Then, using the table for g , we
can evaluate

g(f (3)) = g(3) = 2

Table 1.3.2 shows the composite functions f ∘g and g∘f as tables.

1.3.5 https://math.libretexts.org/@go/page/4436
Table 1.3.2
x g(x) f (g(x)) f (x) g(f (x))

3 2 8 3 2

Exercise 1.3.3

Using Table 1.3.1, evaluate f (g(1)) and g(f (4)).

Answer
f (g(1)) = f (3) = 3 and g(f (4)) = g(1) = 3

Evaluating Composite Functions Using Graphs


When we are given individual functions as graphs, the procedure for evaluating composite functions is similar to the process we
use for evaluating tables. We read the input and output values, but this time, from the x- and y-axes of the graphs.

How To ...

Given a composite function and graphs of its individual functions, evaluate it using the information provided by the graphs.
1. Locate the given input to the inner function on the x-axis of its graph.
2. Read off the output of the inner function from the y-axis of its graph.
3. Locate the inner function output on the x-axis of the graph of the outer function.
4. Read the output of the outer function from the y-axis of its graph. This is the output of the composite function.

Example 1.3.6: Using a Graph to Evaluate a Composite Function

Using Figure 1.3.3, evaluate f (g(1)).

Figure 1.3.3 : Two graphs of a positive and negative parabola.


Solution
To evaluate f (g(1)), we start with the inside evaluation. See Figure 1.3.4.

1.3.6 https://math.libretexts.org/@go/page/4436
Figure 1.3.4 : Two graphs of a positive parabola g(x) and a negative parabola f (x). The following points are plotted: g(1) = 3
and f (3) = 6.
We evaluate g(1) using the graph of g(x), finding the input of 1 on the x-axis and finding the output value of the graph at that
input. Here, g(1) = 3 . We use this value as the input to the function f .

f (g(1)) = f (3)

We can then evaluate the composite function by looking to the graph of f (x), finding the input of 3 on the x-axis and reading
the output value of the graph at this input. Here, f (3) = 6 , so f (g(1)) = 6 .
Analysis
Figure 1.3.5 shows how we can mark the graphs with arrows to trace the path from the input value to the output value.

Figure 1.3.5 : Two graphs of a positive and negative parabola.

1.3.7 https://math.libretexts.org/@go/page/4436
Exercise 1.3.4

Using Figure 1.3.3, evaluate g(f (2)).

Answer
g(f (2)) = g(5) = 3

Evaluating Composite Functions Using Formulas


When evaluating a composite function where we have either created or been given formulas, the rule of working from the inside
out remains the same. The input value to the outer function will be the output of the inner function, which may be a numerical
value, a variable name, or a more complicated expression.
While we can compose the functions for each individual input value, it is sometimes helpful to find a single formula that will
calculate the result of a composition f (g(x)). To do this, we will extend our idea of function evaluation. Recall that, when we
evaluate a function like f (t) = t − t , we substitute the value inside the parentheses into the formula wherever we see the input
2

variable.

How To...
Given a formula for a composite function, evaluate the function.
1. Evaluate the inside function using the input value or variable provided.
2. Use the resulting output as the input to the outside function.

Example 1.3.7: Evaluating a Composition of Functions Expressed as Formulas with a Numerical Input

Given f (t) = t 2
−t and h(x) = 3x + 2 , evaluate f (h(1)).
Solution
Because the inside expression is h(1), we start by evaluating h(x) at 1.

h(1) = 3(1) + 2

h(1) = 5

Then f (h(1)) = f (5), so we evaluate f (t) at an input of 5.


f (h(1)) = f (5)

2
f (h(1)) = 5 −5

f (h(1)) = 20

Analysis
It makes no difference what the input variables t and x were called in this problem because we evaluated for specific
numerical values.

Exercise 1.3.5

Given f (t) = t 2
−t and h(x) = 3x + 2 , evaluate
a. h(f (2))
b. h(f (−2))

Answer a
8
Answer b
20

1.3.8 https://math.libretexts.org/@go/page/4436
Finding the Domain of a Composite Function
As we discussed previously, the domain of a composite function such as f ∘g is dependent on the domain of g and the domain of
f . It is important to know when we can apply a composite function and when we cannot, that is, to know the domain of a function

such as f ∘g. Let us assume we know the domains of the functions f and g separately. If we write the composite function for an
input x as f (g(x)), we can see right away that x must be a member of the domain of g in order for the expression to be
meaningful, because otherwise we cannot complete the inner function evaluation. However, we also see that g(x) must be a
member of the domain of f , otherwise the second function evaluation in f (g(x)) cannot be completed, and the expression is still
undefined. Thus the domain of f ∘g consists of only those inputs in the domain of g that produce outputs from g belonging to the
domain of f . Note that the domain of f composed with g is the set of all x such that x is in the domain of g and g(x)\) is in the
domain of f .

Definition: Domain of a Composite Function


The domain of a composite function f (g(x)) is the set of those inputs x in the domain of g for which g(x) is in the domain
of f .

How To...
Given a function composition f (g(x)), determine its domain.
1. Find the domain of g .
2. Find the domain of f .
3. Find those inputs x in the domain of g for which g(x) is in the domain of f . That is, exclude those inputs x from the
domain of g for which g(x) is not in the domain of f . The resulting set is the domain of f ∘g.

Example 1.3.8A: Finding the Domain of a Composite Function

Find the domain of


5 4
(f ∘ g)(x) where f (x) = and g(x) =
x −1 3x − 2

Solution
The domain of g(x) consists of all real numbers except x = , since that input value would cause us to divide by 0. Likewise,
2

the domain of f consists of all real numbers except 1. So we need to exclude from the domain of g(x) that value of x for
which g(x) = 1 .
4
=1
3x − 2

4 = 3x − 2

6 = 3x

x =2

So the domain of f ∘g is the set of all real numbers except 2

3
and 2. This means that
2
x≠ or x ≠ 2
3

We can write this in interval notation as


2 2
(−∞, )∪( , 2) ∪ (2, ∞)
3 3

1.3.9 https://math.libretexts.org/@go/page/4436
Example 1.3.8B: Finding the Domain of a Composite Function Involving Radicals

Find the domain of


−− −−− − −−−−
(f ∘g)(x) where f (x) = √ x + 2 and g(x) = √ 3 − x

Solution
Because we cannot take the square root of a negative number, the domain of g is (−∞, 3]. Now we check the domain of the
composite function
−−−−−−−− −
− −−−−
(f ∘g)(x) = √ √ 3 − x + 2

−−−
−−
−−−−−−
− − −−−−−
For (f ∘ g)(x) = √√3 − x + 2 , √3 − x + 2 ≥ 0, since the radicand of a square root must be positive. Since square roots
−−−−−
are positive, √3 − x ≥ 0 ,or, 3 − x ≥ 0, which gives a domain of (−∞, 3].
Analysis
This example shows that knowledge of the range of functions (specifically the inner function) can also be helpful in finding the
domain of a composite function. It also shows that the domain of f ∘g can contain values that are not in the domain of f ,
though they must be in the domain of g .

Exercise 1.3.6

Find the domain of


1 −− −−−
(f ∘g)(x) where f (x) = and g(x) = √ x + 4
x −2

Answer
[−4, 0) ∪ (0, ∞)

Decomposing a Composite Function into its Component Functions


In some cases, it is necessary to decompose a complicated function. In other words, we can write it as a composition of two simpler
functions. There may be more than one way to decompose a composite function, so we may choose the decomposition that
appears to be most expedient.

Example 1.3.9: Decomposing a Function


−−−−−
Write f (x) = √5 − x as the composition of two functions.
2

Solution
We are looking for two functions, g and h , so f (x) = g(h(x)) . To do this, we look for a function inside a function in the
formula for f (x). As one possibility, we might notice that the expression 5 − x is the inside of the square root. We could then
2

decompose the function as


2 −
h(x) = 5 − x and g(x) = √x

We can check our answer by recomposing the functions.


− −−− −
2 2
g(h(x)) = g(5 − x ) = √ 5 − x

Exercise 1.3.7
4
Write f (x) = −−−− − as the composition of two functions.
2
3 − √4 + x

Answer

1.3.10 https://math.libretexts.org/@go/page/4436
Possible answers:
−−−− −
2
g(x) = √4 + x

4
h(x) =
3 −x

f = h∘g

Access these online resources for additional instruction and practice with composite functions.
Composite Functions (http://openstaxcollege.org/l/compfunction)
Composite Function Notation Application (http://openstaxcollege.org/l/compfuncnot)
Composite Functions Using Graphs (http://openstaxcollege.org/l/compfuncgraph)
Decompose Functions (http://openstaxcollege.org/l/decompfunction)
Composite Function Values (http://openstaxcollege.org/l/compfuncvalue)

Key Equation
Composite function (f ∘g)(x) = f (g(x))

Key Concepts
We can perform algebraic operations on functions. See Example.
When functions are combined, the output of the first (inner) function becomes the input of the second (outer) function.
The function produced by combining two functions is a composite function. See Example and Example.
The order of function composition must be considered when interpreting the meaning of composite functions. See Example.
A composite function can be evaluated by evaluating the inner function using the given input value and then evaluating the
outer function taking as its input the output of the inner function.
A composite function can be evaluated from a table. See Example.
A composite function can be evaluated from a graph. See Example.
A composite function can be evaluated from a formula. See Example.
The domain of a composite function consists of those inputs in the domain of the inner function that correspond to outputs of
the inner function that are in the domain of the outer function. See Example and Example.
Just as functions can be combined to form a composite function, composite functions can be decomposed into simpler
functions.
Functions can often be decomposed in more than one way. See Example.

Glossary
composite function
the new function formed by function composition, when the output of one function is used as the input of another

1.3: New Functions from Old Functions is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
1.4: Composition of Functions by OpenStax is licensed CC BY 4.0. Original source: https://openstax.org/details/books/precalculus.

1.3.11 https://math.libretexts.org/@go/page/4436
1.4: Exponential Functions
1.4: Exponential Functions is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

1.4.1 https://math.libretexts.org/@go/page/4437
1.5: Inverse Functions and Logarithms
1.5: Inverse Functions and Logarithms is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

1.5.1 https://math.libretexts.org/@go/page/4438
CHAPTER OVERVIEW

2: Limits and Derivatives


A general Calculus Textmap organized around the textbook

Calculus: Early Transcendentals


by James Stewart

I II III IV V VI VII VIII IX X XI XII XIII XIV XV XVI XVII

This Textmap is currently under construction... please be patient with us.

Topic hierarchy
2.1: The Tangent and Velocity Problems
2.2: The Limit of a Function
2.3: Calculating Limits Using the Limit Laws
2.4: The Precise Definition of a Limit
2.5: Continuity
2.6: Limits at Infinity; Horizontal Asymptotes
2.7: Derivatives and Rates of Change
2.8: The Derivative as a Function

2: Limits and Derivatives is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

1
2.1: The Tangent and Velocity Problems
2.1: The Tangent and Velocity Problems is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

2.1.1 https://math.libretexts.org/@go/page/4440
2.2: The Limit of a Function
 Learning Objectives
Using correct notation, describe the limit of a function.
Use a table of values to estimate the limit of a function or to identify when the limit does not exist.
Use a graph to estimate the limit of a function or to identify when the limit does not exist.
Define one-sided limits and provide examples.
Explain the relationship between one-sided and two-sided limits.
Using correct notation, describe an infinite limit.
Define a vertical asymptote.

The concept of a limit or limiting process, essential to the understanding of calculus, has been around for thousands of years. In
fact, early mathematicians used a limiting process to obtain better and better approximations of areas of circles. Yet, the formal
definition of a limit—as we know and understand it today—did not appear until the late 19th century. We therefore begin our quest
to understand limits, as our mathematical ancestors did, by using an intuitive approach. At the end of this chapter, armed with a
conceptual understanding of limits, we examine the formal definition of a limit.
We begin our exploration of limits by taking a look at the graphs of the functions
2
x −4
f (x) = ,
x −2
|x − 2|
g(x) = , and
x −2
1
h(x) =
2
,
(x − 2)

which are shown in Figure 2.2.1. In particular, let’s focus our attention on the behavior of each graph at and around x = 2 .

Figure 2.2.1 : These graphs show the behavior of three different functions around x = 2 .
Each of the three functions is undefined at x = 2 , but if we make this statement and no other, we give a very incomplete picture of
how each function behaves in the vicinity of x = 2 . To express the behavior of each graph in the vicinity of 2 more completely, we
need to introduce the concept of a limit.

Intuitive Definition of a Limit


Let’s first take a closer look at how the function f (x) = (x − 4)/(x − 2) behaves around x = 2 in Figure 2.2.1. As the values of
2

x approach 2 from either side of 2 , the values of y = f (x) approach 4 . Mathematically, we say that the limit of f (x) as x

approaches 2 is 4. Symbolically, we express this limit as

2.2.1 https://math.libretexts.org/@go/page/4441
lim f (x) = 4
x→2

From this very brief informal look at one limit, let’s start to develop an intuitive definition of the limit. We can think of the limit of a
function at a number a as being the one real number L that the functional values approach as the x-values approach a , provided
such a real number L exists. Stated more carefully, we have the following definition:

 Definition (Intuitive): Limit

Let f (x) be a function defined at all values in an open interval containing a , with the possible exception of a itself, and let L
be a real number. If all values of the function f (x) approach the real number L as the values of x(≠ a) approach the number
a , then we say that the limit of f (x) as x approaches a is L. (More succinct, as x gets closer to a , f (x) gets closer and stays

close to L.) Symbolically, we express this idea as

lim f (x) = L. (2.2.1)


x→a

We can estimate limits by constructing tables of functional values and by looking at their graphs. This process is described in the
following Problem-Solving Strategy.

 Problem-Solving Strategy: Evaluating a Limit Using a Table of Functional Values

1. To evaluate lim f (x), we begin by completing a table of functional values. We should choose two sets of x-values—one set
x→a

of values approaching a and less than a , and another set of values approaching a and greater than a . Table 2.2.1 demonstrates
what your tables might look like.
Table 2.2.1
x f (x) x f (x)

a − 0.1 f (a − 0.1) a + 0.1 f (a + 0.1)

a − 0.01 f (a − 0.01) a + 0.01 f (a + 0.01)

a − 0.001 f (a − 0.001) a + 0.001 f (a + 0.001)

a − 0.0001 f (a − 0.0001) a + 0.0001 f (a + 0.0001)

Use additional values as necessary. Use additional values as necessary.

2. Next, let’s look at the values in each of the f (x) columns and determine whether the values seem to be approaching a single
value as we move down each column. In our columns, we look at the sequence f (a − 0.1) , f (a − 0.01) , f (a − 0.001),
f (a − 0.0001), and so on, and f (a + 0.1), f (a + 0.01), f (a + 0.001), f (a + 0.0001), and so on. (Note: Although we have

chosen the x-values a ± 0.1, a ± 0.01, a ± 0.001, a ± 0.0001 , and so forth, and these values will probably work nearly
every time, on very rare occasions we may need to modify our choices.)
3. If both columns approach a common y -value L, we state lim f (x) = L . We can use the following strategy to confirm the
x→a

result obtained from the table or as an alternative method for estimating a limit.
4. Using a graphing calculator or computer software that allows us graph functions, we can plot the function f (x), making sure
the functional values of f (x) for x-values near a are in our window. We can use the trace feature to move along the graph of
the function and watch the y -value readout as the x-values approach a . If the y -values approach L as our x-values approach a
from both directions, then lim f (x) = L. We may need to zoom in on our graph and repeat this process several times.
x→a

We apply this Problem-Solving Strategy to compute a limit in Examples 2.2.1A and 2.2.1B.

 Example 2.2.1A: Evaluating a Limit Using a Table of Functional Values


sin x
Evaluate lim using a table of functional values.
x→0 x

Solution

2.2.2 https://math.libretexts.org/@go/page/4441
sin x
We have calculated the values of f (x) = for the values of x listed in Table 2.2.2.
x

Table 2.2.2
sin x sin x
x x
x x

-0.1 0.998334166468 0.1 0.998334166468

-0.01 0.999983333417 0.01 0.999983333417

-0.001 0.999999833333 0.001 0.999999833333

-0.0001 0.999999998333 0.0001 0.999999998333

Note: The values in this table were obtained using a calculator and using all the places given in the calculator output.
sin x
As we read down each column, we see that the values in each column appear to be approaching one. Thus, it is fairly
x
sin x sin x
reasonable to conclude that lim =1 . A calculator-or computer-generated graph of f (x) = would be similar to
x→0 x x

that shown in Figure 2.2.2, and it confirms our estimate.

Figure 2.2.2 : The graph of f (x) = (sin x)/x confirms the estimate from Table 2.2.2 .

 Example 2.2.1B: Evaluating a Limit Using a Table of Functional Values



√x − 2
Evaluate lim using a table of functional values.
x→4 x −4

Solution
As before, we use a table—in this case, Table 2.2.3—to list the values of the function for the given values of x.
Table 2.2.3
√x−2 √x−2
x x
x−4 x−4

3.9 0.251582341869 4.1 0.248456731317

3.99 0.25015644562 4.01 0.24984394501

3.999 0.250015627 4.001 0.249984377

3.9999 0.250001563 4.0001 0.249998438

3.99999 0.25000016 4.00001 0.24999984

After inspecting this table, we see that the functional values less than 4 appear to be decreasing toward 0.25 whereas the

√x − 2
functional values greater than 4 appear to be increasing toward 0.25. We conclude that lim = 0.25 . We confirm this
x→4 x −4

√x − 2
estimate using the graph of f (x) = shown in Figure 2.2.3.
x −4

2.2.3 https://math.libretexts.org/@go/page/4441
√x−2
Figure 2.2.3 : The graph of x−4
confirms the estimate from Table 2.2.3 .

 Exercise 2.2.1
1
−1
x
Estimate lim using a table of functional values. Use a graph to confirm your estimate.
x→1 x −1

Hint
Use 0.9, 0.99, 0.999, 0.9999, 0.99999 and 1.1, 1.01, 1.001, 1.0001, 1.00001 as your table values.
Answer
1
−1
x
lim = −1
x→1 x −1

At this point, we see from Examples 2.2.1A and 2.2.1b that it may be just as easy, if not easier, to estimate a limit of a function by
inspecting its graph as it is to estimate the limit by using a table of functional values. In Example 2.2.2, we evaluate a limit
exclusively by looking at a graph rather than by using a table of functional values.

 Example 2.2.2: Evaluating a Limit Using a Graph


For g(x) shown in Figure 2.2.4, evaluate lim g(x) .
x→−1

Figure 2.2.4 : The graph of g(x) includes one value not on a smooth curve.
Solution:
Despite the fact that g(−1) = 4 , as the x-values approach −1 from either side, the g(x) values approach 3. Therefore,
lim g(x) = 3 . Note that we can determine this limit without even knowing the algebraic expression of the function.
x→−1

2.2.4 https://math.libretexts.org/@go/page/4441
Based on Example 2.2.2, we make the following observation: It is possible for the limit of a function to exist at a point, and for the
function to be defined at this point, but the limit of the function and the value of the function at the point may be different.

 Exercise 2.2.2

Use the graph of h(x) in Figure 2.2.5 to evaluate lim h(x), if possible.
x→2

Figure 2.2.5 :

Hint
What y-value does the function approach as the x -values approach 2?
Solution
lim h(x) = −1.
x→2

Looking at a table of functional values or looking at the graph of a function provides us with useful insight into the value of the
limit of a function at a given point. However, these techniques rely too much on guesswork. We eventually need to develop
alternative methods of evaluating limits. These new methods are more algebraic in nature and we explore them in the next section;
however, at this point we introduce two special limits that are foundational to the techniques to come.

 Two Important Limits

Let a be a real number and c be a constant.


i. lim x = a
x→a

ii. lim c = c
x→a

We can make the following observations about these two limits.


i. For the first limit, observe that as x approaches a , so does f (x), because f (x) = x. Consequently, lim x = a .
x→a

ii. For the second limit, consider Table 2.2.4.


Table 2.2.4
x f(x) = c x f(x) = c

a − 0.1 c a + 0.1 c

a − 0.01 c a + 0.01 c

a − 0.001 c a + 0.001 c

a − 0.0001 c a + 0.0001 c

2.2.5 https://math.libretexts.org/@go/page/4441
Observe that for all values of x (regardless of whether they are approaching a ), the values f (x) remain constant at c . We have
no choice but to conclude lim c = c .
x→a

The Existence of a Limit


As we consider the limit in the next example, keep in mind that for the limit of a function to exist at a point, the functional values
must approach a single real-number value at that point. If the functional values do not approach a single value, then the limit does
not exist.

 Example 2.2.3: Evaluating a Limit That Fails to Exist

Evaluate lim sin(1/x) using a table of values.


x→0

Solution
Table 2.2.5 lists values for the function sin(1/x) for the given values of x.
Table 2.2.5
x sin(1/x) x sin(1/x)

-0.1 0.544021110889 0.1 −0.544021110889

-0.01 0.50636564111 0.01 −0.50636564111

-0.001 −0.8268795405312 0.001 0.8268795405312

-0.0001 0.305614388888 0.0001 −0.305614388888

-0.00001 −0.035748797987 0.00001 0.035748797987

-0.000001 0.349993504187 0.000001 −0.349993504187

After examining the table of functional values, we can see that the y -values do not seem to approach any one single value. It
appears the limit does not exist. Before drawing this conclusion, let’s take a more systematic approach. Take the following
sequence of x-values approaching 0:
2 2 2 2 2 2
, , , , , , ….
π 3π 5π 7π 9π 11π

The corresponding y -values are

1, −1, 1, −1, 1, −1, . . . .

At this point we can indeed conclude that lim sin(1/x) does not exist. (Mathematicians frequently abbreviate “does not exist”
x→0

as DNE. Thus, we would write lim sin(1/x) DNE.) The graph of f (x) = sin(1/x) is shown in Figure 2.2.6 and it gives a
x→0

clearer picture of the behavior of sin(1/x) as x approaches 0. You can see that sin(1/x) oscillates ever more wildly between
−1 and 1 as x approaches 0 .

2.2.6 https://math.libretexts.org/@go/page/4441
Figure 2.2.6 : The graph of f (x) = sin(1/x) oscillates rapidly between −1 and 1 as x approaches 0.

 Exercise 2.2.3
2

∣x − 4∣

Use a table of functional values to evaluate lim , if possible.
x→2 x −2

Hint
Use x -values 1.9, 1.99, 1.999, 1.9999, 1.99999 and 2.1, 2.01, 2.001, 2.0001, 2.00001 in your table.
Answer
2

∣x − 4∣

lim does not exist.
x→2 x −2

One-Sided Limits
Sometimes indicating that the limit of a function fails to exist at a point does not provide us with enough information about the
behavior of the function at that particular point. To see this, we now revisit the function g(x) = |x − 2|/(x − 2) introduced at the
beginning of the section (see Figure 2.2.1(b)). As we pick values of x close to 2, g(x) does not approach a single value, so the
limit as x approaches 2 does not exist—that is, lim g(x) DNE. However, this statement alone does not give us a complete picture
x→2

of the behavior of the function around the x-value 2. To provide a more accurate description, we introduce the idea of a one-sided
limit. For all values to the left of 2 (or the negative side of 2), g(x) = −1 . Thus, as x approaches 2 from the left, g(x) approaches
−1. Mathematically, we say that the limit as x approaches 2 from the left is −1. Symbolically, we express this idea as

lim g(x) = −1.



x→2

Similarly, as x approaches 2 from the right (or from the positive side), g(x) approaches 1. Symbolically, we express this idea as

lim g(x) = 1.
+
x→2

We can now present an informal definition of one-sided limits.

 Definition: One-sided Limits

We define two types of one-sided limits.


Limit from the left:
Let f (x) be a function defined at all values in an open interval of the form (z, a) , and let L be a real number. If the values of
the function f (x) approach the real number L as the values of x (where x < a ) approach the number a , then we say that L is
the limit of f (x) as x approaches a from the left. Symbolically, we express this idea as

lim f (x) = L.

x→a

2.2.7 https://math.libretexts.org/@go/page/4441
Limit from the right:
Let f (x) be a function defined at all values in an open interval of the form (a, c), and let L be a real number. If the values of
the function f (x) approach the real number L as the values of x (where x > a ) approach the number a , then we say that L is
the limit of f (x) as x approaches a from the right. Symbolically, we express this idea as

lim f (x) = L.
+
x→a

 Example 2.2.4: Evaluating One-Sided Limits

x + 1, if x < 2
For the function f (x) = { 2
, evaluate each of the following limits.
x − 4, if x ≥ 2

a. lim f (x)

x→2

b. lim f (x)
+
x→2

Solution
We can use tables of functional values again. Observe in Table 2.2.6 that for values of x less than 2, we use f (x) = x + 1 and
for values of x greater than 2, we use f (x) = x − 4.2

Table 2.2.6
2
x f(x) = x + 1 x f(x) = x −4

1.9 2.9 2.1 0.41

1.99 2.99 2.01 0.0401

1.999 2.999 2.001 0.004001

1.9999 2.9999 2.0001 0.00040001

1.99999 2.99999 2.00001 0.0000400001

Based on this table, we can conclude that a. lim f (x) = 3



and b. lim f (x) = 0
+
. Therefore, the (two-sided) limit of f (x)
x→2 x→2

does not exist at x = 2 . Figure 2.2.7 shows a graph of f (x) and reinforces our conclusion about these limits.

x + 1, if x < 2
Figure 2.2.7 : The graph of f (x) = { 2
has a break at x = 2 .
x − 4, if x ≥ 2

2.2.8 https://math.libretexts.org/@go/page/4441
 Exercise 2.2.4
Use a table of functional values to estimate the following limits, if possible.
2

∣x − 4 ∣

a. lim

x→2 x −2
2

∣x − 4∣

b. lim
x→2
+
x −2

Hint
2

∣x − 4∣

Use x -values 1.9, 1.99, 1.999, 1.9999, 1.99999 to estimate lim .
x→2

x −2

2

∣x − 4∣

Use x -values 2.1, 2.01, 2.001, 2.0001, 2.00001 to estimate lim
+
.
x→2 x −2

(These tables are available from a previous Checkpoint problem.)


Solution a
2

∣x − 4 ∣

a. lim

= −4
x→2 x −2

Solution b
2

∣x − 4 ∣

lim =4
+
x→2 x −2

Let us now consider the relationship between the limit of a function at a point and the limits from the right and left at that point. It
seems clear that if the limit from the right and the limit from the left have a common value, then that common value is the limit of
the function at that point. Similarly, if the limit from the left and the limit from the right take on different values, the limit of the
function does not exist. These conclusions are summarized in Note.

 Relating One-Sided and Two-Sided Limits

Let f (x) be a function defined at all values in an open interval containing a , with the possible exception of a itself, and let L

be a real number. Then,

lim f (x) = L
x→a

if and only if lim f (x) = L



and lim f (x) = L
+
.
x→a x→a

Infinite Limits
Evaluating the limit of a function at a point or evaluating the limit of a function from the right and left at a point helps us to
characterize the behavior of a function around a given value. As we shall see, we can also describe the behavior of functions that do
not have finite limits.
We now turn our attention to h(x) = 1/(x − 2) , the third and final function introduced at the beginning of this section (see Figure
2

2.2.1(c)). From its graph we see that as the values of x approach 2 , the values of h(x) = 1/(x − 2) become larger and larger and,
2

in fact, become infinite. Mathematically, we say that the limit of h(x) as x approaches 2 is positive infinity. Symbolically, we
express this idea as

lim h(x) = +∞.


x→2

More generally, we define infinite limits as follows:

2.2.9 https://math.libretexts.org/@go/page/4441
 Definitions: Infinite Limits
We define three types of infinite limits.
Infinite limits from the left: Let f (x) be a function defined at all values in an open interval of the form (b, a).
i. If the values of f (x) increase without bound as the values of x (where x < a ) approach the number a , then we say that
the limit as x approaches a from the left is positive infinity and we write

lim f (x) = +∞.



x→a

ii. If the values of f (x) decrease without bound as the values of x (where x < a ) approach the number a , then we say
that the limit as x approaches a from the left is negative infinity and we write

lim f (x) = −∞.



x→a

Infinite limits from the right: Let f (x) be a function defined at all values in an open interval of the form (a, c).
i. If the values of f (x) increase without bound as the values of x (where x > a ) approach the number a , then we say that
the limit as x approaches a from the right is positive infinity and we write

lim f (x) = +∞.


+
x→a

ii. If the values of f (x) decrease without bound as the values of x (where x > a ) approach the number a , then we say
that the limit as x approaches a from the right is negative infinity and we write

lim f (x) = −∞.


+
x→a

Two-sided infinite limit: Let f (x) be defined for all x ≠ a in an open interval containing a
i. If the values of f (x) increase without bound as the values of x (where x ≠ a ) approach the number a , then we say that
the limit as x approaches a is positive infinity and we write

lim f (x) = +∞.


x→a

ii. If the values of f (x) decrease without bound as the values of x (where x ≠ a ) approach the number a , then we say
that the limit as x approaches a is negative infinity and we write

lim f (x) = −∞.


x→a

It is important to understand that when we write statements such as lim f (x) = +∞ or lim f (x) = −∞ we are describing the
x→a x→a

behavior of the function, as we have just defined it. We are not asserting that a limit exists. For the limit of a function f (x) to exist
at a , it must approach a real number L as x approaches a . That said, if, for example, lim f (x) = +∞ , we always write
x→a

lim f (x) = +∞ rather than lim f (x) DNE.


x→a x→a

 Example 2.2.5: Recognizing an Infinite Limit

Evaluate each of the following limits, if possible. Use a table of functional values and graph f (x) = 1/x to confirm your
conclusion.
1
a. lim

x→0 x
1
b. lim
+
x→0 x
1
c. lim
x→0 x

Solution
Begin by constructing a table of functional values.

2.2.10 https://math.libretexts.org/@go/page/4441
Table 2.2.7
1 1
x x
x x

-0.1 -10 0.1 10

-0.01 -100 0.01 100

-0.001 -1000 0.001 1000

-0.0001 -10,000 0.0001 10,000

-0.00001 -100,000 0.00001 100,000

-0.000001 -1,000,000 0.000001 1,000,000

a. The values of 1/x decrease without bound as x approaches 0 from the left. We conclude that
1
lim = −∞.
x→0

x

b. The values of 1/x increase without bound as x approaches 0 from the right. We conclude that
1
lim = +∞.
x→0
+
x

1 1
c. Since lim = −∞ and lim = +∞ have different values, we conclude that
x→0

x +
x→0 x

1
lim DNE.
x→0 x

The graph of f (x) = 1/x in Figure 2.2.8 confirms these conclusions.

Figure 2.2.8 : The graph of f (x) = 1/x confirms that the limit as x approaches 0 does not exist.

 Exercise 2.2.5
Evaluate each of the following limits, if possible. Use a table of functional values and graph f (x) = 1/x
2
to confirm your
conclusion.
1
a. lim
− 2
x→0 x
1
b. lim
+
x→0 x2
1
c. lim
2
x→0 x

2.2.11 https://math.libretexts.org/@go/page/4441
Hint
Follow the procedures from Example 2.2.5.
Answer
1
a. lim
− 2
= +∞ ;
x→0 x

1
b. lim = +∞ ;
+ 2
x→0 x

1
c. lim 2
= +∞
x→0 x

It is useful to point out that functions of the form f (x) = 1/(x − a) , where n is a positive integer, have infinite limits as
n

x approaches a from either the left or right (Figure 2.2.9). These limits are summarized in the above definitions.

Figure 2.2.9 : The function f (x) = 1/(x − a) has infinite limits at a .


n

Infinite Limits from Positive Integers


If n is a positive even integer, then
1
lim = +∞. (2.2.2)
x→a (x − a)n

If n is a positive odd integer, then


1
lim = +∞ (2.2.3)
x→a
+
(x − a)n

and
1
lim = −∞. (2.2.4)
n
x→a

(x − a)

We should also point out that in the graphs of f (x) = 1/(x − a) , points on the graph having x-coordinates very near to a are
n

very close to the vertical line x = a . That is, as x approaches a , the points on the graph of f (x) are closer to the line x = a . The
line x = a is called a vertical asymptote of the graph. We formally define a vertical asymptote as follows:

 Definition: Vertical Asymptotes

Let f (x) be a function. If any of the following conditions hold, then the line x = a is a vertical asymptote of f (x).

lim f (x) = +∞

x→a

2.2.12 https://math.libretexts.org/@go/page/4441
lim f (x) = −∞

x→a

lim f (x) = +∞
+
x→a

lim f (x) = −∞
+
x→a

lim f (x) = +∞
x→a

lim f (x) = −∞
x→a

 Example 2.2.6: Finding a Vertical Asymptote

Evaluate each of the following limits using Equations ,


2.2.2 2.2.3 , and 2.2.4 above. Identify any vertical asymptotes of the
function f (x) = 1/(x + 3) . 4

1
a. lim
− 4
x→−3 (x + 3)
1
b. lim
+ 4
x→−3 (x + 3)

1
c. lim
4
x→−3 (x + 3)

Solution
We can use the above equations directly.
1
a. lim = +∞
x→−3

(x + 3)4

1
b. lim
+ 4
= +∞
x→−3 (x + 3)

1
c. lim
4
= +∞
x→−3 (x + 3)

The function f (x) = 1/(x + 3) has a vertical asymptote of x = −3 .


4

 Exercise 2.2.6
1
Evaluate each of the following limits. Identify any vertical asymptotes of the function f (x) = .
(x − 2)3

1
a. lim
3

x→2 (x − 2)

1
b. lim
+ 3
x→2 (x − 2)
1
c. lim
3
x→2 (x − 2)

Answer a
1
lim = −∞
− 3
x→2 (x − 2)

Answer b
1
lim = +∞
+ 3
x→2 (x − 2)

Answer c
1
lim
3
DNE. The line x = 2 is the vertical asymptote of f (x) = 1/(x − 2) 3
.
x→2 (x − 2)

2.2.13 https://math.libretexts.org/@go/page/4441
In the next example we put our knowledge of various types of limits to use to analyze the behavior of a function at several different
points.

 Example 2.2.7: Behavior of a Function at Different Points

Use the graph of f (x) in Figure 2.2.10 to determine each of the following values:
a. lim f (x) ; lim f (x) ; lim f (x); f (−4)
− +
x→−4 x→−4 x→−4

b. lim

);
f (x lim
+
f (x) ; lim f (x); f (−2)
x→−2 x→−2 x→−2

c. lim f (x)

; ;
lim f (x) lim f (x); f (1)
+
x→1 x→1 x→1

d. lim f (x)

; ;
lim f (x) lim f (x); f (3)
+
x→3 x→3 x→3

Figure 2.2.10 : The graph shows f (x).


Solution
Using the definitions above and the graph for reference, we arrive at the following values:
a. lim

f (x) = 0 ; lim
+
f (x) = 0 ; lim f (x) = 0; f (−4) = 0
x→−4 x→−4 x→−4

b. lim

f (x) = 3 ; lim
+
f (x) = 3 ; lim f (x) = 3; f (−2) is undefined
x→−2 x→−2 x→−2

c. lim f (x) = 6

; lim f (x) = 3
+
; lim f (x) DNE; f (1) = 6
x→1 x→1 x→1

d. lim f (x) = −∞

; lim f (x) = −∞
+
; lim f (x) = −∞ ; f (3) is undefined
x→3 x→3 x→3

 Exercise 2.2.7

Evaluate lim f (x) for f (x) shown here:


x→1

2.2.14 https://math.libretexts.org/@go/page/4441
Figure 2.2.11 . The graph of a piecewise function f .

Hint
Compare the limit from the right with the limit from the left.
Answer
lim f (x) does not exist
x→1

 Example 2.2.8: Einstein’s Equation

In the Chapter opener we mentioned briefly how Albert Einstein showed that a limit exists to how fast any object can travel.
Given Einstein’s equation for the mass of a moving object
m0
m = −−−−−,
v2
√1 − 2
c

what is the value of this bound?

Figure 2.2.12 . (Credit:NASA)


Solution
Our starting point is Einstein’s equation for the mass of a moving object,
m0
m = −−−−−,
2
v
√1 − 2
c

where m is the object’s mass at rest, v is its speed, and c is the speed of light. To see how the mass changes at high speeds, we
0

can graph the ratio of masses m/m as a function of the ratio of speeds, v/c (Figure 2.2.13).
0

2.2.15 https://math.libretexts.org/@go/page/4441
Figure 2.2.13 : This graph shows the ratio of masses as a function of the ratio of speeds in Einstein’s equation for the mass of a
moving object.
We can see that as the ratio of speeds approaches 1—that is, as the speed of the object approaches the speed of light—the ratio
of masses increases without bound. In other words, the function has a vertical asymptote at v/c = 1 . We can try a few values
of this ratio to test this idea.
Table 2.2.8
−−−−−−
v2
v/c √1 − m/mo
c2

0.99 0.1411 7.089

0.999 0.0447 22.37

0.9999 0.0141 70.7

Thus, according to Table 2.2.8:, if an object with mass 100 kg is traveling at 0.9999c, its mass becomes 7071 kg. Since no
object can have an infinite mass, we conclude that no object can travel at or more than the speed of light.

Key Concepts
A table of values or graph may be used to estimate a limit.
If the limit of a function at a point does not exist, it is still possible that the limits from the left and right at that point may exist.
If the limits of a function from the left and right exist and are equal, then the limit of the function is that common value.
We may use limits to describe infinite behavior of a function at a point.

Key Equations
Intuitive Definition of the Limit
lim f (x) = L
x→a

Two Important Limits


lim x = a lim c = c
x→a x→a

One-Sided Limits
lim f (x) = L lim f (x) = L
− +
x→a x→a

Infinite Limits from the Left


lim f (x) = +∞ lim f (x) = −∞
− −
x→a x→a

Infinite Limits from the Right


lim f (x) = +∞ lim f (x) = −∞
+ +
x→a x→a

Two-Sided Infinite Limits


lim f (x) = +∞ : lim f (x) = +∞

and lim f (x) = +∞
+
x→a x→a x→a

lim f (x) = −∞ : lim f (x) = −∞



and lim f (x) = −∞
+
x→a x→a x→a

2.2.16 https://math.libretexts.org/@go/page/4441
Glossary
infinite limit
A function has an infinite limit at a point a if it either increases or decreases without bound as it approaches a

intuitive definition of the limit


If all values of the function f (x) approach the real number L as the values of x(≠ a) approach a, f (x) approaches L

one-sided limit
A one-sided limit of a function is a limit taken from either the left or the right

vertical asymptote
A function has a vertical asymptote at x = a if the limit as x approaches a from the right or left is infinite

2.2: The Limit of a Function is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
2.2: The Limit of a Function by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

2.2.17 https://math.libretexts.org/@go/page/4441
2.3: Calculating Limits Using the Limit Laws
 Learning Objectives
Recognize the basic limit laws.
Use the limit laws to evaluate the limit of a function.
Evaluate the limit of a function by factoring.
Use the limit laws to evaluate the limit of a polynomial or rational function.
Evaluate the limit of a function by factoring or by using conjugates.
Evaluate the limit of a function by using the squeeze theorem.

In the previous section, we evaluated limits by looking at graphs or by constructing a table of values. In this section, we establish
laws for calculating limits and learn how to apply these laws. In the Student Project at the end of this section, you have the
opportunity to apply these limit laws to derive the formula for the area of a circle by adapting a method devised by the Greek
mathematician Archimedes. We begin by restating two useful limit results from the previous section. These two results, together
with the limit laws, serve as a foundation for calculating many limits.

Evaluating Limits with the Limit Laws


The first two limit laws were stated previously and we repeat them here. These basic results, together with the other limit laws,
allow us to evaluate limits of many algebraic functions.

 Basic Limit Results


For any real number a and any constant c ,
I. lim x = a
x→a

II. lim c = c
x→a

 Example 2.3.1: Evaluating a Basic Limit

Evaluate each of the following limits using "Basic Limit Results."


a. lim x
x→2

b. lim 5
x→2

Solution
a. The limit of x as x approaches a is a : lim x = 2 .
x→2

b. The limit of a constant is that constant: lim 5 = 5 .


x→2

We now take a look at the limit laws, the individual properties of limits. The proofs that these laws hold are omitted here.

 Limit Laws

Let f (x) and g(x) be defined for all x ≠ a over some open interval containing a . Assume that L and M are real numbers such
that lim f (x) = L and lim g(x) = M . Let c be a constant. Then, each of the following statements holds:
x→a x→a

Sum law for limits:

lim(f (x) + g(x)) = lim f (x) + lim g(x) = L + M


x→a x→a x→a

Difference law for limits:

lim(f (x) − g(x)) = lim f (x) − lim g(x) = L − M


x→a x→a x→a

2.3.1 https://math.libretexts.org/@go/page/4442
Constant multiple law for limits:

lim cf (x) = c ⋅ lim f (x) = cL


x→a x→a

Product law for limits:

lim(f (x) ⋅ g(x)) = lim f (x) ⋅ lim g(x) = L ⋅ M


x→a x→a x→a

Quotient law for limits:


lim f (x)
f (x) x→a L
lim = =
x→a g(x) lim g(x) M
x→a

for M ≠0 .
Power law for limits:
n n n
lim (f (x)) = ( lim f (x)) =L
x→a x→a

for every positive integer n .


Root law for limits:
−−−− −−−−− −− n −

n
lim √f (x) = √
n
lim f (x) = √L
x→a x→a

for all L if n is odd and for L ≥ 0 if n is even.

We now practice applying these limit laws to evaluate a limit.

 Example 2.3.2A: Evaluating a Limit Using Limit Laws

Use the limit laws to evaluate

lim (4x + 2).


x→−3

Solution
Let’s apply the limit laws one step at a time to be sure we understand how they work. We need to keep in mind the requirement
that, at each application of a limit law, the new limits must exist for the limit law to be applied.
lim (4x + 2) = lim 4x + lim 2 Apply the sum law.
x→−3 x→−3 x→−3

= 4 ⋅ lim x + lim 2 Apply the constant multiple law.


x→−3 x→−3

= 4 ⋅ (−3) + 2 = −10. Apply the basic limit results and simplify.

 Example 2.3.2B: Using Limit Laws Repeatedly

Use the limit laws to evaluate


2
2x − 3x + 1
lim .
x→2 3
x +4

Solution
To find this limit, we need to apply the limit laws several times. Again, we need to keep in mind that as we rewrite the limit in
terms of other limits, each new limit must exist for the limit law to be applied.

2.3.2 https://math.libretexts.org/@go/page/4442
2
2
lim(2 x − 3x + 1)
2x − 3x + 1 x→2 3
lim = Apply the quotient law, make sure that (2 ) + 4 ≠ 0.
x→2 3 3
x +4 lim(x + 4)
x→2

2
2 ⋅ lim x − 3 ⋅ lim x + lim 1
x→2 x→2 x→2
= Apply the sum law and constant multiple law.
3
lim x + lim 4
x→2 x→2

2 ⋅ ( lim x) − 3 ⋅ lim x + lim 1


x→2 x→2 x→2
= Apply the power law.
3

( lim x) + lim 4
x→2 x→2

2(4) − 3(2) + 1 1
= = . Apply the basic limit laws and simplify.
3
(2 ) +4 4

 Exercise 2.3.2

Use the limit laws to evaluate lim(2x − 1)√−−−−−


x + 4 . In each step, indicate the limit law applied.
x→6

Hint
Begin by applying the product law.

Answer
−−
11 √10

Limits of Polynomial and Rational Functions


By now you have probably noticed that, in each of the previous examples, it has been the case that lim f (x) = f (a) . This is not
x→a

always true, but it does hold for all polynomials for any choice of a and for all rational functions at all values of a for which the
rational function is defined.

 Limits of Polynomial and Rational Functions

Let p(x) and q(x) be polynomial functions. Let a be a real number. Then,

lim p(x) = p(a)


x→a

p(x) p(a)
lim =
x→a q(x) q(a)

when q(a) ≠ 0 .
To see that this theorem holds, consider the polynomial
n n−1
p(x) = cn x + cn−1 x + ⋯ + c1 x + c0 .

By applying the sum, constant multiple, and power laws, we end up with
n n−1
lim p(x) = lim(cn x + cn−1 x + ⋯ + c1 x + c0 )
x→a x→a

n n−1

= cn ( lim x) + cn−1 ( lim x) + ⋯ + c1 ( lim x) + lim c0


x→a x→a x→a x→a

n n−1
= cn a + cn−1 a + ⋯ + c1 a + c0

= p(a)

It now follows from the quotient law that if p(x) and q(x) are polynomials for which q(a) ≠ 0 ,

2.3.3 https://math.libretexts.org/@go/page/4442
then
p(x) p(a)
lim = .
x→a q(x) q(a)

 Example 2.3.3: Evaluating a Limit of a Rational Function


2
2x − 3x + 1
Evaluate the lim .
x→3 5x + 4

Solution
2
2x − 3x + 1
Since 3 is in the domain of the rational function f (x) = , we can calculate the limit by substituting 3 for x into
5x + 4

the function. Thus,


2
2x − 3x + 1 10
lim = .
x→3 5x + 4 19

 Exercise 2.3.3

Evaluate lim (3 x
3
− 2x + 7) .
x→−2

Hint
Use LIMITS OF POLYNOMIAL AND RATIONAL FUNCTIONS as reference

Answer
−13

Additional Limit Evaluation Techniques


As we have seen, we may evaluate easily the limits of polynomials and limits of some (but not all) rational functions by direct
substitution. However, as we saw in the introductory section on limits, it is certainly possible for lim f (x) to exist when f (a) is
x→a

undefined. The following observation allows us to evaluate many limits of this type:
If for all x ≠ a, f (x) = g(x) over some open interval containing a , then

lim f (x) = lim g(x).


x→a x→a

2
x −1
To understand this idea better, consider the limit lim .
x→1 x −1

The function
2
x −1 (x − 1)(x + 1)
f (x) = =
x −1 x −1

and the function g(x) = x + 1 are identical for all values of x ≠ 1 . The graphs of these two functions are shown in Figure 2.3.1.

2.3.4 https://math.libretexts.org/@go/page/4442
Figure 2.3.1 : The graphs of f (x) and g(x) are identical for all x ≠ 1 . Their limits at 1 are equal.
We see that
2
x −1 (x − 1)(x + 1)
lim = lim = lim (x + 1) = 2.
x→1 x −1 x→1 x −1 x→1

The limit has the form lim f (x)/g(x) , where lim f (x) = 0 and lim g(x) = 0 . (In this case, we say that f (x)/g(x) has the
x→a x→a x→a

indeterminate form 0/0.) The following Problem-Solving Strategy provides a general outline for evaluating limits of this type.

 Problem-Solving Strategy: Calculating a Limit When f (x)/g(x) has the Indeterminate Form 0/0
1. First, we need to make sure that our function has the appropriate form and cannot be evaluated immediately using the limit
laws.
2. We then need to find a function that is equal to h(x) = f (x)/g(x) for all x ≠ a over some interval containing a. To do
this, we may need to try one or more of the following steps:
a. If f (x) and g(x) are polynomials, we should factor each function and cancel out any common factors.
b. If the numerator or denominator contains a difference involving a square root, we should try multiplying the numerator
and denominator by the conjugate of the expression involving the square root.
c. If f (x)/g(x) is a complex fraction, we begin by simplifying it.
3. Last, we apply the limit laws.

The next examples demonstrate the use of this Problem-Solving Strategy. Example 2.3.4 illustrates the factor-and-cancel
technique; Example 2.3.5 shows multiplying by a conjugate. In Example 2.3.6, we look at simplifying a complex fraction.

 Example 2.3.4: Evaluating a Limit by Factoring and Canceling


2
x − 3x
Evaluate lim 2
.
x→3 2x − 5x − 3

Solution
2
x − 3x
Step 1. The function f (x) =
2
is undefined for x =3 . In fact, if we substitute 3 into the function we get 0/0,
2x − 5x − 3

which is undefined. Factoring and canceling is a good strategy:


2
x − 3x x(x − 3)
lim = lim
x→3 2 x→3
2x − 5x − 3 (x − 3)(2x + 1)

2
x − 3x x
Step 2. For all x ≠ 3, 2
= . Therefore,
2x − 5x − 3 2x + 1

x(x − 3) x
lim = lim .
x→3 (x − 3)(2x + 1) x→3 2x + 1

2.3.5 https://math.libretexts.org/@go/page/4442
Step 3. Evaluate using the limit laws:
x 3
lim = .
x→3 2x + 1 7

 Exercise 2.3.4
2
x + 4x + 3
Evaluate lim
2
.
x→−3 x −9

Hint
Follow the steps in the Problem-Solving Strategy

Answer
1

 Example 2.3.5: Evaluating a Limit by Multiplying by a Conjugate


−−−−−
√x + 2 − 1
Evaluate lim .
x→−1 x +1

Solution
−−−−−
√x + 2 − 1 −−−−− −−−−−
Step 1. has the form 0/0 at −1. Let’s begin by multiplying by √x + 2 +1 , the conjugate of √x + 2 −1 , on the
x +1
numerator and denominator:
−−−−− −−−−− −−−−−
√x + 2 − 1 √x + 2 − 1 √x + 2 + 1
lim = lim ⋅ −−−−− .
x→−1 x +1 x→−1 x +1 √x + 2 + 1

Step 2. We then multiply out the numerator. We don’t multiply out the denominator because we are hoping that the (x + 1) in
the denominator cancels out in the end:
x +1
= lim .
−− −−−
x→−1 (x + 1)(√ x + 2 + 1)

Step 3. Then we cancel:


1
= lim .
−−−−−
x→−1 √x + 2 + 1

Step 4. Last, we apply the limit laws:


1 1
lim −−−−− = .
x→−1 √x + 2 + 1 2

 Exercise 2.3.5
−−−−−
√x − 1 − 2
Evaluate lim .
x→5 x −5

Hint
Follow the steps in the Problem-Solving Strategy

Answer
1

2.3.6 https://math.libretexts.org/@go/page/4442
 Example 2.3.6: Evaluating a Limit by Simplifying a Complex Fraction
1 1

x +1 2
Evaluate lim .
x→1 x −1

Solution
1 1

x +1 2
Step 1. has the form 0/0 at 1. We simplify the algebraic fraction by multiplying by 2(x + 1)/2(x + 1) :
x −1

1 1 1 1
− −
x +1 2 x +1 2 2(x + 1)
lim = lim ⋅ .
x→1 x −1 x→1 x −1 2(x + 1)

Step 2. Next, we multiply through the numerators. Do not multiply the denominators because we want to be able to cancel the
factor (x − 1) :
2 − (x + 1)
= lim .
x→1 2(x − 1)(x + 1)

Step 3. Then, we simplify the numerator:


−x + 1
= lim .
x→1 2(x − 1)(x + 1)

Step 4. Now we factor out −1 from the numerator:


−(x − 1)
= lim .
x→1 2(x − 1)(x + 1)

Step 5. Then, we cancel the common factors of (x − 1) :


−1
= lim .
x→1 2(x + 1)

Step 6. Last, we evaluate using the limit laws:


−1 1
lim =− .
x→1 2(x + 1) 4

 Exercise 2.3.6
1
+1
x +2
Evaluate lim .
x→−3 x +3

Hint
Follow the steps in the Problem-Solving Strategy

Answer
−1

Example 2.3.7 does not fall neatly into any of the patterns established in the previous examples. However, with a little creativity,
we can still use these same techniques.

2.3.7 https://math.libretexts.org/@go/page/4442
 Example 2.3.7: Evaluating a Limit When the Limit Laws Do Not Apply
1 5
Evaluate lim ( + ) .
x→0 x x(x − 5)

Solution:
Both 1/x and 5/x(x − 5) fail to have a limit at zero. Since neither of the two functions has a limit at zero, we cannot apply the
sum law for limits; we must use a different strategy. In this case, we find the limit by performing addition and then applying
one of our previous strategies. Observe that
1 5 x −5 +5 x
+ = = .
x x(x − 5) x(x − 5) x(x − 5)

Thus,
1 5 x 1 1
lim ( + ) = lim = lim =− .
x→0 x x(x − 5) x→0 x(x − 5) x→0 x −5 5

 Exercise 2.3.7
1 4
Evaluate lim ( −
2
) .
x→3 x −3 x − 2x − 3

Hint
Use the same technique as Example 2.3.7. Don’t forget to factor x 2
− 2x − 3 before getting a common denominator.

Answer
1

Let’s now revisit one-sided limits. Simple modifications in the limit laws allow us to apply them to one-sided limits. For example,
to apply the limit laws to a limit of the form lim h(x), we require the function h(x) to be defined over an open interval of the

x→a

form (b, a); for a limit of the form lim h(x)


+
, we require the function h(x) to be defined over an open interval of the form (a, c).
x→a

Example 2.3.8A illustrates this point.

 Example 2.3.8A: Evaluating a One-Sided Limit Using the Limit Laws

Evaluate each of the following limits, if possible.


−−−−−
a. lim √x − 3

x→3
−−−−−
b. lim √x − 3
+
x→3

Solution
−−−−−
Figure 2.3.2 illustrates the function f (x) = √x − 3 and aids in our understanding of these limits.

−−−−−
Figure 2.3.2 : The graph shows the function f (x) = √x − 3 .
−−−−−
a. The function f (x) = √x − 3 is defined over the interval [3, +∞). Since this function is not defined to the left of 3, we
−−− −
− −−−−− −−−−−
cannot apply the limit laws to compute lim √x − 3 . In fact, since f (x) = √x − 3 is undefined to the left of 3, lim √x − 3
− −
x→3 x→3

2.3.8 https://math.libretexts.org/@go/page/4442
does not exist.
−−−−− −−−−−
b. Since f (x) = √x − 3 is defined to the right of 3, the limit laws do apply to lim √x − 3
+
. By applying these limit laws we
x→3
−−−−−
obtain lim √x − 3 = 0 .
+
x→3

In Example 2.3.8B we look at one-sided limits of a piecewise-defined function and use these limits to draw a conclusion about a
two-sided limit of the same function.

 Example 2.3.8B: Evaluating a Two-Sided Limit Using the Limit Laws

4x − 3, if x < 2
For f (x) = { 2
, evaluate each of the following limits:
(x − 3 ) , if x ≥ 2

a. lim f (x)

x→2

b. lim f (x)
+
x→2

c. lim f (x)
x→2

Solution
Figure 2.3.3 illustrates the function f (x) and aids in our understanding of these limits.

Figure 2.3.3 : This graph shows a function f (x).


a. Since f (x) = 4x − 3 for all x in (−∞, 2), replace f (x) in the limit with 4x − 3 and apply the limit laws:

lim f (x) = lim (4x − 3) = 5


− −
x→2 x→2

b. Since f (x) = (x − 3) for all x in (2, +∞), replace f (x) in the limit with (x − 3) and apply the limit laws:
2 2

2
lim f (x) = lim (x − 3 ) = 1.
+ −
x→2 x→2

c. Since lim f (x) = 5



and lim f (x) = 1
+
, we conclude that lim f (x) does not exist.
x→2 x→2 x→2

 Exercise 2.3.8

⎧ −x − 2, if x < −1

Graph f (x) = ⎨ 2, if x = −1 and evaluate lim f (x) .


⎩ −
x→−1
3
x , if x > −1

Hint
Use the method in Example 2.3.8Bto evaluate the limit.

Answer

2.3.9 https://math.libretexts.org/@go/page/4442
lim f (x) = −1

x→−1

f (x)
We now turn our attention to evaluating a limit of the form lim , where lim f (x) = K , where K ≠0 and lim g(x) = 0 .
x→a g(x) x→a x→a

That is, f (x)/g(x) has the form K/0, K ≠ 0 at a .

 Example 2.3.9: Evaluating a Limit of the Form K/0, K ≠ 0 Using the Limit Laws
x −3
Evaluate lim
− 2
.
x→2 x − 2x

Solution
Step 1. After substituting in x =2 , we see that this limit has the form −1/0 . That is, as x approaches 2 from the left, the
x −3
numerator approaches −1; and the denominator approaches 0. Consequently, the magnitude of becomes infinite. To
x(x − 2)

get a better idea of what the limit is, we need to factor the denominator:
x −3 x −3
lim = lim
2
x→2

x − 2x x→2

x(x − 2)

Step 2. Since x − 2 is the only part of the denominator that is zero when 2 is substituted, we then separate 1/(x − 2) from the
rest of the function:
x −3 1
= lim ⋅
x→2

x x −2

Step 3. Using the Limit Laws, we can write:


x −3 1
= ( lim ) ⋅ ( lim ).
x→2

x x→2

x −2

x −3 1 1
Step 4. lim =− and lim = −∞ . Therefore, the product of (x − 3)/x and 1/(x − 2) has a limit of +∞ :

x→2 x 2 −
x→2 x −2

x −3
lim = +∞.
− 2
x→2 x − 2x

 Exercise 2.3.9
x +2
Evaluate lim 2
.
x→1 (x − 1)

Solution

2.3.10 https://math.libretexts.org/@go/page/4442
Use the methods from Example 2.3.9.

Answer
+∞

The Squeeze Theorem


The techniques we have developed thus far work very well for algebraic functions, but we are still unable to evaluate limits of very
basic trigonometric functions. The next theorem, called the squeeze theorem, proves very useful for establishing basic
trigonometric limits. This theorem allows us to calculate limits by “squeezing” a function, with a limit at a point a that is unknown,
between two functions having a common known limit at a . Figure 2.3.4 illustrates this idea.

Figure 2.3.4 : The Squeeze Theorem applies when f (x) ≤ g(x) ≤ h(x) and lim f (x) = lim h(x).
x→a x→a

 The Squeeze Theorem


Let f (x), g(x), and h(x) be defined for all x ≠ a over an open interval containing a . If

f (x) ≤ g(x) ≤ h(x)

for all x ≠ a in an open interval containing a and

lim f (x) = L = lim h(x)


x→a x→a

where L is a real number, then lim g(x) = L.


x→a

 Example 2.3.10: Applying the Squeeze Theorem


Apply the squeeze theorem to evaluate lim x cos x.
x→0

Solution
Because −1 ≤ cos x ≤ 1 for all x, we have −x ≤ x cos x ≤ x for x ≥ 0 and −x ≥ x cos x ≥ x for x ≤ 0 (if x is negative
the direction of the inequalities changes when we multiply). Since lim(−x) = 0 = lim x , from the squeeze theorem, we
x→0 x→0

obtain lim x cos x = 0 . The graphs of f (x) = −x, g(x) = x cos x , and h(x) = x are shown in Figure 2.3.5.
x→0

2.3.11 https://math.libretexts.org/@go/page/4442
Figure 2.3.5 : The graphs of f (x), g(x) , and h(x) are shown around the point x = 0 .

 Exercise 2.3.10
1
Use the squeeze theorem to evaluate lim x 2
sin .
x→0 x

Hint
Use the fact that −x 2
≤x
2
sin(1/x) ≤ x
2
to help you find two functions such that x 2
sin(1/x) is squeezed between them.

Answer
0

We now use the squeeze theorem to tackle several very important limits. Although this discussion is somewhat lengthy, these limits
prove invaluable for the development of the material in both the next section and the next chapter. The first of these limits is
lim sin θ . Consider the unit circle shown in Figure 2.3.6. In the figure, we see that sin θ is the y -coordinate on the unit circle and it
θ→0

corresponds to the line segment shown in blue. The radian measure of angle θ is the length of the arc it subtends on the unit circle.
π
Therefore, we see that for 0 < θ < , we have 0 < sin θ < θ.
2

Figure 2.3.6 : The sine function is shown as a line on the unit circle.
Because lim 0 = 0
+
and lim θ = 0
+
, by using the squeeze theorem we conclude that
θ→0 x→0

lim sin θ = 0.
+
θ→0

π π
To see that lim sin θ = 0 as well, observe that for − < θ < 0, 0 < −θ < and hence, 0 < sin(−θ) < −θ . Consequently,
θ→0

2 2

0 < − sin θ < −θ . It follows that 0 > sin θ > θ . An application of the squeeze theorem produces the desired limit. Thus, since

2.3.12 https://math.libretexts.org/@go/page/4442
lim sin θ = 0
+
and lim sin θ = 0

,
θ→0 θ→0

lim sin θ = 0
θ→0

−−−−−−−− π π
Next, using the identity cos θ = √1 − sin 2
θ for − <θ < , we see that
2 2

− −−− −−− −
2
lim cos θ = lim √ 1 − sin θ = 1.
θ→0 θ→0

sin θ
We now take a look at a limit that plays an important role in later chapters—namely, lim . To evaluate this limit, we use the
θ→0 θ

unit circle in Figure 2.3.7 . Notice that this figure adds one additional triangle to Figure 2.3.6 . We see that the length of the side
π
opposite angle θ in this new triangle is tan θ . Thus, we see that for 0 < θ < , we have sin θ < θ < tan θ .
2

Figure 2.3.7 : The sine and tangent functions are shown as lines on the unit circle.
By dividing by sin θ in all parts of the inequality, we obtain
θ 1
1 < < .
sin θ cos θ

Equivalently, we have
sin θ
1 > > cos θ.
θ

sin θ
Since lim 1 = 1 = lim cos θ , we conclude that lim =1 , by the squeeze theorem. By applying a manipulation similar to
+
θ→0 θ→0
+
θ→0
+
θ
sin θ
that used in demonstrating that lim sin θ = 0 , we can show that lim =1 . Thus,
θ→0

θ→0

θ

sin θ
lim = 1.
θ→0 θ

1 − cos θ
In Example 2.3.11, we use this limit to establish lim =0 . This limit also proves useful in later chapters.
θ→0 θ

 Example 2.3.11: Evaluating an Important Trigonometric Limit


1 − cos θ
Evaluate lim .
θ→0 θ

Solution

2.3.13 https://math.libretexts.org/@go/page/4442
In the first step, we multiply by the conjugate so that we can use a trigonometric identity to convert the cosine in the numerator
to a sine:
1 − cos θ 1 − cos θ 1 + cos θ
lim = lim ⋅
θ→0 θ θ→0 θ 1 + cos θ

2
1 − cos θ
= lim
θ→0 θ(1 + cos θ)

2
sin θ
= lim
θ→0 θ(1 + cos θ)

sin θ sin θ
= lim ⋅
θ→0 θ 1 + cos θ

sin θ sin θ
= ( lim ) ⋅ ( lim )
θ→0 θ θ→0 1 + cos θ

0
=1⋅ = 0.
2

Therefore,
1 − cos θ
lim = 0.
θ→0 θ

 Exercise 2.3.11
1 − cos θ
Evaluate lim .
θ→0 sin θ

Hint
Multiply numerator and denominator by 1 + cos θ .

Answer
0

 Deriving the Formula for the Area of a Circle

Some of the geometric formulas we take for granted today were first derived by methods that anticipate some of the methods
of calculus. The Greek mathematician Archimedes (ca. 287−212; BCE) was particularly inventive, using polygons inscribed
within circles to approximate the area of the circle as the number of sides of the polygon increased. He never came up with the
idea of a limit, but we can use this idea to see what his geometric constructions could have predicted about the limit.
We can estimate the area of a circle by computing the area of an inscribed regular polygon. Think of the regular polygon as
being made up of n triangles. By taking the limit as the vertex angle of these triangles goes to zero, you can obtain the area of
the circle. To see this, carry out the following steps:
1.Express the height h and the base b of the isosceles triangle in Figure 2.3.8 in terms of θ and r.

2.3.14 https://math.libretexts.org/@go/page/4442
Figure 2.3.8
2. Using the expressions that you obtained in step 1, express the area of the isosceles triangle in terms of θ and r.
(Substitute 1

2
sin θ for sin( θ

2
) cos(
θ

2
) in your expression.)
3. If an n -sided regular polygon is inscribed in a circle of radius r, find a relationship between θ and n . Solve this for n .
Keep in mind there are 2π radians in a circle. (Use radians, not degrees.)
4. Find an expression for the area of the n -sided polygon in terms of r and θ .
5. To find a formula for the area of the circle, find the limit of the expression in step 4 as θ goes to zero. (Hint:
sin θ
lim = 1) .
θ→0 θ

The technique of estimating areas of regions by using polygons is revisited in Introduction to Integration.

Key Concepts
The limit laws allow us to evaluate limits of functions without having to go through step-by-step processes each time.
For polynomials and rational functions,

lim f (x) = f (a).


x→a

You can evaluate the limit of a function by factoring and canceling, by multiplying by a conjugate, or by simplifying a complex
fraction.
The squeeze theorem allows you to find the limit of a function if the function is always greater than one function and less than
another function with limits that are known.

Key Equations
Basic Limit Results

lim x = a lim c = c
x→a x→a

Important Limits

lim sin θ = 0
θ→0

lim cos θ = 1
θ→0

sin θ
lim =1
θ→0 θ

1 − cos θ
lim =0
θ→0 θ

2.3.15 https://math.libretexts.org/@go/page/4442
Glossary
constant multiple law for limits
the limit law

lim cf (x) = c ⋅ lim f (x) = cL


x→a x→a

difference law for limits


the limit law

lim(f (x) − g(x)) = lim f (x) − lim g(x) = L − M


x→a x→a x→a

limit laws
the individual properties of limits; for each of the individual laws, let f (x) and g(x) be defined for all x ≠ a over some open
interval containing a; assume that L and M are real numbers so that lim f (x) = L and lim
x→a g(x) = M ; let c be a
x→a

constant

power law for limits


the limit law
n n n
lim(f (x)) = ( lim f (x)) =L
x→a x→a

for every positive integer n

product law for limits


the limit law

lim(f (x) ⋅ g(x)) = lim f (x) ⋅ lim g(x) = L ⋅ M


x→a x→a x→a

quotient law for limits


f (x) limx→a f (x) L
the limit law lim x→a = = for M≠0
g(x) limx→a g(x) M

root law for limits


−−−− −−−−−−−− − n −

the limit law lim x→a
n n
√f (x) = √limx→a f (x) = √L for all L if n is odd and for L ≥ 0 if n is even

squeeze theorem
states that if f (x) ≤ g(x) ≤ h(x) for all x ≠ a over an open interval containing a and lim x→a f (x) = L = limx→a h(x)

where L is a real number, then lim g(x) = L


x→a

sum law for limits


The limit law lim x→a (f (x) + g(x)) = limx→a f (x) + limx→a g(x) = L + M

2.3: Calculating Limits Using the Limit Laws is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
2.3: The Limit Laws by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

2.3.16 https://math.libretexts.org/@go/page/4442
Welcome to the Mathematics Library. This Living Library is a principal hub of the LibreTexts project, which is a multi-institutional
collaborative venture to develop the next generation of open-access texts to improve postsecondary education at all levels of higher
learning. The LibreTexts approach is highly collaborative where an Open Access textbook environment is under constant revision
by students, faculty, and outside experts to supplant conventional paper-based books.

1 https://math.libretexts.org/@go/page/4443
2.5: Continuity
 Learning Objectives
Explain the three conditions for continuity at a point.
Describe three kinds of discontinuities.
Define continuity on an interval.
State the theorem for limits of composite functions.
Provide an example of the intermediate value theorem.

Many functions have the property that their graphs can be traced with a pencil without lifting the pencil from the page. Such functions are
called continuous. Other functions have points at which a break in the graph occurs, but satisfy this property over intervals contained in
their domains. They are continuous on these intervals and are said to have a discontinuity at a point where a break occurs.
We begin our investigation of continuity by exploring what it means for a function to have continuity at a point. Intuitively, a function is
continuous at a particular point if there is no break in its graph at that point.

Continuity at a Point
Before we look at a formal definition of what it means for a function to be continuous at a point, let’s consider various functions that fail
to meet our intuitive notion of what it means to be continuous at a point. We then create a list of conditions that prevent such failures.
Our first function of interest is shown in Figure 2.5.1. We see that the graph of f (x) has a hole at a . In fact, f (a) is undefined. At the
very least, for f (x) to be continuous at a , we need the following condition:

i. f (a) is defined

Figure 2.5.1 : The function f (x) is not continuous at a because f (a) is undefined.
However, as we see in Figure 2.5.2, this condition alone is insufficient to guarantee continuity at the point a . Although f (a) is defined,
the function has a gap at a . In this example, the gap exists because lim f (x) does not exist. We must add another condition for continuity
x→a

at a —namely,

ii. lim f (x) exists


x→a

Figure 2.5.2 : The function f (x) is not continuous at a because lim f (x) does not exist.
x→a

However, as we see in Figure 2.5.3, these two conditions by themselves do not guarantee continuity at a point. The function in this figure
satisfies both of our first two conditions, but is still not continuous at a . We must add a third condition to our list:

2.5.1 https://math.libretexts.org/@go/page/4444
iii. lim f (x) = f (a)
x→a

Figure 2.5.3 : The function f (x) is not continuous at a because lim f (x) ≠ f (a).
x→a

Now we put our list of conditions together and form a definition of continuity at a point.

 Definition: Continuous at a Point

A function f (x) is continuous at a point a if and only if the following three conditions are satisfied:
i. f (a) is defined
ii. lim f (x) exists
x→a

iii. lim f (x) = f (a)


x→a

A function is discontinuous at a point a if it fails to be continuous at a .

The following procedure can be used to analyze the continuity of a function at a point using this definition.

 Problem-Solving Strategy: Determining Continuity at a Point


1. Check to see if f (a) is defined. If f (a) is undefined, we need go no further. The function is not continuous at a. If f (a) is
defined, continue to step 2.
2. Compute lim f (x). In some cases, we may need to do this by first computing lim f (x) and lim f (x). If lim f (x) does not
− +
x→a x→a x→a x→a

exist (that is, it is not a real number), then the function is not continuous at a and the problem is solved. If lim f (x) exists, then
x→a

continue to step 3.
3. Compare f (a) and lim f (x). If lim f (x) ≠ f (a) , then the function is not continuous at a. If lim f (x) = f (a) , then the function
x→a x→a x→a

is continuous at a.

The next three examples demonstrate how to apply this definition to determine whether a function is continuous at a given point. These
examples illustrate situations in which each of the conditions for continuity in the definition succeed or fail.

 Example 2.5.1A: Determining Continuity at a Point, Condition 1


2
x −4
Using the definition, determine whether the function f (x) = is continuous at x = 2 . Justify the conclusion.
x −2

Solution
2
x −4
Let’s begin by trying to calculate f (2). We can see that f (2) = 0/0, which is undefined. Therefore, f (x) = is discontinuous
x −2
at 2 because f (2) is undefined. The graph of f (x) is shown in Figure 2.5.4.

2.5.2 https://math.libretexts.org/@go/page/4444
Figure 2.5.4 : The function f (x) is discontinuous at 2 because f (2) is undefined.

 Example 2.5.1B: Determining Continuity at a Point, Condition 2


2
−x + 4, if x ≤ 3
Using the definition, determine whether the function f (x) = { is continuous at x = 3 . Justify the conclusion.
4x − 8, if x > 3

Solution
Let’s begin by trying to calculate f (3).
2
f (3) = −(3 ) + 4 = −5 .
Thus, f (3) is defined. Next, we calculate lim f (x). To do this, we must compute lim f (x)

and lim f (x)
+
:
x→3 x→3 x→3

2
lim f (x) = −(3 ) + 4 = −5

x→3

and
lim f (x) = 4(3) − 8 = 4
+
.
x→3

Therefore, lim f (x) does not exist. Thus, f (x) is not continuous at 3. The graph of f (x) is shown in Figure 2.5.5.
x→3

2.5.3 https://math.libretexts.org/@go/page/4444
Figure 2.5.5 : The function f (x) is not continuous at 3 because lim f (x) does not exist.
x→3

 Example 2.5.1C : Determining Continuity at a Point, Condition 3


sin x
, if x ≠ 0
Using the definition, determine whether the function f (x) = { x
is continuous at x = 0 .
1, if x = 0

Solution
First, observe that
f (0) = 1

Next,
sin x
lim f (x) = lim =1 .
x→0 x→0 x

Last, compare f (0) and lim f (x). We see that


x→0

f (0) = 1 = lim f (x) .


x→0

Since all three of the conditions in the definition of continuity are satisfied, f (x) is continuous at x = 0 .

 Exercise 2.5.1

⎧ 2x + 1, if x < 1

Using the definition, determine whether the function f (x) = ⎨ 2, if x = 1 is continuous at x =1 . If the function is not

−x + 4, if x > 1

continuous at 1, indicate the condition for continuity at a point that fails to hold.

Hint
Check each condition of the definition.

Answer
f is not continuous at 1 because f (1) = 2 ≠ 3 = lim f (x) .
x→1

By applying the definition of continuity and previously established theorems concerning the evaluation of limits, we can state the
following theorem.

2.5.4 https://math.libretexts.org/@go/page/4444
 Theorem 2.5.1: Continuity of Polynomials and Rational Functions

Polynomials and rational functions are continuous at every point in their domains.

 Proof
p(x) p(a)
Previously, we showed that if p(x) and q(x) are polynomials, lim p(x) = p(a) for every polynomial p(x) and lim = as
x→a x→a q(x) q(a)

long as q(a) ≠ 0 . Therefore, polynomials and rational functions are continuous on their domains.

We now apply Theorem 2.5.1 to determine the points at which a given rational function is continuous.

 Example 2.5.2:Continuity of a Rational Function


x +1
For what values of x is f (x) = continuous?
x −5

Solution
x +1
The rational function f (x) = is continuous for every value of x except x = 5 .
x −5

 Exercise 2.5.2

For what values of x is f (x) = 3x 4


− 4x
2
continuous?

Hint
Use the Continuity of Polynomials and Rational Functions stated above.

Answer
f (x) is continuous at every real number.

Types of Discontinuities
As we have seen in Example 2.5.1A and Example 2.5.1B, discontinuities take on several different appearances. We classify the types of
discontinuities we have seen thus far as removable discontinuities, infinite discontinuities, or jump discontinuities. Intuitively, a
removable discontinuity is a discontinuity for which there is a hole in the graph, a jump discontinuity is a noninfinite discontinuity for
which the sections of the function do not meet up, and an infinite discontinuity is a discontinuity located at a vertical asymptote. Figure
2.5.6 illustrates the differences in these types of discontinuities. Although these terms provide a handy way of describing three common

types of discontinuities, keep in mind that not all discontinuities fit neatly into these categories.

Figure 2.5.6 : Discontinuities are classified as (a) removable, (b) jump, or (c) infinite.
These three discontinuities are formally defined as follows:

2.5.5 https://math.libretexts.org/@go/page/4444
 Definition
If f (x) is discontinuous at a, then
1. f has a removable discontinuity at a if lim f (x) exists. (Note: When we state that lim f (x) exists, we mean that
x→a x→a

lim f (x) = L , where L is a real number.)


x→a

2. f has a jump discontinuity at a if lim f (x) and lim f (x) both exist, but lim f (x) ≠ lim f (x) . (Note: When we state
− + − +
x→a x→a x→a x→a

that lim f (x)



and lim f (x)
+
both exist, we mean that both are real-valued and that neither take on the values ±∞ .)
x→a x→a

3. f has an infinite discontinuity at a if lim f (x) = ±∞



or lim f (x) = ±∞
+
.
x→a x→a

 Example 2.5.3: Classifying a Discontinuity


2
x −4
In Example 2.5.1A, we showed that f (x) = is discontinuous at x =2 . Classify this discontinuity as removable, jump, or
x −2

infinite.
Solution
To classify the discontinuity at 2 we must evaluate lim f (x):
x→2

2
x −4
lim f (x) = lim
x→2 x→2 x −2

(x − 2)(x + 2)
= lim
x→2 x −2

= lim(x + 2)
x→2

= 4.

Since f is discontinuous at 2 and lim f (x) exists, f has a removable discontinuity at x = 2 .


x→2

 Example 2.5.4: Classifying a Discontinuity


2
−x + 4, if x ≤ 3
In Example 2.5.1B, we showed that f (x) = { is discontinuous at x =3 . Classify this discontinuity as
4x − 8, if x > 3

removable, jump, or infinite.


Solution
Earlier, we showed that f is discontinuous at 3 because lim f (x) does not exist. However, since lim f (x) = −5

and lim f (x) = 4
+
x→3 x→3 x→3

both exist, we conclude that the function has a jump discontinuity at 3.

 Example 2.5.5: Classifying a Discontinuity


x +2
Determine whether f (x) = is continuous at −1 . If the function is discontinuous at −1 , classify the discontinuity as
x +1

removable, jump, or infinite.


Solution
The function value f (−1) is undefined. Therefore, the function is not continuous at −1 . To determine the type of discontinuity, we
x +2 x +2
must determine the limit at −1. We see that lim = −∞ and lim = +∞ . Therefore, the function has an infinite
x→−1

x +1 x→−1
+
x +1

discontinuity at −1.

2.5.6 https://math.libretexts.org/@go/page/4444
 Exercise 2.5.3
2
x , if x ≠ 1
For f (x) = { , decide whether f is continuous at 1 . If f is not continuous at 1 , classify the discontinuity as
3, if x = 1

removable, jump, or infinite.

Hint
Consider the definitions of the various kinds of discontinuity stated above. If the function is discontinuous at 1, look at lim f (x)
x→1

Answer
Discontinuous at 1; removable

Continuity over an Interval


Now that we have explored the concept of continuity at a point, we extend that idea to continuity over an interval. As we develop this
idea for different types of intervals, it may be useful to keep in mind the intuitive idea that a function is continuous over an interval if we
can use a pencil to trace the function between any two points in the interval without lifting the pencil from the paper. In preparation for
defining continuity on an interval, we begin by looking at the definition of what it means for a function to be continuous from the right at
a point and continuous from the left at a point.

 Definition: Continuity from the Right and from the Left

A function f (x) is said to be continuous from the right at a if lim f (x) = f (a)
+
.
x→a

A function f (x) is said to be continuous from the left at a if lim f (x) = f (a)

x→a

A function is continuous over an open interval if it is continuous at every point in the interval. A function f (x) is continuous over a
closed interval of the form [a, b] if it is continuous at every point in (a, b) and is continuous from the right at a and is continuous from the
left at b. Analogously, a function f (x) is continuous over an interval of the form (a, b] if it is continuous over (a, b) and is continuous
from the left at b. Continuity over other types of intervals are defined in a similar fashion.
Requiring that lim f (x) = f (a)
+
and lim f (x) = f (b)

ensures that we can trace the graph of the function from the point (a, f (a)) to
x→a x→b

the point (b, f (b)) without lifting the pencil. If, for example, lim f (x) ≠ f (a)
+
, we would need to lift our pencil to jump from f (a) to
x→a

the graph of the rest of the function over (a, b].

 Example 2.5.6: Continuity on an Interval


x −1
State the interval(s) over which the function f (x) = is continuous.
x2 + 2x

Solution
x −1
Since f (x) = is a rational function, it is continuous at every point in its domain. The domain of f (x) is the set
x2 + 2x

(−∞, −2) ∪ (−2, 0) ∪ (0, +∞) . Thus, f (x) is continuous over each of the intervals (−∞, −2), (−2, 0), and (0, +∞).

 Example 2.5.7: Continuity over an Interval


−−−−−
State the interval(s) over which the function f (x) = √4 − x is continuous.
2

Solution
−−−− − −−−− − −−−− −
From the limit laws, we know that lim √4 − x
2
= √4 − a
2
for all values of a in (−2, 2). We also know that lim
+
√4 − x2 = 0
x→a x→−2
−−−− −
exists and lim √4 − x

2
=0 exists. Therefore, f (x) is continuous over the interval [−2, 2].
x→2

2.5.7 https://math.libretexts.org/@go/page/4444
 Exercise 2.5.4
−−−−−
State the interval(s) over which the function f (x) = √x + 3 is continuous.

Hint
Use Example 2.5.7as a guide.

Answer
[−3, +∞)

Theorem 2.5.2 allows us to expand our ability to compute limits. In particular, this theorem ultimately allows us to demonstrate that
trigonometric functions are continuous over their domains.

 Theorem 2.5.2: Composite Function Theorem

If f (x) is continuous at L and lim g(x) = L , then


x→a

lim f (g(x)) = f ( lim g(x)) = f (L).


x→a x→a

Before we move on to Example 2.5.8, recall that earlier, in the section on limit laws, we showed lim cos x = 1 = cos(0) . Consequently,
x→0

we know that f (x) = cos x is continuous at 0. In Example 2.5.8, we see how to combine this result with the composite function theorem.

 Example 2.5.8: Limit of a Composite Cosine Function


π
Evaluate lim cos(x − ) .
x→π/2 2

Solution
π
The given function is a composite of cos x and x−
π

2
. Since lim (x − ) =0 and cos x is continuous at 0, we may apply the
x→π/2 2

composite function theorem. Thus,


π π
lim cos(x − ) = cos( lim (x − )) = cos(0) = 1.
x→π/2 2 x→π/2 2

 Exercise 2.5.4:
Evaluate lim sin(x − π) .
x→π

Hint
f (x) = sin x is continuous at 0. Use Example 2.5.8as a guide.

Answer
0

The proof of the next theorem uses the composite function theorem as well as the continuity of f (x) = sin x and g(x) = cos x at the
point 0 to show that trigonometric functions are continuous over their entire domains.

 Theorem 2.5.3: Continuity of Trigonometric Functions

Trigonometric functions are continuous over their entire domains.

 Proof

We begin by demonstrating that cos x is continuous at every real number. To do this, we must show that lim cos x = cos a for all
x→a

values of a .

2.5.8 https://math.libretexts.org/@go/page/4444
lim cos x = lim cos((x − a) + a) Rewrite x = x − a + a.
x→a x→a

= lim(cos(x − a) cos a − sin(x − a) sin a) Apply the identity for the cosine of the sum of two angles.
x→a

= cos( lim(x − a)) cos a − sin( lim(x − a)) sin a Since lim(x − a) = 0, and sin x and cos x are continuous at 0.
x→a x→a x→a

= cos(0) cos a − sin(0) sin a Evaluate cos(0) and sin(0) and simplify.

= 1 ⋅ cos a − 0 ⋅ sin a = cos a.

The proof that sin x is continuous at every real number is analogous. Because the remaining trigonometric functions may be
expressed in terms of sin x and cos x, their continuity follows from the quotient limit law.

As you can see, the composite function theorem is invaluable in demonstrating the continuity of trigonometric functions. As we continue
our study of calculus, we revisit this theorem many times.

The Intermediate Value Theorem


Functions that are continuous over intervals of the form [a, b], where a and b are real numbers, exhibit many useful properties.
Throughout our study of calculus, we will encounter many powerful theorems concerning such functions. The first of these theorems is
the Intermediate Value Theorem.

 The Intermediate Value Theorem

Let f be continuous over a closed, bounded interval [a, b]. If z is any real number between f (a) and f (b), then there is a number c in
[a, b] satisfying f (c) = z in Figure 2.5.7.

Figure 2.5.7 : There is a number c ∈ [a, b] that satisfies f (c) = z.

 Example 2.5.9: Application of the Intermediate Value Theorem

Show that f (x) = x − cos x has at least one zero.


Solution
Since f (x) = x − cos x is continuous over (−∞, +∞) , it is continuous over any closed interval of the form [a, b]. If you can find an
interval [a, b] such that f (a) and f (b) have opposite signs, you can use the Intermediate Value Theorem to conclude there must be a
real number c in (a, b) that satisfies f (c) = 0 . Note that
f (0) = 0 − cos(0) = −1 < 0

and
f(
π

2
) =
π

2
− cos
π

2
=
π

2
>0 .
Using the Intermediate Value Theorem, we can see that there must be a real number c in [0, π/2] that satisfies f (c) = 0 . Therefore,
f (x) = x − cos x has at least one zero.

2.5.9 https://math.libretexts.org/@go/page/4444
 Example 2.5.10: When Can You Apply the Intermediate Value Theorem?

If f (x) is continuous over [0, 2], f (0) > 0 and f (2) > 0 , can we use the Intermediate Value Theorem to conclude that f (x) has no
zeros in the interval [0, 2]? Explain.
Solution
No. The Intermediate Value Theorem only allows us to conclude that we can find a value between f (0) and f (2); it doesn’t allow us
to conclude that we can’t find other values. To see this more clearly, consider the function f (x) = (x − 1) . It satisfies 2

f (0) = 1 > 0, f (2) = 1 > 0 , and f (1) = 0 .

 Example 2.5.11: When Can You Apply the Intermediate Value Theorem?

For f (x) = 1/x, f (−1) = −1 < 0 and f (1) = 1 > 0 . Can we conclude that f (x) has a zero in the interval [−1, 1]?
Solution
No. The function is not continuous over [−1, 1]. The Intermediate Value Theorem does not apply here.

 Exercise 2.5.5
Show that f (x) = x 3 2
−x − 3x + 1 has a zero over the interval [0, 1].

Hint
Find f (0) and f (1). Apply the Intermediate Value Theorem.

Answer
f (0) = 1 > 0, f (1) = −2 < 0; f (x) is continuous over [0, 1]. It must have a zero on this interval.

Key Concepts
For a function to be continuous at a point, it must be defined at that point, its limit must exist at the point, and the value of the function
at that point must equal the value of the limit at that point.
Discontinuities may be classified as removable, jump, or infinite.
A function is continuous over an open interval if it is continuous at every point in the interval. It is continuous over a closed interval if
it is continuous at every point in its interior and is continuous at its endpoints.
The composite function theorem states: If f (x) is continuous at L and lim g(x) = L , then lim f (g(x)) = f ( lim g(x)) = f (L) .
x→a x→a x→a

The Intermediate Value Theorem guarantees that if a function is continuous over a closed interval, then the function takes on every
value between the values at its endpoints.

Glossary
continuity at a point
A function f (x) is continuous at a point a if and only if the following three conditions are satisfied: (1) f (a) is defined, (2) lim f (x)
x→a

exists, and (3) lim x → af (x) = f (a)

continuity from the left


A function is continuous from the left at b if lim f (x) = f (b)

x→b

continuity from the right


A function is continuous from the right at a if lim f (x) = f (a)
+
x→a

continuity over an interval


a function that can be traced with a pencil without lifting the pencil; a function is continuous over an open interval if it is continuous at
every point in the interval; a function f (x) is continuous over a closed interval of the form [a, b] if it is continuous at every point in (
a, b), and it is continuous from the right at a and from the left at b

2.5.10 https://math.libretexts.org/@go/page/4444
discontinuity at a point
A function is discontinuous at a point or has a discontinuity at a point if it is not continuous at the point

infinite discontinuity
An infinite discontinuity occurs at a point a if lim f (x) = ±∞ or lim f (x) = ±∞
− +
x→a x→a

Intermediate Value Theorem


Let f be continuous over a closed bounded interval [a, b] if z is any real number between f (a) and f (b), then there is a number c in [
a, b] satisfying f (c) = z

jump discontinuity
A jump discontinuity occurs at a point a if lim f (x) and lim f (x) both exist, but lim f (x) ≠ lim f (x)
− + − +
x→a x→a x→a x→a

removable discontinuity
A removable discontinuity occurs at a point a if f (x) is discontinuous at a , but lim f (x) exists
x→a

2.5: Continuity is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
2.4: Continuity by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source: https://openstax.org/details/books/calculus-
volume-1.

2.5.11 https://math.libretexts.org/@go/page/4444
2.6: Limits at Infinity; Horizontal Asymptotes
In Definition 1 we stated that in the equation lim f (x) = L , both c and L were numbers. In this section we relax that definition a
x→c

bit by considering situations when it makes sense to let c and/or L be "infinity.''


As a motivating example, consider f (x) = 1/x , as shown in Figure 1.30. Note how, as
2
x approaches 0, f (x) grows very, very
large. It seems appropriate, and descriptive, to state that
1
lim = ∞. (2.6.1)
x→0 x2

Also note that as x gets very large, f (x) gets very, very small. We could represent this concept with notation such as
1
lim = 0. (2.6.2)
x→∞ x2

FIGURE 1.30 : Graphing f (x) = 1/x for values of x near 0 .


2

We explore both types of use of ∞ in turn.

Definition 5: Limit of infinity


We say lim f (x) = ∞ if for every M >0 there exists δ > 0 such that for all x ≠ c , if |x − c| < δ , then f (x) ≥ M .
x→c

This is just like the ϵ--δ definition from Section 1.2. In that definition, given any (small) value ϵ, if we let x get close enough to c
(within δ units of c ) then f (x) is guaranteed to be within ϵ of f (c). Here, given any (large) value M , if we let x get close enough
to c (within δ units of c ), then f (x) will be at least as large as M . In other words, if we get close enough to c , then we can make
f (x) as large as we want. We can define limits equal to −∞ in a similar way.

It is important to note that by saying lim f (x) = ∞ we are implicitly stating that \textit{the} limit of f (x), as x approaches c , does
x→c

not exist. A limit only exists when f (x) approaches an actual numeric value. We use the concept of limits that approach infinity
because it is helpful and descriptive.

Example 26: Evaluating limits involving infinity


Find lim 1
2
as shown in Figure 1.31.
x→1 (x−1)

FIGURE 1.31 : Observing infinite limit as x → 1 in Example 26.

2.6.1 https://math.libretexts.org/@go/page/4445
Solution
In Example 4 of Section 1.1, by inspecting values of x close to 1 we concluded that this limit does not exist. That is, it cannot
equal any real number. But the limit could be infinite. And in fact, we see that the function does appear to be growing larger
and larger, as f (.99) = 10 , f (.999) = 10 , f (.9999) = 10 . A similar thing happens on the other side of 1. In general, let a
4 6 8

−− −−
"large'' value M be given. Let δ = 1/√M . If x is within δ of 1, i.e., if |x − 1| < 1/√M , then:
1
|x − 1| <
−−
√M

1
2
(x − 1) <
M
1
> M,
2
(x − 1)

which is what we wanted to show. So we may say lim 1/(x − 1) 2


=∞ .
x→1

Example 27: Evaluating limits involving infinity


Find lim , as shown in Figure 1.32.
1

x
x→0

FIGURE 1.32 : Evaluating lim . 1

x
x→0

Solution
It is easy to see that the function grows without bound near 0, but it does so in different ways on different sides of 0. Since its
behavior is not consistent, we cannot say that lim = ∞ . However, we can make a statement about one--sided limits. We can
1

x
x→0

state that lim


+
1

x
=∞ and lim

1

x
= −∞ .
x→0 x→0

Vertical Asymptotes
If the limit of f (x) as x approaches c from either the left or right (or both) is ∞ or −∞ , we say the function has a vertical
asymptote at c .

Example 28: Finding vertical asymptotes


3x
Find the vertical asymptotes of f (x) = 2
.
x −4

2.6.2 https://math.libretexts.org/@go/page/4445
FIGURE 1.33 : Graphing f (x) = 3x

x2 −4
.

Solution
Vertical asymptotes occur where the function grows without bound; this can occur at values of c where the denominator is 0.
When x is near c , the denominator is small, which in turn can make the function take on large values. In the case of the given
function, the denominator is 0 at x = ±2 . Substituting in values of x close to 2 and −2 seems to indicate that the function
tends toward ∞ or −∞ at those points. We can graphically confirm this by looking at Figure 1.33. Thus the vertical
asymptotes are at x = ±2 .

When a rational function has a vertical asymptote at x = c , we can conclude that the denominator is 0 at x = c . However, just
because the denominator is 0 at a certain point does not mean there is a vertical asymptote there. For instance,
f (x) = (x − 1)/(x − 1) does not have a vertical asymptote at x = 1 , as shown in Figure 1.34. While the denominator does get
2

small near x = 1 , the numerator gets small too, matching the denominator step for step. In fact, factoring the numerator, we get
(x − 1)(x + 1)
f (x) = . (2.6.3)
x −1

Canceling the common term, we get that f (x) = x + 1 for x ≠ 1 . So there is clearly no asymptote, rather a hole exists in the graph
at x = 1 .

FIGURE 1.34 : Graphically showing that f (x) = x −1

x−1
does not have an asymptote at x = 1 .
The above example may seem a little contrived. Another example demonstrating this important concept is f (x) = (sin x)/x. We
have considered this function several times in the previous sections. We found that lim = 1 ; i.e., there is no vertical
sin x

x
x→0

asymptote. No simple algebraic cancellation makes this fact obvious; we used the Squeeze Theorem in Section 1.3 to prove this.
If the denominator is 0 at a certain point but the numerator is not, then there will usually be a vertical asymptote at that point. On
the other hand, if the numerator and denominator are both zero at that point, then there may or may not be a vertical asymptote at
that point. This case where the numerator and denominator are both zero returns us to an important topic.

Indeterminate Forms
We have seen how the limits
2
sin x x −1
lim and lim (2.6.4)
x→0 x x→1 x −1

2.6.3 https://math.libretexts.org/@go/page/4445
each return the indeterminate form "0/0'' when we blindly plug in x = 0 and x = 1 , respectively. However, 0/0 is not a valid
arithmetical expression. It gives no indication that the respective limits are 1 and 2.
With a little cleverness, one can come up 0/0 expressions which have a limit of ∞ , 0, or any other real number. That is why this
expression is called indeterminate.
A key concept to understand is that such limits do not really return 0/0. Rather, keep in mind that we are taking limits. What is
really happening is that the numerator is shrinking to 0 while the denominator is also shrinking to 0. The respective rates at which
they do this are very important and determine the actual value of the limit.
An indeterminate form indicates that one needs to do more work in order to compute the limit. That work may be algebraic (such as
factoring and canceling) or it may require a tool such as the Squeeze Theorem. In a later section we will learn a technique called
l'Hospital's Rule that provides another way to handle indeterminate forms.
Some other common indeterminate forms are ∞ − ∞ , ∞ ⋅ 0 , ∞/∞, 0 , ∞ and 1 . Again, keep in mind that these are the
0 0 ∞

"blind'' results of evaluating a limit, and each, in and of itself, has no meaning. The expression ∞ − ∞ does not really mean
"subtract infinity from infinity.'' Rather, it means "One quantity is subtracted from the other, but both are growing without bound.''
What is the result? It is possible to get every value between −∞ and ∞
Note that 1/0 and ∞/0 are not indeterminate forms, though they are not exactly valid mathematical expressions, either. In each,
the function is growing without bound, indicating that the limit will be ∞, −∞ , or simply not exist if the left- and right-hand limits
do not match.

Limits at Infinity and Horizontal Asymptotes


At the beginning of this section we briefly considered what happens to f (x) = 1/x as x grew very large. Graphically, it concerns
2

the behavior of the function to the "far right'' of the graph. We make this notion more explicit in the following definition.

Definition 6: Limits at Infinity and Horizontal Asymptote


1. We say lim f (x) = L if for every ϵ > 0 there exists M >0 such that if x ≥ M , then |f (x) − L| < ϵ .
x→∞

2. We say lim f (x) = L if for every ϵ > 0 there exists M <0 such that if x ≤ M , then |f (x) − L| < ϵ .
x→−∞

3. If lim f (x) = L or lim f (x) = L , we say that y = L is a horizontal asymptote of f .


x→∞ x→−∞

We can also define limits such as lim f (x) = ∞ by combining this definition with Definition 5.
x→∞

Example 29: Approximating horizontal asymptotes


2

Approximate the horizontal asymptote(s) of f (x) = 2


x +4
x
.

Solution
We will approximate the horizontal asymptotes by approximating the limits
2 2
x x
lim and lim . (2.6.5)
x→−∞ 2 x→∞ 2
x +4 x +4

Figure 1.35(a) shows a sketch of f , and part (b) gives values of f (x) for large magnitude values of x. It seems reasonable to
conclude from both of these sources that f has a horizontal asymptote at y = 1 .

2.6.4 https://math.libretexts.org/@go/page/4445
FIGURE 1.35 : Using a graph and a table to approximate a horizontal asymptote in Example 29.
Later, we will show how to determine this analytically.

Horizontal asymptotes can take on a variety of forms. Figure 1.36(a) shows that f (x) = x/(x 2
+ 1) has a horizontal asymptote of
y = 0 , where 0 is approached from both above and below.

−−−−−
Figure 1.36(b) shows that f (x) = x/√x 2
+1 has two horizontal asymptotes; one at y = 1 and the other at y = −1 .
Figure 1.36(c) shows that f (x) = (sin x)/x has even more interesting behavior than at just x =0 ; as x approaches ±∞ , f (x)

approaches 0, but oscillates as it does this.

FIGURE 1.36 : Considering different types of horizontal asymptotes.


We can analytically evaluate limits at infinity for rational functions once we understand lim 1/x . As x gets larger and larger, the
x→∞

1/x gets smaller and smaller, approaching 0. We can, in fact, make 1/x as small as we want by choosing a large enough value of x.
Given ϵ, we can make 1/x < ϵ by choosing x > 1/ϵ. Thus we have lim 1/x = 0.
x→∞

It is now not much of a jump to conclude the following:


1 1
lim =0 and lim =0 (2.6.6)
n n
x→∞ x x→−∞ x

Now suppose we need to compute the following limit:


3
x + 2x + 1
lim . (2.6.7)
x→∞ 3 2
4x − 2x +9

A good way of approaching this is to divide through the numerator and denominator by x
3
(hence dividing by 1), which is the
largest power of x to appear in the function. Doing this, we get
3 3 3
x + 2x + 1 1/x x + 2x + 1
lim = lim ⋅
x→∞ 3 2 x→∞ 3 3 2
4x − 2x +9 1/x 4x − 2x +9
3 3 3 3
x /x + 2x/ x + 1/ x
= lim
x→∞ 4 x3 / x3 − 2 x2 / x3 + 9/ x3

2 3
1 + 2/ x + 1/ x
= lim .
x→∞ 4 − 2/x + 9/x3

Then using the rules for limits (which also hold for limits at infinity), as well as the fact about limits of 1/x , we see that the limit n

becomes

2.6.5 https://math.libretexts.org/@go/page/4445
1 +0 +0 1
= . (2.6.8)
4 −0 +0 4

This procedure works for any rational function. In fact, it gives us the following theorem.

Theorem 11: Limits of Rational Functions at Infinity


Let f (x) be a rational function of the following form:
n n−1
an x + an−1 x + ⋯ + a1 x + a0
f (x) = , (2.6.9)
m m−1
bm x + bm−1 x + ⋯ + b1 x + b0

where any of the coefficients may be 0 except for a and b . n m

an
1. If n = m , then lim f (x) = lim f (x) =
bm
.
x→∞ x→−∞

2. If n < m , then lim f (x) = lim f (x) = 0 .


x→∞ x→−∞

3. If n > m , then lim f (x) and lim f (x) are both infinite.
x→∞ x→−∞

We can see why this is true. If the highest power of x is the same in both the numerator and denominator (i.e. n = m ), we will be
in a situation like the example above, where we will divide by x and in the limit all the terms will approach 0 except for a x /x
n
n
n n

and b x /x . Since n = m , this will leave us with the limit a /b . If n < m , then after dividing through by x , all the terms in
m
m n
n m
m

the numerator will approach 0 in the limit, leaving us with 0/b or 0. If n > m , and we try dividing through by x , we end up
m
n

with all the terms in the denominator tending toward 0, while the x term in the numerator does not approach 0. This is indicative
n

of some sort of infinite limit.


Intuitively, as x gets very large, all the terms in the numerator are small in comparison to a x , and likewise all the terms in the n
n

denominator are small compared to b x . If n = m , looking only at these two important terms, we have (a x )/(b x ). This
n
m
n
n
n
m

reduces to a /b . If n < m , the function behaves like a /(b x


n m ), which tends toward 0. If n > m , the function behaves like
n m
m−n

/ b , which will tend to either ∞ or −∞ depending on the values of n , m , a , b and whether you are looking for
n−m
a x
n m n m

lim f (x) or lim f (x).


x→∞ x→−∞

With care, we can quickly evaluate limits at infinity for a large number of functions by considering the largest powers of x. For
instance, consider again lim , graphed in Figure ??? (b). When x is very large, x + 1 ≈ x . Thus
x 2 2

x→±∞ √x2 +1

− −−−− −− x x
2 2
√ x + 1 ≈ √x = |x|, and ≈ . (2.6.10)
− −−−−
√ x2 + 1 |x|

This expression is 1 when x is positive and −1 when x is negative. Hence we get asymptotes of y = 1 and y = −1 , respectively.

Example 30: Finding a limit of a rational function


2

Confirm analytically that y = 1 is the horizontal asymptote of f (x) = x +4


x
2
, as approximated in Example 29.
Solution
Before using Theorem 11, let's use the technique of evaluating limits at infinity of rational functions that led to that theorem.
The largest power of x in f is 2, so divide the numerator and denominator of f by x , then take limits. 2

2 2 2
x x /x
lim = lim
x→∞ 2 x→∞ 2 2 2
x +4 x /x + 4/ x

1
= lim
x→∞ 2
1 + 4/x

1
=
1 +0

= 1.

We can also use Theorem 11 directly; in this case n =m so the limit is the ratio of the leading coefficients of the numerator
and denominator, i.e., 1/1 = 1.

2.6.6 https://math.libretexts.org/@go/page/4445
Example 31: Finding limits of rational functions
Use Theorem 11 to evaluate each of the following limits.
2 2
x + 2x − 1 x −1
1. lim 3. lim (2.6.11)
3
x→−∞ x +1 x→∞ 3 −x
2
x + 2x − 1
2. lim (2.6.12)
x→∞ 1 − x − 3x2

FIGURE 1.37 : Visualizing the functions in Example 31.


Solution
1. The highest power of x is in the denominator. Therefore, the limit is 0; see Figure 1.37(a).
2. The highest power of x is x , which occurs in both the numerator and denominator. The limit is therefore the ratio of the
2

coefficients of x , which is −1/3. See Figure 1.37(b).


2

3. The highest power of x is in the numerator so the limit will be ∞ or −∞ . To see which, consider only the dominant terms
from the numerator and denominator, which are x and −x. The expression in the limit will behave like x /(−x) = −x
2 2

for large values of x. Therefore, the limit is −∞ . See Figure 1.37(c).

Chapter Summary
In this chapter we:
defined the limit,
found accessible ways to approximate their values numerically and graphically,
developed a not--so--easy method of proving the value of a limit (ϵ − δ proofs),
explored when limits do not exist,
defined continuity and explored properties of continuous functions, and
considered limits that involved infinity.
Why? Mathematics is famous for building on itself and calculus proves to be no exception. In the next chapter we will be interested
in "dividing by 0.'' That is, we will want to divide a quantity by a smaller and smaller number and see what value the quotient
approaches. In other words, we will want to find a limit. These limits will enable us to, among other things, determine exactly how
fast something is moving when we are only given position information.
Later, we will want to add up an infinite list of numbers. We will do so by first adding up a finite list of numbers, then take a limit
as the number of things we are adding approaches infinity. Surprisingly, this sum often is finite; that is, we can add up an infinite
list of numbers and get, for instance, 42.
These are just two quick examples of why we are interested in limits. Many students dislike this topic when they are first
introduced to it, but over time an appreciation is often formed based on the scope of its applicability.

Contributors and Attributions


Gregory Hartman (Virginia Military Institute). Contributions were made by Troy Siemers and Dimplekumar Chalishajar of
VMI and Brian Heinold of Mount Saint Mary's University. This content is copyrighted by a Creative Commons Attribution -
Noncommercial (BY-NC) License. http://www.apexcalculus.com/

2.6: Limits at Infinity; Horizontal Asymptotes is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
1.6: Limits Involving Infinity by Gregory Hartman et al. is licensed CC BY-NC 3.0. Original source: http://www.apexcalculus.com/.

2.6.7 https://math.libretexts.org/@go/page/4445
2.7: Derivatives and Rates of Change
2.7: Derivatives and Rates of Change is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

2.7.1 https://math.libretexts.org/@go/page/4446
2.8: The Derivative as a Function
2.8: The Derivative as a Function is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

2.8.1 https://math.libretexts.org/@go/page/4447
CHAPTER OVERVIEW

3: Differentiation Rules
A general Calculus Textmap organized around the textbook

Calculus: Early Transcendentals


by James Stewart

I II III IV V VI VII VIII IX X XI XII XIII XIV XV XVI XVII

This Textmap is currently under construction... please be patient with us.

Topic hierarchy
3.1: Derivatives of Polynomials and Exponential Functions
3.2: The Product and Quotient Rules
3.3: Derivatives of Trigonometric Functions
3.4: The Chain Rule
3.5: Implicit Differentiation
3.6: Derivatives of Logarithmic Functions
3.7: Rates of Change in the Natural and Social Sciences
3.8: Exponential Growth and Decay
3.9: Related Rates
3.10: Linear Approximations and Differentials
3.11: Hyperbolic Functions

3: Differentiation Rules is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

1
Welcome to the Mathematics Library. This Living Library is a principal hub of the LibreTexts project, which is a multi-institutional
collaborative venture to develop the next generation of open-access texts to improve postsecondary education at all levels of higher
learning. The LibreTexts approach is highly collaborative where an Open Access textbook environment is under constant revision
by students, faculty, and outside experts to supplant conventional paper-based books.

1 https://math.libretexts.org/@go/page/4451
3.2: The Product and Quotient Rules
3.2: The Product and Quotient Rules is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

3.2.1 https://math.libretexts.org/@go/page/4452
3.3: Derivatives of Trigonometric Functions
4.2: The Derivative of sin x - I
4.4: The Derivative of Sin x Part II
4.5: Derivatives of the Trigonometric Functions

3.3: Derivatives of Trigonometric Functions is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

3.3.1 https://math.libretexts.org/@go/page/4453
3.4: The Chain Rule
We have covered almost all of the derivative rules that deal with combinations of two (or more) functions. The operations of addition,
subtraction, multiplication (including by a constant) and division led to the Sum and Difference rules, the Constant Multiple Rule, the
Power Rule, the Product Rule and the Quotient Rule. To complete the list of differentiation rules, we look at the last way two (or
more) functions can be combined: the process of composition (i.e. one function "inside'' another).
One example of a composition of functions is f (x) = cos(x ) . We currently do not know how to compute this derivative. If forced to
2

guess, one would likely guess f (x) = − sin(2x),where we recognize − sin x as the derivative of cos x and 2x as the derivative of

x . However, this is not the case; f (x) ≠ − sin(2x) . In Example 62 we'll see the correct answer, which employs the new rule this
2 ′

section introduces, the Chain Rule.


Before we define this new rule, recall the notation for composition of functions. We write (f ∘ g)(x) or f (g(x)),read as "f of g of x,''
to denote composing f with g . In shorthand, we simply write f ∘ g or f (g) and read it as "f of g .'' Before giving the corresponding
differentiation rule, we note that the rule extends to multiple compositions like f (g(h(x))) or f (g(h(j(x)))) ,etc.
To motivate the rule, let's look at three derivatives we can already compute.

Example 59: Exploring similar derivatives


Find the derivatives of
a. F
1 (x) = (1 − x )
2
,
b. F
2 (x) = (1 − x ) ,
3
and
c. F
3 (x)
4
= (1 − x ) .

We'll see later why we are using subscripts for different functions and an uppercase F .
Solution
In order to use the rules we already have, we must first expand each function as
a. F
1 (x) = 1 − 2x + x
2
,
b. F
2 (x) = 1 − 3x + 3 x
2
−x
3
and
c. F
3 (x) = 1 − 4x + 6 x
2
− 4x
3
+x
4
.
It is not hard to see that:

F (x) = −2 + 2x
1

′ 2
F (x) = −3 + 6x − 3x
2

′ 2 3
F (x) = −4 + 12x − 12 x + 4x .
3

An interesting fact is that these can be rewritten as


′ ′ 2 ′ 3
F (x) = −2(1 − x), F (x) = −3(1 − x ) and F (x) = −4(1 − x ) . (3.4.1)
1 2 3

A pattern might jump out at you. Recognize that each of these functions is a composition, letting g(x) = 1 − x :
2
F1 (x) = f1 (g(x)), where f1 (x) = x ,
3
F2 (x) = f2 (g(x)), where f2 (x) = x ,
4
F3 (x) = f3 (g(x)), where f3 (x) = x .

We'll come back to this example after giving the formal statements of the Chain Rule; for now, we are just illustrating a pattern.

Theorem 18: The Chain Rule


Let y = f (u) be a differentiable function of u and let u = g(x) be a differentiable function of x . Then y = f (g(x)) is a
differentiable function of x,and
′ ′ ′
y = f (g(x)) ⋅ g (x). (3.4.2)

To help understand the Chain Rule, we return to Example 59.

3.4.1 https://math.libretexts.org/@go/page/4454
Example 60: Using the Chain Rule
Use the Chain Rule to find the derivatives of the following functions, as given in Example 59.
Solution
Example 59 ended with the recognition that each of the given functions was actually a composition of functions. To avoid
confusion, we ignore most of the subscripts here.
F1 (x) = (1 − x )
2
:
We found that
2 2
y = (1 − x ) = f (g(x)), where f (x) = x and g(x) = 1 − x. (3.4.3)

To find y , we apply the Chain Rule. We need f


′ ′
(x) = 2x and g ′
(x) = −1.

Part of the Chain Rule uses f (g(x)). This means substitute



g(x) for x in the equation for f (x)

. That is, ′
f (x) = 2(1 − x) .
Finishing out the Chain Rule we have
′ ′ ′
y = f (g(x)) ⋅ g (x) = 2(1 − x) ⋅ (−1) = −2(1 − x) = 2x − 2. (3.4.4)

F2 (x) = (1 − x )
3
:
Let y = (1 − x ) = f (g(x)) ,where
3
f (x) = x
3
and g(x) = (1 − x) . We have ′
f (x) = 3 x
2
,so ′
f (g(x)) = 3(1 − x )
2
. The
Chain Rule then states
′ ′ ′ 2 2
y = f (g(x)) ⋅ g (x) = 3(1 − x ) ⋅ (−1) = −3(1 − x ) . (3.4.5)

F3 (x) = (1 − x )
4
:
Finally, when y = (1 − x) ,we have f (x) = x and g(x) = (1 − x) . Thus f
4 4 ′
(x) = 4 x
3
and f ′
(g(x)) = 4(1 − x )
3
. Thus
′ ′ ′ 3 3
y = f (g(x)) ⋅ g (x) = 4(1 − x ) ⋅ (−1) = −4(1 − x ) . (3.4.6)

Example 60 demonstrated a particular pattern: when f (x) = x


n
,then y

= n ⋅ (g(x))
n−1 ′
⋅ g (x) . This is called the Generalized
Power Rule.

Theorem 19: Generalized Power Rule

Let g(x) be a differentiable function and let n ≠ 0 be an integer. Then


d n n−1 ′
(g(x ) ) = n ⋅ (g(x)) ⋅ g (x). (3.4.7)
dx

This allows us to quickly find the derivative of functions like y = (3 x


2
− 5x + 7 + sin x )
20
. While it may look intimidating, the
Generalized Power Rule states that
′ 2 19
y = 20(3 x − 5x + 7 + sin x ) ⋅ (6x − 5 + cos x). (3.4.8)

Treat the derivative--taking process step--by--step. In the example just given, first multiply by 20, then rewrite the inside of the
parentheses, raising it all to the 19 power. Then think about the derivative of the expression inside the parentheses, and multiply by
th

that.
We now consider more examples that employ the Chain Rule.

Example 61: Using the Chain Rule


Find the derivatives of the following functions:
a. y = sin 2x
b. y = ln(4x 3
− 2x )
2

c. y = e −x

Solution
a. Consider y = sin 2x. Recognize that this is a composition of functions, where f (x) = sin x and g(x) = 2x . Thus

3.4.2 https://math.libretexts.org/@go/page/4454
′ ′ ′
y = f (g(x)) ⋅ g (x) = cos(2x) ⋅ 2 = 2 cos 2x. (3.4.9)

b. Recognize that y = ln(4x 3


− 2x )
2
is the composition of f (x) = ln x and g(x) = 4x 3
− 2x
2
. Also, recall that
d 1
( ln x) = . (3.4.10)
dx x

This leads us to:


2
1 12 x − 4x 4x(3x − 1) 2(3x − 1)
′ 2
y = ⋅ (12 x − 4x) = = = . (3.4.11)
3 2 3 2 2
4x − 2x 4x − 2x 2x(2 x − x) 2 x2 − x

c. Recognize that y = e is the composition of f (x) = e and g(x) = −x . Remembering that f ,we have
2
−x x 2 ′ x
(x) = e

2 2
′ −x −x
y =e ⋅ (−2x) = (−2x)e . (3.4.12)

Example 62: Using the Chain Rule to find a tangent line


Let f (x) = cos x . Find the equation of the line tangent to the graph of f at x = 1 .
2

Solution
The tangent line goes through the point (1, f (1)) ≈ (1, 0.54) with slope f ′
(1) . To find f ,we need the Chain Rule.

′ 2
f (x) = − sin(x ) ⋅ (2x) = −2x sin x
2
. Evaluated at x =1 ,we have ′
f (1) = −2 sin 1 ≈ −1.68 . Thus the equation of the
tangent line is
y = −1.68(x − 1) + 0.54. (3.4.13)

The tangent line is sketched along with f in Figure 2.17.

Figure 2.17: f (x) = cos x sketched along with its tangent line at x = 1 .
2

The Chain Rule is used often in taking derivatives. Because of this, one can become familiar with the basic process and learn patterns
that facilitate finding derivatives quickly. For instance,

d 1 (anything)

( ln(anything)) = ⋅ (anything ) = . (3.4.14)
dx anything anything

A concrete example of this is


14 x
d 45 x + sin x + e
15 x
( ln(3 x − cos x + e )) = . (3.4.15)
15 x
dx 3x − cos x + e

While the derivative may look intimidating at first, look for the pattern. The denominator is the same as what was inside the natural
log function; the numerator is simply its derivative.
This pattern recognition process can be applied to lots of functions. In general, instead of writing "anything'', we use u as a generic
function of x. We then say

d u
( ln u) = . (3.4.16)
dx u

The following is a short list of how the Chain Rule can be quickly applied to familiar functions.

3.4.3 https://math.libretexts.org/@go/page/4454
Of course, the Chain Rule can be applied in conjunction with any of the other rules we have already learned. We practice this next.

Example 63: Using the Product, Quotient and Chain Rules


Find the derivatives of the following functions.
a. 5
f (x) = x sin 2 x
3

3
5x
b. f (x) = 2
.
−x
e

Solution
a. We must use the Product and Chain Rules. Do not think that you must be able to "see'' the whole answer immediately; rather,
just proceed step--by--step.
′ 5 2 3 4 3 7 3 4 3
f (x) = x (6 x cos 2 x ) + 5 x ( sin 2 x ) = 6 x cos 2 x + 5x sin 2 x . (3.4.17)

b. We must employ the Quotient Rule along with the Chain Rule. Again, proceed step--by--step.
2 2 2
−x 2 3 −x −x 4 2
e (15 x ) − 5 x ((−2x)e ) e (10 x + 15 x )

f (x) = =
2 2 −2x2
(e
−x
) e

2
x 4 2
=e (10 x + 15 x ).

A key to correctly working these problems is to break the problem down into smaller, more manageable pieces. For instance, when
using the Product and Chain Rules together, just consider the first part of the Product Rule at first: f (x)g (x). Just rewrite f (x),then ′

find g (x). Then move on to the f (x)g(x) part. Don't attempt to figure out both parts at once.
′ ′

Likewise, using the Quotient Rule, approach the numerator in two steps and handle the denominator after completing that. Only
simplify afterward.
We can also employ the Chain Rule itself several times, as shown in the next example.

Example 64: Using the Chain Rule multiple times


Find the derivative of y = tan 5
(6 x
3
− 7x) .
Solution
Recognize that we have the g(x) = tan(6x − 7x) function "inside'' the f (x) = x function; that is, we have
3 5

5
y = ( tan(6 x − 7x)) . We begin using the Generalized Power Rule; in this first step, we do not fully compute the derivative.
3

Rather, we are approaching this step--by--step.


′ 3 4 ′
y = 5( tan(6 x − 7x)) ⋅ g (x). (3.4.18)

We now find g ′
(x) . We again need the Chain Rule;
′ 2 3 2
g (x) = sec (6 x − 7x) ⋅ (18 x − 7). (3.4.19)

Combine this with what we found above to give


′ 3 4 2 3 2
y = 5( tan(6 x − 7x)) ⋅ sec (6 x − 7x) ⋅ (18 x − 7)

2 2 3 4 3
= (90 x − 35) sec (6 x − 7x) tan (6 x − 7x).

This function is frankly a ridiculous function, possessing no real practical value. It is very difficult to graph, as the tangent
function has many vertical asymptotes and 6x − 7x grows so very fast. The important thing to learn from this is that the
3

3.4.4 https://math.libretexts.org/@go/page/4454
derivative can be found. In fact, it is not "hard;'' one must take several simple steps and be careful to keep track of how to apply
each of these steps.

It is a traditional mathematical exercise to find the derivatives of arbitrarily complicated functions just to demonstrate that it can be
done. Just break everything down into smaller pieces.

Example 65: Using the Product, Quotient and Chain Rules


−2 2 4x
x cos(x ) − sin (e )
Find the derivative of f (x) = 2 4
.
ln(x + 5x )

Solution
This function likely has no practical use outside of demonstrating derivative skills. The answer is given below without
simplification. It employs the Quotient Rule, the Product Rule, and the Chain Rule three times.
2 4 −2 −3 −2 4x 4x 4x
( ln(x + 5 x )) ⋅ [(x ⋅ (− sin(x )) ⋅ (−2 x ) + 1 ⋅ cos(x )) − 2 sin(e ) ⋅ cos(e ) ⋅ (4 e )]

3
2x + 20x
−2 2 4x
− (x cos(x ) − sin (e )) ⋅
2 4
′ x + 5x
f (x) = . (3.4.20)
2
( ln(x2 + 5 x4 ))

The reader is highly encouraged to look at each term and recognize why it is there. (I.e., the Quotient Rule is used; in the
numerator, identify the "LOdHI'' term, etc.) This example demonstrates that derivatives can be computed systematically, no
matter how arbitrarily complicated the function is.

The Chain Rule also has theoretic value. That is, it can be used to find the derivatives of functions that we have not yet learned as we
do in the following example.

Example 66: The Chain Rule and exponential functions


Use the Chain Rule to find the derivative of y = a where a > 0 ,a ≠ 1 is constant.
x

Solution
We only know how to find the derivative of one exponential function: y = e ; this problem is asking us to find the derivative of
x

functions such as y = 2 . x

This can be accomplished by rewriting a in terms of e . Recalling that e and ln x are inverse functions, we can write
x x

x
ln a x ln( a )
a =e and so y =a =e .

By the exponent property of logarithms, we can "bring down'' the power to get
x x(ln a)
y =a =e .

The function is now the composition y = f (g(x)) ,with f (x) = e and g(x) = x(ln a) . Since f x ′
(x) = e
x
and g ′
(x) = ln a , the
Chain Rule gives
′ x(ln a)
y =e ⋅ ln a.

Recall that the e x(ln a)


term on the right hand side is just x
a ,our original function. Thus, the derivative contains the original
function itself. We have
′ x
y = y ⋅ ln a = a ⋅ ln a.

The Chain Rule, coupled with the derivative rule of e ,allows us to find the derivatives of all exponential functions.
x

The previous example produced a result worthy of its own "box.''

3.4.5 https://math.libretexts.org/@go/page/4454
Theorem 20: Derivatives of Exponential Functions
Let f (x) = a ,for a > 0, a ≠ 1 . Then f is differentiable for all real numbers and
x

′ x
f (x) = ln a ⋅ a .

Alternate Chain Rule Notation


dy
It is instructive to understand what the Chain Rule "looks like'' using " '' notation instead of y notation. Suppose that y = f (u) is

dx
a function of u,where u = g(x) is a function of x,as stated in Theorem 18. Then, through the composition f ∘ g ,we can think of y as
dy
a function of x,as y = f (g(x)) . Thus the derivative of y with respect to x makes sense; we can talk about . This leads to an
dx
interesting progression of notation:
′ ′ ′
y = f (g(x)) ⋅ g (x)

dy ′ ′
= y (u) ⋅ u (x) (since y = f (u) and u = g(x))
dx
dy dy du
= ⋅ (using "fractional'' notation for the derivative)
dx du dx

Here the "fractional'' aspect of the derivative notation stands out. On the right hand side, it seems as though the "du'' terms cancel out,
leaving
dy dy
= . (3.4.21)
dx dx

dy
It is important to realize that we are not canceling these terms; the derivative notation of is one symbol. It is equally important to
dx
realize that this notation was chosen precisely because of this behavior. It makes applying the Chain Rule easy with multiple
variables. For instance,
dy dy d◯ d△
= ⋅ ⋅ . (3.4.22)
dt d◯ d△ dt

where ◯ and △ are any variables you'd like to use.


One of the most common ways of "visualizing" the Chain Rule is to consider a set of gears, as shown in Figure 2.18. The gears have
36, 18, and 6 teeth, respectively. That means for every revolution of the x gear, the u gear revolves twice. That is, the rate at which
the u gear makes a revolution is twice as fast as the rate at which the x gear makes a revolution. Using the terminology of calculus,
du
the rate of u-change, with respect to x,is =2 .
dx

dy dy du
Figure 2.18: A series of gears to demonstrate the Chain Rule. Note how = ⋅ .
dx du dx

dy
Likewise, every revolution of u causes 3 revolutions of y : =3 . How does y change with respect to x? For each revolution of x,y
du
revolves 6 times; that is,

3.4.6 https://math.libretexts.org/@go/page/4454
dy dy du
= ⋅ = 2 ⋅ 3 = 6. (3.4.23)
dx du dx

We can then extend the Chain Rule with more variables by adding more gears to the picture.
It is difficult to overstate the importance of the Chain Rule. So often the functions that we deal with are compositions of two or more
functions, requiring us to use this rule to compute derivatives. It is often used in practice when actual functions are unknown. Rather,
dy du dy
through measurement, we can calculate and . With our knowledge of the Chain Rule, finding is straightforward.
du dx dx

In the next section, we use the Chain Rule to justify another differentiation technique. There are many curves that we can draw in the
plane that fail the "vertical line test.'' For instance, consider x + y = 1 ,which describes the unit circle. We may still be interested in
2 2

dy
finding slopes of tangent lines to the circle at various points. The next section shows how we can find without first "solving for
dx
y .'' While we can in this instance, in many other instances solving for y is impossible. In these situations, implicit differentiation is
indispensable.

3.4: The Chain Rule is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
2.5: The Chain Rule by Gregory Hartman et al. is licensed CC BY-NC 3.0. Original source: http://www.apexcalculus.com/.

3.4.7 https://math.libretexts.org/@go/page/4454
3.5: Implicit Differentiation
dy
In the previous sections we learned to find the derivative, , or y , when y is given explicitly as a function of x. That is, if we
dx

know y = f (x) for some function f , we can find y . For example, given y = 3x − 7 , we can easily find y = 6x . (Here we
′ 2 ′

explicitly state how x and y are related. Knowing x, we can directly find y .)
Sometimes the relationship between y and x is not explicit; rather, it is implicit. For instance, we might know that x − y = 4 . 2

This equality defines a relationship between x and y ; if we know x, we could figure out y . Can we still find y ? In this case, sure; ′

we solve for y to get y = x − 4 (hence we now know y explicitly) and then differentiate to get y = 2x .
2 ′

Sometimes the implicit relationship between x and y is complicated. Suppose we are given sin(y) + y = 6 − x . A graph of this 3 3

implicit function is given in Figure 2.19. In this case there is absolutely no way to solve for y in terms of elementary functions. The
surprising thing is, however, that we can still find y via a process known as implicit differentiation.

Figure 2.19: A graph of the implicit function sin(y) + y 3


= 6−x
2
.
Implicit differentiation is a technique based on the Chain Rule that is used to find a derivative when the relationship between the
variables is given implicitly rather than explicitly (solved for one variable in terms of the other).
We begin by reviewing the Chain Rule. Let f and g be functions of x. Then
d
′ ′
(f (g(x))) = f (g(x)) ⋅ g (x). (3.5.1)
dx

Suppose now that y = g(x) . We can rewrite the above as


d ′ ′
d ′
dy
(f (y))) = f (y)) ⋅ y , or (f (y))) = f (y) ⋅ . (2.1)
dx dx dx

These equations look strange; the key concept to learn here is that we can find y even if we don't exactly know how y and x relate.

We demonstrate this process in the following example.

Example 67: Using Implicit Differentiation


Find y given that sin(y) + y
′ 3
= 6 −x
3
.
Solution
We start by taking the derivative of both sides (thus maintaining the equality.) We have :
d 3
d 3
( sin(y) + y ) = (6 − x ). (3.5.2)
dx dx

The right hand side is easy; it returns −3x . 2

The left hand side requires more consideration. We take the derivative term--by--term. Using the technique derived from
Equation 2.1 above, we can see that
d ′
( sin y) = cos y ⋅ y . (3.5.3)
dx

We apply the same process to the y term. 3

3.5.1 https://math.libretexts.org/@go/page/4455
d 3
d 3 2 ′
(y ) = ((y ) ) = 3(y ) ⋅y . (3.5.4)
dx dx

Putting this together with the right hand side, we have


′ 2 ′ 2
cos(y)y + 3 y y = −3 x . (3.5.5)

Now solve for y .′

′ 2 ′ 2
cos(y)y + 3 y y = −3 x .

2 ′ 2
( cos y + 3 y )y = −3x

2

−3x
y =
2
cos y + 3y

This equation for y probably seems unusual for it contains both x and y terms. How is it to be used? We'll address that next.

Implicit functions are generally harder to deal with than explicit functions. With an explicit function, given an x value, we have an
explicit formula for computing the corresponding y value. With an implicit function, one often has to find x and y values at the
same time that satisfy the equation. It is much easier to demonstrate that a given point satisfies the equation than to actually find
such a point.

For instance, we can affirm easily that the point (√6, 0) lies on the graph of the implicit function sin y + y = 6 − x . Plugging in
3 3 3


0 for y , we see the left hand side is 0 . Setting x = √6 , we see the right hand side is also 0 ; the equation is satisfied. The following
3

example finds the equation of the tangent line to this function at this point.

Example 68: Using Implicit Differentiation to find a tangent line



Find the equation of the line tangent to the curve of the implicitly defined function sin y + y at the point (√6, 0).
3
3 3
= 6 −x

Solution
In Example 67 we found that
2
−3x

y = . (3.5.6)
2
cos y + 3y

– 3 – 3 –

We find the slope of the tangent line at the point (√6, 0) by substituting for x and 0 for y . Thus at the point , we
3
√6 (√6, 0)

have the slope as


3 – −−
2 3
−3(√6) −3 √36

y = = ≈ −9.91. (3.5.7)
2
cos 0 + 3 ⋅ 0 1


Therefore the equation of the tangent line to the implicitly defined function sin y + y 3
= 6 −x
3
at the point (√6, 0) is
3

3 −− 3 –
y = −3 √36(x − √6) + 0 ≈ −9.91x + 18. (3.5.8)

The curve and this tangent line are shown in Figure 2.20.


Figure 2.20: The function sin y + y 3 2
and its tangent line at the point (√6, 0).
3
= 6−x

This suggests a general method for implicit differentiation. For the steps below assume y is a function of x.

3.5.2 https://math.libretexts.org/@go/page/4455
1. Take the derivative of each term in the equation. Treat the x terms like normal. When taking the derivatives of y terms, the
usual rules apply except that, because of the Chain Rule, we need to multiply each term by y . ′

2. Get all the y terms on one side of the equal sign and put the remaining terms on the other side.

3. Factor out y ; solve for y by dividing.


′ ′

dy
Practical Note: When working by hand, it may be beneficial to use the symbol dx
instead of y

, as the latter can be easily
confused for y or y . 1

Example 69: Using Implicit Differentiation


Given the implicitly defined function y 3
+x y
2 4
= 1 + 2x , find y . ′

Solution
We will take the implicit derivatives term by term. The derivative of y is 3y 3 2
y

.
The second term, x y , is a little tricky. It requires the Product Rule as it is the product of two functions of x: x and y . Its
2 4 2 4

derivative is x (4y y ) + 2x y . The first part of this expression requires a y because we are taking the derivative of a y term.
2 3 ′ 4 ′

The second part does not require it because we are taking the derivative of x . 2

The derivative of the right hand side is easily found to be 2. In all, we get:
2 ′ 2 3 ′ 4
3 y y + 4 x y y + 2x y = 2. (3.5.9)

Move terms around so that the left side consists only of the y terms and the right side consists of all the other terms:

2 ′ 2 3 ′ 4
3y y + 4x y y = 2 − 2x y . (3.5.10)

Factor out y from the left side and solve to get


4
2 − 2xy

y = . (3.5.11)
2 2 3
3y + 4x y

To confirm the validity of our work, let's find the equation of a tangent line to this function at a point. It is easy to confirm that
the point (0, 1) lies on the graph of this function. At this point, y = 2/3 . So the equation of the tangent line is ′

y = 2/3(x − 0) + 1 . The function and its tangent line are graphed in Figure 2.21.

Figure 2.21: A graph of the implicitly defined function y 3


+x y
2 4
= 1 + 2x along with its tangent line at the point (0, 1).
Notice how our function looks much different than other functions we have seen. For one, it fails the vertical line test. Such
functions are important in many areas of mathematics, so developing tools to deal with them is also important.

Example 70: Using Implicit Differentiation

Given the implicitly defined function sin(x 2 2


y )+y
3
= x +y , find y . ′

Solution
Differentiating term by term, we find the most difficulty in the first term. It requires both the Chain and Product Rules.
d d
2 2 2 2 2 2
( sin(x y )) = cos(x y ) ⋅ (x y )
dx dx
2 2 2 ′ 2
= cos(x y ) ⋅ (x (2y y ) + 2x y )

2 ′ 2 2 2
= 2(x y y + x y ) cos(x y ).

3.5.3 https://math.libretexts.org/@go/page/4455
We leave the derivatives of the other terms to the reader. After taking the derivatives of both sides, we have
2 ′ 2 2 2 2 ′ ′
2(x y y + x y ) cos(x y ) + 3 y y = 1 +y . (3.5.12)

We now have to be careful to properly solve for y , particularly because of the product on the left. It is best to multiply out the

product. Doing this, we get


2 2 2 ′ 2 2 2 2 ′ ′
2 x y cos(x y )y + 2x y cos(x y ) + 3 y y = 1 +y . (3.5.13)

From here we can safely move around terms to get the following:
2 2 2 ′ 2 ′ ′ 2 2 2
2 x y cos(x y )y + 3 y y − y = 1 − 2x y cos(x y ). (3.5.14)

Then we can solve for y to get


2 2 2
1 − 2x y cos(x y )

y = . (3.5.15)
2 2 2 2
2 x y cos(x y ) + 3 y −1

A graph of this implicit function is given in Figure 2.22. It is easy to verify that the points (0, 0), (0, 1) and (0, −1) all lie on
the graph. We can find the slopes of the tangent lines at each of these points using our formula for y . ′

Figure 2.22: A graph of the implicitly defined function sin(x 2 2


y ) +y
3
= x+y .
At (0, 0), the slope is −1.
At (0, 1), the slope is 1/2.
At (0, −1), the slope is also 1/2.
The tangent lines have been added to the graph of the function in Figure 2.23.

Figure 2.23: A graph of the implicitly defined function sin(x 2 2


y ) +y
3
= x+y and certain tangent lines.

Quite a few "famous'' curves have equations that are given implicitly. We can use implicit differentiation to find the slope at various
points on those curves. We investigate two such curves in the next examples.

Example 71: Finding slopes of tangent lines to a circle



Find the slope of the tangent line to the circle x 2
+y
2
=1 at the point (1/2, √3/2).
Solution
Taking derivatives, we get 2x + 2y y ′
=0 . Solving for y gives: ′

3.5.4 https://math.libretexts.org/@go/page/4455
−x

y = . (3.5.16)
y

This is a clever formula. Recall that the slope of the line through the origin and the point (x, y) on the circle will be y/x. We
have found that the slope of the tangent line to the circle at that point is the opposite reciprocal of y/x, namely, −x/y. Hence
these two lines are always perpendicular.

At the point (1/2, √3/2), we have the tangent line's slope as
−1/2 −1

y = – = – ≈ −0.577. (3.5.17)
√3/2 √3


A graph of the circle and its tangent line at (1/2, √3/2) is given in Figure 2.24, along with a thin dashed line from the origin
that is perpendicular to the tangent line. (It turns out that all normal lines to a circle pass through the center of the circle.)


Figure 2.24: The unit circle with its tangent line at (1/2, √3/2).

This section has shown how to find the derivatives of implicitly defined functions, whose graphs include a wide variety of
interesting and unusual shapes. Implicit differentiation can also be used to further our understanding of "regular'' differentiation.
One hole in our current understanding of derivatives is this: what is the derivative of the square root function? That is,
d − d 1/2
(√x ) = (x ) =? (3.5.18)
dx dx

We allude to a possible solution, as we can write the square root function as a power function with a rational (or, fractional) power.
We are then tempted to apply the Power Rule and obtain
d 1/2
1 −1/2
1
(x ) = x = −. (3.5.19)
dx 2 2 √x

The trouble with this is that the Power Rule was initially defined only for positive integer powers, n > 0 . While we did not justify
this at the time, generally the Power Rule is proved using something called the Binomial Theorem, which deals only with positive
integers. The Quotient Rule allowed us to extend the Power Rule to negative integer powers. Implicit Differentiation allows us to
extend the Power Rule to rational powers, as shown below.
Let y = x m/n
, where m and n are integers with no common factors (so m = 2 and n = 5 is fine, but m =2 and n =4 is not).
We can rewrite this explicit function implicitly as y = x . Now apply implicit differentiation.
n m

3.5.5 https://math.libretexts.org/@go/page/4455
m/n
y =x
n m
y =x

d n
d m
(y ) = (x )
dx dx
n−1 ′ m−1
n⋅y ⋅y =m⋅x
m−1
m x
′ m/n
y = (now substitute x for y)
n−1
n y
m−1
m x
= (apply lots of algebra)
n (xm/n )n−1

m (m−n)/n
= x
n
m
m/n−1
= x .
n

The above derivation is the key to the proof extending the Power Rule to rational powers. Using limits, we can extend this once
more to include all powers, including irrational (even transcendental!) powers, giving the following theorem.

Theorem 21: Power Rule for Differentiation

Let f (x) = x , where n ≠ 0 is a real number. Then f is a differentiable function, and f


n ′ n−1
(x) = n ⋅ x .

This theorem allows us to say the derivative of x is πx π π−1


.
We now apply this final version of the Power Rule in the next example, the second investigation of a "famous'' curve.

Example 72: Using the Power Rule

Find the slope of x 2/3


+y
2/3
=8 at the point (8, 8).
Solution
This is a particularly interesting curve called an astroid. It is the shape traced out by a point on the edge of a circle that is
rolling around inside of a larger circle, as shown in Figure 2.25.

Figure 2.25: An astroid, traced out by a point on the smaller circle as it rolls inside the larger circle.
To find the slope of the astroid at the point (8, 8), we take the derivative implicitly.
2 2
−1/3 −1/3 ′
x + y y =0
3 3

2 −1/3 ′
2 −1/3
y y =− x
3 3
−1/3

x
y =−
y −1/3

1/3 −

y y

y =− = −√
3
.
1/3 x
x

Plugging in x = 8 and y = 8 , we get a slope of −1. The astroid, with its tangent line at (8, 8), is shown in Figure 2.26.

3.5.6 https://math.libretexts.org/@go/page/4455
Figure 2.26: An astroid with a tangent line.

Implicit Differentiation and the Second Derivative


dy
We can use implicit differentiation to find higher order derivatives. In theory, this is simple: first find , then take its derivative dx

with respect to x. In practice, it is not hard, but it often requires a bit of algebra. We demonstrate this in an example.

Example 73: Finding the second derivative


2
d y
Given x 2
+y
2
=1 , find dx2
=y
′′
.

Solution
dy
We found that y ′
=
dx
= −x/y in Example 71. To find y , we apply implicit differentiation to y .
′′ ′

′′
d ′
y = (y )
dx

d x
= (− ) (Now use the Quotient Rule.)
dx y

y(1) − x(y )
=−
2
y

replace y with −x/y:


y − x(−x/y)
=−
2
y
2
y + x /y
=− .
y2

While this is not a particularly simple expression, it is usable. We can see that y ′′
>0 when y < 0 and y ′′
<0 when y > 0 . In
Section 3.4, we will see how this relates to the shape of the graph.

Logarithmic Differentiation
Consider the function y = x ; it is graphed in Figure 2.27. It is well--defined for
x
x >0 and we might be interested in finding
equations of lines tangent and normal to its graph. How do we take its derivative?

3.5.7 https://math.libretexts.org/@go/page/4455
Figure 2.27: A plot of y = x .
x

The function is not a power function: it has a "power'' of x, not a constant. It is not an exponential function: it has a "base'' of x, not
a constant.
A differentiation technique known as logarithmic differentiation becomes useful here. The basic principle is this: take the natural
log of both sides of an equation y = f (x), then use implicit differentiation to find y . We demonstrate this in the following

example.

Example 74: Using Logarithmic Differentiation


Given y = x , use logarithmic differentiation to find y .
x ′

Solution
As suggested above, we start by taking the natural log of both sides then applying implicit differentiation.
x
y =x
x
ln(y) = ln(x )(apply logarithm rule)

ln(y) = x ln x(now use implicit differentiation)

d d
( ln(y)) = (x ln x)
dx dx

y 1
= ln x + x ⋅
y x

y
= ln x + 1
y
′ x
y = y( ln x + 1)(substitute y = x )

′ x
y = x ( ln x + 1).

To "test'' our answer, let's use it to find the equation of the tangent line at x = 1.5. The point on the graph our tangent line must
pass through is (1.5, 1.5 ) ≈ (1.5, 1.837). Using the equation for y , we find the slope as
1.5 ′

′ 1.5
y = 1.5 ( ln 1.5 + 1) ≈ 1.837(1.405) ≈ 2.582. (3.5.20)

Thus the equation of the tangent line is y = 1.6833(x − 1.5) + 1.837. Figure 2.28 graphs y = x along with this tangent line.
x

Figure 2.22: A graph of y = x and its tangent line at x = 1.5 .


x

Implicit differentiation proves to be useful as it allows us to find the instantaneous rates of change of a variety of functions. In
particular, it extended the Power Rule to rational exponents, which we then extended to all real numbers. In the next section,
implicit differentiation will be used to find the derivatives of inverse functions, such as y = sin x . −1

3.5: Implicit Differentiation is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
2.6: Implicit Differentiation by Gregory Hartman et al. is licensed CC BY-NC 3.0. Original source: http://www.apexcalculus.com/.

3.5.8 https://math.libretexts.org/@go/page/4455
3.6: Derivatives of Logarithmic Functions
As with the sine, we do not know anything about derivatives that allows us to compute the derivatives of the exponential and
logarithmic functions without going back to basics. Let's do a little work with the definition again:
x+Δx x
d x
a −a
a = lim
dx Δx→0 Δx

x Δx x
a a −a
= lim
Δx→0 Δx
(3.6.1)
Δx
a −1
x
= lim a
Δx→0 Δx

Δx
a −1
x
=a lim .
Δx→0 Δx

There are two interesting things to note here: As in the case of the sine function we are left with a limit that involves Δx but not x,
which means that whatever lim Δx→0 (a
Δx
− 1)/Δx is, we know that it is a number, that is, a constant. This means that a has a x

remarkable property: its derivative is a constant times itself.


We earlier remarked that the hardest limit we would compute is lim sin x/x = 1 ; we now have a limit that is just a bit too hard
x→0

to include here. In fact the hard part is to see that lim Δx→0
(a − 1)/Δx even exists---does this fraction really get closer and
Δx

closer to some fixed value? Yes it does, but we will not prove this fact.
We can look at some examples. Consider (2 − 1)/x for some small values of x: 1, 0.828427124, 0.756828460, 0.724061864,
x

0.70838051 , 0.70070877when x is 1, 1/2, 1/4, 1/8, 1/16, 1/32, respectively. It looks like this is settling in around 0.7, which
turns out to be true (but the limit is not exactly 0.7). Consider next (3 − 1)/x : 2, 1.464101616, 1.264296052, 1.177621520,
x

1.13720773 , 1.11768854, at the same values of x. It turns out to be true that in the limit this is about 1.1.
Two examples don't establish a pattern, but if you do more examples you will find that the limit varies directly with the value of a :
bigger a , bigger limit; smaller a , smaller limit. As we can already see, some of these limits will be less than 1 and some larger than
1. Somewhere between a = 2 and a = 3 the limit will be exactly 1; the value at which this happens is called e , so that
Δx
e −1
lim = 1. (3.6.2)
Δx→0 Δx

As you might guess from our two examples, e is closer to 3 than to 2, and in fact e ≈ 2.718.
Now we see that the function e has a truly remarkable property:
x

x+Δx x
d x
e −e
e = lim
dx Δx→0 Δx

x Δx x
e e −e
= lim
Δx→0 Δx

Δx
e −1 (3.6.3)
x
= lim e
Δx→0 Δx

Δx
x
e −1
=e lim
Δx→0 Δx
x
=e .

That is, e is its own derivative, or in other words the slope of e is the same as its height, or the same as its second coordinate:
x x

The function f (x) = e goes through the point (z, e ) and has slope e there, no matter what z is. It is sometimes convenient to
x z z

express the function e without an exponent, since complicated exponents can be hard to read. In such cases we use exp(x), e.g.,
x

exp(1 + x ) instead of e .
2
2 1+x

What about the logarithm function? This too is hard, but as the cosine function was easier to do once the sine was done, so the
logarithm is easier to do now that we know the derivative of the exponential function. Let's start with log x, which as you e

probably know is often abbreviated ln x and called the "natural logarithm'' function.

3.6.1 https://math.libretexts.org/@go/page/4456
Consider the relationship between the two functions, namely, that they are inverses, that one "undoes'' the other. Graphically this
means that they have the same graph except that one is "flipped'' or "reflected'' through the line y = x , as shown in Figure 3.6.1.

Figure 3.6.1 : The exponential (green) and logarithmic (blue) functions. As inverses of each other, their graphs are reflections of
each other across the line y = x (dashed).
This means that the slopes of these two functions are closely related as well: For example, the slope of e is e at x = 1 ; at the x

corresponding point on the ln(x) curve, the slope must be 1/e, because the "rise'' and the "run'' have been interchanged. Since the
slope of e is e at the point (1, e), the slope of ln(x) is 1/e at the point (e, 1).
x

Figure 3.6.2 : The exponential (green) and logarithmic (blue) functions. The dashed lines indicate the slope of the respective
functions at the points (1, e) and (e, 1) . It is interesting to note that these lines intersect at the origin.
More generally, we know that the slope of e is e at the point (z, e ) , so the slope of ln(x) is 1/e at (e , z) , as indicated in
x z z z z

Figure 3.6.2. In other words, the slope of ln x is the reciprocal of the first coordinate at any point; this means that the slope of ln x
at (x, ln x) is 1/x. The upshot is: d

dx
ln x = . We have discussed this from the point of view of the graphs, which is easy to
1

understand but is not normally considered a rigorous proof---it is too easy to be led astray by pictures that seem reasonable but that
miss some hard point. It is possible to do this derivation without resorting to pictures, and indeed we will see an alternate approach
soon.
Note that ln x is defined only for x > 0 . It is sometimes useful to consider the function ln |x|, a function defined for x ≠ 0 . When
x < 0 , ln |x| = ln(−x) and

d d 1 1
ln |x| = ln(−x) = (−1) = . (3.6.4)
dx dx −x x

Thus whether x is positive or negative, the derivative is the same.


What about the functions a and log x? We know that the derivative of a is some constant times a itself, but what constant?
x
a
x x

Remember that "the logarithm is the exponent'' and you will see that a = e . Then a = (e ) = e , and we can compute
ln a x ln a x x ln a

the derivative using the chain rule:


d x
d ln a x
d x ln a x ln a x
a = (e ) = e = (ln a)e = (ln a)a . (3.6.5)
dx dx dx

The constant is simply ln a. Likewise we can compute the derivative of the logarithm function log a
x . Since x = e ln x
we can take
the logarithm base a of both sides to get log (x) = log (e ) = ln x log e . Then
a a
ln x
a

d 1
loga x = loga e. (3.6.6)
dx x

3.6.2 https://math.libretexts.org/@go/page/4456
This is a perfectly good answer, but we can improve it slightly. Since
ln a
a =e

ln a
loga (a) = loga (e ) = ln a loga e

(3.6.7)
1 = ln a loga e

1
= loga e,
ln a

we can replace log a


e to get d

dx
loga x =
1

x ln a
.
You may if you wish memorize the formulas
d d 1
x x
a = (ln a)a and loga x = . (3.6.8)
dx dx x ln a

Because the "trick'' a = e ln a


is often useful, and sometimes essential, it may be better to remember the trick, not the formula.

Example 3.6.1

Compute the derivative of f (x) = 2 . x

Solution
d d
x ln 2 x
2 = (e )
dx dx

d
x ln 2
= e
dx (3.6.9)

d x ln 2
=( x ln 2) e
dx

x ln 2 x
= (ln 2)e =2 ln 2

Example 3.6.2
2 2

Compute the derivative of f (x) = 2 x


=2
(x )
.
d 2
x
d x
2
ln 2
2 = e
dx dx

d 2 x
2
ln 2
=( x ln 2) e
dx (3.6.10)

2
x ln 2
= (2 ln 2)xe

2
x
= (2 ln 2)x2

Example 3.6.3

Compute the derivative of f (x) = x . At first this appears to be a new kind of function: it is not a constant power of x, and it
x

does not seem to be an exponential function, since the base is not constant. But in fact it is no harder than the previous
example.
d d
x x ln x
x = e
dx dx

d x ln x
=( x ln x) e
dx (3.6.11)

1 x
= (x + ln x)x
x
x
= (1 + ln x)x

3.6.3 https://math.libretexts.org/@go/page/4456
Example 3.6.4

Recall that we have not justified the power rule except when the exponent is a positive or negative integer. We can use the
exponential function to take care of other exponents.
d d
r r ln x
x = e
dx dx

d r ln x
=( r ln x) e
dx (3.6.12)

1 r
= (r )x
x
r−1
= rx

Contributors
David Guichard (Whitman College)
Integrated by Justin Marshall.

3.6: Derivatives of Logarithmic Functions is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
4.7: Derivatives of the Exponential and Logarithmic Functions by David Guichard is licensed CC BY-NC-SA 4.0.

3.6.4 https://math.libretexts.org/@go/page/4456
3.7: Rates of Change in the Natural and Social Sciences
 Learning Objectives
Determine a new value of a quantity from the old value and the amount of change.
Calculate the average rate of change and explain how it differs from the instantaneous rate of change.
Apply rates of change to displacement, velocity, and acceleration of an object moving along a straight line.
Predict the future population from the present value and the population growth rate.
Use derivatives to calculate marginal cost and revenue in a business situation.

In this section we look at some applications of the derivative by focusing on the interpretation of the derivative as the rate of
change of a function. These applications include acceleration and velocity in physics, population growth rates in biology, and
marginal functions in economics.

Amount of Change Formula


One application for derivatives is to estimate an unknown value of a function at a point by using a known value of a function at
some given point together with its rate of change at the given point. If f (x) is a function defined on an interval [a, a + h] , then the
amount of change of f (x) over the interval is the change in the y values of the function over that interval and is given by

f (a + h) − f (a).

The average rate of change of the function f over that same interval is the ratio of the amount of change over that interval to the
corresponding change in the x values. It is given by
f (a + h) − f (a)
.
h

As we already know, the instantaneous rate of change of f (x) at a is its derivative


f (a + h) − f (a)
f '(a) = lim .
h→0 h

f (a+h)−f (a)
For small enough values of h , f '(a) ≈ h
. We can then solve for f (a + h) to get the amount of change formula:

f (a + h) ≈ f (a) + f '(a)h. (3.7.1)

We can use this formula if we know only f (a)and f '(a) and wish to estimate the value of f (a + h) . For example, we may use the
current population of a city and the rate at which it is growing to estimate its population in the near future. As we can see in Figure
3.7.1, we are approximating f (a + h) by the y coordinate at a+h on the line tangent to f (x) at x = a . Observe that the accuracy of

this estimate depends on the value of h as well as the value of f '(a).

Figure 3.7.1 : The new value of a changed quantity equals the original value plus the rate of change times the interval of change:
f (a + h) ≈ f (a) + f '(a)h.

3.7.1 https://math.libretexts.org/@go/page/4457
 Example 3.7.1: Estimating the Value of a Function

If f (3) = 2 and f '(3) = 5, estimate f (3.2).


Solution
Begin by finding h . We have h = 3.2 − 3 = 0.2. Thus,
f (3.2) = f (3 + 0.2) ≈ f (3) + (0.2)f '(3) = 2 + 0.2(5) = 3.

 Exercise 3.7.1

Given f (10) = −5 and f '(10) = 6, estimate f (10.1).

Hint
Use the same process as in the preceding example.

Answer
−4.4

Motion along a Line


Another use for the derivative is to analyze motion along a line. We have described velocity as the rate of change of position. If we
take the derivative of the velocity, we can find the acceleration, or the rate of change of velocity. It is also important to introduce
the idea of speed, which is the magnitude of velocity. Thus, we can state the following mathematical definitions.

 Definition

Let s(t) be a function giving the position of an object at time t.


The velocity of the object at time t is given by v(t) = s'(t) .
The speed of the object at time t is given by |v(t)| .
The acceleration of the object at t is given by a(t) = v'(t) = s ′′
(t) .

 Example 3.7.2: Comparing Instantaneous Velocity and Average Velocity

A ball is dropped from a height of 64 feet. Its height above ground (in feet) t seconds later is given by s(t) = −16t 2
+ 64 .

a. What is the instantaneous velocity of the ball when it hits the ground?
b. What is the average velocity during its fall?
Solution
The first thing to do is determine how long it takes the ball to reach the ground. To do this, set s(t) = 0 . Solving
−16 t + 64 = 0 , we get t = 2 , so it takes 2 seconds for the ball to reach the ground.
2

3.7.2 https://math.libretexts.org/@go/page/4457
a. The instantaneous velocity of the ball as it strikes the ground is v(2) . Since v(t) = s'(t) = −32t , we obtain v(t) = −64
ft/s.
b. The average velocity of the ball during its fall is
s(2)−s(0)
vave =
2−0
=
0−64

2
= −32 ft/s.

 Example 3.7.3: Interpreting the Relationship between v(t) and a(t)

A particle moves along a coordinate axis in the positive direction to the right. Its position at time t is given by
s(t) = t − 4t + 2 . Find v(1) and a(1) and use these values to answer the following questions.
3

a. Is the particle moving from left to right or from right to left at time t = 1 ?
b. Is the particle speeding up or slowing down at time t = 1 ?
Solution
Begin by finding v(t) and a(t) .

v(t) = s (t) = 3 t
2
−4 and a(t) = v'(t) = s ′′
(t) = 6t .
Evaluating these functions at t = 1 , we obtain v(1) = −1 and a(1) = 6 .
a. Because v(1) < 0 , the particle is moving from right to left.
b. Because v(1) < 0 and a(1) > 0 , velocity and acceleration are acting in opposite directions. In other words, the particle is
being accelerated in the direction opposite the direction in which it is traveling, causing |v(t)| to decrease. The particle is
slowing down.

 Example 3.7.4: Position and Velocity

The position of a particle moving along a coordinate axis is given by s(t) = t 3


− 9t
2
+ 24t + 4, t ≥ 0.

a. Find v(t) .
b. At what time(s) is the particle at rest?
c. On what time intervals is the particle moving from left to right? From right to left?
d. Use the information obtained to sketch the path of the particle along a coordinate axis.
Solution
a. The velocity is the derivative of the position function:
2
v(t) = s'(t) = 3 t − 18t + 24.

b. The particle is at rest when v(t) = 0 , so set 3t − 18t + 24 = 0 . Factoring the left-hand side of the equation produces
2

3(t − 2)(t − 4) = 0 . Solving, we find that the particle is at rest at t = 2 and t = 4 .

c. The particle is moving from left to right when v(t) > 0 and from right to left when v(t) < 0 . Figure 3.7.2 gives the analysis
of the sign of v(t) for t ≥ 0 , but it does not represent the axis along which the particle is moving.

Figure 3.7.2 :The sign of v(t) determines the direction of the particle.
Since 3t 2
− 18t + 24 > 0 on [0, 2) ∪ (4, +∞), the particle is moving from left to right on these intervals.
Since 3t 2
− 18t + 24 < 0 on (2, 4), the particle is moving from right to left on this interval.
d. Before we can sketch the graph of the particle, we need to know its position at the time it starts moving (t = 0) and at the
times that it changes direction (t = 2, 4) . We have s(0) = 4 , s(2) = 24 , and s(4) = 20 . This means that the particle begins on
the coordinate axis at 4 and changes direction at 24 and 20 on the coordinate axis. The path of the particle is shown on a
coordinate axis in Figure 3.7.3.

3.7.3 https://math.libretexts.org/@go/page/4457
Figure 3.7.3 : The path of the particle can be determined by analyzing v(t).

 Exercise 3.7.2

A particle moves along a coordinate axis. Its position at time t is given by 2


s(t) = t − 5t + 1 . Is the particle moving from
right to left or from left to right at time t = 3 ?

Hint
Find v(3) and look at the sign.

Answer
left to right

Population Change
In addition to analyzing velocity, speed, acceleration, and position, we can use derivatives to analyze various types of populations,
including those as diverse as bacteria colonies and cities. We can use a current population, together with a growth rate, to estimate
the size of a population in the future. The population growth rate is the rate of change of a population and consequently can be
represented by the derivative of the size of the population.

 Definition

If P (t) is the number of entities present in a population, then the population growth rate of P (t) is defined to be P '(t).

 Example 3.7.5: Estimating a Population


The population of a city is tripling every 5 years. If its current population is 10,000, what will be its approximate population 2
years from now?
Solution
Let P (t) be the population (in thousands) t years from now. Thus, we know that P (0) = 10 and based on the information, we
anticipate P (5) = 30. Now estimate P '(0), the current growth rate, using
P (5)−P (0)
P '(0) ≈
5−0
=
30−10

5
=4 .
By applying Equation 3.7.1 to P (t), we can estimate the population 2 years from now by writing
P (2) ≈ P (0) + (2)P '(0) ≈ 10 + 2(4) = 18 ;
thus, in 2 years the population will be 18,000.

 Exercise 3.7.3

The current population of a mosquito colony is known to be 3,000; that is, P (0) = 3, 000. If P '(0) = 100, estimate the size of
the population in 3 days, where t is measured in days.

Hint
Use P (3) ≈ P (0) + 3P '(0)

Answer
3,300

3.7.4 https://math.libretexts.org/@go/page/4457
Changes in Cost and Revenue
In addition to analyzing motion along a line and population growth, derivatives are useful in analyzing changes in cost, revenue,
and profit. The concept of a marginal function is common in the fields of business and economics and implies the use of
derivatives. The marginal cost is the derivative of the cost function. The marginal revenue is the derivative of the revenue function.
The marginal profit is the derivative of the profit function, which is based on the cost function and the revenue function.

 Definition
If C (x) is the cost of producing x items, then the marginal cost M C (x) is M C (x) = C '(x).
If R(x) is the revenue obtained from selling x items, then the marginal revenue M R(x) is M R(x) = R'(x) .
If P (x) = R(x) − C (x) is the profit obtained from selling x items, then the marginal profit M P (x) is defined to be
M P (x) = P '(x) = M R(x) − M C (x) = R'(x) − C '(x) .

We can roughly approximate


C (x + h) − C (x)
M C (x) = C '(x) = lim
h→0 h

by choosing an appropriate value for h . Since x represents objects, a reasonable and small value for h is 1. Thus, by substituting
h = 1 , we get the approximation M C (x) = C '(x) ≈ C (x + 1) − C (x) . Consequently, C '(x) for a given value of x can be

thought of as the change in cost associated with producing one additional item. In a similar way, M R(x) = R'(x) approximates
the revenue obtained by selling one additional item, and M P (x) = P '(x) approximates the profit obtained by producing and
selling one additional item.

 Example 3.7.6: Applying Marginal Revenue

Assume that the number of barbeque dinners that can be sold, x, can be related to the price charged, p, by the equation
p(x) = 9 − 0.03x, 0 ≤ x ≤ 300 .

In this case, the revenue in dollars obtained by selling x barbeque dinners is given by
R(x) = xp(x) = x(9 − 0.03x) = −0.03 x
2
+ 9x for 0 ≤ x ≤ 300 .
Use the marginal revenue function to estimate the revenue obtained from selling the 101
st
barbeque dinner. Compare this to
the actual revenue obtained from the sale of this dinner.
Solution
First, find the marginal revenue function: M R(x) = R'(x) = −0.06x + 9.
Next, use R'(100) to approximate R(101) − R(100), the revenue obtained from the sale of the 101
st
dinner. Since
R'(100) = 3 , the revenue obtained from the sale of the 101 dinner is approximately $3.
st

The actual revenue obtained from the sale of the 101 dinner is
st

R(101) − R(100) = 602.97 − 600 = 2.97, or $2.97.


The marginal revenue is a fairly good estimate in this case and has the advantage of being easy to compute.

 Exercise 3.7.4

Suppose that the profit obtained from the sale of x fish-fry dinners is given by P (x) = −0.03x 2
+ 8x − 50 . Use the marginal
profit function to estimate the profit from the sale of the 101 fish-fry dinner.
st

Hint
Use P '(100) to approximate P (101) − P (100).

Answer
$2

3.7.5 https://math.libretexts.org/@go/page/4457
Key Concepts
Using f (a + h) ≈ f (a) + f '(a)h , it is possible to estimate f (a + h) given f '(a) and f (a).
The rate of change of position is velocity, and the rate of change of velocity is acceleration. Speed is the absolute value, or
magnitude, of velocity.
The population growth rate and the present population can be used to predict the size of a future population.
Marginal cost, marginal revenue, and marginal profit functions can be used to predict, respectively, the cost of producing one
more item, the revenue obtained by selling one more item, and the profit obtained by producing and selling one more item.

Glossary
acceleration
is the rate of change of the velocity, that is, the derivative of velocity

amount of change
the amount of a function f (x) over an interval [x, x + h]isf (x + h) − f (x)

average rate of change


f (x+h)−f (a)
is a function f (x) over an interval [x, x + h] is b−a

marginal cost
is the derivative of the cost function, or the approximate cost of producing one more item

marginal revenue
is the derivative of the revenue function, or the approximate revenue obtained by selling one more item

marginal profit
is the derivative of the profit function, or the approximate profit obtained by producing and selling one more item

population growth rate


is the derivative of the population with respect to time

speed
is the absolute value of velocity, that is, |v(t)| is the speed of an object at time t whose velocity is given by v(t)

3.7: Rates of Change in the Natural and Social Sciences is shared under a not declared license and was authored, remixed, and/or curated by
LibreTexts.
3.4: Derivatives as Rates of Change by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

3.7.6 https://math.libretexts.org/@go/page/4457
Welcome to the Mathematics Library. This Living Library is a principal hub of the LibreTexts project, which is a multi-institutional
collaborative venture to develop the next generation of open-access texts to improve postsecondary education at all levels of higher
learning. The LibreTexts approach is highly collaborative where an Open Access textbook environment is under constant revision
by students, faculty, and outside experts to supplant conventional paper-based books.

1 https://math.libretexts.org/@go/page/4458
3.9: Related Rates
3.9: Related Rates is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

3.9.1 https://math.libretexts.org/@go/page/4459
3.10: Linear Approximations and Differentials
 Learning Objectives
Describe the linear approximation to a function at a point.
Write the linearization of a given function.
Draw a graph that illustrates the use of differentials to approximate the change in a quantity.
Calculate the relative error and percentage error in using a differential approximation.

We have just seen how derivatives allow us to compare related quantities that are changing over time. In this section, we examine
another application of derivatives: the ability to approximate functions locally by linear functions. Linear functions are the easiest
functions with which to work, so they provide a useful tool for approximating function values. In addition, the ideas presented in
this section are generalized later in the text when we study how to approximate functions by higher-degree polynomials
Introduction to Power Series and Functions.

Linear Approximation of a Function at a Point


Consider a function f that is differentiable at a point x =a . Recall that the tangent line to the graph of f at a is given by the
equation

y = f (a) + f (a)(x − a).

For example, consider the function f (x) = at a = 2 . Since f is differentiable at x = 2 and


1

x

f (x) = −
1
2
x
, we see that

f (2) = − . Therefore, the tangent line to the graph of f at a = 2 is given by the equation
1

1 1
y = − (x − 2).
2 4

Figure 3.10.1a shows a graph of f (x) = along with the tangent line to f at x = 2 . Note that for x near 2, the graph of the
1

tangent line is close to the graph of f . As a result, we can use the equation of the tangent line to approximate f (x) for x near 2. For
example, if x = 2.1, the y value of the corresponding point on the tangent line is
1 1
y = − (2.1 − 2) = 0.475.
2 4

The actual value of f (2.1) is given by


1
f (2.1) = ≈ 0.47619.
2.1

Therefore, the tangent line gives us a fairly good approximation of f (2.1) (Figure 3.10.1b). However, note that for values of x far
from 2, the equation of the tangent line does not give us a good approximation. For example, if x = 10, the y -value of the
corresponding point on the tangent line is
1 1 1
y = − (10 − 2) = − 2 = −1.5,
2 4 2

whereas the value of the function at x = 10 is f (10) = 0.1.

3.10.1 https://math.libretexts.org/@go/page/4449
Figure 3.10.1 : (a) The tangent line to f (x) = 1/x at x = 2 provides a good approximation to f for x near 2. (b) At x = 2.1 , the
value of y on the tangent line to f (x) = 1/x is 0.475. The actual value of f (2.1) is 1/2.1, which is approximately 0.47619.
In general, for a differentiable function f , the equation of the tangent line to f at x = a can be used to approximate f (x) for x near
a . Therefore, we can write


f (x) ≈ f (a) + f (a)(x − a) for x near a .
We call the linear function

L(x) = f (a) + f (a)(x − a) (3.10.1)

the linear approximation, or tangent line approximation, of f at x = a . This function L is also known as the linearization of f

at x = a.

To show how useful the linear approximation can be, we look at how to find the linear approximation for f (x) = √x at x = 9.

 Example 3.10.1: Linear Approximation of √−


x


−−
Find the linear approximation of f (x) = √−
x at x = 9 and use the approximation to estimate √9.1.

Solution
Since we are looking for the linear approximation at x = 9, using Equation 3.10.1 we know the linear approximation is given
by

L(x) = f (9) + f (9)(x − 9).

We need to find f (9) and f ′


(9).

− –
f (x) = √x ⇒ f (9) = √9 = 3

′ 1 ′ 1 1
f (x) = ⇒ f (9) = =
2 √x 2 √9 6

Therefore, the linear approximation is given by Figure 3.10.2.


1
L(x) = 3 + (x − 9)
6


−−
Using the linear approximation, we can estimate √9.1 by writing


−− 1
√9.1 = f (9.1) ≈ L(9.1) = 3 + (9.1 − 9) ≈ 3.0167.
6

3.10.2 https://math.libretexts.org/@go/page/4449
Figure 3.10.2 : The local linear approximation to f (x) = √−
x at x = 9 provides an approximation to f for x near 9 .

Analysis

−−
Using a calculator, the value of √9.1 to four decimal places is 3.0166. The value given by the linear approximation, 3.0167, is
very close to the value obtained with a calculator, so it appears that using this linear approximation is a good way to estimate

√x , at least for x near 9 . At the same time, it may seem odd to use a linear approximation when we can just push a few buttons
−−
− −−−
on a calculator to evaluate √9.1. However, how does the calculator evaluate √9.1? The calculator uses an approximation! In
fact, calculators and computers use approximations all the time to evaluate mathematical expressions; they just use higher-
degree approximations.

 Exercise 3.10.1

−−
Find the local linear approximation to f (x) = √−
x at x = 8 . Use it to approximate √8.1 to five decimal places.
3 3

Hint

L(x) = f (a) + f (a)(x − a)

Answer
1
L(x) = 2 + (x − 8); 2.00833
12

 Example 3.10.2: Linear Approximation of sin x

Find the linear approximation of f (x) = sin x at x = π

3
and use it to approximate sin(62°).
Solution
First we note that since rad is equivalent to
π

3
60° , using the linear approximation at x = π/3 seems reasonable. The linear
approximation is given by
π ′ π π
L(x) = f ( )+f ( )(x − ).
3 3 3

We see that
π π √3
f (x) = sin x ⇒ f ( ) = sin( ) =
3 3 2

′ ′ π π 1
f (x) = cos x ⇒ f ( ) = cos( ) =
3 3 2

Therefore, the linear approximation of f at x = π/3 is given by Figure 3.10.3.


√3 1 π
L(x) = + (x − )
2 2 3

To estimate sin(62°) using L, we must first convert 62° to radians. We have 62° = 62π

180
radians, so the estimate for sin(62°)

is given by
62π 62π √3 1 62π π √3 1 2π √3 π
sin(62°) = f ( ) ≈ L( ) = + ( − ) = + ( ) = + ≈ 0.88348.
180 180 2 2 180 3 2 2 180 2 180

3.10.3 https://math.libretexts.org/@go/page/4449
Figure 3.10.3 : The linear approximation to f (x) = sin x at x = π/3 provides an approximation to sin x for x near π/3.

 Exercise 3.10.2

Find the linear approximation for f (x) = cos x at x = π

2
.

Hint

L(x) = f (a) + f (a)(x − a)

Answer
π
L(x) = −x +
2

Linear approximations may be used in estimating roots and powers. In the next example, we find the linear approximation for
f (x) = (1 + x)
n
at x = 0 , which can be used to estimate roots and powers for real numbers near 1. The same idea can be
extended to a function of the form f (x) = (m + x) to estimate roots and powers near a different number m.
n

 Example 3.10.3: Approximating Roots and Powers

Find the linear approximation of f (x) = (1 + x) at x = 0 . Use this approximation to estimate (1.01)
n 3
.

Solution
The linear approximation at x = 0 is given by

L(x) = f (0) + f (0)(x − 0).

Because
n
f (x) = (1 + x ) ⇒ f (0) = 1

′ n−1 ′
f (x) = n(1 + x ) ⇒ f (0) = n,

the linear approximation is given by Figure 3.10.4a.


L(x) = 1 + n(x − 0) = 1 + nx

We can approximate (1.01) by evaluating L(0.01) when n = 3 . We conclude that


3

3
(1.01 ) = f (1.01) ≈ L(1.01) = 1 + 3(0.01) = 1.03.

3.10.4 https://math.libretexts.org/@go/page/4449
Figure 3.10.4 : (a) The linear approximation of f (x) at x = 0 is L(x). (b) The actual value of 1.01
3
is . The linear
1.030301

approximation of f (x) at x = 0 estimates 1.01 to be 1.03.


3

 Exercise 3.10.3

Find the linear approximation of f (x) = (1 + x) at x = 0 without using the result from the preceding example.
4

Hint
′ 3
f (x) = 4(1 + x )

Answer
L(x) = 1 + 4x

Differentials
We have seen that linear approximations can be used to estimate function values. They can also be used to estimate the amount a
function value changes as a result of a small change in the input. To discuss this more formally, we define a related concept:
differentials. Differentials provide us with a way of estimating the amount a function changes as a result of a small change in input
values.
When we first looked at derivatives, we used the Leibniz notation dy/dx to represent the derivative of y with respect to x.
Although we used the expressions dy and dx in this notation, they did not have meaning on their own. Here we see a meaning to
the expressions dy and dx. Suppose y = f (x) is a differentiable function. Let dx be an independent variable that can be assigned
any nonzero real number, and define the dependent variable dy by

dy = f (x) dx. (3.10.2)

It is important to notice that dy is a function of both x and dx. The expressions dy and dx are called differentials. We can divide
both sides of Equation 3.10.2 by dx, which yields
dy

= f (x). (3.10.3)
dx

This is the familiar expression we have used to denote a derivative. Equation 3.10.3 is known as the differential form of Equation
3.10.2.

 Example 3.10.4: Computing Differentials

For each of the following functions, find dy and evaluate when x = 3 and dx = 0.1.
a. y = x + 2x2

b. y = cos x
Solution

3.10.5 https://math.libretexts.org/@go/page/4449
The key step is calculating the derivative. When we have that, we can obtain dy directly.
a. Since f (x) = x 2
+ 2x, we know f ′
(x) = 2x + 2 , and therefore
dy = (2x + 2) dx.

When x = 3 and dx = 0.1,


dy = (2 ⋅ 3 + 2)(0.1) = 0.8.

b. Since f (x) = cos x, f ′


(x) = − sin(x). This gives us
dy = − sin x dx.

When x = 3 and dx = 0.1,


dy = − sin(3)(0.1) = −0.1 sin(3).

 Exercise 3.10.4
2

For y = e , find dy .
x

Hint

dy = f (x) dx

Answer
2
x
dy = 2x e dx

We now connect differentials to linear approximations. Differentials can be used to estimate the change in the value of a function
resulting from a small change in input values. Consider a function f that is differentiable at point a . Suppose the input x changes
by a small amount. We are interested in how much the output y changes. If x changes from a to a + dx , then the change in x is dx
(also denoted Δx), and the change in y is given by

Δy = f (a + dx) − f (a).

Instead of calculating the exact change in y , however, it is often easier to approximate the change in y by using a linear
approximation. For x near a, f (x) can be approximated by the linear approximation (Equation 3.10.1)

L(x) = f (a) + f (a)(x − a).

Therefore, if dx is small,

f (a + dx) ≈ L(a + dx) = f (a) + f (a)(a + dx − a).

That is,

f (a + dx) − f (a) ≈ L(a + dx) − f (a) = f (a) dx.

In other words, the actual change in the function f if x increases from a to a + dx is approximately the difference between
L(a + dx) and f (a) , where L(x) is the linear approximation of f at a . By definition of L(x), this difference is equal to f (a) dx .

In summary,

Δy = f (a + dx) − f (a) ≈ L(a + dx) − f (a) = f (a) dx = dy.

Therefore, we can use the differential dy = f ′


(a) dx to approximate the change in y if x increases from x = a to x = a + dx . We
can see this in the following graph.

3.10.6 https://math.libretexts.org/@go/page/4449
Figure 3.10.5 : The differential dy = f ′
(a) dx is used to approximate the actual change in y if x increases from a to a + dx .
We now take a look at how to use differentials to approximate the change in the value of the function that results from a small
change in the value of the input. Note the calculation with differentials is much simpler than calculating actual values of functions
and the result is very close to what we would obtain with the more exact calculation.

 Example 3.10.5: Approximating Change with Differentials

Let y = x 2
+ 2x. Compute Δy and dy at x = 3 if dx = 0.1.
Solution
The actual change in y if x changes from x = 3 to x = 3.1 is given by
2 2
Δy = f (3.1) − f (3) = [(3.1 ) + 2(3.1)] − [ 3 + 2(3)] = 0.81.

The approximate change in y is given by dy = f ′


(3) dx . Since f ′
(x) = 2x + 2, we have

dy = f (3) dx = (2(3) + 2)(0.1) = 0.8.

 Exercise 3.10.5

For y = x 2
+ 2x, find Δy and dy at x = 3 if dx = 0.2.

Hint

dy = f (3) dx, Δy = f (3.2) − f (3)

Answer
dy = 1.6, Δy = 1.64

Calculating the Amount of Error


Any type of measurement is prone to a certain amount of error. In many applications, certain quantities are calculated based on
measurements. For example, the area of a circle is calculated by measuring the radius of the circle. An error in the measurement of
the radius leads to an error in the computed value of the area. Here we examine this type of error and study how differentials can be
used to estimate the error.
Consider a function f with an input that is a measured quantity. Suppose the exact value of the measured quantity is a , but the
measured value is a + dx . We say the measurement error is dx (or Δx). As a result, an error occurs in the calculated quantity
f (x). This type of error is known as a propagated error and is given by

Δy = f (a + dx) − f (a).

Since all measurements are prone to some degree of error, we do not know the exact value of a measured quantity, so we cannot
calculate the propagated error exactly. However, given an estimate of the accuracy of a measurement, we can use differentials to
approximate the propagated error Δy. Specifically, if f is a differentiable function at a ,the propagated error is

Δy ≈ dy = f (a) dx.

3.10.7 https://math.libretexts.org/@go/page/4449
Unfortunately, we do not know the exact value a. However, we can use the measured value a + dx, and estimate

Δy ≈ dy ≈ f (a + dx) dx.

In the next example, we look at how differentials can be used to estimate the error in calculating the volume of a box if we assume
the measurement of the side length is made with a certain amount of accuracy.

 Example 3.10.6: Volume of a Cube

Suppose the side length of a cube is measured to be 5 cm with an accuracy of 0.1 cm.
a. Use differentials to estimate the error in the computed volume of the cube.
b. Compute the volume of the cube if the side length is (i) 4.9 cm and (ii) 5.1 cm to compare the estimated error with the
actual potential error.
Solution
a. The measurement of the side length is accurate to within ±0.1 cm. Therefore,
−0.1 ≤ dx ≤ 0.1.

The volume of a cube is given by V =x


3
, which leads to
2
dV = 3 x dx.

Using the measured side length of 5 cm, we can estimate that


2 2
−3(5 ) (0.1) ≤ dV ≤ 3(5 ) (0.1).

Therefore,
−7.5 ≤ dV ≤ 7.5.

b. If the side length is actually 4.9 cm, then the volume of the cube is
3 3
V (4.9) = (4.9 ) = 117.649 cm .

If the side length is actually 5.1 cm, then the volume of the cube is
3 3
V (5.1) = (5.1 ) = 132.651 cm .

Therefore, the actual volume of the cube is between 117.649 and 132.651. Since the side length is measured to be 5 cm,
the computed volume is V (5) = 5 = 125. Therefore, the error in the computed volume is
3

117.649 − 125 ≤ ΔV ≤ 132.651 − 125.

That is,
−7.351 ≤ ΔV ≤ 7.651.

We see the estimated error dV is relatively close to the actual potential error in the computed volume.

 Exercise 3.10.6
Estimate the error in the computed volume of a cube if the side length is measured to be 6 cm with an accuracy of 0.2 cm.

Hint
2
dV = 3 x dx

Answer
The volume measurement is accurate to within 21.6 cm . 3

The measurement error dx (= Δx) and the propagated error Δy are absolute errors. We are typically interested in the size of an
error relative to the size of the quantity being measured or calculated. Given an absolute error Δq for a particular quantity, we
Δq
define the relative error as q
, where q is the actual value of the quantity. The percentage error is the relative error expressed as

3.10.8 https://math.libretexts.org/@go/page/4449
a percentage. For example, if we measure the height of a ladder to be 63 in. when the actual height is 62 in., the absolute error is 1
in. but the relative error is1
= 0.016 , or 1.6%. By comparison, if we measure the width of a piece of cardboard to be 8.25 in.
62

when the actual width is 8 in., our absolute error is in., whereas the relative error is
1

4
= , or 3.1%. Therefore, the
0.25

8
1

32

percentage error in the measurement of the cardboard is larger, even though 0.25 in. is less than 1 in.

 Example 3.10.7: Relative and Percentage Error

An astronaut using a camera measures the radius of Earth as 4000 mi with an error of ±80 mi. Let’s use differentials to
estimate the relative and percentage error of using this radius measurement to calculate the volume of Earth, assuming the
planet is a perfect sphere.
Solution: If the measurement of the radius is accurate to within ±80, we have
−80 ≤ dr ≤ 80.

Since the volume of a sphere is given by V =(


4

3
3
)π r , we have
2
dV = 4π r dr.

Using the measured radius of 4000 mi, we can estimate


2 2
−4π(4000 ) (80) ≤ dV ≤ 4π(4000 ) (80).

dV
To estimate the relative error, consider . Since we do not know the exact value of the volume V , use the measured radius
V

r = 4000 mi to estimate V . We obtain V ≈(


4

3
)π(4000 )
3
. Therefore the relative error satisfies
−4π(4000 ) (80)
2
dV 2
4π(4000 ) (80)
≤ ≤ ,
3 3
4π(4000 ) /3 V 4π(4000 ) /3

which simplifies to
dV
−0.06 ≤ ≤ 0.06.
V

The relative error is 0.06 and the percentage error is 6%.

 Exercise 3.10.7

Determine the percentage error if the radius of Earth is measured to be 3950 mi with an error of ±100 mi.

Hint
Use the fact that dV 2
= 4π r dr to find dV /V .

Answer
7.6%

Key Concepts
A differentiable function y = f (x) can be approximated at a by the linear function

L(x) = f (a) + f (a)(x − a).

For a function y = f (x), if x changes from a to a + dx , then



dy = f (x) dx

is an approximation for the change in y . The actual change in y is


Δy = f (a + dx) − f (a).

A measurement error dx can lead to an error in a calculated quantity f (x). The error in the calculated quantity is known as the
propagated error. The propagated error can be estimated by

dy ≈ f (x) dx.

3.10.9 https://math.libretexts.org/@go/page/4449
Δq
To estimate the relative error of a particular quantity q, we estimate q
.

Key Equations
Linear approximation

L(x) = f (a) + f (a)(x − a)

A differential

dy = f (x) dx

Glossary
differential
the differential dx is an independent variable that can be assigned any nonzero real number; the differential dy is defined to be

dy = f (x) dx

differential form
given a differentiable function y = f ′
(x), the equation dy = f ′
(x) dx is the differential form of the derivative of y with
respect to x

linear approximation
the linear function L(x) = f (a) + f ′
(a)(x − a) is the linear approximation of f at x = a

percentage error
the relative error expressed as a percentage

propagated error
the error that results in a calculated quantity f (x) resulting from a measurement error dx

relative error
Δq
given an absolute error Δq for a particular quantity, q
is the relative error.

tangent line approximation (linearization)


since the linear approximation of f at x = a is defined using the equation of the tangent line, the linear approximation of f at
x = a is also known as the tangent line approximation to f at x = a

3.10: Linear Approximations and Differentials is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
4.2: Linear Approximations and Differentials by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

3.10.10 https://math.libretexts.org/@go/page/4449
3.11: Hyperbolic Functions
The hyperbolic functions appear with some frequency in applications, and are quite similar in many respects to the trigonometric
functions. This is a bit surprising given our initial definitions.

Definition 4.11.1: Hyperbolic Cosines and Sines


The hyperbolic cosine is the function
x −x
e +e
cosh x = , (3.11.1)
2

and the hyperbolic sine is the function


x −x
e −e
sinh x = . (3.11.2)
2

Notice that cosh is even (that is, cosh(−x) = cosh(x)) while sinh is odd (sinh(−x) = − sinh(x) ), and cosh x + sinh x = e . x

Also, for all x, cosh x > 0 , while sinh x = 0 if and only if e − e = 0 , which is true precisely when x = 0 .
x −x

Lemma 4.11.2
The range of cosh x is [1, ∞).

Proof
Let y = cosh x. We solve for x:
x −x
e +e
y =
2
x −x
2y = e +e

x 2x
2ye =e +1

2x x (3.11.3)
0 =e − 2y e +1
− − −−−−
2
2y ± √ 4 y − 4
x
e =
2
−−−−−
x 2
e = y ± √y −1

From the last equation, we see y 2


≥1 , and since y ≥ 0 , it follows that y ≥ 1 .
−−−−− −−−−−
Now suppose y ≥ 1 , so 2
y ± √y − 1 > 0 . Then x = ln(y ± √y 2
−1 ) is a real number, and y = cosh x, so y is in the range
of cosh(x).

Definition 4.11.3: Hyperbolic Tangent and Cotangent


The other hyperbolic functions are
sinh x
tanh x =
cosh x

cosh x
coth x =
sinh x
(3.11.4)
1
sechx =
cosh x

1
cschx =
sinh x

3.11.1 https://math.libretexts.org/@go/page/4450
The domain of coth and csch is x ≠0 while the domain of the other hyperbolic functions is all real numbers. Graphs are
shown in Figure 3.11.1

Figure 3.11.1 : The hyperbolic functions.


Certainly the hyperbolic functions do not closely resemble the trigonometric functions graphically. But they do have analogous
properties, beginning with the following identity.

Theorem 4.11.4

For all x in R, cosh 2


x − sinh
2
x =1 .

Proof

The proof is a straightforward computation:


x −x 2 x −x 2 2x −2x 2x −2x
(e +e ) (e −e ) e +2 +e −e +2 −e 4
2 2
cosh x − sinh x = − = = = 1. (3.11.5)
4 4 4 4

This immediately gives two additional identities:


2 2 2 2
1 − tanh x = sech x and coth x − 1 = csch x. (3.11.6)

The identity of the theorem also helps to provide a geometric motivation. Recall that the graph of x − y = 1 is a hyperbola with 2 2

asymptotes x = ±y whose x-intercepts are ±1. If (x, y) is a point on the right half of the hyperbola, and if we let x = cosh t , then
−−−−− −−−−−−−−−
2
y = ±√x − 1 = ±√cosh x − 1 = ± sinh t
2
. So for some suitable t , cosh t and sinh t are the coordinates of a typical point on
the hyperbola. In fact, it turns out that t is twice the area shown in the first graph of Figure 3.11.2. Even this is analogous to
trigonometry; cos t and sin t are the coordinates of a typical point on the unit circle, and t is twice the area shown in the second
graph of Figure 3.11.2.

Figure 3.11.2 : Geometric definitions of sin, cos, sinh, cosh: t is twice the shaded area in each figure.
Given the definitions of the hyperbolic functions, finding their derivatives is straightforward. Here again we see similarities to the
trigonometric functions.

3.11.2 https://math.libretexts.org/@go/page/4450
Theorem 4.11.5
d

dx
cosh x = sinh x and \thmrdef{thm:hyperbolic derivatives} d

dx
sinh x = cosh x .

Proof
x −x x −x
d d e +e e −e
cosh x = = = sinh x, (3.11.7)
dx dx 2 2

and
x −x x −x
d d e −e e +e
sinh x = = = cosh x. (3.11.8)
dx dx 2 2

Since cosh x > 0 , sinh x is increasing and hence injective, so sinh x has an inverse, arcsinhx. Also, sinh x > 0 when x > 0 , so
cosh x is injective on [0, ∞) and has a (partial) inverse, arccoshx. The other hyperbolic functions have inverses as well, though

arcsechx is only a partial inverse. We may compute the derivatives of these functions as we have other inverse functions.

Theorem 4.11.6
d

dx
arcsinhx =
1

√1+x2
.

Proof

Let y = arcsinhx, so sinh y = x . Then


d

sinh y = cosh(y) ⋅ y = 1, (3.11.9)
dx

and so


1 1 1
y = = −−−−−−−−− = − −−− −. (3.11.10)
cosh y 2 √ 1 + x2
√ 1 + sinh y

The other derivatives are left to the exercises.

Contributors
David Guichard (Whitman College)
Integrated by Justin Marshall.

3.11: Hyperbolic Functions is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
4.11: Hyperbolic Functions by David Guichard is licensed CC BY-NC-SA 4.0.

3.11.3 https://math.libretexts.org/@go/page/4450
CHAPTER OVERVIEW

4: Applications of Differentiation
A general Calculus Textmap organized around the textbook

Calculus: Early Transcendentals


by James Stewart

I II III IV V VI VII VIII IX X XI XII XIII XIV XV XVI XVII

This Textmap is currently under construction... please be patient with us.

Topic hierarchy
4.1: Maximum and Minimum Values
4.2: The Mean Value Theorem
4.3: How Derivatives Affect the Shape of a Graph
4.4: Indeterminate Forms and l'Hospital's Rule
4.5: Summary of Curve Sketching
4.6: Graphing with Calculus and Calculators
4.7: Optimization Problems
4.8: Newton's Method
4.9: Antiderivatives

4: Applications of Differentiation is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

1
4.1: Maximum and Minimum Values
 Learning Objectives
Define absolute extrema.
Define local extrema.
Explain how to find the critical points of a function over a closed interval.
Describe how to use critical points to locate absolute extrema over a closed interval.

Given a particular function, we are often interested in determining the largest and smallest values of the function. This information
is important in creating accurate graphs. Finding the maximum and minimum values of a function also has practical significance,
because we can use this method to solve optimization problems, such as maximizing profit, minimizing the amount of material
used in manufacturing an aluminum can, or finding the maximum height a rocket can reach. In this section, we look at how to use
derivatives to find the largest and smallest values for a function.

Absolute Extrema
Consider the function f (x) = x + 1 over the interval (−∞, ∞). As x → ±∞, f (x) → ∞ . Therefore, the function does not have
2

a largest value. However, since x + 1 ≥ 1 for all real numbers x and x + 1 = 1 when x = 0 , the function has a smallest value,
2 2

1 , when x = 0 . We say that 1 is the absolute minimum of f (x) = x + 1 and it occurs at x = 0 . We say that f (x) = x + 1 does
2 2

not have an absolute maximum (Figure 4.1.1).

Figure 4.1.1 : The given function has an absolute minimum of 1 at x = 0 . The function does not have an absolute maximum.

 Definition: Absolute Extrema


Let f be a function defined over an interval I and let c ∈ I . We say f has an absolute maximum on I at c if f (c) ≥ f (x) for
all x ∈ I . We say f has an absolute minimum on I at c if f (c) ≤ f (x) for all x ∈ I . If f has an absolute maximum on I at c or
an absolute minimum on I at c , we say f has an absolute extremum on I at c .

Before proceeding, let’s note two important issues regarding this definition. First, the term absolute here does not refer to absolute
value. An absolute extremum may be positive, negative, or zero. Second, if a function f has an absolute extremum over an interval
I at c , the absolute extremum is f (c). The real number c is a point in the domain at which the absolute extremum occurs. For

example, consider the function f (x) = 1/(x + 1) over the interval (−∞, ∞). Since
2

1
f (0) = 1 ≥ = f (x)
2
x +1

for all real numbers x, we say f has an absolute maximum over (−∞, ∞) at x = 0 . The absolute maximum is f (0) = 1 . It occurs
at x = 0 , as shown in Figure 4.1.2(b).
A function may have both an absolute maximum and an absolute minimum, just one extremum, or neither. Figure 4.1.2 shows
several functions and some of the different possibilities regarding absolute extrema. However, the following theorem, called the
Extreme Value Theorem, guarantees that a continuous function f over a closed, bounded interval [a, b] has both an absolute
maximum and an absolute minimum.

4.1.1 https://math.libretexts.org/@go/page/4461
Figure 4.1.2 : Graphs (a), (b), and (c) show several possibilities for absolute extrema for functions with a domain of (−∞, ∞).
Graphs (d), (e), and (f) show several possibilities for absolute extrema for functions with a domain that is a bounded interval.

 Theorem 4.1.1: Extreme Value Theorem

If f is a continuous function over the closed, bounded interval [a, b], then there is a point in [a, b] at which f has an absolute
maximum over [a, b] and there is a point in [a, b] at which f has an absolute minimum over [a, b].

The proof of the extreme value theorem is beyond the scope of this text. Typically, it is proved in a course on real analysis. There
are a couple of key points to note about the statement of this theorem. For the extreme value theorem to apply, the function must be
continuous over a closed, bounded interval. If the interval I is open or the function has even one point of discontinuity, the function
may not have an absolute maximum or absolute minimum over I . For example, consider the functions shown in Figure 4.1.2 (d),
(e), and (f). All three of these functions are defined over bounded intervals. However, the function in graph (e) is the only one that
has both an absolute maximum and an absolute minimum over its domain. The extreme value theorem cannot be applied to the
functions in graphs (d) and (f) because neither of these functions is continuous over a closed, bounded interval. Although the
function in graph (d) is defined over the closed interval [0, 4], the function is discontinuous at x = 2 . The function has an absolute
maximum over [0, 4] but does not have an absolute minimum. The function in graph (f) is continuous over the half-open interval
[0, 2), but is not defined at x = 2 , and therefore is not continuous over a closed, bounded interval. The function has an absolute

minimum over [0, 2), but does not have an absolute maximum over [0, 2). These two graphs illustrate why a function over a
bounded interval may fail to have an absolute maximum and/or absolute minimum.

4.1.2 https://math.libretexts.org/@go/page/4461
Before looking at how to find absolute extrema, let’s examine the related concept of local extrema. This idea is useful in
determining where absolute extrema occur.

Local Extrema and Critical Points


Consider the function f shown in Figure 4.1.3. The graph can be described as two mountains with a valley in the middle. The
absolute maximum value of the function occurs at the higher peak, at x = 2 . However, x = 0 is also a point of interest. Although
f (0) is not the largest value of f , the value f (0) is larger than f (x) for all x near 0. We say f has a local maximum at x = 0 .

Similarly, the function f does not have an absolute minimum, but it does have a local minimum at x = 1 because f (1) is less than
f (x) for x near 1.

Figure 4.1.3 : This function f has two local maxima and one local minimum. The local maximum at x = 2 is also the absolute
maximum.

 Definition: Local Extrema

A function f has a local maximum at c if there exists an open interval I containing c such that I is contained in the domain of
f and f (c) ≥ f (x) for all x ∈ I . A function f has a local minimum at c if there exists an open interval I containing c such that

I is contained in the domain of f and f (c) ≤ f (x) for all x ∈ I . A function f has a local extremum at c if f has a local

maximum at c or f has a local minimum at c .

Note that if f has an absolute extremum at c and f is defined over an interval containing c , then f (c) is also considered a local
extremum. If an absolute extremum for a function f occurs at an endpoint, we do not consider that to be a local extremum, but
instead refer to that as an endpoint extremum.
Given the graph of a function f , it is sometimes easy to see where a local maximum or local minimum occurs. However, it is not
always easy to see, since the interesting features on the graph of a function may not be visible because they occur at a very small
scale. Also, we may not have a graph of the function. In these cases, how can we use a formula for a function to determine where
these extrema occur?
To answer this question, let’s look at Figure 4.1.3 again. The local extrema occur at x = 0, x = 1, and x = 2. Notice that at x = 0
and x = 1 , the derivative f (x) = 0 . At x = 2 , the derivative f (x) does not exist, since the function f has a corner there. In fact,
′ ′

if f has a local extremum at a point x = c , the derivative f (c) must satisfy one of the following conditions: either f (c) = 0 or
′ ′

f (c) is undefined. Such a value c is known as a critical point and it is important in finding extreme values for functions.

 Definition: Critical Points

Let c be an interior point in the domain of f . We say that c is a critical point of f if f ′


(c) = 0 or f ′
(c) is undefined.

As mentioned earlier, if f has a local extremum at a point x =c , then c must be a critical point of f . This fact is known as
Fermat’s theorem.

4.1.3 https://math.libretexts.org/@go/page/4461
 Theorem 4.1.2: Fermat’s Theorem
If f has a local extremum at c and f is differentiable at c , then f ′
(c) = 0.

 Proof

Suppose f has a local extremum at c and f is differentiable at c . We need to show that f (c) = 0 . To do this, we will show that

f (c) ≥ 0 and f (c) ≤ 0 , and therefore f (c) = 0 . Since f has a local extremum at c , f has a local maximum or local
′ ′ ′

minimum at c . Suppose f has a local maximum at c . The case in which f has a local minimum at c can be handled similarly.
There then exists an open interval I such that f (c) ≥ f (x) for all x ∈ I . Since f is differentiable at c , from the definition of the
derivative, we know that
f (x) − f (c)

f (c) = lim .
x→c x −c

Since this limit exists, both one-sided limits also exist and equal f ′
(c) . Therefore,
f (x) − f (c)

f (c) = lim (4.1.1)
x→c
+
x − c,

and
f (x) − f (c)

f (c) = lim .
x→c

x −c

Since f (c) is a local maximum, we see that f (x) − f (c) ≤ 0 for x near c . Therefore, for x near c , but x >c , we have
f (x)−f (c)

x−c
≤0 . From Equation 4.1.1 we conclude that ′
f (c) ≤ 0 . Similarly, it can be shown that ′
f (c) ≥ 0. Therefore,

f (c) = 0.

From Fermat’s theorem, we conclude that if f has a local extremum at c , then either ′
f (c) = 0 or ′
f (c) is undefined. In other
words, local extrema can only occur at critical points.
Note this theorem does not claim that a function f must have a local extremum at a critical point. Rather, it states that critical
points are candidates for local extrema. For example, consider the function f (x) = x . We have f (x) = 3x = 0 when x = 0 .
3 ′ 2

Therefore, x = 0 is a critical point. However, f (x) = x is increasing over (−∞, ∞), and thus f does not have a local extremum
3

at x = 0 . In Figure 4.1.4, we see several different possibilities for critical points. In some of these cases, the functions have local
extrema at critical points, whereas in other cases the functions do not. Note that these graphs do not show all possibilities for the
behavior of a function at a critical point.

4.1.4 https://math.libretexts.org/@go/page/4461
Figure 4.1.4 : (a–e) A function f has a critical point at c if f ′
(c) = 0 or f ′
(c) is undefined. A function may or may not have a local
extremum at a critical point.
Later in this chapter we look at analytical methods for determining whether a function actually has a local extremum at a critical
point. For now, let’s turn our attention to finding critical points. We will use graphical observations to determine whether a critical
point is associated with a local extremum.

 Example 4.1.1: Locating Critical Points

For each of the following functions, find all critical points. Use a graphing utility to determine whether the function has a local
extremum at each of the critical points.
a. f (x) = x − x
1

3
3 5

2
2
+ 4x

b. f (x) = (x − 1)
2 3

c. f (x) = 1+x
4x
2

Solution
a. The derivative f (x) = x − 5x + 4 is defined for all real numbers x. Therefore, we only need to find the values for x
′ 2

where f (x) = 0 . Since f (x) = x − 5x + 4 = (x − 4)(x − 1) , the critical points are x = 1 and x = 4. From the graph of f
′ ′ 2

in Figure 4.1.5, we see that f has a local maximum at x = 1 and a local minimum at x = 4 .

Figure 4.1.5 : This function has a local maximum and a local minimum.
b. Using the chain rule, we see the derivative is
′ 2 2 2 2
f (x) = 3(x − 1 ) (2x) = 6x(x − 1) .

Therefore, f has critical points when x = 0 and when x − 1 = 0 . We conclude that the critical points are x = 0, ±1. From
2

the graph of f in Figure 4.1.6, we see that f has a local (and absolute) minimum at x = 0 , but does not have a local extremum

4.1.5 https://math.libretexts.org/@go/page/4461
at x = 1 or x = −1 .

Figure 4.1.6 : This function has three critical points: x = 0 , x = 1 , and x = −1 . The function has a local (and absolute)
minimum at x = 0 , but does not have extrema at the other two critical points.
c. By the quotient rule, we see that the derivative is
2 2
4(1+x )−4x(2x)

f (x) =
2
2
=
4−4x

2
2
.
(1+x ) (1+x )

The derivative is defined everywhere. Therefore, we only need to find values for x where f (x) = 0 . Solving f (x) = 0 , we
′ ′

see that 4 − 4x = 0, which implies x = ±1 . Therefore, the critical points are x = ±1 . From the graph of f in Figure 4.1.7,
2

we see that f has an absolute maximum at x = 1 and an absolute minimum at x = −1. Hence, f has a local maximum at
x = 1 and a local minimum at x = −1 . (Note that if f has an absolute extremum over an interval I at a point c that is not an

endpoint of I , then f has a local extremum at c. )

Figure 4.1.7 : This function has an absolute maximum and an absolute minimum.

 Exercise 4.1.1

Find all critical points for f (x) = x3



1

2
2
x − 2x + 1.

Hint
Calculate f ′
(x).

Answer
−2
x = ,x =1
3

Locating Absolute Extrema


The extreme value theorem states that a continuous function over a closed, bounded interval has an absolute maximum and an
absolute minimum. As shown in Figure 4.1.2, one or both of these absolute extrema could occur at an endpoint. If an absolute
extremum does not occur at an endpoint, however, it must occur at an interior point, in which case the absolute extremum is a local
extremum. Therefore, by Fermat's Theorem, the point c at which the local extremum occurs must be a critical point. We summarize
this result in the following theorem.

4.1.6 https://math.libretexts.org/@go/page/4461
 Theorem 4.1.3: Location of Absolute Extrema
Let f be a continuous function over a closed, bounded interval I . The absolute maximum of f over I and the absolute
minimum of f over I must occur at endpoints of I or at critical points of f in I .

With this idea in mind, let’s examine a procedure for locating absolute extrema.

 Problem-Solving Strategy: Locating Absolute Extrema over a Closed Interval

Consider a continuous function f defined over the closed interval [a, b].
1. Evaluate f at the endpoints x = a and x = b.
2. Find all critical points of f that lie over the interval (a, b) and evaluate f at those critical points.
3. Compare all values found in (1) and (2). From "Location of Absolute Extrema," the absolute extrema must occur at
endpoints or critical points. Therefore, the largest of these values is the absolute maximum of f . The smallest of these
values is the absolute minimum of f .

Now let’s look at how to use this strategy to find the absolute maximum and absolute minimum values for continuous functions.

 Example 4.1.2: Locating Absolute Extrema


For each of the following functions, find the absolute maximum and absolute minimum over the specified interval and state
where those values occur.
a. f (x) = −x + 3x − 2 over [1, 3].
2

b. f (x) = x − 3x
2
over [0, 2].
2/3

Solution
a. Step 1. Evaluate f at the endpoints x = 1 and x = 3 .
f (1) = 0 and f (3) = −2
Step 2. Since f (x) = −2x + 3, f is defined for all real numbers x. Therefore, there are no critical points where the
′ ′

derivative is undefined. It remains to check where f (x) = 0 . Since f (x) = −2x + 3 = 0 at x = and is in the interval
′ ′ 3

2
3

[1, 3], f ( ) is a candidate for an absolute extremum of f over [1, 3]. We evaluate f ( ) and find
3 3

2 2

f (
3

2
) =
1

4
.
Step 3. We set up the following table to compare the values found in steps 1 and 2.

x f (x) Conclusion

1 0

2
1

4
Absolute maximum

3 −2 Absolute minimum

From the table, we find that the absolute maximum of f over the interval [1, 3] is , and it occurs at
1

4
x =
3

2
. The absolute
minimum of f over the interval [1, 3] is −2, and it occurs at x = 3 as shown in Figure 4.1.8.

4.1.7 https://math.libretexts.org/@go/page/4461
Figure 4.1.8 : This function has both an absolute maximum and an absolute minimum.
b. Step 1. Evaluate f at the endpoints x = 0 and x = 2 .
2/3
f (0) = 0 and f (2) = 4 − 3(2) ≈ −0.762

Step 2. The derivative of f is given by


4/3
2
2x −2

f (x) = 2x − =
1/3
x 1/3
x

for x ≠ 0 . The derivative is zero when 2x − 2 = 0 , which implies x = ±1 . The derivative is undefined at x = 0 .
4/3

Therefore, the critical points of f are x = 0, 1, −1. The point x = 0 is an endpoint, so we already evaluated f (0) in step 1.
The point x = −1 is not in the interval of interest, so we need only evaluate f (1). We find that
f (1) = −2.

Step 3. We compare the values found in steps 1 and 2, in the following table.

x f (x) Conclusion

0 0 Absolute maximum

1 −2 Absolute minimum

2 −0.762

We conclude that the absolute maximum of f over the interval [0, 2] is zero, and it occurs at x = 0 . The absolute minimum is
−2, and it occurs at x = 1 as shown in Figure 4.1.9.

Figure 4.1.9 : This function has an absolute maximum at an endpoint of the interval.

4.1.8 https://math.libretexts.org/@go/page/4461
 Exercise 4.1.2
Find the absolute maximum and absolute minimum of f (x) = x 2
− 4x + 3 over the interval [1, 4].

Hint
Look for critical points. Evaluate f at all critical points and at the endpoints.

Answer
The absolute maximum is 3 and it occurs at x = 4 . The absolute minimum is −1 and it occurs at x = 2 .

At this point, we know how to locate absolute extrema for continuous functions over closed intervals. We have also defined local
extrema and determined that if a function f has a local extremum at a point c , then c must be a critical point of f . However, c
being a critical point is not a sufficient condition for f to have a local extremum at c . Later in this chapter, we show how to
determine whether a function actually has a local extremum at a critical point. First, however, we need to introduce the Mean Value
Theorem, which will help as we analyze the behavior of the graph of a function.

Key Concepts
A function may have both an absolute maximum and an absolute minimum, have just one absolute extremum, or have no
absolute maximum or absolute minimum.
If a function has a local extremum, the point at which it occurs must be a critical point. However, a function need not have a
local extremum at a critical point.
A continuous function over a closed, bounded interval has an absolute maximum and an absolute minimum. Each extremum
occurs at a critical point or an endpoint.

Glossary
absolute extremum
if f has an absolute maximum or absolute minimum at c, we say f has an absolute extremum at c

absolute maximum
if f (c) ≥ f (x) for all x in the domain of f , we say f has an absolute maximum at c

absolute minimum
if f (c) ≤ f (x) for all x in the domain of f , we say f has an absolute minimum at c

critical point
if f ′
(c) = 0 or f ′
(c) is undefined, we say that c is a critical point of f

extreme value theorem


if f is a continuous function over a finite, closed interval, then f has an absolute maximum and an absolute minimum

Fermat’s theorem
if f has a local extremum at c, then c is a critical point of f

local extremum
if f has a local maximum or local minimum at c, we say f has a local extremum at c

local maximum
if there exists an interval I such that f (c) ≥ f (x) for all x ∈ I , we say f has a local maximum at c

local minimum
if there exists an interval I such that f (c) ≤ f (x) for all x ∈ I , we say f has a local minimum at c

4.1: Maximum and Minimum Values is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

4.1.9 https://math.libretexts.org/@go/page/4461
4.3: Maxima and Minima by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

4.1.10 https://math.libretexts.org/@go/page/4461
4.2: The Mean Value Theorem
Learning Objectives
Explain the meaning of Rolle’s theorem.
Describe the significance of the Mean Value Theorem.
State three important consequences of the Mean Value Theorem.

The Mean Value Theorem is one of the most important theorems in calculus. We look at some of its implications at the end of this
section. First, let’s start with a special case of the Mean Value Theorem, called Rolle’s theorem.

Rolle’s Theorem
Informally, Rolle’s theorem states that if the outputs of a differentiable function f are equal at the endpoints of an interval, then
there must be an interior point c where f (c) = 0 . Figure 4.2.1 illustrates this theorem.

Figure 4.2.1 : If a differentiable function f satisfies f (a) = f (b), then its derivative must be zero at some point(s) between a and b .

Rolle’s Theorem
Let f be a continuous function over the closed interval [a, b] and differentiable over the open interval (a, b) such that
f (a) = f (b) . There then exists at least one c ∈ (a, b) such that f (c) = 0. ′

Proof
Let k = f (a) = f (b). We consider three cases:
1. f (x) = k for all x ∈ (a, b).
2. There exists x ∈ (a, b) such that f (x) > k.
3. There exists x ∈ (a, b) such that f (x) < k.
Case 1: If f (x) = k for all x ∈ (a, b), then f ′
(x) = 0 for all x ∈ (a, b).
Case 2: Since f is a continuous function over the closed, bounded interval [a, b], by the extreme value theorem, it has an
absolute maximum. Also, since there is a point x ∈ (a, b) such that f (x) > k , the absolute maximum is greater than k .
Therefore, the absolute maximum does not occur at either endpoint. As a result, the absolute maximum must occur at an
interior point c ∈ (a, b) . Because f has a maximum at an interior point c , and f is differentiable at c , by Fermat’s theorem,

f (c) = 0.

Case 3: The case when there exists a point x ∈ (a, b) such that f (x) < k is analogous to case 2, with maximum replaced by
minimum.

An important point about Rolle’s theorem is that the differentiability of the function f is critical. If f is not differentiable, even at a
single point, the result may not hold. For example, the function f (x) = |x| − 1 is continuous over [−1, 1] and f (−1) = 0 = f (1) ,
but f (c) ≠ 0 for any c ∈ (−1, 1) as shown in the following figure.

4.2.1 https://math.libretexts.org/@go/page/4462
Figure 4.2.2 : Since f (x) = |x| − 1 is not differentiable at x = 0 , the conditions of Rolle’s theorem are not satisfied. In fact, the
conclusion does not hold here; there is no c ∈ (−1, 1) such that f (c) = 0.

Let’s now consider functions that satisfy the conditions of Rolle’s theorem and calculate explicitly the points c where f ′
(c) = 0.

Example 4.2.1: Using Rolle’s Theorem

For each of the following functions, verify that the function satisfies the criteria stated in Rolle’s theorem and find all values c
in the given interval where f (c) = 0.

a. f (x) = x 2
+ 2x over [−2, 0]
b. f (x) = x 3
− 4x over [−2, 2]
Solution
a. Since f is a polynomial, it is continuous and differentiable everywhere. In addition, f (−2) = 0 = f (0). Therefore, f
satisfies the criteria of Rolle’s theorem. We conclude that there exists at least one value c ∈ (−2, 0) such that f (c) = 0 . Since ′

f (x) = 2x + 2 = 2(x + 1), we see that f (c) = 2(c + 1) = 0 implies c = −1 as shown in the following graph.
′ ′

Figure 4.2.3 : This function is continuous and differentiable over [−2,0], f ′


(c) = 0 when c = −1 .
b. As in part a. f is a polynomial and therefore is continuous and differentiable everywhere. Also, f (−2) = 0 = f (2). That
said, f satisfies the criteria of Rolle’s theorem. Differentiating, we find that f (x) = 3x − 4. Therefore, f (c) = 0 when
′ 2 ′

x =± . Both points are in the interval [−2, 2], and, therefore, both points satisfy the conclusion of Rolle’s theorem as
2

√3

shown in the following graph.

4.2.2 https://math.libretexts.org/@go/page/4462

Figure 4.2.4 : For this polynomial over [−2, 2], f ′
(c) = 0 at x = ±2/√3 .

Exercise 4.2.1

Verify that the function f (x) = 2x − 8x + 6 defined over the interval [1, 3] satisfies the conditions of Rolle’s theorem. Find
2

all points c guaranteed by Rolle’s theorem.

Hint
Find all values c, where f ′
(c) = 0 .

Answer
c =2

The Mean Value Theorem and Its Meaning


Rolle’s theorem is a special case of the Mean Value Theorem. In Rolle’s theorem, we consider differentiable functions f that are
zero at the endpoints. The Mean Value Theorem generalizes Rolle’s theorem by considering functions that are not necessarily zero
at the endpoints. Consequently, we can view the Mean Value Theorem as a slanted version of Rolle’s theorem (Figure 4.2.5). The
Mean Value Theorem states that if f is continuous over the closed interval [a, b] and differentiable over the open interval (a, b),
then there exists a point c ∈ (a, b) such that the tangent line to the graph of f at c is parallel to the secant line connecting (a, f (a))
and (b, f (b)).

Figure 4.2.5 : The Mean Value Theorem says that for a function that meets its conditions, at some point the tangent line has the
same slope as the secant line between the ends. For this function, there are two values c and c such that the tangent line to f at c
1 2 1

and c has the same slope as the secant line.


2

4.2.3 https://math.libretexts.org/@go/page/4462
Mean Value Theorem
Let f be continuous over the closed interval [a, b] and differentiable over the open interval (a, b) . Then, there exists at least
one point c ∈ (a, b) such that
f (b) − f (a)

f (c) =
b −a

Proof
The proof follows from Rolle’s theorem by introducing an appropriate function that satisfies the criteria of Rolle’s theorem.
Consider the line connecting (a, f (a)) and (b, f (b)). Since the slope of that line is
f (b) − f (a)

b −a

and the line passes through the point (a, f (a)), the equation of that line can be written as
f (b) − f (a)
y = (x − a) + f (a).
b −a

Let g(x) denote the vertical difference between the point (x, f (x)) and the point (x, y) on that line. Therefore,
f (b) − f (a)
g(x) = f (x) − [ (x − a) + f (a)] .
b −a

Figure 4.2.6 : The value g(x) is the vertical difference between the point (x, f (x))

and the point (x, y) on the secant line connecting (a, f (a)) and (b, f (b)).
Since the graph of f intersects the secant line when x = a and x = b , we see that g(a) = 0 = g(b) . Since f is a differentiable
function over (a, b), g is also a differentiable function over (a, b). Furthermore, since f is continuous over [a, b], g is also
continuous over [a, b]. Therefore, g satisfies the criteria of Rolle’s theorem. Consequently, there exists a point c ∈ (a, b) such
that g (c) = 0. Since

f (b) − f (a)
′ ′
g (x) = f (x) − ,
b −a

we see that
f (b) − f (a)
′ ′
g (c) = f (c) − .
b −a

Since g ′
(c) = 0, we conclude that
f (b) − f (a)

f (c) = .
b −a

In the next example, we show how the Mean Value Theorem can be applied to the function f (x) = √−
x over the interval [0, 9]. The

method is the same for other functions, although sometimes with more interesting consequences.

4.2.4 https://math.libretexts.org/@go/page/4462
Example 4.2.2: Verifying that the Mean Value Theorem Applies

For f (x) = √− x over the interval [0, 9], show that f satisfies the hypothesis of the Mean Value Theorem, and therefore there

exists at least one value c ∈ (0, 9) such that f '(c) is equal to the slope of the line connecting (0, f (0)) and (9, f (9)). Find
these values c guaranteed by the Mean Value Theorem.
Solution
We know that f (x) = √− x is continuous over [0, 9] and differentiable over (0, 9). Therefore, f satisfies the hypotheses of the

Mean Value Theorem, and there must exist at least one value c ∈ (0, 9) such that f '(c) is equal to the slope of the line
connecting (0, f (0)) and (9, f (9)) (Figure 4.2.7). To determine which value(s) of c are guaranteed, first calculate the
derivative of f . The derivative f '(x) = . The slope of the line connecting (0, f (0)) and (9, f (9)) is given by
1

(2 √x)

– –
f (9) − f (0) √9 − √0 3 1
= = = .
9 −0 9 −0 9 3

We want to find c such that f '(c) = . That is, we want to find c such that
1

1 1
= .
2 √c 3

Solving this equation for c , we obtain c = 9

4
. At this point, the slope of the tangent line equals the slope of the line joining the
endpoints.

Figure 4.2.7 : The slope of the tangent line at c = 9/4 is the same as the slope of the line segment connecting (0,0) and (9,3).
One application that helps illustrate the Mean Value Theorem involves velocity. For example, suppose we drive a car for 1 h
down a straight road with an average velocity of 45 mph. Let s(t) and v(t) denote the position and velocity of the car,
respectively, for 0 ≤ t ≤ 1 h. Assuming that the position function s(t) is differentiable, we can apply the Mean Value
Theorem to conclude that, at some time c ∈ (0, 1), the speed of the car was exactly
s(1) − s(0)
v(c) = s'(c) = = 45 mph.
1 −0

Example 4.2.3: Mean Value Theorem and Velocity

If a rock is dropped from a height of 100 ft, its position t seconds after it is dropped until it hits the ground is given by the
function s(t) = −16t + 100.
2

a. Determine how long it takes before the rock hits the ground.
b. Find the average velocity v of the rock for when the rock is released and the rock hits the ground.
avg

c. Find the time t guaranteed by the Mean Value Theorem when the instantaneous velocity of the rock is v avg .

Solution

4.2.5 https://math.libretexts.org/@go/page/4462
a. When the rock hits the ground, its position is s(t) = 0 . Solving the equation −16t + 100 = 0 for t , we find that 2

t = ± sec . Since we are only considering t ≥ 0 , the ball will hit the ground sec after it is dropped.
5 5

2 2

b. The average velocity is given by


s(5/2) − s(0) 0 − 100
vavg = = = −40 ft/sec.
5/2 − 0 5/2

c. The instantaneous velocity is given by the derivative of the position function. Therefore, we need to find a time t such that
v(t) = s'(t) = v = −40
avg ft/sec. Since s(t) is continuous over the interval [0, 5/2] and differentiable over the interval
(0, 5/2), by the Mean Value Theorem, there is guaranteed to be a point c ∈ (0, 5/2) such that

s(5/2) − s(0)
s'(c) = = −40.
5/2 − 0

Taking the derivative of the position function s(t) , we find that s'(t) = −32t. Therefore, the equation reduces to
s'(c) = −32c = −40. Solving this equation for c , we have c = . Therefore, sec after the rock is dropped, the
5 5

4 4

instantaneous velocity equals the average velocity of the rock during its free fall: −40 ft/sec.

Figure 4.2.8 : At time t = 5/4 sec, the velocity of the rock is equal to its average velocity from the time it is dropped until it
hits the ground.

Exercise 4.2.2

Suppose a ball is dropped from a height of 200 ft. Its position at time t is s(t) = −16 t
2
+ 200. Find the time t when the
instantaneous velocity of the ball equals its average velocity.

Hint
First, determine how long it takes for the ball to hit the ground. Then, find the average velocity of the ball from the time it
is dropped until it hits the ground.

Answer
5
sec
2 √2

Corollaries of the Mean Value Theorem


Let’s now look at three corollaries of the Mean Value Theorem. These results have important consequences, which we use in
upcoming sections.
At this point, we know the derivative of any constant function is zero. The Mean Value Theorem allows us to conclude that the
converse is also true. In particular, if f '(x) = 0 for all x in some interval I , then f (x) is constant over that interval. This result may
seem intuitively obvious, but it has important implications that are not obvious, and we discuss them shortly.

4.2.6 https://math.libretexts.org/@go/page/4462
Corollary 1: Functions with a Derivative of Zero
Let f be differentiable over an interval I . If f '(x) = 0 for all x ∈ I , then f (x) = constant for all x ∈ I .

Proof
Since f is differentiable over I , f must be continuous over I . Suppose f (x) is not constant for all x in I . Then there exist
a, b ∈ I , where a ≠ b and f (a) ≠ f (b). Choose the notation so that a < b. Therefore,
f (b) − f (a)
≠ 0.
b −a

Since f is a differentiable function, by the Mean Value Theorem, there exists c ∈ (a, b) such that
f (b) − f (a)
f '(c) = .
b −a

Therefore, there exists c ∈ I such that f '(c) ≠ 0 , which contradicts the assumption that f '(x) = 0 for all x ∈ I .

From "Corollary 1: Functions with a Derivative of Zero," it follows that if two functions have the same derivative, they differ by, at
most, a constant.

Corollary 2: Constant Difference Theorem

If f and g are differentiable over an interval I and f '(x) = g'(x) for all x ∈ I , then f (x) = g(x) + C for some constant C .

Proof

Let h(x) = f (x) − g(x). Then, h'(x) = f '(x) − g'(x) = 0 for all x ∈ I. By Corollary 1, there is a constant C such that
h(x) = C for all x ∈ I . Therefore, f (x) = g(x) + C for all x ∈ I .

The third corollary of the Mean Value Theorem discusses when a function is increasing and when it is decreasing. Recall that a
function f is increasing over I if f (x ) < f (x ) whenever x < x , whereas f is decreasing over I if f (x ) > f (x ) whenever
1 2 1 2 1 2

x < x . Using the Mean Value Theorem, we can show that if the derivative of a function is positive, then the function is
1 2

increasing; if the derivative is negative, then the function is decreasing (Figure 4.2.9). We make use of this fact in the next section,
where we show how to use the derivative of a function to locate local maximum and minimum values of the function, and how to
determine the shape of the graph.
This fact is important because it means that for a given function f , if there exists a function F such that F '(x) = f (x); then, the
only other functions that have a derivative equal to f are F (x) + C for some constant C . We discuss this result in more detail later
in the chapter.

Figure 4.2.9 : If a function has a positive derivative over some interval I , then the function increases over that interval I ; if the
derivative is negative over some interval I , then the function decreases over that interval I .

4.2.7 https://math.libretexts.org/@go/page/4462
Corollary 3: Increasing and Decreasing Functions
Let f be continuous over the closed interval [a, b] and differentiable over the open interval (a, b).
i. If f '(x) > 0 for all x ∈ (a, b), then f is an increasing function over [a, b].
ii. If f '(x) < 0 for all x ∈ (a, b), then f is a decreasing function over [a, b].

Proof

We will prove i.; the proof of ii. is similar. Suppose f is not an increasing function on I . Then there exist a and b in I such that
a < b , but f (a) ≥ f (b) . Since f is a differentiable function over I , by the Mean Value Theorem there exists c ∈ (a, b) such

that
f (b) − f (a)
f '(c) = .
b −a

Since f (a) ≥ f (b) , we know that f (b) − f (a) ≤ 0 . Also, a < b tells us that b − a > 0. We conclude that
f (b) − f (a)
f '(c) = ≤ 0.
b −a

However, f '(x) > 0 for all x ∈ I . This is a contradiction, and therefore f must be an increasing function over I .

Key Concepts
If f is continuous over [a, b] and differentiable over (a, b) and f (a) = f (b) , then there exists a point c ∈ (a, b) such that
f '(c) = 0. This is Rolle’s theorem.

If f is continuous over [a, b] and differentiable over (a, b), then there exists a point c ∈ (a, b) such that
f (b) − f (a)

f (c) = .
b −a

This is the Mean Value Theorem.


If f (x) = 0 over an interval I , then f is constant over I .

If two differentiable functions f and g satisfy f '(x) = g'(x) over I , then f (x) = g(x) + C for some constant C .
If f '(x) > 0 over an interval I , then f is increasing over I . If f '(x) < 0 over I , then f is decreasing over I .

Glossary
mean value theorem
f (b)−f (a)
if f is continuous over [a, b] and differentiable over (a, b) , then there exists c ∈ (a, b) such that f '(c) = b−a

rolle’s theorem
if f is continuous over [a, b] and differentiable over (a, b) , and if f (a) = f (b) , then there exists c ∈ (a, b) such that f '(c) = 0

4.2: The Mean Value Theorem is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
4.4: The Mean Value Theorem by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

4.2.8 https://math.libretexts.org/@go/page/4462
4.3: How Derivatives Affect the Shape of a Graph
 Learning Objectives
Explain how the sign of the first derivative affects the shape of a function’s graph.
State the first derivative test for critical points.
Use concavity and inflection points to explain how the sign of the second derivative affects the shape of a function’s graph.
Explain the concavity test for a function over an open interval.
Explain the relationship between a function and its first and second derivatives.
State the second derivative test for local extrema.

Earlier in this chapter we stated that if a function f has a local extremum at a point c , then c must be a critical point of f . However,
a function is not guaranteed to have a local extremum at a critical point. For example, f (x) = x has a critical point at x = 0 since
3


f (x) = 3 x is zero at x = 0 , but f does not have a local extremum at x = 0 . Using the results from the previous section, we are
2

now able to determine whether a critical point of a function actually corresponds to a local extreme value. In this section, we also
see how the second derivative provides information about the shape of a graph by describing whether the graph of a function
curves upward or curves downward.

The First Derivative Test


Corollary 3 of the Mean Value Theorem showed that if the derivative of a function is positive over an interval I then the function is
increasing over I . On the other hand, if the derivative of the function is negative over an interval I , then the function is decreasing
over I as shown in the following figure.

Figure 4.3.1 : Both functions are increasing over the interval (a, b) . At each point x , the derivative ′
f (x) > 0 . Both functions are
decreasing over the interval (a, b) . At each point x , the derivative f (x) < 0.

A continuous function f has a local maximum at point c if and only if f switches from increasing to decreasing at point c .
Similarly, f has a local minimum at c if and only if f switches from decreasing to increasing at c . If f is a continuous function
over an interval I containing c and differentiable over I , except possibly at c , the only way f can switch from increasing to
decreasing (or vice versa) at point c is if f changes sign as x increases through c . If f is differentiable at c , the only way that f
′ ′

can change sign as x increases through c is if f (c) = 0 . Therefore, for a function f that is continuous over an interval I containing

c and differentiable over I , except possibly at c , the only way f can switch from increasing to decreasing (or vice versa) is if

f (c) = 0 or f (c) is undefined. Consequently, to locate local extrema for a function f , we look for points c in the domain of f
′ ′

such that f (c) = 0 or f (c) is undefined. Recall that such points are called critical points of f .
′ ′

4.3.1 https://math.libretexts.org/@go/page/4463
Note that f need not have a local extrema at a critical point. The critical points are candidates for local extrema only. In Figure
4.3.2, we show that if a continuous function f has a local extremum, it must occur at a critical point, but a function may not have a

local extremum at a critical point. We show that if f has a local extremum at a critical point, then the sign of f switches as x ′

increases through that point.

Figure 4.3.2 : The function f has four critical points: a, b, c ,and d . The function f has local maxima at a and d , and a local
minimum at b . The function f does not have a local extremum at c . The sign of f changes at all local extrema.

Using Figure 4.3.2, we summarize the main results regarding local extrema.
If a continuous function f has a local extremum, it must occur at a critical point c .
The function has a local extremum at the critical point c if and only if the derivative f switches sign as x increases through c .

Therefore, to test whether a function has a local extremum at a critical point c , we must determine the sign of f (x) to the left

and right of c .
This result is known as the first derivative test.

 First Derivative Test

Suppose that f is a continuous function over an interval I containing a critical point c . If f is differentiable over I , except
possibly at point c , then f (c) satisfies one of the following descriptions:
i. If f changes sign from positive when x < c to negative when x > c , then f (c) is a local maximum of f .

ii. If f changes sign from negative when x < c to positive when x > c , then f (c) is a local minimum of f .

iii. If f has the same sign for x < c and x > c , then f (c) is neither a local maximum nor a local minimum of f

Now let’s look at how to use this strategy to locate all local extrema for particular functions.

 Example 4.3.1: Using the First Derivative Test to Find Local Extrema

Use the first derivative test to find the location of all local extrema for f (x) = x
3
− 3x
2
− 9x − 1. Use a graphing utility to
confirm your results.
Solution
Step 1. The derivative is f (x) = 3x − 6x − 9. To find the critical points, we need to find where
′ 2 ′
f (x) = 0. Factoring the
polynomial, we conclude that the critical points must satisfy
2
3(x − 2x − 3) = 3(x − 3)(x + 1) = 0.

Therefore, the critical points are x = 3, −1. Now divide the interval (−∞, ∞) into the smaller intervals (−∞, −1), (−1, 3)

and (3, ∞).

4.3.2 https://math.libretexts.org/@go/page/4463
Step 2. Since f is a continuous function, to determine the sign of f (x) over each subinterval, it suffices to choose a point
′ ′

over each of the intervals (−∞, −1), (−1, 3) and (3, ∞) and determine the sign of f at each of these points. For example,

let’s choose x = −2 , x = 0 , and x = 4 as test points.


Table: 4.3.1 : First Derivative Test for f (x) = x 3
− 3x
2
− 9x − 1.

Sign of
Interval Test Point ′
f (x) = 3(x − 3)(x + 1) at Conclusion
Test Point

(−∞, −1) x = −2 (+)(−)(−)=+ f is increasing.

(−1, 3) x = 0 (+)(−)(+)=- f is decreasing.

(3, ∞) x = 4 (+)(+)(+)=+ f is increasing.

Step 3. Since f switches sign from positive to negative as x increases through −1, f has a local maximum at x = −1 . Since

f switches sign from negative to positive as x increases through 3 , f has a local minimum at x = 3 . These analytical results

agree with the following graph.

Figure 4.3.3 : The function f has a maximum at x = −1 and a minimum at x = 3

 Exercise 4.3.1

Use the first derivative test to locate all local extrema for f (x) = −x 3
+
3

2
x
2
+ 18x.

Hint
Find all critical points of f and determine the signs of f ′
(x) over particular intervals determined by the critical points.

Answer

4.3.3 https://math.libretexts.org/@go/page/4463
f has a local minimum at −2 and a local maximum at 3.

 Example 4.3.2: Using the First Derivative Test

Use the first derivative test to find the location of all local extrema for f (x) = 5x 1/3
−x
5/3
. Use a graphing utility to confirm
your results.
Solution
Step 1. The derivative is
2/3 4/3 4/3
5 5 5 5x 5 − 5x 5(1 − x )
′ −2/3 2/3
f (x) = x − x = − = = .
3 3 2/3 3 2/3 2/3
3x 3x 3x

The derivative f (x) = 0 when 1 − x = 0. Therefore, f (x) = 0 at x = ±1 . The derivative f (x) is undefined at x = 0.
′ 4/3 ′ ′

Therefore, we have three critical points: x = 0 , x = 1 , and x = −1 . Consequently, divide the interval (−∞, ∞) into the
smaller intervals (−∞, −1), (−1, 0), (0, 1), and (1, ∞).
Step 2: Since f is continuous over each subinterval, it suffices to choose a test point x in each of the intervals from step 1 and

determine the sign of f at each of these points. The points x = −2, x = − , x = , and x = 2 are test points for these
′ 1

2
1

intervals.
Table: 4.3.2 : First Derivative Test for f (x) = 5x 1/3
−x
5/3
.

4/3
5( 1−x )
Sign of f ′
(x) = at
Interval Test Point 3x2/3 Conclusion
Test Point
(+)(−)
(−∞, −1) x = −2
+
= − f is decreasing.
(+)(+)
(−1, 0) x = −
1

2 +
= + f is increasing.
(+)(+)
(0, 1) x =
1

2 +
= + f is increasing.
(+)(−)
(1, ∞) x = 2
+
= − f is decreasing.

Step 3: Since f is decreasing over the interval (−∞, −1) and increasing over the interval (−1, 0), f has a local minimum at
x = −1 . Since f is increasing over the interval (−1, 0) and the interval (0, 1), f does not have a local extremum at x = 0 .

Since f is increasing over the interval (0, 1) and decreasing over the interval (1, ∞), f has a local maximum at x = 1 . The
analytical results agree with the following graph.

Figure 4.3.4 : The function f has a local minimum at x = −1 and a local maximum at x = 1

 Exercise 4.3.2
3
Use the first derivative test to find all local extrema for f (x) = .
x −1

Hint
The only critical point of f is x = 1.

4.3.4 https://math.libretexts.org/@go/page/4463
Answer
f has no local extrema because f does not change sign at x = 1 .

Concavity and Points of Inflection


We now know how to determine where a function is increasing or decreasing. However, there is another issue to consider regarding
the shape of the graph of a function. If the graph curves, does it curve upward or curve downward? This notion is called the
concavity of the function.
Figure 4.3.5a shows a function f with a graph that curves upward. As x increases, the slope of the tangent line increases. Thus,
since the derivative increases as x increases, f is an increasing function. We say this function f is concave up. Figure 4.3.5b

shows a function f that curves downward. As x increases, the slope of the tangent line decreases. Since the derivative decreases as
x increases, f is a decreasing function. We say this function f is concave down.

 Definition: concavity test

Let f be a function that is differentiable over an open interval I . If f is increasing over I , we say f is concave up over I . If

f is decreasing over I , we say f is concave down over I .


Figure 4.3.5 : (a), (c) Since f is increasing over the interval (a, b) , we say

f is concave up over (a, b). (b), (d) Since f

is
decreasing over the interval (a, b) , we say f is concave down over (a, b).
In general, without having the graph of a function f , how can we determine its concavity? By definition, a function f is
concave up if f is increasing. From Corollary 3, we know that if f is a differentiable function, then f is increasing if its
′ ′ ′

derivative f (x) > 0 . Therefore, a function f that is twice differentiable is concave up when f (x) > 0 . Similarly, a function
′′ ′′

f is concave down if f is decreasing. We know that a differentiable function f is decreasing if its derivative f (x) < 0 .
′ ′ ′′

Therefore, a twice-differentiable function f is concave down when f (x) < 0 . Applying this logic is known as the concavity
′′

test.

4.3.5 https://math.libretexts.org/@go/page/4463
 Test for Concavity

Let f be a function that is twice differentiable over an interval I .


i. If f ′′
(x) > 0 for all x ∈ I , then f is concave up over I
ii. If f ′′
(x) < 0 for all x ∈ I , then f is concave down over I .

We conclude that we can determine the concavity of a function f by looking at the second derivative of f . In addition, we observe
that a function f can switch concavity (Figure 4.3.6). However, a continuous function can switch concavity only at a point x if
f (x) = 0 or f (x) is undefined. Consequently, to determine the intervals where a function f is concave up and concave down,
′′ ′′

we look for those values of x where f (x) = 0 or f (x) is undefined. When we have determined these points, we divide the
′′ ′′

domain of f into smaller intervals and determine the sign of f over each of these smaller intervals. If f changes sign as we pass
′′ ′′

through a point x, then f changes concavity. It is important to remember that a function f may not change concavity at a point x
even if f (x) = 0 or f (x) is undefined. If, however, f does change concavity at a point a and f is continuous at a , we say the
′′ ′′

point (a, f (a)) is an inflection point of f .

 Definition: inflection point

If f is continuous at a and f changes concavity at a , the point (a, f (a)) is an inflection point of f .

Figure 4.3.6 : Since f (x) > 0 for x < a , the function f is concave up over the interval (−∞, a) . Since f
′′ ′′
(x) < 0 for x > a , the
function f is concave down over the interval (a, ∞) . The point (a, f (a)) is an inflection point of f .

 Example 4.3.3: Testing for Concavity

For the function f (x) = x − 6x + 9x + 30, determine all intervals where f is concave up and all intervals where
3 2
f is
concave down. List all inflection points for f . Use a graphing utility to confirm your results.
Solution
To determine concavity, we need to find the second derivative f (x). The first derivative is f (x) = 3x − 12x + 9, so the
′′ ′ 2

second derivative is f (x) = 6x − 12. If the function changes concavity, it occurs either when f (x) = 0 or f (x) is
′′ ′′ ′′

undefined. Since f is defined for all real numbers x, we need only find where f (x) = 0 . Solving the equation 6x − 12 = 0 ,
′′ ′′

we see that x = 2 is the only place where f could change concavity. We now test points over the intervals (−∞, 2) and (2, ∞)
to determine the concavity of f . The points x = 0 and x = 3 are test points for these intervals.
Table: 4.3.3 : Test for Concavity for f (x) = x 3
− 6x
2
+ 9x + 30.

Sign of f ′′
(x) = 6x − 12 at
Interval Test Point Conclusion
Test Point

(−∞, 2) x = 0 − f is concave down

(2, ∞) x = 3 + f is concave up

4.3.6 https://math.libretexts.org/@go/page/4463
We conclude that f is concave down over the interval (−∞, 2) and concave up over the interval (2, ∞). Since f changes
concavity at x = 2 , the point (2, f (2)) = (2, 32) is an inflection point. Figure 4.3.7 confirms the analytical results.

Figure 4.3.7 : The given function has a point of inflection at (2, 32) where the graph changes concavity.

 Exercise 4.3.3

For f (x) = −x 3
+
3

2
2
x + 18x , find all intervals where f is concave up and all intervals where f is concave down.

Hint
Find where f ′′
(x) = 0

Answer
f is concave up over the interval (−∞, 1

2
) and concave down over the interval ( 1

2
, ∞)

We now summarize, in Table 4.3.4, the information that the first and second derivatives of a function f provide about the graph of
f , and illustrate this information in Figure 4.3.8.

Table: 4.3.4 : What Derivatives Tell Us about Graphs


Sign of f ′
Sign of f ′′
Is f increasing or decreasing? Concavity

Positive Positive Increasing Concave up

Positive Negative Increasing Concave down

Negative Positive Decreasing Concave up

Negative Negative Decreasing Concave down

4.3.7 https://math.libretexts.org/@go/page/4463
Figure 4.3.8 :Consider a twice-differentiable function f over an open interval I . If f (x) > 0 for all x ∈ I , the function is

increasing over I . If f (x) < 0 for all x ∈ I , the function is decreasing over I . If f (x) > 0 for all x ∈ I , the function is concave
′ ′′

up. If f (x) < 0 for all x ∈ I , the function is concave down on I .


′′

The Second Derivative Test


The first derivative test provides an analytical tool for finding local extrema, but the second derivative can also be used to locate
extreme values. Using the second derivative can sometimes be a simpler method than using the first derivative.
We know that if a continuous function has a local extremum, it must occur at a critical point. However, a function need not have a
local extremum at a critical point. Here we examine how the second derivative test can be used to determine whether a function
has a local extremum at a critical point. Let f be a twice-differentiable function such that f (a) = 0 and f is continuous over an
′ ′′

open interval I containing a . Suppose f (a) < 0 . Since f is continuous over I , f (x) < 0 for all x ∈ I (Figure 4.3.9). Then, by
′′ ′′ ′′

Corollary 3, f is a decreasing function over I . Since f (a) = 0 , we conclude that for all x ∈ I , f (x) > 0 if x < a and
′ ′ ′

f (x) < 0 if x > a . Therefore, by the first derivative test, f has a local maximum at x = a .

On the other hand, suppose there exists a point b such that f (b) = 0 but f (b) > 0 . Since f is continuous over an open interval
′ ′′ ′′

I containing b , then f (x) > 0 for all x ∈ I (Figure 4.3.9). Then, by Corollary 3 , f is an increasing function over I . Since
′′ ′

f (b) = 0 , we conclude that for all x ∈ I , f (x) < 0 if x < b and f (x) > 0 if x > b . Therefore, by the first derivative test, f has
′ ′ ′

a local minimum at x = b.

Figure 4.3.9 : Consider a twice-differentiable function f such that f is continuous. Since f (a) = 0 and f (a) < 0, there is an
′′ ′ ′′

interval I containing a such that for all x in I , f is increasing if x < a and f is decreasing if x > a . As a result, f has a local
maximum at x = a . Since f (b) = 0 and f (b) > 0, there is an interval I containing b such that for all x in I , f is decreasing if
′ ′′

x < b and f is increasing if x > b . As a result, f has a local minimum at x = b .

4.3.8 https://math.libretexts.org/@go/page/4463
 Second Derivative Test
Suppose f ′
(c) = 0 and f is continuous over an interval containing c .
′′

i. If f ′′
(c) > 0, then f has a local minimum at c .
ii. If f ′′
(c) < 0, then f has a local maximum at c .
iii. If f ′′
(c) = 0, then the test is inconclusive.

Note that for case iii. when f (c) = 0 , then f may have a local maximum, local minimum, or neither at c . For example, the
′′

functions f (x) = x , f (x) = x , and f (x) = −x all have critical points at x = 0 . In each case, the second derivative is zero at
3 4 4

x = 0 . However, the function f (x) = x has a local minimum at x = 0 whereas the function f (x) = −x has a local maximum at
4 4

x = 0 , and the function f (x) = x does not have a local extremum at x = 0 .


3

Let’s now look at how to use the second derivative test to determine whether f has a local maximum or local minimum at a critical
point c where f (c) = 0. ′

 Example 4.3.4: Using the Second Derivative Test

Use the second derivative to find the location of all local extrema for f (x) = x 5
− 5x .
3

Solution
To apply the second derivative test, we first need to find critical points c where f (c) = 0 . The derivative is ′


f (x) = 5 x − 15 x . Therefore, f (x) = 5 x − 15 x = 5 x (x − 3) = 0 when x = 0, ±√3 .
′ 4 2 ′ 4 2 2 2

To determine whether f has a local extremum at any of these points, we need to evaluate the sign of f
′′
at these points. The
second derivative is
′′ 3 2
f (x) = 20 x − 30x = 10x(2 x − 3).

In the following table, we evaluate the second derivative at each of the critical points and use the second derivative test to
determine whether f has a local maximum or local minimum at any of these points.
Table: 4.3.5 : Second Derivative Test for f (x) = x 5 3
− 5x .

x f
′′
(x) Conclusion
– –
−√3 −30 √3 Local maximum

0 0 Second derivative test is inconclusive


– –
√3 30 √3 Local minimum

– –
By the second derivative test, we conclude that f has a local maximum at x = −√3 and f has a local minimum at x = √3 .
The second derivative test is inconclusive at x = 0 . To determine whether f has a local extrema at x = 0, we apply the first
– –
derivative test. To evaluate the sign of f (x) = 5x (x − 3) for x ∈ (−√3, 0) and x ∈ (0, √3) , let x = −1 and x = 1 be the
′ 2 2

two test points. Since f (−1) < 0 and f (1) < 0 , we conclude that f is decreasing on both intervals and, therefore, f does not
′ ′

have a local extrema at x = 0 as shown in the following graph.

4.3.9 https://math.libretexts.org/@go/page/4463
– –
Figure 4.3.10 :The function f has a local maximum at x = −√3 and a local minimum at x = √3

 Exercise 4.3.4

Consider the function f (x) = x − ( )x − 18x . The points c = 3, −2 satisfy


3 3

2
2 ′
f (c) = 0 . Use the second derivative test to
determine whether f has a local maximum or local minimum at those points.

Hint
′′
f (x) = 6x − 3

Answer
f has a local maximum at −2 and a local minimum at 3.

We have now developed the tools we need to determine where a function is increasing and decreasing, as well as acquired an
understanding of the basic shape of the graph. In the next section we discuss what happens to a function as x → ±∞. At that point,
we have enough tools to provide accurate graphs of a large variety of functions.

Key Concepts
If c is a critical point of f and f (x) > 0 for x < c and f (x) < 0 for x > c , then f has a local maximum at c .
′ ′

If c is a critical point of f and f (x) < 0 for x < c and f (x) > 0 for x > c, then f has a local minimum at c .
′ ′

If f (x) > 0 over an interval I , then f is concave up over I .


′′

If f (x) < 0 over an interval I , then f is concave down over I .


′′

If f (c) = 0 and f (c) > 0 , then f has a local minimum at c .


′ ′′

If f (c) = 0 and f (c) < 0 , then f has a local maximum at c .


′ ′′

If f (c) = 0 and f (c) = 0 , then evaluate f (x) at a test point x to the left of c and a test point x to the right of c , to determine
′ ′′ ′

whether f has a local extremum at c .

Glossary
concave down
if f is differentiable over an interval I and f is decreasing over I , then f is concave down over I

concave up
if f is differentiable over an interval I and f is increasing over I , then f is concave up over I

4.3.10 https://math.libretexts.org/@go/page/4463
concavity
the upward or downward curve of the graph of a function

concavity test
suppose f is twice differentiable over an interval I ; if f ′′
>0 over I , then f is concave up over I ; if f ′′
< over I , then f is
concave down over I

first derivative test


let f be a continuous function over an interval I containing a critical point c such that f is differentiable over I except possibly
at c; if f changes sign from positive to negative as x increases through c, then f has a local maximum at c; if f changes sign
′ ′

from negative to positive as x increases through c, then f has a local minimum at c; if f does not change sign as x increases

through c, then f does not have a local extremum at c

inflection point
if f is continuous at c and f changes concavity at c, the point (c, f (c)) is an inflection point of f

second derivative test


suppose f (c) = 0 and f ' is continuous over an interval containing c; if f (c) > 0 , then f has a local minimum at c; if
′ ′ ′′

f (c) < 0 , then f has a local maximum at c ; if f (c) = 0 , then the test is inconclusive
′′ ′′

4.3: How Derivatives Affect the Shape of a Graph is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
4.5: Derivatives and the Shape of a Graph by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

4.3.11 https://math.libretexts.org/@go/page/4463
4.4: Indeterminate Forms and l'Hospital's Rule
 Learning Objectives
Recognize when to apply L’Hôpital’s rule.
Identify indeterminate forms produced by quotients, products, subtractions, and powers, and apply L’Hôpital’s rule in each
case.
Describe the relative growth rates of functions.

In this section, we examine a powerful tool for evaluating limits. This tool, known as L’Hôpital’s rule, uses derivatives to calculate
limits. With this rule, we will be able to evaluate many limits we have not yet been able to determine. Instead of relying on
numerical evidence to conjecture that a limit exists, we will be able to show definitively that a limit exists and to determine its
exact value.

Applying L’Hôpital’s Rule


L’Hôpital’s rule can be used to evaluate limits involving the quotient of two functions. Consider
f (x)
lim .
x→a g(x)

If lim f (x) = L and lim g(x) = L


1 2 ≠ 0, then
x→a x→a

f (x) L1
lim = .
x→a g(x) L2

0
However, what happens if lim f (x) = 0 and lim g(x) = 0 ? We call this one of the indeterminate forms, of type . This is
x→a x→a 0
f (x)
considered an indeterminate form because we cannot determine the exact behavior of as x → a without further analysis. We
g(x)

have seen examples of this earlier in the text. For example, consider
2
x −4
lim
x→2 x −2

and
sin x
lim .
x→0 x

For the first of these examples, we can evaluate the limit by factoring the numerator and writing
2
x −4 (x + 2)(x − 2)
lim = lim = lim(x + 2) = 2 + 2 = 4.
x→2 x −2 x→2 x −2 x→2

sin x
For lim we were able to show, using a geometric argument, that
x→0 x

sin x
lim = 1.
x→0 x

Here we use a different technique for evaluating limits such as these. Not only does this technique provide an easier way to
evaluate these limits, but also, and more importantly, it provides us with a way to evaluate many other limits that we could not
calculate previously.
The idea behind L’Hôpital’s rule can be explained using local linear approximations. Consider two differentiable functions f and g
such that lim f (x) = 0 = lim g(x) and such that g'(a) ≠ 0 For x near a ,we can write
x→a x→a

f (x) ≈ f (a) + f '(a)(x − a)

and

4.4.1 https://math.libretexts.org/@go/page/4464
g(x) ≈ g(a) + g'(a)(x − a).

Therefore,
f (x) f (a) + f '(a)(x − a)
≈ .
g(x) g(a) + g'(a)(x − a)

Figure 4.4.1 : If lim f (x) = lim g(x) , then the ratio f (x)/g(x) is approximately equal to the ratio of their linear approximations
x→a x→a

near a .
Since f is differentiable at a , then f is continuous at a , and therefore f (a) = lim f (x) = 0 . Similarly, g(a) = lim g(x) = 0 . If
x→a x→a

we also assume that f' and g' are continuous at x =a , then f '(a) = lim f '(x) and g'(a) = lim g'(x) . Using these ideas, we
x→a x→a

conclude that
f (x) f '(x)(x − a) f '(x)
lim = lim = lim .
x→a g(x) x→a g'(x)(x − a) x→a g'(x)

Note that the assumption that f ' and g' are continuous at a and g'(a) ≠ 0 can be loosened. We state L’Hôpital’s rule formally for
0 0
the indeterminate form . Also note that the notation does not mean we are actually dividing zero by zero. Rather, we are using
0 0
0
the notation to represent a quotient of limits, each of which is zero.
0

 L’Hôpital’s Rule (0/0 Case)

Suppose f and g are differentiable functions over an open interval containing a , except possibly at a . If lim f (x) = 0 and
x→a

lim g(x) = 0, then


x→a

f (x) f '(x)
lim = lim ,
x→a g(x) x→a g'(x)

assuming the limit on the right exists or is ∞ or −∞ . This result also holds if we are considering one-sided limits, or if a = ∞
or a = −∞.

 Proof

We provide a proof of this theorem in the special case when f , g, f ', and g' are all continuous over an open interval containing
a . In that case, since lim f (x) = 0 = lim g(x) and f and g are continuous at a , it follows that f (a) = 0 = g(a) . Therefore,
x→a x→a

4.4.2 https://math.libretexts.org/@go/page/4464
f (x) f (x) − f (a)
lim = lim Since f (a) = 0 = g(a)
x→a g(x) x→a g(x) − g(a)

f (x) − f (a)

x −a 1
= lim Multiply numerator and denominator by
x→a g(x) − g(a) x −a

x −a

f (x) − f (a)
lim
x→a x −a
= The limit of a quotient is the quotient of the limits.
g(x) − g(a)
lim
x→a x −a

f '(a)
= By the definition of the derivative
g'(a)

lim f '(x)
x→a
= By the continuity of f ' and g'
lim g'(x)
x→a

f '(x)
= lim . The limit of a quotient
x→a g'(x)

f
Note that L’Hôpital’s rule states we can calculate the limit of a quotient by considering the limit of the quotient of the
g
f' f
derivatives . It is important to realize that we are not calculating the derivative of the quotient .
g' g

 Example 4.4.1: Applying L’Hôpital’s Rule (0/0 Case)

Evaluate each of the following limits by applying L’Hôpital’s rule.


1 − cos x
a. lim
x→0 x
sin(πx)
b. lim
x→1 ln x
1/x
e −1
c. lim
x→∞ 1/x

sin x − x
d. lim 2
x→0 x

Solution
a. Since the numerator 1 − cos x → 0 and the denominator x → 0 , we can apply L’Hôpital’s rule to evaluate this limit. We
have
d
(1 − cos x) lim sin x
1 − cos x dx sin x x→0 0
lim = lim = lim = = = 0.
x→0 x x→0 d x→0 1 lim 1 1
(x) x→0
dx

b. As x → 1, the numerator sin(πx) → 0 and the denominator ln(x) → 0. Therefore, we can apply L’Hôpital’s rule. We
obtain
sin(πx) π cos(πx)
lim = lim
x→1 ln x x→1 1/x

= lim(πx) cos(πx)
x→1

= (π ⋅ 1)(−1) = −π.

4.4.3 https://math.libretexts.org/@go/page/4464
c. As x → ∞ , the numerator e 1/x
−1 → 0 and the denominator 1

x
→ 0 . Therefore, we can apply L’Hôpital’s rule. We obtain
1/x −1
1/x e ( )
e −1 x2 1/x 0
lim = lim = lim e =e = 1.
x→∞ 1 x→∞ −1 x→∞
( 2
)
x
x

d. As x → 0, both the numerator and denominator approach zero. Therefore, we can apply L’Hôpital’s rule. We obtain
sin x − x cos x − 1
lim = lim .
2
x→0 x x→0 2x

Since the numerator and denominator of this new quotient both approach zero as x → 0 , we apply L’Hôpital’s rule again. In
doing so, we see that
cos x − 1 − sin x
lim = lim = 0.
x→0 2x x→0 2

Therefore, we conclude that


sin x − x
lim = 0.
2
x→0 x

 Exercise 4.4.1

Evaluate
x
lim .
x→0 tan x

Hint
d
2
( tan x) = sec x
dx

Answer
1

f (x)
We can also use L’Hôpital’s rule to evaluate limits of quotients in which f (x) → ±∞ and g(x) → ±∞ . Limits of this form
g(x)

are classified as indeterminate forms of type ∞/∞. Again, note that we are not actually dividing ∞ by ∞. Since ∞ is not a real
number, that is impossible; rather, ∞/∞ is used to represent a quotient of limits, each of which is ∞ or −∞ .

 L’Hôpital’s Rule (∞/∞ Case)

Suppose f and g are differentiable functions over an open interval containing a , except possibly at a . Suppose lim f (x) = ∞
x→a

(or −∞ ) and lim g(x) = ∞ (or −∞ ). Then,


x→a

f (x) f '(x)
lim = lim
x→a g(x) x→a g'(x)

assuming the limit on the right exists or is ∞ or −∞ . This result also holds if the limit is infinite, if a =∞ or −∞ , or the
limit is one-sided.

 Example 4.4.2: Applying L’Hôpital’s Rule (∞/∞) Case

Evaluate each of the following limits by applying L’Hôpital’s rule.


3x + 5
a. lim
x→∞ 2x + 1

4.4.4 https://math.libretexts.org/@go/page/4464
ln x
b. lim
x→0
+
cot x

Solution
a. Since 3x + 5 and 2x + 1 are first-degree polynomials with positive leading coefficients, lim (3x + 5) = ∞ and
x→∞

lim (2x + 1) = ∞ . Therefore, we apply L’Hôpital’s rule and obtain


x→∞

3x + 5 3 3
lim = lim = .
x→∞ 2x + 1 x→∞ 2 2

Note that this limit can also be calculated without invoking L’Hôpital’s rule. Earlier in the chapter we showed how to evaluate
such a limit by dividing the numerator and denominator by the highest power of x in the denominator. In doing so, we saw that
3x + 5 3 + 5/x 3
lim = lim = .
x→∞ 2x + 1 x→∞ 2 + 1/x 2

L’Hôpital’s rule provides us with an alternative means of evaluating this type of limit.
b. Here, lim ln x = −∞
+
and lim cot x = ∞
+
. Therefore, we can apply L’Hôpital’s rule and obtain
x→0 x→0

ln x 1/x 1
lim = lim = lim .
2 2
x→0
+
cot x x→0
+
− csc x x→0
+
−x csc x

Now as x → 0 , csc x → ∞ . Therefore, the first term in the denominator is approaching zero and the second term is getting
+ 2

really large. In such a case, anything can happen with the product. Therefore, we cannot make any conclusion yet. To evaluate
the limit, we use the definition of csc x to write
2
1 sin x
lim = lim .
2
x→0
+
−x csc x x→0
+
−x

Now lim sin


+
2
x =0 and lim −x = 0
+
, so we apply L’Hôpital’s rule again. We find
x→0 x→0

2
sin x 2 sin x cos x 0
lim = lim = = 0.
x→0
+
−x x→0
+
−1 −1

We conclude that
ln x
lim = 0.
x→0
+
cot x

 Exercise 4.4.2

Evaluate
ln x
lim .
x→∞ 5x

Hint
d 1
( ln x) =
dx x

Answer
0

As mentioned, L’Hôpital’s rule is an extremely useful tool for evaluating limits. It is important to remember, however, that to apply
f (x) f (x) 0
L’Hôpital’s rule to a quotient , it is essential that the limit of be of the form or ∞/∞ . Consider the following
g(x) g(x) 0

example.

4.4.5 https://math.libretexts.org/@go/page/4464
 Example 4.4.3: When L’Hôpital’s Rule Does Not Apply
2
x +5
Consider lim .
x→1 3x + 4

Show that the limit cannot be evaluated by applying L’Hôpital’s rule.


Solution
Because the limits of the numerator and denominator are not both zero and are not both infinite, we cannot apply L’Hôpital’s
rule. If we try to do so, we get
d
2
(x + 5) = 2x
dx

and
d
(3x + 4) = 3.
dx

At which point we would conclude erroneously that


2
x +5 2x 2
lim = lim = .
x→1 3x + 4 x→1 3 3

However, since lim(x 2


+ 5) = 6 and lim(3x + 4) = 7, we actually have
x→1 x→1

2
x +5 6
lim = .
x→1 3x + 4 7

We can conclude that


d
2
2 (x + 5)
x +5 dx
lim ≠ lim
x→1 3x + 4 x→1 d
(3x + 4).
dx

 Exercise 4.4.3
cos x cos x
Explain why we cannot apply L’Hôpital’s rule to evaluate lim . Evaluate lim by other means.
x→0
+
x +
x→0 x

Hint
Determine the limits of the numerator and denominator separately.

Answer
lim cos x = 1.
+
Therefore, we cannot apply L’Hôpital’s rule. The limit of the quotient is ∞.
x→0

Other Indeterminate Forms


0
L’Hôpital’s rule is very useful for evaluating limits involving the indeterminate forms and ∞/∞ . However, we can also use
0
L’Hôpital’s rule to help evaluate limits involving other indeterminate forms that arise when evaluating limits. The expressions
, ∞ , and 0 are all considered indeterminate forms. These expressions are not real numbers. Rather, they
∞ 0 0
0 ⋅ ∞, ∞ − ∞, 1

represent forms that arise when trying to evaluate certain limits. Next we realize why these are indeterminate forms and then
understand how to use L’Hôpital’s rule in these cases. The key idea is that we must rewrite the indeterminate forms in such a way
0
that we arrive at the indeterminate form or ∞/∞.
0

4.4.6 https://math.libretexts.org/@go/page/4464
Indeterminate Form of Type 0⋅∞
Suppose we want to evaluate lim(f (x) ⋅ g(x)), where f (x) → 0 and g(x) → ∞ (or −∞ ) as x → a . Since one term in the product
x→a

is approaching zero but the other term is becoming arbitrarily large (in magnitude), anything can happen to the product. We use the
notation 0 ⋅ ∞ to denote the form that arises in this situation. The expression 0 ⋅ ∞ is considered indeterminate because we cannot
determine without further analysis the exact behavior of the product f (x)g(x) as x → ∞ . For example, let n be a positive integer
and consider
1
f (x) =
n
and g(x) = 3x . 2

(x + 1)

2
3x
As x → ∞, f (x) → 0 and g(x) → ∞ . However, the limit as x → ∞ of f (x)g(x) = n
varies, depending on n . If n = 2 ,
(x + 1)

then lim f (x)g(x) = 3 . If n =1 , then lim f (x)g(x) = ∞ . If n =3 , then lim f (x)g(x) = 0 . Here we consider another limit
x→∞ x→∞ x→∞

involving the indeterminate form 0 ⋅ ∞ and show how to rewrite the function as a quotient to use L’Hôpital’s rule.

 Example 4.4.4: Indeterminate Form of Type 0 ⋅ ∞

Evaluate lim x ln x.
+
x→0

Solution
First, rewrite the function x ln x as a quotient to apply L’Hôpital’s rule. If we write
ln x
x ln x =
1/x

1
we see that ln x → −∞ as x → 0 +
and → ∞ as x → 0 . Therefore, we can apply L’Hôpital’s rule and obtain
+

d
( ln x)
ln x dx 1/x
lim = lim = lim = lim (−x) = 0.
2
x→0
+
1/x +
x→0 d x→0
+
−1/x x→0
+

(1/x)
dx

We conclude that

lim x ln x = 0.
+
x→0

Figure 4.4.2 : Finding the limit at x = 0 of the function f (x) = x ln x.

 Exercise 4.4.4

Evaluate

lim x cot x.
x→0

4.4.7 https://math.libretexts.org/@go/page/4464
Hint
x cos x
Write x cot x =
sin x

Answer
1

Indeterminate Form of Type ∞ − ∞


Another type of indeterminate form is ∞ − ∞. Consider the following example. Let n be a positive integer and let f (x) = 3x n

and g(x) = 3x + 5 . As x → ∞, f (x) → ∞ and g(x) → ∞ . We are interested in lim (f (x) − g(x)) . Depending on whether
2

x→∞

f (x) grows faster, g(x) grows faster, or they grow at the same rate, as we see next, anything can happen in this limit. Since
f (x) → ∞ and g(x) → ∞ , we write ∞ − ∞ to denote the form of this limit. As with our other indeterminate forms, ∞ − ∞ has

no meaning on its own and we must do more analysis to determine the value of the limit. For example, suppose the exponent n in
the function f (x) = 3x is n = 3 , then
n

3 2
lim (f (x) − g(x)) = lim (3 x − 3x − 5) = ∞.
x→∞ x→∞

On the other hand, if n = 2, then


2 2
lim (f (x) − g(x)) = lim (3 x − 3x − 5) = −5.
x→∞ x→∞

However, if n = 1 , then
2
lim (f (x) − g(x)) = lim (3x − 3 x − 5) = −∞.
x→∞ x→∞

Therefore, the limit cannot be determined by considering only ∞ − ∞ . Next we see how to rewrite an expression involving the
indeterminate form ∞ − ∞ as a fraction to apply L’Hôpital’s rule.

 Example 4.4.5: Indeterminate Form of Type ∞ − ∞

Evaluate
1 1
lim ( − ).
2
x→0
+
x tan x

Solution
By combining the fractions, we can write the function as a quotient. Since the least common denominator is x 2
tan x, we have
2
1 1 (tan x) − x

2
− =
2
.
x tan x x tan x

As x → 0 , the numerator tan x − x → 0 and the denominator x


+ 2 2
tan x → 0. Therefore, we can apply L’Hôpital’s rule.
Taking the derivatives of the numerator and the denominator, we have
2 2
(tan x) − x (sec x) − 2x
lim = lim .
+ 2 + 2 2
x→0 x tan x x→0 x sec x + 2x tan x

As x → 0 , (sec x) − 2x → 1 and x
+ 2 2
sec
2
x + 2x tan x → 0 . Since the denominator is positive as x approaches zero from
the right, we conclude that
2
(sec x) − 2x
lim = ∞.
+ 2 2
x→0 x sec x + 2x tan x

Therefore,
1 1
lim ( − ) = ∞.
2
x→0
+
x tan x

4.4.8 https://math.libretexts.org/@go/page/4464
 Exercise 4.4.5
1 1
Evaluate lim ( − ) .
x→0
+
x sin x

Hint
Rewrite the difference of fractions as a single fraction.

Answer
0

Another type of indeterminate form that arises when evaluating limits involves exponents. The expressions 0 , ∞ , and 1 are all 0 0 ∞

indeterminate forms. On their own, these expressions are meaningless because we cannot actually evaluate these expressions as we
would evaluate an expression involving real numbers. Rather, these expressions represent forms that arise when finding limits.
Now we examine how L’Hôpital’s rule can be used to evaluate limits involving these indeterminate forms.
Since L’Hôpital’s rule applies to quotients, we use the natural logarithm function and its properties to reduce a problem evaluating a
limit involving exponents to a related problem involving a limit of a quotient. For example, suppose we want to evaluate
lim f (x ) and we arrive at the indeterminate form ∞ . (The indeterminate forms 0 and 1 can be handled similarly.) We
g(x) 0 0 ∞

x→a

proceed as follows. Let


g(x)
y = f (x ) .

Then,
g(x)
ln y = ln(f (x ) ) = g(x) ln(f (x)).

Therefore,

lim[ln(y)] = lim[g(x) ln(f (x))].


x→a x→a

Since lim f (x) = ∞, we know that lim ln(f (x)) = ∞ . Therefore, lim g(x) ln(f (x)) is of the indeterminate form 0⋅∞ , and we
x→a x→a x→a

can use the techniques discussed earlier to rewrite the expression g(x) ln(f (x)) in a form so that we can apply L’Hôpital’s rule.
Suppose lim g(x) ln(f (x)) = L , where L may be ∞ or −∞. Then
x→a

lim[ln(y)] = L.
x→a

Since the natural logarithm function is continuous, we conclude that

ln( lim y) = L,
x→a

which gives us
g(x) L
lim y = lim f (x ) =e .
x→a x→a

 Example 4.4.6: Indeterminate Form of Type ∞ 0

Evaluate
1/x
lim x .
x→∞

Solution
Let y = x 1/x
.Then,
1 ln x
1/x
ln(x ) = ln x = .
x x

4.4.9 https://math.libretexts.org/@go/page/4464
ln x
We need to evaluate lim . Applying L’Hôpital’s rule, we obtain
x→∞ x

ln x 1/x
lim ln y = lim = lim = 0.
x→∞ x→∞ x x→∞ 1

Therefore, lim ln y = 0. Since the natural logarithm function is continuous, we conclude that
x→∞

ln( lim y) = 0,
x→∞

which leads to

ln( lim y )
1/x 0
x→∞
lim x = lim y = e =e = 1.
x→∞ x→∞

Hence,
1/x
lim x = 1.
x→∞

 Exercise 4.4.6

Evaluate
1/ ln(x)
lim x .
x→∞

Hint
Let y = x
1/ ln(x)
and apply the natural logarithm to both sides of the equation.

Answer
e

 Example 4.4.7: Indeterminate Form of Type 0 0

Evaluate
sin x
lim x .
+
x→0

Solution
Let
sin x
y =x .

Therefore,
sin x
ln y = ln(x ) = sin x ln x.

We now evaluate lim sin x ln x.


+
Since lim sin x = 0
+
and lim ln x = −∞
+
, we have the indeterminate form 0 ⋅ ∞ . To apply
x→0 x→0 x→0

L’Hôpital’s rule, we need to rewrite sin x ln x as a fraction. We could write


sin x
sin x ln x =
1/ ln x

or
ln x ln x
sin x ln x = = .
1/ sin x csc x

Let’s consider the first option. In this case, applying L’Hôpital’s rule, we would obtain

4.4.10 https://math.libretexts.org/@go/page/4464
sin x cos x 2
lim sin x ln x = lim = lim = lim (−x(ln x ) cos x).
x→0
+
x→0
+
1/ ln x x→0
+
−1/(x(ln x )2 ) x→0
+

Unfortunately, we not only have another expression involving the indeterminate form 0 ⋅ ∞, but the new limit is even more
complicated to evaluate than the one with which we started. Instead, we try the second option. By writing
ln x ln x
sin x ln x = =
1/ sin x csc x,

and applying L’Hôpital’s rule, we obtain


ln x 1/x −1
lim sin x ln x = lim = lim = lim .
+ + + +
x→0 x→0 csc x x→0 − csc x cot x x→0 x csc x cot x

1 cos x
Using the fact that csc x = and cot x = , we can rewrite the expression on the right-hand side as
sin x sin x

2
− sin x sin x sin x
lim = lim [ ⋅ (− tan x)] = ( lim ) ⋅ ( lim (− tan x)) = 1 ⋅ 0 = 0.
x→0
+
x cos x x→0
+
x x→0
+
x x→0
+

We conclude that lim ln y = 0.


+
Therefore, ln( lim +
y) = 0 and we have
x→0 x→0

sin x 0
lim y = lim x =e = 1.
+ +
x→0 x→0

Hence,
sin x
lim x = 1.
+
x→0

 Exercise 4.4.7

Evaluate lim x
+
x
.
x→0

Hint
Let y = x and take the natural logarithm of both sides of the equation.
x

Answer
1

Growth Rates of Functions


Suppose the functions f and g both approach infinity as x → ∞ . Although the values of both functions become arbitrarily large as
the values of x become sufficiently large, sometimes one function is growing more quickly than the other. For example, f (x) = x 2

and g(x) = x both approach infinity as x → ∞ . However, as Table 4.4.1 shows, the values of x are growing much faster than
3 3

the values of x . 2

Table 4.4.1 : Comparing the Growth Rates of x and x 2 3

x 10 100 1000 10,000

f(x) = x
2
100 10,000 1,000,000 100,000,000

g(x) = x
3
1000 1,000,000 1,000,000,000 1,000,000,000,000

In fact,
3
x
lim = lim x = ∞.
2
x→∞ x x→∞

or, equivalently

4.4.11 https://math.libretexts.org/@go/page/4464
2
x 1
lim = lim = 0.
3
x→∞ x x→∞ x

As a result, we say x is growing more rapidly than x as x → ∞ . On the other hand, for f (x) = x and g(x) = 3x + 4x + 1 ,
3 2 2 2

although the values of g(x) are always greater than the values of f (x) for x > 0 , each value of g(x) is roughly three times the
corresponding value of f (x) as x → ∞ , as shown in Table 4.4.2. In fact,
2
x 1
lim = .
2
x→∞ 3x + 4x + 1 3

Table 4.4.2 : Comparing the Growth Rates of x and 3x 2 2


+ 4x + 1

x 10 100 1000 10,000


2
f(x) = x 100 10,000 1,000,000 100,000,000
2
g(x) = 3 x + 4x + 1 341 30,401 3,004,001 300,040,001

In this case, we say that x and 3x 2 2


+ 4x + 1 are growing at the same rate as x → ∞.
More generally, suppose f and g are two functions that approach infinity as x → ∞ . We say g grows more rapidly than f as
x → ∞ if

g(x) f (x)
lim =∞ or, equivalently, lim = 0.
x→∞ f (x) x→∞ g(x)

On the other hand, if there exists a constant M ≠0 such that


f (x)
lim = M,
x→∞ g(x)

we say f and g grow at the same rate as x → ∞ .


Next we see how to use L’Hôpital’s rule to compare the growth rates of power, exponential, and logarithmic functions.

 Example 4.4.8: Comparing the Growth Rates of ln(x), x , and e 2 x

For each of the following pairs of functions, use L’Hôpital’s rule to evaluate
f (x)
lim .
x→∞ g(x)

a. f (x) = x and g(x) = e


2 x

b. f (x) = ln(x) and g(x) = x 2

Solution
2
x
a. Since lim x
2
=∞ and lim e
x
=∞ , we can use L’Hôpital’s rule to evaluate lim [
x
] . We obtain
x→∞ x→∞ x→∞ e

2
x 2x
lim = lim .
x
x→∞ e x→∞ ex

Since lim 2x = ∞ and lim e


x
=∞ , we can apply L’Hôpital’s rule again. Since
x→∞ x→∞

2x 2
lim = lim = 0,
x x
x→∞ e x→∞ e

we conclude that
2
x
lim = 0.
x
x→∞ e

Therefore, e grows more rapidly than x as x → ∞ (See Figure 4.4.3 and Table 4.4.3)
x 2

4.4.12 https://math.libretexts.org/@go/page/4464
Figure 4.4.3 : An exponential function grows at a faster rate than a power function.
Table 4.4.3 : Growth rates of a power function and an exponential function.
x 5 10 15 20

x
2
25 100 225 400

e
x
148 22,026 3,269,017 485,165,195

ln x
b. Since lim ln x = ∞ and lim x
2
=∞ , we can use L’Hôpital’s rule to evaluate lim
2
. We obtain
x→∞ x→∞ x→∞ x

ln x 1/x 1
lim = lim = lim = 0.
x→∞ 2 x→∞ x→∞ 2
x 2x 2x

Thus, x grows more rapidly than ln x as x → ∞ (see Figure 4.4.4 and Table 4.4.4).
2

Figure 4.4.4 : A power function grows at a faster rate than a logarithmic function.
Table 4.4.4 : Growth rates of a power function and a logarithmic function
x 10 100 1000 10,000

ln(x) 2.303 4.605 6.908 9.210

x
2
100 10,000 1,000,000 100,000,000

 Exercise 4.4.8

Compare the growth rates of x 100


and 2 . x

Hint
Apply L’Hôpital’s rule to x 100
/2
x
.

Answer
The function 2 grows faster than x
x 100
.

4.4.13 https://math.libretexts.org/@go/page/4464
Using the same ideas as in Example 4.4.8a. it is not difficult to show that e grows more rapidly than x for any p > 0 . In Figure
x p

4.4.5 and Table 4.4.5, we compare e with x and x as x → ∞ .


x 3 4

Figure 4.4.5 : The exponential function e grows faster than x for any p > 0 . (a) A comparison of
x p x
e with x . (b) A comparison
3

of e with x .
x 4

Table 4.4.5 : An exponential function grows at a faster rate than any power function
x 5 10 15 20

x
3
125 1000 3375 8000

x
4
625 10,000 50,625 160,000

e
x
148 22,026 3,269,017 485,165,195

Similarly, it is not difficult to show that p


x grows more rapidly than ln x for any p >0 . In Figure 4.4.6 and Table 4.4.6, we
compare ln x with √− x and √x .
3 −

Figure 4.4.6 : The function y = ln(x) grows more slowly than x for any p > 0 as x → ∞ .
p

Table 4.4.6 : A logarithmic function grows at a slower rate than any root function
x 10 100 1000 10,000

ln(x) 2.303 4.605 6.908 9.210


3 −
√x 2.154 4.642 10 21.544

√x 3.162 10 31.623 100

Key Concepts
0
L’Hôpital’s rule can be used to evaluate the limit of a quotient when the indeterminate form or ∞/∞ arises.
0
L’Hôpital’s rule can also be applied to other indeterminate forms if they can be rewritten in terms of a limit involving a quotient
0
that has the indeterminate form or ∞/∞.
0
The exponential function e grows faster than any power function x , p > 0 .
x p

The logarithmic function ln x grows more slowly than any power function x , p > 0 . p

4.4.14 https://math.libretexts.org/@go/page/4464
Glossary
indeterminate forms
0
When evaluating a limit, the forms ,∞/∞, 0 ⋅ ∞, ∞ − ∞, 0 0 0
,∞ , and 1

are considered indeterminate because further
0
analysis is required to determine whether the limit exists and, if so, what its value is.

L’Hôpital’s rule
If f and g are differentiable functions over an interval a , except possibly at a , and lim f (x) = 0 = lim g(x) or lim f (x) and
x→a x→a x→a

f (x) f '(x)
lim g(x) are infinite, then lim = lim , assuming the limit on the right exists or is ∞ or −∞ .
x→a x→a g(x) x→a g'(x)

4.4: Indeterminate Forms and l'Hospital's Rule is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
4.8: L’Hôpital’s Rule by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

4.4.15 https://math.libretexts.org/@go/page/4464
4.5: Summary of Curve Sketching
 Learning Objectives
Explain how the sign of the first derivative affects the shape of a function’s graph.
State the first derivative test for critical points.
Use concavity and inflection points to explain how the sign of the second derivative affects the shape of a function’s graph.
Explain the concavity test for a function over an open interval.
Explain the relationship between a function and its first and second derivatives.
State the second derivative test for local extrema.

Earlier in this chapter we stated that if a function f has a local extremum at a point c , then c must be a critical point of f . However,
a function is not guaranteed to have a local extremum at a critical point. For example, f (x) = x has a critical point at x = 0 since
3


f (x) = 3 x is zero at x = 0 , but f does not have a local extremum at x = 0 . Using the results from the previous section, we are
2

now able to determine whether a critical point of a function actually corresponds to a local extreme value. In this section, we also
see how the second derivative provides information about the shape of a graph by describing whether the graph of a function
curves upward or curves downward.

The First Derivative Test


Corollary 3 of the Mean Value Theorem showed that if the derivative of a function is positive over an interval I then the function is
increasing over I . On the other hand, if the derivative of the function is negative over an interval I , then the function is decreasing
over I as shown in the following figure.

Figure 4.5.1 : Both functions are increasing over the interval (a, b) . At each point x , the derivative ′
f (x) > 0 . Both functions are
decreasing over the interval (a, b) . At each point x , the derivative f (x) < 0.

A continuous function f has a local maximum at point c if and only if f switches from increasing to decreasing at point c .
Similarly, f has a local minimum at c if and only if f switches from decreasing to increasing at c . If f is a continuous function
over an interval I containing c and differentiable over I , except possibly at c , the only way f can switch from increasing to
decreasing (or vice versa) at point c is if f changes sign as x increases through c . If f is differentiable at c , the only way that f
′ ′

can change sign as x increases through c is if f (c) = 0 . Therefore, for a function f that is continuous over an interval I containing

c and differentiable over I , except possibly at c , the only way f can switch from increasing to decreasing (or vice versa) is if

f (c) = 0 or f (c) is undefined. Consequently, to locate local extrema for a function f , we look for points c in the domain of f
′ ′

such that f (c) = 0 or f (c) is undefined. Recall that such points are called critical points of f .
′ ′

4.5.1 https://math.libretexts.org/@go/page/4465
Note that f need not have a local extrema at a critical point. The critical points are candidates for local extrema only. In Figure
4.5.2, we show that if a continuous function f has a local extremum, it must occur at a critical point, but a function may not have a

local extremum at a critical point. We show that if f has a local extremum at a critical point, then the sign of f switches as x ′

increases through that point.

Figure 4.5.2 : The function f has four critical points: a, b, c ,and d . The function f has local maxima at a and d , and a local
minimum at b . The function f does not have a local extremum at c . The sign of f changes at all local extrema.

Using Figure 4.5.2, we summarize the main results regarding local extrema.
If a continuous function f has a local extremum, it must occur at a critical point c .
The function has a local extremum at the critical point c if and only if the derivative f switches sign as x increases through c .

Therefore, to test whether a function has a local extremum at a critical point c , we must determine the sign of f (x) to the left

and right of c .
This result is known as the first derivative test.

 First Derivative Test

Suppose that f is a continuous function over an interval I containing a critical point c . If f is differentiable over I , except
possibly at point c , then f (c) satisfies one of the following descriptions:
i. If f changes sign from positive when x < c to negative when x > c , then f (c) is a local maximum of f .

ii. If f changes sign from negative when x < c to positive when x > c , then f (c) is a local minimum of f .

iii. If f has the same sign for x < c and x > c , then f (c) is neither a local maximum nor a local minimum of f

Now let’s look at how to use this strategy to locate all local extrema for particular functions.

 Example 4.5.1: Using the First Derivative Test to Find Local Extrema

Use the first derivative test to find the location of all local extrema for f (x) = x
3
− 3x
2
− 9x − 1. Use a graphing utility to
confirm your results.
Solution
Step 1. The derivative is f (x) = 3x − 6x − 9. To find the critical points, we need to find where
′ 2 ′
f (x) = 0. Factoring the
polynomial, we conclude that the critical points must satisfy
2
3(x − 2x − 3) = 3(x − 3)(x + 1) = 0.

Therefore, the critical points are x = 3, −1. Now divide the interval (−∞, ∞) into the smaller intervals (−∞, −1), (−1, 3)

and (3, ∞).

4.5.2 https://math.libretexts.org/@go/page/4465
Step 2. Since f is a continuous function, to determine the sign of f (x) over each subinterval, it suffices to choose a point
′ ′

over each of the intervals (−∞, −1), (−1, 3) and (3, ∞) and determine the sign of f at each of these points. For example,

let’s choose x = −2 , x = 0 , and x = 4 as test points.


Table: 4.5.1 : First Derivative Test for f (x) = x 3
− 3x
2
− 9x − 1.

Sign of
Interval Test Point ′
f (x) = 3(x − 3)(x + 1) at Conclusion
Test Point

(−∞, −1) x = −2 (+)(−)(−)=+ f is increasing.

(−1, 3) x = 0 (+)(−)(+)=- f is decreasing.

(3, ∞) x = 4 (+)(+)(+)=+ f is increasing.

Step 3. Since f switches sign from positive to negative as x increases through −1, f has a local maximum at x = −1 . Since

f switches sign from negative to positive as x increases through 3 , f has a local minimum at x = 3 . These analytical results

agree with the following graph.

Figure 4.5.3 : The function f has a maximum at x = −1 and a minimum at x = 3

 Exercise 4.5.1

Use the first derivative test to locate all local extrema for f (x) = −x 3
+
3

2
x
2
+ 18x.

Hint
Find all critical points of f and determine the signs of f ′
(x) over particular intervals determined by the critical points.

Answer

4.5.3 https://math.libretexts.org/@go/page/4465
f has a local minimum at −2 and a local maximum at 3.

 Example 4.5.2: Using the First Derivative Test

Use the first derivative test to find the location of all local extrema for f (x) = 5x 1/3
−x
5/3
. Use a graphing utility to confirm
your results.
Solution
Step 1. The derivative is
2/3 4/3 4/3
5 5 5 5x 5 − 5x 5(1 − x )
′ −2/3 2/3
f (x) = x − x = − = = .
3 3 2/3 3 2/3 2/3
3x 3x 3x

The derivative f (x) = 0 when 1 − x = 0. Therefore, f (x) = 0 at x = ±1 . The derivative f (x) is undefined at x = 0.
′ 4/3 ′ ′

Therefore, we have three critical points: x = 0 , x = 1 , and x = −1 . Consequently, divide the interval (−∞, ∞) into the
smaller intervals (−∞, −1), (−1, 0), (0, 1), and (1, ∞).
Step 2: Since f is continuous over each subinterval, it suffices to choose a test point x in each of the intervals from step 1 and

determine the sign of f at each of these points. The points x = −2, x = − , x = , and x = 2 are test points for these
′ 1

2
1

intervals.
Table: 4.5.2 : First Derivative Test for f (x) = 5x 1/3
−x
5/3
.

4/3
5( 1−x )
Sign of f ′
(x) = at
Interval Test Point 3x2/3 Conclusion
Test Point
(+)(−)
(−∞, −1) x = −2
+
= − f is decreasing.
(+)(+)
(−1, 0) x = −
1

2 +
= + f is increasing.
(+)(+)
(0, 1) x =
1

2 +
= + f is increasing.
(+)(−)
(1, ∞) x = 2
+
= − f is decreasing.

Step 3: Since f is decreasing over the interval (−∞, −1) and increasing over the interval (−1, 0), f has a local minimum at
x = −1 . Since f is increasing over the interval (−1, 0) and the interval (0, 1), f does not have a local extremum at x = 0 .

Since f is increasing over the interval (0, 1) and decreasing over the interval (1, ∞), f has a local maximum at x = 1 . The
analytical results agree with the following graph.

Figure 4.5.4 : The function f has a local minimum at x = −1 and a local maximum at x = 1

 Exercise 4.5.2
3
Use the first derivative test to find all local extrema for f (x) = .
x −1

Hint
The only critical point of f is x = 1.

4.5.4 https://math.libretexts.org/@go/page/4465
Answer
f has no local extrema because f does not change sign at x = 1 .

Concavity and Points of Inflection


We now know how to determine where a function is increasing or decreasing. However, there is another issue to consider regarding
the shape of the graph of a function. If the graph curves, does it curve upward or curve downward? This notion is called the
concavity of the function.
Figure 4.5.5a shows a function f with a graph that curves upward. As x increases, the slope of the tangent line increases. Thus,
since the derivative increases as x increases, f is an increasing function. We say this function f is concave up. Figure 4.5.5b

shows a function f that curves downward. As x increases, the slope of the tangent line decreases. Since the derivative decreases as
x increases, f is a decreasing function. We say this function f is concave down.

 Definition: concavity test

Let f be a function that is differentiable over an open interval I . If f is increasing over I , we say f is concave up over I . If

f is decreasing over I , we say f is concave down over I .


Figure 4.5.5 : (a), (c) Since f is increasing over the interval (a, b) , we say

f is concave up over (a, b). (b), (d) Since f

is
decreasing over the interval (a, b) , we say f is concave down over (a, b).
In general, without having the graph of a function f , how can we determine its concavity? By definition, a function f is
concave up if f is increasing. From Corollary 3, we know that if f is a differentiable function, then f is increasing if its
′ ′ ′

derivative f (x) > 0 . Therefore, a function f that is twice differentiable is concave up when f (x) > 0 . Similarly, a function
′′ ′′

f is concave down if f is decreasing. We know that a differentiable function f is decreasing if its derivative f (x) < 0 .
′ ′ ′′

Therefore, a twice-differentiable function f is concave down when f (x) < 0 . Applying this logic is known as the concavity
′′

test.

4.5.5 https://math.libretexts.org/@go/page/4465
 Test for Concavity

Let f be a function that is twice differentiable over an interval I .


i. If f ′′
(x) > 0 for all x ∈ I , then f is concave up over I
ii. If f ′′
(x) < 0 for all x ∈ I , then f is concave down over I .

We conclude that we can determine the concavity of a function f by looking at the second derivative of f . In addition, we observe
that a function f can switch concavity (Figure 4.5.6). However, a continuous function can switch concavity only at a point x if
f (x) = 0 or f (x) is undefined. Consequently, to determine the intervals where a function f is concave up and concave down,
′′ ′′

we look for those values of x where f (x) = 0 or f (x) is undefined. When we have determined these points, we divide the
′′ ′′

domain of f into smaller intervals and determine the sign of f over each of these smaller intervals. If f changes sign as we pass
′′ ′′

through a point x, then f changes concavity. It is important to remember that a function f may not change concavity at a point x
even if f (x) = 0 or f (x) is undefined. If, however, f does change concavity at a point a and f is continuous at a , we say the
′′ ′′

point (a, f (a)) is an inflection point of f .

 Definition: inflection point

If f is continuous at a and f changes concavity at a , the point (a, f (a)) is an inflection point of f .

Figure 4.5.6 : Since f (x) > 0 for x < a , the function f is concave up over the interval (−∞, a) . Since f
′′ ′′
(x) < 0 for x > a , the
function f is concave down over the interval (a, ∞) . The point (a, f (a)) is an inflection point of f .

 Example 4.5.3: Testing for Concavity

For the function f (x) = x − 6x + 9x + 30, determine all intervals where f is concave up and all intervals where
3 2
f is
concave down. List all inflection points for f . Use a graphing utility to confirm your results.
Solution
To determine concavity, we need to find the second derivative f (x). The first derivative is f (x) = 3x − 12x + 9, so the
′′ ′ 2

second derivative is f (x) = 6x − 12. If the function changes concavity, it occurs either when f (x) = 0 or f (x) is
′′ ′′ ′′

undefined. Since f is defined for all real numbers x, we need only find where f (x) = 0 . Solving the equation 6x − 12 = 0 ,
′′ ′′

we see that x = 2 is the only place where f could change concavity. We now test points over the intervals (−∞, 2) and (2, ∞)
to determine the concavity of f . The points x = 0 and x = 3 are test points for these intervals.
Table: 4.5.3 : Test for Concavity for f (x) = x 3
− 6x
2
+ 9x + 30.

Sign of f ′′
(x) = 6x − 12 at
Interval Test Point Conclusion
Test Point

(−∞, 2) x = 0 − f is concave down

(2, ∞) x = 3 + f is concave up

4.5.6 https://math.libretexts.org/@go/page/4465
We conclude that f is concave down over the interval (−∞, 2) and concave up over the interval (2, ∞). Since f changes
concavity at x = 2 , the point (2, f (2)) = (2, 32) is an inflection point. Figure 4.5.7 confirms the analytical results.

Figure 4.5.7 : The given function has a point of inflection at (2, 32) where the graph changes concavity.

 Exercise 4.5.3

For f (x) = −x 3
+
3

2
2
x + 18x , find all intervals where f is concave up and all intervals where f is concave down.

Hint
Find where f ′′
(x) = 0

Answer
f is concave up over the interval (−∞, 1

2
) and concave down over the interval ( 1

2
, ∞)

We now summarize, in Table 4.5.4, the information that the first and second derivatives of a function f provide about the graph of
f , and illustrate this information in Figure 4.5.8.

Table: 4.5.4 : What Derivatives Tell Us about Graphs


Sign of f ′
Sign of f ′′
Is f increasing or decreasing? Concavity

Positive Positive Increasing Concave up

Positive Negative Increasing Concave down

Negative Positive Decreasing Concave up

Negative Negative Decreasing Concave down

4.5.7 https://math.libretexts.org/@go/page/4465
Figure 4.5.8 :Consider a twice-differentiable function f over an open interval I . If f (x) > 0 for all x ∈ I , the function is

increasing over I . If f (x) < 0 for all x ∈ I , the function is decreasing over I . If f (x) > 0 for all x ∈ I , the function is concave
′ ′′

up. If f (x) < 0 for all x ∈ I , the function is concave down on I .


′′

The Second Derivative Test


The first derivative test provides an analytical tool for finding local extrema, but the second derivative can also be used to locate
extreme values. Using the second derivative can sometimes be a simpler method than using the first derivative.
We know that if a continuous function has a local extremum, it must occur at a critical point. However, a function need not have a
local extremum at a critical point. Here we examine how the second derivative test can be used to determine whether a function
has a local extremum at a critical point. Let f be a twice-differentiable function such that f (a) = 0 and f is continuous over an
′ ′′

open interval I containing a . Suppose f (a) < 0 . Since f is continuous over I , f (x) < 0 for all x ∈ I (Figure 4.5.9). Then, by
′′ ′′ ′′

Corollary 3, f is a decreasing function over I . Since f (a) = 0 , we conclude that for all x ∈ I , f (x) > 0 if x < a and
′ ′ ′

f (x) < 0 if x > a . Therefore, by the first derivative test, f has a local maximum at x = a .

On the other hand, suppose there exists a point b such that f (b) = 0 but f (b) > 0 . Since f is continuous over an open interval
′ ′′ ′′

I containing b , then f (x) > 0 for all x ∈ I (Figure 4.5.9). Then, by Corollary 3 , f is an increasing function over I . Since
′′ ′

f (b) = 0 , we conclude that for all x ∈ I , f (x) < 0 if x < b and f (x) > 0 if x > b . Therefore, by the first derivative test, f has
′ ′ ′

a local minimum at x = b.

Figure 4.5.9 : Consider a twice-differentiable function f such that f is continuous. Since f (a) = 0 and f (a) < 0, there is an
′′ ′ ′′

interval I containing a such that for all x in I , f is increasing if x < a and f is decreasing if x > a . As a result, f has a local
maximum at x = a . Since f (b) = 0 and f (b) > 0, there is an interval I containing b such that for all x in I , f is decreasing if
′ ′′

x < b and f is increasing if x > b . As a result, f has a local minimum at x = b .

4.5.8 https://math.libretexts.org/@go/page/4465
 Second Derivative Test
Suppose f ′
(c) = 0 and f is continuous over an interval containing c .
′′

i. If f ′′
(c) > 0, then f has a local minimum at c .
ii. If f ′′
(c) < 0, then f has a local maximum at c .
iii. If f ′′
(c) = 0, then the test is inconclusive.

Note that for case iii. when f (c) = 0 , then f may have a local maximum, local minimum, or neither at c . For example, the
′′

functions f (x) = x , f (x) = x , and f (x) = −x all have critical points at x = 0 . In each case, the second derivative is zero at
3 4 4

x = 0 . However, the function f (x) = x has a local minimum at x = 0 whereas the function f (x) = −x has a local maximum at
4 4

x = 0 , and the function f (x) = x does not have a local extremum at x = 0 .


3

Let’s now look at how to use the second derivative test to determine whether f has a local maximum or local minimum at a critical
point c where f (c) = 0. ′

 Example 4.5.4: Using the Second Derivative Test

Use the second derivative to find the location of all local extrema for f (x) = x 5
− 5x .
3

Solution
To apply the second derivative test, we first need to find critical points c where f (c) = 0 . The derivative is ′


f (x) = 5 x − 15 x . Therefore, f (x) = 5 x − 15 x = 5 x (x − 3) = 0 when x = 0, ±√3 .
′ 4 2 ′ 4 2 2 2

To determine whether f has a local extremum at any of these points, we need to evaluate the sign of f
′′
at these points. The
second derivative is
′′ 3 2
f (x) = 20 x − 30x = 10x(2 x − 3).

In the following table, we evaluate the second derivative at each of the critical points and use the second derivative test to
determine whether f has a local maximum or local minimum at any of these points.
Table: 4.5.5 : Second Derivative Test for f (x) = x 5 3
− 5x .

x f
′′
(x) Conclusion
– –
−√3 −30 √3 Local maximum

0 0 Second derivative test is inconclusive


– –
√3 30 √3 Local minimum

– –
By the second derivative test, we conclude that f has a local maximum at x = −√3 and f has a local minimum at x = √3 .
The second derivative test is inconclusive at x = 0 . To determine whether f has a local extrema at x = 0, we apply the first
– –
derivative test. To evaluate the sign of f (x) = 5x (x − 3) for x ∈ (−√3, 0) and x ∈ (0, √3) , let x = −1 and x = 1 be the
′ 2 2

two test points. Since f (−1) < 0 and f (1) < 0 , we conclude that f is decreasing on both intervals and, therefore, f does not
′ ′

have a local extrema at x = 0 as shown in the following graph.

4.5.9 https://math.libretexts.org/@go/page/4465
– –
Figure 4.5.10 :The function f has a local maximum at x = −√3 and a local minimum at x = √3

 Exercise 4.5.4

Consider the function f (x) = x − ( )x − 18x . The points c = 3, −2 satisfy


3 3

2
2 ′
f (c) = 0 . Use the second derivative test to
determine whether f has a local maximum or local minimum at those points.

Hint
′′
f (x) = 6x − 3

Answer
f has a local maximum at −2 and a local minimum at 3.

We have now developed the tools we need to determine where a function is increasing and decreasing, as well as acquired an
understanding of the basic shape of the graph. In the next section we discuss what happens to a function as x → ±∞. At that point,
we have enough tools to provide accurate graphs of a large variety of functions.

Key Concepts
If c is a critical point of f and f (x) > 0 for x < c and f (x) < 0 for x > c , then f has a local maximum at c .
′ ′

If c is a critical point of f and f (x) < 0 for x < c and f (x) > 0 for x > c, then f has a local minimum at c .
′ ′

If f (x) > 0 over an interval I , then f is concave up over I .


′′

If f (x) < 0 over an interval I , then f is concave down over I .


′′

If f (c) = 0 and f (c) > 0 , then f has a local minimum at c .


′ ′′

If f (c) = 0 and f (c) < 0 , then f has a local maximum at c .


′ ′′

If f (c) = 0 and f (c) = 0 , then evaluate f (x) at a test point x to the left of c and a test point x to the right of c , to determine
′ ′′ ′

whether f has a local extremum at c .

Glossary
concave down
if f is differentiable over an interval I and f is decreasing over I , then f is concave down over I

concave up
if f is differentiable over an interval I and f is increasing over I , then f is concave up over I

4.5.10 https://math.libretexts.org/@go/page/4465
concavity
the upward or downward curve of the graph of a function

concavity test
suppose f is twice differentiable over an interval I ; if f ′′
>0 over I , then f is concave up over I ; if f ′′
< over I , then f is
concave down over I

first derivative test


let f be a continuous function over an interval I containing a critical point c such that f is differentiable over I except possibly
at c; if f changes sign from positive to negative as x increases through c, then f has a local maximum at c; if f changes sign
′ ′

from negative to positive as x increases through c, then f has a local minimum at c; if f does not change sign as x increases

through c, then f does not have a local extremum at c

inflection point
if f is continuous at c and f changes concavity at c, the point (c, f (c)) is an inflection point of f

second derivative test


suppose f (c) = 0 and f ' is continuous over an interval containing c; if f (c) > 0 , then f has a local minimum at c; if
′ ′ ′′

f (c) < 0 , then f has a local maximum at c ; if f (c) = 0 , then the test is inconclusive
′′ ′′

4.5: Summary of Curve Sketching is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
4.5: Derivatives and the Shape of a Graph by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

4.5.11 https://math.libretexts.org/@go/page/4465
Welcome to the Mathematics Library. This Living Library is a principal hub of the LibreTexts project, which is a multi-institutional
collaborative venture to develop the next generation of open-access texts to improve postsecondary education at all levels of higher
learning. The LibreTexts approach is highly collaborative where an Open Access textbook environment is under constant revision
by students, faculty, and outside experts to supplant conventional paper-based books.

1 https://math.libretexts.org/@go/page/4466
4.7: Optimization Problems
 Learning Objectives
Set up and solve optimization problems in several applied fields.

One common application of calculus is calculating the minimum or maximum value of a function. For example, companies often
want to minimize production costs or maximize revenue. In manufacturing, it is often desirable to minimize the amount of material
used to package a product with a certain volume. In this section, we show how to set up these types of minimization and
maximization problems and solve them by using the tools developed in this chapter.

Solving Optimization Problems over a Closed, Bounded Interval


The basic idea of the optimization problems that follow is the same. We have a particular quantity that we are interested in
maximizing or minimizing. However, we also have some auxiliary condition that needs to be satisfied. For example, in Example
4.7.1, we are interested in maximizing the area of a rectangular garden. Certainly, if we keep making the side lengths of the garden

larger, the area will continue to become larger. However, what if we have some restriction on how much fencing we can use for the
perimeter? In this case, we cannot make the garden as large as we like. Let’s look at how we can maximize the area of a rectangle
subject to some constraint on the perimeter.

 Example 4.7.1: Maximizing the Area of a Garden


A rectangular garden is to be constructed using a rock wall as one side of the garden and wire fencing for the other three sides
(Figure 4.7.1). Given 100 ft of wire fencing, determine the dimensions that would create a garden of maximum area. What is
the maximum area?

Figure 4.7.1 : We want to determine the measurements x and y that will create a garden with a maximum area using 100 ft of
fencing.
Solution
Let x denote the length of the side of the garden perpendicular to the rock wall and y denote the length of the side parallel to
the rock wall. Then the area of the garden is
A = x ⋅ y.

We want to find the maximum possible area subject to the constraint that the total fencing is 100 ft . From Figure 4.7.1 , the
total amount of fencing used will be 2x + y. Therefore, the constraint equation is
2x + y = 100.

Solving this equation for y , we have y = 100 − 2x. Thus, we can write the area as
2
A(x) = x ⋅ (100 − 2x) = 100x − 2 x .

4.7.1 https://math.libretexts.org/@go/page/4467
Before trying to maximize the area function A(x) = 100x − 2x , we need to determine the domain under consideration. To
2

construct a rectangular garden, we certainly need the lengths of both sides to be positive. Therefore, we need x > 0 and y > 0 .
Since y = 100 − 2x , if y > 0 , then x < 50. Therefore, we are trying to determine the maximum value of A(x) for x over the
open interval (0, 50). We do not know that a function necessarily has a maximum value over an open interval. However, we do
know that a continuous function has an absolute maximum (and absolute minimum) over a closed interval. Therefore, let’s
consider the function A(x) = 100x − 2x over the closed interval [0, 50]. If the maximum value occurs at an interior point,
2

then we have found the value x in the open interval (0, 50) that maximizes the area of the garden.
Therefore, we consider the following problem:
Maximize A(x) = 100x − 2x over the interval [0, 50].
2

As mentioned earlier, since A is a continuous function on a closed, bounded interval, by the extreme value theorem, it has a
maximum and a minimum. These extreme values occur either at endpoints or critical points. At the endpoints, A(x) = 0 . Since
the area is positive for all x in the open interval (0, 50), the maximum must occur at a critical point. Differentiating the
function A(x), we obtain
A'(x) = 100 − 4x.

Therefore, the only critical point is x = 25 (Figure 4.7.2). We conclude that the maximum area must occur when x = 25.

Figure 4.7.2 : To maximize the area of the garden, we need to find the maximum value of the function A(x) = 100x − 2x .
2

Then we have y = 100 − 2x = 100 − 2(25) = 50. To maximize the area of the garden, let x = 25 ft and y = 50 ft . The area
of this garden is 1250 ft .
2

 Exercise 4.7.1

Determine the maximum area if we want to make the same rectangular garden as in Figure 4.7.2, but we have 200 ft of
fencing.

Hint
We need to maximize the function A(x) = 200x − 2x over the interval [0, 100].
2

Answer
The maximum area is 5000 ft . 2

Now let’s look at a general strategy for solving optimization problems similar to Example 4.7.1.

 Problem-Solving Strategy: Solving Optimization Problems


1. Introduce all variables. If applicable, draw a figure and label all variables.
2. Determine which quantity is to be maximized or minimized, and for what range of values of the other variables (if this can
be determined at this time).
3. Write a formula for the quantity to be maximized or minimized in terms of the variables. This formula may involve more
than one variable.

4.7.2 https://math.libretexts.org/@go/page/4467
4. Write any equations relating the independent variables in the formula from step 3. Use these equations to write the quantity
to be maximized or minimized as a function of one variable.
5. Identify the domain of consideration for the function in step 4 based on the physical problem to be solved.
6. Locate the maximum or minimum value of the function from step 4. This step typically involves looking for critical points
and evaluating a function at endpoints.

Now let’s apply this strategy to maximize the volume of an open-top box given a constraint on the amount of material to be used.

 Example 4.7.2: Maximizing the Volume of a Box

An open-top box is to be made from a 24 in. by 36 in. piece of cardboard by removing a square from each corner of the box
and folding up the flaps on each side. What size square should be cut out of each corner to get a box with the maximum
volume?
Solution
Step 1: Let x be the side length of the square to be removed from each corner (Figure 4.7.3). Then, the remaining four flaps
can be folded up to form an open-top box. Let V be the volume of the resulting box.

Figure 4.7.3 : A square with side length x inches is removed from each corner of the piece of cardboard. The remaining flaps
are folded to form an open-top box.
Step 2: We are trying to maximize the volume of a box. Therefore, the problem is to maximize V .
Step 3: As mentioned in step 2, are trying to maximize the volume of a box. The volume of a box is

V = L ⋅ W ⋅ H,

where L, W,and H are the length, width, and height, respectively.


Step 4: From Figure 4.7.3, we see that the height of the box is x inches, the length is 36 − 2x inches, and the width is 24 − 2x
inches. Therefore, the volume of the box is
V (x) = (36 − 2x)(24 − 2x)x
.
3 2
= 4x − 120 x + 864x

Step 5: To determine the domain of consideration, let’s examine Figure 4.7.3. Certainly, we need x > 0. Furthermore, the side
length of the square cannot be greater than or equal to half the length of the shorter side, 24 in.; otherwise, one of the flaps
would be completely cut off. Therefore, we are trying to determine whether there is a maximum volume of the box for x over
the open interval (0, 12). Since V is a continuous function over the closed interval [0, 12], we know V will have an absolute
maximum over the closed interval. Therefore, we consider V over the closed interval [0, 12] and check whether the absolute
maximum occurs at an interior point.
Step 6: Since V (x) is a continuous function over the closed, bounded interval [0, 12], V must have an absolute maximum (and
an absolute minimum). Since V (x) = 0 at the endpoints and V (x) > 0 for 0 < x < 12, the maximum must occur at a critical
point. The derivative is
2
V '(x) = 12 x − 240x + 864.

To find the critical points, we need to solve the equation


2
12 x − 240x + 864 = 0.

4.7.3 https://math.libretexts.org/@go/page/4467
Dividing both sides of this equation by 12, the problem simplifies to solving the equation
2
x − 20x + 72 = 0.

Using the quadratic formula, we find that the critical points are
−−−−−−−−−−−−− −
2
20 ± √ (−20 ) − 4(1)(72)
x =
2
−−−
20 ± √112
=
2 .


20 ± 4 √7
=
2

= 10 ± 2 √7

– –
Since 10 + 2√7 is not in the domain of consideration, the only critical point we need to consider is 10 − 2√7 . Therefore, the

volume is maximized if we let x = 10 − 2√7 in. The maximum volume is
– – 3
V (10 − 2 √7) = 640 + 448 √7 ≈ 1825 in .

as shown in the following graph.

Figure 4.7.4 : Maximizing the volume of the box leads to finding the maximum value of a cubic polynomial.

 Exercise 4.7.2

Suppose the dimensions of the cardboard in Example 4.7.2 are 20 in. by 30 in. Let x be the side length of each square and
write the volume of the open-top box as a function of x. Determine the domain of consideration for x.

Hint
The volume of the box is L ⋅ W ⋅ H .

Answer
V (x) = x(20 − 2x)(30 − 2x). The domain is [0, 10].

 Example 4.7.3: Minimizing Travel Time

An island is 2 mi due north of its closest point along a straight shoreline. A visitor is staying at a cabin on the shore that is 6 mi
west of that point. The visitor is planning to go from the cabin to the island. Suppose the visitor runs at a rate of 8 mph and
swims at a rate of 3 mph. How far should the visitor run before swimming to minimize the time it takes to reach the island?
Solution
Step 1: Let x be the distance running and let y be the distance swimming (Figure 4.7.5). Let T be the time it takes to get from
the cabin to the island.

4.7.4 https://math.libretexts.org/@go/page/4467
Figure 4.7.5 : How can we choose x and y to minimize the travel time from the cabin to the island?
Step 2: The problem is to minimize T .
Step 3: To find the time spent traveling from the cabin to the island, add the time spent running and the time spent swimming.
Since Distance = Rate × Time (D = R × T ), the time spent running is
Drunning x
Trunning = = ,
Rrunning 8

and the time spent swimming is


Dswimming y
Tswimming = = .
Rswimming 3

Therefore, the total time spent traveling is


x y
T = + .
8 3

Step 4: From Figure 4.7.5, the line segment of y miles forms the hypotenuse of a right triangle with legs of length 2 mi and
−−− −− −−− −−
6 − x mi. Therefore, by the Pythagorean theorem, 2 + (6 − x ) = y , and we obtain y = √(6 − x ) + 4 . Thus, the total
2 2 2 2

time spent traveling is given by the function


−−− −−− − −−−
2
x √(6 − x ) + 4
T (x) = + .
8 3

Step 5: From Figure 4.7.5, we see that 0 ≤ x ≤ 6 . Therefore, [0, 6] is the domain of consideration.
Step 6: Since T (x) is a continuous function over a closed, bounded interval, it has a maximum and a minimum. Let’s begin by
looking for any critical points of T over the interval [0, 6]. The derivative is
2 −1/2
1 1 [(6 − x ) + 4]
T '(x) = − ⋅ 2(6 − x)
8 2 3

1 (6 − x)
= −
− −−−− − − −−−
8 2
3 √ (6 − x ) + 4

If T '(x) = 0,, then


1 6 −x
= (4.7.1)
− −−−− − − −−−
8 2
3 √ (6 − x ) + 4

Therefore,
−−−−−−−−−−
2
3 √ (6 − x ) + 4 = 8(6 − x). (4.7.2)

Squaring both sides of this equation, we see that if x satisfies this equation, then x must satisfy
2 2
9[(6 − x ) + 4] = 64(6 − x ) ,

4.7.5 https://math.libretexts.org/@go/page/4467
which implies
2
55(6 − x ) = 36.

We conclude that if x is a critical point, then x satisfies

2
36
(x − 6 ) = .
55

[Note that since we are squaring, (x − 6) 2 2


= (6 − x ) . ]
Therefore, the possibilities for critical points are
6
x =6± −−.
√55

−− −−
Since x = 6 + 6/√55 is not in the domain, it is not a possibility for a critical point. On the other hand, x = 6 − 6/√55 is in
the domain. Since we squared both sides of Equation 4.7.2 to arrive at the possible critical points, it remains to verify that
−− −− −−
x = 6 − 6/ √55 satisfies Equation 4.7.1. Since x = 6 − 6/ √55 does satisfy that equation, we conclude that x = 6 − 6/ √55

is a critical point, and it is the only one. To justify that the time is minimized for this value of x, we just need to check the
values of T (x) at the endpoints x = 0 and x = 6 , and compare them with the value of T (x) at the critical point
−−
x = 6 − 6/ √55 . We find that T (0) ≈ 2.108 h and T (6) ≈ 1.417 h, whereas

−−
T (6 − 6/ √55) ≈ 1.368 h.

Therefore, we conclude that T has a local minimum at x ≈ 5.19 mi.

 Exercise 4.7.3

Suppose the island is 1 mi from shore, and the distance from the cabin to the point on the shore closest to the island is 15 mi.
Suppose a visitor swims at the rate of 2.5 mph and runs at a rate of 6 mph. Let x denote the distance the visitor will run before
swimming, and find a function for the time it takes the visitor to get from the cabin to the island.

Hint
The time T = Trunning + Tswimming .

Answer
−−−−− −− − −−−
x √(15 − x )2 + 1
T (x) = +
6 2.5

In business, companies are interested in maximizing revenue. In the following example, we consider a scenario in which a
company has collected data on how many cars it is able to lease, depending on the price it charges its customers to rent a car. Let’s
use these data to determine the price the company should charge to maximize the amount of money it brings in.

 Example 4.7.4: Maximizing Revenue

Owners of a car rental company have determined that if they charge customers p dollars per day to rent a car, where
50 ≤ p ≤ 200 , the number of cars n they rent per day can be modeled by the linear function n(p) = 1000 − 5p . If they charge

$50 per day or less, they will rent all their cars. If they charge $200 per day or more, they will not rent any cars. Assuming the

owners plan to charge customers between $50 per day and $200 per day to rent a car, how much should they charge to
maximize their revenue?
Solution
Step 1: Let p be the price charged per car per day and let n be the number of cars rented per day. Let R be the revenue per day.
Step 2: The problem is to maximize R.
Step 3: The revenue (per day) is equal to the number of cars rented per day times the price charged per car per day—that is,
R = n × p.

4.7.6 https://math.libretexts.org/@go/page/4467
Step 4: Since the number of cars rented per day is modeled by the linear function n(p) = 1000 − 5p, the revenue R can be
represented by the function
R(p) = n × p

= (1000 − 5p)p

2
= −5 p + 1000p.

Step 5: Since the owners plan to charge between $50 per car per day and $200 per car per day, the problem is to find the
maximum revenue R(p) for p in the closed interval [50, 200].
Step 6: Since R is a continuous function over the closed, bounded interval [50, 200], it has an absolute maximum (and an
absolute minimum) in that interval. To find the maximum value, look for critical points. The derivative is
R'(p) = −10p + 1000. Therefore, the critical point is p = 100 . When p = 100, R(100) = $50, 000. When
p = 50, R(p) = $37, 500. When p = 200, R(p) = $0 .

Therefore, the absolute maximum occurs at p = $100. The car rental company should charge $100 per day per car to
maximize revenue as shown in the following figure.

Figure 4.7.6 : To maximize revenue, a car rental company has to balance the price of a rental against the number of cars people
will rent at that price.

 Exercise 4.7.4

A car rental company charges its customers p dollars per day, where 60 ≤ p ≤ 150 . It has found that the number of cars rented
per day can be modeled by the linear function n(p) = 750 − 5p. How much should the company charge each customer to
maximize revenue?

Hint
R(p) = n × p, where n is the number of cars rented and p is the price charged per car.

Answer
The company should charge $75 per car per day.

 Example 4.7.5: Maximizing the Area of an Inscribed Rectangle


A rectangle is to be inscribed in the ellipse
2
x
2
+y = 1.
4

What should the dimensions of the rectangle be to maximize its area? What is the maximum area?
Solution
Step 1: For a rectangle to be inscribed in the ellipse, the sides of the rectangle must be parallel to the axes. Let L be the length
of the rectangle and W be its width. Let A be the area of the rectangle.

4.7.7 https://math.libretexts.org/@go/page/4467
Figure 4.7.7 : We want to maximize the area of a rectangle inscribed in an ellipse.
Step 2: The problem is to maximize A .
Step 3: The area of the rectangle is A = LW .
Step 4: Let (x, y) be the corner of the rectangle that lies in the first quadrant, as shown in Figure 4.7.7. We can write length
−−−−−−
2 2
x x
L = 2x and width W = 2y . Since +y
2
=1 and y > 0 , we have y = √1− . Therefore, the area is
4 4
−−−−−−
2
x −−−− −
2
A = LW = (2x)(2y) = 4x √1 − = 2x √4 − x
4

Step 5: From Figure 4.7.7, we see that to inscribe a rectangle in the ellipse, the x-coordinate of the corner in the first quadrant
must satisfy 0 < x < 2 . Therefore, the problem reduces to looking for the maximum value of A(x) over the open interval
(0, 2). Since A(x) will have an absolute maximum (and absolute minimum) over the closed interval [0, 2], we consider
−−−−−
A(x) = 2x √4 − x over the interval [0, 2]. If the absolute maximum occurs at an interior point, then we have found an
2

absolute maximum in the open interval.


Step 6: As mentioned earlier, A(x) is a continuous function over the closed, bounded interval [0, 2]. Therefore, it has an
absolute maximum (and absolute minimum). At the endpoints x = 0 and x = 2 , A(x) = 0. For 0 < x < 2 , A(x) > 0 .
Therefore, the maximum must occur at a critical point. Taking the derivative of A(x), we obtain
−−−−− 1
′ 2
A (x) = 2 √ 4 − x + 2x ⋅ − −−− − (−2x)
2
2√ 4 − x

2
− −−− − 2x
2
= 2√ 4 − x − − −−− −
√ 4 − x2

2
8 − 4x
= − −−− −.
√ 4 − x2

To find critical points, we need to find where A (x) = 0. We can see that if x is a solution of

2
8 − 4x
= 0, (4.7.3)
− −−− −
√ 4 − x2

then x must satisfy


2
8 − 4x = 0.


Therefore, x = 2. Thus,
2
x = ±√2 are the possible solutions of Equation 4.7.3. Since we are considering x over the interval
– – –
[0, 2], x = √ 2 is a possibility for a critical point, but x = −√2 is not. Therefore, we check whether √2 is a solution of
– –
Equation 4.7.3. Since x = √2 is a solution of Equation 4.7.3, we conclude that √2 is the only critical point of A(x) in the
interval [0, 2].

Therefore, A(x) must have an absolute maximum at the critical point x = √2 . To determine the dimensions of the rectangle,

we need to find the length L and the width W . If x = √2 then

4.7.8 https://math.libretexts.org/@go/page/4467
−−−−−−−−−
– −−−−−
2
(√2) 1 1
y = √1 − = √1 − = –.
4 2 √2

– 2 –
Therefore, the dimensions of the rectangle are L = 2x = 2 √2 and W = 2y =

= √2 . The area of this rectangle is
√2
– –
A = LW = (2 √2)(√2) = 4.

 Exercise 4.7.5

Modify the area function A if the rectangle is to be inscribed in the unit circle 2
x +y
2
=1 . What is the domain of
consideration?

Hint
If (x, y) is the vertex of the square that lies in the first quadrant, then the area of the square is A = (2x)(2y) = 4xy.

Answer
−−−− −
2
A(x) = 4x √1 − x . The domain of consideration is [0, 1].

Solving Optimization Problems when the Interval Is Not Closed or Is Unbounded


In the previous examples, we considered functions on closed, bounded domains. Consequently, by the extreme value theorem, we
were guaranteed that the functions had absolute extrema. Let’s now consider functions for which the domain is neither closed nor
bounded.
Many functions still have at least one absolute extrema, even if the domain is not closed or the domain is unbounded. For example,
the function f (x) = x + 4 over (−∞, ∞) has an absolute minimum of 4 at x = 0 . Therefore, we can still consider functions
2

over unbounded domains or open intervals and determine whether they have any absolute extrema. In the next example, we try to
minimize a function over an unbounded domain. We will see that, although the domain of consideration is (0, ∞), the function has
an absolute minimum.
In the following example, we look at constructing a box of least surface area with a prescribed volume. It is not difficult to show
that for a closed-top box, by symmetry, among all boxes with a specified volume, a cube will have the smallest surface area.
Consequently, we consider the modified problem of determining which open-topped box with a specified volume has the smallest
surface area.

 Example 4.7.6: Minimizing Surface Area

A rectangular box with a square base, an open top, and a volume of 216 in is to be constructed. What should the dimensions
3

of the box be to minimize the surface area of the box? What is the minimum surface area?
Solution
Step 1: Draw a rectangular box and introduce the variable x to represent the length of each side of the square base; let y

represent the height of the box. Let S denote the surface area of the open-top box.

Figure 4.7.8 : We want to minimize the surface area of a square-based box with a given volume.
Step 2: We need to minimize the surface area. Therefore, we need to minimize S .
Step 3: Since the box has an open top, we need only determine the area of the four vertical sides and the base. The area of each
of the four vertical sides is x ⋅ y. The area of the base is x . Therefore, the surface area of the box is
2

4.7.9 https://math.libretexts.org/@go/page/4467
S = 4xy + x
2
.
Step 4: Since the volume of this box is x 2
y and the volume is given as 216 in , the constraint equation is
3

2
x y = 216 .
216
Solving the constraint equation for y , we have y = 2
. Therefore, we can write the surface area as a function of x only:
x

216 2
S(x) = 4x ( ) +x .
x2

864
Therefore, S(x) = +x
2
.
x

Step 5: Since we are requiring that x y = 216, we cannot have x = 0 . Therefore, we need x > 0 . On the other hand, x is
2

allowed to have any positive value. Note that as x becomes large, the height of the box y becomes correspondingly small so
that x y = 216. Similarly, as x becomes small, the height of the box becomes correspondingly large. We conclude that the
2

domain is the open, unbounded interval (0, ∞). Note that, unlike the previous examples, we cannot reduce our problem to
looking for an absolute maximum or absolute minimum over a closed, bounded interval. However, in the next step, we
discover why this function must have an absolute minimum over the interval (0, ∞).
Step 6: Note that as x → 0 , S(x) → ∞. Also, as x → ∞, S(x) → ∞ . Since S is a continuous function that approaches
+

infinity at the ends, it must have an absolute minimum at some x ∈ (0, ∞). This minimum must occur at a critical point of S .
The derivative is
864
S'(x) = − + 2x.
2
x

864 −−− –
Therefore, S'(x) = 0 when 2x = . Solving this equation for x, we obtain , so x = √432 = 6√2. Since this is
3 3
3
x = 432
x2

the only critical point of S , the absolute minimum must occur at x = 6√2 (see Figure 4.7.9). 3

– 216 3 – – –
When x = 6√2 , y =
3

3 –
= 3 √2 in. Therefore, the dimensions of the box should be x = 6√2 in. and y = 3√2 in. With3 3

2
(6 √2)

these dimensions, the surface area is

3 –
864 3 – 3 –
2 2
S(6 √2) = + (6 √2) = 108 √4 in
3 –
6 √2

Figure 4.7.9 : We can use a graph to determine the dimensions of a box of given the volume and the minimum surface area.

 Exercise 4.7.6

Consider the same open-top box, which is to have volume 216 in . Suppose the cost of the material for the base is 20¢/in and
3 2

the cost of the material for the sides is 30¢/in and we are trying to minimize the cost of this box. Write the cost as a function
2

of the side lengths of the base. (Let x be the side length of the base and y be the height of the box.)

Hint
If the cost of one of the sides is 30¢/in , the cost of that side is 0.30xy dollars.
2

Answer

4.7.10 https://math.libretexts.org/@go/page/4467
259.2
c(x) = + 0.2 x
2
dollars
x

Key Concepts
To solve an optimization problem, begin by drawing a picture and introducing variables.
Find an equation relating the variables.
Find a function of one variable to describe the quantity that is to be minimized or maximized.
Look for critical points to locate local extrema.

Glossary
optimization problems
problems that are solved by finding the maximum or minimum value of a function

4.7: Optimization Problems is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
4.7: Applied Optimization Problems by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

4.7.11 https://math.libretexts.org/@go/page/4467
4.8: Newton's Method
 Learning Objectives
Describe the steps of Newton’s method.
Explain what an iterative process means.
Recognize when Newton’s method does not work.
Apply iterative processes to various situations.

In many areas of pure and applied mathematics, we are interested in finding solutions to an equation of the form f (x) = 0. For
most functions, however, it is difficult—if not impossible—to calculate their zeroes explicitly. In this section, we take a look at a
technique that provides a very efficient way of approximating the zeroes of functions. This technique makes use of tangent line
approximations and is behind the method used often by calculators and computers to find zeroes.

Describing Newton’s Method


Consider the task of finding the solutions of f (x) = 0. If f is the first-degree polynomial f (x) = ax + b , then the solution of
f (x) = 0 is given by the formula x = − . If f is the second-degree polynomial f (x) = ax + bx + c , the solutions of f (x) = 0
b 2

can be found by using the quadratic formula. However, for polynomials of degree 3 or more, finding roots of f becomes more
complicated. Although formulas exist for third- and fourth-degree polynomials, they are quite complicated. Also, if f is a
polynomial of degree 5 or greater, it is known that no such formulas exist. For example, consider the function
5 4 3
f (x) = x + 8x + 4x − 2x − 7.

No formula exists that allows us to find the solutions of f (x) = 0. Similar difficulties exist for nonpolynomial functions. For
example, consider the task of finding solutions of tan(x) − x = 0. No simple formula exists for the solutions of this equation. In
cases such as these, we can use Newton’s method to approximate the roots.
Newton’s method makes use of the following idea to approximate the solutions of f (x) = 0. By sketching a graph of f , we can
estimate a root of f (x) = 0 . Let’s call this estimate x . We then draw the tangent line to f at x . If f '(x ) ≠ 0 , this tangent line
0 0 0

intersects the x-axis at some point (x , 0). Now let x be the next approximation to the actual root. Typically, x is closer than x
1 1 1 0

to an actual root. Next we draw the tangent line to f at x . If f '(x ) ≠ 0 , this tangent line also intersects the x-axis, producing
1 1

another approximation, x . We continue in this way, deriving a list of approximations: x , x , x , … . Typically, the numbers
2 0 1 2

x , x , x , … quickly approach an actual root x , as shown in the following figure.



0 1 2

4.8.1 https://math.libretexts.org/@go/page/4468
Figure 4.8.1 :The approximations x0 , x1 , x2 , … approach the actual root x

. The approximations are derived by looking at
tangent lines to the graph of f .
Now let’s look at how to calculate the approximations x , x , x , … . If x is our first approximation, the approximation
0 1 2 0 x1 is
defined by letting (x , 0) be the x-intercept of the tangent line to f at x . The equation of this tangent line is given by
1 0

y = f (x0 ) + f '(x0 )(x − x0 ).

Therefore, x must satisfy


1

f (x0 ) + f '(x0 )(x1 − x0 ) = 0.

Solving this equation for x , we conclude that


1

f (x0 )
x1 = x0 − .

f (x0 )

Similarly, the point (x 2, 0) is the x-intercept of the tangent line to f at x . Therefore, x satisfies the equation
1 2

f (x1 )
x2 = x1 − .

f (x1 )

In general, for n > 0, x satisfies


n

f (xn−1 )
xn = xn−1 − . (4.8.1)

f (xn−1 )

Next we see how to make use of this technique to approximate the root of the polynomial f (x) = x 3
− 3x + 1.

 Example 4.8.1: Finding a Root of a Polynomial


Use Newton’s method to approximate a root of f (x) = x 3
− 3x + 1 in the interval [1, 2]. Let x 0 =2 and find x
1, x2 , x3 , x4 ,

and x .
5

Solution
From Figure 4.8.2, we see that f has one root over the interval [1, 2]. Therefore x = 2 seems like a reasonable first 0

approximation. To find the next approximation, we use Equation 4.8.1. Since f (x) = x − 3x + 1 , the derivative is 3

f '(x) = 3 x − 3 . Using Equation 4.8.1 with n = 1 (and a calculator that displays 10 digits), we obtain
2

4.8.2 https://math.libretexts.org/@go/page/4468
f (x0 ) f (2) 3
x1 = x0 − =2− =2− ≈ 1.666666667.
′ ′
f (x0 ) f (2) 9

To find the next approximation, x2 , we use Equation 4.8.1 with n =2 and the value of x1 stored on the calculator. We find
that
f (x1 )
x2 = x1 − ≈ 1.548611111.

f (x1 )

Continuing in this way, we obtain the following results:


x1 ≈ 1.666666667

x2 ≈ 1.548611111

x3 ≈ 1.532390162

x4 ≈ 1.532088989

x5 ≈ 1.532088886

x6 ≈ 1.532088886.

We note that we obtained the same value for x and x . Therefore, any subsequent application of Newton’s method will most
5 6

likely give the same value for x .


n

Figure 4.8.2 : The function f (x) = x 3


− 3x + 1 has one root over the interval [1, 2].

 Exercise 4.8.1

Letting x = 0 , let’s use Newton’s method to approximate the root of


0 f (x) = x
3
− 3x + 1 over the interval [0, 1] by
calculating x and x .
1 2

Hint
Use Equation 4.8.1.

Answer
x1 ≈ 0.33333333

x2 ≈ 0.347222222


Newton’s method can also be used to approximate square roots. Here we show how to approximate √2 . This method can be
modified to approximate the square root of any positive number.

 Example 4.8.2: Finding a Square Root



Use Newton’s method to approximate √2 (Figure 4.8.3). Let f (x) = x − 2 , let x = 2 , and calculate x , x , x , x
2
0 1 2 3 4, x5 .
– –
(We note that since f (x) = x − 2 has a zero at √2, the initial value x = 2 is a reasonable choice to approximate √2).
2
0

4.8.3 https://math.libretexts.org/@go/page/4468

Figure 4.8.3 : We can use Newton’s method to find √2.
Solution
For f (x) = x2
− 2, f '(x) = 2x. From Equation 4.8.1, we know that
f (xn−1 )
xn = xn−1 −

f (xn−1 )

2
x −2
n−1
= xn−1 −
2xn−1

1 1
= xn−1 +
2 xn−1

1 2
= ( xn−1 + ).
2 xn−1

Therefore,
1 2 1 2
x1 = (x0 + ) = (2 + ) = 1.5
2 x0 2 2

1 2 1 2
x2 = (x1 + ) = (1.5 + ) ≈ 1.416666667.
2 x1 2 1.5

Continuing in this way, we find that


x1 = 1.5

x2 ≈ 1.416666667

x3 ≈ 1.414215686

x4 ≈ 1.414213562

x5 ≈ 1.414213562.

Since we obtained the same value for x and x , it is unlikely that the value x will change on any subsequent application of
4 5 n

Newton’s method. We conclude that √2 ≈ 1.414213562.

 Exercise 4.8.2

Use Newton’s method to approximate √3 by letting f (x) = x 2
−3 and x 0 =3 . Find x and x .
1 2

Hint

4.8.4 https://math.libretexts.org/@go/page/4468
xn− 1
For f (x) = x 2
−3 , Equation 4.8.1reduces to x n =
2
+
3

2xn− 1
.

Answer
x1 = 2

x2 = 1.75

When using Newton’s method, each approximation after the initial guess is defined in terms of the previous approximation by
f (x)
using the same formula. In particular, by defining the function F (x) = x − [
f '(x)
] , we can rewrite Equation 4.8.1 as
xn = F (xn−1 ) . This type of process, where each x is defined in terms of x
n by repeating the same function, is an example of
n−1

an iterative process. Shortly, we examine other iterative processes. First, let’s look at the reasons why Newton’s method could fail
to find a root.

Failures of Newton’s Method


Typically, Newton’s method is used to find roots fairly quickly. However, things can go wrong. Some reasons why Newton’s
method might fail include the following:
1. At one of the approximations x , the derivative f ' is zero at x , but f (x ) ≠ 0 . As a result, the tangent line of f at x does not
n n n n

intersect the x-axis. Therefore, we cannot continue the iterative process.


2. The approximations x , x , x , … may approach a different root. If the function f has more than one root, it is possible that
0 1 2

our approximations do not approach the one for which we are looking, but approach a different root (see Figure 4.8.4). This
event most often occurs when we do not choose the approximation x close enough to the desired root.
0

3. The approximations may fail to approach a root entirely. In Example 4.8.3, we provide an example of a function and an initial
guess x such that the successive approximations never approach a root because the successive approximations continue to
0

alternate back and forth between two values.

Figure 4.8.4 : If the initial guess x is too far from the root sought, it may lead to approximations that approach a different root.
0

 Example 4.8.3: When Newton’s Method Fails

Consider the function f (x) = x 3


− 2x + 2 . Let x
0 =0 . Show that the sequence x 1, x2 , … fails to approach a root of f .
Solution
For f (x) = x 3
− 2x + 2, the derivative is f '(x) = 3x 2
−2 .Therefore,
f (x0 ) f (0) 2
x1 = x0 − =0− =− = 1.
f '(x0 ) f '(0) −2

In the next step,


f (x1 ) f (1) 1
x2 = x1 − =1− =1− = 0.

f (x1 ) f '(1) 1

4.8.5 https://math.libretexts.org/@go/page/4468
Consequently, the numbers x , x , x , … continue to bounce back and forth between 0 and 1 and never get closer to the root
0 1 2

of f which is over the interval [−2, −1] (Figure 4.8.5). Fortunately, if we choose an initial approximation x closer to the 0

actual root, we can avoid this situation.

Figure 4.8.5 : The approximations continue to alternate between 0 and 1 and never approach the root of f .

 Exercise 4.8.3

For f (x) = x 3
− 2x + 2, let x 0 = −1.5 and find x and x .
1 2

Hint
Use Equation 4.8.1.

Answer
x1 ≈ −1.842105263

x2 ≈ −1.772826920

From Example 4.8.3, we see that Newton’s method does not always work. However, when it does work, the sequence of
approximations approaches the root very quickly. Discussions of how quickly the sequence of approximations approach a root
found using Newton’s method are included in texts on numerical analysis.

Other Iterative Processes


As mentioned earlier, Newton’s method is a type of iterative process. We now look at an example of a different type of iterative
process.
Consider a function F and an initial number x . Define the subsequent numbers x by the formula x = F (x ) . This process is
0 n n n−1

an iterative process that creates a list of numbers x , x , x , … , x , … . This list of numbers may approach a finite number x
0 1 2 n

as n gets larger, or it may not. In Example 4.8.4, we see an example of a function F and an initial guess x such that the resulting
0

list of numbers approaches a finite value.

 Example 4.8.4: Finding a Limit for an Iterative Process

Let F (x) = x + 4 and let x = 0 . For all n ≥ 1 , let x = F (x ) . Find the values x , x , x , x , x . Make a conjecture
1

2
0 n n−1 1 2 3 4 5

about what happens to this list of numbers x , x , x , … , x , … as n → ∞ . If the list of numbers x , x , x , …


1 2 3 n 1 2 3

approaches a finite number x , then x satisfies x = F (x ) , and x is called a fixed point of F .


∗ ∗ ∗ ∗ ∗

Solution
If x0 =0 , then
1
x1 = (0) + 4 = 4
2
1
x2 = (4) + 4 = 6
2
1
x3 = (6) + 4 = 7
2
1
x4 = (7) + 4 = 7.5
2

4.8.6 https://math.libretexts.org/@go/page/4468
1
x5 = (7.5) + 4 = 7.75
2
1
x6 = (7.75) + 4 = 7.875
2
1
x7 = (7.875) + 4 = 7.9375
2
1
x8 = (7.9375) + 4 = 7.96875
2
1
x9 = (7.96875) + 4 = 7.984375.
2

From this list, we conjecture that the values x approach 8. n

Figure 4.8.6 provides a graphical argument that the values approach 8 as n → ∞ . Starting at the point (x , x ) , we draw a 0 0

vertical line to the point (x , F (x )) . The next number in our list is x = F (x ) . We use x to calculate x . Therefore, we
0 0 1 0 1 2

draw a horizontal line connecting (x , x ) to the point (x , x ) on the line y = x , and then draw a vertical line connecting
0 1 1 1

(x , x ) to the point (x , F (x )) . The output F (x ) becomes x . Continuing in this way, we could create an infinite number
1 1 1 1 1 2

of line segments. These line segments are trapped between the lines F (x) = + 4 and y = x . The line segments get closer to
x

the intersection point of these two lines, which occurs when x = F (x). Solving the equation x = + 4, we conclude they x

intersect at x = 8 . Therefore, our graphical evidence agrees with our numerical evidence that the list of numbers
x , x , x , … approaches x = 8 as n → ∞ .

0 1 2

Figure 4.8.6 : This iterative process approaches the value x ∗


= 8.

 Exercise 4.8.4

Consider the function F (x) = x + 6 . Let x = 0 and let x


1

3
0 n = F (xn−1 ) for n ≥ 2 . Find x1 , x2 , x3 , x4 , x5 . Make a
conjecture about what happens to the list of numbers x , x , x , 1 2 3 … xn , … as n → ∞.

Hint
Consider the point where the lines y = x and y = F (x) intersect.

Answer
26 80 242 ∗
x1 = 6, x2 = 8, x3 = , x4 = , x5 = ; x =9
3 9 27

 Iterative Processes and Chaos

Iterative processes can yield some very interesting behavior. In this section, we have seen several examples of iterative
processes that converge to a fixed point. We also saw in Example 4.8.3 that the iterative process bounced back and forth
between two values. We call this kind of behavior a 2-cycle. Iterative processes can converge to cycles with various
periodicities, such as 2−cycles, 4−cycles (where the iterative process repeats a sequence of four values), 8-cycles, and so on.
Some iterative processes yield what mathematicians call chaos. In this case, the iterative process jumps from value to value in a
seemingly random fashion and never converges or settles into a cycle. Although a complete exploration of chaos is beyond the
scope of this text, in this project we look at one of the key properties of a chaotic iterative process: sensitive dependence on

4.8.7 https://math.libretexts.org/@go/page/4468
initial conditions. This property refers to the concept that small changes in initial conditions can generate drastically different
behavior in the iterative process.
Probably the best-known example of chaos is the Mandelbrot set (see Figure 4.8.7), named after Benoit Mandelbrot (1924–
2010), who investigated its properties and helped popularize the field of chaos theory. The Mandelbrot set is usually generated
by computer and shows fascinating details on enlargement, including self-replication of the set. Several colorized versions of
the set have been shown in museums and can be found online and in popular books on the subject.

Figure 4.8.7 : The Mandelbrot set is a well-known example of a set of points generated by the iterative chaotic behavior of a
relatively simple function.
In this project we use the logistic map

f (x) = rx(1 − x)

where x ∈ [0, 1] and r > 0


as the function in our iterative process. The logistic map is a deceptively simple function; but, depending on the value of r, the
resulting iterative process displays some very interesting behavior. It can lead to fixed points, cycles, and even chaos.
To visualize the long-term behavior of the iterative process associated with the logistic map, we will use a tool called a cobweb
diagram. As we did with the iterative process we examined earlier in this section, we first draw a vertical line from the point
(x , 0) to the point (x , f (x )) = (x , x ) . We then draw a horizontal line from that point to the point (x , x ), then draw a
0 0 0 0 1 1 1

vertical line to (x , f (x )) = (x , x ) , and continue the process until the long-term behavior of the system becomes apparent.
1 1 1 2

Figure 4.8.8 shows the long-term behavior of the logistic map when r = 3.55 and x = 0.2 . (The first 100 iterations are not
0

plotted.) The long-term behavior of this iterative process is an 8-cycle.

4.8.8 https://math.libretexts.org/@go/page/4468
Figure 4.8.8 : A cobweb diagram for f (x) = 3.55x(1 − x) is presented here. The sequence of values results in an 8-cycle.
1. Let r = 0.5 and choose x = 0.2 . Either by hand or by using a computer, calculate the first 10 values in the sequence.
0

Does the sequence appear to converge? If so, to what value? Does it result in a cycle? If so, what kind of cycle (for
example, 2−cycle, 4−cycle.)?
2. What happens when r = 2 ?
3. For r = 3.2 and r = 3.5, calculate the first 100 sequence values. Generate a cobweb diagram for each iterative process.
(Several free applets are available online that generate cobweb diagrams for the logistic map.) What is the long-term
behavior in each of these cases?
4. Now let r = 4. Calculate the first 100 sequence values and generate a cobweb diagram. What is the long-term behavior in
this case?
5. Repeat the process for r = 4, but let x = 0.201. How does this behavior compare with the behavior for x = 0.2 ?
0 0

Key Concepts
Newton’s method approximates roots of f (x) = 0 by starting with an initial approximation x , then uses tangent lines to the
0

graph of f to create a sequence of approximations x , x , x , … . 1 2 3

Typically, Newton’s method is an efficient method for finding a particular root. In certain cases, Newton’s method fails to work
because the list of numbers x , x , x , … does not approach a finite value or it approaches a value other than the root sought.
0 1 2

Any process in which a list of numbers x , x , x , … is generated by defining an initial number x and defining the
0 1 2 0

subsequent numbers by the equation x = F (x ) for some function F is an iterative process. Newton’s method is an
n n−1

f (x)
example of an iterative process, where the function F (x) = x − [ f '(x)
] for a given function f .

Glossary
iterative process
process in which a list of numbers x 0, x1 , x2 , x3 … is generated by starting with a number x and defining x
0 n = F (xn−1 ) for
n ≥1

Newton’s method
method for approximating roots of f (x) = 0; using an initial guess x ; each subsequent approximation is defined by the
0

f ( xn− 1 )
equation x n = xn−1 − ′
f ( xn− 1 )

4.8: Newton's Method is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
4.9: Newton’s Method by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

4.8.9 https://math.libretexts.org/@go/page/4468
4.9: Antiderivatives
At this point, we have seen how to calculate derivatives of many functions and have been introduced to a variety of their applications. We now
ask a question that turns this process around: Given a function f , how do we find a function with the derivative f and why would we be
interested in such a function?
We answer the first part of this question by defining antiderivatives. The antiderivative of a function f is a function with a derivative f . Why are
we interested in antiderivatives? The need for antiderivatives arises in many situations, and we look at various examples throughout the
remainder of the text. Here we examine one specific example that involves rectilinear motion. In our examination in Derivatives of rectilinear
motion, we showed that given a position function s(t) of an object, then its velocity function v(t) is the derivative of s(t) —that is, v(t) = s'(t) .
Furthermore, the acceleration a(t) is the derivative of the velocity v(t) —that is, a(t) = v'(t) = s (t) . Now suppose we are given an
′′

acceleration function a , but not the velocity function v or the position function s . Since a(t) = v'(t) , determining the velocity function requires
us to find an antiderivative of the acceleration function. Then, since v(t) = s'(t), determining the position function requires us to find an
antiderivative of the velocity function. Rectilinear motion is just one case in which the need for antiderivatives arises. We will see many more
examples throughout the remainder of the text. For now, let’s look at the terminology and notation for antiderivatives, and determine the
antiderivatives for several types of functions. We examine various techniques for finding antiderivatives of more complicated functions later in
the text (Introduction to Techniques of Integration).

The Reverse of Differentiation


At this point, we know how to find derivatives of various functions. We now ask the opposite question. Given a function f , how can we find a
function with derivative f ? If we can find a function F derivative f , we call F an antiderivative of f .
Definition: Antiderivative
A function F is an antiderivative of the function f if
F '(x) = f (x) (4.9.1)

for all x in the domain of f .

Consider the function f (x) = 2x. Knowing the power rule of differentiation, we conclude that F (x) = x is an antiderivative of f since
2

F '(x) = 2x. Are there any other antiderivatives of f ? Yes; since the derivative of any constant C is zero, x + C is also an antiderivative of 2x.
2


Therefore, x + 5 and x − √2 are also antiderivatives. Are there any others that are not of the form x + C for some constant C ? The answer
2 2 2

is no. From Corollary 2 of the Mean Value Theorem, we know that if F and G are differentiable functions such that F '(x) = G'(x), then
F (x) − G(x) = C for some constant C . This fact leads to the following important theorem.

General Form of an Antiderivative


Let F be an antiderivative of f over an interval I . Then,
I. for each constant C , the function F (x) + C is also an antiderivative of f over I ;
II. if G is an antiderivative of f over I , there is a constant C for which G(x) = F (x) + C over I .
In other words, the most general form of the antiderivative of f over I is F (x) + C .

We use this fact and our knowledge of derivatives to find all the antiderivatives for several functions.

Example 4.9.1 : Finding Antiderivatives


For each of the following functions, find all antiderivatives.
a. f (x) = 3x
2

1
b. f (x) =
x
c. f (x) = cos x
d. f (x) = e x

Solution:
a. Because
d
3 2
(x ) = 3 x
dx

then F (x) = x is an antiderivative of 3x . Therefore, every antiderivative of


3 2
3x
2
is of the form 3
x +C for some constant C , and every
function of the form x + C is an antiderivative of 3x .
3 2

b. Let f (x) = ln |x|. For x > 0, f (x) = ln(x) and

4.9.1 https://math.libretexts.org/@go/page/4469
d 1
(ln x) = .
dx x

Forx < 0, f (x) = ln(−x) and


d 1 1
(ln(−x)) = − = .
dx −x x

Therefore,
d 1
(ln |x|) = .
dx x

1 1
Thus, F (x) = ln |x| is an antiderivative of . Therefore, every antiderivative of is of the form ln |x| + C for some constant C and every
x x
1
function of the form ln |x| + C is an antiderivative of .
x

c. We have
d
(sin x) = cos x,
dx

so F (x) = sin x is an antiderivative of cos x. Therefore, every antiderivative of cos x is of the form sin x + C for some constant C and
every function of the form sin x + C is an antiderivative of cos x.
d. Since
d
x x
(e ) = e ,
dx

then F (x) = e is an antiderivative of e . Therefore, every antiderivative of


x x
e
x
is of the form e
x
+C for some constant C and every
function of the form e + C is an antiderivative of e .
x x

Exercise 4.9.1
Find all antiderivatives of f (x) = sin x .

Hint
What function has a derivative of sin x ?

Answer
− cos x + C

Indefinite Integrals
We now look at the formal notation used to represent antiderivatives and examine some of their properties. These properties allow us to find
df
antiderivatives of more complicated functions. Given a function f , we use the notation f '(x) or to denote the derivative of f . Here we
dx
introduce notation for antiderivatives. If F is an antiderivative of f , we say that F (x) + C is the most general antiderivative of f and write

∫ f (x)dx = F (x) + C . (4.9.2)

The symbol ∫ is called an integral sign, and ∫ f (x)dx is called the indefinite integral of f .
Definition: Indefinite Integrals
Given a function f , the indefinite integral of f , denoted

∫ f (x)dx, (4.9.3)

is the most general antiderivative of f . If F is an antiderivative of f , then

∫ f (x)dx = F (x) + C . (4.9.4)

The expression f (x) is called the integrand and the variable x is the variable of integration.

Given the terminology introduced in this definition, the act of finding the antiderivatives of a function f is usually referred to as integrating f .

4.9.2 https://math.libretexts.org/@go/page/4469
For a function f and an antiderivative F , the functions F (x) + C , where C is any real number, is often referred to as the family of
antiderivatives of f . For example, since x is an antiderivative of 2x and any antiderivative of 2x is of the form x + C , we write
2 2

2
∫ 2xdx = x + C. (4.9.5)

The collection of all functions of the form x 2


+ C, where C is any real number, is known as the family of antiderivatives of 2x. Figure shows a
graph of this family of antiderivatives.

Figure 4.9.1 : The family of antiderivatives of 2x consists of all functions of the form x 2
+C , where C is any real number.
For some functions, evaluating indefinite integrals follows directly from properties of derivatives. For example, for n ≠ −1 ,
n+1
x
n
∫ x dx = + C,
n+1

which comes directly from


n+1 n
d x x
( ) = (n + 1)
n
=x .
dx n+1 n+1

This fact is known as the power rule for integrals.


Power Rule for Integrals
For n ≠ −1,
n+1
n
x
∫ x dx = + C. (4.9.6)
n+1

Evaluating indefinite integrals for some other functions is also a straightforward calculation. The following table lists the indefinite integrals for
several common functions. A more complete list appears in Appendix B.
Table : Integration Formulas
Differentiation Formula Indefinite Integral
d
0
(k) = 0 ∫ kdx = ∫ kx dx = kx + C
dx

n+1
d x
n
(x ) = n x
n−1 n
∫ x dn = +C for n ≠ −1
dx n+1

d 1 1
(ln |x|) = ∫ dx = ln |x| + C
dx x x

d
x x x x
(e ) = e ∫ e dx = e +C
dx

4.9.3 https://math.libretexts.org/@go/page/4469
Differentiation Formula Indefinite Integral

d
(sin x) = cosx ∫ cosxdx = sin x + C
dx

d
(cosx) = − sin x ∫ sin xdx = − cosx + C
dx

d
2 2
(tan x) = sec x ∫ sec xdx = tan x + C
dx

d
(cscx) = −cscx cot x ∫ cscx cot xdx = −cscx + C
dx

d
(secx) = secx tan x ∫ secx tan xdx = secx + C
dx

d
2 2
(cot x) = −csc x ∫ csc xdx = − cot x + C
dx

d 1 1
−1 −1
(sin x) = ∫ = sin x +C
−−−− − −−−− −
dx √1 − x
2 √1 − x2

d 1 1
−1 −1
(tan x) = ∫ dx = tan x +C
2 2
dx 1 +x 1 +x

d 1 1
−1 −1
(sec |x|) = ∫ dx = sec |x| + C
−−−−− −−−−−
dx 2 2
x √x − 1
x √x − 1

From the definition of indefinite integral of f , we know

∫ f (x)dx = F (x) + C (4.9.7)

if and only if F is an antiderivative of f . Therefore, when claiming that

∫ f (x)dx = F (x) + C (4.9.8)

it is important to check whether this statement is correct by verifying that F '(x) = f (x).

Example 4.9.2 : Verifying an Indefinite Integral


Each of the following statements is of the form ∫ f (x)dx = F (x) + C . Verify that each statement is correct by showing that F '(x) = f (x).
2
x
a. x
∫ (x + e )dx = +e
x
+C
2
b. ∫ x e x
dx = x e
x
−e
x
+C

Solution:
a. Since
2
d x
( +e
x
+ C) = x + e
x
,
dx 2

the statement
2
x
x x
∫ (x + e )dx = +e +C
2

is correct.
2
x
Note that we are verifying an indefinite integral for a sum. Furthermore, and e are antiderivatives of x and e , respectively, and the sum
x x

2
of the antiderivatives is an antiderivative of the sum. We discuss this fact again later in this section.
b. Using the product rule, we see that
d x x x x x x
(x e −e + C) = e + xe −e = xe .
dx

Therefore, the statement

x x x
∫ x e dx = x e −e +C

is correct.

4.9.4 https://math.libretexts.org/@go/page/4469
Note that we are verifying an indefinite integral for a product. The antiderivative xex−ex is not a product of the antiderivatives. Furthermore,
the product of antiderivatives, x e /2 is not an antiderivative of xe since
2 x x

2 x 2 x
d x e x e
( ) = xe
x
+ ≠ xe
x
.
dx 2 2

In general, the product of antiderivatives is not an antiderivative of a product.

Exercise 4.9.2
Verify that

∫ x cos x dx = x sin x + cos x + C .

Hint
Calculate
d
(x sin x + cos x + C ).
dx

Answer
d
(x sin x + cos x + C ) = sin x + x cos x − sin x = x cos x
dx

In Table, we listed the indefinite integrals for many elementary functions. Let’s now turn our attention to evaluating indefinite integrals for more
complicated functions. For example, consider finding an antiderivative of a sum f + g . In Example a. we showed that an antiderivative of the
2
x
sum x + e is given by the sum (
x
)+e
x
—that is, an antiderivative of a sum is given by a sum of antiderivatives. This result was not specific
2
to this example. In general, if F and G are antiderivatives of any functions f and g , respectively, then
d
(F (x) + G(x)) = F '(x) + G'(x) = f (x) + g(x).
dx

Therefore, F (x) + G(x) is an antiderivative of f (x) + g(x) and we have

∫ (f (x) + g(x))dx = F (x) + G(x) + C .

Similarly,

∫ (f (x) − g(x))dx = F (x) − G(x) + C .

In addition, consider the task of finding an antiderivative of kf (x), where k is any real number. Since
d d
(kf (x)) = k F (x) = kF '(x)
dx dx

for any real number k , we conclude that

∫ kf (x)dx = kF (x) + C .

These properties are summarized next.


Properties of Indefinite Integrals
Let F and G be antiderivatives of f and g , respectively, and let k be any real number.
Sums and Differences
∫ (f (x) ± g(x))dx = F (x) ± G(x) + C

Constant Multiples
∫ kf (x)dx = kF (x) + C

From this theorem, we can evaluate any integral involving a sum, difference, or constant multiple of functions with antiderivatives that are
known. Evaluating integrals involving products, quotients, or compositions is more complicated (see Exampleb. for an example involving an

4.9.5 https://math.libretexts.org/@go/page/4469
antiderivative of a product.) We look at and address integrals involving these more complicated functions in Introduction to Integration. In the
next example, we examine how to use this theorem to calculate the indefinite integrals of several functions.

Example 4.9.3 : Evaluating Indefinite Integrals


Evaluate each of the following indefinite integrals:
a. ∫ (5 x
3
− 7x
2
+ 3x + 4)dx
2 3 −
x + 4 √x
b. ∫ dx
x
4
c. ∫
2
dx
1 +x
d. ∫ tan x cos xdx
Solution:
a. Using Note, we can integrate each of the four terms in the integrand separately. We obtain
3 2 3 2
∫ (5 x − 7x + 3x + 4)dx = ∫ 5 x dx − ∫ 7 x dx + ∫ 3xdx + ∫ 4dx.

From the second part of Note, each coefficient can be written in front of the integral sign, which gives
3 2 3 2
∫ 5 x dx − ∫ 7 x dx + ∫ 3xdx + ∫ 4dx = 5 ∫ x dx − 7 ∫ x dx + 3 ∫ xdx + 4 ∫ 1dx.

Using the power rule for integrals, we conclude that


5 7 3
3 2 4 3 2
∫ (5 x − 7x + 3x + 4)dx = x − x + x + 4x + C .
4 3 2

b. Rewrite the integrand as


2 3
− 2 3

x + 4 √x x 4 √x
= + = 0.
x x x

Then, to evaluate the integral, integrate each of these terms separately. Using the power rule, we have
4
−2/3
∫ (x + )dx = ∫ xdx + 4 ∫ x dx
2/3
x

1 1
2 (−2/3)+1
= x +4 x + C ])
2 −2
( )+1
3

1
2 1/3
= x + 12 x + C.
2

c. Using Note, write the integral as


1
4∫ dx.
2
1 +x

1
Then, use the fact that tan −1
(x) is an antiderivative of 2
to conclude that
(1 + x )

4
−1
∫ dx = 4tan (x) + C .
2
1 +x

d. Rewrite the integrand as


sin x
tan x cos x = cos x = sin x.
cos x

Therefore,
∫ tan x cos x = ∫ sin x = − cos x + C .

Exercise 4.9.3
Evaluate ∫ (4x 3
− 5x
2
+ x − 7)dx .

Hint
Integrate each term in the integrand separately, making use of the power rule.

Answer
5 1
4 3 2
x − x + x − 7x + C
3 2

4.9.6 https://math.libretexts.org/@go/page/4469
Initial-Value Problems
We look at techniques for integrating a large variety of functions involving products, quotients, and compositions later in the text. Here we turn
to one common use for antiderivatives that arises often in many applications: solving differential equations.
A differential equation is an equation that relates an unknown function and one or more of its derivatives. The equation
dy
= f (x)
dx

is a simple example of a differential equation. Solving this equation means finding a function y with a derivative f . Therefore, the solutions of
Equation are the antiderivatives of f . If F is one antiderivative of f , every function of the form y = F (x) + C is a solution of that differential
equation. For example, the solutions of
dy
2
= 6x
dx

are given by
2
y = ∫ 6 x dx = 2 x
3
+C .
Sometimes we are interested in determining whether a particular solution curve passes through a certain point (x0 , y0 ) —that is, y(x0 ) = y0 .
The problem of finding a function y that satisfies a differential equation
dy
= f (x)
dx

with the additional condition


y(x0 ) = y0

is an example of an initial-value problem. The condition y(x 0) = y0 is known as an initial condition. For example, looking for a function y that
satisfies the differential equation
dy
2
= 6x
dx

and the initial condition


y(1) = 5

is an example of an initial-value problem. Since the solutions of the differential equation are y = 2x + C , to find a function y that also satisfies
3

the initial condition, we need to find C such that y(1) = 2(1) + C = 5 . From this equation, we see that C = 3 , and we conclude that
3

y = 2 x + 3 is the solution of this initial-value problem as shown in the following graph.


3

dy
Figure 4.9.2 : Some of the solution curves of the differential equation = 6x
2
are displayed. The function y = 2x 3
+3 satisfies the
dx
differential equation and the initial condition y(1) = 5.

Example 4.9.4 : Solving an Initial-Value Problem

4.9.7 https://math.libretexts.org/@go/page/4469
Solve the initial-value problem
dy
= sin x, y(0) = 5. (4.9.9)
dx

Solution
dy
First we need to solve the differential equation. If = sin x , then
dx

y =∫ sin(x)dx = − cos x + C . (4.9.10)

Next we need to look for a solution y that satisfies the initial condition. The initial condition y(0)=5 means we need a constant C such that
− cos x + C = 5. Therefore,

C = 5 + cos(0) = 6. (4.9.11)

The solution of the initial-value problem is y = − cos x + 6.

Exercise 4.9.4
dy
Solve the initial value problem = 3x
−2
, y(1) = 2 .
dx

Hint
Find all antiderivatives of f (x) = 3x −2.

Answer
3
y =− +5
x

Initial-value problems arise in many applications. Next we consider a problem in which a driver applies the brakes in a car. We are interested in
how long it takes for the car to stop. Recall that the velocity function v(t) is the derivative of a position function s(t), and the acceleration a(t)
is the derivative of the velocity function. In earlier examples in the text, we could calculate the velocity from the position and then compute the
acceleration from the velocity. In the next example we work the other way around. Given an acceleration function, we calculate the velocity
function. We then use the velocity function to determine the position function.

Example 4.9.5 :
A car is traveling at the rate of 88 ft/sec (60 mph) when the brakes are applied. The car begins decelerating at a constant rate of 15 ft/sec2.
a. How many seconds elapse before the car stops?
b. How far does the car travel during that time?
Solution
a. First we introduce variables for this problem. Let t be the time (in seconds) after the brakes are first applied. Let a(t) be the acceleration of
the car (in feet per seconds squared) at time t . Let v(t) be the velocity of the car (in feet per second) at time t . Let s(t) be the car’s position
(in feet) beyond the point where the brakes are applied at time t .
The car is traveling at a rate of 88f t/sec. Therefore, the initial velocity is v(0) = 88 ft/sec. Since the car is decelerating, the acceleration is
a(t) = −15f t/s
2
.
The acceleration is the derivative of the velocity,
v'(t) = 15.

Therefore, we have an initial-value problem to solve:


v'(t) = −15, v(0) = 88.

Integrating, we find that


v(t) = −15t + C .

Since v(0) = 88, C = 88. Thus, the velocity function is


v(t) = −15t + 88.

4.9.8 https://math.libretexts.org/@go/page/4469
To find how long it takes for the car to stop, we need to find the time t such that the velocity is zero. Solving −15t + 88 = 0, we obtain
88
t = sec.
15

88
b. To find how far the car travels during this time, we need to find the position of the car after sec. We know the velocity v(t) is the
15
derivative of the position s(t) . Consider the initial position to be s(0) = 0 . Therefore, we need to solve the initial-value problem
s'(t) = −15t + 88, s(0) = 0.

Integrating, we have
15
2
s(t) = − t + 88t + C .
2

Since s(0) = 0 , the constant is C =0 . Therefore, the position function is


15
2
s(t) = − t + 88t.
2

88 88
After t = sec, the position is s( ) ≈ 258.133 ft.
15 15

Exercise 4.9.5
Suppose the car is traveling at the rate of 44 ft/sec. How long does it take for the car to stop? How far will the car travel?

Hint
v(t) = −15t + 44.

Answer
2.93sec, 64.5f t

Key Concepts
If F is an antiderivative of f , then every antiderivative of f is of the form F (x) + C for some constant C .
Solving the initial-value problem
dy
= f (x), y(x0 ) = y0
dx

requires us first to find the set of antiderivatives of f and then to look for the particular antiderivative that also satisfies the initial condition.

Glossary
antiderivative
a function F such that F '(x) = f (x) for all x in the domain of f is an antiderivative of f

indefinite integral
the most general antiderivative of f (x) is the indefinite integral of f ; we use the notation ∫ f (x)dx to denote the indefinite integral of f

initial value problem


dy
a problem that requires finding a function y that satisfies the differential equation = f (x) together with the initial condition y(x
0) = y0
dx

Contributors
Gilbert Strang (MIT) and Edwin “Jed” Herman (Harvey Mudd) with many contributing authors. This content by OpenStax is licensed with a
CC-BY-SA-NC 4.0 license. Download for free at http://cnx.org.

4.9: Antiderivatives is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

4.9.9 https://math.libretexts.org/@go/page/4469
CHAPTER OVERVIEW

5: Integrals
A general Calculus Textmap organized around the textbook

Calculus: Early Transcendentals


by James Stewart

I II III IV V VI VII VIII IX X XI XII XIII XIV XV XVI XVII

This Textmap is currently under construction... please be patient with us.

Topic hierarchy
5.1: Areas and Distances
5.2: The Definite Integral
5.3: The Fundamental Theorem of Calculus
5.4: Indefinite Integrals and the Net Change Theorem
5.5: The Substitution Rule

5: Integrals is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

1
5.1: Areas and Distances
 Learning Objectives
Use sigma (summation) notation to calculate sums and powers of integers.
Use the sum of rectangular areas to approximate the area under a curve.
Use Riemann sums to approximate area.

Archimedes was fascinated with calculating the areas of various shapes—in other words, the amount of space enclosed by the
shape. He used a process that has come to be known as the method of exhaustion, which used smaller and smaller shapes, the
areas of which could be calculated exactly, to fill an irregular region and thereby obtain closer and closer approximations to the
total area. In this process, an area bounded by curves is filled with rectangles, triangles, and shapes with exact area formulas. These
areas are then summed to approximate the area of the curved region.
In this section, we develop techniques to approximate the area between a curve, defined by a function f (x), and the x-axis on a
closed interval [a, b]. Like Archimedes, we first approximate the area under the curve using shapes of known area (namely,
rectangles). By using smaller and smaller rectangles, we get closer and closer approximations to the area. Taking a limit allows us
to calculate the exact area under the curve.
Let’s start by introducing some notation to make the calculations easier. We then consider the case when f (x) is continuous and
nonnegative. Later in the chapter, we relax some of these restrictions and develop techniques that apply in more general cases.

Sigma (Summation) Notation


As mentioned, we will use shapes of known area to approximate the area of an irregular region bounded by curves. This process
often requires adding up long strings of numbers. To make it easier to write down these lengthy sums, we look at some new
notation here, called sigma notation (also known as summation notation). The Greek capital letter Σ, sigma, is used to express
long sums of values in a compact form. For example, if we want to add all the integers from 1 to 20 without sigma notation, we
have to write

1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 + 11 + 12 + 13 + 14 + 15 + 16 + 17 + 18 + 19 + 20.

We could probably skip writing a couple of terms and write

1 + 2 + 3 + 4 + ⋯ + 19 + 20,

which is better, but still cumbersome. With sigma notation, we write this sum as
20

∑i

i=1

which is much more compact. Typically, sigma notation is presented in the form
n

∑ ai

i=1

where ai describes the terms to be added, and the i is called the . Each term is evaluated, then we sum all the values,
index

beginning with the value when i = 1 and ending with the value when i = n. For example, an expression like ∑ si is interpreted
i=2

as s + s + s + s + s + s . Note that the index is used only to keep track of the terms to be added; it does not factor into the
2 3 4 5 6 7

calculation of the sum itself. The index is therefore called a dummy variable. We can use any letter we like for the index.
Typically, mathematicians use i, j, k, m, and n for indices.
Let’s try a couple of examples of using sigma notation.

5.1.1 https://math.libretexts.org/@go/page/4471
 Example 5.1.1: Using Sigma Notation
a. Write in sigma notation and evaluate the sum of terms 3 for i = 1, 2, 3, 4, 5.
i

b. Write the sum in sigma notation:


1 1 1 1
1+ + + + .
4 9 16 25

Solution
a. Write
5

i 2 3 4 5
∑3 = 3 +3 +3 +3 +3 = 363.

i=1

5
1
b. The denominator of each term is a perfect square. Using sigma notation, this sum can be written as ∑ 2
.
i
i=1

 Exercise 5.1.1

Write in sigma notation and evaluate the sum of terms 2 for i = 3, 4, 5, 6. i

Hint
Use the solving steps in Example 5.1.1as a guide.

Answer
6
i 3 4 5 6
∑2 =2 +2 +2 +2 = 120

i=3

The properties associated with the summation process are given in the following rule.

 Rule: Properties of Sigma Notation

Let a , a , … , a and b , b , … , b represent two sequences of terms and let


1 2 n 1 2 n c be a constant. The following properties hold
for all positive integers n and for integers m, with 1 ≤ m ≤ n.
n

i. ∑ c = nc

i=1
n n

ii. ∑ ca i = c ∑ ai

i=1 i=1
n n n

iii. ∑(ai + bi ) = ∑ ai + ∑ bi

i=1 i=1 i=1


n n n

iv. ∑(ai − bi ) = ∑ ai − ∑ bi

i=1 i=1 i=1


n m n

v. ∑ a i = ∑ ai + ∑ ai

i=1 i=1 i=m+1

 Proof
We prove properties (ii.) and (iii.) here, and leave proof of the other properties to the Exercises.
(ii.) We have
n n

∑ c ai = c a1 + c a2 + c a3 + ⋯ + c an = c(a1 + a2 + a3 + ⋯ + an ) = c ∑ ai .

i=1 i=1

5.1.2 https://math.libretexts.org/@go/page/4471
(iii.) We have
n

∑(ai + bi ) = (a1 + b1 ) + (a2 + b2 ) + (a3 + b3 ) + ⋯ + (an + bn ) (5.1.1)

i=1

= (a1 + a2 + a3 + ⋯ + an ) + (b1 + b2 + b3 + ⋯ + bn ) (5.1.2)

n n

= ∑ ai + ∑ bi . (5.1.3)

i=1 i=1

A few more formulas for frequently found functions simplify the summation process further. These are shown in the next rule, for
sums and powers of integers, and we use them in the next set of examples.

 Rule: Sums and Powers of Integers

1. The sum of n integers is given by


n
n(n + 1)
∑i = 1 +2 +⋯ +n = . (5.1.4)
2
i=1

2. The sum of consecutive integers squared is given by


n
n(n + 1)(2n + 1)
2 2 2 2
∑i =1 +2 +⋯ +n = . (5.1.5)
6
i=1

3. The sum of consecutive integers cubed is given by


n 2 2
n (n + 1 )
3 3 3 3
∑i =1 +2 +⋯ +n = . (5.1.6)
4
i=1

 Example 5.1.2: Evaluation Using Sigma Notation

Write using sigma notation and evaluate:


a. The sum of the terms (i − 3) for i = 1, 2, … , 200.
2

b. The sum of the terms (i − i ) for i = 1, 2, 3, 4, 5, 6


3 2

Solution
a. Multiplying out (i − 3) , we can break the expression into three terms.
2

200 200

2 2
∑(i − 3 ) = ∑(i − 6i + 9)

i=1 i=1

200 200 200

2
= ∑i − ∑ 6i + ∑ 9

i=1 i=1 i=1

200 200 200

2
= ∑i −6 ∑i +∑9

i=1 i=1 i=1

200(200 + 1)(400 + 1) 200(200 + 1)


= −6 [ ] + 9(200)
6 2

= 2, 686, 700 − 120, 600 + 1800

= 2, 567, 900

b. Use sigma notation property iv. and the rules for the sum of squared terms and the sum of cubed terms.

5.1.3 https://math.libretexts.org/@go/page/4471
6 6 6

3 2 3 2
∑(i −i ) = ∑i −∑i

i=1 i=1 i=1

2 2
6 (6 + 1 ) 6(6 + 1)(2(6) + 1)
= −
4 6

1764 546
= −
4 6

= 350

 Exercise 5.1.2

Find the sum of the values of 4 + 3i for i = 1, 2, … , 100.

Hint
Use the properties of sigma notation to solve the problem.

Answer
15, 550

 Example 5.1.3: Finding the Sum of the Function Values

Find the sum of the values of f (x) = x over the integers 1, 2, 3, … , 10.
3

Solution
Using Equation 5.1.6, we have
10 2 2
(10 ) (10 + 1 ) 100(121)
3
∑i = = = 3025
4 4
i=0

 Exercise 5.1.3
20

Evaluate the sum indicated by the notation ∑(2k + 1) .


k=1

Hint
Use the rule on sum and powers of integers (Equations 5.1.4-5.1.6).

Answer
440

Approximating Area
Now that we have the necessary notation, we return to the problem at hand: approximating the area under a curve. Let f (x) be a
continuous, nonnegative function defined on the closed interval [a, b]. We want to approximate the area A bounded by f (x) above,
the x-axis below, the line x = a on the left, and the line x = b on the right (Figure 5.1.1).

5.1.4 https://math.libretexts.org/@go/page/4471
Figure 5.1.1 : An area (shaded region) bounded by the curve f (x) at top, the x -axis at bottom, the line x = a to the left, and the
line x = b at right.
How do we approximate the area under this curve? The approach is a geometric one. By dividing a region into many small shapes
that have known area formulas, we can sum these areas and obtain a reasonable estimate of the true area. We begin by dividing the
b −a
interval [a, b] into n subintervals of equal width, . We do this by selecting equally spaced points x0 , x1 , x2 , … , xn with
n
x0 = a, xn = b, and
b −a
xi − xi−1 =
n

for i = 1, 2, 3, … , n.
We denote the width of each subinterval with the notation Δx, so Δx = b−a

n
and

xi = x0 + iΔx

for i = 1, 2, 3, … , n. This notion of dividing an interval [a, b] into subintervals by selecting points from within the interval is used
quite often in approximating the area under a curve, so let’s define some relevant terminology.

 Definition: Partitions

A set of points P = x for i = 0, 1, 2, … , n with a = x < x < x <. . . < x = b , which divides the interval [a, b] into
i 0 1 2 n

subintervals of the form [x , x ], [x , x ], . . . , [x , x ] is called a partition of [a, b]. If the subintervals all have the same
0 1 1 2 n−1 n

width, the set of points forms a regular partition (or uniform partition) of the interval [a, b].

We can use this regular partition as the basis of a method for estimating the area under the curve. We next examine two methods:
the left-endpoint approximation and the right-endpoint approximation.

 Rule: Left-Endpoint Approximation

On each subinterval [x , x ] (for i = 1, 2, 3, … , n), construct a rectangle with width Δx and height equal to f (x ), which
i−1 i i−1

is the function value at the left endpoint of the subinterval. Then the area of this rectangle is f (x )Δx. Adding the areas of
i−1

all these rectangles, we get an approximate value for A (Figure 5.1.2). We use the notation L to denote that this is a left-
n

endpoint approximation of A using n subintervals.


n

A ≈ Ln = f (x0 )Δx + f (x1 )Δx + ⋯ + f (xn−1 )Δx = ∑ f (xi−1 )Δx

i=1

5.1.5 https://math.libretexts.org/@go/page/4471
Figure 5.1.2 : In the left-endpoint approximation of area under a curve, the height of each rectangle is determined by the
function value at the left of each subinterval.

The second method for approximating area under a curve is the right-endpoint approximation. It is almost the same as the left-
endpoint approximation, but now the heights of the rectangles are determined by the function values at the right of each
subinterval.

 Rule: Right-Endpoint Approximation

Construct a rectangle on each subinterval [x , x ], only this time the height of the rectangle is determined by the function
i−1 i

value f (x ) at the right endpoint of the subinterval. Then, the area of each rectangle is f (x ) Δx and the approximation for A
i i

is given by
n

A ≈ Rn = f (x1 )Δx + f (x2 )Δx + ⋯ + f (xn )Δx = ∑ f (xi )Δx.

i=1

The notation R indicates this is a right-endpoint approximation for A (Figure 5.1.3).


n

Figure 5.1.3 : In the right-endpoint approximation of area under a curve, the height of each rectangle is determined by the
function value at the right of each subinterval. Note that the right-endpoint approximation differs from the left-endpoint
approximation in Figure 5.1.2 .

2
x
The graphs in Figure 5.1.4 represent the curve f (x) = . In Figure 5.1.4b we divide the region represented by the interval [0, 3]
2
into six subintervals, each of width 0.5. Thus, Δx = 0.5. We then form six rectangles by drawing vertical lines perpendicular to
xi−1 , the left endpoint of each subinterval. We determine the height of each rectangle by calculating f (x ) for i = 1, 2, 3, 4, 5, 6.
i−1

The intervals are [0, 0.5], [0.5, 1], [1, 1.5], [1.5, 2], [2, 2.5], [2.5,. 3]
We find the area of each rectangle by multiplying the height by
the width. Then, the sum of the rectangular areas approximates the area between f (x) and the x-axis. When the left endpoints are
used to calculate height, we have a left-endpoint approximation. Thus,
6

A ≈ L6 = ∑ f (xi−1 )Δx = f (x0 )Δx + f (x1 )Δx + f (x2 )Δx + f (x3 )Δx + f (x4 )Δx + f (x5 )Δx

i=1

= f (0)0.5 + f (0.5)0.5 + f (1)0.5 + f (1.5)0.5 + f (2)0.5 + f (2.5)0.5

= (0)0.5 + (0.125)0.5 + (0.5)0.5 + (1.125)0.5 + (2)0.5 + (3.125)0.5

= 0 + 0.0625 + 0.25 + 0.5625 + 1 + 1.5625

2
= 3.4375 units

5.1.6 https://math.libretexts.org/@go/page/4471
Figure 5.1.4 : Methods of approximating the area under a curve by using (a) the left endpoints and (b) the right endpoints.
In Figure 5.1.4b, we draw vertical lines perpendicular to xi such that x is the right endpoint of each subinterval, and calculate
i

f (x ) for i = 1, 2, 3, 4, 5, 6. We multiply each f (x ) by


i i Δx to find the rectangular areas, and then add them. This is a right-
endpoint approximation of the area under f (x). Thus,
6

A ≈ R6 = ∑ f (xi )Δx = f (x1 )Δx + f (x2 )Δx + f (x3 )Δx + f (x4 )Δx + f (x5 )Δx + f (x6 )Δx

i=1

= f (0.5)0.5 + f (1)0.5 + f (1.5)0.5 + f (2)0.5 + f (2.5)0.5 + f (3)0.5

= (0.125)0.5 + (0.5)0.5 + (1.125)0.5 + (2)0.5 + (3.125)0.5 + (4.5)0.5

= 0.0625 + 0.25 + 0.5625 + 1 + 1.5625 + 2.25

2
= 5.6875 units .

 Example 5.1.4: Approximating the Area Under a Curve

Use both left-endpoint and right-endpoint approximations to approximate the area under the curve of f (x) = x on the interval 2

[0, 2]; use n = 4 .

Solution
(2 − 0)
First, divide the interval [0, 2] into n equal subintervals. Using n = 4, Δx = = 0.5 . This is the width of each
4
rectangle. The intervals [0, 0.5], [0.5, 1], [1, 1.5], [1.5, 2]are shown in Figure 5.1.5 . Using a left-endpoint approximation, the
heights are f (0) = 0, f (0.5) = 0.25, f (1) = 1, and f (1.5) = 2.25. Then,

L4 = f (x0 )Δx + f (x1 )Δx + f (x2 )Δx + f (x3 )Δx

= 0(0.5) + 0.25(0.5) + 1(0.5) + 2.25(0.5)

2
= 1.75 units

Figure 5.1.5 : The graph shows the left-endpoint approximation of the area under f (x) = x from 0 to 2.
2

The right-endpoint approximation is shown in Figure 5.1.6. The intervals are the same, Δx = 0.5, but now use the right
endpoint to calculate the height of the rectangles. We have

5.1.7 https://math.libretexts.org/@go/page/4471
R4 = f (x1 )Δx + f (x2 )Δx + f (x3 )Δx + f (x4 )Δx

= 0.25(0.5) + 1(0.5) + 2.25(0.5) + 4(0.5)

2
= 3.75 units

Figure 5.1.6 : The graph shows the right-endpoint approximation of the area under f (x) = x from 0 to 2.
2

The left-endpoint approximation is 1.75 units ; the right-endpoint approximation is 3.75 units .
2 2

 Exercise 5.1.4
1
Sketch left-endpoint and right-endpoint approximations for f (x) = on [1, 2]; use n =4 . Approximate the area using both
x
methods.

Hint
Follow the solving strategy in Example 5.1.4step-by-step.

Answer
The left-endpoint approximation is 0.7595 units . The right-endpoint approximation is
2
0.6345 units . See the below
2

Media.

Looking at Figure 5.1.4 and the graphs in Example 5.1.4, we can see that when we use a small number of intervals, neither the left-
endpoint approximation nor the right-endpoint approximation is a particularly accurate estimate of the area under the curve.
However, it seems logical that if we increase the number of points in our partition, our estimate of A will improve. We will have
more rectangles, but each rectangle will be thinner, so we will be able to fit the rectangles to the curve more precisely.
We can demonstrate the improved approximation obtained through smaller intervals with an example. Let’s explore the idea of
increasing n , first in a left-endpoint approximation with four rectangles, then eight rectangles, and finally 32 rectangles. Then, let’s
do the same thing in a right-endpoint approximation, using the same sets of intervals, of the same curved region. Figure 5.1.7
shows the area of the region under the curve f (x) = (x − 1) + 4 on the interval [0, 2] using a left-endpoint approximation where
3

n = 4. The width of each rectangle is

5.1.8 https://math.libretexts.org/@go/page/4471
2 −0 1
Δx = = .
4 2

The area is approximated by the summed areas of the rectangles, or


2
L4 = f (0)(0.5) + f (0.5)(0.5) + f (1)(0.5) + f (1.5)0.5 = 7.5 units

Figure 5.1.7 : With a left-endpoint approximation and dividing the region from a to b into four equal intervals, the area under the
curve is approximately equal to the sum of the areas of the rectangles.
Figure 5.1.8 shows the same curve divided into eight subintervals. Comparing the graph with four rectangles in Figure 5.1.7 with
this graph with eight rectangles, we can see there appears to be less white space under the curve when n = 8. This white space is
area under the curve we are unable to include using our approximation. The area of the rectangles is
L8 = f (0)(0.25) + f (0.25)(0.25) + f (0.5)(0.25) + f (0.75)(0.25) + f (1)(0.25) + f (1.25)(0.25) + f (1.5)(0.25)

2
+ f (1.75)(0.25) = 7.75 units

Figure 5.1.8 : The region under the curve is divided into n = 8 rectangular areas of equal width for a left-endpoint approximation.
The graph in Figure 5.1.9 shows the same function with 32 rectangles inscribed under the curve. There appears to be little white
space left. The area occupied by the rectangles is
2
L32 = f (0)(0.0625) + f (0.0625)(0.0625) + f (0.125)(0.0625) + ⋯ + f (1.9375)(0.0625) = 7.9375 units .

Figure 5.1.9 : Here, 32 rectangles are inscribed under the curve for a left-endpoint approximation.
We can carry out a similar process for the right-endpoint approximation method. A right-endpoint approximation of the same
curve, using four rectangles (Figure 5.1.10), yields an area
2
R4 = f (0.5)(0.5) + f (1)(0.5) + f (1.5)(0.5) + f (2)(0.5) = 8.5 units .

5.1.9 https://math.libretexts.org/@go/page/4471
Figure 5.1.10 : Now we divide the area under the curve into four equal subintervals for a right-endpoint approximation.

2 −0
Dividing the region over the interval [0, 2] into eight rectangles results in Δx = = 0.25. The graph is shown in Figure
8
5.1.11. The area is
R8 = f (0.25)(0.25) + f (0.5)(0.25) + f (0.75)(0.25) + f (1)(0.25) + f (1.25)(0.25) + f (1.5)(0.25) + f (1.75)(0.25)

2
+ f (2)(0.25) = 8.25 units

Figure 5.1.11 : Here we use right-endpoint approximation for a region divided into eight equal subintervals.
Last, the right-endpoint approximation with n = 32 is close to the actual area (Figure 5.1.12). The area is approximately
2
R32 = f (0.0625)(0.0625) + f (0.125)(0.0625) + f (0.1875)(0.0625) + ⋯ + f (2)(0.0625) = 8.0625 units

Figure 5.1.12 : The region is divided into 32 equal subintervals for a right-endpoint approximation.
Based on these figures and calculations, it appears we are on the right track; the rectangles appear to approximate the area under the
curve better as n gets larger. Furthermore, as n increases, both the left-endpoint and right-endpoint approximations appear to
approach an area of 8 square units. Table 5.1.15 shows a numerical comparison of the left- and right-endpoint methods. The idea
that the approximations of the area under the curve get better and better as n gets larger and larger is very important, and we now
explore this idea in more detail.
Table 5.1.15 : Converging Values of Left- and Right-Endpoint Approximations as n Increases
Value of n Approximate Area L n Approximate Area R n

n = 4 7.5 8.5

n = 8 7.75 8.25

5.1.10 https://math.libretexts.org/@go/page/4471
Value of n Approximate Area L n Approximate Area R n

n = 32 7.94 8.06

Forming Riemann Sums


So far we have been using rectangles to approximate the area under a curve. The heights of these rectangles have been determined
by evaluating the function at either the right or left endpoints of the subinterval [x , x ]. In reality, there is no reason to restrict
i−1 i

evaluation of the function to one of these two points only. We could evaluate the function at any point x in the subinterval ∗
i

, x ], and use f (x ) as the height of our rectangle. This gives us an estimate for the area of the form

[xi−1 i i


A ≈ ∑ f (x ) Δx.
i

i=1

A sum of this form is called a Riemann sum, named for the 19th-century mathematician Bernhard Riemann, who developed the
idea.

 Definition: Riemann sum

Let f (x) be defined on a closed interval [a, b] and let P be any partition of [a, b]. Let Δx be the width of each subinterval
i

, x ] and for each i , let x be any point in [ x , x ]. A Riemann sum is defined for f (x) as

[x i−1 i i−1 i
i


∑ f (x ) Δxi .
i

i=1

b −a
At this point, we'll choose a regular partition P , as we have in our examples above. This forces all Δx to be equal to Δx = i
n
for any natural number of intervals n .
Recall that with the left- and right-endpoint approximations, the estimates seem to get better and better as n get larger and larger.
The same thing happens with Riemann sums. Riemann sums give better approximations for larger values of n . We are now ready
to define the area under a curve in terms of Riemann sums.

 Definition: Area Under the Curve


n

Let f (x) be a continuous, nonnegative function on an interval [a, b], and let ∑ f (x ∗
i
) Δx be a Riemann sum for f (x) with a
i=1

regular partition P . Then, the area under the curve y = f (x) on [a, b] is given by
n


A = lim ∑ f (x ) Δx.
i
n→∞
i=1

See a graphical demonstration of the construction of a Riemann sum.


Some subtleties here are worth discussing. First, note that taking the limit of a sum is a little different from taking the limit of a
function f (x) as x goes to infinity. Limits of sums are discussed in detail in the chapter on Sequences and Series; however, for now
we can assume that the computational techniques we used to compute limits of functions can also be used to calculate limits of
sums.
Second, we must consider what to do if the expression converges to different limits for different choices of x . Fortunately, this ∗
i

does not happen. Although the proof is beyond the scope of this text, it can be shown that if f (x) is continuous on the closed
n

interval [a, b], then ∗


lim ∑ f (x )Δx
i
exists and is unique (in other words, it does not depend on the choice of x ). ∗
i
n→∞
i=1

We look at some examples shortly. But, before we do, let’s take a moment and talk about some specific choices for x . Although ∗
i

any choice for x gives us an estimate of the area under the curve, we don’t necessarily know whether that estimate is too high

i

(overestimate) or too low (underestimate). If it is important to know whether our estimate is high or low, we can select our value
for x to guarantee one result or the other.

i

5.1.11 https://math.libretexts.org/@go/page/4471
If we want an overestimate, for example, we can choose x such that for i = 1, 2, 3, … , n, f (x ) ≥ f (x) for all x ∈ [x − 1, x ] .

i

i i i

In other words, we choose x so that for i = 1, 2, 3, … , n, f (x ) is the maximum function value on the interval [x , x ]. If we

i

i i−1 i
n

select x in this way, then the Riemann sum ∑ f (x



i

i
)Δx is called an upper sum. Similarly, if we want an underestimate, we can
i=1

choose x ∗ i so that for i = 1, 2, 3, … , n, f (x ) is the minimum function value on the interval [x , x ]. In this case, the

i i−1 i

associated Riemann sum is called a lower sum. Note that if f (x) is either increasing or decreasing throughout the interval [a, b],
then the maximum and minimum values of the function occur at the endpoints of the subintervals, so the upper and lower sums are
just the same as the left- and right-endpoint approximations.

 Example 5.1.5: Finding Lower and Upper Sums

Find a lower sum for f (x) = 10 − x on [1, 2]; let n = 4 subintervals.


2

Solution
1
With n =4 over the interval [1, 2], Δx = . We can list the intervals as [1, 1.25], [1.25, 1.5], [1.5, 1.75], and .
[1.75, 2]
4
Because the function is decreasing over the interval [1, 2], Figure shows that a lower sum is obtained by using the right
endpoints.

Figure 5.1.13 : The graph of f (x) = 10 − x is set up for a right-endpoint approximation of the area bounded by the curve and
2

the x -axis on [1, 2] , and it shows a lower sum.


The Riemann sum is
4

2 2 2 2 2
∑(10 − x )(0.25) = 0.25[10 − (1.25 ) + 10 − (1.5 ) + 10 − (1.75 ) + 10 − (2 ) ]

k=1

= 0.25[8.4375 + 7.75 + 6.9375 + 6]

2
= 7.28 units .

The area of 7.28 units is a lower sum and an underestimate.


2

 Exercise 5.1.5
a. Find an upper sum for f (x) = 10 − x on [1, 2]; let n = 4.
2

b. Sketch the approximation.

Hint
f (x) is decreasing on [1, 2], so the maximum function values occur at the left endpoints of the subintervals.

Answer
a. Upper sum=8.0313 units 2
.

b.

5.1.12 https://math.libretexts.org/@go/page/4471
 Example 5.1.6: Finding Lower and Upper Sums for f (x) = sin x

Find a lower sum for f (x) = sin x over the interval [a, b] = [0, π

2
]; let n = 6.
Solution
Let’s first look at the graph in Figure 5.1.14 to get a better idea of the area of interest.

π/2 π
Figure 5.1.14 : The graph of y = sin x is divided into six regions: Δx = = .
6 12

The intervals are [0, ] , [ , ] , [ , ] , [ , ] , [ , ], and [ , ]. Note that f (x) = sin x is increasing on the
π

12
π

12
π

6
π

6
π

4
π

4
π

3
π

3

12

12
π

interval [0, ], so a left-endpoint approximation gives us the lower sum. A left-endpoint approximation is the Riemann sum
π

2
5
) .We have
π
∑ sin x (
i
i=0 12

π π π π π π π π π 5π π 2
A ≈ sin(0) ( ) + sin( )( ) + sin( )( ) + sin( )( ) + sin( )( ) + sin( )( ) ≈ 0.863 units .
12 12 12 6 12 4 12 3 12 12 12

 Exercise 5.1.6

Using the function f (x) = sin x over the interval [0, π

2
], find an upper sum; let n = 6.

Hint
Follow the steps from Example 5.1.6.

Answer
2
A ≈ 1.125 units

Key Concepts
n

The use of sigma (summation) notation of the form ∑ a is useful for expressing long sums of values in compact form.
i

i=1

For a continuous function defined over an interval [a, b], the process of dividing the interval into n equal parts, extending a
rectangle to the graph of the function, calculating the areas of the series of rectangles, and then summing the areas yields an
approximation of the area of that region.

5.1.13 https://math.libretexts.org/@go/page/4471
b −a
When using a regular partition, the width of each rectangle is Δx = .
n
n

Riemann sums are expressions of the form ∑ f (x ∗


i
)Δx, and can be used to estimate the area under the curve y = f (x). Left-
i=1

and right-endpoint approximations are special kinds of Riemann sums where the values of x are chosen to be the left or right ∗
i

endpoints of the subintervals, respectively.


Riemann sums allow for much flexibility in choosing the set of points x at which the function is evaluated, often with an eye

i

to obtaining a lower sum or an upper sum.

Key Equations
Properties of Sigma Notation
n

∑ c = nc

i=1

n n

∑ c ai = c ∑ ai

i=1 i=1

n n n

∑(ai + bi ) = ∑ ai + ∑ bi

i=1 i=1 i=1

n n n

∑(ai − bi ) = ∑ ai − ∑ bi

i=1 i=1 i=1

n m n

∑ ai = ∑ ai + ∑ ai

i=1 i=1 i=m+1

Sums and Powers of Integers


n
n(n + 1)
∑i = 1 +2 +⋯ +n =
2
i=1

n
n(n + 1)(2n + 1)
2 2 2 2
∑i =1 +2 +⋯ +n =
6
i=1

n 2 2
n (n + 1 )
3 3 3 3
∑i =1 +2 +⋯ +n =
4
i=0

Left-Endpoint Approximation
n

A ≈ Ln = f (x0 )Δx + f (x1 )Δx + ⋯ + f (xn−1 )Δx = ∑ f (xi−1 )Δx

i=1

Right-Endpoint Approximation
n

A ≈ Rn = f (x1 )Δx + f (x2 )Δx + ⋯ + f (xn )Δx = ∑ f (xi )Δx

i=1

Glossary
left-endpoint approximation
an approximation of the area under a curve computed by using the left endpoint of each subinterval to calculate the height of the
vertical sides of each rectangle

lower sum
a sum obtained by using the minimum value of f (x) on each subinterval

partition

5.1.14 https://math.libretexts.org/@go/page/4471
a set of points that divides an interval into subintervals

regular partition
a partition in which the subintervals all have the same width

riemann sum
n

an estimate of the area under the curve of the form A ≈ ∑ f (x ∗


i
)Δx

i=1

right-endpoint approximation
the right-endpoint approximation is an approximation of the area of the rectangles under a curve using the right endpoint of
each subinterval to construct the vertical sides of each rectangle

sigma notation
(also, summation notation) the Greek letter sigma (Σ ) indicates addition of the values; the values of the index above and
below the sigma indicate where to begin the summation and where to end it

upper sum
a sum obtained by using the maximum value of f (x) on each subinterval

5.1: Areas and Distances is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
5.1: Approximating Areas by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

5.1.15 https://math.libretexts.org/@go/page/4471
5.2: The Definite Integral
5.2: The Definite Integral is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

5.2.1 https://math.libretexts.org/@go/page/4472
5.3: The Fundamental Theorem of Calculus
5.3: The Fundamental Theorem of Calculus is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

5.3.1 https://math.libretexts.org/@go/page/4473
5.4: Indefinite Integrals and the Net Change Theorem
 Learning Objectives
Apply the basic integration formulas.
Explain the significance of the net change theorem.
Use the net change theorem to solve applied problems.
Apply the integrals of odd and even functions.

In this section, we use some basic integration formulas studied previously to solve some key applied problems. It is important to
note that these formulas are presented in terms of indefinite integrals. Although definite and indefinite integrals are closely related,
there are some key differences to keep in mind. A definite integral is either a number (when the limits of integration are constants)
or a single function (when one or both of the limits of integration are variables). An indefinite integral represents a family of
functions, all of which differ by a constant. As you become more familiar with integration, you will get a feel for when to use
definite integrals and when to use indefinite integrals. You will naturally select the correct approach for a given problem without
thinking too much about it. However, until these concepts are cemented in your mind, think carefully about whether you need a
definite integral or an indefinite integral and make sure you are using the proper notation based on your choice.

Basic Integration Formulas


Recall the integration formulas given in the section on Antiderivatives and the properties of definite integrals. Let’s look at a few
examples of how to apply these formulas and properties.

 Example 5.4.1: Integrating a Function Using the Power Rule


4

Use the power rule to integrate the function ∫ √t(1 + t) dt .


1

Solution
The first step is to rewrite the function and simplify it so we can apply the power rule:
4 4
1/2
∫ √t(1 + t) dt = ∫ t (1 + t) dt
1 1

4
1/2 3/2
=∫ (t +t ) dt.
1

Now apply the power rule:


4 4

1/2 3/2
2 3/2
2 5/2

∫ (t +t ) dt = ( t + t )∣
1
3 5 ∣
1

2 3/2
2 5/2
2 3/2
2 5/2
=[ (4 ) + (4 ) ]−[ (1 ) + (1 ) ]
3 5 3 5

256
= .
15

 Exercise 5.4.1
Find the definite integral of f (x) = x 2
− 3x over the interval [1, 3].

Hint
Follow the process from Example 5.4.1to solve the problem.

Answer

5.4.1 https://math.libretexts.org/@go/page/4474
3
2
10
∫ (x − 3x) dx = −
1
3

The Net Change Theorem


The net change theorem considers the integral of a rate of change. It says that when a quantity changes, the new value equals the
initial value plus the integral of the rate of change of that quantity. The formula can be expressed in two ways. The second is more
familiar; it is simply the definite integral.

 Net Change Theorem

The new value of a changing quantity equals the initial value plus the integral of the rate of change:
b

F (b) = F (a) + ∫ F (x)dx (5.4.1)
a

or
b

∫ F (x)dx = F (b) − F (a). (5.4.2)
a

Subtracting F (a) from both sides of the Equation 5.4.1 yields Equation . Since they are equivalent formulas, which one we
5.4.2

use depends on the application.


The significance of the net change theorem lies in the results. Net change can be applied to area, distance, and volume, to name
only a few applications. Net change accounts for negative quantities automatically without having to write more than one integral.
To illustrate, let’s apply the net change theorem to a velocity function in which the result is displacement.
We looked at a simple example of this in The Definite Integral section. Suppose a car is moving due north (the positive direction) at
40 mph between 2 p.m. and 4 p.m., then the car moves south at 30 mph between 4 p.m. and 5 p.m. We can graph this motion as
shown in Figure 5.4.1.

Figure 5.4.1 : The graph shows speed versus time for the given motion of a car.
Just as we did before, we can use definite integrals to calculate the net displacement as well as the total distance traveled. The net
displacement is given by
5 4 5

∫ v(t) dt = ∫ 40 dt + ∫ −30 dt = 80 − 30 = 50.


2 2 4

Thus, at 5 p.m. the car is 50 mi north of its starting position. The total distance traveled is given by
5 4 5

∫ |v(t)| dt = ∫ 40 dt + ∫ 30 dt = 80 + 30 = 110.
2 2 4

Therefore, between 2 p.m. and 5 p.m., the car traveled a total of 110 mi.

5.4.2 https://math.libretexts.org/@go/page/4474
To summarize, net displacement may include both positive and negative values. In other words, the velocity function accounts for
both forward distance and backward distance. To find net displacement, integrate the velocity function over the interval. Total
distance traveled, on the other hand, is always positive. To find the total distance traveled by an object, regardless of direction, we
need to integrate the absolute value of the velocity function.

 Example 5.4.2: Finding Net Displacement

Given a velocity function v(t) = 3t − 5 (in meters per second) for a particle in motion from time t = 0 to time t = 3, find the
net displacement of the particle.
Solution
Applying the net change theorem, we have
3 2 3 2
3t ∣ 3(3) 27 27 30 3
∫ (3t − 5) dt = ( − 5t) ∣ =[ − 5(3)] − 0 = − 15 = − =− .
2 ∣ 2 2 2 2 2
0 0

The net displacement is − m (Figure 5.4.2).


3

Figure 5.4.2 : The graph shows velocity versus time for a particle moving with a linear velocity function.

 Example 5.4.3: Finding the Total Distance Traveled

Use Example 5.4.2 to find the total distance traveled by a particle according to the velocity function v(t) = 3t − 5 m/sec over
a time interval [0, 3].
Solution
The total distance traveled includes both the positive and the negative values. Therefore, we must integrate the absolute value
of the velocity function to find the total distance traveled.
To continue with the example, use two integrals to find the total distance. First, find the t -intercept of the function, since that is
where the division of the interval occurs. Set the equation equal to zero and solve for t . Thus,

3t − 5 = 0

3t = 5

5
t = .
3

The two subintervals are [0, ] and [ , 3]. To find the total distance traveled, integrate the absolute value of the function.
5

3
5

Since the function is negative over the interval [0, ], we have ∣∣v(t)∣∣ = −v(t) over that interval. Over [ , 3], the function is
5

3
5

positive, so ∣∣v(t)∣∣ = v(t) . Thus, we have

5.4.3 https://math.libretexts.org/@go/page/4474
3 5/3 3

∫ |v(t)| dt = ∫ −v(t) dt + ∫ v(t) dt


0 0 5/3

5/3 3

=∫ 5 − 3t dt + ∫ 3t − 5 dt
0 5/3

2 5/3 2 3
3t ∣ 3t ∣
= (5t − )∣ +( − 5t) ∣
2 ∣ 2 ∣
0 5/3

2 2
5 3(5/3) 27 3(5/3) 25
= [5( )− ]−0 +[ − 15] − [ − ]
3 2 2 2 3

25 25 27 25 25
= − + − 15 − +
3 6 2 6 3

41
=
6

So, the total distance traveled is 14

6
m.

 Exercise 5.4.2

Find the net displacement and total distance traveled in meters given the velocity function f (t) =
1

2
t
e −2 over the interval
[0, 2].

Hint
Follow the procedures from Examples 5.4.2and 5.4.3. Note that f (t) ≤ 0 for t ≤ ln 4 and f (t) ≥ 0 for t ≥ ln 4 .

Answer
2 2

Net displacement: e −9

2
≈ −0.8055 m; total distance traveled: 4 ln 4 − 7.5 + e

2
≈ 1.740 m.

Applying the Net Change Theorem


The net change theorem can be applied to the flow and consumption of fluids, as shown in Example 5.4.4.

 Example 5.4.4: How Many Gallons of Gasoline Are Consumed?

If the motor on a motorboat is started at t = 0 and the boat consumes gasoline at the rate of 5 − t gal/hr, how much gasoline 3

is used in the first 2 hours?


Solution
Express the problem as a definite integral, integrate, and evaluate using the Fundamental Theorem of Calculus. The limits of
integration are the endpoints of the interval [0,2]. We have
2 2
4
t ∣
3
∫ (5 − t ) dt = (5t − )∣
0
4 ∣
0

4
(2)
= [5(2) − ]−0
4

16
= 10 −
4

= 6.

Thus, the motorboat uses 6 gal of gas in 2 hours.

5.4.4 https://math.libretexts.org/@go/page/4474
 Example 5.4.5: Chapter Opener: Iceboats

As we saw at the beginning of the chapter, top iceboat racers can attain speeds of up to five times the wind speed. Andrew is
an intermediate iceboater, though, so he attains speeds equal to only twice the wind speed.

Figure 5.4.3 : (credit: modification of work by Carter Brown, Flickr)


Suppose Andrew takes his iceboat out one morning when a light 5-mph breeze has been blowing all morning. As Andrew gets
his iceboat set up, though, the wind begins to pick up. During his first half hour of iceboating, the wind speed increases
according to the function v(t) = 20t + 5. For the second half hour of Andrew’s outing, the wind remains steady at 15 mph. In
other words, the wind speed is given by
1
20t + 5, for 0 ≤ t ≤
2
v(t) = {
1
15, for ≤t ≤1
2

Recalling that Andrew’s iceboat travels at twice the wind speed, and assuming he moves in a straight line away from his
starting point, how far is Andrew from his starting point after 1 hour?
Solution
To figure out how far Andrew has traveled, we need to integrate his velocity, which is twice the wind speed. Then
1

Distance = ∫ 2v(t) dt.


0

Substituting the expressions we were given for v(t) , we get


1 1/2 1

∫ 2v(t) dt = ∫ 2v(t) dt + ∫ 2v(t) dt


0 0 1/2

1/2 1

=∫ 2(20t + 5) dt + ∫ 2(15) dt
0 1/3

1/2 1

=∫ (40t + 10) dt + ∫ 30 dt
0 1/2

1/2 1
2 ∣ ∣
= [20 t + 10t] ∣ + [30t] ∣
∣0 ∣1/2

20
=( + 5) − 0 + (30 − 15)
4

= 25.

Andrew is 25 mi from his starting point after 1 hour.

5.4.5 https://math.libretexts.org/@go/page/4474
 Exercise 5.4.3

Suppose that, instead of remaining steady during the second half hour of Andrew’s outing, the wind starts to die down
according to the function v(t) = −10t + 15. In other words, the wind speed is given by
1
20t + 5, for 0 ≤ t ≤
2
v(t) = { .
1
−10t + 15, for ≤t ≤1
2

Under these conditions, how far from his starting point is Andrew after 1 hour?

Hint
Don’t forget that Andrew’s iceboat moves twice as fast as the wind.

Answer
17.5 mi

Integrating Even and Odd Functions


We saw in Functions and Graphs that an even function is a function in which f (−x) = f (x) for all x in the domain—that is, the
graph of the curve is unchanged when x is replaced with −x. The graphs of even functions are symmetric about the y -axis. An odd
function is one in which f (−x) = −f (x) for all x in the domain, and the graph of the function is symmetric about the origin.
Integrals of even functions, when the limits of integration are from −a to a , involve two equal areas, because they are symmetric
about the y -axis. Integrals of odd functions, when the limits of integration are similarly [−a, a], evaluate to zero because the areas
above and below the x-axis are equal.

 Integrals of Even and Odd Functions

For continuous even functions such that f (−x) = f (x),


a a

∫ f (x) dx = 2 ∫ f (x) dx.


−a 0

For continuous odd functions such that f (−x) = −f (x),


a

∫ f (x) dx = 0.
−a

 Example 5.4.6: Integrating an Even Function


2

Integrate the even function ∫ 8


(3 x − 2) dx and verify that the integration formula for even functions holds.
−2

Solution
The symmetry appears in the graphs in Figure 5.4.4. Graph (a) shows the region below the curve and above the x-axis. We
have to zoom in to this graph by a huge amount to see the region. Graph (b) shows the region above the curve and below the x-
axis. The signed area of this region is negative. Both views illustrate the symmetry about the y -axis of an even function. We
have

5.4.6 https://math.libretexts.org/@go/page/4474
2 2
9
x ∣
8
∫ (3 x − 2) dx = ( − 2x) ∣
−2 3 ∣−2

9 9
(2) (−2)
=[ − 2(2)] − [ − 2(−2)]
3 3

512 512
=( − 4) − (− + 4)
3 3

1000
= .
3

To verify the integration formula for even functions, we can calculate the integral from 0 to 2 and double it, then check to
make sure we get the same answer.
2 9 2
8
x ∣ 512 500
∫ (3 x − 2) dx = ( − 2x) ∣ = −4 =
3 ∣ 3 3
0 0

Since 2 ⋅ 500

3
=
1000

3
, we have verified the formula for even functions in this particular example.

Figure 5.4.4 : Graph (a) shows the positive area between the curve and the x -axis, whereas graph (b) shows the negative area
between the curve and the x -axis. Both views show the symmetry about the y -axis.

 Example 5.4.7: Integrating an Odd Function

Evaluate the definite integral of the odd function −5 sin x over the interval [−π, π].
Solution
The graph is shown in Figure 5.4.5. We can see the symmetry about the origin by the positive area above the x-axis over
[−π, 0], and the negative area below the x-axis over [0, π]. we have

π π

∫ −5 sin x dx = −5(− cos x)∣
∣−π
−π

π

= 5 cos x ∣

−π

= [5 cos π] − [5 cos(−π)]

= −5 − (−5) = 0.

5.4.7 https://math.libretexts.org/@go/page/4474
Figure 5.4.5 :The graph shows areas between a curve and the x -axis for an odd function.

 Exercise 5.4.4
2

Integrate the function ∫ 4


x dx.
−2

Hint
Integrate an even function.

Answer
64

Key Concepts
The net change theorem states that when a quantity changes, the final value equals the initial value plus the integral of the rate
of change. Net change can be a positive number, a negative number, or zero.
The area under an even function over a symmetric interval can be calculated by doubling the area over the positive x-axis. For
an odd function, the integral over a symmetric interval equals zero, because half the area is negative.

Key Equations
Net Change Theorem
b

F (b) = F (a) + ∫ F (x) dx
a

or
b

∫ F (x) dx = F (b) − F (a)
a

Glossary
net change theorem
if we know the rate of change of a quantity, the net change theorem says the future quantity is equal to the initial quantity plus
the integral of the rate of change of the quantity

5.4: Indefinite Integrals and the Net Change Theorem is shared under a not declared license and was authored, remixed, and/or curated by
LibreTexts.

5.4.8 https://math.libretexts.org/@go/page/4474
5.4: Integration Formulas and the Net Change Theorem by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original
source: https://openstax.org/details/books/calculus-volume-1.

5.4.9 https://math.libretexts.org/@go/page/4474
5.5: The Substitution Rule
 Learning Objectives
Use substitution to evaluate indefinite integrals.
Use substitution to evaluate definite integrals.

The Fundamental Theorem of Calculus gave us a method to evaluate integrals without using Riemann sums. The drawback of this
method, though, is that we must be able to find an antiderivative, and this is not always easy. In this section we examine a
technique, called integration by substitution, to help us find antiderivatives. Specifically, this method helps us find antiderivatives
when the integrand is the result of a chain-rule derivative.
At first, the approach to the substitution procedure may not appear very obvious. However, it is primarily a visual task—that is, the
integrand shows you what to do; it is a matter of recognizing the form of the function. So, what are we supposed to see? We are
looking for an integrand of the form f [g(x)]g'(x) dx. For example, in the integral

2 3
∫ (x − 3) 2x dx. (5.5.1)

we have
3
f (x) = x

and
2
g(x) = x − 3.

Then

g (x) = 2x.

and
2 3
f [g(x)]g'(x) = (x − 3 ) (2x),

and we see that our integrand is in the correct form. The method is called substitution because we substitute part of the integrand
with the variable u and part of the integrand with du. It is also referred to as change of variables because we are changing variables
to obtain an expression that is easier to work with for applying the integration rules.

 Substitution with Indefinite Integrals


Let u = g(x) ,, where g'(x) is continuous over an interval, let f (x) be continuous over the corresponding range of g , and let
F (x) be an antiderivative of f (x). Then,

∫ f [g(x)]g'(x) dx = ∫ f (u) du

= F (u) + C

= F (g(x)) + C

 Proof
Let f , g , u, and F be as specified in the theorem. Then
d
[F (g(x))] = F '(g(x))g'(x) = f [g(x)]g'(x).
dx

Integrating both sides with respect to x, we see that

∫ f [g(x)]g'(x) dx = F (g(x)) + C .

5.5.1 https://math.libretexts.org/@go/page/4475
If we now substitute u = g(x) , and du = g ′
(x) dx , we get

∫ f [g(x)]g'(x) dx = ∫ f (u) du = F (u) + C = F (g(x)) + C .

Returning to the problem we looked at originally, we let u = x 2


−3 and then du = 2x dx.
Rewrite the integral (Equation 5.5.1) in terms of u:

2 3 3
∫ (x − 3 ) (2x dx) = ∫ u du.

Using the power rule for integrals, we have


4
3
u
∫ u du = + C.
4

Substitute the original expression for x back into the solution:


4 2 4
u (x − 3)
+C = + C.
4 4

We can generalize the procedure in the following Problem-Solving Strategy.

 Problem-Solving Strategy: Integration by Substitution


1. Look carefully at the integrand and select an expression g(x) within the integrand to set equal to u. Let’s select g(x). such
that g'(x) is also part of the integrand.
2. Substitute u = g(x) and du = g'(x)dx. into the integral.
3. We should now be able to evaluate the integral with respect to u. If the integral can’t be evaluated we need to go back and
select a different expression to use as u.
4. Evaluate the integral in terms of u.
5. Write the result in terms of x and the expression g(x).

 Example 5.5.1: Using Substitution to Find an Antiderivative

Use substitution to find the antiderivative of ∫ 6x(3 x


2
+ 4)
4
dx.

Solution
The first step is to choose an expression for u. We choose u = 3x 2
+4 because then du = 6x dx and we already have du in
the integrand. Write the integral in terms of u:

2 4 4
∫ 6x(3 x + 4) dx = ∫ u du.

Remember that du is the derivative of the expression chosen for u, regardless of what is inside the integrand. Now we can
evaluate the integral with respect to u:
5 2 5
u (3 x + 4)
4
∫ u du = +C = + C.
5 5

Analysis
We can check our answer by taking the derivative of the result of integration. We should obtain the integrand. Picking a value
1
for C of 1, we let y = 2
(3 x + 4)
5
+ 1. We have
5

1
2 5
y = (3 x + 4) + 1,
5

5.5.2 https://math.libretexts.org/@go/page/4475
so
1 2 4
y' = ( ) 5(3 x + 4 ) 6x
5

2 4
= 6x(3 x + 4) .

This is exactly the expression we started with inside the integrand.

 Exercise 5.5.1

Use substitution to find the antiderivative of ∫ 2


3 x (x
3
− 3)
2
dx.

Hint
Let u = x 3
− 3.

Answer
1
2 3 2 3 3
∫ 3 x (x − 3) dx = (x − 3) +C
3

Sometimes we need to adjust the constants in our integral if they don’t match up exactly with the expressions we are substituting.

 Example 5.5.2: Using Substitution with Alteration

Use substitution to find the antiderivative of


− − −−−
2
∫ z√ z − 5 dz.

Solution

Rewrite the integral as ∫ z(z


2 1/2
− 5) dz. Let u =z
2
−5 and du = 2z dz. Now we have a problem because du = 2z dz

and the original expression has only z dz. We have to alter our expression for du or the integral in u will be twice as large as it
1
should be. If we multiply both sides of the du equation by . we can solve this problem. Thus,
2

2
u =z −5

du = 2z dz

1 1
du = (2z) dz = z dz.
2 2

1
Write the integral in terms of u, but pull the outside the integration symbol:
2

1
2 1/2 1/2
∫ z(z − 5) dz = ∫ u du.
2

Integrate the expression in u:

5.5.3 https://math.libretexts.org/@go/page/4475
3/2
1 1/2
1 u
∫ u du = ( ) +C
2 2 3

1 2 3/2
=( )( )u +C
2 3

1 3/2
= u +C
3

1
2 3/2
= (z − 5) +C
3

 Exercise 5.5.2

Use substitution to find the antiderivative of ∫ 2


x (x
3
+ 5)
9
dx.

Hint
1
Multiply the du equation by .
3

Answer
3 10
(x + 5)
2 3 9
∫ x (x + 5) dx = +C
30

 Example 5.5.3: Using Substitution with Integrals of Trigonometric Functions


sin t
Use substitution to evaluate the integral ∫ 3
dt.
cos t

Solution
We know the derivative of cos t is − sin t , so we set u = cos t . Then du = − sin t dt.
Substituting into the integral, we have
sin t du
∫ dt = − ∫ .
3 3
cos t u

Evaluating the integral, we get


du −3
1 −2
−∫ = −∫ u du = − (− )u + C.
3
u 2

Putting the answer back in terms of t, we get


sin t 1 1
∫ dt = +C = + C.
3 2 2
cos t 2u 2 cos t

 Exercise 5.5.3
cos t
Use substitution to evaluate the integral ∫ 2
dt.
sin t

Hint
Use the process from Example 5.5.3to solve the problem.

Answer

5.5.4 https://math.libretexts.org/@go/page/4475
cos t 1
∫ dt = − +C
2
sin t sin t

 Exercise 5.5.4

Use substitution to evaluate the indefinite integral ∫ cos


3
t sin t dt.

Hint
Use the process from Example 5.5.3to solve the problem.

Answer
4
3
cos t
∫ cos t sin t dt = − +C
4

Sometimes we need to manipulate an integral in ways that are more complicated than just multiplying or dividing by a constant.
We need to eliminate all the expressions within the integrand that are in terms of the original variable. When we are done, u should
be the only variable in the integrand. In some cases, this means solving for the original variable in terms of u. This technique
should become clear in the next example.

 Example 5.5.4: Finding an Antiderivative Using u-Substitution

Use substitution to find the antiderivative of


x
∫ dx.
−−−−−
√x − 1

Solution
If we let u = x − 1, then du = dx . But this does not account for the x in the numerator of the integrand. We need to express x
in terms of u. If u = x − 1 , then x = u + 1. Now we can rewrite the integral in terms of u :
x u +1 − 1 1/2 −1/2
∫ −−−−− dx = ∫ − du = ∫ ( √u + − ) du = ∫ (u +u ) du.
√x − 1 √u √u

Then we integrate in the usual way, replace u with the original expression, and factor and simplify the result. Thus,
2
1/2 −1/2 3/2 1/2
∫ (u +u ) du = u + 2u +C
3

2 3/2 1/2
= (x − 1 ) + 2(x − 1 ) +C
3

1/2
2
= (x − 1 ) [ (x − 1) + 2] + C
3

2 2 6
1/2
= (x − 1 ) ( x− + )
3 3 3

2 4
1/2
= (x − 1 ) ( x+ )
3 3

2
1/2
= (x − 1 ) (x + 2) + C .
3

Substitution for Definite Integrals


Substitution can be used with definite integrals, too. However, using substitution to evaluate a definite integral requires a change to
the limits of integration. If we change variables in the integrand, the limits of integration change as well.

5.5.5 https://math.libretexts.org/@go/page/4475
 Substitution with Definite Integrals

Let u = g(x) and let g be continuous over an interval [a, b], and let f be continuous over the range of u = g(x). Then,

b g(b)

∫ f (g(x))g'(x) dx = ∫ f (u) du.


a g(a)

Although we will not formally prove this theorem, we justify it with some calculations here. From the substitution rule for
indefinite integrals, if F (x) is an antiderivative of f (x), we have

∫ f (g(x))g'(x) dx = F (g(x)) + C .

Then
b x=b

∫ f [g(x)]g'(x) dx = F (g(x))∣

a x=a

= F (g(b)) − F (g(a))

u=g(b)

= F (u)∣

u=g(a)

g(b)

=∫ f (u) du
g(a)

and we have the desired result.

 Example 5.5.5: Using Substitution to Evaluate a Definite Integral

Use substitution to evaluate


1
2 3 5
∫ x (1 + 2 x ) dx.
0

Solution
Let u = 1 + 2x , so du = 6x dx . Since the original function includes one factor of x and du = 6x
3 2 2 2
dx , multiply both sides
of the du equation by 1/6. Then,
2
du = 6 x dx

1 2
becomes du = x dx.
6

To adjust the limits of integration, note that when x = 0, u = 1 + 2(0) = 1, and when x = 1, u = 1 + 2(1) = 3.

Then
1 3
2 3 5
1 5
∫ x (1 + 2 x ) dx = ∫ u du.
0
6 1

Evaluating this expression, we get


3 6
1 1 u 3
5 ∣
∫ u du = ( )( )

6 1
6 6 1

1 6 6
= [(3 ) − (1 ) ]
36

182
= .
9

5.5.6 https://math.libretexts.org/@go/page/4475
 Exercise 5.5.5
0

Use substitution to evaluate the definite integral ∫ y(2 y


2
− 3)
5
dy.
−1

Hint
Use the steps from Example 5.5.5to solve the problem.

Answer
0
91
2 5
∫ y(2 y − 3) dy =
−1
3

 Exercise 5.5.6
1
π
Use substitution to evaluate ∫ 2
x cos(
3
x ) dx.
0
2

Hint
Use the process from Example 5.5.5to solve the problem.

Answer
1
2
π 3
2
∫ x cos( x ) dx = ≈ 0.2122
0
2 3π

 Example 5.5.6: Using Substitution with an Exponential Function


Use substitution to evaluate
1
2
4 x +3
∫ xe dx.
0

Solution
Let u = 4x
3
Then, du = 8x dx. To adjust the limits of integration, we note that when
+ 3. x = 0, u = 3 , and when
x = 1, u = 7 . So our substitution gives

1 7
2 1
4 x +3 u
∫ xe dx = ∫ e du
0
8 3

1 7
u∣
= e

8 3

7 3
e −e
=
8

≈ 134.568

Substitution may be only one of the techniques needed to evaluate a definite integral. All of the properties and rules of integration
apply independently, and trigonometric functions may need to be rewritten using a trigonometric identity before we can apply
substitution. Also, we have the option of replacing the original expression for u after we find the antiderivative, which means that
we do not have to change the limits of integration. These two approaches are shown in Example 5.5.7.

5.5.7 https://math.libretexts.org/@go/page/4475
 Example 5.5.7: Using Substitution to Evaluate a Trigonometric Integral
Use substitution to evaluate
π/2
2
∫ cos θ dθ.
0

Solution
1 + cos 2θ
Let us first use a trigonometric identity to rewrite the integral. The trig identity cos
2
θ = allows us to rewrite the
2
integral as
π/2 π/2
1 + cos 2θ
2
∫ cos θ dθ = ∫ dθ.
0 0
2

Then,
π/2 π/2
1 + cos 2θ 1 1
∫ ( ) dθ = ∫ ( + cos 2θ) dθ
0
2 0
2 2

π/2 π/2
1
= ∫ dθ + ∫ cos 2θ dθ.
2 0 0

We can evaluate the first integral as it is, but we need to make a substitution to evaluate the second integral. Let u = 2θ. Then,
1
du = 2 dθ, or du = dθ . Also, when θ = 0, u = 0, and when θ = π/2, u = π. Expressing the second integral in terms of
2
u , we have
π/2 π/2 π/2 π
1 1 1 1 1
∫ dθ + ∫ cos 2θ dθ = ∫ dθ + ( )∫ cos u du
2 0
2 0
2 0
2 2 0

θ=π/2 u=θ
θ ∣ 1 ∣
= ∣ + sin u ∣
2 ∣ θ=0 4 ∣
u=0

π π
=( − 0) + (0 − 0) =
4 4

Key Concepts
Substitution is a technique that simplifies the integration of functions that are the result of a chain-rule derivative. The term
‘substitution’ refers to changing variables or substituting the variable u and du for appropriate expressions in the integrand.
When using substitution for a definite integral, we also have to change the limits of integration.

Key Equations
Substitution with Indefinite Integrals

∫ f [g(x)]g'(x) dx = ∫ f (u) du = F (u) + C = F (g(x)) + C

Substitution with Definite Integrals


b g(b)

∫ f (g(x))g (x) dx = ∫ f (u) du
a g(a)

Glossary
change of variables
the substitution of a variable, such as u , for an expression in the integrand

integration by substitution

5.5.8 https://math.libretexts.org/@go/page/4475
a technique for integration that allows integration of functions that are the result of a chain-rule derivative

5.5: The Substitution Rule is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
5.5: Substitution by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

5.5.9 https://math.libretexts.org/@go/page/4475
CHAPTER OVERVIEW

6: Applications of Integration
A general Calculus Textmap organized around the textbook

Calculus: Early Transcendentals


by James Stewart

I II III IV V VI VII VIII IX X XI XII XIII XIV XV XVI XVII

This Textmap is currently under construction... please be patient with us.


6.1: Areas Between Curves
6.2: Volumes
6.3: Volumes by Cylindrical Shells
6.4: Work
6.5: Average Value of a Function

6: Applications of Integration is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

1
6.1: Areas Between Curves
6.1: Areas Between Curves is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

6.1.1 https://math.libretexts.org/@go/page/4477
6.2: Volumes
6.2: Volumes is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

6.2.1 https://math.libretexts.org/@go/page/4478
6.3: Volumes by Cylindrical Shells
 Learning Objectives
Calculate the volume of a solid of revolution by using the method of cylindrical shells.
Compare the different methods for calculating a volume of revolution.

In this section, we examine the method of cylindrical shells, the final method for finding the volume of a solid of revolution. We
can use this method on the same kinds of solids as the disk method or the washer method; however, with the disk and washer
methods, we integrate along the coordinate axis parallel to the axis of revolution. With the method of cylindrical shells, we
integrate along the coordinate axis perpendicular to the axis of revolution. The ability to choose which variable of integration we
want to use can be a significant advantage with more complicated functions. Also, the specific geometry of the solid sometimes
makes the method of using cylindrical shells more appealing than using the washer method. In the last part of this section, we
review all the methods for finding volume that we have studied and lay out some guidelines to help you determine which method to
use in a given situation.

The Method of Cylindrical Shells


Again, we are working with a solid of revolution. As before, we define a region R , bounded above by the graph of a function
y = f (x), below by the x -axis, and on the left and right by the lines x = a and x = b , respectively, as shown in Figure 6.3.1a. We

then revolve this region around the y -axis, as shown in Figure 6.3.1b. Note that this is different from what we have done before.
Previously, regions defined in terms of functions of x were revolved around the x-axis or a line parallel to it.

Figure 6.3.1 : (a) A region bounded by the graph of a function of x . (b) The solid of revolution formed when the region is revolved
around the y -axis.
As we have done many times before, partition the interval [a, b] using a regular partition, P = x , x , … , x and, for0 1 n

i = 1, 2, … , n, choose a point x ∈ [ x , x ] . Then, construct a rectangle over the interval [ x , x ] of height f (x ) and width
∗ ∗
i i−1 i i−1 i i

Δx. A representative rectangle is shown in Figure 6.3.2a. When that rectangle is revolved around the y -axis, instead of a disk or a

washer, we get a cylindrical shell, as shown in Figure 6.3.2.

6.3.1 https://math.libretexts.org/@go/page/4479
Figure 6.3.2 : (a) A representative rectangle. (b) When this rectangle is revolved around the y -axis, the result is a cylindrical shell.
(c) When we put all the shells together, we get an approximation of the original solid.
To calculate the volume of this shell, consider Figure 6.3.3.

Figure 6.3.3 : Calculating the volume of the shell.


The shell is a cylinder, so its volume is the cross-sectional area multiplied by the height of the cylinder. The cross-sections are
annuli (ring-shaped regions—essentially, circles with a hole in the center), with outer radius x and inner radius x . Thus, the
i i−1

cross-sectional area is π x − π x . The height of the cylinder is f (x ). Then the volume of the shell is
2
i
2
i−1

i

∗ 2 2
Vshell = f (x )(π x −π x )
i i i−1

∗ 2 2
= π f (x )(x −x )
i i i−1


= π f (x )(xi + xi−1 )(xi − xi−1 )
i

xi + xi−1

= 2π f (x ) ( ) (xi − xi−1 ).
i
2

Note that x i − xi−1 = Δx, so we have


xi + xi−1

Vshell = 2π f (x ) ( ) Δx.
i
2

xi + xi−1
Furthermore, is both the midpoint of the interval [x i−1 , xi ] and the average radius of the shell, and we can approximate
2
this by x . We then have

i

∗ ∗
Vshell ≈ 2π f (x )x Δx.
i i

6.3.2 https://math.libretexts.org/@go/page/4479
Another way to think of this is to think of making a vertical cut in the shell and then opening it up to form a flat plate (Figure
6.3.4).

Figure 6.3.4 : (a) Make a vertical cut in a representative shell. (b) Open the shell up to form a flat plate.
In reality, the outer radius of the shell is greater than the inner radius, and hence the back edge of the plate would be slightly longer
than the front edge of the plate. However, we can approximate the flattened shell by a flat plate of height f (x ), width 2πx , and ∗
i

i

thickness Δx (Figure). The volume of the shell, then, is approximately the volume of the flat plate. Multiplying the height, width,
and depth of the plate, we get
∗ ∗
Vshell ≈ f (x )(2π x ) Δx,
i i

which is the same formula we had before.


To calculate the volume of the entire solid, we then add the volumes of all the shells and obtain
n

∗ ∗
V ≈ ∑(2π x f (x ) Δx).
i i

i=1

Here we have another Riemann sum, this time for the function 2π x f (x).Taking the limit as n → ∞ gives us
n b

∗ ∗
V = lim ∑(2π x f (x ) Δx) = ∫ (2π x f (x)) dx.
i i
n→∞
a
i=1

This leads to the following rule for the method of cylindrical shells.

 Rule: The Method of Cylindrical Shells

Let f (x) be continuous and nonnegative. Define R as the region bounded above by the graph of f (x), below by the x-axis, on
the left by the line x = a , and on the right by the line x = b . Then the volume of the solid of revolution formed by revolving R
around the y -axis is given by
b

V =∫ (2π x f (x)) dx.


a

Now let’s consider an example.

 Example 6.3.1: The Method of Cylindrical Shells I

Define R as the region bounded above by the graph of f (x) = 1/x and below by the x-axis over the interval . Find the
[1, 3]

volume of the solid of revolution formed by revolving R around the y -axis.

6.3.3 https://math.libretexts.org/@go/page/4479
Solution
First we must graph the region R and the associated solid of revolution, as shown in Figure 6.3.5.

Figure 6.3.5 : (a) The region R under the graph of f (x) = 1/x over the interval [1, 3] . (b) The solid of revolution generated by
revolving R about the y -axis.

Figure 6.3.5 (c) Visualizing the solid of revolution with CalcPlot3D.


Then the volume of the solid is given by

6.3.4 https://math.libretexts.org/@go/page/4479
b

V =∫ (2π x f (x)) dx
a

3
1
=∫ (2π x ( )) dx
1 x
3

=∫ 2π dx
1

3
∣ 3
= 2π x ∣ = 4π units .

1

 Exercise 6.3.1

Define R as the region bounded above by the graph of f (x) = x and below by the x-axis over the interval
2
[1, 2]. Find the
volume of the solid of revolution formed by revolving R around the y -axis.

Hint
Use the procedure from Example 6.3.1.

Answer
15π 3
units
2

 Example 6.3.2: The Method of Cylindrical Shells II

Define R as the region bounded above by the graph of f (x) = 2x − x and below by the x-axis over the interval [0, 2]. Find
2

the volume of the solid of revolution formed by revolving R around the y -axis.
Solution
First graph the region R and the associated solid of revolution, as shown in Figure 6.3.6.

Figure 6.3.6 : (a) The region R under the graph of f (x) = 2x − x


2
over the interval [0, 2]. (b) The volume of revolution
obtained by revolving R about the y -axis.
Then the volume of the solid is given by

6.3.5 https://math.libretexts.org/@go/page/4479
b

V =∫ (2π x f (x)) dx
a

2
2
=∫ (2π x(2x − x )) dx
0

2
2 3
= 2π ∫ (2 x − x ) dx
0

2
3 4
2x x ∣
= 2π [ − ]∣
3 4 ∣0


3
= units
3

 Exercise 6.3.2

Define R as the region bounded above by the graph of f (x) = 3x − x and below by the x-axis over the interval [0, 2]. Find
2

the volume of the solid of revolution formed by revolving R around the y -axis.

Hint
Use the process from Example 6.3.2.

Answer
3
8π units

As with the disk method and the washer method, we can use the method of cylindrical shells with solids of revolution, revolved
around the x-axis, when we want to integrate with respect to y . The analogous rule for this type of solid is given here.

 Rule: The Method of Cylindrical Shells for Solids of Revolution around the x-axis

Let g(y) be continuous and nonnegative. Define Q as the region bounded on the right by the graph of g(y) , on the left by the
y -axis, below by the line y = c , and above by the line y = d . Then, the volume of the solid of revolution formed by revolving

Q around the x-axis is given by

V =∫ (2π y g(y)) dy.


c

 Example 6.3.3: The Method of Cylindrical Shells for a Solid Revolved around the x-axis

Define Q as the region bounded on the right by the graph of g(y) = 2√y and on the left by the y -axis for y ∈ [0, 4]. Find the
volume of the solid of revolution formed by revolving Q around the x-axis.
Solution
First, we need to graph the region Q and the associated solid of revolution, as shown in Figure 6.3.7.

6.3.6 https://math.libretexts.org/@go/page/4479
Figure 6.3.7 : (a) The region Q to the left of the function g(y) over the interval [0, 4] . (b) The solid of revolution generated by
revolving Q around the x -axis.
Label the shaded region Q. Then the volume of the solid is given by
d

V =∫ (2π y g(y)) dy
c

=∫ (2π y(2 √y)) dy


0

4
3/2
= 4π ∫ y dy
0

4
5/2
2y ∣
= 4π [ ]∣
5 ∣0

256π
3
= units
5

 Exercise 6.3.3

Define Q as the region bounded on the right by the graph of g(y) = 3/y and on the left by the y -axis for y ∈ [1, 3]. Find the
volume of the solid of revolution formed by revolving Q around the x-axis.

Hint
Use the process from Example 6.3.3.

Answer
12π units3

For the next example, we look at a solid of revolution for which the graph of a function is revolved around a line other than one of
the two coordinate axes. To set this up, we need to revisit the development of the method of cylindrical shells. Recall that we found
the volume of one of the shells to be given by

6.3.7 https://math.libretexts.org/@go/page/4479
∗ 2 2
Vshell = f (x )(π x −π x )
i i i−1

∗ 2 2
= π f (x )(x −x )
i i i−1


= π f (x )(xi + xi−1 )(xi − xi−1 )
i

xi + xi−1

= 2π f (x ) ( ) (xi − xi−1 ).
i
2

This was based on a shell with an outer radius of x and an inner radius of x . If, however, we rotate the region around a line
i i−1

other than the y -axis, we have a different outer and inner radius. Suppose, for example, that we rotate the region around the line
x = −k, where k is some positive constant. Then, the outer radius of the shell is x + k and the inner radius of the shell is
i

xi−1 + k . Substituting these terms into the expression for volume, we see that when a plane region is rotated around the line

x = −k, the volume of a shell is given by

(xi + k) + (xi−1 + k)

Vshell = 2π f (x )( )((xi + k) − (xi−1 + k))
i
2

xi + xi−2

= 2π f (x ) (( ) + k) Δx.
i
2

xi + xi−1
As before, we notice that is the midpoint of the interval [ xi−1 , xi ] and can be approximated by x

i
. Then, the
2
approximate volume of the shell is
∗ ∗
Vshell ≈ 2π(x + k)f (x )Δx.
i i

The remainder of the development proceeds as before, and we see that


b

V =∫ (2π(x + k)f (x))dx.


a

We could also rotate the region around other horizontal or vertical lines, such as a vertical line in the right half plane. In each case,
the volume formula must be adjusted accordingly. Specifically, the x-term in the integral must be replaced with an expression
representing the radius of a shell. To see how this works, consider the following example.

 Example 6.3.4: A Region of Revolution Revolved around a Line


Define R as the region bounded above by the graph of f (x) = x and below by the x-axis over the interval [1, 2] . Find the
volume of the solid of revolution formed by revolving R around the line x = −1.
Solution
First, graph the region R and the associated solid of revolution, as shown in Figure 6.3.8.

6.3.8 https://math.libretexts.org/@go/page/4479
Figure 6.3.8 : (a) The region R between the graph of f (x) and the x -axis over the interval [1, 2] . (b) The solid of revolution
generated by revolving R around the line x = −1.
Note that the radius of a shell is given by x + 1 . Then the volume of the solid is given by
2

V =∫ 2π(x + 1)f (x) dx


1

2 2
2
=∫ 2π(x + 1)x dx = 2π ∫ x + x dx
1 1

3 2 2
x x ∣
= 2π [ + ]∣
3 2 ∣
1

23π 3
= units
3

 Exercise 6.3.4

Define R as the region bounded above by the graph of f (x) = x and below by the x-axis over the interval
2
[0, 1]. Find the
volume of the solid of revolution formed by revolving R around the line x = −2 .

Hint
Use the process from Example 6.3.4.

Answer
11π
units3
6

For our final example in this section, let’s look at the volume of a solid of revolution for which the region of revolution is bounded
by the graphs of two functions.

 Example 6.3.5: A Region of Revolution Bounded by the Graphs of Two Functions

Define as the region bounded above by the graph of the function f (x) = √−
R x and below by the graph of the function

g(x) = 1/x over the interval [1, 4]. Find the volume of the solid of revolution generated by revolving R around the y -axis.

Solution
First, graph the region R and the associated solid of revolution, as shown in Figure 6.3.9.

6.3.9 https://math.libretexts.org/@go/page/4479
Figure 6.3.9 : (a) The region R between the graph of f (x) and the graph of g(x) over the interval [1, 4] . (b) The solid of
revolution generated by revolving R around the y -axis.
Note that the axis of revolution is the y -axis, so the radius of a shell is given simply by x. We don’t need to make any
adjustments to the x-term of our integrand. The height of a shell, though, is given by f (x) − g(x) , so in this case we need to
adjust the f (x) term of the integrand. Then the volume of the solid is given by
4

V =∫ (2π x(f (x) − g(x))) dx


1

4 4
− 1 3/2
=∫ (2π x(√x − )) dx = 2π ∫ (x − 1)dx
1
x 1

5/2 4
2x ∣ 94π
3
= 2π [ − x] ∣ = units .
5 ∣ 5
1

 Exercise 6.3.5

Define R as the region bounded above by the graph of f (x) = x and below by the graph of g(x) = x over the interval [0, 1]. 2

Find the volume of the solid of revolution formed by revolving R around the y -axis.

Hint
Hint: Use the process from Example 6.3.5.

Answer
π
units3
6

Which Method Should We Use?


We have studied several methods for finding the volume of a solid of revolution, but how do we know which method to use? It
often comes down to a choice of which integral is easiest to evaluate. Figure 6.3.10 describes the different approaches for solids of
revolution around the x-axis. It’s up to you to develop the analogous table for solids of revolution around the y -axis.

6.3.10 https://math.libretexts.org/@go/page/4479
Figure 6.3.10
Let’s take a look at a couple of additional problems and decide on the best approach to take for solving them.

 Example 6.3.6: Selecting the Best Method

For each of the following problems, select the best method to find the volume of a solid of revolution generated by revolving
the given region around the x-axis, and set up the integral to find the volume (do not evaluate the integral).
a. The region bounded by the graphs of y = x, y = 2 − x, and the x-axis.
b. The region bounded by the graphs of y = 4x − x and the x-axis.
2

Solution
a.
First, sketch the region and the solid of revolution as shown.

6.3.11 https://math.libretexts.org/@go/page/4479
Figure 6.3.11 : (a) The region R bounded by two lines and the x -axis. (b) The solid of revolution generated by revolving R

about the x -axis.


Looking at the region, if we want to integrate with respect to x, we would have to break the integral into two pieces, because
we have different functions bounding the region over [0, 1] and [1, 2]. In this case, using the disk method, we would have
1 2
2 2
V =∫ πx dx + ∫ π(2 − x ) dx.
0 1

If we used the shell method instead, we would use functions of y to represent the curves, producing
1 1

V =∫ 2π y[(2 − y) − y] dy = ∫ 2π y[2 − 2y] dy.


0 0

Neither of these integrals is particularly onerous, but since the shell method requires only one integral, and the integrand
requires less simplification, we should probably go with the shell method in this case.
b.
First, sketch the region and the solid of revolution as shown.

Figure 6.3.12 : (a) The region R between the curve and the x -axis. (b) The solid of revolution generated by revolving R about
the x -axis.

6.3.12 https://math.libretexts.org/@go/page/4479
Looking at the region, it would be problematic to define a horizontal rectangle; the region is bounded on the left and right by
the same function. Therefore, we can dismiss the method of shells. The solid has no cavity in the middle, so we can use the
method of disks. Then
4
2
2
V =∫ π (4x − x ) dx
0

 Exercise 6.3.6
Select the best method to find the volume of a solid of revolution generated by revolving the given region around the x-axis,
and set up the integral to find the volume (do not evaluate the integral): the region bounded by the graphs of y = 2 − x and 2

y =x .
2

Hint
Sketch the region and use Figure 6.3.12to decide which integral is easiest to evaluate.

Answer
Use the method of washers;
1
2 2 2 2
V =∫ π [ (2 − x ) − (x ) ] dx
−1

Key Concepts
The method of cylindrical shells is another method for using a definite integral to calculate the volume of a solid of revolution.
This method is sometimes preferable to either the method of disks or the method of washers because we integrate with respect
to the other variable. In some cases, one integral is substantially more complicated than the other.
The geometry of the functions and the difficulty of the integration are the main factors in deciding which integration method to
use.

Key Equations
Method of Cylindrical Shells
b

V =∫ (2π x f (x)) dx
a

Glossary
method of cylindrical shells
a method of calculating the volume of a solid of revolution by dividing the solid into nested cylindrical shells; this method is
different from the methods of disks or washers in that we integrate with respect to the opposite variable

6.3: Volumes by Cylindrical Shells is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
6.3: Volumes of Revolution - Cylindrical Shells by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

6.3.13 https://math.libretexts.org/@go/page/4479
6.4: Work
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
How do we measure the work accomplished by a varying force that moves an object a certain distance?
What is the total force exerted by water against a dam?
How are both of the above concepts and their corresponding use of definite integrals similar to problems we have
encountered in the past involving formulas such as “distance equals rate times time” and “mass equals density times
volume”?

In our work to date with the definite integral, we have seen several different circumstances where the integral enables us to measure
the accumulation of a quantity that varies, provided the quantity is approximately constant over small intervals. For instance, based
on the fact that the area of a rectangle is A = l ⋅ w, if we wish to find the area bounded by a nonnegative curve y = f (x) and the
x-axis on an interval [a, b], a representative slice of width Δx has area A = f (x)Δx, and thus as we let the width of the
slice

representative slice tend to zero, we find that the exact area of the region is
b

A =∫ f (x)dx. (6.4.1)
a

In a similar way, if we know that the velocity of a moving object is given by the function y = v(t) , and we wish to know the
distance the object travels on an interval [a, b] where v(t) is nonnegative, we can use a definite integral to generalize the fact that
d = r ⋅ t when the rate, r , is constant. More specifically, on a short time interval Δt , v(t) is roughly constant, and hence for a

small slice of time, d


slice= v(t)Δt , and so as the width of the time interval Δt tends to zero, the exact distance traveled is given

by the definite integral


b

d =∫ v(t)dt. (6.4.2)
a

Finally, when we recently learned about the mass of an object of non-constant density, we saw that since M = D ⋅ V (mass equals
density times volume, provided that density is constant), if we can consider a small slice of an object on which the density is
approximately constant, a definite integral may be used to determine the exact mass of the object. For instance, if we have a thin
rod whose cross sections have constant density, but whose density is distributed along the x axis according to the function
y = ρ(x), it follows that for a small slice of the rod that is Δx thick, M = ρ(x)Δx. In the limit as Δx → 0 , we then find that
slice

the total mass is given by


b

M =∫ ρ(x)dx. (6.4.3)
a

Note that all three of these situations are similar in that we have a basic rule (A = l ⋅ w, d = r ⋅ t, M = D ⋅ V ) where one of the
two quantities being multiplied is no longer constant; in each, we consider a small interval for the other variable in the formula,
calculate the approximate value of the desired quantity (area, distance, or mass) over the small interval, and then use a definite
integral to sum the results as the length of the small intervals is allowed to approach zero. It should be apparent that this approach
will work effectively for other situations where we have a quantity of interest that varies. We next turn to the notion of work: from
physics, a basic principal is that work is the product of force and distance. For example, if a person exerts a force of 20 pounds to
lift a 20-pound weight 4 feet off the ground, the total work accomplished is
W =F ⋅d (6.4.4)

= 20 ⋅ 4 (6.4.5)

= 80 foot-pounds. (6.4.6)

If force and distance are measured in English units (pounds and feet), then the units on work are foot-pounds. If instead we work in
metric units, where forces are measured in Newtons and distances in meters, the units on work are Newton-meters.

6.4.1 https://math.libretexts.org/@go/page/4480
Figure 6.14: Three settings where we compute the accumulation of a varying quantity: the area under y = f (x), the distance
traveled by an object with velocity y = v(t) , and the mass of a bar with density function y = ρ(x).
Of course, the formula W = F ⋅ d only applies when the force is constant while it is exerted over the distance d . In Preview
Activity 6.4, we explore one way that we can use a definite integral to compute the total work accomplished when the force exerted
varies.

Preview Activity 6.4.1

A bucket is being lifted from the bottom of a 50-foot deep well; its weight (including the water), B , in pounds at a height h
feet above the water is given by the function B(h) . When the bucket leaves the water, the bucket and water together weigh
B(0) = 20 pounds, and when the bucket reaches the top of the well, B(50) = 12 pounds. Assume that the bucket loses water

at a constant rate (as a function of height, h ) throughout its journey from the bottom to the top of the well.
a. Find a formula for B(h) .
b. Compute the value of the product B(5)Δh , where Δh = 2 feet. Include units on your answer. Explain why this product
represents the approximate work it took to move the bucket of water from h = 5 to h = 7 .
c. Is the value in (b) an over- or under-estimate of the actual amount of work it took to move the bucket from h = 5 to h = 7 ?
Why?
d. Compute the value of the product B(22)Δh, where Δh = 0.25 feet. Include units on your answer. What is the meaning of
the value you found?
e. More generally, what does the quantity W slice= B(h)Δh measure for a given value of h and a small positive value of

Δh ?
5
f. Evaluate the definite integral ∫ 0 B(h)dh . What is the meaning of the value you find? Why?
0

Work
Because work is calculated by the rule W = F ⋅ d , whenever the force F is constant, it follows that we can use a definite integral
to compute the work accomplished by a varying force. For example, suppose that in a setting similar to the problem posed in
Preview Activity 6.4, we have a bucket being lifted in a 50-foot well whose weight at height h is given by
−0.1h
B(h) = 12 + 8 e . (6.4.7)

In contrast to the problem in the preview activity, this bucket is not leaking at a constant rate; but because the weight of the bucket
and water is not constant, we have to use a definite integral to determine the total work that results from lifting the bucket. Observe
that at a height h above the water, the approximate work to move the bucket a small distance Δh is
−0.1h
Wslice = B(h)Δh = (12 + 8 e )Δh. (6.4.8)

Hence, if we let Δh tend to 0 and take the sum of all of the slices of work accomplished on these small intervals, it follows that the
total work is given by
50 50
−0.1h
W =∫ B(h) dh = ∫ (12 + 8 e )dh. (6.4.9)
0 0

6.4.2 https://math.libretexts.org/@go/page/4480
While is a straightforward exercise to evaluate this integral exactly using the First Fundamental Theorem of Calculus, in applied
settings such as this one we will typically use computing technology to find accurate approximations of integrals that are of interest
to us. Here, it turns out that
50
−0.1h
W =∫ (12 + 8 e )dh ≈ 679.461 foot-pounds. (6.4.10)
0

Our work in Preview Activity 6.1 and in the most recent example above employs the following important general principle.
For an object being moved in the positive direction along an axis, x, by a force F (x), the total work to move the object from a to b
is given by
b

W =∫ F (x)dx. (6.4.11)
a

Activity 6.4.1

Consider the following situations in which a varying force accomplishes work.


a. Suppose that a heavy rope hangs over the side of a cliff. The rope is 200 feet long and weighs 0.3 pounds per foot; initially
the rope is fully extended. How much work is required to haul in the entire length of the rope? (Hint: set up a function
F (h) whose value is the weight of the rope remaining over the cliff after h feet have been hauled in.)

b. A leaky bucket is being hauled up from a 100 foot deep well. When lifted from the water, the bucket and water together
weigh 40 pounds. As the bucket is being hauled upward at a constant rate, the bucket leaks water at a constant rate so that it
is losing weight at a rate of 0.1 pounds per foot. What function B(h) tells the weight of the bucket after the bucket has been
lifted h feet? What is the total amount of work accomplished in lifting the bucket to the top of the well?
c. Now suppose that the bucket in (b) does not leak at a constant rate, but rather that its weight at a height h feet above the
water is given by B(h) = 25 + 15e −0.05h
. What is the total work required to lift the bucket 100 feet? What is the average
force exerted on the bucket on the interval h = 0 to h = 100 ?
d. From physics, Hooke’s Law for springs states that the amount of force required to hold a spring that is compressed (or
extended) to a particular length is proportionate to the distance the spring is compressed (or extended) from its natural
length. That is, the force to compress (or extend) a spring x units from its natural length is F (x) = kx for some constant k
(which is called the spring constant.) For springs, we choose to measure the force in pounds and the distance the spring is
compressed in feet. Suppose that a force of 5 pounds extends a particular spring 4 inches (1/3 foot) beyond its natural
length.
i. Use the given fact that F (1/3) = 5 to find the spring constant k .
ii. Find the work done to extend the spring from its natural length to 1 foot beyond its natural length.
iii. Find the work required to extend the spring from 1 foot beyond its natural length to 1.5 feet beyond its natural length.

Work: Pumping Liquid from a Tank


In certain geographic locations where the water table is high, residential homes with basements have a peculiar feature: in the
basement, one finds a large hole in the floor, and in the hole, there is water. For example, in Figure 6.15 where we see a sump
crock.

6.4.3 https://math.libretexts.org/@go/page/4480
Figure 6.15: A sump crock. Image credit to www.warreninspect.com/basement-moisture.
Essentially, a sump crock provides an outlet for water that may build up beneath the basement floor; of course, as that water rises, it
is imperative that the water not flood the basement. Hence, in the crock we see the presence of a floating pump that sits on the
surface of the water: this pump is activated by elevation, so when the water level reaches a particular height, the pump turns on and
pumps a certain portion of the water out of the crock, hence relieving the water buildup beneath the foundation. One of the
questions we’d like to answer is: how much work does a sump pump accomplish? To that end, let’s suppose that we have a sump
crock that has the shape of a frustum of a cone, as pictured in Figure 6.16. Assume that the crock has a diameter of 3 feet at its
surface, a diameter of 1.5 feet at its base, and a depth of 4 feet. In addition, suppose that the sump pump is set up so that it pumps
the water vertically up a pipe to a drain that is located at ground level just outside a basement window. To accomplish this, the
pump must send the water to a location 9 feet above the surface of the sump crock.

Figure 6.16: A sump crock with approximately cylindrical cross-sections that is 4 feet deep, 1.5 feet in diameter at its base, and 3
feet in diameter at its top.
It turns out to be advantageous to think of the depth below the surface of the crock as being the independent variable, so, in
problems such as this one we typically let the positive x-axis point down, and the positive y -axis to the right, as pictured in the
figure. As we think about the work that the pump does, we first realize that the pump sits on the surface of the water, so it makes
sense to think about the pump moving the water one “slice” at a time, where it takes a thin slice from the surface, pumps it out of
the tank, and then proceeds to pump the next slice below. For the sump crock described in this example, each slice of water is
cylindrical in shape. We see that the radius of each approximately cylindrical slice varies according to the linear function y = f (x)
that passes through the points (0, 1.5) and (4, 0.75), where x is the depth of the particular slice in the tank; it is a straightforward
exercise to find that f (x) = 1.5 − 0.1875x. Now we are prepared to think about the overall problem in several steps:
a. determining the volume of a typical slice;
b. finding the weight (We assume that the weight density of water is 62.4 pounds per cubic foot) of a typical slice (and thus the
force that must be exerted on it)
c. deciding the distance that a typical slice moves; and
d. computing the work to move a representative slice. Once we know the work it takes to move one slice, we use a definite
integral over an appropriate interval to find the total work.
Consider a representative cylindrical slice that sits on the surface of the water at a depth of x feet below the top of the crock. It
follows that the approximate volume of that slice is given by
2
Vslice = πf (x ) Δx = π(1.5 − 0.1875x ) Δx
2
.
Since water weighs 62.4 lb/ft3 , it follows that the approximate weight of a representative slice, which is also the approximate force
the pump must exert to move the slice, is
Fslice = 62.4 ⋅ Vslice = 62.4π(1.5 − 0.1875x ) Δx
2
.
Because the slice is located at a depth of x feet below the top of the crock, the slice being moved by the pump must move x feet to
get to the level of the basement floor, and then, as stated in the problem description, be moved another 9 feet to reach the drain at
ground level outside a basement window. Hence, the total distance a representative slice travels is
dslice = x + 9 .

6.4.4 https://math.libretexts.org/@go/page/4480
Finally, we note that the work to move a representative slice is given by
2
Wslice = Fslice ⋅ dslice = 62.4π(1.5 − 0.1875x ) Δx ⋅ (x + 9) ,
since the force to move a particular slice is constant. We sum the work required to move slices throughout the tank (from x = 0 to
x = 4 ), let Δx → 0 , and hence

4
W =∫
0
2
62.4π(1.5 − 0.1875x ) (x + 9)dx ,
which, when evaluated using appropriate technology, shows that the total work is W = 10970.5π foot-pounds.
The preceding example demonstrates the standard approach to finding the work required to empty a tank filled with liquid. The
main task in each such problem is to determine the volume of a representative slice, followed by the force exerted on the slice, as
well as the distance such a slice moves. In the case where the units are metric, there is one key difference: in the metric setting,
rather than weight, we normally first find the mass of a slice. For instance, if distance is measured in meters, the mass density of
water is 1000 kg/m3 . In that setting, we can find the mass of a typical slice (in kg). To determine the force required to move it, we
use F = ma, where m is the object’s mass and a is the gravitational constant 9.81 N/kg3 . That is, in metric units, the weight density
of water is 9810 N/m3 .

Activity 6.4.2

In each of the following problems, determine the total work required to accomplish the described task. In parts (b) and (c), a
key step is to find a formula for a function that describes the curve that forms the side boundary of the tank.

Figure 6.17: A trough with triangular ends, as described in Activity 6.11, part (c).
a. Consider a vertical cylindrical tank of radius 2 meters and depth 6 meters. Suppose the tank is filled with 4 meters of water
of mass density 1000 kg/m3 , and the top 1 meter of water is pumped over the top of the tank.
b. Consider a hemispherical tank with a radius of 10 feet. Suppose that the tank is full to a depth of 7 feet with water of weight
density 62.4 pounds/ft3, and the top 5 feet of water are pumped out of the tank to a tanker truck whose height is 5 feet
above the top of the tank.
c. Consider a trough with triangular ends, as pictured in Figure 6.17, where the tank is 10 feet long, the top is 5 feet wide, and
the tank is 4 feet deep. Say that the trough is full to within 1 foot of the top with water of weight density 62.4 pounds/ft3,
and a pump is used to empty the tank until the water remaining in the tank is 1 foot deep.

Force due to Hydrostatic Pressure


When a dam is built, it is imperative to for engineers to understand how much force water will exert against the face of the dam.
The first thing we realize is the force exerted by the fluid is related to the natural concept of pressure. The pressure a force exerts on
a region is measured in units of force per unit of area: for example, the air pressure in a tire is often measured in pounds per square
inch (PSI). Hence, we see that the general relationship is given by
F
P = , or F =P ⋅A ,
A

where P represents pressure, F represents force, and A the area of the region being considered. Of course, in the equation F = PA,
we assume that the pressure is constant over the entire region A.
Most people know from experience that the deeper one dives underwater while swimming, the greater the pressure that is exerted
by the water. This is due to the fact that the deeper one dives, the more water there is right on top of the swimmer: it is the force

6.4.5 https://math.libretexts.org/@go/page/4480
that “column” of water exerts that determines the pressure the swimmer experiences. To get water pressure measured in its standard
units (pounds per square foot), we say that the total water pressure is found by computing the total weight of the column of water
that lies above a region of area 1 square foot at a fixed depth. Such a rectangular column with a 1 × 1 base and a depth of d feet has
volume V = 1 · 1 · d ft3, and thus the corresponding weight of the water overhead is 62.4d. Since this is also the amount of force
being exerted on a 1 square foot region at a depth d feet underwater, we see that P = 62.4d (lbs/ft2) is the pressure exerted by water
at depth d.
The understanding that P = 62.4d will tell us the pressure exerted by water at a depth of d, along with the fact that F = PA, will now
enable us to compute the total force that water exerts on a dam, as we see in the following example.

Example 6.4.3

Consider a trapezoid-shaped dam that is 60 feet wide at its base and 90 feet wide at its top, and assume the dam is 25 feet tall
with water that rises to within 5 feet of the top of its face. Water weighs 62.5 pounds per cubic foot. How much force does the
water exert against the dam?
Solution
First, we sketch a picture of the dam, as shown in Figure 6.18. Note that, as in problems involving the work to pump out a tank,
we let the positive x-axis point down.
It is essential to use the fact that pressure is constant at a fixed depth. Hence, we consider a slice of water at constant depth on
the face, such as the one shown in the figure. First, the approximate area of this slice is the area of the pictured rectangle. Since
the width of that rectangle depends on the variable x (which represents the how far the slice lies from the top of the dam), we
find a formula for the function y = f (x) that determines one side of the face of the dam. Since f is linear, it is straightforward
3
to find that y = f (x) = 45 − x . Hence, the approximate area of a representative slice is
5

3
Aslice = 2f (x)Δx = 2(45 − x)Δx .
5

At any point on this slice, the depth is approximately constant, and thus the pressure can be considered constant. In particular,
we note that since x measures the distance to the top of the dam, and because the water rises to within 5 feet of the top of the
dam, the depth of any point on the representative slice is approximately (x − 5) . Now, since pressure

Figure 6.18: A trapezoidal dam that is 25 feet tall, 60 feet wide at its base, 90 feet wide at its top, with the water line 5 feet
down from the top of its face.
is given by P = 62.4d, we have that at any point on the representative slice
Pslice = 62.4(x − 5) .
Knowing both the pressure and area, we can find the force the water exerts on the slice. Using F = PA , it follows that
3
Fslice = Pslice ⋅ Aslice = 62.4(x − 5) ⋅ 2(45 − x)Δx .
5

Finally, we use a definite integral to sum the forces over the appropriate range of x-values. Since the water rises to within 5
feet of the top of the dam, we start at x = 5 and slice all the way to the bottom of the dam, where x = 30. Hence,

6.4.6 https://math.libretexts.org/@go/page/4480
x=30 3
F =∫
x=5
62.4(x − 5) ⋅ 2(45 − x)dx .
5

Using technology to evaluate the integral, we find F ≈ 1.248 × 106 pounds.

Activity 6.4.4

In each of the following problems, determine the total force exerted by water against the surface that is described.
a. Consider a rectangular dam that is 100 feet wide and 50 feet tall, and suppose that water presses against the dam all the way
to the top.
b. Consider a semicircular dam with a radius of 30 feet. Suppose that the water rises to within 10 feet of the top of the dam.
c. Consider a trough with triangular ends, as pictured in Figure 6.17, where the tank is 10 feet long, the top is 5 feet wide, and
the tank is 4 feet deep. Say that the trough is full to within 1 foot of the top with water of weight density 62.4 pounds/ft3.
How much force does the water exert against one of the triangular ends?

While there are many different formulas that we use in solving problems involving work, force, and pressure, it is important to
understand that the fundamental ideas behind these problems are similar to several others that we’ve encountered in applications of
the definite integral. In particular, the basic idea is to take a difficult problem and somehow slice it into more manageable pieces
that we understand, and then use a definite integral to add up these simpler pieces.

Summary
In this section, we encountered the following important ideas:
To measure the work accomplished by a varying force that moves an object, we subdivide the problem into pieces on which we
can use the formula W = F · d, and then use a definite integral to sum the work accomplished on each piece.
To find the total force exerted by water against a dam, we use the formula F = P · A to measure the force exerted on a slice that
lies at a fixed depth, and then use a definite integral to sum the forces across the appropriate range of depths.
Because work is computed as the product of force and distance (provided force is constant), and the force water exerts on a dam
can be computed as the product of pressure and area (provided pressure is constant), problems involving these concepts are
similar to earlier problems we did using definite integrals to find distance (via “distance equals rate times time”) and mass
(“mass equals density times volume”).

6.4: Work is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
6.4: Physics Applications - Work, Force, and Pressure by Matthew Boelkins, David Austin & Steven Schlicker is licensed CC BY-SA 4.0.
Original source: https://activecalculus.org/single.

6.4.7 https://math.libretexts.org/@go/page/4480
6.5: Average Value of a Function
The average of some finite set of values is a familiar concept. If, for example, the class scores on a quiz are 10, 9, 10, 8, 7, 5, 7, 6,
3, 2, 7, 8, then the average score is the sum of these numbers divided by the size of the class:
10 + 9 + 10 + 8 + 7 + 5 + 7 + 6 + 3 + 2 + 7 + 8 82
average score = = ≈ 6.83. (6.5.1)
12 12

Suppose that between t = 0 and t = 1 the speed of an object is sin(πt). What is the average speed of the object over that time?
The question sounds as if it must make sense, yet we can't merely add up some number of speeds and divide, since the speed is
changing continuously over the time interval.
To make sense of "average'' in this context, we fall back on the idea of approximation. Consider the speed of the object at tenth of a
second intervals: sin 0 , sin(0.1π), sin(0.2π), sin(0.3π),…, sin(0.9π). The average speed "should'' be fairly close to the average of
these ten speeds:
9
1 1
∑ sin(πi/10) ≈ 6.3 = 0.63. (6.5.2)
10 10
i=0

Of course, if we compute more speeds at more times, the average of these speeds should be closer to the "real'' average. If we take
the average of n speeds at evenly spaced times, we get:
n−1
1
∑ sin(πi/n). (6.5.3)
n
i=0

Here the individual times are t i = i/n , so rewriting slightly we have


n−1
1
∑ sin(π ti ). (6.5.4)
n
i=0

This is almost the sort of sum that we know turns into an integral; what's apparently missing is Δt ---but in fact, Δt = 1/n , the
length of each subinterval. So rewriting again:
n−1 n−1
1
∑ sin(π ti ) = ∑ sin(π ti )Δt. (6.5.5)
n
i=0 i=0

Now this has exactly the right form, so that in the limit we get $$ \hbox{average speed} = \int_0^1 \sin(\pi t)\,dt= \left.-{\cos(\pi
t)\over\pi}\right|_0^1= -{\cos(\pi)\over \pi}+{\cos(0)\over\pi}={2\over\pi}\approx 0.6366\approx 0.64. \]
It's not entirely obvious from this one simple example how to compute such an average in general. Let's look at a somewhat more
complicated case. Suppose that the velocity of an object is 16t + 5 feet per second. What is the average velocity between t = 1
2

and t = 3 ? Again we set up an approximation to the average:


n−1
1
2
∑ 16 t + 5, (6.5.6)
i
n
i=0

where the values t are evenly spaced times between 1 and 3. Once again we are "missing'' Δt, and this time 1/n is not the correct
i

value. What is Δt in general? It is the length of a subinterval; in this case we take the interval [1, 3] and divide it into n
subintervals, so each has length (3 − 1)/n = 2/n = Δt . Now with the usual "multiply and divide by the same thing'' trick we can
rewrite the sum:
n−1 n−1 n−1 n−1
1 1 3 −1 1 2 1
2 2 2 2
∑ 16 t +5 = ∑(16 t + 5) = ∑(16 t + 5) = ∑(16 t + 5)Δt. (6.5.7)
i i i i
n 3 −1 n 2 n 2
i=0 i=0 i=0 i=0

In the limit this becomes


3
1 1 446 223
2
∫ 16 t + 5 dt = = . (6.5.8)
2 1
2 3 3

6.5.1 https://math.libretexts.org/@go/page/4481
Does this seem reasonable? Let's picture it: in figure 9.4.1 is the velocity function together with the horizontal line
y = 223/3 ≈ 74.3. Certainly the height of the horizontal line looks at least plausible for the average height of the curve.

Figure 9.4.1. Average velocity.


Here's another way to interpret "average'' that may make our computation appear even more reasonable. The object of our example
goes a certain distance between t = 1 and t = 3 . If instead the object were to travel at the average speed over the same time, it
should go the same distance. At an average speed of 223/3 feet per second for two seconds the object would go 446/3 feet. How
far does it actually go? We know how to compute this:
3 3
446
2
∫ v(t) dt = ∫ 16 t + 5 dt = . (6.5.9)
1 1
3

So now we see that another interpretation of the calculation


3
1 2
1 446 223
∫ 16 t + 5 dt = = (6.5.10)
2 1
2 3 3

is: total distance traveled divided by the time in transit, namely, the usual interpretation of average speed.
In the case of speed, or more properly velocity, we can always interpret "average'' as total (net) distance divided by time. But in the
case of a different sort of quantity this interpretation does not obviously apply, while the approximation approach always does. We
might interpret the same problem geometrically: what is the average height of 16x + 5 on the interval [1, 3]? We approximate this 2

in exactly the same way, by adding up many sample heights and dividing by the number of samples. In the limit we get the same
result:
n−1 3
1 1 1 446 223
2 2
lim ∑ 16 x +5 = ∫ 16 x + 5 dx = = . (6.5.11)
i
n→∞ n 2 2 3 3
1
i=0

We can interpret this result in a slightly different way. The area under y = 16x 2
+5 above [1, 3] is
3
446
2
∫ 16 t + 5 dt = . (6.5.12)
1
3

The area under y = 223/3 over the same interval [1, 3] is simply the area of a rectangle that is 2 by 223/3 with area 446/3. So the
average height of a function is the height of the horizontal line that produces the same area over the given interval.

Contributors and Attributions


David Guichard (Whitman College)
Integrated by Justin Marshall.

6.5: Average Value of a Function is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
9.4: Average Value of a Function by David Guichard is licensed CC BY-NC-SA 4.0.

6.5.2 https://math.libretexts.org/@go/page/4481
CHAPTER OVERVIEW

7: Techniques of Integration
A general Calculus Textmap organized around the textbook

Calculus: Early Transcendentals


by James Stewart

I II III IV V VI VII VIII IX X XI XII XIII XIV XV XVI XVII

This Textmap is currently under construction... please be patient with us.

Topic hierarchy
7.1: Integration by Parts
7.2: Trigonometric Integrals
7.3: Trigonometric Substitution
7.4: Integration of Rational Functions by Partial Fractions
7.5: Strategy for Integration
7.6: Integration Using Tables and Computer Algebra Systems
7.7: Approximate Integration
7.8: Improper Integrals

7: Techniques of Integration is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

1
7.1: Integration by Parts
 Learning Objectives
Recognize when to use integration by parts.
Use the integration-by-parts formula to solve integration problems.
Use the integration-by-parts formula for definite integrals.

By now we have a fairly thorough procedure for how to evaluate many basic integrals. However, although we can integrate
∫ x sin(x ) dx by using the substitution, u = x , something as simple looking as ∫ x sin x dx defies us. Many students want to
2 2

know whether there is a product rule for integration. There is not, but there is a technique based on the product rule for
differentiation that allows us to exchange one integral for another. We call this technique integration by parts.

The Integration-by-Parts Formula


If, h(x) = f (x)g(x), then by using the product rule, we obtain

h'(x) = f '(x)g(x) + g'(x)f (x). (7.1.1)

Although at first it may seem counterproductive, let’s now integrate both sides of Equation 7.1.1:

∫ h'(x) dx = ∫ (g(x)f '(x) + f (x)g'(x)) dx.

This gives us

h(x) = f (x)g(x) = ∫ g(x)f '(x) dx + ∫ f (x)g'(x) dx.

Now we solve for ∫ f (x)g'(x) dx :

∫ f (x)g'(x) dx = f (x)g(x) − ∫ g(x)f '(x) dx.

By making the substitutions u = f (x) and v = g(x) , which in turn make du = f '(x) dx and dv = g'(x) dx , we have the more
compact form

∫ u dv = uv − ∫ v du.

 Integration by Parts

Let u = f (x) and v = g(x) be functions with continuous derivatives. Then, the integration-by-parts formula for the integral
involving these two functions is:

∫ u dv = uv − ∫ v du. (7.1.2)

The advantage of using the integration-by-parts formula is that we can use it to exchange one integral for another, possibly easier,
integral. The following example illustrates its use.

 Example 7.1.1: Using Integration by Parts


Use integration by parts with u = x and dv = sin x dx to evaluate

∫ x sin x dx.

Solution
By choosing u = x , we have du = 1 dx . Since dv = sin x dx , we get

7.1.1 https://math.libretexts.org/@go/page/4483
v=∫ sin x dx = − cos x.

It is handy to keep track of these values as follows:


u =x

dv = sin x dx

du = 1 dx

v = ∫ sin x dx = − cos x.

Applying the integration-by-parts formula (Equation 7.1.2) results in

∫ x sin x dx = (x)(− cos x) − ∫ (− cos x)(1 dx) (Substitute)

= −x cos x + ∫ cos x dx (Simplify)

Then use

∫ cos x dx = sin x + C .

to obtain

∫ x sin x dx = −x cos x + sin x + C .

Analysis
At this point, there are probably a few items that need clarification. First of all, you may be curious about what would have
1
happened if we had chosen u = sin x and dv = x . If we had done so, then we would have du = cos x and v=
2
x . Thus,
2
after applying integration by parts (Equation 7.1.2), we have
1 1
2 2
∫ x sin x dx = x sin x − ∫ x cos x dx.
2 2

Unfortunately, with the new integral, we are in no better position than before. It is important to keep in mind that when we
apply integration by parts, we may need to try several choices for u and dv before finding a choice that works.
Second, you may wonder why, when we find v = ∫ sin x dx = − cos x , we do not use v = − cos x + K. To see that it makes
no difference, we can rework the problem using v = − cos x + K :

∫ x sin x dx = (x)(− cos x + K) − ∫ (− cos x + K)(1 dx)

= −x cos x + Kx + ∫ cos x dx − ∫ K dx

= −x cos x + Kx + sin x − Kx + C

= −x cos x + sin x + C .

As you can see, it makes no difference in the final solution.


Last, we can check to make sure that our antiderivative is correct by differentiating −x cos x + sin x + C :

d
(−x cos x + sin x + C ) = (−1) cos x + (−x)(− sin x) + cos x
dx

= x sin x

Therefore, the antiderivative checks out.

7.1.2 https://math.libretexts.org/@go/page/4483
 Exercise 7.1.1

Evaluate ∫ x e 2x
dx using the integration-by-parts formula (Equation 7.1.2) with u = x and dv = e 2x
dx .

Hint
Find du and v , and use the previous example as a guide.

Answer
1 1
2x 2x 2x
∫ xe dx = xe − e +C
2 4

The natural question to ask at this point is: How do we know how to choose u and dv ? Sometimes it is a matter of trial and error;
however, the acronym LIATE can often help to take some of the guesswork out of our choices. This acronym stands for
Logarithmic Functions, Inverse Trigonometric Functions, Algebraic Functions, Trigonometric Functions, and Exponential
Functions. This mnemonic serves as an aid in determining an appropriate choice for u. The type of function in the integral that
appears first in the list should be our first choice of u.
For example, if an integral contains a logarithmic function and an algebraic function, we should choose u to be the logarithmic
function, because L comes before A in LIATE. The integral in Example 7.1.1 has a trigonometric function (sin x) and an algebraic
function (x). Because A comes before T in LIATE, we chose u to be the algebraic function. When we have chosen u, dv is selected
to be the remaining part of the function to be integrated, together with dx.
Why does this mnemonic work? Remember that whatever we pick to be dv must be something we can integrate. Since we do not
have integration formulas that allow us to integrate simple logarithmic functions and inverse trigonometric functions, it makes
sense that they should not be chosen as values for dv . Consequently, they should be at the head of the list as choices for u. Thus,
we put LI at the beginning of the mnemonic. (We could just as easily have started with IL, since these two types of functions won’t
appear together in an integration-by-parts problem.) The exponential and trigonometric functions are at the end of our list because
they are fairly easy to integrate and make good choices for dv . Thus, we have TE at the end of our mnemonic. (We could just as
easily have used ET at the end, since when these types of functions appear together it usually doesn’t really matter which one is u
and which one is dv .) Algebraic functions are generally easy both to integrate and to differentiate, and they come in the middle of
the mnemonic.

 Example 7.1.2: Using Integration by Parts

Evaluate
ln x
∫ dx.
3
x

Solution
Begin by rewriting the integral:
ln x
−3
∫ dx = ∫ x ln x dx.
3
x

Since this integral contains the algebraic function x and the logarithmic function
−3
ln x , choose u = ln x , since L comes
before A in LIATE. After we have chosen u = ln x , we must choose dv = x dx . −3

1 1
Next, since u = ln x, we have du = dx. Also, v = ∫ x −3
dx = − x
−2
. Summarizing,
x 2

u = ln x

1
du = dx
x
−3
dv = x dx

1
−3 −2
v=∫ x dx = − x .
2

Substituting into the integration-by-parts formula (Equation 7.1.2) gives

7.1.3 https://math.libretexts.org/@go/page/4483
ln x −3
1 −2
1 −2
1
∫ dx = ∫ x ln x dx = (ln x)(− x )−∫ (− x )( dx)
3
x 2 2 x

1 −2
1 −3
=− x ln x + ∫ x dx
2 2

1 −2
1 −2
=− x ln x − x +C
2 4

1 1
=− ln x − +C
2 2
2x 4x

 Exercise 7.1.2
Evaluate

∫ x ln x dx.

Hint
Use u = ln x and dv = x dx .

Answer
1 1
2 2
∫ x ln x dx = x ln x − x +C
2 4

In some cases, as in the next two examples, it may be necessary to apply integration by parts more than once.

 Example 7.1.3A: Applying Integration by Parts More Than Once

Evaluate

2 3x
∫ x e dx.

Solution
1
Using LIATE, choose u = x and dv = e 2 3x
dx . Thus, du = 2x dx and v = ∫ e 3x
dx = ( )e
3x
. Therefore,
3

2
u =x

du = 2x dx
3x
dv = e dx

1
3x 3x
v=∫ e dx = e .
3

Substituting into Equation 7.1.2 produces


1 2
2 3x 2 3x 3x
∫ x e dx = x e −∫ xe dx. (7.1.3)
3 3

2
We still cannot integrate ∫ xe
3x
dx directly, but the integral now has a lower power on x. We can evaluate this new integral
3
by using integration by parts again. To do this, choose

u =x

and
2
3x
dv = e dx.
3

Thus,

7.1.4 https://math.libretexts.org/@go/page/4483
du = dx

and
2 3x
2 3x
v=∫ ( )e dx = ( )e .
3 9

Now we have
u =x

du = dx

2
3x
dv = e dx
3
2 2
3x 3x
v=∫ e dx = e .
3 9

Substituting back into Equation 7.1.3 yields


1 2 2
2 3x 2 3x 3x 3x
∫ x e dx = x e −( xe −∫ e dx) .
3 9 9

After evaluating the last integral and simplifying, we obtain


1 2 2
2 3x 2 3x 3x 3x
∫ x e dx = x e − xe + e + C.
3 9 27

 Example 7.1.3B: Applying Integration by Parts When LIATE Does not Quite Work

Evaluate
2
3 t
∫ t e dt.

Solution
If we use a strict interpretation of the mnemonic LIATE to make our choice of u, we end up with u = t and dv = e .
2
3 t
dt

Unfortunately, this choice won’t work because we are unable to evaluate ∫ e dt . However, since we can evaluate ∫ te dx ,
2 2
t t

we can try choosing u = t and dv = te dt. With these choices we have


2 t

2
u =t

du = 2tdt
2
t
dv = te dt

2 1 2
t t
v = ∫ te dt = e .
2

Thus, we obtain
2 1 2 1 2
3 t 2 t t
∫ t e dt = t e −∫ e 2t dt
2 2

1 2 t
2 1 2
t
= t e − e + C.
2 2

 Example 7.1.3C : Applying Integration by Parts More Than Once

Evaluate

∫ sin(ln x) dx.

Solution

7.1.5 https://math.libretexts.org/@go/page/4483
This integral appears to have only one function—namely, sin(ln x)—however, we can always use the constant function 1 as
the other function. In this example, let’s choose u = sin(ln x) and dv = 1 dx . (The decision to use u = sin(ln x) is easy. We
can’t choose dv = sin(ln x) dx because if we could integrate it, we wouldn’t be using integration by parts in the first place!)
Consequently, du = (1/x) cos(ln x) dx and v = ∫ 1 dx = x. After applying integration by parts to the integral and
simplifying, we have

∫ sin(ln x) dx = x sin(ln x) − ∫ cos(ln x) dx.

Unfortunately, this process leaves us with a new integral that is very similar to the original. However, let’s see what happens
when we apply integration by parts again. This time let’s choose u = cos(ln x) and dv = 1 dx, making
du = −(1/x) sin(ln x) dx and v = ∫ 1 dx = x.

Substituting, we have

∫ sin(ln x) dx = x sin(ln x) − (x cos(ln x) − ∫ − sin(ln x) dx).

After simplifying, we obtain

∫ sin(ln x) dx = x sin(ln x) − x cos(ln x) − ∫ sin(ln x) dx.

The last integral is now the same as the original. It may seem that we have simply gone in a circle, but now we can actually
evaluate the integral. To see how to do this more clearly, substitute I = ∫ sin(ln x) dx. Thus, the equation becomes

I = x sin(ln x) − x cos(ln x) − I .

First, add I to both sides of the equation to obtain

2I = x sin(ln x) − x cos(ln x).

Next, divide by 2:
1 1
I = x sin(ln x) − x cos(ln x).
2 2

Substituting I = ∫ sin(ln x) dx again, we have


1 1
∫ sin(ln x) dx = x sin(ln x) − x cos(ln x).
2 2

From this we see that (1/2)x sin(ln x) − (1/2)x cos(ln x) is an antiderivative of sin(ln x) dx . For the most general
antiderivative, add +C :
1 1
∫ sin(ln x) dx = x sin(ln x) − x cos(ln x) + C .
2 2

Analysis
If this method feels a little strange at first, we can check the answer by differentiation:
d 1 1
( x sin(ln x) − x cos(ln x))
dx 2 2

1 1 1 1 1 1
= (sin(ln x)) + cos(ln x) ⋅ ⋅ x −( cos(ln x) − sin(ln x) ⋅ ⋅ x)
2 x 2 2 x 2

= sin(ln x).

7.1.6 https://math.libretexts.org/@go/page/4483
 Exercise 7.1.3
Evaluate

2
∫ x sin x dx.

Hint
This is similar to Examples 7.1.3A- 7.1.3C.

Answer

2 2
∫ x sin x dx = −x cos x + 2x sin x + 2 cos x + C

Integration by Parts for Definite Integrals


Now that we have used integration by parts successfully to evaluate indefinite integrals, we turn our attention to definite integrals.
The integration technique is really the same, only we add a step to evaluate the integral at the upper and lower limits of integration.

 Integration by Parts for Definite Integrals


Let u = f (x) and v = g(x) be functions with continuous derivatives on [a, b]. Then
b b
b

∫ u dv = uv −∫ v du

a
a a

 Example 7.1.4A: Finding the Area of a Region

Find the area of the region bounded above by the graph of y = tan −1
x and below by the x-axis over the interval [0, 1].
Solution
This region is shown in Figure 7.1.1. To find the area, we must evaluate
1
−1
∫ tan x dx.
0

Figure 7.1.1 : To find the area of the shaded region, we have to use integration by parts.
1
For this integral, let’s choose u = tan
−1
x and dv = dx , thereby making du =
2
dx and v=x . After applying the
x +1

integration-by-parts formula (Equation 7.1.2) we obtain


1
1 x
−1
Area = x tan x∣
∣ −∫ dx.
0 2
0 x +1

7.1.7 https://math.libretexts.org/@go/page/4483
Use u-substitution to obtain
1 1
x 1 ∣
2
∫ dx = ln(x + 1)∣ .
2 ∣
0 x +1 2 0

Thus,
1
1 1 ∣ π 1
−1 ∣ 2 2
Area = x tan x − ln(x + 1)∣ =( − ln 2) units .
∣0 ∣
2 0 4 2

At this point it might not be a bad idea to do a “reality check” on the reasonableness of our solution. Since
π 1

2
ln 2 ≈ 0.4388 units , and from Figure 7.1.1 we expect our area to be slightly less than 2
0.5 units , this solution
4 2
appears to be reasonable.

 Example 7.1.4B: Finding a Volume of Revolution

Find the volume of the solid obtained by revolving the region bounded by the graph of f (x) = e −x
, the x-axis, the y -axis, and
the line x = 1 about the y -axis.
Solution
The best option to solving this problem is to use the shell method. Begin by sketching the region to be revolved, along with a
typical rectangle (Figure 7.1.2).

Figure 7.1.2 : We can use the shell method to find a volume of revolution.
To find the volume using shells, we must evaluate
1
−x
2π ∫ xe dx. (7.1.4)
0

To do this, let u = x and dv = e


−x
. These choices lead to du = dx and v=∫ e
−x
dx = −e
−x
. Using the Shell Method
formula, we obtain
1
−x
Volume = 2π ∫ xe dx
0

1
1
−x ∣ −x
= 2π (−x e +∫ e dx) (Use integration by parts)
∣0
0

1
−1 −x ∣
= 2π (−e +0 −e )

0

−1 −1
= 2π (−e −e + 1)

2
3
= 2π (1 − ) units . (Evaluate and simplify)
e

Analysis

7.1.8 https://math.libretexts.org/@go/page/4483
Again, it is a good idea to check the reasonableness of our solution. We observe that the solid has a volume slightly less than
1
that of a cylinder of radius 1 and height of 1/e added to the volume of a cone of base radius 1 and height of 1− .
e
Consequently, the solid should have a volume a bit less than

2
1 π 2
1 2π π 3
π(1 ) +( ) (1 ) (1 − ) = + ≈ 1.8177 units .
e 3 e 3e 3


Since 2π − ≈ 1.6603, we see that our calculated volume is reasonable.
e

 Exercise 7.1.4

Evaluate
π/2

∫ x cos x dx.
0

Hint
Use Equation 7.1.2with u = x and dv = cos x dx.

Answer
π/2
π
∫ x cos x dx = −1
0
2

Key Concepts
The integration-by-parts formula (Equation 7.1.2) allows the exchange of one integral for another, possibly easier, integral.
Integration by parts applies to both definite and indefinite integrals.

Key Equations
Integration by parts formula

∫ u dv = uv − ∫ v du

Integration by parts for definite integrals


b b
b

∫ u dv = uv −∫ v du
∣a
a a

Glossary
integration by parts

a technique of integration that allows the exchange of one integral for another using the formula ∫ u dv = uv − ∫ v du

7.1: Integration by Parts is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
7.1: Integration by Parts by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

7.1.9 https://math.libretexts.org/@go/page/4483
7.2: Trigonometric Integrals
 Learning Objectives
Solve integration problems involving products and powers of sin x and cos x.
Solve integration problems involving products and powers of tan x and sec x.
Use reduction formulas to solve trigonometric integrals.

In this section we look at how to integrate a variety of products of trigonometric functions. These integrals are called trigonometric integrals. They are
an important part of the integration technique called trigonometric substitution, which is featured in Trigonometric Substitution. This technique allows us
to convert algebraic expressions that we may not be able to integrate into expressions involving trigonometric functions, which we may be able to
integrate using the techniques described in this section. In addition, these types of integrals appear frequently when we study polar, cylindrical, and
spherical coordinate systems later. Let’s begin our study with products of sin x and cos x.

Integrating Products and Powers of sin x and cos x


A key idea behind the strategy used to integrate combinations of products and powers of sin x and cos x involves rewriting these expressions as sums and
differences of integrals of the form ∫ sin x cos x dx or ∫ cos x sin x dx. After rewriting these integrals, we evaluate them using u-substitution. Before
j j

describing the general process in detail, let’s take a look at the following examples.

 Example 7.2.1: Integrating ∫ j


cos x sin x dx

Evaluate ∫ cos
3
x sin x dx.

Solution
Use u-substitution and let u = cos x. In this case, du = − sin x dx.
Thus,
1 1
3 3 4 4
∫ cos x sin x dx = − ∫ u du = − u +C = − cos x + C.
4 4

 Exercise 7.2.1

Evaluate ∫ 4
sin x cos x dx.

Hint
Let u = sin x.

Answer
4
1 5
∫ sin x cos x dx = sin x +C
5

 Example 7.2.2: A Preliminary Example: Integrating ∫ cos j


x sin
k
x dx where k is Odd

Evaluate ∫ cos
2
x sin
3
x dx.

Solution

To convert this integral to integrals of the form ∫ j


cos x sin x dx, rewrite sin 3
x = sin
2
x sin x and make the substitution sin
2 2
x = 1 − cos x.

Thus,

7.2.1 https://math.libretexts.org/@go/page/4484
2 3 2 2
∫ cos x sin x dx = ∫ cos x(1 − cos x) sin x dx Let u = cos x; then du = − sin x dx.

2 2
= −∫ u (1 − u ) du

4 2
= ∫ (u − u ) du

1 1
5 3
= u − u +C
5 3

1 5
1 3
= cos x− cos x + C.
5 3

 Exercise 7.2.2

Evaluate ∫ cos
3
x sin
2
x dx.

Hint
Write cos 3
x = cos
2
x cos x = (1 − sin
2
x) cos x and let u = sin x .

Answer
3 2
1 3
1 5
∫ cos x sin x dx = sin x− sin x +C
3 5

In the next example, we see the strategy that must be applied when there are only even powers of sin x and cos x. For integrals of this type, the identities

1 1 1 − cos(2x)
2
sin x = − cos(2x) =
2 2 2

and

1 1 1 + cos(2x)
2
cos x = + cos(2x) =
2 2 2

are invaluable. These identities are sometimes known as power-reducing identities and they may be derived from the double-angle identity
cos(2x) = cos x − sin x and the Pythagorean identity cos x + sin x = 1.
2 2 2 2

 Example 7.2.3: Integrating an Even Power of sin x

Evaluate ∫ sin
2
x dx .

Solution
To evaluate this integral, let’s use the trigonometric identity sin 2
x =
1

2

1

2
cos(2x). Thus,
1 1 1 1
2
∫ sin x dx = ∫ ( − cos(2x)) dx = x− sin(2x) + C .
2 2 2 4

 Exercise 7.2.3

Evaluate ∫ cos
2
x dx.

Hint
2 1 1
cos x = + cos(2x)
2 2

Answer
1 1
2
∫ cos x dx = x+ sin(2x) + C
2 4

The general process for integrating products of powers of sin x and cos x is summarized in the following set of guidelines.

7.2.2 https://math.libretexts.org/@go/page/4484
 Problem-Solving Strategy: Integrating Products and Powers of sin x and cosx

To integrate ∫ cos x sin


j k
x dx use the following strategies:

1. If k is odd, rewrite sin x = sin x sin x and use the identity sin
k k−1 2
x = 1 − cos
2
x to rewrite sin k−1
x in terms of cos x. Integrate using the
substitution u = cos x. This substitution makes du = − sin x dx.
2. If j is odd, rewrite cos x = cos x cos x and use the identity cos x = 1 − sin x to rewrite cos x in terms of sin x. Integrate using the
j j−1 2 2 j−1

substitution u = sin x . This substitution makes du = cos x dx. (Note: If both j and k are odd, either strategy 1 or strategy 2 may be used.)
1 − cos(2x) 1 + cos(2x)
3. If both j and k are even, use sin
2
x = and 2
cos x = . After applying these formulas, simplify and reapply
2 2
strategies 1 through 3 as appropriate.

 Example 7.2.4: Integrating ∫ cos j


x sin
k
x dx where k is Odd

Evaluate ∫ cos
8
x sin
5
x dx.

Solution
Since the power on sin x is odd, use strategy 1. Thus,
8 5 8 4
∫ cos x sin x dx = ∫ cos x sin x sin x dx Break off sin x.

8 2 2 4 2 2
=∫ cos x(sin x) sin x dx Rewrite sin x = (sin x) .

8 2 2 2 2
=∫ cos x(1 − cos x) sin x dx Substitute sin x = 1 − cos x.

8 2 2
=∫ u (1 − u ) (−du) Let u = cos x and du = − sin x dx.

8 10 12
= ∫ (−u + 2u −u )du Expand.

1 9
2 11
1 13
=− u + u − u +C Evaluate the integral.
9 11 13

1 9
2 11
1 13
=− cos x+ cos x− cos x +C Substitute u = cos x.
9 11 13

 Example 7.2.5: Integrating ∫ cos j


x sin
k
x dx where k and j are Even

Evaluate ∫ sin
4
x dx.

Solution: Since the power on sin x is even (k = 4) and the power on cos x is even (j = 0), we must use strategy 3. Thus,
2 2
4 2 4 2
∫ sin x dx = ∫ (sin x) dx Rewrite sin x = (sin x) .

2
1 1 1 1
2
=∫ ( − cos(2x)) dx Substitute sin x = − cos(2x).
2 2 2 2

2
1 1 1 2
1 1
=∫ ( − cos(2x) + cos (2x)) dx Expand ( − cos(2x)) .
4 2 4 2 2

1 1 1 1 1 1 1
2 2
=∫ ( − cos(2x) + ( + cos(4x))) dx Since cos (2x) has an even power, substitute cos (2x) = + cos(4x).
4 2 4 2 2 2 2

3 1 1
=∫ ( − cos(2x) + cos(4x)) dx Simplify.
8 2 8

3 1 1
= x− sin(2x) + sin(4x) + C Evaluate the integral.
8 4 32

 Exercise 7.2.4

Evaluate ∫ cos
3
x dx.

Hint

7.2.3 https://math.libretexts.org/@go/page/4484
Use strategy 2. Write cos 3
x = cos
2
x cos x and substitute cos 2
x = 1 − sin
2
x.

Answer
3
1 3
∫ cos x dx = sin x − sin x +C
3

 Exercise 7.2.5

Evaluate ∫ 2
cos (3x) dx.

Hint
Use strategy 3. Substitute cos 2
(3x) =
1

2
+
1

2
cos(6x)

Answer
1 1
2
∫ cos (3x) dx = x+ sin(6x) + C
2 12

In some areas of physics, such as quantum mechanics, signal processing, and the computation of Fourier series, it is often necessary to integrate products
that include sin(ax), sin(bx), cos(ax), and cos(bx). These integrals are evaluated by applying trigonometric identities, as outlined in the following rule.

 Rule: Integrating Products of Sines and Cosines of Different Angles


To integrate products involving sin(ax), sin(bx), cos(ax), and cos(bx), use the substitutions
1 1
sin(ax) sin(bx) = cos((a − b)x) − cos((a + b)x)
2 2

1 1
sin(ax) cos(bx) = sin((a − b)x) + sin((a + b)x)
2 2

1 1
cos(ax) cos(bx) = cos((a − b)x) + cos((a + b)x)
2 2

These formulas may be derived from the sum-of-angle formulas for sine and cosine.

 Example 7.2.6: Evaluating ∫ sin(ax) cos(bx) dx

Evaluate ∫ sin(5x) cos(3x) dx.

Solution: Apply the identity sin(5x) cos(3x) = 1

2
sin(2x) +
1

2
sin(8x). Thus,
1 1 1 1
∫ sin(5x) cos(3x) dx = ∫ sin(2x) + sin(8x) dx = − cos(2x) − cos(8x) + C .
2 2 4 16

 Exercise 7.2.6

Evaluate ∫ cos(6x) cos(5x) dx.

Hint
Substitute cos(6x) cos(5x) = 1

2
cos x +
1

2
cos(11x).

Answer
1 1
∫ cos(6x) cos(5x) dx = sin x + sin(11x) + C
2 22

Integrating Products and Powers of tan x and sec x


Before discussing the integration of products and powers of tan x and sec x, it is useful to recall the integrals involving tan x and sec x we have already
learned:

1. ∫ sec
2
x dx = tan x + C

2. ∫ sec x tan x dx = sec x + C

7.2.4 https://math.libretexts.org/@go/page/4484
3. ∫ tan x dx = ln | sec x| + C

4. ∫ sec x dx = ln | sec x + tan x| + C .

For most integrals of products and powers of tan x and sec x, we rewrite the expression we wish to integrate as the sum or difference of integrals of the
form ∫ tan
j
x sec
2
x dx or ∫ sec
j
x tan x dx . As we see in the following example, we can evaluate these new integrals by using u-substitution.

 Example 7.2.7: Evaluating ∫ sec j


x tan x dx

Evaluate ∫ sec
5
x tan x dx.

Solution: Start by rewriting sec 5


x tan x as sec 4
x sec x tan x.

5 4
∫ sec x tan x dx = ∫ sec x sec x tan x dx

4
=∫ u du Let u = sec x; then, du = sec x tan x dx.

1 5
= u +C Evaluate the integral.
5

1 5
= sec x +C Substitute sec x = u.
5

You can read some interesting information at this website to learn about a common integral involving the secant.

 Exercise 7.2.7

Evaluate ∫ tan
5
x sec
2
x dx.

Hint
Let u = tan x and du = sec 2
x.

Answer
5 2 1 6
∫ tan x sec x dx = tan x +C
6

We now take a look at the various strategies for integrating products and powers of sec x and tan x.

 Problem-Solving Strategy: Integrating ∫ tan


k
x sec
j
x dx

To integrate ∫ tan
k
x sec
j
x dx, use the following strategies:

1. If j is even and j ≥ 2, rewrite sec j


x = sec
j−2
x sec
2
x and use sec
2
x = tan
2
x +1 to rewrite sec
j−2
x in terms of tan x . Let u = tan x

and du = sec x. 2

2. If k is odd and j ≥ 1 , rewrite tan x sec x = tan x sec x sec x tan x and use tan x = sec x − 1 to rewrite tan
k j k−1 j−1 2 2 k−1
x in terms of
sec x. Let u = sec x and du = sec x tan x dx. (Note: If j is even and k is odd, then either strategy 1 or strategy 2 may be used.)

3. If k is odd where k ≥ 3 and j = 0 , rewrite tan x = tan k k−2


x tan
2
x = tan
k−2
x(sec
2
x − 1) = tan
k−2
x sec
2
x − tan
k−2
x. It may be
necessary to repeat this process on the tan x term.
k−2

4. If k is even and j is odd, then use tan 2


x = sec
2
x −1 to express tan k
x in terms of sec x. Use integration by parts to integrate odd powers
of sec x.

 Example 7.2.8: Integrating ∫ tan k


x sec
j
x dx when j is Even

Evaluate ∫ tan
6
x sec
4
x dx.

Solution
Since the power on sec x is even, rewrite sec 4
x = sec
2
x sec
2
x and use sec 2
x = tan
2
x +1 to rewrite the first sec 2
x in terms of tan x. Thus,

7.2.5 https://math.libretexts.org/@go/page/4484
6 4 6 2 2
∫ tan x sec x dx = ∫ tan x(tan x + 1) sec x dx

6 2 2
=∫ u (u + 1) du Let u = tan x and du = sec x.

8 6
= ∫ (u + u ) du Expand.

1 1
9 7
= u + u +C Evaluate the integral.
9 7

1 9
1 7
= tan x+ tan x + C. Substitute tan x = u.
9 7

 Example 7.2.9: Integrating ∫ tan k


x sec
j
x dx when k is Odd

Evaluate ∫ tan
5
x sec
3
x dx.

Solution
Since the power on tan x is odd, begin by rewriting tan 5
x sec
3
x = tan
4
x sec
2
x sec x tan x. Thus,
5 3 4 2
∫ tan x sec x dx = tan x sec x sec x tan x.

2 2 2 4 2 2
= ∫ (tan x) sec x sec x tan x dx Write tan x = (tan x) .

2 2 2 2 2
= ∫ (sec x − 1) sec x sec x tan x dx Use tan x = sec x − 1.

2 2 2
= ∫ (u − 1 ) u du Let u = sec x and du = sec x tan x dx

6 4 2
= ∫ (u − 2u + u )du Expand.

1 2 1
7 5 3
= u − u + u +C Integrate.
7 5 3

1 7
2 5
1 3
= sec x− sec x+ sec x +C Substitute sec x = u.
7 5 3

 Example 7.2.10: Integrating ∫ tan k


x dx where k is Odd and k ≥ 3

Evaluate ∫ tan
3
x dx.

Solution
Begin by rewriting tan 3
x = tan x tan
2
x = tan x(sec
2
x − 1) = tan x sec
2
x − tan x. Thus,
3 2
∫ tan x dx = ∫ (tan x sec x − tan x) dx

2
=∫ tan x sec x dx − ∫ tan x dx

1 2
= tan x − ln | sec x| + C .
2

For the first integral, use the substitution u = tan x. For the second integral, use the formula.

 Example 7.2.11: Integrating ∫ sec


3
x dx

Integrate ∫ sec
3
x dx.

Solution
This integral requires integration by parts. To begin, let u = sec x and dv = sec 2
x . These choices make du = sec x tan x and v = tan x . Thus,

7.2.6 https://math.libretexts.org/@go/page/4484
3
∫ sec x dx = sec x tan x − ∫ tan x sec x tan x dx

2
= sec x tan x − ∫ tan x sec x dx Simplify.

2 2 2
= sec x tan x − ∫ (sec x − 1) sec x dx Substitute tan x = sec x − 1.

3
= sec x tan x + ∫ sec x dx − ∫ sec x dx Rewrite.

3
= sec x tan x + ln | sec x + tan x| − ∫ sec x dx. Evaluate ∫ sec x dx.

We now have

3 3
∫ sec x dx = sec x tan x + ln | sec x + tan x| − ∫ sec x dx.

Since the integral ∫ sec


3
x dx has reappeared on the right-hand side, we can solve for ∫ sec
3
x dx by adding it to both sides. In doing so, we obtain

3
2∫ sec x dx = sec x tan x + ln | sec x + tan x|.

Dividing by 2, we arrive at

3
1 1
∫ sec x dx = sec x tan x + ln | sec x + tan x| + C
2 2

 Exercise 7.2.8

Evaluate ∫ tan
3
x sec
7
x dx.

Hint
Use Example 7.2.9as a guide.

Answer
3 7
1 9
1 7
∫ tan x sec x dx = sec x− sec x +C
9 7

Reduction Formulas
Evaluating ∫ sec
n
x dx for values of n where n is odd requires integration by parts. In addition, we must also know the value of ∫ sec
n−2
x dx to

evaluate ∫ sec
n
x dx . The evaluation of ∫ tan
n
x dx also requires being able to integrate ∫ tan
n−2
x dx . To make the process easier, we can derive and
apply the following power reduction formulas. These rules allow us to replace the integral of a power of sec x or tan x with the integral of a lower power
of sec x or tan x.

 Rule: Reduction Formulas for ∫ sec n


x dx and ∫ tan n
x dx

n
1 n−2
n−2 n−2
∫ sec x dx = sec x tan x + ∫ sec x dx
n−1 n−1

n
1 n−1 n−2
∫ tan x dx = tan x −∫ tan x dx
n−1

The first power reduction rule may be verified by applying integration by parts. The second may be verified by following the strategy outlined for
integrating odd powers of tan x.

 Example 7.2.12: Revisiting ∫ sec 3


x dx

Apply a reduction formula to evaluate ∫ sec


3
x dx.

Solution: By applying the first reduction formula, we obtain

7.2.7 https://math.libretexts.org/@go/page/4484
3
1 1
∫ sec x dx = sec x tan x + ∫ sec x dx
2 2

1 1
= sec x tan x + ln | sec x + tan x| + C .
2 2

 Example 7.2.13: Using a Reduction Formula

Evaluate ∫ tan
4
x dx.

Solution: Applying the reduction formula for ∫ tan 4


x dx we have
1
4 3 2
∫ tan x dx = tan x −∫ tan x dx
3

1 3 0 2
= tan x − (tan x − ∫ tan x dx) Apply the reduction formula to ∫ tan x dx.
3

1 3
= tan x − tan x + ∫ 1 dx Simplify.
3

1 3
= tan x − tan x + x + C Evaluate ∫ 1 dx
3

 Exercise 7.2.9

Apply the reduction formula to ∫ sec


5
x dx.

Hint
Use reduction formula 1 and let n = 5.

Answer
1 3
5 3 3
∫ sec x dx = sec x tan x + ∫ sec x
4 4

Key Concepts
Integrals of trigonometric functions can be evaluated by the use of various strategies. These strategies include
1. Applying trigonometric identities to rewrite the integral so that it may be evaluated by u-substitution
2. Using integration by parts
3. Applying trigonometric identities to rewrite products of sines and cosines with different arguments as the sum of individual sine and cosine functions
4. Applying reduction formulas

Key Equations
To integrate products involving sin(ax), sin(bx), cos(ax), and cos(bx), use the substitutions.
Sine Products
1 1
sin(ax) sin(bx) = cos((a − b)x) − cos((a + b)x)
2 2

Sine and Cosine Products


1 1
sin(ax) cos(bx) = sin((a − b)x) + sin((a + b)x)
2 2

Cosine Products
1 1
cos(ax) cos(bx) = cos((a − b)x) + cos((a + b)x)
2 2

Power Reduction Formula


n
1 n−2
n−2 n−2
∫ sec x dx = sec x tan x + ∫ sec x dx
n−1 n−1

Power Reduction Formula


1
n n−1 n−2
∫ tan x dx = tan x −∫ tan x dx
n−1

7.2.8 https://math.libretexts.org/@go/page/4484
Glossary
power reduction formula
a rule that allows an integral of a power of a trigonometric function to be exchanged for an integral involving a lower power

trigonometric integral
an integral involving powers and products of trigonometric functions

7.2: Trigonometric Integrals is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
7.2: Trigonometric Integrals by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source: https://openstax.org/details/books/calculus-
volume-1.

7.2.9 https://math.libretexts.org/@go/page/4484
7.3: Trigonometric Substitution
 Learning Objectives
Solve integration problems involving the square root of a sum or difference of two squares.

−−−−−− −−−−−− −−−−−−


In this section, we explore integrals containing expressions of the form √a − x , √a + x , and √x − a , where the values of
2 2 2 2 2 2

a are positive. We have already encountered and evaluated integrals containing some expressions of this type, but many still remain

inaccessible. The technique of trigonometric substitution comes in very handy when evaluating these integrals. This technique uses
substitution to rewrite these integrals as trigonometric integrals.
−−−−−−−
Integrals Involving √a 2
−x
2

−− −−−− −−−− −
Before developing a general strategy for integrals containing √a 2
−x
2
, consider the integral ∫ √9 − x2 dx. This integral cannot
be evaluated using any of the techniques we have discussed so far. However, if we make the substitution x = 3 sin θ , we have
dx = 3 cos θ dθ. After substituting into the integral, we have

− −−− − − −−−−−−−− −
2 2
∫ √ 9 − x dx = ∫ √ 9 − (3 sin θ) ⋅ 3 cos θ dθ.

After simplifying, we have


− −−− − −−−−−−−−
2 2
∫ √ 9 − x dx = ∫ 9 √ 1 − sin θ ⋅ cos θ dθ.

Letting 1 − sin 2 2
θ = cos θ, we now have
− −−− − −−−−−
∫ √ 9 − x2 dx = ∫ 2
9 √cos θ cos θ dθ.

Assuming that cos θ ≥ 0 , we have


− −−− −
∫ √ 9 − x2 dx = ∫ 9 cos2 θ dθ.

At this point, we can evaluate the integral using the techniques developed for integrating powers and products of trigonometric
functions. Before completing this example, let’s take a look at the general theory behind this idea.
−−−−−−
To evaluate integrals involving √a − x , we make the substitution x = a sin θ and dx = a cos θ . To see that this actually makes
2 2

−−−−− −
sense, consider the following argument: The domain of √a − x is [−a, a]. Thus,
2 2

−a ≤ x ≤ a.

Consequently,
x
−1 ≤ ≤ 1.
a

Since the range of sin x over [−(π/2), π/2]is [−1, 1], there is a unique angle θ satisfying −(π/2) ≤ θ ≤ π/2 so that sin θ = x/a ,
−−−−−−
or equivalently, so that x = a sin θ . If we substitute x = a sin θ into √a − x , we get
2 2

7.3.1 https://math.libretexts.org/@go/page/4485
−−−−−− − −−−−−−−−− −
2 2 2 2
π π
√a − x = √ a − (a sin θ) Let x = a sin θ where − ≤θ ≤ .
2 2

Simplify.

−−−−−−−−− −
2 2 2 2
= √ a − a sin θ Factor out a .

−−−−−−−−−−−
2 2 2 2
= √ a (1 − sin θ) Substitute 1 − sin x = cos x.

− −−−−−−
2 2
= √ a cos θ Take the square root.

= |a cos θ|

= a cos θ

π π
Since cos x ≥ 0 on − ≤θ ≤ and a > 0, |a cos θ| = a cos θ. We can see, from this discussion, that by making the substitution
2 2
x = a sin θ , we are able to convert an integral involving a radical into an integral involving trigonometric functions. After we
evaluate the integral, we can convert the solution back to an expression involving x. To see how to do this, let’s begin by assuming
π x
that 0 <x <a . In this case, 0 <θ < . Since sin θ = , we can draw the reference triangle in Figure 7.3.1 to assist in
2 a
expressing the values of cos θ, tan θ, and the remaining trigonometric functions in terms of x. It can be shown that this triangle
π π
actually produces the correct values of the trigonometric functions evaluated at θ for all θ satisfying − ≤θ ≤ . It is useful to
2 2
−−−− −−
observe that the expression √a2 − x2 actually appears as the length of one side of the triangle. Last, should θ appear by itself, we
x
use θ = sin −1
( ).
a

Figure 7.3.1 : A reference triangle can help express the trigonometric functions evaluated at θ in terms of x .
The essential part of this discussion is summarized in the following problem-solving strategy.
−− −−−−
 Problem-Solving Strategy: Integrating Expressions Involving √a 2
− x
2

1. It is a good idea to make sure the integral cannot be evaluated easily in another way. For example, although this method can
1 x −−−−−−
be applied to integrals of the form ∫ −−− −−− dx ,∫ −−− −−− dx, and ∫ x √a
2
−x
2
dx, they can each be integrated
√a2 − x2 √a2 − x2

directly either by formula or by a simple u-substitution.


−−− −−−
2. Make the substitution x = a sin θ and dx = a cos θ dθ. Note: This substitution yields √a − x = a cos θ. 2 2

3. Simplify the expression.


4. Evaluate the integral using techniques from the section on trigonometric integrals.
5. Use the reference triangle from Figure 1 to rewrite the result in terms of x. You may also need to use some trigonometric
x
identities and the relationship θ = sin −1
( ).
a

The following example demonstrates the application of this problem-solving strategy.


−− −−−−
 Example 7.3.1: Integrating an Expression Involving √a 2
− x
2

Evaluate
− −−− −
2
∫ √ 9 − x dx.

Solution

7.3.2 https://math.libretexts.org/@go/page/4485
x
Begin by making the substitutions x = 3 sin θ and dx = 3 cos θ dθ. Since sin θ = , we can construct the reference triangle
3
shown in Figure 2.

Figure 7.3.2 : A reference triangle can be constructed for Example 7.3.1 .


Thus,
− −−− − −−−−−−−−−−
2 2
∫ √ 9 − x dx = ∫ √ 9 − (3 sin θ) 3 cos θ dθ

Substitute x = 3 sin θ and dx = 3 cos θ dθ .


−−−−−−−−−−
= ∫ √9(1 − sin
2
θ) ⋅ 3 cos θ dθ Simplify.
−−− −−−
= ∫ √9 cos2 θ ⋅ 3 cos θ dθ Substitute cos 2
θ = 1 − sin
2
θ .
= ∫ 3| cos θ|3 cos θ dθ Take the square root.
π π
= ∫ 9 cos
2
θ dθ Simplify. Since − ≤θ ≤ , cos θ ≥ 0 and | cos θ| = cos θ.
2 2

1 1
=∫ 9( + cos(2θ)) dθ Use the strategy for integrating an even power of cos θ.
2 2

9 9
= θ+ sin(2θ) + C Evaluate the integral.
2 4

9 9
= θ+ (2 sin θ cos θ) + C
2 4

Substitute sin(2θ) = 2 sin θ cos θ .


−−−− −
9 x 9 x √9 − x2 x
=
−1
sin ( )+ ⋅ ⋅ +C Substitute sin −1
( ) =θ and sin θ = x

3
. Use the reference triangle to
2 3 2 3 3 3
−−−− −
√9 − x2
see that cos θ = and make this substitution. Simplify.
3
−−−− −
2
9 x x √9 − x
=
−1
sin ( )+ + C. Simplify.
2 3 2

−− −−−−
 Example 7.3.2: Integrating an Expression Involving √a 2
− x
2

Evaluate
− −−− −
√ 4 − x2
∫ dx.
x

Solution
x
First make the substitutions x = 2 sin θ and dx = 2 cos θ dθ . Since sin θ = , we can construct the reference triangle shown
2
in Figure 7.3.3.

7.3.3 https://math.libretexts.org/@go/page/4485
Figure 7.3.3 : A reference triangle can be constructed for Example 7.3.2 .
Thus,
−−−− − −−−−−−−−− −
2
√4 − x2 √4 − (2 sin θ)
∫ dx = ∫ 2 cos θ dθ Substitute x = 2 sin θ and dx = 2 cos θ dθ.
x 2 sin θ

2
2 cos θ
=∫ dθ Substitute cos 2
θ = 1 − sin
2
θ and simplify.
sin θ

2
2(1 − sin θ)
=∫ dθ Substitute cos 2
θ = 1 − sin
2
θ .
sin θ

1
= ∫ (2 csc θ − 2 sin θ) dθ Separate the numerator, simplify, and use csc θ = .
sin θ

= 2 ln | csc θ − cot θ| + 2 cos θ + C Evaluate the integral.


−−−− −
∣ 2 √4 − x2 ∣ −−−− −
= 2 ln∣ −
2
∣ + √4 − x + C . Use the reference triangle to rewrite the expression in terms of x and simplify.
∣x x ∣

In the next example, we see that we sometimes have a choice of methods.


−− −−−−
 Example 7.3.3: Integrating an Expression Involving √a 2
− x
2
Two Ways
−−−− −
Evaluate 3 2
∫ x √1 − x dx two ways: first by using the substitution u = 1 − x and then by using a trigonometric substitution.
2

Method 1
Let u = 1 − x and hence x
2 2
= 1 −u . Thus, du = −2x dx. In this case, the integral becomes
−−−− − 1 −−−− −
3 2
∫ x √1 − x dx = −
2 2
∫ x √1 − x (−2x dx) Make the substitution.
2

1 −
=− ∫ (1 − u)√u du Expand the expression.
2

1
=− ∫ (u
1/2
−u
3/2
) du Evaluate the integral.
2

1 2 2
=− ( u
3/2
− u
5/2
)+C Rewrite in terms of x.
2 3 5

1 1
2 3/2 2 5/2
=− (1 − x ) + (1 − x ) + C.
3 5

Method 2
Let x = sin θ . In this case, dx = cos θ dθ. Using this substitution, we have
−−−− − 3
3 2 2
∫ x √1 − x dx = ∫ sin θ cos θ dθ

= ∫ (1 − cos
2
θ) cos
2
θ sin θ dθ Let u = cos θ .Thus,du = − sin θ dθ.
4 2
= ∫ (u − u ) du

1 1
= u
5
− u
3
+C Substitute cos θ = u.
5 3

1 1 −−−−−
= cos
5
θ− cos
3
θ+C Use a reference triangle to see that cos θ = √1 − x 2
.
5 3

7.3.4 https://math.libretexts.org/@go/page/4485
1 1
2 5/2 2 3/2
= (1 − x ) − (1 − x ) + C.
5 3

 Exercise 7.3.1
3
x
Rewrite the integral ∫ −−−−− − dx using the appropriate trigonometric substitution (do not evaluate the integral).
√25 − x2

Hint
Substitute x = 5 sin θ and dx = 5 cos θ dθ.

Answer
3
∫ 125 sin θ dθ

−−−−−−−
Integrating Expressions Involving √a 2
+x
2

−−−−−− −−−−−−
For integrals containing √a + x ,let’s first consider the domain of this expression. Since √a + x is defined for all real values
2 2 2 2

of x, we restrict our choice to those trigonometric functions that have a range of all real numbers. Thus, our choice is restricted to
selecting either x = a tan θ or x = a cot θ . Either of these substitutions would actually work, but the standard substitution is
x = a tan θ or, equivalently, tan θ = x/a . With this substitution, we make the assumption that −(π/2) < θ < π/2 , so that we

also have θ = tan (x/a). The procedure for using this substitution is outlined in the following problem-solving strategy.
−1

−− −−−−
 Problem-Solving Strategy: Integrating Expressions Involving √a 2
+ x
2

1. Check to see whether the integral can be evaluated easily by using another method. In some cases, it is more convenient to
use an alternative method.
2. Substitute x = a tan θ and dx = a sec θ dθ. This substitution yields
2

−−−−−−−−−− − π π
−−−− −− −−−−−−−−−− − −−−−− −
√a2 + x2 = √a2 + (a tan θ)2 = √a2 (1 + tan2 θ) = √a2 sec2 θ = |a sec θ| = a sec θ. (Since − <θ < and
2 2
sec θ > 0 over this interval, |a sec θ| = a sec θ .)

3. Simplify the expression.


4. Evaluate the integral using techniques from the section on trigonometric integrals.
5. Use the reference triangle from Figure 7.3.4 to rewrite the result in terms of x. You may also need to use some
x
trigonometric identities and the relationship θ = tan −1
( ) . (Note: The reference triangle is based on the assumption that
a
x >0 ; however, the trigonometric ratios produced from the reference triangle are the same as the ratios for which x ≤ 0 .)

Figure 7.3.4 : A reference triangle can be constructed to express the trigonometric functions evaluated at θ in terms of x .
−− −−−−
 Example 7.3.4: Integrating an Expression Involving √a 2
+ x2

dx
Evaluate ∫ −−−− − and check the solution by differentiating.
√1 + x2

Solution
Begin with the substitution x = tan θ and dx = sec 2
θ dθ . Since tan θ = x , draw the reference triangle in Figure 7.3.5.

7.3.5 https://math.libretexts.org/@go/page/4485
Figure 7.3.5 : The reference triangle for Example 7.3.4 .
Thus,
2
dx sec θ 2
∫ −−−− − =∫ dθ Substitute x = tan θ and dx = sec θ dθ.
√1 + x2 sec θ

−−−− −
2
This substitution makes √1 + x = sec θ. Simplify.

=∫ sec θ dθ Evaluate the integral.

= ln | sec θ + tan θ| + C Use the reference triangle to express the result in terms of x.

−−−− −
2
= ln | √1 + x + x| + C

To check the solution, differentiate:


−−−− −
2
d −−−− − 1 x 1 x + √1 + x 1
2
( ln | √1 + x + x|) = −−−− − ⋅( −−−− − + 1) = −−−− − ⋅ −−−− − = −−−− −.
dx √1 + x2 + x √1 + x2 √1 + x2 + x √1 + x2 √1 + x2

−−−−− −−−−− −−−− −


Since √1 + x 2
+x > 0 for all values of x, we could rewrite ln |√1 + x 2 2
+ x| + C = ln(√1 + x + x) + C , if desired.

dx
 Example 7.3.5: Evaluating ∫ −−−−−
Using a Different Substitution
√1 + x2

dx
Use the substitution x = sinh θ to evaluate ∫ −−−− −.
√1 + x2

Solution
Because sinh θ has a range of all real numbers, and 1 + sinh θ = cosh 2 2
θ , we may also use the substitution x = sinh θ to
evaluate this integral. In this case, dx = cosh θ dθ. Consequently,
dx cosh θ
∫ =∫ dθ Substitute x = sinh θ and dx = cosh θ dθ.
−−−− − −−−− −−−− −
2
√1 + x2 √1 + sinh θ

2 2
Substitute 1 + sinh θ = cosh θ.

cosh θ −−−−−−
2
=∫ −−−− −− dθ Since √cosh θ = | cosh θ|
√cosh2 θ

cosh θ
=∫ dθ | cosh θ| = cosh θ since cosh θ > 0 for all θ.
| cosh θ|

cosh θ
=∫ dθ Simplify.
cosh θ

=∫ 1 dθ Evaluate the integral.

−1
= θ+C Since x = sinh θ, we know θ = sinh x.

−1
= sinh x + C.

Analysis
This answer looks quite different from the answer obtained using the substitution x = tan θ. To see that the solutions are the
same, set y = sinh x . Thus, sinh y = x. From this equation we obtain:
−1

7.3.6 https://math.libretexts.org/@go/page/4485
y −y
e −e
= x.
2

After multiplying both sides by 2e and rewriting, this equation becomes:


y

2y y
e − 2x e − 1 = 0.

Use the quadratic equation to solve for e : y

− −−−− −
2x ± √ 4 x2 + 4
y
e = .
2

Simplifying, we have:
−−−−−
y 2
e = x ± √x +1.

−−−−− −−−−−
Since 2
x − √x + 1 < 0 , it must be the case that e y 2
= x + √x + 1 . Thus,
−−−−−
2
y = ln(x + √ x + 1 ).

Last, we obtain
−−−−−
−1 2
sinh x = ln(x + √ x + 1 ).

−−−−−
After we make the final observation that, since 2
x + √x + 1 > 0,

−−−−− − −−− −
2 2
ln(x + √ x + 1 ) = ln ∣ √ 1 + x + x ∣,

we see that the two different methods produced equivalent solutions.

 Example 7.3.6: Finding an Arc Length


1
Find the length of the curve y = x over the interval [0,
2
.
]
2

Solution
dy
Because = 2x , the arc length is given by
dx

1/2 −−−−−−−− 1/2


− −−−− −
2
∫ √ 1 + (2x) dx = ∫ √ 1 + 4x2 dx.

0 0

1
To evaluate this integral, use the substitution x = tan θ and dx =
1

2
sec
2
θ dθ . We also need to change the limits of
2
1 π
integration. If x = 0 , then θ = 0 and if x = , then θ = . Thus,
2 4

1/2 −−−−− − π/4 −−−− −−−− −−−−−−



0
√1 + 4x2 dx = ∫
0
√1 + tan2 θ ⋅ 1

2
sec
2
θ dθ After substitution,√1 + 4x 2
= sec θ . (Substitute 1 + tan
2
θ = sec
2
θ

and simplify.)
π/4
=
1

2

0
sec
3
θ dθ We derived this integral in the previous section.
π/4
1 1 ∣
=
1

2
( sec θ tan θ + ln | sec θ + tan θ|)∣ Evaluate and simplify.
2 2 ∣
0

1 – –
= (√2 + ln(√2 + 1)).
4

 Exercise 7.3.2
−−−−−
Rewrite ∫ 3 2
x √x + 4 dx by using a substitution involving tan θ .

Hint

7.3.7 https://math.libretexts.org/@go/page/4485
Use x = 2 tan θ and dx = 2 sec 2
θ dθ.

Answer

3 3
∫ 32 tan θ sec θ dθ

−−−−−−−
Integrating Expressions Involving √x 2
−a
2

−−−−− − x x
The domain of the expression √x2 − a2 is (−∞, −a] ∪ [a, +∞) . Thus, either x ≤ −a or x ≥ a. Hence, ≤ −1 or ≥1 .
a a
π π x
Since these intervals correspond to the range of sec θ on the set [0, )∪( , π] , it makes sense to use the substitution sec θ =
2 2 a
π π
or, equivalently, x = a sec θ , where 0 ≤θ < or <θ ≤π . The corresponding substitution for dx is dx = a sec θ tan θ dθ .
2 2
The procedure for using this substitution is outlined in the following problem-solving strategy.
−− −−−−
 Problem-Solving Strategy: Integrals Involving √x 2
− a
2

1. Check to see whether the integral cannot be evaluated using another method. If so, we may wish to consider applying an
alternative technique.
2. Substitute x = a sec θ and dx = a sec θ tan θ dθ . This substitution yields
−−−−−− − −−−−−−−−− − − −−−− −−−−− − − − −−−−−
2 2 2 2 2 2 2 2
√x − a = √ (a sec θ) − a = √ a (sec θ − 1) = √ a tan θ = |a tan θ|.

For x ≥ a, |a tan θ| = a tan θ and for x ≤ −a, |a tan θ| = −a tan θ.


3. Simplify the expression.
4. Evaluate the integral using techniques from the section on trigonometric integrals.
5. Use the reference triangles from Figure 7.3.6 to rewrite the result in terms of x.
x
6. You may also need to use some trigonometric identities and the relationship θ = sec −1
( ) . (Note: We need both
a
reference triangles, since the values of some of the trigonometric ratios are different depending on whether x > a or
x < −a .)

Figure 7.3.6 : Use the appropriate reference triangle to express the trigonometric functions evaluated at θ in terms of x .

 Example 7.3.7: Finding the Area of a Region


−−−−−
Find the area of the region between the graph of f (x) = √x 2
−9 and the x-axis over the interval [3, 5].
Solution
First, sketch a rough graph of the region described in the problem, as shown in the following figure.

7.3.8 https://math.libretexts.org/@go/page/4485
Figure 7.3.7 : Calculating the area of the shaded region requires evaluating an integral with a trigonometric substitution.
5 −−−−−
We can see that the area is A = ∫ √x − 9 dx . To evaluate this definite integral, substitute x = 3 sec θ and
3
2

dx = 3 sec θ tan θ dθ . We must also change the limits of integration. If x = 3 , then 3 = 3 sec θ and hence θ = 0 . If x = 5 ,

5
then θ = sec −1
( ) . After making these substitutions and simplifying, we have
3

5 −−−−−
Area= ∫ 3
√x2 − 9 dx

−1
sec (5/3)
=∫
0
9 tan
2
θ sec θ dθ Use tan 2
θ = sec
2
θ − 1.

−1
sec (5/3)
=∫
0
9(sec
2
θ − 1) sec θ dθ Expand.
−1
sec (5/3)
=∫
0
9(sec
3
θ − sec θ) dθ Evaluate the integral.
−1
sec (5/3)
9 9 ∣
=( ln | sec θ + tan θ| + sec θ tan θ) − 9 ln | sec θ + tan θ| ∣ Simplify.
2 2 ∣
0

−1
sec (5/3)
9 9 ∣ 5 5 5 4
= sec θ tan θ − ln | sec θ + tan θ| ∣ Evaluate. Use sec(sec −1
) = and tan(sec −1
) = .
2 2 ∣ 3 3 3 3
0

9 5 4 9 5 4 9 9
= ⋅ ⋅ − ln ∣ + ∣ −( ⋅1⋅0− ln |1 + 0|)
2 3 3 2 3 3 2 2

9
= 10 − ln 3
2

 Exercise 7.3.3

Evaluate
dx
∫ − −−−−.
√ x2 − 4

Assume that x > 2.

Hint
Substitute x = 2 sec θ and dx = 2 sec θ tan θ dθ.

Answer
− −−−−
x √ x2 − 4
ln | + | +C
2 2

Key Concepts
−−−−−−
For integrals involving √a − x , use the substitution x = a sin θ and dx = a cos θ dθ.
2 2

−−−−−−
For integrals involving √a + x , use the substitution x = a tan θ and dx = a sec θ dθ .
2 2 2

−−−−−−
For integrals involving √x − a , substitute x = a sec θ and dx = a sec θ tan θ dθ .
2 2

7.3.9 https://math.libretexts.org/@go/page/4485
Glossary
trigonometric substitution
−− −−−− −− −−−−
an integration technique that converts an algebraic integral containing expressions of the form √a 2
− x2 , √a 2
+ x2 , or
− −− −−−
2
√x − a 2
into a trigonometric integral

7.3: Trigonometric Substitution is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
7.3: Trigonometric Substitution by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

7.3.10 https://math.libretexts.org/@go/page/4485
7.4: Integration of Rational Functions by Partial Fractions
 Learning Objectives
Integrate a rational function using the method of partial fractions.
Recognize simple linear factors in a rational function.
Recognize repeated linear factors in a rational function.
Recognize quadratic factors in a rational function.

We have seen some techniques that allow us to integrate specific rational functions. For example, we know that
du
∫ = ln |u| + C
u

and
du 1 −1
u
∫ = tan ( ) + C.
2 2
u +a a a

However, we do not yet have a technique that allows us to tackle arbitrary quotients of this type. Thus, it is not immediately
obvious how to go about evaluating
3x
∫ dx.
2
x −x −2

However, we know from material previously developed that


1 2
∫ ( + ) dx = ln |x + 1| + 2 ln |x − 2| + C .
x +1 x −2

In fact, by getting a common denominator, we see that


1 2 3x
+ = .
2
x +1 x −2 x −x −2

Consequently,
3x 1 2
∫ dx = ∫ ( + ) dx.
2
x −x −2 x +1 x −2

In this section, we examine the method of partial fraction decomposition, which allows us to decompose rational functions into
sums of simpler, more easily integrated rational functions. Using this method, we can rewrite an expression such as:
3x

2
x −x −2

as an expression such as
1 2
+ .
x +1 x −2

The key to the method of partial fraction decomposition is being able to anticipate the form that the decomposition of a rational
function will take. As we shall see, this form is both predictable and highly dependent on the factorization of the denominator of
the rational function. It is also extremely important to keep in mind that partial fraction decomposition can be applied to a rational
P (x)
function only if deg(P (x)) < deg(Q(x)) . In the case when deg(P (x)) ≥ deg(Q(x)) , we must first perform long division
Q(x)

P (x) R(x)
to rewrite the quotient in the form A(x) + , where deg(R(x)) < deg(Q(x)) . We then do a partial fraction
Q(x) Q(x)

7.4.1 https://math.libretexts.org/@go/page/4486
R(x)
decomposition on . The following example, although not requiring partial fraction decomposition, illustrates our approach to
Q(x)

P (x)
integrals of rational functions of the form ∫ dx , where deg(P (x)) ≥ deg(Q(x)).
Q(x)

P (x)
 Example 7.4.1: Integrating ∫ dx, where deg(P (x)) ≥ deg(Q(x))
Q(x)

Evaluate
2
x + 3x + 5
∫ dx.
x +1

Solution
Since deg(x 2
+ 3x + 5) ≥ deg(x + 1), we perform long division to obtain
2
x + 3x + 5 3
= x +2 + .
x +1 x +1

Thus,
2
x + 3x + 5 3 1 2
∫ dx = ∫ (x + 2 + ) dx = x + 2x + 3 ln |x + 1| + C .
x +1 x +1 2

Visit this website for a review of long division of polynomials.

 Exercise 7.4.1

Evaluate
x −3
∫ dx.
x +2

Hint
x −3 5
Use long division to obtain =1− .
x +2 x +2

Answer
x − 5 ln |x + 2| + C

P (x)
To integrate ∫ dx , where deg(P (x)) < deg(Q(x)) , we must begin by factoring Q(x).
Q(x)

Nonrepeated Linear Factors


If Q(x) can be factored as (a x + b 1 1 )(a2 x + b2 ) … (an x + bn ) , where each linear factor is distinct, then it is possible to find
constants A , A , … A satisfying
1 2 n

P (x) A1 A2 An
= + +⋯ + . (7.4.1)
Q(x) a1 x + b1 a2 x + b2 an x + bn

The proof that such constants exist is beyond the scope of this course.
In this next example, we see how to use partial fractions to integrate a rational function of this type.

7.4.2 https://math.libretexts.org/@go/page/4486
 Example 7.4.2: Partial Fractions with Nonrepeated Linear Factors
3x + 2
Evaluate ∫ 3 2
dx.
x −x − 2x

Solution
3x + 2
Since deg(3x + 2) < deg(x
3
−x
2
− 2x) , we begin by factoring the denominator of 3 2
. We can see that
x −x − 2x
x
3
−x
2
− 2x = x(x − 2)(x + 1) . Thus, there are constants A , B , and C satisfying Equation 7.4.1 such that
3x + 2 A B C
= + + .
x(x − 2)(x + 1) x x −2 x +1

We must now find these constants. To do so, we begin by getting a common denominator on the right. Thus,

3x + 2 A(x − 2)(x + 1) + Bx(x + 1) + C x(x − 2)


= .
x(x − 2)(x + 1) x(x − 2)(x + 1)

Now, we set the numerators equal to each other, obtaining

3x + 2 = A(x − 2)(x + 1) + Bx(x + 1) + C x(x − 2). (7.4.2)

There are two different strategies for finding the coefficients A , B , and C . We refer to these as the method of equating
coefficients and the method of strategic substitution.
Strategy one: Method of Equating Coefficients
Rewrite Equation 7.4.2 in the form
2
3x + 2 = (A + B + C )x + (−A + B − 2C )x + (−2A).

Equating coefficients produces the system of equations

A+B+C =0

−A + B − 2C =3

−2A = 2.

To solve this system, we first observe that −2A = 2 ⇒ A = −1. Substituting this value into the first two equations gives us
the system
B+C = 1

B − 2C = 2 .
Multiplying the second equation by −1 and adding the resulting equation to the first produces
−3C = 1,

1 4
which in turn implies that C =− . Substituting this value into the equation B+C = 1 yields B = . Thus, solving these
3 3
4 1
equations yields A = −1, B = , and C =− .
3 3

It is important to note that the system produced by this method is consistent if and only if we have set up the decomposition
correctly. If the system is inconsistent, there is an error in our decomposition.
Strategy two: Method of Strategic Substitution
The method of strategic substitution is based on the assumption that we have set up the decomposition correctly. If the
decomposition is set up correctly, then there must be values of A, B, and C that satisfy Equation 7.4.2 for all values of x. That
is, this equation must be true for any value of x we care to substitute into it. Therefore, by choosing values of x carefully and
substituting them into the equation, we may find A, B , and C easily. For example, if we substitute x = 0 , the equation reduces
to 2 = A(−2)(1) . Solving for A yields A = −1 . Next, by substituting x = 2 , the equation reduces to 8 = B(2)(3) , or

7.4.3 https://math.libretexts.org/@go/page/4486
equivalently B = 4/3 . Last, we substitute x = −1 into the equation and obtain −1 = C (−1)(−3). Solving, we have
1
C =− .
3

It is important to keep in mind that if we attempt to use this method with a decomposition that has not been set up correctly, we
are still able to find values for the constants, but these constants are meaningless. If we do opt to use the method of strategic
substitution, then it is a good idea to check the result by recombining the terms algebraically.
Now that we have the values of A, B, and C , we rewrite the original integral:
3x + 2 1 4 1 1 1
∫ dx = ∫ (− + ⋅ − ⋅ ) dx.
3 2
x −x − 2x x 3 x −2 3 x +1

Evaluating the integral gives us


3x + 2 4 1
∫ dx = − ln |x| + ln |x − 2| − ln |x + 1| + C .
x3 − x2 − 2x 3 3

In the next example, we integrate a rational function in which the degree of the numerator is not less than the degree of the
denominator.

 Example 7.4.3: Dividing before Applying Partial Fractions


2
x + 3x + 1
Evaluate ∫ 2
dx.
x −4

Solution
Since deg(x 2
+ 3x + 1) ≥ deg(x
2
− 4), we must perform long division of polynomials. This results in
2
x + 3x + 1 3x + 5
=1+
2 2
x −4 x −4

3x + 5 3x + 5
Next, we perform partial fraction decomposition on 2
= . We have
x −4 (x + 2)(x − 2)

3x + 5 A B
= + .
(x − 2)(x + 2) x −2 x +2

Thus,

3x + 5 = A(x + 2) + B(x − 2).

Solving for A and B using either method, we obtain A = 11/4 and B = 1/4.
Rewriting the original integral, we have
2
x + 3x + 1 11 1 1 1
∫ dx = ∫ (1 + ⋅ + ⋅ ) dx.
2
x −4 4 x −2 4 x +2

Evaluating the integral produces


2
x + 3x + 1 11 1
∫ dx = x + ln |x − 2| + ln |x + 2| + C .
2
x −4 4 4

As we see in the next example, it may be possible to apply the technique of partial fraction decomposition to a nonrational function.
The trick is to convert the nonrational function to a rational function through a substitution.

 Example 7.4.4: Applying Partial Fractions after a Substitution


cos x
Evaluate ∫ 2
dx.
sin x − sin x

7.4.4 https://math.libretexts.org/@go/page/4486
Solution
Let’s begin by letting u = sin x. Consequently, du = cos x dx. After making these substitutions, we have
cos x du du
∫ dx = ∫ =∫ .
2 2
sin x − sin x u −u u(u − 1)

1 1 1 1
Applying partial fraction decomposition to gives =− + .
u(u − 1) u(u − 1) u u −1

Thus,
cos x
∫ dx = − ln |u| + ln |u − 1| + C = − ln | sin x| + ln | sin x − 1| + C .
2
sin x − sin x

 Exercise 7.4.2
x +1
Evaluate ∫ dx.
(x + 3)(x − 2)

Hint
x +1 A B
= +
(x + 3)(x − 2) x +3 x −2

Answer
2 3
ln |x + 3| + ln |x − 2| + C
5 5

Repeated Linear Factors


For some applications, we need to integrate rational expressions that have denominators with repeated linear factors—that is,
rational functions with at least one factor of the form (ax + b) , where n is a positive integer greater than or equal to 2. If the
n

denominator contains the repeated linear factor (ax + b) , then the decomposition must contain
n

A1 A2 An
+ +⋯ + . (7.4.3)
2 n
ax + b (ax + b) (ax + b)

As we see in our next example, the basic technique used for solving for the coefficients is the same, but it requires more algebra to
determine the numerators of the partial fractions.

 Example 7.4.5: Partial Fractions with Repeated Linear Factors


x −2
Evaluate ∫ 2
dx.
(2x − 1 ) (x − 1)

Solution
We have deg(x − 2) < deg((2x − 1) 2
(x − 1)), so we can proceed with the decomposition. Since (2x − 1)
2
is a repeated
linear factor, include
A B
+
2
2x − 1 (2x − 1)

in the decomposition in Equation 7.4.3. Thus,


x −2 A B C
= + + .
2 2
(2x − 1 ) (x − 1) 2x − 1 (2x − 1) x −1

After getting a common denominator and equating the numerators, we have


2
x − 2 = A(2x − 1)(x − 1) + B(x − 1) + C (2x − 1 ) . (7.4.4)

7.4.5 https://math.libretexts.org/@go/page/4486
We then use the method of equating coefficients to find the values of A, B, and C .
2
x − 2 = (2A + 4C )x + (−3A + B − 4C )x + (A − B + C ).

Equating coefficients yields 2A + 4C = 0 , −3A + B − 4C = 1 , and A − B + C = −2 . Solving this system yields


A = 2, B = 3, and C = −1.

Alternatively, we can use the method of strategic substitution. In this case, substituting x = 1 and x = 1/2 into Equation 7.4.4
easily produces the values B = 3 and C = −1 . At this point, it may seem that we have run out of good choices for x,
however, since we already have values for B and C , we can substitute in these values and choose any value for x not
previously used. The value x = 0 is a good option. In this case, we obtain the equation
−2 = A(−1)(−1) + 3(−1) + (−1)(−1) or, equivalently, A = 2.
2

Now that we have the values for A, B, and C , we rewrite the original integral and evaluate it:
x −2 2 3 1
∫ dx = ∫ ( + − ) dx
2 2
(2x − 1 ) (x − 1) 2x − 1 (2x − 1) x −1

3
= ln |2x − 1| − − ln |x − 1| + C .
2(2x − 1)

 Exercise 7.4.3

Set up the partial fraction decomposition for


x +2
∫ dx.
(x + 3 )3 (x − 4 )2

(Do not solve for the coefficients or complete the integration.)

Hint
Use the problem-solving method of Example 7.4.5for guidance.

Answer
x +2 A B C D E
= + + + +
3 2 2 3 2
(x + 3 ) (x − 4 ) x +3 (x + 3) (x + 3) (x − 4) (x − 4)

The General Method


Now that we are beginning to get the idea of how the technique of partial fraction decomposition works, let’s outline the basic
method in the following problem-solving strategy.

 Problem-Solving Strategy: Partial Fraction Decomposition


To decompose the rational function P (x)/Q(x), use the following steps:
1. Make sure that deg(P (x)) < deg(Q(x)). If not, perform long division of polynomials.
2. Factor Q(x) into the product of linear and irreducible quadratic factors. An irreducible quadratic is a quadratic that has no
real zeros.
3. Assuming that deg(P (x)) < deg(Q(x) , the factors of Q(x) determine the form of the decomposition of P (x)/Q(x).
a. If Q(x) can be factored as (a x + b )(a x + b
1 1 2 2) … (an x + bn ) , where each linear factor is distinct, then it is possible
to find constants A , A , . . . A satisfying
1 2 n

P (x) A1 A2 An
= + +⋯ + .
Q(x) a1 x + b1 a2 x + b2 an x + bn

b. If Q(x) contains the repeated linear factor (ax + b) , then the decomposition must contain
n

7.4.6 https://math.libretexts.org/@go/page/4486
A1 A2 An
+ +⋯ + .
2 n
ax + b (ax + b) (ax + b)

c. For each irreducible quadratic factor ax 2


+ bx + c that Q(x) contains, the decomposition must include
Ax + B
.
2
ax + bx + c

d. For each repeated irreducible quadratic factor (ax 2


+ bx + c ) ,
n
the decomposition must include
A1 x + B1 A2 x + B2 An x + Bn
+ +⋯ + .
2 2 2 2 n
ax + bx + c (ax + bx + c ) (ax + bx + c )

e. After the appropriate decomposition is determined, solve for the constants.


f. Last, rewrite the integral in its decomposed form and evaluate it using previously developed techniques or integration
formulas.

Simple Quadratic Factors


Now let’s look at integrating a rational expression in which the denominator contains an irreducible quadratic factor. Recall that the
quadratic ax + bx + c is irreducible if ax + bx + c = 0 has no real zeros—that is, if b − 4ac < 0.
2 2 2

 Example 7.4.6: Rational Expressions with an Irreducible Quadratic Factor

Evaluate
2x − 3
∫ dx.
3
x +x

Solution
Since deg(2x − 3) < deg(x
3
+ x), factor the denominator and proceed with partial fraction decomposition. Since
Ax + B
x
3
+ x = x(x
2
+ 1) contains the irreducible quadratic factor x
2
+1 , include 2
as part of the decomposition, along
x +1
C
with for the linear term x. Thus, the decomposition has the form
x

2x − 3 Ax + B C
= + .
2 2
x(x + 1) x +1 x

After getting a common denominator and equating the numerators, we obtain the equation
2
2x − 3 = (Ax + B)x + C (x + 1).

Solving for A, B, and C , we get A = 3, B = 2, and C = −3.

Thus,
2x − 3 3x + 2 3
= − .
3 2
x +x x +1 x

Substituting back into the integral, we obtain


2x − 3 3x + 2 3
∫ dx = ∫ ( − ) dx
3 2
x +x x +1 x

x 1 1
=3∫ dx + 2 ∫ dx − 3 ∫ dx Split up the integral
x2 + 1 x2 + 1 x

3
2 −1
= ln ∣ x + 1 ∣ +2 tan x − 3 ln |x| + C . Evaluate each integral
2

Note: We may rewrite ln ∣ x 2


+ 1 ∣= ln(x
2
+ 1) , if we wish to do so, since x 2
+ 1 > 0.

7.4.7 https://math.libretexts.org/@go/page/4486
 Example 7.4.7: Partial Fractions with an Irreducible Quadratic Factor
dx
Evaluate ∫ 3
.
x −8

Solution: We can start by factoring x − 8 = (x − 2)(x + 2x + 4). We see that the quadratic factor x + 2x + 4 is
3 2 2

irreducible since 2 − 4(1)(4) = −12 < 0. Using the decomposition described in the problem-solving strategy, we get
2

1 A Bx + C
= + .
2 2
(x − 2)(x + 2x + 4) x −2 x + 2x + 4

After obtaining a common denominator and equating the numerators, this becomes
2
1 = A(x + 2x + 4) + (Bx + C )(x − 2).

1 1 1
Applying either method, we get A = ,B =− , and C =− .
12 12 3

dx
Rewriting ∫ 3
, we have
x −8

dx 1 1 1 x +4
∫ = ∫ dx − ∫ dx.
3 2
x −8 12 x −2 12 x + 2x + 4

We can see that


1
∫ dx = ln |x − 2| + C ,
x −2

but
x +4
∫ dx
2
x + 2x + 4

requires a bit more effort. Let’s begin by completing the square on x 2


+ 2x + 4 to obtain
2 2
x + 2x + 4 = (x + 1 ) + 3.

By letting u = x + 1 and consequently du = dx, we see that


x +4 x +4
∫ dx = ∫ dx Complete the square on the denominator
2 2
x + 2x + 4 (x + 1 ) +3

u +3
=∫ du Substitute u = x + 1, x = u − 1, and du = dx
2
u +3

u 3
=∫ du + ∫ du Split the numerator apart
2 2
u +3 u +3

1 2
3 −1
u
= ln ∣ u +3 ∣ + – tan – +C Evaluate each integral
2 √3 √3

1 – x +1
2 −1
= ln ∣ x + 2x + 4 ∣ +√3 tan ( ) +C Rewrite in terms of x and simplify

2 √3

Substituting back into the original integral and simplifying gives



dx 1 1 √3 x +1
2 −1
∫ = ln |x − 2| − ln | x + 2x + 4| − tan ( ) + C.
3 –
x −8 12 24 12 √3

Here again, we can drop the absolute value if we wish to do so, since x 2
+ 2x + 4 > 0 for all x.

7.4.8 https://math.libretexts.org/@go/page/4486
 Example 7.4.8: Finding a Volume
2
x
Find the volume of the solid of revolution obtained by revolving the region enclosed by the graph of f (x) =
2 2
and
(x + 1)

the x-axis over the interval [0, 1] about the y-axis.


Solution
Let’s begin by sketching the region to be revolved (see Figure 7.4.1). From the sketch, we see that the shell method is a good
choice for solving this problem.

Figure 7.4.1 : We can use the shell method to find the volume of revolution obtained by revolving the region shown about the
y -axis.

The volume is given by


1 2 1 3
x x
V = 2π ∫ x⋅ dx = 2π ∫ dx.
2 2 2 2
0 (x + 1) 0 (x + 1)

Since deg((x + 1) ) = 4 > 3 = deg(x ), we can proceed with partial fraction decomposition. Note that
2 2 3
(x
2
+ 1)
2
is a
repeated irreducible quadratic. Using the decomposition described in the problem-solving strategy, we get
3
x Ax + B Cx + D
= + .
(x2 + 1 )2 x2 + 1 (x2 + 1 )2

Finding a common denominator and equating the numerators gives


3 2
x = (Ax + B)(x + 1) + C x + D.

Solving, we obtain A = 1, B = 0, C = −1, and D = 0. Substituting back into the integral, we have
1 3 1
x x x 1 1 1 1
2 ∣
V = 2π ∫ dx = 2π ∫ ( − ) dx = 2π ( ln(x + 1) + ⋅ ) =
2 2 2 2 2 2 ∣
0 (x + 1) 0 x +1 (x + 1) 2 2 x +1 0

1
π (ln 2 − ).
2

 Exercise 7.4.4
Set up the partial fraction decomposition for
2
x + 3x + 1
∫ dx.
2 2 2
(x + 2)(x − 3 ) (x + 4)

Hint
Use the problem-solving strategy.

Answer

7.4.9 https://math.libretexts.org/@go/page/4486
2
x + 3x + 1 A B C Dx + E Fx +G
= + + + +
2 2 2 2 2 2 2
(x + 2)(x − 3 ) (x + 4) x +2 x −3 (x − 3) x +4 (x + 4)

Key Concepts
Partial fraction decomposition is a technique used to break down a rational function into a sum of simple rational functions that
can be integrated using previously learned techniques.
When applying partial fraction decomposition, we must make sure that the degree of the numerator is less than the degree of the
denominator. If not, we need to perform long division before attempting partial fraction decomposition.
The form the decomposition takes depends on the type of factors in the denominator. The types of factors include nonrepeated
linear factors, repeated linear factors, nonrepeated irreducible quadratic factors, and repeated irreducible quadratic factors.

Glossary
partial fraction decomposition
a technique used to break down a rational function into the sum of simple rational functions

7.4: Integration of Rational Functions by Partial Fractions is shared under a not declared license and was authored, remixed, and/or curated by
LibreTexts.
7.4: Partial Fractions by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

7.4.10 https://math.libretexts.org/@go/page/4486
7.5: Strategy for Integration
 Learning Objectives
Use a table of integrals to solve integration problems.
Use a computer algebra system (CAS) to solve integration problems.

In addition to the techniques of integration we have already seen, several other tools are widely available to assist with the process
of integration. Among these tools are integration tables, which are readily available in many books, including the appendices to
this one. Also widely available are computer algebra systems (CAS), which are found on calculators and in many campus
computer labs, and are free online.

Tables of Integrals
Integration tables, if used in the right manner, can be a handy way either to evaluate or check an integral quickly. Keep in mind that
when using a table to check an answer, it is possible for two completely correct solutions to look very different. For example, in
Trigonometric Substitution, we found that, by using the substitution x = tan θ, we can arrive at
dx −−−−−
∣ 2
∫ √x + 1 ∣ + C .
−−−− − = ln∣x + ∣
√1 + x2

However, using x = sinh θ , we obtained a different solution—namely,


dx
−1
∫ = sinh x + C.
−−−− −
√1 + x2

−−−−−
We later showed algebraically that the two solutions are equivalent. That is, we showed that sinh −1

2
x = ln∣x + √x + 1 ∣

. In this
case, the two antiderivatives that we found were actually equal. This need not be the case. However, as long as the difference in the
two antiderivatives is a constant, they are equivalent.

 Example 7.5.1: Using a Formula from a Table to Evaluate an Integral


Use the table formula
−−−− −− −−−− −−
√a2 − u2 √a2 − u2 u
−1
∫ du = − − sin +C
2
u u a

−−−−− −−
√16 − e2x
to evaluate ∫ x
dx.
e

Solution
−−−−−−
If we look at integration tables, we see that several formulas contain expressions of the form √a − u . This expression is 2 2

− −−−− −−
actually similar to √16 − e , where a = 4 and u = e . Keep in mind that we must also have du = e . Multiplying the
2x x x

numerator and the denominator of the given integral by e should help to put this integral in a useful form. Thus, we now have
x

−−−−− −− −−−−− −−
√16 − e2x √16 − e2x
x
∫ dx = ∫ e dx.
x 2x
e e
−−−− −−
√a2 − u2
Substituting u = e and du = e
x x
dx produces ∫ du. From the integration table (#88 in Appendix A),
u2
−−−− −− −−−− −−
√a2 − u2 √a2 − u2 u
−1
∫ du = − − sin + C.
2
u u a

Thus,
−−−−− −− −−−−− −−
√16 − e2x √16 − e2x

x
dx = ∫
2x
e dx
x
Substitute u = e and du = e x x
dx.
e e
−−−− −−
√42 − u2
=∫
2
du Apply the formula using a = 4 .
u

7.5.1 https://math.libretexts.org/@go/page/4487
−−−− −−
√42 − u2 u
=− − sin
−1
+C Substitute u = e . x

u 4
−−−−− −−
x
√16 − e2x e
−1
=− − sin ( )+C
x
e 4

Computer Algebra Systems


If available, a CAS is a faster alternative to a table for solving an integration problem. Many such systems are widely available and
are, in general, quite easy to use.

 Example 7.5.2: Using a Computer Algebra System to Evaluate an Integral


−−−−−
dx ∣ √x2 − 4 x∣
Use a computer algebra system to evaluate ∫ −−−−−
. Compare this result with ln∣ + ∣ + C, a result we might
√x2 − 4 ∣ 2 2 ∣

have obtained if we had used trigonometric substitution.


Solution
Using Wolfram Alpha, we obtain
dx −−−−−
∫ ∣√x2 − 4 + x ∣ + C .
−−−−− = ln∣ ∣
√x2 − 4

Notice that
−−−−− −−−− −
∣ √x2 − 4 x∣ ∣ √x2 − 4 + x ∣ −−−−−
2
ln∣ + ∣ + C = ln∣ ∣ + C = ln∣

√x − 4 + x ∣ − ln 2 + C .

∣ 2 2 ∣ ∣ 2 ∣

Since these two antiderivatives differ by only a constant, the solutions are equivalent. We could have also demonstrated that
each of these antiderivatives is correct by differentiating them.

You can access an integral calculator for more examples.

 Example 7.5.3: Using a CAS to Evaluate an Integral


1
Evaluate ∫ sin
3
x dx using a CAS. Compare the result to 3
cos x − cos x + C , the result we might have obtained using the
3

technique for integrating odd powers of sin x discussed earlier in this chapter.
Solution
Using Wolfram Alpha, we obtain
1
3
∫ sin x dx = (cos(3x) − 9 cos x) + C .
12

1
This looks quite different from cos
3
x − cos x + C . To see that these antiderivatives are equivalent, we can make use of a
3
few trigonometric identities:
1 1
(cos(3x) − 9 cos x) = (cos(x + 2x) − 9 cos x)
12 12

1
= (cos(x) cos(2x) − sin(x) sin(2x) − 9 cos x)
12

1
2
= (cos x(2 cos x − 1) − sin x(2 sin x cos x) − 9 cos x)
12

1
3 2
= (2 cos x − cos x − 2 cos x(1 − cos x) − 9 cos x)
12

1
3
= (4 cos x − 12 cos x)
12

7.5.2 https://math.libretexts.org/@go/page/4487
1 3
= cos x − cos x.
3

Thus, the two antiderivatives are identical.


We may also use a CAS to compare the graphs of the two functions, as shown in the following figure.

1 1
Figure 7.5.1 : The graphs of y = cos
3
x − cos x and y = (cos(3x) − 9 cos x) are identical.
3 12

 Exercise 7.5.1
dx
Use a CAS to evaluate ∫ −−−−− .
√x2 + 4

Hint
Answers may vary.

Answer
x −−−−−
Possible solutions include sinh −1
( )+C and ln∣∣√x 2
+ 4 + x∣ + C .

2

Key Concepts
An integration table may be used to evaluate indefinite integrals.
A CAS (or computer algebra system) may be used to evaluate indefinite integrals.
It may require some effort to reconcile equivalent solutions obtained using different methods.

Glossary
computer algebra system (CAS)
technology used to perform many mathematical tasks, including integration

integration table
a table that lists integration formulas

7.5: Strategy for Integration is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
7.5: Other Strategies for Integration by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

7.5.3 https://math.libretexts.org/@go/page/4487
7.6: Integration Using Tables and Computer Algebra Systems
7.6: Integration Using Tables and Computer Algebra Systems is shared under a not declared license and was authored, remixed, and/or curated by
LibreTexts.

7.6.1 https://math.libretexts.org/@go/page/4488
7.7: Approximate Integration
 Learning Objectives
Approximate the value of a definite integral by using the midpoint and trapezoidal rules.
Determine the absolute and relative error in using a numerical integration technique.
Estimate the absolute and relative error using an error-bound formula.
Recognize when the midpoint and trapezoidal rules over- or underestimate the true value of an integral.
Use Simpson’s rule to approximate the value of a definite integral to a given accuracy.

The antiderivatives of many functions either cannot be expressed or cannot be expressed easily in closed form (that is, in terms of known functions). Consequently, rather than evaluate
definite integrals of these functions directly, we resort to various techniques of numerical integration to approximate their values. In this section, we explore several of these
techniques. In addition, we examine the process of estimating the error in using these techniques.

The Midpoint Rule


Earlier in this text we defined the definite integral of a function over an interval as the limit of Riemann sums. In general, any Riemann sum of a function f (x) over an interval [a, b]
b

may be viewed as an estimate of ∫ f (x) dx . Recall that a Riemann sum of a function f (x) over an interval [a, b] is obtained by selecting a partition
a

P = { x0 , x1 , x2 , … , xn }

where a = x0 < x1 < x2 < ⋯ < xn = b

and a set
∗ ∗ ∗
S = { x , x , … , xn }
1 2

where x i−1 ≤x

i
≤ xi for all i.

The Riemann sum corresponding to the partition P and the set S is given by ∑ f (x ∗
i
)Δxi , where Δx i = xi − xi−1 , the length of the i
th
subinterval.
i=1

The midpoint rule for estimating a definite integral uses a Riemann sum with subintervals of equal width and the midpoints, m , of each subinterval in place of x . Formally, we state a i

i

theorem regarding the convergence of the midpoint rule as follows.

 The Midpoint Rule


b −a
Assume that f (x) is continuous on [a, b]. Let n be a positive integer and Δx = . If [a, b] is divided into n subintervals, each of length Δx, and m is the midpoint of the i
i
th

n
subinterval, set
n

Mn = ∑ f (mi )Δx.

i=1

Then lim Mn = ∫ f (x) dx.


n→∞
a

As we can see in Figure 7.7.1, if f (x) ≥ 0 over [a, b], then ∑ f (m i )Δx corresponds to the sum of the areas of rectangles approximating the area between the graph of f (x) and the
i=1

x -axis over [a, b]. The graph shows the rectangles corresponding to M for a nonnegative function over a closed interval [a, b].
4

Figure 7.7.1 : The midpoint rule approximates the area between the graph of f (x) and the x -axis by summing the areas of rectangles with midpoints that are points on f (x).

 Example 7.7.1: Using the Midpoint Rule with M 4

Use the midpoint rule to estimate ∫ 2


x dx using four subintervals. Compare the result with the actual value of this integral.
0

1 −0 1
Solution: Each subinterval has length Δx = = . Therefore, the subintervals consist of
4 4

1 1 1 1 3 3
[0, ], [ , ], [ , ] , and [ , 1] .
4 4 2 2 4 4

The midpoints of these subintervals are { 1

8
,
3

8
,
5

8
,
7

8
}. Thus,

7.7.1 https://math.libretexts.org/@go/page/4489
1 1 1 3 1 5 1 7
M4 = ⋅f ( )+ ⋅f ( )+ ⋅f ( )+ ⋅f ( )
4 8 4 8 4 8 4 8

1 1 1 9 1 25 1 49
= ⋅ + ⋅ + ⋅ + ⋅
4 64 4 64 4 64 4 64

21
= = 0.328125.
64

Since
1
1
2
∫ x dx = ,
0
3

the absolute error in this approximation is:


∣1 21 ∣ 1
∣ − ∣ = ≈ 0.0052,
∣3 64 ∣ 192

and we see that the midpoint rule produces an estimate that is somewhat close to the actual value of the definite integral.

 Example 7.7.2: Using the Midpoint Rule with M 6

Use M to estimate the length of the curve y =


6
1

2
x
2
on [1, 4].
Solution: The length of y = 1

2
2
x on [1, 4] is
−−−−−−−−−
4 2
dy
s =∫ √1 + ( ) dx.
1
dx

4
dy −−−− −
Since =x , this integral becomes ∫ 2
√1 + x dx.
dx 1

4 −1 1
If [1, 4] is divided into six subintervals, then each subinterval has length Δx = = and the midpoints of the subintervals are {
5

4
,
7

4
,
9

4
,
11

4
,
13

4
,
15

4
} . If we set
6 2
−−−− −
f (x) = √1 + x
2
,

1
5 1
7 1 9 1 11 1 13 1 15
M6 = ⋅f ( )+ ⋅f ( )+ ⋅f ( )+ ⋅f ( )+ ⋅f ( )+ ⋅f ( )
2 2
4 4 2 4 2 4 2 4 2 4

1
≈ (1.6008 + 2.0156 + 2.4622 + 2.9262 + 3.4004 + 3.8810) = 8.1431 units.
2

 Exercise 7.7.1
2
1
Use the midpoint rule with n = 2 to estimate ∫ dx.
1
x

Hint
1 5 7
Δx = , m1 = , and m2 = .
2 4 4

Answer
24
≈ 0.685714
35

The Trapezoidal Rule


We can also approximate the value of a definite integral by using trapezoids rather than rectangles. In Figure 7.7.2, the area beneath the curve is approximated by trapezoids rather than
by rectangles.

Figure 7.7.2 : Trapezoids may be used to approximate the area under a curve, hence approximating the definite integral.
The trapezoidal rule for estimating definite integrals uses trapezoids rather than rectangles to approximate the area under a curve. To gain insight into the final form of the rule,
consider the trapezoids shown in Figure 7.7.2. We assume that the length of each subinterval is given by Δx. First, recall that the area of a trapezoid with a height of h and bases of
length b and b is given by Area = h(b + b ) . We see that the first trapezoid has a height Δx and parallel bases of length f (x ) and f (x ). Thus, the area of the first trapezoid in
1 2
1

2
1 2 0 1

Figure 7.7.2 is
1
Δx(f (x0 ) + f (x1 )).
2

The areas of the remaining three trapezoids are


1 1 1
Δx(f (x1 ) + f (x2 )), Δx(f (x2 ) + f (x3 )), and Δx(f (x3 ) + f (x4 )).
2 2 2

7.7.2 https://math.libretexts.org/@go/page/4489
Consequently,
b
1 1 1 1
∫ f (x) dx ≈ Δx(f (x0 ) + f (x1 )) + Δx(f (x1 ) + f (x2 )) + Δx(f (x2 ) + f (x3 )) + Δx(f (x3 ) + f (x4 )).
a
2 2 2 2

After taking out a common factor of 1

2
Δx and combining like terms, we have
b
Δx
∫ f (x) dx ≈ [f (x0 ) + 2 f (x1 ) + 2 f (x2 ) + 2 f (x3 ) + f (x4 )].
a 2

Generalizing, we formally state the following rule.

 The Trapezoidal Rule


b −a
Assume that f (x) is continuous over [a, b] . Let n be a positive integer and Δx = . Let [a, b] be divided into n subintervals, each of length , with endpoints at
Δx
n
P = { x0 , x1 , x2 … , xn }.

Set
Δx
Tn = [f (x0 ) + 2 f (x1 ) + 2 f (x2 ) + ⋯ + 2 f (xn−1 ) + f (xn )].
2

Then, lim Tn = ∫ f (x) dx.


n→+∞
a

Before continuing, let’s make a few observations about the trapezoidal rule. First of all, it is useful to note that
n n
1
Tn = (Ln + Rn ) where L n = ∑ f (xi−1 )Δx and R n = ∑ f (xi )Δx.
2
i=1 i=1

That is, L and R approximate the integral using the left-hand and right-hand endpoints of each subinterval, respectively. In addition, a careful examination of Figure 7.7.3 leads us
n n

to make the following observations about using the trapezoidal rules and midpoint rules to estimate the definite integral of a nonnegative function. The trapezoidal rule tends to
overestimate the value of a definite integral systematically over intervals where the function is concave up and to underestimate the value of a definite integral systematically over
intervals where the function is concave down. On the other hand, the midpoint rule tends to average out these errors somewhat by partially overestimating and partially underestimating
the value of the definite integral over these same types of intervals. This leads us to hypothesize that, in general, the midpoint rule tends to be more accurate than the trapezoidal rule.

Figure 7.7.3 :The trapezoidal rule tends to be less accurate than the midpoint rule.

 Example 7.7.3: Using the Trapezoidal Rule


1

Use the trapezoidal rule to estimate ∫ x


2
dx using four subintervals.
0

Solution
1−0
The endpoints of the subintervals consist of elements of the set P = {0,
1

4
,
1

2
,
3

4
, 1} and Δx = 4
=
1

4
. Thus,
1
2
1 1 1 1 3
∫ x dx ≈ ⋅ [f (0) + 2 f ( )+2 f ( )+2 f ( ) + f (1)]
4 2 4
0
2 4

1 1 1 9
= (0 + 2 ⋅ +2 ⋅ +2 ⋅ + 1)
8 16 4 16

11
= = 0.34375
32

 Exercise 7.7.2
2
1
Use the trapezoidal rule with n = 2 to estimate ∫ dx.
1
x

Hint
1
Set Δx = . The endpoints of the subintervals are the elements of the set P = {1,
3

2
, 2} .
2

Answer
17
≈ 0.708333
24

Absolute and Relative Error


An important aspect of using these numerical approximation rules consists of calculating the error in using them for estimating the value of a definite integral. We first need to define
absolute error and relative error.

7.7.3 https://math.libretexts.org/@go/page/4489
 Definition: absolute and relative error

If B is our estimate of some quantity having an actual value of A , then the absolute error is given by |A − B| .
The relative error is the error as a percentage of the actual value and is given by
∣ A−B ∣
∣ ∣ ⋅ 100%.
∣ A ∣

 Example 7.7.4: Calculating Error in the Midpoint Rule


1

Calculate the absolute and relative error in the estimate of ∫ 2


x dx using the midpoint rule, found in Example 7.7.1.
0

1
1
Solution: The calculated value is ∫ 2
x dx = and our estimate from the example is M 4 =
21

64
. Thus, the absolute error is given by ∣∣ 1

3

21

64
∣ =
∣ 192
1
≈ 0.0052.
0
3

The relative error is


1/192 1
= ≈ 0.015625 ≈ 1.6%.
1/3 64

 Example 7.7.5: Calculating Error in the Trapezoidal Rule


1

Calculate the absolute and relative error in the estimate of ∫ 2


x dx using the trapezoidal rule, found in Example 7.7.3.
0

1
1
Solution: The calculated value is ∫ 2
x dx = and our estimate from the example is T 4 =
11

32
. Thus, the absolute error is given by ∣∣ 1

3

11

32
∣ =

1

96
≈ 0.0104.
0 3

The relative error is given by


1/96
= 0.03125 ≈ 3.1%.
1/3

 Exercise 7.7.3
2
1
In an earlier checkpoint, we estimated ∫ dx to be 24

35
using M . The actual value of this integral is ln 2. Using
2
24

35
≈ 0.6857 and ln 2 ≈ 0.6931, calculate the absolute error
1
x

and the relative error.

Hint
Use the previous examples as a guide.

Answer
absolute error ≈ 0.0074, and relative error ≈ 1.1%

Error Bounds on the Midpoint and Trapezoidal Rules


In the two previous examples, we were able to compare our estimate of an integral with the actual value of the integral; however, we do not typically have this luxury. In general, if we
are approximating an integral, we are doing so because we cannot compute the exact value of the integral itself easily. Therefore, it is often helpful to be able to determine an upper
bound for the error in an approximation of an integral. The following theorem provides error bounds for the midpoint and trapezoidal rules. The theorem is stated without proof.

 Error Bounds for the Midpoint and Trapezoidal Rules

Let f (x) be a continuous function over [a, b], having a second derivative f ′′
(x) over this interval. If M is the maximum value of |f ′′
(x)| over [a, b], then the upper bounds for the
b

error in using M and T to estimate ∫


n n f (x) dx are
a

3
M (b − a)
Error in Mn ≤ (7.7.1)
24n2

and
3
M (b − a)
Error in Tn ≤
2
12n

We can use these bounds to determine the value of n necessary to guarantee that the error in an estimate is less than a specified value.

 Example 7.7.6: Determining the Number of Intervals to Use


1
2

What value of n should be used to guarantee that an estimate of ∫ e


x
dx is accurate to within 0.01 if we use the midpoint rule?
0

Solution
We begin by determining the value of M , the maximum value of |f over [0, 1] for f (x) = e . Since f '(x) = 2x e we have
2 2
′′ x x
(x)| ,

2 2
′′ x 2 x
f (x) = 2 e + 4x e .

7.7.4 https://math.libretexts.org/@go/page/4489
Thus,
2
′′ x 2
|f (x)| = 2 e (1 + 2 x ) ≤ 2 ⋅ e ⋅ 3 = 6e.

From the error-bound Equation 7.7.1, we have


3 3
M (b − a) 6e(1 − 0) 6e
Error in Mn ≤ ≤ = .
2 2 2
24n 24n 24n

Now we solve the following inequality for n :


6e
≤ 0.01.
24n2


−−−
Thus, n ≥ √ 600e

24
≈ 8.24. Since n must be an integer satisfying this inequality, a choice of n = 9 would guarantee that
1
∣ 2

x
∣∫ e dx − Mn ∣ < 0.01.
∣ 0 ∣

Analysis
We might have been tempted to round 8.24 down and choose n = 8 , but this would be incorrect because we must have an integer greater than or equal to 8.24. We need to keep in
mind that the error estimates provide an upper bound only for the error. The actual estimate may, in fact, be a much better approximation than is indicated by the error bound.

 Exercise 7.7.4
1

Use Equation 7.7.1 to find an upper bound for the error in using M to estimate ∫ 4 x
2
dx.
0

Hint
f
′′
(x) = 2, so M = 2.

Answer
1

192

Simpson’s Rule
With the midpoint rule, we estimated areas of regions under curves by using rectangles. In a sense, we approximated the curve with piecewise constant functions. With the trapezoidal
rule, we approximated the curve by using piecewise linear functions. What if we were, instead, to approximate a curve using piecewise quadratic functions? With Simpson’s rule, we
x2 x2

do just this. We partition the interval into an even number of subintervals, each of equal width. Over the first pair of subintervals we approximate ∫ f (x) dx with ∫ ,
p(x) dx
x0 x0

where p(x) = Ax 2
+ Bx + C is the quadratic function passing through (x 0, f (x0 )), (x1 , f (x1 )), and (x 2, f (x2 )) (Figure 7.7.4). Over the next pair of subintervals we approximate
x4

∫ f (x) dx with the integral of another quadratic function passing through (x2 , f (x2 )), (x3 , f (x3 )), and (x4 , f (x4 )). This process is continued with each successive pair of
x2

subintervals.

Figure 7.7.4 : With Simpson’s rule, we approximate a definite integral by integrating a piecewise quadratic function.
To understand the formula that we obtain for Simpson’s rule, we begin by deriving a formula for this approximation over the first two subintervals. As we go through the derivation, we
need to keep in mind the following relationships:
2
f (x0 ) = p(x0 ) = Ax + Bx0 + C
0

2
f (x1 ) = p(x1 ) = Ax + Bx1 + C
1

2
f (x2 ) = p(x2 ) = Ax + Bx2 + C
2

x2 − x0 = 2Δx , where Δx is the length of a subinterval.


(x2 + x0 )
x2 + x0 = 2 x1 , since x 1 = .
2

Thus,

7.7.5 https://math.libretexts.org/@go/page/4489
x2 x2

∫ f (x) dx ≈ ∫ p(x) dx
x0 x0

x2
2
=∫ (Ax + Bx + C ) dx
x0

x2
A 3
B 2 ∣
=( x + x + C x) ∣ Find the antiderivative.
3 2 ∣
x0

A B
3 3 2 2
= (x −x )+ (x − x ) + C (x2 − x0 ) Evaluate the antiderivative.
2 0 2 0
3 2

A B
2 2
= (x2 − x0 )(x + x2 x0 + x ) + (x2 − x0 )(x2 + x0 ) + C (x2 − x0 )
2 0
3 2

x2 − x0 x2 − x0
2 2
= (2A(x + x2 x0 + x ) + 3B(x2 + x0 ) + 6C ) Factor out .
2 0
6 6

Δx x2 − x0
2 2 2 2
= ((Ax + Bx2 + C ) + (Ax + Bx0 + C ) + A(x + 2 x2 x0 + x ) + 2B(x2 + x0 ) + 4C ) Rearrange the terms. Note: Δx =
2 0 2 0
3 2

Δx
2
= (f (x2 ) + f (x0 ) + A(x2 + x0 ) + 2B(x2 + x0 ) + 4C ) Factor and substitute:
3

2 2
f (x2 ) = Ax + Bx2 + C and f (x0 ) = Ax + Bx0 + C .
2 0

Δx
2
= (f (x2 ) + f (x0 ) + A(2 x1 ) + 2B(2 x1 ) + 4C ) Substitute x2 + x0 = 2 x1 .
3

x2 + x0
Note: x1 = is the midpoint.
2

Δx
2
= (f (x2 ) + 4f (x1 ) + f (x0 )). Expand and substitute f (x1 ) = Ax + Bx1 + C .
1
3

x4

If we approximate ∫ f (x) dx using the same method, we see that we have


x2

x4
Δx
∫ f (x) dx ≈ (f (x4 ) + 4 f (x3 ) + f (x2 )).
x2
3

Combining these two approximations, we get


x4
Δx
∫ f (x) dx ≈ (f (x0 ) + 4 f (x1 ) + 2 f (x2 ) + 4 f (x3 ) + f (x4 )).
x0
3

The pattern continues as we add pairs of subintervals to our approximation. The general rule may be stated as follows.

 Simpson’s Rule
b −a
Assume that f (x) is continuous over [a, b] . Let n be a positive even integer and Δx = . Let [a, b] be divided into n subintervals, each of length Δx , with endpoints at
n
P = { x0 , x1 , x2 , … , xn }. Set
Δx
Sn = [f (x0 ) + 4 f (x1 ) + 2 f (x2 ) + 4 f (x3 ) + 2 f (x4 ) + ⋯ + 2 f (xn−2 ) + 4 f (xn−1 ) + f (xn )].
3

Then,
b

lim Sn = ∫ f (x) dx.


n→+∞
a

Just as the trapezoidal rule is the average of the left-hand and right-hand rules for estimating definite integrals, Simpson’s rule may be obtained from the midpoint and trapezoidal rules
by using a weighted average. It can be shown that S = ( ) M + ( ) T . 2n
2

3
n
1

3
n

It is also possible to put a bound on the error when using Simpson’s rule to approximate a definite integral. The bound in the error is given by the following rule:

 Rule: Error Bound for Simpson’s Rule

Let f (x) be a continuous function over [a, b] having a fourth derivative, f (4)
, over this interval. If M is the maximum value of ∣∣f
(x)
(4)
(x)∣
∣ over [a, b], then the upper bound for
b

the error in using S to estimate ∫


n f (x) dx is given by
a

5
M (b − a)
Error in Sn ≤ .
4
180n

 Example 7.7.7: Applying Simpson’s Rule 1


1

Use S to approximate ∫
2 x
3
dx . Estimate a bound for the error in S . 2

Solution
1−0
Since [0, 1] is divided into two intervals, each subinterval has length Δx = 2
=
1

2
. The endpoints of these subintervals are {0, 1

2
. If we set f (x) = x
, 1}
3
, then
1 1 1 1 1 1
S2 = ⋅ (f (0) + 4 f ( ) + f (1)) = (0 + 4 ⋅ + 1) = .
3 2 2 6 8 4

Since f (4)
(x) = 0 and consequently M = 0, we see that

7.7.6 https://math.libretexts.org/@go/page/4489
5
0(1)
Error in S 2 ≤ 4
= 0.
180⋅2

1
1
This bound indicates that the value obtained through Simpson’s rule is exact. A quick check will verify that, in fact, ∫ 3
x dx = .
0
4

 Example 7.7.8: Applying Simpson’s Rule 2

Use S to estimate the length of the curve y =


6
1

2
2
x over [1, 4].
Solution
4
−−−− −
The length of y =
1

2
x
2
over [1, 4] is ∫
2
√1 + x dx . If we divide [1, 4] into six subintervals, then each subinterval has length Δx =
4−1

6
=
1

2
, and the endpoints of the
1
−−−− −
subintervals are {1, 3

2
, 2,
5

2
, 3,
7

2
, 4} . Setting f (x) = √1 + x
2
,
1 1 3 5 7
S6 = ⋅ (f (1) + 4f ( ) + 2f (2) + 4f ( ) + 2f (3) + 4f ( ) + f (4)).
3 2 2 2 2

After substituting, we have


1
S6 = (1.4142 + 4 ⋅ 1.80278 + 2 ⋅ 2.23607 + 4 ⋅ 2.69258 + 2 ⋅ 3.16228 + 4 ⋅ 3.64005 + 4.12311) ≈ 8.14594 units.
6

 Exercise 7.7.5
2
1
Use S to estimate ∫
2 dx.
1
x

Hint
1
S2 = Δx (f (x0 ) + 4f (x1 ) + f (x2 ))
3

Answer
25
≈ 0.694444
36

Key Concepts
We can use numerical integration to estimate the values of definite integrals when a closed form of the integral is difficult to find or when an approximate value only of the definite
integral is needed.
The most commonly used techniques for numerical integration are the midpoint rule, trapezoidal rule, and Simpson’s rule.
The midpoint rule approximates the definite integral using rectangular regions whereas the trapezoidal rule approximates the definite integral using trapezoidal approximations.
Simpson’s rule approximates the definite integral by first approximating the original function using piecewise quadratic functions.

Key Equations
Midpoint rule
n

Mn = ∑ f (mi )Δx

i=1

Trapezoidal rule
Δx
Tn = [f (x0 ) + 2 f (x1 ) + 2 f (x2 ) + ⋯ + 2 f (xn−1 ) + f (xn )]
2

Simpson’s rule
Δx
Sn = [f (x0 ) + 4 f (x1 ) + 2 f (x2 ) + 4 f (x3 ) + 2 f (x4 ) + 4 f (x5 ) + ⋯ + 2 f (xn−2 ) + 4 f (xn−1 ) + f (xn )]
3

Error bound for midpoint rule


3
M (b − a)
Error in M n ≤
2
, where M is the maximum value of |f ′′
(x)| over [a, b].
24n

Error bound for trapezoidal rule


3
M (b − a)
Error in T n ≤
2
, where M is the maximum value of |f ′′
(x)| over [a, b].
12n

Error bound for Simpson’s rule


5
M (b − a)
Error in S n ≤
4
, where M is the maximum value of ∣∣f (4)
(x)∣
∣ over [a, b].
180n

Glossary
absolute error
if B is an estimate of some quantity having an actual value of A, then the absolute error is given by |A − B|

midpoint rule
n b

a rule that uses a Riemann sum of the form M n = ∑ f (mi )Δx , where m is the midpoint of the i
i
th
subinterval to approximate ∫ f (x) dx
a
i=1

numerical integration

7.7.7 https://math.libretexts.org/@go/page/4489
the variety of numerical methods used to estimate the value of a definite integral, including the midpoint rule, trapezoidal rule, and Simpson’s rule

relative error
error as a percentage of the actual value, given by
∣ A−B ∣
relative error = ∣ ∣ ⋅ 100%
∣ A ∣

Simpson’s rule
b

a rule that approximates ∫ f (x) dx using the area under a piecewise quadratic function.
a
b

The approximation S to ∫n f (x) dx is given by


a

Δx
Sn = (f (x0 ) + 4 f (x1 ) + 2 f (x2 ) + 4 f (x3 ) + 2 f (x4 ) + ⋯ + 2 f (xn−2 ) + 4 f (xn−1 ) + f (xn )).
3

trapezoidal rule
b

a rule that approximates ∫ f (x) dx using the area of trapezoids.


a
b

The approximation T to ∫n f (x) dx is given by


a

Δx
Tn = (f (x0 ) + 2 f (x1 ) + 2 f (x2 ) + ⋯ + 2 f (xn−1 ) + f (xn )).
2

Contributors and Attributions


Gilbert Strang (MIT) and Edwin “Jed” Herman (Harvey Mudd) with many contributing authors. This content by OpenStax is licensed with a CC-BY-SA-NC 4.0 license. Download
for free at http://cnx.org.
Edited by Paul Seeburger (Monroe Community College). Notes added to development of area under a parabola and typos fixed in original text.

7.7: Approximate Integration is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
7.6: Numerical Integration by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source: https://openstax.org/details/books/calculus-volume-1.

7.7.8 https://math.libretexts.org/@go/page/4489
7.8: Improper Integrals
 Learning Objectives
Evaluate an integral over an infinite interval.
Evaluate an integral over a closed interval with an infinite discontinuity within the interval.
Use the comparison theorem to determine whether a definite integral is convergent.

1
Is the area between the graph of f (x) = and the x-axis over the interval [1, +∞) finite or infinite? If this same region is revolved about the x-
x
axis, is the volume finite or infinite? Surprisingly, the area of the region described is infinite, but the volume of the solid obtained by revolving this
region about the x-axis is finite.
In this section, we define integrals over an infinite interval as well as integrals of functions containing a discontinuity on the interval. Integrals of
these types are called improper integrals. We examine several techniques for evaluating improper integrals, all of which involve taking limits.

Integrating over an Infinite Interval


+∞ t

How should we go about defining an integral of the type ∫ f (x) dx? We can integrate ∫ f (x) dx for any value of t , so it is reasonable to
a a
t

look at the behavior of this integral as we substitute larger values of t . Figure 7.8.1 shows that ∫ f (x) dx may be interpreted as area for various
a

values of t . In other words, we may define an improper integral as a limit, taken as one of the limits of integration increases or decreases without
bound.

Figure 7.8.1 : To integrate a function over an infinite interval, we consider the limit of the integral as the upper limit increases without bound.

 Definition: Improper Integral


1. Let f (x) be continuous over an interval of the form [a, +∞). Then
+∞ t

∫ f (x) dx = lim ∫ f (x) dx, (7.8.1)


t→+∞
a a

provided this limit exists.


2. Let f (x) be continuous over an interval of the form (−∞, b] . Then
b b

∫ f (x) dx = lim ∫ f (x) dx, (7.8.2)


t→−∞
−∞ t

provided this limit exists.


In each case, if the limit exists, then the improper integral is said to converge. If the limit does not exist, then the improper integral is said to
diverge.
3. Let f (x) be continuous over (−∞, +∞) . Then
+∞ 0 +∞

∫ f (x) dx = ∫ f (x) dx + ∫ f (x) dx, (7.8.3)


−∞ −∞ 0

0 +∞

provided that ∫ f (x) dx and ∫ f (x) dx both converge.


−∞ 0
+∞

If either of these two integrals diverge, then ∫ f (x) dx diverges.


−∞

+∞ a +∞

(It can be shown that, in fact, ∫ f (x) dx = ∫ f (x) dx + ∫ f (x) dx for any value of a.).
−∞ −∞ a

1
In our first example, we return to the question we posed at the start of this section: Is the area between the graph of f (x) = and the x-axis over
x
the interval [1, +∞) finite or infinite?

7.8.1 https://math.libretexts.org/@go/page/4490
 Example 7.8.1: Finding an Area
1
Determine whether the area between the graph of f (x) = and the x-axis over the interval [1, +∞) is finite or infinite.
x

Solution
We first do a quick sketch of the region in question, as shown in Figure 7.8.2.

Figure 7.8.2 : We can find the area between the curve f (x) = 1/x and the x -axis on an infinite interval.
We can see that the area of this region is given by

1
A =∫ dx.
1
x

which can be evaluated using Equation 7.8.1:



1
A =∫ dx
1 x

t
1
= lim ∫ dx Rewrite the improper integral as a limit
t→+∞
1
x

t

= lim ln |x| ∣ Find the antiderivative
t→+∞ ∣
1

= lim (ln |t| − ln 1) Evaluate the antiderivative


t→+∞

= +∞. Evaluate the limit.

Since the improper integral diverges to +∞, the area of the region is infinite.

 Example 7.8.2: Finding a Volume


1
Find the volume of the solid obtained by revolving the region bounded by the graph of f (x) = and the x-axis over the interval [1, +∞)
x
about the x-axis.
Solution
The solid is shown in Figure 7.8.3. Using the disk method, we see that the volume V is
+∞
1
V =π∫ dx.
2
1 x

7.8.2 https://math.libretexts.org/@go/page/4490
Figure 7.8.3 : The solid of revolution can be generated by rotating an infinite area about the x -axis.
Then we have
+∞
1
V =π∫ dx
2
1 x

t
1
=π lim ∫ dx Rewrite as a limit.
t→+∞ 2
1 x

t
1 ∣
=π lim − ∣ Find the antiderivative.
t→+∞ x∣ 1

1
=π lim (− + 1) Evaluate the antiderivative.
t→+∞ t

0
⎛ 1 ⎞
=π lim − +1 Evaluate the antiderivative.
t→+∞ ⎝ t ⎠

The improper integral converges to π. Therefore, the volume of the solid of revolution is π.
In conclusion, although the area of the region between the x-axis and the graph of f (x) = 1/x over the interval [1, +∞) is infinite, the
volume of the solid generated by revolving this region about the x-axis is finite. The solid generated is known as Gabriel’s Horn.
Note: Gabriel's horn (also called Torricelli's trumpet) is a geometric figure which has infinite surface area, but finite volume. The name refers
to the tradition identifying the Archangel Gabriel as the angel who blows the horn to announce Judgment Day, associating the divine, or
infinite, with the finite. The properties of this figure were first studied by Italian physicist and mathematician Evangelista Torricelli in the 17th
century.

 Example 7.8.3: Traffic Accidents in a City

Suppose that at a busy intersection, traffic accidents occur at an average rate of one every three months. After residents complained, changes
were made to the traffic lights at the intersection. It has now been eight months since the changes were made and there have been no
accidents. Were the changes effective or is the 8-month interval without an accident a result of chance?

Figure 7.8.4 : Modification of work by David McKelvey, Flickr.


Probability theory tells us that if the average time between events is k , the probability that X, the time between events, is between a and b is
given by
b

(P (a ≤ x ≤ b) = ∫ f (x) dx
a

where

7.8.3 https://math.libretexts.org/@go/page/4490
0, if x < 0
f (x) = { .
−kx
ke , if x ≥ 0

Thus, if accidents are occurring at a rate of one every 3 months, then the probability that X, the time between accidents, is between a and b is
given by
b

P (a ≤ x ≤ b) = ∫ f (x) dx
a

where

0, if x < 0
f (x) = { .
−3x
3e , if x ≥ 0

+∞

To answer the question, we must compute P (X ≥ 8) = ∫ 3e


−3x
dx and decide whether it is likely that 8 months could have passed
8

without an accident if there had been no improvement in the traffic situation.


Solution
We need to calculate the probability as an improper integral:
+∞
−3x
P (X ≥ 8) = ∫ 3e dx
8

t
−3x
= lim ∫ 3e dx
t→+∞
8

t
−3x ∣
= lim −e ∣
t→+∞ ∣
8

−3t −24
= lim (−e +e )
t→+∞

−11
≈ 3.8 × 10 .

The value 3.8 × 10 represents the probability of no accidents in 8 months under the initial conditions. Since this value is very, very small,
−11

it is reasonable to conclude the changes were effective.

 Example 7.8.4: Evaluating an Improper Integral over an Infinite Interval


0
1
Evaluate ∫ 2
dx. State whether the improper integral converges or diverges.
−∞ x +4

Solution
0
1
Begin by rewriting ∫ 2
dx as a limit using Equation 7.8.2 from the definition. Thus,
−∞ x +4

0 0
1 1
∫ dx = lim ∫ dx Rewrite as a limit.
2 t→−∞ 2
−∞ x +4 t x +4

0
1 −1
x∣
= lim tan ∣ Find the antiderivative.
t→−∞ 2 2 ∣t

1 1 t
−1 −1
= lim ( tan 0− tan ) Evaluate the antiderivative.
t→−∞ 2 2 2

π
= . Evaluate the limit and simplify.
4

π
The improper integral converges to .
4

 Example 7.8.5: Evaluating an Improper Integral on (−∞, +∞)


+∞

Evaluate ∫ xe
x
dx. State whether the improper integral converges or diverges.
−∞

Solution
Start by splitting up the integral:

7.8.4 https://math.libretexts.org/@go/page/4490
+∞ 0 +∞
x x x
∫ xe dx = ∫ xe dx + ∫ xe dx.
−∞ −∞ 0

0 +∞ +∞

If either ∫ xe
x
dx or ∫ xe
x
dx diverges, then ∫ xe
x
dx diverges.
−∞ 0 −∞

Compute each integral separately.


For the first integral,
0 0
x x
∫ xe dx = lim ∫ xe dx Rewrite as a limit.
t→−∞
−∞ t

0
x x ∣ x
= lim (x e − e )∣ Use integration by parts to find the antiderivative. (Here u = x and dv = e dx. )
t→−∞ ∣
t

t t
= lim (−1 − te + e ) Evaluate the antiderivative.
t→−∞

= −1.

t −1
Evaluate the limit. Note: lim te
t
is indeterminate of the form 0⋅∞ .Thus, lim te
t
= lim = lim = lim −e
t
=0 by
t→−∞ t→−∞ t→−∞ e−t t→−∞ e−t t→−∞

L’Hôpital’s Rule.
The first improper integral converges. For the second integral,
+∞ t
x x
∫ xe dx = lim ∫ xe dx Rewrite as a limit.
t→+∞
0 0

t
x x ∣
= lim (x e − e )∣ Find the antiderivative.
t→+∞ ∣
0

t t
= lim (te − e + 1) Evaluate the antiderivative.
t→+∞

t t t
= lim ((t − 1)e + 1) Rewrite. (te − e is indeterminate.)
t→+∞

= +∞. Evaluate the limit.

+∞ +∞

Thus, ∫ xe
x
dx diverges. Since this integral diverges, ∫ xe
x
dx diverges as well.
0 −∞

 Exercise 7.8.1
+∞

Evaluate ∫ e
−x
dx. State whether the improper integral converges or diverges.
−3

Hint
+∞ t
−x −x
∫ e dx = lim ∫ e dx
t→+∞
−3 −3

Answer
It converges to e 3
.

Integrating a Discontinuous Integrand


Now let’s examine integrals of functions containing an infinite discontinuity in the interval over which the integration occurs. Consider an integral
b

of the form ∫ f (x) dx, where f (x) is continuous over [a, b) and discontinuous at b . Since the function f (x) is continuous over [a, t] for all
a
t

values of t satisfying a ≤t <b , the integral ∫ f (x) dx is defined for all such values of t . Thus, it makes sense to consider the values of
a
t b t

∫ f (x) dx as t approaches b for a ≤t <b . That is, we define ∫ f (x) dx = lim ∫



f (x) dx , provided this limit exists. Figure 7.8.5
a a t→b a
t

illustrates ∫ f (x) dx as areas of regions for values of t approaching b .


a

7.8.5 https://math.libretexts.org/@go/page/4490
Figure 7.8.5 : As t approaches b from the left, the value of the area from a to t approaches the area from a to b.
b

We use a similar approach to define ∫ , where


f (x) dx f (x) is continuous over (a, b] and discontinuous at a . We now proceed with a formal
a

definition.

 Definition: Converging and Diverging Improper Integral


1. Let f (x) be continuous over [a, b). Then,
b t

∫ f (x) dx = lim ∫ f (x) dx. (7.8.4)



a t→b a

2. Let f (x) be continuous over (a, b]. Then,


b b

∫ f (x) dx = lim ∫ f (x) dx. (7.8.5)


+
a t→a t

In each case, if the limit exists, then the improper integral is said to converge. If the limit does not exist, then the improper integral is said
to diverge.
3. If f (x) is continuous over [a, b] except at a point c in (a, b), then
b c b

∫ f (x) dx = ∫ f (x) dx + ∫ f (x) dx, (7.8.6)


a a c

c b b

provided both ∫ f (x) dx and ∫ f (x) dx converge. If either of these integrals diverges, then ∫ f (x) dx diverges.
a c a

The following examples demonstrate the application of this definition.

 Example 7.8.6: Integrating a Discontinuous Integrand


4
1
Evaluate ∫ −−−−− dx, if possible. State whether the integral converges or diverges.
0 √4 − x

Solution
1
The function f (x) = −−−−− is continuous over [0, 4) and discontinuous at 4. Using Equation 7.8.4 from the definition, rewrite
√4 − x
4
1

−−−−−
dx as a limit:
0 √4 − x

4 t
1 1
∫ dx = lim ∫ dx Rewrite as a limit.
−−−−− −
−−−−−
0 √4 − x t→4 0 √4 − x

t
−−−−− ∣
= lim (−2 √4 − x )∣ Find the antiderivative.
t→4


0

−−−−
= lim (−2 √4 − t + 4) Evaluate the antiderivative.

t→4

= 4. Evaluate the limit.

The improper integral converges.

7.8.6 https://math.libretexts.org/@go/page/4490
 Example 7.8.7: Integrating a Discontinuous Integrand
2

Evaluate ∫ x ln x dx. State whether the integral converges or diverges.


0

Solution
Since f (x) = x ln x is continuous over (0, 2] and is discontinuous at zero, we can rewrite the integral in limit form using Equation 7.8.5:
2 2

∫ x ln x dx = lim ∫ x ln x dx Rewrite as a limit.


+
0 t→0 t

2
1 1 ∣
2 2
= lim ( x ln x − x )∣ Evaluate ∫ x ln x dx using integration by parts with u = ln x and dv = x.
t→0
+
2 4 ∣
t

1 2
1 2
= lim (2 ln 2 − 1 − t ln t + t ). Evaluate the antiderivative.
t→0
+
2 4

2
= 2 ln 2 − 1. Evaluate the limit. Note that lim t ln t is indeterminate.
+
t→0

To evaluate it, rewrite it as a quotient and apply L’H pital’s rule. ô

The improper integral converges.

 Example 7.8.8: Integrating a Discontinuous Integrand


1
1
Evaluate ∫ 3
dx. State whether the improper integral converges or diverges.
−1 x

Solution
Since f (x) = 1/x is discontinuous at zero, using Equation 7.8.6, we can write
3

1 0 1
1 1 1
∫ dx = ∫ dx + ∫ dx.
3 3
−1
x −1
x 0
x3

0
1
If either of the two integrals diverges, then the original integral diverges. Begin with ∫ 3
dx:
−1 x

0 t
1 1
∫ dx = lim ∫ dx Rewrite as a limit.
3 − 3
−1 x t→0 −1 x

t
1 ∣
= lim (− )∣ Find the antiderivative.
t→0

2x
2 ∣
−1

1 1
= lim (− + ) Evaluate the antiderivative.
2
t→0

2t 2

= +∞. Evaluate the limit.

0
1
Therefore, ∫ 3
dx diverges.
−1 x

0 1
1 1
Since ∫ 3
dx diverges, ∫ 3
dx also diverges.
−1 x −1 x

 Exercise 7.8.2
2
1
Evaluate ∫ dx. State whether the integral converges or diverges.
0
x

Hint
2
1
Write ∫ dx in limit form using Equation 7.8.5.
0
x

Answer
+∞ , It diverges.

7.8.7 https://math.libretexts.org/@go/page/4490
A Comparison Theorem
It is not always easy or even possible to evaluate an improper integral directly; however, by comparing it with another carefully chosen integral, it
may be possible to determine its convergence or divergence. To see this, consider two continuous functions f (x) and g(x) satisfying
0 ≤ f (x) ≤ g(x) for x ≥ a (Figure 7.8.6). In this case, we may view integrals of these functions over intervals of the form [a, t] as areas, so we

have the relationship


t t

0 ≤∫ f (x) dx ≤ ∫ g(x) dx
a a

for t ≥ a .

t t

Figure 7.8.6 : If 0 ≤ f (x) ≤ g(x) for x ≥ a , then for t ≥ a , ∫ f (x) dx ≤ ∫ g(x) dx.
a a

Thus, if
+∞ t

∫ f (x) dx = lim ∫ f (x) dx = +∞,


t→+∞
a a

then
+∞ t

∫ g(x) dx = lim ∫ g(x) dx = +∞


t→+∞
a a

as well. That is, if the area of the region between the graph of f (x) and the x-axis over [a, +∞) is infinite, then the area of the region between the
graph of g(x) and the x-axis over [a, +∞) is infinite too.
On the other hand, if
+∞ t

∫ g(x) dx = lim ∫ g(x) dx = L


t→+∞
a a

for some real number L, then


+∞ t

∫ f (x) dx = lim ∫ f (x) dx


t→+∞
a a

t t

must converge to some value less than or equal to L, since ∫ f (x) dx increases as t increases and ∫ f (x) dx ≤ L for all t ≥ a.
a a

If the area of the region between the graph of g(x) and the x-axis over [a, +∞) is finite, then the area of the region between the graph of f (x) and
the x-axis over [a, +∞) is also finite.
These conclusions are summarized in the following theorem.

 A Comparison Theorem
Let f (x) and g(x) be continuous over [a, +∞). Assume that 0 ≤ f (x) ≤ g(x) for x ≥ a.
i. If
+∞ t

∫ f (x) dx = lim ∫ f (x) dx = +∞,


t→+∞
a a

then
+∞ t

∫ g(x) dx = lim ∫ g(x) dx = +∞.


t→+∞
a a

7.8.8 https://math.libretexts.org/@go/page/4490
ii. If
+∞ t

∫ g(x) dx = lim ∫ g(x) dx = L,


t→+∞
a a

where L is a real number, then


+∞ t

∫ f (x) dx = lim ∫ f (x) dx = M


t→+∞
a a

for some real number M ≤ L.

 Example 7.8.9: Applying the Comparison Theorem


+∞
1
Use a comparison to show that ∫ x
dx.
1
xe

converges.
Solution
We can see that 0 ≤ 1

xe
x ≤
e
1
x =e
−x
,

+∞ +∞
1
so if ∫ e
−x
dx converges, then so does ∫ x
dx .
1 1
xe

+∞

To evaluate ∫ e
−x
dx, first rewrite it as a limit:
1

+∞ t
−x −x
∫ e dx = lim ∫ e dx
t→+∞
1 1

t
−x ∣
= lim (−e )∣
t→+∞ ∣
1

−t −1
= lim (−e +e )
t→+∞

−1
=e .

+∞ +∞
1
Since ∫ e
−x
dx converges, so does ∫ x
dx.
1 1
xe

 Example 7.8.10: Applying the Comparison Theorem


+∞
1
Use the comparison theorem to show that ∫ p
dx diverges for all p < 1 .
1
x

Solution
1 1
For p < 1, ≤
p
over [1, +∞).
x x

+∞
1
In Example 7.8.1, we showed that ∫ dx = +∞.
1
x

+∞
1
Therefore, ∫ p
dx diverges for all p < 1 .
1
x

 Exercise 7.8.3
+∞
ln x
Use a comparison to show that ∫ dx diverges.
e x

Hint
1 ln x
≤ on [e, +∞)
x x

Answer
+∞ +∞
1 ln x
Since ∫ dx = +∞, ∫ dx diverges.
e
x e
x

7.8.9 https://math.libretexts.org/@go/page/4490
 Laplace Transforms
In the last few chapters, we have looked at several ways to use integration for solving real-world problems. For this next project, we are going
to explore a more advanced application of integration: integral transforms. Specifically, we describe the Laplace transform and some of its
properties. The Laplace transform is used in engineering and physics to simplify the computations needed to solve some problems. It takes
functions expressed in terms of time and transforms them to functions expressed in terms of frequency. It turns out that, in many cases, the
computations needed to solve problems in the frequency domain are much simpler than those required in the time domain.
The Laplace transform is defined in terms of an integral as

−st
L (f (t)) = F (s) = ∫ e f (t) dt.
0

Note that the input to a Laplace transform is a function of time, f (t), and the output is a function of frequency, F (s) . Although many real-

−−
world examples require the use of complex numbers (involving the imaginary number i = √−1), in this project we limit ourselves to
functions of real numbers.
Let’s start with a simple example. Here we calculate the Laplace transform of f (t) = t . We have

−st
L (t) = ∫ te dt.
0

This is an improper integral, so we express it in terms of a limit, which gives


∞ z
−st −st
L (t) = ∫ te dt = lim ∫ te dt.
z→∞
0 0

Now we use integration by parts to evaluate the integral. Note that we are integrating with respect to t, so we treat the variable s as a constant.
We have
−st
u =t dv = e dt

1 −st
du = dt v=− e
s

Then we obtain
z z z
−st
t −st ∣ 1 −st
lim ∫ te dt = lim [ − e ∣ + ∫ e dt]
z→∞ z→∞ s ∣0 s
0 0

z
z −sz
0 −0s
1 −st
= lim [ [− e + e ]+ ∫ e dt]
z→∞ s s s 0

−st z
z −sz
1 e ∣
= lim [ [− e + 0] − [ ]∣ ]
z→∞ s s s ∣0

z −sz
1 −sz
= lim [ − e − [e − 1] ]
2
z→∞ s s

z 1 1
= lim [− ] − lim + lim
sz
z→∞ se z→∞ s2 esz z→∞ s2

1
= 0 −0 + ô
Note that the first limit requires L'H pital's rule. See work below.
2
s

1
= .
2
s

z ∞
As noted above, lim [−
sz
] is indeterminate of the form − and requires L'Hôpital's rule to evaluate. Using L'Hôpital's rule,
z→∞ se ∞

z 1
lim [− ] = lim [− ] = 0.
sz 2 sz
z→∞ se z→∞ s e

1. Calculate the Laplace transform of f (t) = 1.


2. Calculate the Laplace transform of f (t) = e . −3t

3. Calculate the Laplace transform of f (t) = t . (Note, you will have to integrate by parts twice.)
2

Laplace transforms are often used to solve differential equations. Differential equations are not covered in detail until later in this book; but,
for now, let’s look at the relationship between the Laplace transform of a function and the Laplace transform of its derivative.
Let’s start with the definition of the Laplace transform. We have

7.8.10 https://math.libretexts.org/@go/page/4490
∞ z
−st −st
L (f (t)) = ∫ e f (t) dt = lim ∫ e f (t) dt.
z→∞
0 0

Use integration by parts to evaluate lim ∫ e


−st
f (t) dt . (Let u = f (t) and dv = e −st
dt .)
z→∞
0

After integrating by parts and evaluating the limit, you should see that
f (0) 1
L (f (t)) = + [L (f '(t))].
s s

Then,

L (f '(t)) = sL (f (t)) − f (0).

Thus, differentiation in the time domain simplifies to multiplication by s in the frequency domain.
t

The final thing we look at in this project is how the Laplace transforms of f (t) and its antiderivative are related. Let g(t) = ∫ f (u) du.
0

Then,
∞ z
−st −st
L (g(t)) = ∫ e g(t) dt = lim ∫ e g(t) dt.
z→∞
0 0

Use integration by parts to evaluate lim ∫ e


−st
g(t) dt. (Let u = g(t) and dv = e
−st
dt . Note, by the way that we have defined
z→∞
0

g(t), du = f (t) dt. )


As you might expect, you should see that
1
L (g(t)) = ⋅ L (f (t)).
s

Integration in the time domain simplifies to division by s in the frequency domain.

Key Concepts
Integrals of functions over infinite intervals are defined in terms of limits.
Integrals of functions over an interval for which the function has a discontinuity at an endpoint may be defined in terms of limits.
The convergence or divergence of an improper integral may be determined by comparing it with the value of an improper integral for which
the convergence or divergence is known.

Key Equations
Improper integrals
+∞ t

∫ f (x) dx = lim ∫ f (x) dx


t→+∞
a a

b b

∫ f (x) dx = lim ∫ f (x) dx


t→−∞
−∞ t

+∞ 0 +∞

∫ f (x) dx = ∫ f (x) dx + ∫ f (x) dx


−∞ −∞ 0

Glossary
improper integral
an integral over an infinite interval or an integral of a function containing an infinite discontinuity on the interval; an improper integral is
defined in terms of a limit. The improper integral converges if this limit is a finite real number; otherwise, the improper integral diverges

7.8: Improper Integrals is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
7.7: Improper Integrals by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source: https://openstax.org/details/books/calculus-
volume-1.

7.8.11 https://math.libretexts.org/@go/page/4490
CHAPTER OVERVIEW

8: Further Applications of Integration


A general Calculus Textmap organized around the textbook

Calculus: Early Transcendentals


by James Stewart

I II III IV V VI VII VIII IX X XI XII XIII XIV XV XVI XVII

This Textmap is currently under construction... please be patient with us.

Topic hierarchy
8.1: Arc Length
8.2: Area of a Surface of Revolution
8.3: Applications to Physics and Engineering
8.4: Applications to Economics and Biology
8.5: Probability

Thumbnail: Volume of a solid of revolution can be calculated via integration techniques. (GPLC, Matt Boelkins, David Austin, and
Steve Schlicker, Grand Valley State University).

8: Further Applications of Integration is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

1
8.1: Arc Length
 Learning Objectives
Determine the length of a curve, y = f (x), between two points.
Determine the length of a curve, x = g(y) , between two points.
Find the surface area of a solid of revolution.

In this section, we use definite integrals to find the arc length of a curve. We can think of arc length as the distance you would
travel if you were walking along the path of the curve. Many real-world applications involve arc length. If a rocket is launched
along a parabolic path, we might want to know how far the rocket travels. Or, if a curve on a map represents a road, we might want
to know how far we have to drive to reach our destination.
We begin by calculating the arc length of curves defined as functions of x, then we examine the same process for curves defined as
functions of y . (The process is identical, with the roles of x and y reversed.) The techniques we use to find arc length can be
extended to find the surface area of a surface of revolution, and we close the section with an examination of this concept.

Arc Length of the Curve y = f(x)


In previous applications of integration, we required the function f (x) to be integrable, or at most continuous. However, for
calculating arc length we have a more stringent requirement for f (x). Here, we require f (x) to be differentiable, and furthermore
we require its derivative, f '(x), to be continuous. Functions like this, which have continuous derivatives, are called smooth. (This
property comes up again in later chapters.)
Let f (x) be a smooth function defined over [a, b]. We want to calculate the length of the curve from the point (a, f (a)) to the point
(b, f (b)). We start by using line segments to approximate the length of the curve. For i = 0, 1, 2, … , n, let P = x be a regular
i

partition of [a, b]. Then, for i = 1, 2, … , n, construct a line segment from the point (x , f (x )) to the point (x , f (x )).
i−1 i−1 i i

Although it might seem logical to use either horizontal or vertical line segments, we want our line segments to approximate the
curve as closely as possible. Figure 8.1.1 depicts this construct for n = 5 .

Figure 8.1.1 : We can approximate the length of a curve by adding line segments.
To help us find the length of each line segment, we look at the change in vertical distance as well as the change in horizontal
distance over each interval. Because we have used a regular partition, the change in horizontal distance over each interval is given
by Δx. The change in vertical distance varies from interval to interval, though, so we use Δy = f (x ) − f (x ) to represent the
i i i−1

change in vertical distance over the interval [x , x ], as shown in Figure 8.1.2. Note that some (or all) Δy may be negative.
i−1 i i

8.1.1 https://math.libretexts.org/@go/page/4492
Figure 8.1.2 : A representative line segment approximates the curve over the interval [x i−1 , xi ].

By the Pythagorean theorem, the length of the line segment is


−−−−−−−−−−−−
2 2
√ (Δx ) + (Δyi ) .

We can also write this as


−−−−−−−−−−−−−−−
2
Δx √ 1 + ((Δyi )/(Δx)) .

Now, by the Mean Value Theorem, there is a point x



i
∈ [ xi−1 , xi ] such that ∗
f '(x ) = (Δyi )/(Δx)
i
. Then the length of the line
segment is given by
−−−−−−−−−−
∗ 2
Δx √ 1 + [f '(x )] .
i

Adding up the lengths of all the line segments, we get


n
−−−−−−−−−−
∗ 2
Arc Length ≈ ∑ √ 1 + [f '(x )] Δx.
i

i=1

This is a Riemann sum. Taking the limit as n → ∞, we have


n
−−−−−−−−−−
∗ 2
Arc Length = lim ∑ √ 1 + [f '(x )] Δx
i
n→∞
i=1

b
−−−−−−−−−
2
=∫ √ 1 + [f '(x)] dx.
a

We summarize these findings in the following theorem.

 Arc Length for y = f (x)


Let f (x) be a smooth function over the interval [a, b]. Then the arc length of the portion of the graph of f (x) from the point
(a, f (a)) to the point (b, f (b)) is given by

b −−−−−−−−−
2
Arc Length = ∫ √ 1 + [f '(x)] dx.
a

Note that we are integrating an expression involving f '(x), so we need to be sure f '(x) is integrable. This is why we require f (x)
to be smooth. The following example shows how to apply the theorem.

 Example 8.1.1: Calculating the Arc Length of a Function of x

Let f (x) = 2x 3/2


. Calculate the arc length of the graph of f (x) over the interval . Round the answer to three decimal
[0, 1]

places.
Solution

8.1.2 https://math.libretexts.org/@go/page/4492
We have f '(x) = 3x 1/2
, so [f '(x)] 2
= 9x. Then, the arc length is
b −−−−−−−−−
2
Arc Length = ∫ √ 1 + [f '(x)] dx
a

1
−−−− −
=∫ √ 1 + 9x dx.
0

Substitute u = 1 + 9x. Then, du = 9dx. When x = 0 , then u = 1 , and when x = 1 , then u = 10. Thus,
1
−−−− −
Arc Length = ∫ √ 1 + 9x dx
0

1
1 −−−− −
= ∫ √ 1 + 9x 9dx
9 0

10
1 −
= ∫ √u du
9 1

1 2 2 −−
3/2 10
= ⋅ u ∣ = [10 √10 − 1]
1
9 3 27

≈ 2.268units.

 Exercise 8.1.1

Let f (x) = (4/3)x 3/2


. Calculate the arc length of the graph of f (x) over the interval [0, 1]. Round the answer to three decimal
places.

Hint
Use the process from the previous example. Don’t forget to change the limits of integration.

Answer
1 –
(5 √5 − 1) ≈ 1.697
6

Although it is nice to have a formula for calculating arc length, this particular theorem can generate expressions that are difficult to
integrate. We study some techniques for integration in Introduction to Techniques of Integration. In some cases, we may have to
use a computer or calculator to approximate the value of the integral.

 Example 8.1.2: Using a Computer or Calculator to Determine the Arc Length of a Function of x

Let f (x) = x . Calculate the arc length of the graph of f (x) over the interval [1, 3].
2

Solution
We have f '(x) = 2x, so [f '(x)] 2
= 4x .
2
Then the arc length is given by
b −−−−−−−−−
2
Arc Length = ∫ √ 1 + [f '(x)] dx
a

3
− −−−− −
2
=∫ √ 1 + 4x dx.
1

Using a computer to approximate the value of this integral, we get


3
− −−−− −
2
∫ √ 1 + 4x dx ≈ 8.26815.
1

8.1.3 https://math.libretexts.org/@go/page/4492
 Exercise 8.1.2
Let f (x) = sin x . Calculate the arc length of the graph of f (x) over the interval [0, π] . Use a computer or calculator to
approximate the value of the integral.

Hint
Use the process from the previous example.

Answer
Arc Length ≈ 3.8202

Arc Length of the Curve x = g(y)


We have just seen how to approximate the length of a curve with line segments. If we want to find the arc length of the graph of a
function of y , we can repeat the same process, except we partition the y-axis instead of the x-axis. Figure 8.1.3 shows a
representative line segment.

Figure 8.1.3 : A representative line segment over the interval [y i−1 , yi ].

Then the length of the line segment is


−−−−−−−−−−−−
2 2
√ (Δy ) + (Δxi ) ,

which can also be written as


−−−−−−−−−−−
2
Δxi
Δy √ 1 + ( ) .
Δy

If we now follow the same development we did earlier, we get a formula for arc length of a function x = g(y) .

 Arc Length for x = g(y)

Let g(y) be a smooth function over an interval [c, d]. Then, the arc length of the graph of g(y) from the point (c, g(c)) to the
point (d, g(d)) is given by
d −−−−−−−−−
2
Arc Length = ∫ √ 1 + [g'(y)] dy.
c

 Example 8.1.3: Calculating the Arc Length of a Function of y

Let g(y) = 3y 3
. Calculate the arc length of the graph of g(y) over the interval [1, 2].
Solution
We have g'(y) = 9y 2
, so [g'(y)]
2 4
= 81 y . Then the arc length is

8.1.4 https://math.libretexts.org/@go/page/4492
d −−−−−−−−−
2
Arc Length = ∫ √ 1 + [g'(y)] dy
c

2 −−−−−−−
4
=∫ √ 1 + 81y dy.
1

Using a computer to approximate the value of this integral, we obtain


2 −−−−−−−
4
∫ √ 1 + 81y dy ≈ 21.0277.
1

 Exercise 8.1.3

Let g(y) = 1/y . Calculate the arc length of the graph of g(y) over the interval . Use a computer or calculator to
[1, 4]

approximate the value of the integral.

Hint
Use the process from the previous example.

Answer
Arc Length = 3.15018

Area of a Surface of Revolution


The concepts we used to find the arc length of a curve can be extended to find the surface area of a surface of revolution. Surface
area is the total area of the outer layer of an object. For objects such as cubes or bricks, the surface area of the object is the sum of
the areas of all of its faces. For curved surfaces, the situation is a little more complex. Let f (x) be a nonnegative smooth function
over the interval [a, b]. We wish to find the surface area of the surface of revolution created by revolving the graph of y = f (x)
around the x-axis as shown in the following figure.

Figure 8.1.4 : (a) A curve representing the function . (b) The surface of revolution formed by revolving the graph of
f (x) f (x)

around the x − axis .


As we have done many times before, we are going to partition the interval [a, b] and approximate the surface area by calculating
the surface area of simpler shapes. We start by using line segments to approximate the curve, as we did earlier in this section. For
i = 0, 1, 2, … , n, let P = x be a regular partition of [a, b]. Then, for i = 1, 2, … , n, construct a line segment from the point
i

(xi−1 , f (x )) to the point (x , f (x )). Now, revolve these line segments around the x-axis to generate an approximation of the
i−1 i i

surface of revolution as shown in the following figure.

8.1.5 https://math.libretexts.org/@go/page/4492
Figure 8.1.5 : (a) Approximating f (x) with line segments. (b) The surface of revolution formed by revolving the line segments
around the x − axis .
Notice that when each line segment is revolved around the axis, it produces a band. These bands are actually pieces of cones (think
of an ice cream cone with the pointy end cut off). A piece of a cone like this is called a frustum of a cone.
To find the surface area of the band, we need to find the lateral surface area, S , of the frustum (the area of just the slanted outside
surface of the frustum, not including the areas of the top or bottom faces). Let r and r be the radii of the wide end and the narrow
1 2

end of the frustum, respectively, and let l be the slant height of the frustum as shown in the following figure.

Figure 8.1.6 : A frustum of a cone can approximate a small part of surface area.
We know the lateral surface area of a cone is given by

Lateral Surface Area = πrs,

where r is the radius of the base of the cone and s is the slant height (Figure 8.1.7).

Figure 8.1.7 : The lateral surface area of the cone is given by πrs .
Since a frustum can be thought of as a piece of a cone, the lateral surface area of the frustum is given by the lateral surface area of
the whole cone less the lateral surface area of the smaller cone (the pointy tip) that was cut off (Figure 8.1.8).

8.1.6 https://math.libretexts.org/@go/page/4492
Figure 8.1.8 : Calculating the lateral surface area of a frustum of a cone.
The cross-sections of the small cone and the large cone are similar triangles, so we see that
r2 s−l
=
r1 s

Solving for s , we get =s−ls


r2 s−l
=
r1 s

r2 s = r1 (s − l)

r2 s = r1 s − r1 l

r1 l = r1 s − r2 s

r1 l = (r1 − r2 )s

r1 l
=s
r1 − r2

Then the lateral surface area (SA) of the frustum is


S = (Lateral SA of large cone) − (Lateral SA of small cone)

= π r1 s − π r2 (s − l)

r1 l r1 l
= π r1 ( ) − π r2 ( )
r1 − r2 r1 − r2 − l

2
πr l π r1 r2 l
1
= − + π r2 l
1 2
r −r r1 − r2

2
πr l π r1 r2l π r2 l(r1 − r2 )
1
= − +
r1 − r2 r1 − r2 r1 − r2

2 2
πr π r1 r2 l π r1 r2 l πr l
1 2
= − + −
lr1 − r2 r1 − r2 r1 − r2 r1 − r3

2 2
π(r − r )l π(r1 − r + 2)(r1 + r2)l
1 2
= =
r1 − r2 r1 − r2

= π(r1 + r2 )l.

Let’s now use this formula to calculate the surface area of each of the bands formed by revolving the line segments around the
x − axis . A representative band is shown in the following figure.

8.1.7 https://math.libretexts.org/@go/page/4492
Figure 8.1.9 : A representative band used for determining surface area.
Note that the slant height of this frustum is just the length of the line segment used to generate it. So, applying the surface area
formula, we have

S = π(r1 + r2 )l
−−−−−−−−−−−
2 2
= π(f (xi−1 ) + f (xi ))√ Δx + (Δyi )

−−−−−−−−−
Δyi
2
= π(f (xi−1 ) + f (xi ))Δx √ 1 + ( )
Δx

Now, as we did in the development of the arc length formula, we apply the Mean Value Theorem to select x ∗
i
∈ [ xi−1 , xi ] such that
f '(x ) = (Δy )/Δx. This gives us

i i

−−−−−−−−−−
∗ 2
S = π(f (xi−1 ) + f (xi ))Δx √ 1 + (f '(x ))
i

Furthermore, sincef (x) is continuous, by the Intermediate Value Theorem, there is a point x
∗∗
i
∈ [ xi−1 , x[i] such that \
(f(x^{**}_i)=(1/2)[f(xi−1)+f(xi)],
so we get
−−−−−−−−−−
∗∗ ∗ 2
S = 2πf (x )Δx √ 1 + (f '(x )) .
i i

Then the approximate surface area of the whole surface of revolution is given by
n
−−−−−−−−−−
∗∗ ∗ 2
Surface Area ≈ ∑ 2πf (x )Δx √ 1 + (f '(x )) .
i i

i=1

This almost looks like a Riemann sum, except we have functions evaluated at two different points, x and x , over the interval ∗
i
∗∗
i

[xi−1 , x ]. Although we do not examine the details here, it turns out that because f (x) is smooth, if we let n→ ∞ , the limit works
i

the same as a Riemann sum even with the two different evaluation points. This makes sense intuitively. Both x and x^{**}_i\) are ∗
i

in the interval [x , x ], so it makes sense that as n → ∞ , both x and x approach x Those of you who are interested in the
i−1 i

i
∗∗
i

details should consult an advanced calculus text.


Taking the limit as n → ∞, we get
−−−−−−−−−−
2 ∗∗ ∗ 2
Surface Area = lim ∑ n πf (x )Δx √ 1 + (f '(x ))
i i
n→∞
i=1

b −−−−−−−−−−
2
=∫ (2πf (x)√ 1 + (f '(x)) )
a

As with arc length, we can conduct a similar development for functions of y to get a formula for the surface area of surfaces of
revolution about the y − axis . These findings are summarized in the following theorem.

8.1.8 https://math.libretexts.org/@go/page/4492
 Surface Area of a Surface of Revolution
Let f (x) be a nonnegative smooth function over the interval [a, b]. Then, the surface area of the surface of revolution formed
by revolving the graph of f (x) around the x-axis is given by
b −−−−−−−−−−
2
Surface Area = ∫ (2πf (x)√ 1 + (f '(x)) )dx
a

Similarly, let g(y) be a nonnegative smooth function over the interval [c, d]. Then, the surface area of the surface of revolution
formed by revolving the graph of g(y) around the y − axis is given by
d −−−−−−−−−
2
Surface Area = ∫ (2πg(y)√ 1 + (g'(y)) dy
c

 Example 8.1.4: Calculating the Surface Area of a Surface of Revolution 1.

Let f (x) = √−x over the interval [1, 4]. Find the surface area of the surface generated by revolving the graph of f (x) around
the x-axis. Round the answer to three decimal places.
Solution
The graph of f (x) and the surface of rotation are shown in Figure 8.1.10.

Figure 8.1.10 : (a) The graph of f (x). (b) The surface of revolution.

We have f (x) = √x. Then, f '(x) = 1/(2√−
x ) and (f '(x))
2
= 1/(4x). Then,
b −−−−−−−−−−
2
Surface Area = ∫ (2πf (x)√ 1 + (f '(x)) dx
a

4
−−−−−−−−−−−
− 1
=∫ (√ 2π √x 1 + )dx
1
4x

4
−−−− −
=∫ (2π √ x + 14 dx.
1

Let u = x + 1/4. Then, du = dx . When x = 1, u = 5/4, and when x = 4, u = 17/4. This gives us
1
−−−−− 17/4
1 −
∫ (2π √ x + )dx = ∫ 2π √u du
0
4 5/4

17/4
2 ∣
3/2
= 2π [ u ]∣
3 ∣5/4

π −− –
= [17 √17 − 5 √5] ≈ 30.846
6

8.1.9 https://math.libretexts.org/@go/page/4492
 Exercise 8.1.4
−−−−−
Let f (x) = √1 − x over the interval [0, 1/2]. Find the surface area of the surface generated by revolving the graph of f (x)

around the x-axis. Round the answer to three decimal places.

Hint
Use the process from the previous example.

Answer
π – –
(5 √5 − 3 √3) ≈ 3.133
6

 Example 8.1.5: Calculating the Surface Area of a Surface of Revolution 2


[
Let f (x) = y = . Consider the portion of the curve where
]3x 0 ≤y ≤2 . Find the surface area of the surface generated by
3
revolving the graph of f (x) around the y -axis.
Solution
Notice that we are revolving the curve around the y -axis, and the interval is in terms of y , so we want to rewrite the function as
a function of y . We get x = g(y) = (1/3)y . The graph of g(y) and the surface of rotation are shown in the following figure.
3

Figure 8.1.11 : (a) The graph of g(y). (b) The surface of revolution.
We have g(y) = (1/3)y , so g'(y) = y and (g'(y))
3 2 2
=y
4
. Then
d −−−−−−−−−
2
Surface Area = ∫ (2πg(y)√ 1 + (g'(y)) )dy
c

2 −−−−−
1
3 4
=∫ (2π( y )√ 1 + y )dy
0
3

2 −−−−−
2π 3 4
= ∫ (y √ 1 + y )dy.
3 0

Let u = y 4
+ 1. Then du = 4y 3
dy . When y = 0, u = 1 , and when y = 2, u = 17. Then
2 −−−−− 17
2π 3 4
2π 1 −
∫ (y √ 1 + y )dy = ∫ √u du
3 0
3 1
4

π 2 π
3/2 17 3/2
= [ u ] ∣ = [(17 ) − 1] ≈ 24.118.
1
6 3 9

8.1.10 https://math.libretexts.org/@go/page/4492
 Exercise 8.1.5
−−−−−
Let g(y) = √9 − y 2
over the interval y ∈ [0, 2]. Find the surface area of the surface generated by revolving the graph of g(y)
around the y -axis.

Hint
Use the process from the previous example.

Answer
12π

Key Concepts
The arc length of a curve can be calculated using a definite integral.
The arc length is first approximated using line segments, which generates a Riemann sum. Taking a limit then gives us the
definite integral formula. The same process can be applied to functions of y .
The concepts used to calculate the arc length can be generalized to find the surface area of a surface of revolution.
The integrals generated by both the arc length and surface area formulas are often difficult to evaluate. It may be necessary to
use a computer or calculator to approximate the values of the integrals.

Key Equations
Arc Length of a Function of x
b −−−−−−−− −
Arc Length = ∫ a
√1 + [f '(x)]2 dx

Arc Length of a Function of y


d −−−− − −−−−
Arc Length = ∫ c
2
√1 + [g'(y)] dy

Surface Area of a Function of x


b −−−− −− −−−−
Surface Area = ∫ a
2
(2πf (x)√1 + (f '(x)) )dx

Glossary
arc length
the arc length of a curve can be thought of as the distance a person would travel along the path of the curve

frustum
a portion of a cone; a frustum is constructed by cutting the cone with a plane parallel to the base

surface area
the surface area of a solid is the total area of the outer layer of the object; for objects such as cubes or bricks, the surface area of
the object is the sum of the areas of all of its faces

8.1: Arc Length is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
6.4: Arc Length of a Curve and Surface Area by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

8.1.11 https://math.libretexts.org/@go/page/4492
8.2: Area of a Surface of Revolution
The concepts we used to find the arc length of a curve can be extended to find the surface area of a surface of revolution. Surface
area is the total area of the outer layer of an object. For objects such as cubes or bricks, the surface area of the object is the sum of
the areas of all of its faces. For curved surfaces, the situation is a little more complex. Let f (x) be a nonnegative smooth function
over the interval [a, b]. We wish to find the surface area of the surface of revolution created by revolving the graph of y = f (x)
around the x-axis as shown in the following figure.

Figure 8.2.4 : (a) A curve representing the function . (b) The surface of revolution formed by revolving the graph of
f (x) f (x)

around the x − axis .


As we have done many times before, we are going to partition the interval [a, b] and approximate the surface area by calculating
the surface area of simpler shapes. We start by using line segments to approximate the curve, as we did earlier in this section. For
i = 0, 1, 2, … , n, let P = x be a regular partition of [a, b]. Then, for i = 1, 2, … , n, construct a line segment from the point
i

(xi−1 , f (x )) to the point (x , f (x )). Now, revolve these line segments around the x-axis to generate an approximation of the
i−1 i i

surface of revolution as shown in the following figure.

Figure 8.2.5 : (a) Approximating f (x) with line segments. (b) The surface of revolution formed by revolving the line segments
around the x − axis .
Notice that when each line segment is revolved around the axis, it produces a band. These bands are actually pieces of cones (think
of an ice cream cone with the pointy end cut off). A piece of a cone like this is called a frustum of a cone.
To find the surface area of the band, we need to find the lateral surface area, S , of the frustum (the area of just the slanted outside
surface of the frustum, not including the areas of the top or bottom faces). Let r and r be the radii of the wide end and the narrow
1 2

end of the frustum, respectively, and let l be the slant height of the frustum as shown in the following figure.

8.2.1 https://math.libretexts.org/@go/page/4493
Figure 8.2.6 : A frustum of a cone can approximate a small part of surface area.
We know the lateral surface area of a cone is given by

Lateral Surface Area = πrs,

where r is the radius of the base of the cone and s is the slant height (Figure 8.2.7).

Figure 8.2.7 : The lateral surface area of the cone is given by πrs .
Since a frustum can be thought of as a piece of a cone, the lateral surface area of the frustum is given by the lateral surface area of
the whole cone less the lateral surface area of the smaller cone (the pointy tip) that was cut off (Figure 8.2.8).

Figure 8.2.8 : Calculating the lateral surface area of a frustum of a cone.


The cross-sections of the small cone and the large cone are similar triangles, so we see that
r2 s−l
=
r1 s

Solving for s , we get =s−ls

8.2.2 https://math.libretexts.org/@go/page/4493
r2 s−l
=
r1 s

r2 s = r1 (s − l)

r2 s = r1 s − r1 l

r1 l = r1 s − r2 s

r1 l = (r1 − r2 )s

r1 l
=s
r1 − r2

Then the lateral surface area (SA) of the frustum is


S = (Lateral SA of large cone) − (Lateral SA of small cone)

= π r1 s − π r2 (s − l)

r1 l r1 l
= π r1 ( ) − π r2 ( )
r1 − r2 r1 − r2 − l

2
πr l π r1 r2 l
1
= − + π r2 l
1 2
r −r r1 − r2

2
πr l π r1 r2l π r2 l(r1 − r2 )
1
= − +
r1 − r2 r1 − r2 r1 − r2

2 2
πr π r1 r2 l π r1 r2 l πr l
1 2
= − + −
lr1 − r2 r1 − r2 r1 − r2 r1 − r3

2 2
π(r − r )l π(r1 − r + 2)(r1 + r2)l
1 2
= =
r1 − r2 r1 − r2

= π(r1 + r2 )l.

Let’s now use this formula to calculate the surface area of each of the bands formed by revolving the line segments around the
x − axis . A representative band is shown in the following figure.

Figure 8.2.9 : A representative band used for determining surface area.


Note that the slant height of this frustum is just the length of the line segment used to generate it. So, applying the surface area
formula, we have
S = π(r1 + r2 )l
−−−−−−−−−−−
2 2
= π(f (xi−1 ) + f (xi ))√ Δx + (Δyi )

−−−−−−−−−
Δyi
2
= π(f (xi−1 ) + f (xi ))Δx √ 1 + ( )
Δx

Now, as we did in the development of the arc length formula, we apply the Mean Value Theorem to select x ∗
i
∈ [ xi−1 , xi ] such that
f '(x ) = (Δy )/Δx. This gives us

i i

−−−−−−−−−−
∗ 2
S = π(f (xi−1 ) + f (xi ))Δx √ 1 + (f '(x ))
i

8.2.3 https://math.libretexts.org/@go/page/4493
Furthermore, sincef (x) is continuous, by the Intermediate Value Theorem, there is a point x
∗∗
i
∈ [ xi−1 , x[i] such that \
(f(x^{**}_i)=(1/2)[f(xi−1)+f(xi)],
so we get
−−−−−−−−−−
∗∗ ∗ 2
S = 2πf (x )Δx √ 1 + (f '(x )) .
i i

Then the approximate surface area of the whole surface of revolution is given by
n
−−−−−−−−−−
∗∗ ∗ 2
Surface Area ≈ ∑ 2πf (x )Δx √ 1 + (f '(x )) .
i i

i=1

This almost looks like a Riemann sum, except we have functions evaluated at two different points, x and x , over the interval ∗
i
∗∗
i

[xi−1 , x ]. Although we do not examine the details here, it turns out that because f (x) is smooth, if we let n→ ∞ , the limit works
i

the same as a Riemann sum even with the two different evaluation points. This makes sense intuitively. Both x and x^{**}_i\) are ∗
i

in the interval [x , x ], so it makes sense that as n → ∞ , both x and x approach x Those of you who are interested in the
i−1 i

i
∗∗
i

details should consult an advanced calculus text.


Taking the limit as n → ∞, we get
−−−−−−−−−−
2 ∗∗ ∗ 2
Surface Area = lim ∑ n πf (x )Δx √ 1 + (f '(x ))
i i
n→∞
i=1

b
−−−−−−−−−−
2
=∫ (2πf (x)√ 1 + (f '(x)) )
a

As with arc length, we can conduct a similar development for functions of y to get a formula for the surface area of surfaces of
revolution about the y − axis . These findings are summarized in the following theorem.

 Surface Area of a Surface of Revolution


Let f (x) be a nonnegative smooth function over the interval [a, b]. Then, the surface area of the surface of revolution formed
by revolving the graph of f (x) around the x-axis is given by
b
−−−−−−−−−−
2
Surface Area = ∫ (2πf (x)√ 1 + (f '(x)) )dx
a

Similarly, let g(y) be a nonnegative smooth function over the interval [c, d]. Then, the surface area of the surface of revolution
formed by revolving the graph of g(y) around the y − axis is given by
d −−−−−−−−−
2
Surface Area = ∫ (2πg(y)√ 1 + (g'(y)) dy
c

 Example 8.2.4: Calculating the Surface Area of a Surface of Revolution 1.

Let f (x) = √−x over the interval [1, 4]. Find the surface area of the surface generated by revolving the graph of f (x) around
the x-axis. Round the answer to three decimal places.
Solution
The graph of f (x) and the surface of rotation are shown in Figure 8.2.10.

8.2.4 https://math.libretexts.org/@go/page/4493
Figure 8.2.10 : (a) The graph of f (x). (b) The surface of revolution.

We have f (x) = √x . Then, f '(x) = 1/(2√−
x ) and (f '(x))
2
= 1/(4x). Then,
b −−−−−−−−−−
2
Surface Area = ∫ (2πf (x)√ 1 + (f '(x)) dx
a

4
−−−−−−−−−−−
− 1
=∫ (√ 2π √x 1 + )dx
1
4x

4
−−−− −
=∫ (2π √ x + 14 dx.
1

Let u = x + 1/4. Then, du = dx . When x = 1, u = 5/4, and when x = 4, u = 17/4. This gives us
1
−−−−− 17/4
1 −
∫ (2π √ x + )dx = ∫ 2π √u du
0
4 5/4

17/4
2 ∣
3/2
= 2π [ u ]∣
3 ∣5/4

π −− –
= [17 √17 − 5 √5] ≈ 30.846
6

 Exercise 8.2.4
−−−−−
Let f (x) = √1 − x over the interval [0, 1/2]. Find the surface area of the surface generated by revolving the graph of f (x)

around the x-axis. Round the answer to three decimal places.

Hint
Use the process from the previous example.

Answer
π – –
(5 √5 − 3 √3) ≈ 3.133
6

 Example 8.2.5: Calculating the Surface Area of a Surface of Revolution 2


[
Let f (x) = y = . Consider the portion of the curve where
]3x 0 ≤y ≤2 . Find the surface area of the surface generated by
3
revolving the graph of f (x) around the y -axis.
Solution

8.2.5 https://math.libretexts.org/@go/page/4493
Notice that we are revolving the curve around the y -axis, and the interval is in terms of y , so we want to rewrite the function as
a function of y . We get x = g(y) = (1/3)y . The graph of g(y) and the surface of rotation are shown in the following figure.
3

Figure 8.2.11 : (a) The graph of g(y). (b) The surface of revolution.
We have g(y) = (1/3)y , so g'(y) = y and (g'(y))
3 2 2
=y
4
. Then
d −−−−−−−−−
2
Surface Area = ∫ (2πg(y)√ 1 + (g'(y)) )dy
c

2 −−−−−
1
3 4
=∫ (2π( y )√ 1 + y )dy
0
3

2 −−−−−
2π 3 4
= ∫ (y √ 1 + y )dy.
3 0

Let u = y 4
+ 1. Then du = 4y 3
dy . When y = 0, u = 1 , and when y = 2, u = 17. Then
2 −−−−− 17
2π 3 4
2π 1 −
∫ (y √ 1 + y )dy = ∫ √u du
3 0
3 1
4

π 2 π
3/2 17 3/2
= [ u ] ∣ = [(17 ) − 1] ≈ 24.118.
1
6 3 9

 Exercise 8.2.5
−−−−−
Let g(y) = √9 − y 2
over the interval y ∈ [0, 2]. Find the surface area of the surface generated by revolving the graph of g(y)
around the y -axis.

Hint
Use the process from the previous example.

Answer
12π

8.2: Area of a Surface of Revolution is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
6.4: Arc Length of a Curve and Surface Area by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

8.2.6 https://math.libretexts.org/@go/page/4493
8.3: Applications to Physics and Engineering
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
How do we measure the work accomplished by a varying force that moves an object a certain distance?
What is the total force exerted by water against a dam?
How are both of the above concepts and their corresponding use of definite integrals similar to problems we have
encountered in the past involving formulas such as “distance equals rate times time” and “mass equals density times
volume”?

In our work to date with the definite integral, we have seen several different circumstances where the integral enables us to measure
the accumulation of a quantity that varies, provided the quantity is approximately constant over small intervals. For instance, based
on the fact that the area of a rectangle is A = l ⋅ w, if we wish to find the area bounded by a nonnegative curve y = f (x) and the
x-axis on an interval [a, b], a representative slice of width Δx has area A = f (x)Δx, and thus as we let the width of the
slice

representative slice tend to zero, we find that the exact area of the region is
b

A =∫ f (x)dx. (8.3.1)
a

In a similar way, if we know that the velocity of a moving object is given by the function y = v(t) , and we wish to know the
distance the object travels on an interval [a, b] where v(t) is nonnegative, we can use a definite integral to generalize the fact that
d = r ⋅ t when the rate, r , is constant. More specifically, on a short time interval Δt , v(t) is roughly constant, and hence for a

small slice of time, d


slice= v(t)Δt , and so as the width of the time interval Δt tends to zero, the exact distance traveled is given

by the definite integral


b

d =∫ v(t)dt. (8.3.2)
a

Finally, when we recently learned about the mass of an object of non-constant density, we saw that since M = D ⋅ V (mass equals
density times volume, provided that density is constant), if we can consider a small slice of an object on which the density is
approximately constant, a definite integral may be used to determine the exact mass of the object. For instance, if we have a thin
rod whose cross sections have constant density, but whose density is distributed along the x axis according to the function
y = ρ(x), it follows that for a small slice of the rod that is Δx thick, M = ρ(x)Δx. In the limit as Δx → 0 , we then find that
slice

the total mass is given by


b

M =∫ ρ(x)dx. (8.3.3)
a

Note that all three of these situations are similar in that we have a basic rule (A = l ⋅ w, d = r ⋅ t, M = D ⋅ V ) where one of the
two quantities being multiplied is no longer constant; in each, we consider a small interval for the other variable in the formula,
calculate the approximate value of the desired quantity (area, distance, or mass) over the small interval, and then use a definite
integral to sum the results as the length of the small intervals is allowed to approach zero. It should be apparent that this approach
will work effectively for other situations where we have a quantity of interest that varies. We next turn to the notion of work: from
physics, a basic principal is that work is the product of force and distance. For example, if a person exerts a force of 20 pounds to
lift a 20-pound weight 4 feet off the ground, the total work accomplished is
W =F ⋅d (8.3.4)

= 20 ⋅ 4 (8.3.5)

= 80 foot-pounds. (8.3.6)

If force and distance are measured in English units (pounds and feet), then the units on work are foot-pounds. If instead we work in
metric units, where forces are measured in Newtons and distances in meters, the units on work are Newton-meters.

8.3.1 https://math.libretexts.org/@go/page/4494
Figure 6.14: Three settings where we compute the accumulation of a varying quantity: the area under y = f (x), the distance
traveled by an object with velocity y = v(t) , and the mass of a bar with density function y = ρ(x).
Of course, the formula W = F ⋅ d only applies when the force is constant while it is exerted over the distance d . In Preview
Activity 6.4, we explore one way that we can use a definite integral to compute the total work accomplished when the force exerted
varies.

Preview Activity 8.3.1

A bucket is being lifted from the bottom of a 50-foot deep well; its weight (including the water), B , in pounds at a height h
feet above the water is given by the function B(h) . When the bucket leaves the water, the bucket and water together weigh
B(0) = 20 pounds, and when the bucket reaches the top of the well, B(50) = 12 pounds. Assume that the bucket loses water

at a constant rate (as a function of height, h ) throughout its journey from the bottom to the top of the well.
a. Find a formula for B(h) .
b. Compute the value of the product B(5)Δh , where Δh = 2 feet. Include units on your answer. Explain why this product
represents the approximate work it took to move the bucket of water from h = 5 to h = 7 .
c. Is the value in (b) an over- or under-estimate of the actual amount of work it took to move the bucket from h = 5 to h = 7 ?
Why?
d. Compute the value of the product B(22)Δh, where Δh = 0.25 feet. Include units on your answer. What is the meaning of
the value you found?
e. More generally, what does the quantity W slice= B(h)Δh measure for a given value of h and a small positive value of

Δh ?
5
f. Evaluate the definite integral ∫ 0 B(h)dh . What is the meaning of the value you find? Why?
0

Work
Because work is calculated by the rule W = F ⋅ d , whenever the force F is constant, it follows that we can use a definite integral
to compute the work accomplished by a varying force. For example, suppose that in a setting similar to the problem posed in
Preview Activity 6.4, we have a bucket being lifted in a 50-foot well whose weight at height h is given by
−0.1h
B(h) = 12 + 8 e . (8.3.7)

In contrast to the problem in the preview activity, this bucket is not leaking at a constant rate; but because the weight of the bucket
and water is not constant, we have to use a definite integral to determine the total work that results from lifting the bucket. Observe
that at a height h above the water, the approximate work to move the bucket a small distance Δh is
−0.1h
Wslice = B(h)Δh = (12 + 8 e )Δh. (8.3.8)

Hence, if we let Δh tend to 0 and take the sum of all of the slices of work accomplished on these small intervals, it follows that the
total work is given by
50 50
−0.1h
W =∫ B(h) dh = ∫ (12 + 8 e )dh. (8.3.9)
0 0

8.3.2 https://math.libretexts.org/@go/page/4494
While is a straightforward exercise to evaluate this integral exactly using the First Fundamental Theorem of Calculus, in applied
settings such as this one we will typically use computing technology to find accurate approximations of integrals that are of interest
to us. Here, it turns out that
50
−0.1h
W =∫ (12 + 8 e )dh ≈ 679.461 foot-pounds. (8.3.10)
0

Our work in Preview Activity 6.1 and in the most recent example above employs the following important general principle.
For an object being moved in the positive direction along an axis, x, by a force F (x), the total work to move the object from a to b
is given by
b

W =∫ F (x)dx. (8.3.11)
a

Activity 8.3.1

Consider the following situations in which a varying force accomplishes work.


a. Suppose that a heavy rope hangs over the side of a cliff. The rope is 200 feet long and weighs 0.3 pounds per foot; initially
the rope is fully extended. How much work is required to haul in the entire length of the rope? (Hint: set up a function
F (h) whose value is the weight of the rope remaining over the cliff after h feet have been hauled in.)

b. A leaky bucket is being hauled up from a 100 foot deep well. When lifted from the water, the bucket and water together
weigh 40 pounds. As the bucket is being hauled upward at a constant rate, the bucket leaks water at a constant rate so that it
is losing weight at a rate of 0.1 pounds per foot. What function B(h) tells the weight of the bucket after the bucket has been
lifted h feet? What is the total amount of work accomplished in lifting the bucket to the top of the well?
c. Now suppose that the bucket in (b) does not leak at a constant rate, but rather that its weight at a height h feet above the
water is given by B(h) = 25 + 15e −0.05h
. What is the total work required to lift the bucket 100 feet? What is the average
force exerted on the bucket on the interval h = 0 to h = 100 ?
d. From physics, Hooke’s Law for springs states that the amount of force required to hold a spring that is compressed (or
extended) to a particular length is proportionate to the distance the spring is compressed (or extended) from its natural
length. That is, the force to compress (or extend) a spring x units from its natural length is F (x) = kx for some constant k
(which is called the spring constant.) For springs, we choose to measure the force in pounds and the distance the spring is
compressed in feet. Suppose that a force of 5 pounds extends a particular spring 4 inches (1/3 foot) beyond its natural
length.
i. Use the given fact that F (1/3) = 5 to find the spring constant k .
ii. Find the work done to extend the spring from its natural length to 1 foot beyond its natural length.
iii. Find the work required to extend the spring from 1 foot beyond its natural length to 1.5 feet beyond its natural length.

Work: Pumping Liquid from a Tank


In certain geographic locations where the water table is high, residential homes with basements have a peculiar feature: in the
basement, one finds a large hole in the floor, and in the hole, there is water. For example, in Figure 6.15 where we see a sump
crock.

8.3.3 https://math.libretexts.org/@go/page/4494
Figure 6.15: A sump crock. Image credit to www.warreninspect.com/basement-moisture.
Essentially, a sump crock provides an outlet for water that may build up beneath the basement floor; of course, as that water rises, it
is imperative that the water not flood the basement. Hence, in the crock we see the presence of a floating pump that sits on the
surface of the water: this pump is activated by elevation, so when the water level reaches a particular height, the pump turns on and
pumps a certain portion of the water out of the crock, hence relieving the water buildup beneath the foundation. One of the
questions we’d like to answer is: how much work does a sump pump accomplish? To that end, let’s suppose that we have a sump
crock that has the shape of a frustum of a cone, as pictured in Figure 6.16. Assume that the crock has a diameter of 3 feet at its
surface, a diameter of 1.5 feet at its base, and a depth of 4 feet. In addition, suppose that the sump pump is set up so that it pumps
the water vertically up a pipe to a drain that is located at ground level just outside a basement window. To accomplish this, the
pump must send the water to a location 9 feet above the surface of the sump crock.

Figure 6.16: A sump crock with approximately cylindrical cross-sections that is 4 feet deep, 1.5 feet in diameter at its base, and 3
feet in diameter at its top.
It turns out to be advantageous to think of the depth below the surface of the crock as being the independent variable, so, in
problems such as this one we typically let the positive x-axis point down, and the positive y -axis to the right, as pictured in the
figure. As we think about the work that the pump does, we first realize that the pump sits on the surface of the water, so it makes
sense to think about the pump moving the water one “slice” at a time, where it takes a thin slice from the surface, pumps it out of
the tank, and then proceeds to pump the next slice below. For the sump crock described in this example, each slice of water is
cylindrical in shape. We see that the radius of each approximately cylindrical slice varies according to the linear function y = f (x)
that passes through the points (0, 1.5) and (4, 0.75), where x is the depth of the particular slice in the tank; it is a straightforward
exercise to find that f (x) = 1.5 − 0.1875x. Now we are prepared to think about the overall problem in several steps:
a. determining the volume of a typical slice;
b. finding the weight (We assume that the weight density of water is 62.4 pounds per cubic foot) of a typical slice (and thus the
force that must be exerted on it)
c. deciding the distance that a typical slice moves; and
d. computing the work to move a representative slice. Once we know the work it takes to move one slice, we use a definite
integral over an appropriate interval to find the total work.
Consider a representative cylindrical slice that sits on the surface of the water at a depth of x feet below the top of the crock. It
follows that the approximate volume of that slice is given by
2
Vslice = πf (x ) Δx = π(1.5 − 0.1875x ) Δx
2
.
Since water weighs 62.4 lb/ft3 , it follows that the approximate weight of a representative slice, which is also the approximate force
the pump must exert to move the slice, is
Fslice = 62.4 ⋅ Vslice = 62.4π(1.5 − 0.1875x ) Δx
2
.
Because the slice is located at a depth of x feet below the top of the crock, the slice being moved by the pump must move x feet to
get to the level of the basement floor, and then, as stated in the problem description, be moved another 9 feet to reach the drain at
ground level outside a basement window. Hence, the total distance a representative slice travels is
dslice = x + 9 .

8.3.4 https://math.libretexts.org/@go/page/4494
Finally, we note that the work to move a representative slice is given by
2
Wslice = Fslice ⋅ dslice = 62.4π(1.5 − 0.1875x ) Δx ⋅ (x + 9) ,
since the force to move a particular slice is constant. We sum the work required to move slices throughout the tank (from x = 0 to
x = 4 ), let Δx → 0 , and hence

4
W =∫
0
2
62.4π(1.5 − 0.1875x ) (x + 9)dx ,
which, when evaluated using appropriate technology, shows that the total work is W = 10970.5π foot-pounds.
The preceding example demonstrates the standard approach to finding the work required to empty a tank filled with liquid. The
main task in each such problem is to determine the volume of a representative slice, followed by the force exerted on the slice, as
well as the distance such a slice moves. In the case where the units are metric, there is one key difference: in the metric setting,
rather than weight, we normally first find the mass of a slice. For instance, if distance is measured in meters, the mass density of
water is 1000 kg/m3 . In that setting, we can find the mass of a typical slice (in kg). To determine the force required to move it, we
use F = ma, where m is the object’s mass and a is the gravitational constant 9.81 N/kg3 . That is, in metric units, the weight density
of water is 9810 N/m3 .

Activity 8.3.2

In each of the following problems, determine the total work required to accomplish the described task. In parts (b) and (c), a
key step is to find a formula for a function that describes the curve that forms the side boundary of the tank.

Figure 6.17: A trough with triangular ends, as described in Activity 6.11, part (c).
a. Consider a vertical cylindrical tank of radius 2 meters and depth 6 meters. Suppose the tank is filled with 4 meters of water
of mass density 1000 kg/m3 , and the top 1 meter of water is pumped over the top of the tank.
b. Consider a hemispherical tank with a radius of 10 feet. Suppose that the tank is full to a depth of 7 feet with water of weight
density 62.4 pounds/ft3, and the top 5 feet of water are pumped out of the tank to a tanker truck whose height is 5 feet
above the top of the tank.
c. Consider a trough with triangular ends, as pictured in Figure 6.17, where the tank is 10 feet long, the top is 5 feet wide, and
the tank is 4 feet deep. Say that the trough is full to within 1 foot of the top with water of weight density 62.4 pounds/ft3,
and a pump is used to empty the tank until the water remaining in the tank is 1 foot deep.

Force due to Hydrostatic Pressure


When a dam is built, it is imperative to for engineers to understand how much force water will exert against the face of the dam.
The first thing we realize is the force exerted by the fluid is related to the natural concept of pressure. The pressure a force exerts on
a region is measured in units of force per unit of area: for example, the air pressure in a tire is often measured in pounds per square
inch (PSI). Hence, we see that the general relationship is given by
F
P = , or F =P ⋅A ,
A

where P represents pressure, F represents force, and A the area of the region being considered. Of course, in the equation F = PA,
we assume that the pressure is constant over the entire region A.
Most people know from experience that the deeper one dives underwater while swimming, the greater the pressure that is exerted
by the water. This is due to the fact that the deeper one dives, the more water there is right on top of the swimmer: it is the force

8.3.5 https://math.libretexts.org/@go/page/4494
that “column” of water exerts that determines the pressure the swimmer experiences. To get water pressure measured in its standard
units (pounds per square foot), we say that the total water pressure is found by computing the total weight of the column of water
that lies above a region of area 1 square foot at a fixed depth. Such a rectangular column with a 1 × 1 base and a depth of d feet has
volume V = 1 · 1 · d ft3, and thus the corresponding weight of the water overhead is 62.4d. Since this is also the amount of force
being exerted on a 1 square foot region at a depth d feet underwater, we see that P = 62.4d (lbs/ft2) is the pressure exerted by water
at depth d.
The understanding that P = 62.4d will tell us the pressure exerted by water at a depth of d, along with the fact that F = PA, will now
enable us to compute the total force that water exerts on a dam, as we see in the following example.

Example 8.3.3

Consider a trapezoid-shaped dam that is 60 feet wide at its base and 90 feet wide at its top, and assume the dam is 25 feet tall
with water that rises to within 5 feet of the top of its face. Water weighs 62.5 pounds per cubic foot. How much force does the
water exert against the dam?
Solution
First, we sketch a picture of the dam, as shown in Figure 6.18. Note that, as in problems involving the work to pump out a tank,
we let the positive x-axis point down.
It is essential to use the fact that pressure is constant at a fixed depth. Hence, we consider a slice of water at constant depth on
the face, such as the one shown in the figure. First, the approximate area of this slice is the area of the pictured rectangle. Since
the width of that rectangle depends on the variable x (which represents the how far the slice lies from the top of the dam), we
find a formula for the function y = f (x) that determines one side of the face of the dam. Since f is linear, it is straightforward
3
to find that y = f (x) = 45 − x . Hence, the approximate area of a representative slice is
5

3
Aslice = 2f (x)Δx = 2(45 − x)Δx .
5

At any point on this slice, the depth is approximately constant, and thus the pressure can be considered constant. In particular,
we note that since x measures the distance to the top of the dam, and because the water rises to within 5 feet of the top of the
dam, the depth of any point on the representative slice is approximately (x − 5) . Now, since pressure

Figure 6.18: A trapezoidal dam that is 25 feet tall, 60 feet wide at its base, 90 feet wide at its top, with the water line 5 feet
down from the top of its face.
is given by P = 62.4d, we have that at any point on the representative slice
Pslice = 62.4(x − 5) .
Knowing both the pressure and area, we can find the force the water exerts on the slice. Using F = PA , it follows that
3
Fslice = Pslice ⋅ Aslice = 62.4(x − 5) ⋅ 2(45 − x)Δx .
5

Finally, we use a definite integral to sum the forces over the appropriate range of x-values. Since the water rises to within 5
feet of the top of the dam, we start at x = 5 and slice all the way to the bottom of the dam, where x = 30. Hence,

8.3.6 https://math.libretexts.org/@go/page/4494
x=30 3
F =∫
x=5
62.4(x − 5) ⋅ 2(45 − x)dx .
5

Using technology to evaluate the integral, we find F ≈ 1.248 × 106 pounds.

Activity 8.3.4

In each of the following problems, determine the total force exerted by water against the surface that is described.
a. Consider a rectangular dam that is 100 feet wide and 50 feet tall, and suppose that water presses against the dam all the way
to the top.
b. Consider a semicircular dam with a radius of 30 feet. Suppose that the water rises to within 10 feet of the top of the dam.
c. Consider a trough with triangular ends, as pictured in Figure 6.17, where the tank is 10 feet long, the top is 5 feet wide, and
the tank is 4 feet deep. Say that the trough is full to within 1 foot of the top with water of weight density 62.4 pounds/ft3.
How much force does the water exert against one of the triangular ends?

While there are many different formulas that we use in solving problems involving work, force, and pressure, it is important to
understand that the fundamental ideas behind these problems are similar to several others that we’ve encountered in applications of
the definite integral. In particular, the basic idea is to take a difficult problem and somehow slice it into more manageable pieces
that we understand, and then use a definite integral to add up these simpler pieces.

Summary
In this section, we encountered the following important ideas:
To measure the work accomplished by a varying force that moves an object, we subdivide the problem into pieces on which we
can use the formula W = F · d, and then use a definite integral to sum the work accomplished on each piece.
To find the total force exerted by water against a dam, we use the formula F = P · A to measure the force exerted on a slice that
lies at a fixed depth, and then use a definite integral to sum the forces across the appropriate range of depths.
Because work is computed as the product of force and distance (provided force is constant), and the force water exerts on a dam
can be computed as the product of pressure and area (provided pressure is constant), problems involving these concepts are
similar to earlier problems we did using definite integrals to find distance (via “distance equals rate times time”) and mass
(“mass equals density times volume”).

8.3: Applications to Physics and Engineering is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
6.4: Physics Applications - Work, Force, and Pressure by Matthew Boelkins, David Austin & Steven Schlicker is licensed CC BY-SA 4.0.
Original source: https://activecalculus.org/single.

8.3.7 https://math.libretexts.org/@go/page/4494
8.4: Applications to Economics and Biology
 Learning Objectives
Use the exponential growth model in applications, including population growth and compound interest.
Explain the concept of doubling time.
Use the exponential decay model in applications, including radioactive decay and Newton’s law of cooling.
Explain the concept of half-life.

One of the most prevalent applications of exponential functions involves growth and decay models. Exponential growth and decay
show up in a host of natural applications. From population growth and continuously compounded interest to radioactive decay and
Newton’s law of cooling, exponential functions are ubiquitous in nature. In this section, we examine exponential growth and decay
in the context of some of these applications.

Exponential Growth Model


Many systems exhibit exponential growth. These systems follow a model of the form y = y e , where y represents the initial
0
kt
0

state of the system and k is a positive constant, called the growth constant. Notice that in an exponential growth model, we have
kt
y' = ky0 e = ky. (8.4.1)

That is, the rate of growth is proportional to the current function value. This is a key feature of exponential growth. Equation 8.4.1
involves derivatives and is called a differential equation.

 Exponential Growth
Systems that exhibit exponential growth increase according to the mathematical model
kt
y = y0 e

where y represents the initial state of the system and k > 0 is a constant, called the growth constant.
0

Population growth is a common example of exponential growth. Consider a population of bacteria, for instance. It seems plausible
that the rate of population growth would be proportional to the size of the population. After all, the more bacteria there are to
reproduce, the faster the population grows. Figure 8.4.1 and Table 8.4.1 represent the growth of a population of bacteria with an
initial population of 200 bacteria and a growth constant of 0.02. Notice that after only 2 hours (120 minutes), the population is 10
times its original size!

Figure 8.4.1 : An example of exponential growth for bacteria.


Table 8.4.1 : Exponential Growth of a Bacterial Population
Time(min) Population Size (no. of bacteria)

10 244

20 298

8.4.1 https://math.libretexts.org/@go/page/4495
Time(min) Population Size (no. of bacteria)

30 364

40 445

50 544

60 664

70 811

80 991

90 1210

100 1478

110 1805

120 2205

Note that we are using a continuous function to model what is inherently discrete behavior. At any given time, the real-world
population contains a whole number of bacteria, although the model takes on noninteger values. When using exponential growth
models, we must always be careful to interpret the function values in the context of the phenomenon we are modeling.

 Example 8.4.1: Population Growth

Consider the population of bacteria described earlier. This population grows according to the function f (t) = 200e , where
0.02t

t is measured in minutes. How many bacteria are present in the population after 5 hours (300 minutes)? When does the
population reach 100, 000 bacteria?
Solution
We have f (t) = 200e 0.02t
. Then
0.02(300)
f (300) = 200 e ≈ 80, 686.

There are 80, 686 bacteria in the population after 5 hours.


To find when the population reaches 100, 000 bacteria, we solve the equation
0.02t
100, 000 = 200e

0.02t
500 = e

ln 500 = 0.02t

ln 500
t = ≈ 310.73.
0.02

The population reaches 100, 000 bacteria after 310.73 minutes.

 Exercise 8.4.1
Consider a population of bacteria that grows according to the function f (t) = 500e , where t is measured in minutes. How
0.05t

many bacteria are present in the population after 4 hours? When does the population reach 100 million bacteria?

Answer
Use the process from the previous example.

Answer
There are 81, 377, 396 bacteria in the population after 4 hours. The population reaches 100 million bacteria after 244.12

minutes.

8.4.2 https://math.libretexts.org/@go/page/4495
Let’s now turn our attention to a financial application: compound interest. Interest that is not compounded is called simple
interest. Simple interest is paid once, at the end of the specified time period (usually 1 year). So, if we put $1000 in a savings
account earning 2 simple interest per year, then at the end of the year we have

1000(1 + 0.02) = $1020.

Compound interest is paid multiple times per year, depending on the compounding period. Therefore, if the bank compounds the
interest every 6 months, it credits half of the year’s interest to the account after 6 months. During the second half of the year, the
account earns interest not only on the initial $1000, but also on the interest earned during the first half of the year. Mathematically
speaking, at the end of the year, we have
2
0.02
1000 (1 + ) = $1020.10.
2

Similarly, if the interest is compounded every 4 months, we have


3
0.02
1000 (1 + ) = $1020.13,
3

and if the interest is compounded daily (365 times per year), we have $1020.20 . If we extend this concept, so that the interest is
compounded continuously, after t years we have
nt
0.02
1000 lim (1 + ) .
n→∞ n

Now let’s manipulate this expression so that we have an exponential growth function. Recall that the number e can be expressed as
a limit:
m
1
e = lim (1 + ) .
m→∞ m

Based on this, we want the expression inside the parentheses to have the form (1 + 1/m) . Let n = 0.02m . Note that as
n → ∞, m → ∞ as well. Then we get

nt 0.02mt m 0.02t
0.02 0.02 1
1000 lim (1 + ) = 1000 lim (1 + ) = 1000 [ lim (1 + ) ] .
n→∞ n m→∞ 0.02m m→∞ m

We recognize the limit inside the brackets as the number e . So, the balance in our bank account after t years is given by 1000e . 0.02t

Generalizing this concept, we see that if a bank account with an initial balance of $P earns interest at a rate of r, compounded
continuously, then the balance of the account after t years is
rt
Balance = Pe .

 Example 8.4.2: Compound Interest

A 25-year-old student is offered an opportunity to invest some money in a retirement account that pays 5 annual interest
compounded continuously. How much does the student need to invest today to have $1 million when she retires at age 65?
What if she could earn 6 annual interest compounded continuously instead?
Solution
We have
0.05(40)
1, 000, 000 = P e

P = 135, 335.28.

She must invest $135, 335.28at 5 interest.


If, instead, she is able to earn 6 then the equation becomes
0.06(40)
1, 000, 000 = P e

8.4.3 https://math.libretexts.org/@go/page/4495
P = 90, 717.95.

In this case, she needs to invest only $90, 717.95.This is roughly two-thirds the amount she needs to invest at 5. The fact that
the interest is compounded continuously greatly magnifies the effect of the 1 increase in interest rate.

 Exercise 8.4.2
−− −−−−−
Suppose instead of investing at age 25√b 2
− 4ac , the student waits until age 35. How much would she have to invest at 5? At
6?

Hint
Use the process from the previous example.

Answer
At 5 interest, she must invest $223, 130.16
. At 6 interest, she must invest $165, 298.89.

If a quantity grows exponentially, the time it takes for the quantity to double remains constant. In other words, it takes the same
amount of time for a population of bacteria to grow from 100 to 200 bacteria as it does to grow from 10, 000 to 20, 000 bacteria.
This time is called the doubling time. To calculate the doubling time, we want to know when the quantity reaches twice its original
size. So we have
kt
2y0 = y0 e

kt
2 =e

ln 2 = kt

ln 2
t = .
k

 Definition: Doubling Time

If a quantity grows exponentially, the doubling time is the amount of time it takes the quantity to double. It is given by
ln 2
Doubling time = .
k

 Example 8.4.3: Using the Doubling Time

Assume a population of fish grows exponentially. A pond is stocked initially with 500 fish. After 6 months, there are 1000 fish
in the pond. The owner will allow his friends and neighbors to fish on his pond after the fish population reaches 10, 000. When
will the owner’s friends be allowed to fish?
Solution
We know it takes the population of fish 6 months to double in size. So, if t represents time in months, by the doubling-time
formula, we have 6 = (ln 2)/k . Then, k = (ln 2)/6 . Thus, the population is given by y = 500e . To figure out when
((ln 2)/6)t

the population reaches 10, 000 fish, we must solve the following equation:
(ln 2/6)t
10, 000 = 500e

(ln 2/6)t
20 = e

ln 2
ln 20 = ( )t
6

6(ln 20)
t =
ln 2

≈ 25.93.

8.4.4 https://math.libretexts.org/@go/page/4495
The owner’s friends have to wait 25.93 months (a little more than 2 years) to fish in the pond.

 Exercise 8.4.3

Suppose it takes 9 months for the fish population in Example 8.4.3 to reach 1000 fish. Under these circumstances, how long
do the owner’s friends have to wait?

Hint
Use the process from the previous example.

Answer
38.90 months

Exponential Decay Model


Exponential functions can also be used to model populations that shrink (from disease, for example), or chemical compounds that
break down over time. We say that such systems exhibit exponential decay, rather than exponential growth. The model is nearly the
same, except there is a negative sign in the exponent. Thus, for some positive constant k , we have
−kt
y = y0 e .

As with exponential growth, there is a differential equation associated with exponential decay. We have
−kt
y' = −ky0 e = −ky.

 Exponential Decay

Systems that exhibit exponential decay behave according to the model


−kt
y = y0 e ,

where y represents the initial state of the system and k > 0 is a constant, called the decay constant.
0

Figure 8.4.2 shows a graph of a representative exponential decay function.

Figure 8.4.2 : An example of exponential decay.


Let’s look at a physical application of exponential decay. Newton’s law of cooling says that an object cools at a rate proportional to
the difference between the temperature of the object and the temperature of the surroundings. In other words, if T represents the
temperature of the object and T represents the ambient temperature in a room, then
a

T ' = −k(T − Ta ).

Note that this is not quite the right model for exponential decay. We want the derivative to be proportional to the function, and this
expression has the additional T term. Fortunately, we can make a change of variables that resolves this issue. Let
a

8.4.5 https://math.libretexts.org/@go/page/4495
y(t) = T (t) − Ta . Then y'(t) = T '(t) − 0 = T '(t) , and our equation becomes

y' = −ky.

From our previous work, we know this relationship between y and its derivative leads to exponential decay. Thus,
−kt
y = y0 e ,

and we see that


−kt
T − Ta = (T0 − Ta )e

−kt
T = (T0 − Ta )e + Ta

where T represents the initial temperature. Let’s apply this formula in the following example.
0

 Example 8.4.4: Newton’s Law of Cooling

According to experienced baristas, the optimal temperature to serve coffee is between 155°F and 175°F. Suppose coffee is
poured at a temperature of 200°F, and after 2 minutes in a 70°F room it has cooled to 180°F. When is the coffee first cool
enough to serve? When is the coffee too cold to serve? Round answers to the nearest half minute.
Solution
We have
−kt
T = (T0 − Ta )e + Ta

−k(2)
180 = (200 − 70)e + 70

−2k
110 = 130e

11 −2k
=e
13

11
ln = −2k
13

ln 11 − ln 13 = −2k

ln 13 − ln 11
k =
2

Then, the model is


(ln 11−ln 13/2)t
T = 130 e + 70.

The coffee reaches 175°F when


(ln 11−ln 13/2)t
175 = 130 e + 70

(ln 11−ln 13/2)t


105 = 130e

21 (ln 11−ln 13/2)t


=e
26

21 ln 11 − ln 13
ln = t
26 2

ln 11 − ln 13
ln 21 − ln 26 = ( )t
2

2(ln 21 − ln 26)
t =
ln 11 − ln 13

≈ 2.56.

The coffee can be served about 2.5 minutes after it is poured. The coffee reaches 155°F at

8.4.6 https://math.libretexts.org/@go/page/4495
(ln 11−ln 13/2)t
155 = 130 e + 70

(ln 11−ln 13)t


85 = 130e

17 (ln 11−ln 13)t


=e
26

ln 11 − ln 13
ln 17 − ln 26 = ( )t
2

2(ln 17 − ln 26)
t =
ln 11 − ln 13

≈ 5.09.

The coffee is too cold to be served about 5 minutes after it is poured.

 Exercise 8.4.4

Suppose the room is warmer (75°F ) and, after 2 minutes, the coffee has cooled only to 185°F . When is the coffee first cool
enough to serve? When is the coffee be too cold to serve? Round answers to the nearest half minute.

Hint
Use the process from the previous example.

Answer
The coffee is first cool enough to serve about 3.5 minutes after it is poured. The coffee is too cold to serve about 7 minutes
after it is poured.

Just as systems exhibiting exponential growth have a constant doubling time, systems exhibiting exponential decay have a constant
half-life. To calculate the half-life, we want to know when the quantity reaches half its original size. Therefore, we have
y0
−kt
= y0 e
2

1
−kt
=e
2

− ln 2 = −kt

ln 2
t = .
k

Note: This is the same expression we came up with for doubling time.

 Definition: Half-Life
If a quantity decays exponentially, the half-life is the amount of time it takes the quantity to be reduced by half. It is given by
ln 2
Half-life = .
k

 Example 8.4.5: Radiocarbon Dating


One of the most common applications of an exponential decay model is carbon dating. Carbon-14 decays (emits a radioactive
particle) at a regular and consistent exponential rate. Therefore, if we know how much carbon-14 was originally present in an
object and how much carbon-14 remains, we can determine the age of the object. The half-life of carbon-14 is approximately
5730 years—meaning, after that many years, half the material has converted from the original carbon-14 to the new
nonradioactive nitrogen-14. If we have 100 g carbon-14 today, how much is left in 50 years? If an artifact that originally
contained 100 g of carbon-14 now contains 10 g of carbon-14, how old is it? Round the answer to the nearest hundred years.
Solution

8.4.7 https://math.libretexts.org/@go/page/4495
We have
ln 2
5730 =
k

ln 2
k = .
5730

So, the model says


−(ln 2/5730)t
y = 100 e .

In 50 years, we have
−(ln 2/5730)(50)
y = 100 e ≈ 99.40

Therefore, in 50 years, 99.40 g of carbon-14 remains.


To determine the age of the artifact, we must solve
−(ln 2/5730)t
10 = 100e

1 −(ln 2/5730)t
=e
10

t ≈ 19035.

The artifact is about 19, 000 years old.

 Exercise 8.4.5: Carbon-14 Decay

If we have 100 g of carbon-14 , how much is left after 500 years? If an artifact that originally contained 100 g of carbon-14
now contains 20 g of carbon-14, how old is it? Round the answer to the nearest hundred years.

Hint
Use the process from the previous example.

Answer
A total of 94.13 g of carbon-14 remains after 500 years. The artifact is approximately 13,300 years old.

Key Concepts
Exponential growth and exponential decay are two of the most common applications of exponential functions.
Systems that exhibit exponential growth follow a model of the form y = y e . 0
kt

In exponential growth, the rate of growth is proportional to the quantity present. In other words, y' = ky .
Systems that exhibit exponential growth have a constant doubling time, which is given by (ln 2)/k.
Systems that exhibit exponential decay follow a model of the form y = y e . 0
−kt

Systems that exhibit exponential decay have a constant half-life, which is given by (ln 2)/k.

Glossary
doubling time
if a quantity grows exponentially, the doubling time is the amount of time it takes the quantity to double, and is given by
(ln 2)/k

exponential decay
systems that exhibit exponential decay follow a model of the form y = y 0e
−kt

exponential growth
systems that exhibit exponential growth follow a model of the form y = y 0e
kt

8.4.8 https://math.libretexts.org/@go/page/4495
half-life
if a quantity decays exponentially, the half-life is the amount of time it takes the quantity to be reduced by half. It is given by
(ln 2)/k

8.4: Applications to Economics and Biology is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
6.8: Exponential Growth and Decay by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

8.4.9 https://math.libretexts.org/@go/page/4495
8.5: Probability
In this section we will briefly discuss some applications of multiple integrals in the field of probability theory. In particular we will
see ways in which multiple integrals can be used to calculate probabilities and expected values.

Probability
Suppose that you have a standard six-sided (fair) die, and you let a variable X represent the value rolled. Then the probability of
rolling a 3, written as P (X = 3) , is 1 6 , since there are six sides on the die and each one is equally likely to be rolled, and hence in
particular the 3 has a one out of six chance of being rolled. Likewise the probability of rolling at most a 3, written as P (X ≤ 3) , is
3 1
= , since of the six numbers on the die, there are three equally likely numbers (1, 2, and 3) that are less than or equal to 3.
6 2
Note that:
P (X ≤ 3) = P (X = 1) + P (X = 2) + P (X = 3). (8.5.1)

We call X a discrete random variable on the sample space (or probability space) Ω consisting of all possible outcomes. In our
case, Ω = 1, 2, 3, 4, 5, 6. An event A is a subset of the sample space. For example, in the case of the die, the event X ≤ 3 is the set
1, 2, 3.

Now let X be a variable representing a random real number in the interval (0, 1). Note that the set of all real numbers between 0
and 1 is not a discrete (or countable) set of values, i.e. it can not be put into a one-to-one correspondence with the set of positive
integers. In this case, for any real number x in (0, 1), it makes no sense to consider P (X = x) since it must be 0 (why?). Instead,
we consider the probability P (X ≤ x) , which is given by P (X ≤ x) = x . The reasoning is this: the interval (0, 1) has length 1,
and for x in (0, 1) the interval (0, x) has length x. So since X represents a random number in (0, 1), and hence is uniformly
distributed over (0, 1), then
length of (0, x) x
P (X ≤ x) = = =x (8.5.2)
length of (0, 1) 1

We call X a continuous random variable on the sample space Ω = (0, 1) . An event A is a subset of the sample space. For
example, in our case the event X ≤ x is the set (0, x).
In the case of a discrete random variable, we saw how the probability of an event was the sum of the probabilities of the individual
outcomes comprising that event (e.g. P (X ≤ 3) = P (X = 1) + P (X = 2) + P (X = 3) in the die example). For a continuous
random variable, the probability of an event will instead be the integral of a function, which we will now describe.
Let X be a continuous real-valued random variable on a sample space Ω in R . For simplicity, let Ω = (a, b) . Define the
distribution function F of X as
F (x) = P (X ≤ x), for − ∞ < x < ∞ (8.5.3)

⎧ 1, for x ≥ b

= ⎨ P (X ≤ x), for a < x < b (8.5.4)




0, for x ≤ a.

Suppose that there is a nonnegative, continuous real-valued function f on R such that


x

F (x) = ∫ f (y) dy, for − ∞ < x < ∞, (8.5.5)


−∞

and

∫ f (x) dx = 1 (8.5.6)
−∞

Then we call f the probability density function (or p. d. f. for short) for X. We thus have
x

P (X ≤ x) = ∫ f (y) dy, for a < x < b (8.5.7)


a

Also, by the Fundamental Theorem of Calculus, we have

8.5.1 https://math.libretexts.org/@go/page/4496

F (x) = f (x), for − ∞ < x < ∞. (8.5.8)

Example 8.5.1: Uniform Distribution

Let X represent a randomly selected real number in the interval (0, 1) . We say that X has the uniform distribution on (0, 1),
with distribution function

⎧ 1, for x ≥ 1

F (x) = P (X ≤ x) = ⎨ x, for 0 < x < 1 (8.5.9)




0, for x ≤ 0,

and probability density function


1, for 0 < x < 1

f (x) = F (x) = { (8.5.10)
0, elsewhere.

In general, if X represents a randomly selected real number in an interval (a, b), then X has the uniform distribution function

⎧ 1, \text{for }x \ge b



x
F (x) = P (X ≤ x) = ⎨ , \text{for }a<x<b (8.5.11)
b −a




0, \text{for }x \le a

⎧ 1
, \text{for }a<x<b
f (x) = F '(x) = ⎨ b − a (8.5.12)

0, elsewhere.

Example 8.5.2: Standard Normal Distribution

A famous distribution function is given by the standard normal distribution, whose probability density function f is
1 2
−x /2
f (x) = e , for − ∞ < x < ∞ (8.5.13)
−−
√2π

This is often called a “bell curve”, and is used widely in statistics. Since we are claiming that f is a p. d. f ., we should have

1 2
−x /2
∫ −−e dx = 1 (8.5.14)
−∞ √2π

by Equation 8.5.6, which is equivalent to



−x /2
2
−−
∫ e dx = √2π. (8.5.15)
−∞

We can use a double integral in polar coordinates to verify this integral. First,
∞ ∞ ∞ ∞
2 2 2 2
−( x +y )/2 −y /2 −x /2
∫ ∫ e dx dy = ∫ e (∫ e dx) dy
−∞ −∞ −∞ −∞

∞ ∞
2 2
−x /2 −y /2
= (∫ e dx) ( ∫ e dy)
−∞ −∞

∞ 2
2
−x /2
= (∫ e dx)
−∞

since the same function is being integrated twice in the middle equation, just with different variables. But using polar
coordinates, we see that

8.5.2 https://math.libretexts.org/@go/page/4496
∞ ∞ 2π ∞
2 2 2
−( x +y )/2 −r /2
∫ ∫ e dx dy = ∫ ∫ e r dr dθ
−∞ −∞ 0 0


2 r=∞
−r /2 ∣
=∫ (−e ) dθ

r=0
0

2π 2π
0
=∫ (0 − (−e )) dθ = ∫ 1 dθ = 2π,
0 0

and so
∞ 2
2
−x /2
(∫ e dx) = 2π, and hence
−∞


2
−x /2 −−
∫ e dx = √2π
−∞

In addition to individual random variables, we can consider jointly distributed random variables. For this, we will let X, Y and Z
be three real-valued continuous random variables defined on the same sample space Ω in R (the discussion for two random
variables is similar). Then the joint distribution function F of X, Y and Z is given by

F (x, y, z) = P (X ≤ x, Y ≤ y, Z ≤ z), for − ∞ < x, y, z < ∞. (8.5.16)

If there is a nonnegative, continuous real-valued function f on R such that 3

z y x

F (x, y, z) = ∫ ∫ ∫ f (u, v, w) du dv dw, for − ∞ < x, y, z < ∞ (8.5.17)


−∞ −∞ −∞

and
∞ ∞ ∞

∫ ∫ ∫ f (x, y, z) dx dy dz = 1, (8.5.18)
−∞ −∞ −∞

then we call f the joint probability density function (or joint p.d.f. for short) for X, Y and Z . In general, for
a1 < b1 , a2 < b2 , a3 < b3 , we have
b3 b2 b1

P (a1 < X ≤ b1 , a2 < Y ≤ b2 , a3 < Z ≤ b3 ) = ∫ ∫ ∫ f (x, y, z) dx dy dz, (8.5.19)


a3 a2 a1

with the ≤ and < symbols interchangeable in any combination. A triple integral, then, can be thought of as representing a
probability (for a function f which is a p. d. f .).

Example 8.5.3

Let a, b, and c be real numbers selected randomly from the interval (0, 1) . What is the probability that the equation
ax
2
+ bx + c = 0 has at least one real solution x?
Solution
We know by the quadratic formula that there is at least one real solution if b − 4ac ≥ 0 . So we need to calculate 2

P (b − 4ac ≥ 0) . We will use three jointly distributed random variables to do this. First, since 0 < a, b, c < 1, we have
2

2 2 −

b − 4ac ≥ 0 ⇔ 0 < 4ac ≤ b < 1 ⇔ 0 < 2 √a √c ≤ b < 1,

where the last relation holds for all 0 < a, c < 1 such that
1
0 < 4ac < 1 ⇔ 0 < c <
4a

Considering a, b and c as real variables, the region R in the ac -plane where the above relation holds is given by
1
R = (a, c) : 0 < a < 1, 0 < c < 1, 0 < c < , which we can see is a union of two regions R and R , as in Figure 8.5.1. 1 2
4a

8.5.3 https://math.libretexts.org/@go/page/4496
Figure 8.5.1 : Region R = R 1 ∪ R2

Now let X, Y and Z be continuous random variables, each representing a randomly selected real number from the interval
(0, 1) (think of X, Y and Z representing a, b and c , respectively). Then, similar to how we showed that f (x) = 1 is the

p. d. f . of the uniform distribution on (0, 1), it can be shown that f (x, y, z) = 1 for x, y, z in (0, 1) (0 elsewhere) is the joint

p. d. f . of X, Y and Z . Now,

2 −

P (b − 4ac ≥ 0) = P ((a, c) ∈ R, 2 √a √c ≤ b < 1),

so this probability is the triple integral of f (a, b, c) = 1 as b varies from 2√−



a √c to 1 and as (a, c) varies over the region R .

Since R can be divided into two regions R and R , then the required triple integral can be split into a sum of two triple
1 2

integrals, using vertical slices in R :


1/4 1 1 1 1/4a 1
2
P (b − 4ac ≥ 0) = ∫ ∫ ∫ 1 db dc da + ∫ ∫ ∫ 1 db dc da
0 0 2 √a√c 1/4 0 2 √a√c
 
R1 R2

1/4 1 1 1/4a

− −

=∫ ∫ (1 − 2 √a √c)dc da + ∫ ∫ (1 − 2 √a √c)dc da
0 0 1/4 0

1/4 1
4 c=1 4 c=1/4a

− 3/2 ∣ −
− 3/2 ∣
=∫ (c − √a c ) da + ∫ (c − √a c ) da
∣ ∣
0
3 c=0
1/4
3 c=0

1/4 1
4 −
− 1
=∫ (1 − √a ) da + ∫ da
0
3 1/4
12a

8 1/4 1 1
3/2 ∣ ∣
= a− a + ln a
∣0 ∣1/4
9 12

1 1 1 1 5 1
=( − ) + (0 − ln ) = + ln 4
4 9 12 4 36 12

2
5 + 3 ln 4
P (b − 4ac ≥ 0) = ≈ 0.2544
36

In other words, the equation ax 2


+ bx + c = 0 has about a 25% chance of being solved!

Expectation Value
The expectation value (or expected value) EX of a random variable X can be thought of as the “average” value of X as it varies
over its sample space. If X is a discrete random variable, then

EX = ∑ xP (X = x), (8.5.20)

with the sum being taken over all elements x of the sample space. For example, if X represents the number rolled on a six-sided
die, then
6 6
1
EX = ∑ xP (X = x) = ∑ x = 3.5 (8.5.21)
6
x=1 x=1

is the expected value of X, which is the average of the integers 1−6.

8.5.4 https://math.libretexts.org/@go/page/4496
If X is a real-valued continuous random variable with p.d.f. f , then

EX = ∫ xf (x) dx (8.5.22)
−∞

For example, if X has the uniform distribution on the interval (0, 1), then its p.d.f. is
1, \text{for }0 < x < 1
f (x) = { (8.5.23)
0, elsewhere

and so
∞ 1
1
EX = ∫ xf (x) dx = ∫ x dx = (8.5.24)
−∞ 0
2

For a pair of jointly distributed, real-valued continuous random variables X and Y with joint p.d.f. f (x, y), the expected values of
X and Y are given by

∞ ∞ ∞ ∞

EX = ∫ ∫ xf (x, y) dx dy and EY = ∫ ∫ yf (x, y) dx dy (8.5.25)


−∞ −∞ −∞ −∞

respectively.

Example 8.5.4

If you were to pick n > 2 random real numbers from the interval (0, 1) , what are the expected values for the smallest and
largest of those numbers?
Solution
Let U , . . . , U be n continuous random variables, each representing a randomly selected real number from
1 n (0, 1), i.e. each
has the uniform distribution on (0, 1). Define random variables X and Y by

X = min(U1 , . . . , Un ) and Y = max(U1 , . . . , Un ).

Then it can be shown that the joint p.d.f. of X and Y is


n−2
n(n − 1)(y − x ) , \text{for } 0 ≤ x ≤ y ≤ 1
f (x, y) = { (8.5.26)
0, elsewhere.

Thus, the expected value of X is


1 1
n−2
EX = ∫ ∫ n(n − 1)x(y − x ) dy dx
0 x

1
y=1
n−1 ∣
=∫ (nx(y − x ) ) dx
∣y=x
0

1
n−1
=∫ nx(1 − x ) dx, so integration by parts yields
0

1 1
n n+1 ∣
= −x(1 − x ) − (1 − x )

n+1 0

1
EX = ,
n+1

and similarly (see Exercise 3) it can be shown that


1 y
n−2
n
EY = ∫ ∫ n(n − 1)y(y − x ) dx dy = .
0 0
n+1

So, for example, if you were to repeatedly take samples of n =3 random real numbers from (0, 1) , and each time store the
1
minimum and maximum values in the sample, then the average of the minimums would approach and the average of the
4

8.5.5 https://math.libretexts.org/@go/page/4496
3
maximums would approach as the number of samples grows. It would be relatively simple (see Exercise 4) to write a
4
computer program to test this.

8.5: Probability is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
3.7: Application- Probability and Expectation Values by Michael Corral is licensed GNU FDL. Original source: http://www.mecmath.net/.

8.5.6 https://math.libretexts.org/@go/page/4496
CHAPTER OVERVIEW

9: Differential Equations
A general Calculus Textmap organized around the textbook

Calculus: Early Transcendentals


by James Stewart

I II III IV V VI VII VIII IX X XI XII XIII XIV XV XVI XVII

This Textmap is currently under construction... please be patient with us.


9.1: Modeling with Differential Equations
9.2: Direction Fields and Euler's Method
9.3: Separable Equations
9.4: Models for Population Growth
9.5: Linear Equations
9.6: Predator-Prey Systems

9: Differential Equations is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

1
9.1: Modeling with Differential Equations
 Learning Objectives
Identify the order of a differential equation.
Explain what is meant by a solution to a differential equation.
Distinguish between the general solution and a particular solution of a differential equation.
Identify an initial-value problem.
Identify whether a given function is a solution to a differential equation or an initial-value problem.

Calculus is the mathematics of change, and rates of change are expressed by derivatives. Thus, one of the most common ways to
use calculus is to set up an equation containing an unknown function y = f (x) and its derivative, known as a differential equation.
Solving such equations often provides information about how quantities change and frequently provides insight into how and why
the changes occur.
Techniques for solving differential equations can take many different forms, including direct solution, use of graphs, or computer
calculations. We introduce the main ideas in this chapter and describe them in a little more detail later in the course. In this section
we study what differential equations are, how to verify their solutions, some methods that are used for solving them, and some
examples of common and useful equations.

General Differential Equations


Consider the equation y' = 3x , which is an example of a differential equation because it includes a derivative. There is a
2

relationship between the variables x and y : y is an unknown function of x. Furthermore, the left-hand side of the equation is the
derivative of y . Therefore we can interpret this equation as follows: Start with some function y = f (x) and take its derivative. The
answer must be equal to 3x . What function has a derivative that is equal to 3x ? One such function is y = x , so this function is
2 2 3

considered a solution to a differential equation.

 Definition: differential equation


A differential equation is an equation involving an unknown function y = f (x) and one or more of its derivatives. A solution
to a differential equation is a function y = f (x) that satisfies the differential equation when f and its derivatives are substituted
into the equation.
Go to this website to explore more on this topic.

Some examples of differential equations and their solutions appear in Table 9.1.1.
Table 9.1.1 : Examples of Differential Equations and Their Solutions
Equation Solution
′ 2
y = 2x y = x

′ −3x
y + 3y = 6x + 11 y = e + 2x + 3

′′ ′ −2x x 2x −2x
y − 3y + 2y = 24 e y = 3e − 4e + 2e

Note that a solution to a differential equation is not necessarily unique, primarily because the derivative of a constant is zero. For
example, y = x + 4 is also a solution to the first differential equation in Table 9.1.1. We will return to this idea a little bit later in
2

this section. For now, let’s focus on what it means for a function to be a solution to a differential equation.

 Example 9.1.1: Verifying Solutions of Differential Equations

Verify that the function y = e −3x


+ 2x + 3 is a solution to the differential equation y' + 3y = 6x + 11 .
Solution
To verify the solution, we first calculate y' using the chain rule for derivatives. This gives y' = −3 e
−3x
+2 . Next we
substitute y and y' into the left-hand side of the differential equation:

9.1.1 https://math.libretexts.org/@go/page/4498
−2x −2x
(−3 e + 2) + 3(e + 2x + 3).

The resulting expression can be simplified by first distributing to eliminate the parentheses, giving
−2x −2x
−3 e + 2 + 3e + 6x + 9.

Combining like terms leads to the expression 6x + 11 , which is equal to the right-hand side of the differential equation. This
result verifies that y = e + 2x + 3 is a solution of the differential equation.
−3x

 Exercise 9.1.1

Verify that y = 2e 3x
− 2x − 2 is a solution to the differential equation y' − 3y = 6x + 4.

Hint
First calculate y' then substitute both y' and y into the left-hand side.

It is convenient to define characteristics of differential equations that make it easier to talk about them and categorize them. The
most basic characteristic of a differential equation is its order.

 Definition: order of a differential equation

The order of a differential equation is the highest order of any derivative of the unknown function that appears in the equation.

 Example 9.1.2: Identifying the Order of a Differential Equation

The highest derivative in the equation is y',


What is the order of each of the following differential equations?
a. y' − 4y = x 2
− 3x + 4

b. x y − 3x y
2 ′′′ ′′
+ xy' − 3y = sin x

c. y − y
4

x
(4) 6
2
x
′′
+
12

x
4
y =x
3
− 3x
2
+ 4x − 12

Solution
a. The highest derivative in the equation is y',so the order is 1.
b. The highest derivative in the equation is y , so the order is 3.
′′′

c. The highest derivative in the equation is y , so the order is 4.


(4)

 Exercise 9.1.2

What is the order of the following differential equation?


4 (5) 2
(x − 3x)y − (3 x + 1)y' + 3y = sin x cos x

Hint
What is the highest derivative in the equation?

Answer
5

General and Particular Solutions


We already noted that the differential equation y' = 2x has at least two solutions: y = x and y = x + 4 . The only difference
2 2

between these two solutions is the last term, which is a constant. What if the last term is a different constant? Will this expression
still be a solution to the differential equation? In fact, any function of the form y = x + C , where C represents any constant, is a
2

solution as well. The reason is that the derivative of x + C is 2x, regardless of the value of C . It can be shown that any solution of
2

this differential equation must be of the form y = x + C . This is an example of a general solution to a differential equation. A
2

9.1.2 https://math.libretexts.org/@go/page/4498
graph of some of these solutions is given in Figure 9.1.1. (Note: in this graph we used even integer values for C ranging between
−4 and 4 . In fact, there is no restriction on the value of C ; it can be an integer or not.)

Figure 9.1.1 : Family of solutions to the differential equation y' = 2x.


In this example, we are free to choose any solution we wish; for example, y = x − 3 is a member of the family of solutions to this
2

differential equation. This is called a particular solution to the differential equation. A particular solution can often be uniquely
identified if we are given additional information about the problem.

 Example 9.1.3: Finding a Particular Solution

Find the particular solution to the differential equation y' = 2x passing through the point (2, 7).
Solution
Any function of the form y = x + C is a solution to this differential equation. To determine the value of C , we substitute the
2

values x = 2 and y = 7 into this equation and solve for C :


2
y =x +C

2
7 =2 +C

= 4 +C

C = 3.

Therefore the particular solution passing through the point (2, 7) is y = x 2


+3 .

 Exercise 9.1.3

Find the particular solution to the differential equation

y' = 4x + 3

passing through the point (1, 7), given that y = 2x 2


+ 3x + C is a general solution to the differential equation.

Hint
First substitute x = 1 and y = 7 into the equation, then solve for C .

Answer
2
y = 2x + 3x + 2

Initial-Value Problems
Usually a given differential equation has an infinite number of solutions, so it is natural to ask which one we want to use. To choose
one solution, more information is needed. Some specific information that can be useful is an initial value, which is an ordered pair
that is used to find a particular solution.

9.1.3 https://math.libretexts.org/@go/page/4498
A differential equation together with one or more initial values is called an initial-value problem. The general rule is that the
number of initial values needed for an initial-value problem is equal to the order of the differential equation. For example, if we
have the differential equation y' = 2x, then y(3) = 7 is an initial value, and when taken together, these equations form an initial-
value problem. The differential equation y − 3y' + 2y = 4e is second order, so we need two initial values. With initial-value
′′ x

problems of order greater than one, the same value should be used for the independent variable. An example of initial values for
this second-order equation would be y(0) = 2 and y'(0) = −1. These two initial values together with the differential equation
form an initial-value problem. These problems are so named because often the independent variable in the unknown function is t ,
which represents time. Thus, a value of t = 0 represents the beginning of the problem.

 Example 9.1.4: Verifying a Solution to an Initial-Value Problem

Verify that the function y = 2e −2t


+e
t
is a solution to the initial-value problem
t
y' + 2y = 3 e , y(0) = 3.

Solution
For a function to satisfy an initial-value problem, it must satisfy both the differential equation and the initial condition. To
show that y satisfies the differential equation, we start by calculating y'. This gives y' = −4e + e . Next we substitute both −2t t

y and y' into the left-hand side of the differential equation and simplify:

−2t t −2t t
y' + 2y = (−4 e + e ) + 2(2 e +e )

−2t t −2t t t
= −4 e + e + 4e + 2e = 3e .

This is equal to the right-hand side of the differential equation, so y = 2e


−2t
+e
t
solves the differential equation. Next we
calculate y(0):
−2(0) 0
y(0) = 2 e +e = 2 + 1 = 3.

This result verifies the initial value. Therefore the given function satisfies the initial-value problem.

 Exercise 9.1.4

Verify that y = 3e 2t
+ 4 sin t is a solution to the initial-value problem

y' − 2y = 4 cos t − 8 sin t, y(0) = 3.

Hint
First verify that y solves the differential equation. Then check the initial value.

In Example 9.1.4, the initial-value problem consisted of two parts. The first part was the differential equation y' + 2y = 3e , and x

the second part was the initial value y(0) = 3. These two equations together formed the initial-value problem.
The same is true in general. An initial-value problem will consists of two parts: the differential equation and the initial condition.
The differential equation has a family of solutions, and the initial condition determines the value of C . The family of solutions to
the differential equation in Example 9.1.4 is given by y = 2e + C e . This family of solutions is shown in Figure 9.1.2, with the
−2t t

particular solution y = 2e + e labeled.


−2t t

9.1.4 https://math.libretexts.org/@go/page/4498
Figure 9.1.2 : A family of solutions to the differential equation y' + 2y = 3e . The particular solution y = 2e
t −2t
+e
t
is labeled.

 Example 9.1.5: Solving an Initial-value Problem


Solve the following initial-value problem:
x 2
y' = 3 e +x − 4, y(0) = 5.

Solution
The first step in solving this initial-value problem is to find a general family of solutions. To do this, we find an antiderivative
of both sides of the differential equation

x 2
∫ y' dx = ∫ (3 e +x − 4) dx,

namely,
y + C1 = 3 e
x
+
1

3
x
3
− 4x + C2 .
We are able to integrate both sides because the y term appears by itself. Notice that there are two integration constants: C and 1

C . Solving this equation for y gives


2

x 1 3
y = 3e + x − 4x + C2 − C1 .
3

Because C and C are both constants, C


1 2 2 − C1 is also a constant. We can therefore define C = C2 − C1 , which leads to the
equation
x 1 3
y = 3e + x − 4x + C .
3

Next we determine the value of C . To do this, we substitute x = 0 and y = 5 into this equation and solve for C :
1
0 3
5 = 3e + 0 − 4(0) + C
3
.
5 = 3 +C

C =2

Now we substitute the value C =2 into the general equation. The solution to the initial-value problem is
x 1 3
y = 3e + x − 4x + 2.
3

Analysis
The difference between a general solution and a particular solution is that a general solution involves a family of functions,
either explicitly or implicitly defined, of the independent variable. The initial value or values determine which particular
solution in the family of solutions satisfies the desired conditions.

9.1.5 https://math.libretexts.org/@go/page/4498
 Exercise 9.1.5

Solve the initial-value problem


2 x
y' = x − 4x + 3 − 6 e , y(0) = 8.

Hint
First take the antiderivative of both sides of the differential equation. Then substitute x =0 and y =8 into the resulting
equation and solve for C .

Answer
1 3 2 x
y = x − 2x + 3x − 6 e + 14
3

In physics and engineering applications, we often consider the forces acting upon an object, and use this information to understand
the resulting motion that may occur. For example, if we start with an object at Earth’s surface, the primary force acting upon that
object is gravity. Physicists and engineers can use this information, along with Newton’s second law of motion (in equation form
F = ma , where F represents force, m represents mass, and a represents acceleration), to derive an equation that can be solved.

Figure 9.1.3 : For a baseball falling in air, the only force acting on it is gravity (neglecting air resistance).
In Figure 9.1.3 we assume that the only force acting on a baseball is the force of gravity. This assumption ignores air resistance.
(The force due to air resistance is considered in a later discussion.) The acceleration due to gravity at Earth’s surface, g, is
approximately 9.8 m/s . We introduce a frame of reference, where Earth’s surface is at a height of 0 meters. Let v(t) represent the
2

velocity of the object in meters per second. If v(t) > 0 , the ball is rising, and if v(t) < 0 , the ball is falling (Figure).

Figure 9.1.4 : Possible velocities for the rising/falling baseball.


Our goal is to solve for the velocity v(t) at any time t . To do this, we set up an initial-value problem. Suppose the mass of the ball
is m, where m is measured in kilograms. We use Newton’s second law, which states that the force acting on an object is equal to its
mass times its acceleration (F = ma) . Acceleration is the derivative of velocity, so a(t) = v'(t) . Therefore the force acting on the
baseball is given by F = mv'(t) . However, this force must be equal to the force of gravity acting on the object, which (again using
Newton’s second law) is given by F = −mg , since this force acts in a downward direction. Therefore we obtain the equation
g

F = F , which becomes mv'(t) = −mg . Dividing both sides of the equation by m gives the equation
g

v'(t) = −g.

9.1.6 https://math.libretexts.org/@go/page/4498
Notice that this differential equation remains the same regardless of the mass of the object.
We now need an initial value. Because we are solving for velocity, it makes sense in the context of the problem to assume that we
know the initial velocity, or the velocity at time t = 0. This is denoted by v(0) = v . 0

 Example 9.1.6: Velocity of a Moving Baseball

A baseball is thrown upward from a height of 3 meters above Earth’s surface with an initial velocity of 10 m/s, and the only
force acting on it is gravity. The ball has a mass of 0.15 kg at Earth’s surface.
a. Find the velocity v(t) of the basevall at time t .
b. What is its velocity after 2 seconds?
Solution
a. From the preceding discussion, the differential equation that applies in this situation is
v'(t) = −g,

where g = 9.8 m/s . The initial condition is


2
v(0) = v0 , where v0 = 10 m/s. Therefore the initial-value problem is
2
v'(t) = −9.8 m/s , v(0) = 10 m/s.

The first step in solving this initial-value problem is to take the antiderivative of both sides of the differential equation. This
gives

∫ v'(t) dt = ∫ −9.8 dt

v(t) = −9.8t + C .

The next step is to solve for C . To do this, substitute t = 0 and v(0) = 10 :

v(t) = −9.8t + C

v(0) = −9.8(0) + C

10 = C .

Therefore C = 10 and the velocity function is given by v(t) = −9.8t + 10.


b. To find the velocity after 2 seconds, substitute t = 2 into v(t) .
v(t) = −9.8t + 10

v(2) = −9.8(2) + 10

v(2) = −9.6

The units of velocity are meters per second. Since the answer is negative, the object is falling at a speed of 9.6 m/s.

 Exercise 9.1.6

Suppose a rock falls from rest from a height of 100 meters and the only force acting on it is gravity. Find an equation for the
velocity v(t) as a function of time, measured in meters per second.

Hint
What is the initial velocity of the rock? Use this with the differential equation in Example 9.1.6 to form an initial-value
problem, then solve for v(t).

Answer
v(t) = −9.8t

A natural question to ask after solving this type of problem is how high the object will be above Earth’s surface at a given point in
time. Let s(t) denote the height above Earth’s surface of the object, measured in meters. Because velocity is the derivative of

9.1.7 https://math.libretexts.org/@go/page/4498
position (in this case height), this assumption gives the equation s'(t) = v(t) . An initial value is necessary; in this case the initial
height of the object works well. Let the initial height be given by the equation s(0) = s . Together these assumptions give the
0

initial-value problem

s'(t) = v(t), s(0) = s0 .

If the velocity function is known, then it is possible to solve for the position function as well.

 Example 9.1.7: Height of a Moving Baseball

A baseball is thrown upward from a height of 3 meters above Earth’s surface with an initial velocity of 10m/s , and the only
force acting on it is gravity. The ball has a mass of 0.15 kilogram at Earth’s surface.
a. Find the position s(t) of the baseball at time t .
b. What is its height after 2 seconds?
Solution
We already know the velocity function for this problem is v(t) = −9.8t + 10 . The initial height of the baseball is 3 meters, so
s = 3 . Therefore the initial-value problem for this example is
0

To solve the initial-value problem, we first find the antiderivatives:

∫ s'(t) dt = ∫ (−9.8t + 10) dt

2
s(t) = −4.9 t + 10t + C .

Next we substitute t = 0 and solve for C :


2
s(t) = −4.9 t + 10t + C

2
s(0) = −4.9(0 ) + 10(0) + C

3 =C .
Therefore the position function is s(t) = −4.9t 2
+ 10t + 3.

b. The height of the baseball after 2 sec is given by s(2) :


2
s(2) = −4.9(2 ) + 10(2) + 3 = −4.9(4) + 23 = 3.4.

Therefore the baseball is 3.4 meters above Earth’s surface after 2 seconds. It is worth noting that the mass of the ball cancelled
out completely in the process of solving the problem.

Key Concepts
A differential equation is an equation involving a function y = f (x) and one or more of its derivatives. A solution is a function
y = f (x) that satisfies the differential equation when f and its derivatives are substituted into the equation.

The order of a differential equation is the highest order of any derivative of the unknown function that appears in the equation.
A differential equation coupled with an initial value is called an initial-value problem. To solve an initial-value problem, first
find the general solution to the differential equation, then determine the value of the constant. Initial-value problems have many
applications in science and engineering.

Glossary
differential equation
an equation involving a function y = y(x) and one or more of its derivatives

general solution (or family of solutions)


the entire set of solutions to a given differential equation

initial value(s)
a value or set of values that a solution of a differential equation satisfies for a fixed value of the independent variable

9.1.8 https://math.libretexts.org/@go/page/4498
initial velocity
the velocity at time t = 0

initial-value problem
a differential equation together with an initial value or values

order of a differential equation


the highest order of any derivative of the unknown function that appears in the equation

particular solution
member of a family of solutions to a differential equation that satisfies a particular initial condition

solution to a differential equation


a function y = f (x) that satisfies a given differential equation

9.1: Modeling with Differential Equations is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
8.1: Basics of Differential Equations by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

9.1.9 https://math.libretexts.org/@go/page/4498
9.2: Direction Fields and Euler's Method
9.2: Direction Fields and Euler's Method is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

9.2.1 https://math.libretexts.org/@go/page/4499
9.3: Separable Equations
9.3: Separable Equations is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

9.3.1 https://math.libretexts.org/@go/page/4500
9.4: Models for Population Growth
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
How can we use differential equations to realistically model the growth of a population?
How can we assess the accuracy of our models?

The growth of the earth’s population is one of the pressing issues of our time. Will the population continue to grow? Or will it
perhaps level off at some point, and if so, when? In this section, we will look at two ways in which we may use differential
equations to help us address questions such as these. Before we begin, let’s consider again two important differential equations that
we have seen in earlier work this chapter.

Preview Activity 9.4.1

Recall that one model for population growth states that a population grows at a rate proportional to its size.
a. We begin with the differential equation
dP 1
= P. (9.4.1)
dt 2

Sketch a slope field below as well as a few typical solutions on the axes provided.
b. Find all equilibrium solutions of Equation 9.4.1 and classify them as stable or unstable.
c. If P (0) is positive, describe the long-term behavior of the solution to Equation 9.4.1.
d. Let’s now consider a modified differential equation given by
dP 1
= P (3 − P ).
dt 2

As before, sketch a slope field as well as a few typical solutions on the following axes provided.
e. Find any equilibrium solutions and classify them as stable or unstable.
f. If P (0) is positive, describe the long-term behavior of the solution.

The Earth’s Population


We will now begin studying the earth’s population. To get started, here are some data for the earth’s population in recent years that
we will use in our investigations.

Year 1998 1999 2000 2001 2002 2005 2006 2007 2008 2009 2010

Populatio
n (in 5.932 6.008 6.084 6.159 6.234 6.456 6.531 6.606 6.681 6.756 6.831
Billions)

9.4.1 https://math.libretexts.org/@go/page/4501
Activity 9.4.1: Growth Dynamics

Our first model will be based on the following assumption:


The rate of change of the population is proportional to the population.
On the face of it, this seems pretty reasonable. When there is a relatively small number of people, there will be fewer births and
deaths so the rate of change will be small. When there is a larger number of people, there will be more births and deaths so we
expect a larger rate of change. If P (t) is the population t years after the year 2000, we may express this assumption as
dP
= kP (9.4.2)
dt

where k is a constant of proportionality.


a. Use the data in the table to estimate the derivative P (0) using a central difference. Assume that t = 0 corresponds to the

year 2000.
b. What is the population P (0)?
c. Use these two facts to estimate the constant of proportionality k in the differential equation.
d. Now that we know the value of k , we have the initial value problem of Equation 9.4.2 with P (0) = 6.084. Find the
solution to this initial value problem.
e. What does your solution predict for the population in the year 2010? Is this close to the actual population given in the
table?
f. When does your solution predict that the population will reach 12 billion?
g. What does your solution predict for the population in the year 2500?
h. Do you think this is a reasonable model for the earth’s population? Why or why not? Explain your thinking using a couple
of complete sentences.

Our work in Activity 9.4.1 shows that that the exponential model is fairly accurate for years relatively close to 2000. However, if
we go too far into the future, the model predicts increasingly large rates of change, which causes the population to grow arbitrarily
large. This does not make much sense since it is unrealistic to expect that the earth would be able to support such a large
population.
The constant k in the differential equation has an important interpretation. Let’s rewrite the differential equation
dP
= kP
dt

by solving for k , so that we have


dP

dt
k = .
P

Viewed in this light, k is the ratio of the rate of change to the population; in other words, it is the contribution to the rate of change
from a single person. We call this the per capita growth rate.
In the exponential model we introduced in Activity 9.4.1, the per capita growth rate is constant. In particular, we are assuming that
when the population is large, the per capita growth rate is the same as when the population is small. It is natural to think that the per
capita growth rate should decrease when the population becomes large, since there will not be enough resources to support so many
people. In other words, we expect that a more realistic model would hold if we assume that the per capita growth rate depends on
the population P. In the previous activity, we computed the per capita growth rate in a single year by computing k , the quotient of
dP

dt
and P (which we did for t = 0 ). If we return data and compute the per capita growth rate over a range of years, we generate the
data shown in Figure 9.4.1, which shows how the per capita growth rate is a function of the population, P .

9.4.2 https://math.libretexts.org/@go/page/4501
Figure 9.4.1: A plot of per capita growth rate vs. population P.
From the data, we see that the per capita growth rate appears to decrease as the population increases. In fact, the points seem to lie
very close to a line, which is shown at two different scales in Figure 9.4.2.

Figure 9.4.2: The line that approximates per capita growth as a function of population, P.
Looking at this line carefully, we can find its equation to be
dP

dt
= 0.025 − 0.002P .
P

If we multiply both sides by P , we arrive at the differential equation


dP
= P (0.025 − 0.002P ). (9.4.3)
dt

Graphing the dependence of dP

dt
on the population P , we see that this differential equation demonstrates a quadratic relationship
between dP

dt
and P , as shown in Figure 9.4.3.

9.4.3 https://math.libretexts.org/@go/page/4501
Figure 9.4.3: A plot of dP

dt
vs. P for Equation 9.4.3.
Equation 9.4.3 is an example of the logistic equation, and is the second model for population growth that we will consider. We
have reason to believe that it will be more realistic since the per capita growth rate is a decreasing function of the population.
Indeed, the graph in Figure 9.4.3 shows that there are two equilibrium solutions, P = 0 , which is unstable, and P = 12.5, which is
a stable equilibrium. The graph shows that any solution with P (0) > 0 will eventually stabilize around 12.5. In other words, our
model predicts the world’s population will eventually stabilize around 12.5 billion.
A prediction for the long-term behavior of the population is a valuable conclusion to draw from our differential equation. We
would, however, like to answer some quantitative questions. For instance, how long will it take to reach a population of 10 billion?
To determine this, we need to find an explicit solution of the equation. Solving the logistic differential equation Since we would
like to apply the logistic model in more general situations, we state the logistic equation in its more general form,
dP
= kP (N − P ). (9.4.4)
dt

The equilibrium solutions here are when P = 0 and 1 − = 0 , which shows that P = N . The equilibrium at P = N is called
P

the carrying capacity of the population for it represents the stable population that can be sustained by the environment.
We now solve the logistic Equation 9.4.4, which is separable, so we separate the variables
1 dP
= k,
P (N − P ) dt

and integrate to find that


1
∫ dP = ∫ kdt,
P (N − P )

To find the antiderivative on the left, we use the partial fraction decomposition
1 1 1 1
= [ + ].
P (N − P ) N P N −P

Now we are ready to integrate, with


1 1 1
∫ [ + ] dP = ∫ kdt.
N P N −P

On the left, observe that N is constant, so we can remove the factor of N


1
and antidifferentiate to find that
1
(ln |P | − ln |N − P |) = kt + C .
N

Multiplying both sides of this last equation by N and using an important rule of logarithms, we next find that
∣ P ∣
ln∣ ∣ = kN t + C .
∣ N −P ∣

From the definition of the logarithm, replacing e with C , and letting C absorb the absolute value signs, we now know that
C

P
kN t
= Ce .
N −P

At this point, all that remains is to determine C and solve algebraically for P .
If the initial population is P (0) = P , then it follows that
0

P0
C =
N − P0

so
P P0
kN t
= e .
N −P N − P0

We will solve this most recent equation for P by multiplying both sides by (N − P )(N − P 0) to obtain

9.4.4 https://math.libretexts.org/@go/page/4501
kN t
P (N − P0 ) = P0 (N − P )e (9.4.5)

kN t kN t
= P0 N e − P0 P e . (9.4.6)

Swapping the left and right sides, expanding, and factoring, it follows that
kN t kN t
P0 N e = P (N − P0 ) + P0 P e (9.4.7)

kN t
= P (N − P0 + P0 e ). (9.4.8)

Dividing to solve for P , we see that


kN t
P0 N e
P = .
N − P0 + P0 ekN t

Finally, we choose to multiply the numerator and denominator by 1

P0
e
−kN t
to obtain

N
P (t) = . (9.4.9)
N − P0
−kN t
( )e +1
P0

While that was a lot of algebra, notice the result: we have found an explicit solution to the initial value problem
dP
= kP (N − P ), P (0) = P0 ,
dt

with P (0) = P and that solution is Equation 9.4.9.


0

For the logistic equation describing the earth’s population that we worked with earlier in this section, we have
,
k = 0.002 N = 12.5 , and P 0 = 6.084 .
This gives the solution
12.5
P (t) = , (9.4.10)
−0.025t
1.0546 e +1

whose graph is shown in Figure 9.4.4 Notice that the graph shows the population leveling off at 12.5 billion, as we expected, and
that the population will be around 10 billion in the year 2050. These results, which we have found using a relatively simple
mathematical model, agree fairly well with predictions made using a much more sophisticated model developed by the United
Nations.

Figure 9.4.4: The solution to the logistic equation modeling the earth’s population (Equation 9.4.10).
The logistic equation is useful in other situations, too, as it is good for modeling any situation in which limited growth is possible.
For instance, it could model the spread of a flu virus through a population contained on a cruise ship, the rate at which a rumor
spreads within a small town, or the behavior of an animal population on an island. Again, it is important to realize that through our
work in this section, we have completely solved the logistic equation, regardless of the values of the constants N , k , and P . 0

Anytime we encounter a logistic equation, we can apply the formula we found in Equation 9.4.9.

9.4.5 https://math.libretexts.org/@go/page/4501
Activity 9.4.2: Predicting Earth's Population

Consider the logistic equation


dP
= kP (N − P )
dt

with the graph of dP

dt
vs. P shown below.

a. At what value of P is the rate of change greatest?


b. Consider the model for the earth’s population that we created. At what value of P is the rate of change greatest? How does
that compare to the population in recent years?
c. According to the model we developed, what will the population be in the year 2100?
d. According to the model we developed, when will the population reach 9 billion?
e. Now consider the general solution to the general logistic initial value problem that we found, given by Equation 9.4.9.
Verify algebraically that P (0) = P and that lim
0 t→∞P (t) = N .

Summary
In this section, we encountered the following important ideas:
If we assume that the rate of growth of a population is proportional to the population, we are led to a model in which the
population grows without bound and at a rate that grows without bound.
By assuming that the per capita growth rate decreases as the population grows, we are led to the logistic model of population
growth, which predicts that the population will eventually stabilize at the carrying capacity.

This page titled 9.4: Models for Population Growth is shared under a CC BY-SA license and was authored, remixed, and/or curated by Matthew
Boelkins, David Austin & Steven Schlicker (ScholarWorks @Grand Valley State University) .
7.6: Population Growth and the Logistic Equation by Matthew Boelkins, David Austin & Steven Schlicker is licensed CC BY-SA 4.0.
Original source: https://activecalculus.org/single.

9.4.6 https://math.libretexts.org/@go/page/4501
9.5: Linear Equations
 Learning Objectives
Write a first-order linear differential equation in standard form.
Find an integrating factor and use it to solve a first-order linear differential equation.
Solve applied problems involving first-order linear differential equations.

Earlier, we studied an application of a first-order differential equation that involved solving for the velocity of an object. In
particular, if a ball is thrown upward with an initial velocity of v ft/s, then an initial-value problem that describes the velocity of
0

the ball after t seconds is given by


dv
= −32
dt

with v(0) = v 0.

This model assumes that the only force acting on the ball is gravity. Now we add to the problem by allowing for the possibility of
air resistance acting on the ball.
Air resistance always acts in the direction opposite to motion. Therefore if an object is rising, air resistance acts in a downward
direction. If the object is falling, air resistance acts in an upward direction (Figure 9.5.1). There is no exact relationship between
the velocity of an object and the air resistance acting on it. For very small objects, air resistance is proportional to velocity; that is,
the force due to air resistance is numerically equal to some constant k times v . For larger (e.g., baseball-sized) objects, depending
on the shape, air resistance can be approximately proportional to the square of the velocity. In fact, air resistance may be
proportional to v , or v , or some other power of v .
1.5 0.9

Figure 9.5.1 : Forces acting on a moving baseball: gravity acts in a downward direction and air resistance acts in a direction
opposite to the direction of motion.
We will work with the linear approximation for air resistance. If we assume k > 0 , then the expression for the force F due to air
A

resistance is given by F A − kv . Therefore the sum of the forces acting on the object is equal to the sum of the gravitational force
=

and the force due to air resistance. This, in turn, is equal to the mass of the object multiplied by its acceleration at time t (Newton’s
second law). This gives us the differential equation
dv
m = −kv − mg.
dt

Finally, we impose an initial condition v(0) = v , where


0 v0 is the initial velocity measured in meters per second. This makes
g = 9.8m/ s . The initial-value problem becomes
2

dv
m = −kv − mg
dt

with v(0) = v 0.

9.5.1 https://math.libretexts.org/@go/page/4502
The differential equation in this initial-value problem is an example of a first-order linear differential equation. (Recall that a
differential equation is first-order if the highest-order derivative that appears in the equation is 1.) In this section, we study first-
order linear equations and examine a method for finding a general solution to these types of equations, as well as solving initial-
value problems involving them.

 Definition: Linear first-order differential equation

A first-order differential equation is linear if it can be written in the form

a(x)y' + b(x)y = c(x),

where a(x), b(x), and c(x) are arbitrary functions of x.

Remember that the unknown function y depends on the variable x; that is, x is the independent variable and y is the dependent
variable. Some examples of first-order linear differential equations are
2 ′
(3 x − 4)y + (x − 3)y = sin x


(sin x)y − (cos x)y = cot x

′ 3
4x y + (3 ln x)y = x − 4x.

Examples of first-order nonlinear differential equations include


′ 4 ′ 3
(y ) − (y ) = (3x − 2)(y + 4)

′ 3
4y + 3y = 4x − 5

′ 2
(y ) = sin y + cos x.

These equations are nonlinear because of terms like 4


(y' ) , y ,
3
etc. Due to these terms, it is impossible to put these equations into
the same form as Equation.

Standard Form
Consider the differential equation
2
(3 x − 4)y' + (x − 3)y = sin x.

Our main goal in this section is to derive a solution method for equations of this form. It is useful to have the coefficient of y' be
equal to 1. To make this happen, we divide both sides by 3x − 4. 2

x −3 sin x
y' + ( )y =
2 2
3x −4 3x −4

This is called the standard form of the differential equation. We will use it later when finding the solution to a general first-order
linear differential equation. Returning to Equation, we can divide both sides of the equation by a(x). This leads to the equation
b(x) c(x)
y' + y = . (9.5.1)
a(x) a(x)

Now define
b(x)
p(x) =
a(x)

and
c(x)
q(x) =
a(x)

Then Equation 9.5.1 becomes

9.5.2 https://math.libretexts.org/@go/page/4502
y' + p(x)y = q(x).

We can write any first-order linear differential equation in this form, and this is referred to as the standard form for a first-order
linear differential equation.

 Example 9.5.1: Writing First-Order Linear Equations in Standard Form

Put each of the following first-order linear differential equations into standard form. Identify p(x) and q(x) for each equation.
a. y

= 3x − 4y

3xy
b. =2 (here x > 0 )
4y − 3

c. y = 3y − 4x
′ 2
+5

Solution
a. Add 4y to both sides:

y + 4y = 3x.

In this equation, p(x) = 4 and q(x) = 3x.


b. Multiply both sides by 4y − 3 , then subtract 8y from each side:

3xy
=2
4y − 3


3x y = 2(4y − 3)


3x y = 8y − 6


3x y − 8y = −6.

Finally, divide both sides by 3x to make the coefficient of y equal to 1: ′

8 2

y − y =− .
3x 3x

This is allowable because in the original statement of this problem we assumed that x > 0 . (If x = 0 then the original equation
becomes 0 = 2 , which is clearly a false statement.)
8 2
In this equation, p(x) = − and q(x) = − .
3x 3x

c. Subtract y from each side and add 4x 2


−5 :
′ 2
3y − y = 4x − 5.

Next divide both sides by 3:


1 4 5
y −

y =
2
x − .
3 3 3

1 4 5
In this equation, p(x) = − and q(x) = x
2
− .
3 3 3

 Exercise 9.5.1

(x + 3)y
Put the equation =5 into standard form and identify p(x) and q(x).
2x − 3y − 4

Hint
Multiply both sides by the common denominator, then collect all terms involving y on one side.

Answer
15 10x − 20

y + y =
x +3 x +3

9.5.3 https://math.libretexts.org/@go/page/4502
15
p(x) =
x +3

and
10x − 20
q(x) =
x +3

Integrating Factors
We now develop a solution technique for any first-order linear differential equation. We start with the standard form of a first-order
linear differential equation:

y + p(x)y = q(x). (9.5.2)

The first term on the left-hand side of Equation is the derivative of the unknown function, and the second term is the product of a
known function with the unknown function. This is somewhat reminiscent of the power rule. If we multiply Equation 9.5.2 by a
yet-to-be-determined function μ(x), then the equation becomes
μ(x)y' + μ(x)p(x)y = μ(x)q(x). (9.5.3)

The left-hand side Equation 9.5.3 can be matched perfectly to the product rule:
d
[f (x)g(x)] = f '(x)g(x) + f (x)g'(x).
dx

Matching term by term gives y = f (x), g(x) = μ(x) , and g'(x) = μ(x)p(x). Taking the derivative of g(x) = μ(x) and setting it
equal to the right-hand side of g'(x) = μ(x)p(x) leads to
μ'(x) = μ(x)p(x).

This is a first-order, separable differential equation for μ(x). We know p(x) because it appears in the differential equation we are
solving. Separating variables and integrating yields
μ'(x)
= p(x) (9.5.4)
μ(x)

μ'(x)
∫ dx = ∫ p(x)dx (9.5.5)
μ(x)

ln |μ(x)| = ∫ p(x)dx + C (9.5.6)

ln |μ(x)| ∫ p(x)dx+C
e =e (9.5.7)

∫ p(x)dx
|μ(x)| = C1 e (9.5.8)

∫ p(x)dx
μ(x) = C2 e . (9.5.9)

Here C can be an arbitrary (positive or negative) constant. This leads to a general method for solving a first-order linear
2

differential equation. We first multiply both sides of Equation by the integrating factor μ(x). This gives

μ(x)y' + μ(x)p(x)y = μ(x)q(x). (9.5.10)

d
The left-hand side of Equation 9.5.10 can be rewritten as (μ(x)y) .
dx

d
(μ(x)y) = μ(x)q(x). (9.5.11)
dx

Next integrate both sides of Equation 9.5.11 with respect to x.

9.5.4 https://math.libretexts.org/@go/page/4502
d
∫ (μ(x)y)dx = ∫ μ(x)q(x)dx (9.5.12)
dx

μ(x)y = ∫ μ(x)q(x)dx (9.5.13)

Divide both sides of Equation 9.5.11 by μ(x):


1
y = [∫ μ(x)q(x)dx + C ] .
μ(x)

Since μ(x) was previously calculated, we are now finished. An important note about the integrating constant C : It may seem that
we are inconsistent in the usage of the integrating constant. However, the integral involving p(x) is necessary in order to find an
integrating factor for Equation. Only one integrating factor is needed in order to solve the equation; therefore, it is safe to assign a
value for C for this integral. We chose C = 0 . When calculating the integral inside the brackets in Equation, it is necessary to keep
our options open for the value of the integrating constant, because our goal is to find a general family of solutions to Equation. This
integrating factor guarantees just that.

 Problem-Solving Strategy: Solving a First-order Linear Differential Equation


1. Put the equation into standard form and identify p(x) and q(x).
2. Calculate the integrating factor
∫ p(x)dx
μ(x) = e .

3. Multiply both sides of the differential equation by μ(x).


4. Integrate both sides of the equation obtained in step 3, and divide both sides by μ(x).
5. If there is an initial condition, determine the value of C .

 Example 9.5.2: Solving a First-order Linear Equation

Find a general solution for the differential equation x y ′


+ 3y = 4 x
2
− 3x. Assume x > 0.
Solution
1. To put this differential equation into standard form, divide both sides by x:


3
y + y = 4x − 3.
x

3
Therefore p(x) = and q(x) = 4x − 3.
x

2. The integrating factor is μ(x) = e ∫ (3/x)


dx = e
3 ln x
=x
3
.
3. Multiplying both sides of the differential equation by μ(x) gives us

3 3
3 3
x y' + x ( ) = x (4x − 3)
x

3 2 4 3
x y' + 3 x y = 4 x − 3x

d 3 4 3
(x y) = 4 x − 3x .
dx

4. Integrate both sides of the equation.


d
3 4 3
∫ (x y)dx = ∫ 4x − 3 x dx
dx

5 4
4x 3x
3
x y = − +C
5 4

2
4x 3x
−3
y = − + Cx .
5 4

9.5.5 https://math.libretexts.org/@go/page/4502
5. There is no initial value, so the problem is complete.
Analysis
You may have noticed the condition that was imposed on the differential equation; namely, x > 0 . For any nonzero value of C ,
the general solution is not defined at x = 0 . Furthermore, when x < 0 , the integrating factor changes. The integrating factor is
given by Equation as f (x) = e . For this p(x) we get
∫ p(x)dx

∫ p(x)dx ∫ (3/x)dx
e =e

3 ln |x|
=e

3
= |x|

since x < 0 . The behavior of the general solution changes at x = 0 largely due to the fact that p(x) is not defined there.

 Exercise 9.5.2

Find the general solution to the differential equation (x − 2)y ′


+ y = 3x
2
+ 2x. Assume x > 2 .

Hint
Use the method outlined in the problem-solving strategy for first-order linear differential equations.

Answer
3 2
x +x +C
y =
x −2

Now we use the same strategy to find the solution to an initial-value problem.

 Example 9.5.3: A First-order Linear Initial-Value Problem

Solve the initial-value problem

y' + 3y = 2x − 1, y(0) = 3.

Solution
1. This differential equation is already in standard form with p(x) = 3 and q(x) = 2x − 1 .
2. The integrating factor is μ(x) = e ∫ 3dx
=e
3x
.
3. Multiplying both sides of the differential equation by μ(x) gives
3x 3x 3x
e y' + 3 e y = (2x − 1)e

d
3x 3x
[y e ] = (2x − 1)e .
dx

Integrate both sides of the equation:


d
3x 3x
∫ [y e ]dx = ∫ (2x − 1)e dx
dx

3x
e 2
3x 3x
ye = (2x − 1) − ∫ e dx
3 3

3x 3x
e (2x − 1) 2e
3x
ye = − +C
3 9

2x − 1 2
−3x
y = − + Ce
3 9

2x 5
y = − + Ce
−3x
.
3 9

9.5.6 https://math.libretexts.org/@go/page/4502
4. Now substitute x = 0 and y = 3 into the general solution and solve for C :
2 5 −3x
y = x− + Ce
3 9

2 5
−3(0)
3 = (0) − + Ce
3 9

5
3 =− +C
9

32
C = .
9

Therefore the solution to the initial-value problem is


2 5 32
−3x
y = x− + e .
3 9 9

 Example 9.5.4:

Solve the initial-value problem



y − 2y = 4x + 3y(0) = −2.

Solution
2x
y = −2x − 4 + 2e

Applications of First-order Linear Differential Equations


We look at two different applications of first-order linear differential equations. The first involves air resistance as it relates to
objects that are rising or falling; the second involves an electrical circuit. Other applications are numerous, but most are solved in a
similar fashion.

Free fall with air resistance


We discussed air resistance at the beginning of this section. The next example shows how to apply this concept for a ball in vertical
motion. Other factors can affect the force of air resistance, such as the size and shape of the object, but we ignore them here.

 Example 9.5.5: A Ball with Air Resistance

A racquetball is hit straight upward with an initial velocity of 2m/s. The mass of a racquetball is approximately 0.0427 kg. Air
resistance acts on the ball with a force numerically equal to 0.5v, where v represents the velocity of the ball at time t .
a. Find the velocity of the ball as a function of time.
b. How long does it take for the ball to reach its maximum height?
c. If the ball is hit from an initial height of 1 meter, how high will it reach?
Solution
a. The mass m = 0.0427kg, k = 0.5, and g = 9.8m/s
2
. The initial velocity is v0 = 2m/s . Therefore the initial-value
problem is
dv
0.0427 = −0.5v − 0.0427(9.8), v0 = 2.
dt

Dividing the differential equation by 0.0427 gives


dv
= −11.7096v − 9.8, v0 = 2.
dt

The differential equation is linear. Using the problem-solving strategy for linear differential equations:
dv
Step 1. Rewrite the differential equation as + 11.7096v = −9.8 . This gives p(t) = 11.7096 and q(t) = −9.8
dt

9.5.7 https://math.libretexts.org/@go/page/4502
Step 2. The integrating factor is μ(t) = e ∫ 11.7096dt
=e
11.7096t
.

Step 3. Multiply the differential equation by μ(t) :


dv
11.7096t

e dt + 11.7096ve11.7096t = −9.8 e11.7096t

d
11.7096t 11.7096t
[ve ] = −9.8 e .
dt

Step 4. Integrate both sides:


d
11.7096t 11.7096t
∫ [ve ]dt = ∫ −9.8 e dt
dt

−9.8
11.7096t 11.7096t
ve = e +C
11.7096

−11.7096t
v(t) = −0.8369 + C e .

Step 5. Solve for C using the initial condition v 0 = v(0) = 2 :


−11.7096t
v(t) = −0.8369 + C e

−11.7096(0)
v(0) = −0.8369 + C e

2 = −0.8369 + C

C = 2.8369.

Therefore the solution to the initial-value problem is


−11.7096t
v(t) = 2.8369 e − 0.8369.

b. The ball reaches its maximum height when the velocity is equal to zero. The reason is that when the velocity is positive, it is
rising, and when it is negative, it is falling. Therefore when it is zero, it is neither rising nor falling, and is at its maximum
height:
−11.7096t
2.8369 e − 0.8369 = 0

−11.7096t
2.8369 e = 0.8369

0.8369
−11.7096t
e = ≈ 0.295
2.8369

−11.7096t
lne = ln0.295 ≈ −1.221

−11.7096t = −1.221

t ≈ 0.104.

Therefore it takes approximately 0.104 second to reach maximum height.


c. To find the height of the ball as a function of time, use the fact that the derivative of position is velocity, i.e., if h(t)
represents the height at time t , then h'(t) = v(t) . Because we know v(t) and the initial height, we can form an initial-value
problem:
−11.7096t
h'(t) = 2.8369 e − 0.8369, h(0) = 1.

Integrating both sides of the differential equation with respect to t gives


−11.7096t
∫ h'(t)dt = ∫ 2.8369 e − 0.8369dt

2.8369
−11.7096t
h(t) = − e − 0.8369t + C
11.7096

−11.7096t
h(t) = −0.2423 e − 0.8369t + C .

Solve for C by using the initial condition:


−11.7096t
h(t) = −0.2423 e − 0.8369t + C

−11.7096(0)
h(0) = −0.2423 e − 0.8369(0) + C

9.5.8 https://math.libretexts.org/@go/page/4502
1 = −0.2423 + C

C = 1.2423.

Therefore
−11.7096t
h(t) = −0.2423 e − 0.8369t + 1.2423.

After 0.104 second, the height is given by


h(0.2) = −0.2423 e
−11.7096t
− 0.8369t + 1.2423 ≈ 1.0836 meter.

 Exercise 9.5.3

The weight of a penny is 2.5 grams (United States Mint, “Coin Specifications,” accessed April 9, 2015,
www.usmint.gov/about_the_mint...specifications), and the upper observation deck of the Empire State Building is 369 meters
above the street. Since the penny is a small and relatively smooth object, air resistance acting on the penny is actually quite
small. We assume the air resistance is numerically equal to 0.0025v. Furthermore, the penny is dropped with no initial velocity
imparted to it.
a. Set up an initial-value problem that represents the falling penny.
b. Solve the problem for v(t) .
c. What is the terminal velocity of the penny (i.e., calculate the limit of the velocity as t approaches infinity)?

Hint
Set up the differential equation the same way as Example. Remember to convert from grams to kilograms.

Answer
dv
a. = −v − 9.8 v(0) = 0
dt

b. v(t) = 9.8(e −t
− 1)

c. lim t→∞ v(t) = limt→∞ (9.8(e


−t
− 1)) = −9.8m/s ≈ −21.922mph

Electrical Circuits
A source of electromotive force (e.g., a battery or generator) produces a flow of current in a closed circuit, and this current
produces a voltage drop across each resistor, inductor, and capacitor in the circuit. Kirchhoff’s Loop Rule states that the sum of the
voltage drops across resistors, inductors, and capacitors is equal to the total electromotive force in a closed circuit. We have the
following three results:
1. The voltage drop across a resistor is given by
ER = Ri,

where R is a constant of proportionality called the resistance, and i is the current.


2. The voltage drop across an inductor is given by
EL = Li' ,
where L is a constant of proportionality called the inductance, and i again denotes the current.
3. The voltage drop across a capacitor is given by
1
EC = q ,
C

where C is a constant of proportionality called the capacitance, and q is the instantaneous charge on the capacitor. The relationship
between i and q is i = q' .
We use units of volts (V ) to measure voltage E , amperes (A) to measure current i, coulombs (C ) to measure charge q, ohms (Ω)
to measure resistance R , henrys (H ) to measure inductance L, and farads (F ) to measure capacitance C . Consider the circuit in

9.5.9 https://math.libretexts.org/@go/page/4502
Figure 9.5.2.

Figure 9.5.2 : A typical electric circuit, containing a voltage generator (V ), capacitor (C ), inductor (L), and resistor (R).
S

Applying Kirchhoff’s Loop Rule to this circuit, we let E denote the electromotive force supplied by the voltage generator. Then
EL + ER + EC = E .
Substituting the expressions for E L, ER , and E into this equation, we obtain
C

1
Li' + Ri + q = E.
C

If there is no capacitor in the circuit, then the equation becomes


Li' + Ri = E.

This is a first-order differential equation in i. The circuit is referred to as an LRcircuit.


Next, suppose there is no inductor in the circuit, but there is a capacitor and a resistor, so L = 0, R ≠ 0, and C ≠ 0. Then Equation
can be rewritten as
1
Rq' + q = E,
C

which is a first-order linear differential equation. This is referred to as an RC circuit. In either case, we can set up and solve an
initial-value problem.

 Electric Circuit

A circuit has in series an electromotive force given by E = 50 sin 20tV , a resistor of , and an inductor of
5Ω 0.4H . If the
initial current is 0, find the current at time t > 0 .
Solution
We have a resistor and an inductor in the circuit, so we use Equation. The voltage drop across the resistor is given by
E R = R = 5 . The voltage drop across the inductor is given by E
i i = Li' = 0.4i'. The electromotive force becomes the
L

right-hand side of Equation. Therefore Equation becomes


0.4i' + 5i = 50 sin 20t.

Dividing both sides by 0.4 gives the equation

i' + 12.5i = 125 sin 20t.

Since the initial current is 0, this result gives an initial condition of i(0) = 0. We can solve this initial-value problem using the
five-step strategy for solving first-order differential equations.
Step 1. Rewrite the differential equation as i' + 12.5i = 125 sin 20t. This gives p(t) = 12.5 and q(t) = 125 sin 20t .
Step 2. The integrating factor is μ(t) = e ∫ 12.5dt
=e
12.5t
.
Step 3. Multiply the differential equation by μ(t) :
12.5t 12.5t 12.5t
e i' + 12.5 e i = 125 e sin 20t

d
[i e
12.5
t] = 125 e
12.5t
sin 20t .
dt

9.5.10 https://math.libretexts.org/@go/page/4502
Step 4. Integrate both sides:
d
12.5t 12.5t
∫ [i e ]dt = ∫ 125 e sin 20tdt
dt

250 sin 20t − 400 cos 20t


12.5t 12.5t
ie =( )e +C
89

250 sin 20t − 400 cos 20t


i(t) = + Ce
−12.5t
.
89

Step 5. Solve for C using the initial condition v(0) = 2 :


250 sin 20t − 400 cos 20t
−12.5t
i(t) = + Ce
89

250sin20(0) − 400cos20(0)
−12.5(0)
i(0) = + Ce
89

400
0 =− +C
89

400
C = .
89

Therefore the solution to the initial-value problem is


−12.5t −12.5t
250 sin 20t − 400 cos 20t + 400e 250 sin 20t − 400 cos 20t 400e
i(t) = = + .
89 89 89

−−−−−−−−− −−
The first term can be rewritten as a single cosine function. First, multiply and divide by √250 2
+ 400
2
= 50 √89 :
−− −−
250 sin 20t − 400 cos 20t 50 √89 250 sin 20t − 400 cos 20t 50 √89 8 cos 20t 5 sin 20t
= ( −− ) =− ( −− − −− ) .
89 89 50 √89 89 √89 √89

8 5
Next, define φ to be an acute angle such that cos φ = −− . Then sin φ = −− and
√89 √89

−− −− −−
50 √89 8 cos 20t 5 sin 20t 50 √89 50 √89
− ( −− − −− ) =− (cos φ cos 20t − sin φ sin 20t) = − cos(20t + φ).
89 √89 √89 89 89

Therefore the solution can be written as


−− −12.5t
50 √89 400e
i(t) = − cos(20t + φ) + .
89 89

The second term is called the attenuation term, because it disappears rapidly as t grows larger. The phase shift is given by φ ,
−−
50 √89
and the amplitude of the steady-state current is given by . The graph of this solution appears in Figure 9.5.3:
89

9.5.11 https://math.libretexts.org/@go/page/4502
Figure 9.5.3 .

 Exercise 9.5.4

A circuit has in series an electromotive force given by E = 20sin5t V, a capacitor with capacitance 0.02F, and a resistor of
8Ω. If the initial charge is 4C , find the charge at time t > 0 .

Hint
Use Equation for an RC circuit to set up an initial-value problem.

Answer
Initial-value problem:
1
8q' + q = 20sin5t, q(0) = 4
0.02

−6.25t
10sin5t − 8cos5t + 172e
q(t) =
41

Key Concepts
Any first-order linear differential equation can be written in the form y + p(x)y = q(x) .

We can use a five-step problem-solving strategy for solving a first-order linear differential equation that may or may not include
an initial value.
Applications of first-order linear differential equations include determining motion of a rising or falling object with air
resistance and finding current in an electrical circuit.

Key Equations
standard form

y + p(x)y = q(x)

integrating factor
∫ p(x)dx
μ(x) = e

Glossary
integrating factor

9.5.12 https://math.libretexts.org/@go/page/4502
any function f (x) that is multiplied on both sides of a differential equation to make the side involving the unknown function
equal to the derivative of a product of two functions

linear
description of a first-order differential equation that can be written in the form a(x)y' + b(x)y = c(x)

standard form
the form of a first-order linear differential equation obtained by writing the differential equation in the form y ′
+ p(x)y = q(x)

9.5: Linear Equations is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
8.5: First-order Linear Equations by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

9.5.13 https://math.libretexts.org/@go/page/4502
9.6: Predator-Prey Systems
 Learning Objectives
Describe the concept of environmental carrying capacity in the logistic model of population growth.
Draw a direction field for a logistic equation and interpret the solution curves.
Solve a logistic equation and interpret the results.

Differential equations can be used to represent the size of a population as it varies over time. We saw this in an earlier chapter in
the section on exponential growth and decay, which is the simplest model. A more realistic model includes other factors that affect
the growth of the population. In this section, we study the logistic differential equation and see how it applies to the study of
population dynamics in the context of biology.

Population Growth and Carrying Capacity


To model population growth using a differential equation, we first need to introduce some variables and relevant terms. The
variable t . will represent time. The units of time can be hours, days, weeks, months, or even years. Any given problem must
specify the units used in that particular problem. The variable P will represent population. Since the population varies over time, it
is understood to be a function of time. Therefore we use the notation P (t) for the population as a function of time. If P (t) is a
differentiable function, then the first derivative represents the instantaneous rate of change of the population as a function of
dP

dt

time.
In Exponential Growth and Decay, we studied the exponential growth and decay of populations and radioactive substances. An
example of an exponential growth function is P (t) = P e . In this function, P (t) represents the population at time t, P
0
rt
0

represents the initial population (population at time t = 0 ), and the constant r > 0 is called the growth rate. Figure 9.6.1 shows a
graph of P (t) = 100e . Here P = 100 and r = 0.03.
0.03t
0

Figure 9.6.1 : An exponential growth model of population.


We can verify that the function P (t) = P 0e
rt
satisfies the initial-value problem
dP
= rP
dt

with P (0) = P 0.

This differential equation has an interesting interpretation. The left-hand side represents the rate at which the population increases
(or decreases). The right-hand side is equal to a positive constant multiplied by the current population. Therefore the differential
equation states that the rate at which the population increases is proportional to the population at that point in time. Furthermore, it
states that the constant of proportionality never changes.
One problem with this function is its prediction that as time goes on, the population grows without bound. This is unrealistic in a
real-world setting. Various factors limit the rate of growth of a particular population, including birth rate, death rate, food supply,
predators, and so on. The growth constant r usually takes into consideration the birth and death rates but none of the other factors,
and it can be interpreted as a net (birth minus death) percent growth rate per unit time. A natural question to ask is whether the
population growth rate stays constant, or whether it changes over time. Biologists have found that in many biological systems, the
population grows until a certain steady-state population is reached. This possibility is not taken into account with exponential
growth. However, the concept of carrying capacity allows for the possibility that in a given area, only a certain number of a given
organism or animal can thrive without running into resource issues.

9.6.1 https://math.libretexts.org/@go/page/4503
 Definition: Carrying Capacity

The carrying capacity of an organism in a given environment is defined to be the maximum population of that organism that
the environment can sustain indefinitely.

We use the variable K to denote the carrying capacity. The growth rate is represented by the variable r. Using these variables, we
can define the logistic differential equation.

 Definition: Logistic Differential Equation

Let K represent the carrying capacity for a particular organism in a given environment, and let r be a real number that
represents the growth rate. The function P (t) represents the population of this organism as a function of time t , and the
constant P represents the initial population (population of the organism at time t = 0 ). Then the logistic differential equation
0

is
dP P
= rP (1 − ). (9.6.1)
dt K

The logistic equation was first published by Pierre Verhulst in 1845 . This differential equation can be coupled with the initial
condition P (0) = P to form an initial-value problem for P (t).
0

Suppose that the initial population is small relative to the carrying capacity. Then is small, possibly close to zero. Thus, the
P

quantity in parentheses on the right-hand side of Equation 9.6.1 is close to 1, and the right-hand side of this equation is close to rP .
If r > 0 , then the population grows rapidly, resembling exponential growth.
However, as the population grows, the ratio also grows, because K is constant. If the population remains below the carrying
K
P

capacity, then is less than 1, so 1 − > 0 . Therefore the right-hand side of Equation 9.6.1 is still positive, but the quantity in
P

K
P

parentheses gets smaller, and the growth rate decreases as a result. If P = K then the right-hand side is equal to zero, and the
population does not change.
Now suppose that the population starts at a value higher than the carrying capacity. Then > 1, and 1 −
P

K
< 0 . Then the right-
P

hand side of Equation 9.6.1 is negative, and the population decreases. As long as P > K , the population decreases. It never
actually reaches K because dP

dt
will get smaller and smaller, but the population approaches the carrying capacity as t approaches
infinity. This analysis can be represented visually by way of a phase line. A phase line describes the general behavior of a solution
to an autonomous differential equation, depending on the initial condition. For the case of a carrying capacity in the logistic
equation, the phase line is as shown in Figure 9.6.2.

dP P
Figure 9.6.2 : A phase line for the differential equation = rP (1 − ).
dt K

This phase line shows that when P is less than zero or greater than K , the population decreases over time. When P is between 0

and K , the population increases over time.

9.6.2 https://math.libretexts.org/@go/page/4503
 Example 9.6.1: Examining the Carrying Capacity of a Deer Population

Let’s consider the population of white-tailed deer (Odocoileus virginianus) in the state of Kentucky. The Kentucky Department
of Fish and Wildlife Resources (KDFWR) sets guidelines for hunting and fishing in the state. Before the hunting season of
2004, it estimated a population of 900,000 deer. Johnson notes: “A deer population that has plenty to eat and is not hunted by
humans or other predators will double every three years.” (George Johnson, “The Problem of Exploding Deer Populations Has
No Attractive Solutions,” January 12,2001, accessed April 9, 2015)

Figure 9.6.3 : (credit: modification of work by Rachel Kramer, Flickr)


ln(2)
This observation corresponds to a rate of increase r = = 0.2311, so the approximate growth rate is 23.11% per year.
3
(This assumes that the population grows exponentially, which is reasonable––at least in the short term––with plentiful food
supply and no predators.) The KDFWR also reports deer population densities for 32 counties in Kentucky, the average of
which is approximately 27 deer per square mile. Suppose this is the deer density for the whole state (39,732 square miles). The
carrying capacity K is 39,732 square miles times 27 deer per square mile, or 1,072,764 deer.
a. For this application, we have P = 900, 000, K = 1, 072, 764, and r = 0.2311. Substitute these values into Equation
0

9.6.1 and form the initial-value problem.

b. Solve the initial-value problem from part a.


c. According to this model, what will be the population in 3 years? Recall that the doubling time predicted by Johnson for the
deer population was 3 years. How do these values compare?
Suppose the population managed to reach 1,200,000 What does the logistic equation predict will happen to the population in
this scenario?
Solution
a. The initial value problem is
dP P
= 0.2311P (1 − ), P (0) = 900, 000.
dt 1, 072, 764

b. The logistic equation is an autonomous differential equation, so we can use the method of separation of variables.
Step 1: Setting the right-hand side equal to zero gives P = 0 and P = 1, 072, 764. This means that if the population starts at
zero it will never change, and if it starts at the carrying capacity, it will never change.
Step 2: Rewrite the differential equation and multiply both sides by:
dP 1, 072, 764 − P
= 0.2311P ( )
dt 1, 072, 764

1, 072, 764 − P
dP = 0.2311P ( ) dt
1, 072, 764

dP 0.2311
= dt.
P (1, 072, 764 − P ) 1, 072, 764

Step 3: Integrate both sides of the equation using partial fraction decomposition:

9.6.3 https://math.libretexts.org/@go/page/4503
dP 0.2311
∫ =∫ dt
P (1, 072, 764 − P ) 1, 072, 764

1 1 1 0.2311t
∫ ( + ) dP = +C
1, 072, 764 P 1, 072, 764 − P 1, 072, 764

1 0.2311t
(ln |P | − ln |1, 072, 764 − P |) = + C.
1, 072, 764 1, 072, 764

Step 4: Multiply both sides by 1,072,764 and use the quotient rule for logarithms:
∣ P ∣
ln∣ ∣ = 0.2311t + C1 .
∣ 1, 072, 764 − P ∣

Here C 1 = 1, 072, 764C . Next exponentiate both sides and eliminate the absolute value:
∣ P ∣
∣ ∣
ln
∣ ∣
∣ 1, 072, 764 − P ∣ 0.2311t+C1
e =e

∣ P ∣ 0.2311t
∣ ∣ = C2 e
∣ 1, 072, 764 − P ∣

P
0.2311t
= C2 e .
1, 072, 764 − P

Here C 2 =e
C1
but after eliminating the absolute value, it can be negative as well. Now solve for:
0.2311t
P = C2 e (1, 072, 764 − P )

0.2311t 0.2311t
P = 1, 072, 764 C2 e − C2 P e

0.2311t 0.2311t
P + C2 P e = 1, 072, 764 C2 e

0.2311t 0.2311t
P (1 + C2 e = 1, 072, 764 C2 e

0.2311t
1, 072, 764C2 e
P (t) = .
0.2311t
1 + C2 e

Step 5: To determine the value of C , it is actually easier to go back a couple of steps to where C was defined. In particular,
2 2

use the equation


P
0.2311t
= C2 e .
1, 072, 764 − P

The initial condition is P (0) = 900, 000. Replace P with 900, 000 and t with zero:
P
0.2311t
= C2 e
1, 072, 764 − P

900, 000
0.2311(0)
= C2 e
1, 072, 764 − 900, 000

900, 000
= C2
172, 764

25, 000
C2 =
4, 799

≈ 5.209.

Therefore

9.6.4 https://math.libretexts.org/@go/page/4503
25000
0.2311t
1, 072, 764 ( )e
4799
P (t) =
0.2311t
1 + (250004799)e

0.2311t
1, 072, 764(25000)e
=
0.2311t
4799 + 25000 e .

Dividing the numerator and denominator by 25,000 gives


0.2311t
1, 072, 764e
P (t) = .
0.19196 + e0.2311t

Figure is a graph of this equation.

Figure 9.6.4 : Logistic curve for the deer population with an initial population of 900,000 deer.
c. Using this model we can predict the population in 3 years.
0.2311(3)
1, 072, 764e
P (3) = ≈ 978, 830 deer
0.2311(3)
0.19196 + e

This is far short of twice the initial population of 900, 000. Remember that the doubling time is based on the assumption that
the growth rate never changes, but the logistic model takes this possibility into account.
d. If the population reached 1,200,000 deer, then the new initial-value problem would be
dP P
= 0.2311P (1 − ) , P (0) = 1, 200, 000.
dt 1, 072, 764

The general solution to the differential equation would remain the same.
0.2311t
1, 072, 764C2 e
P (t) =
0.2311t
1 + C2 e

To determine the value of the constant, return to the equation


P
0.2311t
= C2 e .
1, 072, 764 − P

Substituting the values t = 0 and P = 1, 200, 000, you get


1, 200, 000
0.2311(0)
C2 e =
1, 072, 764 − 1, 200, 000

100, 000
C2 = − ≈ −9.431.
10, 603

Therefore

9.6.5 https://math.libretexts.org/@go/page/4503
0.2311t
1, 072, 764C2 e
P (t) =
0.2311t
1 + C2 e

100, 000
0.2311t
1, 072, 764 (− )e
10, 603
=
100, 000
0.2311t
1 + (− )e
10, 603

0.2311t
107, 276, 400, 000e
=−
0.2311t
100, 000 e − 10, 603

0.2311t
10, 117, 551e

0.2311t
9.43129 e −1

This equation is graphed in Figure 9.6.5.

Figure 9.6.5 : Logistic curve for the deer population with an initial population of 1,200,000 deer.

Solving the Logistic Differential Equation


The logistic differential equation is an autonomous differential equation, so we can use separation of variables to find the general
solution, as we just did in Example 9.6.1.
Step 1: Setting the right-hand side equal to zero leads to P = 0 and P = K as constant solutions. The first solution indicates that
when there are no organisms present, the population will never grow. The second solution indicates that when the population starts
at the carrying capacity, it will never change.
Step 2: Rewrite the differential equation in the form
dP rP (K − P )
= .
dt K

Then multiply both sides by dt and divide both sides by P (K − P ). This leads to
dP r
= dt.
P (K − P ) K

Multiply both sides of the equation by K and integrate:


K
∫ dP = ∫ rdt. (9.6.2)
P (K − P )

The left-hand side of this equation can be integrated using partial fraction decomposition. We leave it to you to verify that
K 1 1
= + .
P (K − P ) P K −P

Then the Equation 9.6.2 becomes

9.6.6 https://math.libretexts.org/@go/page/4503
1 1
∫ + dP = ∫ rdt
P K −P

ln |P | − ln |K − P | = rt + C

P
ln ∣ ∣= rt + C .
K −P

Now exponentiate both sides of the equation to eliminate the natural logarithm:
P ∣

ln∣

K −P ∣ rt+C
e =e

∣ P
C rt
∣ ∣= e e .
∣ K −P

We define C 1 =e
c
so that the equation becomes
P rt
= C1 e . (9.6.3)
K −P

To solve this equation for P (t), first multiply both sides by K − P and collect the terms containing P on the left-hand side of the
equation:
rt
P = C1 e (K − P )

rt rt
= C1 K e − C1 P e

rt rt
P + C1 P e = C1 K e .

Next, factor P from the left-hand side and divide both sides by the other factor:
rt rt
P (1 + C1 e ) = C1 K e

rt
C1 K e
P (t) = .
rt
1 + C1 e

The last step is to determine the value of C . The easiest way to do this is to substitute t = 0 and P in place of P in Equation and
1 0

solve for C :
1

P
rt
= C1 e
K −P

P0 r(0)
= C1 e
K − P0

P0
C1 = .
K − P0

Finally, substitute the expression for C into Equation 9.6.3:


1

P0
rt
Ke
rt
C1 K e K − P0
P (t) = =
1 + C1 ert P0
rt
1+ e
K − P0

Now multiply the numerator and denominator of the right-hand side by (K − P 0) and simplify:

9.6.7 https://math.libretexts.org/@go/page/4503
P0
rt
Ke
K − P0
P (t) =
P0
rt
1+ e
K − P0

P0
rt
Ke
rt
K − P0 K − P0 P0 K e
= ⋅ = .
P0 K − P0 (K − P0 ) + P0 ert
1+ ert
K − P0

We state this result as a theorem.

 Solution of the Logistic Differential Equation

Consider the logistic differential equation subject to an initial population of P0 with carrying capacity K and growth rate r.
The solution to the corresponding initial-value problem is given by
rt
P0 K e
P (t) =
rt
(K − P0 ) + P0 e

Now that we have the solution to the initial-value problem, we can choose values for P , r, and K and study the solution curve.
0

For example, in Example we used the values r = 0.2311, K = 1, 072, 764, and an initial population of 900, 000 deer. This leads
to the solution
rt
P0 K e
P (t) =
rt
(K − P0 ) + P0 e

0.2311t
900, 000(1, 072, 764)e
=
0.2311t
(1, 072, 764 − 900, 000) + 900, 000e

0.2311t
900, 000(1, 072, 764)e
= .
172, 764 + 900, 000e0.2311t

Dividing top and bottom by 900, 000 gives


0.2311t
1, 072, 764e
P (t) = .
0.2311t
0.19196 + e

This is the same as the original solution. The graph of this solution is shown again in blue in Figure 9.6.6, superimposed over the
graph of the exponential growth model with initial population 900, 000 and growth rate 0.2311 (appearing in green). The red
dashed line represents the carrying capacity, and is a horizontal asymptote for the solution to the logistic equation.

Figure 9.6.6 : A comparison of exponential versus logistic growth for the same initial population of 900, 000 organisms and growth
rate of 23.11

9.6.8 https://math.libretexts.org/@go/page/4503
Working under the assumption that the population grows according to the logistic differential equation, this graph predicts that
approximately 20 years earlier (1984), the growth of the population was very close to exponential. The net growth rate at that time
would have been around 23.1 per year. As time goes on, the two graphs separate. This happens because the population increases,
and the logistic differential equation states that the growth rate decreases as the population increases. At the time the population
was measured (2004), it was close to carrying capacity, and the population was starting to level off.
The solution to the logistic differential equation has a point of inflection. To find this point, set the second derivative equal to zero:
rt
P0 K e
P (t) =
(K − P0 ) + P0 ert

rt
rP0 K(K − P 0)e
P '(t) =
rt 2
((K − P0 ) + P0 e )

2 2 rt 2 2 2rt
r P0 K(K − P0 ) e −r P K(K − P0 )e
′′ 0
P (t) =
((K − P0 ) + P0 ert )3

2 rt rt
r P0 K(K − P0 )e ((K − P0 ) − P0 e )
= .
rt 3
((K − P0 ) + P0 e )

Setting the numerator equal to zero,


2 rt rt
r P0 K(K − P0 )e ((K − P0 ) − P0 e ) = 0.

As long as P 0 ≠K , the entire quantity before and including e is nonzero, so we can divide it out:
rt

rt
(K − P0 ) − P0 e = 0.

Solving for t ,
rt
P0 e = K − P0

rt
K − P0
e =
P0

rt
K − P0
ln e = ln
P0

K − P0
rt = ln
P0

1 K − P0
t = ln .
r P0

Notice that if P > K , then this quantity is undefined, and the graph does not have a point of inflection. In the logistic graph, the
0

point of inflection can be seen as the point where the graph changes from concave up to concave down. This is where the “leveling
off” starts to occur, because the net growth rate becomes slower as the population starts to approach the carrying capacity.

 Exercise 9.6.1
A population of rabbits in a meadow is observed to be 200 rabbits at time t = 0 . After a month, the rabbit population is
observed to have increased by 4. Using an initial population of 200 and a growth rate of 0.04, with a carrying capacity of 750
rabbits,
a. Write the logistic differential equation and initial condition for this model.
b. Draw a slope field for this logistic differential equation, and sketch the solution corresponding to an initial population of
200 rabbits.

c. Solve the initial-value problem for P (t).


d. Use the solution to predict the population after 1 year.

Hint

9.6.9 https://math.libretexts.org/@go/page/4503
First determine the values of r, K, and P . Then create the initial-value problem, draw the direction field, and solve
0

the problem.

Answer
dP P
a. = 0.04(1 − ), P (0) = 200
dt 750

b.

.04t
3000e
c. P (t) = .04t
11 + 4e

d. After 12 months, the population will be P (12) ≈ 278 rabbits.

 Student Project: Logistic Equation with a Threshold Population

An improvement to the logistic model includes a threshold population. The threshold population is defined to be the
minimum population that is necessary for the species to survive. We use the variable T to represent the threshold population. A
differential equation that incorporates both the threshold population T and carrying capacity K is
dP P P
= −rP (1 − ) (1 − )
dt K T

where r represents the growth rate, as before.


1. The threshold population is useful to biologists and can be utilized to determine whether a given species should be placed
on the endangered list. A group of Australian researchers say they have determined the threshold population for any species
to survive: 5000 adults. (Catherine Clabby, “A Magic Number,” American Scientist 98(1): 24, doi:10.1511/2010.82.24.
accessed April 9, 2015, www.americanscientist.org/iss...a-magic-number). Therefore we use T = 5000 as the threshold
population in this project. Suppose that the environmental carrying capacity in Montana for elk is 25, 000. Set up Equation
using the carrying capacity of 25, 000 and threshold population of 5000. Assume an annual net growth rate of 18%.
2. Draw the direction field for the differential equation from step 1, along with several solutions for different initial
populations. What are the constant solutions of the differential equation? What do these solutions correspond to in the
original population model (i.e., in a biological context)?
3. What is the limiting population for each initial population you chose in step 2? (Hint: use the slope field to see what
happens for various initial populations, i.e., look for the horizontal asymptotes of your solutions.)
4. This equation can be solved using the method of separation of variables. However, it is very difficult to get the solution as
an explicit function of t . Using an initial population of 18, 000 elk, solve the initial-value problem and express the solution
as an implicit function of t, or solve the general initial-value problem, finding a solution in terms of r, K, T , and P .
0

Key Concepts
When studying population functions, different assumptions—such as exponential growth, logistic growth, or threshold
population—lead to different rates of growth.

9.6.10 https://math.libretexts.org/@go/page/4503
The logistic differential equation incorporates the concept of a carrying capacity. This value is a limiting value on the
population for any given environment.
The logistic differential equation can be solved for any positive growth rate, initial population, and carrying capacity.

Key Equations
Logistic differential equation and initial-value problem
dP P
= rP (1 − ), P (0) = P0
dt K

Solution to the logistic differential equation/initial-value problem


rt
P0 K e
P (t) =
rt
(K − P0 ) + P0 e

Threshold population model


dP P P
= −rP (1 − ) (1 − )
dt K T

Glossary
carrying capacity
the maximum population of an organism that the environment can sustain indefinitely

growth rate
the constant r > 0 in the exponential growth function P (t) = P 0e
rt

initial population
the population at time t = 0

logistic differential equation


a differential equation that incorporates the carrying capacity K and growth rate rr into a population model

phase line
a visual representation of the behavior of solutions to an autonomous differential equation subject to various initial conditions

threshold population
the minimum population that is necessary for a species to survive

9.6: Predator-Prey Systems is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
8.4: The Logistic Equation by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

9.6.11 https://math.libretexts.org/@go/page/4503
CHAPTER OVERVIEW

10: Parametric Equations And Polar Coordinates


A general Calculus Textmap organized around the textbook

Calculus: Early Transcendentals


by James Stewart

I II III IV V VI VII VIII IX X XI XII XIII XIV XV XVI XVII

This Textmap is currently under construction... please be patient with us.

Topic hierarchy
10.1: Curves Defined by Parametric Equations
10.2: Calculus with Parametric Curves
10.3: Polar Coordinates
10.4: Areas and Lengths in Polar Coordinates
10.5: Conic Sections
10.6: Conic Sections in Polar Coordinates
Index

10: Parametric Equations And Polar Coordinates is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

1
CHAPTER OVERVIEW

Front Matter
TitlePage
InfoPage

1
10: Parametric Equations And Polar
Coordinates
This text is disseminated via the Open Education Resource (OER) LibreTexts Project (https://LibreTexts.org) and like the hundreds
of other texts available within this powerful platform, it is freely available for reading, printing and "consuming." Most, but not all,
pages in the library have licenses that may allow individuals to make changes, save, and print this book. Carefully
consult the applicable license(s) before pursuing such effects.
Instructors can adopt existing LibreTexts texts or Remix them to quickly build course-specific resources to meet the needs of their
students. Unlike traditional textbooks, LibreTexts’ web based origins allow powerful integration of advanced features and new
technologies to support learning.

The LibreTexts mission is to unite students, faculty and scholars in a cooperative effort to develop an easy-to-use online platform
for the construction, customization, and dissemination of OER content to reduce the burdens of unreasonable textbook costs to our
students and society. The LibreTexts project is a multi-institutional collaborative venture to develop the next generation of open-
access texts to improve postsecondary education at all levels of higher learning by developing an Open Access Resource
environment. The project currently consists of 14 independently operating and interconnected libraries that are constantly being
optimized by students, faculty, and outside experts to supplant conventional paper-based books. These free textbook alternatives are
organized within a central environment that is both vertically (from advance to basic level) and horizontally (across different fields)
integrated.
The LibreTexts libraries are Powered by NICE CXOne and are supported by the Department of Education Open Textbook Pilot
Project, the UC Davis Office of the Provost, the UC Davis Library, the California State University Affordable Learning Solutions
Program, and Merlot. This material is based upon work supported by the National Science Foundation under Grant No. 1246120,
1525057, and 1413739.
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not
necessarily reflect the views of the National Science Foundation nor the US Department of Education.
Have questions or comments? For information about adoptions or adaptions contact [email protected]. More information on our
activities can be found via Facebook (https://facebook.com/Libretexts), Twitter (https://twitter.com/libretexts), or our blog
(http://Blog.Libretexts.org).
This text was compiled on 12/01/2023
10.1: Curves Defined by Parametric Equations
 Learning Objectives
Plot a curve described by parametric equations.
Convert the parametric equations of a curve into the form y = f (x).
Recognize the parametric equations of basic curves, such as a line and a circle.
Recognize the parametric equations of a cycloid.

In this section we examine parametric equations and their graphs. In the two-dimensional coordinate system, parametric equations
are useful for describing curves that are not necessarily functions. The parameter is an independent variable that both x and y
depend on, and as the parameter increases, the values of x and y trace out a path along a plane curve. For example, if the parameter
is t (a common choice), then t might represent time. Then x and y are defined as functions of time, and (x(t), y(t)) can describe
the position in the plane of a given object as it moves along a curved path.

Parametric Equations and Their Graphs


Consider the orbit of Earth around the Sun. Our year lasts approximately 365.25 days, but for this discussion we will use 365 days.
On January 1 of each year, the physical location of Earth with respect to the Sun is nearly the same, except for leap years, when the
lag introduced by the extra 1

4
day of orbiting time is built into the calendar. We call January 1 “day 1” of the year. Then, for
example, day 31 is January 31, day 59 is February 28, and so on.
The number of the day in a year can be considered a variable that determines Earth’s position in its orbit. As Earth revolves around
the Sun, its physical location changes relative to the Sun. After one full year, we are back where we started, and a new year begins.
According to Kepler’s laws of planetary motion, the shape of the orbit is elliptical, with the Sun at one focus of the ellipse. We
study this idea in more detail in Conic Sections.

Figure 10.1.1 : Earth’s orbit around the Sun in one year.


Figure 10.1.1 depicts Earth’s orbit around the Sun during one year. The point labeled F is one of the foci of the ellipse; the other
2

focus is occupied by the Sun. If we superimpose coordinate axes over this graph, then we can assign ordered pairs to each point on
the ellipse (Figure 10.1.2). Then each x value on the graph is a value of position as a function of time, and each y value is also a
value of position as a function of time. Therefore, each point on the graph corresponds to a value of Earth’s position as a function
of time.

10.1.1 https://math.libretexts.org/@go/page/4505
Figure 10.1.2 : Coordinate axes superimposed on the orbit of Earth.
We can determine the functions for x(t) and y(t), thereby parameterizing the orbit of Earth around the Sun. The variable t is called
an independent parameter and, in this context, represents time relative to the beginning of each year.
A curve in the (x, y) plane can be represented parametrically. The equations that are used to define the curve are called parametric
equations.

 Definition: Parametric Equations

If x and y are continuous functions of t on an interval I , then the equations

x = x(t)

and

y = y(t)

are called parametric equations and t is called the parameter. The set of points (x, y) obtained as t varies over the interval I is
called the graph of the parametric equations. The graph of parametric equations is called a parametric curve or plane curve,
and is denoted by C .

Notice in this definition that x and y are used in two ways. The first is as functions of the independent variable t . As t varies over
the interval I , the functions x(t) and y(t) generate a set of ordered pairs (x, y). This set of ordered pairs generates the graph of the
parametric equations. In this second usage, to designate the ordered pairs, x and y are variables. It is important to distinguish the
variables x and y from the functions x(t) and y(t).

 Example 10.1.1: Graphing a Parametrically Defined Curve


Sketch the curves described by the following parametric equations:
a. x(t) = t − 1, y(t) = 2t + 4, for − 3 ≤ t ≤ 2

b. x(t) = t − 3,
2
y(t) = 2t + 1, for − 2 ≤ t ≤ 3

c. x(t) = 4 cos t, y(t) = 4 sin t, for 0 ≤ t ≤ 2π

Solution
a. To create a graph of this curve, first set up a table of values. Since the independent variable in both x(t) and y(t) is t , let t
appear in the first column. Then x(t) and y(t) will appear in the second and third columns of the table.

t x(t) y(t)

−3 −4 −2

−2 −3 0

10.1.2 https://math.libretexts.org/@go/page/4505
t x(t) y(t)

−1 −2 2

0 −1 4

1 0 6

2 1 8

The second and third columns in this table provide a set of points to be plotted. The graph of these points appears in Figure
10.1.3. The arrows on the graph indicate the orientation of the graph, that is, the direction that a point moves on the graph as t

varies from −3 to 2.

Figure 10.1.3 : Graph of the plane curve described by the parametric equations in part a.
b. To create a graph of this curve, again set up a table of values.

t x(t) y(t)

−2 1 −3

−1 −2 −1

0 −3 1

1 −2 3

2 1 5

3 6 7

The second and third columns in this table give a set of points to be plotted (Figure 10.1.4). The first point on the graph
(corresponding to t = −2 ) has coordinates (1, −3), and the last point (corresponding to t = 3 ) has coordinates (6, 7). As t
progresses from −2 to 3, the point on the curve travels along a parabola. The direction the point moves is again called the
orientation and is indicated on the graph.

10.1.3 https://math.libretexts.org/@go/page/4505
Figure 10.1.4 : Graph of the plane curve described by the parametric equations in part b.
c. In this case, use multiples of π/6 for t and create another table of values:

t x(t) y(t) t x(t) y(t)


0 4 0 7π

6
−2 √3 ≈ −3.5 -2
– –
π

6
2 √3 ≈ 3.5 2 4π

3
−2 −2 √3 ≈ −3.5


π

3
2 2 √3 ≈ 3.5

2
0 −4

π

2
0 4 5π

3
2 −2 √3 ≈ −3.5

– –

3
−2 2 √3 ≈ 3.5
11π

6
2 √3 ≈ 3.5 -2

6
−2 √3 ≈ −3.5 2 2π 4 0

π −4 0

The graph of this plane curve appears in the following graph.

Figure 10.1.5 : Graph of the plane curve described by the parametric equations in part c.
This is the graph of a circle with radius 4 centered at the origin, with a counterclockwise orientation. The starting point and
ending points of the curve both have coordinates (4, 0).

10.1.4 https://math.libretexts.org/@go/page/4505
 Exercise 10.1.1

Sketch the curve described by the parametric equations


2
x(t) = 3t + 2, y(t) = t − 1, for − 3 ≤ t ≤ 2.

Hint
Make a table of values for x(t) and y(t) using t values from −3 to 2.

Answer

Eliminating the Parameter


To better understand the graph of a curve represented parametrically, it is useful to rewrite the two equations as a single equation
relating the variables x and y . Then we can apply any previous knowledge of equations of curves in the plane to identify the curve.
For example, the equations describing the plane curve in Example 10.1.1b are
2
x(t) = t −3 (10.1.1)

y(t) = 2t + 1 (10.1.2)

over the region −2 ≤ t ≤ 3.


Solving Equation 10.1.2 for t gives
y −1
t = .
2

This can be substituted into Equation 10.1.1:


2
y −1
x =( ) −3 (10.1.3)
2

2
y − 2y + 1
= −3 (10.1.4)
4

2
y − 2y − 11
= . (10.1.5)
4

Equation 10.1.5 describes x as a function of y . These steps give an example of eliminating the parameter. The graph of this
function is a parabola opening to the right (Figure 10.1.4). Recall that the plane curve started at (1, −3) and ended at (6, 7). These
terminations were due to the restriction on the parameter t .

10.1.5 https://math.libretexts.org/@go/page/4505
 Example 10.1.2: Eliminating the Parameter

Eliminate the parameter for each of the plane curves described by the following parametric equations and describe the resulting
graph.
−−−−−
a. x(t) = √2t + 4 , y(t) = 2t + 1, for − 2 ≤ t ≤ 6

b. x(t) = 4 cos t, y(t) = 3 sin t, for 0 ≤ t ≤ 2π

Solution
a. To eliminate the parameter, we can solve either of the equations for t . For example, solving the first equation for t gives
− −−−−
x = √ 2t + 4

2
x = 2t + 4

2
x − 4 = 2t

2
x −4
t = .
2

2
x −4
Note that when we square both sides it is important to observe that x ≥ 0 . Substituting t = into y(t) yields
2

y(t) = 2t + 1

2
x −4
y =2( ) +1
2

2
y =x −4 +1

2
y =x − 3.

This is the equation of a parabola opening upward. There is, however, a domain restriction because of the limits on the
−−−−−−− − −− −− −−−
parameter t . When t = −2 , x = √2(−2) + 4 = 0 , and when t = 6 , x = √2(6) + 4 = 4 . The graph of this plane
curve follows.

Figure 10.1.6 : Graph of the plane curve described by the parametric equations in part a.
b. Sometimes it is necessary to be a bit creative in eliminating the parameter. The parametric equations for this example
are

x(t) = 4 cos t

and

y(t) = 3 sin t

10.1.6 https://math.libretexts.org/@go/page/4505
Solving either equation for t directly is not advisable because sine and cosine are not one-to-one functions. However,
dividing the first equation by 4 and the second equation by 3 (and suppressing the t ) gives us
x
cos t =
4

and
y
sin t = .
3

Now use the Pythagorean identity cos t + sin


2 2
t =1 and replace the expressions for sin t and cos t with the equivalent
expressions in terms of x and y . This gives
2 2
x y
( ) +( ) =1
4 3

2 2
x y
+ = 1.
16 9

This is the equation of a horizontal ellipse centered at the origin, with semi-major axis 4 and semi-minor axis 3 as shown
in the following graph.

Figure 10.1.7 : Graph of the plane curve described by the parametric equations in part b.
As t progresses from 0 to 2π, a point on the curve traverses the ellipse once, in a counterclockwise direction. Recall from
the section opener that the orbit of Earth around the Sun is also elliptical. This is a perfect example of using
parameterized curves to model a real-world phenomenon.

 Exercise 10.1.2
Eliminate the parameter for the plane curve defined by the following parametric equations and describe the resulting graph.
3
x(t) = 2 + , y(t) = t − 1, for 2 ≤ t ≤ 6
t

Hint
Solve one of the equations for t and substitute into the other equation.

Answer
x =2+
3

y+1
, or y = −1 + 3

x−2
. This equation describes a portion of a rectangular hyperbola centered at (2, −1).

10.1.7 https://math.libretexts.org/@go/page/4505
So far we have seen the method of eliminating the parameter, assuming we know a set of parametric equations that describe a plane
curve. What if we would like to start with the equation of a curve and determine a pair of parametric equations for that curve? This
is certainly possible, and in fact it is possible to do so in many different ways for a given curve. The process is known as
parameterization of a curve.

 Example 10.1.3: Parameterizing a Curve

Find two different pairs of parametric equations to represent the graph of y = 2x 2


−3 .
Solution
First, it is always possible to parameterize a curve by defining x(t) = t , then replacing x with t in the equation for y(t). This
gives the parameterization
2
x(t) = t, y(t) = 2 t − 3.

Since there is no restriction on the domain in the original graph, there is no restriction on the values of t .
We have complete freedom in the choice for the second parameterization. For example, we can choose x(t) = 3t − 2 . The
only thing we need to check is that there are no restrictions imposed on x; that is, the range of x(t) is all real numbers. This is
the case for x(t) = 3t − 2 . Now since y = 2x − 3 , we can substitute x(t) = 3t − 2 for x. This gives
2

2 2 2 2
y(t) = 2(3t − 2 ) − 2 = 2(9 t − 12t + 4) − 2 = 18 t − 24t + 8 − 2 = 18 t − 24t + 6.

Therefore, a second parameterization of the curve can be written as


x(t) = 3t − 2 and y(t) = 18t 2
− 24t + 6.

 Exercise 10.1.3

Find two different sets of parametric equations to represent the graph of y = x 2


+ 2x .

Hint
Follow the steps in Example 10.1.3. Remember we have freedom in choosing the parameterization for x(t).

Answer
One possibility is x(t) = t,
2
y(t) = t + 2t. Another possibility is
x(t) = 2t − 3, y(t) = (2t − 3 )
2
+ 2(2t − 3) = 4 t
2
− 8t + 3. There are, in fact, an infinite number of possibilities.

Cycloids and Other Parametric Curves


Imagine going on a bicycle ride through the country. The tires stay in contact with the road and rotate in a predictable pattern. Now
suppose a very determined ant is tired after a long day and wants to get home. So he hangs onto the side of the tire and gets a free
ride. The path that this ant travels down a straight road is called a cycloid (Figure 10.1.8). A cycloid generated by a circle (or
bicycle wheel) of radius a is given by the parametric equations

10.1.8 https://math.libretexts.org/@go/page/4505
x(t) = a(t − sin t), y(t) = a(1 − cos t).

To see why this is true, consider the path that the center of the wheel takes. The center moves along the x-axis at a constant height
equal to the radius of the wheel. If the radius is a , then the coordinates of the center can be given by the equations

x(t) = at, y(t) = a

for any value of t . Next, consider the ant, which rotates around the center along a circular path. If the bicycle is moving from left to
right then the wheels are rotating in a clockwise direction. A possible parameterization of the circular motion of the ant (relative to
the center of the wheel) is given by
x(t) = −a sin t

y(t) = −a cos t.

(The negative sign is needed to reverse the orientation of the curve. If the negative sign were not there, we would have to imagine
the wheel rotating counterclockwise.) Adding these equations together gives the equations for the cycloid.

x(t) = a(t − sin t)

y(t) = a(1 − cos t)

Figure 10.1.8 : A wheel traveling along a road without slipping; the point on the edge of the wheel traces out a cycloid.
Now suppose that the bicycle wheel doesn’t travel along a straight road but instead moves along the inside of a larger wheel, as in
Figure 10.1.9. In this graph, the green circle is traveling around the blue circle in a counterclockwise direction. A point on the edge
of the green circle traces out the red graph, which is called a hypocycloid.

Figure 10.1.9 : Graph of the hypocycloid described by the parametric equations shown.
The general parametric equations for a hypocycloid are
a−b
x(t) = (a − b) cos t + b cos( )t
b

10.1.9 https://math.libretexts.org/@go/page/4505
a−b
y(t) = (a − b) sin t − b sin( )t.
b

These equations are a bit more complicated, but the derivation is somewhat similar to the equations for the cycloid. In this case we
assume the radius of the larger circle is a and the radius of the smaller circle is b . Then the center of the wheel travels along a circle
of radius a − b. This fact explains the first term in each equation above. The period of the second trigonometric function in both
2πb
x(t) and y(t) is equal to .
a−b

a
The ratio is related to the number of cusps on the graph (cusps are the corners or pointed ends of the graph), as illustrated in
b
Figure 10.1.10. This ratio can lead to some very interesting graphs, depending on whether or not the ratio is rational. Figure 10.1.9
corresponds to a = 4 and b = 1 . The result is a hypocycloid with four cusps. Figure 10.1.10 shows some other possibilities. The
a
last two hypocycloids have irrational values for . In these cases the hypocycloids have an infinite number of cusps, so they never
b
return to their starting point. These are examples of what are known as space-filling curves.

Figure 10.1.10: Graph of various hypocycloids corresponding to different values of a/b.

 The Witch of Agnesi


Many plane curves in mathematics are named after the people who first investigated them, like the folium of Descartes or the
spiral of Archimedes. However, perhaps the strangest name for a curve is the witch of Agnesi. Why a witch?
Maria Gaetana Agnesi (1718–1799) was one of the few recognized women mathematicians of eighteenth-century Italy. She
wrote a popular book on analytic geometry, published in 1748, which included an interesting curve that had been studied by
Fermat in 1630. The mathematician Guido Grandi showed in 1703 how to construct this curve, which he later called the
“versoria,” a Latin term for a rope used in sailing. Agnesi used the Italian term for this rope, “versiera,” but in Latin, this same
word means a “female goblin.” When Agnesi’s book was translated into English in 1801, the translator used the term “witch”
for the curve, instead of rope. The name “witch of Agnesi” has stuck ever since.
The witch of Agnesi is a curve defined as follows: Start with a circle of radius a so that the points (0, 0) and (0, 2a) are points
on the circle (Figure 10.1.11). Let O denote the origin. Choose any other point A on the circle, and draw the secant line OA.
Let B denote the point at which the line OA intersects the horizontal line through (0, 2a). The vertical line through B intersects
the horizontal line through A at the point P. As the point A varies, the path that the point P travels is the witch of Agnesi curve
for the given circle.

10.1.10 https://math.libretexts.org/@go/page/4505
Witch of Agnesi curves have applications in physics, including modeling water waves and distributions of spectral lines. In
probability theory, the curve describes the probability density function of the Cauchy distribution. In this project you will
parameterize these curves.

Figure 10.1.11: As the point A moves around the circle, the point P traces out the witch of Agnesi curve for the given circle.
1. On the figure, label the following points, lengths, and angle:
a. C is the point on the x-axis with the same x-coordinate as A .
b. x is the x-coordinate of P , and y is the y -coordinate of P .
c. E is the point (0, a).
d. F is the point on the line segment OA such that the line segment EF is perpendicular to the line segment OA.
e. b is the distance from O to F .
f. c is the distance from F to A .
g. d is the distance from O to C .
h. θ is the measure of angle ∠C OA.
The goal of this project is to parameterize the witch using θ as a parameter. To do this, write equations for x and y in terms of
only θ .
2a
2. Show that d = .
sin θ

3. Note that x = d cos θ . Show that x = 2a cot θ . When you do this, you will have parameterized the x-coordinate of the
curve with respect to θ . If you can get a similar equation for y , you will have parameterized the curve.
4. In terms of θ , what is the angle ∠EOA?
5. Show that b + c = 2a cos( π

2
− θ) .
6. Show that y = 2a cos( π

2
− θ) sin θ .
7. Show that y = 2a sin 2
θ . You have now parameterized the y -coordinate of the curve with respect to θ .
8. Conclude that a parameterization of the given witch curve is
2
x = 2a cot θ, y = 2a sin θ, for − ∞ < θ < ∞.

3
8a
9. Use your parameterization to show that the given witch curve is the graph of the function f (x) = 2 2
.
x + 4a

 Travels with My Ant: The Curtate and Prolate Cycloids


Earlier in this section, we looked at the parametric equations for a cycloid, which is the path a point on the edge of a wheel
traces as the wheel rolls along a straight path. In this project we look at two different variations of the cycloid, called the
curtate and prolate cycloids.
First, let’s revisit the derivation of the parametric equations for a cycloid. Recall that we considered a tenacious ant trying to
get home by hanging onto the edge of a bicycle tire. We have assumed the ant climbed onto the tire at the very edge, where the

10.1.11 https://math.libretexts.org/@go/page/4505
tire touches the ground. As the wheel rolls, the ant moves with the edge of the tire (Figure 10.1.12).
As we have discussed, we have a lot of flexibility when parameterizing a curve. In this case we let our parameter t represent
the angle the tire has rotated through. Looking at Figure 10.1.12, we see that after the tire has rotated through an angle of t , the
position of the center of the wheel, C = (x , y ) , is given by
C C

xC = at and y C =a .
Furthermore, letting A = (x A, yA ) denote the position of the ant, we note that
xC − xA = a sin t and y
C − yA = a cos t

Then

xA = xC − a sin t = at − a sin t = a(t − sin t)

yA = yC − a cos t = a − a cos t = a(1 − cos t).

Figure 10.1.12: (a) The ant clings to the edge of the bicycle tire as the tire rolls along the ground. (b) Using geometry to
determine the position of the ant after the tire has rotated through an angle of t .
Note that these are the same parametric representations we had before, but we have now assigned a physical meaning to the
parametric variable t .
After a while the ant is getting dizzy from going round and round on the edge of the tire. So he climbs up one of the spokes
toward the center of the wheel. By climbing toward the center of the wheel, the ant has changed his path of motion. The new
path has less up-and-down motion and is called a curtate cycloid (Figure 10.1.13). As shown in the figure, we let b denote the
distance along the spoke from the center of the wheel to the ant. As before, we let t represent the angle the tire has rotated
through. Additionally, we let C = (x , y ) represent the position of the center of the wheel and A = (x , y ) represent the
C C A A

position of the ant.

10.1.12 https://math.libretexts.org/@go/page/4505
Figure 10.1.13: (a) The ant climbs up one of the spokes toward the center of the wheel. (b) The ant’s path of motion after he
climbs closer to the center of the wheel. This is called a curtate cycloid. (c) The new setup, now that the ant has moved closer
to the center of the wheel.
1. What is the position of the center of the wheel after the tire has rotated through an angle of t ?
2. Use geometry to find expressions for x C − xA and for y
C − yA .
3. On the basis of your answers to parts 1 and 2, what are the parametric equations representing the curtate cycloid?
Once the ant’s head clears, he realizes that the bicyclist has made a turn, and is now traveling away from his home. So he
drops off the bicycle tire and looks around. Fortunately, there is a set of train tracks nearby, headed back in the right
direction. So the ant heads over to the train tracks to wait. After a while, a train goes by, heading in the right direction,
and he manages to jump up and just catch the edge of the train wheel (without getting squished!).
The ant is still worried about getting dizzy, but the train wheel is slippery and has no spokes to climb, so he decides to
just hang on to the edge of the wheel and hope for the best. Now, train wheels have a flange to keep the wheel running
on the tracks. So, in this case, since the ant is hanging on to the very edge of the flange, the distance from the center of
the wheel to the ant is actually greater than the radius of the wheel (Figure 10.1.14).
The setup here is essentially the same as when the ant climbed up the spoke on the bicycle wheel. We let b denote the
distance from the center of the wheel to the ant, and we let t represent the angle the tire has rotated through. Additionally,
we let C = (x , y ) represent the position of the center of the wheel and A = (x , y ) represent the position of the ant
C C A A

(Figure 10.1.14).
When the distance from the center of the wheel to the ant is greater than the radius of the wheel, his path of motion is
called a prolate cycloid. A graph of a prolate cycloid is shown in the figure.

10.1.13 https://math.libretexts.org/@go/page/4505
Figure 10.1.14: (a) The ant is hanging onto the flange of the train wheel. (b) The new setup, now that the ant has jumped onto
the train wheel. (c) The ant travels along a prolate cycloid.
4. Using the same approach you used in parts 1– 3, find the parametric equations for the path of motion of the ant.
5. What do you notice about your answer to part 3 and your answer to part 4?
Notice that the ant is actually traveling backward at times (the “loops” in the graph), even though the train
continues to move forward. He is probably going to be really dizzy by the time he gets home!

Key Concepts
Parametric equations provide a convenient way to describe a curve. A parameter can represent time or some other meaningful
quantity.
It is often possible to eliminate the parameter in a parameterized curve to obtain a function or relation describing that curve.
There is always more than one way to parameterize a curve.
Parametric equations can describe complicated curves that are difficult or perhaps impossible to describe using rectangular
coordinates.

Glossary
cycloid
the curve traced by a point on the rim of a circular wheel as the wheel rolls along a straight line without slippage

cusp
a pointed end or part where two curves meet

orientation
the direction that a point moves on a graph as the parameter increases

parameter
an independent variable that both x and y depend on in a parametric curve; usually represented by the variable t

parametric curve
the graph of the parametric equations x(t) and y(t) over an interval a ≤ t ≤ b combined with the equations

10.1.14 https://math.libretexts.org/@go/page/4505
parametric equations
the equations x = x(t) and y = y(t) that define a parametric curve

parameterization of a curve
rewriting the equation of a curve defined by a function y = f (x) as parametric equations

10.1: Curves Defined by Parametric Equations is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
11.1: Parametric Equations by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

10.1.15 https://math.libretexts.org/@go/page/4505
10.2: Calculus with Parametric Curves
 Learning Objectives
Determine derivatives and equations of tangents for parametric curves.
Find the area under a parametric curve.
Use the equation for arc length of a parametric curve.
Apply the formula for surface area to a volume generated by a parametric curve.

Now that we have introduced the concept of a parameterized curve, our next step is to learn how to work with this concept in the context
of calculus. For example, if we know a parameterization of a given curve, is it possible to calculate the slope of a tangent line to the
curve? How about the arc length of the curve? Or the area under the curve?
Another scenario: Suppose we would like to represent the location of a baseball after the ball leaves a pitcher’s hand. If the position of
the baseball is represented by the plane curve (x(t), y(t)) then we should be able to use calculus to find the speed of the ball at any given
time. Furthermore, we should be able to calculate just how far that ball has traveled as a function of time.

Derivatives of Parametric Equations


We start by asking how to calculate the slope of a line tangent to a parametric curve at a point. Consider the plane curve defined by the
parametric equations
x(t) = 2t + 3 (10.2.1)

y(t) = 3t − 4 (10.2.2)

within −2 ≤ t ≤ 3 .
The graph of this curve appears in Figure 10.2.1. It is a line segment starting at (−1, −10) and ending at (9, 5).

Figure 10.2.1 : Graph of the line segment described by the given parametric equations.
We can eliminate the parameter by first solving Equation 10.2.1 for t :
x(t) = 2t + 3

x − 3 = 2t

10.2.1 https://math.libretexts.org/@go/page/4506
x −3
t = .
2

Substituting this into y(t) (Equation 10.2.2), we obtain


y(t) = 3t − 4

x −3
y =3( ) −4
2

3x 9
y = − −4
2 2

3x 17
y = − .
2 2

dy 3
The slope of this line is given by = . Next we calculate x'(t) and y'(t). This gives x'(t) = 2 and y'(t) = 3 . Notice that
dx 2

dy dy/dt 3
= = .
dx dx/dt 2

This is no coincidence, as outlined in the following theorem.

 Derivative of Parametric Equations


Consider the plane curve defined by the parametric equations x = x(t) and y = y(t) . Suppose that x'(t) and y'(t) exist, and assume
dy
that x'(t) ≠ 0 . Then the derivative is given by
dx

dy dy/dt y'(t)
= = . (10.2.3)
dx dx/dt x'(t)

 Proof
This theorem can be proven using the Chain Rule. In particular, assume that the parameter t can be eliminated, yielding a
differentiable function y = F (x). Then y(t) = F (x(t)). Differentiating both sides of this equation using the Chain Rule yields

y'(t) = F '(x(t))x'(t),

so
y'(t)
F '(x(t)) = .
x'(t)

dy
But F '(x(t)) = , which proves the theorem.
dx

Equation 10.2.3 can be used to calculate derivatives of plane curves, as well as critical points. Recall that a critical point of a
differentiable function y = f (x) is any point x = x such that either f '(x ) = 0 or f '(x ) does not exist. Equation 10.2.3 gives a
0 0 0

formula for the slope of a tangent line to a curve defined parametrically regardless of whether the curve can be described by a function
y = f (x) or not.

 Example 10.2.1: Finding the Derivative of a Parametric Curve


dy
Calculate the derivative for each of the following parametrically defined plane curves, and locate any critical points on their
dx
respective graphs.
a. x(t) = t − 3,
2
y(t) = 2t − 1, for − 3 ≤ t ≤ 4

b. x(t) = 2t + 1, y(t) = t
3
− 3t + 4, for − 2 ≤ t ≤ 2

c. x(t) = 5 cos t, y(t) = 5 sin t, for 0 ≤ t ≤ 2π

Solution

10.2.2 https://math.libretexts.org/@go/page/4506
a. To apply Equation 10.2.3, first calculate x'(t) and y'(t):
x'(t) = 2t

y'(t) = 2 .
Next substitute these into the equation:
dy dy/dt
=
dx dx/dt

dy 2
=
dx 2t

dy 1
= .
dx t

This derivative is undefined when t = 0 . Calculating x(0) and y(0) gives x(0) = (0) − 3 = −3 and 2

y(0) = 2(0) − 1 = −1 , which corresponds to the point (−3, −1) on the graph. The graph of this curve is a parabola opening

to the right, and the point (−3, −1) is its vertex as shown.

Figure 10.2.2 : Graph of the parabola described by parametric equations in part a.


b. To apply Equation 10.2.3, first calculate x'(t) and y'(t):
x'(t) = 2

y'(t) = 3 t
2
−3 .
Next substitute these into the equation:
dy dy/dt
=
dx dx/dt

2
dy 3t −3
= .
dx 2

This derivative is zero when t = ±1 . When t = −1 we have


x(−1) = 2(−1) + 1 = −1 and y(−1) = (−1) 3
− 3(−1) + 4 = −1 + 3 + 4 = 6 ,
which corresponds to the point (−1, 6) on the graph. When t = 1 we have
x(1) = 2(1) + 1 = 3 and y(1) = (1) 3
− 3(1) + 4 = 1 − 3 + 4 = 2,

which corresponds to the point (3, 2) on the graph. The point (3, 2) is a relative minimum and the point (−1, 6) is a relative
maximum, as seen in the following graph.

10.2.3 https://math.libretexts.org/@go/page/4506
Figure 10.2.3 : Graph of the curve described by parametric equations in part b.
c. To apply Equation 10.2.3, first calculate x'(t) and y'(t):
x'(t) = −5 sin t

y'(t) = 5 cos t.

Next substitute these into the equation:


dy dy/dt
=
dx dx/dt

dy 5 cos t
=
dx −5 sin t

dy
= − cot t.
dx

π 3π
This derivative is zero when cos t = 0 and is undefined when sin t = 0. This gives t = 0, , π, , and 2π as critical points
2 2
for t. Substituting each of these into x(t) and y(t), we obtain

t x(t) y(t)

0 5 0
π
0 5
2

π −5 0

0 −5
2

2π 5 0

These points correspond to the sides, top, and bottom of the circle that is represented by the parametric equations (Figure
10.2.4). On the left and right edges of the circle, the derivative is undefined, and on the top and bottom, the derivative equals

zero.

10.2.4 https://math.libretexts.org/@go/page/4506
Figure 10.2.4 : Graph of the curve described by parametric equations in part c.

 Exercise 10.2.1

Calculate the derivative dy/dx for the plane curve defined by the equations
2 3
x(t) = t − 4t, y(t) = 2 t − 6t, for − 2 ≤ t ≤ 3

and locate any critical points on its graph.

Hint
Calculate x'(t) and y'(t) and use Equation 10.2.3.

Answer
2 2
dy 6t −6 3t −3
x'(t) = 2t − 4 and y'(t) = 6t 2
−6 , so = = .
dx 2t − 4 t −2

This expression is undefined when t = 2 and equal to zero when t = ±1 .

 Example 10.2.2: Finding a Tangent Line

Find the equation of the tangent line to the curve defined by the equations
2
x(t) = t − 3, y(t) = 2t − 1, for − 3 ≤ t ≤ 4

10.2.5 https://math.libretexts.org/@go/page/4506
when t = 2 .
Solution
First find the slope of the tangent line using Equation 10.2.3, which means calculating x'(t) and y'(t):
x'(t) = 2t

y'(t) = 2 .
Next substitute these into the equation:
dy dy/dt
=
dx dx/dt

dy 2
=
dx 2t

dy 1
= .
dx t

dy 1
When t = 2, = , so this is the slope of the tangent line. Calculating x(2) and y(2) gives
dx 2

x(2) = (2 )
2
−3 = 1 and y(2) = 2(2) − 1 = 3 ,
which corresponds to the point (1, 3) on the graph (Figure 10.2.5). Now use the point-slope form of the equation of a line to find the
equation of the tangent line:
y − y0 = m(x − x0 )

1
y −3 = (x − 1)
2

1 1
y −3 = x−
2 2

1 5
y = x+ .
2 2

Figure 10.2.5 : Tangent line to the parabola described by the given parametric equations when t = 2 .

 Exercise 10.2.2

Find the equation of the tangent line to the curve defined by the equations
x(t) = t
2
− 4t, y(t) = 2 t
3
− 6t, for − 2 ≤ t ≤ 6 when t = 5 .

Hint
Calculate x'(t) and y'(t) and use Equation 10.2.3.

Answer
The equation of the tangent line is y = 24x + 100.

10.2.6 https://math.libretexts.org/@go/page/4506
Second-Order Derivatives
Our next goal is to see how to take the second derivative of a function defined parametrically. The second derivative of a function
y = f (x) is defined to be the derivative of the first derivative; that is,

2
d y d dy
= [ ]. (10.2.4)
2
dx dx dx

Since

dy dy/dt
= ,
dx dx/dt

dy
we can replace the y on both sides of Equation 10.2.4 with . This gives us
dx

2
d y d dy (d/dt)(dy/dx)
= ( ) = . (10.2.5)
2
dx dx dx dx/dt

If we know dy/dx as a function of t , then this formula is straightforward to apply

 Example 10.2.3: Finding a Second Derivative

Calculate the second derivative 2


d y/dx
2
for the plane curve defined by the parametric equations
2
x(t) = t − 3, y(t) = 2t − 1, for − 3 ≤ t ≤ 4.

Solution
dy 2 1
From Example 10.2.1 we know that = = . Using Equation 10.2.5, we obtain
dx 2t t

2 −2
d y (d/dt)(dy/dx) (d/dt)(1/t) −t 1

2
= = = =−
3
.
dx dx/dt 2t 2t 2t

 Exercise 10.2.3

Calculate the second derivative d 2


y/dx
2
for the plane curve defined by the equations
2 3
x(t) = t − 4t, y(t) = 2 t − 6t, for − 2 ≤ t ≤ 3

and locate any critical points on its graph.

Hint
Start with the solution from the previous exercise, and use Equation 10.2.5.

Answer
2 2
d y 3t − 12t + 3
= . Critical points (5, 4), (−3, −4) ,and (−4, 6).
2 3
dx 2(t − 2)

Integrals Involving Parametric Equations


Now that we have seen how to calculate the derivative of a plane curve, the next question is this: How do we find the area under a curve
defined parametrically? Recall the cycloid defined by these parametric equations

x(t) = t − sin t

y(t) = 1 − cos t.

Suppose we want to find the area of the shaded region in the following graph.

10.2.7 https://math.libretexts.org/@go/page/4506
Figure 10.2.6 : Graph of a cycloid with the arch over [0, 2π] highlighted.
To derive a formula for the area under the curve defined by the functions
x = x(t)

y = y(t)

where a ≤ t ≤ b .
We assume that x(t) is differentiable and start with an equal partition of the interval a ≤t ≤b . Suppose
t0 = a < t1 < t2 < ⋯ < tn = b and consider the following graph.

Figure 10.2.7 : Approximating the area under a parametrically defined curve.


We use rectangles to approximate the area under the curve. The height of a typical rectangle in this parametrization is y(x(t¯ )) for some i

value t¯ in the i subinterval, and the width can be calculated as x(t ) − x(t ) . Thus the area of the i rectangle is given by
i
th
i i−1
th

¯ ))(x(t ) − x(t
Ai = y(x(ti i i−1 )).

Then a Riemann sum for the area is


n

¯ ))(x(t ) − x(t
An = ∑ y(x(ti i i−1 )).

i=1

Multiplying and dividing each area by t i − ti−1 gives


n
x(ti ) − x(ti−1 )
¯ )) (
An = ∑ y(x(t ) (ti − ti−1 )
i
ti − ti−1
i=1

n
x(ti ) − x(ti−1 )
¯ )) (
= ∑ y(x(t ) Δt.
i
Δt
i=1

Taking the limit as n approaches infinity gives


b

A = lim An = ∫ y(t)x'(t) dt.


n→∞
a

This leads to the following theorem.

10.2.8 https://math.libretexts.org/@go/page/4506
 Area under a Parametric Curve
Consider the non-self-intersecting plane curve defined by the parametric equations

x = x(t), y = y(t), for a ≤ t ≤ b

and assume that x(t) is differentiable. The area under this curve is given by
b

A =∫ y(t)x'(t) dt. (10.2.6)


a

 Example 10.2.4: Finding the Area under a Parametric Curve

Find the area under the curve of the cycloid defined by the equations

x(t) = t − sin t, y(t) = 1 − cos t, for 0 ≤ t ≤ 2π.

Solution
Using Equation 10.2.6, we have
b

A =∫ y(t)x'(t) dt
a

=∫ (1 − cos t)(1 − cos t) dt


0


2
=∫ (1 − 2 cos t + cos t) dt
0


1 + cos(2t)
=∫ (1 − 2 cos t + ) dt
0
2


3 cos(2t)
=∫ ( − 2 cos t + ) dt
0
2 2


3t sin(2t) ∣
= − 2 sin t + ∣
2 4 ∣
0

= 3π

 Exercise 10.2.4

Find the area under the curve of the hypocycloid defined by the equations

x(t) = 3 cos t + cos(3t), y(t) = 3 sin t − sin(3t), for 0 ≤ t ≤ π.

Hint
1 1 − cos(2t)
Use Equation 10.2.6, along with the identities sin α sin β = [cos(α − β) − cos(α + β)] and sin2
t = .
2 2

Answer
A = 3π (Note that the integral formula actually yields a negative answer. This is due to the fact that x(t) is a decreasing function
over the interval [0, π]; that is, the curve is traced from right to left.)

Arc Length of a Parametric Curve


In addition to finding the area under a parametric curve, we sometimes need to find the arc length of a parametric curve. In the case of a
line segment, arc length is the same as the distance between the endpoints. If a particle travels from point A to point B along a curve,
then the distance that particle travels is the arc length. To develop a formula for arc length, we start with an approximation by line
segments as shown in the following graph.

10.2.9 https://math.libretexts.org/@go/page/4506
Figure 10.2.7 : Approximation of a curve by line segments.
Given a plane curve defined by the functions x = x(t), y = y(t), for a ≤ t ≤ b , we start by partitioning the interval [a, b] into n
equal subintervals: t = a < t < t < ⋯ < t = b . The width of each subinterval is given by Δt = (b − a)/n . We can calculate the
0 1 2 n

length of each line segment:


−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2
d1 = √ (x(t1 ) − x(t0 )) + (y(t1 ) − y(t0 ))

−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2
d2 = √ (x(t2 ) − x(t1 )) + (y(t2 ) − y(t1 ))

etc.
Then add these up. We let s denote the exact arc length and s denote the approximation by n line segments:
n

n n −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2
s ≈ ∑ sk = ∑ √ (x(tk ) − x(tk−1 )) + (y(tk ) − y(tk−1 )) . (10.2.7)

k=1 k=1

If we assume that x(t) and y(t) are differentiable functions of t , then the Mean Value Theorem applies, so in each subinterval [t k−1 , tk ]
~
there exist t^ and t such that
k k

x(tk ) − x(tk−1 ) = x'(t^ ^


k )(tk − tk−1 ) = x'(tk ) Δt

~ ~
y(tk ) − y(tk−1 ) = y'(tk )(tk − tk−1 ) = y'(tk ) Δt.

Therefore Equation 10.2.7 becomes


n n − −−−−−−−−−−−−−−−−−− −
2 ~ 2
s ≈ ∑ sk = ∑ √ (x'(t^
k )Δt ) + (y'(tk )Δt)

k=1 k=1

n
−−−−−−−−−−−−−−−−−−−−−−− −
2 2 ~ 2 2
= ∑ √ (x'(t^
k )) (Δt ) + (y'(tk )) (Δt)

k=1

n
−−−−−−−−−−−−−− −
2 ~ 2
= ∑ √ (x'(t^
k )) + (y'(tk )) Δt.

k=1

This is a Riemann sum that approximates the arc length over a partition of the interval [a, b]. If we further assume that the derivatives are
continuous and let the number of points in the partition increase without bound, the approximation approaches the exact arc length. This
gives
n

s = lim ∑ sk
n→∞
k=1

n
−−−−−−−−−−−−−− −
2 ~ 2
= lim ∑ √ (x'(t^
k )) + (y'(tk )) Δt
n→∞
k=1

b −−−−−−−−−−−−−−
2 2
=∫ √ (x'(t)) + (y'(t)) dt.
a

~
When taking the limit, the values of t^
k and tk are both contained within the same ever-shrinking interval of width Δt , so they must
converge to the same value.
We can summarize this method in the following theorem.

10.2.10 https://math.libretexts.org/@go/page/4506
 Arc Length of a Parametric Curve
Consider the plane curve defined by the parametric equations

x = x(t), y = y(t), for t1 ≤ t ≤ t2

and assume that x(t) and y(t) are differentiable functions of t . Then the arc length of this curve is given by
−−−−−−−−−−−−−−
t2 2 2
dx dy
s =∫ √( ) +( ) dt. (10.2.8)
t1 dt dt

At this point a side derivation leads to a previous formula for arc length. In particular, suppose the parameter can be eliminated, leading
to a function y = F (x). Then y(t) = F (x(t)) and the Chain Rule gives

y'(t) = F '(x(t))x'(t).

Substituting this into Equation 10.2.8 gives


−−−−−−−−−−−−−−−−−−−
t2 2 2
dx dx
s =∫ √( ) + (F '(x) ) dt
t1
dt dt

−−−−−−−−−−−−−−−−−
t2 2
dx 2
=∫ √( ) (1 + (F '(x)) ) dt
t1 dt

−−−−−−−−−
t2 2
dy
=∫ x'(t)√ 1 + ( ) dt.
t1
dx

Here we have assumed that x'(t) > 0 , which is a reasonable assumption. The Chain Rule gives dx = x'(t) dt, and letting a = x(t 1) and
b = x(t ) we obtain the formula
2

−−−−−−−−−
b 2
dy
s =∫ √1 + ( ) dx,
a
dx

which is the formula for arc length obtained in the Introduction to the Applications of Integration.

 Example 10.2.5: Finding the Arc Length of a Parametric Curve

Find the arc length of the semicircle defined by the equations


x(t) = 3 cos t, y(t) = 3 sin t, for 0 ≤ t ≤ π.

Solution
The values t = 0 to t = π trace out the blue curve in Figure 10.2.8. To determine its length, use Equation 10.2.8:
−−−−−−−−−−−−−−
t2 2 2
dx dy
s =∫ √( ) +( ) dt
t1
dt dt

π −−−−−−−−−−−−−−−−−
2 2
=∫ √ (−3 sin t) + (3 cos t) dt
0

π
− −−−−−−−−−−− −
2 2
=∫ √ 9 sin t + 9 cos t dt
0

π −−−−−−−−−−−−−
2 2
=∫ √ 9(sin t + cos t) dt
0

π
π

=∫ 3 dt = 3t
∣0
0

= 3π units.

10.2.11 https://math.libretexts.org/@go/page/4506
Note that the formula for the arc length of a semicircle is πr and the radius of this circle is 3. This is a great example of using
calculus to derive a known formula of a geometric quantity.

Figure 10.2.8 : The arc length of the semicircle is equal to its radius times π.

 Exercise 10.2.5
Find the arc length of the curve defined by the equations
2 3
x(t) = 3 t , y(t) = 2 t , for 1 ≤ t ≤ 3.

Hint
Use Equation 10.2.8.

Answer
s = 2(10
3/2
−2
3/2
) ≈ 57.589 units

We now return to the problem posed at the beginning of the section about a baseball leaving a pitcher’s hand. Ignoring the effect of air
resistance (unless it is a curve ball!), the ball travels a parabolic path. Assuming the pitcher’s hand is at the origin and the ball travels left
to right in the direction of the positive x-axis, the parametric equations for this curve can be written as
2
x(t) = 140t, y(t) = −16 t + 2t

where t represents time. We first calculate the distance the ball travels as a function of time. This distance is represented by the arc
length. We can modify the arc length formula slightly. First rewrite the functions x(t) and y(t) using v as an independent variable, so as
to eliminate any confusion with the parameter t :
2
x(v) = 140v, y(v) = −16 v + 2v.

Then we write the arc length formula as follows:


t −−−−−−−−−−− −
dx dy
2 2
s(t) = ∫ √( ) +( ) dv
0 dv dv

t −−−−−−−−−−−−−−−
2 2
=∫ √ 140 + (−32v + 2 ) dv
0

The variable v acts as a dummy variable that disappears after integration, leaving the arc length as a function of time t . To integrate this
expression we can use a formula from Appendix A,
2
− −−−−− u −−−−−− a − −−−−−
2 2 2 2 2 2
∫ √ a + u du = √a + u + ln ∣ u + √ a + u ∣ +C .
2 2

1
We set a = 140 and u = −32v + 2. This gives du = −32 dv, so dv = − du. Therefore
32

10.2.12 https://math.libretexts.org/@go/page/4506
−−−−−−−−−−−−−−− − −−−−−
2 2
1 2 2
∫ √ 140 + (−32v + 2 ) dv = − ∫ √ a + u du
32

−−−−−−−−−−−−−−− 2 −−−−−−−−−−−−−−−
1 (−32v + 2) 140
2 2 2 2
=− [ √ 140 + (−32v + 2 ) + ln ∣ (−32v + 2) + √ 140 + (−32v + 2 ) | + C]
32 2 2

and
−−−−−−−−−−−−−−− 2 − −−−−−−−−−−−−− −
1 (−32t + 2) 140
2 2 ∣ 2 2 ∣
s(t) =− [ √ 140 + (−32t + 2 ) + ln (−32t + 2) + √ 140 + (−32t + 2 ) ]
32 2 2 ∣ ∣

2
1 − −−−−−−− 140 − −−−−−−−
2 2 2 2
+ [√ 140 + 2 + ln∣
∣2 +
√ 140 + 2 ∣]

32 2

−−−−−
t 1 − −−−−−−−−−−−−−−−− − 1225 − −−−−−−−−−−−−−−−− − √19604 1225
2 2
=( − ) √ 1024 t − 128t + 19604 − ln∣
∣(−32t + 2) +
√ 1024 t − 128t + 19604∣ +
∣ + ln
2 32 4 32 4
−−−−−
(2 + √19604)
.

This function represents the distance traveled by the ball as a function of time. To calculate the speed, take the derivative of this function
with respect to t . While this may seem like a daunting task, it is possible to obtain the answer directly from the Fundamental Theorem of
Calculus:
x
d
∫ f (u) du = f (x).
dx a

Therefore
d
s'(t) = [s(t)]
dt

t −−−−−−−−−−−−−−−
d
2 2
= [∫ √ 140 + (−32v + 2 ) dv]
dt 0

−−−−−−−−−−−−−−−
2 2
= √ 140 + (−32t + 2 )

− −−−−−−−−−−−−−−−− −
2
= √ 1024 t − 128t + 19604

− −−−−−−−−−−−−− −
2
= 2 √ 256 t − 32t + 4901.

One third of a second after the ball leaves the pitcher’s hand, the distance it travels is equal to
−−−−−−−−−−−−−−−−−−−−−−−−−
2
1 1/3 1 1 1
s( ) =( − ) √ 1024 ( ) − 128 ( ) + 19604
3 2 32 3 3

−−−−−−−−−−−−−−−−−−−−−−−−−
2
1225 ∣ 1 1 1 ∣
− ln ∣ (−32 ( ) + 2) + √ 1024 ( ) − 128 ( ) + 19604∣
4 ∣ 3 3 3 ∣

−−−−−
√19604 1225 −−−−−
+ + ln(2 + √19604)
32 4

≈ 46.69 feet.

This value is just over three quarters of the way to home plate. The speed of the ball is
−−−−−−−−−−−−−−−−−−−−
2
s' (
1

3
) = 2 √256 (
1

3
) − 32 (
1

3
) + 4901 ≈ 140.27 ft/s.

This speed translates to approximately 95 mph—a major-league fastball.

Surface Area Generated by a Parametric Curve


Recall the problem of finding the surface area of a volume of revolution. In Curve Length and Surface Area, we derived a formula for
finding the surface area of a volume generated by a function y = f (x) from x = a to x = b, revolved around the x-axis:
b −−−−−−−−−−
2
S = 2π ∫ f (x)√ 1 + (f '(x)) dx.
a

10.2.13 https://math.libretexts.org/@go/page/4506
We now consider a volume of revolution generated by revolving a parametrically defined curve x = x(t), y = y(t), for a ≤ t ≤ b

around the x-axis as shown in Figure 10.2.9.

Figure 10.2.9 : A surface of revolution generated by a parametrically defined curve.


The analogous formula for a parametrically defined curve is
b −−−−−−−−−−−−−−
2 2
S = 2π ∫ y(t)√ (x'(t)) + (y'(t)) dt (10.2.9)
a

provided that y(t) is not negative on [a, b].

 Example 10.2.6: Finding Surface Area

Find the surface area of a sphere of radius r centered at the origin.


Solution
We start with the curve defined by the equations

x(t) = r cos t, y(t) = r sin t, for 0 ≤ t ≤ π.

This generates an upper semicircle of radius r centered at the origin as shown in the following graph.

Figure 10.2.10: A semicircle generated by parametric equations.


When this curve is revolved around the x-axis, it generates a sphere of radius r. To calculate the surface area of the sphere, we use
Equation 10.2.9:

10.2.14 https://math.libretexts.org/@go/page/4506
b −−−−−−−−−−−−−−
2 2
S = 2π ∫ y(t)√ (x'(t)) + (y'(t)) dt
a

π −−−−−−−−−−−−−−−−−
2 2
= 2π ∫ r sin t√ (−r sin t) + (r cos t) dt
0

π
−−−−−−−−−−−−−−
2 2 2 2
= 2π ∫ r sin t√ r sin t +r cos t dt
0

π −−−−−−−−−−−−−−
2 2 2
= 2π ∫ r sin t√ r (sin t + cos t) dt
0

π
2
= 2π ∫ r sin t dt
0

π
2 ∣
= 2π r (− cos t )
∣0

2
= 2π r (− cos π + cos 0)

2 2
= 4π r units .

This is, in fact, the formula for the surface area of a sphere.

 Exercise 10.2.6

Find the surface area generated when the plane curve defined by the equations
3 2
x(t) = t , y(t) = t , for 0 ≤ t ≤ 1

is revolved around the x-axis.

Hint
Use Equation 10.2.9. When evaluating the integral, use a u -substitution.

Answer
−−
π(494 √13 + 128)
2
A = units
1215

Key Concepts
dy y'(t)
The derivative of the parametrically defined curve x = x(t) and y = y(t) can be calculated using the formula = . Using
dx x'(t)

the derivative, we can find the equation of a tangent line to a parametric curve.
t2

The area between a parametric curve and the x-axis can be determined by using the formula A = ∫ y(t)x'(t) dt.
t1

The arc length of a parametric curve can be calculated by using the formula
−−−−−−−−−−−−−−
t2 2 2
dx dy
s =∫ √( ) +( ) dt.
t1
dt dt

The surface area of a volume of revolution revolved around the x-axis is given by
b −−−−−−−−−−−−−−
2 2
S = 2π ∫ y(t)√ (x'(t)) + (y'(t)) dt.
a

If the curve is revolved around the y -axis, then the formula is


b −−−−−−−−−−−−−−
2 2
S = 2π ∫ x(t)√ (x'(t)) + (y'(t)) dt.
a

10.2.15 https://math.libretexts.org/@go/page/4506
Key Equations
Derivative of parametric equations
dy dy/dt y'(t)
= =
dx dx/dt x'(t)

Second-order derivative of parametric equations


2
d y d dy (d/dt)(dy/dx)
= ( ) =
dx2 dx dx dx/dt

Area under a parametric curve


b

A =∫ y(t)x'(t) dt
a

Arc length of a parametric curve


−−−−−−−−−−−−−−
t2 2 2
dx dy
s =∫ √( ) +( ) dt
t1
dt dt

Surface area generated by a parametric curve


b −−−−−−−−−−−−−−
2 2
S = 2π ∫ y(t)√ (x'(t)) + (y'(t)) dt
a

10.2: Calculus with Parametric Curves is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
11.2: Calculus of Parametric Curves by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

10.2.16 https://math.libretexts.org/@go/page/4506
10.3: Polar Coordinates
 Learning Objectives
Locate points in a plane by using polar coordinates.
Convert points between rectangular and polar coordinates.
Sketch polar curves from given equations.
Convert equations between rectangular and polar coordinates.
Identify symmetry in polar curves and equations.

The rectangular coordinate system (or Cartesian plane) provides a means of mapping points to ordered pairs and ordered pairs to
points. This is called a one-to-one mapping from points in the plane to ordered pairs. The polar coordinate system provides an
alternative method of mapping points to ordered pairs. In this section we see that in some circumstances, polar coordinates can be
more useful than rectangular coordinates.

Defining Polar Coordinates


To find the coordinates of a point in the polar coordinate system, consider Figure 10.3.1. The point P has Cartesian coordinates
(x, y). The line segment connecting the origin to the point P measures the distance from the origin to P and has length r . The

angle between the positive x-axis and the line segment has measure θ . This observation suggests a natural correspondence between
the coordinate pair (x, y) and the values r and θ . This correspondence is the basis of the polar coordinate system. Note that every
point in the Cartesian plane has two values (hence the term ordered pair) associated with it. In the polar coordinate system, each
point also has two values associated with it: r and θ .

Figure 10.3.1 : An arbitrary point in the Cartesian plane.


Using right-triangle trigonometry, the following equations are true for the point P :
x
cos θ = so x = r cos θ
r

y
sin θ = so y = r sin θ.
r

Furthermore,
2 2 2
r =x +y

and
y
tan θ = .
x

Each point (x, y) in the Cartesian coordinate system can therefore be represented as an ordered pair (r, θ) in the polar coordinate
system. The first coordinate is called the radial coordinate and the second coordinate is called the angular coordinate. Every
point in the plane can be represented in this form.
Note that the equation tan θ = y/x has an infinite number of solutions for any ordered pair (x, y). However, if we restrict the
solutions to values between 0 and 2π then we can assign a unique solution to the quadrant in which the original point (x, y) is

10.3.1 https://math.libretexts.org/@go/page/4507
located. Then the corresponding value of r is positive, so r 2
=x
2
+y
2
.

 Converting Points between Coordinate Systems

Given a point P in the plane with Cartesian coordinates (x, y) and polar coordinates (r, θ) , the following conversion formulas
hold true:
x = r cos θ (10.3.1)

y = r sin θ (10.3.2)

and
2 2 2
r =x +y (10.3.3)

y .
tan θ = (10.3.4)
x

These formulas can be used to convert from rectangular to polar or from polar to rectangular coordinates. Notice that Equation
10.3.3 is the Pythagorean theorem. (Figure 10.3.1).

 Example 10.3.1: Converting between Rectangular and Polar Coordinates

Convert each of the following points into polar coordinates.


a. (1, 1)
b. (−3, 4)
c. (0, 3)

d. (5√3, −5)
Convert each of the following points into rectangular coordinates.
e. (3, π/3)
f. (2, 3π/2)
g. (6, −5π/6)
Solution
a. Use x = 1 and y = 1 in Equation 10.3.3:
2 2 2
r =x +y

2 2
=1 +1

r = √2

and via Equation 10.3.4


y 1
tan θ = = =1
x 1

π
θ = .
4

– π
Therefore this point can be represented as (√2, ) in polar coordinates.
4

b. Use x = −3 and y = 4 in Equation 10.3.3:


2 2 2 2 2
r =x +y = (−3 ) + (4 )

r =5

and via Equation 10.3.4


y 4
tan θ = =−
x 3

10.3.2 https://math.libretexts.org/@go/page/4507
4
θ = arctan(− ) + π ≈ 2.21.
3

Therefore this point can be represented as (5, 2.21) in polar coordinates.


c. Use x = 0 and y = 3 in Equation 10.3.3:
2 2 2 2 2
r =x +y = (3 ) + (0 ) = 9 +0 r =3

and via Equation 10.3.4


y 3
tan θ = = .
x 0

Direct application of the second equation leads to division by zero. Graphing the point (0, 3) on the rectangular
coordinate system reveals that the point is located on the positive y-axis. The angle between the positive x-axis and the
π π
positive y-axis is . Therefore this point can be represented as (3, ) in polar coordinates.
2 2

d. Use x = 5√3 and y = −5 in Equation 10.3.3:
2 2 2 – 2 2
r =x +y = (5 √3) + (−5 ) = 75 + 25

r = 10

and via Equation 10.3.4



y −5 √3
tan θ = = =−

x 5 √3 3

π
θ =− .
6

π
Therefore this point can be represented as (10, − ) in polar coordinates.
6

π
e. Use r = 3 and θ = in Equation 10.3.1:
3

π 1 3
x = r cos θ = 3 cos( ) = 3( ) =
3 2 2

and
– –
π √3 3 √3
y = r sin θ = 3 sin( ) = 3( ) = .
3 2 2

3 3 √3
Therefore this point can be represented as ( , ) in rectangular coordinates.
2 2


f. Use r = 2 and θ = in Equation 10.3.1:
2


x = r cos θ = 2 cos( ) = 2(0) = 0
2

and

y = r sin θ = 2 sin( ) = 2(−1) = −2.
2

Therefore this point can be represented as (0, −2) in rectangular coordinates.



g. Use r = 6 and θ = − in Equation 10.3.1:
6

5π √3 –
x = r cos θ = 6 cos(− ) = 6(− ) = −3 √3
6 2

and
5π 1
y = r sin θ = 6 sin(− ) = 6(− ) = −3 .
6 2

Therefore this point can be represented as (−3√3, −3) in rectangular coordinates.

10.3.3 https://math.libretexts.org/@go/page/4507
 Exercise 10.3.1

Convert (−8, −8) into polar coordinates and (4, ) into rectangular coordinates.
3

Hint
Use Equation 10.3.3and Equation 10.3.1. Make sure to check the quadrant when calculating θ .

Answer
– 5π –
(8 √2, ) and (−2, 2√3)
4

π 7π
The polar representation of a point is not unique. For example, the polar coordinates (2, ) and (2, ) both represent the point
3 3
– 4π
(1, √3) in the rectangular system. Also, the value of r can be negative. Therefore, the point with polar coordinates (−2, ) also
3

represents the point (1, √3) in the rectangular system, as we can see by using Equation 10.3.1:
4π 1
x = r cos θ = −2 cos( ) = −2(− ) =1
3 2

and

4π √3 –
y = r sin θ = −2 sin( ) = −2(− ) = √3.
3 2

Every point in the plane has an infinite number of representations in polar coordinates. However, each point in the plane has only
one representation in the rectangular coordinate system.
Note that the polar representation of a point in the plane also has a visual interpretation. In particular, r is the directed distance that
the point lies from the origin, and θ measures the angle that the line segment from the origin to the point makes with the positive x-
axis. Positive angles are measured in a counterclockwise direction and negative angles are measured in a clockwise direction. The
polar coordinate system appears in Figure 10.3.2.

Figure 10.3.2 : The polar coordinate system.


The line segment starting from the center of the graph going to the right (called the positive x-axis in the Cartesian system) is the
polar axis. The center point is the pole, or origin, of the coordinate system, and corresponds to r = 0 . The innermost circle shown
in Figure 10.3.2 contains all points a distance of 1 unit from the pole, and is represented by the equation r = 1 . Then r = 2 is the

10.3.4 https://math.libretexts.org/@go/page/4507
set of points 2 units from the pole, and so on. The line segments emanating from the pole correspond to fixed angles. To plot a
point in the polar coordinate system, start with the angle. If the angle is positive, then measure the angle from the polar axis in a
counterclockwise direction. If it is negative, then measure it clockwise. If the value of r is positive, move that distance along the
terminal ray of the angle. If it is negative, move along the ray that is opposite the terminal ray of the given angle.

 Example 10.3.2: Plotting Points in the Polar Plane

Plot each of the following points on the polar plane.


π
a. (2, )
4

b. (−3, )
3

c. (4, )
4

Solution
The three points are plotted in Figure 10.3.3.

Figure 10.3.3 : Three points plotted in the polar coordinate system.

 Exercise 10.3.2
5π 7π
Plot (4, ) and (−3, − ) on the polar plane.
3 2

Hint
Start with θ , then use r .

Answer

10.3.5 https://math.libretexts.org/@go/page/4507
Polar Curves
Now that we know how to plot points in the polar coordinate system, we can discuss how to plot curves. In the rectangular
coordinate system, we can graph a function y = f (x) and create a curve in the Cartesian plane. In a similar fashion, we can graph a
curve that is generated by a function r = f (θ) .
The general idea behind graphing a function in polar coordinates is the same as graphing a function in rectangular coordinates.
Start with a list of values for the independent variable (θ in this case) and calculate the corresponding values of the dependent
variable r. This process generates a list of ordered pairs, which can be plotted in the polar coordinate system. Finally, connect the
points, and take advantage of any patterns that may appear. The function may be periodic, for example, which indicates that only a
limited number of values for the independent variable are needed.

 Problem-Solving Strategy: Plotting a Curve in Polar Coordinates


1. Create a table with two columns. The first column is for θ , and the second column is for r.
2. Create a list of values for θ .
3. Calculate the corresponding r values for each θ .
4. Plot each ordered pair (r, θ) on the coordinate axes.
5. Connect the points and look for a pattern.

 Example 10.3.3: Graphing a Function in Polar Coordinates

Graph the curve defined by the function r = 4 sin θ . Identify the curve and rewrite the equation in rectangular coordinates.
Solution
Because the function is a multiple of a sine function, it is periodic with period 2π, so use values for θ between 0 and 2π. The
result of steps 1–3 appear in the following table. Figure 10.3.4 shows the graph based on this table.

θ r = 4 sin θ θ r = 4 sin θ

0 0 π 0
π 7π
2 −2
6 6

π – 5π –
2 √2 ≈ 2.8 −2 √2 ≈ −2.8
4 4

π 4π
– –
2 √3 ≈ 3.4 −2 √3 ≈ −3.4
3 3

π 3π
4 −4
2 2

10.3.6 https://math.libretexts.org/@go/page/4507
θ r = 4 sin θ θ r = 4 sin θ

2π – 5π –
2 √3 ≈ 3.4 −2 √3 ≈ −3.4
3 3

3π – 7π –
2 √2 ≈ 2.8 −2 √2 ≈ −2.8
4 4

5π 11π
2 −2
6 6

2π 0

Figure 10.3.4 : The graph of the function r = 4 sin θ is a circle.


This is the graph of a circle. The equation r = 4 sin θ can be converted into rectangular coordinates by first multiplying both
sides by r. This gives the equation r = 4r sin θ. Next use the facts that r = x + y and y = r sin θ . This gives
2 2 2 2

x + y = 4y . To put this equation into standard form, subtract 4y from both sides of the equation and complete the square:
2 2

2 2
x +y − 4y = 0

2 2
x + (y − 4y) = 0

2 2
x + (y − 4y + 4) = 0 +4

2 2
x + (y − 2 ) =4

This is the equation of a circle with radius 2 and center (0, 2) in the rectangular coordinate system.

 Exercise 10.3.3

Create a graph of the curve defined by the function r = 4 + 4 cos θ .

Hint
Follow the problem-solving strategy for creating a graph in polar coordinates.

Answer
The name of this shape is a cardioid, which we will study further later in this section.

10.3.7 https://math.libretexts.org/@go/page/4507
The graph in Example 10.3.3 was that of a circle. The equation of the circle can be transformed into rectangular coordinates using
the coordinate transformation formulas in Equation 10.3.1. Example 10.3.4 gives some more examples of functions for
transforming from polar to rectangular coordinates.

 Example 10.3.4: Transforming Polar Equations to Rectangular Coordinates

Rewrite each of the following equations in rectangular coordinates and identify the graph.
π
a. θ =
3
b. r = 3
c. r = 6 cos θ − 8 sin θ
Solution:

a. Take the tangent of both sides. This gives tan θ = tan(π/3) = √3 .Since tan θ = y/x we can replace the left-hand side of
– –
this equation by y/x. This gives y/x = √3, which can be rewritten as y = x √3 . This is the equation of a straight line passing

through the origin with slope √3. In general, any polar equation of the form θ = K represents a straight line through the pole
with slope equal to tan K .
b. First, square both sides of the equation. This gives r = 9. Next replace r with x + y . This gives the equation
2 2 2 2

x + y = 9 , which is the equation of a circle centered at the origin with radius 3. In general, any polar equation of the form
2 2

r = k where k is a positive constant represents a circle of radius k centered at the origin. (Note: when squaring both sides of an

equation it is possible to introduce new points unintentionally. This should always be taken into consideration. However, in this
π 4π
case we do not introduce new points. For example, (−3, ) is the same point as (3, .)
)
3 3

c. Multiply both sides of the equation by r. This leads to r 2


= 6r cos θ − 8r sin θ . Next use the formulas
2 2 2
r =x + y , x = r cos θ, y = r sin θ.

This gives
2
r = 6(r cos θ) − 8(r sin θ)

2 2
x +y = 6x − 8y.

To put this equation into standard form, first move the variables from the right-hand side of the equation to the left-hand side,
then complete the square.
2 2
x +y = 6x − 8y

2 2
x − 6x + y + 8y = 0

2 2
(x − 6x) + (y + 8y) = 0

2 2
(x − 6x + 9) + (y + 8y + 16) = 9 + 16

2 2
(x − 3 ) + (y + 4 ) = 25.

This is the equation of a circle with center at (3, −4) and radius 5. Notice that the circle passes through the origin since the
center is 5 units away.

 Exercise 10.3.4

Rewrite the equation r = sec θ tan θ in rectangular coordinates and identify its graph.

Hint
Convert to sine and cosine, then multiply both sides by cosine.

Answer
y =x
2
, which is the equation of a parabola opening upward.

10.3.8 https://math.libretexts.org/@go/page/4507
We have now seen several examples of drawing graphs of curves defined by polar equations. A summary of some common curves
is given in the tables below. In each equation, a and b are arbitrary constants.

Figure 10.3.5

10.3.9 https://math.libretexts.org/@go/page/4507
Figure 10.3.6
A cardioid is a special case of a limaçon (pronounced “lee-mah-son”), in which a = b or a = −b . The rose is a very interesting
curve. Notice that the graph of r = 3 sin 2θ has four petals. However, the graph of r = 3 sin 3θ has three petals as shown.

10.3.10 https://math.libretexts.org/@go/page/4507
Figure 10.3.7 : Graph of r = 3 sin 3θ .
If the coefficient of θ is even, the graph has twice as many petals as the coefficient. If the coefficient of θ is odd, then the number
of petals equals the coefficient. You are encouraged to explore why this happens. Even more interesting graphs emerge when the
coefficient of θ is not an integer. For example, if it is rational, then the curve is closed; that is, it eventually ends where it started
(Figure 10.3.8a). However, if the coefficient is irrational, then the curve never closes (Figure 10.3.8b). Although it may appear that
the curve is closed, a closer examination reveals that the petals just above the positive x axis are slightly thicker. This is because the
petal does not quite match up with the starting point.

Figure 10.3.8 : Polar rose graphs of functions with (a) rational coefficient and (b) irrational coefficient. Note that the rose in part (b)
would actually fill the entire circle if plotted in full.
Since the curve defined by the graph of r = 3 sin(πθ) never closes, the curve depicted in Figure 10.3.8b is only a partial depiction.
In fact, this is an example of a space-filling curve. A space-filling curve is one that in fact occupies a two-dimensional subset of
the real plane. In this case the curve occupies the circle of radius 3 centered at the origin.

 Example 10.3.5: Describing a Spiral

Recall the chambered nautilus introduced in the chapter prelude. This creature displays a spiral when half the outer shell is
cut away. It is possible to describe a spiral using rectangular coordinates. Figure 10.3.9 shows a spiral in rectangular
coordinates. How can we describe this curve mathematically?

10.3.11 https://math.libretexts.org/@go/page/4507
Figure 10.3.9 : How can we describe a spiral graph mathematically?
Solution
As the point P travels around the spiral in a counterclockwise direction, its distance d from the origin increases. Assume that
the distance d is a constant multiple k of the angle θ that the line segment OP makes with the positive x-axis. Therefore
d(P , O) = kθ , where O is the origin. Now use the distance formula and some trigonometry:

d(P , O) = kθ

−−−−−−−−−−−−−− − y
2 2
√(x − 0 ) + (y − 0 ) = k arctan( )
x

−−−−− − y
√x2 + y 2 = k arctan( )
x
−−−−−−
2 2
y √x + y
arctan( ) =
x k
−−−−− −
√x2 + y 2
y = x tan( ) .
k

Although this equation describes the spiral, it is not possible to solve it directly for either x or y. However, if we use polar
coordinates, the equation becomes much simpler. In particular, d(P , O) = r , and θ is the second coordinate. Therefore the
equation for the spiral becomes r = kθ . Note that when θ = 0 we also have r = 0 , so the spiral emanates from the origin. We
can remove this restriction by adding a constant to the equation. Then the equation for the spiral becomes r = a + kθ for
arbitrary constants a and k . This is referred to as an Archimedean spiral, after the Greek mathematician Archimedes.
Another type of spiral is the logarithmic spiral, described by the function r = a ⋅ b . A graph of the function r = 1.2(1.25
θ θ
) is
given in Figure 10.3.10. This spiral describes the shell shape of the chambered nautilus.

10.3.12 https://math.libretexts.org/@go/page/4507
Figure 10.3.10: A logarithmic spiral is similar to the shape of the chambered nautilus shell. (credit: modification of work by
Jitze Couperus, Flickr)

Suppose a curve is described in the polar coordinate system via the function r = f (θ) . Since we have conversion formulas from
polar to rectangular coordinates given by

x = r cos θ

y = r sin θ

,
it is possible to rewrite these formulas using the function

x = f (θ) cos θ

y = f (θ) sin θ.

This step gives a parameterization of the curve in rectangular coordinates using θ as the parameter. For example, the spiral formula
r = a + bθ from Figure becomes

x = (a + bθ) cos θ

y = (a + bθ) sin θ.

Letting θ range from −∞ to ∞ generates the entire spiral.

Symmetry in Polar Coordinates


When studying symmetry of functions in rectangular coordinates (i.e., in the form y = f (x)), we talk about symmetry with respect
to the y-axis and symmetry with respect to the origin. In particular, if f (−x) = f (x) for all x in the domain of f , then f is an even
function and its graph is symmetric with respect to the y-axis. If f (−x) = −f (x) for all x in the domain of f , then f is an odd
function and its graph is symmetric with respect to the origin. By determining which types of symmetry a graph exhibits, we can
learn more about the shape and appearance of the graph. Symmetry can also reveal other properties of the function that generates
the graph. Symmetry in polar curves works in a similar fashion.

 Symmetry in Polar Curves and Equations

Consider a curve generated by the function r = f (θ) in polar coordinates.


i. The curve is symmetric about the polar axis if for every point (r, θ) on the graph, the point (r, −θ) is also on the graph.
Similarly, the equation r = f (θ) is unchanged by replacing θ with −θ .
ii. The curve is symmetric about the pole if for every point (r, θ) on the graph, the point (r, π + θ) is also on the graph.
Similarly, the equation r = f (θ) is unchanged when replacing r with −r, or θ with π + θ.

10.3.13 https://math.libretexts.org/@go/page/4507
π
iii. The curve is symmetric about the vertical line θ = if for every point (r, θ) on the graph, the point (r, π − θ) is also on
2
the graph. Similarly, the equation r = f (θ) is unchanged when θ is replaced by π − θ .

The following table shows examples of each type of symmetry.

 Example 10.3.6: Using Symmetry to Graph a Polar Equation

Find the symmetry of the rose defined by the equation r = 3 sin(2θ) and create a graph.
Solution
Suppose the point (r, θ) is on the graph of r = 3 sin(2θ).
i. To test for symmetry about the polar axis, first try replacing θ with −θ . This gives r = 3 sin(2(−θ)) = −3 sin(2θ) .
Since this changes the original equation, this test is not satisfied. However, returning to the original equation and
replacing r with −r and θ with π − θ yields

10.3.14 https://math.libretexts.org/@go/page/4507
−r = 3 sin(2(π − θ))

−r = 3 sin(2π − 2θ)

−r = 3 sin(−2θ)

−r = −3 sin 2θ.

Multiplying both sides of this equation by −1 gives r = 3 sin 2θ , which is the original equation. This demonstrates that
the graph is symmetric with respect to the polar axis.
ii. To test for symmetry with respect to the pole, first replace r with −r, which yields −r = 3 sin(2θ) . Multiplying both
sides by −1 gives r = −3 sin(2θ) , which does not agree with the original equation. Therefore the equation does not pass
the test for this symmetry. However, returning to the original equation and replacing θ with θ + π gives
r = 3 sin(2(θ + π))

= 3 sin(2θ + 2π)

= 3(sin 2θ cos 2π + cos 2θ sin 2π)

= 3 sin 2θ.

Since this agrees with the original equation, the graph is symmetric about the pole.
π
iii. To test for symmetry with respect to the vertical line θ = , first replace both r with −r and θ with −θ .
2

−r = 3 sin(2(−θ))

−r = 3 sin(−2θ)

−r = −3 sin 2θ.

Multiplying both sides of this equation by −1 gives r = 3 sin 2θ , which is the original equation. Therefore the graph is
π
symmetric about the vertical line θ = .
2

This graph has symmetry with respect to the polar axis, the origin, and the vertical line going through the pole. To graph the
function, tabulate values of θ between 0 and π/2 and then reflect the resulting graph.

0 0

π
3 √3
≈ 2.6
6 2

π
3
4


π
3 √3
≈ 2.6
3 2

π
0
2

This gives one petal of the rose, as shown in the following graph.

10.3.15 https://math.libretexts.org/@go/page/4507
Figure 10.3.11: The graph of the equation between θ = 0 and θ = π/2.
Reflecting this image into the other three quadrants gives the entire graph as shown.

Figure 10.3.12: The entire graph of the equation is called a four-petaled rose.

 Exercise 10.3.5Symmetry

Determine the symmetry of the graph determined by the equation r = 2 cos(3θ) and create a graph.

Hint
Use Note.

Answer
Symmetric with respect to the polar axis.

10.3.16 https://math.libretexts.org/@go/page/4507
Key Concepts
The polar coordinate system provides an alternative way to locate points in the plane.
Convert points between rectangular and polar coordinates using the formulas

x = r cos θ and y = r sin θ

and
−−−−−−
2 2
y
r = √x +y and tan θ = .
x

To sketch a polar curve from a given polar function, make a table of values and take advantage of periodic properties.
Use the conversion formulas to convert equations between rectangular and polar coordinates.
Identify symmetry in polar curves, which can occur through the pole, the horizontal axis, or the vertical axis.

Glossary
angular coordinate
θ the angle formed by a line segment connecting the origin to a point in the polar coordinate system with the positive radial (x)

axis, measured counterclockwise

cardioid
a plane curve traced by a point on the perimeter of a circle that is rolling around a fixed circle of the same radius; the equation
of a cardioid is r = a(1 + sin θ) or r = a(1 + cos θ)

limaçon
the graph of the equation r = a + b sin θ or r = a + b cos θ. If a = b then the graph is a cardioid

polar axis
the horizontal axis in the polar coordinate system corresponding to r ≥ 0

polar coordinate system


a system for locating points in the plane. The coordinates are r , the radial coordinate, and θ , the angular coordinate

polar equation
an equation or function relating the radial coordinate to the angular coordinate in the polar coordinate system

pole
the central point of the polar coordinate system, equivalent to the origin of a Cartesian system

10.3.17 https://math.libretexts.org/@go/page/4507
radial coordinate
r the coordinate in the polar coordinate system that measures the distance from a point in the plane to the pole

rose
graph of the polar equation r = a cos 2θ or r = a sin 2θ for a positive constant a

space-filling curve
a curve that completely occupies a two-dimensional subset of the real plane

10.3: Polar Coordinates is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
11.3: Polar Coordinates by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

10.3.18 https://math.libretexts.org/@go/page/4507
10.4: Areas and Lengths in Polar Coordinates
 Learning Objectives
Apply the formula for area of a region in polar coordinates.
Determine the arc length of a polar curve.

In the rectangular coordinate system, the definite integral provides a way to calculate the area under a curve. In particular, if we
have a function y = f (x) defined from x = a to x = b where f (x) > 0 on this interval, the area between the curve and the x-axis
is given by
b

A =∫ f (x)dx.
a

This fact, along with the formula for evaluating this integral, is summarized in the Fundamental Theorem of Calculus. Similarly,
the arc length of this curve is given by
b
−−−−−−−−−−
2
L =∫ √ 1 + (f '(x)) dx.
a

In this section, we study analogous formulas for area and arc length in the polar coordinate system.

Areas of Regions Bounded by Polar Curves


We have studied the formulas for area under a curve defined in rectangular coordinates and parametrically defined curves. Now we
turn our attention to deriving a formula for the area of a region bounded by a polar curve. Recall that the proof of the Fundamental
Theorem of Calculus used the concept of a Riemann sum to approximate the area under a curve by using rectangles. For polar
curves we use the Riemann sum again, but the rectangles are replaced by sectors of a circle.
Consider a curve defined by the function r = f (θ), where α ≤ θ ≤ β. Our first step is to partition the interval [α, β] into n equal-
width subintervals. The width of each subinterval is given by the formula Δθ = (β − α)/n , and the ith partition point θ is given
i

by the formula θ = α + iΔθ . Each partition point θ = θ defines a line with slope tan θ passing through the pole as shown in the
i i i

following graph.

10.4.1 https://math.libretexts.org/@go/page/4508
Figure 10.4.1 : A partition of a typical curve in polar coordinates.
The line segments are connected by arcs of constant radius. This defines sectors whose areas can be calculated by using a
geometric formula. The area of each sector is then used to approximate the area between successive line segments. We then sum the
areas of the sectors to approximate the total area. This approach gives a Riemann sum approximation for the total area. The formula
for the area of a sector of a circle is illustrated in the following figure.

1
Figure 10.4.2 : The area of a sector of a circle is given by A = 2
θr .
2

Recall that the area of a circle is A = πr


2
. When measuring angles in radians, 360 degrees is equal to 2π radians. Therefore a
θ
fraction of a circle can be measured by the central angle θ . The fraction of the circle is given by , so the area of the sector is this

fraction multiplied by the total area:
θ 1
2 2
A =( )π r = θr .
2π 2

Since the radius of a typical sector in Figure 10.4.1 is given by r i = f (θi ) , the area of the ith sector is given by
1 2
Ai = (Δθ)(f (θi )) .
2

10.4.2 https://math.libretexts.org/@go/page/4508
Therefore a Riemann sum that approximates the area is given by
n n
1
2
An = ∑ Ai ≈ ∑ (Δθ)(f (θi )) .
2
i=1 i=1

We take the limit as n → ∞ to get the exact area:


β
1 2
A = lim An = ∫ (f (θ)) dθ.
n→∞ 2 α

This gives the following theorem.

 Area of a Region Bounded by a Polar Curve

Suppose f is continuous and nonnegative on the interval α ≤ θ ≤ β with 0 < β − α ≤ 2π . The area of the region bounded by
the graph of r = f (θ) between the radial lines θ = α and θ = β is
β
1
2
A = ∫ [f (θ)] dθ (10.4.1)
2 α

β
1
2
= ∫ r dθ. (10.4.2)
2 α

 Example 10.4.1: Finding an Area of a Polar Region

Find the area of one petal of the rose defined by the equation r = 3 sin(2θ).
Solution
The graph of r = 3 sin(2θ) follows.

Figure 10.4.3 : The graph of r = 3 sin(2θ).


When θ = 0 we have r = 3 sin(2(0)) = 0 . The next value for which r = 0 is θ = π/2 . This can be seen by solving the
equation 3 sin(2θ) = 0 for θ . Therefore the values θ = 0 to θ = π/2 trace out the first petal of the rose. To find the area inside
this petal, use Equation 10.4.2 with f (θ) = 3 sin(2θ), α = 0, and β = π/2:
β
1 2
A = ∫ [f (θ)] dθ
2 α

π/2
1 2
= ∫ [3 sin(2θ)] dθ
2 0

π/2
1 2
= ∫ 9 sin (2θ)dθ.
2 0

To evaluate this integral, use the formula sin


2
α = (1 − cos(2α))/2 with α = 2θ :

10.4.3 https://math.libretexts.org/@go/page/4508
π/2
1
2
A = ∫ 9 sin (2θ)dθ
2 0

π/2
9 (1 − cos(4θ))
= ∫ dθ
2 0
2

π/2
9
= (∫ 1 − cos(4θ)dθ)
4 0

π/2
9 sin(4θ) ∣
= (θ − ∣
4 4 ∣
0

9 π sin 2π 9 sin 4(0)


= ( − )− (0 − )
4 2 4 4 4


=
8

 Exercise 10.4.1

Find the area inside the cardioid defined by the equation r = 1 − cos θ .

Hint
Use Equation 10.4.2. Be sure to determine the correct limits of integration before evaluating.

Answer
A = 3π/2

Example 10.4.1 involved finding the area inside one curve. We can also use Equation 10.4.2 to find the area between two polar
curves. However, we often need to find the points of intersection of the curves and determine which function defines the outer
curve or the inner curve between these two points.

 Example 10.4.2: Finding the Area between Two Polar Curves

Find the area outside the cardioid r = 2 + 2 sin θ and inside the circle r = 6 sin θ .
Solution
First draw a graph containing both curves as shown.

Figure 10.4.4 : The region between the curves r = 2 + 2 sin θ and r = 6 sin θ.
To determine the limits of integration, first find the points of intersection by setting the two functions equal to each other and
solving for θ :

10.4.4 https://math.libretexts.org/@go/page/4508
6 sin θ = 2 + 2 sin θ

4 sin θ = 2
.

1
sin θ =
2

π 5π
This gives the solutions θ = and θ = , which are the limits of integration. The circle r = 3 sin θ is the red graph, which
6 6
is the outer function, and the cardioid r = 2 + 2 sin θ is the blue graph, which is the inner function. To calculate the area
π 5π
between the curves, start with the area inside the circle between θ = and θ = , then subtract the area inside the cardioid
6 6
π 5π
between θ = and θ = :
6 6

A = circle − cardioid

1 5π/6 1 5π/6
2 2
= ∫ [6 sin θ] dθ − ∫ [2 + 2 sin θ] dθ
π/6 π/6
2 2

1 5π/6 1 5π/6
2 2
= ∫ 36 sin θ dθ − ∫ 4 + 8 sin θ + 4 sin θ dθ
π/6 π/6
2 2

5π/6
1 − cos(2θ) 5π/6
1 − cos(2θ)
= 18 ∫ dθ − 2 ∫ 1 + 2 sin θ + dθ
π/6 π/6
2 2

sin(2θ) 5π/6 3θ sin(2θ) 5π/6


= 9[θ − ] − 2[ − 2 cos θ − ]
π/6 π/6
2 2 4

5π sin(10π/6) π sin(2π/6) 5π 5π sin(10π/6)


= 9( − ) − 9( − ) − (3( ) − 4 cos − )
6 2 6 2 6 6 2

π π sin(2π/6)
+ (3( ) − 4 cos − )
6 6 2

= 4π .

 Exercise 10.4.2

Find the area inside the circle r = 4 cos θ and outside the circle r = 2 .

Hint
Use Equation 10.4.2and take advantage of symmetry.

Answer
4π –
A = + 2 √3
3

In Example 10.4.2 we found the area inside the circle and outside the cardioid by first finding their intersection points. Notice that
π 5π
solving the equation directly for θ yielded two solutions: θ = and θ = . However, in the graph there are three intersection
6 6
points. The third intersection point is the origin. The reason why this point did not show up as a solution is because the origin is on
both graphs but for different values of θ . For example, for the cardioid we get
2 + 2 sin θ = 0
.
sin θ = −1,


so the values for θ that solve this equation are θ = + 2nπ , where n is any integer. For the circle we get
2

6 sin θ = 0.

The solutions to this equation are of the form θ = nπ for any integer value of n . These two solution sets have no points in
common. Regardless of this fact, the curves intersect at the origin. This case must always be taken into consideration.

10.4.5 https://math.libretexts.org/@go/page/4508
Arc Length in Polar Curves
Here we derive a formula for the arc length of a curve defined in polar coordinates. In rectangular coordinates, the arc length of a
parameterized curve (x(t), y(t)) for a ≤ t ≤ b is given by
−−−−−−−−−−−−−−
b 2 2
dx dy
L =∫ √( ) +( ) dt.
a
dt dt

In polar coordinates we define the curve by the equation r = f (θ) , where α ≤ θ ≤ β. In order to adapt the arc length formula for a
polar curve, we use the equations

x = r cos θ = f (θ) cos θ

and

y = r sin θ = f (θ) sin θ,

and we replace the parameter t by θ . Then


dx
= f '(θ) cos θ − f (θ) sin θ

dy
= f '(θ) sin θ + f (θ) cos θ.

We replace dt by dθ , and the lower and upper limits of integration are α and β, respectively. Then the arc length formula becomes
−−−−−−−−−−−−−−
b 2 2
dx dy
L =∫ √( ) +( ) dt
a
dt dt

−−−−−−−−−−−−−−
β 2 2
dx dy
=∫ √( ) +( ) dθ
α
dθ dθ

β −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2
=∫ √ (f '(θ) cos θ − f (θ) sin θ) + (f '(θ) sin θ + f (θ) cos θ) dθ
α

β −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2 2 2 2 2
=∫ √ (f '(θ)) (cos θ + sin θ) + (f (θ)) (cos θ + sin θ) dθ
α

β −−−−−−−−−−−−−−
2 2
=∫ √ (f '(θ)) + (f (θ)) dθ
α

−−−−−−−−−−
β 2
dr
2
=∫ √r +( ) dθ
α

This gives us the following theorem.

 Arc Length of a Curve Defined by a Polar Function


Let f be a function whose derivative is continuous on an interval α ≤θ ≤β . The arc length of the graph of r = f (θ) from
θ = α to θ = β is

β −−−−−−−−−−−−−
2 2
L =∫ √ [f (θ)] + [f '(θ)] dθ (10.4.3)
α

−−−−−−−−−−
β 2

2
dr
=∫ √r +( ) dθ. (10.4.4)
α

10.4.6 https://math.libretexts.org/@go/page/4508
 Example 10.4.3: Finding the Arc Length of a cardioid
Find the arc length of the cardioid r = 2 + 2 cos θ .
Solution
When θ = 0, r = 2 + 2 cos 0 = 4. Furthermore, as θ goes from 0 to 2π, the cardioid is traced out exactly once. Therefore
these are the limits of integration. Using f (θ) = 2 + 2 cos θ, α = 0, and β = 2π, Equation 10.4.3 becomes
β
−−−−−−−−−−−−−
2 2
L =∫ √ [f (θ)] + [f '(θ)] dθ
α

2π −−−−−−−−−−−−−−−−−−−−
2 2
=∫ √ [2 + 2 cos θ] + [−2 sin θ] dθ
0


−−−−−−−−−−−−−−−−−−−−−− −
2 2
=∫ √ 4 + 8 cos θ + 4 cos θ + 4 sin θ dθ
0

2π −−−−−−−−−−−−−−−−−−−−−−−
2 2
=∫ √ 4 + 8 cos θ + 4(cos θ + sin θ) dθ
0


− −−−−−− −
=∫ √ 8 + 8 cos θ dθ
0


− −−−−−− −
=2∫ √ 2 + 2 cos θ dθ.
0

Next, using the identity cos(2α) = 2 cos α − 1, add 1 to both sides and multiply by 2. This gives 2 + 2 cos(2α) = 4 cos
2 2
α.

Substituting α = θ/2 gives 2 + 2 cos θ = 4 cos (θ/2) , so the integral becomes


2


− −−−−−− −
L =2∫ √ 2 + 2 cos θ dθ
0

2π −−−−− −−−
θ
2
=2∫ √4 cos ( ) dθ
0 2


θ
=4∫ ∣ cos( ) ∣ dθ.
0
2

The absolute value is necessary because the cosine is negative for some values in its domain. To resolve this issue, change the
π
limits from 0 to π and double the answer. This strategy works because cosine is positive between 0 and . Thus,
2


θ
L =4∫ ∣ cos( ) ∣ dθ
0
2

π
θ
=8∫ cos( ) dθ
0
2

π
θ ∣
= 8(2 sin( )∣
2 ∣0

= 16

 Exercise 10.4.3

Find the total arc length of the graph of r = 3 sin θ .

Hint
Use Equation 10.4.3. To determine the correct limits, make a table of values.

Answer

10.4.7 https://math.libretexts.org/@go/page/4508
s = 3π

Key Concepts
The area of a region in polar coordinates defined by the equation r = f (θ) with α ≤ θ ≤ β is given by the integral
1 β
A = ∫
α
2
[f (θ)] dθ .
2
To find the area between two curves in the polar coordinate system, first find the points of intersection, then subtract the
corresponding areas.
The arc length of a polar curve defined by the equation r = f (θ) with α ≤ θ ≤ β is given by the integral
−−−−−−− −−
β −−−−−−−−−−−− − β dr
L =∫
α
√[f (θ)]2 + [f '(θ)]2 dθ = ∫
α
√r2 + ( 2
) dθ .

Key Equations
Area of a region bounded by a polar curve
β β
1 1
2 2
A = ∫ [f (θ)] dθ = ∫ r dθ
2 α
2 α

Arc length of a polar curve


β β −− −−−−−−−
−−−−−−−−−−−−−
2 2 2
dr 2
L =∫ √ [f (θ)] + [f '(θ)] dθ = ∫ √r +( ) dθ
α α

10.4: Areas and Lengths in Polar Coordinates is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
11.4: Area and Arc Length in Polar Coordinates by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

10.4.8 https://math.libretexts.org/@go/page/4508
10.5: Conic Sections
 Learning Objectives
Identify the equation of a parabola in standard form with given focus and directrix.
Identify the equation of an ellipse in standard form with given foci.
Identify the equation of a hyperbola in standard form with given foci.
Recognize a parabola, ellipse, or hyperbola from its eccentricity value.
Write the polar equation of a conic section with eccentricity e .
Identify when a general equation of degree two is a parabola, ellipse, or hyperbola.

Conic sections have been studied since the time of the ancient Greeks, and were considered to be an important mathematical
concept. As early as 320 BCE, such Greek mathematicians as Menaechmus, Appollonius, and Archimedes were fascinated by these
curves. Appollonius wrote an entire eight-volume treatise on conic sections in which he was, for example, able to derive a specific
method for identifying a conic section through the use of geometry. Since then, important applications of conic sections have arisen
(for example, in astronomy), and the properties of conic sections are used in radio telescopes, satellite dish receivers, and even
architecture. In this section we discuss the three basic conic sections, some of their properties, and their equations.
Conic sections get their name because they can be generated by intersecting a plane with a cone. A cone has two identically shaped
parts called nappes. One nappe is what most people mean by “cone,” having the shape of a party hat. A right circular cone can be
generated by revolving a line passing through the origin around the y-axis as shown in Figure 10.5.1.

Figure 10.5.1 : A cone generated by revolving the line y = 3x around the y -axis.
Conic sections are generated by the intersection of a plane with a cone (Figure 10.5.2). If the plane is parallel to the axis of
revolution (the y-axis), then the conic section is a hyperbola. If the plane is parallel to the generating line, the conic section is a
parabola. If the plane is perpendicular to the axis of revolution, the conic section is a circle. If the plane intersects one nappe at an
angle to the axis (other than 90°), then the conic section is an ellipse.

10.5.1 https://math.libretexts.org/@go/page/4509
Figure 10.5.2 : The four conic sections. Each conic is determined by the angle the plane makes with the axis of the cone.

Parabolas
A parabola is generated when a plane intersects a cone parallel to the generating line. In this case, the plane intersects only one of
the nappes. A parabola can also be defined in terms of distances.

 Definitions: The Focus, Directrix and Vertex

A parabola is the set of all points whose distance from a fixed point, called the focus, is equal to the distance from a fixed line,
called the directrix. The point halfway between the focus and the directrix is called the vertex of the parabola.

Figure 10.5.3 : A typical parabola in which the distance from the focus to the vertex is represented by the variable p.
A graph of a typical parabola appears in Figure 10.5.3. Using this diagram in conjunction with the distance formula, we can derive
an equation for a parabola. Recall the distance formula: Given point P with coordinates (x , y ) and point Q with coordinates
1 1

(x , y ), the distance between them is given by the formula


2 2

−−−−−−−−−−−−−−−−−−
2 2
d(P , Q) = √ (x2 − x1 ) + (y2 − y1 ) .

Then from the definition of a parabola and Figure 10.5.3, we get

d(F , P ) = d(P , Q)

10.5.2 https://math.libretexts.org/@go/page/4509
−−−−−−−−−−−−−−− −−−−−−−−−−−−−−−−
2 2 2 2
√ (0 − x ) + (p − y ) = √ (x − x ) + (−p − y ) .

Squaring both sides and simplifying yields


2 2 2 2
x + (p − y ) =0 + (−p − y ) (10.5.1)

2 2 2 2 2
x +p − 2py + y =p + 2py + y (10.5.2)

2
x − 2py = 2py (10.5.3)

2
x = 4py. (10.5.4)

Now suppose we want to relocate the vertex. We use the variables (h, k) to denote the coordinates of the vertex. Then if the focus
is directly above the vertex, it has coordinates (h, k + p) and the directrix has the equation y = k − p . Going through the same
derivation yields the formula (x − h) = 4p(y − k) . Solving this equation for y leads to the following theorem.
2

 Equations for Parabolas: standard form

Given a parabola opening upward with vertex located at (h, k) and focus located at (h, k + p) , where p is a constant, the
equation for the parabola is given by
1 2
y = (x − h ) + k.
4p

This is the standard form of a parabola.

We can also study the cases when the parabola opens down or to the left or the right. The equation for each of these cases can also
be written in standard form as shown in the following graphs.

10.5.3 https://math.libretexts.org/@go/page/4509
Figure 10.5.4 : Four parabolas, opening in various directions, along with their equations in standard form.
In addition, the equation of a parabola can be written in the general form, though in this form the values of h , k , and p are not
immediately recognizable. The general form of a parabola is written as
2
ax + bx + cy + d = 0 (10.5.5)

or
2
ay + bx + cy + d = 0. (10.5.6)

Equation 10.5.5 represents a parabola that opens either up or down. Equation 10.5.6 represents a parabola that opens either to the
left or to the right. To put the equation into standard form, use the method of completing the square.

 Example 10.5.1: Converting the Equation of a Parabola from General into Standard Form

Put the equation


2
x − 4x − 8y + 12 = 0

into standard form and graph the resulting parabola.


Solution
Since y is not squared in this equation, we know that the parabola opens either upward or downward. Therefore we need to
solve this equation for y, which will put the equation into standard form. To do that, first add 8y to both sides of the equation:

10.5.4 https://math.libretexts.org/@go/page/4509
2
8y = x − 4x + 12.

The next step is to complete the square on the right-hand side. Start by grouping the first two terms on the right-hand side using
parentheses:
2
8y = (x − 4x) + 12.

Next determine the constant that, when added inside the parentheses, makes the quantity inside the parentheses a perfect square
−4
trinomial. To do this, take half the coefficient of x and square it. This gives (
2
) = 4. Add 4 inside the parentheses and
2
subtract 4 outside the parentheses, so the value of the equation is not changed:
2
8y = (x − 4x + 4) + 12 − 4.

Now combine like terms and factor the quantity inside the parentheses:
2
8y = (x − 2 ) + 8.

Finally, divide by 8:
1 2
y = (x − 2 ) + 1.
8

This equation is now in standard form. Comparing this to Equation gives h = 2, k = 1 , and p = 2 . The parabola opens up,
with vertex at (2, 1), focus at (2, 3), and directrix y = −1 . The graph of this parabola appears as follows.

Figure 10.5.5 : The parabola in Example 10.5.1 .

 Exercise 10.5.1
Put the equation 2y 2
− x + 12y + 16 = 0 into standard form and graph the resulting parabola.

Hint
Solve for x . Check which direction the parabola opens.

Answer
2
x = 2(y + 3 ) −2

10.5.5 https://math.libretexts.org/@go/page/4509
The axis of symmetry of a vertical (opening up or down) parabola is a vertical line passing through the vertex. The parabola has an
interesting reflective property. Suppose we have a satellite dish with a parabolic cross section. If a beam of electromagnetic waves,
such as light or radio waves, comes into the dish in a straight line from a satellite (parallel to the axis of symmetry), then the waves
reflect off the dish and collect at the focus of the parabola as shown.

Consider a parabolic dish designed to collect signals from a satellite in space. The dish is aimed directly at the satellite, and a
receiver is located at the focus of the parabola. Radio waves coming in from the satellite are reflected off the surface of the
parabola to the receiver, which collects and decodes the digital signals. This allows a small receiver to gather signals from a wide
angle of sky. Flashlights and headlights in a car work on the same principle, but in reverse: the source of the light (that is, the light
bulb) is located at the focus and the reflecting surface on the parabolic mirror focuses the beam straight ahead. This allows a small
light bulb to illuminate a wide angle of space in front of the flashlight or car.

Ellipses
An ellipse can also be defined in terms of distances. In the case of an ellipse, there are two foci (plural of focus), and two
directrices (plural of directrix). We look at the directrices in more detail later in this section.

 Definition: Ellipse

An ellipse is the set of all points for which the sum of their distances from two fixed points (the foci) is constant.

A graph of a typical ellipse is shown in Figure 10.5.6. In this figure the foci are labeled as F and F '. Both are the same fixed
distance from the origin, and this distance is represented by the variable c . Therefore the coordinates of F are (c, 0) and the
coordinates of F ' are (−c, 0). The points P and P ' are located at the ends of the major axis of the ellipse, and have coordinates
(a, 0) and (−a, 0), respectively. The major axis is always the longest distance across the ellipse, and can be horizontal or vertical.

Thus, the length of the major axis in this ellipse is 2a. Furthermore, P and P ' are called the vertices of the ellipse. The points Q
and Q' are located at the ends of the minor axis of the ellipse, and have coordinates (0, b) and (0, −b), respectively. The minor
axis is the shortest distance across the ellipse. The minor axis is perpendicular to the major axis.

10.5.6 https://math.libretexts.org/@go/page/4509
Figure 10.5.6 : A typical ellipse in which the sum of the distances from any point on the ellipse to the foci is constant.
According to the definition of the ellipse, we can choose any point on the ellipse and the sum of the distances from this point to the
two foci is constant. Suppose we choose the point P . Since the coordinates of point P are (a, 0), the sum of the distances is

d(P , F ) + d(P , F ') = (a − c) + (a + c) = 2a.

Therefore the sum of the distances from an arbitrary point A with coordinates (x, y) is also equal to 2a. Using the distance formula,
we get

d(A, F ) + d(A, F ') = 2a.

−−−−−−−−−− −−−−−−−−−−
2 2 2 2
√ (x − c ) +y + √ (x + c ) +y = 2a

Subtract the second radical from both sides and square both sides:
−−−−−−−−−− −−−−−−−−−−
2 2 2 2
√ (x − c ) +y = 2a − √ (x + c ) +y

−−−−−−−−−−
2 2 2 2 2 2 2
(x − c ) +y = 4a − 4a√ (x + c ) +y + (x + c ) +y

−−−−−−−−−−
2 2 2 2 2 2 2 2 2
x − 2cx + c +y = 4a − 4a√ (x + c ) +y +x + 2cx + c +y

−−−−−−−−−−
2 2 2
−2cx = 4 a − 4a√ (x + c ) +y + 2cx.

Now isolate the radical on the right-hand side and square again:
−−−−−−−−−−
2 2 2
−2cx = 4 a − 4a√ (x + c ) +y + 2cx

−−−−−−−−−−
2 2 2
4a√ (x + c ) +y = 4a + 4cx

−−−−−−−−−− cx
2 2
√ (x + c ) +y = a+
a

2 2
c x
2 2 2
(x + c ) +y =a + 2cx +
2
a

2 2
c x
2 2 2 2
x + 2cx + c +y =a + 2cx +
2
a

2 2
2 2 2 2
c x
x +c +y =a + .
2
a

Isolate the variables on the left-hand side of the equation and the constants on the right-hand side:

10.5.7 https://math.libretexts.org/@go/page/4509
2 2
2
c x 2 2 2
x − +y =a −c
2
a

2 2 2
(a − c )x
2 2 2
+y =a −c .
2
a

Divide both sides by a 2 2


−c . This gives the equation
2 2
x y
+ = 1.
2 2 2
a a −c

If we refer back to Figure 10.5.6, then the length of each of the two green line segments is equal to a . This is true because the sum
of the distances from the point Q to the foci F and F ' is equal to 2a, and the lengths of these two line segments are equal. This line
segment forms a right triangle with hypotenuse length a and leg lengths b and c . From the Pythagorean theorem, b + c = a and 2 2 2

b = a − c . Therefore the equation of the ellipse becomes


2 2 2

2 2
x y
+ = 1.
2 2
a b

Finally, if the center of the ellipse is moved from the origin to a point (h, k), we have the following standard form of an ellipse.

 Equation of an Ellipse in Standard Form


Consider the ellipse with center (h, k), a horizontal major axis with length 2a, and a vertical minor axis with length 2b. Then
the equation of this ellipse in standard form is
2 2
(x − h) (y − k)
+ =1 (10.5.7)
2 2
a b

2
a
and the foci are located at (h ± c, k) , where c 2
=a
2
−b
2
. The equations of the directrices are x = h ± .
c

If the major axis is vertical, then the equation of the ellipse becomes
2 2
(x − h) (y − k)
+ =1 (10.5.8)
b2 a2

2
a
and the foci are located at (h, k ± c) , where c 2
=a
2
−b
2
. The equations of the directrices in this case are y = k ± .
c

If the major axis is horizontal, then the ellipse is called horizontal, and if the major axis is vertical, then the ellipse is called vertical.
The equation of an ellipse is in general form if it is in the form
2 2
Ax + By + C x + Dy + E = 0,

where A and B are either both positive or both negative. To convert the equation from general to standard form, use the method of
completing the square.

 Example 10.5.2: Finding the Standard Form of an Ellipse

Put the equation


2 2
9x + 4y − 36x + 24y + 36 = 0

into standard form and graph the resulting ellipse.


Solution
First subtract 36 from both sides of the equation:
2 2
9x + 4y − 36x + 24y = −36.

Next group the x terms together and the y terms together, and factor out the common factor:

10.5.8 https://math.libretexts.org/@go/page/4509
2 2
(9 x − 36x) + (4 y + 24y) = −36

2 2
9(x − 4x) + 4(y + 6y) = −36.

We need to determine the constant that, when added inside each set of parentheses, results in a perfect square. In the first set of
−4
parentheses, take half the coefficient of x and square it. This gives ( )
2
= 4. In the second set of parentheses, take half the
2
6
coefficient of y and square it. This gives ( )
2
= 9. Add these inside each pair of parentheses. Since the first set of parentheses
2
has a 9 in front, we are actually adding 36 to the left-hand side. Similarly, we are adding 36 to the second set as well. Therefore
the equation becomes
2 2
9(x − 4x + 4) + 4(y + 6y + 9) = −36 + 36 + 36

2 2
9(x − 4x + 4) + 4(y + 6y + 9) = 36.

Now factor both sets of parentheses and divide by 36:


2 2
9(x − 2 ) + 4(y + 3 ) = 36

2 2
9(x − 2) 4(y + 3)
+ =1
36 36

2 2
(x − 2) (y + 3)
+ = 1.
4 9

The equation is now in standard form. Comparing this to Equation 10.5.8 gives h = 2, k = −3, a = 3, and b = 2 . This is a
vertical ellipse with center at (2, −3), major axis 6, and minor axis 4. The graph of this ellipse appears as follows.

Figure 10.5.7 : The ellipse in Example 10.5.2 .

 Exercise 10.5.2

Put the equation


2 2
9x + 16 y + 18x − 64y − 71 = 0

into standard form and graph the resulting ellipse.

Hint
Move the constant over and complete the square.

Answer
2 2
(x + 1) (y − 2)
+ =1
16 9

10.5.9 https://math.libretexts.org/@go/page/4509
According to Kepler’s first law of planetary motion, the orbit of a planet around the Sun is an ellipse with the Sun at one of the foci
as shown in Figure 10.5.8A. Because Earth’s orbit is an ellipse, the distance from the Sun varies throughout the year. A commonly
held misconception is that Earth is closer to the Sun in the summer. In fact, in summer for the northern hemisphere, Earth is farther
from the Sun than during winter. The difference in season is caused by the tilt of Earth’s axis in the orbital plane. Comets that orbit
the Sun, such as Halley’s Comet, also have elliptical orbits, as do moons orbiting the planets and satellites orbiting Earth.
Ellipses also have interesting reflective properties: A light ray emanating from one focus passes through the other focus after mirror
reflection in the ellipse. The same thing occurs with a sound wave as well. The National Statuary Hall in the U.S. Capitol in
Washington, DC, is a famous room in an elliptical shape as shown in Figure 10.5.8B. This hall served as the meeting place for the
U.S. House of Representatives for almost fifty years. The location of the two foci of this semi-elliptical room are clearly identified
by marks on the floor, and even if the room is full of visitors, when two people stand on these spots and speak to each other, they
can hear each other much more clearly than they can hear someone standing close by. Legend has it that John Quincy Adams had
his desk located on one of the foci and was able to eavesdrop on everyone else in the House without ever needing to stand.
Although this makes a good story, it is unlikely to be true, because the original ceiling produced so many echoes that the entire
room had to be hung with carpets to dampen the noise. The ceiling was rebuilt in 1902 and only then did the now-famous
whispering effect emerge. Another famous whispering gallery—the site of many marriage proposals—is in Grand Central Station
in New York City.

Figure 10.5.8 : (a) Earth’s orbit around the Sun is an ellipse with the Sun at one focus. (b) Statuary Hall in the U.S. Capitol is a
whispering gallery with an elliptical cross section.

Hyperbolas
A hyperbola can also be defined in terms of distances. In the case of a hyperbola, there are two foci and two directrices. Hyperbolas
also have two asymptotes.

 Definition: hyperbola

A hyperbola is the set of all points where the difference between their distances from two fixed points (the foci) is constant.

A graph of a typical hyperbola appears as follows.

10.5.10 https://math.libretexts.org/@go/page/4509
Figure 10.5.9 : A typical hyperbola in which the difference of the distances from any point on the hyperbola to the foci is constant.
The transverse axis is also called the major axis, and the conjugate axis is also called the minor axis.
The derivation of the equation of a hyperbola in standard form is virtually identical to that of an ellipse. One slight hitch lies in the
definition: The difference between two numbers is always positive. Let P be a point on the hyperbola with coordinates (x, y). Then
the definition of the hyperbola gives |d(P , F ) − d(P , F )| = constant . To simplify the derivation, assume that P is on the right
1 2

branch of the hyperbola, so the absolute value bars drop. If it is on the left branch, then the subtraction is reversed. The vertex of
the right branch has coordinates (a, 0), so

d(P , F1 ) − d(P , F2 ) = (c + a) − (c − a) = 2a.

This equation is therefore true for any point on the hyperbola. Returning to the coordinates (x, y) for P :

d(P , F1 ) − d(P , F2 ) = 2a

−−−−−−−−−− −−−−−−−−−−
2 2 2 2
√ (x + c ) +y − √ (x − c ) +y = 2a.

Isolate the second radical and square both sides:


−−−−−−−−−− −−−−−−−−−−
2 2 2 2
√ (x − c ) +y = −2a + √ (x + c ) +y

−−−−−−−−−−
2 2 2 2 2 2 2
(x − c ) +y = 4a − 4a√ (x + c ) +y + (x + c ) +y

−−−−−−−−−−
2 2 2 2 2 2 2 2 2
x − 2cx + c +y = 4a − 4a√ (x + c ) +y +x + 2cx + c +y

−−−−−−−−−−
2 2 2
−2cx = 4 a − 4a√ (x + c ) +y + 2cx.

Now isolate the radical on the right-hand side and square again:
2
−−−−−−−−− −
2 2
−2cx = 4 a − 4a√(x + c ) + y + 2cx

−−−−−−−−− − 2
2 2
−4a√(x + c ) + y = −4 a − 4cx

−−−−−−−−− − cx
2 2
−√(x + c ) + y = −a −
a

2 2
c x
2 2 2
(x + c ) +y =a + 2cx +
2
a

2 2
c x
2 2 2 2
x + 2cx + c +y =a + 2cx +
2
a

2 2
c x
x
2
+c
2
+y
2
=a
2
+
2
.
a

Isolate the variables on the left-hand side of the equation and the constants on the right-hand side:

10.5.11 https://math.libretexts.org/@go/page/4509
2 2
2
c x 2 2 2
x − +y =a −c
2
a

2 2 2
(a − c )x
2 2 2
+y =a −c .
2
a

Finally, divide both sides by a 2


−c
2
. This gives the equation
2 2
x y
+ = 1.
2 2 2
a a −c

We now define b so that b 2


=c
2
−a
2
. This is possible because c > a . Therefore the equation of the hyperbola becomes
2 2
x y
− = 1.
2 2
a b

Finally, if the center of the hyperbola is moved from the origin to the point (h, k), we have the following standard form of a
hyperbola.

 Equation of a Hyperbola in Standard Form

Consider the hyperbola with center (h, k) , a horizontal major axis, and a vertical minor axis. Then the equation of this
hyperbola is
2 2
(x − h) (y − k)
− =1 (10.5.9)
2 2
a b

b
and the foci are located at (h ± c, k), where c 2
=a
2
+b
2
. The equations of the asymptotes are given by y = k ± (x − h).
a
The equations of the directrices are
2 2
a a
x =h± − −−−−− =h±
√ a2 + b2 c

If the major axis is vertical, then the equation of the hyperbola becomes
2 2
(y − k) (x − h)
− =1
2 2
a b

a
and the foci are located at (h, k ± c), where c 2
=a
2
+b
2
. The equations of the asymptotes are given by y = k ± (x − h) .
b
The equations of the directrices are
2 2
a a
y =k± − −−−−− =k± .
√ a2 + b2 c

If the major axis (transverse axis) is horizontal, then the hyperbola is called horizontal, and if the major axis is vertical then the
hyperbola is called vertical. The equation of a hyperbola is in general form if it is in the form
2 2
Ax + By + C x + Dy + E = 0,

where A and B have opposite signs. In order to convert the equation from general to standard form, use the method of completing
the square.

 Example 10.5.3: Finding the Standard Form of a Hyperbola


Put the equation 9x − 16y
2 2
+ 36x + 32y − 124 = 0 into standard form and graph the resulting hyperbola. What are the
equations of the asymptotes?
Solution
First add 124 to both sides of the equation:
2 2
9x − 16 y + 36x + 32y = 124.

10.5.12 https://math.libretexts.org/@go/page/4509
Next group the x terms together and the y terms together, then factor out the common factors:
2 2
(9 x + 36x) − (16 y − 32y) = 124

9(x
2
+ 4x) − 16(y
2
− 2y) = 124 .
We need to determine the constant that, when added inside each set of parentheses, results in a perfect square. In the first set of
4
parentheses, take half the coefficient of x and square it. This gives (
2
) =4 . In the second set of parentheses, take half the
2
−2
coefficient of y and square it. This gives ( )
2
= 1. Add these inside each pair of parentheses. Since the first set of
2
parentheses has a 9 in front, we are actually adding 36 to the left-hand side. Similarly, we are subtracting 16 from the second
set of parentheses. Therefore the equation becomes
2 2
9(x + 4x + 4) − 16(y − 2y + 1) = 124 + 36 − 16

2 2
9(x + 4x + 4) − 16(y − 2y + 1) = 144.

Next factor both sets of parentheses and divide by 144:


2 2
9(x + 2 ) − 16(y − 1 ) = 144

2 2
9(x + 2) 16(y − 1)
− =1
144 144

2 2
(x + 2) (y − 1)
− = 1.
16 9

The equation is now in standard form. Comparing this to Equation 10.5.9 gives h = −2, k = 1, a = 4, and b =3 . This is a
3
horizontal hyperbola with center at (−2, 1) and asymptotes given by the equations y =1± (x + 2) . The graph of this
4
hyperbola appears in Figure 10.5.10.

Figure 10.5.10: Graph of the hyperbola in Example 10.5.3 .

 Exercise 10.5.3

Put the equation 4y − 9x + 16y + 18x − 29 = 0 into standard form and graph the resulting hyperbola. What are the
2 2

equations of the asymptotes?

Hint
Move the constant over and complete the square. Check which direction the hyperbola opens

Answer
2 2
(y + 2) (x − 1) 3
− = 1. This is a vertical hyperbola. Asymptotes y = −2 ± (x − 1).
9 4 2

10.5.13 https://math.libretexts.org/@go/page/4509
Hyperbolas also have interesting reflective properties. A ray directed toward one focus of a hyperbola is reflected by a hyperbolic
mirror toward the other focus. This concept is illustrated in Figure 10.5.11.

Figure 10.5.11: A hyperbolic mirror used to collect light from distant stars.
This property of the hyperbola has important applications. It is used in radio direction finding (since the difference in signals from
two towers is constant along hyperbolas), and in the construction of mirrors inside telescopes (to reflect light coming from the
parabolic mirror to the eyepiece). Another interesting fact about hyperbolas is that for a comet entering the solar system, if the
speed is great enough to escape the Sun’s gravitational pull, then the path that the comet takes as it passes through the solar system
is hyperbolic.

Eccentricity and Directrix


An alternative way to describe a conic section involves the directrices, the foci, and a new property called eccentricity. We will see
that the value of the eccentricity of a conic section can uniquely define that conic.

 Definition: Eccentricity and Directrices

The eccentricity e of a conic section is defined to be the distance from any point on the conic section to its focus, divided by
the perpendicular distance from that point to the nearest directrix. This value is constant for any conic section, and can define
the conic section as well:

10.5.14 https://math.libretexts.org/@go/page/4509
1. If e = 1 , the conic is a parabola.
2. If e < 1 , it is an ellipse.
3. If e > 1, it is a hyperbola.
The eccentricity of a circle is zero. The directrix of a conic section is the line that, together with the point known as the focus,
serves to define a conic section. Hyperbolas and noncircular ellipses have two foci and two associated directrices. Parabolas
have one focus and one directrix.

The three conic sections with their directrices appear in Figure 10.5.12.

Figure 10.5.12: The three conic sections with their foci and directrices.
Recall from the definition of a parabola that the distance from any point on the parabola to the focus is equal to the distance from
that same point to the directrix. Therefore, by definition, the eccentricity of a parabola must be 1. The equations of the directrices of
2
a
a horizontal ellipse are x =± . The right vertex of the ellipse is located at (a, 0) and the right focus is (c, 0) . Therefore the
c
2
a
distance from the vertex to the focus is a−c and the distance from the vertex to the right directrix is − c. This gives the
c
eccentricity as
a−c c(a − c) c(a − c) c
e = = = = .
2 2
a a − ac a(a − c) a
−a
c

Since c <a , this step proves that the eccentricity of an ellipse is less than 1. The directrices of a horizontal hyperbola are also
2
a c
located at x = ± , and a similar calculation shows that the eccentricity of a hyperbola is also e = . However in this case we
c a
have c > a , so the eccentricity of a hyperbola is greater than 1.

 Example 10.5.4: Determining Eccentricity of a Conic Section

Determine the eccentricity of the ellipse described by the equation


2 2
(x − 3) (y + 2)
+ = 1.
16 25

Solution
From the equation we see that a = 5 and b = 4 . The value of c can be calculated using the equation a = b + c for an 2 2 2

ellipse. Substituting the values of a and b and solving for c gives c = 3 . Therefore the eccentricity of the ellipse is
c 3
e = = = 0.6.
a 5

10.5.15 https://math.libretexts.org/@go/page/4509
 Exercise 10.5.4
Determine the eccentricity of the hyperbola described by the equation
2 2
(y − 3) (x + 2)
− = 1.
49 25

Hint
First find the values of a and b, then determine c using the equation c 2
=a
2
+b
2
.

Answer
−−
c √74
e = = ≈ 1.229
a 7

Polar Equations of Conic Sections


Sometimes it is useful to write or identify the equation of a conic section in polar form. To do this, we need the concept of the focal
parameter. The focal parameter of a conic section p is defined as the distance from a focus to the nearest directrix. The following
table gives the focal parameters for the different types of conics, where a is the length of the semi-major axis (i.e., half the length of
the major axis), c is the distance from the origin to the focus, and e is the eccentricity. In the case of a parabola, a represents the
distance from the vertex to the focus.
Table 10.5.1 : Eccentricities and Focal Parameters of the Conic Sections
Conic e p

2 2 2
a −c a(1 − e )
Ellipse 0 < e < 1 =
c c

Parabola e = 1 2a

2 2 2
c −a a(e − 1)
Hyperbola e > 1 =
c c

Using the definitions of the focal parameter and eccentricity of the conic section, we can derive an equation for any conic section in
polar coordinates. In particular, we assume that one of the foci of a given conic section lies at the pole. Then using the definition of
the various conic sections in terms of distances, it is possible to prove the following theorem.

 Polar Equation of Conic Sections


The polar equation of a conic section with focal parameter p is given by
ep ep
r = or r = .
1 ± e cos θ 1 ± e sin θ

In the equation on the left, the major axis of the conic section is horizontal, and in the equation on the right, the major axis is
vertical. To work with a conic section written in polar form, first make the constant term in the denominator equal to 1. This can be
done by dividing both the numerator and the denominator of the fraction by the constant that appears in front of the plus or minus
in the denominator. Then the coefficient of the sine or cosine in the denominator is the eccentricity. This value identifies the conic.
If cosine appears in the denominator, then the conic is horizontal. If sine appears, then the conic is vertical. If both appear then the
axes are rotated. The center of the conic is not necessarily at the origin. The center is at the origin only if the conic is a circle (i.e.,
e = 0 ).

 Example 10.5.5: Graphing a Conic Section in Polar Coordinates


Identify and create a graph of the conic section described by the equation
3
r = .
1 + 2 cos θ

Solution

10.5.16 https://math.libretexts.org/@go/page/4509
The constant term in the denominator is 1, so the eccentricity of the conic is 2. This is a hyperbola. The focal parameter p can
3
be calculated by using the equation ep = 3. Since e = 2 , this gives p = . The cosine function appears in the denominator, so
2
the hyperbola is horizontal. Pick a few values for θ and create a table of values. Then we can graph the hyperbola (Figure
10.5.13).

θ r θ r

0 1 π −3
π 3 5π 3
≈ 1.2426 ≈ −7.2426
– –
4 1 + √2 4 1 − √2

π 3π
3 3
2 2

3π 3 7π 3
≈ −7.2426 ≈ 1.2426
– –
4 1 − √2 4 1 + √2

Figure 10.5.13: Graph of the hyperbola described in Example 10.5.5 .

 Exercise 10.5.5

Identify and create a graph of the conic section described by the equation
4
r = .
1 − 0.8 sin θ

Hint
First find the values of e and p, and then create a table of values.

Answer
Here e = 0.8 and p = 5 . This conic section is an ellipse.

10.5.17 https://math.libretexts.org/@go/page/4509
General Equations of Degree Two
A general equation of degree two can be written in the form
2 2
Ax + Bxy + C y + Dx + Ey + F = 0.

The graph of an equation of this form is a conic section. If B ≠ 0 then the coordinate axes are rotated. To identify the conic
section, we use the discriminant of the conic section 4AC − B . 2

 Identifying the Conic Section

One of the following cases must be true:


1. 4AC − B 2
>0 . If so, the graph is an ellipse.
2. 4AC − B 2
= 0 . If so, the graph is a parabola.

3. 4AC − B 2
< 0 . If so, the graph is a hyperbola.

The simplest example of a second-degree equation involving a cross term is xy = 1 . This equation can be solved for y to obtain
1
y = . The graph of this function is called a rectangular hyperbola as shown.
x

10.5.18 https://math.libretexts.org/@go/page/4509
Figure 10.5.14: Graph of the equation xy = 1 ; The red lines indicate the rotated axes.
The asymptotes of this hyperbola are the x and y coordinate axes. To determine the angle θ of rotation of the conic section, we use
the formula cot 2θ = A−C

B
. In this case A = C = 0 and B = 1 , so cot 2θ = (0 − 0)/1 = 0 and θ = 45° . The method for
graphing a conic section with rotated axes involves determining the coefficients of the conic in the rotated coordinate system. The
new coefficients are labeled A', B', C ', D', E', and F ', and are given by the formulas
2 2
A' = A cos θ + B cos θ sin θ + C sin θ (10.5.10)

B' = 0 (10.5.11)

2 2
C ' = A sin θ − B sin θ cos θ + C cos θ (10.5.12)

D' = D cos θ + E sin θ (10.5.13)

E' = −D sin θ + E cos θ (10.5.14)

F' = F. (10.5.15)

 Procedure: graphing a rotated conic

The procedure for graphing a rotated conic is the following:


1. Identify the conic section using the discriminant 4AC − B . 2

2. Determine θ using the formula


A−C
cot 2θ = . (10.5.16)
B

3. Calculate A', B', C ', D', E',and F '.


4. Rewrite the original equation using A', B', C ', D', E',and F '.
5. Draw a graph using the rotated equation.

 Example 10.5.6: Identifying a Rotated Conic

Identify the conic and calculate the angle of rotation of axes for the curve described by the equation
2 – 2
13 x − 6 √3xy + 7 y − 256 = 0.

Solution

In this equation, A = 13, B = −6√3, C = 7, D = 0, E = 0, and F = −256 . The discriminant of this equation is
2 – 2
4AC − B = 4(13)(7) − (−6 √3) = 364 − 108 = 256.

Therefore this conic is an ellipse.


To calculate the angle of rotation of the axes, use Equation 10.5.16

10.5.19 https://math.libretexts.org/@go/page/4509
A−C
cot 2θ = .
B

This gives

A−C 13 − 7 √3
cot 2θ = =

=− .
B −6 √3 3

Therefore 2θ = 120 and θ = 60 , which is the angle of the rotation of the axes.
o o

To determine the rotated coefficients, use the formulas given above:


2 2
A' = A cos θ + B cos θ sin θ + C sin θ

2 – 2
= 13 cos 60 + (−6 √3) cos 60 sin 60 + 7 sin 60

– –
1 – 1 √3 √3
2 2
= 13( ) − 6 √3( )( ) + 7( )
2 2 2 2

= 4,

B' = 0

2 2
C ' = A sin θ − B sin θ cos θ + C cos θ

2 – 2
= 13 sin 60 + (6 √3) sin 60 cos 60 + 7 cos 60

– –
√3 – √3 1 1
2 2
= 13( ) + 6 √3( )( ) + 7( )
2 2 2 2

= 16,

D' = D cos θ + E sin θ

= (0) cos 60 + (0) sin 60

= 0,

E' = −D sin θ + E cos θ

= −(0) sin 60 + (0) cos 60

=0

F' = F

= −256.

The equation of the conic in the rotated coordinate system becomes


2 2
4(x' ) + 16(y' ) = 256

2 2
(x') (y')
+ =1 .
64 16

A graph of this conic section appears as follows.

10.5.20 https://math.libretexts.org/@go/page/4509

Figure 10.5.15: Graph of the ellipse described by the equation 13x
2 2
− 6√3xy + 7y − 256 = 0 . The axes are rotated .
60°

The red dashed lines indicate the rotated axes.

 Exercise 10.5.6

Identify the conic and calculate the angle of rotation of axes for the curve described by the equation
2 2
3x + 5xy − 2 y − 125 = 0.

Hint
Follow steps 1 and 2 of the five-step method outlined above

Answer
The conic is a hyperbola and the angle of rotation of the axes is θ = 22.5°.

Key Concepts
1
The equation of a vertical parabola in standard form with given focus and directrix is y = (x − h )
2
+k where p is the
4p

distance from the vertex to the focus and (h, k) are the coordinates of the vertex.
2 2
(x − h) (y − k)
The equation of a horizontal ellipse in standard form is 2
+
2
=1 where the center has coordinates (h, k), the
a b
major axis has length 2a, the minor axis has length 2b, and the coordinates of the foci are (h ± c, k) , where c 2 2
=a −b
2
.
2 2
(x − h) (y − k)
The equation of a horizontal hyperbola in standard form is 2

2
=1 where the center has coordinates (h, k),
a b
the vertices are located at (h ± a, k) , and the coordinates of the foci are (h ± c, k), where c = a + b . 2 2 2

The eccentricity of an ellipse is less than 1, the eccentricity of a parabola is equal to 1, and the eccentricity of a hyperbola is
greater than 1. The eccentricity of a circle is 0.
ep ep
The polar equation of a conic section with eccentricity e is r = or r = , where p represents the focal
1 ± ecosθ 1 ± esinθ
parameter.
To identify a conic generated by the equation Ax + Bxy + C y + Dx + Ey + F = 0 ,first calculate the discriminant
2 2

D = 4AC − B . If D > 0 then the conic is an ellipse, if D = 0 then the conic is a parabola, and if D < 0 then the conic is a
2

hyperbola.

Glossary
conic section
a conic section is any curve formed by the intersection of a plane with a cone of two nappes

10.5.21 https://math.libretexts.org/@go/page/4509
directrix
a directrix (plural: directrices) is a line used to construct and define a conic section; a parabola has one directrix; ellipses and
hyperbolas have two

discriminant
the value 4AC − B , which is used to identify a conic when the equation contains a term involving xy, is called a discriminant
2

focus
a focus (plural: foci) is a point used to construct and define a conic section; a parabola has one focus; an ellipse and a hyperbola
have two

eccentricity
the eccentricity is defined as the distance from any point on the conic section to its focus divided by the perpendicular distance
from that point to the nearest directrix

focal parameter
the focal parameter is the distance from a focus of a conic section to the nearest directrix

general form
an equation of a conic section written as a general second-degree equation

major axis
the major axis of a conic section passes through the vertex in the case of a parabola or through the two vertices in the case of an
ellipse or hyperbola; it is also an axis of symmetry of the conic; also called the transverse axis

minor axis
the minor axis is perpendicular to the major axis and intersects the major axis at the center of the conic, or at the vertex in the
case of the parabola; also called the conjugate axis

nappe
a nappe is one half of a double cone

standard form
an equation of a conic section showing its properties, such as location of the vertex or lengths of major and minor axes

vertex
a vertex is an extreme point on a conic section; a parabola has one vertex at its turning point. An ellipse has two vertices, one at
each end of the major axis; a hyperbola has two vertices, one at the turning point of each branch

10.5: Conic Sections is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
11.5: Conic Sections by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

10.5.22 https://math.libretexts.org/@go/page/4509
10.6: Conic Sections in Polar Coordinates
 Learning Objectives
Identify a conic in polar form.
Graph the polar equations of conics.
Define conics in terms of a focus and a directrix.

Most of us are familiar with orbital motion, such as the motion of a planet around the sun or an electron around an atomic nucleus.
Within the planetary system, orbits of planets, asteroids, and comets around a larger celestial body are often elliptical. Comets,
however, may take on a parabolic or hyperbolic orbit instead. And, in reality, the characteristics of the planets’ orbits may vary over
time. Each orbit is tied to the location of the celestial body being orbited and the distance and direction of the planet or other object
from that body. As a result, we tend to use polar coordinates to represent these orbits.

Figure 10.6.1 : Planets orbiting the sun follow elliptical paths. (credit: NASA Blueshift, Flickr)
In an elliptical orbit, the periapsis is the point at which the two objects are closest, and the apoapsis is the point at which they are
farthest apart. Generally, the velocity of the orbiting body tends to increase as it approaches the periapsis and decrease as it
approaches the apoapsis. Some objects reach an escape velocity, which results in an infinite orbit. These bodies exhibit either a
parabolic or a hyperbolic orbit about a body; the orbiting body breaks free of the celestial body’s gravitational pull and fires off into
space. Each of these orbits can be modeled by a conic section in the polar coordinate system.

Identifying a Conic in Polar Form


Any conic may be determined by three characteristics: a single focus, a fixed line called the directrix, and the ratio of the distances
of each to a point on the graph. Consider the parabola x = 2 + y shown in Figure 10.6.2.
2

Figure 10.6.2
We previously learned how a parabola is defined by the focus (a fixed point) and the directrix (a fixed line). In this section, we will
learn how to define any conic in the polar coordinate system in terms of a fixed point, the focus P (r, θ) at the pole, and a line, the
directrix, which is perpendicular to the polar axis.

10.6.1 https://math.libretexts.org/@go/page/4510
If F is a fixed point, the focus, and D is a fixed line, the directrix, then we can let e be a fixed positive number, called the
eccentricity, which we can define as the ratio of the distances from a point on the graph to the focus and the point on the graph to
PF
the directrix. Then the set of all points P such that e = is a conic. In other words, we can define a conic as the set of all
PD
points P with the property that the ratio of the distance from P to F to the distance from P to D is equal to the constant e .
For a conic with eccentricity e ,
if 0 ≤ e < 1 , the conic is an ellipse
if e = 1 , the conic is a parabola
if e > 1 , the conic is an hyperbola
With this definition, we may now define a conic in terms of the directrix, x = ±p , the eccentricity e , and the angle θ . Thus, each
conic may be written as a polar equation, an equation written in terms of r and θ .

 THE POLAR EQUATION FOR A CONIC

For a conic with a focus at the origin, if the directrix is x = ±p , where p is a positive real number, and the eccentricity is a
positive real number e , the conic has a polar equation
ep
r = (10.6.1)
1 ± e cos θ

For a conic with a focus at the origin, if the directrix is y = ±p , where p is a positive real number, and the eccentricity is a
positive real number e , the conic has a polar equation
ep
r = (10.6.2)
1 ± e sin θ

 How to: Given the polar equation for a conic, identify the type of conic, the directrix, and the eccentricity.
1. Multiply the numerator and denominator by the reciprocal of the constant in the denominator to rewrite the equation in
standard form.
2. Identify the eccentricity e as the coefficient of the trigonometric function in the denominator.
3. Compare e with 1 to determine the shape of the conic.
4. Determine the directrix as x = p if cosine is in the denominator and y = p if sine is in the denominator. Set ep equal to the
numerator in standard form to solve for x or y .

 Example 10.6.1: Identifying a Conic Given the Polar Form

For each of the following equations, identify the conic with focus at the origin, the directrix, and the eccentricity.
6
a. r =
3 + 2 sin θ
12
b. r =
4 + 5 cos θ
7
c. r =
2 − 2 sin θ

Solution
For each of the three conics, we will rewrite the equation in standard form. Standard form has a 1 as the constant in the
denominator. Therefore, in all three parts, the first step will be to multiply the numerator and denominator by the reciprocal of
1
the constant of the original equation, , where c is that constant.
c

1
a. Multiply the numerator and denominator by .
3

10.6.2 https://math.libretexts.org/@go/page/4510
1 1
( ) 6( )
6 3 3 2
r = ⋅ = =
3 + 2 sin θ 1 1 1 2
( ) 3( ) +2 ( ) sin θ 1+ sin θ
3 3 3 3

2
Because sin θ is in the denominator, the directrix is y = p . Comparing to standard form, note that e = .Therefore, from the
3
numerator,

2 = ep

2
2 = p
3

3 3 2
( )2 =( ) p
2 2 3

3 =p

2
Since e < 1 , the conic is an ellipse. The eccentricity is e = and the directrix is y = 3 .
3

1
b. Multiply the numerator and denominator by .
4

1
( )
12 4
r = ⋅
4 + 5 cos θ 1
( )
4

1
12 ( )
4
r =
1 1
4( ) +5 ( ) cos θ
4 4

3
r =
5
1+ cos θ
4

5
Because cos θ is in the denominator, the directrix is x =p . Comparing to standard form, e = . Therefore, from the
4
numerator,

3 = ep

5
3 = p
4

4 4 5
( )3 =( ) p
5 5 4

12
=p
5

5 12
Since e > 1 , the conic is a hyperbola. The eccentricity is e = and the directrix is x = = 2.4 .
4 5

1
c. Multiply the numerator and denominator by .
2

10.6.3 https://math.libretexts.org/@go/page/4510
1
( )
7 2
r = ⋅
2 − 2 sin θ 1
( )
2

1
7( )
2
r =
1 1
2( ) −2 ( ) sin θ
2 2

2
r =
1 − sin θ

Because sine is in the denominator, the directrix is y = −p . Comparing to standard form, e =1 . Therefore, from the
numerator,
7
= ep
2
7
= (1)p
2
7
=p
2

7
Because e = 1 , the conic is a parabola. The eccentricity is e = 1 and the directrix is y = − = −3.5 .
2

 Exercise 10.6.1
2
Identify the conic with focus at the origin, the directrix, and the eccentricity for r = .
3 − cos θ

Answer
1
ellipse; e = ; x = −2
3

Graphing the Polar Equations of Conics


When graphing in Cartesian coordinates, each conic section has a unique equation. This is not the case when graphing in polar
coordinates. We must use the eccentricity of a conic section to determine which type of curve to graph, and then determine its
specific characteristics. The first step is to rewrite the conic in standard form as we have done in the previous example. In other
words, we need to rewrite the equation so that the denominator begins with 1. This enables us to determine e and, therefore, the
π
shape of the curve. The next step is to substitute values for θ and solve for r to plot a few key points. Setting θ equal to 0, , π,
2

and provides the vertices so we can create a rough sketch of the graph.
2

 Example 10.6.2A: Graphing a Parabola in Polar Form


5
Graph r = .
3 + 3 cos θ

Solution
1
First, we rewrite the conic in standard form by multiplying the numerator and denominator by the reciprocal of 3, which is .
3

10.6.4 https://math.libretexts.org/@go/page/4510
1
5( )
5 3
r = =
3 + 3 cos θ 1 1
3( ) +3 ( ) cos θ
3 3

3
r =
1 + cos θ

Because e = 1 ,we will graph a parabola with a focus at the origin. The function has a cos θ , and there is an addition sign in
the denominator, so the directrix is x = p .
5
= ep
3
5
= (1)p
3
5
=p
3

5
The directrix is x = .
3

Plotting a few key points as in Table 10.6.1 will enable us to see the vertices. See Figure 10.6.3.
Table 10.6.1
A B C D
π 3π
θ 0 π
2 2

5 5 5 5
r = ≈ 0.83 ≈ 1.67 undefined ≈ 1.67
3 + 3 cosθ 6 3 3

Figure 10.6.3
We can check our result with a graphing utility. See Figure 10.6.4.

Figure 10.6.4

 Example 10.6.2B: Graphing a Hyperbola in Polar Form


8
Graph r = .
2 − 3 sin θ

Solution
1
First, we rewrite the conic in standard form by multiplying the numerator and denominator by the reciprocal of 2, which is .
2

10.6.5 https://math.libretexts.org/@go/page/4510
1
8( )
8 2
r = =
2 − 3 sin θ 1 1
2( ) −3 ( ) sin θ
2 2

4
r =
3
1− sin θ
2

3
Because e = , e > 1 , so we will graph a hyperbola with a focus at the origin. The function has a sin θ term and there is a
2
subtraction sign in the denominator, so the directrix is y = −p .

4 = ep

3
4 =( )p
2

2
4( ) =p
3

8
=p
3

8
The directrix is y = − .
3

Plotting a few key points as in Table 10.6.2 will enable us to see the vertices. See Figure 10.6.5.
Table 10.6.2
A B C D
π 3π
θ 0 π
2 2

8 8
r = 4 −8 4 = 1.6
2 − 3 sin θ 5

Figure 10.6.5

 Example 10.6.2C: Graphing an Ellipse in Polar Form


10
Graph r = .
5 − 4 cos θ

Solution

10.6.6 https://math.libretexts.org/@go/page/4510
1
First, we rewrite the conic in standard form by multiplying the numerator and denominator by the reciprocal of 5, which is .
5

1
10 ( )
10 5
r = =
5 − 4 cos θ 1 1
5( ) −4 ( ) cos θ
5 5

2
r =
4
1− cos θ
5

4
Because e = , e <1 , so we will graph an ellipse with a focus at the origin. The function has a cos θ , and there is a
5
subtraction sign in the denominator, so the directrix is x = −p .
2 = ep

4
2 =( )p
5

5
2( ) =p
4

5
=p
2

5
The directrix is x = − .
2

Plotting a few key points as in Table 10.6.3 will enable us to see the vertices. See Figure 10.6.6.
Table 10.6.3
A B C D
π 3π
θ 0 π
2 2

10 10
r = 10 2 ≈ 1.1 2
5 − 4 cosθ 9

Figure 10.6.6
Analysis
We can check our result using a graphing utility. See Figure 10.6.7.

10.6.7 https://math.libretexts.org/@go/page/4510
10
Figure 10.6.6 : r = graphed on a viewing window of [– 3, 12, 1] by [– 4, 4, 1] , θmin = 0 and θmax = 2π .
5 − 4 cos θ

 Exercise 10.6.2
2
Graph r = .
4 − cos θ

Answer

Figure 10.6.7

Defining Conics in Terms of a Focus and a Directrix


So far we have been using polar equations of conics to describe and graph the curve. Now we will work in reverse; we will use
information about the origin, eccentricity, and directrix to determine the polar equation.

 How to: Given the focus, eccentricity, and directrix of a conic, determine the polar equation
1. Determine whether the directrix is horizontal or vertical. If the directrix is given in terms of y , we use the general polar
form in terms of sine. If the directrix is given in terms of x, we use the general polar form in terms of cosine.
2. Determine the sign in the denominator. If p < 0 , use subtraction. If p > 0 , use addition.
3. Write the coefficient of the trigonometric function as the given eccentricity.
4. Write the absolute value of p in the numerator, and simplify the equation.

10.6.8 https://math.libretexts.org/@go/page/4510
 Example 10.6.3A: Finding the Polar Form of a Vertical Conic Given a Focus at the Origin and the
Eccentricity and Directrix
Find the polar form of the conic given a focus at the origin, e = 3 and directrix y = −2 .
Solution
The directrix is y = −p , so we know the trigonometric function in the denominator is sine.
Because y = −2 , – 2 < 0 , so we know there is a subtraction sign in the denominator. We use the standard form of
ep
r =
1 − e sin θ

and e = 3 and | − 2| = 2 = p .
Therefore,
(3)(2)
r =
1 − 3 sin θ

6
r =
1 − 3 sin θ

 Example 10.6.3B: Finding the Polar Form of a Horizontal Conic Given a Focus at the Origin and the
Eccentricity and Directrix
3
Find the polar form of a conic given a focus at the origin, e = , and directrix x = 4 .
5

Solution
Because the directrix is x = p , we know the function in the denominator is cosine. Because x = 4 , 4 > 0 , so we know there is
an addition sign in the denominator. We use the standard form of
ep
r =
1 + e cos θ

3
and e = and |4| = 4 = p .
5

Therefore,
3
( ) (4)
5
r =
3
1+ cos θ
5
12

5
r =
3
1+ cos θ
5

12

5
r =
5 3
1( )+ cos θ
5 5

12

5
r =
5 3
+ cos θ
5 5
12 5
r = ⋅
5 5 + 3 cos θ

12
r =
5 + 3 cos θ

10.6.9 https://math.libretexts.org/@go/page/4510
 Exercise 10.6.3
Find the polar form of the conic given a focus at the origin, e = 1 , and directrix x = −1 .

Answer
1
r =
1 − cos θ

 Example 10.6.4: Converting a Conic in Polar Form to Rectangular Form


1
Convert the conic r = to rectangular form.
5 − 5 sin θ

Solution
−−−−−−
We will rearrange the formula to use the identities r = √x 2
+y
2
, x = r cos θ ,and y = r sin θ .
1
r =
5 − 5 sin θ

1
r ⋅ (5 − 5 sin θ) = ⋅ (5 − 5 sin θ) Eliminate the fraction.
5 − 5 sin θ

5r − 5r sin θ = 1 Distribute.

5r = 1 + 5r sin θ Isolate 5r.

2 2
25r = (1 + 5r sin θ) Square both sides.
−−−−−−
2 2 2 2 2
25(x + y ) = (1 + 5y) Substitute r = √ x +y and y = r sin θ.

2 2 2
25 x + 25 y = 1 + 10y + 25 y Distribute and use FOIL.
2
25 x − 10y = 1 Rearrange terms and set equal to 1.

 Exercise 10.6.4
2
Convert the conic r = to rectangular form.
1 + 2 cos θ

Answer
2 2
4 − 8x + 3 x −y =0

 Media
Access these online resources for additional instruction and practice with conics in polar coordinates.
Polar Equations of Conic Sections
Graphing Polar Equations of Conics - 1
Graphing Polar Equations of Conics - 2

Visit this website for additional practice questions from Learningpod.

Key Concepts
Any conic may be determined by a single focus, the corresponding eccentricity, and the directrix. We can also define a conic in
terms of a fixed point, the focus P (r, θ) at the pole, and a line, the directrix, which is perpendicular to the polar axis.
PF
A conic is the set of all points e = , where eccentricity e is a positive real number. Each conic may be written in terms of
PD
its polar equation. See Example 10.6.1.
The polar equations of conics can be graphed. See Example 10.6.2, Example 10.6.3, and Example 10.6.4.
Conics can be defined in terms of a focus, a directrix, and eccentricity. See Example 10.6.5 and Example 10.6.6.

10.6.10 https://math.libretexts.org/@go/page/4510
−−−−−−
We can use the identities r = √x + y , x = r cos θ ,and y = r sin θ to convert the equation for a conic from polar to
2 2

rectangular form. See Example 10.6.7.

10.6: Conic Sections in Polar Coordinates is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
12.5: Conic Sections in Polar Coordinates by OpenStax is licensed CC BY 4.0. Original source:
https://openstax.org/details/books/precalculus.

10.6.11 https://math.libretexts.org/@go/page/4510
CHAPTER OVERVIEW

Back Matter
Index

1 https://math.libretexts.org/@go/page/45259
Index
A Divergence Theorem L
arc length 16.9: The Divergence Theorem L’Hôpital’s rule
8.1: Arc Length 4.4: Indeterminate Forms and l'Hospital's Rule
13.3: Arc Length and Curvature G
Gradient Vector S
C 14.6: Directional Derivatives and the Gradient Stokes’ Theorem
carrying capacity Vector
16.8: Stokes' Theorem
9.6: Predator-Prey Systems Green's theorem
16.4: Green's Theorem
V
D Volume by Shells
Directional Derivatives I
6.3: Volumes by Cylindrical Shells
14.6: Directional Derivatives and the Gradient indeterminate forms
Vector 4.4: Indeterminate Forms and l'Hospital's Rule

1 https://math.libretexts.org/@go/page/45260
CHAPTER OVERVIEW

11: Infinite Sequences And Series


A general Calculus Textmap organized around the textbook

Calculus: Early Transcendentals


by James Stewart

I II III IV V VI VII VIII IX X XI XII XIII XIV XV XVI XVII

This Textmap is currently under construction... please be patient with us.


11.1: Sequences
11.2: Series
11.3: The Integral Test and Estimates of Sums
11.4: The Comparison Tests
11.5: Alternating Series
11.6: Absolute Convergence and the Ratio and Root Test
11.7: Strategy for Testing Series
11.8: Power Series
11.9: Representations of Functions as Power Series
11.10: Taylor and Maclaurin Series
11.11: Applications of Taylor Polynomials

11: Infinite Sequences And Series is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

1
11.1: Sequences
While the idea of a sequence of numbers, a , a , a , … is straightforward, it is useful to think of a sequence as a function. We have
1 2 3

up until now dealt with functions whose domains are the real numbers, or a subset of the real numbers, like f (x) = sin x . A
sequence is a function with domain the natural numbers N = {1, 2, 3, …} or the non-negative integers, Z = {0, 1, 2, 3, …}. ≥0

The range of the function is still allowed to be the real numbers; in symbols, we say that a sequence is a function f : N → R .
Sequences are written in a few different ways, all equivalent; these all mean the same thing:
a1 , a2 , a3 , …


{ an } (11.1.1)
n=1


{f (n)}
n=1

As with functions on the real numbers, we will most often encounter sequences that can be expressed by a formula. We have
already seen the sequence a = f (i) = 1 − 1/2 , and others are easy to come by:
i
i

i
f (i) =
i +1

1
f (n) =
n
2 (11.1.2)

f (n) = sin(nπ/6)

(i − 1)(i + 2)
f (i) = .
i
2

Frequently these formulas will make sense if thought of either as functions with domain R or N , though occasionally one will
make sense only for integer values.
Faced with a sequence we are interested in the limit lim f (i) = lim
i→∞ a . We already understand lim
i→∞ i f (x) when x is a x→∞

real valued variable; now we simply want to restrict the "input'' values to be integers. No real difference is required in the definition
of limit, except that we specify, perhaps implicitly, that the variable is an integer. Compare this definition to definition 4.10.2.

Definition 11.1.1: Converging and Diverging Sequences



Suppose that {a } n is a sequence. We say that lim
n=1
a = L if for every ϵ > 0 there is an N
n→∞ n >0 so that whenever
n > N , | a − L| < ϵ . If lim
n a = L we say that the sequence converges, otherwise it diverges.
n→∞ n

If f (i) defines a sequence, and f (x) makes sense, and lim f (x) = L , then it is clear that limi→∞ f (i) = L as well, but it is
x→∞

important to note that the converse of this statement is not true. For example, since limx→∞ (1/x) = 0 , it is clear that also
lim (1/i) = 0, that is, the numbers
i→∞

1 1 1 1 1 1
, , , , , ,… (11.1.3)
1 2 3 4 5 6

get closer and closer to 0. Consider this, however: Let f (n) = sin(nπ) .
This is the sequence
sin(0π), sin(1π), sin(2π), sin(3π), … = 0, 0, 0, 0, … (11.1.4)

since
sin(nπ) = 0 (11.1.5)

when n is an integer. Thus lim n→∞f (n) = 0 . But lim f (x), when x is real, does not exist: as x gets bigger and bigger, the
x→∞

values sin(xπ) do not get closer and closer to a single value, but take on all values between −1 and 1 over and over. In general,
whenever you want to know lim n→∞ f (n) you should first attempt to compute lim f (x), since if the latter exists it is also
x→∞

equal to the first limit. But if for some reason lim f (x) does not exist, it may still be true that lim
x→∞ f (n) exists, but you'll n→∞

have to figure out another way to compute it.

11.1.1 https://math.libretexts.org/@go/page/4514
It is occasionally useful to think of the graph of a sequence. Since the function is defined only for integer values, the graph is just a
sequence of dots. In figure 11.1.1 we see the graphs of two sequences and the graphs of the corresponding real functions.

Figure 11.1.1. Graphs of sequences and their corresponding real functions.


Not surprisingly, the properties of limits of real functions translate into properties of sequences quite easily. Theorem 2.3.6 about
limits becomes

Definition 11.1.2
Suppose that lim n→∞ an = L and lim
n→∞ bn = M and k is some constant. Then
lim kan = k lim an = kL
n→∞ n→∞

lim (an + bn ) = lim an + lim bn = L + M


n→∞ n→∞ n→∞

lim (an − bn ) = lim an − lim bn = L − M


n→∞ n→∞ n→∞ (11.1.6)

lim (an bn ) = lim an ⋅ lim bn = LM


n→∞ n→∞ n→∞

an limn→∞ an L
lim = = , if M is not 0.
n→∞ bn limn→∞ bn M

Likewise the Squeeze Theorem (4.3.1) becomes

Theorem 11.1.3
Suppose that
an ≤ bn ≤ cn (11.1.7)

for all n > N , for some N . If


lim an = lim cn = L, (11.1.8)
n→∞ n→∞

then
lim bn = L. (11.1.9)
n→∞

And a final useful fact:

Theorem 11.1.4

lim | an | = 0 (11.1.10)
n→∞

if and only if
lim an = 0. (11.1.11)
n→∞

11.1.2 https://math.libretexts.org/@go/page/4514
This theorem says simply that the size of a gets close to zero if and only if a gets close to zero.
n n

Example 11.1.5

Determine whether { n

n+1
} converges or diverges. If it converges, compute the limit.
n=0

Solution
Since this makes sense for real numbers we consider
x 1
lim = lim 1 − = 1 − 0 = 1. (11.1.12)
x→∞ x +1 x→∞ x +1

Thus the sequence converges to 1.

Example 11.1.6

Determine whether { ln n

n
} converges or diverges. If it converges, compute the limit.
n=1

Solution
1/x
We compute lim x→∞
ln x

x
= limx→∞
1
= 0, using L'Hôpital's Rule. Thus the sequence converges to 0.

Example 11.1.7

Determine whether {(−1) n


}

n=0
converges or diverges. If it converges, compute the limit.
Solution
This does not make sense for all real exponents, but the sequence is easy to understand: it is 1, −1, 1, −1, 1 … and clearly
diverges.

Example 11.1.8
Determine whether {(−1/2) n ∞
}
n=0
converges or diverges. If it converges, compute the limit.
Solution
We consider the sequence
n ∞ n ∞
{|(−1/2 ) | } = {(1/2 ) } . (11.1.13)
n=0 n=0

Then
x
1 1
lim ( ) = lim = 0, (11.1.14)
x
x→∞ 2 x→∞ 2

so by theorem 11.1.4 the sequence converges to 0.

Example 11.1.9
Determine whether {(sin n)/√−
n}

n=1
converges or diverges. If it converges, compute the limit.
Solution
− − −
Since | sin n| ≤ 1 , 0 ≤ | sin n/ √n | ≤ 1/ √n and we can use theorem 11.1.3 with an = 0 and cn = 1/ √n . Since

limn→∞ an = limn→∞ cn = 0 , lim n→∞ sin n/ √n = 0 and the sequence converges to 0.

Example 11.1.10
A particularly common and useful sequence is {r } , for various values of r. Some are quite easy to understand: If r = 1
n ∞
n=0

the sequence converges to 1 since every term is 1, and likewise if r = 0 the sequence converges to 0. If r = −1 this is the
sequence of example 11.1.7 and diverges. If r > 1 or r < −1 the terms r get large without limit, so the sequence diverges. If
n

11.1.3 https://math.libretexts.org/@go/page/4514
0 <r <1 then the sequence converges to 0. If −1 < r < 0 then |r | = |r| and 0 < |r| < 1 , so the sequence {|r| }
n n n ∞
n=0

converges to 0, so also {r } converges to 0. converges. In summary, {r } converges precisely when −1 < r ≤ 1 in which
n ∞
n=0
n

0 if −1 < r < 1
case lim n→∞ r
n
={
1 if r = 1.

Sometimes we will not be able to determine the limit of a sequence, but we still would like to know whether it converges. In some
cases we can determine this even without being able to compute the limit.
A sequence is called increasing or sometimes strictly increasing if a < a for all i. It is called non-decreasing or sometimes
i i+1

(unfortunately) increasing if a ≤ a for all i. Similarly a sequence is decreasing if a > a


i i+1 for all i and non-increasing if
i i+1

a ≥a
i for all i. If a sequence has any of these properties it is called monotonic.
i+1

Example 11.1.11
The sequence
i ∞
2 −1 1 3 7 15
{ } = , , , ,… (11.1.15)
i
2 2 4 8 16
i=1

is increasing,
and

n+1 2 3 4 5
{ } = , , , ,… (11.1.16)
n 1 2 3 4
i=1

is decreasing.

A sequence is bounded above if there is some number N such that a ≤ N for every n , and bounded below if there is some
n

number N such that a ≥ N for every n . If a sequence is bounded above and bounded below it is bounded. If a sequence
n

{a } n

is increasing or non-decreasing it is bounded below (by a ), and if it is decreasing or non-increasing it is bounded above
n=0 0

(by a ). Finally, with all this new terminology we can state an important theorem.
0

Theorem 11.1.12
If a sequence is bounded and monotonic, then it converges.

We will not prove this; the proof appears in many calculus books. It is not hard to believe: suppose that a sequence is increasing
and bounded, so each term is larger than the one before, yet never larger than some fixed value N . The terms must then get closer
and closer to some value between a and N . It need not be N , since N may be a "too-generous'' upper bound; the limit will be the
0

smallest number that is above all of the terms a . i

Example 11.1.13

All of the terms (2 − 1)/2 are less than 2, and the sequence is increasing. As we have seen, the limit of the sequence is 1---1
i i

is the smallest number that is bigger than all the terms in the sequence. Similarly, all of the terms (n + 1)/n are bigger than
1/2, and the limit is 1---1 is the largest number that is smaller than the terms of the sequence.

We do not actually need to know that a sequence is monotonic to apply this theorem---it is enough to know that the sequence is
"eventually'' monotonic, that is, that at some point it becomes increasing or decreasing. For example, the sequence 10, 9, 8, 15, 3,
21, 4 , 3/4, 7/8, 15/16, 31/32, … is not increasing, because among the first few terms it is not. But starting with the term 3/4 it is

increasing, so the theorem tells us that the sequence 3/4, 7/8, 15/16, 31/32, …converges. Since convergence depends only on
what happens as n gets large, adding a few terms at the beginning can't turn a convergent sequence into a divergent one.

Example 11.1.14

Show that {n 1/n


} converges.
Solution

11.1.4 https://math.libretexts.org/@go/page/4514
We first show that this sequence is decreasing, that is, that n > (n + 1) 1/n
. Consider the real function f (x) = x
1/(n+1) 1/x

when x ≥ 1 . We can compute the derivative, f (x) = x (1 − ln x)/x , and note that when x ≥ 3 this is negative. Since the
′ 1/x 2

function has negative slope, n > (n + 1)


1/n
when n ≥ 3 . Since all terms of the sequence are positive, the sequence is
1/(n+1)

decreasing and bounded when n ≥ 3 , and so the sequence converges. (As it happens, we can compute the limit in this case, but
we know it converges even without knowing the limit; see exercise 1.)

Example 11.1.15
Show that {n!/n n
} converges.
Solution
Again we show that the sequence is decreasing, and since each term is positive the sequence converges. We can't take the
derivative this time, as x! doesn't make sense for x real. But we note that if a /a < 1 then a < a , which is what we
n+1 n n+1 n

want to know. So we look at


n n n n
an+1 (n + 1)! n (n + 1)! n n+1 n n
an+1 / an : = = = ( ) =( ) < 1. (11.1.17)
n+1 n+1
an (n + 1) n! n! (n + 1) n+1 n+1 n+1

(Again it is possible to compute the limit; see exercise 2.)

Contributors
David Guichard (Whitman College)
Integrated by Justin Marshall.

11.1: Sequences is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
11.2: Sequences by David Guichard is licensed CC BY-NC-SA 4.0.

11.1.5 https://math.libretexts.org/@go/page/4514
11.2: Series
While much more can be said about sequences, we now turn to our principal interest, series. Recall that a series, roughly speaking, is
the sum of a sequence: if {a } is a sequence then the associated series is
n

n=0

∑ an = a0 + a1 + a2 + ⋯ (11.2.1)

i=0

Associated with a series is a second sequence, called the sequence of partial sums:

{sn } (11.2.2)
n=0

with
n

sn = ∑ ai . (11.2.3)

i=0

So

s0 = a0 , s1 = a0 + a1 , s2 = a0 + a1 + a2 , … (11.2.4)

A series converges if the sequence of partial sums converges, and otherwise the series diverges.

Example 11.2.1: Geometric Series


If a n
n
= kx , then

∑ an (11.2.5)

n=0

is called a geometric series. A typical partial sum is


2 3 n 2 3 n
sn = k + kx + kx + kx + ⋯ + kx = k(1 + x + x +x + ⋯ + x ). (11.2.6)

We note that
2 3 n
sn (1 − x) = k(1 + x + x +x + ⋯ + x )(1 − x)

2 3 n 2 3 n−1 n
= k(1 + x + x +x + ⋯ + x )1 − k(1 + x + x +x +⋯ +x + x )x
(11.2.7)
2 3 n 2 3 n n+1
= k(1 + x + x +x +⋯ +x −x −x −x −⋯ −x −x )

n+1
= k(1 − x )

so
n+1
sn (1 − x) = k(1 − x )

n+1 (11.2.8)
1 −x
sn = k .
1 −x

If |x| < 1, lim n→∞


n
x =0 so
n+1
1 −x 1
lim sn = lim k =k . (11.2.9)
n→∞ n→∞ 1 −x 1 −x

Thus, when |x| < 1 the geometric series converges to k/(1 − x) . When, for example, k = 1 and x = 1/2:
n+1 n+1 ∞
1 − (1/2) 2 −1 1 1 1
sn = = =2− and ∑ = = 2. (11.2.10)
n n n
1 − 1/2 2 2 2 1 − 1/2
n=0


We began the chapter with the series ∑ , namely, the geometric series without the first term 1 . Each partial sum of this
n=1
1

2
n

series is 1 less than the corresponding partial sum for the geometric series, so of course the limit is also one less than the value of
the geometric series, that is,

11.2.1 https://math.libretexts.org/@go/page/4515

1
∑ n
= 1. (11.2.11)
2
n=1

It is not hard to see that the following theorem follows from theorem 11.1.2.

Theorem 11.2.2
Suppose that ∑ a and ∑ b are convergent series, and c is a constant. Then
n n

1. ∑ ca is convergent and ∑ ca = c ∑ a
n n n

2. ∑(a + b ) is convergent and ∑(a + b


n n n n) = ∑ an + ∑ bn .

The two parts of this theorem are subtly different. Suppose that ∑ a diverges; does ∑ ca also diverge if c is non-zero? Yes:
n n

suppose instead that ∑ ca converges; then by the theorem, ∑(1/c)ca converges, but this is the same as ∑ a , which by
n n n

assumption diverges. Hence ∑ ca also diverges. Note that we are applying the theorem with a replaced by ca and c replaced by
n n n

(1/c).

Now suppose that ∑ a and ∑ b diverge; does


n n ∑(an + bn ) also diverge? Now the answer is no: Let an = 1 and bn = −1 , so
certainly ∑ a and ∑ b diverge. But
n n

∑(an + bn ) = ∑(1 + −1) = ∑ 0 = 0. (11.2.12)

Of course, sometimes ∑(a n + bn ) will also diverge, for example, if a n = bn = 1 , then

∑(an + bn ) = ∑(1 + 1) = ∑ 2 (11.2.13)

diverges.
In general, the sequence of partial sums s is harder to understand and analyze than the sequence of terms a , and it is difficult to
n n

determine whether series converge and if so to what. Sometimes things are relatively simple, starting with the following.

Theorem 11.2.3
If

∑ an (11.2.14)

converges then
lim an = 0. (11.2.15)
n→∞

Proof.
Since ∑ a converges, lim
n n→∞ sn = L and limn→∞ sn−1 = L , because this really says the same thing but "renumbers'' the
terms. By theorem 11.1.2,

lim (sn − sn−1 ) = lim sn − lim sn−1 = L − L = 0. (11.2.16)


n→∞ n→∞ n→∞

But
sn − sn−1 = (a0 + a1 + a2 + ⋯ + an ) − (a0 + a1 + a2 + ⋯ + an−1 ) = an , (11.2.17)

so as desired lim n→∞ an = 0 .

This theorem presents an easy divergence test: if given a series ∑ a the limit lim a does not exist or has a value other than
n n→∞ n

zero, the series diverges. Note well that the converse is not true: If lim a = 0 then the series does not necessarily converge.
n→∞ n

Example 11.2.4
Show that

n
∑ (11.2.18)
n+1
n=1

11.2.2 https://math.libretexts.org/@go/page/4515
diverges.
Solution
We compute the limit:
n
lim = 1 ≠ 0. (11.2.19)
n→∞ n+1

Looking at the first few terms perhaps makes it clear that the series has no chance of converging:
1 2 3 4
+ + + +⋯ (11.2.20)
2 3 4 5

will just get larger and larger; indeed, after a bit longer the series starts to look very much like ⋯ + 1 + 1 + 1 + 1 + ⋯ , and of
course if we add up enough 1's we can make the sum as large as we desire.

Example 11.2.5: Harmonic Series


Show that

1
∑ (11.2.21)
n
n=1

diverges.
Solution
Here the theorem does not apply: lim 1/n = 0 , so it looks like perhaps the series converges. Indeed, if you have the
n→∞

fortitude (or the software) to add up the first 1000 terms you will find that
1000
1
∑ ≈ 7.49, (11.2.22)
n
n=1

so it might be reasonable to speculate that the series converges to something in the neighborhood of 10. But in fact the partial
sums do go to infinity; they just get big very, very slowly. Consider the following:
1 1 1 1 1 1 1 1
1+ + + >1+ + + =1+ + (11.2.23)
2 3 4 2 4 4 2 2

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1+ + + + + + + >1+ + + + + + + =1+ + + (11.2.24)
2 3 4 5 6 7 8 2 4 4 8 8 8 8 2 2 2

1 1 1 1 1 1 1 1 1 1 1 1 1 1
1+ + +⋯ + >1+ + + + +⋯ + + +⋯ + =1+ + + + (11.2.25)
2 3 16 2 4 4 8 8 16 16 2 2 2 2

and so on. By swallowing up more and more terms we can always manage to add at least another 1/2 to the sum, and by adding
enough of these we can make the partial sums as big as we like. In fact, it's not hard to see from this pattern that
1 1 1 n
1+ + +⋯ + n
>1+ , (11.2.26)
2 3 2 2

so to make sure the sum is over 100, for example, we'd add up terms until we get to around 1/2 198
, that is, about 4 ⋅ 10 59
terms.
This series, ∑(1/n), is called the harmonic series.

Contributors
David Guichard (Whitman College)
Integrated by Justin Marshall.

11.2: Series is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
11.3: Series by David Guichard is licensed CC BY-NC-SA 4.0.

11.2.3 https://math.libretexts.org/@go/page/4515
11.3: The Integral Test and Estimates of Sums
It is generally quite difficult, often impossible, to determine the value of a series exactly. In many cases it is possible at least to
determine whether or not the series converges, and so we will spend most of our time on this problem.
If all of the terms a in a series are non-negative, then clearly the sequence of partial sums s is non-decreasing. This means that if
n n

we can show that the sequence of partial sums is bounded, the series must converge. We know that if the series converges, the
terms a approach zero, but this does not mean that a ≥ a
n nfor every n . Many useful and interesting series do have this
n+1

property, however, and they are among the easiest to understand. Let's look at an example.

Example 11.3.1
Show that

1
∑ (11.3.1)
2
n
n=1

converges.
Solution
The terms 1/n are positive and decreasing, and since lim
2
1/ x = 0 , the terms 1/n approach zero. We seek an upper
x→∞
2 2

bound for all the partial sums, that is, we want to find a number N so that s ≤ N for every n . The upper bound is provided
n

courtesy of integration, and is inherent in figure 11.3.1.


The figure shows the graph of y = 1/x together with some rectangles that lie completely below the curve and that all have
2

base length one. Because the heights of the rectangles are determined by the height of the curve, the areas of the rectangles are
1/1 , 1/2 , 1/3 , and so on---in other words, exactly the terms of the series. The partial sum s is simply the sum of the areas
2 2 2
n

of the first n rectangles. Because the rectangles all lie between the curve and the x-axis, any sum of rectangle areas is less than
the corresponding area under the curve, and so of course any sum of rectangle areas is less than the area under the entire curve,
that is, all the way to infinity. There is a bit of trouble at the left end, where there is an asymptote, but we can work around that
easily. Here it is:
n ∞
1 1 1 1 1 1
sn = + + +⋯ + < 1 +∫ dx < 1 + ∫ dx = 1 + 1 = 2, (11.3.2)
2 2 2 2 2 2
1 2 3 n 1 x 1 x

recalling that we computed this improper integral in section 9.7. Since the sequence of partial sums s is increasing and n

bounded above by 2, we know that lim s = L < 2 , and so the series converges to some number less than 2. In fact, it is
n→∞ n

possible, though difficult, to show that L = π /6 ≈ 1.6 .


2

We already know that ∑ 1/n diverges. What goes wrong if we try to apply this technique to it? Here's the calculation:
n ∞
1 1 1 1 1 1
sn = + + +⋯ + < 1 +∫ dx < 1 + ∫ dx = 1 + ∞. (11.3.3)
1 2 3 n 1 x 1 x

The problem is that the improper integral doesn't converge. Note well that this does not prove that ∑ 1/n diverges, just that
this particular calculation fails to prove that it converges. A slight modification, however, allows us to prove in a second way
that ∑ 1/n diverges.

Example
Consider a slightly altered version of figure 11.3.1, shown in figure 11.3.2.
Solution
The rectangles this time are above the curve, that is, each rectangle completely contains the corresponding area under the
curve. This means that
[(s_n = {1\over 1}+{1\over 2}+{1\over 3}+\cdots+{1\over n} > \int_1^{n+1} {1\over x}\,dx = \ln x\Big|_1^{n+1}=\ln(n+1).\]
As n gets bigger, ln(n + 1) goes to infinity, so the sequence of partial sums s must also go to infinity, so the harmonic series
n

diverges.

11.3.1 https://math.libretexts.org/@go/page/4516
n+1 ∞
The important fact that clinches this example is that lim ∫ n→∞ dx = ∞, which we can rewrite as ∫
1
1

x
dx = ∞.
1
1

So these two examples taken together indicate that we can prove that a series converges or prove that it diverges with a
single calculation of an improper integral. This is known as the integral test, which we state as a theorem.

Theorem 11.3.3: The Integral Test


Suppose that f (x) > 0 and is decreasing on the infinite interval [k, ∞) (for some k ≥1 ) and that an = f (n) . Then
the series

∑ an (11.3.4)

n=1

converges if and only if the improper integral


∫ f (x) dx (11.3.5)
1

converges.

The two examples we have seen are called p-series; a p-series is any series of the form ∑ 1/n . If p ≤ 0 , p

1/ n ≠ 0 , so the series diverges. For positive values of \)p\) we can determine precisely which series converge.
p
limn→∞

Theorem 11.3.4

A p-series with p > 0 converges if and only if p > 1 .


Proof
We use the integral test; we have already done p = 1 , so assume that p ≠ 1 .
∞ D
1−p 1−p
1 x ∣ D 1
∫ dx = lim ∣ = lim − . (11.3.6)
p
1
x D→∞ 1 −p ∣ D→∞ 1 −p 1 −p
1

If p >1 then 1 −p < 0 and lim D→∞ D


1−p
=0 , so the integral converges. If 0 <p <1 then 1 −p > 0 and
= ∞ , so the integral diverges.
1−p
limD→∞ D

Example 11.3.5
Show that

1
∑ (11.3.7)
n3
n=1

converges.
Solution
We could of course use the integral test, but now that we have the theorem we may simply note that this is a p-series
with p > 1 .

Example 11.3.6
Show that

5
∑ (11.3.8)
4
n
n=1

converges.
Solution

11.3.2 https://math.libretexts.org/@go/page/4516
We know that if

4
∑ 1/ n (11.3.9)

n=1

converges then

4
∑ 5/ n (11.3.10)

n=1

∞ ∞
also converges, by theorem 11.2.2. Since ∑ n=1
1/ n
4
is a convergent p-series, then ∑n=1
4
5/ n converges also.

Example 11.3.7
Show that

5
∑ − (11.3.11)
√n
n=1

diverges.
Solution
This also follows from theorem 11.2.2: Since ∑ ∞

n=1
1
is a p-series with p = 1/2 < 1 , it diverges, and so does ∑ ∞

n=1
5
.
√n √n

Since it is typically difficult to compute the value of a series exactly, a good approximation is frequently required. In a real sense, a
good approximation is only as good as we know it is, that is, while an approximation may in fact be good, it is only valuable in
practice if we can guarantee its accuracy to some degree. This guarantee is usually easy to come by for series with decreasing
positive terms.

Example 11.3.8

Approximate
2
∑ 1/n (11.3.12)

to two decimal places.


Solution
Referring to figure 11.3.1, if we approximate the sum by ∑ 1/ n , the error we make is the total area of the remaining
N

n=1
2

rectangles, all of which lie under the curve 1/x from \)x=N\) out to infinity. So we know the true value of the series is larger
2

than the approximation, and no bigger than the approximation plus the area under the curve from N to infinity. Roughly, then,
we need to find N so that

1
∫ dx < 1/100. (11.3.13)
2
N x


We can compute the integral: ∫ N
1
dx =
x2
, so N = 100 is a good starting point. Adding up the first 100 terms gives
N
1

approximately 1.634983900, and that plus 1/100 is 1.644983900, so approximating the series by the value halfway between
these will be at most 1/200 = 0.005 in error. The midpoint is 1.639983900, but while this is correct to ±0.005, we can't tell if
the correct two-decimal approximation is 1.63 or 1.64.
We need to make N big enough to reduce the guaranteed error, perhaps to around 0.004 to be safe, so we would need
1/N ≈ 0.008, or N = 125 . Now the sum of the first 125 terms is approximately 1.636965982 , and that plus 0.008 is
1.644965982 and the point halfway between them is 1.640965982 . The true value is then 1.640965982 ± 0.004, and all
numbers in this range round to 1.64, so 1.64 is correct to two decimal places. We have mentioned that the true value of this
series can be shown to be π /6 ≈ 1.644934068which rounds down to 1.64 (just barely) and is indeed below the upper bound
2

of 1.644965982, again just barely. Frequently approximations will be even better than the "guaranteed'' accuracy, but not
always, as this example demonstrates.

11.3.3 https://math.libretexts.org/@go/page/4516
Contributors

11.3: The Integral Test and Estimates of Sums is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
11.4: The Integral Test by David Guichard is licensed CC BY-NC-SA 4.0.

11.3.4 https://math.libretexts.org/@go/page/4516
11.4: The Comparison Tests
As we begin to compile a list of convergent and divergent series, new ones can sometimes be analyzed by comparing them to ones
that we already understand.

Example 11.5.1: Identifying if a Sum Converges


Does the following sum converge?

1
∑ (11.4.1)
2
n ln n
n=2

Solution
The obvious first approach, based on what we know, is the integral test. Unfortunately, we cannot compute the required
antiderivative. But looking at the series, it would appear that it must converge, because the terms we are adding are smaller
than the terms of a p-series, that is,
1 1
< , (11.4.2)
2 2
n ln n n

when n ≥ 3 . Since adding up the terms 1/n doesn't get "too big'', the new series "should'' also converge. Let's make this more
2

precise.
The series

1
∑ (11.4.3)
2
n ln n
n=2

converges if and only if



1
∑ (11.4.4)
2
n ln n
n=3

converges---all we've done is dropped the initial term. We know that



1
∑ (11.4.5)
2
n
n=3

converges. Looking at two typical partial sums:


$$s_n={1\over 3^2\ln 3}+{1\over 4^2\ln 4}+{1\over 5^2\ln 5}+\cdots+ {1\over n^2\ln n} < {1\over 3^2}+{1\over 4^2}+
{1\over 5^2}+\cdots+{1\over n^2}=t_n.\]
Since the p-series converges, say to L, and since the terms are positive, t < L . Since the terms of the new series are positive,
n

the s form an increasing sequence and s < t < L for all n . Hence the sequence {s } is bounded and so converges.
n n n n

Sometimes, even when the integral test applies, comparison to a known series is easier, so it's generally a good idea to think about
doing a comparison before doing the integral test.

Example 11.5.2: Identifying if a Sum Converges


Does the following sum converge?

| sin n|
∑ (11.4.6)
2
n
n=2

Solution
We cannot apply the integral test here, because the terms of this series are not decreasing. Just as in the previous example,
however,

11.4.1 https://math.libretexts.org/@go/page/4517
| sin n| 1
≤ , (11.4.7)
2 2
n n

because | sin n| ≤ 1 . Once again the partial sums are non-decreasing and bounded above by
2
∑ 1/ n =L (11.4.8)

so the new series converges.

Like the integral test, the comparison test can be used to show both convergence and divergence. In the case of the integral test, a
single calculation will confirm whichever is the case. To use the comparison test we must first have a good idea as to convergence
or divergence and pick the sequence for comparison accordingly.

Example 11.5.3: Identifying if a Sum Converges


Does the following sum converge?

1
∑ (11.4.9)
− −−−−
√ n2 − 3
n=2

Solution
We observe that the −3 should have little effect compared to the n inside the square root, and therefore guess that the terms
2

−−
are enough like 1/√n = 1/n that the series should diverge. We attempt to show this by comparison to the harmonic series.
2

We note that
1 1 1
> = , (11.4.10)
− −−−− −−
√ n2 − 3 √n2 n

so that
1 1 1 1 1 1
sn = + +⋯ + > + +⋯ + = tn , (11.4.11)
− −−−− − −−−− − −−−−
√ 22 − 3 √ 32 − 3 √ n2 − 3 2 3 n

where t is 1 less than the corresponding partial sum of the harmonic series (because we start at
n n =2 instead of n =1 ).
Since lim t = ∞ , lim
n→∞ n n→∞s = ∞ as well.
n

So the general approach is this: If you believe that a new series is convergent, attempt to find a convergent series whose terms are
larger than the terms of the new series; if you believe that a new series is divergent, attempt to find a divergent series whose terms
are smaller than the terms of the new series.

Example 11.5.4: Identifying if a Sum Converges

Does the following sum converge?



1
∑ (11.4.12)
− −−−−
√ n2 + 3
n=1

Solution
Just as in the last example, we guess that this is very much like the harmonic series and so diverges. Unfortunately,
1 1
− −−−− < , (11.4.13)
√ n2 + 3 n

so we cannot compare the series directly to the harmonic series. A little thought leads us to
1 1 1
−−−−− > − −−−−−− = , (11.4.14)
2
√n + 3 2
√ n + 3n2 2n

so if ∑ 1/(2n) diverges then the given series diverges. But since ∑ 1/(2n) = (1/2) ∑ 1/n , theorem 11.2.2 implies that it
does indeed diverge.

11.4.2 https://math.libretexts.org/@go/page/4517
For reference we summarize the comparison test in a theorem.

Theorem 11.5.5: The Comparison Test

Suppose that a and b are non-negative for all n and that a


n n n ≤ bn when n ≥ N , for some N .

Proof
∞ ∞ ∞ ∞
If ∑ n=0
bn converges, so does ∑ n=0
an . If ∑
n=0
an diverges, so does ∑ n=0
bn .

Contributors and Attributions


David Guichard (Whitman College)
Integrated by Justin Marshall.

11.4: The Comparison Tests is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
11.6: Comparison Test by David Guichard is licensed CC BY-NC-SA 4.0.

11.4.3 https://math.libretexts.org/@go/page/4517
11.5: Alternating Series
Next we consider series with both positive and negative terms, but in a regular pattern: they alternate, as in the alternating
harmonic series for example:
∞ n−1
(−1) 1 −1 1 −1 1 1 1 1
∑ = + + + +⋯ = − + − +⋯ . (11.5.1)
n 1 2 3 4 1 2 3 4
n=1

In this series the sizes of the terms decrease, that is, |a | forms a decreasing sequence, but this is not required in an alternating
n

series. As with positive term series, however, when the terms do have decreasing sizes it is easier to analyze the series, much easier,
in fact, than positive term series. Consider pictorially what is going on in the alternating harmonic series, shown in Figure 11.4.1.
Because the sizes of the terms a are decreasing, the partial sums s , s , s , and so on, form a decreasing sequence that is bounded
n 1 3 5

below by s , so this sequence must converge. Likewise, the partial sums s , s , s , and so on, form an increasing sequence that is
2 2 4 6

bounded above by s , so this sequence also converges. Since all the even numbered partial sums are less than all the odd numbered
1

ones, and since the "jumps'' (that is, the a terms) are getting smaller and smaller, the two sequences must converge to the same
i

value, meaning the entire sequence of partial sums s , s , s , … converges as well.


1 2 3

Figure 11.4.1. The alternating harmonic series.


There's nothing special about the alternating harmonic series---the same argument works for any alternating sequence with
decreasing size terms. The alternating series test is worth calling a theorem.

Theorem 11.4.1: The Alternating Series Test


Suppose that {an }

is a non-increasing sequence of positive numbers and
n=1
limn→∞ an = 0 . Then the alternating series

n=1
(−1 )
n−1
an converges.

Proof
The odd numbered partial sums,
s , s , s ,1 and so on, form a non-increasing sequence, because
3 5

s2k+3 = s2k+1 − a2k+2 + a2k+3 ≤ s2k+1 , since a ≥a . This sequence is bounded below by s , so it must converge,
2k+2 2k+3 2

say lim k→∞ s


2k+1 = L . Likewise, the partial sums s , s , s , and so on, form a non-decreasing sequence that is bounded
2 4 6

above by s 1 , so this sequence also converges, say lim s = M . Since lim


k→∞ a = 0 and s
2k =s +a , n→∞ n 2k+1 2k 2k+1

L = lim s2k+1 = lim (s2k + a2k+1 ) = lim s2k + lim a2k+1 = M + 0 = M , (11.5.2)
k→∞ k→∞ k→∞ k→∞

so L = M , the two sequences of partial sums converge to the same limit, and this means the entire sequence of partial sums
also converges to L.

Another useful fact is implicit in this discussion. Suppose that L = ∑ (−1) a and that we approximate L by a finite part of

n=1
n−1
n

N
this sum, say L ≈ ∑ (−1) a . Because the terms are decreasing in size, we know that the true value of L must be between
n=1
n−1
n

this approximation and the next one, that is, between ∑ (−1) a and ∑ (−1 ) a . Depending on whether N is
N

n=1
n−1
n
N +1

n=1
n−1
n

odd or even, the second will be larger or smaller than the first.

11.5.1 https://math.libretexts.org/@go/page/4518
Example 11.4.2
Approximate the alternating harmonic series to one decimal place.
Solution
We need to go roughly to the point at which the next term to be added or subtracted is 1/10. Adding up the first nine and the
first ten terms we get approximately 0.746 and 0.646. These are 1/10 apart, but it is not clear how the correct value would be
rounded. It turns out that we are able to settle the question by computing the sums of the first eleven and twelve terms, which
give 0.737 and 0.653, so correct to one place the value is 0.7.
We have considered alternating series with first index 1, and in which the first term is positive, but a little thought shows this is
∞ ∞ ∞
not crucial. The same test applies to any similar series, such as ∑ (−1) a , ∑ (−1) a , ∑
n=0
n
n n=1
n
n (−1 ) a , etc.
n=17
n
n

Contributors and Attributions


David Guichard (Whitman College)
Integrated by Justin Marshall.

11.5: Alternating Series is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
11.5: Alternating Series by David Guichard is licensed CC BY-NC-SA 4.0.

11.5.2 https://math.libretexts.org/@go/page/4518
11.6: Absolute Convergence and the Ratio and Root Test
Roughly speaking there are two ways for a series to converge: As in the case of ∑ 1/n , the individual terms get small very 2

quickly, so that the sum of all of them stays finite, or, as in the case of ∑(−1) /n, the terms do not get small fast enough (
n−1

∑ 1/n diverges), but a mixture of positive and negative terms provides enough cancellation to keep the sum finite. You might

guess from what we've seen that if the terms get small fast enough to do the job, then whether or not some terms are negative and
some positive the series converges.

Theorem 11.6.1
If ∑ ∞

n=0
| an | converges, then ∑ ∞

n=0
an converges.

Proof.

Note that 0 ≤ a n + | an | ≤ 2| an | so by the comparison test ∑ n=0
(an + | an |) converges. Now
∞ ∞ ∞ ∞

∑(an + | an |) − ∑ | an | = ∑ an + | an | − | an | = ∑ an (11.6.1)

n=0 n=0 n=0 n=0

converges by theorem 11.2.2.


So given a series ∑ a with both positive and negative terms, you should first ask whether ∑ |a | converges. This may be an
n n

easier question to answer, because we have tests that apply specifically to terms with non-negative terms. If ∑ |a | converges then n

you know that ∑ a converges as well. If ∑ |a | diverges then it still may be true that ∑ a converges---you will have to do more
n n n

work to decide the question. Another way to think of this result is: it is (potentially) easier for ∑ a to converge than for ∑ |a | to
n n

converge, because the latter series cannot take advantage of cancellation.


If ∑ |a | converges we say that ∑ a converges absolutely; to say that ∑ a converges absolutely is to say that any cancellation
n n n

that happens to come along is not really needed, as the terms already get small so fast that convergence is guaranteed by that alone.
If ∑ a converges but ∑ |a | does not, we say that ∑ a converges conditionally. For example ∑ (−1)
n n n converges ∞

n=1
n−1 1

n2

absolutely, while ∑ n=1
(−1 )
n−1 1
n
converges conditionally.

Example 11.6.2
Does

sin n
∑ (11.6.2)
2
n
n=2

converge?
Solution
In example 11.5.2 we saw that

| sin n|
∑ (11.6.3)
2
n
n=2

converges, so the given series converges absolutely.

Example 11.6.3
∞ 3n+4
Does ∑ n=0
(−1 )
n
2
2 n +3n+5
converge?

Solution
Taking the absolute value,

3n + 4
∑ (11.6.4)
2
2n + 3n + 5
n=0

11.6.1 https://math.libretexts.org/@go/page/4519
diverges by comparison to

3
∑ , (11.6.5)
10n
n=1

so if the series converges it does so conditionally. It is true that


2
lim (3n + 4)/(2 n + 3n + 5) = 0, (11.6.6)
n→∞

so to apply the alternating series test we need to know whether the terms are decreasing. If we let
2
f (x) = (3x + 4)/(2 x + 3x + 5) (11.6.7)

then
′ 2 2 2
f (x) = −(6 x + 16x − 3)/(2 x + 3x + 5 ) , (11.6.8)

and it is not hard to see that this is negative for x ≥ 1 , so the series is decreasing and by the alternating series test it converges.

Contributors
David Guichard (Whitman College)
Integrated by Justin Marshall.
5

Does the series ∑ n=0
n

5
converge? It is possible, but a bit unpleasant, to approach this with the integral test or the comparison
n

test, but there is an easier way. Consider what happens as we move from one term to the next in this series:
5 5
n (n + 1)
⋯+ n
+ +⋯ (11.6.9)
n+1
5 5

The denominator goes up by a factor of 5, 5 n+1


=5⋅5
n
, but the numerator goes up by much less:
5 5 4 3 2
(n + 1 ) =n + 5n + 10 n + 10 n + 5n + 1, (11.6.10)

which is much less than 5n when n is large, because 5n is much less than n . So we might guess that in the long run it begins to
5 4 5

look as if each term is 1/5 of the previous term. We have seen series that behave like this:

1 5
∑ = , (11.6.11)
n
5 4
n=0

a geometric series. So we might try comparing the given series to some variation of this geometric series. This is possible, but a bit
messy. We can in effect do the same thing, but bypass most of the unpleasant work.
The key is to notice that
5 n 5
an+1 (n + 1) 5 (n + 1) 1 1 1
lim = lim = lim =1⋅ = . (11.6.12)
n+1 5 5
n→∞ an n→∞
5 n n→∞ n 5 5 5

This is really just what we noticed above, done a bit more officially: in the long run, each term is one fifth of the previous term.
Now pick some number between 1/5 and 1, say 1/2. Because
an+1 1
lim = , (11.6.13)
n→∞ an 5

then when n is big enough, say n ≥ N for some N ,


an+1 1 an
< and an+1 < . (11.6.14)
an 2 2

So aN +1 < aN /2 , a N +2<a /2 < a


N +1 N /4 , aN +3 < aN +2 /2 < aN +1 /4 < aN /8 , and so on. The general form is
aN +k < aN / 2
k
. So if we look at the series

11.6.2 https://math.libretexts.org/@go/page/4519

∑ aN +k = aN + aN +1 + aN +2 + aN +3 + ⋯ + aN +k + ⋯ , (11.6.15)

k=0

its terms are less than or equal to the terms of the sequence

aN aN aN aN aN
aN + + + +⋯ + +⋯ = ∑ = 2 aN . (11.6.16)
k k
2 4 8 2 2
k=0

∞ ∞
So by the comparison test, ∑k=0 aN +k converges, and this means that ∑n=0 an converges, since we've just added the fixed
number a + a + ⋯ + a
0 1 N −1 .
Under what circumstances could we do this? What was crucial was that the limit of a /a , say L, was less than 1 so that we
n+1 n

could pick a value r so that L < r < 1 . The fact that L < r (1/5 < 1/2 in our example) means that we can compare the series
∑a to ∑ r , and the fact that r < 1 guarantees that ∑ r converges. That's really all that is required to make the argument
n
n n

work. We also made use of the fact that the terms of the series were positive; in general we simply consider the absolute values of
the terms and we end up testing for absolute convergence.

Theroem 11.7.1: The Ratio Test

Suppose that
lim | an+1 / an | = L. (11.6.17)
n→∞

If

L <1 (11.6.18)

the series ∑ a converges absolutely, if L > 1 the series diverges, and if L = 1 this test gives no information.
n

Proof.
The example above essentially proves the first part of this, if we simply replace 1/5 by L and 1/2 by r. Suppose that L >1 ,
and pick r so that 1 < r < L . Then for n ≥ N , for some N ,
| an+1 |
>r and | an+1 | > r| an |. (11.6.19)
| an |

This implies that


k
| aN +k | > r | aN | (11.6.20)

, but since r > 1 this means that


lim | aN +k | ≠ 0 (11.6.21)
k→∞

, which means also that

lim an ≠ 0 (11.6.22)
n→∞

. By the divergence test, the series diverges.


To see that we get no information when L = 1 , we need to exhibit two series with L = 1 , one that converges and one that diverges.
It is easy to see that ∑ 1/n and ∑ 1/n do the job.
2

Example 11.7.2
The ratio test is particularly useful for series involving the factorial function. Consider ∑ ∞

n=0
n
5 /n! .
n+1 n+1
5 n! 5 n! 1
lim = lim = lim 5 = 0. (11.6.23)
n n
n→∞ (n + 1)! 5 n→∞ 5 (n + 1)! n→∞ (n + 1)

Since 0 < 1 , the series converges.

11.6.3 https://math.libretexts.org/@go/page/4519
A similar argument, which we will not do, justifies a similar test that is occasionally easier to apply.

Theroem 11.7.3: The Root Test


Suppose that
1/n
lim | an | = L. (11.6.24)
n→∞

If L < 1 ,the series ∑ a converges absolutely, if L > 1 the series diverges, and if L = 1 this test gives no information.
n

The proof of the root test is actually easier than that of the ratio test, and is a good exercise.

Example 11.7.4
n

Analyze ∑ ∞

n=0
5

n
n
.
Solution
The ratio test turns out to be a bit difficult on this series (try it). Using the root test:
n 1/n n 1/n
5 (5 ) 5
lim ( ) = lim = lim = 0. (11.6.25)
n n 1/n
n→∞ n n→∞
(n ) n→∞ n

Since 0 < 1 , the series converges.

The root test is frequently useful when n appears as an exponent in the general term of the series.

Contributors
David Guichard (Whitman College)
Integrated by Justin Marshall.

11.6: Absolute Convergence and the Ratio and Root Test is shared under a not declared license and was authored, remixed, and/or curated by
LibreTexts.
11.7: Absolute Convergence by David Guichard is licensed CC BY-NC-SA 4.0.
11.8: The Ratio and Root Tests by David Guichard is licensed CC BY-NC-SA 4.0.

11.6.4 https://math.libretexts.org/@go/page/4519
Welcome to the Mathematics Library. This Living Library is a principal hub of the LibreTexts project, which is a multi-institutional
collaborative venture to develop the next generation of open-access texts to improve postsecondary education at all levels of higher
learning. The LibreTexts approach is highly collaborative where an Open Access textbook environment is under constant revision
by students, faculty, and outside experts to supplant conventional paper-based books.

1 https://math.libretexts.org/@go/page/4520
11.8: Power Series
We have seen that some functions can be represented as series, which may give valuable information about the function. So far, we
have seen only those examples that result from manipulation of our one fundamental example, the geometric series. We would like
to start with a given function and produce a series to represent it, if possible.
Suppose that f (x) = ∑ ∞

n=0
a x n on some interval of convergence. Then we know that we can compute derivatives of f by taking
n

derivatives of the terms of the series. Let's look at the first few in general:

′ n−1 2 3
f (x) = ∑ nan x = a1 + 2 a2 x + 3 a3 x + 4 a4 x +⋯

n=1


′′ n−2 2
f (x) = ∑ n(n − 1)an x = 2 a2 + 3 ⋅ 2 a3 x + 4 ⋅ 3 a4 x +⋯ (11.8.1)
n=2

′′′ n−3
f (x) = ∑ n(n − 1)(n − 2)an x = 3 ⋅ 2 a3 + 4 ⋅ 3 ⋅ 2 a4 x + ⋯

n=3

By examining these it's not hard to discern the general pattern. The k th derivative must be

(k) n−k
f (x) = ∑ n(n − 1)(n − 2) ⋯ (n − k + 1)an x

n=k
(11.8.2)
= k(k − 1)(k − 2) ⋯ (2)(1)ak + (k + 1)(k) ⋯ (2)ak+1 x +

2
+ (k + 2)(k + 1) ⋯ (3)ak+2 x +⋯

We can shrink this quite a bit by using factorial notation:



n! (k + 2)!
(k) n−k 2
f (x) = ∑ an x = k! ak + (k + 1)! ak+1 x + ak+2 x +⋯ (11.8.3)
(n − k)! 2!
n=k

Now substitute x = 0 :

n!
(k) n−k
f (0) = k! ak + ∑ an 0 = k! ak , (11.8.4)
(n − k)!
n=k+1

and solve for a :


k

(k)
f (0)
ak = . (11.8.5)
k!

Note the special case, obtained from the series for f itself, that gives f (0) = a . 0

So if a function f can be represented by a series, we know just what series it is. Given a function f , the series
∞ (n)
f (0)
n
∑ x (11.8.6)
n!
n=0

is called the Maclaurin series for f .

Example 11.10.1: Maclaurin series


Find the Maclaurin series for f (x) = 1/(1 − x) .
Solution
We need to compute the derivatives of f (and hope to spot a pattern).

11.8.1 https://math.libretexts.org/@go/page/4521
−1
f (x) = (1 − x)

′ −2
f (x) = (1 − x)

′′ −3
f (x) = 2(1 − x)

′′′ −4
f (x) = 6(1 − x) (11.8.7)

(4) −5
f (x) = 4!(1 − x)

(n) −n−1
f (x) = n!(1 − x)

So
(n) −n−1
f (0) n!(1 − 0)
= =1 (11.8.8)
n! n!

and the Maclaurin series is


∞ ∞

n n
∑1⋅x = ∑x , (11.8.9)

n=0 n=0

the geometric series.

A warning is in order here. Given a function f we may be able to compute the Maclaurin series, but that does not mean we have
found a series representation for f . We still need to know where the series converges, and if, where it converges, it converges to
f (x). While for most commonly encountered functions the Maclaurin series does indeed converge to f on some interval, this is not

true of all functions, so care is required.


As a practical matter, if we are interested in using a series to approximate a function, we will need some finite number of terms of
the series. Even for functions with messy derivatives we can compute these using computer software like Sage. If we want to know
the whole series, that is, a typical term in the series, we need a function whose derivatives fall into a pattern that we can discern. A
few of the most important functions are fortunately very easy.

Example 11.10.2: Maclaurin series


Find the Maclaurin series for sin x.
Solution
The derivatives are quite easy: f (x) = cos x, f (x) = − sin x , f (x) = − cos x , f (x) = sin x , and then the pattern
′ ′′ ′′′ (4)

repeats. We want to know the derivatives at zero: 1, 0, −1, 0, 1, 0, −1, 0,…, and so the Maclaurin series is
3 5 ∞ 2n+1
x x x
n
x− + − ⋯ = ∑(−1 ) . (11.8.10)
3! 5! (2n + 1)!
n=0

We should always determine the radius of convergence:


2n+3 2
|x| (2n + 1)! |x|
lim = lim = 0, (11.8.11)
n→∞ 2n+1 n→∞
(2n + 3)! |x| (2n + 3)(2n + 2)

so the series converges for every x. Since it turns out that this series does indeed converge to sin x everywhere, we have a
series representation for sin x for every x.

Sometimes the formula for the n th derivative of a function f is difficult to discover, but a combination of a known Maclaurin series
and some algebraic manipulation leads easily to the Maclaurin series for f .

11.8.2 https://math.libretexts.org/@go/page/4521
Example 11.10.3: Maclaurin series
Find the Maclaurin series for x sin(−x).
Solution
To get from sin x to x sin(−x) we substitute −x for x and then multiply by x. We can do the same thing to the series for sin x:
∞ 2n+1 ∞ 2n+1 ∞ 2n+2
(−x) x x
n n 2n+1 n+1
x ∑(−1 ) = x ∑(−1 ) (−1 ) = ∑(−1 ) . (11.8.12)
(2n + 1)! (2n + 1)! (2n + 1)!
n=0 n=0 n=0

As we have seen, a general power series can be centered at a point other than zero, and the method that produces the Maclaurin
series can also produce such series.

Taylor series
Find a series centered at −2 for 1/(1 − x).
Solution
If the series is

n
∑ an (x + 2 ) (11.8.13)

n=0

then looking at the k th derivative:



n!
−k−1 n−k
k!(1 − x ) =∑ an (x + 2 ) (11.8.14)
(n − k)!
n=k

and substituting x = −2 we get


−k−1
k! 3 = k! ak (11.8.15)

and
−k−1 k+1
ak = 3 = 1/ 3 , (11.8.16)

so the series is
∞ n
(x + 2)
∑ . (11.8.17)
n+1
n=0
3

We've already seen this, in Section 11.8. Such a series is called the Taylor series for the function, and the general term has the
form
(n)
f (a)
n
(x − a) . (11.8.18)
n!

A Maclaurin series is simply a Taylor series with a = 0 .

Contributors
David Guichard (Whitman College)
Integrated by Justin Marshall.

11.8: Power Series is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
11.11: Taylor Series by David Guichard is licensed CC BY-NC-SA 4.0.

11.8.3 https://math.libretexts.org/@go/page/4521
11.9: Representations of Functions as Power Series
Now we know that some functions can be expressed as power series, which look like infinite polynomials. Since calculus, that is,
computation of derivatives and antiderivatives, is easy for polynomials, the obvious question is whether the same is true for infinite
series. The answer is yes:

Theorem 11.9.1
Suppose the power series

n
f (x) = ∑ an (x − a) (11.9.1)

n=0

has radius of convergence R. Then


′ n−1
f (x) = ∑ nan (x − a) ,

n=0
(11.9.2)

an n+1
∫ f (x) dx = C + ∑ (x − a) ,
n+1
n=0

and these two series have radius of convergence R as well.

Example 11.9.2

1
n
= ∑x
1 −x
n=0


1 1 n+1
∫ dx = − ln |1 − x| = ∑ x (11.9.3)
1 −x n+1
n=0


1
n+1
ln |1 − x| = ∑ − x
n+1
n=0

when |x| < 1. The series does not converge when x = 1 but does converge when x = −1 or 1 − x = 2 . The interval of
convergence is [−1, 1), or 0 < 1 − x ≤ 2 , so we can use the series to represent ln(x) when 0 < x ≤ 2 .
For example

n
1 1
ln(3/2) = ln(1 − −1/2) = ∑(−1 ) (11.9.4)
n+1
n+1 2
n=0

and so
1 1 1 1 1 1 1 909
ln(3/2) ≈ − + − + − + = ≈ 0.406. (11.9.5)
2 8 24 64 160 384 896 2240

Because this is an alternating series with decreasing terms, we know that the true value is between 909/2240 and
909/2240 − 1/2048 = 29053/71680 ≈ .4053 , so correct to two decimal places the value is 0.41.
What about ln(9/4) ? Since 9/4is larger than 2 we cannot use the series directly, but
2
ln(9/4) = ln((3/2 ) ) = 2 ln(3/2) ≈ 0.82, so in fact we get a lot more from this one calculation than first meets the eye. To
estimate the true value accurately we actually need to be a bit more careful. When we multiply by two we know that the true
value is between 0.8106 and 0.812, so rounded to two decimal places the true value is 0.81.

Contributors

11.9: Representations of Functions as Power Series is shared under a not declared license and was authored, remixed, and/or curated by
LibreTexts.

11.9.1 https://math.libretexts.org/@go/page/4522
11.10: Calculus with Power Series by David Guichard is licensed CC BY-NC-SA 4.0.

11.9.2 https://math.libretexts.org/@go/page/4522
11.10: Taylor and Maclaurin Series
 Learning Objectives
Describe the procedure for finding a Taylor polynomial of a given order for a function.
Explain the meaning and significance of Taylor’s theorem with remainder.
Estimate the remainder for a Taylor series approximation of a given function.

In the previous two sections we discussed how to find power series representations for certain types of functions––specifically, functions related to
geometric series. Here we discuss power series representations for other types of functions. In particular, we address the following questions: Which
functions can be represented by power series and how do we find such representations? If we can find a power series representation for a particular
function f and the series converges on some interval, how do we prove that the series actually converges to f ?

Overview of Taylor/Maclaurin Series


Consider a function f that has a power series representation at x = a . Then the series has the form

n 2
∑ cn (x − a) = c0 + c1 (x − a) + c2 (x − a) +… . (11.10.1)

n=0

What should the coefficients be? For now, we ignore issues of convergence, but instead focus on what the series should be, if one exists. We return to
discuss convergence later in this section. If the series Equation 11.10.1 is a representation for f at x = a , we certainly want the series to equal f (a)
at x = a . Evaluating the series at x = a , we see that

n 2
∑ cn (x − a) = c0 + c1 (a − a) + c2 (a − a) + ⋯ = c0 . (11.10.2)

n=0

Thus, the series equals f (a) if the coefficient c = f (a) . In addition, we would like the first derivative of the power series to equal f '(a) at x = a .
0

Differentiating Equation 11.10.2 term-by-term, we see that



d
n 2
( ∑ cn (x − a) ) = c1 + 2 c2 (x − a) + 3 c3 (x − a) +… . (11.10.3)
dx
n=0

Therefore, at x = a, the derivative is



d n 2
( ∑ cn (x − a) ) = c1 + 2 c2 (a − a) + 3 c3 (a − a) + ⋯ = c1 . (11.10.4)
dx
n=0

Therefore, the derivative of the series equals f '(a) if the coefficient c = f '(a). Continuing in this way, we look for coefficients c such that all the
1 n

derivatives of the power series Equation 11.10.4 will agree with all the corresponding derivatives of f at x = a . The second and third derivatives of
Equation 11.10.3 are given by
2 ∞
d n 2
( ∑ cn (x − a) ) = 2 c2 + 3 ⋅ 2 c3 (x − a) + 4 ⋅ 3 c4 (x − a) +… (11.10.5)
2
dx
n=0

and
3 ∞
d n 2
( ∑ cn (x − a) ) = 3 ⋅ 2 c3 + 4 ⋅ 3 ⋅ 2 c4 (x − a) + 5 ⋅ 4 ⋅ 3 c5 (x − a) +⋯ . (11.10.6)
3
dx
n=0

Therefore, at x = a , the second and third derivatives


2 ∞
d n 2
( ∑ cn (x − a) ) = 2 c2 + 3 ⋅ 2 c3 (a − a) + 4 ⋅ 3 c4 (a − a) + ⋯ = 2 c2 (11.10.7)
2
dx
n=0

and
3 ∞
d n 2
( ∑ cn (x − a) ) = 3 ⋅ 2 c3 + 4 ⋅ 3 ⋅ 2 c4 (a − a) + 5 ⋅ 4 ⋅ 3 c5 (a − a) + ⋯ = 3 ⋅ 2 c3 (11.10.8)
dx3
n=0

′′ ′′′
f (a) f (a)
equal f ′′
(a) and f ′′′
(a) , respectively, if c 2 = and c 3 = . More generally, we see that if f has a power series representation at x = a ,
2 3⋅2
(n)
f (a)
then the coefficients should be given by c n = . That is, the series should be
n!

11.10.1 https://math.libretexts.org/@go/page/4512
∞ (n) ′′ ′′′
f (a) f (a) f (a)
n 2 3
∑ (x − a) = f (a) + f '(a)(x − a) + (x − a) + (x − a) +⋯
n! 2! 3!
n=0

This power series for f is known as the Taylor series for f at a. If x = 0 , then this series is known as the Maclaurin series for f .

 Definition 11.10.1: Maclaurin and Taylor series

If f has derivatives of all orders at x = a , then the Taylor series for the function f at a is
∞ (n) ′′ (n)
f (a) f (a) f (a)
n 2 n
∑ (x − a) = f (a) + f '(a)(x − a) + (x − a) +⋯ + (x − a) +⋯
n! 2! n!
n=0

The Taylor series for f at 0 is known as the Maclaurin series for f .

Later in this section, we will show examples of finding Taylor series and discuss conditions under which the Taylor series for a function will
converge to that function. Here, we state an important result. Recall that power series representations are unique. Therefore, if a function f has a
power series at a , then it must be the Taylor series for f at a .

 Uniqueness of Taylor Series

If a function f has a power series at a that converges to f on some open interval containing a , then that power series is the Taylor series for f at
a.

The proof follows directly from that discussed previously.


To determine if a Taylor series converges, we need to look at its sequence of partial sums. These partial sums are finite polynomials, known as
Taylor polynomials.

Taylor Polynomials
The n partial sum of the Taylor series for a function
th
f at a is known as the n
th
-degree Taylor polynomial. For example, the 0th, 1st, 2nd, and 3rd
partial sums of the Taylor series are given by

p0 (x) = f (a)

p1 (x) = f (a) + f '(a)(x − a)

′′
f (a)
2
p2 (x) = f (a) + f '(a)(x − a) + (x − a)
2!

′′ ′′′
f (a) f (a)
2 3
p3 (x) = f (a) + f '(a)(x − a) + (x − a) + (x − a)
2! 3!

respectively. These partial sums are known as the 0th, 1st, 2nd, and 3rd degree Taylor polynomials of f at a , respectively. If x = a , then these
polynomials are known as Maclaurin polynomials for f . We now provide a formal definition of Taylor and Maclaurin polynomials for a function f .

 Definition 11.10.2: Maclaurin polynomial

If f has n derivatives at x = a , then the n -degree Taylor polynomial of f at a is


th

′′ ′′′ (n)
f (a) f (a) f (a)
2 3 n
pn (x) = f (a) + f '(a)(x − a) + (x − a) + (x − a) +⋯ + (x − a) .
2! 3! n!

The n -degree Taylor polynomial for f at 0 is known as the n -degree Maclaurin polynomial for f .
th th

We now show how to use this definition to find several Taylor polynomials for f (x) = ln x at x = 1 .

 Example 11.10.1: Finding Taylor Polynomials

Find the Taylor polynomials p0 , p1 , p2 and p for 3 f (x) = ln x at x =1 . Use a graphing utility to compare the graph of f with the graphs of
p , p , p and p .
0 1 2 3

Solution
To find these Taylor polynomials, we need to evaluate f and its first three derivatives at x = 1 .

11.10.2 https://math.libretexts.org/@go/page/4512
f (x) = ln x f (1) = 0

1
f '(x) = f '(1) = 1
x

1
′′ ′′
f (x) = − f (1) = −1
2
x

2
′′′ ′′′
f (x) = f (1) = 2
3
x

Therefore,
p0 (x) = f (1) = 0,

p1 (x) = f (1) + f '(1)(x − 1) = x − 1,

′′
f (1) 1
2 2
p2 (x) = f (1) + f '(1)(x − 1) + (x − 1 ) = (x − 1) − (x − 1 )
2 2

′′ ′′′
f (1) f (1) 1 1
2 3 2 3
p3 (x) = f (1) + f '(1)(x − 1) + (x − 1 ) + (x − 1 ) = (x − 1) − (x − 1 ) + (x − 1 )
2 3! 2 3

The graphs of y = f (x) and the first three Taylor polynomials are shown in Figure 11.10.1.

Figure 11.10.1: The function y = ln x and the Taylor polynomials p 0, p1 , p2 and p at x = 1 are plotted on this graph.
3

 Exercise 11.10.1
1
Find the Taylor polynomials p 0, p1 , p2 and p for f (x) =
3 at x = 1 .
x2

Hint
Find the first three derivatives of f and evaluate them at x = 1.
Answer
p0 (x) = 1

p1 (x) = 1 − 2(x − 1)

2
p2 (x) = 1 − 2(x − 1) + 3(x − 1)

2 3
p3 (x) = 1 − 2(x − 1) + 3(x − 1 ) − 4(x − 1 )

We now show how to find Maclaurin polynomials for e x


, sin x, and cos x. As stated above, Maclaurin polynomials are Taylor polynomials centered
at zero.

 Example 11.10.2: Finding Maclaurin Polynomials


For each of the following functions, find formulas for the Maclaurin polynomials p , p , p and p . Find a formula for the n -degree Maclaurin
0 1 2 3
th

polynomial and write it using sigma notation. Use a graphing utility to compare the graphs of p , p , p and p with f . 0 1 2 3

a. f (x) = ex

b. f (x) = sin x
c. f (x) = cos x
Solution

11.10.3 https://math.libretexts.org/@go/page/4512
Since f (x) = e ,we know that f (x) = f '(x) = f
x ′′
(x) = ⋯ = f
(n)
(x) = e
x
for all positive integers n . Therefore,
′′ (n)
f (0) = f '(0) = f (0) = ⋯ = f (0) = 1

for all positive integers n . Therefore, we have


p0 (x) = f (0) = 1,

p1 (x) = f (0) + f '(0)x = 1 + x,

′′
f (0) 1
2 2
p2 (x) = f (0) + f '(0)x + x = 1 +x + x ,
2! 2

′′ ′′′
f (0) f (0) 1 1
2 3 2 3
p3 (x) = f (0) + f '(0)x + x + x = 1 +x + x + x ,
2 3! 2 3!

′′ ′′′ (n)
f (0) f (0) f (0)
2 3 n
pn (x) = f (0) + f '(0)x + x + x +⋯ + x
2 3! n!

2 3 n
x x x
= 1 +x + + +⋯ + .
2! 3! n!

n k
x
=∑
k!
k=0

The function and the first three Maclaurin polynomials are shown in Figure 11.10.2.

Figure 11.10.2: The graph shows the function y = e and the Maclaurin polynomials p x
0, p1 , p2 and p .
3

b. For f (x) = sin x , the values of the function and its first four derivatives at x = 0 are given as follows:
f (x) = sin x f (0) = 0

f '(x) = cos x f '(0) = 1

′′ ′′
f (x) = − sin x f (0) = 0

′′′ ′′′
f (x) = − cos x f (0) = −1

(4) (4)
f (x) = sin x f (0) = 0.

Since the fourth derivative is sin x, the pattern repeats. That is, f (2m)
(0) = 0 and f (2m+1) m
(0) = (−1 ) for m ≥ 0. Thus, we have
p0 (x) = 0,

p1 (x) = 0 + x = x,

p2 (x) = 0 + x + 0 = x,

3
1 3
x
p3 (x) = 0 + x + 0 − x =x− ,
3! 3!

3
1 x
3
p4 (x) = 0 + x + 0 − x +0 = x − ,
3! 3!

3 5
1 1 x x
3 5
p5 (x) = 0 + x + 0 − x +0 + x =x− + ,
3! 5! 3! 5!

and for m ≥ 0 ,

11.10.4 https://math.libretexts.org/@go/page/4512
3 5 2m+1
x x m
x
p2m+1 (x) = p2m+2 (x) =x− + − ⋯ + (−1 )
3! 5! (2m + 1)!

m 2k+1
x
k
= ∑(−1 ) .
(2k + 1)!
k=0

Graphs of the function and its Maclaurin polynomials are shown in Figure 11.10.3.

Figure 11.10.3: The graph shows the function y = sin x and the Maclaurin polynomials p 1, p3 and p .
5

c. For f (x) = cos x, the values of the function and its first four derivatives at x = 0 are given as follows:
f (x) = cos x f (0) = 1

f '(x) = − sin x f '(0) = 0

′′ ′′
f (x) = − cos x f (0) = −1

′′′ ′′′
f (x) = sin x f (0) = 0

(4) (4)
f (x) = cos x f (0) = 1.

Since the fourth derivative is sin x, the pattern repeats. In other words, f (2m)
(0) = (−1 )
m
and f (2m+1)
=0 for m ≥ 0 . Therefore,
p0 (x) = 1,

p1 (x) = 1 + 0 = 1,

2
1 2
x
p2 (x) = 1 + 0 − x =1− ,
2! 2!

2
1 2
x
p3 (x) = 1 + 0 − x +0 = 1 − ,
2! 2!

2 4
1 2
1 4
x x
p4 (x) = 1 + 0 − x +0 + x =1− + ,
2! 4! 2! 4!

2 4
1 1 x x
2 4
p5 (x) = 1 + 0 − x +0 + x +0 = 1 − + ,
2! 4! 2! 4!

and for n ≥ 0 ,

p2m (x) = p2m+1 (x)

2 4 2m
x x m
x
=1− + − ⋯ + (−1 )
2! 4! (2m)!

m 2k
x
k
= ∑(−1 ) .
(2k)!
k=0

Graphs of the function and the Maclaurin polynomials appear in Figure 11.10.4.

11.10.5 https://math.libretexts.org/@go/page/4512
Figure 11.10.4: The function y = cos x and the Maclaurin polynomials p 0, p2 and p are plotted on this graph.
4

 Exercise 11.10.2
1
Find formulas for the Maclaurin polynomials p 0, p1 , p2 and p for f (x) =
3 .
1 +x

Find a formula for the n -degree Maclaurin polynomial. Write your answer using sigma notation.
th

Hint
Evaluate the first four derivatives of f and look for a pattern.
Answer
n

2 2 3 2 3 n n k k
p0 (x) = 1; p1 (x) = 1 − x; p2 (x) = 1 − x + x ; p3 (x) = 1 − x + x − x ; pn (x) = 1 − x + x −x + ⋯ + (−1 ) x = ∑(−1 ) x

k=0

Taylor’s Theorem with Remainder


Recall that the n -degree Taylor polynomial for a function f at a is the n partial sum of the Taylor series for f at a . Therefore, to determine if the
th th

Taylor series converges, we need to determine whether the sequence of Taylor polynomials p converges. However, not only do we want to know if
n

the sequence of Taylor polynomials converges, we want to know if it converges to f . To answer this question, we define the remainder R (x) as n

Rn (x) = f (x) − pn (x).

For the sequence of Taylor polynomials to converge to f , we need the remainder R to converge to zero. To determine if R converges to zero, we
n n

introduce Taylor’s theorem with remainder. Not only is this theorem useful in proving that a Taylor series converges to its related function, but it
will also allow us to quantify how well the n -degree Taylor polynomial approximates the function.
th

Here we look for a bound on |R n |. Consider the simplest case: n = 0 . Let p be the 0th Taylor polynomial at a for a function f . The remainder
0 R0

satisfies
R0 (x) = f (x) − p0 (x) = f (x) − f (a).

If f is differentiable on an interval I containing a and x, then by the Mean Value Theorem there exists a real number c between a and x such that
f (x) − f (a) = f '(c)(x − a) . Therefore,

R0 (x) = f '(c)(x − a).

Using the Mean Value Theorem in a similar argument, we can show that if f is n times differentiable on an interval I containing a and x, then the
n
th
remainder R satisfies
n

(n+1)
f (c)
n+1
Rn (x) = (x − a)
(n + 1)!

for some real number c between a and x. It is important to note that the value c in the numerator above is not the center a , but rather an unknown
value c between a and x. This formula allows us to get a bound on the remainder R . If we happen to know that ∣∣f
n (x)∣ ∣ is bounded by some
(n+1)

real number M on this interval I , then


M n+1
| Rn (x)| ≤ |x − a|
(n + 1)!

for all x in the interval I .


We now state Taylor’s theorem, which provides the formal relationship between a function f and its n -degree Taylor polynomial p (x). This th
n

theorem allows us to bound the error when using a Taylor polynomial to approximate a function value, and will be important in proving that a Taylor
series for f converges to f .

11.10.6 https://math.libretexts.org/@go/page/4512
 Taylor’s Theorem with Remainder

Let f be a function that can be differentiated n+1 times on an interval I containing the real number a . Let pn be the th
n -degree Taylor
polynomial of f at a and let

Rn (x) = f (x) − pn (x)

be the n th
remainder. Then for each x in the interval I , there exists a real number c between a and x such that
(n+1)
f (c)
n+1
Rn (x) = (x − a)
(n + 1)!

.
If there exists a real number M such that ∣∣f (n+1)
(x) ∣≤ M for all x ∈ I , then
M n+1
| Rn (x)| ≤ |x − a|
(n + 1)!

for all x in I .

Proof
Fix a point x ∈ I and introduce the function g such that
′′ (n) n+1
f (t) f (t) (x − t)
2 n
g(t) = f (x) − f (t) − f '(t)(x − t) − (x − t) −⋯ − (x − t) − Rn (x) .
n+1
2! n! (x − a)

We claim that g satisfies the criteria of Rolle’s theorem. Since g is a polynomial function (in t ), it is a differentiable function. Also, g is zero at
t = a and t = x because

′′ (n)
f (a) f (a)
2 n
g(a) = f (x) − f (a) − f '(a)(x − a) − (x − a) +⋯ + (x − a) − Rn (x)
2! n!

= f (x) − pn (x) − Rn (x)

= 0,

g(x) = f (x) − f (x) − 0 − ⋯ − 0

= 0.

Therefore, g satisfies Rolle’s theorem, and consequently, there exists c between a and x such that g'(c) = 0. We now calculate g' . Using the
product rule, we note that
(n) (n) (n+1)
d f (t) f (t) f (t)
n n−1 n
[ (x − t) ] = − (x − t) + (x − t) .
dt n! (n − 1)! n!

Consequently,
′′′
f (t)
′′ ′′ 2
g'(t) = −f '(t) + [f '(t) − f (t)(x − t)] + [f (t)(x − t) − (x − t) ] + ⋯
2!

(n) (n+1) n
f (t) f (t) (x − t)
n−1 n
+[ (x − t) − (x − t) ] + (n + 1)Rn (x) (11.10.9)
n+1
(n − 1)! n! (x − a)

.
Notice that there is a telescoping effect. Therefore,
(n+1) n
f (t) (x − t)
′ n
g (t) = − (x − t) + (n + 1)Rn (x)
n+1
n! (x − a)

.
By Rolle’s theorem, we conclude that there exists a number c between a and x such that g'(c) = 0. Since
(n+1 n
f )(c) (x − c)
n
g'(c) = − (x − c ) + (n + 1)Rn (x)
n+1
n! (x − a)

we conclude that

11.10.7 https://math.libretexts.org/@go/page/4512
(n+1) n
f (c) (x − c)
n
− (x − c ) + (n + 1)Rn (x) = 0.
n+1
n! (x − a)

Adding the first term on the left-hand side to both sides of the equation and dividing both sides of the equation by n + 1, we conclude that
(n+1)
f (c)
n+1
Rn (x) = (x − a)
(n + 1)!

as desired. From this fact, it follows that if there exists M such that ∣∣f (n+1)
(x) ∣≤ M for all x in I , then
M n+1
| Rn (x)| ≤ |x − a|
(n + 1)!

.

Not only does Taylor’s theorem allow us to prove that a Taylor series converges to a function, but it also allows us to estimate the accuracy of Taylor
polynomials in approximating function values. We begin by looking at linear and quadratic approximations of f (x) = √− x at x = 8 and determine
3

−−
how accurate these approximations are at estimating √11 . 3

 Example 11.10.3: Using Linear and Quadratic Approximations to Estimate Function Values

Consider the function f (x) = √−


x.
3

a. Find the first and second Taylor polynomials for f at x = 8 . Use a graphing utility to compare these polynomials with f near x = 8.
−−
b. Use these two polynomials to estimate √11 . 3

c. Use Taylor’s theorem to bound the error.


Solution:
a. For f (x) = √−
x , the values of the function and its first two derivatives at x = 8 are as follows:
3

3 −
f (x) = √x , f (8) = 2

1 1
f '(x) = , f '(8) =
2/3 12
3x

−2 1
′′ ′′
f (x) = , f (8) = −
5/3 144.
9x

Thus, the first and second Taylor polynomials at x = 8 are given by


p1 (x) = f (8) + f '(8)(x − 8)

1
=2+ (x − 8)
12
′′
f (8)
2
p2 (x) = f (8) + f '(8)(x − 8) + (x − 8 )
2!

1 1 2
=2+ (x − 8) − (x − 8 ) .
12 288

The function and the Taylor polynomials are shown in Figure 11.10.5.

Figure 11.10.5: The graphs of f (x) = √−


x and the linear and quadratic approximations p
3
1 (x) and p
2 (x)

11.10.8 https://math.libretexts.org/@go/page/4512
b. Using the first Taylor polynomial at x = 8 , we can estimate

3
−− 1
√11 ≈ p1 (11) = 2 + (11 − 8) = 2.25.
12

Using the second Taylor polynomial at x = 8 , we obtain

3 −− 1 1 2
√11 ≈ p2 (11) = 2 + (11 − 8) − (11 − 8 ) = 2.21875.
12 288

−−
c. By Note, there exists a c in the interval (8, 11) such that the remainder when approximating √11 by the first Taylor polynomial satisfies 3

′′
f (c)
2
R1 (11) = (11 − 8 ) .
2!

We do not know the exact value of c, so we find an upper bound on R 1 (11) by determining the maximum value of f
′′
on the interval (8, 11).
2 1
Since f ′′
(x) = − , the largest value for |f ′′
(x)| on that interval occurs at x = 8 . Using the fact that f ′′
(8) = − , we obtain
5/3 144
9x

1
2
| R1 (11)| ≤ (11 − 8 ) = 0.03125.
144 ⋅ 2!

Similarly, to estimate R , we use the fact that


2 (11)

′′′
f (c)
R2 (11) = (11 − 8 )
3
.
3!

10
Since f ′′′
(x) = , the maximum value of f ′′′
on the interval (8, 11) is f ′′′
(8) ≈ 0.0014468 . Therefore, we have
8/3
27x

0.0011468
3
| R2 (11)| ≤ (11 − 8 ) ≈ 0.0065104.
3!

 Exercise 11.10.3:
− –
Find the first and second Taylor polynomials for f (x) = √x at x = 4 . Use these polynomials to estimate √6 . Use Taylor’s theorem to bound
the error.

Hint
Evaluate f (4), f '(4), and f ′′
(4).

Answer
1 1 1
2
p1 (x) = 2 + (x − 4); p2 (x) = 2 + (x − 4) − (x − 4 ) ; p1 (6) = 2.5; p2 (6) = 2.4375;
4 4 64

| R1 (6)| ≤ 0.0625; | R2 (6)| ≤ 0.015625

 Example 11.10.4: Approximating sin x Using Maclaurin Polynomials

From Example 11.10.2b, the Maclaurin polynomials for sin x are given by
3 5 7 2m+1
x x x m
x
p2m+1 (x) = p2m+2 (x) = x − + − + ⋯ + (−1 )
3! 5! 7! (2m + 1)!

for m = 0, 1, 2, … .
π
a. Use the fifth Maclaurin polynomial for sin x to approximate sin( ) and bound the error.
18
b. For what values of x does the fifth Maclaurin polynomial approximate sin x to within 0.0001?
Solution
a.
The fifth Maclaurin polynomial is
3 5
x x
p5 (x) = x − +
3! 5!

.
Using this polynomial, we can estimate as follows:
3 5
π π π 1 π 1 π
sin( ) ≈ p5 ( ) = − ( ) + ( ) ≈ 0.173648.
18 18 18 3! 18 5! 18

11.10.9 https://math.libretexts.org/@go/page/4512
π
To estimate the error, use the fact that the sixth Maclaurin polynomial is p6 (x) = p5 (x) and calculate a bound on R6 ( ) . By Note, the
18
remainder is
(7)
f (c) 7
π π
R6 ( ) = ( )
18 7! 18

π
for some c between 0 and . Using the fact that ∣∣f (7)
(x) ∣≤ 1 for all x, we find that the magnitude of the error is at most
18

1 π 7
−10
⋅( ) ≤ 9.8 × 10 .
7! 18

b.
We need to find the values of x such that
1 7
|x | ≤ 0.0001.
7!

Solving this inequality for x, we have that the fifth Maclaurin polynomial gives an estimate to within 0.0001 as long as |x| < 0.907.

 Exercise 11.10.4
π
Use the fourth Maclaurin polynomial for cos x to approximate cos( ).
12

Hint
2 4
x x
The fourth Maclaurin polynomial is p 4 (x) =1− + .
2! 4!

Answer
0.96593

Now that we are able to bound the remainder R n (x), we can use this bound to prove that a Taylor series for f at a converges to f .

Representing Functions with Taylor and Maclaurin Series


We now discuss issues of convergence for Taylor series. We begin by showing how to find a Taylor series for a function, and how to find its interval
of convergence.

 Example 11.10.5: Finding a Taylor Series


1
Find the Taylor series for f (x) = at x = 1 . Determine the interval of convergence.
x

Solution
1
For f (x) = , the values of the function and its first four derivatives at x = 1 are
x

1
f (x) = f (1) = 1
x

1
f '(x) = − f '(1) = −1
2
x

2
′′ ′′
f (x) = f (1) = 2!
3
x

3⋅2
′′′ ′′′
f (x) = − f (1) = −3!
x4

4⋅3⋅2
(4) (4)
f (x) = f (1) = 4!.
5
x

That is, we have f (n) n


(1) = (−1 ) n! for all n ≥ 0 . Therefore, the Taylor series for f at x = 1 is given by
∞ (n) ∞
f (1)
∑ (x − 1 )
n
= ∑(−1 ) (x − 1 )
n n
.
n!
n=0 n=0

To find the interval of convergence, we use the ratio test. We find that

11.10.10 https://math.libretexts.org/@go/page/4512
n+1 +1
| an+1 | ∣
∣(−1 ) (x − 1)n ∣

=
n n
= |x − 1| .
| an | |(−1 ) (x − 1 ) |

Thus, the series converges if |x − 1| < 1. That is, the series converges for 0 < x < 2 . Next, we need to check the endpoints. At x = 2 , we see
that
∞ ∞

n n n
∑(−1 ) (2 − 1 ) = ∑(−1 )

n=0 n=0

diverges by the divergence test. Similarly, at x = 0,


∞ ∞ ∞

n n 2n
∑(−1 ) (0 − 1 ) = ∑(−1 ) = ∑1

n=0 n=0 n=0

diverges. Therefore, the interval of convergence is (0, 2).

 Exercise 11.10.5
1
Find the Taylor series for f (x) = at x = 2 and determine its interval of convergence.
2

Hint
n
(−1 ) n!
(n)
f (2) =
n+1
2

Answer
∞ n
1 2 −x
∑( ) . The interval of convergence is (0, 4).
2 2
n=0

We know that the Taylor series found in this example converges on the interval (0, 2), but how do we know it actually converges to f ? We consider
this question in more generality in a moment, but for this example, we can answer this question by writing
1 1
f (x) = = .
x 1 − (1 − x)


1
That is, f can be represented by the geometric series ∑(1 − x )
n
. Since this is a geometric series, it converges to as long as |1 − x| < 1.
x
n=0

1
Therefore, the Taylor series found in Example does converge to f (x) = on (0, 2).
x

We now consider the more general question: if a Taylor series for a function f converges on some interval, how can we determine if it actually
converges to f ? To answer this question, recall that a series converges to a particular value if and only if its sequence of partial sums converges to
that value. Given a Taylor series for f at a , the n partial sum is given by the n -degree Taylor polynomial p . Therefore, to determine if the
th th
n

Taylor series converges to f , we need to determine whether


lim pn (x) = f (x) .
n→∞

Since the remainder R n (x) = f (x) − pn (x) , the Taylor series converges to f if and only if
lim Rn (x) = 0.
n→∞

We now state this theorem formally.

 Convergence of Taylor Series

Suppose that f has derivatives of all orders on an interval I containing a . Then the Taylor series
∞ (n)
f (a)
n
∑ (x − a)
n!
n=0

converges to f (x) for all x in I if and only if

lim Rn (x) = 0
n→∞

for all x in I .

11.10.11 https://math.libretexts.org/@go/page/4512
With this theorem, we can prove that a Taylor series for f at a converges to f if we can prove that the remainder Rn (x) → 0 . To prove that
R (x) → 0 , we typically use the bound
n

M n+1
| Rn (x)| ≤ |x − a|
(n + 1)!

from Taylor’s theorem with remainder.


In the next example, we find the Maclaurin series for e and sin x and show that these series converge to the corresponding functions for all real
x

numbers by proving that the remainders R (x) → 0 for all real numbers x.
n

 Example 11.10.6: Finding Maclaurin Series

For each of the following functions, find the Maclaurin series and its interval of convergence. Use Note to prove that the Maclaurin series for f
converges to f on that interval.
a. e
x

b. sin x
Solution
a. Using the n -degree Maclaurin polynomial for e found in Example a., we find that the Maclaurin series for e is given by
th x x

∞ n
x
∑ .
n!
n=0

To determine the interval of convergence, we use the ratio test. Since


n+1
| an+1 | |x| n! |x|
= ⋅
n
= ,
| an | (n + 1)! |x| n+1

we have
| an+1 | |x|
lim = lim =0
n→∞ | an | n→∞ n+1

for all x. Therefore, the series converges absolutely for all x, and thus, the interval of convergence is (−∞, ∞). To show that the series
converges to e for all x, we use the fact that f (x) = e for all n ≥ 0 and e is an increasing function on (−∞, ∞). Therefore, for any
x (n) x x

real number b , the maximum value of e for all |x| ≤ b is e . Thus,


x b

b
e
| Rn (x)| ≤ |x |
n+1
.
(n + 1)!

Since we just showed that


∞ n
|x|

n!
n=0

converges for all x, by the divergence test, we know that


n+1
|x|
lim =0
n→∞ (n + 1)!

for any real number x. By combining this fact with the squeeze theorem, the result is lim Rn (x) = 0.
n→∞

b. Using the n -degree Maclaurin polynomial for sin x found in Example b., we find that the Maclaurin series for sin x is given by
th

∞ 2n+1
x
∑(−1 )
n
.
(2n + 1)!
n=0

In order to apply the ratio test, consider


2n+3
| an+1 | |x| (2n + 1)!
= ⋅
2n+1
| an | (2n + 3)! |x|
.
2
|x|
=
(2n + 3)(2n + 2)

Since
2
|x|
lim =0
n→∞ (2n + 3)(2n + 2)

11.10.12 https://math.libretexts.org/@go/page/4512
for all x, we obtain the interval of convergence as (−∞, ∞). To show that the Maclaurin series converges to sin x , look at . For each
Rn (x) x

there exists a real number c between 0 and x such that


(n+1)
f (c)
Rn (x) = x
n+1
.
(n + 1)!

Since ∣∣f (n+1)


(c) ∣≤ 1 for all integers n and all real numbers c , we have
n+1
|x|
| Rn (x)| ≤
(n + 1)!

for all real numbers x. Using the same idea as in part a., the result is lim Rn (x) = 0 for all x, and therefore, the Maclaurin series for sin x
n→∞

converges to sin x for all real x.

 Exercise 11.10.6

Find the Maclaurin series for f (x) = cos x. Use the ratio test to show that the interval of convergence is (−∞, ∞) . Show that the Maclaurin
series converges to cos x for all real numbers x.

Hint
Use the Maclaurin polynomials for cos x.
Answer
∞ n 2n
(−1) x

(2n)!
n=0

n+1
|x|
By the ratio test, the interval of convergence is (−∞, ∞). Since |R n (x)| ≤ , the series converges to cos x for all real x .
(n + 1)!

 Proving that e is Irrational

In this project, we use the Maclaurin polynomials for e to prove that e is irrational. The proof relies on supposing that e is rational and arriving
x

at a contradiction. Therefore, in the following steps, we suppose e = r/s for some integers r and s where s ≠ 0.
1. Write the Maclaurin polynomials p (x), p (x), p (x), p (x), p (x)for e . Evaluate p (1), p (1), p (1), p (1), p (1) to estimate e .
0 1 2 3 4
x
0 1 2 3 4

2. Let R (x) denote the remainder when using p (x) to estimate e . Therefore, R (x) = e − p (x) , and R (1) = e − p (1) . Assuming that
n n
x
n
x
n n n

r
e = for integers r and s , evaluate R 0 (1), R1 (1), R2 (1), R3 (1), R4 (1).
s
3. Using the results from part 2, show that for each remainder R (1), R (1), R (1), R (1), R (1), we can find an integer k such that kR
0 1 2 3 4 n (1)

is an integer for n = 0, 1, 2, 3, 4.
4. Write down the formula for the n -degree Maclaurin polynomial p (x) for e and the corresponding remainder R (x). Show that
th
n
x
n

sn! R (1) is an integer.


n

5. Use Taylor’s theorem to write down an explicit formula for R (1). Conclude that R (1) ≠ 0 , and therefore, sn!R (1) ≠ 0 .
n n n

6. Use Taylor’s theorem to find an estimate on R (1). Use this estimate combined with the result from part 5 to show that
n
se
|sn! Rn (1)| < . Conclude that if n is large enough, then |sn!R n (1)| <1 . Therefore, sn!R n (1) is an integer with magnitude less than
n+1
1. Thus, sn!R (1) = 0 . But from part 5, we know that sn!R
n n (1) ≠ 0 . We have arrived at a contradiction, and consequently, the original
supposition that e is rational must be false.

Key Concepts
Taylor polynomials are used to approximate functions near a value x = a . Maclaurin polynomials are Taylor polynomials at x = 0 .
The n -degree Taylor polynomials for a function f are the partial sums of the Taylor series for f .
th

If a function f has a power series representation at x = a , then it is given by its Taylor series at x = a .
A Taylor series for f converges to f if and only if lim R (x) = 0 where R (x) = f (x) − p (x) .
n n n
n→∞

The Taylor series for e x


, and cos x converge to the respective functions for all real x.
, sin x

Key Equations
Taylor series for the function f at the point x = a
∞ (n) ′′ (n)
f (a) f (a) f (a)
n 2 n
∑ (x − a) = f (a) + f '(a)(x − a) + (x − a) +⋯ + (x − a) +⋯
n! 2! n!
n=0

11.10.13 https://math.libretexts.org/@go/page/4512
Glossary
Maclaurin polynomial
a Taylor polynomial centered at 0; the n -degree Taylor polynomial for f at 0 is the n -degree Maclaurin polynomial for f
th th

Maclaurin series
a Taylor series for a function f at x = 0 is known as a Maclaurin series for f

Taylor polynomials
′′ (n)
f (a) f (a)
the n -degree Taylor polynomial for f at x = a is p
th
n (x) = f (a) + f '(a)(x − a) +
2
(x − a) +⋯ +
n
(x − a)
2! n!

Taylor series
a power series at a that converges to a function f on some open interval containing a .

Taylor’s theorem with remainder


for a function f and the n
th
-degree Taylor polynomial for f at x =a , the remainder Rn (x) = f (x) − pn (x) satisfies
(n+1)
f (c)
n+1
Rn (x) = (x − a)
(n + 1)!

for somec between x and a ; if there exists an interval I containing a and a real number M such that ∣∣f (n+1)
(x) ∣≤ M for all x in I , then
M n+1
| Rn (x)| ≤ |x − a|
(n + 1)!

11.10: Taylor and Maclaurin Series is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
10.3: Taylor and Maclaurin Series by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

11.10.14 https://math.libretexts.org/@go/page/4512
11.11: Applications of Taylor Polynomials
 Learning Objectives
Write the terms of the binomial series.
Recognize the Taylor series expansions of common functions.
Recognize and apply techniques to find the Taylor series for a function.
Use Taylor series to solve differential equations.
Use Taylor series to evaluate non-elementary integrals.

In the preceding section, we defined Taylor series and showed how to find the Taylor series for several common functions by
explicitly calculating the coefficients of the Taylor polynomials. In this section we show how to use those Taylor series to derive
Taylor series for other functions. We then present two common applications of power series. First, we show how power series can
be used to solve differential equations. Second, we show how power series can be used to evaluate integrals when the antiderivative
2

of the integrand cannot be expressed in terms of elementary functions. In one example, we consider ∫ e
−x
dx, an integral that
arises frequently in probability theory.

The Binomial Series


Our first goal in this section is to determine the Maclaurin series for the function f (x) = (1 + x) for all real numbers r. The r

Maclaurin series for this function is known as the binomial series. We begin by considering the simplest case: r is a nonnegative
integer. We recall that, for r = 0, 1, 2, 3, 4, f (x) = (1 + x) can be written as
r

0
f (x) = (1 + x ) = 1,

1
f (x) = (1 + x ) = 1 + x,

2 2
f (x) = (1 + x ) = 1 + 2x + x ,

3 2 3
f (x) = (1 + x ) = 1 + 3x + 3 x +x

4 2 3 4
f (x) = (1 + x ) = 1 + 4x + 6 x + 4x +x .

The expressions on the right-hand side are known as binomial expansions and the coefficients are known as binomial coefficients.
More generally, for any nonnegative integer r, the binomial coefficient of x in the binomial expansion of (1 + x) is given by
n r

r r!
( ) = (11.11.1)
n n!(r − n)!

and
r
f (x) = (1 + x)

r r r r r r
2 3 r−1 r
=( ) + ( )x + ( ) x + ( ) x + ⋯ + ( )x + ( )x
0 1 2 3 r−1 r

r
r
n
= ∑( )x . (11.11.2)
n
n=0

For example, using this formula for r = 5 , we see that


5
f (x) = (1 + x)

5 5 5 2
5 3
5 4
5 5
=( )1 + ( )x + ( ) x + ( ) x + ( ) x + ( ) x
0 1 2 3 4 5

5! 5! 5! 5! 5! 5!
2 3 4 5
= 1+ x+ x + x + x + x
0!5! 1!4! 2!3! 3!2! 4!1! 5!0!

2 3 4 5
= 1 + 5x + 10 x + 10 x + 5x +x .

11.11.1 https://math.libretexts.org/@go/page/4513
We now consider the case when the exponent r.
is any real number, not necessarily a nonnegative integer. If r is not a nonnegative integer, then f (x) = (1 + x) cannot be written r

as a finite polynomial. However, we can find a power series for f . Specifically, we look for the Maclaurin series for f . To do this,
we find the derivatives of f and evaluate them at x = 0 .
r
f (x) = (1 + x) f (0) = 1

r−1 ′
f '(x) = r(1 + x) f (0) = r

′′ r−2 ′′
f (x) = r(r − 1)(1 + x) f (0) = r(r − 1)

′′′ r−3 ′′′


f (x) = r(r − 1)(r − 2)(1 + x) f (0) = r(r − 1)(r − 2)

(n) r−n (n)


f (x) = r(r − 1)(r − 2) ⋯ (r − n + 1)(1 + x) f (0) = r(r − 1)(r − 2) ⋯ (r − n + 1)

We conclude that the coefficients in the binomial series are given by


(n)
f (0) r(r − 1)(r − 2) ⋯ (r − n + 1)
= . (11.11.3)
n! n!

We note that if r is a nonnegative integer, then the (r + 1) derivative f is the zero function, and the series terminates. In
st (r+1)

addition, if r is a nonnegative integer, then Equation 11.11.3 for the coefficients agrees with Equation 11.11.1 for the coefficients,
and the formula for the binomial series agrees with Equation 11.11.2 for the finite binomial expansion. More generally, to denote
the binomial coefficients for any real number r, we define
r (r − 1)(r − 2) ⋯ (r − n + 1)
( ) = .
n n!

With this notation, we can write the binomial series for (1 + x) as r


r r(r − 1) r(r − 1) ⋯ (r − n + 1)
n 2 n
∑( ) x = 1 + rx + x +⋯ + x +⋯ . (11.11.4)
n 2! n!
n=0

We now need to determine the interval of convergence for the binomial series Equation 11.11.4 . We apply the ratio test.
Consequently, we consider
n+1
| an+1 | |r(r − 1)(r − 2) ⋯ (r − n)|x|| n
= ⋅
n
| an | (n + 1)! |r(r − 1)(r − 2) ⋯ (r − n + 1)||x|

|r − n||x|
=
|n + 1|

.
Since
| an+1 |
lim = |x| < 1
n→∞ | an |

if and only if |x| < 1, we conclude that the interval of convergence for the binomial series is (−1, 1). The behavior at the endpoints
depends on r. It can be shown that for r ≥ 0 the series converges at both endpoints; for −1 < r < 0 , the series converges at x = 1
and diverges at x = −1 ; and for r < −1 , the series diverges at both endpoints. The binomial series does converge to (1 + x) in r

(−1, 1) for all real numbers r , but proving this fact by showing that the remainder R (x) → 0 is difficult. n

 Definition: binomial series

For any real number r, the Maclaurin series for f (x) = (1 + x) is the binomial series. It converges to r
f for |x| < 1, and we
write

r r(r − 1) (r − 1) ⋯ (r − n + 1)
r n 2 n
(1 + x ) = ∑( ) x = 1 + rx + x +⋯ +r x +⋯
n 2! n!
n=0

11.11.2 https://math.libretexts.org/@go/page/4513
for |x| < 1.

−−−−− −
−−
We can use this definition to find the binomial series for f (x) = √1 + x and use the series to approximate √1.5.

 Example 11.11.1: Finding Binomial Series


−−−−−
a. Find the binomial series for f (x) = √1 + x .

−−
b. Use the third-order Maclaurin polynomial p (x) to estimate √1.5. Use Taylor’s theorem to bound the error. Use a graphing
3

utility to compare the graphs of f and p . 3

Solution
1
a. Here r = . Using the definition for the binomial series, we obtain
2

−−−−− 1 (1/2)(−1/2) (1/2)(−1/2)(−3/2)


2 3
√1 + x = 1 + x+ x + x +⋯
2 2! 3!

n+1
1 1 1 1 1⋅3 (−1) 1 ⋅ 3 ⋅ 5 ⋯ (2n − 3)
2 3 n
=1+ x− x + x −⋯ + x +⋯
2 3 n
2 2! 2 3! 2 n! 2

∞ n+1
(−1) 1 ⋅ 3 ⋅ 5 ⋯ (2n − 3)
n
= 1 +∑ x .
n
n! 2
n=1

b. From the result in part a. the third-order Maclaurin polynomial is


1 1 1
p3 (x) = 1 + x− x
2
+
3
x .
2 8 16

Therefore,

−− −−−− −− 1 1 1
2 3
√1.5 = √1 + 0.5 ≈ 1 + (0.5) − (0.5 ) + (0.5 ) ≈ 1.2266.
2 8 16

From Taylor’s theorem, the error satisfies


(4)
f (c)
4
R3 (0.5) = (0.5 )
4!

15
for some c between 0 and 0.5. Since f (4)
(x) = −
4
, and the maximum value of ∣f (4) (x)∣
∣ ∣ on the interval (0, 0.5)
7/2
2 (1 + x )

occurs at x = 0 , we have
15
4
| R3 (0.5)| ≤ (0.5 ) ≈ 0.00244.
4
4!2

The function and the Maclaurin polynomial p are graphed in Figure 11.11.1.
3

−−−−−
Figure 11.11.1 : The third-order Maclaurin polynomial p3 (x) provides a good approximation for f (x) = √1 + x for x near
zero.

 Exercise 11.11.1
1
Find the binomial series for f (x) = 2
.
(1 + x)

Hint

11.11.3 https://math.libretexts.org/@go/page/4513
Use the definition of binomial series for r = −2 .

Answer

n n
∑(−1 ) (n + 1)x

n=0

Common Functions Expressed as Taylor Series


At this point, we have derived Maclaurin series for exponential, trigonometric, and logarithmic functions, as well as functions of the
form f (x) = (1 + x) . In Table 11.11.1, we summarize the results of these series. We remark that the convergence of the
r

Maclaurin series for f (x) = ln(1 + x) at the endpoint x = 1 and the Maclaurin series for f (x) = tan x at the endpoints x = 1 −1

and x = −1 relies on a more advanced theorem than we present here. (Refer to Abel’s theorem for a discussion of this more
technical point.)
Table 11.11.1: Maclaurin Series for Common Functions
Function Maclaurin Series Interval of Convergence

1 n
f (x) = ∑x −1 < x < 1
1 −x
n=0

∞ n
x
x
f (x) = e ∑ −∞ < x < ∞
n!
n=0

∞ 2n+1
x
n
f (x) = sin x ∑(−1 ) −∞ < x < ∞
(2n + 1)!
n=0

∞ 2n
x
n
f (x) = cosx ∑(−1 ) −∞ < x < ∞
(2n)!
n=0

∞ n
x
n+1
f (x) = ln(1 + x) ∑(−1 ) −1 < x < 1
n
n=0

∞ 2n+1
x
−1 n
f (x) = tan x ∑(−1 ) −1 < x < 1
2n + 1
n=0


r
r n
f (x) = (1 + x) ∑ ( )x −1 < x < 1
n
n=0

Earlier in the chapter, we showed how you could combine power series to create new power series. Here we use these properties,
combined with the Maclaurin series in Table 11.11.1, to create Maclaurin series for other functions.

 Example 11.11.2: Deriving Maclaurin Series from Known Series

Find the Maclaurin series of each of the following functions by using one of the series listed in Table 11.11.1.
a. f (x) = cos √−x

b. f (x) = sinh x
Solution
a. Using the Maclaurin series for cos x we find that the Maclaurin series for cos √−
x is given by

∞ n − 2n ∞ n n 2 3 4
(−1 ) (√x ) (−1) x x x x x
∑ =∑ =1− + − + −⋯ .
(2n)! (2n)! 2! 4! 6! 8!
n=0 n=0

This series converges to cos √− −


x for all x in the domain of cos √x; that is, for all x ≥ 0 .

b. To find the Maclaurin series for sinh x, we use the fact that
x −x
e −e
sinh x = .
2

Using the Maclaurin series for e , we see that the n


x th
term in the Maclaurin series for sinh x is given by

11.11.4 https://math.libretexts.org/@go/page/4513
n n
x (−x)
− .
n! n!
n
2x
For n even, this term is zero. For n odd, this term is . Therefore, the Maclaurin series for sinh x has only odd-order terms
n!
and is given by
∞ 2n+1 3 5
x x x
∑ =x+ + +⋯ .
(2n + 1)! 3! 5!
n=0

 Exercise 11.11.2
Find the Maclaurin series for sin(x 2
).

Hint
Use the Maclaurin series for sin x.

Answer
∞ n 4n+2
(−1) x

(2n + 1)!
n=0

We also showed previously in this chapter how power series can be differentiated term by term to create a new power series. In
−−−−− 1
Example 11.11.3, we differentiate the binomial series for √1 + x term by term to find the binomial series for −−−−−
. Note that
√1 + x
1 −−−−−
we could construct the binomial series for −−−−− directly from the definition, but differentiating the binomial series for √1 + x
√1 + x

is an easier calculation.

 Example 11.11.3: Differentiating a Series to Find a New Series


−−−−− 1
Use the binomial series for √1 + x to find the binomial series for −−−−− .
√1 + x

Solution
The two functions are related by
d −−−−− 1
√1 + x = −−−−− ,
dx 2 √1 + x

1
so the binomial series for −−−−− is given by
√1 + x

∞ n
1 d −−−−− (−1) 1 ⋅ 3 ⋅ 5 ⋯ (2n − 1)
n
−−−−− =2 √1 + x = 1 + ∑ n
x .
√1 + x dx n! 2
n=1

 Exercise 11.11.3
1
Find the binomial series for f (x) = 3/2
(1 + x)

Hint
1
Differentiate the series for −−−−−
√1 + x

Answer

11.11.5 https://math.libretexts.org/@go/page/4513
∞ n
(−1) 1 ⋅ 3 ⋅ 5 ⋯ (2n − 1)
n
∑ x
n
n! 2
n=1

In this example, we differentiated a known Taylor series to construct a Taylor series for another function. The ability to differentiate
power series term by term makes them a powerful tool for solving differential equations. We now show how this is accomplished.

Solving Differential Equations with Power Series


Consider the differential equation

y'(x) = y.

Recall that this is a first-order separable equation and its solution is y = C e . This equation is easily solved using techniques
x

discussed earlier in the text. For most differential equations, however, we do not yet have analytical tools to solve them. Power
series are an extremely useful tool for solving many types of differential equations. In this technique, we look for a solution of the

form y = ∑ cn x
n
and determine what the coefficients would need to be. In the next example, we consider an initial-value
n=0

problem involving y' = y to illustrate the technique.

 Example 11.11.4: Power Series Solution of a Differential Equation

Use power series to solve the initial-value problem y' = y, y(0) = 3.

Solution
Suppose that there exists a power series solution

n 2 3 4
y(x) = ∑ cn x = c0 + c1 x + c2 x + c3 x + c4 x +⋯ .

n=0

Differentiating this series term by term, we obtain


2 3
y' = c1 + 2 c2 x + 3 c3 x + 4 c4 x +⋯ .

If y satisfies the differential equation, then


2 3 2 3
c0 + c1 x + c2 x + c3 x + ⋯ = c1 + 2 c2 x + 3 c3 x + 4 c3 x +⋯ .

Using the uniqueness of power series representations, we know that these series can only be equal if their coefficients are equal.
Therefore,
c0 = c1 ,

c1 = 2 c2 ,

c2 = 3 c3 ,

c3 = 4 c4 ,


Using the initial condition y(0) = 3 combined with the power series representation
y(x) = c0 + c1 x + c2 x
2
+ c3 x
3
+⋯ ,
we find that c 0 =3 . We are now ready to solve for the rest of the coefficients. Using the fact that c 0 =3 , we have

11.11.6 https://math.libretexts.org/@go/page/4513
3
c1 = c0 = 3 = ,
1!

c1 3 3
c2 = = = ,
2 2 2!

c2 3 3
c3 = = = ,
3 3⋅2 3!

c3 3 3
c4 = = = .
4 4⋅3⋅2 4!

Therefore,
∞ n
1 1 2
1 3
1 4
x
y = 3 [1 + x+ x + x x + ⋯] = 3 ∑ .
1! 2! 3! 4! n!
n=0

You might recognize


∞ n
x

n!
n=0

as the Taylor series for e . Therefore, the solution is y = 3e .


x x

 Exercise 11.11.4

Use power series to solve y' = 2y, y(0) = 5.

Hint
The equations for the first several coefficients c will satisfy c n 0 = 2 c1 , c1 = 2 ⋅ 2 c2 , c2 = 2 ⋅ 3 c3 , … . In general, for all
n ≥ 0, c = 2(n + 1)C
n .
n+1

Answer
2x
y = 5e

We now consider an example involving a differential equation that we cannot solve using previously discussed methods. This
differential equation

y' − xy = 0

is known as Airy’s equation. It has many applications in mathematical physics, such as modeling the diffraction of light. Here we
show how to solve it using power series.

 Example 11.11.5: Power Series Solution of Airy’s Equation

Use power series to solve y ′′


− xy = 0 with the initial conditions y(0) = a and y ′
(0) = b.

Solution
We look for a solution of the form

n 2 3 4
y = ∑ cn x = c0 + c1 x + c2 x + c3 x + c4 x +⋯

n=0

Differentiating this function term by term, we obtain


2 3
y' = c1 + 2 c2 x + 3 c3 x + 4 c4 x +⋯ ,

′′ 2
y = 2 ⋅ 1 c2 + 3 ⋅ 2 c3 x + 4 ⋅ 3 c4 x +⋯ .

If y satisfies the equation y ′′


= xy , then

11.11.7 https://math.libretexts.org/@go/page/4513
2 2 3
2 ⋅ 1 c2 + 3 ⋅ 2 c3 x + 4 ⋅ 3 c4 x + ⋯ = x(c0 + c1 x + c2 x + c3 x + ⋯).

Using [link] on the uniqueness of power series representations, we know that coefficients of the same degree must be equal.
Therefore,
2 ⋅ 1 c2 = 0,

3 ⋅ 2 c3 = c0 ,

4 ⋅ 3 c4 = c1 ,

5 ⋅ 4 c5 = c2 ,


More generally, for n ≥ 3 , we have n ⋅ (n − 1)c n = cn−3 . In fact, all coefficients can be written in terms of c and c . To see 0 1

this, first note that c = 0 . Then


2

c0
c3 = ,
3⋅2

c1
c4 = .
4⋅3

For c 5, c6 , c7 , we see that


c2
c5 = = 0,
5⋅4

c3 c0
c6 = = ,
6⋅5 6⋅5⋅3⋅2

c4 c1
c7 = = .
7⋅6 7⋅6⋅4⋅3

Therefore, the series solution of the differential equation is given by


c0 c1 c0 c1
2 3 4 5 6 7
y = c0 + c1 x + 0 ⋅ x + x + x +0 ⋅ x + x + x +⋯ .
3⋅2 4⋅3 6⋅5⋅3⋅2 7⋅6⋅4⋅3

The initial condition y(0) = a implies c0 = a . Differentiating this series term by term and using the fact that y'(0) = b , we
conclude that c = b . 1

Therefore, the solution of this initial-value problem is


3 4 7
x x x x
y = a (1 + + + ⋯) + b (x + + + ⋯) .
3⋅2 6⋅5⋅3⋅2 4⋅3 7⋅6⋅4⋅3

 Exercise 11.11.5

Use power series to solve y ′′ 2


+x y = 0 with the initial condition y(0) = a and y'(0) = b .

Hint
The coefficients satisfy c 0 = a, c1 = b, c2 = 0, c3 = 0, and for n ≥ 4, n(n − 1)cn = −cn−4 .

Answer
4 8 5 9
x x x x
y = a (1 − + − ⋯) + b (x − + − ⋯)
3⋅4 3⋅4⋅7⋅8 4⋅5 4⋅5⋅8⋅9

Evaluating Non-elementary Integrals


Solving differential equations is one common application of power series. We now turn to a second application. We show how
power series can be used to evaluate integrals involving functions whose antiderivatives cannot be expressed using elementary
functions.

11.11.8 https://math.libretexts.org/@go/page/4513
2

One integral that arises often in applications in probability theory is ∫ e


−x
dx. Unfortunately, the antiderivative of the integrand
is not an elementary function. By elementary function, we mean a function that can be written using a finite number of
2
−x
e

algebraic combinations or compositions of exponential, logarithmic, trigonometric, or power functions. We remark that the term
“elementary function” is not synonymous with noncomplicated function. For example, the function
− −−−− −
is an elementary function, although not a particularly simple-looking function. Any
3
2 x
f (x) = √x − 3x + e − sin(5x + 4)

integral of the form ∫ f (x) dx where the antiderivative of f cannot be written as an elementary function is considered a non-
elementary integral.
non-elementary integrals cannot be evaluated using the basic integration techniques discussed earlier. One way to evaluate such
integrals is by expressing the integrand as a power series and integrating term by term. We demonstrate this technique by
2

considering ∫ e
−x
dx.

 Example 11.11.6: Using Taylor Series to Evaluate a Definite Integral


2

a. Express ∫ e
−x
dx as an infinite series.
1
2

b. Evaluate ∫ e
−x
dx to within an error of 0.01.
0

Solution
a. The Maclaurin series for e is given by
2
−x

∞ 2 n
2 (−x )
−x
e =∑
n!
n=0

4 6 2n
x x x
2 n
= 1 −x + − + ⋯ + (−1 ) +⋯
2! 3! n!

∞ 2n
x
n
= ∑(−1 ) .
n!
n=0

Therefore,
4 6 2n
−x
2
2
x x n
x
∫ e dx =∫ (1 − x + − + ⋯ + (−1 ) + ⋯) dx
2! 3! n!

3 5 7 2n+1
x x x x
n
= C +x − + − + ⋯ + (−1 ) +⋯ .
3 5.2! 7.3! (2n + 1)n!

b. Using the result from part a. we have


1
2 1 1 1 1
−x
∫ e dx = 1 − + − + −⋯ .
0 3 10 42 216

The sum of the first four terms is approximately 0.74. By the alternating series test, this estimate is accurate to within an
1
error of less than ≈ 0.0046296 < 0.01.
216

 Exercise 11.11.6
1

Express ∫ −
cos √x dx as an infinite series. Evaluate ∫ −
cos √x dx to within an error of 0.01.
0

Hint
Use the series found in Example 11.11.6.

11.11.9 https://math.libretexts.org/@go/page/4513
Answer
∞ n
x
C + ∑(−1 )
n+1
The definite integral is approximately 0.514 to within an error of 0.01.
n(2n − 2)!
n=1

As mentioned above, the integral ∫ e


−x
dx arises often in probability theory. Specifically, it is used when studying data sets that
are normally distributed, meaning the data values lie under a bell-shaped curve. For example, if a set of data values is normally
distributed with mean μ and standard deviation σ, then the probability that a randomly chosen value lies between x = a and x = b
is given by
b
1 2
−(x−μ ) /(2 σ )
2

−− ∫ e dx. (11.11.5)
σ √2π a

(See Figure 11.11.2.)

Figure 11.11.2 : If data values are normally distributed with mean μ and standard deviation σ , the probability that a randomly
1 2

selected data value is between a and b is the area under the curve y = between x = a and x = b .
2
−(x−μ) /(2σ )
e
−−
σ√2π

x −μ
To simplify this integral, we typically let z = . This quantity z is known as the z score of a data value. With this
σ
simplification, integral Equation 11.11.5 becomes
(b−μ)/σ
1 −z
2
/2
−− ∫ e dz.
√2π (a−μ)/σ

In Example 11.11.7, we show how we can use this integral in calculating probabilities.

 Example 11.11.7: Using Maclaurin Series to Approximate a Probability

Suppose a set of standardized test scores are normally distributed with mean μ = 100 and standard deviation σ = 50. Use
Equation 11.11.5 and the first six terms in the Maclaurin series for e to approximate the probability that a randomly
2
−x /2

selected test score is between x = 100 and x = 200. Use the alternating series test to determine how accurate your
approximation is.
Solution
Since μ = 100, σ = 50, and we are trying to determine the area under the curve from a = 100 to b = 200 , integral Equation
11.11.5 becomes

2
1 2
−z /2
∫ e dz.
−−
√2π 0

The Maclaurin series for e is given by


2
−x /2

11.11.10 https://math.libretexts.org/@go/page/4513
n
2
x
(− )

2 2
−x /2
e =∑
n!
n=0

2 4 6 2n
x x x x
n
=1− + − + ⋯ + (−1 ) ! +⋯
1 2 3 n
2 ⋅ 1! 2 ⋅ 2! 2 ⋅ 3! 2 ⋅n

∞ 2n
x
n
= ∑(−1 ) .
n
2 ⋅ n!
n=0

Therefore,
2 4 6 2n
1 −z
2
/2
1 z z z n
z
−− ∫ e dz = −− ∫ (1 −
1
+
2

3
+ ⋯ + (−1 )
n
+ ⋯) dz
√2π √2π 2 ⋅ 1! 2 ⋅ 2! 2 ⋅ 3! 2 ⋅ n!

3 5 7 2n+1
1 z z z n
z
= −− (C + z − + − + ⋯ + (−1 ) + ⋯)
1 2 3 n
√2π 3 ⋅ 2 ⋅ 1! 5 ⋅ 2 ⋅ 2! 7 ⋅ 2 ⋅ 3! (2n + 1)2 ⋅ n!

2 11
1 −z
2
/2
1 8 32 128 512 2
−− ∫ e dz = −− (2 − + − + −
5
+ ⋯)
√2π 0 √2π 6 40 336 3456 11 ⋅ 2 ⋅ 5!

Using the first five terms, we estimate that the probability is approximately 0.4922. By the alternating series test, we see
that this estimate is accurate to within
13
1 2
−− ≈ 0.00546.
6
√2π 13 ⋅ 2 ⋅ 6!

Analysis
If you are familiar with probability theory, you may know that the probability that a data value is within two standard
deviations of the mean is approximately 95%. Here we calculated the probability that a data value is between the mean and two
standard deviations above the mean, so the estimate should be around 47.5%. The estimate, combined with the bound on the
accuracy, falls within this range.

 Exercise 11.11.7
2

Use the first five terms of the Maclaurin series for e to estimate the probability that a randomly selected test score is
−x /2

between 100 and 150. Use the alternating series test to determine the accuracy of this estimate.

Hint
1
2

Evaluate ∫ using the first five terms of the Maclaurin series for e .
2
−z /2 −z /2
e dz
0

Answer
The estimate is approximately 0.3414.This estimate is accurate to within 0.0000094.

Another application in which a non-elementary integral arises involves the period of a pendulum. The integral is
π/2

∫ − −−−− −− −−−
0 √ 1 − k2 sin2 θ

.
An integral of this form is known as an elliptic integral of the first kind. Elliptic integrals originally arose when trying to calculate
the arc length of an ellipse. We now show how to use power series to approximate this integral.

11.11.11 https://math.libretexts.org/@go/page/4513
 Example 11.11.8: Period of a Pendulum
The period of a pendulum is the time it takes for a pendulum to make one complete back-and-forth swing. For a pendulum with
length L that makes a maximum angle θ maxwith the vertical, its period T is given by
−−
π/2
L dθ
T = 4√ ∫ − −−−− −− −−−
g 0 √ 1 − k2 sin2 θ

θmax
where g is the acceleration due to gravity and k = sin( ) (see Figure 11.11.3). (We note that this formula for the period
2

arises from a non-linearized model of a pendulum. In some cases, for simplification, a linearized model is used and sin θ is
approximated by θ .)

Figure 11.11.3: This pendulum has length L and makes a maximum angle θ max with the vertical.
Use the binomial series
∞ n
1 (−1) 1 ⋅ 3 ⋅ 5 ⋯ (2n − 1)
n
−−−−− = 1 +∑ n
x
√1 + x n! 2
n=1

to estimate the period of this pendulum. Specifically, approximate the period of the pendulum if
a. you use only the first term in the binomial series, and
b. you use the first two terms in the binomial series.
Solution
We use the binomial series, replacing x with −k 2
sin
2
θ. Then we can write the period as
−−
π/2
L 1 2 2
1⋅3 4 4
T = 4√ ∫ (1 + k sin θ+ k sin θ + ⋯) dθ.
2
g 0
2 2!2

a. Using just the first term in the integrand, the first-order estimate is
−− −−
π/2
L L
T ≈ 4√ ∫ dθ = 2π √ .
g 0
g

θmax
If θmax is small, then k = sin( ) is small. We claim that when k is small, this is a good estimate. To justify
2

this claim, consider


π/2
1 2 2
1⋅3 4 4
∫ (1 + k sin θ+ k sin θ + ⋯) dθ.
2
0
2 2!2

11.11.12 https://math.libretexts.org/@go/page/4513
Since | sin x| ≤ 1 , this integral is bounded by
π/2
1 1.3 π 1 1⋅3
2 4 2 4
∫ ( k + k + ⋯) dθ < ( k + k + ⋯) .
2 2
0
2 2!2 2 2 2!2

Furthermore, it can be shown that each coefficient on the right-hand side is less than 1 and, therefore, that this
expression is bounded by
2 2
πk πk 1
(1 + k
2
+k
4
+ ⋯) = ⋅
2
,
2 2 1 −k

which is small for k small.


b. For larger values of θ max , we can approximate T by using more terms in the integrand. By using the first two terms in
the integral, we arrive at the estimate
−− −−
π/2 2
L 1 2 2
L k
T ≈ 4√ ∫ (1 + k sin θ) dθ = 2π √ (1 + ).
g 0
2 g 4

The applications of Taylor series in this section are intended to highlight their importance. In general, Taylor series are useful
because they allow us to represent known functions using polynomials, thus providing us a tool for approximating function values
and estimating complicated integrals. In addition, they allow us to define new functions as power series, thus providing us with a
powerful tool for solving differential equations.

Key Concepts
The binomial series is the Maclaurin series for f (x) = (1 + x) . It converges for |x| < 1. r

Taylor series for functions can often be derived by algebraic operations with a known Taylor series or by differentiating or
integrating a known Taylor series.
Power series can be used to solve differential equations.
Taylor series can be used to help approximate integrals that cannot be evaluated by other means.

Glossary
binomial series
the Maclaurin series for f (x) = (1 + x) ; it is given by
r


r r(r − 1) r(r − 1) ⋯ (r − n + 1)
(1 + x )
r
= ∑(
n
)x = 1 + rx + x
2
+⋯ + x
n
+⋯ for |x| < 1
n 2! n!
n=0

non-elementary integral
an integral for which the antiderivative of the integrand cannot be expressed as an elementary function

11.11: Applications of Taylor Polynomials is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
10.4: Working with Taylor Series by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

11.11.13 https://math.libretexts.org/@go/page/4513
CHAPTER OVERVIEW

12: Vectors and The Geometry of Space


A general Calculus Textmap organized around the textbook

Calculus: Early Transcendentals


by James Stewart

I II III IV V VI VII VIII IX X XI XII XIII XIV XV XVI XVII

This Textmap is currently under construction... please be patient with us.

Topic hierarchy
12.1: Three-Dimensional Coordinate Systems
12.2: Vectors
12.3: The Dot Product
12.4: The Cross Product
12.5: Equations of Lines and Planes
12.6: Cylinders and Quadric Surfaces

12: Vectors and The Geometry of Space is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

1
12.1: Three-Dimensional Coordinate Systems
In single-variable calculus, the functions that one encounters are functions of a variable (usually x or t ) that varies over some
subset of the real number line (which we denote by R). For such a function, say, y = f (x), the graph of the function f consists of
the points (x, y) = (x, f (x)). These points lie in the Euclidean plane , which, in the Cartesian or rectangular coordinate
system, consists of all ordered pairs of real numbers (a, b). We use the word ``Euclidean'' to denote a system in which all the usual
rules of Euclidean geometry hold. We denote the Euclidean plane by R ; the "2'' represents the number of dimensions of the
2

plane. The Euclidean plane has two perpendicular coordinate axes : the x-axis and the y -axis.
In vector (or multivariable) calculus, we will deal with functions of two or three variables (usually x, y or x, y, z, respectively). The
graph of a function of two variables, say, z = f (x, y) , lies in Euclidean space, which in the Cartesian coordinate system consists
of all ordered triples of real numbers (a, b, c). Since Euclidean space is 3-dimensional, we denote it by R . The graph of f consists
3

of the points (x, y, z) = (x, y, f (x, y)). The 3-dimensional coordinate system of Euclidean space can be represented on a flat
surface, such as this page or a blackboard, only by giving the illusion of three dimensions, in the manner shown in Figure 12.1.1 .
Euclidean space has three mutually perpendicular coordinate axes (x, y and z ), and three mutually perpendicular coordinate
planes\index{plane!coordinate}: the xy-plane, yz-plane and xz-plane (Figure 12.1.2 ).

Figure 12.1.1 Figure 12.1.2

The coordinate system shown in Figure 12.1.1 is known as a right-handed coordinate system , because it is possible,
using the right hand, to point the index finger in the positive direction of the x-axis, the middle finger in the positive direction of
the y -axis, and the thumb in the positive direction of the z -axis, as in Figure 12.1.3

Fig 12.1.3 : Right-handed coordinate system.


An equivalent way of defining a right-handed system is if you can point your thumb upwards in the positive z -axis direction while
using the remaining four fingers to rotate the x-axis towards the y -axis. Doing the same thing with the left hand is what defines a
left-handed coordinate system . Notice that switching the x- and y -axes in a right-handed system results in a left-handed

system, and that rotating either type of system does not change its ``handedness''. Throughout the book we will use a right-handed
system.
For functions of three variables, the graphs exist in 4-dimensional space (i.e. R ), which we can not see in our 3-dimensional space,
4

let alone simulate in 2-dimensional space. So we can only think of 4-dimensional space abstractly. For an entertaining discussion of
this subject, see the book by ABBOT.
So far, we have discussed the position of an object in 2-dimensional or 3-dimensional space. But what about something such as the
velocity of the object, or its acceleration? Or the gravitational force acting on the object? These phenomena all seem to involve
motion and direction in some way. This is where the idea of a vector comes in.

12.1.1 https://math.libretexts.org/@go/page/4524
You have already dealt with velocity and acceleration in single-variable calculus. For example, for motion along a straight line, if
y = f (t) gives the displacement of an object after time t , then dy/dt = f (t) is the velocity of the object at time t . The derivative

f (t) is just a number, which is positive if the object is moving in an agreed-upon "positive'' direction, and negative if it moves in

the opposite of that direction. So you can think of that number, which was called the velocity of the object, as having two
components: a magnitude, indicated by a nonnegative number, preceded by a direction, indicated by a plus or minus symbol
(representing motion in the positive direction or the negative direction, respectively), i.e. f (t) = ±a for some number a ≥ 0 .

Then a is the magnitude of the velocity (normally called the speed of the object), and the ± represents the direction of the velocity
(though the + is usually omitted for the positive direction).
For motion along a straight line, i.e. in a 1-dimensional space, the velocities are also contained in that 1-dimensional space, since
they are just numbers. For general motion along a curve in 2- or 3-dimensional space, however, velocity will need to be represented
by a multidimensional object which should have both a magnitude and a direction. A geometric object which has those features is
an arrow, which in elementary geometry is called a ``directed line segment''. This is the motivation for how we will define a vector.

Definition 12.1.1

A (nonzero) vector is a directed line segment drawn from a point P (called its initial point ) to a point Q (called its

−→
terminal point ), with P and Q being distinct points. The vector is denoted by PQ. Its magnitude is the length of the

−→
line segment, denoted by ∥P Q∥, and its direction is the same as that of the directed line segment. The zero vector is just
a point, and it is denoted by 0 .

To indicate the direction of a vector, we draw an arrow from its initial point to its terminal point. We will often denote a vector by a
single bold-faced letter (e.g. v) and use the terms ``magnitude" and ``length'' interchangeably. Note that our definition could apply
to systems with any number of dimensions (Figure 1.1.4 (a)-(c)).

Figure 12.1.4 Vectors in different dimensions

A few things need to be noted about the zero vector. Our motivation for what a vector is included the notions of magnitude and
direction. What is the magnitude of the zero vector? We define it to be zero, i.e. ∥0∥ = 0 . This agrees with the definition of the
zero vector as just a point, which has zero length. What about the direction of the zero vector? A single point really has no well-
defined direction. Notice that we were careful to only define the direction of a nonzero vector, which is well-defined since the
initial and terminal points are distinct. Not everyone agrees on the direction of the zero vector. Some contend that the zero vector
has arbitrary direction (i.e. can take any direction), some say that it has indeterminate direction (i.e. the direction cannot be
determined), while others say that it has no direction. Our definition of the zero vector, however, does not require it to have a
direction, and we will leave it at that.
Now that we know what a vector is, we need a way of determining when two vectors are equal. This leads us to the following
definition.

Definition 12.1.2

Two nonzero vectors are equal if they have the same magnitude and the same direction. Any vector with zero magnitude is
equal to the zero vector.

12.1.2 https://math.libretexts.org/@go/page/4524
By this definition, vectors with the same magnitude and direction but with different initial points would be equal. For example, in

Figure 1.1.5 the vectors u , v and w all have the same magnitude √5 (by the Pythagorean Theorem). And we see that u and w are
parallel, since they lie on lines having the same slope , and they point in the same direction. So u = w , even though they have
1

different initial points. We also see that v is parallel to u but points in the opposite direction. So u ≠ v .

Figure 12.1.5

So we can see that there are an infinite number of vectors for a given magnitude and direction, those vectors all being equal and
differing only by their initial and terminal points. Is there a single vector which we can choose to represent all those equal vectors?
The answer is yes, and is suggested by the vector w in Figure 12.1.5.

Unless otherwise indicated, when speaking of "the vector" with a given magnitude and direction, we will mean the one whose
initial point is at the origin of the coordinate system.

Thinking of vectors as starting from the origin provides a way of dealing with vectors in a standard way, since every coordinate
system has an origin. But there will be times when it is convenient to consider a different initial point for a vector (for example,
when adding vectors, which we will do in the next section). Another advantage of using the origin as the initial point is that it
provides an easy correspondence between a vector and its terminal point.

Example 12.1.1

Let v be the vector in R whose initial point is at the origin and whose terminal point is (3, 4, 5). Though the point (3, 4, 5)
3

and the vector v are different objects, it is convenient to write v = (3, 4, 5). When doing this, it is understood that the initial
point of v is at the origin (0, 0, 0) and the terminal point is (3, 4, 5).

Figure 12.1.6 Correspondence between points and vectors

Unless otherwise stated, when we refer to vectors as v = (a, b) in R or v = (a, b, c) in R , we mean vectors in Cartesian
2 3

coordinates starting at the origin. Also, we will write the zero vector 0 in R and R as (0, 0) and (0, 0, 0), respectively.
2 3

The point-vector correspondence provides an easy way to check if two vectors are equal, without having to determine their
magnitude and direction. Similar to seeing if two points are the same, you are now seeing if the terminal points of vectors starting
at the origin are the same. For each vector, find the (unique!) vector it equals whose initial point is the origin. Then compare the
coordinates of the terminal points of these ``new'' vectors: if those coordinates are the same, then the original vectors are equal. To
get the ``new'' vectors starting at the origin, you translate each vector to start at the origin by subtracting the coordinates of the

12.1.3 https://math.libretexts.org/@go/page/4524
original initial point from the original terminal point. The resulting point will be the terminal point of the ``new'' vector whose
initial point is the origin. Do this for each original vector then compare.

Example 12.1.2

−→ −→
Consider the vectors PQ and RS in R
3
, where P = (2, 1, 5), Q = (3, 5, 7), R = (1, −3, −2) and S = (2, 1, 0) . Does

−→ −→
P Q = RS ?

−→
The vector PQ is equal to the vector v with initial point (0, 0, 0) and terminal point
Q − P = (3, 5, 7) − (2, 1, 5) = (3 − 2, 5 − 1, 7 − 5) = (1, 4, 2) .
−→
Similarly, RS is equal to the vector w with initial point (0, 0, 0) and terminal point
S − R = (2, 1, 0) − (1, −3, −2) = (2 − 1, 1 − (−3), 0 − (−2)) = (1, 4, 2) .

−→ −→
So P Q = v = (1, 4, 2) and RS = w = (1, 4, 2) .

−→ −→
∴ P Q = RS

Figure 12.1.7

Recall the distance formula for points in the Euclidean plane:

For points P = (x1 , y1 ) , Q = (x 2, y2 ) in R , the distance d between P and Q is:


2

−−−−−−−−−−−−−−−−−−
2 2
d = √ (x2 − x1 ) + (y2 − y1 ) (12.1.1)

By this formula, we have the following result:

Note

−→ −
−→
For a vector P Q in R with initial point P
2
= (x1 , y1 ) and terminal point Q = (x 2, y2 ) , the magnitude of P Q is:

−→ −−−−−−−−−−−−−−−−−−
2 2
∥ P Q∥ = √ (x2 − x1 ) + (y2 − y1 ) (12.1.2)

Finding the magnitude of a vector v = (a, b) in R is a special case of the above formula with P
2
= (0, 0) and Q = (a, b) :

For a vector v = (a, b) in R , the magnitude of v is:


2

− −−−−−
2 2
∥v∥ = √ a + b (12.1.3)

To calculate the magnitude of vectors in R , we need a distance formula for points in Euclidean space (we will postpone the proof
3

until the next section):

12.1.4 https://math.libretexts.org/@go/page/4524
Theorem 12.1.1

The distance d between points P = (x1 , y1 , z1 ) and Q = (x 2, y2 , z2 ) in R is:


3

−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2 2
d = √ (x2 − x1 ) + (y2 − y1 ) + (z2 − z1 ) (12.1.4)

The proof will use the following result:

Theorem 12.1.2

For a vector v = (a, b, c) in R , the magnitude of v is:


3

−− −−−−−−−−
2 2 2
∥v∥ = √ a + b + c (12.1.5)

Proof: There are four cases to consider:


−− −−−− −−−− −−−−− − −−−−
Case 1: a = b = c = 0 . Then v = 0 , so ∥v∥ = 0 = √0 2 2
+0 +0
2
= √a2 + b2 + c2 .
are 0. Without loss of generality, we assume that a = b = 0 and c ≠ 0 (the other two possibilities
Case 2: exactly two of a, b, c

are handled in a similar manner). Then v = (0, 0, c), which is a vector of length |c| along the z -axis. So

− −−− −−−−− −− −−− − −−− −−−
2 2
∥v∥ = |c| = √c = √0 + 0 + c
2 2
= √a + b + c .2 2 2

Case 3: exactly one of a, b, cis 0. Without loss of generality, we assume that a = 0 , b ≠ 0 and c ≠ 0 (the other two possibilities
are handled in a similar manner). Then v = (0, b, c) , which is a vector in the yz-plane, so by the Pythagorean Theorem we have
−−− −−− −−−−−−− −− − −−− − −−− −−−
2
∥v∥ = √b + c
2 2
= √0 + b + c
2 2
= √a + b + c
2
.2 2

Figure 12.1.8

Case 4: none of a, b, c are 0. Without loss of generality, we can assume that a, b, c are all positive (the other seven possibilities are
handled in a similar manner). Consider the points P = (0, 0, 0), Q = (a, b, c) , R = (a, b, 0), and S = (a, 0, 0) , as shown in
Figure 1.1.8. Applying the Pythagorean Theorem to the right triangle △P SR gives |P R| = a + b . A second application of the
2 2 2

−−−−−−−−−−−−
−− −−−− −−−−
Pythagorean Theorem, this time to the right triangle △P QR , gives ∥v∥ = |P Q| = √|P R|
2
+ |QR|
2 2 2
= √a + b + c
2
. This
proves the theorem.
(QED)

12.1.5 https://math.libretexts.org/@go/page/4524
Example 1.3
Calculate the following:

−→
1. The magnitude of the vector P Q in R with P
2
= (−1, 2) and Q = (5, 5).

−→
−−−−−−−−−−−−−−−−− − −−−−− −− –
Solution: By formula (1.2), ∥ P Q∥ = √(5 − (−1))2 + (5 − 2 )2 = √36 + 9 = √45 = 3 √5.
2. The magnitude of the vector v = (8, 3) in R . 2

−− −−−− −−
Solution: By formula (1.3), ∥v∥ = √8 + 3 = √73 .
2 2

3. The distance between the points P = (2, −1, 4) and Q = (4, 2, −3) in R . 2

−−−−−−−−−−−−−−−−−−−−−−−−−− − −−−−−−− − −−
Solution: By formula (1.4), the distance d = √(4 − 2 ) + (2 − (−1)) + (−3 − 4 ) = √4 + 9 + 49 = √62 .
2 2 2

4. The magnitude of the vector v = (5, 8, −2) in R . 3

−−−−−−−−−−−−−
−−−−−−−− − −−
Solution: By formula (1.5), ∥v∥ = √5 2 2
+8
2
+ (−2 ) = √25 + 64 + 4 = √93 .

12.1: Three-Dimensional Coordinate Systems is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
1.1: Introduction by Michael Corral is licensed GNU FDL. Original source: http://www.mecmath.net/.

12.1.6 https://math.libretexts.org/@go/page/4524
12.2: Vectors
 Learning Objectives
Describe three-dimensional space mathematically.
Locate points in space using coordinates.
Write the distance formula in three dimensions.
Write the equations for simple planes and spheres.
Perform vector operations in R . 3

Vectors are useful tools for solving two-dimensional problems. Life, however, happens in three dimensions. To expand the use of
vectors to more realistic applications, it is necessary to create a framework for describing three-dimensional space. For example,
although a two-dimensional map is a useful tool for navigating from one place to another, in some cases the topography of the land
is important. Does your planned route go through the mountains? Do you have to cross a river? To appreciate fully the impact of
these geographic features, you must use three dimensions. This section presents a natural extension of the two-dimensional
Cartesian coordinate plane into three dimensions.

Three-Dimensional Coordinate Systems


As we have learned, the two-dimensional rectangular coordinate system contains two perpendicular axes: the horizontal x-axis and
the vertical y -axis. We can add a third dimension, the z -axis, which is perpendicular to both the x-axis and the y -axis. We call this
system the three-dimensional rectangular coordinate system. It represents the three dimensions we encounter in real life.

 Definition: Three-dimensional Rectangular Coordinate System


The three-dimensional rectangular coordinate system consists of three perpendicular axes: the x-axis, the y -axis, and the z -
axis. Because each axis is a number line representing all real numbers in R, the three-dimensional system is often denoted by
R .
3

In Figure 12.2.1a, the positive z -axis is shown above the plane containing the x- and y -axes. The positive x-axis appears to the left
and the positive y -axis is to the right. A natural question to ask is: How was this arrangement determined? The system displayed
follows the right-hand rule. If we take our right hand and align the fingers with the positive x-axis, then curl the fingers so they
point in the direction of the positive y -axis, our thumb points in the direction of the positive z -axis (Figure 12.2.1b). In this text, we
always work with coordinate systems set up in accordance with the right-hand rule. Some systems do follow a left-hand rule, but
the right-hand rule is considered the standard representation.

Figure 12.2.1 : (a) We can extend the two-dimensional rectangular coordinate system by adding a third axis, the z -axis, that is
perpendicular to both the x -axis and the y -axis. (b) The right-hand rule is used to determine the placement of the coordinate axes in
the standard Cartesian plane.
In two dimensions, we describe a point in the plane with the coordinates (x, y). Each coordinate describes how the point aligns
with the corresponding axis. In three dimensions, a new coordinate, z , is appended to indicate alignment with the z -axis: (x, y, z).

12.2.1 https://math.libretexts.org/@go/page/4525
A point in space is identified by all three coordinates (Figure 12.2.2). To plot the point (x, y, z), go x units along the x-axis, then y

units in the direction of the y -axis, then z units in the direction of the z -axis.

Figure 12.2.2 : To plot the point (x, y, z) go x units along the x -axis, then y units in the direction of the y -axis, then z units in the
direction of the z -axis.

 Example 12.2.1: Locating Points in Space

Sketch the point (1, −2, 3) in three-dimensional space.


Solution
To sketch a point, start by sketching three sides of a rectangular prism along the coordinate axes: one unit in the positive x
direction, 2 units in the negative y direction, and 3 units in the positive z direction. Complete the prism to plot the point
(Figure 12.2.3).

Figure 12.2.3 : Sketching the point (1, −2, 3).

 Exercise 12.2.1

Sketch the point (−2, 3, −1) in three-dimensional space.

Hint
Start by sketching the coordinate axes. e.g., Figure 12.2.3. Then sketch a rectangular prism to help find the point in space.
Answer

12.2.2 https://math.libretexts.org/@go/page/4525
In two-dimensional space, the coordinate plane is defined by a pair of perpendicular axes. These axes allow us to name any
location within the plane. In three dimensions, we define coordinate planes by the coordinate axes, just as in two dimensions.
There are three axes now, so there are three intersecting pairs of axes. Each pair of axes forms a coordinate plane: the xy-plane, the
xz-plane, and the yz-plane (Figure 12.2.4). We define the xy-plane formally as the following set: {(x, y, 0) : x, y ∈ R}. Similarly,

the xz-plane and the yz-plane are defined as {(x, 0, z) : x, z ∈ R} and {(0, y, z) : y, z ∈ R}, respectively.
To visualize this, imagine you’re building a house and are standing in a room with only two of the four walls finished. (Assume the
two finished walls are adjacent to each other.) If you stand with your back to the corner where the two finished walls meet, facing
out into the room, the floor is the xy-plane, the wall to your right is the xz-plane, and the wall to your left is the yz-plane.

Figure 12.2.4 : The plane containing the x - and y -axes is called the xy-plane. The plane containing the x - and z -axes is called the
xz -plane, and the y - and z -axes define the yz-plane.

In two dimensions, the coordinate axes partition the plane into four quadrants. Similarly, the coordinate planes divide space
between them into eight regions about the origin, called octants. The octants fill R in the same way that quadrants fill R , as
3 2

shown in Figure 12.2.5.

12.2.3 https://math.libretexts.org/@go/page/4525
Figure 12.2.5 : Points that lie in octants have three nonzero coordinates.
Most work in three-dimensional space is a comfortable extension of the corresponding concepts in two dimensions. In this section,
we use our knowledge of circles to describe spheres, then we expand our understanding of vectors to three dimensions. To
accomplish these goals, we begin by adapting the distance formula to three-dimensional space.
If two points lie in the same coordinate plane, then it is straightforward to calculate the distance between them. We know that the
distance d between two points (x , y ) and (x , y ) in the xy-coordinate plane is given by the formula
1 1 2 2

−−−−−−−−−−−−−−−−−−
2 2
d = √ (x2 − x1 ) + (y2 − y1 ) .

The formula for the distance between two points in space is a natural extension of this formula.

 The Distance between Two Points in Space

The distance d between points (x 1, y1 , z1 ) and (x 2, y2 , z2 ) is given by the formula


−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2 2
d = √ (x2 − x1 ) + (y2 − y1 ) + (z2 − z1 ) . (12.2.1)

The proof of this theorem is left as an exercise. (Hint: First find the distance d between the points (x 1 1, y1 , z1 ) and (x
2, y2 , z1 ) as
shown in Figure 12.2.6.)

Figure 12.2.6 : The distance between P and P is the length of the diagonal of the rectangular prism having P and P as opposite
1 2 1 2

corners.

 Example 12.2.2: Distance in Space


Find the distance between points P 1 = (3, −1, 5) and P 2 = (2, 1, −1).

12.2.4 https://math.libretexts.org/@go/page/4525
Figure 12.2.7 : Find the distance between the two points.
Solution
Substitute values directly into the distance formula (Equation 12.2.1):
−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2 2
d(P1 , P2 ) = √ (x2 − x1 ) + (y2 − y1 ) + (z2 − z1 )

−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2 2
= √ (2 − 3 ) + (1 − (−1)) + (−1 − 5 )

−−−−−−−−−−−−−−−
2 2 2
= √ (−1 ) +2 + (−6 )

−−
= √41.

 Exercise 12.2.2

Find the distance between points P 1 = (1, −5, 4) and P 2 = (4, −1, −1) .

Hint
−−−−−−−−−−−−−−−−−−−−−−−−−−− −
2 2 2
d = √(x2 − x1 ) + (y2 − y1 ) + (z2 − z1 )

Answer

5 √2

Before moving on to the next section, let’s get a feel for how R differs from R . For example, in R , lines that are not parallel
3 2 2

must always intersect. This is not the case in R . For example, consider the lines shown in Figure 12.2.8. These two lines are not
3

parallel, nor do they intersect.

12.2.5 https://math.libretexts.org/@go/page/4525
Figure 12.2.8: These two lines are not parallel, but still do not intersect.
You can also have circles that are interconnected but have no points in common, as in Figure 12.2.9.

12.2.6 https://math.libretexts.org/@go/page/4525
Figure 12.2.9: These circles are interconnected, but have no points in common.
We have a lot more flexibility working in three dimensions than we do if we stuck with only two dimensions.

Writing Equations in R 3

Now that we can represent points in space and find the distance between them, we can learn how to write equations of geometric
objects such as lines, planes, and curved surfaces in R . First, we start with a simple equation. Compare the graphs of the equation
3

x = 0 in R , R ,and R (Figure 12.2.10). From these graphs, we can see the same equation can describe a point, a line, or a plane.
2 3

Figure 12.2.10: (a) In R , the equation x = 0 describes a single point. (b) In R , the equation x = 0 describes a line, the y -axis. (c)
2

In R , the equation x = 0 describes a plane, the yz-plane.


3

12.2.7 https://math.libretexts.org/@go/page/4525
In space, the equation x = 0 describes all points (0, y, z). This equation defines the yz-plane. Similarly, the xy-plane contains all
points of the form (x, y, 0). The equation z = 0 defines the xy-plane and the equation y = 0 describes the xz-plane (Figure
12.2.11).

Figure 12.2.11: (a) In space, the equation z = 0 describes the xy-plane. (b) All points in the xz -plane satisfy the equation y = 0 .
Understanding the equations of the coordinate planes allows us to write an equation for any plane that is parallel to one of the
coordinate planes. When a plane is parallel to the xy-plane, for example, the z -coordinate of each point in the plane has the same
constant value. Only the x- and y -coordinates of points in that plane vary from point to point.

 Equations of Planes Parallel to Coordinate Planes


1. The plane in space that is parallel to the xy-plane and contains point (a, b, c) can be represented by the equation z = c .
2. The plane in space that is parallel to the xz-plane and contains point (a, b, c) can be represented by the equation y = b .
3. The plane in space that is parallel to the yz-plane and contains point (a, b, c) can be represented by the equation x = a .

 Example 12.2.3: Writing Equations of Planes Parallel to Coordinate Planes


a. Write an equation of the plane passing through point (3, 11, 7) that is parallel to the yz-plane.
b. Find an equation of the plane passing through points (6, −2, 9), (0, −2, 4),and (1, −2, −3).
Solution
a. When a plane is parallel to the yz-plane, only the y - and z -coordinates may vary. The x-coordinate has the same constant
value for all points in this plane, so this plane can be represented by the equation x = 3 .
b. Each of the points (6, −2, 9), (0, −2, 4),and (1, −2, −3) has the same y -coordinate. This plane can be represented by the
equation y = −2 .

 Exercise 12.2.3

Write an equation of the plane passing through point (1, −6, −4) that is parallel to the xy-plane.

Hint
If a plane is parallel to the xy-plane, the z-coordinates of the points in that plane do not vary.
Answer
z = −4

As we have seen, in R the equation x = 5 describes the vertical line passing through point (5, 0). This line is parallel to the y -
2

axis. In a natural extension, the equation x = 5 in R describes the plane passing through point (5, 0, 0), which is parallel to the
3

yz-plane. Another natural extension of a familiar equation is found in the equation of a sphere.

12.2.8 https://math.libretexts.org/@go/page/4525
 Definition: Sphere
A sphere is the set of all points in space equidistant from a fixed point, the center of the sphere (Figure 12.2.12), just as the set
of all points in a plane that are equidistant from the center represents a circle. In a sphere, as in a circle, the distance from the
center to a point on the sphere is called the radius.

Figure 12.2.12: Each point (x, y, z) on the surface of a sphere is r units away from the center (a, b, c) .
The equation of a circle is derived using the distance formula in two dimensions. In the same way, the equation of a sphere is based
on the three-dimensional formula for distance.

 Standard Equation of a Sphere

The sphere with center (a, b, c) and radius r can be represented by the equation
2 2 2 2
(x − a) + (y − b ) + (z − c ) =r .

This equation is known as the standard equation of a sphere.

 Example 12.2.4: Finding an Equation of a Sphere

Find the standard equation of the sphere with center (10, 7, 4) and point (−1, 3, −2), as shown in Figure 12.2.13.

12.2.9 https://math.libretexts.org/@go/page/4525
Figure 12.2.13: The sphere centered at (10, 7, 4) containing point (−1, 3, −2).
Solution
Use the distance formula to find the radius r of the sphere:
−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2 2
r = √ (−1 − 10 ) + (3 − 7 ) + (−2 − 4 )

−−−−−−−−−−−−−−−−−−−
2 2 2
= √ (−11 ) + (−4 ) + (−6 )

−−−
= √173

The standard equation of the sphere is


2 2 2
(x − 10 ) + (y − 7 ) + (z − 4 ) = 173.

 Exercise 12.2.4

Find the standard equation of the sphere with center (−2, 4, −5) containing point (4, 4, −1).

Hint
First use the distance formula to find the radius of the sphere.
Answer
2 2 2
(x + 2 ) + (y − 4 ) + (z + 5 ) = 52

12.2.10 https://math.libretexts.org/@go/page/4525
 Example 12.2.5: Finding the Equation of a Sphere
¯
¯¯¯¯¯¯
¯
Let P = (−5, 2, 3) and Q = (3, 4, −1), and suppose line segment P Q forms the diameter of a sphere (Figure 12.2.14). Find
the equation of the sphere.

Figure 12.2.14: Line segment P Q.


¯
¯¯¯¯¯¯
¯

Solution:
¯
¯¯¯¯¯¯
¯ ¯
¯¯¯¯¯¯
¯
Since P Q is a diameter of the sphere, we know the center of the sphere is the midpoint of P Q.Then,
−5 + 3 2 +4 3 + (−1)
C =( , , ) = (−1, 3, 1).
2 2 2

Furthermore, we know the radius of the sphere is half the length of the diameter. This gives
1 −−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2 2
r = √ (−5 − 3 ) + (2 − 4 ) + (3 − (−1))
2

1 − −−−−−−− −
= √ 64 + 4 + 16
2
−−
= √21

Then, the equation of the sphere is (x + 1) 2


+ (y − 3 )
2
+ (z − 1 )
2
= 21.

 Exercise 12.2.5
¯
¯¯¯¯¯¯
¯
Find the equation of the sphere with diameter P Q, where P = (2, −1, −3) and Q = (−2, 5, −1).

Hint
Find the midpoint of the diameter first.
Answer
2 2 2
x + (y − 2 ) + (z + 2 ) = 14

 Example 12.2.6: Graphing Other Equations in Three Dimensions


Describe the set of points that satisfies (x − 4)(z − 2) = 0, and graph the set.
Solution

12.2.11 https://math.libretexts.org/@go/page/4525
We must have either x − 4 = 0 or z − 2 = 0 , so the set of points forms the two planes x = 4 and z = 2 (Figure 12.2.15).

Figure 12.2.15: The set of points satisfying (x − 4)(z − 2) = 0 forms the two planes x = 4 and z = 2 .

 Exercise 12.2.6

Describe the set of points that satisfies (y + 2)(z − 3) = 0, and graph the set.

Hint
One of the factors must be zero.
Answer
The set of points forms the two planes y = −2 and z = 3 .

12.2.12 https://math.libretexts.org/@go/page/4525
 Example 12.2.7: Graphing Other Equations in Three Dimensions

Describe the set of points in three-dimensional space that satisfies (x − 2)


2 2
+ (y − 1 ) = 4, and graph the set.
Solution
The x- and y -coordinates form a circle in the xy-plane of radius 2, centered at (2, 1). Since there is no restriction on the z -
coordinate, the three-dimensional result is a circular cylinder of radius 2 centered on the line with x = 2 and y = 1 . The
cylinder extends indefinitely in the z -direction (Figure 12.2.16).

12.2.13 https://math.libretexts.org/@go/page/4525
Figure 12.2.16: The set of points satisfying (x − 2)
2 2
+ (y − 1) = 4 . This is a cylinder of radius 2 centered on the line with
x = 2 and y = 1 .

 Exercise 12.2.7
Describe the set of points in three dimensional space that satisfies x 2
+ (z − 2 )
2
= 16 , and graph the surface.

Hint
Think about what happens if you plot this equation in two dimensions in the xz-plane.
Answer
A cylinder of radius 4 centered on the line with x = 0 and z = 2 .

12.2.14 https://math.libretexts.org/@go/page/4525
Working with Vectors in R 3

Just like two-dimensional vectors, three-dimensional vectors are quantities with both magnitude and direction, and they are
represented by directed line segments (arrows). With a three-dimensional vector, we use a three-dimensional arrow.
Three-dimensional vectors can also be represented in component form. The notation v = ⟨x, y, z⟩ is a natural extension of the two-

dimensional case, representing a vector with the initial point at the origin, (0, 0, 0), and terminal point (x, y, z). The zero vector is

0 = ⟨0, 0, 0⟩. So, for example, the three dimensional vector v = ⟨2, 4, 1⟩ is represented by a directed line segment from point

(0, 0, 0) to point (2, 4, 1) (Figure 12.2.17).

Figure 12.2.17: Vector ⇀


v = ⟨2, 4, 1⟩ is represented by a directed line segment from point (0, 0, 0) to point (2, 4, 1).
Vector addition and scalar multiplication are defined analogously to the two-dimensional case. If ⇀
v = ⟨x1 , y1 , z1 ⟩ and
w = ⟨x , y , z ⟩ are vectors, and k is a scalar, then

2 2 2

⇀ ⇀
v + w = ⟨x1 + x2 , y1 + y2 , z1 + z2 ⟩

and

k v = ⟨kx1 , ky1 , kz1 ⟩.

If k = −1, then k v = (−1) v is written as −v , and vector subtraction is defined by v − w = v + (−w) = v + (−1)w .
⇀ ⇀ ⇀ ⇀ ⇀ ⇀ ⇀ ⇀ ⇀

The standard unit vectors extend easily into three dimensions as well, ^i = ⟨1, 0, 0⟩, ^j = ⟨0, 1, 0⟩, and k
^
= ⟨0, 0, 1⟩, and we use

them in the same way we used the standard unit vectors in two dimensions. Thus, we can represent a vector in R in the following 3

ways:
⇀ ^ ^ ^
v = ⟨x, y, z⟩ = x i + y j + zk

 Example 12.2.8: Vector Representations


−−⇀
Let PQ be the vector with initial point P = (3, 12, 6) and terminal point Q = (−4, −3, 2) as shown in Figure .
12.2.18
−−⇀
Express P Q in both component form and using standard unit vectors.

12.2.15 https://math.libretexts.org/@go/page/4525
Figure 12.2.18: The vector with initial point P = (3, 12, 6) and terminal point Q = (−4, −3, 2) .
Solution
In component form,
−−⇀
P Q = ⟨x2 − x1 , y2 − y1 , z2 − z1 ⟩

= ⟨−4 − 3, −3 − 12, 2 − 6⟩

= ⟨−7, −15, −4⟩.

In standard unit form,


−−⇀
^ ^ ^
P Q = −7 i − 15 j − 4 k.

 Exercise 12.2.8

Let S = (3, 8, 2) and T = (2, −1, 3) . Express ST in component form and in standard unit form.

Hint
−−⇀ −−⇀
Write ST in component form first. T is the terminal point of ST .
Answer
−−⇀
^ ^ ^
ST = ⟨−1, −9, 1⟩ = − i − 9 j + k

As described earlier, vectors in three dimensions behave in the same way as vectors in a plane. The geometric interpretation of
vector addition, for example, is the same in both two- and three-dimensional space (Figure 12.2.19).

Figure 12.2.19: To add vectors in three dimensions, we follow the same procedures we learned for two dimensions.

12.2.16 https://math.libretexts.org/@go/page/4525
We have already seen how some of the algebraic properties of vectors, such as vector addition and scalar multiplication, can be
extended to three dimensions. Other properties can be extended in similar fashion. They are summarized here for our reference.

 Properties of Vectors in Space

Let ⇀
v = ⟨x1 , y1 , z1 ⟩ and w = ⟨x

2, y2 , z2 ⟩ be vectors, and let k be a scalar.
Scalar multiplication:

k v = ⟨kx1 , ky1 , kz1 ⟩

Vector addition:
⇀ ⇀
v + w = ⟨x1 , y1 , z1 ⟩ + ⟨x2 , y2 , z2 ⟩ = ⟨x1 + x2 , y1 + y2 , z1 + z2 ⟩

Vector subtraction:
⇀ ⇀
v − w = ⟨x1 , y1 , z1 ⟩ − ⟨x2 , y2 , z2 ⟩ = ⟨x1 − x2 , y1 − y2 , z1 − z2 ⟩

Vector magnitude:
−−−−−−−−−−
⇀ 2 2 2
∥ v∥ = √ x +y +z
1 1 1

Unit vector in the direction of ⇀


v :
1 ⇀
1 x1 y1 z1 ⇀

v = ⟨x1 , y1 , z1 ⟩ = ⟨ , , ⟩, if v ≠ 0
⇀ ⇀ ⇀ ⇀ ⇀
∥ v∥ ∥ v∥ ∥ v∥ ∥ v∥ ∥ v∥

We have seen that vector addition in two dimensions satisfies the commutative, associative, and additive inverse properties. These
properties of vector operations are valid for three-dimensional vectors as well. Scalar multiplication of vectors satisfies the
distributive property, and the zero vector acts as an additive identity. The proofs to verify these properties in three dimensions are
straightforward extensions of the proofs in two dimensions.

 Example 12.2.9: Vector Operations in Three Dimensions

Let ⇀
v = ⟨−2, 9, 5⟩ and w = ⟨1, −1, 0⟩ (Figure 12.2.20). Find the following vectors.

a. 3 v − 2w
⇀ ⇀

b. 5∥w∥ ⇀

c. ∥5w∥ ⇀

d. A unit vector in the direction of ⇀


v

12.2.17 https://math.libretexts.org/@go/page/4525
Figure 12.2.20: The vectors ⇀
v = ⟨−2, 9, 5⟩ and w = ⟨1, −1, 0⟩ .

Solution
a. First, use scalar multiplication of each vector, then subtract:
⇀ ⇀
3 v − 2 w = 3⟨−2, 9, 5⟩ − 2⟨1, −1, 0⟩

= ⟨−6, 27, 15⟩ − ⟨2, −2, 0⟩

= ⟨−6 − 2, 27 − (−2), 15 − 0⟩

= ⟨−8, 29, 15⟩.

b. Write the equation for the magnitude of the vector, then use scalar multiplication:
−−−−−−−−−−−−−
⇀ 2 2 2 –
5∥ w∥ = 5 √ 1 + (−1 ) +0 = 5 √2.

c. First, use scalar multiplication, then find the magnitude of the new vector. Note that the result is the same as for part b.:
−−−−−−−−−−−−−
⇀ 2 2 2 −− –
∥5 w∥ =∥ ⟨5, −5, 0⟩ ∥= √ 5 + (−5 ) +0 = √50 = 5 √2

d. Recall that to find a unit vector in two dimensions, we divide a vector by its magnitude. The procedure is the same in three
dimensions:

v 1
= ⟨−2, 9, 5⟩
⇀ ⇀
∥ v∥ ∥ v∥

1
= −−−−−−−−−−−−− ⟨−2, 9, 5⟩
2 2
√ (−2 )2 + 9 +5

1
= −−− ⟨−2, 9, 5⟩
√110

−2 9 5
=⟨ , , ⟩.
−−− −−− −−−
√110 √110 √110

 Exercise 12.2.9:

Let ⇀
v = ⟨−1, −1, 1⟩ and w = ⟨2, 0, 1⟩. Find a unit vector in the direction of 5 v + 3w.
⇀ ⇀ ⇀

Hint

12.2.18 https://math.libretexts.org/@go/page/4525
Start by writing 5 v + 3w in component form.
⇀ ⇀

Answer
1 5 8
⟨ −−, − −−, −−⟩
3 √10 3 √10 3 √10

 Example 12.2.10: Throwing a Forward Pass


A quarterback is standing on the football field preparing to throw a pass. His receiver is standing 20 yd down the field and 15
yd to the quarterback’s left. The quarterback throws the ball at a velocity of 60 mph toward the receiver at an upward angle of
30° (see the following figure). Write the initial velocity vector of the ball, v , in component form.

Solution
The first thing we want to do is find a vector in the same direction as the velocity vector of the ball. We then scale the vector
appropriately so that it has the right magnitude. Consider the vector w extending from the quarterback’s arm to a point directly

above the receiver’s head at an angle of 30° (see the following figure). This vector would have the same direction as v , but it ⇀

may not have the right magnitude.

The receiver is 20 yd down the field and 15 yd to the quarterback’s left. Therefore, the straight-line distance from the
quarterback to the receiver is
−−−−−−−− −−−−−− −− −−−
Dist from QB to receiver = √15 2
+ 20
2
= √225 + 400 = √625 = 25 yd.
25
We have ⇀
= cos 30°. Then the magnitude of w is given by

∥ w∥

25 25 ⋅ 2 50

∥ w∥ = =

=

yd
cos 30° √3 √3

and the vertical distance from the receiver to the terminal point of w is ⇀

50 1 25
Vert dist from receiver to terminal point of w = ∥w∥ sin 30° =
⇀ ⇀

⋅ =

yd.
√3 2 √3

12.2.19 https://math.libretexts.org/@go/page/4525
25
Then w = ⟨20, 15,

–⟩ , and has the same direction as ⇀
v .
√3

50
Recall, though, that we calculated the magnitude of ⇀
w to be ⇀
∥ w∥ =

yd, and ⇀
v has magnitude 60 mph. So, we need to
√3

multiply vector w by an appropriate constant, k . We want to find a value of k so that ∥kw ∥= 60 mph*. We have
⇀ ⇀

50

∥kw∥ = k∥ w∥ = k


yd,
√3

so we want
50
k( – yd) = 60 mph
√3


60 √3
k = mph / yd
50

6 √3
k = mph / yd.
5

Then

25 6 √3 25 – –
⇀ ⇀
v = kw = k⟨20, 15,

⟩ = ⟨20, 15,

⟩ = ⟨24 √3, 18 √3, 30⟩ .
√3 5 √3

Let’s double-check that ∥ v ∥ = 60 mph. We have


−−−−−−−−−−−−−−−−−−−− −
– – −−−−−−−−−−−−− − −−−−
⇀ 2 2
∥ v ∥ = √(24 √3) + (18 √3) + (30 )
2
= √1728 + 972 + 900 = √3600 = 60 mph.

So, we have found the correct components for ⇀


v .

 Note *

Readers who have been watching the units of measurement may be wondering what exactly is going on at this point:
haven't we just mixed yards and miles per hour? We haven't, but the reason is subtle. One way to understand it is to realize
that there are really two parallel coordinate systems in this problem: one gives positions down the field, across the field,
and up into the air in units of yards; the other gives speeds down the field, across the field, and up into the air in units of
miles per hour. The vector w is calculated in the position coordinate system; vector v will be in the speed system. Because
⇀ ⇀

corresponding axes in each system are parallel, directions in the two systems are also parallel, so the claim that w and v ⇀ ⇀

point in the same direction is correct. The constant k that we're looking for is a conversion factor between the magnitudes
of these two vectors, converting from the position system to the speed one in the process. And as seen above, our
calculation of k produces the right units for such a conversion, namely miles per hour per yard.

 Exercise 12.2.10

Assume the quarterback and the receiver are in the same place as in the previous example. This time, however, the quarterback
throws the ball at velocity of 40 mph and an angle of 45°. Write the initial velocity vector of the ball, v , in component form.⇀

Hint
Follow the process used in the previous example.
Answer
– – –
v = ⟨16 √2, 12 √2, 20 √2⟩

Key Concepts
The three-dimensional coordinate system is built around a set of three axes that intersect at right angles at a single point, the
origin. Ordered triples (x, y, z) are used to describe the location of a point in space.
The distance d between points (x , y , z ) and (x , y , z ) is given by the formula
1 1 1 2 2 2

12.2.20 https://math.libretexts.org/@go/page/4525
−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2 2
d = √ (x2 − x1 ) + (y2 − y1 ) + (z2 − z1 ) .

In three dimensions, the equations x = a, y = b, and z = c describe planes that are parallel to the coordinate planes.
The standard equation of a sphere with center (a, b, c) and radius r is
2 2 2 2
(x − a) + (y − b ) + (z − c ) =r .

In three dimensions, as in two, vectors are commonly expressed in component form, v = ⟨x, y, z⟩ , or in terms of the standard ⇀

unit vectors, v = x ^i + y ^j + z k
⇀ ^
.

Properties of vectors in space are a natural extension of the properties for vectors in a plane. Let v = ⟨x , y , z ⟩ and ⇀
1 1 1

w = ⟨x , y , z ⟩ be vectors, and let k be a scalar.



2 2 2

Scalar multiplication:

k v = ⟨kx1 , ky1 , kz1 ⟩

Vector addition:
⇀ ⇀
v + w = ⟨x1 , y1 , z1 ⟩ + ⟨x2 , y2 , z2 ⟩ = ⟨x1 + x2 , y1 + y2 , z1 + z2 ⟩

Vector subtraction:
⇀ ⇀
v − w = ⟨x1 , y1 , z1 ⟩ − ⟨x2 , y2 , z2 ⟩ = ⟨x1 − x2 , y1 − y2 , z1 − z2 ⟩

Vector magnitude:
−−−−−−−−−−
⇀ 2 2 2
∥ v∥ = √ x +y +z
1 1 1

Unit vector in the direction of ⇀


v :

v 1 x1 y1 z1 ⇀

= ⟨x1 , y1 , z1 ⟩ = ⟨ , , ⟩, v ≠ 0
⇀ ⇀ ⇀ ⇀ ⇀
∥ v∥ ∥ v∥ ∥ v∥ ∥ v∥ ∥ v∥

Key Equations
Distance between two points in space:
−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2 2
d = √ (x2 − x1 ) + (y2 − y1 ) + (z2 − z1 )

Sphere with center (a, b, c) and radius r :


2 2 2 2
(x − a) + (y − b ) + (z − c ) =r

Glossary
coordinate plane
a plane containing two of the three coordinate axes in the three-dimensional coordinate system, named by the axes it contains:
the xy-plane, xz-plane, or the yz-plane

right-hand rule
a common way to define the orientation of the three-dimensional coordinate system; when the right hand is curved around the
z -axis in such a way that the fingers curl from the positive x -axis to the positive y -axis, the thumb points in the direction of the

positive z -axis

octants
the eight regions of space created by the coordinate planes

sphere
the set of all points equidistant from a given point known as the center

standard equation of a sphere

12.2.21 https://math.libretexts.org/@go/page/4525
(x − a)
2 2
+ (y − b )
2
+ (z − c )
2
=r describes a sphere with center (a, b, c) and radius r

three-dimensional rectangular coordinate system


a coordinate system defined by three lines that intersect at right angles; every point in space is described by an ordered triple
(x, y, z) that plots its location relative to the defining axes

Contributors
Gilbert Strang (MIT) and Edwin “Jed” Herman (Harvey Mudd) with many contributing authors. This content by OpenStax is
licensed with a CC-BY-SA-NC 4.0 license. Download for free at http://cnx.org.
Example 12.2.10 has been modified by Doug Baldwin and Paul Seeburger to clarify the units of measurement that it uses and how
it uses them.
Paul Seeburger also created dynamic versions of Figures 12.2.8, 12.2.9and 12.2.13 using CalcPlot3D.

12.2: Vectors is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
12.2: Vectors in Three Dimensions by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

12.2.22 https://math.libretexts.org/@go/page/4525
12.3: The Dot Product
 Learning Objectives
Calculate the dot product of two given vectors.
Determine whether two given vectors are perpendicular.
Find the direction cosines of a given vector.
Explain what is meant by the vector projection of one vector onto another vector, and describe how to compute it.
Calculate the work done by a given force.

If we apply a force to an object so that the object moves, we say that work is done by the force. Previously, we looked at a constant
force and we assumed the force was applied in the direction of motion of the object. Under those conditions, work can be expressed
as the product of the force acting on an object and the distance the object moves. In this chapter, however, we have seen that both
force and the motion of an object can be represented by vectors.
In this section, we develop an operation called the dot product, which allows us to calculate work in the case when the force vector
and the motion vector have different directions. The dot product essentially tells us how much of the force vector is applied in the
direction of the motion vector. The dot product can also help us measure the angle formed by a pair of vectors and the position of a
vector relative to the coordinate axes. It even provides a simple test to determine whether two vectors meet at a right angle.

The Dot Product and Its Properties


We have already learned how to add and subtract vectors. In this chapter, we investigate two types of vector multiplication. The
first type of vector multiplication is called the dot product, based on the notation we use for it, and it is defined as follows:

 Definition: dot product


The dot product of vectors u = ⟨u

1, u2 , u3 ⟩ and ⇀
v = ⟨v1 , v2 , v3 ⟩ is given by the sum of the products of the components
⇀ ⇀
u ⋅ v = u1 v1 + u2 v2 + u3 v3 .

Note that if u and v are two-dimensional vectors, we calculate the dot product in a similar fashion. Thus, if ⇀
u = ⟨u1 , u2 ⟩ and

v = ⟨v1 , v2 ⟩, then
⇀ ⇀
u ⋅ v = u1 v1 + u2 v2 .

When two vectors are combined under addition or subtraction, the result is a vector. When two vectors are combined using the dot
product, the result is a scalar. For this reason, the dot product is often called the scalar product. It may also be called the inner
product.

 Example 12.3.1: Calculating Dot Products


a. Find the dot product of u = ⟨3, 5, 2⟩ and v = ⟨−1, 3, 0⟩ .
⇀ ⇀

b. Find the scalar product of p = 10^i − 4^j + 7k


⇀ ^
and q = −2^i + ^j + 6k
⇀ ^
.

Solution:
a. Substitute the vector components into the formula for the dot product:
⇀ ⇀
u⋅ v = u1 v1 + u2 v2 + u3 v3

= 3(−1) + 5(3) + 2(0)

= −3 + 15 + 0

= 12.

b. The calculation is the same if the vectors are written using standard unit vectors. We still have three components for each
vector to substitute into the formula for the dot product:

12.3.1 https://math.libretexts.org/@go/page/4526
⇀ ⇀
p⋅ q = p1 q1 + p2 q2 + p3 q3

= 10(−2) + (−4)(1) + (7)(6)

= −20 − 4 + 42

= 18.

 Exercise 12.3.1

Find u ⋅ v , where u = ⟨2, 9, −1⟩ and


⇀ ⇀ ⇀ ⇀
v = ⟨−3, 1, −4⟩.

Hint
Multiply corresponding components and then add their products.
Answer
7

Like vector addition and subtraction, the dot product has several algebraic properties. We prove three of these properties and leave
the rest as exercises.

 Properties of the Dot Product

Let u ,
⇀ ⇀
v , and w be vectors, and let c be a scalar.

i. Commutative property
⇀ ⇀ ⇀ ⇀
u⋅ v = v ⋅ u

ii. Distributive property


⇀ ⇀ ⇀ ⇀ ⇀ ⇀ ⇀
u ⋅ ( v + w) = u ⋅ v + u ⋅ w

iii. Associative property


⇀ ⇀ ⇀ ⇀ ⇀ ⇀
c( u ⋅ v ) = (c u ) ⋅ v = u ⋅ (c v )

iv. Property of magnitude


⇀ ⇀ ⇀ 2
v ⋅ v = ∥ v∥

 Proof

Let u = ⟨u

1, u2 , u3 ⟩ and ⇀
v = ⟨v1 , v2 , v3 ⟩. Then
⇀ ⇀
u⋅ v = ⟨u1 , u2 , u3 ⟩ ⋅ ⟨v1 , v2 , v3 ⟩

= u1 v1 + u2 v2 + u3 v3

= v1 u1 + v2 u2 + v3 u3

= ⟨v1 , v2 , v3 ⟩ ⋅ ⟨u1 , u2 , u3 ⟩

⇀ ⇀
= v ⋅ u.

The associative property looks like the associative property for real-number multiplication, but pay close attention to the
difference between scalar and vector objects:

12.3.2 https://math.libretexts.org/@go/page/4526
⇀ ⇀
c( u ⋅ v ) = c(u1 v1 + u2 v2 + u3 v3 )

= c(u1 v1 ) + c(u2 v2 ) + c(u3 v3 )

= (c u1 )v1 + (c u2 )v2 + (c u3 )v3

= ⟨c u1 , c u2 , c u3 ⟩ ⋅ ⟨v1 , v2 , v3 ⟩

= c⟨u1 , u2 , u3 ⟩ ⋅ ⟨v1 , v2 , v3 ⟩

⇀ ⇀
= (c u ) ⋅ v .

The proof that c(u ⋅ v ) = u ⋅ (c v ) is similar.


⇀ ⇀ ⇀ ⇀

The fourth property shows the relationship between the magnitude of a vector and its dot product with itself:
⇀ ⇀
v ⋅ v = ⟨v1 , v2 , v3 ⟩ ⋅ ⟨v1 , v2 , v3 ⟩

2 2 2
= (v1 ) + (v2 ) + (v3 )

−−−−−−−−−−−−−−−− 2
2 2 2
= [√ (v1 ) + (v2 ) + (v3 ) ]

⇀ 2
= ∥ v∥ .

⇀ ⇀
Note that the definition of the dot product yields ⇀
0 ⋅ v = 0. By property iv. if ⇀ ⇀
v ⋅ v = 0, then ⇀
v = 0.

 Example 12.3.2: Using Properties of the Dot Product



Let ⇀
a = ⟨1, 2, −3⟩ , b = ⟨0, 2, 4⟩, and ⇀
c = ⟨5, −1, 3⟩ .
Find each of the following products.

a. ( a ⋅ b ) c
⇀ ⇀

b. a ⋅ (2 c )
⇀ ⇀


c. ∥ b ∥ 2

Solution

a. Note that this expression asks for the scalar multiple of ⇀
c by ⇀
a ⋅ b :

⇀ ⇀
( a ⋅ b) c = (⟨1, 2, −3⟩ ⋅ ⟨0, 2, 4⟩)⟨5, −1, 3⟩

= (1(0) + 2(2) + (−3)(4))⟨5, −1, 3⟩

= −8⟨5, −1, 3⟩

= ⟨−40, 8, −24⟩.

b. This expression is a dot product of vector ⇀


a and scalar multiple 2 c : ⇀

⇀ ⇀ ⇀ ⇀
a ⋅ (2 c ) = 2( a ⋅ c )

= 2(⟨1, 2, −3⟩ ⋅ ⟨5, −1, 3⟩)

= 2(1(5) + 2(−1) + (−3)(3))

= 2(−6) = −12.

c. Simplifying this expression is a straightforward application of the dot product:

12.3.3 https://math.libretexts.org/@go/page/4526
⇀ ⇀ ⇀
2
∥ b∥ = b⋅ b

= ⟨0, 2, 4⟩ ⋅ ⟨0, 2, 4⟩

2 2 2
=0 +2 +4

= 0 + 4 + 16

= 20.

 Exercise 12.3.2

Find the following products for p = ⟨7, 0, 2⟩, q


⇀ ⇀
= ⟨−2, 2, −2⟩ , and ⇀
r = ⟨0, 2, −3⟩ .
a. ( r ⋅ p ) q
⇀ ⇀ ⇀

b. ∥ p ∥
⇀ 2

Hint

r ⋅ p

is a scalar.
Answer
⇀ ⇀ ⇀ ⇀ 2
a. ( r ⋅ p ) q = ⟨12, −12, 12⟩; b. ∥ p∥ = 53

Using the Dot Product to Find the Angle between Two Vectors
When two nonzero vectors are placed in standard position, whether in two dimensions or three dimensions, they form an angle
between them (Figure 12.3.1). The dot product provides a way to find the measure of this angle. This property is a result of the fact
that we can express the dot product in terms of the cosine of the angle formed by two vectors.

Figure 12.3.1 : Let θ be the angle between two nonzero vectors u and ⇀ ⇀
v such that 0 ≤ θ ≤ π .

 Evaluating a Dot Product

The dot product of two vectors is the product of the magnitude of each vector and the cosine of the angle between them:
⇀ ⇀ ⇀ ⇀
u ⋅ v = ∥ u ∥∥ v ∥ cos θ. (12.3.1)

 Proof

Place vectors u and v in standard position and consider the vector


⇀ ⇀ ⇀
v −u

(Figure 12.3.2). These three vectors form a triangle
with side lengths ∥ u ∥, ∥ v ∥, and ∥ v − u ∥ .
⇀ ⇀ ⇀ ⇀

Figure 12.3.2 : The lengths of the sides of the triangle are given by the magnitudes of the vectors that form the triangle.
Recall from trigonometry that the law of cosines describes the relationship among the side lengths of the triangle and the angle
θ . Applying the law of cosines here gives

⇀ ⇀ 2 ⇀ 2 ⇀ 2 ⇀ ⇀
∥ v − u∥ = ∥ u∥ + ∥ v∥ − 2∥ u ∥∥ v ∥ cos θ. (12.3.2)

12.3.4 https://math.libretexts.org/@go/page/4526
The dot product provides a way to rewrite the left side of Equation 12.3.2:
⇀ ⇀ 2 ⇀ ⇀ ⇀ ⇀
∥ v − u∥ = (v − u) ⋅ (v − u)

⇀ ⇀ ⇀ ⇀ ⇀ ⇀
= (v − u) ⋅ v − (v − u) ⋅ u

⇀ ⇀ ⇀ ⇀ ⇀ ⇀ ⇀ ⇀
= v ⋅ v −u⋅ v −v ⋅ u+u⋅ u

⇀ ⇀ ⇀ ⇀ ⇀ ⇀ ⇀ ⇀
= v ⋅ v −u⋅ v −u⋅ v +u⋅ u

⇀ 2 ⇀ ⇀ ⇀ 2
= ∥ v∥ − 2 u ⋅ v + ∥ u∥ .

Substituting into the law of cosines yields


⇀ ⇀ 2 ⇀ 2 ⇀ 2 ⇀ ⇀
∥ v − u∥ = ∥ u∥ + ∥ v∥ − 2∥ u ∥∥ v ∥ cos θ

⇀ 2 ⇀ ⇀ ⇀ 2 ⇀ 2 ⇀ 2 ⇀ ⇀
∥ v∥ − 2 u ⋅ v + ∥ u∥ = ∥ u∥ + ∥ v∥ − 2∥ u ∥∥ v ∥ cos θ

⇀ ⇀ ⇀ ⇀
−2 u ⋅ v = −2∥ u ∥∥ v ∥ cos θ

⇀ ⇀ ⇀ ⇀
u ⋅ v = ∥ u ∥∥ v ∥ cos θ.

We can use the form of the dot product in Equation 12.3.1 to find the measure of the angle between two nonzero vectors by
rearranging Equation 12.3.1 to solve for the cosine of the angle:
⇀ ⇀
u⋅ v
cos θ = . (12.3.3)
⇀ ⇀
∥ u ∥∥ v ∥

Using this equation, we can find the cosine of the angle between two nonzero vectors. Since we are considering the smallest angle
between the vectors, we assume 0° ≤ θ ≤ 180° (or 0 ≤ θ ≤ π if we are working in radians). The inverse cosine is unique over
this range, so we are then able to determine the measure of the angle θ .

 Example 12.3.3: Finding the Angle between Two Vectors

Find the measure of the angle between each pair of vectors.

a. ^i + ^j + k
^
and 2^i – ^j – 3k
^

b. ⟨2, 5, 6⟩ and ⟨−2, −4, 4⟩


Solution
a. To find the cosine of the angle formed by the two vectors, substitute the components of the vectors into Equation 12.3.3:
^ ^ ^ ^ ^ ^
( i + j + k) ⋅ (2 i − j − 3 k)
cos θ =
∥^ ^ ^ ^ ^ ^∥
∥ i + j + k ∥ ⋅ ∥ 2 i − j − 3 k∥

1(2) + (1)(−1) + (1)(−3)


=
− −−−− −−−− − −−−−−−−−−−−−−− −
√ 12 + 12 + 12 √ 22 + (−1 )2 + (−3 )2

−2 −2
= = .
– −− −−
√3√14 √42

−2
Therefore, θ = arccos −− rad.
√42

b. Start by finding the value of the cosine of the angle between the vectors:

12.3.5 https://math.libretexts.org/@go/page/4526
⟨2, 5, 6⟩ ⋅ ⟨−2, −4, 4⟩
cos θ =
∥⟨2, 5, 6⟩ ∥ ⋅ ∥ ⟨−2, −4, 4⟩∥

2(−2) + (5)(−4) + (6)(4)


=
− −−−− −−−− − −−−−−−−−−−−−−− −
√ 22 + 52 + 62 √ (−2 )2 + (−4 )2 + 42

0
= = 0.
−− −−
√65√36

Now, cos θ = 0 and 0 ≤ θ ≤ π , so θ = π/2 .

 Exercise 12.3.3

Find the measure of the angle, in radians, formed by vectors ⇀
a = ⟨1, 2, 0⟩ and b = ⟨2, 4, 1⟩. Round to the nearest hundredth.

Hint
Use the Equation 12.3.3.
Answer
θ ≈ 0.22 rad

The angle between two vectors can be acute (0 < cos θ < 1), obtuse (−1 < cos θ < 0) , or straight (cos θ = −1) . If cos θ = 1 ,
then both vectors have the same direction. If cos θ = 0 , then the vectors, when placed in standard position, form a right angle
(Figure 12.3.3). We can formalize this result into a theorem regarding orthogonal (perpendicular) vectors.

Figure 12.3.3 : (a) An acute angle has 0 < cos θ < 1 . (b) An obtuse angle has −1 < cos θ < 0. (c) A straight line has cos θ = −1 .
(d) If the vectors have the same direction, cos θ = 1 . (e) If the vectors are orthogonal (perpendicular), cos θ = 0.

 Orthogonal Vectors
The nonzero vectors u and
⇀ ⇀
v are orthogonal vectors if and only if u ⋅ v = 0.
⇀ ⇀

12.3.6 https://math.libretexts.org/@go/page/4526
 Proof
Let u and
⇀ ⇀
v be nonzero vectors, and let θ denote the angle between them. First, assume u ⋅ v = 0. Then
⇀ ⇀

⇀ ⇀
∥ u ∥∥ v ∥ cos θ = 0.

However, ∥ u ∥ ≠ 0 and ∥ v ∥ ≠ 0, so we must have cos θ = 0 . Hence, θ = 90° , and the vectors are orthogonal.
⇀ ⇀

Now assume u and ⇀ ⇀


v are orthogonal. Then θ = 90° and we have
⇀ ⇀ ⇀ ⇀
u ⋅ v = ∥ u ∥∥ v ∥ cos θ

⇀ ⇀
= ∥ u ∥∥ v ∥ cos 90°

⇀ ⇀
= ∥ u ∥∥ v ∥(0)

= 0.

The terms orthogonal, perpendicular, and normal each indicate that mathematical objects are intersecting at right angles. The use
of each term is determined mainly by its context. We say that vectors are orthogonal and lines are perpendicular. The term normal
is used most often when measuring the angle made with a plane or other surface.

 Example 12.3.4: Identifying Orthogonal Vectors

Determine whether p = ⟨1, 0, 5⟩ and q


⇀ ⇀
= ⟨10, 3, −2⟩ are orthogonal vectors.
Solution
Using the definition, we need only check the dot product of the vectors:
⇀ ⇀
p ⋅ q = 1(10) + (0)(3) + (5)(−2) = 10 + 0 − 10 = 0.

Because p ⋅ q
⇀ ⇀
= 0, the vectors are orthogonal (Figure 12.3.4).

Figure 12.3.4 : Vectors p and q form a right angle when their initial points are aligned.
⇀ ⇀

 Exercise 12.3.4

For which value of x is p = ⟨2, 8, −1⟩ orthogonal to q


⇀ ⇀
= ⟨x, −1, 2⟩ ?

Hint
Vectors p and q are orthogonal if and only if p ⋅ q
⇀ ⇀ ⇀ ⇀
=0 .

12.3.7 https://math.libretexts.org/@go/page/4526
Answer
x =5

 Example 12.3.5: Measuring the Angle Formed by Two Vectors

Let ⇀
v = ⟨2, 3, 3⟩. Find the measures of the angles formed by the following vectors.
a. v and ^i

b. v and ^j

c. ⇀
v and k
^

Solution
a.Let α be the angle formed by ⇀
v and ^i :
⇀ ^
v ⋅ i ⟨2, 3, 3⟩ ⋅ ⟨1, 0, 0⟩ 2
cos α = = =
− −−−−− −−−− – −−
⇀ ^ 2 2 2
√ 2 + 3 + 3 √1 √22
∥ v∥ ⋅ ∥ i ∥

2
α = arccos −− ≈ 1.130 rad.
√22

b. Let β represent the angle formed by v and ^j :


⇀ ^
v ⋅ j ⟨2, 3, 3⟩ ⋅ ⟨0, 1, 0⟩
3
cos β = = − −−−−− −−−− = −−
⇀ ^ 2 2 2 –
∥ v∥ ⋅ ∥ j ∥ √ 2 + 3 + 3 √1 √22

3
β = arccos −− ≈ 0.877 rad.
√22

c. Let γ represent the angle formed by v and k


⇀^
:
⇀ ^
v ⋅k ⟨2, 3, 3⟩ ⋅ ⟨0, 0, 1⟩
3
cos γ = = − −−−−− −−−− = −−
⇀ ^ 2 2 2 –
∥ v ∥ ⋅ ∥ k∥ √ 2 + 3 + 3 √1 √22

3
γ = arccos −− ≈ 0.877 rad.
√22

 Exercise 12.3.5

Let ⇀
v = ⟨3, −5, 1⟩. Find the measure of the angles formed by each pair of vectors.
a. v and ^i

b. v and ^j

c. v and k
⇀ ^

Hint
^ ^
i = ⟨1, 0, 0⟩, j = ⟨0, 1, 0⟩, and k
^
= ⟨0, 0, 1⟩

Answer
a. α ≈ 1.04 rad; b. β ≈ 2.58 rad; c. γ ≈ 1.40 rad

The angle a vector makes with each of the coordinate axes, called a direction angle, is very important in practical computations,
especially in a field such as engineering. For example, in astronautical engineering, the angle at which a rocket is launched must be
determined very precisely. A very small error in the angle can lead to the rocket going hundreds of miles off course. Direction
angles are often calculated by using the dot product and the cosines of the angles, called the direction cosines. Therefore, we define
both these angles and their cosines.

12.3.8 https://math.libretexts.org/@go/page/4526
 Definition: direction angles
The angles formed by a nonzero vector and the coordinate axes are called the direction angles for the vector (Figure ).
12.3.5

The cosines for these angles are called the direction cosines.

Figure 12.3.5 : Angle α is formed by vector ⇀


v and unit vector ^i . Angle β is formed by vector ⇀
v and unit vector ^j . Angle γ is
formed by vector v and unit vector k
⇀ ^
.

2 3 3
In Example 12.3.5, the direction cosines of ⇀
v = ⟨2, 3, 3⟩ are cos α = −− , cos β = −−, and cos γ = −− . The direction
√22 √22 √22

angles of ⇀
v are α = 1.130 rad, β = 0.877 rad, and γ = 0.877 rad.
So far, we have focused mainly on vectors related to force, movement, and position in three-dimensional physical space. However,
vectors are often used in more abstract ways. For example, suppose a fruit vendor sells apples, bananas, and oranges. On a given
day, he sells 30 apples, 12 bananas, and 18 oranges. He might use a quantity vector, q = ⟨30, 12, 18⟩, to represent the quantity of

fruit he sold that day. Similarly, he might want to use a price vector, p = ⟨0.50, 0.25, 1⟩, to indicate that he sells his apples for 50¢

each, bananas for 25¢ each, and oranges for $1 apiece. In this example, although we could still graph these vectors, we do not
interpret them as literal representations of position in the physical world. We are simply using vectors to keep track of particular
pieces of information about apples, bananas, and oranges.
This idea might seem a little strange, but if we simply regard vectors as a way to order and store data, we find they can be quite a
powerful tool. Going back to the fruit vendor, let’s think about the dot product, q ⋅ p . We compute it by multiplying the number of
⇀ ⇀

apples sold (30) by the price per apple (50¢), the number of bananas sold by the price per banana, and the number of oranges sold
by the price per orange. We then add all these values together. So, in this example, the dot product tells us how much money the
fruit vendor had in sales on that particular day.
When we use vectors in this more general way, there is no reason to limit the number of components to three. What if the fruit
vendor decides to start selling grapefruit? In that case, he would want to use four-dimensional quantity and price vectors to
represent the number of apples, bananas, oranges, and grapefruit sold, and their unit prices. As you might expect, to calculate the
dot product of four-dimensional vectors, we simply add the products of the components as before, but the sum has four terms
instead of three.

 Example 12.3.6: Using Vectors in an Economic Context

AAA Party Supply Store sells invitations, party favors, decorations, and food service items such as paper plates and napkins.
When AAA buys its inventory, it pays 25¢ per package for invitations and party favors. Decorations cost AAA 50¢ each, and
food service items cost 20¢ per package. AAA sells invitations for $2.50 per package and party favors for $1.50 per package.
Decorations sell for $4.50 each and food service items for $1.25 per package.
During the month of May, AAA Party Supply Store sells 1258 invitations, 342 party favors, 2426 decorations, and 1354 food
service items. Use vectors and dot products to calculate how much money AAA made in sales during the month of May. How
much did the store make in profit?
Solution
The cost, price, and quantity vectors are

12.3.9 https://math.libretexts.org/@go/page/4526

c = ⟨0.25, 0.25, 0.50, 0.20⟩


p = ⟨2.50, 1.50, 4.50, 1.25⟩


q = ⟨1258, 342, 2426, 1354⟩.

AAA sales for the month of May can be calculated using the dot product p ⋅ q . We have
⇀ ⇀

⇀ ⇀
p⋅ q = ⟨2.50, 1.50, 4.50, 1.25⟩ ⋅ ⟨1258, 342, 2426, 1354⟩

= 3145 + 513 + 10917 + 1692.5

= 16267.5.

So, AAA took in $16,267.50 during the month of May. To calculate the profit, we must first calculate how much AAA paid for
the items sold. We use the dot product c ⋅ q to get
⇀ ⇀

⇀ ⇀
c ⋅ q = ⟨0.25, 0.25, 0.50, 0.20⟩ ⋅ ⟨1258, 342, 2426, 1354⟩

= 314.5 + 85.5 + 1213 + 270.8

= 1883.8.

So, AAA paid $1,883.80 for the items they sold. Their profit, then, is given by
⇀ ⇀ ⇀ ⇀
p ⋅ q − c ⋅ q = 16267.5 − 1883.8 = 14383.7.

Therefore, AAA Party Supply Store made $14,383.70 in May.

 Exercise 12.3.6

On June 1, AAA Party Supply Store decided to increase the price they charge for party favors to $2 per package. They also
changed suppliers for their invitations, and are now able to purchase invitations for only 10¢ per package. All their other costs
and prices remain the same. If AAA sells 1408 invitations, 147 party favors, 2112 decorations, and 1894 food service items in
the month of June, use vectors and dot products to calculate their total sales and profit for June.

Hint
Use four-dimensional vectors for cost, price, and quantity sold.
Answer
Sales = $15,685.50; profit = $14,073.15

Projections
As we have seen, addition combines two vectors to create a resultant vector. But what if we are given a vector and we need to find
its component parts? We use vector projections to perform the opposite process; they can break down a vector into its components.
The magnitude of a vector projection is a scalar projection. For example, if a child is pulling the handle of a wagon at a 55° angle,
we can use projections to determine how much of the force on the handle is actually moving the wagon forward (12.3.6). We return
to this example and learn how to solve it after we see how to calculate projections.

Figure 12.3.6 : When a child pulls a wagon, only the horizontal component of the force propels the wagon forward.

12.3.10 https://math.libretexts.org/@go/page/4526
 Definition: Vector and Projection

The vector projection of v onto u is the vector labeled proj v in Figure 12.3.7. It has the same initial point as u and v and
⇀ ⇀

u
⇀ ⇀ ⇀

the same direction as u , and represents the component of v that acts in the direction of u . If θ represents the angle between u
⇀ ⇀ ⇀ ⇀

and v , then, by properties of triangles, we know the length of proj v is ∥proj v ∥ = ∥ v ∥ cos θ. When expressing cos θ in


u


u
⇀ ⇀

terms of the dot product, this becomes


⇀ ⇀ ⇀ ⇀
⇀ ⇀ ⇀
u⋅ v u⋅ v
∥ proj ⇀ v ∥ = ∥ v ∥ cos θ = ∥ v ∥ ( ) =
u ⇀ ⇀ ⇀
∥ u ∥∥ v ∥ ∥ u ∥.

We now multiply by a unit vector in the direction of u to get proj ⇀



u

v :
⇀ ⇀ ⇀ ⇀

u⋅ v 1 ⇀
u⋅ v ⇀
proj ⇀ v = ( u) = u.
u ⇀ ⇀ ⇀ 2
∥ u∥ ∥ u∥ ∥ u∥

The length of this vector is also known as the scalar projection of ⇀


v onto u and is denoted by

⇀ ⇀
u⋅ v
⇀ ⇀
∥ proj ⇀ v ∥ = comp⇀ v =
u u ⇀
∥ u ∥.

Figure 12.3.7 : The projection of ⇀


v onto u shows the component of vector
⇀ ⇀
v in the direction of u .

 Example 12.3.7: Finding Projections

Find the projection of ⇀


v onto u .

a. v = ⟨3, 5, 1⟩ and u = ⟨−1, 4, 3⟩


⇀ ⇀

b. v = 3^i − 2^j and u = ^i + 6^j


⇀ ⇀

Solution
a. Substitute the components of ⇀
v and u into the formula for the projection:

⇀ ⇀
u⋅ v
⇀ ⇀
proj ⇀ v = u
u 2

∥ u∥

⟨−1, 4, 3⟩ ⋅ ⟨3, 5, 1⟩
= ⟨−1, 4, 3⟩
2
∥⟨−1, 4, 3⟩∥

−3 + 20 + 3
= ⟨−1, 4, 3⟩
2 2 2
(−1 ) +4 +3

20
= ⟨−1, 4, 3⟩
26

10 40 30
= ⟨− , , ⟩.
13 13 13

b. To find the two-dimensional projection, simply adapt the formula to the two-dimensional case:

12.3.11 https://math.libretexts.org/@go/page/4526
⇀ ⇀
u⋅ v
⇀ ⇀
proj ⇀ v = u
u 2

∥ u∥

^ ^ ^ ^
( i + 6 j ) ⋅ (3 i − 2 j )
^ ^
= (i +6j)
2
∥^ ^∥
∥i +6j∥

1(3) + 6(−2)
^ ^
= (i +6j)
2 2
1 +6

9
^ ^
=− (i +6j)
37

9 54
^ ^
=− i − j.
37 37

Sometimes it is useful to decompose vectors—that is, to break a vector apart into a sum. This process is called the resolution of a
vector into components. Projections allow us to identify two orthogonal vectors having a desired sum. For example, let
v = ⟨6, −4⟩ and let u = ⟨3, 1⟩. We want to decompose the vector v into orthogonal components such that one of the component
⇀ ⇀ ⇀

vectors has the same direction as u . ⇀

We first find the component that has the same direction as u by projecting
⇀ ⇀
v onto u . Let p = proj
⇀ ⇀

u

v . Then, we have
⇀ ⇀

u⋅ v ⇀
p = u
⇀ 2
∥ u∥

18 − 4 ⇀
= u
9 +1

7 7 21 7

= u = ⟨3, 1⟩ = ⟨ , ⟩.
5 5 5 5

Now consider the vector q ⇀ ⇀ ⇀


= v − p. We have
⇀ ⇀ ⇀
q = v −p

21 7
= ⟨6, −4⟩ − ⟨ , ⟩
5 5

9 27
=⟨ ,− ⟩.
5 5

Clearly, by the way we defined q , we have v = q + p , and


⇀ ⇀ ⇀ ⇀

9 27 21 7
⇀ ⇀
q ⋅ p =⟨ ,− ⟩⋅⟨ , ⟩
5 5 5 5

9(21) 27(7)
= +−
25 25

189 189
= − = 0.
25 25

Therefore, q and p are orthogonal.


⇀ ⇀

 Example 12.3.8: Resolving Vectors into Components


Express ⇀
v = ⟨8, −3, −3⟩ as a sum of orthogonal vectors such that one of the vectors has the same direction as u = ⟨2, 3, 2⟩. ⇀

Solution
Let p represent the projection of
⇀ ⇀
v onto u :

12.3.12 https://math.libretexts.org/@go/page/4526
⇀ ⇀
p = proj ⇀ v
u

⇀ ⇀
u⋅ v

= u
⇀ 2
∥ u∥

⟨2, 3, 2⟩ ⋅ ⟨8, −3, −3⟩


= ⟨2, 3, 2⟩
2
∥⟨2, 3, 2⟩∥

16 − 9 − 6
= ⟨2, 3, 2⟩
2 2 2
2 +3 +2

1
= ⟨2, 3, 2⟩
17

2 3 2
=⟨ , , ⟩.
17 17 17

Then,
2 3 2
⇀ ⇀ ⇀
q = v − p = ⟨8, −3, −3⟩ − ⟨ , , ⟩
17 17 17

134 54 53
=⟨ ,− ,− ⟩.
17 17 17

To check our work, we can use the dot product to verify that p and q are orthogonal vectors:⇀ ⇀

⇀ ⇀
2 3 2 134 54 53
p⋅ q =⟨ , , ⟩⋅⟨ ,− ,− ⟩
17 17 17 17 17 17

268 162 106


= − − = 0.
289 289 289

Then,
2 3 2 134 54 53
⇀ ⇀ ⇀
v = p+q =⟨ , , ⟩+⟨ ,− ,− ⟩.
17 17 17 17 17 17

 Exercise 12.3.7

Express ⇀ ^ ^
v =5i − j as a sum of orthogonal vectors such that one of the vectors has the same direction as u = 4^i + 2^j . ⇀

Hint
Start by finding the projection of ⇀
v onto u . ⇀

Answer
18 9 7 14
⇀ ⇀
v = p + q,

where p =
⇀ ^
i +
^
j and q ⇀
=
^
i −
^
j
5 5 5 5

 Example 12.3.9: Scalar Projection of Velocity

A container ship leaves port traveling 15° north of east. Its engine generates a speed of 20 knots along that path (see the
following figure). In addition, the ocean current moves the ship northeast at a speed of 2 knots. Considering both the engine
and the current, how fast is the ship moving in the direction 15° north of east? Round the answer to two decimal places.

12.3.13 https://math.libretexts.org/@go/page/4526
Solution
Let ⇀
vbe the velocity vector generated by the engine, and let w be the velocity vector of the current. We already know

∥ v ∥ = 20 along the desired route. We just need to add in the scalar projection of w onto v . We get
⇀ ⇀ ⇀

⇀ ⇀
v ⋅w

comp⇀ w =
v ⇀
∥ v∥

⇀ ⇀ –
∥ v ∥∥ w∥ cos(30°) √3 –

= = ∥ w∥ cos(30°) = 2 = √3 ≈ 1.73 knots.

∥ v∥ 2

The ship is moving at 21.73 knots in the direction 15° north of east.

 Exercise 12.3.8
Repeat the previous example, but assume the ocean current is moving southeast instead of northeast, as shown in the following
figure.

Hint
Compute the scalar projection of w onto
⇀ ⇀
v .
Answer
21 knots

Work
Now that we understand dot products, we can see how to apply them to real-life situations. The most common application of the
dot product of two vectors is in the calculation of work.

12.3.14 https://math.libretexts.org/@go/page/4526
From physics, we know that work is done when an object is moved by a force. When the force is constant and applied in the same
direction the object moves, then we define the work done as the product of the force and the distance the object travels: W = F d .
We saw several examples of this type in earlier chapters. Now imagine the direction of the force is different from the direction of
motion, as with the example of a child pulling a wagon. To find the work done, we need to multiply the component of the force that
acts in the direction of the motion by the magnitude of the displacement. The dot product allows us to do just that. If we represent

an applied force by a vector F and the displacement of an object by a vector s , then the work done by the force is the dot product


of F and ⇀
s .

 Definition: Constant Force

When a constant force is applied to an object so the object moves in a straight line from point P to point Q, the work W done

by the force F, acting at an angle θ from the line of motion, is given by
⇀ −−⇀ ⇀ −−⇀
W = F ⋅ P Q =∥ F ∥∥ P Q ∥ cos θ.

Let’s revisit the problem of the child’s wagon introduced earlier. Suppose a child is pulling a wagon with a force having a
magnitude of 8 lb on the handle at an angle of 55°. If the child pulls the wagon 50 ft, find the work done by the force (Figure
12.3.8).


Figure 12.3.8 : The horizontal component of the force is the projection of F onto the positive x -axis.
We have
⇀ −−⇀
W =∥ F ∥∥ P Q ∥ cos θ = 8(50)(cos(55°)) ≈ 229 ft⋅lb.


∥ −−⇀ ∥
In U.S. standard units, we measure the magnitude of force ∥F∥ in pounds. The magnitude of the displacement vector
∥ ∥ ∥
PQ

tells
us how far the object moved, and it is measured in feet. The customary unit of measure for work, then, is the foot-pound. One foot-
pound is the amount of work required to move an object weighing 1 lb a distance of 1 ft straight up. In the metric system, the unit
of measure for force is the newton (N), and the unit of measure of magnitude for work is a newton-meter (N·m), or a joule (J).

 Example 12.3.10: Calculating Work



A conveyor belt generates a force F = 5^i − 3^j + k
^
that moves a suitcase from point (1, 1, 1) to point (9, 4, 7) along a
straight line. Find the work done by the conveyor belt. The distance is measured in meters and the force is measured in
newtons.
Solution
−−⇀
The displacement vector P Q has initial point (1, 1, 1) and terminal point (9, 4, 7):
−−⇀
^ ^ ^
P Q = ⟨9 − 1, 4 − 1, 7 − 1⟩ = ⟨8, 3, 6⟩ = 8 i + 3 j + 6 k.

Work is the dot product of force and displacement:

12.3.15 https://math.libretexts.org/@go/page/4526
⇀ −−⇀
W = F⋅PQ

^ ^ ^ ^ ^ ^
= (5 i − 3 j + k) ⋅ (8 i + 3 j + 6 k)

= 5(8) + (−3)(3) + 1(6)

= 37 N⋅m

= 37 J

 Exercise 12.3.9

A constant force of 30 lb is applied at an angle of 60° to pull a handcart 10 ft across the ground. What is the work done by this
force?

Hint
Use the definition of work as the dot product of force and distance.
Answer
150 ft-lb

Key Concepts
The dot product, or scalar product, of two vectors u = ⟨u ⇀
1, u2 , u3 ⟩ and ⇀
v = ⟨v1 , v2 , v3 ⟩ is u ⋅ v = u
⇀ ⇀
1 v1 + u2 v2 + u3 v3 .
The dot product satisfies the following properties:
⇀ ⇀ ⇀ ⇀
u⋅ v = v ⋅ u
⇀ ⇀ ⇀ ⇀ ⇀ ⇀ ⇀
u ⋅ ( v + w) = u ⋅ v + u ⋅ w
⇀ ⇀ ⇀ ⇀ ⇀ ⇀
c( u ⋅ v ) = (c u ) ⋅ v = u ⋅ (c v )
⇀ ⇀ ⇀ 2
v ⋅ v = ∥ v∥

The dot product of two vectors can be expressed, alternatively, as u ⋅ v = ∥ u ∥∥ v ∥ cos θ. This form of the dot product is useful
⇀ ⇀ ⇀ ⇀

for finding the measure of the angle formed by two vectors.


Vectors u and v are orthogonal if u ⋅ v = 0 .
⇀ ⇀ ⇀ ⇀

The angles formed by a nonzero vector and the coordinate axes are called the direction angles for the vector. The cosines of
these angles are known as the direction cosines.
⇀ ⇀
u⋅ v
The vector projection of ⇀
v onto u is the vector proj


u

v =
2

u . The magnitude of this vector is known as the scalar

∥ u∥
⇀ ⇀
u⋅ v
projection of ⇀
v onto u , given by comp


u

v =

.
∥ u∥

Work is done when a force is applied to an object, causing displacement. When the force is represented by the vector F and the
⇀ ⇀
displacement is represented by the vector s , then the work done W is given by the formula W = F ⋅ s =∥ F ∥ ∥ s ∥ cos θ.
⇀ ⇀ ⇀

12.3.16 https://math.libretexts.org/@go/page/4526
Key Equations
Dot product of u and v ⇀ ⇀

⇀ ⇀ ⇀ ⇀
u ⋅ v = u1 v1 + u2 v2 + u3 v3 = ∥ u ∥∥ v ∥ cos θ

Cosine of the angle formed by u and v ⇀ ⇀

⇀ ⇀
u⋅ v
cos θ =
⇀ ⇀
∥ u ∥∥ v ∥

Vector projection of ⇀
v onto u⇀

⇀ ⇀
u⋅ v
⇀ ⇀
proj ⇀ v = u
u 2

∥ u∥

Scalar projection of ⇀
v onto u⇀

⇀ ⇀

u⋅ v
comp⇀ v =
u ⇀
∥ u∥

⇀ −−⇀
Work done by a force F to move an object through displacement vector P Q
⇀ −−⇀ ⇀ −−⇀
W = F ⋅ P Q =∥ F ∥∥ P Q ∥ cos θ

Glossary
direction angles
the angles formed by a nonzero vector and the coordinate axes

direction cosines
the cosines of the angles formed by a nonzero vector and the coordinate axes

dot product or scalar product


⇀ ⇀
u ⋅ v = u1 v1 + u2 v2 + u3 v3 where u = ⟨u

1, u2 , u3 ⟩ and ⇀
v = ⟨v1 , v2 , v3 ⟩

scalar projection
the magnitude of the vector projection of a vector

orthogonal vectors
vectors that form a right angle when placed in standard position

vector projection
the component of a vector that follows a given direction

work done by a force



work is generally thought of as the amount of energy it takes to move an object; if we represent an applied force by a vector F

and the displacement of an object by a vector ⇀
s , then the work done by the force is the dot product of F and ⇀
s .

Contributors and Attributions


Gilbert Strang (MIT) and Edwin “Jed” Herman (Harvey Mudd) with many contributing authors. This content by OpenStax is
licensed with a CC-BY-SA-NC 4.0 license. Download for free at http://cnx.org.
edited for vector notation by Paul Seeburger

12.3: The Dot Product is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
12.3: The Dot Product by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

12.3.17 https://math.libretexts.org/@go/page/4526
12.4: The Cross Product
 Learning Objectives
Calculate the cross product of two given vectors.
Use determinants to calculate a cross product.
Find a vector orthogonal to two given vectors.
Determine areas and volumes by using the cross product.
Calculate the torque of a given force and position vector.

Imagine a mechanic turning a wrench to tighten a bolt. The mechanic applies a force at the end of the wrench. This creates rotation,
or torque, which tightens the bolt. We can use vectors to represent the force applied by the mechanic, and the distance (radius) from
the bolt to the end of the wrench. Then, we can represent torque by a vector oriented along the axis of rotation. Note that the torque
vector is orthogonal to both the force vector and the radius vector.
In this section, we develop an operation called the cross product, which allows us to find a vector orthogonal to two given vectors.
Calculating torque is an important application of cross products, and we examine torque in more detail later in the section.

The Cross Product and Its Properties


The dot product is a multiplication of two vectors that results in a scalar. In this section, we introduce a product of two vectors that
generates a third vector orthogonal to the first two. Consider how we might find such a vector. Let u = ⟨u , u , u ⟩ and ⇀
1 2 3

v = ⟨v , v , v ⟩ be nonzero vectors. We want to find a vector w = ⟨w , w , w ⟩ orthogonal to both u and v —that is, we want to
⇀ ⇀ ⇀ ⇀
1 2 3 1 2 3

find w such that u ⋅ w = 0 and v ⋅ w = 0 . Therefore, w , w , and w must satisfy


⇀ ⇀ ⇀ ⇀ ⇀
1 2 3

u1 w1 + u2 w2 + u3 w3 = 0 (12.4.1)

v1 w1 + v2 w2 + v3 w3 = 0. (12.4.2)

If we multiply the top equation by v and the bottom equation by u and subtract, we can eliminate the variable w , which gives
3 3 3

(u1 v3 − v1 u3 )w1 + (u2 v3 − v2 u3 )w2 = 0.

If we select

w1 = u2 v3 − u3 v2

w2 = −(u1 v3 − u3 v1 ),

we get a possible solution vector. Substituting these values back into the original equations (Equations 12.4.1 and 12.4.2) gives

w3 = u1 v2 − u2 v1 .

That is, vector



w = ⟨u2 v3 − u3 v2 , −(u1 v3 − u3 v1 ), u1 v2 − u2 v1 ⟩

is orthogonal to both u and ⇀ ⇀


v , which leads us to define the following operation, called the cross product.

 Definition: Cross Product

Let u = ⟨u

1, u2 , u3 ⟩ and ⇀
v = ⟨v1 , v2 , v3 ⟩. Then, the cross product u × v is vector
⇀ ⇀

⇀ ⇀ ^ ^ ^
u×v = (u2 v3 − u3 v2 ) i − (u1 v3 − u3 v1 ) j + (u1 v2 − u2 v1 )k

= ⟨u2 v3 − u3 v2 , −(u1 v3 − u3 v1 ), u1 v2 − u2 v1 ⟩. (12.4.3)

From the way we have developed u × v , it should be clear that the cross product is orthogonal to both u and
⇀ ⇀ ⇀ ⇀
v . However, it never
hurts to check. To show that u × v is orthogonal to u , we calculate the dot product of u and u × v .
⇀ ⇀ ⇀ ⇀ ⇀ ⇀

12.4.1 https://math.libretexts.org/@go/page/4527
⇀ ⇀ ⇀
u ⋅ (u × v ) = ⟨u1 , u2 , u3 ⟩ ⋅ ⟨u2 v3 − u3 v2 , −u1 v3 + u3 v1 , u1 v2 − u2 v1 ⟩

= u1 (u2 v3 − u3 v2 ) + u2 (−u1 v3 + u3 v1 ) + u3 (u1 v2 − u2 v1 )

= u1 u2 v3 − u1 u3 v2 − u1 u2 v3 + u2 u3 v1 + u1 u3 v2 − u2 u3 v1

= (u1 u2 v3 − u1 u2 v3 ) + (−u1 u3 v2 + u1 u3 v2 ) + (u2 u3 v1 − u2 u3 v1 )

=0

In a similar manner, we can show that the cross product is also orthogonal to ⇀
v .

⇀ ⇀
The cross product a × b (vertical, in pink) changes as the angle between the vectors a (blue) and b (red) changes. The cross
⇀ ⇀

product (purple) is always perpendicular to both vectors, and has magnitude zero when the vectors are parallel and maximum

magnitude ∥ a ∥∥ b ∥ when they are perpendicular. (Public Domain; LucasVB).

 Example 12.4.1: Finding a Cross Product

Let p = ⟨−1, 2, 5⟩ and q


⇀ ⇀
= ⟨4, 0, −3⟩ (Figure 12.4.1). Find p × q .
⇀ ⇀

Figure 12.4.1 : Finding a cross product to two given vectors.


Solution
Substitute the components of the vectors into Equation 12.4.3:
⇀ ⇀
p×q = ⟨−1, 2, 5⟩ × ⟨4, 0, −3⟩

= ⟨p2 q3 − p3 q2 , −(p1 q3 − p3 q1 ), p1 q2 − p2 q1 ⟩

= ⟨2(−3) − 5(0), −(−1)(−3) + 5(4), (−1)(0) − 2(4)⟩

= ⟨−6, 17, −8⟩.

12.4.2 https://math.libretexts.org/@go/page/4527
 Exercise 12.4.1

Find p × q for p = ⟨5, 1, 2⟩ and q


⇀ ⇀ ⇀ ⇀
= ⟨−2, 0, 1⟩. Express the answer using standard unit vectors.

Hint
Use the formula u × v = (u
⇀ ⇀
2 v3
^ ^ ^
− u3 v2 ) i − (u1 v3 − u3 v1 ) j + (u1 v2 − u2 v1 )k.

Answer
⇀ ⇀ ^ ^ ^
p × q = i − 9 j + 2k

Although it may not be obvious from Equation 12.4.3, the direction of u × v is given by the right-hand rule. If we hold the right
⇀ ⇀

hand out with the fingers pointing in the direction of u , then curl the fingers toward vector v , the thumb points in the direction of
⇀ ⇀

the cross product, as shown in Figure 12.4.2.

Figure 12.4.2 : The direction of u × v is determined by the right-hand rule.


⇀ ⇀

Notice what this means for the direction of v × u . If we apply the right-hand rule to v × u , we start with our fingers pointed in
⇀ ⇀ ⇀ ⇀

the direction of v , then curl our fingers toward the vector u . In this case, the thumb points in the opposite direction of u × v . (Try
⇀ ⇀ ⇀ ⇀

it!)

 Example 12.4.2: Anticommutativity of the Cross Product

Let u = ⟨0, 2, 1⟩ and


⇀ ⇀
v = ⟨3, −1, 0⟩ . Calculate u × v and
⇀ ⇀ ⇀
v ×u

and graph them.

Figure 12.4.3 : Are the cross products u × v and in the same direction?
⇀ ⇀ ⇀ ⇀
v ×u

Solution
We have
⇀ ⇀
u × v = ⟨(0 + 1), −(0 − 3), (0 − 6)⟩ = ⟨1, 3, −6⟩

12.4.3 https://math.libretexts.org/@go/page/4527
⇀ ⇀
v × u = ⟨(−1 − 0), −(3 − 0), (6 − 0)⟩ = ⟨−1, −3, 6⟩.

We see that, in this case, u × v = −( v × u ) (Figure 12.4.4). We prove this in general later in this section.
⇀ ⇀ ⇀ ⇀

Figure 12.4.4: The cross products u × v and


⇀ ⇀ ⇀
v ×u

are both orthogonal to u and
⇀ ⇀
v , but in opposite directions.

 Exercise 12.4.2

Suppose vectors u and v lie in the xy-plane (the z -component of each vector is zero). Now suppose the x- and y -components
⇀ ⇀

of u and the y -component of v are all positive, whereas the x-component of v is negative. Assuming the coordinate axes are
⇀ ⇀ ⇀

oriented in the usual positions, in which direction does u × v point?


⇀ ⇀

Hint
Remember the right-hand rule (Figure 12.4.2).
Answer
Up (the positive z -direction)

The cross products of the standard unit vectors ^i , ^j , and k


^
can be useful for simplifying some calculations, so let’s consider these
cross products. A straightforward application of the definition shows that

^ ^ ^ ^ ^ ^
i × i = j × j = k × k = 0.

(The cross product of two vectors is a vector, so each of these products results in the zero vector, not the scalar 0.) It’s up to you to
verify the calculations on your own.

Furthermore, because the cross product of two vectors is orthogonal to each of these vectors, we know that the cross product of ^i
and ^j is parallel to k
^
. Similarly, the vector product of ^i and k
^
is parallel to ^j , and the vector product of ^j and k
^
is parallel to ^i .
We can use the right-hand rule to determine the direction of each product. Then we have

12.4.4 https://math.libretexts.org/@go/page/4527
^ ^ ^
i × j =k

^ ^ ^
j × i = −k

^ ^ ^
j ×k = i

^ ^ ^
k × j = −i

^ ^ ^
k× i = j

^ ^ ^
i × k = −j .

These formulas come in handy later.

 Example 12.4.3: Cross Product of Standard Unit Vectors

Find ^i × (^j × k
^
) .

Solution

We know that ^j × k
^
= i . Therefore, i × ( j × k) = i × i = 0 .
^ ^ ^ ^ ^ ^

 Exercise 12.4.3

Find (^i × ^j ) × (k
^ ^
× i ).

Hint
Remember the right-hand rule (Figure 12.4.2).
Answer
^
−i

As we have seen, the dot product is often called the scalar product because it results in a scalar. The cross product results in a
vector, so it is sometimes called the vector product. These operations are both versions of vector multiplication, but they have very
different properties and applications. Let’s explore some properties of the cross product. We prove only a few of them. Proofs of
the other properties are left as exercises.

 Properties of the Cross Product

Let u , v , and w be vectors in space, and let c be a scalar.


⇀ ⇀ ⇀

i. Anticommutative property:
⇀ ⇀ ⇀ ⇀
u × v = −( v × u )

ii. Distributive property:


⇀ ⇀ ⇀ ⇀ ⇀ ⇀ ⇀
u × ( v + w) = u × v + u × w

iii. Multiplication by a constant:


⇀ ⇀ ⇀ ⇀ ⇀ ⇀
c( u × v ) = (c u ) × v = u × (c v )

iv. Cross product of the zero vector:


⇀ ⇀ ⇀
⇀ ⇀
u×0 = 0 ×u = 0

v. Cross product of a vector with itself:



⇀ ⇀
v ×v = 0

12.4.5 https://math.libretexts.org/@go/page/4527
vi. Triple scalar product:
⇀ ⇀ ⇀ ⇀ ⇀ ⇀
u ⋅ ( v × w) = ( u × v ) ⋅ w

vii. Triple cross product:


⇀ ⇀ ⇀ ⇀ ⇀ ⇀ ⇀ ⇀ ⇀
u × ( v × w) = ( u ⋅ w) v − ( u ⋅ v ) w

 Proof

For property i, we want to show u × v = −( v × u ). We have


⇀ ⇀ ⇀ ⇀

⇀ ⇀
u×v = ⟨u1 , u2 , u3 ⟩ × ⟨v1 , v2 , v3 ⟩

= ⟨u2 v3 − u3 v2 , −u1 v3 + u3 v1 , u1 v2 − u2 v1 ⟩

= −⟨u3 v2 − u2 v3 , −u3 v1 + u1 v3 , u2 v1 − u1 v2 ⟩

= −⟨v1 , v2 , v3 ⟩ × ⟨u1 , u2 , u3 ⟩

⇀ ⇀
= −( v × u ).

Unlike most operations we’ve seen, the cross product is not commutative. This makes sense if we think about the right-hand
rule.
For property iv., this follows directly from the definition of the cross product. We have
⇀ ⇀

u × 0 = ⟨u2 (0) − u3 (0), −(u1 (0) − u3 (0)), u1 (0) − u2 (0)⟩ = ⟨0, 0, 0⟩ = 0 .

⇀ ⇀
Then, by property i., 0 × u = 0 as well. Remember that the dot product of a vector and the zero vector is the scalar 0,


whereas the cross product of a vector with the zero vector is the vector 0 .
Property vi . looks like the associative property, but note the change in operations:
⇀ ⇀ ⇀
u ⋅ ( v × w) = u ⋅ ⟨v2 w3 − v3 w2 , −v1 w3 + v3 w1 , v1 w2 − v2 w1 ⟩

= u1 (v2 w3 − v3 w2 ) + u2 (−v1 w3 + v3 w1 ) + u3 (v1 w2 − v2 w1 )

= u1 v2 w3 − u1 v3 w2 − u2 v1 w3 + u2 v3 w1 + u3 v1 w2 − u3 v2 w1

= (u2 v3 − u3 v2 )w1 + (u3 v1 − u1 v3 )w2 + (u1 v2 − u2 v1 )w3

⇀ ⇀ ⇀
= ⟨u2 v3 − u3 v2 , u3 v1 − u1 v3 , u1 v2 − u2 v1 ⟩ ⋅ ⟨w1 , w2 , w3 ⟩ = ( u × v ) ⋅ w.

 Example 12.4.4: Using the Properties of the Cross Product

Use the cross product properties to calculate (2^i × 3^j ) × ^j .


Solution
^ ^ ^ ^ ^ ^
(2 i × 3 j ) × j = 2( i × 3 j ) × j

^ ^ ^
= 2(3)( i × j ) × j

^ ^
= (6 k) × j

^ ^
= 6(k × j )

^ ^
= 6(− i ) = −6 i .

12.4.6 https://math.libretexts.org/@go/page/4527
 Exercise 12.4.4

Use the properties of the cross product to calculate (^i × k


^ ^ ^
) × (k × j ).

Hint
⇀ ⇀ ⇀ ⇀
u × v = −( v × u )

Answer
^
−k

So far in this section, we have been concerned with the direction of the vector u × v , but we have not discussed its magnitude. It ⇀ ⇀

turns out there is a simple expression for the magnitude of u × v involving the magnitudes of u and v , and the sine of the angle
⇀ ⇀ ⇀ ⇀

between them.

 Magnitude of the Cross Product

Let u and
⇀ ⇀
v be vectors, and let θ be the angle between them. Then, ∥ u × v ∥ = ∥ u ∥ ⋅ ∥ v ∥ ⋅ sin θ. ⇀ ⇀ ⇀ ⇀

 Proof

Let u = ⟨u

1, u2 , u3 ⟩ and ⇀
v = ⟨v1 , v2 , v3 ⟩ be vectors, and let θ denote the angle between them. Then
⇀ ⇀ 2 2 2 2
∥u × v∥ = (u2 v3 − u3 v2 ) + (u3 v1 − u1 v3 ) + (u1 v2 − u2 v1 )

2 2 2 2 2 2 2 2 2 2 2 2
=u v − 2 u2 u3 v2 v3 + u v +u v − 2 u1 u3 v1 v3 + u v +u v − 2 u1 u2 v1 v2 + u v
2 3 3 2 3 1 1 3 1 2 2 1

2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
=u v +u v +u v +u v +u v +u v +u v +u v +u v
1 1 1 2 1 3 2 1 2 2 2 3 3 1 3 2 3 3

2 2 2 2 2 2
− (u v +u v +u v + 2 u1 u2 v1 v2 + 2 u1 u3 v1 v3 + 2 u2 u3 v2 v3 )
1 1 2 2 3 3

2 2 2 2 2 2 2
= (u +u + u )(v +v + v ) − (u1 v1 + u2 v2 + u3 v3 )
1 2 3 1 2 3

⇀ 2 ⇀ 2 ⇀ ⇀ 2
= ∥ u∥ ∥ v ∥ − (u ⋅ v )

⇀ 2 ⇀ 2 ⇀ 2 ⇀ 2 2
= ∥ u∥ ∥ v ∥ − ∥ u∥ ∥ v ∥ cos θ

⇀ 2 ⇀ 2 2
= ∥ u ∥ ∥ v ∥ (1 − cos θ)

⇀ 2 ⇀ 2 2
= ∥ u ∥ ∥ v ∥ (sin θ).

−−−−−
Taking square roots and noting that √sin 2
θ = sin θ for 0 ≤ θ ≤ 180°, we have the desired result:
⇀ ⇀ ⇀ ⇀
∥ u × v ∥ = ∥ u ∥∥ v ∥ sin θ.

This definition of the cross product allows us to visualize or interpret the product geometrically. It is clear, for example, that the
cross product is defined only for vectors in three dimensions, not for vectors in two dimensions. In two dimensions, it is impossible
to generate a vector simultaneously orthogonal to two nonparallel vectors.

 Example 12.4.5: Calculating the Cross Product


Use "Magnitude of the Cross Product" to find the magnitude of the cross product of u = ⟨0, 4, 0⟩ and ⇀ ⇀
v = ⟨0, 0, −3⟩ .
Solution
We have

12.4.7 https://math.libretexts.org/@go/page/4527
⇀ ⇀ ⇀ ⇀
∥u × v∥ = ∥ u ∥ ⋅ ∥ v ∥ ⋅ sin θ

−−−−−− −−−− − −−−−− −−−−−−− π


2 2 2 2 2 2
= √0 +4 +0 ⋅ √ 0 + 0 + (−3 ) ⋅ sin
2

= 4(3)(1) = 12

 Exercise 12.4.5

Use "Magnitude of the Cross Product" to find the magnitude of u × v , where u = ⟨−8, 0, 0⟩ and⇀ ⇀ ⇀ ⇀
v = ⟨0, 2, 0⟩ .

Hint
Vectors u and
⇀ ⇀
v are orthogonal.
Answer
16

Determinants and the Cross Product


Using Equation 12.4.3 to find the cross product of two vectors is straightforward, and it presents the cross product in the useful
component form. The formula, however, is complicated and difficult to remember. Fortunately, we have an alternative. We can
calculate the cross product of two vectors using determinant notation.
A 2 × 2 determinant is defined by
∣ a1 b1 ∣
∣ ∣ = a1 b2 − b1 a2 .
∣ a2 b2 ∣

For example,
∣3 −2 ∣
∣ ∣ = 3(1) − 5(−2) = 3 + 10 = 13.
∣5 1 ∣

A 3 × 3 determinant is defined in terms of 2 × 2 determinants as follows:


∣ a1 a2 a3 ∣
∣ ∣ ∣ b2 b3 ∣ ∣ b1 b3 ∣ ∣ b1 b2 ∣
b1 b2 b3 = a1 ∣ ∣ − a2 ∣ ∣ + a3 ∣ ∣. (12.4.4)
∣ ∣
∣ c2 c3 ∣ ∣ c1 c3 ∣ ∣ c1 c2 ∣
∣ c1 c2 c3 ∣

Equation 12.4.4 is referred to as the expansion of the determinant along the first row. Notice that the multipliers of each of the
2 × 2 determinants on the right side of this expression are the entries in the first row of the 3 × 3 determinant. Furthermore, each

of the 2 × 2 determinants contains the entries from the 3 × 3 determinant that would remain if you crossed out the row and column
containing the multiplier. Thus, for the first term on the right, a is the multiplier, and the 2 × 2 determinant contains the entries
1

that remain if you cross out the first row and first column of the 3 × 3 determinant. Similarly, for the second term, the multiplier is
a , and the 2 × 2 determinant contains the entries that remain if you cross out the first row and second column of the 3 × 3
2

determinant. Notice, however, that the coefficient of the second term is negative. The third term can be calculated in similar
fashion.

 Example 12.4.6: Using Expansion Along the First Row to Compute a 3 × 3 Determinant
∣ 2 5 −1 ∣

Evaluate the determinant ∣∣ −1 1 3




.
∣ −2 3 4 ∣

Solution
We have

12.4.8 https://math.libretexts.org/@go/page/4527
∣ 2 5 −1 ∣
∣ ∣ ∣1 3∣ ∣ −1 3∣ ∣ −1 1∣
−1 1 3 =2∣ ∣−5 ∣ ∣−1 ∣ ∣
∣ ∣ ∣3 4∣ ∣ −2 4∣ ∣ −2 3∣
∣ −2 3 4 ∣

= 2(4 − 9) − 5(−4 + 6) − 1(−3 + 2)

= 2(−5) − 5(2) − 1(−1) = −10 − 10 + 1

= −19

 Exercise 12.4.6

∣1 −2 −1 ∣
∣ ∣
Evaluate the determinant ∣ 3 2 −3

.
∣1 5 4 ∣

Hint
Expand along the first row. Don’t forget the second term is negative!
Answer
40

Technically, determinants are defined only in terms of arrays of real numbers. However, the determinant notation provides a useful
mnemonic device for the cross product formula.

 Rule: Cross Product Calculated by a Determinant

Let u = ⟨u

1, u2 , u3 ⟩ and ⇀
v = ⟨v1 , v2 , v3 ⟩ be vectors. Then the cross product u × v is given by
⇀ ⇀

∣ ^ ^ ^ ∣
i j k
∣ ∣ ∣ u2 u3 ∣ ∣ u1 u3 ∣ ∣ u1 u2 ∣
⇀ ⇀ ^ ^ ^
u×v =∣u u2 u3 ∣ = ∣ ∣ i −∣ ∣ j +∣ ∣ k.
1
∣ ∣ ∣ v2 v3 ∣ ∣ v1 v3 ∣ ∣ v1 v2 ∣
∣ v1 v2 v3 ∣

 Example 12.4.7: Using Determinant Notation to find ⇀


p × q

Let p = ⟨−1, 2, 5⟩ and q


⇀ ⇀
= ⟨4, 0, −3⟩ . Find p × q .
⇀ ⇀

Solution
We set up our determinant by putting the standard unit vectors across the first row, the components of u in the second row, and ⇀

the components of v in the third row. Then, we have


∣ ^ ^ ^ ∣
i j k
∣ ∣ ∣2 5 ∣ ∣ −1 5 ∣ ∣ −1 2∣
⇀ ⇀ ^ ^ ^
p×q = ∣ −1 2 5 ∣ =∣ ∣ i −∣ ∣ j +∣ ∣k
∣ ∣ ∣0 −3 ∣ ∣ 4 −3 ∣ ∣ 4 0∣
∣ 4 0 −3 ∣

^ ^ ^
= (−6 − 0) i − (3 − 20) j + (0 − 8)k

^ ^ ^
= −6 i + 17 j − 8 k.

Notice that this answer confirms the calculation of the cross product in Example 12.4.1.

 Exercise 12.4.7
⇀ ⇀
Use determinant notation to find ⇀
a ×b , where ⇀
a = ⟨8, 2, 3⟩ and b = ⟨−1, 0, 4⟩.

Hint

12.4.9 https://math.libretexts.org/@go/page/4527
∣ ^ ^ ^∣
i j k
∣ ∣
Calculate the determinant ∣ 8 2 3 ∣ .
∣ ∣
∣ −1 0 4 ∣

Answer

⇀ ^ ^ ^
a × b = 8 i − 35 j + 2 k

Using the Cross Product


The cross product is very useful for several types of calculations, including finding a vector orthogonal to two given vectors,
computing areas of triangles and parallelograms, and even determining the volume of the three-dimensional geometric shape made
of parallelograms known as a parallelepiped. The following examples illustrate these calculations.

 Example 12.4.8: Finding a Unit Vector Orthogonal to Two Given Vectors


⇀ ⇀
Let ⇀
a = ⟨5, 2, −1⟩ and b = ⟨0, −1, 4⟩ . Find a unit vector orthogonal to both ⇀
a and b .
Solution
⇀ ⇀
The cross product ⇀
a ×b is orthogonal to both vectors ⇀
a and b . We can calculate it with a determinant:
∣^ ^ ^ ∣
i j k
⇀ ∣ ∣ ∣ 2 −1 ∣ ∣5 −1 ∣ ∣5 2 ∣
⇀ ^ ^ ^
a ×b =∣ 5 2 −1 ∣ = ∣ ∣ i −∣ ∣ j +∣ ∣k
∣ ∣ ∣ −1 4 ∣ ∣0 4 ∣ ∣0 −1 ∣
∣ 0 −1 4 ∣

^ ^ ^
= (8 − 1) i − (20 − 0) j + (−5 − 0)k

^ ^ ^
= 7 i − 20 j − 5 k.

Normalize this vector to find a unit vector in the same direction:


⇀ −−−−−−−−−−−−−−−−− − −−−
⇀ 2 2
∥ a × b ∥ = √(7 ) + (−20 ) + (−5 )
2
= √474 .
7 −20 −5 ⇀
Thus, ⟨ −−−
,
−−−
,
−−−
⟩ is a unit vector orthogonal to ⇀
a and b .
√474 √474 √474

−−− −−− −−−


7 √474 −10 √474 −5 √474
Simplified, this vector becomes ⟨ , , ⟩ .
474 237 474

 Exercise 12.4.8
⇀ ⇀
Find a unit vector orthogonal to both ⇀
a and b , where ⇀
a = ⟨4, 0, 3⟩ and b = ⟨1, 1, 4⟩.

Hint
Normalize the cross product.
Answer
−−− −−− −−−
−3 −13 4 −3 √194 −13 √194 2 √194

−−−
,
−−−
,
−−−
⟩ or, simplified as ⟨ , , ⟩
√194 √194 √194 194 194 97

To use the cross product for calculating areas, we state and prove the following theorem.

 Area of a Parallelogram

If we locate vectors u and v such that they form adjacent sides of a parallelogram, then the area of the parallelogram is given
⇀ ⇀

by ∥ u × v ∥ (Figure 12.4.5).
⇀ ⇀

12.4.10 https://math.libretexts.org/@go/page/4527
Figure 12.4.5 : The parallelogram with adjacent sides u and has base ∥ u ∥ and height ∥ v ∥ sin θ.
⇀ ⇀ ⇀ ⇀
v

 Proof

We show that the magnitude of the cross product is equal to the base times height of the parallelogram.
Area of a parallelogram = base × height

⇀ ⇀
= ∥ u ∥(∥ v ∥ sin θ)

⇀ ⇀
= ∥u × v∥

 Example 12.4.9: Finding the Area of a Triangle

Let P = (1, 0, 0), Q = (0, 1, 0), and R = (0, 0, 1) be the vertices of a triangle (Figure 12.4.6). Find its area.

Figure 12.4.6 : Finding the area of a triangle by using the cross product.
Solution
−−⇀ −−⇀
We have P Q = ⟨0 − 1, 1 − 0, 0 − 0⟩ = ⟨−1, 1, 0⟩ and P R = ⟨0 − 1, 0 − 0, 1 − 0⟩ = ⟨−1, 0, 1⟩ . The area of the
−−⇀ −−⇀
∥ −−⇀ −−⇀
parallelogram with adjacent sides P Q and P R is given by ∥
PQ×PR


:

12.4.11 https://math.libretexts.org/@go/page/4527
∣ ^ ^ ^
i j k∣
−−⇀ −−⇀ ∣ ∣
P Q × P R = ∣ −1 1 0 ∣
∣ ∣
∣ −1 0 1 ∣

^ ^ ^
= (1 − 0) i − (−1 − 0) j + (0 − (−1))k

^ ^ ^
= i + j +k

∥ −−⇀ −−⇀

PQ×PR =∥ ⟨1, 1, 1⟩∥
∥ ∥

−−−−−− −−−−
2 2 2
= √1 +1 +1


= √3.


The area of ΔP QR is half the area of the parallelogram or √3/2 units . 2

 Exercise 12.4.9

Find the area of the parallelogram P QRS with vertices P (1, 1, 0), Q(7, 1, 0), R(9, 4, 2), and S(3, 4, 2).

Hint
Sketch the parallelogram and identify two vectors that form adjacent sides of the parallelogram.
Answer
−− 2
6 √13 units

The Triple Scalar Product


Because the cross product of two vectors is a vector, it is possible to combine the dot product and the cross product. The dot
product of a vector with the cross product of two other vectors is called the triple scalar product because the result is a scalar.

 Definition: Triple Scalar Product


The triple scalar product of vectors u ,
⇀ ⇀
v, and w is

⇀ ⇀ ⇀
u ⋅ ( v × w).

 Calculating a Triple Scalar Product

The triple scalar product of vectors


⇀ ^ ^ ^
u = u1 i + u2 j + u3 k

⇀ ^ ^ ^
v = v1 i + v2 j + v3 k

and
⇀ ^ ^ ^
w = w1 i + w2 j + w3 k

is the determinant of the 3 × 3 matrix formed by the components of the vectors:


∣ u1 u2 u3 ∣
⇀ ⇀ ⇀ ∣ ∣
u ⋅ ( v × w) = v1 v2 v3 . (12.4.5)
∣ ∣
∣ w1 w2 w3 ∣

12.4.12 https://math.libretexts.org/@go/page/4527
 Proof

The calculation is straightforward.


⇀ ⇀ ⇀
u ⋅ ( v × w) = ⟨u1 , u2 , u3 ⟩ ⋅ ⟨v2 w3 − v3 w2 , −v1 w3 + v3 w1 , v1 w2 − v2 w1 ⟩

= u1 (v2 w3 − v3 w2 ) + u2 (−v1 w3 + v3 w1 ) + u3 (v1 w2 − v2 w1 )

= u1 (v2 w3 − v3 w2 ) − u2 (v1 w3 − v3 w1 ) + u3 (v1 w2 − v2 w1 )

∣ u1 u2 u3 ∣
∣ ∣
= v1 v2 v3 .
∣ ∣
∣ w1 w2 w3 ∣

 Example 12.4.10: Calculating the Triple Scalar Product

Let u = ⟨1, 3, 5⟩,


⇀ ⇀
v = ⟨2, −1, 0⟩ and w = ⟨−3, 0, −1⟩. Calculate the triple scalar product u ⋅ ( v × w).
⇀ ⇀ ⇀ ⇀

Solution
Apply Equation 12.4.5 directly:
∣ 1 3 5 ∣
⇀ ⇀ ⇀ ∣ ∣
u ⋅ ( v × w) = 2 −1 0
∣ ∣
∣ −3 0 −1 ∣

∣ −1 0 ∣ ∣ 2 0 ∣ ∣ 2 −1 ∣
=1∣ ∣−3 ∣ ∣+5 ∣ ∣
∣ 0 −1 ∣ ∣ −3 −1 ∣ ∣ −3 0 ∣

= (1 − 0) − 3(−2 − 0) + 5(0 − 3)

= 1 + 6 − 15 = −8.

 Exercise 12.4.10
⇀ ⇀
Calculate the triple scalar product ⇀
a ⋅ ( b × c ),

where ⇀
a = ⟨2, −4, 1⟩, b = ⟨0, 3, −1⟩ , and ⇀
c = ⟨5, −3, 3⟩.

Hint
Place the vectors as the rows of a 3 × 3 matrix, then calculate the determinant.
Answer
17

When we create a matrix from three vectors, we must be careful about the order in which we list the vectors. If we list them in a
matrix in one order and then rearrange the rows, the absolute value of the determinant remains unchanged. However, each time two
rows switch places, the determinant changes sign:
∣ a1 a2 a3 ∣ ∣ b1 b2 b3 ∣ ∣ b1 b2 b3 ∣ ∣ c1 c2 c3 ∣
∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣
b1 b2 b3 =d a1 a2 a3 = −d c1 c2 c3 =d b1 b2 b3 = −d
∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣
∣ c1 c2 c3 ∣ ∣ c1 c2 c3 ∣ ∣ a1 a2 a3 ∣ ∣ a1 a2 a3 ∣

Verifying this fact is straightforward, but rather messy. Let’s take a look at this with an example:
∣ 1 2 1 ∣
∣ ∣ ∣0 3 ∣ ∣ −2 3 ∣ ∣ −2 0∣
−2 0 3 =∣ ∣−2 ∣ ∣+∣ ∣
∣ ∣ ∣1 −1 ∣ ∣ 4 −1 ∣ ∣ 4 1∣
∣ 4 1 −1 ∣

= (0 − 3) − 2(2 − 12) + (−2 − 0)

= −3 + 20 − 2 = 15.

12.4.13 https://math.libretexts.org/@go/page/4527
Switching the top two rows we have
∣ −2 0 3 ∣
∣ ∣ ∣2 1 ∣ ∣1 2∣
1 2 1 = −2 ∣ ∣+3 ∣ ∣
∣ ∣ ∣1 −1 ∣ ∣4 1∣
∣ 4 1 −1 ∣

= −2(−2 − 1) + 3(1 − 8)

= 6 − 21 = −15.

Rearranging vectors in the triple products is equivalent to reordering the rows in the matrix of the determinant. Let
⇀ ^ ^ ^ ^ ⇀
^ ^
u = u i + u j + u k, v = v i + v j + v k,
1 2 3 1 and w = w ^i + w ^j + w k
2 3
^
. Applying Calculating a Triple Scalar Product, we

1 2 3

have
∣ u1 u2 u3 ∣
⇀ ⇀ ⇀ ∣ ∣
u ⋅ ( v × w) = v1 v2 v3
∣ ∣
∣ w1 w2 w3 ∣

and
∣ u1 u2 u3 ∣
⇀ ⇀ ⇀ ∣ ∣
u ⋅ (w × v ) = w1 w2 w3 .
∣ ∣
∣ v1 v2 v3 ∣

We can obtain the determinant for calculating ⇀ ⇀


u ⋅ (w × v )

by switching the bottom two rows of ⇀ ⇀ ⇀
u ⋅ ( v × w). Therefore,
⇀ ⇀ ⇀ ⇀ ⇀ ⇀
u ⋅ ( v × w) = −u ⋅ (w × v ).

Following this reasoning and exploring the different ways we can interchange variables in the triple scalar product lead to the
following identities:
⇀ ⇀ ⇀ ⇀ ⇀ ⇀
u ⋅ ( v × w) = −u ⋅ (w × v ) (12.4.6)

⇀ ⇀ ⇀ ⇀ ⇀ ⇀ ⇀ ⇀ ⇀
u ⋅ ( v × w) = v ⋅ (w × u ) = w ⋅ ( u × v ). (12.4.7)

Let u and v be two vectors in standard position. If u and v are not scalar multiples of each other, then these vectors form adjacent
⇀ ⇀ ⇀ ⇀

sides of a parallelogram. We saw in Area of a Parallelogram that the area of this parallelogram is ∥ u × v ∥ . Now suppose we add a ⇀ ⇀

third vector w that does not lie in the same plane as u and v but still shares the same initial point. Then these vectors form three
⇀ ⇀ ⇀

edges of a parallelepiped, a three-dimensional prism with six faces that are each parallelograms, as shown in Figure 12.4.7. The
volume of this prism is the product of the figure’s height and the area of its base. The triple scalar product of u , v , and w provides ⇀ ⇀ ⇀

a simple method for calculating the volume of the parallelepiped defined by these vectors.

 Volume of a Parallelepiped

The volume of a parallelepiped with adjacent edges given by the vectors ⇀ ⇀


u, v , and w is the absolute value of the triple scalar

product (Figure 12.4.7):


⇀ ⇀ ⇀
V = | u ⋅ ( v × w)|.

Note that, as the name indicates, the triple scalar product produces a scalar. The volume formula just presented uses the
absolute value of a scalar quantity.

Figure 12.4.7 : The height of the parallelepiped is given by ∥proj ⇀



⇀ u ∥.
v ×w

12.4.14 https://math.libretexts.org/@go/page/4527
 Proof

The area of the base of the parallelepiped is given by ∥ v × w∥. The height of the figure is given by
⇀ ⇀
∥ proj ⇀ ⇀
v ×w

u ∥. The
volume of the parallelepiped is the product of the height and the area of the base, so we have
⇀ ⇀ ⇀
V =∥ proj ⇀ ⇀ u ∥ ∥ v × w∥
v ×w

⇀ ⇀ ⇀
∣ u ⋅ ( v × w) ∣
⇀ ⇀
=∣ ∣ ∥ v × w∥
⇀ ⇀
∣ ∥ v × w∥ ∣

⇀ ⇀ ⇀
= | u ⋅ ( v × w)|.

 Example 12.4.11: Calculating the Volume of a Parallelepiped

Let u = ⟨−1, −2, 1⟩, v = ⟨4, 3, 2⟩, and


⇀ ⇀ ⇀
w = ⟨0, −5, −2⟩ . Find the volume of the parallelepiped with adjacent edges ⇀
u, v

,
and w (Figure 12.4.8).

Figure 12.4.8
Solution
We have
∣ −1 −2 1 ∣
⇀ ⇀ ⇀ ∣ ∣
u ⋅ ( v × w) = 4 3 2
∣ ∣
∣ 0 −5 −2 ∣

∣ 3 2 ∣ ∣4 2 ∣ ∣4 3 ∣
= (−1) ∣ ∣+2 ∣ ∣+∣ ∣
∣ −5 −2 ∣ ∣0 −2 ∣ ∣0 −5 ∣

= (−1)(−6 + 10) + 2(−8 − 0) + (−20 − 0)

= −4 − 16 − 20

= −40.

Thus, the volume of the parallelepiped is | − 40| = 40 units3

 Exercise 12.4.11

Find the volume of the parallelepiped formed by the vectors ⇀ ^ ^ ^ ^ ^ ^
a = 3 i + 4 j − k, b = 2 i − j − k, and ⇀ ^ ^
c = 3 j + k.

Hint

12.4.15 https://math.libretexts.org/@go/page/4527
Calculate the triple scalar product by finding a determinant.
Answer
8 units3

Applications of the Cross Product


The cross product appears in many practical applications in mathematics, physics, and engineering. Let’s examine some of these
applications here, including the idea of torque, with which we began this section. Other applications show up in later chapters,
particularly in our study of vector fields such as gravitational and electromagnetic fields (Introduction to Vector Calculus).

 Example 12.4.12: Using the Triple Scalar Product

Use the triple scalar product to show that vectors ⇀ ⇀


, and w = ⟨1, −1, 3⟩ are coplanar—that is, show
u = ⟨2, 0, 5⟩, v = ⟨2, 2, 4⟩

that these vectors lie in the same plane.


Solution
Start by calculating the triple scalar product to find the volume of the parallelepiped defined by u , v , and w:
⇀ ⇀ ⇀

∣2 0 5∣
⇀ ⇀ ⇀ ∣ ∣
u ⋅ ( v × w) = 2 2 4
∣ ∣
∣1 −1 3∣

= [2(2)(3) + (0)(4)(1) + 5(2)(−1)] − [5(2)(1) + (2)(4)(−1) + (0)(2)(3)]

= 2 − 2 = 0.

The volume of the parallelepiped is 0 units3, so one of the dimensions must be zero. Therefore, the three vectors all lie in the
same plane.

 Exercise 12.4.12

Are the vectors ⇀ ^ ^ ^ ^ ^ ^
a = i + j − k, b = i − j + k, and ⇀ ^ ^ ^
c = i + j +k coplanar?

Hint
Calculate the triple scalar product.
Answer
No, the triple scalar product is −4 ≠ 0, so the three vectors form the adjacent edges of a parallelepiped. They are not
coplanar.

 Example 12.4.13: Finding an Orthogonal Vector

Only a single plane can pass through any set of three noncolinear points. Find a vector orthogonal to the plane containing
points P = (9, −3, −2), Q = (1, 3, 0), and R = (−2, 5, 0).
Solution
−−⇀ −−⇀
The plane must contain vectors P Q and QR:
−−⇀
P Q = ⟨1 − 9, 3 − (−3), 0 − (−2)⟩ = ⟨−8, 6, 2⟩

−−⇀
QR = ⟨−2 − 1, 5 − 3, 0 − 0⟩ = ⟨−3, 2, 0⟩.

−−⇀ −−⇀ −−⇀ −−⇀


The cross product P Q × QR produces a vector orthogonal to both P Q and QR. Therefore, the cross product is orthogonal to
the plane that contains these two vectors:

12.4.16 https://math.libretexts.org/@go/page/4527
∣ ^ ^ ^
i j k∣
−−⇀ −−⇀ ∣ ∣
P Q × QR = ∣ −8 6 2 ∣
∣ ∣
∣ −3 2 0 ∣

^ ^ ^ ^ ^ ^
= 0 i − 6 j − 16 k − (−18 k + 4 i + 0 j )

^ ^ ^
= −4 i − 6 j + 2 k.

We have seen how to use the triple scalar product and how to find a vector orthogonal to a plane. Now we apply the cross product
to real-world situations.
Sometimes a force causes an object to rotate. For example, turning a screwdriver or a wrench creates this kind of rotational effect,
called torque.

 Definition: Torque

Torque, τ (the Greek letter tau), measures the tendency of a force to produce rotation about an axis of rotation. Let r be a
⇀ ⇀

vector with an initial point located on the axis of rotation and with a terminal point located at the point where the force is
⇀ ⇀
applied, and let vector F represent the force. Then torque is equal to the cross product of ⇀
r and F:

⇀ ⇀
τ = r × F.

See Figure 12.4.9.

Figure 12.4.9 : Torque measures how a force causes an object to rotate.

Think about using a wrench to tighten a bolt. The torque τ applied to the bolt depends on how hard we push the wrench (force) and
how far up the handle we apply the force (distance). The torque increases with a greater force on the wrench at a greater distance
from the bolt. Common units of torque are the newton-meter or foot-pound. Although torque is dimensionally equivalent to work
(it has the same units), the two concepts are distinct. Torque is used specifically in the context of rotation, whereas work typically
involves motion along a line.

 Example 12.4.14: Evaluating Torque

A bolt is tightened by applying a force of 6 N to a 0.15-m wrench (Figure 12.4.10). The angle between the wrench and the
force vector is 40°. Find the magnitude of the torque about the center of the bolt. Round the answer to two decimal places.

Figure 12.4.10: Torque describes the twisting action of the wrench.


Solution:
Substitute the given information into the equation defining torque:

12.4.17 https://math.libretexts.org/@go/page/4527

⇀ ⇀
∥ τ ∥ = ∥ r × F∥



= ∥ r ∥ ∥ F ∥ sin θ

= (0.15 m)(6 N) sin 40°

≈ 0.58 N⋅m.

 Exercise 12.4.14

Calculate the force required to produce 15 N⋅m torque at an angle of 30º from a 150-cm rod.

Hint

∥ τ ∥ = 15 N⋅m and ∥ r ∥ = 1.5 m

Answer
20 N

Key Concepts
The cross product u × v of two vectors u = ⟨u , u , u ⟩ and v = ⟨v , v , v ⟩ is a vector orthogonal to both u and v . Its
⇀ ⇀ ⇀
1 2 3

1 2 3
⇀ ⇀

length is given by ∥ u × v ∥ = ∥ u ∥ ⋅ ∥ v ∥ ⋅ sin θ, where θ is the angle between u and v . Its direction is given by the right-hand
⇀ ⇀ ⇀ ⇀ ⇀ ⇀

rule.
The algebraic formula for calculating the cross product of two vectors,

u = ⟨u1 , u2 , u3 ⟩ and ⇀
v = ⟨v1 , v2 , v3 ⟩ , is
⇀ ⇀ ^ ^ ^
u × v = (u2 v3 − u3 v2 ) i − (u1 v3 − u3 v1 ) j + (u1 v2 − u2 v1 )k.

The cross product satisfies the following properties for vectors u , v , and w, and scalar c : ⇀ ⇀ ⇀

⇀ ⇀ ⇀ ⇀
u × v = −( v × u )

⇀ ⇀ ⇀ ⇀ ⇀ ⇀ ⇀
u × ( v + w) = u × v + u × w

⇀ ⇀ ⇀ ⇀ ⇀ ⇀
c( u × v ) = (c u ) × v = u × (c v )

⇀ ⇀ ⇀
⇀ ⇀
u×0 = 0 ×u = 0


⇀ ⇀
v ×v = 0

⇀ ⇀ ⇀ ⇀ ⇀ ⇀
u ⋅ ( v × w) = ( u × v ) ⋅ w

∣ ^
i
^
j
^
k ∣
∣ ∣
The cross product of vectors u = ⟨u ⇀
1, u2 , u3 ⟩ and ⇀
v = ⟨v1 , v2 , v3 ⟩ is the determinant ∣ u 1 u2 u3 ∣
∣ ∣
∣ v1 v2 v3 ∣

If vectors u and v form adjacent sides of a parallelogram, then the area of the parallelogram is given by ∥ u × v ∥.
⇀ ⇀ ⇀ ⇀

The triple scalar product of vectors u , v , and w is u ⋅ ( v × w).


⇀ ⇀ ⇀ ⇀ ⇀ ⇀

The volume of a parallelepiped with adjacent edges given by vectors u , v , and w is V = | u ⋅ ( v × w)|.⇀ ⇀ ⇀ ⇀ ⇀ ⇀

If the triple scalar product of vectors u , v , and w is zero, then the vectors are coplanar. The converse is also true: If the vectors
⇀ ⇀ ⇀

are coplanar, then their triple scalar product is zero.


The cross product can be used to identify a vector orthogonal to two given vectors or to a plane.

Torque τ measures the tendency of a force to produce rotation about an axis of rotation. If force F is acting at a distance

⇀ ⇀
(displacement) ⇀
r from the axis, then torque is equal to the cross product of ⇀
r and F : ⇀ ⇀
τ = r × F.

Key Equations
The cross product of two vectors in terms of the unit vectors
⇀ ⇀ ^ ^ ^
u × v = (u2 v3 − u3 v2 ) i − (u1 v3 − u3 v1 ) j + (u1 v2 − u2 v1 )k

12.4.18 https://math.libretexts.org/@go/page/4527
Glossary
cross product
⇀ ⇀ ^ ^ ^
u × v = (u2 v3 − u3 v2 ) i − (u1 v3 − u3 v1 ) j + (u1 v2 − u2 v1 )k, where u = ⟨u

1, u2 , u3 ⟩ and ⇀
v = ⟨v1 , v2 , v3 ⟩

determinant
a real number associated with a square matrix
parallelepiped
a three-dimensional prism with six faces that are parallelograms
torque
the effect of a force that causes an object to rotate
triple scalar product
the dot product of a vector with the cross product of two other vectors: u ⋅ ( v × w)
⇀ ⇀ ⇀

vector product
the cross product of two vectors

12.4: The Cross Product is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
12.4: The Cross Product by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

12.4.19 https://math.libretexts.org/@go/page/4527
12.5: Equations of Lines and Planes
 Learning Objectives
Write the vector, parametric, and symmetric equations of a line through a given point in a given direction, and a line
through two given points.
Find the distance from a point to a given line.
Write the vector and scalar equations of a plane through a given point with a given normal.
Find the distance from a point to a given plane.
Find the angle between two planes.

By now, we are familiar with writing equations that describe a line in two dimensions. To write an equation for a line, we must
know two points on the line, or we must know the direction of the line and at least one point through which the line passes. In two
dimensions, we use the concept of slope to describe the orientation, or direction, of a line. In three dimensions, we describe the
direction of a line using a vector parallel to the line. In this section, we examine how to use equations to describe lines and planes
in space.

Equations for a Line in Space


Let’s first explore what it means for two vectors to be parallel. Recall that parallel vectors must have the same or opposite
directions. If two nonzero vectors, u and v , are parallel, we claim there must be a scalar, k , such that u = k v . If u and v have
⇀ ⇀ ⇀ ⇀ ⇀ ⇀

the same direction, simply choose



∥ u∥
k = .

∥ v∥

If u and
⇀ ⇀
v have opposite directions, choose

∥ u∥
k =− .

∥ v∥

Note that the converse holds as well. If u = k v for some scalar k , then either u and v have the same direction (k > 0) or
⇀ ⇀ ⇀ ⇀

opposite directions (k < 0) , so u and v are parallel. Therefore, two nonzero vectors u and v are parallel if and only if u = k v
⇀ ⇀ ⇀ ⇀ ⇀ ⇀


for some scalar k . By convention, the zero vector 0 is considered to be parallel to all vectors.

12.5.1 https://math.libretexts.org/@go/page/4528
−−⇀
Figure 12.5.1: Vector ⇀
v is the direction vector for P Q.
As in two dimensions, we can describe a line in space using a point on the line and the direction of the line, or a parallel vector,
which we call the direction vector (Figure 12.5.1). Let L be a line in space passing through point P (x , y , z ). Let v = ⟨a, b, c⟩0 0 0

−−⇀
be a vector parallel to L. Then, for any point on line Q(x, y, z), we know that P Q is parallel to ⇀
v . Thus, as we just discussed, there
−−⇀
is a scalar, t , such that P Q = t v , which gives

−−⇀

P Q = tv

⟨x − x0 , y − y0 , z − z0 ⟩ = t⟨a, b, c⟩

⟨x − x0 , y − y0 , z − z0 ⟩ = ⟨ta, tb, tc⟩. (12.5.1)

Using vector operations, we can rewrite Equation 12.5.1


⟨x − x0 , y − y0 , z − z0 ⟩ = ⟨ta, tb, tc⟩

⟨x, y, z⟩ − ⟨x0 , y0 , z0 ⟩ = t⟨a, b, c⟩

⟨x, y, z⟩ = ⟨x0 , y0 , z0 ⟩ + t ⟨a, b, c⟩.


  
⇀ ⇀ ⇀
r r o v

Setting ⇀
r = ⟨x, y, z⟩ and ⇀
r0 = ⟨x0 , y0 , z0 ⟩ , we now have the vector equation of a line:
⇀ ⇀ ⇀
r = r 0 + tv. (12.5.2)

Equating components, Equation 12.5.2 shows that the following equations are simultaneously true: x − x = ta, y − y = tb, and 0 0

z − z = tc. If we solve each of these equations for the component variables x, y, and z , we get a set of equations in which each
0

12.5.2 https://math.libretexts.org/@go/page/4528
variable is defined in terms of the parameter t and that, together, describe the line. This set of three equations forms a set of
parametric equations of a line:

x = x0 + ta

y = y0 + tb

z = z0 + tc.

If we solve each of the equations for t assuming a, b, and c are nonzero, we get a different description of the same line:
x − x0
=t
a

y − y0
=t
b

z − z0
= t.
c

Because each expression equals t , they all have the same value. We can set them equal to each other to create symmetric
equations of a line:
x − x0 y − y0 z − z0
= = .
a b c

We summarize the results in the following theorem.

 Theorem: Parametric and Symmetric Equations of a Line

A line L parallel to vector ⇀


v = ⟨a, b, c⟩ and passing through point P (x 0, y0 , z0 ) can be described by the following parametric
equations:

x = x0 + ta, y = y0 + tb,

and

z = z0 + tc.

If the constants a, b, and c are all nonzero, then L can be described by the symmetric equation of the line:
x − x0 y − y0 z − z0
= = .
a b c

The parametric equations of a line are not unique. Using a different parallel vector or a different point on the line leads to a
different, equivalent representation. Each set of parametric equations leads to a related set of symmetric equations, so it follows that
a symmetric equation of a line is not unique either.

 Example 12.5.1: Equations of a Line in Space

Find parametric and symmetric equations of the line passing through points (1, 4, −2) and (−3, 5, 0).
Solution
First, identify a vector parallel to the line:

v = ⟨−3 − 1, 5 − 4, 0 − (−2)⟩ = ⟨−4, 1, 2⟩.

Use either of the given points on the line to complete the parametric equations:
x = 1 − 4t

y = 4 + t,

and

z = −2 + 2t.

12.5.3 https://math.libretexts.org/@go/page/4528
Solve each equation for t to create the symmetric equation of the line:
x −1 z+2
= y −4 = .
−4 2

 Exercise 12.5.1
Find parametric and symmetric equations of the line passing through points (1, −3, 2) and (5, −2, 8).

Hint:
Start by finding a vector parallel to the line.

Answer
Possible set of parametric equations: x = 1 + 4t, y = −3 + t, z = 2 + 6t; related set of symmetric equations:
x −1 z−2
= y +3 =
4 6

Sometimes we don’t want the equation of a whole line, just a line segment. In this case, we limit the values of our parameter t . For
example, let P (x , y , z ) and Q(x , y , z ) be points on a line, and let p = ⟨x , y , z ⟩ and q = ⟨x , y , z ⟩ be the associated
0 0 0 1 1 1

0 0 0

1 1 1

position vectors. In addition, let r = ⟨x, y, z⟩ . We want to find a vector equation for the line segment between P and Q. Using P

−−⇀
as our known point on the line, and P Q = ⟨x 1 − x0 , y1 − y0 , z1 − z0 ⟩ as the direction vector equation, Equation 12.5.2 gives
−−⇀
⇀ ⇀
r = p + t(P Q). (12.5.3)

Equation 12.5.3 can be expanded using properties of vectors:


−−⇀
⇀ ⇀
r = p + t(P Q)

= ⟨x0 , y0 , z0 ⟩ + t⟨x1 − x0 , y1 − y0 , z1 − z0 ⟩

= ⟨x0 , y0 , z0 ⟩ + t(⟨x1 , y1 , z1 ⟩ − ⟨x0 , y0 , z0 ⟩)

= ⟨x0 , y0 , z0 ⟩ + t⟨x1 , y1 , z1 ⟩ − t⟨x0 , y0 , z0 ⟩

= (1 − t)⟨x0 , y0 , z0 ⟩ + t⟨x1 , y1 , z1 ⟩

⇀ ⇀
= (1 − t) p + t q .

Thus, the vector equation of the line passing through P and Q is


⇀ ⇀ ⇀
r = (1 − t) p + t q .

Remember that we did not want the equation of the whole line, just the line segment between P and Q. Notice that when t = 0 , we
have r = p , and when t = 1 , we have r = q . Therefore, the vector equation of the line segment between P and Q is
⇀ ⇀ ⇀ ⇀

⇀ ⇀ ⇀
r = (1 − t) p + t q , 0 ≤ t ≤ 1.

Going back to Equation 12.5.2, we can also find parametric equations for this line segment. We have
−−⇀
⇀ ⇀
r = p + t(P Q)

⟨x, y, z⟩ = ⟨x0 , y0 , z0 ⟩ + t⟨x1 − x0 , y1 − y0 , z1 − z0 ⟩

= ⟨x0 + t(x1 − x0 ), y0 + t(y1 − y0 ), z0 + t(z1 − z0 )⟩.

Then, the parametric equations are

x = x0 + t(x1 − x0 )

y = y0 + t(y1 − y0 ) (12.5.4)

z = z0 + t(z1 − z0 ), 0 ≤ t ≤ 1.

12.5.4 https://math.libretexts.org/@go/page/4528
 Example 12.5.2: Parametric Equations of a Line Segment

Find parametric equations of the line segment between the points P (2, 1, 4) and Q(3, −1, 3).
Solution
Start with the parametric equations for a line (Equations 12.5.4) and work with each component separately:

x = x0 + t(x1 − x0 )

= 2 + t(3 − 2)

= 2 + t,

y = y0 + t(y1 − y0 )

= 1 + t(−1 − 1)

= 1 − 2t,

and
z = z0 + t(z1 − z0 )

= 4 + t(3 − 4)

= 4 − t.

Therefore, the parametric equations for the line segment are


x = 2 +t

y = 1 − 2t

z = 4 − t, 0 ≤ t ≤ 1.

 Exercise 12.5.2

Find parametric equations of the line segment between points P (−1, 3, 6) and Q(−8, 2, 4).

Answer
x = −1 − 7t, y = 3 − t, z = 6 − 2t, 0 ≤t ≤1

Distance between a Point and a Line


We already know how to calculate the distance between two points in space. We now expand this definition to describe the distance
between a point and a line in space. Several real-world contexts exist when it is important to be able to calculate these distances.
When building a home, for example, builders must consider “setback” requirements, when structures or fixtures have to be a
certain distance from the property line. Air travel offers another example. Airlines are concerned about the distances between
populated areas and proposed flight paths.
Let L be a line in the plane and let M be any point not on the line. Then, we define distance d from M to L as the length of line
¯
¯¯¯¯¯¯¯
¯ ¯
¯¯¯¯¯¯¯
¯
segment M P , where P is a point on L such that M P is perpendicular to L (Figure 12.5.2).

Figure 12.5.2 : The distance from point M to line L is the length of M P .


¯
¯¯¯¯¯¯¯
¯

12.5.5 https://math.libretexts.org/@go/page/4528
When we’re looking for the distance between a line and a point in space, Figure 12.5.2 still applies. We still define the distance as
the length of the perpendicular line segment connecting the point to the line. In space, however, there is no clear way to know
which point on the line creates such a perpendicular line segment, so we select an arbitrary point on the line and use properties of
vectors to calculate the distance. Therefore, let P be an arbitrary point on line L and let v be a direction vector for L (Figure

12.5.3).

−−⇀
Figure 12.5.3 : Vectors P M and ⇀
v form two sides of a parallelogram with base ∥ v ∥ and height d , which is the distance between a

line and a point in space.


−−⇀ −−⇀
Vectors P M and v form two sides of a parallelogram with area ∥P M × v ∥ . Using a formula from geometry, the area of this
⇀ ⇀

parallelogram can also be calculated as the product of its base and height:
−−⇀
⇀ ⇀
∥ P M × v ∥ = ∥ v ∥d.

We can use this formula to find a general formula for the distance between a line in space and any point not on the line.

 Distance from a Point to a Line

Let L be a line in space passing through point P with direction vector ⇀


v . If M is any point not on L, then the distance from M
to L is
−−⇀

∥P M × v ∥
d = .

∥ v∥

 Example 12.5.3: Calculating the Distance from a Point to a Line


x −3 y +1
Find the distance between the point M = (1, 1, 3) and line = = z − 3.
4 2

Solution:
From the symmetric equations of the line, we know that vector v = ⟨4, 2, 1⟩ is a direction vector for the line. Setting the

symmetric equations of the line equal to zero, we see that point P (3, −1, 3) lies on the line. Then,
−−⇀
P M = ⟨1 − 3, 1 − (−1), 3 − 3⟩

= ⟨−2, 2, 0⟩.

−−⇀
To calculate the distance, we need to find P M × v : ⇀

∣ ^ ^ ^∣
i j k
−−⇀ ∣ ∣

P M × v = ∣ −2 2 0 ∣
∣ ∣
∣ 4 2 1 ∣

^ ^ ^
= (2 − 0) i − (−2 − 0) j + (−4 − 8)k

^ ^ ^
= 2 i + 2 j − 12 k.

Therefore, the distance between the point and the line is (Figure 12.5.4)

12.5.6 https://math.libretexts.org/@go/page/4528
−−⇀ ⇀
∥P M × v ∥
d =

∥ v∥

− −−−− −−−− −−
√ 22 + 22 + 122
= − −−−− −−−− −
√ 42 + 22 + 12

−−
2 √38
=
−−
√21

−−−
2 √798
= units
21

x−3 y +1
Figure 12.5.4 : Point (1, 1, 3) is approximately 2.7 units from the line with symmetric equations = = z − 3.
4 2

 Exercise 12.5.3

Find the distance between point (0, 3, 6) and the line with parametric equations x = 1 − t, y = 1 + 2t, z = 5 + 3t.

Hint
Find a vector with initial point (0, 3, 6) and a terminal point on the line, and then find a direction vector for the line.

Answer
−−
− −−
10 √70
√ = units
7 7

Relationships between Lines


Given two lines in the two-dimensional plane, the lines are equal, they are parallel but not equal, or they intersect in a single point.
In three dimensions, a fourth case is possible. If two lines in space are not parallel, but do not intersect, then the lines are said to be
skew lines (Figure 12.5.5).

12.5.7 https://math.libretexts.org/@go/page/4528
Figure 12.5.5: In three dimensions, it is possible that two lines do not cross, even when they have different directions.
To classify lines as parallel but not equal, equal, intersecting, or skew, we need to know two things: whether the direction vectors
are parallel and whether the lines share a point (Figure 12.5.6).

Figure 12.5.6 : Determine the relationship between two lines based on whether their direction vectors are parallel and whether they
share a point.

 Example 12.5.4: Classifying Lines in Space

For each pair of lines, determine whether the lines are equal, parallel but not equal, skew, or intersecting.
a.
L1 : x = 2s − 1, y = s − 1, z = s − 4

12.5.8 https://math.libretexts.org/@go/page/4528
L2 : x = t − 3, y = 3t + 8, z = 5 − 2t

b.
L1 : x = −y = z

x −3
L2 : = y = z−2
2

c.
L1 : x = 6s − 1, y = −2s, z = 3s + 1

x −4 y +3 z−1
L2 : = =
6 −2 3

Solution
a. Line L has direction vector v = ⟨2, 1, 1⟩; line L has direction vector v = ⟨1, 3, −2⟩. Because the direction vectors are
1

1 2

2

not parallel vectors, the lines are either intersecting or skew. To determine whether the lines intersect, we see if there is a point,
(x, y, z), that lies on both lines. To find this point, we use the parametric equations to create a system of equalities:

2s − 1 = t − 3;

s − 1 = 3t + 8;

s − 4 = 5 − 2t.

By the first equation, t = 2s + 2. Substituting into the second equation yields


s − 1 = 3(2s + 2) + 8

s − 1 = 6s + 6 + 8

5s = −15

s = −3.

Substitution into the third equation, however, yields a contradiction:


s − 4 = 5 − 2(2s + 2)

s − 4 = 5 − 4s − 4

5s = 5

s = 1.

There is no single point that satisfies the parametric equations for L1 and L2 simultaneously. These lines do not
intersect, so they are skew (see the following figure).

b. Line L has direction vector v = ⟨1, −1, 1⟩ and passes through the origin, (0, 0, 0). Line L has a different direction
1

1 2

vector, v = ⟨2, 1, 1⟩, so these lines are not parallel or equal. Let r represent the parameter for line L and let s represent the

2 1

12.5.9 https://math.libretexts.org/@go/page/4528
parameter for L : 2

Line L1 : Line L2 :

x =r x = 2s + 3

y = −r y =s

z =r z = s+2

Solve the system of equations to find r = 1 and s = −1 . If we need to find the point of intersection, we can substitute
these parameters into the original equations to get (1, −1, 1) (see the following figure).

c. Lines L1 and L2 have equivalent direction vectors: ⇀


v = ⟨6, −2, 3⟩. These two lines are parallel (see the following
figure).

 Exercise 12.5.4
Describe the relationship between the lines with the following parametric equations:

x = 1 − 4t, y = 3 + t, z = 8 − 6t

x = 2 + 3s, y = 2s, z = −1 − 3s.

Hint

12.5.10 https://math.libretexts.org/@go/page/4528
Start by identifying direction vectors for each line. Is one a multiple of the other?

Answer
These lines are skew because their direction vectors are not parallel and there is no point (x, y, z) that lies on both lines.

Equations for a Plane


We know that a line is determined by two points. In other words, for any two distinct points, there is exactly one line that passes
through those points, whether in two dimensions or three. Similarly, given any three points that do not all lie on the same line, there
is a unique plane that passes through these points. Just as a line is determined by two points, a plane is determined by three.
This may be the simplest way to characterize a plane, but we can use other descriptions as well. For example, given two distinct,
intersecting lines, there is exactly one plane containing both lines. A plane is also determined by a line and any point that does not
lie on the line. These characterizations arise naturally from the idea that a plane is determined by three points. Perhaps the most
surprising characterization of a plane is actually the most useful.
Imagine a pair of orthogonal vectors that share an initial point. Visualize grabbing one of the vectors and twisting it. As you twist,
the other vector spins around and sweeps out a plane. Here, we describe that concept mathematically. Let n = ⟨a, b, c⟩ be a vector

−−⇀
and P be a point. Then the set of all points Q = (x, y, z) such that P Q is orthogonal to n forms a plane (Figure
= (x0 , y0 , z0 )

12.5.7). We say that n is a normal vector, or perpendicular to the plane. Remember, the dot product of orthogonal vectors is zero.

This fact generates the vector equation of a plane:


−−⇀

n ⋅ P Q = 0.

Rewriting this equation provides additional ways to describe the plane:


−−⇀

n ⋅PQ = 0

⟨a, b, c⟩ ⋅ ⟨x − x0 , y − y0 , z − z0 ⟩ =0

a(x − x0 ) + b(y − y0 ) + c(z − z0 ) = 0.

−−⇀
Figure 12.5.7 : Given a point P and vector n , the set of all points Q with P Q orthogonal to n forms a plane.
⇀ ⇀

 Definition: Scalar Equation of a Plane


−−⇀
Given a point P and vector n , the set of all points Q satisfying the equation n ⋅ P Q = 0 forms a plane. The equation
⇀ ⇀

−−⇀

n ⋅PQ = 0

is known as the vector equation of a plane.


The scalar equation of a plane (sometimes also called the standard equation of a plane) containing point P = (x0 , y0 , z0 )

with normal vector n⃗ = ⟨a, b, c⟩ is

a(x − x0 ) + b(y − y0 ) + c(z − z0 ) = 0.

This equation can be expressed as ax + by + cz + d = 0, where d = −ax0 − b y0 − c z0 . This form of the equation is
sometimes called the general form of the equation of a plane.

12.5.11 https://math.libretexts.org/@go/page/4528
As described earlier in this section, any three points that do not all lie on the same line determine a plane. Given three such points,
we can find an equation for the plane containing these points.

 Example 12.5.5: Writing an Equation of a Plane Given Three Points in the Plane

Write an equation for the plane containing points P = (1, 1, −2), Q = (0, 2, 1), and R = (−1, −1, 0) in both standard and
general forms.
Solution
To write an equation for a plane, we must find a normal vector for the plane. We start by identifying two vectors in the plane:
−−⇀
P Q = ⟨0 − 1, 2 − 1, 1 − (−2)⟩

= ⟨−1, 1, 3⟩

−−⇀
QR = ⟨−1 − 0, −1 − 2, 0 − 1⟩

= ⟨−1, −3, −1⟩.

−−⇀ −−⇀ −−⇀ −−⇀


The cross product P Q × QR is orthogonal to both P Q and QR, so it is normal to the plane that contains these two vectors:
⇀ −−⇀ −−⇀
n = P Q × QR

∣ ^ ^ ^ ∣
i j k
∣ ∣
= ∣ −1 1 3 ∣
∣ ∣
∣ −1 −3 −1 ∣

^ ^ ^
= (−1 + 9) i − (1 + 3) j + (3 + 1)k

^ ^ ^
= 8 i − 4 j + 4 k.

Thus, n = ⟨8, −4, 4⟩, and we can choose any of the three given points to write an equation of the plane:

8(x − 1) − 4(y − 1) + 4(z + 2) =0

8x − 4y + 4z + 4 = 0.

The scalar equations of a plane vary depending on the normal vector and point chosen.

 Example 12.5.6: Writing an Equation for a Plane Given a Point and a Line
y −1
Find an equation of the plane that passes through point (1, 4, 3) and contains the line given by x = = z + 1.
2

Solution
Symmetric equations describe the line that passes through point (0, 1, −1) parallel to vector v = ⟨1, 2, 1⟩ (see the following

1

figure). Use this point and the given point, (1, 4, 3), to identify a second vector parallel to the plane:

v 2 = ⟨1 − 0, 4 − 1, 3 − (−1)⟩ = ⟨1, 3, 4⟩.

Use the cross product of these vectors to identify a normal vector for the plane:
⇀ ⇀ ⇀
n = v1 × v2

∣^i
^
j
^
k∣
∣ ∣
=∣ 1 2 1 ∣
∣ ∣
∣ 1 3 4 ∣

^ ^ ^
= (8 − 3) i − (4 − 1) j + (3 − 2)k

^ ^ ^
= 5 i − 3 j + k.

The scalar equations for the plane are 5x − 3(y − 1) + (z + 1) = 0 and 5x − 3y + z + 4 = 0.

12.5.12 https://math.libretexts.org/@go/page/4528
 Exercise 12.5.6

Find an equation of the plane containing the lines L and L :


1 2

L1 : x = −y = z

x −3
L2 : = y = z − 2.
2

Hint
Hint: The cross product of the lines’ direction vectors gives a normal vector for the plane.
Answer
−2(x − 1) + (y + 1) + 3(z − 1) = 0

or

−2x + y + 3z = 0

Now that we can write an equation for a plane, we can use the equation to find the distance d between a point P and the plane. It is
defined as the shortest possible distance from P to a point on the plane.

Figure 12.5.8 : We want to find the shortest distance from point P to the plane. Let point R be the point in the plane such that, for
−−⇀ −−⇀
any other point in the plane Q, ∥RP ∥ < ∥QP ∥ .
Just as we find the two-dimensional distance between a point and a line by calculating the length of a line segment perpendicular to
the line, we find the three-dimensional distance between a point and a plane by calculating the length of a line segment

12.5.13 https://math.libretexts.org/@go/page/4528
−−⇀
perpendicular to the plane. Let R be the point in the plane such that RP is orthogonal to the plane, and let Q be an arbitrary point
−−⇀ −−⇀
in the plane. Then the projection of vector QP onto the normal vector describes vector RP , as shown in Figure 12.5.8.

 The Distance between a Plane and a Point

Suppose a plane with normal vector ⇀


n passes through point Q. The distance d from the plane to a point P not in the plane is
given by
∣ −−⇀ ⇀∣
QP ⋅ n
−−⇀ −−⇀ ∣ ∣
d = ∥ proj ⇀ QP ∥ =∣ comp⇀ QP ∣= . (12.5.5)
n n ⇀
∥ n∥

 Example 12.5.7: Distance between a Point and a Plane


Find the distance between point P = (3, 1, 2) and the plane given by x − 2y + z = 5 (see the following figure).

Solution
−−⇀
The coefficients of the plane’s equation provide a normal vector for the plane: n = ⟨1, −2, 1⟩ . To find vector QP , we need a

point in the plane. Any point will work, so set y = z = 0 to see that point Q = (5, 0, 0) lies in the plane. Find the component
form of the vector from Q to P :
−−⇀
QP = ⟨3 − 5, 1 − 0, 2 − 0⟩ = ⟨−2, 1, 2⟩.

Apply the distance formula from Equation 12.5.5:


∣ −−⇀ ⇀∣
QP ⋅ n
∣ ∣
d =

∥ n∥

|⟨−2, 1, 2⟩ ⋅ ⟨1, −2, 1⟩|


=
−−−−−−−−−−−−−
2 2 2
√1 + (−2 ) +1

| − 2 − 2 + 2|
= –
√6


2 √6
= – = units.
√6 3

12.5.14 https://math.libretexts.org/@go/page/4528
 Exercise 12.5.7

Find the distance between point P = (5, −1, 0) and the plane given by 4x + 2y − z = 3 .

Hint
Point (0, 0, −3) lies on the plane.
Answer
−−
15 5 √21
= units
−−
√21 7

Parallel and Intersecting Planes


We have discussed the various possible relationships between two lines in two dimensions and three dimensions. When we
describe the relationship between two planes in space, we have only two possibilities: the two distinct planes are parallel or they
intersect. When two planes are parallel, their normal vectors are parallel. When two planes intersect, the intersection is a line
(Figure 12.5.9).

Figure 12.5.9 : The intersection of two nonparallel planes is always a line.


We can use the equations of the two planes to find parametric equations for the line of intersection.

 Example 12.5.8: Finding the Line of Intersection for Two Planes

Find parametric and symmetric equations for the line formed by the intersection of the planes given by x +y +z = 0 and
2x − y + z = 0 (see the following figure).

12.5.15 https://math.libretexts.org/@go/page/4528
Solution
Note that the two planes have nonparallel normals, so the planes intersect. Further, the origin satisfies each equation, so we
know the line of intersection passes through the origin. Add the plane equations so we can eliminate one of the variables, in
this case, y :
x +y +z = 0

2x − y + z = 0

________________
3x + 2z = 0 .
2
This gives us x = − z. We substitute this value into the first equation to express y in terms of z :
3

x +y +z = 0

2
− z+y +z = 0
3

1 .
y+ z =0
3

1
y =− z
3

We now have the first two variables, x and y , in terms of the third variable, z . Now we define z in terms of t . To eliminate the
1
need for fractions, we choose to define the parameter t as t =− z . Then, z = −3t . Substituting the parametric
3

12.5.16 https://math.libretexts.org/@go/page/4528
representation of z back into the other two equations, we see that the parametric equations for the line of intersection are
x z
x = 2t, y = t, z = −3t. The symmetric equations for the line are =y = .
2 −3

 Exercise 12.5.8

Find parametric equations for the line formed by the intersection of planes x + y − z = 3 and 3x − y + 3z = 5.

Hint
Add the two equations, then express z in terms of x . Then, express y in terms of x .
Answer
x = t, y = 7 − 3t, z = 4 − 2t

In addition to finding the equation of the line of intersection between two planes, we may need to find the angle formed by the
intersection of two planes. For example, builders constructing a house need to know the angle where different sections of the roof
meet to know whether the roof will look good and drain properly. We can use normal vectors to calculate the angle between the two
planes. We can do this because the angle between the normal vectors is the same as the angle between the planes. Figure 12.5.10
shows why this is true.

Figure 12.5.10: The angle between two planes has the same measure as the angle between the normal vectors for the planes.
We can find the measure of the angle θ between two intersecting planes by first finding the cosine of the angle, using the following
equation:
⇀ ⇀
| n1 ⋅ n2 |
cos θ = .
⇀ ⇀
∥ n 1 ∥∥ n 2 ∥

We can then use the angle to determine whether two planes are parallel or orthogonal or if they intersect at some other angle.

 Example 12.5.9: Finding the Angle between Two Planes

Determine whether each pair of planes is parallel, orthogonal, or neither. If the planes are intersecting, but not orthogonal, find
the measure of the angle between them. Give the answer in radians and round to two decimal places.
a. x + 2y − z = 8 and 2x + 4y − 2z = 10
b. 2x − 3y + 2z = 3 and 6x + 2y − 3z = 1
c. x + y + z = 4 and x − 3y + 5z = 1
Solution:
a. The normal vectors for these planes are n = ⟨1, 2, −1⟩ and n = ⟨2, 4, −2⟩. These two vectors are scalar multiples of

1

2

each other. The normal vectors are parallel, so the planes are parallel.
b. The normal vectors for these planes are n = ⟨2, −3, 2⟩ and n = ⟨6, 2, −3⟩. Taking the dot product of these vectors, we

1

2

have

12.5.17 https://math.libretexts.org/@go/page/4528
⇀ ⇀
n1 ⋅ n 2 = ⟨2, −3, 2⟩ ⋅ ⟨6, 2, −3⟩

= 2(6) − 3(2) + 2(−3) = 0.

The normal vectors are orthogonal, so the corresponding planes are orthogonal as well.
c. The normal vectors for these planes are n = ⟨1, 1, 1⟩ and n = ⟨1, −3, 5⟩:

1

2

⇀ ⇀
| n1 ⋅ n2 |
cos θ =
⇀ ⇀
∥ n 1 ∥∥ n 2 ∥

|⟨1, 1, 1⟩ ⋅ ⟨1, −3, 5⟩|


= − −−−− −−− −−− −−
− −−−− −−−− −
√ 12 + 12 + 12 √ 12 + (−3 )2 + 52

3
= −−−
√105

Then θ = arccos 3
≈ 1.27 rad.
√105

Thus the angle between the two planes is about 1.27 rad, or approximately 73°.

 Exercise 12.5.9

Find the measure of the angle between planes x + y − z = 3 and 3x − y + 3z = 5. Give the answer in radians and round to
two decimal places.

Hint
Use the coefficients of the variables in each equation to find a normal vector for each plane.

Answer
1.44 rad

When we find that two planes are parallel, we may need to find the distance between them. To find this distance, we simply select a
point in one of the planes. The distance from this point to the other plane is the distance between the planes.
Previously, we introduced the formula for calculating this distance in Equation 12.5.5:
−−⇀

QP ⋅ n
d = ,

∥ n∥

where Q is a point on the plane, P is a point not on the plane, and n⃗ is the normal vector that passes through point Q. Consider the
distance from point (x , y , z ) to plane ax + by + cz + k = 0. Let (x , y , z ) be any point in the plane. Substituting into the
0 0 0 1 1 1

formula yields
|a(x0 − x1 ) + b(y0 − y1 ) + c(z0 − z1 )|
d = −− −−− −−−− −
√ a2 + b2 + c2

|ax0 + b y0 + c z0 + k|
= −− −−− −−−− − .
√ a2 + b2 + c2

We state this result formally in the following theorem.

 Distance from a Point to a Plane


Let P (x 0, y0 , z0 ) be a point. The distance from P to plane ax + by + cz + k = 0 is given by
|ax0 + b y0 + c z0 + k|
d = −− −−− −−−− − .
√ a2 + b2 + c2

12.5.18 https://math.libretexts.org/@go/page/4528
 Example 12.5.10: Finding the Distance between Parallel Planes
Find the distance between the two parallel planes given by 2x + y − z = 2 and 2x + y − z = 8.
Solution
Point (1, 0, 0) lies in the first plane. The desired distance, then, is
|ax0 + b y0 + c z0 + k|
d =
−− −−− −−−− −
√ a2 + b2 + c2

|2(1) + 1(0) + (−1)(0) + (−8)|


= −−−−−−−−−−−−−
2 2 2
√2 +1 + (−1 )

6 –
= – = √6 units
√6

 Exercise 12.5.10:
Find the distance between parallel planes 5x − 2y + z = 6 and 5x − 2y + z = −3 .

Hint
Set x = y = 0 to find a point on the first plane.

Answer
−−
9 3 √30
−− = units
√30 10

 Distance between Two Skew Lines


Finding the distance from a point to a line or from a line to a plane seems like a pretty abstract procedure. But, if the lines
represent pipes in a chemical plant or tubes in an oil refinery or roads at an intersection of highways, confirming that the
distance between them meets specifications can be both important and awkward to measure. One way is to model the two pipes
as lines, using the techniques in this chapter, and then calculate the distance between them. The calculation involves forming
vectors along the directions of the lines and using both the cross product and the dot product.

12.5.19 https://math.libretexts.org/@go/page/4528
Figure 12.5.11: Industrial pipe installations often feature pipes running in different directions. How can we find the distance
between two skew pipes?
The symmetric forms of two lines, L and L , are
1 2

x − x1 y − y1 z − z1
L1 : = =
a1 b1 c1

x − x2 y − y2 z − z2
L2 : = = .
a2 b2 c2

You are to develop a formula for the distance d between these two lines, in terms of the values a , b , c ; a , b , c ; x , y , z ;
1 1 1 2 2 2 1 1 1

and x , y , z . The distance between two lines is usually taken to mean the minimum distance, so this is the length of a line
2 2 2

segment or the length of a vector that is perpendicular to both lines and intersects both lines.
1. First, write down two vectors, ⇀
v1 and ⇀
v2 , that lie along L and L , respectively.
1 2


2. Find the cross product of these two vectors and call it N . This vector is perpendicular to ⇀
v1 and ⇀
v2 , and hence is
perpendicular to both lines.

3. From vector N, form a unit vector n in the same direction.

4. Use symmetric equations to find a convenient vector ⇀


v 12 that lies between any two points, one on each line. Again,
this can be done directly from the symmetric equations.
5. The dot product of two vectors is the magnitude of the projection of one vector onto the other—that is,
⇀ ⇀ ⇀ ⇀
A ⋅ B = ∥ A∥∥ B∥ cos θ, where θ is the angle between the vectors. Using the dot product, find the projection of vector

v12 found in step 4 onto unit vector n found in step 3. This projection is perpendicular to both lines, and hence its

length must be the perpendicular distance d between them. Note that the value of d may be negative, depending on your
choice of vector v or the order of the cross product, so use absolute value signs around the numerator.

12

−−−
6. Check that your formula gives the correct distance of | − 25|/√198 ≈ 1.78 between the following two lines:
x −5 y −3 z−1
L1 : = =
2 4 3

x −6 y −1 z
L2 : = = .
3 5 7

12.5.20 https://math.libretexts.org/@go/page/4528
7. Is your general expression valid when the lines are parallel? If not, why not? (Hint: What do you know about the value
of the cross product of two parallel vectors? Where would that result show up in your expression for d ?)
8. Demonstrate that your expression for the distance is zero when the lines intersect. Recall that two lines intersect if
they are not parallel and they are in the same plane. Hence, consider the direction of n and v . What is the result of ⇀ ⇀
12

their dot product?


9. Consider the following application. Engineers at a refinery have determined they need to install support struts between
many of the gas pipes to reduce damaging vibrations. To minimize cost, they plan to install these struts at the closest
points between adjacent skewed pipes. Because they have detailed schematics of the structure, they are able to determine
the correct lengths of the struts needed, and hence manufacture and distribute them to the installation crews without
spending valuable time making measurements.
The rectangular frame structure has the dimensions 4.0 × 15.0 × 10.0 m (height, width, and depth). One sector
has a pipe entering the lower corner of the standard frame unit and exiting at the diametrically opposed corner (the
one farthest away at the top); call this L . A second pipe enters and exits at the two different opposite lower
1

corners; call this L (Figure 12.5.12).


2

Figure 12.5.12: Two pipes cross through a standard frame unit.


Write down the vectors along the lines representing those pipes, find the cross product between them from which to
create the unit vector n , define a vector that spans two points on each line, and finally determine the minimum distance

between the lines. (Take the origin to be at the lower corner of the first pipe.) Similarly, you may also develop the
symmetric equations for each line and substitute directly into your formula.

Key Concepts
In three dimensions, the direction of a line is described by a direction vector. The vector equation of a line with direction vector
v = ⟨a, b, c⟩ passing through point P = (x , y , z ) is r = r + t v , where r = ⟨x , y , z ⟩ is the position vector of point
⇀ ⇀ ⇀ ⇀ ⇀
0 0 0 0 0 0 0 0

P . This equation can be rewritten to form the parametric equations of the line: x = x + ta, y = y + tb , and z = z + tc .
0 0 0

x − x0 y − y0 z − z0
The line can also be described with the symmetric equations = = .
a b c
Let L be a line in space passing through point P with direction vector ⇀
v . If Q is any point not on L, then the distance from Q
−−⇀

∥P Q × v ∥
to L is d = ⇀
.
∥ v∥

In three dimensions, two lines may be parallel but not equal, equal, intersecting, or skew.
−−⇀ −−⇀
Given a point P and vector n , the set of all points Q satisfying equation n ⋅ P Q = 0 forms a plane. Equation n ⋅ P Q = 0 is
⇀ ⇀ ⇀

known as the vector equation of a plane.


The scalar equation of a plane containing point P = (x , y , z ) with normal vector n = ⟨a, b, c⟩ is
0 0 0

a(x − x ) + b(y − y ) + c(z − z ) = 0


0 0 . This equation can be expressed as ax + by + cz + d = 0, where
0

d = −ax − b y − c z . This form of the equation is sometimes called the general form of the equation of a plane.
0 0 0

12.5.21 https://math.libretexts.org/@go/page/4528
Suppose a plane with normal vector n passes through point Q. The distance D from the plane to point P not in the plane is
given by

∣ ⇀∣
∣QP ⋅ n ∣
−−⇀
→ ∣ ∣
D = ∥ proj ⇀ QP ∥ =∣ comp⇀ QP ∣=
n n ⇀
∥ n ∥.

The normal vectors of parallel planes are parallel. When two planes intersect, they form a line.
⇀ ⇀
| n1 ⋅ n2 |
The measure of the angle θ between two intersecting planes can be found using the equation: cos θ = ⇀ ⇀
, where n

1
∥ n 1 ∥∥ n 2 ∥

and n are normal vectors to the planes.



2

The distance D from point (x , y , z ) to plane ax + by + cz + d = 0 is given by


0 0 0

|a(x0 − x1 ) + b(y0 − y1 ) + c(z0 − z1 )| |ax0 + b y0 + c z0 + d|


D = −− −−− −−−− − = −− −−− −−−− −
√ a2 + b2 + c2 √ a2 + b2 + c2

Key Equations
Vector Equation of a Line
⇀ ⇀ ⇀
r = r 0 + tv

Parametric Equations of a Line


x = x0 + ta, y = y0 + tb, and z = z 0 + tc

Symmetric Equations of a Line


x − x0 y − y0 z − z0
= =
a b c

Vector Equation of a Plane


−−⇀

n ⋅PQ = 0

Scalar Equation of a Plane


a(x − x0 ) + b(y − y0 ) + c(z − z0 ) = 0

Distance between a Plane and a Point


−−⇀
∣ ⇀∣
QP ⋅ n
−−⇀ −−⇀ ∣ ∣
d = ∥ proj ⇀ QP ∥ =∣ comp⇀ QP ∣=
n n ⇀
∥ n∥

Glossary
direction vector
a vector parallel to a line that is used to describe the direction, or orientation, of the line in space

general form of the equation of a plane


an equation in the form ax + by + cz + d = 0, where n = ⟨a, b, c⟩ is a normal vector of the plane, P

= (x0 , y0 , z0 ) is a point
on the plane, and d = −ax − by − cz 0 0 0

normal vector
a vector perpendicular to a plane

parametric equations of a line


the set of equations x = x + ta, y = y
0 0 + tb, and z = z 0 + tc describing the line with direction vector v = ⟨a, b, c⟩ passing
through point (x , y , z )
0 0 0

scalar equation of a plane

12.5.22 https://math.libretexts.org/@go/page/4528
the equation a(x − x ) + b(y − y ) + c(z − z ) = 0 used to describe a plane containing point P
0 0 0 = (x0 , y0 , z0 ) with normal
vector n = ⟨a, b, c⟩ or its alternate form ax + by + cz + d = 0 , where d = −ax − by − cz 0 0 0

skew lines
two lines that are not parallel but do not intersect

symmetric equations of a line


x − x0 y − y0 z − z0
the equations = = describing the line with direction vector v = ⟨a, b, c⟩ passing through point
a b c
(x0 , y0 , z0 )

vector equation of a line


the equation r = r + t v used to describe a line with direction vector
⇀ ⇀
0
⇀ ⇀
v = ⟨a, b, c⟩ passing through point P = (x0 , y0 , z0 ) ,
where r = ⟨x , y , z ⟩ , is the position vector of point P

0 0 0 0

vector equation of a plane


−−⇀
the equation n ⋅ P Q = 0, where P is a given point in the plane, Q is any point in the plane, and n is a normal vector of the
⇀ ⇀

plane

12.5: Equations of Lines and Planes is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
12.5: Equations of Lines and Planes in Space by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

12.5.23 https://math.libretexts.org/@go/page/4528
12.6: Cylinders and Quadric Surfaces
 Learning Objectives
Convert from cylindrical to rectangular coordinates.
Convert from rectangular to cylindrical coordinates.
Convert from spherical to rectangular coordinates.
Convert from rectangular to spherical coordinates.

The Cartesian coordinate system provides a straightforward way to describe the location of points in space. Some surfaces,
however, can be difficult to model with equations based on the Cartesian system. This is a familiar problem; recall that in two
dimensions, polar coordinates often provide a useful alternative system for describing the location of a point in the plane,
particularly in cases involving circles. In this section, we look at two different ways of describing the location of points in space,
both of them based on extensions of polar coordinates. As the name suggests, cylindrical coordinates are useful for dealing with
problems involving cylinders, such as calculating the volume of a round water tank or the amount of oil flowing through a pipe.
Similarly, spherical coordinates are useful for dealing with problems involving spheres, such as finding the volume of domed
structures.

Cylindrical Coordinates
When we expanded the traditional Cartesian coordinate system from two dimensions to three, we simply added a new axis to
model the third dimension. Starting with polar coordinates, we can follow this same process to create a new three-dimensional
coordinate system, called the cylindrical coordinate system. In this way, cylindrical coordinates provide a natural extension of polar
coordinates to three dimensions.

 Definition: The Cylindrical Coordinate System


In the cylindrical coordinate system, a point in space (Figure 12.6.1) is represented by the ordered triple (r, θ, z) , where
(r, θ) are the polar coordinates of the point’s projection in the xy-plane
z is the usual z -coordinate in the Cartesian coordinate system

Figure 12.6.1 : The right triangle lies in the xy-plane. The length of the hypotenuse is r and θ is the measure of the angle
formed by the positive x -axis and the hypotenuse. The z -coordinate describes the location of the point above or below the xy-
plane.

In the xy-plane, the right triangle shown in Figure 12.6.1 provides the key to transformation between cylindrical and Cartesian, or
rectangular, coordinates.

 Conversion between Cylindrical and Cartesian Coordinates


The rectangular coordinates (x, y, z) and the cylindrical coordinates (r, θ, z) of a point are related as follows:
These equations are used to convert from cylindrical coordinates to rectangular coordinates.

12.6.1 https://math.libretexts.org/@go/page/4529
x = r cos θ

y = r sin θ

z =z

These equations are used to convert from rectangular coordinates to cylindrical coordinates
1. r2
=x
2
+y
2

y
2. tan θ =
x
3. z = z

As when we discussed conversion from rectangular coordinates to polar coordinates in two dimensions, it should be noted that the
y
equation tan θ = has an infinite number of solutions. However, if we restrict θ to values between 0 and 2π, then we can find a
x
unique solution based on the quadrant of the xy-plane in which original point (x, y, z) is located. Note that if x = 0 , then the value
π 3π
of θ is either , , or 0, depending on the value of y .
2 2

Notice that these equations are derived from properties of right triangles. To make this easy to see, consider point P in the xy-plane
with rectangular coordinates (x, y, 0) and with cylindrical coordinates (r, θ, 0), as shown in Figure 12.6.2.

Figure : The Pythagorean theorem provides equation


12.6.2 r
2 2
= x +y
2
. Right-triangle relationships tell us that
x = r cos θ, y = r sin θ, and tan θ = y/x.

Let’s consider the differences between rectangular and cylindrical coordinates by looking at the surfaces generated when each of
the coordinates is held constant. If c is a constant, then in rectangular coordinates, surfaces of the form x = c, y = c, or z = c are
all planes. Planes of these forms are parallel to the yz-plane, the xz-plane, and the xy-plane, respectively. When we convert to
cylindrical coordinates, the z -coordinate does not change. Therefore, in cylindrical coordinates, surfaces of the form z = c are
planes parallel to the xy-plane. Now, let’s think about surfaces of the form r = c . The points on these surfaces are at a fixed
distance from the z -axis. In other words, these surfaces are vertical circular cylinders. Last, what about θ = c ? The points on a
surface of the form θ = c are at a fixed angle from the x-axis, which gives us a half-plane that starts at the z -axis (Figures 12.6.3
and 12.6.4).

Figure 12.6.3 : In rectangular coordinates, (a) surfaces of the form x = c are planes parallel to the yz-plane, (b) surfaces of the
form y = c are planes parallel to the xz -plane, and (c) surfaces of the form z = c are planes parallel to the xy-plane.

12.6.2 https://math.libretexts.org/@go/page/4529
Figure 12.6.4 : In cylindrical coordinates, (a) surfaces of the form r = c are vertical cylinders of radius r , (b) surfaces of the form
θ = c are half-planes at angle θ from the x -axis, and (c) surfaces of the form z = c are planes parallel to the xy -plane.

 Example 12.6.1: Converting from Cylindrical to Rectangular Coordinates



Plot the point with cylindrical coordinates (4, , −2) and express its location in rectangular coordinates.
3

Solution
Conversion from cylindrical to rectangular coordinates requires a simple application of the equations listed in Conversion
between Cylindrical and Cartesian Coordinates:

x = r cos θ = 4 cos = −2
3

2π – .
y = r sin θ = 4 sin = 2 √3
3

z = −2

2π –
The point with cylindrical coordinates (4, , −2) has rectangular coordinates (−2, 2√3, −2) (Figure 12.6.5).
3

Figure 12.6.5 : The projection of the point in the xy -plane is 4 units from the origin. The line from the origin to the point’s

projection forms an angle of with the positive x -axis. The point lies 2 units below the xy-plane.
3

 Exercise 12.6.1

Point R has cylindrical coordinates (5,


π

6
, 4). Plot R and describe its location in space using rectangular, or Cartesian,
coordinates.

Hint
The first two components match the polar coordinates of the point in the xy-plane.
Answer
5 √3
The rectangular coordinates of the point are ( 2
,
5

2
, 4).

12.6.3 https://math.libretexts.org/@go/page/4529
If this process seems familiar, it is with good reason. This is exactly the same process that we followed in Introduction to
Parametric Equations and Polar Coordinates to convert from polar coordinates to two-dimensional rectangular coordinates.

 Example 12.6.2: Converting from Rectangular to Cylindrical Coordinates

Convert the rectangular coordinates (1, −3, 5) to cylindrical coordinates.


Solution
Use the second set of equations from Conversion between Cylindrical and Cartesian Coordinates to translate from rectangular
to cylindrical coordinates:
2 2 2
r =x +y

−−−−−−−−−
2 2
r = ±√ 1 + (−3 )

−−
= ±√10.

−−
We choose the positive square root, so r = √10 .Now, we apply the formula to find θ . In this case, y is negative and x is

positive, which means we must select the value of θ between and 2π:
2

y −3
tan θ = =
x 1

θ = arctan(−3) ≈ 5.03 rad.

In this case, the z-coordinates are the same in both rectangular and cylindrical coordinates:

z = 5.

−−
The point with rectangular coordinates (1, −3, 5) has cylindrical coordinates approximately equal to (√10, 5.03, 5).

 Exercise 12.6.2

Convert point (−8, 8, −7) from Cartesian coordinates to cylindrical coordinates.

Hint
y
r
2
=x
2
+y
2
and tan θ = x

Answer
– 3π
(8 √2, , −7)
4

12.6.4 https://math.libretexts.org/@go/page/4529
The use of cylindrical coordinates is common in fields such as physics. Physicists studying electrical charges and the capacitors
used to store these charges have discovered that these systems sometimes have a cylindrical symmetry. These systems have
complicated modeling equations in the Cartesian coordinate system, which make them difficult to describe and analyze. The
equations can often be expressed in more simple terms using cylindrical coordinates. For example, the cylinder described by
equation x + y = 25 in the Cartesian system can be represented by cylindrical equation r = 5 .
2 2

 Example 12.6.3: Identifying Surfaces in the Cylindrical Coordinate System

Describe the surfaces with the given cylindrical equations.


π
a. θ =
4
b. r + z
2 2
=9

c. z = r
Solution
a. When the angle θ is held constant while r and z are allowed to vary, the result is a half-plane (Figure 12.6.6).

Figure 12.6.6 : In polar coordinates, the equation θ = π/4 describes the ray extending diagonally through the first quadrant. In
three dimensions, this same equation describes a half-plane.
b. Substitute r = x + y into equation r + z = 9 to express the rectangular form of the equation: x
2 2 2 2 2 2
+y
2
+z
2
=9 . This
equation describes a sphere centered at the origin with radius 3 (Figure 12.6.7).

Figure 12.6.7 : The sphere centered at the origin with radius 3 can be described by the cylindrical equation r 2
+z
2
= 9 .

12.6.5 https://math.libretexts.org/@go/page/4529
c. To describe the surface defined by equation z = r , is it useful to examine traces parallel to the xy-plane. For example, the
trace in plane z = 1 is circle r = 1 , the trace in plane z = 3 is circle r = 3 , and so on. Each trace is a circle. As the value of z
increases, the radius of the circle also increases. The resulting surface is a cone (Figure 12.6.8).

Figure 12.6.8 : The traces in planes parallel to the xy-plane are circles. The radius of the circles increases as z increases.

 Exercise 12.6.3

Describe the surface with cylindrical equation r = 6 .

Hint
The θ and z components of points on the surface can take any value.
Answer
This surface is a cylinder with radius 6.

Spherical Coordinates
In the Cartesian coordinate system, the location of a point in space is described using an ordered triple in which each coordinate
represents a distance. In the cylindrical coordinate system, the location of a point in space is described using two distances (r and
z) and an angle measure (θ) . In the spherical coordinate system, we again use an ordered triple to describe the location of a point in

space. In this case, the triple describes one distance and two angles. Spherical coordinates make it simple to describe a sphere, just
as cylindrical coordinates make it easy to describe a cylinder. Grid lines for spherical coordinates are based on angle measures, like
those for polar coordinates.

12.6.6 https://math.libretexts.org/@go/page/4529
 Definition: spherical coordinate system
In the spherical coordinate system, a point P in space (Figure 12.6.9) is represented by the ordered triple (ρ, θ, φ) where
ρ(the Greek letter rho) is the distance between P and the origin (ρ ≠ 0);
θis the same angle used to describe the location in cylindrical coordinates;
¯
¯¯¯¯¯¯
¯
φ (the Greek letter phi) is the angle formed by the positive z -axis and line segment OP , where O is the origin and

0 ≤ φ ≤ π.

Figure 12.6.9 : The relationship among spherical, rectangular, and cylindrical coordinates.
By convention, the origin is represented as (0, 0, 0) in spherical coordinates.

 HOWTO: Converting among Spherical, Cylindrical, and Rectangular Coordinates


Rectangular coordinates (x, y, z), cylindrical coordinates (r, θ, z), and spherical coordinates (ρ, θ, φ) of a point are related as
follows:
Convert from spherical coordinates to rectangular coordinates
These equations are used to convert from spherical coordinates to rectangular coordinates.
x = ρ sin φ cos θ

y = ρ sin φ sin θ

z = ρ cos φ

Convert from rectangular coordinates to spherical coordinates


These equations are used to convert from rectangular coordinates to spherical coordinates.
2 2 2 2
ρ =x +y +z
y
tan θ =
x
z
φ = arccos( −−−−−−−−− − ).
√x + y 2 + z 2
2

Convert from spherical coordinates to cylindrical coordinates


These equations are used to convert from spherical coordinates to cylindrical coordinates.
r = ρ sin φ

θ =θ

z = ρ cos φ

Convert from cylindrical coordinates to spherical coordinates


These equations are used to convert from cylindrical coordinates to spherical coordinates.
−− −−−−
2 2
ρ = √r + z

12.6.7 https://math.libretexts.org/@go/page/4529
θ =θ
z
φ = arccos( −−− −−−)
√r2 + z 2

The formulas to convert from spherical coordinates to rectangular coordinates may seem complex, but they are straightforward
applications of trigonometry. Looking at Figure 12.6.10, it is easy to see that r = ρ sin φ . Then, looking at the triangle in the xy-
plane with r as its hypotenuse, we have x = r cos θ = ρ sin φ cos θ . The derivation of the formula for y is similar. Figure
12.6.10 also shows that ρ = r + z = x + y + z and z = ρ cos φ . Solving this last equation for φ and then substituting
2 2 2 2 2 2

−− −−−− z
2
ρ = √r + z
2
(from the first equation) yields φ = arccos( −− − −−−) . Also, note that, as before, we must be careful when using
√r + z 2
2

y
the formula tan θ = to choose the correct value of θ .
x

Figure 12.6.10: The equations that convert from one system to another are derived from right-triangle relationships.
As we did with cylindrical coordinates, let’s consider the surfaces that are generated when each of the coordinates is held constant.
Let c be a constant, and consider surfaces of the form ρ = c . Points on these surfaces are at a fixed distance from the origin and
form a sphere. The coordinate θ in the spherical coordinate system is the same as in the cylindrical coordinate system, so surfaces
of the form θ = c are half-planes, as before. Last, consider surfaces of the form φ = c . The points on these surfaces are at a fixed
angle from the z -axis and form a half-cone (Figure 12.6.11).

Figure 12.6.11: In spherical coordinates, surfaces of the form ρ = c are spheres of radius ρ (a), surfaces of the form θ = c are half-
planes at an angle θ from the x -axis (b), and surfaces of the form ϕ = c are half-cones at an angle ϕ from the z -axis (c).

 Example 12.6.4: Converting from Spherical Coordinates


π π
Plot the point with spherical coordinates (8, , ) and express its location in both rectangular and cylindrical coordinates.
3 6

Solution
Use the equations in Converting among Spherical, Cylindrical, and Rectangular Coordinates to translate between spherical and
cylindrical coordinates (Figure 12.6.12):

12.6.8 https://math.libretexts.org/@go/page/4529
x = ρ sin φ cos θ

π π
= 8 sin( ) cos( )
6 3

1 1
=8( )
2 2

=2

y = ρ sin φ sin θ

π π
= 8 sin( ) sin( )
6 3

1 √3
=8( )
2 2


= 2 √3

z = ρ cos φ

π
= 8 cos( )
6

√3
=8( )
2


= 4 √3

Figure 12.6.12: The projection of the point in the xy-plane is 4 units from the origin. The line from the origin to the point’s

projection forms an angle of π/3 with the positive x -axis. The point lies 4√3 units above the xy-plane.
π π – –
The point with spherical coordinates (8, , ) has rectangular coordinates (2, 2√3, 4√3).
3 6

Finding the values in cylindrical coordinates is equally straightforward:

12.6.9 https://math.libretexts.org/@go/page/4529
r = ρ sin φ

π
= 8 sin
6

=4

θ =θ

z = ρ cos φ

π
= 8 cos
6

= 4 √3.

π –
Thus, cylindrical coordinates for the point are (4, , 4 √3) .
3

 Exercise 12.6.4

Plot the point with spherical coordinates (2, − 5π

6
,
π

6
) and describe its location in both rectangular and cylindrical coordinates.

Hint
Converting the coordinates first may help to find the location of the point in space more easily.
Answer
√3 – –
Cartesian: (− 2
,−
1

2
, √3), cylindrical: (1, − 5π

6
, √3)

 Example 12.6.5: Converting from Rectangular Coordinates



Convert the rectangular coordinates (−1, 1, √6) to both spherical and cylindrical coordinates.
Solution
Start by converting from rectangular to spherical coordinates:
2 2 2 2 2 2 – 2
ρ =x +y +z = (−1 ) +1 + (√6) = 8

1
tan θ =
−1

– 3π
ρ = 2 √2 and θ = arctan(−1) = .
4

Because (x, y) = (−1, 1), then the correct choice for θ is 3π

4
.

12.6.10 https://math.libretexts.org/@go/page/4529
z
There are actually two ways to identify φ . We can use the equation φ = arccos( −−−−−−−−− −) . A more simple approach,
2 2 2
√x + y + z
– –
however, is to use equation z = ρ cos φ. We know that z = √6 and ρ = 2√2 , so
– –
– – √6 √3
√6 = 2 √2 cos φ, so cos φ = –
=
2 √2 2

π – 3π π
and therefore φ = . The spherical coordinates of the point are (2√2, , ).
6 4 6

To find the cylindrical coordinates for the point, we need only find r:
– π –
r = ρ sin φ = 2 √2 sin( ) = √2.
6

– 3π –
The cylindrical coordinates for the point are (√2, , √6).
4

 Example 12.6.6: Identifying Surfaces in the Spherical Coordinate System

Describe the surfaces with the given spherical equations.


π
a. θ =
3

b. φ =
6
c. ρ = 6
d. ρ = sin θ sin φ
Solution
a. The variable θ represents the measure of the same angle in both the cylindrical and spherical coordinate systems. Points with
π π
coordinates (ρ, , φ) lie on the plane that forms angle θ = with the positive x-axis. Because ρ > 0 , the surface described by
3 3
π
equation θ = is the half-plane shown in Figure 12.6.13.
3

π
Figure 12.6.13: The surface described by equation θ = is a half-plane.
3


b. Equation φ = describes all points in the spherical coordinate system that lie on a line from the origin forming an angle
6

measuring rad with the positive z -axis. These points form a half-cone (Figure 12.6.14). Because there is only one value for
6
φ that is measured from the positive z -axis, we do not get the full cone (with two pieces).

12.6.11 https://math.libretexts.org/@go/page/4529

Figure 12.6.14: The equation φ = describes a cone.
6

z
To find the equation in rectangular coordinates, use equation φ = arccos( −−−−−−−−− −
).
2 2 2
√x + y + z

5π z
= arccos( −−−−−−−−−−)
6 2 2 2
√x +y +z

5π z
cos = −−−−−−−−−−
6 2 2 2
√x +y +z


√3 z
− = −−−−−−−−− −
2 √ x2 + y 2 + z 2

2
3 z
=
2 2 2
4 x +y +z

2 2 2
3x 3y 3z
2
+ + =z
4 4 4

2 2 2
3x 3y z
+ − = 0.
4 4 4

This is the equation of a cone centered on the z -axis.


c. Equation ρ = 6 describes the set of all points 6 units away from the origin—a sphere with radius 6 (Figure 12.6.15).

Figure 12.6.15: Equation ρ = 6 describes a sphere with radius 6.


d. To identify this surface, convert the equation from spherical to rectangular coordinates, using equations y = ρ sin φ sin θ

and ρ = x + y + z :
2 2 2 2

ρ = sin θ sin φ

12.6.12 https://math.libretexts.org/@go/page/4529
2
ρ = ρ sin θ sin φ Multiply both sides of the equation by ρ.
x
2
+y
2
+z
2
=y Substitute rectangular variables using the equations above.
x
2
+y
2
−y +z
2
=0 Subtract y from both sides of the equation.
1 1
x
2
+y
2
−y + +z
2
= Complete the square.
4 4

2
1 1
x
2
+ (y − ) +z
2
= . Rewrite the middle terms as a perfect square.
2 4

1 1
The equation describes a sphere centered at point (0, , 0) with radius .
2 2

 Exercise 12.6.5

Describe the surfaces defined by the following equations.


a. ρ = 13


b. θ =
3
π
c. φ =
4

Hint
Think about what each component represents and what it means to hold that component constant.
Answer a
This is the set of all points 13 units from the origin. This set forms a sphere with radius 13.
Answer b

This set of points forms a half plane. The angle between the half plane and the positive x -axis is θ = .
3

Answer c
π
Let P be a point on this surface. The position vector of this point forms an angle of φ = with the positive z -axis, which
4
means that points closer to the origin are closer to the axis. These points form a half-cone.

Spherical coordinates are useful in analyzing systems that have some degree of symmetry about a point, such as the volume of the
space inside a domed stadium or wind speeds in a planet’s atmosphere. A sphere that has Cartesian equation x + y + z = c 2 2 2 2

has the simple equation ρ = c in spherical coordinates.


In geography, latitude and longitude are used to describe locations on Earth’s surface, as shown in Figure 12.6.16. Although the
shape of Earth is not a perfect sphere, we use spherical coordinates to communicate the locations of points on Earth. Let’s assume
Earth has the shape of a sphere with radius 4000 mi. We express angle measures in degrees rather than radians because latitude and
longitude are measured in degrees.

12.6.13 https://math.libretexts.org/@go/page/4529
Figure 12.6.16: In the latitude–longitude system, angles describe the location of a point on Earth relative to the equator and the
prime meridian.
Let the center of Earth be the center of the sphere, with the ray from the center through the North Pole representing the positive z -
axis. The prime meridian represents the trace of the surface as it intersects the xz-plane. The equator is the trace of the sphere
intersecting the xy-plane.

 Example 12.6.7: Converting Latitude and Longitude to Spherical Coordinates


The latitude of Columbus, Ohio, is 40° N and the longitude is 83° W, which means that Columbus is 40° north of the equator.
Imagine a ray from the center of Earth through Columbus and a ray from the center of Earth through the equator directly south
of Columbus. The measure of the angle formed by the rays is 40°. In the same way, measuring from the prime meridian,
Columbus lies 83° to the west. Express the location of Columbus in spherical coordinates.
Solution
The radius of Earth is 4000mi, so ρ = 4000. The intersection of the prime meridian and the equator lies on the positive x-axis.
Movement to the west is then described with negative angle measures, which shows that θ = −83° , Because Columbus lies
40° north of the equator, it lies 50° south of the North Pole, so φ = 50° . In spherical coordinates, Columbus lies at point

(4000, −83°, 50°).

12.6.14 https://math.libretexts.org/@go/page/4529
 Exercise 12.6.6

Sydney, Australia is at 34°S and 151°E. Express Sydney’s location in spherical coordinates.

Hint
Because Sydney lies south of the equator, we need to add 90° to find the angle measured from the positive z -axis.
Answer
(4000, 151°, 124°)

Cylindrical and spherical coordinates give us the flexibility to select a coordinate system appropriate to the problem at hand. A
thoughtful choice of coordinate system can make a problem much easier to solve, whereas a poor choice can lead to unnecessarily
complex calculations. In the following example, we examine several different problems and discuss how to select the best
coordinate system for each one.

 Example 12.6.8: Choosing the Best Coordinate System

In each of the following situations, we determine which coordinate system is most appropriate and describe how we would
orient the coordinate axes. There could be more than one right answer for how the axes should be oriented, but we select an
orientation that makes sense in the context of the problem. Note: There is not enough information to set up or solve these
problems; we simply select the coordinate system (Figure 12.6.17).
a. Find the center of gravity of a bowling ball.
b. Determine the velocity of a submarine subjected to an ocean current.
c. Calculate the pressure in a conical water tank.
d. Find the volume of oil flowing through a pipeline.
e. Determine the amount of leather required to make a football.

Figure 12.6.17: (credit: (a) modification of work by scl hua, Wikimedia, (b) modification of work by DVIDSHUB, Flickr, (c)
modification of work by Michael Malak, Wikimedia, (d) modification of work by Sean Mack, Wikimedia, (e) modification of
work by Elvert Barnes, Flickr)
Solution
a. Clearly, a bowling ball is a sphere, so spherical coordinates would probably work best here. The origin should be located at
the physical center of the ball. There is no obvious choice for how the x-, y - and z -axes should be oriented. Bowling balls
normally have a weight block in the center. One possible choice is to align the z -axis with the axis of symmetry of the

12.6.15 https://math.libretexts.org/@go/page/4529
weight block.
b. A submarine generally moves in a straight line. There is no rotational or spherical symmetry that applies in this situation, so
rectangular coordinates are a good choice. The z -axis should probably point upward. The x- and y -axes could be aligned to
point east and north, respectively. The origin should be some convenient physical location, such as the starting position of
the submarine or the location of a particular port.
c. A cone has several kinds of symmetry. In cylindrical coordinates, a cone can be represented by equation z = kr, where k is
a constant. In spherical coordinates, we have seen that surfaces of the form φ = c are half-cones. Last, in rectangular
2 2
x y
coordinates, elliptic cones are quadric surfaces and can be represented by equations of the form z 2
=
2
+
2
. In this
a b
case, we could choose any of the three. However, the equation for the surface is more complicated in rectangular
coordinates than in the other two systems, so we might want to avoid that choice. In addition, we are talking about a water
tank, and the depth of the water might come into play at some point in our calculations, so it might be nice to have a
component that represents height and depth directly. Based on this reasoning, cylindrical coordinates might be the best
choice. Choose the z -axis to align with the axis of the cone. The orientation of the other two axes is arbitrary. The origin
should be the bottom point of the cone.
d. A pipeline is a cylinder, so cylindrical coordinates would be best the best choice. In this case, however, we would likely
choose to orient our z -axis with the center axis of the pipeline. The x-axis could be chosen to point straight downward or to
some other logical direction. The origin should be chosen based on the problem statement. Note that this puts the z -axis in
a horizontal orientation, which is a little different from what we usually do. It may make sense to choose an unusual
orientation for the axes if it makes sense for the problem.
e. A football has rotational symmetry about a central axis, so cylindrical coordinates would work best. The z -axis should align
with the axis of the ball. The origin could be the center of the ball or perhaps one of the ends. The position of the x-axis is
arbitrary.

 Exercise 12.6.7

Which coordinate system is most appropriate for creating a star map, as viewed from Earth (see the following figure)?

How should we orient the coordinate axes?

Hint
What kinds of symmetry are present in this situation?
Answer

12.6.16 https://math.libretexts.org/@go/page/4529
Spherical coordinates with the origin located at the center of the earth, the z -axis aligned with the North Pole, and the x -
axis aligned with the prime meridian

Key Concepts
In the cylindrical coordinate system, a point in space is represented by the ordered triple (r, θ, z), where (r, θ) represents the
polar coordinates of the point’s projection in the xy-plane and z represents the point’s projection onto the z -axis.
To convert a point from cylindrical coordinates to Cartesian coordinates, use equations x = r cos θ, y = r sin θ, and z = z.
y
To convert a point from Cartesian coordinates to cylindrical coordinates, use equations r 2
=x
2 2
+ y , tan θ = , and z = z.
x
In the spherical coordinate system, a point P in space is represented by the ordered triple (ρ, θ, φ), where ρ is the distance
between P and the origin (ρ ≠ 0), θ is the same angle used to describe the location in cylindrical coordinates, and φ is the
¯
¯¯¯¯¯¯
¯
angle formed by the positive z -axis and line segment OP , where O is the origin and 0 ≤ φ ≤ π.
To convert a point from spherical coordinates to Cartesian coordinates, use equations x = ρ sin φ cos θ, y = ρ sin φ sin θ, and
z = ρ cos φ.
y
To convert a point from Cartesian coordinates to spherical coordinates, use equations ρ 2
=x
2
+y
2 2
+ z , tan θ = , and
x
z
φ = arccos( −−−−−−−−− −) .
√x2 + y 2 + z 2

To convert a point from spherical coordinates to cylindrical coordinates, use equations r = ρ sin φ, θ = θ, and z = ρ cos φ.
− −−−− −
To convert a point from cylindrical coordinates to spherical coordinates, use equations ρ = √r + z , θ = θ, and
2 2

z
φ = arccos( ).
−−−− −−
√r2 + z 2

Glossary
cylindrical coordinate system
a way to describe a location in space with an ordered triple (r, θ, z), where (r, θ) represents the polar coordinates of the point’s
projection in the xy-plane, and z represents the point’s projection onto the z -axis

spherical coordinate system


a way to describe a location in space with an ordered triple (ρ, θ, φ), where ρ is the distance between P and the origin
(ρ ≠ 0), θ is the same angle used to describe the location in cylindrical coordinates, and φ is the angle formed by the positive

z -axis and line segment OP , where O is the origin and 0 ≤ φ ≤ π


¯

Contributors and Attributions


Gilbert Strang (MIT) and Edwin “Jed” Herman (Harvey Mudd) with many contributing authors. This content by OpenStax is
licensed with a CC-BY-SA-NC 4.0 license. Download for free at http://cnx.org.
Paul Seeburger edited the LaTeX on the page

12.6: Cylinders and Quadric Surfaces is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
12.7: Cylindrical and Spherical Coordinates by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

12.6.17 https://math.libretexts.org/@go/page/4529
CHAPTER OVERVIEW

13: Vector Functions


A general Calculus Textmap organized around the textbook

Calculus: Early Transcendentals


by James Stewart

I II III IV V VI VII VIII IX X XI XII XIII XIV XV XVI XVII

This Textmap is currently under construction... please be patient with us.

Topic hierarchy
13.1: Vector Functions and Space Curves
13.2: Derivatives and Integrals of Vector Functions
13.3: Arc Length and Curvature
13.4: Motion in Space- Velocity and Acceleration

13: Vector Functions is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

1
13.1: Vector Functions and Space Curves
 Learning Objectives
Write the general equation of a vector-valued function in component form and unit-vector form.
Recognize parametric equations for a space curve.
Describe the shape of a helix and write its equation.
Define the limit of a vector-valued function.

Our study of vector-valued functions combines ideas from our earlier examination of single-variable calculus with our description
of vectors in three dimensions from the preceding chapter. In this section, we extend concepts from earlier chapters and also
examine new ideas concerning curves in three-dimensional space. These definitions and theorems support the presentation of
material in the rest of this chapter and also in the remaining chapters of the text.

Definition of a Vector-Valued Function


Our first step in studying the calculus of vector-valued functions is to define what exactly a vector-valued function is. We can then
look at graphs of vector-valued functions and see how they define curves in both two and three dimensions.

 Definition: Vector-valued Functions


A vector-valued function is a function of the form
⇀ ^ ^ ⇀ ^ ^ ^
r (t) = f (t) i + g(t) j or r (t) = f (t) i + g(t) j + h(t) k,

where the component functions f , g , and h , are real-valued functions of the parameter t . Vector-valued functions are also
written in the form
⇀ ⇀
r (t) = ⟨f (t), g(t)⟩ or r (t) = ⟨f (t), g(t), h(t)⟩.

In both cases, the first form of the function defines a two-dimensional vector-valued function; the second form describes a three-
dimensional vector-valued function.
The parameter t can lie between two real numbers: a ≤ t ≤ b . Another possibility is that the value of t might take on all real
numbers. Last, the component functions themselves may have domain restrictions that enforce restrictions on the value of t . We
often use t as a parameter because t can represent time.

 Example 13.1.1: Evaluating Vector-Valued Functions and Determining Domains

For each of the following vector-valued functions, evaluate ⇀


r (0) , ⇀
r(
π

2
) , and ⇀
r(

3
. Do any of these functions have domain
)

restrictions?
1. ⇀ ^ ^
r (t) = 4 cos t i + 3 sin t j

2. ⇀ ^ ^ ^
r (t) = 3 tan t i + 4 sec t j + 5t k

Solution
1. To calculate each of the function values, substitute the appropriate value of t into the function:

13.1.1 https://math.libretexts.org/@go/page/4531
⇀ ^ ^
r (0) = 4 cos(0) i + 3 sin(0) j

^ ^ ^
= 4i +0j = 4i

π π π
⇀ ^ ^
r ( ) = 4 cos( ) i + 3 sin( )j
2 2 2

^ ^ ^
= 0i +3j = 3j


2π 2π 2π
^ ^
r ( ) = 4 cos( ) i + 3 sin( )j
3 3 3

√3 3 √3
1 ^ ^ ^ ^
= 4 (− ) i +3 ( ) j = −2 i + j
2 2 2

To determine whether this function has any domain restrictions, consider the component functions separately. The first
component function is f (t) = 4 cos t and the second component function is g(t) = 3 sin t . Neither of these functions has a
domain restriction, so the domain of r (t) = 4 cos t ^i + 3 sin t ^j is all real numbers.

2. To calculate each of the function values, substitute the appropriate value of t into the function:
⇀ ^ ^ ^
r (0) = 3 tan(0) i + 4 sec(0) j + 5(0)k

^ ^ ^
= 0 i + 4j + 0 k = 4 j


π π π π
^ ^ ^
r ( ) = 3 tan( ) i + 4 sec( )j + 5 ( ) k, which does not exist
2 2 2 2


2π 2π 2π 2π
^ ^ ^
r ( ) = 3 tan( ) i + 4 sec( )j + 5 ( )k
3 3 3 3

– ^ 10π
^ ^
= 3(−√3) i + 4(−2) j + k
3

– ^ 10π
^ ^
= (−3 √3) i − 8 j + k
3

To determine whether this function has any domain restrictions, consider the component functions separately. The first
component function is f (t) = 3 tan t , the second component function is g(t) = 4 sec t , and the third component function is
h(t) = 5t . The first two functions are not defined for odd multiples of , so the function is not defined for odd multiples of
π

2
π

2
. Therefore,
(2n + 1)π
D⇀ = {t | t ≠ },
r
2

where n is any integer.

 Exercise 13.1.1

For the vector-valued function ⇀


r (t) = (t
2 ^ ^
− 3t) i + (4t + 1) j , evaluate ⇀ ⇀
r (0), r (1) , and ⇀
r (−4) . Does this function have
any domain restrictions?

Hint
Substitute the appropriate values of t into the function.
Answer
⇀ ^ ⇀ ^ ^ ⇀ ^ ^
r (0) = j , r (1) = −2 i + 5 j , r (−4) = 28 i − 15 j

The domain of ⇀
r (t) = (t
2 ^ ^
− 3t) i + (4t + 1) j is all real numbers.

Example 13.1.1 illustrates an important concept. The domain of a vector-valued function consists of real numbers. The domain can
be all real numbers or a subset of the real numbers. The range of a vector-valued function consists of vectors. Each real number in
the domain of a vector-valued function is mapped to either a two- or a three-dimensional vector.

13.1.2 https://math.libretexts.org/@go/page/4531
Graphing Vector-Valued Functions
Recall that a plane vector consists of two quantities: direction and magnitude. Given any point in the plane (the initial point), if we
move in a specific direction for a specific distance, we arrive at a second point. This represents the terminal point of the vector. We
calculate the components of the vector by subtracting the coordinates of the initial point from the coordinates of the terminal point.
A vector is considered to be in standard position if the initial point is located at the origin. When graphing a vector-valued function,
we typically graph the vectors in the domain of the function in standard position, because doing so guarantees the uniqueness of the
graph. This convention applies to the graphs of three-dimensional vector-valued functions as well. The graph of a vector-valued
function of the form
⇀ ^ ^
r (t) = f (t) i + g(t) j

consists of the set of all points (f (t), g(t)) , and the path it traces is called a plane curve. The graph of a vector-valued function of
the form
⇀ ^ ^ ^
r (t) = f (t) i + g(t) j + h(t) k

consists of the set of all points (f (t), g(t), h(t)), and the path it traces is called a space curve. Any representation of a plane curve
or space curve using a vector-valued function is called a vector parameterization of the curve.
Each plane curve and space curve has an orientation, indicated by arrows drawn in on the curve, that shows the direction of
motion along the curve as the value of the parameter t increases.

 Example 13.1.2 : Graphing a Vector-Valued Function

Create a graph of each of the following vector-valued functions:


1. The plane curve represented by ⇀ ^ ^
r (t) = 4 cos t i + 3 sin t j , 0 ≤ t ≤ 2π
−−
2. The plane curve represented by ⇀ ^ ^
, 0 ≤ t ≤ √2π
3
3 3
r (t) = 4 cos(t ) i + 3 sin(t ) j

3. The space curve represented by r (t) = 4 cos t i + 4 sin t j + t k , 0 ≤ t ≤ 4π


⇀ ^ ^ ^

Solution
1. As with any graph, we start with a table of values. We then graph each of the vectors in the second column of the table in
standard position and connect the terminal points of each vector to form a curve (Figure 13.1.1). This curve turns out to be an
ellipse centered at the origin.
Table 13.1.1 : Table of Values for ⇀ ^ ^
r (t) = 4 cos t i + 3 sin t j , 0 ≤ t ≤ 2π
⇀ ⇀
t r (t) t r (t)

^ ^
0 4i π −4 i

π 5π
– 3√2 – 3√2
^ ^ ^ ^
2 √2 i + j −2 √2 i − j
4 2 2
4

π 3π
^ ^
3j −3 j
2 2

3π – 3√2 7π – 3√2
^ ^ ^ ^
−2 √2 i + j 2 √2 i − j
4 2 4 2

^
2π 4i

13.1.3 https://math.libretexts.org/@go/page/4531
Figure 13.1.1 : The graph of the first vector-valued function is an ellipse.
−−
2. The table of values for ⇀ 3 ^ 3 ^
r (t) = 4 cos(t ) i + 3 sin(t ) j , 0 ≤ t ≤ √2π is as follows:
3

3 −−
Table of Values for ⇀ 3 ^ 3 ^
,
r (t) = 4 cos( t ) i + 3 sin( t ) j 0 ≤ t ≤ √2π

⇀ ⇀
t r (t) t r (t)

^ 3 − ^
0 4i √π −4 i


− −−

π 5π
– 3√2 – 3√2

3 ^ ^ 3
√ ^ ^
2 √2 i + j −2 √2 i − j
4 2
4
2


− −−

π 3π
3 ^ 3 ^
√ 3j √ −3 j
2 2

−−
− −−

3π – 3√2 7π – 3√2

3 ^ ^ 3
√ ^ ^
−2 √2 i + j 2 √2 i − j
2 2
4 4

3 −−
^
√2π 4i

The graph of this curve is also an ellipse centered at the origin.

Figure 13.1.2 : The graph of the second vector-valued function is also an ellipse.
3. We go through the same procedure for a three-dimensional vector function.
Table of Values for r(t) = 4 cos t^i + 4 sin t^j + tk
^
, 0 ≤ t ≤ 4π

13.1.4 https://math.libretexts.org/@go/page/4531

⇀ ⇀

t r (t) t r (t)

^ ^ ^
0 4i π −4 i + πk

π 5π
–^ –^ π ^ –^ –^ 5π ^
2 √2 i + 2 √2 j + k −2 √2 i − 2 √2 j + k
4 4
4 4

π 3π
^ π ^ ^ 3π ^
4j + k −4 j + k
2 2
2 2

3π –^ –^ 7π –^ –^
3π ^ 7π ^
−2 √2 i + 2 √2 j + k 2 √2 i − 2 √2 j + k
4 4
4 4

^ ^
2π 4 j + 2πk

The values then repeat themselves, except for the fact that the coefficient of k^
is always increasing ( 13.1.3). This curve is
called a helix. Notice that if the k component is eliminated, then the function becomes r (t) = 4 cos t^i + 4 sin t^j , which is a
^ ⇀

circle of radius 4 centered at the origin.

Figure 13.1.3 : The graph of the third vector-valued function is a helix.

You may notice that the graphs in parts a. and b. are identical. This happens because the function describing curve b is a so-called
reparameterization of the function describing curve a. In fact, any curve has an infinite number of reparameterizations; for example,
we can replace t with 2t in any of the three previous curves without changing the shape of the curve. The interval over which t is
defined may change, but that is all. We return to this idea later in this chapter when we study arc-length parameterization. As
mentioned, the name of the shape of the curve of the graph in 13.1.3 is a helix. The curve resembles a spring, with a circular cross-
section looking down along the z -axis. It is possible for a helix to be elliptical in cross-section as well. For example, the vector-
valued function r (t) = 4 cos t ^i + 3 sin t ^j + t k
⇀ ^
describes an elliptical helix. The projection of this helix into the xy-plane is an
ellipse. Last, the arrows in the graph of this helix indicate the orientation of the curve as t progresses from 0 to 4π.

 Exercise 13.1.2

Create a graph of the vector-valued function ⇀


r (t) = (t
2 ^ ^
− 1) i + (2t − 3) j ,0 ≤t≤3.

Hint
Start by making a table of values, then graph the vectors for each value of t.

Answer

13.1.5 https://math.libretexts.org/@go/page/4531
At this point, you may notice a similarity between vector-valued functions and parameterized curves. Indeed, given a vector-valued
function r (t) = f (t) ^i + g(t) ^j we can define x = f (t) and y = g(t) . If a restriction exists on the values of t (for example, t is

restricted to the interval [a, b] for some constants a < b , then this restriction is enforced on the parameter. The graph of the
parameterized function would then agree with the graph of the vector-valued function, except that the vector-valued graph would
represent vectors rather than points. Since we can parameterize a curve defined by a function y = f (x), it is also possible to
represent an arbitrary plane curve by a vector-valued function.

Limits and Continuity of a Vector-Valued Function


We now take a look at the limit of a vector-valued function. This is important to understand to study the calculus of vector-valued
functions.

 Definition: limit of a vector-valued function



A vector-valued function ⇀
r approaches the limit L as t approaches a , written


lim r (t) = L,
t→a

provided


lim ∥
∥ r (t) − L∥
∥ = 0.
t→a

This is a rigorous definition of the limit of a vector-valued function. In practice, we use the following theorem:

 Theorem: Limit of a vector-valued function

Let f , g , and h be functions of t . Then the limit of the vector-valued function ⇀ ^ ^


r (t) = f (t) i + g(t) j as t approaches a is given
by
⇀ ^ ^
lim r (t) = [ lim f (t)] i + [ lim g(t)] j , (13.1.1)
t→a t→a t→a

provided the limits lim f (t) and lim g(t) exist.


t→a t→a

Similarly, the limit of the vector-valued function ⇀ ^ ^ ^


r (t) = f (t) i + g(t) j + h(t)k as t approaches a is given by
⇀ ^ ^ ^
lim r (t) = [ lim f (t)] i + [ lim g(t)] j + [ lim h(t)] k, (13.1.2)
t→a t→a t→a t→a

provided the limits lim f (t) , lim g(t) and lim h(t) exist.
t→a t→a t→a

In the following example, we show how to calculate the limit of a vector-valued function.

13.1.6 https://math.libretexts.org/@go/page/4531
 Example 13.1.3: Evaluating the Limit of a Vector-Valued Function
For each of the following vector-valued functions, calculate lim r (t) for ⇀

t→3

a. ⇀
r (t) = (t
2 ^ ^
− 3t + 4) i + (4t + 3) j

b. ⇀
r (t) =
2t−4

t+1
^
i +
2
t ^ ^
j + (4t − 3)k
t +1

Solution
a. Use Equation 13.1.1 and substitute the value t = 3 into the two component expressions:
⇀ 2 ^ ^
lim r (t) = lim [(t − 3t + 4) i + (4t + 3) j ]
t→3 t→3

2 ^ ^
= [lim(t − 3t + 4)] i + [lim(4t + 3)] j
t→3 t→3

^ ^
= 4 i + 15 j

b. Use Equation 13.1.2 and substitute the value t = 3 into the three component expressions:


2t − 4 t
^ ^ ^
lim r (t) = lim ( i + j + (4t − 3)k)
2
t→3 t→3 t +1 t +1

2t − 4 t
^ ^ ^
= [lim ( )] i + [lim ( )] j + [lim(4t − 3)] k
2
t→3 t +1 t→3 t +1 t→3

1 ^ 3 ^ ^
= i + j + 9k
2 10

 Exercise 13.1.3
−− −−−−−−− (t+1)π
Calculate lim r (t) for the function
⇀ ⇀ 2 ^ ^
r (t) = √t + 3t − 1 i − (4t − 3) j − sin
2
^
k
t→2

Hint
Use Equation 13.1.2from the preceding theorem.

Answer
⇀ ^ ^ ^
lim r (t) = 3 i − 5 j + k
t→2

Now that we know how to calculate the limit of a vector-valued function, we can define continuity at a point for such a function.

 Definitions

Let f , g , and h be functions of t . Then, the vector-valued function ⇀ ^ ^


r (t) = f (t) i + g(t) j is continuous at point t = a if the
following three conditions hold:
1. r (a) exists

2. lim r (t) exists


t→a

3. lim r (t) =
⇀ ⇀
r (a)
t→a

Similarly, the vector-valued function ⇀ ^ ^ ^


r (t) = f (t) i + g(t) j + h(t)k is continuous at point t = a if the following three
conditions hold:
1. r (a) exists

2. lim r (t) exists


t→a

3. lim r (t) =
⇀ ⇀
r (a)
t→a

13.1.7 https://math.libretexts.org/@go/page/4531
Summary
A vector-valued function is a function of the form r (t) = f (t)^i + g(t)^j or r (t) = f (t)^i + g(t)^j + h(t)k
⇀ ⇀ ^
, where the
component functions f , g , and h are real-valued functions of the parameter t .
The graph of a vector-valued function of the form r (t) = f (t)^i + g(t)^j is called a plane curve. The graph of a vector-valued

function of the form r (t) = f (t)^i + g(t)^j + h(t)k


⇀ ^
is called a space curve.
It is possible to represent an arbitrary plane curve by a vector-valued function.
To calculate the limit of a vector-valued function, calculate the limits of the component functions separately.
Key Equations
Vector-valued function
r (t) = f (t) i + g(t) j or r (t) = f (t) i + g(t) j + h(t)k ,or r (t) = ⟨f (t), g(t)⟩ or r (t) = ⟨f (t), g(t), h(t)⟩
⇀ ^ ^ ⇀ ^ ^ ^ ⇀ ⇀

Limit of a vector-valued function


lim r (t) = [ lim f (t)] i + [ lim g(t)] j or lim r (t) = [ lim f (t)] i + [ lim g(t)] j + [ lim h(t)] k
⇀ ^ ^ ⇀ ^ ^ ^
t→a t→a t→a t→a t→a t→a t→a

Glossary
component functions
the component functions of the vector-valued function ⇀ ^ ^
r (t) = f (t) i + g(t) j are f (t) and g(t), and the component functions of
the vector-valued function ⇀ ^ ^ ^
r (t) = f (t) i + g(t) j + h(t)k are f (t), g(t) and h(t)

helix
a three-dimensional curve in the shape of a spiral

limit of a vector-valued function


⇀ ⇀
⇀ ⇀
a vector-valued function r (t) has a limit L as t approaches a if lim t → a ∣∣ r (t) − L∣∣ = 0

plane curve
the set of ordered pairs (f (t), g(t)) together with their defining parametric equations x = f (t) and y = g(t)

reparameterization
an alternative parameterization of a given vector-valued function

space curve
the set of ordered triples (f (t), g(t), h(t)) together with their defining parametric equations x = f (t) , y = g(t) and z = h(t)

vector parameterization
any representation of a plane or space curve using a vector-valued function

vector-valued function
a function of the form r (t) = f (t)^i + g(t)^j or
⇀ ⇀ ^ ^ ^
r (t) = f (t) i + g(t) j + h(t)k ,where the component functions f , g , and h are
real-valued functions of the parameter t.

Contributors and Attributions


Gilbert Strang (MIT) and Edwin “Jed” Herman (Harvey Mudd) with many contributing authors. This content by OpenStax is
licensed with a CC-BY-SA-NC 4.0 license. Download for free at http://cnx.org.
Edited by Paul Seeburger (Monroe Community College)

13.1: Vector Functions and Space Curves is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
13.1: Vector-Valued Functions and Space Curves by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

13.1.8 https://math.libretexts.org/@go/page/4531
13.2: Derivatives and Integrals of Vector Functions
 Learning Objectives
Write an expression for the derivative of a vector-valued function.
Find the tangent vector at a point for a given position vector.
Find the unit tangent vector at a point for a given position vector and explain its significance.
Calculate the definite integral of a vector-valued function.

To study the calculus of vector-valued functions, we follow a similar path to the one we took in studying real-valued functions. First,
we define the derivative, then we examine applications of the derivative, then we move on to defining integrals. However, we will find
some interesting new ideas along the way as a result of the vector nature of these functions and the properties of space curves.

Derivatives of Vector-Valued Functions


Now that we have seen what a vector-valued function is and how to take its limit, the next step is to learn how to differentiate a
vector-valued function. The definition of the derivative of a vector-valued function is nearly identical to the definition of a real-valued
function of one variable. However, because the range of a vector-valued function consists of vectors, the same is true for the range of
the derivative of a vector-valued function.

 Definition: Derivative of Vector-Valued Functions


The derivative of a vector-valued function ⇀
r (t) is
⇀ ⇀
r (t + Δt) − r (t)

r '(t) = lim (13.2.1)
Δt→0 Δt

⇀′
provided the limit exists. If r (t) exists, then r (t) is differentiable at t . If r '(t) exists for all t in an open interval (a, b) then
⇀ ⇀

r (t) is differentiable over the interval (a, b) . For the function to be differentiable over the closed interval [a, b], the following two

limits must exist as well:


⇀ ⇀
r (a + Δt) − r (a)

r '(a) = lim
+
Δt→0 Δt

and
⇀ ⇀
r (b + Δt) − r (b)

r '(b) = lim
Δt→0

Δt

Many of the rules for calculating derivatives of real-valued functions can be applied to calculating the derivatives of vector-valued
functions as well. Recall that the derivative of a real-valued function can be interpreted as the slope of a tangent line or the
instantaneous rate of change of the function. The derivative of a vector-valued function can be understood to be an instantaneous rate
of change as well; for example, when the function represents the position of an object at a given point in time, the derivative
represents its velocity at that same point in time.
We now demonstrate taking the derivative of a vector-valued function.

 Example 13.2.1: Finding the Derivative of a Vector-Valued Function


Use the definition to calculate the derivative of the function
⇀ ^ 2 ^
r (t) = (3t + 4) i + (t − 4t + 3) j .

Solution
Let’s use Equation 13.2.1:

13.2.1 https://math.libretexts.org/@go/page/4532
⇀ ⇀
r (t + Δt) − r (t)

r '(t) = lim
Δt→0 Δt

^ 2 ^ ^ 2 ^
[(3(t + Δt) + 4) i + ((t + Δt) − 4(t + Δt) + 3) j ] − [(3t + 4) i + (t − 4t + 3) j ]
= lim
Δt→0 Δt

^ ^ 2 2 ^ 2 ^
(3t + 3Δt + 4) i − (3t + 4) i + (t + 2tΔt + (Δt) − 4t − 4Δt + 3) j − (t − 4t + 3) j
= lim
Δt→0 Δt

^ 2 ^
(3Δt) i + (2tΔt + (Δt) − 4Δt) j
= lim
Δt→0 Δt

^ ^
= lim (3 i + (2t + Δt − 4) j )
Δt→0

^ ^
= 3 i + (2t − 4) j

 Exercise 13.2.1

Use the definition to calculate the derivative of the function ⇀


r (t) = (2 t
2 ^ ^
+ 3) i + (5t − 6) j .

Hint
Use Equation 13.2.1.

Answer
⇀ ^ ^
r '(t) = 4t i + 5 j

Notice that in the calculations in Example 13.2.1, we could also obtain the answer by first calculating the derivative of each
component function, then putting these derivatives back into the vector-valued function. This is always true for calculating the
derivative of a vector-valued function, whether it is in two or three dimensions. We state this in the following theorem. The proof of
this theorem follows directly from the definitions of the limit of a vector-valued function and the derivative of a vector-valued
function.

 Theorem 13.2.1: Differentiation of Vector-Valued Functions


Let f , g , and h be differentiable functions of t .

1. If ⇀ ^ ^
r (t) = f (t) i + g(t) j then
⇀ ^ ^
r '(t) = f '(t) i + g'(t) j .

2. If ⇀ ^ ^ ^
r (t) = f (t) i + g(t) j + h(t) k then
⇀ ^ ^ ^
r '(t) = f '(t) i + g'(t) j + h'(t) k.

 Example 13.2.2: Calculating the Derivative of Vector-Valued Functions

Use Theorem 13.2.1 to calculate the derivative of each of the following functions.

a. ⇀ ^ 2 ^
r (t) = (6t + 8) i + (4 t + 2t − 3) j

b. ⇀ ^ ^
r (t) = 3 cos t i + 4 sin t j

c. ⇀ t ^ t ^
r (t) = e sin t i + e cos t j − e
2t ^
k

Solution
We use Theorem 13.2.1 and what we know about differentiating functions of one variable.
a. The first component of
⇀ ^ 2 ^
r (t) = (6t + 8) i + (4 t + 2t − 3) j

13.2.2 https://math.libretexts.org/@go/page/4532
is f (t) = 6t + 8 . The second component is g(t) = 4t + 2t − 3 . We have f '(t) = 6 and g'(t) = 8t + 2 , so the Theorem
2

13.2.1 gives r '(t) = 6 i + (8t + 2) j .


⇀ ^ ^

b. The first component is f (t) = 3 cos t and the second component is g(t) = 4 sin t . We have f '(t) = −3 sin t and
g'(t) = 4 cos t , so we obtain r '(t) = −3 sin t i + 4 cos t j .
⇀ ^ ^

c. The first component of r (t) = e sin t i + e cos t j − e k


⇀ ^ t ^ ^ t
is f (t) = e sin t , the second component is g(t) = e cos t , and
2t t t

the third component is h(t) = −e . We have f '(t) = e (sin t + cos t) , g'(t) = e (cos t − sin t) , and h'(t) = −2e , so the
2t t t 2t

theorem gives r '(t) = e (sin t + cos t) ^i + e (cos t − sin t) ^j − 2e k


⇀ t t ^
. 2t

 Exercise 13.2.2

Calculate the derivative of the function


⇀ ^ t ^ ^
r (t) = (t ln t) i + (5 e ) j + (cos t − sin t) k.

Hint
Identify the component functions and use Theorem 13.2.1.

Answer
⇀ ^ t ^ ^
r '(t) = (1 + ln t) i + 5 e j − (sin t + cos t) k

We can extend to vector-valued functions the properties of the derivative that we presented previously. In particular, the constant
multiple rule, the sum and difference rules, the product rule, and the chain rule all extend to vector-valued functions. However, in the
case of the product rule, there are actually three extensions:
1. for a real-valued function multiplied by a vector-valued function,
2. for the dot product of two vector-valued functions, and
3. for the cross product of two vector-valued functions.

 Theorem: Properties of the Derivative of Vector-Valued Functions


Let ⇀
r and u be differentiable vector-valued functions of t , let f be a differentiable real-valued function of t , and let c be a scalar.

d ⇀ ⇀
i. [c r (t)] = c r '(t) Scalar multiple
dt

d ⇀ ⇀ ⇀ ⇀
ii. [ r (t) ± u (t)] = r '(t) ± u '(t) Sum and difference
dt

d ⇀ ⇀ ⇀
iii. [f (t) u (t)] = f '(t) u (t) + f (t) u '(t) Scalar product
dt

d ⇀ ⇀ ⇀ ⇀ ⇀ ⇀
iv. [ r (t) ⋅ u (t)] = r '(t) ⋅ u (t) + r (t) ⋅ u '(t) Dot product
dt

d
⇀ ⇀ ⇀ ⇀ ⇀ ⇀
v. [ r (t) × u (t)] = r '(t) × u (t) + r (t) × u '(t) Cross product
dt

d
⇀ ⇀
vi. [ r (f (t))] = r '(f (t)) ⋅ f '(t) Chain rule
dt
⇀ ⇀ ⇀ ⇀
vii. If r (t) ⋅ r (t) = c, then r (t) ⋅ r '(t) = 0 .

 Proof

The proofs of the first two properties follow directly from the definition of the derivative of a vector-valued function. The third
property can be derived from the first two properties, along with the product rule. Let u (t) = g(t) ^i + h(t) ^j . Then

13.2.3 https://math.libretexts.org/@go/page/4532
d ⇀
d
^ ^
[f (t) u (t)] = [f (t)(g(t) i + h(t) j )]
dt dt

d
^ ^
= [f (t)g(t) i + f (t)h(t) j ]
dt

d d
^ ^
= [f (t)g(t)] i + [f (t)h(t)] j
dt dt

^ ^
= (f '(t)g(t) + f (t)g'(t)) i + (f '(t)h(t) + f (t)h'(t)) j

⇀ ⇀
= f '(t) u (t) + f (t) u '(t).

To prove property iv. let ⇀ ^ ^


r (t) = f1 (t) i + g1 (t) j and u (t) = f

2 (t)
^ ^
i + g2 (t) j . Then
d ⇀ ⇀
d
[ r (t) ⋅ u (t)] = [ f1 (t)f2 (t) + g1 (t)g2 (t)]
dt dt

= f1 '(t)f2 (t) + f1 (t)f2 '(t) + g1 '(t)g2 (t) + g1 (t)g2 '(t) = f1 '(t)f2 (t) + g1 '(t)g2 (t) + f1 (t)f2 '(t) + g1 (t)g2 '(t)

^ ^ ^ ^ ^ ^ ^ ^
= (f1 ' i + g1 ' j ) ⋅ (f2 i + g2 j ) + (f1 i + g1 j ) ⋅ (f2 ' i + g2 ' j )

⇀ ⇀ ⇀ ⇀
= r '(t) ⋅ u (t) + r (t) ⋅ u '(t).

The proof of property v. is similar to that of property iv. Property vi. can be proved using the chain rule. Last, property vii. follows
from property iv:
d ⇀ ⇀
d
[ r (t) ⋅ r (t)] = [c]
dt dt

⇀ ⇀ ⇀ ⇀
r '(t) ⋅ r (t) + r (t) ⋅ r '(t) = 0

⇀ ⇀
2 r (t) ⋅ r '(t) = 0

⇀ ⇀
r (t) ⋅ r '(t) = 0

Now for some examples using these properties.

 Example 13.2.3: Using the Properties of Derivatives of Vector-Valued Functions

Given the vector-valued functions


⇀ ^ 2 ^ ^
r (t) = (6t + 8) i + (4 t + 2t − 3) j + 5t k

and
⇀ 2 ^ ^ 3 ^
u (t) = (t − 3) i + (2t + 4) j + (t − 3t) k,

calculate each of the following derivatives using the properties of the derivative of vector-valued functions.
d
a. ⇀ ⇀
[ r (t) ⋅ u (t)]
dt
d
b. ⇀ ⇀
[ u (t) × u '(t)]
dt

Solution

We have ⇀ ^ ^ ^
r '(t) = 6 i + (8t + 2) j + 5 k and u '(t) = 2t ^i + 2 ^j + (3t
⇀ 2 ^
− 3) k . Therefore, according to property iv:

1.

13.2.4 https://math.libretexts.org/@go/page/4532
d ⇀ ⇀ ⇀ ⇀ ⇀ ⇀
[ r (t) ⋅ u (t)] = r '(t) ⋅ u (t) + r (t) ⋅ u '(t)
dt

^ ^ ^ 2 ^ ^ 3 ^
= (6 i + (8t + 2) j + 5 k) ⋅ ((t − 3) i + (2t + 4) j + (t − 3t) k)

^ 2 ^ ^ ^ ^ 2 ^
+ ((6t + 8) i + (4 t + 2t − 3) j + 5t k) ⋅ (2t i + 2 j + (3 t − 3) k)

2 3
= 6(t − 3) + (8t + 2)(2t + 4) + 5(t − 3t)

2 2
+ 2t(6t + 8) + 2(4 t + 2t − 3) + 5t(3 t − 3)

3 2
= 20 t + 42 t + 26t − 16.

2. First, we need to adapt property v for this problem:


d
⇀ ⇀ ⇀ ⇀ ⇀ ⇀
[ u (t) × u '(t)] = u '(t) × u '(t) + u (t) × u ''(t).
dt

Recall that the cross product of any vector with itself is zero. Furthermore, u ''(t) represents the second derivative of u (t) :
⇀ ⇀

d d
⇀ ⇀ ^ ^ 2 ^ ^ ^
u ''(t) = [ u '(t)] = [2t i + 2 j + (3 t − 3) k] = 2 i + 6t k.
dt dt

Therefore,
d
⇀ ⇀ 2 ^ ^ 3 ^ ^ ^
[ u (t) × u '(t)] = 0 + ((t − 3) i + (2t + 4) j + (t − 3t) k) × (2 i + 6t k)
dt

∣ ^ ^ ^ ∣
i j k
∣ ∣
= ∣ t2 − 3 2t + 4 t
3
− 3t ∣
∣ ∣
∣ 2 0 6t ∣

^ 2 3 ^ ^
= 6t(2t + 4) i − (6t(t − 3) − 2(t − 3t)) j − 2(2t + 4) k

2 ^ 3 ^ ^
= (12 t + 24t) i + (12t − 4 t ) j − (4t + 8) k.

 Exercise 13.2.3
d d
Calculate ⇀ ⇀
[ r (t) ⋅ r '(t)] and ⇀ ⇀
[ u (t) × r (t)] for the vector-valued functions:
dt dt

⇀ ^ ^ 2t ^
r (t) = cos t i + sin t j − e k
⇀ ^ ^ ^
u (t) = t i + sin t j + cos t k ,

Hint
Follow the same steps as in Example 13.2.3.

Answer
d ⇀ ⇀ 4t
[ r (t) ⋅ r '(t)] = 8 e
dt

d ⇀ ⇀ 2t ^ 2t ^ ^
[ u (t) × r (t)] = −(e (cos t + 2 sin t) + cos 2t) i + (e (2t + 1) − sin 2t) j + (t cos t + sin t − cos 2t) k
dt

Tangent Vectors and Unit Tangent Vectors


Recall that the derivative at a point can be interpreted as the slope of the tangent line to the graph at that point. In the case of a vector-
valued function, the derivative provides a tangent vector to the curve represented by the function. Consider the vector-valued function
⇀ ^ ^
r (t) = cos t i + sin t j (13.2.2)

The derivative of this function is


⇀ ^ ^
r '(t) = − sin t i + cos t j

13.2.5 https://math.libretexts.org/@go/page/4532
If we substitute the value t = π/6 into both functions we get

π √3 1
⇀ ^ ^
r ( ) = i + j
6 2 2

and

π 1 √3
⇀ ^ ^
r' ( ) =− i + j.
6 2 2

π ⇀′
π
The graph of this function appears in Figure 13.2.1, along with the vectors ⇀
r ( ) and r ( ) .
6 6

Figure 13.2.1 : The tangent line at a point is calculated from the derivative of the vector-valued function ⇀
.
r (t)

π π
Notice that the vector ⇀
r' ( ) is tangent to the circle at the point corresponding to t = . This is an example of a tangent vector to
6 6
the plane curve defined by Equation 13.2.2.

 Definition: principal unit tangent vector

Let C be a curve defined by a vector-valued function r , and assume that r '(t) exists when t = t A tangent vector r at t = t is
⇀ ⇀
0

0

any vector such that, when the tail of the vector is placed at point r (t ) on the graph, vector r is tangent to curve C . Vector

0

r '(t ) is an example of a tangent vector at point t = t . Furthermore, assume that r '(t) ≠ 0 . The principal unit tangent vector at
⇀ ⇀
0 0

t is defined to be


⇀ r '(t)
T(t) = ,

∥ r '(t)∥

provided ∥ r '(t)∥ ≠ 0 .

The unit tangent vector is exactly what it sounds like: a unit vector that is tangent to the curve. To calculate a unit tangent vector, first
find the derivative r '(t). Second, calculate the magnitude of the derivative. The third step is to divide the derivative by its magnitude.

 Example 13.2.4: Finding a Unit Tangent Vector

Find the unit tangent vector for each of the following vector-valued functions:
a. r (t) = cos t ^i + sin t ^j

b. u (t) = (3t + 2t) ^i + (2 − 4t


⇀ 2 3 ^ ^
) j + (6t + 5) k

Solution
⇀ ^ ^
First step: r '(t) = − sin t i + cos t j

⇀ −−−−−−−−−−−−−− −
Second step: ∥ r '(t)∥ = √(− sin t)2 + (cos t)2 = 1
a.

r '(t) ^ ^
⇀ − sin t i + cos t j
^ ^
Third step: T(t) = = = − sin t i + cos t j

∥ r '(t)∥ 1

13.2.6 https://math.libretexts.org/@go/page/4532
⇀ ^ 2 ^ ^
First step: r '(t) = (6t + 2) i − 12 t j + 6 k
−−−−−−−−−−−−−−−−−−−−
⇀ 2
Second step: ∥ r '(t)∥ = √(6t + 2 )2 + (−12 t2 )2 + 6

−−−−−−−−−−−−−−−−− −
= √144 t4 + 36 t2 + 24t + 40
−−−−−−−−−−−−−−− −
4 2
2 √36 t + 9 t + 6t + 10
b. =

⇀ ^ 2 ^ ^
⇀ r '(t) (6t + 2) i − 12 t j + 6 k
Third step: T(t) = = −−−−−−−−−−−−−−− −

∥ r '(t)∥ 2 √36 t4 + 9 t2 + 6t + 10

2
3t + 1 6t 3
^ ^ ^
= −−−−−−−−−−−−−−− − i − −−−−−−−−−−−−−−− − j + −−−−−−−−−−−−−−− − k
√36 t4 + 9 t2 + 6t + 10 √36 t4 + 9 t2 + 6t + 10 √36 t4 + 9 t2 + 6t + 10

 Exercise 13.2.4

Find the unit tangent vector for the vector-valued function


⇀ 2 ^ ^ ^
r (t) = (t − 3) i + (2t + 1) j + (t − 2) k.

Hint
Follow the same steps as in Example 13.2.4.

Answer
⇀ 2t 2 1
^ ^ ^
T(t) = i + j + k
− − −−−− − − −−−− − − −−−−
√ 4 t2 + 5 √ 4 t2 + 5 √ 4 t2 + 5

Integrals of Vector-Valued Functions


We introduced antiderivatives of real-valued functions in Antiderivatives and definite integrals of real-valued functions in The
Definite Integral. Each of these concepts can be extended to vector-valued functions. Also, just as we can calculate the derivative of a
vector-valued function by differentiating the component functions separately, we can calculate the antiderivative in the same manner.
Furthermore, the Fundamental Theorem of Calculus applies to vector-valued functions as well.
The antiderivative of a vector-valued function appears in applications. For example, if a vector-valued function represents the velocity
of an object at time t, then its antiderivative represents position. Or, if the function represents the acceleration of the object at a given
time, then the antiderivative represents its velocity.

 Definition: Definite and Indefinite Integrals of Vector-Valued Functions

Let f , g , and h be integrable real-valued functions over the closed interval [a, b].
1. The indefinite integral of a vector-valued function ⇀ ^ ^
r (t) = f (t) i + g(t) j is

^ ^ ^ ^
∫ [f (t) i + g(t) j ] dt = [∫ f (t) dt] i + [∫ g(t) dt] j .

The definite integral of a vector-valued function is


b b b

^ ^ ^ ^
∫ [f (t) i + g(t) j ] dt = [∫ f (t) dt] i + [∫ g(t) dt] j .
a a a

2. The indefinite integral of a vector-valued function ⇀ ^ ^ ^


r (t) = f (t) i + g(t) j + h(t) k is

^ ^ ^ ^ ^ ^
∫ [f (t) i + g(t) j + h(t) k] dt = [∫ f (t) dt] i + [∫ g(t) dt] j + [∫ h(t) dt] k.

The definite integral of the vector-valued function is


b b b b

^ ^ ^ ^ ^ ^
∫ [f (t) i + g(t) j + h(t) k] dt = [∫ f (t) dt] i + [∫ g(t) dt] j + [∫ h(t) dt] k.
a a a a

13.2.7 https://math.libretexts.org/@go/page/4532
Since the indefinite integral of a vector-valued function involves indefinite integrals of the component functions, each of these
component integrals contains an integration constant. They can all be different. For example, in the two-dimensional case, we can
have

∫ f (t) dt = F (t) + C1 and ∫ g(t) dt = G(t) + C2 ,

where F and G are antiderivatives of f and g , respectively. Then

^ ^ ^ ^
∫ [f (t) i + g(t) j ] dt = [∫ f (t) dt] i + [∫ g(t) dt] j

^ ^
= (F (t) + C1 ) i + (G(t) + C2 ) j

^ ^ ^ ^
= F (t) i + G(t) j + C1 i + C2 j


^ ^
= F (t) i + G(t) j + C


where C = C 1
^ ^
i + C2 j . Therefore, the integration constants becomes a constant vector.

 Example 13.2.5: Integrating Vector-Valued Functions

Calculate each of the following integrals:

a. ∫ [(3 t
2 ^ ^ 3 2 ^
+ 2t) i + (3t − 6) j + (6 t + 5 t − 4) k] dt

b. ∫ 2 3 3 2
[⟨t, t , t ⟩ × ⟨t , t , t⟩] dt

c. ∫
^ ^
[sin 2t i + tan t j + e
−2t ^
k] dt
0

Solution
a. We use the first part of the definition of the integral of a space curve:

b. 2 ^ ^ 3 2 ^ 2 ^ ^ 3 2 ^
∫ [(3 t + 2t) i + (3t − 6) j + (6 t + 5 t − 4) k] dt = [∫ 3t + 2t dt] i + [∫ 3t − 6 dt] j + [∫ 6t + 5t − 4 dt] k

3 3 5 ⇀
3 2 ^ 2 ^ 4 3 ^
= (t +t ) i +( t − 6t) j + ( t + t − 4t) k + C.
2 2 3

c. First calculate ⟨t, t 2 3 3 2


, t ⟩ × ⟨t , t , t⟩ :

∣ ^
i
^
j
^
k∣
∣ ∣
2 3 3 2 2 3
⟨t, t , t ⟩ × ⟨t , t , t⟩ = ∣ t t t ∣
∣ ∣
3 2
∣t t t ∣

2 3 2 ^ 2 3 3 ^ 2 2 3 ^
= (t (t) − t (t )) i − (t − t (t )) j + (t(t ) − t (t )) k

3 5 ^ 6 2 ^ 3 5 ^
= (t − t ) i + (t − t ) j + (t − t ) k.

Next, substitute this back into the integral and integrate:

2 3 3 2 3 5 ^ 6 2 ^ 3 5 ^
∫ [⟨t, t , t ⟩ × ⟨t , t , t⟩] dt = ∫ (t − t ) i + (t − t ) j + (t − t ) k dt

4 6 7 3 4 6
t t t t t t ⇀
^ ^ ^
=( − ) i +( − ) j +( − ) k + C.
4 6 7 3 4 6

d. Use the second part of the definition of the integral of a space curve:

13.2.8 https://math.libretexts.org/@go/page/4532
π π π π

3 3 3 3
^ ^ −2t ^ ^ ^ −2t ^
∫ [sin 2t i + tan t j + e k] dt = [∫ sin 2t dt] i + [ ∫ tan t dt] j + [ ∫ e dt] k
0 0 0 0

π/3 π/3 π/3


1 ∣ ^ ∣ ^ 1 −2t ∣ ^
= (− cos 2t) i − (ln | cos t|) j −( e ) k
2 ∣0 ∣0 2 ∣0

1 2π 1 ^ π ^ 1 −2π/3 1 −2(0) ^
= (− cos + cos 0) i − (ln(cos ) − ln(cos 0)) j − ( e − e ) k
2 3 2 3 2 2

1 1 ^ ^ 1 −2π/3 1 ^
=( + ) i − (− ln 2) j − ( e − ) k
4 2 2 2

3 ^ ^ 1 1 −2π/3 ^
= i + (ln 2) j + ( − e ) k.
4 2 2

 Exercise 13.2.5
Calculate the following integral:
3

^ 2 ^
∫ [(2t + 4) i + (3 t − 4t) j ] dt
1

Hint
Use the definition of the definite integral of a plane curve.

Answer
3

^ 2 ^ ^ ^
∫ [(2t + 4) i + (3 t − 4t) j ] dt = 16 i + 10 j
1

Summary
To calculate the derivative of a vector-valued function, calculate the derivatives of the component functions, then put them back
into a new vector-valued function.
Many of the properties of differentiation of scalar functions also apply to vector-valued functions.

The derivative of a vector-valued function r (t) is also a tangent vector to the curve. The unit tangent vector T(t) is calculated by

dividing the derivative of a vector-valued function by its magnitude.


The antiderivative of a vector-valued function is found by finding the antiderivatives of the component functions, then putting
them back together in a vector-valued function.
The definite integral of a vector-valued function is found by finding the definite integrals of the component functions, then putting
them back together in a vector-valued function.

Key Equations
Derivative of a vector-valued function
⇀ ⇀
r (t + Δt) − r (t)

r '(t) = lim
Δt→0 Δt

Principal unit tangent vector



⇀ r '(t)
T(t) =

∥ r '(t)∥

Indefinite integral of a vector-valued function

^ ^ ^ ^ ^ ^
∫ [f (t) i + g(t) j + h(t) k] dt = [∫ f (t) dt] i + [∫ g(t) dt] j + [∫ h(t) dt] k

Definite integral of a vector-valued function


b b b b

^ ^ ^ ^ ^ ^
∫ [f (t) i + g(t) j + h(t) k] dt = [∫ f (t) dt] i + [∫ g(t) dt] j + [∫ h(t) dt] k
a a a a

13.2.9 https://math.libretexts.org/@go/page/4532
Glossary
definite integral of a vector-valued function
the vector obtained by calculating the definite integral of each of the component functions of a given vector-valued function, then
using the results as the components of the resulting function

derivative of a vector-valued function


⇀ ⇀
r (t+Δt)− r (t)
the derivative of a vector-valued function ⇀
r (t) is ⇀
r '(t) = lim
Δt
, provided the limit exists
Δt→0

indefinite integral of a vector-valued function


a vector-valued function with a derivative that is equal to a given vector-valued function

principal unit tangent vector


a unit vector tangent to a curve C

tangent vector
to r (t) at t = t any vector

0

v such that, when the tail of the vector is placed at point ⇀
r (t0 ) on the graph, vector ⇀
v is tangent to
curve C

13.2: Derivatives and Integrals of Vector Functions is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
13.2: Calculus of Vector-Valued Functions by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

13.2.10 https://math.libretexts.org/@go/page/4532
13.3: Arc Length and Curvature
 Learning Objectives
Determine the length of a particle’s path in space by using the arc-length function.
Explain the meaning of the curvature of a curve in space and state its formula.
Describe the meaning of the normal and binormal vectors of a curve in space.

In this section, we study formulas related to curves in both two and three dimensions, and see how they are related to various
properties of the same curve. For example, suppose a vector-valued function describes the motion of a particle in space. We would
like to determine how far the particle has traveled over a given time interval, which can be described by the arc length of the path it
follows. Or, suppose that the vector-valued function describes a road we are building and we want to determine how sharply the
road curves at a given point. This is described by the curvature of the function at that point. We explore each of these concepts in
this section.

Arc Length for Vector Functions


We have seen how a vector-valued function describes a curve in either two or three dimensions. Recall that the formula for the arc
length of a curve defined by the parametric functions x = x(t), y = y(t), t ≤ t ≤ t is given by 1 2

t2 −−−−−−−−−−−−−−
2 2
s =∫ √ (x'(t)) + (y'(t)) dt.
t1

In a similar fashion, if we define a smooth curve using a vector-valued function ⇀ ^ ^


r (t) = f (t) i + g(t) j , where a ≤ t ≤ b , the arc
length is given by the formula
b −−−−−−−−−−−−−−
2 2
s =∫ √ (f '(t)) + (g'(t)) dt.
a

In three dimensions, if the vector-valued function is described by ⇀ ^ ^ ^


r (t) = f (t) i + g(t) j + h(t) k over the same interval
a ≤ t ≤ b , the arc length is given by

b −−−−−−−−−−−−−−−−−−−−−−
2 2 2
s =∫ √ (f '(t)) + (g'(t)) + (h'(t)) dt.
a

 Theorem: Arc-Length Formulas for Plane and Space curves

Plane curve: Given a smooth curve C defined by the function ⇀ ^ ^


r (t) = f (t) i + g(t) j , where t lies within the interval ,
[a, b]

the arc length of C over the interval is


b −−−−−−−−−−−−−
2 2
s =∫ √ [f '(t)] + [g'(t)] dt (13.3.1)
a

b

=∫ ∥ r '(t)∥dt. (13.3.2)
a

Space curve: Given a smooth curve C defined by the function ⇀ ^ ^ ^


r (t) = f (t) i + g(t) j + h(t) k , where t lies within the
interval [a, b], the arc length of C over the interval is
b −−−−−−−−−−−−−−−−−−−−−
2 2 2
s =∫ √ [f '(t)] + [g'(t)] + [h'(t)] dt (13.3.3)
a

b

=∫ ∥ r '(t)∥dt. (13.3.4)
a

The two formulas are very similar; they differ only in the fact that a space curve has three component functions instead of two.
Note that the formulas are defined for smooth curves: curves where the vector-valued function r (t) is differentiable with a non- ⇀

Access for free at OpenStax 13.3.1 https://math.libretexts.org/@go/page/4533


zero derivative. The smoothness condition guarantees that the curve has no cusps (or corners) that could make the formula
problematic.

 Example 13.3.1: Finding the Arc Length

Calculate the arc length for each of the following vector-valued functions:
a. ⇀ ^ ^
r (t) = (3t − 2) i + (4t + 5) j , 1 ≤t ≤5

b. ⇀
r (t) = ⟨t cos t, t sin t, 2t⟩, 0 ≤ t ≤ 2π

Solution

a. Using Equation 13.3.2, ⇀ ^ ^


r '(t) = 3 i + 4 j , so
b

s =∫ ∥ r '(t)∥dt
a

5
− −−−−−
2 2
=∫ √ 3 + 4 dt
1

5
5
=∫ 5dt = 5t∣
∣ = 20.
1
1

b. Using Equation 13.3.4, ⇀


r '(t) = ⟨cos t − t sin t, sin t + t cos t, 2⟩ , so
b

s =∫ ∥ r '(t) ∥ dt
a


−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2 2
=∫ √ (cos t − t sin t) + (sin t + t cos t) +2 dt
0

2π −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2 2 2 2 2
=∫ √ (cos t − 2t sin t cos t + t sin t) + (sin t + 2t sin t cos t + t cos t) + 4 dt
0

2π −−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2 2 2 2
=∫ √ cos t + sin t + t (cos t + sin t) + 4 dt
0


− − −−−
2
=∫ √ t + 5 dt
0

Here we can use a table integration formula


2
− −−−− − u − −−−− − a − −−−− −
∫ √ u2 + a2 du = √ u2 + a2 + ln ∣ √ u2 + a2 ∣ + C ,
∣u+ ∣
2 2

so we obtain
2π 2π
− − −−− 1 −−−−− −− −−−
2 2 2
∫ √ t + 5 dt = (t√ t + 5 + 5 ln ∣
∣t +
√t + 5 ∣ )

0
2
0

1 −−−−−− −−−−−− 5 –
2 2
= (2π √ 4 π + 5 + 5 ln (2π + √ 4 π + 5 )) − ln √5
2 2

≈ 25.343 units.

 Exercise 13.3.1
Calculate the arc length of the parameterized curve
⇀ 2 2 3
r (t) = ⟨2 t + 1, 2 t − 1, t ⟩, 0 ≤ t ≤ 3.

Hint

Access for free at OpenStax 13.3.2 https://math.libretexts.org/@go/page/4533


Use Equation 13.3.4.
Answer
⇀ 2
r '(t) = ⟨4t, 4t, 3 t ⟩, so s = 1

27
(113
3/2
− 32
3/2
) ≈ 37.785 units

We now return to the helix introduced earlier in this chapter. A vector-valued function that describes a helix can be written in the
form


2πN t 2πN t
^ ^ ^
r (t) = R cos( ) i + R sin( ) j + t k, 0 ≤ t ≤ h,
h h

where R represents the radius of the helix, h represents the height (distance between two consecutive turns), and the helix
completes N turns. Let’s derive a formula for the arc length of this helix using Equation 13.3.4. First of all,


2πN R 2πN t 2πN R 2πN t
^ ^ ^
r '(t) = − sin( ) i + cos( ) j + k.
h h h h

Therefore,
b

s =∫ ∥ r '(t)∥dt
a

−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
h 2 2
2πN R 2πN t 2πN R 2πN t 2
=∫ √ (− sin ( )) +( cos ( )) +1 dt
0
h h h h

−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
h 2 2 2
4π N R 2
2πN t 2
2πN t
=∫ √ ( sin ( ) + cos ( )) + 1 dt
2
0 h h h

−−−−−−−−−−−
h 2 2 2
4π N R
=∫ √ + 1 dt
2
0 h

−−−−−−−−−−−
2 2 2 h
4π N R
= [t√ +1]
2
h 0

−−−−−−−−−−−−
2 2 2 2
4π N R +h
= h√
2
h

− −−−−− −−−−−−
2 2 2 2
= √ 4π N R + h .

This gives a formula for the length of a wire needed to form a helix with N turns that has radius R and height h .

Arc-Length Parameterization
We now have a formula for the arc length of a curve defined by a vector-valued function. Let’s take this one step further and
examine what an arc-length function is.
If a vector-valued function represents the position of a particle in space as a function of time, then the arc-length function measures
how far that particle travels as a function of time. The formula for the arc-length function follows directly from the formula for arc
length:
t −−−−−−−−−−−−−−−−−−−−−−−
2 2 2
s =∫ √ (f '(u)) + (g'(u)) + (h'(u)) du. (13.3.5)
a

If the curve is in two dimensions, then only two terms appear under the square root inside the integral. The reason for using the
independent variable u is to distinguish between time and the variable of integration. Since s(t) measures distance traveled as a
function of time, s'(t) measures the speed of the particle at any given time. Since we have a formula for s(t) in Equation 13.3.5,
we can differentiate both sides of the equation:

Access for free at OpenStax 13.3.3 https://math.libretexts.org/@go/page/4533


t −−−−−−−−−−−−−−−−−−−−−−−
d
2 2 2
s'(t) = [∫ √ (f '(u)) + (g'(u)) + (h'(u)) du]
dt a

t
d

= [∫ ∥ r '(u)∥du]
dt a


= ∥ r '(t)∥.

If we assume that r (t) defines a smooth curve, then the arc length is always increasing, so s'(t) > 0 for t > a . Last, if
⇀ ⇀
r (t) is a
curve on which ∥ r '(t)∥ = 1 for all t , then

t t

s(t) = ∫ ∥ r '(u)∥ du = ∫ 1 du = t − a,
a a

which means that t represents the arc length as long as a = 0 .

 Theorem: Arc-Length Function

Let ⇀
r (t) describe a smooth curve for t ≥ a . Then the arc-length function is given by
t

s(t) = ∫ ∥ r '(u)∥ du
a

Furthermore,
ds ⇀
= ∥ r '(t)∥ > 0.
dt

If ∥ r '(t)∥ = 1 for all t ≥ a , then the parameter t represents the arc length from the starting point at t = a .

A useful application of this theorem is to find an alternative parameterization of a given curve, called an arc-length
parameterization. Recall that any vector-valued function can be reparameterized via a change of variables. For example, if we
have a function r (t) = ⟨3 cos t, 3 sin t⟩, 0 ≤ t ≤ 2π that parameterizes a circle of radius 3, we can change the parameter from t to

4t, obtaining a new parameterization r (t) = ⟨3 cos 4t, 3 sin 4t⟩ . The new parameterization still defines a circle of radius 3, but

now we need only use the values 0 ≤ t ≤ π/2 to traverse the circle once.
Suppose that we find the arc-length function s(t) and are able to solve this function for t as a function of s . We can then
reparameterize the original function r (t) by substituting the expression for t back into r (t) . The vector-valued function is now
⇀ ⇀

written in terms of the parameter s . Since the variable s represents the arc length, we call this an arc-length parameterization of the
original function r (t) . One advantage of finding the arc-length parameterization is that the distance traveled along the curve

starting from s = 0 is now equal to the parameter s . The arc-length parameterization also appears in the context of curvature
(which we examine later in this section) and line integrals.

 Example 13.3.2: Finding an Arc-Length Parameterization

Find the arc-length parameterization for each of the following curves:


a. ⇀ ^ ^
r (t) = 4 cos t i + 4 sin t j , t ≥0

b. ⇀
r (t) = ⟨t + 3, 2t − 4, 2t⟩, t ≥3

Solution
a. First we find the arc-length function using Equation 13.3.5:

Access for free at OpenStax 13.3.4 https://math.libretexts.org/@go/page/4533


t

s(t) = ∫ ∥ r '(u)∥ du
a

=∫ ∥⟨−4 sin u, 4 cos u⟩∥ du


0

t −−−−−−−−−−−−−−−−−−
2 2
=∫ √ (−4 sin u ) + (4 cos u ) du
0

t
− −−− −− −−−−−− −−− −
2 2
=∫ √ 16 sin u + 16 cos u du
0

=∫ 4 du = 4t,
0

which gives the relationship between the arc length s and the parameter t as s = 4t; so, t = s/4 . Next we replace the
variable t in the original function r (t) = 4 cos t ^i + 4 sin t ^j with the expression s/4 to obtain


s s
^ ^
r (s) = 4 cos( ) i + 4 sin( ) j.
4 4

This is the arc-length parameterization of ⇀


r (t) . Since the original restriction on t was given by t ≥ 0 , the restriction on s
becomes s/4 ≥ 0 , or s ≥ 0 .

b. The arc-length function is given by Equation 13.3.5:


t

s(t) = ∫ ∥ r '(u)∥ du
a

=∫ ∥⟨1, 2, 2⟩∥ du
3

t
−−−−−− −−−−
2 2 2
=∫ √1 +2 +2 du
3

=∫ 3 du
3

= 3t − 9.

Therefore, the relationship between the arc length s and the parameter t is s = 3t − 9 , so t = s

3
+3 . Substituting this into
the original function r (t) = ⟨t + 3, 2t − 4, 2t⟩ yields

s s s s 2s 2s

r (s) = ⟨( + 3) + 3, 2 ( + 3) − 4, 2 ( + 3)⟩ = ⟨ + 6, + 2, + 6⟩.
3 3 3 3 3 3

This is an arc-length parameterization of ⇀


r (t) . The original restriction on the parameter t was t ≥ 3 , so the restriction on s
is (s/3) + 3 ≥ 3 , or s ≥ 0 .

 Exercise 13.3.2

Find the arc-length function for the helix



r (t) = ⟨3 cos t, 3 sin t, 4t⟩, t ≥ 0.

Then, use the relationship between the arc length and the parameter t to find an arc-length parameterization of ⇀
r (t) .

Hint
Start by finding the arc-length function.
Answer

Access for free at OpenStax 13.3.5 https://math.libretexts.org/@go/page/4533


s = 5t , or t = s/5 . Substituting this into ⇀
r (t) = ⟨3 cos t, 3 sin t, 4t⟩ gives


s s 4s
r (s) = ⟨3 cos( ), 3 sin( ), ⟩, s ≥0
5 5 5

Curvature
An important topic related to arc length is curvature. The concept of curvature provides a way to measure how sharply a smooth
curve turns. A circle has constant curvature. The smaller the radius of the circle, the greater the curvature.
Think of driving down a road. Suppose the road lies on an arc of a large circle. In this case you would barely have to turn the wheel
to stay on the road. Now suppose the radius is smaller. In this case you would need to turn more sharply to stay on the road. In the
case of a curve other than a circle, it is often useful first to inscribe a circle to the curve at a given point so that it is tangent to the
curve at that point and “hugs” the curve as closely as possible in a neighborhood of the point (Figure 13.3.1). The curvature of the
graph at that point is then defined to be the same as the curvature of the inscribed circle.

Figure 13.3.1 : The graph represents the curvature of a function y = f (x). The sharper the turn in the graph, the greater the
curvature, and the smaller the radius of the inscribed circle.

Definition: Curvature

Let C be a smooth curve in the plane or in space given by ⇀


r (s) , where s is the arc-length parameter. The curvature κ at s is

∥ dT ∥ ⇀
κ =∥ ∥ = ∥ T'(s)∥.
∥ ds ∥

Visit this video for more information about the curvature of a space curve.


The formula in the definition of curvature is not very useful in terms of calculation. In particular, recall that T(t) represents the

unit tangent vector to a given vector-valued function ⇀
r (t) , and the formula for T(t) is

⇀ r '(t)
T(t) = .

∥ r '(t)∥

To use the formula for curvature, it is first necessary to express r (t) in terms of the arc-length parameter s , then find the unit

⇀ ⇀
tangent vector T(s) for the function r (s) , then take the derivative of T(s) with respect to s . This is a tedious process.

Fortunately, there are equivalent formulas for curvature.

 Theorem: Alternate Formulas of Curvature


If C is a smooth curve given by ⇀
r (t) , then the curvature κ of C at t is given by

∥ T'(t)∥
κ = . (13.3.6)

∥ r '(t)∥

If C is a three-dimensional curve, then the curvature can be given by the formula


⇀ ⇀
∥ r '(t) × r ''(t)∥
κ = . (13.3.7)
⇀ 3
∥ r '(t)∥

If C is the graph of a function y = f (x) and both y' and y exist, then the curvature κ at point (x, y) is given by
′′

Access for free at OpenStax 13.3.6 https://math.libretexts.org/@go/page/4533


′′
|y |
κ = . (13.3.8)
2 3/2
[1 + (y') ]

 Proof

The first formula follows directly from the chain rule:


⇀ ⇀
dT dT ds
= ,
dt ds dt

where s is the arc length along the curve C . Dividing both sides by ds/dt , and taking the magnitude of both sides gives
∥ ∥
⇀ ⇀
∥ T'(t) ∥
∥ dT ∥
∥ ∥ =∥ ∥.
∥ ds ∥ ds
∥ ∥
∥ dt ∥

Since ds/dt = ∥ r '(t)∥ , this gives the formula for the curvature κ of a curve C in terms of any parameterization of C :


∥ T'(t)∥
κ = .

∥ r '(t)∥


In the case of a three-dimensional curve, we start with the formulas T(t) = ( r '(t))/∥ r '(t)∥ and ds/dt = ∥ r '(t)∥ . ⇀ ⇀ ⇀


Therefore, r '(t) = (ds/dt)T(t) . We can take the derivative of this function using the scalar product formula:

2
d s ⇀ ds ⇀

r '' (t) = T(t) + T'(t).
2
dt dt

Using these last two equations we get


2
⇀ ⇀
ds ⇀ d s ⇀ ds ⇀
r '(t) × r '' (t) = T(t) × ( T(t) + T'(t))
dt dt2 dt

2
ds d s ⇀ ⇀ ds 2
⇀ ⇀
= T(t) × T(t) + ( ) T(t) × T'(t).
2
dt dt dt

⇀ ⇀
Since T(t) × T(t) = 0 , this reduces to
2

⇀ ⇀
ds ⇀ ⇀
r '(t) × r ''(t) = ( ) T(t) × T'(t).
dt

⇀ ⇀ ⇀ ⇀ ⇀ ⇀
Since T' is parallel to N , and
is orthogonal to
T N , it follows that T and T' are orthogonal. This means that
⇀ ⇀ ⇀ ⇀ ⇀
∥ T × T'∥ = ∥ T∥∥ T'∥ sin(π/2) = ∥ T'∥, so

⇀ ⇀
ds ⇀
∥ r '(t) × r '' (t)∥ = ( ) ∥ T'(t)∥.
dt


Now we solve this equation for ∥T'(t)∥ and use the fact that ds/dt = ∥ r '(t)∥ : ⇀

⇀ ⇀
⇀ ∥ r '(t) × r '' (t)∥
∥ T'(t)∥ = .
⇀ 2
∥ r '(t)∥

Then, we divide both sides by ∥ r '(t)∥. This gives



⇀ ⇀
∥ T'(t)∥ ∥ r '(t) × r '' (t)∥
κ = = .
⇀ 3

∥ r '(t)∥ ∥ r '(t)∥

This proves 13.3.7. To prove 13.3.8, we start with the assumption that curve C is defined by the function y = f (x). Then, we
can define r (t) = x ^i + f (x) ^j + 0 k
⇀ ^
. Using the previous formula for curvature:

Access for free at OpenStax 13.3.7 https://math.libretexts.org/@go/page/4533


⇀ ^ ^
r '(t) = i + f '(x) j

⇀ ^
r '' (t) = f '' (x) j

∣^i
^
j
^
k∣
∣ ∣
⇀ ⇀ ^
r '(t) × r '' (t) = ∣ 1 f '(x) 0 ∣ = f '' (x) k.
∣ ∣
∣0 f '' (x) 0 ∣

Therefore,
⇀ ⇀
∥ r '(t) × r '' (t)∥ |f '' (x)|
κ = =
⇀ 3 2 3/2
∥ r '(t)∥ (1 + [f '(x)] )

 Example 13.3.3: Finding Curvature

Find the curvature for each of the following curves at the given point:

a. ⇀ ^ ^ ^
r (t) = 4 cos t i + 4 sin t j + 3t k, t =
3
−−−−−−
b. f (x) = √4x − x 2
,x =2

Solution
a. This function describes a helix.


The curvature of the helix at t = (4π)/3 can be found by using 13.3.6. First, calculate T(t) :

⇀ r '(t)
T(t) =

∥ r '(t)∥

⟨−4 sin t, 4 cos t, 3⟩


= −−−−−−−−−−−−−−−−−−−−−
2
√ (−4 sin t)2 + (4 cos t)2 + 3

4 4 3
= ⟨− sin t, cos t, ⟩.
5 5 5


Next, calculate T'(t) :
⇀ 4 4
T'(t) = ⟨− cos t, − sin t, 0⟩.
5 5

Last, apply 13.3.6 :


4 4

∥⟨− cos t, − sin t, 0⟩∥
∥ T'(t)∥ 5 5
κ = =

∥ r '(t)∥ ∥⟨−4 sin t, 4 cos t, 3⟩∥

− −−−−−−−−−−−−−−−−−−−−− −
4 4
2 2 2
√ (− cos t) + (− sin t) + 0
5 5
= −−−−−−−−−−−−−−−−−−−−−
2
√ (−4 sin t)2 + (4 cos t)2 + 3

4/5 4
= = .
5 25

The curvature of this helix is constant at all points on the helix.


2. This function describes a semicircle.

Access for free at OpenStax 13.3.8 https://math.libretexts.org/@go/page/4533


To find the curvature of this graph, we must use 13.3.8. First, we calculate y' and y'' :
− −−−− −
2 2 1/2
y = √ 4x − x = (4x − x )

1 2 −1/2 2 −1/2
y' = (4x − x ) (4 − 2x) = (2 − x)(4x − x )
2

1
2 −1/2 2 −3/2
y'' = −(4x − x ) + (2 − x)(− )(4x − x ) (4 − 2x)
2

2 2
4x − x (2 − x)
=− −
2 3/2 2 3/2
(4x − x ) (4x − x )

2 2
x − 4x − (4 − 4x + x )
=
2 3/2
(4x − x )

4
=− .
2 3/2
(4x − x )

Then, we apply 13.3.8:


′′
|y |
κ =
[1 + (y')2 ]3/2

∣ 4 ∣ ∣ 4 ∣
∣− ∣ ∣ ∣
∣ 2 3/2 ∣ ∣ 2 3/2 ∣
(4x − x ) (4x − x )
= =
3/2 2 3/2
(2 − x)
2 −1/2 2
[1 + ((2 − x)(4x − x ) ) ] [1 + ]
2
4x − x

∣ 4 ∣
∣ ∣
∣ (4x − x2 )3/2 ∣ 2 3/2
∣ 4 ∣ (4x − x )
= =∣ ∣⋅
2 2 3/2 ∣ 2 3/2 ∣ 8
4x − x +x − 4x + 4 (4x − x )
[ ]
2
4x − x

1
= .
2

The curvature of this circle is equal to the reciprocal of its radius. There is a minor issue with the absolute value in 13.3.8 ;
however, a closer look at the calculation reveals that the denominator is positive for any value of x.

 Exercise 13.3.3
Find the curvature of the curve defined by the function
2
y = 3x − 2x + 4

at the point x = 2 .

Hint

Access for free at OpenStax 13.3.9 https://math.libretexts.org/@go/page/4533


Use 13.3.8.
Answer
6
κ = ≈ 0.0059
3/2
101

The Normal and Binormal Vectors


We have seen that the derivative r '(t) of a vector-valued function is a tangent vector to the curve defined by r (t) , and the unit
⇀ ⇀


tangent vector T(t) can be calculated by dividing r '(t) by its magnitude. When studying motion in three dimensions, two other

vectors are useful in describing the motion of a particle along a path in space: the principal unit normal vector and the binormal
vector.

Definition: Binormal Vectors


⇀ ⇀
Let C be a three-dimensional smooth curve represented by ⇀
r over an open interval I . If T'(t) ≠ 0 , then the principal unit
normal vector at t is defined to be

⇀ T'(t)
N(t) = . (13.3.9)

∥ T'(t)∥

The binormal vector at t is defined as


⇀ ⇀ ⇀
B(t) = T(t) × N(t), (13.3.10)


where T(t) is the unit tangent vector.


Note that, by definition, the binormal vector is orthogonal to both the unit tangent vector and the normal vector. Furthermore, B(t)
is always a unit vector. This can be shown using the formula for the magnitude of a cross product.
⇀ ⇀ ⇀ ⇀ ⇀
∥ B(t)∥ = ∥ T(t) × N(t)∥ = ∥ T(t)∥∥ N(t)∥ sin θ,

⇀ ⇀ ⇀
where θ is the angle between T(t) and N(t). Since N(t) is the derivative of a unit vector, property (vii) of the derivative of a
⇀ ⇀
vector-valued function tells us that T(t) and N(t) are orthogonal to each other, so θ = π/2 . Furthermore, they are both unit
⇀ ⇀ ⇀
vectors, so their magnitude is 1. Therefore, ∥T(t)∥∥N(t)∥ sin θ = (1)(1) sin(π/2) = 1 and B(t) is a unit vector.
The principal unit normal vector can be challenging to calculate because the unit tangent vector involves a quotient, and this
quotient often has a square root in the denominator. In the three-dimensional case, finding the cross product of the unit tangent
vector and the unit normal vector can be even more cumbersome. Fortunately, we have alternative formulas for finding these two
vectors, and they are presented in Motion in Space.

 Example 13.3.4: Finding the Principal Unit Normal Vector and Binormal Vector

For each of the following vector-valued functions, find the principal unit normal vector. Then, if possible, find the binormal
vector.

1. ⇀ ^ ^
r (t) = 4 cos t i − 4 sin t j

2. ⇀ ^ 2 ^ ^
r (t) = (6t + 2) i + 5 t j − 8t k

Solution
1. This function describes a circle.

Access for free at OpenStax 13.3.10 https://math.libretexts.org/@go/page/4533



To find the principal unit normal vector, we first must find the unit tangent vector T(t) :

⇀ r '(t)
T(t) =

∥ r '(t)∥

^ ^
−4 sin t i − 4 cos t j
= − −−−−−−−−−−−−−−−−− −
2 2
√ (−4 sin t) + (−4 cos t)

^ ^
−4 sin t i − 4 cos t j
= − −−−−−−−−−−−−− −
√ 16 sin2 t + 16 cos2 t

^ ^
−4 sin t i − 4 cos t j
=
−−−−−−−−−−−−−−
2
√ 16(sin t + cos2 t)

^ ^
−4 sin t i − 4 cos t j
=
4

^ ^
= − sin t i − cos t j .

Next, we use 13.3.9 :


⇀ T'(t)
N(t) =

∥ T'(t)∥

^ ^
− cos t i + sin t j
= − −−−−−−−−−−−−− −
2 2
√ (− cos t) + (sin t)

^ ^
− cos t i + sin t j
= − −−− − −−− −−−
√ cos2 t + sin2 t

^ ^
= − cos t i + sin t j .

Notice that the unit tangent vector and the principal unit normal vector are orthogonal to each other for all values of t :
⇀ ⇀
T(t) ⋅ N(t) = ⟨− sin t, − cos t⟩ ⋅ ⟨− cos t, sin t⟩

= sin t cos t − cos t sin t

= 0.

Furthermore, the principal unit normal vector points toward the center of the circle from every point on the circle. Since ⇀
r (t)

defines a curve in two dimensions, we cannot calculate the binormal vector.

Access for free at OpenStax 13.3.11 https://math.libretexts.org/@go/page/4533


2. This function looks like this:


To find the principal unit normal vector, we first find the unit tangent vector T(t) :

⇀ r '(t)
T(t) =

∥ r '(t)∥

^ ^ ^
6 i + 10t j − 8 k
= −−−−−−−−−−−−−−−
2 2 2
√6 + (10t) + (−8 )

^ ^ ^
6 i + 10t j − 8 k
= −−−−−−−−−−− −
√ 36 + 100 t2 + 64

^ ^ ^
6 i + 10t j − 8 k
= − −−−−−−−−
2
√ 100(t + 1)

^ ^ ^
3 i − 5t j − 4 k
=
− − −−−
2
5√ t + 1

3 2 −1/2 2 −1/2 ^
4 2 −1/2
^ ^
= (t + 1) i − t(t + 1 ) j − (t + 1) k.
5 5

⇀ ⇀
Next, we calculate T'(t) and ∥T'(t)∥:

Access for free at OpenStax 13.3.12 https://math.libretexts.org/@go/page/4533


⇀ 3 1 2 −3/2 2 −1/2
1 2 −3/2
4 1 2 −3/2
^ ^ ^
T'(t) = (− )(t + 1) (2t) i − ((t + 1 ) − t( )(t + 1) (2t)) j − (− )(t + 1) (2t) k
5 2 2 5 2

3t 1 4t
^ ^ ^
=− i − j + k
2 3/2 2 3/2 2 3/2
5(t + 1) (t + 1) 5(t + 1)

−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2 2
⇀ 3t 1 4t
∥ T'(t)∥ = √ ( − ) +( − ) +( )
2 3/2 2 3/2 2 3/2
5(t + 1) (t + 1) 5(t + 1)

−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2
9t 1 16t
=√ + +
2 3 2 3 2 3
25(t + 1) (t + 1) 25(t + 1)

−−−−−−−−−
2
25 t + 25
=√
25(t2 + 1 )3

−−−−−−−−
1
=√
2 2
(t + 1)

1
= .
t2 + 1

Therefore, according to 13.3.9 :



⇀ T'(t)
N(t) =

∥ T'(t)∥

3t 1 4t
^ ^ ^ 2
=(− i − j + k)(t + 1)
2 3/2 2 3/2 2 3/2
5(t + 1) (t + 1) 5(t + 1)

3t 5 4t
^ ^ ^
=− i − j + k
2 1/2 2 1/2 2 1/2
5(t + 1) 5(t + 1) 5(t + 1)

^ ^ ^
3t i + 5 j − 4t k
=− − − −−− .
2
5√ t + 1

Once again, the unit tangent vector and the principal unit normal vector are orthogonal to each other for all values of t :
^ ^ ^ ^ ^ ^
⇀ ⇀ 3 i − 5t j − 4 k 3t i + 5 j − 4t k
T(t) ⋅ N(t) = ( − − −−− )⋅(− − − −−− )
2 2
5√ t + 1 5√ t + 1

3(−3t) − 5t(−5) − 4(4t)


=
2
25(t + 1)

−9t + 25t − 16t


=
2
25(t + 1)

= 0.

Last, since ⇀
r (t) represents a three-dimensional curve, we can calculate the binormal vector using 13.3.10 :

Access for free at OpenStax 13.3.13 https://math.libretexts.org/@go/page/4533


⇀ ⇀ ⇀
B(t) = T(t) × N(t)

∣ ^ ^ ^ ∣
i j k
∣ ∣
3 5t 4
∣ ∣
− −
− − −−− − − −−− −− −−−
=∣ 2
5√ t + 1
2
5√ t + 1 5√ t + 1 ∣
2

∣ ∣
3t 5 4t
∣− − ∣
− − −−− − − −−− − − − −−
∣ 2 2 2
5√ t + 1 5√ t + 1 5√ t + 1 ∣

5t 4t 4 5
^
= (( − − − −−− )( − − −−−) −( − − − −−− )( − − − −−− )) i
2 2 2 2
5√ t + 1 5√ t + 1 5√ t + 1 5√ t + 1

3 4t 4 3t
^
− (( − − −−− )( − − −−−) −( − − − −−− )( − − − −−− )) j
2 2 2 2
5√ t + 1 5√ t + 1 5√ t + 1 5√ t + 1

3 5 5t 3t
^
+ (( − − −−− )( − − − −−−) −( − − − −−− )( − − − −−− )) k
2 2 2 2
5√ t + 1 5√ t + 1 5√ t + 1 5√ t + 1

2 2
−20 t − 20 −15 − 15t
^ ^
=( ) i +( )k
2 2
25(t + 1) 25(t + 1)

2 2
t +1 t +1
^ ^
= −20( ) i − 15( )k
25(t2 + 1) 25(t2 + 1)

4 3
^ ^
=− i − k.
5 5

 Exercise 13.3.4

Find the unit normal vector for the vector-valued function ⇀


r (t) = (t
2 ^ ^
− 3t) i + (4t + 1) j and evaluate it at t = 2 .

Hint

First, find T(t), then use 13.3.9.
Answer

⇀ √2
^ ^
N(2) = ( i − j)
2

For any smooth curve in three dimensions that is defined by a vector-valued function, we now have formulas for the unit tangent
⇀ ⇀ ⇀
vector T , the unit normal vector N, and the binormal vector B. The unit normal vector and the binormal vector form a plane that is
perpendicular to the curve at any point on the curve, called the normal plane. In addition, these three vectors form a frame of
reference in three-dimensional space called the Frenet frame of reference (also called the TNB frame) (Figure 13.3.2). Last, the
⇀ ⇀
plane determined by the vectors T and N forms the osculating plane of C at any point P on the curve.

Figure 13.3.2 : This figure depicts a Frenet frame of reference. At every point P on a three-dimensional curve, the unit tangent,
unit normal, and binormal vectors form a three-dimensional frame of reference.
Suppose we form a circle in the osculating plane of C at point P on the curve. Assume that the circle has the same curvature as the
curve does at point P and let the circle have radius r. Then, the curvature of the circle is given by . We call r the radius of 1

Access for free at OpenStax 13.3.14 https://math.libretexts.org/@go/page/4533


curvature of the curve, and it is equal to the reciprocal of the curvature. If this circle lies on the concave side of the curve and is
tangent to the curve at point P , then this circle is called the osculating circle of C at P , as shown in Figure 13.3.3.

Figure 13.3.3 : In this osculating circle, the circle is tangent to curve C at point P and shares the same curvature.
For more information on osculating circles, see this demonstration on curvature and torsion, this article on osculating circles, and
this discussion of Serret formulas.
To find the equation of an osculating circle in two dimensions, we need find only the center and radius of the circle.

 Example 13.3.5: Finding the Equation of an Osculating Circle

Find the equation of the osculating circle of the curve defined by the function y = x 3
− 3x + 1 at x = 1 .
Solution
Figure 13.3.4 shows the graph of y = x 3
− 3x + 1 .

Figure 13.3.4 : We want to find the osculating circle of this graph at the point where x = 1 .
First, let’s calculate the curvature at x = 1 :
|f '' (x)| |6x|
κ = = .
3/2 2 2 3/2
(1 + [3 x − 3] )
2
(1 + [f '(x)] )

1
This gives κ =6 . Therefore, the radius of the osculating circle is given by R =
1

κ
= . Next, we then calculate the
6
coordinates of the center of the circle. When x = 1 , the slope of the tangent line is zero. Therefore, the center of the osculating
circle is directly above the point on the graph with coordinates (1, −1). The center is located at (1, − ). The formula for a 5

circle with radius r and center (h, k) is given by (x − h) + (y − k) = r . Therefore, the equation of the osculating circle is
2 2 2

2
(x − 1 ) + (y +
5

6
2
) =
1

36
. The graph and its osculating circle appears in the following graph.

Access for free at OpenStax 13.3.15 https://math.libretexts.org/@go/page/4533


Figure 13.3.5 : The osculating circle has radius R = 1

6
.

 Exercise 13.3.5

Find the equation of the osculating circle of the curve defined by the vector-valued function y = 2x 2
− 4x + 5 at x = 1 .

Hint
Use 13.3.8to find the curvature of the graph, then draw a graph of the function around x = 1 to help visualize the circle in
relation to the graph.

Answer
4
κ =
2 3/2
[1+(4x−4) ]

At the point x = 1 , the curvature is equal to 4. Therefore, the radius of the osculating circle is 1

4
.
A graph of this function appears next:

The vertex of this parabola is located at the point (1, 3). Furthermore, the center of the osculating circle is directly above
the vertex. Therefore, the coordinates of the center are (1, ). The equation of the osculating circle is
13

(x − 1 )
2
+ (y −
13

4
2
) =
1

16
.

Key Concepts
b

The arc-length function for a vector-valued function is calculated using the integral formula s(t) = ∫ ⇀
∥ r '(t)∥ dt . This
a

formula is valid in both two and three dimensions.


The curvature of a curve at a point in either two or three dimensions is defined to be the curvature of the inscribed circle at that
point. The arc-length parameterization is used in the definition of curvature.

Access for free at OpenStax 13.3.16 https://math.libretexts.org/@go/page/4533


There are several different formulas for curvature. The curvature of a circle is equal to the reciprocal of its radius.
The principal unit normal vector at t is defined to be

⇀ T'(t)
N(t) = .

∥ T'(t)∥

⇀ ⇀ ⇀ ⇀
The binormal vector at t is defined as B(t) = T(t) × N(t) , where T(t) is the unit tangent vector.
The Frenet frame of reference is formed by the unit tangent vector, the principal unit normal vector, and the binormal vector.
The osculating circle is tangent to a curve at a point and has the same curvature as the tangent curve at that point.

Key Equations
Arc length of space curve
b b
−−−−−−−−−−−−−−−−−−−− − ⇀
2 2 2
s =∫ √[f '(t)] + [g'(t)] + [h'(t)] dt = ∫ ∥ r '(t)∥ dt
a a

Arc-length function
t t
−−−−−−−−−−−−−−−−−−−−− − ⇀
2 2 2
s(t) = ∫ √f '(u)) + (g'(u)) + (h'(u)) du or s(t) = ∫ ∥ r '(u)∥ du
a a
⇀ ⇀ ⇀
∥ T '(t)∥ ∥ r '(t)× r ''(t)∥ |y''|
κ = ⇀
or κ = or κ =
⇀ 3 2 3/2
∥ r '(t)∥ ∥ r '(t) ∥ [1+(y') ]

Principal unit normal vector



⇀ T '(t)
N(t) = ⇀
∥ T '(t)∥

Binormal vector
⇀ ⇀ ⇀
B(t) = T(t) × N(t)

Glossary
arc-length function
a function s(t) that describes the arc length of curve C as a function of t

arc-length parameterization
a reparameterization of a vector-valued function in which the parameter is equal to the arc length

binormal vector
a unit vector orthogonal to the unit tangent vector and the unit normal vector

curvature
the derivative of the unit tangent vector with respect to the arc-length parameter

Frenet frame of reference


(TNB frame) a frame of reference in three-dimensional space formed by the unit tangent vector, the unit normal vector, and the
binormal vector

normal plane
a plane that is perpendicular to a curve at any point on the curve

osculating circle
a circle that is tangent to a curve C at a point P and that shares the same curvature

osculating plane
the plane determined by the unit tangent and the unit normal vector

principal unit normal vector



T '(t)
a vector orthogonal to the unit tangent vector, given by the formula ⇀
∥ T '(t)∥

Access for free at OpenStax 13.3.17 https://math.libretexts.org/@go/page/4533


radius of curvature
the reciprocal of the curvature

smooth
curves where the vector-valued function ⇀
r (t) is differentiable with a non-zero derivative

This page titled 13.3: Arc Length and Curvature is shared under a not declared license and was authored, remixed, and/or curated by OpenStax.
13.3: Arc Length and Curvature by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

Access for free at OpenStax 13.3.18 https://math.libretexts.org/@go/page/4533


13.4: Motion in Space- Velocity and Acceleration
 Learning Objectives
Describe the velocity and acceleration vectors of a particle moving in space.
Explain the tangential and normal components of acceleration.
State Kepler’s laws of planetary motion.

We have now seen how to describe curves in the plane and in space, and how to determine their properties, such as arc length and
curvature. All of this leads to the main goal of this chapter, which is the description of motion along plane curves and space curves.
We now have all the tools we need; in this section, we put these ideas together and look at how to use them.

Motion Vectors in the Plane and in Space


Our starting point is using vector-valued functions to represent the position of an object as a function of time. All of the following
material can be applied either to curves in the plane or to space curves. For example, when we look at the orbit of the planets, the
curves defining these orbits all lie in a plane because they are elliptical. However, a particle traveling along a helix moves on a
curve in three dimensions.

 Definition: Speed, Velocity, and Acceleration


Let r (t) be a twice-differentiable vector-valued function of the parameter

t that represents the position of an object as a
function of time.
The velocity vector ⇀
v (t) of the object is given by
⇀ ⇀ ′
Velocity = v (t) = r (t). (13.4.1)

The acceleration vector ⇀


a (t) is defined to be
⇀ ⇀ ′ ⇀ ′′
Acceleration = a (t) = v (t) = r (t). (13.4.2)

The speed is defined to be

⇀ ⇀ ′
ds
Speed = v(t) = ∥ v (t)∥ = ∥ r (t)∥ = . (13.4.3)
dt

Since r (t) can be in either two or three dimensions, these vector-valued functions can have either two or three components. In two

dimensions, we define r (t) = x(t)^i + y(t)^j and in three dimensions r (t) = x(t)^i + y(t)^j + z(t)k
⇀ ⇀ ^
. Then the velocity,
acceleration, and speed can be written as shown in the following table.
Table 13.4.1 : Formulas for Position, Velocity, Acceleration, and Speed
Quantity Two Dimensions Three Dimensions

Position ⇀ ^ ^
r (t) = x(t) i + y(t) j
⇀ ^ ^ ^
r (t) = x(t) i + y(t) j + z(t)k

Velocity ⇀ ′ ^ ′ ^
v (t) = x (t) i + y (t) j
⇀ ′ ^ ′ ^ ′ ^
v (t) = x (t) i + y (t) j + z (t)k

Acceleration ⇀
a (t) = x
′′ ^
(t) i + y
′′ ^
(t) j

a (t) = x
′′ ^
(t) i + y
′′ ^
(t) j + z
′′ ^
(t)k

−−− −−− −−−−− −−−− −−−−−−−−−−−−−−−−−−−−−− −


Speed ⇀ ′ 2 ′
∥ v (t)∥ = √(x (t)) + (y (t))
2 ⇀ ′ 2 ′ 2 ′
∥ v (t)∥ = √(x (t)) + (y (t)) + (z (t))
2

 Example 13.4.1: Studying Motion Along a Parabola


−−−− −
A particle moves in a parabolic path defined by the vector-valued function ⇀
r (t) = t
2^ ^
i + √5 − t2 j , where t measures time in
seconds.
1. Find the velocity, acceleration, and speed as functions of time.
2. Sketch the curve along with the velocity vector at time t = 1 .
Solution

13.4.1 https://math.libretexts.org/@go/page/4534
1. We use Equations 13.4.1, 13.4.2, and 13.4.3:

⇀ ⇀ ′
t
^ ^
v (t) = r (t) = 2t i − − − −−− j
√ 5 − t2

3
⇀ ⇀ ′ ^ 2 − ^
a (t) = v (t) = 2 i − 5(5 − t ) 2 j

⇀ ⇀ ′
|| v (t)|| = || r (t)||

−−−−−−−−−−−−−−−−−−
2

2
t
= √ (2t) + (− − − −−−)
√ 5 − t2

−−−−−−−−−−
2
t
2
= √ 4t +
2
5 −t

−−−−−−−−−
2 4
21 t − 4t
=√ .
2
5 −t

−−−− −
2. The graph of ⇀ 2^
r (t) = t
2 ^
i + √5 − t j is a portion of a parabola (Figure 13.4.1).

−−−− −−− –
When t = 1 , ⇀
r (1) = (1 )
2^ 2 ^
i + √5 − (1) j =
^ ^
i + √4 j =
^ ^
i +2j .

Thus the particle would be located at the point (1, 2) when t = 1 .

The velocity vector at t = 1 is


1
⇀ ⇀ ′ ^ ^
v (1) = r (1) = 2(1) i − j
−−−−−
2
√5 − 1

1
^ ^
= 2i − j
2

and the acceleration vector at t = 1 is

⇀ ⇀ ′ 2 −3/2 ^
5
^ ^ ^
a (1) = v (1) = 2 i − 5(5 − 1 ) j = 2i − j.
8

Notice that the velocity vector is tangent to the path, as is always the case.

Figure 13.4.1 : This graph depicts the velocity vector at time t = 1 for a particle moving in a parabolic path.

13.4.2 https://math.libretexts.org/@go/page/4534
 Exercise 13.4.1

A particle moves in a path defined by the vector-valued function r (t) = (t − 3t) ^i + (2t − 4) ^j + (t + 2) k
⇀ 2 ^
, where t
measures time in seconds and where distance is measured in feet. Find the velocity, acceleration, and speed as functions of
time.

Hint
Use Equations 13.4.1, 13.4.2, and 13.4.3.

Answer
⇀ ⇀ ′ ^ ^ ^
v (t) = r (t) = (2t − 3) i + 2 j + k

⇀ ⇀ ′ ^
a (t) = v (t) = 2 i

−−−−−−−−−−−−−−− − −−−−−−−−− −
⇀ ′ 2 2 2 2
|| r (t)|| = √ (2t − 3 ) +2 +1 = √ 4 t − 12t + 14

The units for velocity and speed are feet per second, and the units for acceleration are feet per second squared.

To gain a better understanding of the velocity and acceleration vectors, imagine you are driving along a curvy road. If you do not
turn the steering wheel, you would continue in a straight line and run off the road. The speed at which you are traveling when you
run off the road, coupled with the direction, gives a vector representing your velocity, as illustrated in Figure 13.4.2.

Figure 13.4.2 : At each point along a road traveled by a car, the velocity vector of the car is tangent to the path traveled by the car.
However, the fact that you must turn the steering wheel to stay on the road indicates that your velocity is always changing (even if
your speed is not) because your direction is constantly changing to keep you on the road. As you turn to the right, your acceleration
vector also points to the right. As you turn to the left, your acceleration vector points to the left. This indicates that your velocity
and acceleration vectors are constantly changing, regardless of whether your actual speed varies (Figure 13.4.3).

Figure 13.4.3 : The dashed line represents the trajectory of an object (a car, for example). The acceleration vector points toward the
inside of the turn at all times.

Components of the Acceleration Vector


We can combine some of the concepts discussed in Arc Length and Curvature with the acceleration vector to gain a deeper

understanding of how this vector relates to motion in the plane and in space. Recall that the unit tangent vector T and the unit

13.4.3 https://math.libretexts.org/@go/page/4534

normal vector N form an osculating plane at any point P on the curve defined by a vector-valued function r (t) . The following ⇀

theorem shows that the acceleration vector a (t) lies in the osculating plane and can be written as a linear combination of the unit

tangent and the unit normal vectors.

 Theorem 13.4.1: The Plane of the Acceleration Vector

The acceleration vector a (t) of an object moving along a curve traced out by a twice-differentiable function
⇀ ⇀
r (t) lies in the
⇀ ⇀
plane formed by the unit tangent vector T(t) and the principal unit normal vector N(t) to C . Furthermore,
⇀ ⇀
⇀ ′ 2
a (t) = v (t)T(t) + [v(t)] κ N(t)

Here, v(t) = ∥ v (t)∥ is the speed of the object and κ is the curvature of C traced out by
⇀ ⇀
r (t) .

 Proof
⇀ ′
⇀ r (t) ⇀ ⇀
⇀ ′ ⇀ ′
Because ⇀
v (t) = r (t) and T(t) = ⇀ ′
, we have v (t) = || r

(t)|| T(t) = v(t)T(t) .
|| r (t)||

Now we differentiate this equation:


d ⇀ ⇀ ⇀ ′
⇀ ⇀ ′ ′
a (t) = v (t) = (v(t)T(t)) = v (t)T(t) + v(t)T (t)
dt

⇀ ′

⇀ T (t) ⇀ ′ ⇀ ′ ⇀
Since N(t) = ⇀ ′
, we know T (t) = || T (t)|| N(t) , so
|| T (t)||

⇀ ⇀ ′ ⇀
⇀ ′
a (t) = v (t)T(t) + v(t)|| T (t)|| N(t).

⇀ ′
|| T (t)|| ⇀ ′
⇀ ′
A formula for curvature is κ = ⇀ ′
, so T (t) = κ|| r (t)|| = κv(t) .
|| r (t)||

⇀ ⇀
This gives ⇀ ′
a (t) = v (t)T(t) + κ(v(t)) N(t).
2

⇀ ⇀
The coefficients of T(t) and N(t) are referred to as the tangential component of acceleration and the normal component of
acceleration, respectively. We write a to denote the tangential component and a to denote the normal component.

T

N

 Theorem 13.4.2: Tangential and Normal Components of Acceleration

Let r (t) be a vector-valued function that denotes the position of an object as a function of time. Then a (t) =
⇀ ⇀ ⇀ ′
r ' (t) is the
acceleration vector. The tangential and normal components of acceleration a and a are given by the formulas ⇀
T

N

⇀ ⇀
⇀ v ⋅ a

a⇀ = a ⋅ T = (13.4.4)
T ⇀
|| v ||

and
⇀ ⇀ −−−−−−−−−−−−
|| v × a || 2
⇀ 2
⇀ ⇀
a⇀ = a ⋅ N = = √ || a | | − (a⇀ ) . (13.4.5)
N ⇀ T
|| v ||

These components are related by the formula


⇀ ⇀

a (t) = a⇀ T(t) + a⇀ N(t). (13.4.6)
T N

⇀ ⇀
Here T(t) is the unit tangent vector to the curve defined by ⇀
r (t) , and N(t) is the unit normal vector to the curve defined by
r (t) .

13.4.4 https://math.libretexts.org/@go/page/4534
The normal component of acceleration is also called the centripetal component of acceleration or sometimes the radial component
of acceleration. To understand centripetal acceleration, suppose you are traveling in a car on a circular track at a constant speed.
Then, as we saw earlier, the acceleration vector points toward the center of the track at all times. As a rider in the car, you feel a
pull toward the outside of the track because you are constantly turning. This sensation acts in the opposite direction of centripetal
acceleration. The same holds true for non-circular paths. The reason is that your body tends to travel in a straight line and resists
the force resulting from acceleration that push it toward the side. Note that at point B in Figure 13.4.4 the acceleration vector is
pointing backward. This is because the car is decelerating as it goes into the curve.

Figure 13.4.4 : The tangential and normal components of acceleration can be used to describe the acceleration vector.
The tangential and normal unit vectors at any given point on the curve provide a frame of reference at that point. The tangential and
⇀ ⇀
normal components of acceleration are the projections of the acceleration vector onto T and N, respectively.

 Example 13.4.2: Finding Components of Acceleration

A particle moves in a path defined by the vector-valued function ⇀


r (t) = t
2 ^ ^ 2 ^
i + (2t − 3) j + (3 t − 3t) k , where t measures
time in seconds and distance is measured in feet.
a. Find a ⇀ and a ⇀ as functions of t .
T N

b. Find a ⇀ and a ⇀ at time t = 2 .


T N

Solution
a. Let’s start deriving the velocityand acceleration functions:
⇀ ⇀ ′
v (t) = r (t)

^ ^ ^
= 2t i + 2 j + (6t − 3) k

⇀ ⇀ ′
a (t) = v (t)

^ ^
= 2 i +6 k

Now we apply Equation 13.4.4:


⇀ ⇀
v ⋅ a
a⇀ =
T ⇀
|| v ||

^ ^ ^ ^ ^
(2t i + 2 j + (6t − 3) k) ⋅ (2 i + 6 k)
=
^ ^ ^
||2t i + 2 j + (6t − 3) k||

4t + 6(6t − 3)
= −−−−−−−−−−−−−−−−−
2
√ (2t)2 + 2 2
+ (6t − 3 )

40t − 18
=
2
40 t − 36t + 13

Now we can apply Equation 13.4.5:

13.4.5 https://math.libretexts.org/@go/page/4534
−−−−−−−−−−−−
2
⇀ 2
a⇀ = √ || a | | − (a⇀ )
N T

−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2
40t − 18
^ ^ 2
= √ ||2 i + 6 k| | − ( − −−−−−−−−−− −)
√ 40 t2 − 36t + 13

−−−−−−−−−−−−−−−−−−−−
(40t − 18)2
= √ 4 + 36 −
2
40 t − 36t + 13

−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2
40(40 t − 36t + 13) − (1600 t − 1440t + 324)
=√
2
40 t − 36t + 13

−−−−−−−−−−−−−
196
=√
2
40 t − 36t + 13

14
=
− −−−−−−−−−− −
√ 40 t2 − 36t + 13

b. We must evaluate each of the answers from part a at t = 2 :


40(2) − 18
a⇀ (2) =
T
−−−−−−−−−−−−−− −
2
√ 40(2 ) − 36(2) + 13

80 − 18
=
− −−−−−−−−− −
√ 160 − 72 + 13

62
= −−−
√101

14
a⇀ (2) =
N
−−−−−−−−−−−−−− −
√ 40(2 )2 − 36(2) + 13

14 14
= = .
− −−−−−−−−− − −−−
√ 160 − 72 + 13 √101

The units of acceleration are feet per second squared, as are the units of the normal and tangential components of
acceleration.

 Exercise 13.4.2

An object moves in a path defined by the vector-valued function ⇀ ^ 2 ^


r (t) = 4t i + t j , where t measures time in seconds.
a. Find a⇀
T
and a

N
as functions of t .
b. Find a⇀
T
and a

N
at time t = −3 .

Hint
Use Equations 13.4.4and 13.4.5

Answer
a.

13.4.6 https://math.libretexts.org/@go/page/4534
⇀ ⇀ ⇀ ′ ⇀′ ′
v (t) ⋅ a (t) r (t) ⋅ r (t)
a⇀ = =
T ⇀ ⇀ ′
|| v (t)|| || r (t)||

^ ^ ^
(4 i + 2t j ) ⋅ (2 j )
=
^ ^
||4 i + 2t j ||

4t
=
−−−−−−−−
2 2
√4 + (2t)

2t
=
− −−−−
√ 2 + t2

−−−−−−−−−
⇀ 2 2
a⇀ = √ || a | | − a⇀
N T

− −−−−−−−−−−−−−−−− −
 2
 2t
^ 2

= ||2 j | | − ( )
− −− −−
⎷ √ 2 + t2

−−−−−−−−−
2
4t
= √4 −
2
2 +t

b.
2(−3)
a⇀ (−3) =
T
− −−− −−−−
2
√ 2 + (−3)

−6
= −−
√11

−−−−−−−−−−−−
2
4(−3)
a⇀ (−3) = √ 4 −
N 2
2 + (−3)

−−−−−−
36
= √4 −
11
−−

8
=√
11

2 √2
=
−−
√11

Projectile Motion
Now let’s look at an application of vector functions. In particular, let’s consider the effect of gravity on the motion of an object as it
travels through the air, and how it determines the resulting trajectory of that object. In the following, we ignore the effect of air
resistance. This situation, with an object moving with an initial velocity but with no forces acting on it other than gravity, is known
as projectile motion. It describes the motion of objects from golf balls to baseballs, and from arrows to cannonballs.
First we need to choose a coordinate system. If we are standing at the origin of this coordinate system, then we choose the positive
y -axis to be up, the negative y -axis to be down, and the positive x-axis to be forward (i.e., away from the thrower of the object).

The effect of gravity is in a downward direction, so Newton’s second law tells us that the force on the object resulting from gravity
⇀ ⇀
is equal to the mass of the object times the acceleration resulting from gravity, or F = m a , where F represents the force from
g

g

gravity and a = −g ^j represents the acceleration resulting from gravity at Earth’s surface. The value of g in the English system of

measurement is approximately 32 ft/sec2 and it is approximately 9.8 m/sec2 in the metric system. This is the only force acting on

the object. Since gravity acts in a downward direction, we can write the force resulting from gravity in the form F = −mg ^j , as g

shown in Figure 13.4.5.

13.4.7 https://math.libretexts.org/@go/page/4534
Figure 13.4.5 : An object is falling under the influence of gravity.
Newton’s second law also tells us that F = m a , where a represents the acceleration vector of the object. This force must be equal
⇀ ⇀

to the force of gravity at all times, so we therefore know that


⇀ ⇀
F = Fg

⇀ ^
m a = −mg j

⇀ ^
a = −g j .

Now we use the fact that the acceleration vector is the first derivative of the velocity vector. Therefore, we can rewrite the last
equation in the form
⇀ ′ ^
v (t) = −g j

By taking the antiderivative of each side of this equation we obtain



⇀ ^ ^
v (t) = ∫ −g j dt = −gt j + C1


for some constant vector C . To determine the value of this vector, we can use the velocity of the object at a fixed time, say at time
1
⇀ ⇀
t =0 . We call this velocity the initial velocity: ⇀
v (0) = v 0

. Therefore, ⇀ ^ ⇀
v (0) = −g(0) j + C1 = v 0 and C 1

= v0 . This gives the
velocity vector as v (t) = −gt ^j + v .
⇀ ⇀
0

Next we use the fact that velocity ⇀


v (t) is the derivative of position ⇀
s (t) . This gives the equation
⇀ ′ ^ ⇀
s (t) = −gt j + v 0 .

Taking the antiderivative of both sides of this equation leads to

⇀ ^ ⇀
s (t) = ∫ −gt j + v 0 dt

1 2 ⇀

^
=− gt j + v 0 t + C2
2

⇀ ⇀
with another unknown constant vector C . To determine the value of C , we can use the position of the object at a given time, say
2 2

at time t = 0 . We call this position the initial position: s (0) = s . Therefore, s (0) = −(1/2)g(0) ^j + v (0) + C = s . This
⇀ ⇀
0
⇀ 2 ⇀
0 2

0

gives the position of the object at any time as


1 2 ⇀ ⇀
^
s (t) = − gt j + v0 t + s 0 .
2

Let’s take a closer look at the initial velocity and initial position. In particular, suppose the object is thrown upward from the origin
at an angle θ to the horizontal, with initial speed v . How can we modify the previous result to reflect this scenario? First, we can

0

13.4.8 https://math.libretexts.org/@go/page/4534

assume it is thrown from the origin. If not, then we can move the origin to the point from where it is thrown. Therefore, ⇀
s0 = 0 , as
shown in Figure 13.4.6.

Figure 13.4.6 : Projectile motion when the object is thrown upward at an angle θ. The horizontal motion is at constant velocity and
the vertical motion is at constant acceleration.

We can rewrite the initial velocity vector in the form ⇀ ^ ^


v 0 = v0 cos θ i + v0 sin θ j . Then the equation for the position function ⇀
s (t)

becomes
1
⇀ 2 ^ ^ ^
s (t) = − gt j + v0 t cos θ i + v0 t sin θ j
2

1 2
^ ^ ^
= v0 t cos θ i + v0 t sin θ j − gt j
2

1
^ 2 ^
= v0 t cos θ i + ( v0 t sin θ − gt ) j .
2

The coefficient of ^i represents the horizontal component of s (t) and is the horizontal distance of the object from the origin at time

t . The maximum value of the horizontal distance (measured at the same initial and final altitude) is called the range R . The

coefficient of ^j represents the vertical component of s (t) and is the altitude of the object at time t . The maximum value of the

vertical distance is the height H .

 Example 13.4.3: Motion of a Cannonball

During an Independence Day celebration, a cannonball is fired from a cannon on a cliff toward the water. The cannon is aimed
at an angle of 30° above horizontal and the initial speed of the cannonball is 600 ft/sec. The cliff is 100 ft above the water
(Figure 13.4.7).
a. Find the maximum height of the cannonball.
b. How long will it take for the cannonball to splash into the sea?
c. How far out to sea will the cannonball hit the water?

13.4.9 https://math.libretexts.org/@go/page/4534
Figure 13.4.7 : The flight of a cannonball (ignoring air resistance) is projectile motion.
Solution
We use the equation
1
⇀ ^ 2 ^
s (t) = v0 t cos θ i + ( v0 t sin θ − gt ) j
2

ft ft
with θ = 30 , g = 32

2
, and v
0 = 600 . Then the position equation becomes
sec sec

1
⇀ ∘ ^ ∘ 2 ^
s (t) = 600t(cos 30 ) i + (600t sin 30 − (32)t ) j
2

–^ 2 ^
= 300t√3 i + (300t − 16 t ) j

a. The cannonball reaches its maximum height when the vertical component of its velocity is zero, because the cannonball is
neither rising nor falling at that point. The velocity vector is
⇀ ⇀ ′
v (t) = s (t)

–^ ^
= 300 √3 i + (300 − 32t) j

Therefore, the vertical component of velocity is given by the expression 300 − 32t. Setting this expression equal to zero
and solving for t gives t = 9.375 sec. The height of the cannonball at this time is given by the vertical component of the
position vector, evaluated at t = 9.375.
⇀ –^ 2 ^
s (9.375) = 300(9.375)√3 i + (300(9.375) − 16(9.375 ) ) j

^ ^
= 4871.39 i + 1406.25 j

Therefore, the maximum height of the cannonball is 1406.39 ft above the cannon, or 1506.39 ft above sea level.
b. When the cannonball lands in the water, it is 100 ft below the cannon. Therefore, the vertical component of the position
vector is equal to −100. Setting the vertical component of s (t) equal to −100 and solving, we obtain

13.4.10 https://math.libretexts.org/@go/page/4534
2
300t − 16t = −100
2
16 t − 300t − 100 = 0
2
4t − 75 − 25 = 0

−−−−−−
2
75 ± √(−75) − 4(4)(−25)
t =
2(4)

−−−−
75 ± √6025
=
8

−−−
75 ± 5 √241
=
8

The positive value of t that solves this equation is approximately 19.08. Therefore, the cannonball hits the water after
approximately 19.08 sec.
c. To find the distance out to sea, we simply substitute the answer from part (b) into s (t) : ⇀

⇀ –^ 2 ^
s (19.08) = 300(19.08)√3 i + (300(19.08) − 16(19.08 ) ) j

^ ^
= 9914.26 i − 100.7424 j

Therefore, the ball hits the water about 9914.26 ft away from the base of the cliff. Notice that the vertical component of the
position vector is very close to −100, which tells us that the ball just hit the water. Note that 9914.26 feet is not the true
range of the cannon since the cannonball lands in the ocean at a location below the cannon. The range of the cannon would
be determined by finding how far out the cannonball is when its height is 100 ft above the water (the same as the altitude of
the cannon).

 Exercise 13.4.3

An archer fires an arrow at an angle of 40° above the horizontal with an initial speed of 98 m/sec. The height of the archer is
171.5 cm. Find the horizontal distance the arrow travels before it hits the ground.

Hint
The equation for the position vector needs to account for the height of the archer in meters.

Answer
967.15 m

One final question remains: In general, what is the maximum distance a projectile can travel, given its initial speed? To determine
this distance, we assume the projectile is fired from ground level and we wish it to return to ground level. In other words, we want
to determine an equation for the range. In this case, the equation of projectile motion is
1
⇀ ^ 2 ^
s = v0 t cos θ i + ( v0 t sin θ − gt ) j .
2

Setting the second component equal to zero and solving for t yields
1
2
v0 t sin θ − gt =0
2

1
t ( v0 sin θ − gt) = 0
2

2 v0 sin θ
Therefore, either t = 0 or t = . We are interested in the second value of t , so we substitute this into ⇀
s (t) , which gives
g

13.4.11 https://math.libretexts.org/@go/page/4534
2


2 v0 sin θ 2 v0 sin θ 2 v0 sin θ 1 2 v0 sin θ
^ ^
s ( ) = v0 ( ) cos θ i + ( v0 ( ) sin θ − g( ) ) j
g g g 2 g

2
2v sin θ cos θ
0
^
=( ) i
g

2
v sin 2θ
0
^
= i.
g

Thus, the expression for the range of a projectile fired at an angle θ is


2
v sin 2θ
0
^
R = i.
g

The only variable in this expression is θ . To maximize the distance traveled, take the derivative of the coefficient of ^i with respect
to θ and set it equal to zero:
2
d v sin 2θ
0
( ) =0
dθ g

2
2v cos 2θ
0
=0
g

θ = 45

This value of θ) is the smallest positive value that makes the derivative equal to zero. Therefore, in the absence of air resistance,
the best angle to fire a projectile (to maximize the range) is at a 45° angle. The distance it travels is given by
∘ 2 ∘ 2
2 v0 sin 45 v sin 90 v
⇀ 0 ^ 0 ^
s ( ) = i = i
g g g

2
v
0
Therefore, the range for an angle of 45° is units.
g

Kepler’s Laws
During the early 1600s, Johannes Kepler was able to use the amazingly accurate data from his mentor Tycho Brahe to formulate his
three laws of planetary motion, now known as Kepler’s laws of planetary motion. These laws also apply to other objects in the
solar system in orbit around the Sun, such as comets (e.g., Halley’s comet) and asteroids. Variations of these laws apply to satellites
in orbit around Earth.

 Theorem 13.4.2: Kepler's Laws of Planetary Motion


1. The path of any planet about the Sun is elliptical in shape, with the center of the Sun located at one focus of the ellipse (the
law of ellipses).
2. A line drawn from the center of the Sun to the center of a planet sweeps out equal areas in equal time intervals (the law of
equal areas) (Figure 13.4.8).
3. The ratio of the squares of the periods of any two planets is equal to the ratio of the cubes of the lengths of their semimajor
orbital axes (the Law of Harmonies).

13.4.12 https://math.libretexts.org/@go/page/4534
Figure 13.4.8 : Kepler’s first and second laws are pictured here. The Sun is located at a focus of the elliptical orbit of any
planet. Furthermore, the shaded areas are all equal, assuming that the amount of time measured as the planet moves is the same
for each region.

Kepler’s third law is especially useful when using appropriate units. In particular, 1 astronomical unit is defined to be the average
distance from Earth to the Sun, and is now recognized to be 149,597,870,700 m or, approximately 93,000,000 mi. We therefore
write 1 A.U. = 93,000,000 mi. Since the time it takes for Earth to orbit the Sun is 1 year, we use Earth years for units of time. Then,
substituting 1 year for the period of Earth and 1 A.U. for the average distance to the Sun, Kepler’s third law can be written as
2 3
Tp = Dp

for any planet in the solar system, where T is the period of that planet measured in Earth years and D is the average distance
P P

from that planet to the Sun measured in astronomical units. Therefore, if we know the average distance from a planet to the Sun (in
astronomical units), we can then calculate the length of its year (in Earth years), and vice versa.
Kepler’s laws were formulated based on observations from Brahe; however, they were not proved formally until Sir Isaac Newton
was able to apply calculus. Furthermore, Newton was able to generalize Kepler’s third law to other orbital systems, such as a moon
orbiting around a planet. Kepler’s original third law only applies to objects orbiting the Sun.

 Proof

Let’s now prove Kepler’s first law using the calculus of vector-valued functions. First we need a coordinate system. Let’s place
the Sun at the origin of the coordinate system and let the vector-valued function r (t) represent the location of a planet as a

function of time. Newton proved Kepler’s law using his second law of motion and his law of universal gravitation. Newton’s
⇀ ⇀
second law of motion can be written as F = m a , where F represents the net force acting on the planet. His law of universal


⇀ GmM r
gravitation can be written in the form F =−
2


, which indicates that the force resulting from the gravitational

|| r | | || r ||

GmM
attraction of the Sun points back toward the Sun, and has magnitude 2
(Figure 13.4.9).

|| r | |

Figure 13.4.9 : The gravitational force between Earth and the Sun is equal to the mass of the earth times its acceleration.
Setting these two forces equal to each other, and using the fact that ⇀
a (t) = v (t)
⇀ ′
, we obtain

13.4.13 https://math.libretexts.org/@go/page/4534

GmM r
⇀ ′
m v (t) = − ⋅ ,
⇀ 2 ⇀
∥r∥ ∥r∥

which can be rewritten as



dv GM ⇀
=− r.
⇀ 3
dt || r | |


This equation shows that the vectors ⇀
d v /dt and ⇀
r are parallel to each other, so ⇀
d v /dt × r = 0

. Next, let’s differentiate
r × v with respect to time:
⇀ ⇀

⇀ ⇀
d ⇀ ⇀
dr ⇀ ⇀
dv ⇀ ⇀ ⇀ ⇀
( r × v) = ×v + r × = v × v + 0 = 0. (13.4.7)
dt dt dt

⇀ ⇀
This proves that ⇀
r ×v

is a constant vector, which we call C. Since ⇀
r and ⇀
v are both perpendicular to C for all values of t ,

they must lie in a plane perpendicular to C. Therefore, the motion of the planet lies in a plane.

Next we calculate the expression d v /dt × C :


dv ⇀ GM GM
⇀ ⇀ ⇀ ⇀ ⇀ ⇀ ⇀ ⇀ ⇀
×C = − r × ( r × v) = − [( r ⋅ v ) r − ( r ⋅ r ) v ]. (13.4.8)
⇀ 3 ⇀ 3
dt || r | | || r | |

The last equality in Equation 13.4.8 is from the triple cross product formula (see the cross product section in Introduction to
Vectors in Space). We need an expression for r ⋅ v . To calculate this, we differentiate r ⋅ r with respect to time:
⇀ ⇀ ⇀ ⇀

⇀ ⇀ ⇀
d dr dr dr
⇀ ⇀ ⇀ ⇀ ⇀ ⇀ ⇀
(r ⋅ r) = ⋅ r + r ⋅ = 2r ⋅ = 2 r ⋅ v. (13.4.9)
dt dt dt dt

2
Since ⇀ ⇀ ⇀
r ⋅ r = || r | | , we also have
d ⇀ ⇀
d ⇀ 2 ⇀
d ⇀
(r ⋅ r) = || r | | = 2|| r || || r ||. (13.4.10)
dt dt dt

Combining Equation 13.4.9 and Equation 13.4.10, we get


d
⇀ ⇀ ⇀ ⇀
2 r ⋅ v = 2|| r || || r ||
dt

⇀ ⇀ ⇀
d ⇀
r ⋅ v = || r ∥ || r ||.
dt

Substituting this into Equation 13.4.8 gives us



dv ⇀ GM ⇀ ⇀ ⇀ ⇀ ⇀ ⇀
×C = − [( r ⋅ v ) r − ( r ⋅ r ) v ]
⇀ 3
dt || r | |

GM ⇀
d ⇀ ⇀ ⇀ 2⇀
=− [|| r ( || r ||) r − || r | | v ]
⇀ 3
|| r | | dt

1 d ⇀ ⇀
1 ⇀
= −GM [ ( || r ||) r − v]
⇀ 2 ⇀
|| r | | dt || r ||

⇀ ⇀
v r d ⇀
= GM [ − ( || r ||)] . (13.4.11)
⇀ ⇀ 2
|| r || || r | | dt

However,

13.4.14 https://math.libretexts.org/@go/page/4534
d ⇀ ⇀ ⇀ d ⇀

d r ( r )|| r || − r || r ||
dt dt
=
⇀ ⇀ 2
dt || r || || r | |

d r

dt
r d

= − || r ||
⇀ ⇀ 2
|| r || || r | | dt

⇀ ⇀
v r d

= − || r ||.
⇀ ⇀ 2
|| r || || r | | dt

Therefore, Equation 13.4.11 becomes


⇀ ⇀
dv ⇀ d r
× C = GM ( ).

dt dt || r ||


Since C is a constant vector, we can integrate both sides and obtain


⇀ r ⇀
v × C = GM + D,

|| r ||

⇀ ⇀
where D is a constant vector. Our goal is to solve for || r ||. Let’s start by calculating
⇀ ⇀ ⇀
r ⋅ (v × C :
⇀ 2
⇀ || r | | ⇀ ⇀
⇀ ⇀ ⇀ ⇀ ⇀
r ⋅ ( v × C = GM + r ⋅ D = GM || r || + r ⋅ D.

|| r ||

⇀ ⇀
However, ⇀ ⇀ ⇀ ⇀
r ⋅ ( v × C) = ( r × v ) ⋅ C , so
⇀ ⇀
⇀ ⇀ ⇀ ⇀
( r × v ) ⋅ C = GM || r || + r ⋅ D.


Since ⇀ ⇀
r ×v =C , we have
⇀ ⇀
2 ⇀ ⇀
|| C| | = GM || r || + r ⋅ D.

⇀ ⇀ ⇀
Note that ⇀ ⇀
r ⋅ D = || r |||| D|| cos θ , where θ is the angle between ⇀
r and D. Therefore,
⇀ 2 ⇀
⇀ ⇀
|| C| | = GM || r || + || r |||| D|| cos θ

Solving for || r ||,


⇀ ⇀
2 2
|| C| | || C| | 1

|| r || = = ( ).

GM 1 + e cos θ
GM + || D|| cos θ


where e = ||D||/GM . This is the polar equation of a conic with a focus at the origin, which we set up to be the Sun. It is a
hyperbola if e > 1 , a parabola if e = 1 , or an ellipse if e < 1 . Since planets have closed orbits, the only possibility is an
ellipse. However, at this point it should be mentioned that hyperbolic comets do exist. These are objects that are merely passing
through the solar system at speeds too great to be trapped into orbit around the Sun. As they pass close enough to the Sun, the
gravitational field of the Sun deflects the trajectory enough so the path becomes hyperbolic.

Kepler’s third law of planetary motion can be modified to the case of one object in orbit around an object other than the Sun, such
as the Moon around the Earth. In this case, Kepler’s third law becomes
2 3
4π a
2
P = , (13.4.12)
G(m + M )

where m is the mass of the Moon and M is the mass of Earth, a represents the length of the major axis of the elliptical orbit, and P
represents the period.

13.4.15 https://math.libretexts.org/@go/page/4534
 Example 13.4.4: Using Kepler’s Third Law for Nonheliocentric Orbits

Given that the mass of the Moon is 7.35 × 10 kg, the mass of Earth is 5.97 × 10 kg, G = 6.67 × 10 m/kg ⋅ sec , and
22 24 −11 2

the period of the moon is 27.3 days, let’s find the length of the major axis of the orbit of the Moon around Earth.
Solution
It is important to be consistent with units. Since the universal gravitational constant contains seconds in the units, we need to
use seconds for the period of the Moon as well:
24 hr 3600 esc
27.3 days × × = 2, 358, 720 sec
1 day 1 hour

Substitute all the data into Equation 13.4.12 and solve for a :
2 3
2
4π a
(2, 358, 720sec ) =
−11 m 22 24
(6.67 × 10 2
) (7.35 × 10 kg + 5.97 × 10 kg)
kg×sec

2 3
4π a
12
5.563 × 10 =
−11 3 24
(6.67 × 10 m )(6.04 × 10 )
12 −11 3 24 2 3
(5.563 × 10 )(6.67 × 10 m )(6.04 × 10 ) = 4π a

27
3
2.241 × 10 3
a = m
2

8
a = 3.84 × 10 m

≈ 384, 000 km.

Analysis
According to solarsystem.nasa.gov, the actual average distance from the Moon to Earth is 384,400 km. This is calculated using
reflectors left on the Moon by Apollo astronauts back in the 1960s.

 Exercise 13.4.4

Titan is the largest moon of Saturn. The mass of Titan is approximately 1.35 × 10 kg . The mass of Saturn is approximately
23

5.68 × 10
26
kg. Titan takes approximately 16 days to orbit Saturn. Use this information, along with the universal gravitation
constant G = 6.67 × 10 m/kg ⋅ sec to estimate the distance from Titan to Saturn.
−11 2

Hint
Make sure your units agree, then use Equation 13.4.12.

Answer
9
a ≈ 1.224 × 10 m = 1, 224, 000km

 Example 13.4.5: Halley’s Comet


We now return to the chapter opener, which discusses the motion of Halley’s comet around the Sun. Kepler’s first law states
that Halley’s comet follows an elliptical path around the Sun, with the Sun as one focus of the ellipse. The period of Halley’s
comet is approximately 76.1 years, depending on how closely it passes by Jupiter and Saturn as it passes through the outer
solar system. Let’s use T = 76.1 years. What is the average distance of Halley’s comet from the Sun?

13.4.16 https://math.libretexts.org/@go/page/4534
Solution
Using the equation T 2
=D
3
with T = 76.1 , we obtain D 3
= 5791.21 , so D ≈ 17.96 A.U. This comes out to approximately
1.67 × 10 mi.
9

A natural question to ask is: What are the maximum (aphelion) and minimum (perihelion) distances from Halley’s Comet to
the Sun? The eccentricity of the orbit of Halley’s Comet is 0.967 (Source:
http://nssdc.gsfc.nasa.gov/planetary...cometfact.html). Recall that the formula for the eccentricity of an ellipse is e = c/a ,
where a is the length of the semimajor axis and c is the distance from the center to either focus. Therefore, 0.967 = c/17.96
and c ≈ 17.37 A.U. Subtracting this from a gives the perihelion distance p = a − c = 17.96 − 17.37 = 0.59 A.U. According
to the National Space Science Data Center (Source: http://nssdc.gsfc.nasa.gov/planetary...cometfact.html), the perihelion
distance for Halley’s comet is 0.587 A.U. To calculate the aphelion distance, we add

P = a + c = 17.96 + 17.37 = 35.33 A.U.

This is approximately 3.3 × 10 mi. The average distance from Pluto to the Sun is 39.5 A.U. (Source:
9

http://www.oarval.org/furthest.htm), so it would appear that Halley’s Comet stays just within the orbit of Pluto.

 NAVIGATING A BANKED TURN

How fast can a racecar travel through a circular turn without skidding and hitting the wall? The answer could depend on
several factors:
The weight of the car;
The friction between the tires and the road;
The radius of the circle;
The “steepness” of the turn.
In this project we investigate this question for NASCAR racecars at the Bristol Motor Speedway in Tennessee. Before
considering this track in particular, we use vector functions to develop the mathematics and physics necessary for answering
questions such as this.
A car of mass m moves with constant angular speed ω around a circular curve of radius R (Figure 13.4.9). The curve is
banked at an angle θ . If the height of the car off the ground is h , then the position of the car at time t is given by the function
r (t) =< R cos(ωt), R sin(ωt), h > .

Figure 13.4.9 : Views of a race car moving around a track.

13.4.17 https://math.libretexts.org/@go/page/4534
1. Find the velocity function v (t) of the car. Show that v is tangent to the circular curve. This means that, without a force to
⇀ ⇀

keep the car on the curve, the car will shoot off of it.
2. Show that the speed of the car is ωR. Use this to show that (2π4)/∥ v ∥ = (2π)/ω.

3. Find the acceleration a . Show that this vector points toward the center of the circle and that ∥ a ∥ = Rω .
⇀ ⇀ 2


4. The force required to produce this circular motion is called the centripetal force, and it is denoted F . This force points cent


toward the center of the circle (not toward the ground). Show that ∥F cent ∥ = (m| v | ) /R
⇀ 2
.

As the car moves around the curve, three forces act on it: gravity, the force exerted by the road (this force is perpendicular to
the ground), and the friction force (Figure 13.4.10). Because describing the frictional force generated by the tires and the road
⇀ ⇀
is complex, we use a standard approximation for the frictional force. Assume that f = μN for some positive constant μ . The
constant μ is called the coefficient of friction.


Figure 13.4.10: The car has three forces acting on it: gravity (denoted by mg ), the friction force

f , and the force exerted by

the road N.
Let vmax denote the maximum speed the car can attain through the curve without skidding. In other words, v is the fastest max

speed at which the car can navigate the turn. When the car is traveling at this speed, the magnitude of the centripetal force is
2
⇀ m(vmax )
∥ Fcent ∥ = .
R

The next three questions deal with developing a formula that relates the speed v max to the banking angle θ .
⇀ ⇀ ⇀
5. Show that N cos θ = m g + f sin θ . Conclude that N = (m g )/(cos θ − μ sin θ) .
⇀ ⇀

6. The centripetal force is the sum of the forces in the horizontal direction, since the centripetal force points toward the center
of the circular curve. Show that
⇀ ⇀ ⇀
Fcent = N sin θ + f cos θ.

Conclude that
⇀ sin θ + μ cos θ

Fcent = m g.
cosθ − μ sin θ

7. Show that (v ) = ((sin θ + μ cosθ)/(cos θ − μ sin θ))gR . Conclude that the maximum speed does not actually
max
2

depend on the mass of the car.


Now that we have a formula relating the maximum speed of the car and the banking angle, we are in a position to answer
the questions like the one posed at the beginning of the project.
The Bristol Motor Speedway is a NASCAR short track in Bristol, Tennessee. The track has the approximate shape shown
in Figure 13.4.11. Each end of the track is approximately semicircular, so when cars make turns they are traveling along an
approximately circular curve. If a car takes the inside track and speeds along the bottom of turn 1, the car travels along a
semicircle of radius approximately 211 ft with a banking angle of 24°. If the car decides to take the outside track and speeds
along the top of turn 1, then the car travels along a semicircle with a banking angle of 28°. (The track has variable angle
banking.)

13.4.18 https://math.libretexts.org/@go/page/4534
Figure 13.4.11: At the Bristol Motor Speedway, Bristol, Tennessee (a), the turns have an inner radius of about 211 ft and a
width of 40 ft (b). (credit: part (a) photo by Raniel Diaz, Flickr)
The coefficient of friction for a normal tire in dry conditions is approximately 0.7. Therefore, we assume the coefficient for a
NASCAR tire in dry conditions is approximately 0.98.
Before answering the following questions, note that it is easier to do computations in terms of feet and seconds, and then
convert the answers to miles per hour as a final step.
8. In dry conditions, how fast can the car travel through the bottom of the turn without skidding?
9. In dry conditions, how fast can the car travel through the top of the turn without skidding?
10. In wet conditions, the coefficient of friction can become as low as 0.1. If this is the case, how fast can the car travel through
the bottom of the turn without skidding?
11. Suppose the measured speed of a car going along the outside edge of the turn is 105 mph. Estimate the coefficient of
friction for the car’s tires.

Key Concepts
⇀ ′
If r (t) represents the position of an object at time t, then r (t) represents the velocity and r ' (t) represents the acceleration of
⇀ ⇀ ′

the object at time t. The magnitude of the velocity vector is speed.


The acceleration vector always points toward the concave side of the curve defined by r (t) . The tangential and normal

components of acceleration a and a are the projections of the acceleration vector onto the unit tangent and unit normal

T

N

vectors to the curve.


Kepler’s three laws of planetary motion describe the motion of objects in orbit around the Sun. His third law can be modified to
describe motion of objects in orbit around other celestial objects as well.
Newton was able to use his law of universal gravitation in conjunction with his second law of motion and calculus to prove
Kepler’s three laws.

Key Equations
Velocity
⇀ ⇀ ′
v (t) = r (t)

Acceleration
⇀ ⇀ ′ ⇀ ′
a (t) = v (t) = r ' (t)

Speed
ds
⇀ ⇀ ′
v(t) = || v (t)|| = || r (t)|| =
dt

Tangential component of acceleration


⇀ ⇀
⇀ v ⋅ a

a⇀ = a ⋅ T =
T ⇀
|| v ||

Normal component of acceleration

13.4.19 https://math.libretexts.org/@go/page/4534
⇀ ⇀
⇀ || v × a || −−−−−−−−−
⇀ ⇀ 2
a⇀ = a ⋅ N = = √ || a | | − a⇀
N ⇀ T
|| v ||

Glossary
acceleration vector
the second derivative of the position vector

Kepler’s laws of planetary motion


three laws governing the motion of planets, asteroids, and comets in orbit around the Sun

normal component of acceleration


⇀ ⇀ ⇀
the coefficient of the unit normal vector N when the acceleration vector is written as a linear combination of T and N

projectile motion
motion of an object with an initial velocity but no force acting on it other than gravity

tangential component of acceleration


⇀ ⇀ ⇀
the coefficient of the unit tangent vector T when the acceleration vector is written as a linear combination of T and N

velocity vector
the derivative of the position vector

Contributors and Attributions


Gilbert Strang (MIT) and Edwin “Jed” Herman (Harvey Mudd) with many contributing authors. This content by OpenStax is
licensed with a CC-BY-SA-NC 4.0 license. Download for free at http://cnx.org.
Edited by Paul Seeburger
Paul Seeburger added finding point (1, 2) when t = 1 in Example 13.4.1.
He also created Figure 13.4.1.

13.4: Motion in Space- Velocity and Acceleration is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
13.4: Motion in Space by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

13.4.20 https://math.libretexts.org/@go/page/4534
CHAPTER OVERVIEW

14: Partial Derivatives


A general Calculus Textmap organized around the textbook

Calculus: Early Transcendentals


by James Stewart

I II III IV V VI VII VIII IX X XI XII XIII XIV XV XVI XVII

This Textmap is currently under construction... please be patient with us.

Topic hierarchy
14.1: Functions of Several Variables
14.2: Limits and Continuity
14.3: Partial Derivatives
14.4: Tangent Planes and Linear Approximations
14.5: The Chain Rule
14.6: Directional Derivatives and the Gradient Vector
14.7: Maximum and Minimum Values
14.8: Lagrange Multipliers

14: Partial Derivatives is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

1
14.1: Functions of Several Variables
 Learning Objectives
Recognize a function of two variables and identify its domain and range.
Sketch a graph of a function of two variables.
Sketch several traces or level curves of a function of two variables.
Recognize a function of three or more variables and identify its level surfaces.

Our first step is to explain what a function of more than one variable is, starting with functions of two independent variables. This
step includes identifying the domain and range of such functions and learning how to graph them. We also examine ways to relate
the graphs of functions in three dimensions to graphs of more familiar planar functions.

Functions of Two Variables


The definition of a function of two variables is very similar to the definition for a function of one variable. The main difference is
that, instead of mapping values of one variable to values of another variable, we map ordered pairs of variables to another variable.

 Definition: function of two variables


A function of two variables z = f (x, y) maps each ordered pair (x, y) in a subset D of the real plane R to a unique real 2

number z. The set D is called the domain of the function. The range of f is the set of all real numbers z that has at least one
ordered pair (x, y) ∈ D such that f (x, y) = z as shown in Figure 14.1.1.

Figure 14.1.1 : The domain of a function of two variables consists of ordered pairs (x, y).

Determining the domain of a function of two variables involves taking into account any domain restrictions that may exist. Let’s
take a look.

 Example 14.1.1: Domains and Ranges for Functions of Two Variables


Find the domain and range of each of the following functions:
a. f (x, y) = 3x + 5y + 2
− −−−−−−− −
b. g(x, y) = √9 − x − y
2 2

Solution
a. This is an example of a linear function in two variables. There are no values or combinations of x and y that cause f (x, y) to
be undefined, so the domain of f is R . To determine the range, first pick a value for z. We need to find a solution to the
2

equation f (x, y) = z, or 3x − 5y + 2 = z. One such solution can be obtained by first setting y = 0 , which yields the equation
z−2 z−2
3x + 2 = z . The solution to this equation is x = , which gives the ordered pair ( , 0) as a solution to the
3 3

equation f (x, y) = z for any value of z . Therefore, the range of the function is all real numbers, or R .
b. For the function g(x, y) to have a real value, the quantity under the square root must be nonnegative:
2 2
9 −x −y ≥ 0.

This inequality can be written in the form

14.1.1 https://math.libretexts.org/@go/page/4536
2 2
x +y ≤ 9.

Therefore, the domain of g(x, y) is {(x, y) ∈ R ∣ x + y ≤ 9} . The graph of this set of points can be described as a disk of
2 2 2

radius 3 centered at the origin. The domain includes the boundary circle as shown in the following graph.

−−−−−−−− −
Figure 14.1.2 : The domain of the function g(x, y) = √9 − x 2
−y
2
is a closed disk of radius 3.
−−−−−−−− −
To determine the range of g(x, y) = √9 − x2 − y 2 we start with a point (x0 , y0 ) on the boundary of the domain, which is
defined by the relation x + y 2 2
=9 . It follows that x 2
0
+ y = 9 and
2
0

−−−−−−−−−
2 2
g(x0 , y0 ) = √ 9 − x − y
0 0

−−−−−−−−−−−
2 2
= √ 9 − (x +y )
0 0

−−−−
= √9 − 9

= 0.

If x2
0
+y
0
2
=0 (in other words, x 0 = y0 = 0) , then
−−−−−−−−−
2 2
g(x0 , y0 ) = √ 9 − x −y
0 0

−−−−−−−−−−−
2 2
= √ 9 − (x + y )
0 0

−−− −
= √ 9 − 0 = 3.

This is the maximum value of the function. Given any value c between 0 and 3, we can find an entire set of points inside the
domain of g such that g(x, y) = c :
−−−−−−−−−
2 2
√9 −x −y =c

2 2 2
9 −x −y =c

2 2 2
x +y = 9 −c .

−−−−−
Since 9 − c 2
, this describes a circle of radius √9 − c centered at the origin. Any point on this circle satisfies the equation
>0
2

g(x, y) = c . Therefore, the range of this function can be written in interval notation as [0, 3].

 Exercise 14.1.1
−−−−−−−−−−− −
Find the domain and range of the function f (x, y) = √36 − 9x 2
− 9y
2
.

Hint

14.1.2 https://math.libretexts.org/@go/page/4536
Determine the set of ordered pairs that do not make the radicand negative.
Solution
The domain is {(x, y)|x + y ≤ 4} the shaded circle defined by the inequality x
2 2 2
+y
2
≤4 , which has a circle of radius
2 as its boundary. The range is [0, 6].

Graphing Functions of Two Variables


Suppose we wish to graph the function z = f (x, y). This function has two independent variables (x and y ) and one dependent
variable (z) . When graphing a function y = f (x) of one variable, we use the Cartesian plane. We are able to graph any ordered
pair (x, y) in the plane, and every point in the plane has an ordered pair (x, y) associated with it. With a function of two variables,
each ordered pair (x, y) in the domain of the function is mapped to a real number z . Therefore, the graph of the function f consists
of ordered triples (x, y, z). The graph of a function z = f (x, y) of two variables is called a surface.
To understand more completely the concept of plotting a set of ordered triples to obtain a surface in three-dimensional space,
imagine the (x, y) coordinate system laying flat. Then, every point in the domain of the function f has a unique z -value associated
with it. If z is positive, then the graphed point is located above the xy-plane, if z is negative, then the graphed point is located
below the xy-plane. The set of all the graphed points becomes the two-dimensional surface that is the graph of the function f .

 Example 14.1.2: Graphing Functions of Two Variables

Create a graph of each of the following functions:


−−−−−−−− −
a. g(x, y) = √9 − x 2
−y
2

b. f (x, y) = x + y
2 2

Solution
−−−−−−−−−
a. In Example 14.1.2, we determined that the domain of g(x, y) = √9 − x − y is {(x, y) ∈ R ∣ x + y ≤ 9} and the
2 2 2 2 2

range is {z ∈ R ∣ 0 ≤ z ≤ 3} . When x + y = 9 we have g(x, y) = 0 . Therefore any point on the circle of radius 3
2 2 2

centered at the origin in the xy-plane maps to z = 0 in R . If x + y = 8 , then g(x, y) = 1, so any point on the circle of
3 2 2


radius 2√2 centered at the origin in the xy-plane maps to z = 1 in R . As x + y gets closer to zero, the value of z
3 2 2

approaches 3. When x + y = 0 , then g(x, y) = 3 . This is the origin in the xy-plane If x + y is equal to any other value
2 2 2 2

between 0 and 9, then g(x, y) equals some other constant between 0 and 3. The surface described by this function is a
hemisphere centered at the origin with radius 3 as shown in the following graph.

14.1.3 https://math.libretexts.org/@go/page/4536
Figure 14.1.3 : Graph of the hemisphere represented by the given function of two variables.
b. This function also contains the expression x + y . Setting this expression equal to various values starting at zero, we obtain
2 2

circles of increasing radius. The minimum value of f (x, y) = x + y is zero (attained when x = y = 0. . When x = 0 , the
2 2

function becomes z = y , and when y = 0 , then the function becomes z = x . These are cross-sections of the graph, and are
2 2

parabolas. Recall from Introduction to Vectors in Space that the name of the graph of f (x, y) = x + y is a paraboloid. The
2 2

graph of f appears in the following graph.

Figure 14.1.4 : A paraboloid is the graph of the given function of two variables.

14.1.4 https://math.libretexts.org/@go/page/4536
 Example 14.1.3: Nuts and Bolts

A profit function for a hardware manufacturer is given by


2 2
f (x, y) = 16 − (x − 3 ) − (y − 2 ) ,

where x is the number of nuts sold per month (measured in thousands) and y represents the number of bolts sold per month
(measured in thousands). Profit is measured in thousands of dollars. Sketch a graph of this function.
Solution
This function is a polynomial function in two variables. The domain of f consists of (x, y) coordinate pairs that yield a
nonnegative profit:
2 2
16 − (x − 3 ) − (y − 2 ) ≥0

2 2
(x − 3 ) + (y − 2 ) ≤ 16.

This is a disk of radius 4 centered at (3, 2). A further restriction is that both x and y must be nonnegative. When x = 3 and
y = 2, f (x, y) = 16. Note that it is possible for either value to be a noninteger; for example, it is possible to sell 2.5 thousand

nuts in a month. The domain, therefore, contains thousands of points, so we can consider all points within the disk. For any
z < 16 , we can solve the equation f (x, y) = 16 :

2 2
16 − (x − 3 ) − (y − 2 ) =z

2 2
(x − 3 ) + (y − 2 ) = 16 − z.

−−−−−
Since z < 16, we know that 16 − z > 0, so the previous equation describes a circle with radius √16 − z centered at the point
(3, 2). Therefore. the range of f (x, y) is {z ∈ R|z ≤ 16}. The graph of f (x, y) is also a paraboloid, and this paraboloid points

downward as shown.

Figure 14.1.5 : The graph of the given function of two variables is also a paraboloid.

Level Curves
If hikers walk along rugged trails, they might use a topographical map that shows how steeply the trails change. A topographical
map contains curved lines called contour lines. Each contour line corresponds to the points on the map that have equal elevation
(Figure 14.1.6). A level curve of a function of two variables f (x, y) is completely analogous to a contour line on a topographical
map.

14.1.5 https://math.libretexts.org/@go/page/4536
Figure 14.1.6 : (a) A topographical map of Devil’s Tower, Wyoming. Lines that are close together indicate very steep terrain. (b) A
perspective photo of Devil’s Tower shows just how steep its sides are. Notice the top of the tower has the same shape as the center
of the topographical map.

 Definition: level curves

Given a function f (x, y) and a number c in the range of f , a level curve of a function of two variables for the value c is
defined to be the set of points satisfying the equation f (x, y) = c.

−−−−−−−−−
Returning to the function g(x, y) = √9 − x − y , we can determine the level curves of this function. The range of g is the
2 2

closed interval [0, 3]. First, we choose any number in this closed interval—say, c = 2 . The level curve corresponding to c = 2 is
described by the equation
−−−−−−−−−
2 2
√9 −x −y = 2.

To simplify, square both sides of this equation:


2 2
9 −x −y = 4.

Now, multiply both sides of the equation by −1 and add 9 to each side:
2 2
x +y = 5.


This equation describes a circle centered at the origin with radius √5. Using values of c between 0 and 3 yields other circles also
centered at the origin. If c = 3 , then the circle has radius 0, so it consists solely of the origin. Figure 14.1.7 is a graph of the level
curves of this function corresponding to c = 0, 1, 2, and 3. Note that in the previous derivation it may be possible that we
introduced extra solutions by squaring both sides. This is not the case here because the range of the square root function is
nonnegative.

14.1.6 https://math.libretexts.org/@go/page/4536
−−−−−−−− −
Figure 14.1.7 : Level curves of the function g(x, y) = √9 − x 2
−y
2
, using c = 0, 1, 2, and 3(c = 3 corresponds to the origin).
A graph of the various level curves of a function is called a contour map.

 Example 14.1.4: Making a Contour Map


−−−−−−−−−−−−−−−−−−
Given the function f (x, y) = √8 + 8x − 4y − 4x − y , find the level curve corresponding to c = 0 . Then create a contour
2 2

map for this function. What are the domain and range of f ?
Solution
To find the level curve for c = 0, we set f (x, y) = 0 and solve. This gives
−−−−−−−−−−−−−−−−− −
2
0 = √8 + 8x − 4y − 4 x − y
2
.
We then square both sides and multiply both sides of the equation by −1:
2 2
4x +y − 8x + 4y − 8 = 0.

Now, we rearrange the terms, putting the x terms together and the y terms together, and add 8 to each side:
2 2
4x − 8x + y + 4y = 8.

Next, we group the pairs of terms containing the same variable in parentheses, and factor 4 from the first pair:
2 2
4(x − 2x) + (y + 4y) = 8.

Then we complete the square in each pair of parentheses and add the correct value to the right-hand side:
2 2
4(x − 2x + 1) + (y + 4y + 4) = 8 + 4(1) + 4.

Next, we factor the left-hand side and simplify the right-hand side:
2 2
4(x − 1 ) + (y + 2 ) = 16.

Last, we divide both sides by 16 :


2 2
(x − 1) (y + 2)
+ = 1.
4 16

This equation describes an ellipse centered at (1, −2). The graph of this ellipse appears in the following graph.

14.1.7 https://math.libretexts.org/@go/page/4536
−−−−−−−−−−−−−−−−− −
Figure 14.1.8 : Level curve of the function f (x, y) = √8 + 8x − 4y − 4x 2
−y
2
corresponding to c = 0
We can repeat the same derivation for values of c less than 4. Then, Equation ??? becomes
2 2
4(x − 1) (y + 2)
+ =1
2 2
16 − c 16 − c

for an arbitrary value of c . Figure 14.1.9 shows a contour map for f (x, y) using the values c = 0, 1, 2, and 3. When c = 4, the
level curve is the point (−1, 2).

−−−−−−−−−−−−−−−−− −
Figure 14.1.9 : Contour map for the function f (x, y) = √8 + 8x − 4y − 4x 2
−y
2
using the values c = 0, 1, 2, 3, and 4.
Finding the Domain & Range
Since this is a square root function, the radicand must not be negative. So we have
2 2
8 + 8x − 4y − 4 x −y ≥0

Recognizing that the boundary of the domain is an ellipse, we repeat the steps we showed above to obtain
2 2
(x − 1) (y + 2)
+ ≤1
4 16

2 2
(x−1) (y+2)
So the domain of f can be written: {(x, y) | 4
+
16
≤ 1}.

To find the range of f , we need to consider the possible outputs of this square root function. We know the output cannot be
negative, so we need to next check if its output is ever 0. From the work we completed above to find the level curve for c = 0,

14.1.8 https://math.libretexts.org/@go/page/4536
2 2
(x−1) (y+2)
we know the value of f is 0 for any point on that level curve (on the ellipse, 4
+
16
=1 ). So we know the lower
bound of the range of this function is 0.
To determine the upper bound for the range of the function in this problem, it's easier if we first complete the square under the
radical.
−−−−−−−−−−−−−−−−−−
2 2
f (x, y) = √ 8 + 8x − 4y − 4 x −y

−−−−−−−−−−−−−−−−−−−−−−−−
2 2
= √ 8 − 4(x − 2x ) − (y + 4y )

−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2
=√ 8 − 4(x − 2x + 1 − 1) − (y + 4y + 4 − 4)

−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2
=√ 8 − 4(x − 2x + 1) + 4 − (y + 4y + 4) + 4

−−−−−−−−−−−−−−−−−−−−
2 2
= √ 16 − 4(x − 1 ) − (y + 2 )

Now that we have f in this form, we can see how large the radicand can be. Since we are subtracting two perfect squares from
16, we know that the value of the radicand cannot be greater than 16. At the point (1, −2), we can see the radicand will be 16
−−
(since we will be subtracting 0 from 16 at this point. This gives us the maximum value of f , that is f (1, −2) = √16 = 4.
So the range of this function is [0, 4].

 Exercise 14.1.2

Find and graph the level curve of the function g(x, y) = x 2


+y
2
− 6x + 2y corresponding to c = 15.

Hint
First, set g(x, y) = 15 and then complete the square.
Solution
The equation of the level curve can be written as (x − 3 )
2
+ (y + 1 )
2
= 25, which is a circle with radius 5 centered at
(3, −1).

Another useful tool for understanding the graph of a function of two variables is called a vertical trace. Level curves are always
graphed in the xy − plane , but as their name implies, vertical traces are graphed in the xz- or yz-planes.

14.1.9 https://math.libretexts.org/@go/page/4536
 Definition: vertical traces

Consider a function z = f (x, y) with domain D ⊆ R . A vertical trace of the function can be either the set of points that
2

solves the equation f (a, y) = z for a given constant x = a or f (x, b) = z for a given constant y = b.

 Example 14.1.5: Finding Vertical Traces


π π π π
Find vertical traces for the function f (x, y) = sin x cos y corresponding to x = − , 0, and , and y = − ,0 , and .
4 4 4 4

Solution
π
First set x = − in the equation z = sin x cos y :
4

π √2 cos y
z = sin(− ) cos y = − ≈ −0.7071 cos y.
4 2

π
This describes a cosine graph in the plane x = − . The other values of z appear in the following table.
4

Vertical Traces Parallel to the xz − P lane for the Function f (x, y) = sin x cos y
c Vertical Trace for x = c


π √2 cosy

z = −
4
2

0 z = 0


π √2 cosy
z =
4
2

In a similar fashion, we can substitute the y − values in the equation f (x, y) to obtain the traces in the yz − plane, as listed
in the following table.
Vertical Traces Parallel to the yz − P lane for the Function f (x, y) = sin x cos y
d Vertical Trace for y = d


π √2 sin x
z =
4
2

0 z = sin x


π √2 sin x

z =
4
2

The three traces in the xz − plane are cosine functions; the three traces in the yz − plane are sine functions. These curves
π π π π
appear in the intersections of the surface with the planes x = − , x = 0, x = and y = − , y = 0, y = as shown in the
4 4 4 4
following figure.

14.1.10 https://math.libretexts.org/@go/page/4536
Figure : Vertical traces of the function
14.1.10 f (x, y) are cosine curves in the xz − planes (a) and sine curves in the
yz − planes (b).

 Exercise 14.1.3

Determine the equation of the vertical trace of the function g(x, y) = −x


2
−y
2
+ 2x + 4y − 1 corresponding to y =3 , and
describe its graph.

Hint
Set y = 3 in the equation z = −x 2
−y
2
+ 2x + 4y − 1 and complete the square.
Solution
z = 3 − (x − 1)
2
. This function describes a parabola opening downward in the plane y = 3 .

Functions of two variables can produce some striking-looking surfaces. Figure 14.1.11 shows two examples.

Figure 14.1.11: Examples of surfaces representing functions of two variables: (a) a combination of a power function and a sine
function and (b) a combination of trigonometric, exponential, and logarithmic functions.

Functions of More Than Two Variables


So far, we have examined only functions of two variables. However, it is useful to take a brief look at functions of more than two
variables. Two such examples are

14.1.11 https://math.libretexts.org/@go/page/4536
2 2 2
f (x, y, z) = x − 2xy + y + 3yz − z + 4x − 2y + 3x − 6

a polynomial in three variables

and
2 2
g(x, y, t) = (x − 4xy + y ) sin t − (3x + 5y) cos t.

In the first function, (x, y, z) represents a point in space, and the function f maps each point in space to a fourth quantity, such as
temperature or wind speed. In the second function, (x, y) can represent a point in the plane, and t can represent time. The function
might map a point in the plane to a third quantity (for example, pressure) at a given time t . The method for finding the domain of a
function of more than two variables is analogous to the method for functions of one or two variables.

 Example 14.1.6: Domains for Functions of Three Variables

Find the domain of each of the following functions:


3x − 4y + 2z
a. f (x, y, z) =
−−−−−−−−−−−− −
2 2 2
√9 − x − y − z
−−−−−
√2t − 4
b. g(x, y, t) = 2 2
x −y

Solution:
3x − 4y + 2z
a. For the function f (x, y, z) = −−−−−−−−−−−− −
to be defined (and be a real value), two conditions must hold:
√9 − x2 − y 2 − z 2

1. The denominator cannot be zero.


2. The radicand cannot be negative.
Combining these conditions leads to the inequality
2 2 2
9 −x −y −z > 0.

Moving the variables to the other side and reversing the inequality gives the domain as
3 2 2 2
domain(f ) = {(x, y, z) ∈ R ∣ x +y +z < 9},

which describes a ball of radius 3 centered at the origin. (Note: The surface of the ball is not included in this domain.)
−−−−−
√2t − 4
b. For the function g(x, y, t) = 2 2
to be defined (and be a real value), two conditions must hold:
x −y

1. The radicand cannot be negative.


2. The denominator cannot be zero.
Since the radicand cannot be negative, this implies 2t − 4 ≥ 0 , and therefore that t ≥ 2 . Since the denominator cannot
be zero, x − y ≠ 0 , or x ≠ y , Which can be rewritten as y = ±x , which are the equations of two lines passing
2 2 2 2

through the origin. Therefore, the domain of g is

domain(g) = {(x, y, t)|y ≠ ±x, t ≥ 2}.

 Exercise 14.1.4
−−−−−−−−−
Find the domain of the function h(x, y, t) = (3t − 6)√y − 4x 2
+4 .

Hint
Check for values that make radicands negative or denominators equal to zero.
Solution
3 2
domain(h) = {(x, y, t) ∈ R ∣ y ≥ 4x − 4}

14.1.12 https://math.libretexts.org/@go/page/4536
Functions of two variables have level curves, which are shown as curves in the xy − plane. However, when the function has three
variables, the curves become surfaces, so we can define level surfaces for functions of three variables.

 Definition: level surface of a function of three variables

Given a function f (x, y, z) and a number c in the range of f , a level surface of a function of three variables is defined to be the
set of points satisfying the equation f (x, y, z) = c.

 Example 14.1.7: Finding a Level Surface

Find the level surface for the function f (x, y, z) = 4x 2


+ 9y
2
−z
2
corresponding to c = 1 .
Solution
The level surface is defined by the equation 4x 2
+ 9y
2
−z
2
= 1. This equation describes a hyperboloid of one sheet as shown
in Figure 14.1.12.

Figure 14.1.12: A hyperboloid of one sheet with some of its level surfaces.

14.1.13 https://math.libretexts.org/@go/page/4536
 Exercise 14.1.5
Find the equation of the level surface of the function
2 2 2
g(x, y, z) = x +y +z − 2x + 4y − 6z

corresponding to c = 2, and describe the surface, if possible.

Hint
Set g(x, y, z) = c and complete the square.
Solution
((x−1)^2+(y+2)^2+(z−3)^2=16\) describes a sphere of radius 4 centered at the point (1, −2, 3).

Summary
The graph of a function of two variables is a surface in R and can be studied using level curves and vertical traces.
3

A set of level curves is called a contour map.

Key Equations
Vertical trace
f (a, y) = z for x = a or f (x, b) = z for y = b
Level surface of a function of three variables
f (x, y, z) = c

Glossary
contour map
a plot of the various level curves of a given function f (x, y)

function of two variables


a function z = f (x, y) that maps each ordered pair (x, y) in a subset D of R to a unique real number z
2

graph of a function of two variables


a set of ordered triples (x, y, z) that satisfies the equation z = f (x, y) plotted in three-dimensional Cartesian space

level curve of a function of two variables


the set of points satisfying the equation f (x, y) = c for some real number c in the range of f

level surface of a function of three variables


the set of points satisfying the equation f (x, y, z) = c for some real number c in the range of f

surface
the graph of a function of two variables, z = f (x, y)

vertical trace
the set of ordered triples (c, y, z) that solves the equation f (c, y) = z for a given constant x = c or the set of ordered triples
(x, d, z) that solves the equation f (x, d) = z for a given constant y = d

14.1: Functions of Several Variables is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
14.1: Functions of Several Variables by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

14.1.14 https://math.libretexts.org/@go/page/4536
14.2: Limits and Continuity
 Learning Objectives
Calculate the limit of a function of two variables.
Learn how a function of two variables can approach different values at a boundary point, depending on the path of
approach.
State the conditions for continuity of a function of two variables.
Verify the continuity of a function of two variables at a point.
Calculate the limit of a function of three or more variables and verify the continuity of the function at a point.

We have now examined functions of more than one variable and seen how to graph them. In this section, we see how to take the
limit of a function of more than one variable, and what it means for a function of more than one variable to be continuous at a point
in its domain. It turns out these concepts have aspects that just don’t occur with functions of one variable.

Limit of a Function of Two Variables


Recall from Section 2.5 that the definition of a limit of a function of one variable:
Let f (x) be defined for all x ≠ a in an open interval containing a . Let L be a real number. Then

lim f (x) = L
x→a

if for every ε > 0, there exists a δ > 0 , such that if 0 < |x − a| < δ for all x in the domain of f , then

|f (x) − L| < ε.

Before we can adapt this definition to define a limit of a function of two variables, we first need to see how to extend the idea of an
open interval in one variable to an open interval in two variables.

 Definition: δ Disks

Consider a point (a, b) ∈ R 2


. A δ disk centered at point (a, b) is defined to be an open disk of radius δ centered at point (a, b)
—that is,
2 2 2 2
{(x, y) ∈ R ∣ (x − a) + (y − b ) <δ }

as shown in Figure 14.2.1.

Figure 14.2.1 : A δ disk centered around the point (2, 1).

The idea of a δ disk appears in the definition of the limit of a function of two variables. If δ is small, then all the points (x, y) in the
δ disk are close to (a, b) . This is completely analogous to x being close to a in the definition of a limit of a function of one variable.

In one dimension, we express this restriction as

a − δ < x < a + δ.

In more than one dimension, we use a δ disk.

14.2.1 https://math.libretexts.org/@go/page/4537
 Definition: limit of a function of two variables

Let f be a function of two variables, x and y . The limit of f (x, y) as (x, y) approaches (a, b) is L, written

lim f (x, y) = L
(x,y)→(a,b)

if for each ε > 0 there exists a small enough δ > 0 such that for all points (x, y) in a δ disk around (a, b), except possibly for
(a, b) itself, the value of f (x, y) is no more than ε away from L (Figure 14.2.2).

Using symbols, we write the following: For any ε > 0 , there exists a number δ > 0 such that

|f (x, y) − L| < ε

whenever
−−−−−−−−−−−−−−−
2 2
0 < √ (x − a) + (y − b ) < δ.

Figure 14.2.2 : The limit of a function involving two variables requires that f (x, y) be within ε of L whenever (x, y) is within δ of
(a, b) . The smaller the value of ε , the smaller the value of δ .

Proving that a limit exists using the definition of a limit of a function of two variables can be challenging. Instead, we use the
following theorem, which gives us shortcuts to finding limits. The formulas in this theorem are an extension of the formulas in the
limit laws theorem in The Limit Laws.

 Limit laws for functions of two variables

Let f (x, y) and g(x, y) be defined for all (x, y) ≠ (a, b) in a neighborhood around (a, b), and assume the neighborhood is
contained completely inside the domain of f . Assume that L and M are real numbers such that

lim f (x, y) = L
(x,y)→(a,b)

and

lim g(x, y) = M ,
(x,y)→(a,b)

and let c be a constant. Then each of the following statements holds:


Constant Law:

14.2.2 https://math.libretexts.org/@go/page/4537
lim c =c
(x,y)→(a,b)

Identity Laws:

lim x =a
(x,y)→(a,b)

lim y =b
(x,y)→(a,b)

Sum Law:

lim (f (x, y) + g(x, y)) = L + M


(x,y)→(a,b)

Difference Law:

lim (f (x, y) − g(x, y)) = L − M


(x,y)→(a,b)

Constant Multiple Law:

lim (cf (x, y)) = cL


(x,y)→(a,b)

Product Law:

lim (f (x, y)g(x, y)) = LM


(x,y)→(a,b)

Quotient Law:
f (x, y) L
lim = for M ≠ 0
(x,y)→(a,b) g(x, y) M

Power Law:
n n
lim (f (x, y)) =L
(x,y)→(a,b)

for any positive integer n .


Root Law:
−−−− −
n −

n
lim √f (x, y) = √L
(x,y)→(a,b)

for all L if n is odd and positive, and for L ≥ 0 if n is even and positive.

The proofs of these properties are similar to those for the limits of functions of one variable. We can apply these laws to finding
limits of various functions.

 Example 14.2.1: Finding the Limit of a Function of Two Variables

Find each of the following limits:


a. lim (x
2
− 2xy + 3 y
2
− 4x + 3y − 6)
(x,y)→(2,−1)

2x + 3y
b. lim
(x,y)→(2,−1) 4x − 3y

Solution
a. First use the sum and difference laws to separate the terms:

14.2.3 https://math.libretexts.org/@go/page/4537
2 2
lim (x − 2xy + 3 y − 4x + 3y − 6)
(x,y)→(2,−1)

2 2
=( lim x ) −( lim 2xy) + ( lim 3y ) − ( lim 4x)
(x,y)→(2,−1) (x,y)→(2,−1) (x,y)→(2,−1) (x,y)→(2,−1)

+( lim 3y) − ( lim 6) .


(x,y)→(2,−1) (x,y)→(2,−1)

Next, use the constant multiple law on the second, third, fourth, and fifth limits:
2 2
=( lim x ) − 2( lim xy) + 3( lim y ) − 4( lim x)
(x,y)→(2,−1) (x,y)→(2,−1) (x,y)→(2,−1) (x,y)→(2,−1)

+3( lim y) − lim 6.


(x,y)→(2,−1) (x,y)→(2,−1)

Now, use the power law on the first and third limits, and the product law on the second limit:
2 2

( lim x) −2 ( lim x) ( lim y) + 3 ( lim y)


(x,y)→(2,−1) (x,y)→(2,−1) (x,y)→(2,−1) (x,y)→(2,−1)

−4 ( lim x) + 3 ( lim y) − lim 6.


(x,y)→(2,−1) (x,y)→(2,−1) (x,y)→(2,−1)

Last, use the identity laws on the first six limits and the constant law on the last limit:
2 2 2 2
lim (x − 2xy + 3 y − 4x + 3y − 6) = (2 ) − 2(2)(−1) + 3(−1 ) − 4(2) + 3(−1) − 6
(x,y)→(2,−1)

= −6.

b. Before applying the quotient law, we need to verify that the limit of the denominator is nonzero. Using the difference law,
constant multiple law, and identity law,
lim (4x − 3y) = lim 4x − lim 3y
(x,y)→(2,−1) (x,y)→(2,−1) (x,y)→(2,−1)

= 4( lim x) − 3( lim y)
(x,y)→(2,−1) (x,y)→(2,−1)

= 4(2) − 3(−1) = 11.

Since the limit of the denominator is nonzero, the quotient law applies. We now calculate the limit of the numerator
using the difference law, constant multiple law, and identity law:
lim (2x + 3y) = lim 2x + lim 3y
(x,y)→(2,−1) (x,y)→(2,−1) (x,y)→(2,−1)

= 2( lim x) + 3( lim y)
(x,y)→(2,−1) (x,y)→(2,−1)

= 2(2) + 3(−1) = 1.

Therefore, according to the quotient law we have


lim (2x + 3y)
2x + 3y (x,y)→(2,−1)

lim =
(x,y)→(2,−1) 4x − 3y lim (4x − 3y)
(x,y)→(2,−1)

1
= .
11

 Exercise 14.2.1:

Evaluate the following limit:


−−−−−−−−−
2
x −y
lim √
3
.
2
(x,y)→(5,−2) y +x −1

14.2.4 https://math.libretexts.org/@go/page/4537
Hint
Use the limit laws.
Answer
−−−−−−−−−
x2 − y 3
3
lim √ =
2
(x,y)→(5,−2) y +x −1 2

Since we are taking the limit of a function of two variables, the point (a, b) is in R , and it is possible to approach this point from
2

an infinite number of directions. Sometimes when calculating a limit, the answer varies depending on the path taken toward (a, b).
If this is the case, then the limit fails to exist. In other words, the limit must be unique, regardless of path taken.

 Example 14.2.2: Limits That Fail to Exist

Show that neither of the following limits exist:


2xy
a. lim
2 2
(x,y)→(0,0) 3x +y
2
4xy
b. lim
(x,y)→(0,0) x2 + 3 y 4

Solution
2xy
a. The domain of the function f (x, y) = consists of all points in the xy -plane except for the point (0, 0) (Figure
3 x2 + y 2

14.2.3 ). To show that the limit does not exist as (x, y) approaches (0, 0), we note that it is impossible to satisfy the definition
of a limit of a function of two variables because of the fact that the function takes different values along different lines passing
through point (0, 0). First, consider the line y = 0 in the xy-plane. Substituting y = 0 into f (x, y) gives
2x(0)
f (x, 0) = =0
2
3 x2 + 0

for any value of x. Therefore the value of f remains constant for any point on the x-axis, and as y approaches zero, the
function remains fixed at zero.
Next, consider the line y = x . Substituting y = x into f (x, y) gives
2
2x(x) 2x
1
f (x, x) = = = .
2 2 2 2
3x +x 4x

This is true for any point on the line y = x . If we let x approach zero while staying on this line, the value of the function
remains fixed at , regardless of how small x is.
1

Choose a value for ε that is less than 1/2—say, 1/4. Then, no matter how small a δ disk we draw around (0, 0), the values of
f (x, y) for points inside that δ disk will include both 0 and . Therefore, the definition of limit at a point is never satisfied and
1

the limit fails to exist.

14.2.5 https://math.libretexts.org/@go/page/4537
2xy
f (x, y) =
2 2
3x + y

2xy
Figure 14.2.3 : Graph of the function f (x, y) = 2 2
. Along the line y = 0 , the function is equal to zero; along the line
3x +y

y = x , the function is equal to 1

2
.

b. In a similar fashion to a., we can approach the origin along any straight line passing through the origin. If we try the x-axis
(i.e., y = 0 ), then the function remains fixed at zero. The same is true for the y -axis. Suppose we approach the origin along a
straight line of slope k . The equation of this line is y = kx . Then the limit becomes
2 2
4xy 4x(kx)
lim = lim
2 4 2 4
(x,y)→(0,0) x + 3y (x,y)→(0,0) x + 3(kx )
2 3
4k x
= lim
2 4 4
(x,y)→(0,0) x + 3k x
2
4k x
= lim
4 2
(x,y)→(0,0) 1 + 3k x
2
lim (4 k x)
(x,y)→(0,0)

=
4 2
lim (1 + 3 k x )
(x,y)→(0,0)

= 0.

regardless of the value of k . It would seem that the limit is equal to zero. What if we chose a curve passing through the origin
instead? For example, we can consider the parabola given by the equation x = y . Substituting y in place of x in f (x, y) 2 2

gives

14.2.6 https://math.libretexts.org/@go/page/4537
2 2 2
4xy 4(y )y
lim = lim
(x,y)→(0,0) x2 + 3 y 4 (x,y)→(0,0) (y 2 )2 + 3 y 4

4
4y
= lim
4 4
(x,y)→(0,0) y + 3y

= lim 1
(x,y)→(0,0)

= 1.

By the same logic in part a, it is impossible to find a δ disk around the origin that satisfies the definition of the limit for any
value of ε < 1. Therefore,
2
4xy
lim
2 4
(x,y)→(0,0) x + 3y

does not exist.

 Exercise 14.2.2:
Show that
(x − 2)(y − 1)
lim
2 2
(x,y)→(2,1) (x − 2 ) + (y − 1 )

does not exist.

Hint
Pick a line with slope k passing through point (2, 1).
Answer
(x − 2)(y − 1) k
If y = k(x − 2) + 1, then lim (x,y)→(2,1)
2 2
=
2
. Since the answer depends on k, the limit fails to
(x − 2 ) + (y − 1 ) 1 +k

exist.

Interior Points and Boundary Points


To study continuity and differentiability of a function of two or more variables, we first need to learn some new terminology.

 Definition: interior and boundary points

Let S be a subset of R (Figure 14.2.4).


2

1. A point P is called an interior point of S if there is a δ disk centered around P contained completely in S .
0 0

2. A point P is called a boundary point of S if every δ disk centered around P contains points both inside and outside S .
0 0

14.2.7 https://math.libretexts.org/@go/page/4537
Figure 14.2.4 : In the set S shown, (−1, 1) is an interior point and (2, 3) is a boundary point.

 Definition: Open and closed sets

Let S be a subset of R (Figure 14.2.4).


2

1. S is called an open set if every point of S is an interior point.


2. S is called a closed set if it contains all its boundary points.

An example of an open set is a δ disk. If we include the boundary of the disk, then it becomes a closed set. A set that contains
some, but not all, of its boundary points is neither open nor closed. For example if we include half the boundary of a δ disk but not
the other half, then the set is neither open nor closed.

 Definition: connected sets and Regions

Let S be a subset of R (Figure 14.2.4).


2

1. An open set S is a connected set if it cannot be represented as the union of two or more disjoint, nonempty open subsets.
2. A set S is a region if it is open, connected, and nonempty.

The definition of a limit of a function of two variables requires the δ disk to be contained inside the domain of the function.
However, if we wish to find the limit of a function at a boundary point of the domain, the δ disk is not contained inside the domain.
By definition, some of the points of the δ disk are inside the domain and some are outside. Therefore, we need only consider points
that are inside both the δ disk and the domain of the function. This leads to the definition of the limit of a function at a boundary
point.

 Definition

Let f be a function of two variables, x and y , and suppose (a, b) is on the boundary of the domain of f . Then, the limit of
f (x, y) as (x, y) approaches (a, b) is L, written

lim f (x, y) = L,
(x,y)→(a,b)

if for any ε > 0, there exists a number δ > 0 such that for any point (x, y) inside the domain of f and within a suitably small
distance positive δ of (a, b), the value of f (x, y) is no more than ε away from L (Figure 14.2.2). Using symbols, we can write:
For any ε > 0 , there exists a number δ > 0 such that
−−−−−−−−−−−−−−−
2 2
|f (x, y) − L| < ε whenever 0 < √ (x − a) + (y − b ) < δ.

14.2.8 https://math.libretexts.org/@go/page/4537
 Example 14.2.3: Limit of a Function at a Boundary Point
Prove
−−−−−−−−−−
2 2
lim √ 25 − x −y = 0.
(x,y)→(4,3)

Solution
−−−−−−−−−−
The domain of the function f (x, y) = √25 − x − y is {(x, y) ∈ R
2 2 2
∣ x
2
+y
2
≤ 25} , which is a circle of radius 5

centered at the origin, along with its interior as shown in Figure 14.2.5.

−−−−−−−−− −
Figure 14.2.5 : Domain of the function f (x, y) = √25 − x 2
−y
2
.
We can use the limit laws, which apply to limits at the boundary of domains as well as interior points:
−−−−−−−−−− −−−−−−−−−−−−−−−−−−
2 2 2 2
lim √ 25 − x −y = lim (25 − x −y )

(x,y)→(4,3) (x,y)→(4,3)

−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2
= lim 25 − lim x − lim y

(x,y)→(4,3) (x,y)→(4,3) (x,y)→(4,3)

− −−−− −−−−−
2 2
= √ 25 − 4 − 3

=0

See the following graph.

−−−−−−−−− −
Figure 14.2.6 : Graph of the function f (x, y) = √25 − x 2
−y
2
.

14.2.9 https://math.libretexts.org/@go/page/4537
 Exercise 14.2.3
Evaluate the following limit:
−−−−−−−−−−
2 2
lim √ 29 − x −y .
(x,y)→(5,−2)

Hint
−−−−−−−−− −
Determine the domain of f (x, y) = √29 − x 2
−y
2
.
Answer
−−−−−−−−−−
2 2
lim √ 29 − x −y
(x,y)→(5,−2)

Continuity of Functions of Two Variables


In Continuity, we defined the continuity of a function of one variable and saw how it relied on the limit of a function of one
variable. In particular, three conditions are necessary for f (x) to be continuous at point x = a
1. f (a) exists.
2. lim f (x) exists.
x→a

3. lim f (x) = f (a).


x→a

These three conditions are necessary for continuity of a function of two variables as well.

 Definition: continuous Functions

A function f (x, y) is continuous at a point (a, b) in its domain if the following conditions are satisfied:
1. f (a, b) exists.
2. lim f (x, y) exists.
(x,y)→(a,b)

3. lim f (x, y) = f (a, b).


(x,y)→(a,b)

 Example 14.2.4: Demonstrating Continuity for a Function of Two Variables

Show that the function


3x + 2y
f (x, y) =
x +y +1

is continuous at point (5, −3).


Solution
There are three conditions to be satisfied, per the definition of continuity. In this example, a = 5 and b = −3.
1. f (a, b) exists. This is true because the domain of the function f consists of those ordered pairs for which the denominator is
nonzero (i.e., x + y + 1 ≠ 0 ). Point (5, −3) satisfies this condition. Furthermore,
3(5) + 2(−3) 15 − 6
f (a, b) = f (5, −3) = = = 3.
5 + (−3) + 1 2 +1

2. lim f (x, y) exists. This is also true:


(x,y)→(a,b)

14.2.10 https://math.libretexts.org/@go/page/4537
3x + 2y
lim f (x, y) = lim
(x,y)→(a,b) (x,y)→(5,−3) x +y +1

lim (3x + 2y)


(x,y)→(5,−3)

=
lim (x + y + 1)
(x,y)→(5,−3)

15 − 6
=
5 −3 +1

= 3.

3. lim f (x, y) = f (a, b). This is true because we have just shown that both sides of this equation equal three.
(x,y)→(a,b)

 Exercise 14.2.4

Show that the function


−−−−−−−−−−−
2 2
f (x, y) = √ 26 − 2 x −y

is continuous at point (2, −3).

Hint
Use the three-part definition of continuity.
Answer
−−−−−−−−−−−−−− −
1. The domain of f contains the ordered pair (2, −3) because f (a, b) = f (2, −3) = √16 − 2(2) 2
− (−3 )
2
=3

2. lim f (x, y) = 3
(x,y)→(a,b)

3. lim f (x, y) = f (a, b) = 3


(x,y)→(a,b)

Continuity of a function of any number of variables can also be defined in terms of delta and epsilon. A function of two variables is
continuous at a point (x , y ) in its domain if for every ε > 0 there exists a δ > 0 such that, whenever
0 0
− −−−−−−−−−−−−−−− −
< δ it is true, |f (x, y) − f (a, b)| < ε. This definition can be combined with the formal definition (that
2 2
√(x − x ) + (y − y )
0 0

is, the epsilon–delta definition) of continuity of a function of one variable to prove the following theorems:

 The Sum of Continuous Functions Is Continuous


If f (x, y) is continuous at (x 0, y0 ) , and g(x, y) is continuous at (x 0, y0 ) , then f (x, y) + g(x, y) is continuous at (x0, y0 ) .

 The Product of Continuous Functions Is Continuous


If g(x) is continuous at x and h(y) is continuous at y , then f (x, y) = g(x)h(y) is continuous at (x
0 0 0, y0 ).

 The Composition of Continuous Functions Is Continuous


Let be a function of two variables from a domain D ⊆ R to a range R ⊆ R. Suppose g is continuous at some point
g
2

(x , y ) ∈ D and define z = g(x , y ) . Let f be a function that maps R to R such that z is in the domain of f . Last, assume
0 0 0 0 0 0

f is continuous at z . Then f ∘ g is continuous at (x , y ) as shown in Figure 14.2.7.


0 0 0

14.2.11 https://math.libretexts.org/@go/page/4537
Figure 14.2.7 : The composition of two continuous functions is continuous.

Let’s now use the previous theorems to show continuity of functions in the following examples.

 Example 14.2.5: More Examples of Continuity of a Function of Two Variables

Show that the functions f (x, y) = 4x 3


y
2
and g(x, y) = cos(4x 3 2
y ) are continuous everywhere.
Solution
The polynomials g(x) = 4x and h(y) = y are continuous at every real number, and therefore by the product of continuous
3 2

functions theorem, f (x, y) = 4x y is continuous at every point (x, y) in the xy-plane. Since f (x, y) = 4x y is continuous
3 2 3 2

at every point (x, y) in the xy-plane and g(x) = cos x is continuous at every real number x, the continuity of the composition
of functions tells us that g(x, y) = cos(4x y ) is continuous at every point (x, y) in the xy-plane.
3 2

 Exercise 14.2.5

Show that the functions f (x, y) = 2x 2


y
3
+3 and g(x, y) = (2x 2
y
3 4
+ 3) are continuous everywhere.

Hint
Use the continuity of the sum, product, and composition of two functions.
Answer
The polynomials g(x) = 2x and h(y) = y are continuous at every real number; therefore, by the product of continuous
2 3

functions theorem, f (x, y) = 2x y is continuous at every point (x, y) in the xy-plane. Furthermore, any constant
2 3

function is continuous everywhere, so g(x, y) = 3 is continuous at every point (x, y) in the xy-plane. Therefore,
f (x, y) = 2 x y + 3 is continuous at every point (x, y) in the xy-plane. Last, h(x) = x is continuous at every real
2 3 4

number x , so by the continuity of composite functions theorem g(x, y) = (2x y + 3) is continuous at every point (x, y)2 3 4

in the xy-plane.

Functions of Three or More Variables


The limit of a function of three or more variables occurs readily in applications. For example, suppose we have a function
f (x, y, z) that gives the temperature at a physical location (x, y, z) in three dimensions. Or perhaps a function g(x, y, z, t) can

indicate air pressure at a location (x, y, z) at time t . How can we take a limit at a point in R ? What does it mean to be continuous 3

at a point in four dimensions?


The answers to these questions rely on extending the concept of a δ disk into more than two dimensions. Then, the ideas of the
limit of a function of three or more variables and the continuity of a function of three or more variables are very similar to the
definitions given earlier for a function of two variables.

 Definition: δ -balls

Let (x , y
0 0, z0 ) be a point in R . Then, a δ -ball in three dimensions consists of all points in R lying at a distance of less than
3 3

δ from (x 0, y0 , z ) —that is,


0

−−−−−−−−−−−−−−−−−−−−−−−−−−
3 2 2 2
{(x, y, z) ∈ R ∣ √ (x − x0 ) + (y − y0 ) + (z − z0 ) < δ}.

14.2.12 https://math.libretexts.org/@go/page/4537
To define a δ -ball in higher dimensions, add additional terms under the radical to correspond to each additional dimension. For
example, given a point P = (w , x , y , z ) in R , a δ ball around P can be described by
0 0 0 0
4

−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
4 2 2 2 2
{(w, x, y, z) ∈ R ∣ √ (w − w0 ) + (x − x0 ) + (y − y0 ) + (z − z0 ) < δ}.

To show that a limit of a function of three variables exists at a point (x , y , z ) , it suffices to show that for any point in a δ ball
0 0 0

centered at (x , y , z ) , the value of the function at that point is arbitrarily close to a fixed value (the limit value). All the limit laws
0 0 0

for functions of two variables hold for functions of more than two variables as well.

 Example 14.2.6: Finding the Limit of a Function of Three Variables

Find
2
x y − 3z
lim .
(x,y,z)→(4,1,−3) 2x + 5y − z

Solution
Before we can apply the quotient law, we need to verify that the limit of the denominator is nonzero. Using the difference law,
the identity law, and the constant law,
lim (2x + 5y − z) = 2( lim x) + 5( lim y) − ( lim z)
(x,y,z)→(4,1,−3) (x,y,z)→(4,1,−3) (x,y,z)→(4,1,−3) (x,y,z)→(4,1,−3)

= 2(4) + 5(1) − (−3)

= 16.

Since this is nonzero, we next find the limit of the numerator. Using the product law, power law, difference law, constant
multiple law, and identity law,
2 2
lim (x y − 3z) = ( lim x) ( lim y) − 3 lim z
(x,y,z)→(4,1,−3) (x,y,z)→(4,1,−3) (x,y,z)→(4,1,−3) (x,y,z)→(4,1,−3)

2
= (4 )(1) − 3(−3)

= 16 + 9

= 25

Last, applying the quotient law:


2
lim (x y − 3z)
2
x y − 3z (x,y,z)→(4,1,−3) 25
lim = =
(x,y,z)→(4,1,−3) 2x + 5y − z lim (2x + 5y − z) 16
(x,y,z)→(4,1,−3)

 Exercise 14.2.6

Find
−−−−−−−−−−−−−−−
2 2 2
lim √ 13 − x − 2y +z
(x,y,z)→(4,−1,3)

Hint
Use the limit laws and the continuity of the composition of functions.
Answer
−−−−−−−−−−−−−−−
2 2 2
lim √ 13 − x − 2y +z =2
(x,y,z)→(4,−1,3)

14.2.13 https://math.libretexts.org/@go/page/4537
Key Concepts
To study limits and continuity for functions of two variables, we use a δ disk centered around a given point.
A function of several variables has a limit if for any point in a δ ball centered at a point P , the value of the function at that point
is arbitrarily close to a fixed value (the limit value).
The limit laws established for a function of one variable have natural extensions to functions of more than one variable.
A function of two variables is continuous at a point if the limit exists at that point, the function exists at that point, and the limit
and function are equal at that point.

Glossary
boundary point
a point P of R is a boundary point if every δ disk centered around P contains points both inside and outside R
0 0

closed set
a set S that contains all its boundary points

connected set
an open set S that cannot be represented as the union of two or more disjoint, nonempty open subsets

δ disk
an open disk of radius δ centered at point (a, b)

δ ball
all points in R lying at a distance of less than δ from (x
3
0, y0 , z0 )

interior point
a point P of R is a boundary point if there is a δ disk centered around P contained completely in R
0 0

open set
a set S that contains none of its boundary points

region
an open, connected, nonempty subset of R 2

14.2: Limits and Continuity is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
14.2: Limits and Continuity by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

14.2.14 https://math.libretexts.org/@go/page/4537
14.3: Partial Derivatives
 Learning Objectives
Calculate the partial derivatives of a function of two variables.
Calculate the partial derivatives of a function of more than two variables.
Determine the higher-order derivatives of a function of two variables.
Explain the meaning of a partial differential equation and give an example.

Now that we have examined limits and continuity of functions of two variables, we can proceed to study derivatives. Finding
derivatives of functions of two variables is the key concept in this chapter, with as many applications in mathematics, science, and
engineering as differentiation of single-variable functions. However, we have already seen that limits and continuity of multivariable
functions have new issues and require new terminology and ideas to deal with them. This carries over into differentiation as well.

Derivatives of a Function of Two Variables


When studying derivatives of functions of one variable, we found that one interpretation of the derivative is an instantaneous rate of
change of y as a function of x. Leibniz notation for the derivative is dy/dx, which implies that y is the dependent variable and x is the
independent variable. For a function z = f (x, y) of two variables, x and y are the independent variables and z is the dependent
variable. This raises two questions right away: How do we adapt Leibniz notation for functions of two variables? Also, what is an
interpretation of the derivative? The answer lies in partial derivatives.

 Definition: Partial Derivatives

Let f (x, y) be a function of two variables. Then the partial derivative of f with respect to x, written as ∂f /∂x,, or f x, is defined as
∂f f (x + h, y) − f (x, y)
= fx (x, y) = lim (14.3.1)
∂x h→0 h

The partial derivative of f with respect to y , written as ∂f /∂y, or f , is defined as


y

∂f f (x, y + k) − f (x, y)
= fy (x, y) = lim . (14.3.2)
∂y k→0 k

This definition shows two differences already. First, the notation changes, in the sense that we still use a version of Leibniz notation, but
the d in the original notation is replaced with the symbol ∂ . (This rounded “d” is usually called “partial,” so ∂f /∂x is spoken as the
“partial of f with respect to x.”) This is the first hint that we are dealing with partial derivatives. Second, we now have two different
derivatives we can take, since there are two different independent variables. Depending on which variable we choose, we can come up
with different partial derivatives altogether, and often do.

 Example 14.3.1: Calculating Partial Derivatives from the Definition

Use the definition of the partial derivative as a limit to calculate ∂f /∂x and ∂f /∂y for the function
2 2
f (x, y) = x − 3xy + 2 y − 4x + 5y − 12.

Solution
First, calculate f (x + h, y).
2 2
f (x + h, y) = (x + h ) − 3(x + h)y + 2 y − 4(x + h) + 5y − 12
2 2 2
=x + 2xh + h − 3xy − 3hy + 2 y − 4x − 4h + 5y − 12.

Next, substitute this into Equation 14.3.1 and simplify:

14.3.1 https://math.libretexts.org/@go/page/4538
∂f f (x + h, y) − f (x, y)
= lim
∂x h→0 h
2 2 2 2 2
(x + 2xh + h − 3xy − 3hy + 2 y − 4x − 4h + 5y − 12) − (x − 3xy + 2 y − 4x + 5y − 12)
= lim
h→0 h
2 2 2 2 2
x + 2xh + h − 3xy − 3hy + 2 y − 4x − 4h + 5y − 12 − x + 3xy − 2 y + 4x − 5y + 12
= lim
h→0 h
2
2xh + h − 3hy − 4h
= lim
h→0 h

h(2x + h − 3y − 4)
= lim
h→0 h

= lim(2x + h − 3y − 4)
h→0

= 2x − 3y − 4.

∂f
To calculate , first calculate f (x, y + h) :
∂y

2 2
f (x + h, y) =x − 3x(y + h) + 2(y + h ) − 4x + 5(y + h) − 12

2 2 2
=x − 3xy − 3xh + 2 y + 4yh + 2 h − 4x + 5y + 5h − 12.

Next, substitute this into Equation 14.3.2 and simplify:

∂f f (x, y + h) − f (x, y)
= lim
∂y h→0 h
2 2 2 2 2
(x − 3xy − 3xh + 2 y + 4yh + 2 h − 4x + 5y + 5h − 12) − (x − 3xy + 2 y − 4x + 5y − 12)
= lim
h→0 h
2 2 2 2 2
x − 3xy − 3xh + 2 y + 4yh + 2 h − 4x + 5y + 5h − 12 − x + 3xy − 2 y + 4x − 5y + 12
= lim
h→0 h
2
−3xh + 4yh + 2 h + 5h
= lim
h→0 h

h(−3x + 4y + 2h + 5)
= lim
h→0 h

= lim(−3x + 4y + 2h + 5)
h→0

= −3x + 4y + 5

 Exercise 14.3.1
Use the definition of the partial derivative as a limit to calculate ∂f /∂x and ∂f /∂y for the function
2 2
f (x, y) = 4 x + 2xy − y + 3x − 2y + 5.

Hint
Use Equations 14.3.1and 14.3.2from the definition of partial derivatives.
Answer
∂f
= 8x + 2y + 3
∂x

∂f
= 2x − 2y − 2
∂y

The idea to keep in mind when calculating partial derivatives is to treat all independent variables, other than the variable with respect to
which we are differentiating, as constants. Then proceed to differentiate as with a function of a single variable. To see why this is true,
first fix y and define g(x) = f (x, y) as a function of x. Then

14.3.2 https://math.libretexts.org/@go/page/4538
g(x + h) − g(x)
g'(x) = lim
h→0 h

f (x + h, y) − f (x, y)
= lim
h→0 h

∂f
= .
∂x

The same is true for calculating the partial derivative of f with respect to y . This time, fix x and define h(y) = f (x, y) as a function of
y . Then

h(x + k) − h(x)
h'(x) = lim
k→0 k

f (x, y + k) − f (x, y)
= lim
k→0 k

∂f
= .
∂y

All differentiation rules apply.

 Example 14.3.2: Calculating Partial Derivatives

Calculate ∂f /∂x and ∂f /∂y for the following functions by holding the opposite variable constant then differentiating:
a. f (x, y) = x − 3xy + 2y − 4x + 5y − 12
2 2

b. g(x, y) = sin(x y − 2x + 4)
2

Solution:
a. To calculate ∂f /∂x, treat the variable y as a constant. Then differentiate f (x, y) with respect to x using the sum, difference, and
power rules:
∂f ∂
2 2
= [x − 3xy + 2 y − 4x + 5y − 12]
∂x ∂x

∂ ∂ ∂ ∂ ∂ ∂
2 2
= [x ] − [3xy] + [2 y ] − [4x] + [5y] − [12]
∂x ∂x ∂x ∂x ∂x ∂x

= 2x − 3y + 0 − 4 + 0 − 0

= 2x − 3y − 4.

The derivatives of the third, fifth, and sixth terms are all zero because they do not contain the variable x, so they are treated as
constant terms. The derivative of the second term is equal to the coefficient of x, which is −3y. Calculating ∂f /∂y:
∂f ∂
2 2
= [x − 3xy + 2 y − 4x + 5y − 12]
∂y ∂y

∂ 2
∂ ∂ 2
∂ ∂ ∂
= [x ] − [3xy] + [2 y ] − [4x] + [5y] − [12]
∂y ∂y ∂y ∂y ∂y ∂y

= −3x + 4y − 0 + 5 − 0

= −3x + 4y + 5.

These are the same answers obtained in Example 14.3.1.


b. To calculate ∂g/∂x, treat the variable y as a constant. Then differentiate g(x, y) with respect to x using the chain rule and power
rule:

14.3.3 https://math.libretexts.org/@go/page/4538
∂g ∂
2
= [sin(x y − 2x + 4)]
∂x ∂x


2 2
= cos(x y − 2x + 4) [ x y − 2x + 4]
∂x

2
= (2xy − 2) cos(x y − 2x + 4).

To calculate ∂g/∂y, treat the variable x as a constant. Then differentiate g(x, y) with respect to y using the chain rule and power
rule:
∂g ∂ 2
= [sin(x y − 2x + 4)]
∂y ∂y


2 2
= cos(x y − 2x + 4) [ x y − 2x + 4]
∂y

2 2
=x cos(x y − 2x + 4).

 Exercise 14.3.2

Calculate ∂f /∂x and ∂f /∂y for the function


3 2 2 4
f (x, y) = tan(x − 3x y + 2y )

by holding the opposite variable constant, then differentiating.

Hint
Use Equations 14.3.1and 14.3.1from the definition of partial derivatives.
Answer
∂f
2 2 2 3 2 2 4
= (3 x − 6x y ) sec (x − 3x y + 2y )
∂x

∂f
2 3 2 3 2 2 4
= (−6 x y + 8 y ) sec (x − 3x y + 2y )
∂y

How can we interpret these partial derivatives? Recall that the graph of a function of two variables is a surface in R . If we remove the 3

limit from the definition of the partial derivative with respect to x, the difference quotient remains:
f (x + h, y) − f (x, y)
.
h

This resembles the difference quotient for the derivative of a function of one variable, except for the presence of the y variable. Figure
14.3.1 illustrates a surface described by an arbitrary function z = f (x, y).

Figure 14.3.1 : Secant line passing through the points (x, y, f (x, y)) and (x + h, y, f (x + h, y)).

14.3.4 https://math.libretexts.org/@go/page/4538
In Figure 14.3.1, the value of h is positive. If we graph f (x, y) and f (x + h, y) for an arbitrary point (x, y), then the slope of the secant
line passing through these two points is given by
f (x + h, y) − f (x, y)
.
h

This line is parallel to the x-axis. Therefore, the slope of the secant line represents an average rate of change of the function f as we
travel parallel to the x-axis. As h approaches zero, the slope of the secant line approaches the slope of the tangent line.
If we choose to change y instead of x by the same incremental value h , then the secant line is parallel to the y -axis and so is the tangent
line. Therefore, ∂f /∂x represents the slope of the tangent line passing through the point (x, y, f (x, y))parallel to the x-axis and ∂f /∂y
represents the slope of the tangent line passing through the point (x, y, f (x, y))parallel to the y -axis. If we wish to find the slope of a
tangent line passing through the same point in any other direction, then we need what are called directional derivatives.
We now return to the idea of contour maps, which we introduced in Functions of Several Variables. We can use a contour map to
estimate partial derivatives of a function g(x, y).

 Example 14.3.3: Partial Derivatives from a Contour Map



Use a contour map to estimate ∂g/∂x at the point (√5, 0) for the function
−−−−−−−−−
2 2
g(x, y) = √ 9 − x −y .

Solution
Figure 14.3.2 represents a contour map for the function g(x, y).

−−−−−−−− −
Figure 14.3.2 : Contour map for the function g(x, y) = √9 − x 2
−y
2
, using c = 0, 1, 2, and 3(c = 3 corresponds to the origin).
The inner circle on the contour map corresponds to c = 2 and the next circle out corresponds to c = 1 . The first circle is given by
−−−−−−−− − − −−−−−−− −
the equation 2 = √9 − x − y ; the second circle is given by the equation 1 = √9 − x − y . The first equation simplifies to
2 2 2 2


x + y = 5 and the second equation simplifies to x + y = 8. The x-intercept of the first circle is (√5, 0) and the x-intercept of
2 2 2 2

– –
the second circle is (2√2, 0). We can estimate the value of ∂g/∂x evaluated at the point (√5, 0) using the slope formula:
– –
∂g ∣ g(√5, 0) − g(2 √2, 0)
∣ ≈
– –
∂x ∣ (x,y)=( √5,0) √5 − 2 √2

2 −1
= –

√5 − 2 √2

1
= ≈ −1.688.
– –
√5 − 2 √2


To calculate the exact value of ∂g/∂x evaluated at the point (√5, 0) , we start by finding ∂g/∂x using the chain rule. First, we
rewrite the function as
−−−−−−−−−
2 2 2 2 1/2
g(x, y) = √ 9 − x −y = (9 − x −y )

and then differentiate with respect to x while holding y constant:

14.3.5 https://math.libretexts.org/@go/page/4538
∂g 1
2 2 −1/2
= (9 − x −y ) (−2x)
∂x 2

x
=− − −−−−−−− −.
√ 9 − x2 − y 2


Next, we evaluate this expression using x = √5 and y = 0 :

∂g ∣ √5
∣ =−
− −−−−−−−−−−− −
∂x ∣ (x,y)=( √5,0)
– 2 2
√ 9 − (√5) − (0 )


√5
=− –
√4


√5
=− ≈ −1.118.
2

– –
The estimate for the partial derivative corresponds to the slope of the secant line passing through the points (√5, 0, g(√5, 0)) and
– –
(2 √2, 0, g(2 √2, 0)). It represents an approximation to the slope of the tangent line to the surface through the point
– –
(√5, 0, g(√5, 0)), which is parallel to the x-axis.

 Exercise 14.3.3

Use a contour map to estimate ∂f /∂y at point (0, √2) for the function
2 2
f (x, y) = x −y .

Compare this with the exact answer.

Hint

Create a contour map for f using values of c from −3 to 3. Which of these curves passes through point (0, √2)?
Answer
Using the curves corresponding to c = −2 and c = −3, we obtain
– –
∂f ∣ f (0, √3) − f (0, √2)
∣ ≈
– –
∂y ∣ √3 − √2
(x,y)=(0, √2)

– –
−3 − (−2) √3 + √2
= – ⋅ –
– –
√3 − √2 √3 + √2

– –
= −√3 − √2 ≈ −3.146.

The exact answer is


∂f ∣ –
∣ = (−2y | = −2 √2 ≈ −2.828.
(x,y)=(0, √2)
∂y ∣
(x,y)=(0, √2)

Functions of More Than Two Variables


Suppose we have a function of three variables, such as w = f (x, y, z). We can calculate partial derivatives of w with respect to any of
the independent variables, simply as extensions of the definitions for partial derivatives of functions of two variables.

 Definition: Partial Derivatives

Let f (x, y, z) be a function of three variables. Then, the partial derivative of f with respect to x, written as ∂f /∂x, or fx , is
defined to be
∂f f (x + h, y, z) − f (x, y, z)
= fx (x, y, z) = lim . (14.3.3)
∂x h→0 h

The partial derivative of f with respect to y , written as ∂f /∂y, or f , is defined to be


y

14.3.6 https://math.libretexts.org/@go/page/4538
∂f f (x, y + k, z) − f (x, y, z)
= fy (x, y, z) = lim (14.3.4)
∂y k→0 k.

The partial derivative of f with respect to z , written as ∂f /∂z, or f , is defined to be z

∂f f (x, y, z + m) − f (x, y, z)
= fz (x, y, z) = lim . (14.3.5)
∂z m→0 m

We can calculate a partial derivative of a function of three variables using the same idea we used for a function of two variables. For
example, if we have a function f of x, y, and z , and we wish to calculate ∂f /∂x, then we treat the other two independent variables as if
they are constants, then differentiate with respect to x.

 Example 14.3.4: Calculating Partial Derivatives for a Function of Three Variables


Use the limit definition of partial derivatives to calculate ∂f /∂x for the function
2 2 2
f (x, y, z) = x − 3xy + 2 y − 4xz + 5y z − 12x + 4y − 3z.

Then, find ∂f /∂y and ∂f /∂z by setting the other two variables constant and differentiating accordingly.
Solution:
We first calculate ∂f /∂x using Equation 14.3.3, then we calculate the other two partial derivatives by holding the remaining
variables constant. To use the equation to find ∂f /∂x, we first need to calculate f (x + h, y, z) :
2 2 2
f (x + h, y, z) = (x + h ) − 3(x + h)y + 2 y − 4(x + h)z + 5y z − 12(x + h) + 4y − 3z

2 2 2 2
=x + 2xh + h − 3xy − 3xh + 2 y − 4xz − 4hz + 5y z − 12x − 12h + 4y − 3z

and recall that f (x, y, z) = x


2
− 3xy + 2 y
2
− 4zx + 5y z
2
− 12x + 4y − 3z. Next, we substitute these two expressions into the
equation:
∂f
=
∂x 2 2 2 2 2 2 2
x + 2xh + h − 3xy − 3hy + 2 y − 4xz − 4hz + 5y z − 12x − 12h + 4y − 3zh − x − 3xy + 2 y − 4xz + 5y z
⎡ ⎤

⎢ − 12x + 4y − 3z ⎥
lim ⎢ ⎥
h→0 ⎢ h ⎥

⎣ ⎦

2
2xh + h − 3hy − 4hz − 12h
= lim [ ]
h→0 h

h(2x + h − 3y − 4z − 12)
= lim [ ]
h→0 h

= lim(2x + h − 3y − 4z − 12)
h→0

= 2x − 3y − 4z − 12.

Then we find ∂f /∂y by holding x and z constant. Therefore, any term that does not include the variable y is constant, and its
derivative is zero. We can apply the sum, difference, and power rules for functions of one variable:
∂ 2 2 2
[x − 3xy + 2 y − 4xz + 5y z − 12x + 4y − 3z]
∂y

∂ 2
∂ ∂ 2
∂ ∂ 2
∂ ∂ ∂
= [x ] − [3xy] + [2 y ] − [4xz] + [5y z ] − [12x] + [4y] − [3z]
∂y ∂y ∂y ∂y ∂y ∂y ∂y ∂z

2
= 0 − 3x + 4y − 0 + 5 z −0 +4 −0

2
= −3x + 4y + 5 z + 4.

To calculate ∂f /∂z, we hold x and y constant and apply the sum, difference, and power rules for functions of one variable:

14.3.7 https://math.libretexts.org/@go/page/4538

2 2 2
[x − 3xy + 2 y − 4xz + 5y z − 12x + 4y − 3z]
∂z

∂ 2
∂ ∂ 2
∂ ∂ 2
∂ ∂ ∂
= [x ] − [3xy] + [2 y ] − [4xz] + [5y z ] − [12x] + [4y] − [3z]
∂z ∂z ∂z ∂z ∂z ∂z ∂z ∂z

= 0 − 0 + 0 − 4x + 10yz − 0 + 0 − 3

= −4x + 10yz − 3

 Exercise 14.3.4

Use the limit definition of partial derivatives to calculate ∂f /∂x for the function
2 2 2 2
f (x, y, z) = 2 x − 4x y + 2y + 5x z − 6x + 3z − 8.

Then find ∂f /∂y and ∂f /∂z by setting the other two variables constant and differentiating accordingly.

Hint
Use the strategy in the preceding example.
Answer
∂f ∂f ∂f
2 2
= 4x − 8xy + 5 z − 6, = −4 x + 4y, = 10xz + 3
∂x ∂y ∂z

 Example 14.3.5: Calculating Partial Derivatives for a Function of Three Variables

Calculate the three partial derivatives of the following functions.


2 2
x y − 4xz + y
a. f (x, y, z) =
x − 3yz

b. g(x, y, z) = sin(x 2
y − z) + cos(x
2
− yz)

Solution
In each case, treat all variables as constants except the one whose partial derivative you are calculating.
a.
2 2
∂f ∂ x y − 4xz + y
= [ ]
∂x ∂x x − 3yz

∂ ∂
2 2 2 2
(x y − 4xz + y )(x − 3yz) − (x y − 4xz + y ) (x − 3yz)
∂x ∂x
=
(x − 3yz)2

2 2
(2xy − 4z)(x − 3yz) − (x y − 4xz + y )(1)
=
2
(x − 3yz)

2 2 2 2 2
2 x y − 6x y z − 4xz + 12y z − x y + 4xz − y
=
2
(x − 3yz)

2 2 2 2
x y − 6x y z − 4xz + 12y z + 4xz − y
=
2
(x − 3yz)

14.3.8 https://math.libretexts.org/@go/page/4538
2 2
∂f ∂ x y − 4xz + y
= [ ]
∂y ∂y x − 3yz

∂ ∂
2 2 2 2
(x y − 4xz + y )(x − 3yz) − (x y − 4xz + y ) (x − 3yz)
∂y ∂y
=
2
(x − 3yz)

2 2 2
(x + 2y)(x − 3yz) − (x y − 4xz + y )(−3z)
=
2
(x − 3yz)

3 2 2 2 2 2
x − 3 x yz + 2xy − 6 y z + 3 x yz − 12x z + 3y z
=
2
(x − 3yz)

3 2 2
x + 2xy − 3 y z − 12x z
=
2
(x − 3yz)

2 2
∂f ∂ x y − 4xz + y
= [ ]
∂z ∂z x − 3yz

∂ ∂
2 2 2 2
(x y − 4xz + y )(x − 3yz) − (x y − 4xz + y ) (x − 3yz)
∂z ∂z
=
2
(x − 3yz)

2 2
(−4x)(x − 3yz) − (x y − 4xz + y )(−3y)
=
2
(x − 3yz)

2 2 2 3
−4 x + 12xyz + 3 x y − 12xyz + 3 y
=
(x − 3yz)2

2 2 2 3
−4 x + 3x y + 3y
=
2
(x − 3yz)

b.
∂f ∂
2 2
= [sin(x y − z) + cos(x − yz)]
∂x ∂x

∂ ∂
2 2 2 2
= (cos(x y − z)) (x y − z) − (sin(x − yz)) (x − yz)
∂x ∂x

2 2
= 2xy cos(x y − z) − 2x sin(x − yz)

∂f ∂ 2 2
= [sin(x y − z) + cos(x − yz)]
∂y ∂y

2
∂ 2 2
∂ 2
= (cos(x y − z)) (x y − z) − (sin(x − yz)) (x − yz)
∂y ∂y

2 2 2
=x cos(x y − z) + z sin(x − yz)

∂f ∂
2 2
= [sin(x y − z) + cos(x − yz)]
∂z ∂z

∂ ∂
2 2 2 2
= (cos(x y − z)) (x y − z) − (sin(x − yz)) (x − yz)
∂z ∂z

2 2
= − cos(x y − z) + y sin(x − yz)

 Exercise 14.3.5

Calculate ∂f /∂x, ∂f /∂y, and ∂f /∂z for the function


2 3 2
f (x, y, z) = sec(x y) − tan(x y z ).

14.3.9 https://math.libretexts.org/@go/page/4538
Hint
Use the strategy in the preceding example.
Answer
∂f
2 2 2 2 2 3 2
= 2xy sec(x y) tan(x y) − 3 x y z sec (x y z )
∂x

∂f
2 2 2 3 2 2 3 2
=x sec(x y) tan(x y) − x z sec (x y z )
∂y

∂f
3 2 3 2
= −2 x yz sec (x y z )
∂z

Higher-Order Partial Derivatives


Consider the function
3 2 3
f (x, y) = 2 x − 4x y + 5y − 6xy + 5x − 4y + 12.

Its partial derivatives are


∂f
2 2
= 6x − 4y − 6y + 5
∂x

and
∂f 2
= −8xy + 15 y − 6x − 4.
∂y

Each of these partial derivatives is a function of two variables, so we can calculate partial derivatives of these functions. Just as with
derivatives of single-variable functions, we can call these second-order derivatives, third-order derivatives, and so on. In general, they
are referred to as higher-order partial derivatives. There are four second-order partial derivatives for any function (provided they all
exist):
2
∂ f ∂ ∂f
= [ ]
2
∂x ∂x ∂x

2
∂ f ∂ ∂f
= [ ]
∂y∂x ∂y ∂x

2
∂ f ∂ ∂f
= [ ]
∂x∂y ∂x ∂y

2
∂ f ∂ ∂f
= [ ].
2
∂y ∂y ∂y

An alternative notation for each is f , f , f , and f , respectively. Higher-order partial derivatives calculated with respect to
xx xy yx yy

different variables, such as f and f , are commonly called mixed partial derivatives.
xy yx

 Example 14.3.6: Calculating Second Partial Derivatives

Calculate all four second partial derivatives for the function


−3y
f (x, y) = x e + sin(2x − 5y). (14.3.6)

Solution:
2 2
∂ f ∂ f
To calculate 2
and , we first calculate ∂f /∂x:
∂x ∂y∂x

∂f −3y
=e + 2 cos(2x − 5y). (14.3.7)
∂x

2
∂ f
To calculate 2
, differentiate ∂f /∂x (Equation 14.3.7) with respect to x:
∂x

14.3.10 https://math.libretexts.org/@go/page/4538
2
∂ f ∂ ∂f
= [ ]
2
∂x ∂x ∂x


−3y
= [e + 2 cos(2x − 5y)]
∂x

= −4 sin(2x − 5y).

2
∂ f
To calculate , differentiate ∂f /∂x (Equation 14.3.7) with respect to y :
∂y∂x

2
∂ f ∂ ∂f
= [ ]
∂y ∂x ∂y ∂x

∂ −3y
= [e + 2 cos(2x − 5y)]
∂y

−3y
= −3 e + 10 sin(2x − 5y).

2 2
∂ f ∂ f
To calculate and 2
, first calculate ∂f /∂y:
∂x∂y ∂y

∂f
−3y
= −3x e − 5 cos(2x − 5y). (14.3.8)
∂y

2
∂ f
To calculate , differentiate ∂f /∂y (Equation 14.3.8) with respect to x:
∂x∂y

2
∂ f ∂ ∂f
= [ ]
∂x∂y ∂x ∂y


−3y
= [−3x e − 5 cos(2x − 5y)]
∂x

−3y
= −3 e + 10 sin(2x − 5y).

2
∂ f
To calculate 2
, differentiate ∂f /∂y (Equation 14.3.8) with respect to y :
∂y

2
∂ f ∂ ∂f
= [ ]
2
∂y ∂y ∂y

∂ −3y
= [−3x e − 5 cos(2x − 5y)]
∂y

−3y
= 9x e − 25 sin(2x − 5y).

 Exercise 14.3.6

Calculate all four second partial derivatives for the function

f (x, y) = sin(3x − 2y) + cos(x + 4y).

Hint
Follow the same steps as in the previous example.
Answer
2
∂ f
= −9 sin(3x − 2y) − cos(x + 4y)
2
∂x

2
∂ f
= 6 sin(3x − 2y) − 4 cos(x + 4y)
∂y∂x

14.3.11 https://math.libretexts.org/@go/page/4538
2
∂ f
= 6 sin(3x − 2y) − 4 cos(x + 4y)
∂x∂y

2
∂ f
= −4 sin(3x − 2y) − 16 cos(x + 4y)
2
∂y

2 2
∂ f ∂ f
At this point we should notice that, in both Example 14.3.6 and the checkpoint, it was true that = . Under certain
∂y∂x ∂x∂y

conditions, this is always true. In fact, it is a direct consequence of the following theorem.

 Equality of Mixed Partial Derivatives (Clairaut’s Theorem)

Suppose that f (x, y) is defined on an open disk D that contains the point (a, b). If the functions f xy and f yx are continuous on D,
then f (a, b) = f (a, b) .
xy yx

Clairaut’s theorem guarantees that as long as mixed second-order derivatives are continuous, the order in which we choose to
differentiate the functions (i.e., which variable goes first, then second, and so on) does not matter. It can be extended to higher-order
derivatives as well. The proof of Clairaut’s theorem can be found in most advanced calculus books.
Two other second-order partial derivatives can be calculated for any function f (x, y). The partial derivative fxx is equal to the
partial derivative of f with respect to x, and f is equal to the partial derivative of f with respect to y .
x yy y

Partial Differential Equations


Previously, we studied differential equations in which the unknown function had one independent variable. A partial differential
equation is an equation that involves an unknown function of more than one independent variable and one or more of its partial
derivatives. Examples of partial differential equations are
2
ut = c (uxx + uyy )
heat equation in two dimensions

2
utt = c (uxx + uyy )
wave equation in two dimensions

uxx + uyy = 0
Laplace’s equation in two dimensions

In the heat and wave equations, the unknown function u has three independent variables: t , x, and y with c is an arbitrary constant. The
independent variables x and y are considered to be spatial variables, and the variable t represents time. In Laplace’s equation, the
unknown function u has two independent variables x and y .

 Example 14.3.7: A Solution to the Wave Equation

Verify that

u(x, y, t) = 5 sin(3πx) sin(4πy) cos(10πt)

is a solution to the wave equation

utt = 4(uxx + uyy ). (14.3.9)

Solution
First, we calculate u tt , uxx , and u
yy :

14.3.12 https://math.libretexts.org/@go/page/4538
∂ ∂u
utt (x, y, t) = [ ]
∂t ∂t


= [5 sin(3πx) sin(4πy)(−10π sin(10πt))]
∂t


= [−50π sin(3πx) sin(4πy) sin(10πt)]
∂t

2
= −500 π sin(3πx) sin(4πy) cos(10πt)

∂ ∂u
uxx (x, y, t) = [ ]
∂x ∂x


= [15π cos(3πx) sin(4πy) cos(10πt)]
∂x

2
= −45 π sin(3πx) sin(4πy) cos(10πt)

∂ ∂u
uyy (x, y, t) = [ ]
∂y ∂y


= [5 sin(3πx)(4π cos(4πy)) cos(10πt)]
∂y


= [20π sin(3πx) cos(4πy) cos(10πt)]
∂y

2
= −80 π sin(3πx) sin(4πy) cos(10πt).

Next, we substitute each of these into the right-hand side of Equation 14.3.9 and simplify:
2 2
4(uxx + uyy ) = 4(−45 π sin(3πx) sin(4πy) cos(10πt) + −80 π sin(3πx) sin(4πy) cos(10πt))

2
= 4(−125 π sin(3πx) sin(4πy) cos(10πt))

2
= −500 π sin(3πx) sin(4πy) cos(10πt)

= utt .

This verifies the solution.

 Exercise 14.3.7: A Solution to the Heat Equation

Verify that
x y −25t/16
u(x, y, t) = 2 sin( ) sin( )e
3 4

is a solution to the heat equation

ut = 9(uxx + uyy ).

Hint
Calculate the partial derivatives and substitute into the right-hand side.
Answer
TBA

Since the solution to the two-dimensional heat equation is a function of three variables, it is not easy to create a visual representation of
the solution. We can graph the solution for fixed values of t, which amounts to snapshots of the heat distributions at fixed times. These
snapshots show how the heat is distributed over a two-dimensional surface as time progresses. The graph of the preceding solution at
time t = 0 appears in Figure 14.3.3. As time progresses, the extremes level out, approaching zero as t approaches infinity.

14.3.13 https://math.libretexts.org/@go/page/4538
Figure 14.3.3
If we consider the heat equation in one dimension, then it is possible to graph the solution over time. The heat equation in one
dimension becomes
2
ut = c uxx ,

where c represents the thermal diffusivity of the material in question. A solution of this differential equation can be written in the form
2

2 2 2
−π m c t
um (x, t) = e sin(mπx)

where m is any positive integer. A graph of this solution using m = 1 appears in Figure 14.3.4, where the initial temperature
distribution over a wire of length 1 is given by u(x, 0) = sin πx. Notice that as time progresses, the wire cools off. This is seen because,
from left to right, the highest temperature (which occurs in the middle of the wire) decreases and changes color from red to blue.

Figure 14.3.4 : Graph of a solution of the heat equation in one dimension over time.

 Lord Kelvin and the Age of Earth

During the late 1800s, the scientists of the new field of geology were coming to the conclusion that Earth must be “millions and
millions” of years old. At about the same time, Charles Darwin had published his treatise on evolution. Darwin’s view was that
evolution needed many millions of years to take place, and he made a bold claim that the Weald chalk fields, where important
fossils were found, were the result of 300 million years of erosion.

14.3.14 https://math.libretexts.org/@go/page/4538
Figure 14.3.5 : (a) William Thomson (Lord Kelvin), 1824-1907, was a British physicist and electrical engineer; (b) Kelvin used the
heat diffusion equation to estimate the age of Earth (credit: modification of work by NASA).
At that time, eminent physicist William Thomson (Lord Kelvin) used an important partial differential equation, known as the heat
diffusion equation, to estimate the age of Earth by determining how long it would take Earth to cool from molten rock to what we
had at that time. His conclusion was a range of 20 to 400 million years, but most likely about 50 million years. For many decades,
the proclamations of this irrefutable icon of science did not sit well with geologists or with Darwin.
Read Kelvin’s paper on estimating the age of the Earth.
Kelvin made reasonable assumptions based on what was known in his time, but he also made several assumptions that turned out to
be wrong. One incorrect assumption was that Earth is solid and that the cooling was therefore via conduction only, hence justifying
the use of the diffusion equation. But the most serious error was a forgivable one—omission of the fact that Earth contains
radioactive elements that continually supply heat beneath Earth’s mantle. The discovery of radioactivity came near the end of
Kelvin’s life and he acknowledged that his calculation would have to be modified.
Kelvin used the simple one-dimensional model applied only to Earth’s outer shell, and derived the age from graphs and the roughly
known temperature gradient near Earth’s surface. Let’s take a look at a more appropriate version of the diffusion equation in radial
coordinates, which has the form
2
∂T ∂ T 2 ∂T
=K[ + ] (14.3.10)
2
∂t ∂ r r ∂r

.
Here, T (r, t) is temperature as a function of r (measured from the center of Earth) and time t. K is the heat conductivity—for
molten rock, in this case. The standard method of solving such a partial differential equation is by separation of variables, where we
express the solution as the product of functions containing each variable separately. In this case, we would write the temperature as

T (r, t) = R(r)f (t).

1. Substitute this form into Equation 14.3.10 and, noting that f (t) is constant with respect to distance (r) and R(r) is constant
with respect to time (t), show that
2
1 ∂f K ∂ R 2 ∂R
= [ + ].
f ∂t R ∂r2 r ∂r

2. This equation represents the separation of variables we want. The left-hand side is only a function of t and the right-hand side is
only a function of r, and they must be equal for all values of r and t . Therefore, they both must be equal to a constant. Let’s call
that constant −λ . (The convenience of this choice is seen on substitution.) So, we have
2

2
1 ∂f 2
K ∂ R 2 ∂R 2
= −λ and [ + ] = −λ .
2
f ∂t R ∂r r ∂r

14.3.15 https://math.libretexts.org/@go/page/4538
2

3. Now, we can verify through direct substitution for each equation that the solutions are f (t) = Ae −λ t
and
sin αr cos αr −

, where . Note that f (t) = Ae is also a valid solution, so we could have
2
+λn t
R(r) = B ( ) +C ( ) α = λ/ √K
r r

chosen +λ for our constant. Can you see why it would not be valid for this case as time increases?
2

4. Let’s now apply boundary conditions.


a. The temperature must be finite at the center of Earth, r = 0 . Which of the two constants, B or C , must therefore be zero to
keep R finite at r = 0 ? (Recall that sin(αr)/r → α = as r → 0 , but cos(αr)/r behaves very differently.)
b. Kelvin argued that when magma reaches Earth’s surface, it cools very rapidly. A person can often touch the surface within
weeks of the flow. Therefore, the surface reached a moderate temperature very early and remained nearly constant at a
surface temperature T . For simplicity, let’s set T = 0 at r = R and find α such that this is the temperature there for all
s E

time t . (Kelvin took the value to be 300K ≈ 80°F . We can add this 300K constant to our solution later.) For this to be true,
the sine argument must be zero at r = R . Note that α has an infinite series of values that satisfies this condition. Each value
E

of α represents a valid solution (each with its own value for A ). The total or general solution is the sum of all these
solutions.
c. At t = 0, we assume that all of Earth was at an initial hot temperature T (Kelvin took this to be about 7000K.) The
0

application of this boundary condition involves the more advanced application of Fourier coefficients. As noted in part b.
each value of α represents a valid solution, and the general solution is a sum of all these solutions. This results in a series
n

solution:
n−1
T0 RE (−1) 2 sin(αn r)
−λn t
T (r, t) = ( )∑ e
π n r
n

where αn = nπ/ RE .
n−1
−1
Note how the values of α come from the boundary condition applied in part b. The term
n is the constant A for each term
n
n
π
in the series, determined from applying the Fourier method. Letting β = , examine the first few terms of this solution shown
RE

here and note how λ in the exponential causes the higher terms to decrease quickly as time progresses:
2

T (r, t)

T0 RE −K β t
2 1 2
−4K β t
1 2
−9K β t
1 −16K β t
2 1 2
−25K β t
= (e (sin βr) − e (sin 2βr) + e (sin 3βr) − e (sin 4βr) + e (sin 5βr). .
πr 2 3 4 5

.).

Near time t = 0, many terms of the solution are needed for accuracy. Inserting values for the conductivity K and β = π/R for E

time approaching merely thousands of years, only the first few terms make a significant contribution. Kelvin only needed to look at
the solution near Earth’s surface (Figure 14.3.6) and, after a long time, determine what time best yielded the estimated temperature
gradient known during his era (1°F increase per 50f t). He simply chose a range of times with a gradient close to this value. In
Figure 14.3.6, the solutions are plotted and scaled, with the 300 − K surface temperature added. Note that the center of Earth
would be relatively cool. At the time, it was thought Earth must be solid.

14.3.16 https://math.libretexts.org/@go/page/4538
Figure 14.3.6 : Temperature versus radial distance from the center of Earth. (a) Kelvin’s results, plotted to scale. (b) A close-up of
the results at a depth of 4.0 miles below Earth’s surface.
Epilog
On May 20, 1904, physicist Ernest Rutherford spoke at the Royal Institution to announce a revised calculation that included the
contribution of radioactivity as a source of Earth’s heat. In Rutherford’s own words:
“I came into the room, which was half-dark, and presently spotted Lord Kelvin in the audience, and realized that I was in for
trouble at the last part of my speech dealing with the age of the Earth, where my views conflicted with his. To my relief,
Kelvin fell fast asleep, but as I came to the important point, I saw the old bird sit up, open an eye and cock a baleful glance at
me.
Then a sudden inspiration came, and I said Lord Kelvin had limited the age of the Earth, provided no new source [of heat]
was discovered. That prophetic utterance referred to what we are now considering tonight, radium! Behold! The old boy
beamed upon me.”
Rutherford calculated an age for Earth of about 500 million years. Today’s accepted value of Earth’s age is about 4.6 billion years.

Key Concepts
A partial derivative is a derivative involving a function of more than one independent variable.
To calculate a partial derivative with respect to a given variable, treat all the other variables as constants and use the usual
differentiation rules.
Higher-order partial derivatives can be calculated in the same way as higher-order derivatives.

Key Equations
Partial derivative of f with respect to x
∂f f (x + h, y) − f (x, y)
= lim
∂x h→0 h

Partial derivative of f with respect to y

∂f f (x, y + k) − f (x, y)
= lim
∂y k→0 k

Glossary
higher-order partial derivatives
second-order or higher partial derivatives, regardless of whether they are mixed partial derivatives

mixed partial derivatives


second-order or higher partial derivatives, in which at least two of the differentiations are with respect to different variables

14.3.17 https://math.libretexts.org/@go/page/4538
partial derivative
a derivative of a function of more than one independent variable in which all the variables but one are held constant

partial differential equation


an equation that involves an unknown function of more than one independent variable and one or more of its partial derivatives

14.3: Partial Derivatives is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
14.3: Partial Derivatives by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

14.3.18 https://math.libretexts.org/@go/page/4538
14.4: Tangent Planes and Linear Approximations
 Learning Objectives
Determine the equation of a plane tangent to a given surface at a point.
Use the tangent plane to approximate a function of two variables at a point.
Explain when a function of two variables is differentiable.
Use the total differential to approximate the change in a function of two variables.

In this section, we consider the problem of finding the tangent plane to a surface, which is analogous to finding the equation of a
tangent line to a curve when the curve is defined by the graph of a function of one variable, y = f (x). The slope of the tangent line
at the point x = a is given by m = f '(a) ; what is the slope of a tangent plane? We learned about the equation of a plane in
Equations of Lines and Planes in Space; in this section, we see how it can be applied to the problem at hand.

Tangent Planes
Intuitively, it seems clear that, in a plane, only one line can be tangent to a curve at a point. However, in three-dimensional space,
many lines can be tangent to a given point. If these lines lie in the same plane, they determine the tangent plane at that point. A
more intuitive way to think of a tangent plane is to assume the surface is smooth at that point (no corners). Then, a tangent line to
the surface at that point in any direction does not have any abrupt changes in slope because the direction changes smoothly.
Therefore, in a small-enough neighborhood around the point, a tangent plane touches the surface at that point only.

 Definition: tangent lines


Let P = (x , y , z ) be a point on a surface S , and let C be any curve passing through P and lying entirely in S . If the
0 0 0 0 0

tangent lines to all such curves C at P lie in the same plane, then this plane is called the tangent plane to S at P (Figure
0 0

14.4.1).

Figure 14.4.1 : The tangent plane to a surface S at a point P contains all the tangent lines to curves in S that pass through P .
0 0

For a tangent plane to a surface to exist at a point on that surface, it is sufficient for the function that defines the surface to be
differentiable at that point. We define the term tangent plane here and then explore the idea intuitively.

14.4.1 https://math.libretexts.org/@go/page/4539
 Definition: tangent planes

Let S be a surface defined by a differentiable function z = f (x, y), and let P 0 = (x0 , y0 ) be a point in the domain of f . Then,
the equation of the tangent plane to S at P is given by 0

z = f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 ). (14.4.1)

To see why this formula is correct, let’s first find two tangent lines to the surface S . The equation of the tangent line to the curve
that is represented by the intersection of S with the vertical trace given by x = x is z = f (x , y ) + f (x , y )(y − y ) . 0 0 0 y 0 0 0

Similarly, the equation of the tangent line to the curve that is represented by the intersection of S with the vertical trace given by
y =y 0 is z = f (x , y ) + f (x , y )(x − x ) . A parallel vector to the first tangent line is a = ^j + f (x , y ) k
0 0 x 0 0 0
^
; a parallel vector

y 0 0

to the second tangent line is ^ ^
b = i + fx (x0 , y0 ) k . We can take the cross product of these two vectors:

⇀ ^ ^ ^ ^
a ×b = ( j + fy (x0 , y0 ) k) × ( i + fx (x0 , y0 ) k)

∣^i
^
j
^
k ∣
∣ ∣
∣ ∣
= 0 1 fy (x0 , y0 )
∣ ∣
∣ ∣
∣1 0 fx (x0 , y0 ) ∣

^ ^ ^
= fx (x0 , y0 ) i + fy (x0 , y0 ) j − k.

This vector is perpendicular to both lines and is therefore perpendicular to the tangent plane. We can use this vector as a normal
vector to the tangent plane, along with the point P = (x , y , f (x , y )) in the equation for a plane:
0 0 0 0 0

⇀ ^ ^ ^
n ⋅ ((x − x0 ) i + (y − y0 ) j + (z − f (x0 , y0 )) k) =0

^ ^ ^ ^ ^ ^
(fx (x0 , y0 ) i + fy (x0 , y0 ) j − k) ⋅ ((x − x0 ) i + (y − y0 ) j + (z − f (x0 , y0 )) k) =0

fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 ) − (z − f (x0 , y0 )) = 0.

Solving this equation for z gives Equation 14.4.1.

 Example 14.4.1: Finding a Tangent Plane

Find the equation of the tangent plane to the surface defined by the function f (x, y) = 2 x
2
− 3xy + 8 y
2
+ 2x − 4y + 4 at
point (2, −1).
Solution
First, we must calculate f x (x, y) and f y (x, y), then use Equation with x 0 =2 and y0 = −1 :
fx (x, y) = 4x − 3y + 2

fy (x, y) = −3x + 16y − 4

2 2
f (2, −1) = 2(2 ) − 3(2)(−1) + 8(−1 ) + 2(2) − 4(−1) + 4 = 34

fx (2, −1) = 4(2) − 3(−1) + 2 = 13

fy (2, −1) = −3(2) + 16(−1) − 4 = −26.

Then Equation 14.4.1 becomes


z = f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 )

z = 34 + 13(x − 2) − 26(y − (−1))

z = 34 + 13x − 26 − 26y − 26

z = 13x − 26y − 18.

(See the following figure).

14.4.2 https://math.libretexts.org/@go/page/4539
Figure 14.4.2 : Calculating the equation of a tangent plane to a given surface at a given point.

 Exercise 14.4.1

Find the equation of the tangent plane to the surface defined by the function f (x, y) = x 3 2
−x y +y
2
− 2x + 3y − 2 at point
(−1, 3).

Hint
First, calculate fx (x, y) and f y (x, , then use Equation 14.4.1.
y)

Answer
z = 7x + 8y − 3

 Example 14.4.2: Finding Another Tangent Plane

Find the equation of the tangent plane to the surface defined by the function f (x, y) = sin(2x) cos(3y) at the point
(π/3, π/4).

Solution
First, calculate f
x (x, y) and f y (x, y) , then use Equation 14.4.1 with x 0 = π/3 and y 0 = π/4 :

fx (x, y) = 2 cos(2x) cos(3y)

fy (x, y) = −3 sin(2x) sin(3y)

– – –
π π π π √3 √2 √6
f ( , ) = sin(2 ( )) cos(3 ( )) = ( ) (− ) =−
3 4 3 4 2 2 4

– –
π π π π 1 √2 √2
fx ( , ) = 2 cos(2 ( )) cos(3 ( )) = 2 (− ) (− ) =
3 4 3 4 2 2 2

– – –
π π π π √3 √2 3 √6
fy ( , ) = −3 sin(2 ( )) sin(3 ( )) = −3 ( )( ) =− .
3 4 3 4 2 2 4

Then Equation 14.4.1 becomes

14.4.3 https://math.libretexts.org/@go/page/4539
z = f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 )

– – –
√6 √2 π 3 √6 π
=− + (x − )− (y − )
4 2 3 4 4
– – – – –
√2 3 √6 √6 π √2 3π √6
= x− y− − +
2 4 4 6 16

A tangent plane to a surface does not always exist at every point on the surface. Consider the piecewise function
xy

⎪− −−−− −, (x, y) ≠ (0, 0)
√ x2 + y 2
f (x, y) = ⎨ . (14.4.2)


0, (x, y) = (0, 0)

The graph of this function follows.

Figure 14.4.3 : Graph of a function that does not have a tangent plane at the origin. Dynamic figure powered by CalcPlot3D.
If either x =0 or , then f (x, y) = 0, so the value of the function does not change on either the x- or y -axis. Therefore,
y =0

fx (x, 0) = fy (0, y) = 0, so as either x or y approach zero, these partial derivatives stay equal to zero. Substituting them into
Equation gives z = 0 as the equation of the tangent line. However, if we approach the origin from a different direction, we get a
different story. For example, suppose we approach the origin along the line y = x . If we put y = x into the original function, it
becomes
2
x(x) x |x|
f (x, x) = = = .
− −−−− −−− −−− –
√ x2 + (x )2 √2x2 √2

14.4.4 https://math.libretexts.org/@go/page/4539
– –
When x > 0, the slope of this curve is equal to √2/2; when x < 0 , the slope of this curve is equal to −(√2/2). This presents a
problem. In the definition of tangent plane, we presumed that all tangent lines through point P (in this case, the origin) lay in the
same plane. This is clearly not the case here. When we study differentiable functions, we will see that this function is not
differentiable at the origin.

Linear Approximations
Recall from Linear Approximations and Differentials that the formula for the linear approximation of a function f (x) at the point
x = a is given by


y ≈ f (a) + f (a)(x − a).

The diagram for the linear approximation of a function of one variable appears in the following graph.

Figure 14.4.4 : Linear approximation of a function in one variable.


The tangent line can be used as an approximation to the function f (x) for values of x reasonably close to x = a . When working
with a function of two variables, the tangent line is replaced by a tangent plane, but the approximation idea is much the same.

 Definition: Linear Approximation

Given a function z = f (x, y) with continuous partial derivatives that exist at the point (x 0, y0 ) , the linear approximation of f
at the point (x , y ) is given by the equation
0 0

L(x, y) = f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 ). (14.4.3)

Notice that this equation also represents the tangent plane to the surface defined by z = f (x, y) at the point (x , y ). The idea 0 0

behind using a linear approximation is that, if there is a point (x , y ) at which the precise value of f (x, y) is known, then for
0 0

values of (x, y) reasonably close to (x , y ), the linear approximation (i.e., tangent plane) yields a value that is also reasonably
0 0

close to the exact value of f (x, y) (Figure). Furthermore the plane that is used to find the linear approximation is also the tangent
plane to the surface at the point (x , y ).
0 0

14.4.5 https://math.libretexts.org/@go/page/4539
Figure 14.4.5 : Using a tangent plane for linear approximation at a point.

 Example 14.4.3: Using a Tangent Plane Approximation


−−−−−−−−−−−
Given the function f (x, y) = √41 − 4x − y , approximate
2 2
f (2.1, 2.9) using point (2, 3) for (x0 , y0 ). What is the
approximate value of f (2.1, 2.9)to four decimal places?
Solution
To apply Equation 14.4.3, we first must calculate f (x 0, y0 ), fx (x0 , y0 ), and f y (x0 , y0 ) using x0 =2 and y 0 =3 :

−−−−−−−−−−−−−−
2 2 − −−−−−−− − −−
f (x0 , y0 ) = f (2, 3) = √ 41 − 4(2 ) − (3 ) = √ 41 − 16 − 9 = √16 = 4

4x 4(2)
fx (x, y) = − − −−−−−−−−− − so fx (x0 , y0 ) = − − −−−−− −− − −−−−− = −2
2 2 2 2
√ 41 − 4 x − y √ 41 − 4(2 ) − (3 )

y 3 3
fy (x, y) = − − −−−−−−−−− − so fy (x0 , y0 ) = − − −−−−− −− − −−−−− =− .
2 2 2 2 4
√ 41 − 4 x − y √ 41 − 4(2 ) − (3 )

Now we substitute these values into Equation 14.4.3:


L(x, y) = f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 )

3
= 4 − 2(x − 2) − (y − 3)
4

41 3
= − 2x − y.
4 4

Last, we substitute x = 2.1 and y = 2.9 into L(x, y) :


41 3
L(2.1, 2.9) = − 2(2.1) − (2.9) = 10.25 − 4.2 − 2.175 = 3.875.
4 4

The approximate value of f (2.1, 2.9)to four decimal places is


−−−−−−−−−−−−−−−−
2 2 −−−−
f (2.1, 2.9) = √ 41 − 4(2.1 ) − (2.9 ) = √14.95 ≈ 3.8665,

which corresponds to a 0.2 error in approximation.

14.4.6 https://math.libretexts.org/@go/page/4539
 Exercise 14.4.2
Given the function f (x, y) = e 5−2x+3y
, approximate f (4.1, 0.9)using point (4, 1) for (x 0, y0 ) . What is the approximate value
of f (4.1, 0.9)to four decimal places?

Hint
First calculate f (x 0, y0 ), fx (x0 , y0 ), and f y (x0 , y0 ) using x 0 =4 and y
0 =1 , then use Equation 14.4.3.
Answer
L(x, y) = 6 − 2x + 3y, so L(4.1, 0.9) = 6 − 2(4.1) + 3(0.9) = 0.5 f (4.1, 0.9) = e 5−2(4.1)+3(0.9)
=e
−0.5
≈ 0.6065.

Differentiability
When working with a function y = f (x) of one variable, the function is said to be differentiable at a point x = a if f '(a) exists.
Furthermore, if a function of one variable is differentiable at a point, the graph is “smooth” at that point (i.e., no corners exist) and
a tangent line is well-defined at that point.
The idea behind differentiability of a function of two variables is connected to the idea of smoothness at that point. In this case, a
surface is considered to be smooth at point P if a tangent plane to the surface exists at that point. If a function is differentiable at a
point, then a tangent plane to the surface exists at that point. Recall the formula (Equation 14.4.1) for a tangent plane at a point
(x , y ) is given by
0 0

z = f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 )

For a tangent plane to exist at the point (x , y ), the partial derivatives must therefore exist at that point. However, this is not a
0 0

sufficient condition for smoothness, as was illustrated in Figure. In that case, the partial derivatives existed at the origin, but the
function also had a corner on the graph at the origin.

 Definition: differentiable Functions

A function f (x, y) is differentiable at a point P (x 0, y0 ) if, for all points (x, y) in a δ disk around P , we can write

f (x, y) = f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 ) + E(x, y), (14.4.4)

where the error term E satisfies


E(x, y)
lim = 0. (14.4.5)
− −−−−−−−−−−−−−−− −
(x,y)→( x0 , y0 ) 2 2
√ (x − x0 ) + (y − y0 )

The last term in Equation 14.4.4 is to as the error term and it represents how closely the tangent plane comes to the surface in a
small neighborhood (δ disk) of point P . For the function f to be differentiable at P , the function must be smooth—that is, the
graph of f must be close to the tangent plane for points near P .

 Example 14.4.4: Demonstrating Differentiability


Show that the function f (x, y) = 2x 2
− 4y is differentiable at point (2, −3).
Solution
First, we calculate f (x 0, y0 ), fx (x0 , y0 ), and f y (x0 , y0 ) using x 0 =2 and y 0 = −3, then we use Equation 14.4.4:
2
f (2, −3) = 2(2 ) − 4(−3) = 8 + 12 = 20

fx (2, −3) = 4(2) = 8

fy (2, −3) = −4.

Therefore m 1 =8 and m 2 = −4, and Equation 14.4.4 becomes

14.4.7 https://math.libretexts.org/@go/page/4539
f (x, y) = f (2, −3) + fx (2, −3)(x − 2) + fy (2, −3)(y + 3) + E(x, y)

2
2x − 4y = 20 + 8(x − 2) − 4(y + 3) + E(x, y)

2
2x − 4y = 20 + 8x − 16 − 4y − 12 + E(x, y)

2
2x − 4y = 8x − 4y − 8 + E(x, y)

2
E(x, y) = 2 x − 8x + 8.

Next, we calculate the limit in Equation 14.4.5:


2
E(x, y) 2x − 8x + 8
lim = lim
− −−−−−−−−−−−−−−− − − −−−−−−−−−−−−− −
(x,y)→( x0 , y0 ) 2 2 (x,y)→(2,−3) 2 2
√ (x − x0 ) + (y − y0 ) √ (x − 2 ) + (y + 3 )

2
2(x − 4x + 4)
= lim
− −−−−−−−−−−−−− −
(x,y)→(2,−3) √ (x − 2 )2 + (y + 3 )2

2
2(x − 2)
= lim
− −−−−−−−−−−−−− −
(x,y)→(2,−3) √ (x − 2 )2 + (y + 3 )2

2 2
2((x − 2 ) + (y + 3 ) )
≤ lim − −−−−−−−−−−−−− −
(x,y)→(2,−3) √ (x − 2 )2 + (y + 3 )2

−−−−−−−−−−−−−−−
2 2
= lim 2 √ (x − 2 ) + (y + 3 )
(x,y)→(2,−3)

= 0.

Since E(x, y) ≥ 0 for any value of x or y , the original limit must be equal to zero. Therefore, 2
f (x, y) = 2 x − 4y is
differentiable at point (2, −3).

 Exercise 14.4.3

Show that the function f (x, y) = 3x − 4y is differentiable at point (−1, 2).


2

Hint
First, calculate f (x , y ), f
0 0 x (x0 , y0 ), and f y (x0 , y0 ) using x 0 = −1 and y 0 =2 , then use Equation 14.4.5 to find E(x, y).
Last, calculate the limit.
Answer
2
f (−1, 2) = −19, fx (−1, 2) = 3, fy (−1, 2) = −16, E(x, y) = −4(y − 2 ) .

2
E(x, y) −4(y − 2)
lim = lim
− −−−−−−−−−−−−−−− − − −−−−−−−−−−−−− −
(x,y)→( x0 , y ) 2 2 (x,y)→(−1,2) 2 2
0 √ (x − x0 ) + (y − y0 ) √ (x + 1 ) + (y − 2 )

2 2
−4((x + 1 ) + (y − 2 ) )
≤ lim
− −−−−−− 2
−−−−−−− −
2
(x,y)→(−1,2) √ (x + 1 ) + (y − 2 )

−−−−−−−−−−−−−−−
2 2
= lim −4 √ (x + 1 ) + (y − 2 )
(x,y)→(2,−3)

= 0.

This function from (Equation 14.4.2)


xy


− −−−−−, (x, y) ≠ (0, 0)
2 2
f (x, y) = ⎨ √ x + y


0, (x, y) = (0, 0)

14.4.8 https://math.libretexts.org/@go/page/4539
is not differentiable at the origin (Figure 14.4.3). We can see this by calculating the partial derivatives. This function appeared
earlier in the section, where we showed that f (0, 0) = f (0, 0) = 0 . Substituting this information into Equations 14.4.4 and
x y

14.4.5 using x = 0 and y = 0 , we get


0 0

f (x, y) = f (0, 0) + fx (0, 0)(x − 0) + fy (0, 0)(y − 0) + E(x, y)

xy
E(x, y) = −−−−−−.
2 2
√x + y

Calculating
E(x, y)
lim
− −−−−−−−−−−−−−−− −
(x,y)→( x0 , y0 ) √ (x − x0 )2 + (y − y0 )2

gives
xy
−−−−−−
2 2
E(x, y) √x + y
lim − −−−−−−−−−−−−−−− − = lim −−−−−−
(x,y)→( x0 , y0 ) 2 2 (x,y)→(0,0) 2 2
√ (x − x0 ) + (y − y0 ) √x + y

xy
= lim .
(x,y)→(0,0) x2 + y 2

Depending on the path taken toward the origin, this limit takes different values. Therefore, the limit does not exist and the function
f is not differentiable at the origin as shown in the following figure.

Figure 14.4.6 : This function f (x, y) (Equation 14.4.2 ) is not differentiable at the origin.
Differentiability and continuity for functions of two or more variables are connected, the same as for functions of one variable. In
fact, with some adjustments of notation, the basic theorem is the same.

 THEOREM: Differentiability Implies Continuity

Let z = f (x, y) be a function of two variables with (x0 , y0 ) in the domain of f . If f (x, y) is differentiable at (x0 , y0 ) , then
f (x, y) is continuous at (x , y ).
0 0

Note shows that if a function is differentiable at a point, then it is continuous there. However, if a function is continuous at a point,
then it is not necessarily differentiable at that point. For example, the function discussed above (Equation 14.4.2)

14.4.9 https://math.libretexts.org/@go/page/4539
xy


− −−−−−, (x, y) ≠ (0, 0)
2 2
f (x, y) = ⎨ √ x + y


0, (x, y) = (0, 0)

is continuous at the origin, but it is not differentiable at the origin. This observation is also similar to the situation in single-variable
calculus.
We can further explores the connection between continuity and differentiability at a point. This next theorem says that if the
function and its partial derivatives are continuous at a point, the function is differentiable.

 Theorem: Continuity of First Partials Implies Differentiability

Let z = f (x, y) be a function of two variables with (x , y ) in the domain of f . If f (x, y), f
0 0 x (x, y), and f y (x, y) all exist in a
neighborhood of (x , y ) and are continuous at (x , y ), then f (x, y) is differentiable there.
0 0 0 0

Recall that earlier we showed that the function in Equation 14.4.2 was not differentiable at the origin. Let’s calculate the partial
derivatives f and f :
x y

3
∂f y
=
2 2 3/2
∂x (x +y )

and
3
∂f x
= .
∂y 2 2 3/2
(x +y )

The contrapositive of the preceding theorem states that if a function is not differentiable, then at least one of the hypotheses must
be false. Let’s explore the condition that f (0, 0) must be continuous. For this to be true, it must be true that
x

lim fx (x, y) = fx (0, 0)


(x,y)→(0,0)

therefor
3
y
lim fx (x, y) = lim .
(x,y)→(0,0) (x,y)→(0,0) 2 2 3/2
(x +y )

Let x = ky . Then
3 3
y y
lim = lim
2 2 3/2 y→0 2 2 3/2
(x,y)→(0,0) (x +y ) ((ky ) +y )

3
y
= lim
y→0 (k2 y 2 + y 2 )3/2

3
y
= lim
3
y→0 2 3/2
|y | (k + 1)

1 |y|
= lim .
2 3/2 y→0 y
(k + 1)

If y > 0 , then this expression equals 1/(k 2


+ 1)
3/2
; if y < 0 , then it equals −(1/(k 2
+ 1)
3/2
) . In either case, the value depends on
k , so the limit fails to exist.

Differentials
In Linear Approximations and Differentials we first studied the concept of differentials. The differential of y , written dy , is defined
as f '(x)dx. The differential is used to approximate Δy = f (x + Δx) − f (x) , where Δx = dx . Extending this idea to the linear
approximation of a function of two variables at the point (x , y ) yields the formula for the total differential for a function of two
0 0

variables.

14.4.10 https://math.libretexts.org/@go/page/4539
 Definition: Total Differential

Let z = f (x, y) be a function of two variables with (x , y ) in the domain of f , and let Δx and Δy be chosen so that
0 0

(x0 + Δx, y0 + Δy) is also in the domain of f . If f is differentiable at the point (x , y ), then the differentials dx and dy are
0 0

defined as

dx = Δx

and

dy = Δy.

The differential dz , also called the total differential of z = f (x, y) at (x 0, y0 ) , is defined as

dz = fx (x0 , y0 )dx + fy (x0 , y0 )dy. (14.4.6)

Notice that the symbol ∂ is not used to denote the total differential; rather, d appears in front of z . Now, let’s define
Δz = f (x + Δx, y + Δy) − f (x, y). We use dz to approximate Δz, so

Δz ≈ dz = fx (x0 , y0 )dx + fy (x0 , y0 )dy.

Therefore, the differential is used to approximate the change in the function z = f (x , y ) at the point (x , y ) for given values of
0 0 0 0

Δx and Δy. Since Δz = f (x + Δx, y + Δy) − f (x, y) , this can be used further to approximate f (x + Δx, y + Δy) :

f (x + Δx, y + Δy) = f (x, y) + Δz ≈ f (x, y) + fx (x0 , y0 )Δx + fy (x0 , y0 )Δy.

See the following figure.

Figure 14.4.7 : The linear approximation is calculated via the formula


f (x + Δx, y + Δy) ≈ f (x, y) + fx ( x0 , y0 )Δx + fy ( x0 , y0 )Δy.

One such application of this idea is to determine error propagation. For example, if we are manufacturing a gadget and are off by a
certain amount in measuring a given quantity, the differential can be used to estimate the error in the total volume of the gadget.

 Example 14.4.5: Approximation by Differentials

Find the differential dz of the function f (x, y) = 3x − 2xy + y 2 2


and use it to approximate Δz at point (2, −3). Use
Δx = 0.1 and Δy = −0.05. What is the exact value of Δz ?

Solution
First, we must calculate f (x 0, y0 ), fx (x0 , y0 ), and f
y (x0 , y0 ) using x 0 =2 and y 0 = −3 :

14.4.11 https://math.libretexts.org/@go/page/4539
2 2
f (x0 , y0 ) = f (2, −3) = 3(2 ) − 2(2)(−3) + (−3 ) = 12 + 12 + 9 = 33

fx (x, y) = 6x − 2y

fy (x, y) = −2x + 2y

fx (x0 , y0 ) = f x(2, −3)

= 6(2) − 2(−3) = 12 + 6 = 18

fy (x0 , y0 ) = fy (2, −3)

= −2(2) + 2(−3)

= −4 − 6 = −10.

Then, we substitute these quantities into Equation 14.4.6:


dz = fx (x0 , y0 )dx + fy (x0 , y0 )dy

dz = 18(0.1) − 10(−0.05) = 1.8 + 0.5 = 2.3.

This is the approximation to Δz = f (x 0 + Δx, y0 + Δy) − f (x0 , y0 ). The exact value of Δz is given by

Δz = f (x0 + Δx, y0 + Δy) − f (x0 , y0 )

= f (2 + 0.1, −3 − 0.05) − f (2, −3)

= f (2.1, −3.05) − f (2, −3)

= 2.3425.

 Exercise 14.4.4

Find the differential dz of the function f (x, y) = 4y + x y − 2xy and use it to approximate
2 2
Δz at point . Use
(1, −1)

Δx = 0.03 and Δy = −0.02. What is the exact value of Δz ?

Hint
First, calculate fx (x0 , y0 ) and f y (x0 , y0 ) using x 0 =1 and y 0 = −1 , then use Equation 14.4.6.
Answer
dz = 0.18

Δz = f (1.03, −1.02) − f (1, −1) = 0.180682

Differentiability of a Function of Three Variables


All of the preceding results for differentiability of functions of two variables can be generalized to functions of three variables.
First, the definition:

 Definition: Differentiability at a point

A function f (x, y, z) is differentiable at a point P (x 0, y0 , z0 ) if for all points (x, y, z) in a δ disk around P we can write

f (x, y) = f (x0 , y0 , z0 ) + fx (x0 , y0 , z0 )(x − x0 ) + fy (x0 , y0 , z0 )(y − y0 ) + fz (x0 , y0 , z0 )(z − z0 ) + E(x, y, z),

where the error term E satisfies


E(x, y, z)
lim −−−−−−−−−−−−−−−−−−−−−−−−− − = 0.
(x,y,z)→( x0 , y0 , z0 ) 2 2 2
√ (x − x0 ) + (y − y0 ) + (z − z0 )

If a function of three variables is differentiable at a point (x 0, y0 , z0 ) , then it is continuous there. Furthermore, continuity of first
partial derivatives at that point guarantees differentiability.

14.4.12 https://math.libretexts.org/@go/page/4539
Key Concepts
The analog of a tangent line to a curve is a tangent plane to a surface for functions of two variables.
Tangent planes can be used to approximate values of functions near known values.
A function is differentiable at a point if it is ”smooth” at that point (i.e., no corners or discontinuities exist at that point).
The total differential can be used to approximate the change in a function z = f (x , y ) at the point (x , y ) for given values of
0 0 0 0

Δx and Δy.

Key Equations
Tangent plane
z = f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 )

Linear approximation
L(x, y) = f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 )

Total differential
dz = fx (x0 , y0 )dx + fy (x0 , y0 )dy .
Differentiability (two variables)
f (x, y) = f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 ) + E(x, y),

where the error term E satisfies


E(x, y)
lim −−−−−−−−−−−−−−−− − =0 .
(x,y)→( x0 , y0 ) 2 2
√(x − x0 ) + (y − y0 )

Differentiability (three variables)


f (x, y) = f (x0 , y0 , z0 ) + fx (x0 , y0 , z0 )(x − x0 ) + fy (x0 , y0 , z0 )(y − y0 ) + fz (x0 , y0 , z0 )(z − z0 ) + E(x, y, z),

where the error term E satisfies


E(x, y, z)
lim −−−−−−−−−−−−−−−−−−−−−−−−− − =0 .
(x,y,z)→( x0 , y , z0 ) 2 2 2
0 √(x − x0 ) + (y − y0 ) + (z − z0 )

Glossary
differentiable
a function f (x, y) is differentiable at (x0 , y0 ) if f (x, y) can be expressed in the form
f (x, y) = f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 ) + E(x, y),

E(x, y)
where the error term E(x, y) satisfies lim (x,y)→( x0 , y0 ) −−−−−−−−−−−−−−−− −
=0
2 2
√(x − x0 ) + (y − y0 )

linear approximation
given a function f (x, y) and a tangent plane to the function at a point (x 0, y0 ) , we can approximate f (x, y) for points near
(x , y ) using the tangent plane formula
0 0

tangent plane
given a function f (x, y) that is differentiable at a point (x , y ), the equation of the tangent plane to the surface z = f (x, y) is
0 0

given by z = f (x , y ) + f (x , y )(x − x ) + f (x , y )(y − y )


0 0 x 0 0 0 y 0 0 0

total differential
the total differential of the function f (x, y) at (x 0, y0 ) is given by the formula dz = f x (x0 , y0 )dx + f y(x0 , y0 )dy

14.4: Tangent Planes and Linear Approximations is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
14.4: Tangent Planes and Linear Approximations by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

14.4.13 https://math.libretexts.org/@go/page/4539
14.5: The Chain Rule
 Learning Objectives
State the chain rules for one or two independent variables.
Use tree diagrams as an aid to understanding the chain rule for several independent and intermediate variables.
Perform implicit differentiation of a function of two or more variables.

In single-variable calculus, we found that one of the most useful differentiation rules is the chain rule, which allows us to find the
derivative of the composition of two functions. The same thing is true for multivariable calculus, but this time we have to deal with
more than one form of the chain rule. In this section, we study extensions of the chain rule and learn how to take derivatives of
compositions of functions of more than one variable.

Chain Rules for One or Two Independent Variables


Recall that the chain rule for the derivative of a composite of two functions can be written in the form
d
(f (g(x))) = f '(g(x))g'(x).
dx

In this equation, both f (x) and g(x) are functions of one variable. Now suppose that f is a function of two variables and g is a
function of one variable. Or perhaps they are both functions of two variables, or even more. How would we calculate the derivative
in these cases? The following theorem gives us the answer for the case of one independent variable.

 Chain Rule for One Independent Variable

Suppose that x = g(t) and y = h(t) are differentiable functions of t and z = f (x, y) is a differentiable function of x and y .
Then z = f (x(t), y(t)) is a differentiable function of t and
dz ∂z dx ∂z dy
= ⋅ + ⋅ , (14.5.1)
dt ∂x dt ∂y dt

where the ordinary derivatives are evaluated at t and the partial derivatives are evaluated at (x, y).

 Proof

The proof of this theorem uses the definition of differentiability of a function of two variables. Suppose that f is differentiable
at the point P (x , y ), where x = g(t ) and y = h(t ) for a fixed value of t . We wish to prove that z = f (x(t), y(t)) is
0 0 0 0 0 0 0

differentiable at t = t and that Equation 14.5.1 holds at that point as well.


0

Since f is differentiable at P , we know that

z(t) = f (x, y) = f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 ) + E(x, y),

where
E(x, y)
lim − −−−−−−−−−−−−−−− − = 0.
(x,y)→( x0 , y0 ) 2 2
√ (x − x0 ) + (y − y0 )

We then subtract z 0 = f (x0 , y0 ) from both sides of this equation:

z(t) − z(t0 ) = f (x(t), y(t)) − f (x(t0 ), y(t0 ))

= fx (x0 , y0 )(x(t) − x(t0 )) + fy (x0 , y0 )(y(t) − y(t0 )) + E(x(t), y(t)).

Next, we divide both sides by t − t : 0

z(t) − z(t0 ) x(t) − x(t0 ) y(t) − y(t0 ) E(x(t), y(t))


= fx (x0 , y0 ) + fy (x0 , y0 ) + .
t − t0 t − t0 t − t0 t − t0

Then we take the limit as t approaches t : 0

14.5.1 https://math.libretexts.org/@go/page/4540
z(t) − z(t0 ) x(t) − x(t0 )
lim = fx (x0 , y0 ) lim ( )
t→t0 t − t0 t→t0 t − t0

y(t) − y(t0 )
+ fy (x0 , y0 ) lim ( )
t→t0 t − t0

E(x(t), y(t))
+ lim .
t→t0 t − t0

The left-hand side of this equation is equal to dz/dt , which leads to


dz dx dy E(x(t), y(t))
= fx (x0 , y0 ) + fy (x0 , y0 ) + lim .
dt dt dt t→t0 t − t0

The last term can be rewritten as


− −−−−−−−−−−−−−−− −
2 2
E(x(t), y(t)) E(x, y) √ (x − x0 ) + (y − y0 )
lim = lim − −−−−−−−−−−−−−−− − )
t→t0 t − t0 t→t0
√ (x − x0 )2 + (y − y0 )2 t − t0

− −−−−−−−−−−−−−−− −
E(x, y) √ (x − x0 )2 + (y − y0 )2
= lim ( ) lim ( ).
− −−−−−−−−−−−−−−− −
t→t0 2 2 t→t0 t − t0
√ (x − x0 ) + (y − y0 )

As t approaches t 0, (x(t), y(t)) approaches (x(t 0 ), y(t0 )), so we can rewrite the last product as
− −−−−−−−−−−−−−−− −
E(x, y) √ (x − x0 )2 + (y − y0 )2
lim lim ( ).
− −−−−−−−−−−−−−−− −
(x,y)→( x0 , y )
0 √ (x − x0 )2 + (y − y0 )2 (x,y)→( x0 , y )
0
t − t0

Since the first limit is equal to zero, we need only show that the second limit is finite:
− −−−−−−−−−−−−−−− − −−−−−−−−−−−−−−−−−
√ (x − x0 )2 + (y − y0 )2 (x − x0 )
2
+ (y − y0 )
2

lim = lim √
2
(x,y)→( x0 , y0 ) t − t0 (x,y)→( x0 , y0 ) (t − t0 )

−−−−−−−−−−−−−−−−−−−−
2 2
x − x0 y − y0
= lim √( ) +( )
(x,y)→( x0 , y0 ) t − t0 t − t0

−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2
x − x0 y − y0
= √[ lim ( )] +[ lim ( )] .
(x,y)→( x0 , y0 ) t − t0 (x,y)→( x0 , y0 ) t − t0

Since x(t) and y(t) are both differentiable functions of t , both limits inside the last radical exist. Therefore, this value is finite.
This proves the chain rule at t = t ; the rest of the theorem follows from the assumption that all functions are differentiable
0

over their entire domains.


∂f dx
Closer examination of Equation 14.5.1 reveals an interesting pattern. The first term in the equation is ⋅ and the second
∂x dt
∂f dy
term is ⋅ . Recall that when multiplying fractions, cancelation can be used. If we treat these derivatives as fractions, then
∂y dt

each product “simplifies” to something resembling ∂f /dt. The variables x and y that disappear in this simplification are often
called intermediate variables: they are independent variables for the function f , but are dependent variables for the variable t .
Two terms appear on the right-hand side of the formula, and f is a function of two variables. This pattern works with functions of
more than two variables as well, as we see later in this section.

 Example 14.5.1: Using the Chain Rule


Calculate dz/dt for each of the following functions:
a. z = f (x, y) = 4x + 3y , 2 2
x = x(t) = sin t, y = y(t) = cos t
−−−−−−
b. z = f (x, y) = √x − y , 2 2
x = x(t) = e
2t
, y = y(t) = e
−t

14.5.2 https://math.libretexts.org/@go/page/4540
Solution
a. To use the chain rule, we need four quantities—∂z/∂x, ∂z/∂y, dx/dt , and dy/dt:
∂z
= 8x
∂x
dx
= cos t
dt
∂z
= 6y
∂y
dy
= − sin t
dt

Now, we substitute each of these into Equation 14.5.1:


dz ∂z dx ∂z dy
= ⋅ + ⋅
dt ∂x dt ∂y dt

= (8x)(cos t) + (6y)(− sin t)

= 8x cos t − 6y sin t.

This answer has three variables in it. To reduce it to one variable, use the fact that x(t) = sin t and y(t) = cos t. We obtain
dz
= 8x cos t − 6y sin t
dt

= 8(sin t) cos t − 6(cos t) sin t

= 2 sin t cos t.

This derivative can also be calculated by first substituting x(t) and y(t) into f (x, y), then differentiating with respect to t :

z = f (x, y) = f (x(t), y(t))

2 2
= 4(x(t)) + 3(y(t))

2 2
= 4 sin t + 3 cos t.

Then
dz
= 2(4 sin t)(cos t) + 2(3 cos t)(− sin t)
dt

= 8 sin t cos t − 6 sin t cos t

= 2 sin t cos t,

which is the same solution. However, it may not always be this easy to differentiate in this form.
b. To use the chain rule, we again need four quantities—∂z/∂x, ∂z/dy, dx/dt, and dy/dt :
∂z x
= −−−−− −
∂x √x2 − y 2

dx
2t
= 2e
dt
∂z −y
= −−−−−−
∂y 2 2
√x − y

dx
−t
= −e .
dt

We substitute each of these into Equation 14.5.1:

14.5.3 https://math.libretexts.org/@go/page/4540
dz ∂z dx ∂z dy
= ⋅ + ⋅
dt ∂x dt ∂y dt

x 2t
−y −t
=( −−− −−− ) (2 e ) + ( − −−−− − ) (−e )
2 2
√x − y √ x2 − y 2

2t −t
2x e − ye
= − −−−− − .
√ x2 − y 2

To reduce this to one variable, we use the fact that x(t) = e 2t


and y(t) = e −t
. Therefore,
2 −t
dz 2x e t + y e
=
− −−−− −
dt √ x2 − y 2

2t 2t −t −t
2(e )e + (e )e
= − −− −−−−−
√ e4t − e−2t

4t −2t
2e +e
= .
− −− −−−−−
√ e4t − e−2t

−−
To eliminate negative exponents, we multiply the top by e and the bottom by √e :
2t 4t

4t −2t 2t
dz 2e +e e
= ⋅
− −− −−−−− −−
dt √ e4t − e−2t √e4t

6t
2e +1
= − −− −−−−
√ e8t − e2t

6t
2e +1
=
− −−−−−−− −
√ e2t (e6t − 1)

6t
2e +1
= −− − −−−.
t√ 6t
e e −1

Again, this derivative can also be calculated by first substituting x(t) and y(t) into f (x, y), then differentiating with respect to
t:

z = f (x, y)

= f (x(t), y(t))

−−−−−−−−−−−−−
2 2
= √ (x(t)) − (y(t))

−− − −−−−−
4t −2t
= √e −e

4t −2t 1/2
= (e −e ) .

Then
dz 1 4t −2t −1/2 4t −2t
= (e −e ) (4 e + 2e )
dt 2

4t −2t
2e +e
= .
− −− −−−−−
√ e4t − e−2t

This is the same solution.

14.5.4 https://math.libretexts.org/@go/page/4540
 Exercise 14.5.1

Calculate dz/dt given the following functions. Express the final answer in terms of t .
2 2
z = f (x, y) = x − 3xy + 2 y ,

x = x(t) = 3 sin 2t,

y = y(t) = 4 cos 2t

Hint
Calculate ∂z/∂x, ∂z/dy, dx/dt,and dy/dt, then use Equation 14.5.1.
Answer
dz ∂f dx ∂f dy
= +
dt ∂x dt ∂y dt

= (2x − 3y)(6 cos 2t) + (−3x + 4y)(−8 sin 2t)

2 2
= −92 sin 2t cos 2t − 72(cos 2t − sin 2t)

= −46 sin 4t − 72 cos 4t.

It is often useful to create a visual representation of Equation 14.5.1 for the chain rule. This is called a tree diagram for the chain
rule for functions of one variable and it provides a way to remember the formula (Figure 14.5.1). This diagram can be expanded for
functions of more than one variable, as we shall see very shortly.

dz ∂z dx ∂z dy
Figure 14.5.1 : Tree diagram for the case = ⋅ + ⋅ .
dt ∂x dt ∂y dt

In this diagram, the leftmost corner corresponds to z = f (x, y) . Since f has two independent variables, there are two lines
coming from this corner. The upper branch corresponds to the variable x and the lower branch corresponds to the variable y . Since
each of these variables is then dependent on one variable t , one branch then comes from x and one branch comes from y . Last,
each of the branches on the far right has a label that represents the path traveled to reach that branch. The top branch is reached by
following the x branch, then the t branch; therefore, it is labeled (∂z/∂x) × (dx/dt). The bottom branch is similar: first the y
branch, then the t branch. This branch is labeled (∂z/∂y) × (dy/dt) . To get the formula for dz/dt, add all the terms that appear
on the rightmost side of the diagram. This gives us Equation.
In Chain Rule for Two Independent Variables, z = f (x, y) is a function of x and y , and both x = g(u, v) and y = h(u, v) are
functions of the independent variables u and v .

 Chain Rule for Two Independent Variables

Suppose x = g(u, v) and y = h(u, v) are differentiable functions of u and v , and z = f (x, y) is a differentiable function of x
and y . Then, z = f (g(u, v), h(u, v)) is a differentiable function of u and v , and
∂z ∂z ∂x ∂z ∂y
= + (14.5.2)
∂u ∂x ∂u ∂y ∂u

14.5.5 https://math.libretexts.org/@go/page/4540
and
∂z ∂z ∂x ∂z ∂y
= + . (14.5.3)
∂v ∂x ∂v ∂y ∂v

We can draw a tree diagram for each of these formulas as well as follows.

∂z ∂z ∂x ∂z ∂y ∂z ∂z ∂x ∂z ∂y
Figure 14.5.2 : Tree diagram for = ⋅ + ⋅ and = ⋅ + ⋅ .
∂u ∂x ∂u ∂y ∂u ∂v ∂x ∂v ∂y ∂v

To derive the formula for ∂z/∂u, start from the left side of the diagram, then follow only the branches that end with u and add the
terms that appear at the end of those branches. For the formula for ∂z/∂v , follow only the branches that end with v and add the
terms that appear at the end of those branches.
There is an important difference between these two chain rule theorems. In Chain Rule for One Independent Variable, the left-hand
side of the formula for the derivative is not a partial derivative, but in Chain Rule for Two Independent Variables it is. The reason is
that, in Chain Rule for One Independent Variable, z is ultimately a function of t alone, whereas in Chain Rule for Two Independent
Variables, z is a function of both u and v .

 Example 14.5.2: Using the Chain Rule for Two Variables


Calculate ∂z/∂u and ∂z/∂v using the following functions:
2 2
z = f (x, y) = 3 x − 2xy + y , x = x(u, v) = 3u + 2v, y = y(u, v) = 4u − v.

Solution
To implement the chain rule for two variables, we need six partial derivatives—∂z/∂x, ∂z/∂y, ∂x/∂u, ∂x/∂v, ∂y/∂u, and
∂y/∂v:

∂z ∂z
= 6x − 2y = −2x + 2y
∂x ∂y

∂x ∂x
=3 =2
∂u ∂v

∂y ∂y
=4 = −1.
∂u ∂v

To find ∂z/∂u, we use Equation 14.5.2:


∂z ∂z ∂x ∂z ∂y
= ⋅ + ⋅
∂u ∂x ∂u ∂y ∂u

= 3(6x − 2y) + 4(−2x + 2y)

= 10x + 2y.

Next, we substitute x(u, v) = 3u + 2v and y(u, v) = 4u − v :

14.5.6 https://math.libretexts.org/@go/page/4540
∂z
= 10x + 2y
∂u

= 10(3u + 2v) + 2(4u − v)

= 38u + 18v.

To find ∂z/∂v, we use Equation 14.5.3:


∂z ∂z ∂x ∂z ∂y
= +
∂v ∂x ∂v ∂y ∂v

= 2(6x − 2y) + (−1)(−2x + 2y)

= 14x − 6y.

Then we substitute x(u, v) = 3u + 2v and y(u, v) = 4u − v :


∂z
= 14x − 6y
∂v

= 14(3u + 2v) − 6(4u − v)

= 18u + 34v

 Exercise 14.5.2

Calculate ∂z/∂u and ∂z/∂v given the following functions:


2x − y 2u 2u
z = f (x, y) = , x(u, v) = e cos 3v, y(u, v) = e sin 3v.
x + 3y

Hint
Calculate ∂z/∂x, ∂z/∂y, ∂x/∂u, ∂x/∂v, ∂y/∂u, and ∂y/∂v, then use Equation 14.5.2and Equation 14.5.3.
Answer
∂z ∂z −21
= 0, =
2
∂u ∂v (3 sin 3v + cos 3v)

The Generalized Chain Rule


Now that we’ve see how to extend the original chain rule to functions of two variables, it is natural to ask: Can we extend the rule
to more than two variables? The answer is yes, as the generalized chain rule states.

 Generalized Chain Rule


Let w = f (x1 , x2 , … , xm ) be a differentiable function of m independent variables, and for each i ∈ 1, … , m, let
xi = xi (t1 , t2 , … , tn ) be a differentiable function of n independent variables. Then
∂w ∂w ∂x1 ∂w ∂x2 ∂w ∂xm
= + +⋯ +
∂tj ∂x1 ∂tj ∂x2 ∂tj ∂xm ∂tj

for any j ∈ 1, 2, … , n.

In the next example we calculate the derivative of a function of three independent variables in which each of the three variables is
dependent on two other variables.

 Example 14.5.3: Using the Generalized Chain Rule


Calculate ∂w/∂u and ∂w/∂v using the following functions:

14.5.7 https://math.libretexts.org/@go/page/4540
2 2
w = f (x, y, z) = 3 x − 2xy + 4 z

u
x = x(u, v) = e sin v

u
y = y(u, v) = e cos v

u
z = z(u, v) = e .

Solution
The formulas for ∂w/∂u and ∂w/∂v are
∂w ∂w ∂x ∂w ∂y ∂w ∂z
= ⋅ + ⋅ + ⋅
∂u ∂x ∂u ∂y ∂u ∂z ∂u

∂w ∂w ∂x ∂w ∂y ∂w ∂z
= ⋅ + ⋅ + ⋅ .
∂v ∂x ∂v ∂y ∂v ∂z ∂v

Therefore, there are nine different partial derivatives that need to be calculated and substituted. We need to calculate each of
them:
∂w ∂w ∂w
= 6x − 2y = −2x = 8z
∂x ∂y ∂z

∂x ∂y ∂z
u u u
=e sin v =e cos v =e
∂u ∂u ∂u

∂x u
∂y u
∂z
=e cos v = −e sin v = 0.
∂v ∂v ∂v

Now, we substitute each of them into the first formula to calculate ∂w/∂u:
∂w ∂w ∂x ∂w ∂y ∂w ∂z
= ⋅ + ⋅ + ⋅
∂u ∂x ∂u ∂y ∂u ∂z ∂u

u u u
= (6x − 2y)e sin v − 2x e cos v + 8ze ,

then substitute x(u, v) = e u


sin v, y(u, v) = e
u
cos v, and z(u, v) = e into this equation: u

∂w
u u u
= (6x − 2y)e sin v − 2x e cos v + 8ze
∂u

u u u u 2u
= (6 e sin v − 2eu cos v)e sin v − 2(e sin v)e cos v + 8 e

2u 2 2u 2u
= 6e sin v − 4e sin v cos v + 8 e

2u 2
= 2e (3 sin v − 2 sin v cos v + 4).

Next, we calculate ∂w/∂v:


∂w ∂w ∂x ∂w ∂y ∂w ∂z
= ⋅ + ⋅ + ⋅
∂v ∂x ∂v ∂y ∂v ∂z ∂v

u u
= (6x − 2y)e cos v − 2x(−e sin v) + 8z(0),

then we substitute x(u, v) = e u


sin v, y(u, v) = e
u
cos v, and z(u, v) = e into this equation:u

∂w u u
= (6x − 2y)e cos v − 2x(−e sin v)
∂v

u u u u u
= (6 e sin v − 2 e cos v)e cos v + 2(e sin v)(e sin v)

2u 2 2u 2u 2
= 2e sin v + 6e sin v cos v − 2 e cos v

2u 2 2
= 2e (sin v + sin v cos v − cos v).

14.5.8 https://math.libretexts.org/@go/page/4540
 Exercise 14.5.3
Calculate ∂w/∂u and ∂w/∂v given the following functions:
x + 2y − 4z
w = f (x, y, z) =
2x − y + 3z

2u
x = x(u, v) = e cos 3v

2u
y = y(u, v) = e sin 3v

2u
z = z(u, v) = e .

Hint
Calculate nine partial derivatives, then use the same formulas from Example 14.5.3.
Answer
∂w
=0
∂u

∂w 15 − 33 sin 3v + 6 cos 3v
=
2
∂v (3 + 2 cos 3v − sin 3v)

 Example 14.5.4: Drawing a Tree Diagram

Create a tree diagram for the case when

w = f (x, y, z), x = x(t, u, v), y = y(t, u, v), z = z(t, u, v)

and write out the formulas for the three partial derivatives of w.
Solution
Starting from the left, the function f has three independent variables: x, y, and z . Therefore, three branches must be emanating
from the first node. Each of these three branches also has three branches, for each of the variables t, u, and v .

Figure 14.5.3 : Tree diagram for a function of three variables, each of which is a function of three independent variables.
The three formulas are

14.5.9 https://math.libretexts.org/@go/page/4540
∂w ∂w ∂x ∂w ∂y ∂w ∂z
= + +
∂t ∂x ∂t ∂y ∂t ∂z ∂t

∂w ∂w ∂x ∂w ∂y ∂w ∂z
= + +
∂u ∂x ∂u ∂y ∂u ∂z ∂u

∂w ∂w ∂x ∂w ∂y ∂w ∂z
= + + .
∂v ∂x ∂v ∂y ∂v ∂z ∂v

 Exercise 14.5.4

Create a tree diagram for the case when

w = f (x, y), x = x(t, u, v), y = y(t, u, v)

and write out the formulas for the three partial derivatives of w.

Hint
Determine the number of branches that emanate from each node in the tree.
Answer
∂w ∂w ∂x ∂w ∂y
= +
∂t ∂x ∂t ∂y ∂t

∂w ∂w ∂x ∂w ∂y
= +
∂u ∂x ∂u ∂y ∂u

∂w ∂w ∂x ∂w ∂y
= +
∂v ∂x ∂v ∂y ∂v

Implicit Differentiation
Recall from implicit differentiation provides a method for finding dy/dx when y is defined implicitly as a function of x. The
method involves differentiating both sides of the equation defining the function with respect to x, then solving for dy/dx. Partial
derivatives provide an alternative to this method.
Consider the ellipse defined by the equation x 2
+ 3y
2
+ 4y − 4 = 0 as follows.

14.5.10 https://math.libretexts.org/@go/page/4540
Figure 14.5.4 : Graph of the ellipse defined by x 2
+ 3y
2
+ 4y − 4 = 0 .
This equation implicitly defines y as a function of x. As such, we can find the derivative dy/dx using the method of implicit
differentiation:
d 2 2
d
(x + 3y + 4y − 4) = (0)
dx dx

dy dy
2x + 6y +4 =0
dx dx

dy
(6y + 4) = −2x
dx

dy x
=−
dx 3y + 2

We can also define a function z = f (x, y) by using the left-hand side of the equation defining the ellipse. Then
f (x, y) = x + 3 y + 4y − 4. The ellipse x + 3 y + 4y − 4 = 0 can then be described by the equation f (x, y) = 0. Using this
2 2 2 2

function and the following theorem gives us an alternative approach to calculating dy/dx.

 Theorem: Implicit Differentiation of a Function of Two or More Variables

Suppose the function z = f (x, y) defines y implicitly as a function y = g(x) of x via the equation f (x, y) = 0. Then
dy ∂f /∂x
=− (14.5.4)
dx ∂f /∂y

provided f y (x, y) ≠ 0.

If the equation f (x, y, z) = 0 defines z implicitly as a differentiable function of x and y , then


dz ∂f /∂x dz ∂f /∂y
=− and =− (14.5.5)
dx ∂f /∂z dy ∂f /∂z

as long as f z (x, y, z) ≠ 0.

Equation 14.5.4 is a direct consequence of Equation 14.5.2. In particular, if we assume that y is defined implicitly as a function of
x via the equation f (x, y) = 0, we can apply the chain rule to find dy/dx :

d d
f (x, y) = (0)
dx dx

∂f dx ∂f dy
⋅ + ⋅ =0
∂x dx ∂y dx

∂f ∂f dy
+ ⋅ = 0.
∂x ∂y dx

Solving this equation for dy/dx gives Equation 14.5.4. Equation 14.5.4 can be derived in a similar fashion.
Let’s now return to the problem that we started before the previous theorem. Using Note and the function
f (x, y) = x + 3 y + 4y − 4, we obtain
2 2

14.5.11 https://math.libretexts.org/@go/page/4540
∂f
= 2x
∂x

∂f
= 6y + 4.
∂y

Then Equation 14.5.4 gives

dy ∂f /∂x 2x x
=− =− =− ,
dx ∂f /∂y 6y + 4 3y + 2

which is the same result obtained by the earlier use of implicit differentiation.

 Example 14.5.5: Implicit Differentiation by Partial Derivatives


a. Calculate dy/dx if y is defined implicitly as a function of x via the equation 3x 2
− 2xy + y
2
+ 4x − 6y − 11 = 0 . What
is the equation of the tangent line to the graph of this curve at point (2, 1)?
b. Calculate ∂z/∂x and ∂z/∂y, given x e − yze = 0.
2 y x

Solution
a. Set 2
f (x, y) = 3 x − 2xy + y
2
+ 4x − 6y − 11 = 0, then calculate fx and fy : fx (x, y) = 6x − 2y + 4 and
fy (x, y) = −2x + 2y − 6.

The derivative is given by


dy ∂f /∂x 6x − 2y + 4 3x − y + 2
=− = = .
dx ∂f /∂y −2x + 2y − 6 x −y +3

The slope of the tangent line at point (2, 1) is given by


dy ∣ 3(2) − 1 + 2 7
∣ = =
dx ∣ 2 −1 +3 4
(x,y)=(2,1)

To find the equation of the tangent line, we use the point-slope form (Figure 14.5.5):
y − y0 = m(x − x0 )

7
y −1 = (x − 2)
4

7 7
y = x− +1
4 2

7 5
y = x− .
4 2

Figure 14.5.5 : Graph of the rotated ellipse defined by 3x 2


− 2xy + y
2
+ 4x − 6y − 11 = 0 .

14.5.12 https://math.libretexts.org/@go/page/4540
b. We have f (x, y, z) = x 2
e
y
− yze .
x
Therefore,
∂f
y x
= 2x e − yze
∂x

∂f 2 y x
=x e − ze
∂y

∂f
x
= −ye
∂z

Using Equation 14.5.5,


∂z ∂f /∂x ∂z ∂f /∂y
=− and =−
∂x ∂f /∂y ∂y ∂f /∂z

y x 2 y x
2x e − yze x e − ze
=− =−
x x
−ye −ye

y x 2 y x
2x e − yze x e − ze
= =
x x
ye ye

 Exercise 14.5.5

Find dy/dx if y is defined implicitly as a function of x by the equation x


2
+ xy − y
2
+ 7x − 3y − 26 = 0 . What is the
equation of the tangent line to the graph of this curve at point (3, −2)?

Hint
Calculate ∂f /dx and ∂f /dy, then use Equation 14.5.4.
Solution
dy 2x + y + 7 ∣ 2(3) + (−2) + 7 11
= ∣ = =−
dx 2y − x + 3 ∣ (3,−2)
2(−2) − (3) + 3 4

11 25
Equation of the tangent line: y = − x+
4 4

Key Concepts
The chain rule for functions of more than one variable involves the partial derivatives with respect to all the independent
variables.
Tree diagrams are useful for deriving formulas for the chain rule for functions of more than one variable, where each
independent variable also depends on other variables.

Key Equations
Chain rule, one independent variable
dz ∂z dx ∂z dy
= ⋅ + ⋅
dt ∂x dt ∂y dt

Chain rule, two independent variables


dz ∂z ∂x ∂z ∂y dz ∂z ∂x ∂z ∂y
= ⋅ + ⋅ = ⋅ + ⋅
du ∂x ∂u ∂y ∂u dv ∂x ∂v ∂y ∂v

Generalized chain rule


∂w ∂w ∂x1 ∂w ∂x1 ∂w ∂xm
= + +⋯ +
∂tj ∂x1 ∂tj ∂x2 ∂tj ∂xm ∂tj

14.5.13 https://math.libretexts.org/@go/page/4540
Glossary
generalized chain rule
the chain rule extended to functions of more than one independent variable, in which each independent variable may depend on
one or more other variables

intermediate variable
given a composition of functions (e.g., f (x(t), y(t))), the intermediate variables are the variables that are independent in the
outer function but dependent on other variables as well; in the function f (x(t), y(t)), the variables x and y are examples of
intermediate variables

tree diagram
illustrates and derives formulas for the generalized chain rule, in which each independent variable is accounted for

14.5: The Chain Rule is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
14.5: The Chain Rule for Multivariable Functions by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

14.5.14 https://math.libretexts.org/@go/page/4540
14.6: Directional Derivatives and the Gradient Vector
 Learning Objectives
Determine the directional derivative in a given direction for a function of two variables.
Determine the gradient vector of a given real-valued function.
Explain the significance of the gradient vector with regard to direction of change along a surface.
Use the gradient to find the tangent to a level curve of a given function.
Calculate directional derivatives and gradients in three dimensions.

A function z = f (x, y) has two partial derivatives: ∂z/∂x and ∂z/∂y . These derivatives correspond to each of the independent variables
and can be interpreted as instantaneous rates of change (that is, as slopes of a tangent line). For example, ∂z/∂x represents the slope of a
tangent line passing through a given point on the surface defined by z = f (x, y), assuming the tangent line is parallel to the x-axis.
Similarly, ∂z/∂y represents the slope of the tangent line parallel to the y -axis. Now we consider the possibility of a tangent line parallel to
neither axis.

Directional Derivatives
We start with the graph of a surface defined by the equation z = f (x, y) . Given a point (a, b) in the domain of f , we choose a direction to
travel from that point. We measure the direction using an angle θ , which is measured counterclockwise in the xy-plane, starting at zero
from the positive x-axis (Figure 14.6.1). The distance we travel is h and the direction we travel is given by the unit vector
u = (cos θ) i + (sin θ) j . Therefore, the z -coordinate of the second point on the graph is given by z = f (a + h cos θ, b + h sin θ).
⇀ ^ ^

Figure 14.6.1 : Finding the directional derivative at a point on the graph of z . The slope of the blue arrow on the graph indicates
= f (x, y)

the value of the directional derivative at that point.


We can calculate the slope of the secant line by dividing the difference in z -values by the length of the line segment connecting the two
points in the domain. The length of the line segment is h . Therefore, the slope of the secant line is
f (a + h cos θ, b + h sin θ) − f (a, b)
msec =
h

To find the slope of the tangent line in the same direction, we take the limit as h approaches zero.

 Definition: Directional Derivatives

Suppose z = f (x, y) is a function of two variables with a domain of D . Let (a, b) ∈ D and define ⇀ ^ ^
u = (cos θ) i + (sin θ) j . Then
the directional derivative of f in the direction of u is given by

f (a + h cos θ, b + h sin θ) − f (a, b)


D⇀ f (a, b) = lim (14.6.1)
u
h→0 h

provided the limit exists.

14.6.1 https://math.libretexts.org/@go/page/4541
Equation 14.6.1 provides a formal definition of the directional derivative that can be used in many cases to calculate a directional
derivative.
Note that since the point (a, b) is chosen randomly from the domain D of the function f , we can use this definition to find the directional
derivative as a function of x and y .
That is,
f (x + h cos θ, y + h sin θ) − f (x, y)
D⇀ f (x, y) = lim (14.6.2)
u
h→0 h

 Example 14.6.1: Finding a Directional Derivative from the Definition


Let θ = arccos(3/5). Find the directional derivative D⇀ f (x, y)
u
of f (x, y) = x
2
− xy + 3 y
2
in the direction of
⇀ ^ ^
u = (cos θ) i + (sin θ) j .
Then determine D ⇀ f (−1,
u
2).
Solution
First of all, since cos θ = 3/5 and θ is acute, this implies
−−−−−−−−
2 −−

3 16 4
sin θ = √ 1 − ( ) =√ = .
5 25 5

Using f (x, y) = x 2
− xy + 3 y ,
2
we first calculate f (x + h cos θ, y + h sin θ) :
2 2
f (x + h cos θ, y + h sin θ) = (x + h cos θ) − (x + h cos θ)(y + h sin θ) + 3(y + h sin θ)
2 2 2 2 2 2 2
=x + 2xh cos θ + h cos θ − xy − xh sin θ − yh cos θ − h sin θ cos θ + 3 y + 6yh sin θ + 3 h sin θ
2 2
3 9h 4xh 3yh 12h 4 16
2 2 2
=x + 2xh( )+ − xy − − − + 3y + 6yh( ) + 3h ( )
5 25 5 5 25 5 25
2
2 2
2xh 9h 21yh
=x − xy + 3 y + + + .
5 5 5

We substitute this expression into Equation 14.6.1 with a = x and b = y :


f (x + h cos θ, y + h sin θ) − f (x, y)
D⇀ f (x, y) = lim
u
h→0 h
2
2xh 9h 21yh
2 2 2 2
(x − xy + 3 y + + + ) − (x − xy + 3 y )
5 5 5
= lim
h→0 h
2
2xh 9h 21yh
+ +
5 5 5
= lim
h→0 h

2x 9h 21y
= lim + +
h→0 5 5 5

2x + 21y
= .
5

To calculate D ⇀ f (−1,
u
2), we substitute x = −1 and y = 2 into this answer (Figure 14.6.2):
2(−1) + 21(2) −2 + 42
D⇀ f (−1, 2) = = = 8.
u
5 5

14.6.2 https://math.libretexts.org/@go/page/4541
Figure 14.6.2 : Finding the directional derivative in a given direction at a given point on a surface. The plane is tangent to the

u

surface at the given point (−1, 2, 15).

An easier approach to calculating directional derivatives that involves partial derivatives is outlined in the following theorem.

 Directional Derivative of a Function of Two Variables


Let z = f (x, y) be a function of two variables x and y , and assume that fx and f exist. Then the directional derivative of
y f in the
direction of u = (cos θ) ^i + (sin θ) ^j is given by

D⇀ f (x, y) = fx (x, y) cos θ + fy (x, y) sin θ. (14.6.3)


u

 Proof

Applying the definition of a directional derivative stated above in Equation 14.6.1, the directional derivative of f in the direction of
u = (cos θ) i + (sin θ) j at a point (x , y ) in the domain of f can be written
⇀ ^ ^
0 0

f (x0 + t cos θ, y0 + t sin θ) − f (x0 , y0 )


D⇀ f ((x0 , y0 )) = lim .
u
t→0 t

Let x = x + t cos θ and y = y + t sin θ, and define


0 0 g(t) = f (x, y) . Since fx and fy both exist, we can use the chain rule for
functions of two variables to calculate g'(t) :
∂f dx ∂f dy
g'(t) = + = fx (x, y) cos θ + fy (x, y) sin θ.
∂x dt ∂y dt

If t = 0, then x = x and y = y0 0, so

g'(0) = fx (x0 , y0 ) cos θ + fy (x0 , y0 ) sin θ

By the definition of g'(t), it is also true that


g(t) − g(0) f (x0 + t cos θ, y0 + t sin θ) − f (x0 , y0 )
g'(0) = lim = lim .
t→0 t t→0 t

Therefore, D ⇀ f (x0 ,
u
y0 ) = fx (x0 , y0 ) cos θ + fy (x0 , y0 ) sin θ .
Since the point (x , y ) is an arbitrary point from the domain of f , this result holds for all points in the domain of
0 0 f for which the
partials f and f exist.
x y

Therefore,

D⇀ f (x, y) = fx (x, y) cos θ + fy (x, y) sin θ.


u

14.6.3 https://math.libretexts.org/@go/page/4541
 Example 14.6.2: Finding a Directional Derivative: Alternative Method
Let θ = arccos(3/5). Find the directional derivative D⇀ f (x, y)
u
of f (x, y) = x
2
− xy + 3 y
2
in the direction of
⇀ ^ ^
u = (cos θ) i + (sin θ) j .
Then determine D ⇀ f (−1,
u
2) .
Solution
First, we must calculate the partial derivatives of f :
fx (x, y) = 2x − y

fy (x, y) = −x + 6y,

Then we use Equation 14.6.3 with θ = arccos(3/5):

D⇀ f (x, y) = fx (x, y) cos θ + fy (x, y) sin θ


u

3 4
= (2x − y) + (−x + 6y)
5 5
6x 3y 4x 24y
= − − +
5 5 5 5
2x + 21y
= .
5

To calculate D ⇀ f (−1,
u
2), let x = −1 and y = 2 :
2(−1) + 21(2) −2 + 42
D⇀ f (−1, 2) = = = 8.
u
5 5

This is the same answer obtained in Example 14.6.1.

 Exercise 14.6.1:
π π
Find the directional derivative D⇀ f (x, y)
u
of 2
f (x, y) = 3 x y − 4x y
3
+ 3y
2
− 4x in the direction of ⇀
u = (cos
^
) i + (sin
^
) j
3 3
using Equation 14.6.3.
What is D ⇀ f (3,
u
?
4)

Hint
Calculate the partial derivatives and determine the value of θ .
Answer
3 2 2 –
(6xy − 4 y − 4)(1) (3 x − 12x y + 6y)√3
D⇀ f (x, y) = +
u
2 2
– –
72 − 256 − 4 (27 − 576 + 24)√3 525 √3
D⇀ f (3, 4) = + = −94 −
u
2 2 2

If the vector that is given for the direction of the derivative is not a unit vector, then it is only necessary to divide by the norm of the
vector. For example, if we wished to find the directional derivative of the function in Example 14.6.2 in the direction of the vector
⟨−5, 12⟩, we would first divide by its magnitude to get u . This gives us u = ⟨− ⟩.
⇀ ⇀ 5 12
,
13 13

Then
D⇀ f (x, y) = fx (x, y) cos θ + fy (x, y) sin θ
u

5 12
=− (2x − y) + (−x + 6y)
13 13

22 17
=− x+ y
13 13

Gradient
The right-hand side of Equation 14.6.3 is equal to f x (x, y) cos θ + fy (x, y) sin θ , which can be written as the dot product of two vectors.

Define the first vector as ^ ^
∇f (x, y) = fx (x, y) i + fy (x, y) j and the second vector as ⇀ ^ ^
u = (cos θ) i + (sin θ) j . Then the right-hand

14.6.4 https://math.libretexts.org/@go/page/4541
side of the equation can be written as the dot product of these two vectors:


D⇀ f (x, y) = ∇f (x, y) ⋅ u . (14.6.4)
u


The first vector in Equation 14.6.4 has a special name: the gradient of the function f . The symbol ∇ is called nabla and the vector ∇f is
read “del f .”

 Definition: The Gradient



Let z = f (x, y) be a function of x and y such that f and f exist. The vector ∇f (x, y) is called the gradient of f and is defined as
x y


^ ^
∇f (x, y) = fx (x, y) i + fy (x, y) j . (14.6.5)


The vector ∇f (x, y) is also written as “grad f .”

 Example 14.6.3: Finding Gradients



Find the gradient ∇f (x, y) of each of the following functions:
a. f (x, y) = x − xy + 3y
2 2

b. f (x, y) = sin 3x cos 3y


Solution
For both parts a. and b., we first calculate the partial derivatives f and f , then use Equation 14.6.5. x y

a. f x (x, y) = 2x − y and f y (x, y) = −x + 6y , so



^ ^
∇f (x, y) = fx (x, y) i + fy (x, y) j

^ ^
= (2x − y) i + (−x + 6y) j .

b. f x (x, y) = 3 cos 3x cos 3y and f y (x, y) = −3 sin 3x sin 3y , so



^ ^
∇f (x, y) = fx (x, y) i + fy (x, y) j

^ ^
= (3 cos 3x cos 3y) i − (3 sin 3x sin 3y) j .

 Exercise 14.6.2
2 2
⇀ x − 3y
Find the gradient ∇f (x, y) of f (x, y) = .
2x + y

Hint
Calculate the partial derivatives, then use Equation 14.6.5.
Answer
2 2 2 2
⇀ 2x + 2xy + 6 y x + 12xy + 3 y
^ ^
∇f (x, y) = i − j
2 2
(2x + y) (2x + y)

The gradient has some important properties. We have already seen one formula that uses the gradient: the formula for the directional
⇀ ⇀ ⇀
derivative. Recall from The Dot Product that if the angle between two vectors a and b is φ , then a ⋅ b = ∥ a ∥∥ b ∥ cos φ. Therefore, if
⇀ ⇀ ⇀


the angle between ∇f (x 0, y0 ) and u = (cosθ) ^i + (sinθ) ^j is φ , we have

⇀ ⇀ ⇀
⇀ ⇀
D⇀ f (x0 , y0 ) = ∇f (x0 , y0 ) ⋅ u = ∥ ∇f (x0 , y0 )∥∥ u ∥ cos φ = ∥ ∇f (x0 , y0 )∥ cos φ.
u

The ∥ u ∥ disappears because u is a unit vector. Therefore, the directional derivative is equal to the magnitude of the gradient evaluated at
⇀ ⇀

(x , y ) multiplied by cos φ. Recall that cos φ ranges from −1 to 1 .


0 0


If φ = 0, then cos φ = 1 and ∇f (x 0, y0 ) and u both point in the same direction.


If φ = π , then cos φ = −1 and ∇f (x 0, y0 ) and u point in opposite directions.

In the first case, the value of D ⇀ f (x0 ,


u
y0 ) is maximized; in the second case, the value of D ⇀ f (x0 ,
u
y0 ) is minimized.

14.6.5 https://math.libretexts.org/@go/page/4541
⇀ ⇀
We can also see that if ∇f (x 0, y0 ) = 0 , then


D⇀ f (x0 , y0 ) = ∇f (x0 , y0 ) ⋅ u = 0
u

for any vector u . These three cases are outlined in the following theorem.

 Properties of the Gradient


Suppose the function z = f (x, y) is differentiable at (x 0, y0 ) (Figure 14.6.3).
⇀ ⇀
i. If ∇f (x 0, y0 ) = 0 , then D ⇀ f (x0 ,
u
y0 ) = 0 for any unit vector u . ⇀

⇀ ⇀ ⇀
ii. If ∇f (x 0, y0 ) ≠ 0 , then D ⇀ f (x0 ,
u
y0 ) is maximized when u points in the same direction as ∇f (x

0, y0 ) . The maximum value

of D f (x , y ) is ∥∇f (x , y )∥.

u 0 0 0 0
⇀ ⇀ ⇀
iii. If ∇f (x , y ) ≠ 0 , then D f (x , y ) is minimized when u points in the opposite direction from ∇f (x
0 0 ⇀
u 0 0

0, y0 ) . The minimum

value of D f (x , y ) is −∥∇f (x , y )∥.

u 0 0 0 0

Figure 14.6.3 : The gradient indicates the maximum and minimum values of the directional derivative at a point.

 Example 14.6.4: Finding a Maximum Directional Derivative

Find the direction for which the directional derivative of f (x, y) = 3x 2


− 4xy + 2 y
2
at (−2, 3) is a maximum. What is the maximum
value?
Solution

The maximum value of the directional derivative occurs when ∇f and the unit vector point in the same direction. Therefore, we start

by calculating ∇f (x, y):

fx (x, y) = 6x − 4y and fy (x, y) = −4x + 4y

so

^ ^ ^ ^
∇f (x, y) = fx (x, y) i + fy (x, y) j = (6x − 4y) i + (−4x + 4y) j .

Next, we evaluate the gradient at (−2, 3):



^ ^ ^ ^
∇f (−2, 3) = (6(−2) − 4(3)) i + (−4(−2) + 4(3)) j = −24 i + 20 j .

14.6.6 https://math.libretexts.org/@go/page/4541
⇀ ⇀
We need to find a unit vector that points in the same direction as ∇f (−2, 3), so the next step is to divide ∇f (−2, 3) by its
−−−−−−−−−−− − −−− −−
magnitude, which is √(−24) + (20) = √976 = 4√61 . Therefore,
2 2

⇀ −− −−
∇f (−2, 3) −24 20 6 √61 5 √61
^ ^
= −− i + −− j = − i + j.

4 √61 4 √61 61 61
∥ ∇f (−2, 3)∥


This is the unit vector that points in the same direction as ∇f (−2, 3). To find the angle corresponding to this unit vector, we solve the
equations
−− −−
−6 √61 5 √61
cos θ = and sin θ =
61 61

for θ . Since cosine is negative and sine is positive, the angle must be in the second quadrant. Therefore,
−−
θ = π − arcsin((5 √61)/61) ≈ 2.45 rad.
⇀ −−
The maximum value of the directional derivative at (−2, 3) is ∥∇f (−2, 3)∥ = 4√61 (Figure 14.6.4).

Figure 14.6.4 : The maximum value of the directional derivative at (−2, 3) is in the direction of the gradient.

 Exercise 14.6.3

Find the direction for which the directional derivative of g(x, y) = 4x − xy + 2y


2
at (−2, 3) is a maximum. What is the maximum
value?

Hint
Evaluate the gradient of g at point (−2, 3).
Answer
⇀ ⇀

The gradient of g at (−2, 3) is ∇g(−2, 3) = ^ ^


i + 14 j . The unit vector that points in the same direction as ∇g(−2, 3) is

−−− −−−
∇g(−2, 3) 1 14 √197 14 √197
^ ^ ^ ^
= −−− i + −−− j = i + j,

√197 √197 197 197
∥ ∇g(−2, 3)∥

−−−
which gives an angle of θ = arcsin((14√197)/197) ≈ 1.499 rad.
⇀ −−−
The maximum value of the directional derivative is ∥∇g(−2, 3)∥ = √197.

Figure 14.6.5 shows a portion of the graph of the function f (x, y) = 3 + sin x sin y . Given a point (a, b) in the domain of f , the

maximum value of the directional derivative at that point is given by ∥∇f (a, b)∥. This would equal the rate of greatest ascent if the
surface represented a topographical map. If we went in the opposite direction, it would be the rate of greatest descent.

14.6.7 https://math.libretexts.org/@go/page/4541
Figure 14.6.5 : A typical surface in R . Given a point on the surface, the directional derivative can be calculated using the gradient.
3

When using a topographical map, the steepest slope is always in the direction where the contour lines are closest together (Figure 14.6.6).
This is analogous to the contour map of a function, assuming the level curves are obtained for equally spaced values throughout the range
of that function.

Figure 14.6.6 : Contour map for the function f (x, y) = x 2


−y
2
using level values between −5 and 5.

Gradients and Level Curves


Recall that if a curve is defined parametrically by the function pair (x(t), y(t)), then the vector x'(t) ^i + y'(t) ^j is tangent to the curve for
every value of t in the domain. Now let’s assume z = f (x, y) is a differentiable function of x and y , and (x , y ) is in its domain. Let’s
0 0

suppose further that x = x(t ) and y = y(t ) for some value of t , and consider the level curve f (x, y) = k . Define
0 0 0 0

g(t) = f (x(t), y(t)) and calculate g'(t) on the level curve. By the chain Rule,

g'(t) = fx (x(t), y(t))x'(t) + fy (x(t), y(t))y'(t).

But g'(t) = 0 because g(t) = k for all t . Therefore, on the one hand,

fx (x(t), y(t))x'(t) + fy (x(t), y(t))y'(t) = 0;

on the other hand,



fx (x(t), y(t))x'(t) + fy (x(t), y(t))y'(t) = ∇f (x, y) ⋅ ⟨x'(t), y'(t)⟩.

Therefore,

∇f (x, y) ⋅ ⟨x'(t), y'(t)⟩ = 0.

14.6.8 https://math.libretexts.org/@go/page/4541
Thus, the dot product of these vectors is equal to zero, which implies they are orthogonal. However, the second vector is tangent to the
level curve, which implies the gradient must be normal to the level curve, which gives rise to the following theorem.

 Gradient Is Normal to the Level Curve

Suppose the function z = f (x, y) has continuous first-order partial derivatives in an open disk centered at a point (x0 , y0 ) . If
⇀ ⇀
∇f (x , y ) ≠ 0 , then ∇f (x , y ) is normal to the level curve of f at (x , y ).
0 0 0 0 0 0

We can use this theorem to find tangent and normal vectors to level curves of a function.

 Example 14.6.5: Finding Tangents to Level Curves

For the function f (x, y) = 2x 2


− 3xy + 8 y
2
+ 2x − 4y + 4, find a tangent vector to the level curve at point (−2, 1). Graph the level

curve corresponding to f (x, y) = 18 and draw in ∇f (−2, 1) and a tangent vector.
Solution

First, we must calculate ∇f (x, y) :

^ ^
fx (x, y) = 4x − 3y + 2 and fy = −3x + 16y − 4 so ∇f (x, y) = (4x − 3y + 2) i + (−3x + 16y − 4) j .


Next, we evaluate ∇f (x, y) at (−2, 1) :

^ ^ ^ ^
∇f (−2, 1) = (4(−2) − 3(1) + 2) i + (−3(−2) + 16(1) − 4) j = −9 i + 18 j .

This vector is orthogonal to the curve at point (−2, 1). We can obtain a tangent vector by reversing the components and multiplying
either one by −1. Thus, for example, −18 ^i − 9 ^j is a tangent vector (Figure 14.6.7).

Figure 14.6.7 : Tangent and normal vectors to 2x 2


− 3xy + 8y
2
+ 2x − 4y + 4 = 18 at point (−2, 1).

 Exercise 14.6.4
For the function f (x, y) = x 2
− 2xy + 5 y
2
+ 3x − 2y + 3 , find the tangent to the level curve at point (1, 1). Draw the graph of the

level curve corresponding to f (x, y) = 8 and draw ∇f (1, 1) and a tangent vector.

Hint
Calculate the gradient at point (1, 1).
Answer

^ ^
∇f (x, y) = (2x − 2y + 3) i + (−2x + 10y − 2) j


^ ^
∇f (1, 1) = 3 i + 6 j

Tangent vector: 6 ^i − 3 ^j or −6 ^i + 3 ^j

14.6.9 https://math.libretexts.org/@go/page/4541
Three-Dimensional Gradients and Directional Derivatives
The definition of a gradient can be extended to functions of more than two variables.

 Definition: Gradients in 3D

Let w = f (x, y, z) be a function of three variables such that fx , fy , and f exist. The vector
z ∇f (x, y, z) is called the gradient of f

and is defined as

^ ^ ^
∇f (x, y, z) = fx (x, y, z) i + fy (x, y, z) j + fz (x, y, z) k. (14.6.6)


∇f (x, y, z) can also be written as grad f (x, y, z).

Calculating the gradient of a function in three variables is very similar to calculating the gradient of a function in two variables. First, we
calculate the partial derivatives f , f , and f , and then we use Equation 14.6.6.
x y z

 Example 14.6.6: Finding Gradients in Three Dimensions



Find the gradient ∇f (x, y, z) of each of the following functions:
a. f (x, y, z) = 5x 2
− 2xy + y
2
− 4yz + z
2
+ 3xz

b. f (x, y, z) = e −2z
sin 2x cos 2y

Solution
For both parts a. and b., we first calculate the partial derivatives f x, fy , and f , then use Equation 14.6.6.
z

a. fx (x, y, z) = 10x − 2y + 3z ,fy (x, y, z) = −2x + 2y − 4z , and f z (x, y, z) = 3x − 4y + 2z , so



^ ^ ^
∇f (x, y, z) = fx (x, y, z) i + fy (x, y, z) j + fz (x, y, z) k

^ ^ ^
= (10x − 2y + 3z) i + (−2x + 2y − 4z) j + (3x − 4y + 2z) k.

b. f x (x, y, z) = 2 e
−2z
,
cos 2x cos 2y fy (x, y, z) = −2 e
−2z
sin 2x sin 2y , and f z (x, y, z) = −2 e
−2z
sin 2x cos 2y , so

^ ^ ^
∇f (x, y, z) = fx (x, y, z) i + fy (x, y, z) j + fz (x, y, z) k

−2z ^ −2z ^ −2z ^


= (2 e cos 2x cos 2y) i + (−2 e sin 2x sin 2y) j + (−2 e sin 2x cos 2y) k

−2z ^ ^ ^
= 2e (cos 2x cos 2y i − sin 2x sin 2y j − sin 2x cos 2y k).

 Exercise 14.6.5:
2 2 2
⇀ x − 3y +z
Find the gradient ∇f (x, y, z) of f (x, y, z) =
2x + y − 4z.

Answer
2 2 2 2 2 2 2 2 2
⇀ 2x + 2xy + 6 y − 8xz − 2 z x + 12xy + 3 y − 24yz + z 4x − 12 y − 4z + 4xz + 2yz
^ ^ ^
∇f (x, y, z) = i − j + k
2 2 2
(2x + y − 4z) (2x + y − 4z) (2x + y − 4z)

14.6.10 https://math.libretexts.org/@go/page/4541
The directional derivative can also be generalized to functions of three variables. To determine a direction in three dimensions, a vector
with three components is needed. This vector is a unit vector, and the components of the unit vector are called directional cosines. Given a
three-dimensional unit vector u in standard form (i.e., the initial point is at the origin), this vector forms three different angles with the

positive x-, y -, and z -axes. Let’s call these angles α, β, and γ. Then the directional cosines are given by cos α, cos β, and cos γ. These are
the components of the unit vector u ; since u is a unit vector, it is true that cos α + cos β + cos γ = 1.
⇀ ⇀ 2 2 2

 Definition: Directional Derivative of a Function of Three variables

Suppose w = f (x, y, z) is a function of three variables with a domain of D. Let (x , y , z ) ∈ D 0 0 0 and let
⇀ ^ ^ ^
u = cos α i + cos β j + cos γ k be a unit vector. Then, the directional derivative of f in the direction of u is given by
f (x0 + t cos α, y0 + t cos β, z0 + t cos γ) − f (x0 , y0 , z0 )
D⇀ f (x0 , y0 , z0 ) = lim
u
t→0 t

provided the limit exists.

We can calculate the directional derivative of a function of three variables by using the gradient, leading to a formula that is analogous to
Equation 14.6.3.

 Directional Derivative of a Function of Three Variables

Let f (x, y, z) be a differentiable function of three variables and let ⇀ ^ ^ ^


u = cos α i + cos β j + cos γ k be a unit vector. Then, the
directional derivative of f in the direction of u is given by ⇀



D⇀ f (x, y, z) = ∇f (x, y, z) ⋅ u = fx (x, y, z) cos α + fy (x, y, z) cos β + fz (x, y, z) cos γ. (14.6.7)
u

The three angles α, β, and γ determine the unit vector ⇀


u . In practice, we can use an arbitrary (nonunit) vector, then divide by its
magnitude to obtain a unit vector in the desired direction.

 Example 14.6.7: Finding a Directional Derivative in Three Dimensions

Calculate D ⇀ f (1,
v
−2, 3) in the direction of ⇀ ^ ^ ^
v = − i +2 j +2 k for the function
2 2 2
f (x, y, z) = 5 x − 2xy + y − 4yz + z + 3xz.

Solution:
First, we find the magnitude of v :
−−−−−−−−−−−−−−−
⇀ 2 2 2 –
∥ v ∥ = √ (−1 ) + (2 ) + (2 ) = √9 = 3.

⇀ ^ ^ ^
v −i + 2 j + 2 k 1 2 2 1 2
Therefore, ⇀
= =−
^
i +
^
j +
^
k is a unit vector in the direction of ⇀
v , so cos α = − , cos β = , and
∥ v∥ 3 3 3 3 3 3

2
cos γ = . Next, we calculate the partial derivatives of f :
3

fx (x, y, z) = 10x − 2y + 3z

fy (x, y, z) = −2x + 2y − 4z

fz (x, y, z) = −4y + 2z + 3x,

then substitute them into Equation 14.6.7:

D⇀ f (x, y, z) = fx (x, y, z) cos α + fy (x, y, z) cos β + fz (x, y, z) cos γ


v

1 2 2
= (10x − 2y + 3z)(− ) + (−2x + 2y − 4z)( ) + (−4y + 2z + 3x)( )
3 3 3
10x 2y 3z 4x 4y 8z 8y 4z 6x
=− + − − + − − + +
3 3 3 3 3 3 3 3 3

8x 2y 7z
=− − − .
3 3 3

Last, to find D ⇀ f (1,


v
−2, 3), we substitute x = 1, y = −2 , and z = 3 :

14.6.11 https://math.libretexts.org/@go/page/4541
8(1) 2(−2) 7(3)
D⇀ f (1, −2, 3) = − − −
v
3 3 3
8 4 21
=− + −
3 3 3
25
=− .
3

 Exercise 14.6.6:

Calculate D ⇀ f (x,
v
y, z) and D⇀ f (0,
v
−2, 5) in the direction of ⇀ ^ ^ ^
v = −3 i + 12 j − 4 k for the function
2 2 2
f (x, y, z) = 3 x + xy − 2 y + 4yz − z + 2xz.

Hint
First, divide ⇀
v by its magnitude, calculate the partial derivatives of f , then use Equation 14.6.7.
Answer
3 12 4
D⇀ f (x, y, z) = − (6x + y + 2z) + (x − 4y + 4z) − (2x + 4y − 2z)
v
13 13 13

384
D⇀ f (0, −2, 5) =
v
13

Summary
A directional derivative represents a rate of change of a function in any given direction.
The gradient can be used in a formula to calculate the directional derivative.
The gradient indicates the direction of greatest change of a function of more than one variable.

Key Equations
directional derivative (two dimensions)
f (a + h cos θ, b + h sin θ) − f (a, b)
D⇀ f (a, b) = lim
u
h→0 h

or

D⇀ f (x, y) = fx (x, y) cos θ + fy (x, y) sin θ


u

gradient (two dimensions)



^ ^
∇f (x, y) = fx (x, y) i + fy (x, y) j

gradient (three dimensions)



^ ^ ^
∇f (x, y, z) = fx (x, y, z) i + fy (x, y, z) j + fz (x, y, z) k

directional derivative (three dimensions)




D⇀ f (x, y, z) = ∇f (x, y, z) ⋅ u = fx (x, y, z) cos α + fy (x, y, z) cos β + fx (x, y, z) cos γ
u

Glossary
directional derivative
the derivative of a function in the direction of a given unit vector
gradient

the gradient of the function f (x, y) is defined to be ^ ^


∇f (x, y) = (∂f /∂x) i + (∂f /∂y) j , which can be generalized to a function of
any number of independent variables

14.6: Directional Derivatives and the Gradient Vector is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

14.6.12 https://math.libretexts.org/@go/page/4541
14.6: Directional Derivatives and the Gradient by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

14.6.13 https://math.libretexts.org/@go/page/4541
14.7: Maximum and Minimum Values
 Learning Objectives
Use partial derivatives to locate critical points for a function of two variables.
Apply a second derivative test to identify a critical point as a local maximum, local minimum, or saddle point for a function
of two variables.
Examine critical points and boundary points to find absolute maximum and minimum values for a function of two
variables.

One of the most useful applications for derivatives of a function of one variable is the determination of maximum and/or minimum
values. This application is also important for functions of two or more variables, but as we have seen in earlier sections of this
chapter, the introduction of more independent variables leads to more possible outcomes for the calculations. The main ideas of
finding critical points and using derivative tests are still valid, but new wrinkles appear when assessing the results.

Critical Points
For functions of a single variable, we defined critical points as the values of the variable at which the function's derivative equals
zero or does not exist. For functions of two or more variables, the concept is essentially the same, except for the fact that we are
now working with partial derivatives.

 Definition: Critical Points


Let z = f (x, y) be a function of two variables that is differentiable on an open set containing the point (x , y ). The point 0 0

(x0 , y0 ) is called a critical point of a function of two variables f if one of the two following conditions holds:
1. f (x , y ) = f
x 0 0 y (x0 , y0 ) = 0

2. Either f (x , y
x 0 0 ) or fy (x0 , y0 ) does not exist.

 Example 14.7.1: Finding Critical Points


Find the critical points of each of the following functions:
−−−−−−−−−−−−−−−−−−−−−−
a. f (x, y) = √4y − 9x + 24y + 36x + 36
2 2

b. g(x, y) = x + 2xy − 4y + 4x − 6y + 4
2 2

Solution
a. First, we calculate f x (x, y) and fy (x, y) :

1 2 2 −1/2
fx (x, y) = (−18x + 36)(4 y − 9x + 24y + 36x + 36 )
2

−9x + 18
= −−−−−−−−−−−−−−−−−−−−− −
√ 4 y 2 − 9 x2 + 24y + 36x + 36

1
2 2 −1/2
fy (x, y) = (8y + 24)(4 y − 9x + 24y + 36x + 36 )
2
.
4y + 12
= −−−−−−−−−−−−−−−−−−−−− −
2 2
√ 4 y − 9 x + 24y + 36x + 36

Next, we set each of these expressions equal to zero:


−9x + 18
−−−−−−−−−−−−−−−−−−−−− − =0
2 2
√ 4 y − 9 x + 24y + 36x + 36

4y + 12
−−−−−−−−−−−−−−−−−−−−− − = 0.
√ 4 y − 9 x2 + 24y + 36x + 36
2

14.7.1 https://math.libretexts.org/@go/page/4542
Then, multiply each equation by its common denominator:
−9x + 18 = 0

4y + 12 = 0.

Therefore, x = 2 and y = −3, so (2, −3) is a critical point of f .


We must also check for the possibility that the denominator of each partial derivative can equal zero, thus causing the partial
derivative not to exist. Since the denominator is the same in each partial derivative, we need only do this once:
2 2
4y − 9x + 24y + 36x + 36 = 0. (14.7.1)

Equation 14.7.1 represents a hyperbola. We should also note that the domain of f consists of points satisfying the inequality
2 2
4y − 9x + 24y + 36x + 36 ≥ 0.

Therefore, any points on the hyperbola are not only critical points, they are also on the boundary of the domain. To put the
hyperbola in standard form, we use the method of completing the square:
2 2
4y − 9x + 24y + 36x + 36 =0

2 2
4y − 9x + 24y + 36x = −36

2 2
4y + 24y − 9 x + 36x = −36

2 2
4(y + 6y) − 9(x − 4x) = −36

2 2
4(y + 6y + 9) − 9(x − 4x + 4) = −36 − 36 + 36

2 2
4(y + 3 ) − 9(x − 2 ) = −36.

Dividing both sides by −36 puts the equation in standard form:


2 2
4(y + 3) 9(x − 2)
− =1
−36 −36

2 2
(x − 2) (y + 3)
− = 1.
4 9

Notice that point (2, −3) is the center of the hyperbola.


2 2
(x − 2) (y + 3)
Thus, the critical points of the function f are (2, −3) and all points on the hyperbola, − =1 .
4 9

b. First, we calculate g
x (x, y) and g
y (x, y) :
gx (x, y) = 2x + 2y + 4

gy (x, y) = 2x − 8y − 6.

Next, we set each of these expressions equal to zero, which gives a system of equations in x and y :

2x + 2y + 4 = 0

2x − 8y − 6 = 0.

Subtracting the second equation from the first gives 10y + 10 = 0 , so y = −1 . Substituting this into the first equation gives
2x + 2(−1) + 4 = 0 , so x = −1 .

Therefore (−1, −1) is a critical point of g . There are no points in R that make either partial derivative not exist.
2

Figure 14.7.1 shows the behavior of the surface at the critical point.

14.7.2 https://math.libretexts.org/@go/page/4542
Figure 14.7.1 : The function g(x, y) has a critical point at (−1, −1, 5) .

 Exercise 14.7.1

Find the critical point of the function f (x, y) = x 3


+ 2xy − 2x − 4y.

Hint
Calculate f x (x, y) and f
y (x, , then set them equal to zero.
y)

Answer
The only critical point of f is (2, −5).

The main purpose for determining critical points is to locate relative maxima and minima, as in single-variable calculus. When
working with a function of one variable, the definition of a local extremum involves finding an interval around the critical point
such that the function value is either greater than or less than all the other function values in that interval. When working with a
function of two or more variables, we work with an open disk around the point.

 Definition: Global and Local Extrema

Let z = f (x, y) be a function of two variables that is defined and continuous on an open set containing the point (x0 , y0 ).

Then f has a local maximum at (x , y ) if 0 0

f (x0 , y0 ) ≥ f (x, y)

for all points (x, y) within some disk centered at (x , y ). The number f (x , y ) is called a local maximum value. If the
0 0 0 0

preceding inequality holds for every point (x, y) in the domain of f , then f has a global maximum (also called an absolute
maximum) at (x , y ). 0 0

The function f has a local minimum at (x 0, y0 ) if

f (x0 , y0 ) ≤ f (x, y)

for all points (x, y) within some disk centered at (x , y ). The number f (x , y ) is called a local minimum value. If the
0 0 0 0

preceding inequality holds for every point (x, y) in the domain of f , then f has a global minimum (also called an absolute
minimum) at (x , y ). 0 0

If f (x0, y0 ) is either a local maximum or local minimum value, then it is called a local extremum (see the following figure).

14.7.3 https://math.libretexts.org/@go/page/4542
−−−−−−−−−−
Figure 14.7.2 : The graph of z = √16 − x − y has a maximum value when (x, y) = (0, 0). It attains its minimum value at the
2 2

boundary of its domain, which is the circle x + y = 16.


2 2

In Calculus 1, we showed that extrema of functions of one variable occur at critical points. The same is true for functions of more
than one variable, as stated in the following theorem.

 Fermat’s Theorem for Functions of Two Variables

Let z = f (x, y) be a function of two variables that is defined and continuous on an open set containing the point (x0 , y0 ) .
Suppose f and f each exist at (x , y ). If f has a local extremum at (x , y ), then (x , y ) is a critical point of f .
x y 0 0 0 0 0 0

Second Derivative Test


Consider the function f (x) = x . This function has a critical point at x = 0 , since f (0) = 3(0) = 0 . However, f does not have
3 ′ 2

an extreme value at x = 0 . Therefore, the existence of a critical value at x = x does not guarantee a local extremum at x = x .
0 0

The same is true for a function of two or more variables. One way this can happen is at a saddle point. An example of a saddle
point appears in the following figure.
Figure 14.7.3 : Graph of the function z = x 2
−y
2
. This graph has a saddle point at the origin.
In this graph, the origin is a saddle point. This is because the first partial derivatives of f(x, y) = x − y are both equal to zero at
2 2

this point, but it is neither a maximum nor a minimum for the function. Furthermore the vertical trace corresponding to y = 0 is
z =x
2
(a parabola opening upward), but the vertical trace corresponding to x = 0 is z = −y (a parabola opening downward).2

Therefore, it is both a global maximum for one trace and a global minimum for another.

 Definition: Saddle Point


Given the function z = f (x, y), the point (x , y 0 0, f (x0 , y0 )) is a saddle point if both f x (x0 , y0 ) = 0 and f y (x0 , y0 ) = 0 , but
f does not have a local extremum at (x , y ). 0 0

The second derivative test for a function of one variable provides a method for determining whether an extremum occurs at a
critical point of a function. When extending this result to a function of two variables, an issue arises related to the fact that there
are, in fact, four different second-order partial derivatives, although equality of mixed partials reduces this to three. The second
derivative test for a function of two variables, stated in the following theorem, uses a discriminant D that replaces f (x ) in the ′′
0

second derivative test for a function of one variable.

 Second Derivative Test


Let z = f (x, y) be a function of two variables for which the first- and second-order partial derivatives are continuous on some
disk containing the point (x , y ). Suppose f (x , y ) = 0 and f (x , y ) = 0. Define the quantity
0 0 x 0 0 y 0 0

14.7.4 https://math.libretexts.org/@go/page/4542
2
D = fxx (x0 , y0 )fyy (x0 , y0 ) − (fxy (x0 , y0 )) .

Then:
i. If D > 0 and f (x , y ) > 0 , then f has a local minimum at (x , y ).
xx 0 0 0 0

ii. If D > 0 and f (x , y ) < 0 , then f has a local maximum at (x , y ).


xx 0 0 0 0

iii. If D < 0 , then f has a saddle point at (x , y ). 0 0

iv. If D = 0 , then the test is inconclusive.


See Figure 14.7.4.

Figure 14.7.4 : The second derivative test can often determine whether a function of two variables has local minima (a), local
maxima (b), or a saddle point (c).

To apply the second derivative test, it is necessary that we first find the critical points of the function. There are several steps
involved in the entire procedure, which are outlined in a problem-solving strategy.

 Problem-Solving Strategy: Using the Second Derivative Test for Functions of Two Variables

Let z = f (x, y) be a function of two variables for which the first- and second-order partial derivatives are continuous on some
disk containing the point (x , y ). To apply the second derivative test to find local extrema, use the following steps:
0 0

1. Determine the critical points (x , y ) of the function f where f (x , y ) = f (x , y ) = 0. Discard any points where at
0 0 x 0 0 y 0 0

least one of the partial derivatives does not exist.


2
2. Calculate the discriminant D = f (x , y )f (x , y ) − (f (x , y )) for each critical point of f .
xx 0 0 yy 0 0 xy 0 0

3. Apply the four cases of the test to determine whether each critical point is a local maximum, local minimum, or saddle
point, or whether the theorem is inconclusive.

 Example 14.7.2: Using the Second Derivative Test

Find the critical points for each of the following functions, and use the second derivative test to find the local extrema:
a. f (x, y) = 4 x
2
+ 9y
2
+ 8x − 36y + 24

1
b. g(x, y) = x
3
+y
2
+ 2xy − 6x − 3y + 4
3

Solution
a. Step 1 of the problem-solving strategy involves finding the critical points of f . To do this, we first calculate fx (x, y) and
f (x, y), then set each of them equal to zero:
y

fx (x, y) = 8x + 8

fy (x, y) = 18y − 36.

Setting them equal to zero yields the system of equations

14.7.5 https://math.libretexts.org/@go/page/4542
8x + 8 = 0

18y − 36 = 0.

The solution to this system is x = −1 and y = 2 . Therefore (−1, 2) is a critical point of f .


Step 2 of the problem-solving strategy involves calculating D. To do this, we first calculate the second partial derivatives of
f :

fxx (x, y) = 8

fxy (x, y) = 0

fyy (x, y) = 18.

2
Therefore, D = f xx (−1, 2)fyy (−1, 2) − (fxy (−1, 2)) = (8)(18) − (0 )
2
= 144.

Step 3 states to apply the four cases of the test to classify the function's behavior at this critical point.
Since D > 0 and fxx (−1, 2) > 0, this corresponds to case 1. Therefore, f has a local minimum at (−1, 2) as shown in the
following figure.
Figure 14.7.5 : The function f (x, y) has a local minimum at (−1, 2, −16). Note the scale on the y -axis in this plot is in
thousands.
b. For step 1, we first calculate g x (x, y) and g
y (x, y), then set each of them equal to zero:
2
gx (x, y) = x + 2y − 6

gy (x, y) = 2y + 2x − 3.

Setting them equal to zero yields the system of equations


2
x + 2y − 6 = 0

2y + 2x − 3 = 0.

3 − 2x
To solve this system, first solve the second equation for y . This gives y = . Substituting this into the first equation
2
gives
2
x + 3 − 2x − 6 =0

2
x − 2x − 3 = 0

(x − 3)(x + 1) = 0.

3 − 2x
Therefore, x = −1 or x = 3 . Substituting these values into the equation y = yields the critical points (−1,
5

2
) and
2

(3, −
3

2
.
)

Step 2 involves calculating the second partial derivatives of g :


gxx (x, y) = 2x

gxy (x, y) = 2

gyy (x, y) = 2.

Then, we find a general formula for D:


2
D(x0 , y0 ) = gxx (x0 , y0 )gyy (x0 , y0 ) − (gxy (x0 , y0 ))

2
= (2 x0 )(2) − 2

= 4 x0 − 4.

Next, we substitute each critical point into this formula:

14.7.6 https://math.libretexts.org/@go/page/4542
5 2
D (−1, ) = (2(−1))(2) − (2 ) = −4 − 4 = −8
2

3 2
D (3, − ) = (2(3))(2) − (2 ) = 12 − 4 = 8.
2

In step 3, we note that, applying Note to point (−1, ) leads to case 3, which means that (−1, ) is a saddle point. Applying
5

2
5

the theorem to point (3, − ) leads to case 1, which means that (3, − ) corresponds to a local minimum as shown in the
3

2
3

following figure.

Figure 14.7.6 : The function g(x, y) has a local minimum and a saddle point.

 Exercise 14.7.2

Use the second derivative test to find the local extrema of the function
3 2
f (x, y) = x + 2xy − 6x − 4 y .

Hint
Follow the problem-solving strategy for applying the second derivative test.
Answer

(
4

3
,
1

3
) is a saddle point, (− 3

2
,−
3

8
) is a local maximum.

Absolute Maxima and Minima


When finding global extrema of functions of one variable on a closed interval, we start by checking the critical values over that
interval and then evaluate the function at the endpoints of the interval. When working with a function of two variables, the closed
interval is replaced by a closed, bounded set. A set is bounded if all the points in that set can be contained within a ball (or disk) of
finite radius. First, we need to find the critical points inside the set and calculate the corresponding critical values. Then, it is
necessary to find the maximum and minimum value of the function on the boundary of the set. When we have all these values, the
largest function value corresponds to the global maximum and the smallest function value corresponds to the absolute minimum.
First, however, we need to be assured that such values exist. The following theorem does this.

 Extreme Value Theorem

A continuous function f (x, y) on a closed and bounded set D in the plane attains an absolute maximum value at some point of
D and an absolute minimum value at some point of D.

14.7.7 https://math.libretexts.org/@go/page/4542
Now that we know any continuous function f defined on a closed, bounded set attains its extreme values, we need to know how to
find them.

 Finding Extreme Values of a Function of Two Variables

Assume z = f (x, y) is a differentiable function of two variables defined on a closed, bounded set D. Then f will attain the
absolute maximum value and the absolute minimum value, which are, respectively, the largest and smallest values found
among the following:
1. The values of f at the critical points of f in D.
2. The values of f on the boundary of D.

The proof of this theorem is a direct consequence of the extreme value theorem and Fermat’s theorem. In particular, if either
extremum is not located on the boundary of D, then it is located at an interior point of D. But an interior point (x , y ) of D that’s
0 0

an absolute extremum is also a local extremum; hence, (x , y ) is a critical point of f by Fermat’s theorem. Therefore the only
0 0

possible values for the global extrema of f on D are the extreme values of f on the interior or boundary of D.

 Problem-Solving Strategy: Finding Absolute Maximum and Minimum Values

Let z = f (x, y) be a continuous function of two variables defined on a closed, bounded set D, and assume f is differentiable
on D. To find the absolute maximum and minimum values of f on D, do the following:
1. Determine the critical points of f in D.
2. Calculate f at each of these critical points.
3. Determine the maximum and minimum values of f on the boundary of its domain.
4. The maximum and minimum values of f will occur at one of the values obtained in steps 2 and 3.

Finding the maximum and minimum values of f on the boundary of D can be challenging. If the boundary is a rectangle or set of
straight lines, then it is possible to parameterize the line segments and determine the maxima on each of these segments, as seen in
Example 14.7.3. The same approach can be used for other shapes such as circles and ellipses.
If the boundary of the set D is a more complicated curve defined by a function g(x, y) = c for some constant c , and the first-order
partial derivatives of g exist, then the method of Lagrange multipliers can prove useful for determining the extrema of f on the
boundary which is introduced in Lagrange Multipliers.

 Example 14.7.3: Finding Absolute Extrema


Use the problem-solving strategy for finding absolute extrema of a function to determine the absolute extrema of each of the
following functions:
a. f (x, y) = x2
− 2xy + 4 y
2
− 4x − 2y + 24 on the domain defined by 0 ≤ x ≤ 4 and 0 ≤ y ≤ 2
b. g(x, y) = x 2
+y
2
+ 4x − 6y on the domain defined by x + y ≤ 16
2 2

Solution
a. Using the problem-solving strategy, step 1 involves finding the critical points of f on its domain. Therefore, we first
calculate f (x, y) and f (x, y), then set them each equal to zero:
x y

fx (x, y) = 2x − 2y − 4

fy (x, y) = −2x + 8y − 2.

Setting them equal to zero yields the system of equations


2x − 2y − 4 = 0

−2x + 8y − 2 = 0.

The solution to this system is x = 3 and y = 1 . Therefore (3, 1) is a critical point of f . Calculating f (3, 1) gives f (3, 1) = 17.
The next step involves finding the extrema of f on the boundary of its domain. The boundary of its domain consists of four
line segments as shown in the following graph:

14.7.8 https://math.libretexts.org/@go/page/4542
Figure 14.7.7 : Graph of the domain of the function f (x, y) = x 2
− 2xy + 4y
2
− 4x − 2y + 24.

L1 is the line segment connecting (0, 0) and (4, 0), and it can be parameterized by the equations x(t) = t, y(t) = 0 for
0 ≤t ≤4 . Define g(t) = f (x(t), y(t)) . This gives g(t) = t − 4t + 24 . Differentiating g leads to g'(t) = 2t − 4. Therefore,
2

g has a critical value at t = 2 , which corresponds to the point (2, 0). Calculating f (2, 0) gives the z -value 20.
L2 is the line segment connecting (4, 0) and (4, 2), and it can be parameterized by the equations x(t) = 4, y(t) = t for
0 ≤ t ≤ 2. Again, define g(t) = f (x(t), y(t)). This gives g(t) = 4t − 10t + 24. Then, g'(t) = 8t − 10 . g has a critical
2

value at t = , which corresponds to the point (4, ) . Calculating f (4, ) gives the z -value 17.75.
5

4
5

4
5

L3 is the line segment connecting (0, 2) and (4, 2), and it can be parameterized by the equations x(t) = t, y(t) = 2 for
0 ≤ t ≤ 4. Again, define g(t) = f (x(t), y(t)). This gives g(t) = t − 8t + 36. The critical value corresponds to the point
2

(4, 2). So, calculating f (4, 2) gives the z -value 20.

L4 is the line segment connecting (0, 0) and (0, 2), and it can be parameterized by the equations x(t) = 0, y(t) = t for
0 ≤ t ≤ 2. This time, g(t) = 4t − 2t + 24 and the critical value t = correspond to the point (0,
2 1

4
1

4
. Calculating f (0, )
)
1

gives the z -value 23.75.


We also need to find the values of f (x, y) at the corners of its domain. These corners are located at (0, 0), (4, 0), (4, 2) and
(0, 2):

2 2
f (0, 0) = (0 ) − 2(0)(0) + 4(0 ) − 4(0) − 2(0) + 24 = 24

2 2
f (4, 0) = (4 ) − 2(4)(0) + 4(0 ) − 4(4) − 2(0) + 24 = 24

2 2
f (4, 2) = (4 ) − 2(4)(2) + 4(2 ) − 4(4) − 2(2) + 24 = 20

2 2
f (0, 2) = (0 ) − 2(0)(2) + 4(2 ) − 4(0) − 2(2) + 24 = 36.

The absolute maximum value is 36, which occurs at , and the global minimum value is
(0, 2) 20 , which occurs at both (4, 2)

and (2, 0) as shown in the following figure.

14.7.9 https://math.libretexts.org/@go/page/4542
Figure 14.7.8 : The function f (x, y) has two global minima and one global maximum over its domain.
b. Using the problem-solving strategy, step 1 involves finding the critical points of g on its domain. Therefore, we first
calculate g (x, y) and g (x, y), then set them each equal to zero:
x y

gx (x, y) = 2x + 4

gy (x, y) = 2y − 6.

Setting them equal to zero yields the system of equations


2x + 4 = 0

2y − 6 = 0.

The solution to this system is x = −2 and y = 3 . Therefore, (−2, 3) is a critical point of g . Calculating g(−2, 3), we get
2 2
g(−2, 3) = (−2 ) +3 + 4(−2) − 6(3) = 4 + 9 − 8 − 18 = −13.

The next step involves finding the extrema of g on the boundary of its domain. The boundary of its domain consists of a circle
of radius 4 centered at the origin as shown in the following graph.

14.7.10 https://math.libretexts.org/@go/page/4542
Figure 14.7.9 : Graph of the restricted domain of the function g(x, y) = x 2
+y
2
+ 4x − 6y .
The boundary of the domain of g can be parameterized using the functions x(t) = 4 cos t, y(t) = 4 sin t for 0 ≤ t ≤ 2π .
Define h(t) = g(x(t), y(t)) :

h(t) = g(x(t), y(t))

2 2
= (4 cos t) + (4 sin t) + 4(4 cos t) − 6(4 sin t)

2 2
= 16 cos t + 16 sin t + 16 cos t − 24 sin t

= 16 + 16 cos t − 24 sin t.

Setting h'(t) = 0 leads to


−16 sin t − 24 cos t = 0

−16 sin t = 24 cos t

−16 sin t 24 cos t


=
−16 cos t −16 cos t

3
tan t = − .
2

This equation has two solutions over the interval 0 ≤ t ≤ 2π . One is t = π − arctan(
3

2
) and the other is
t = 2π − arctan( ) . For the first angle,
3

−−
3 3
3 √13
sin t = sin(π − arctan( )) = sin(arctan( )) =
2 2
13
−−
3 3
2 √13
cos t = cos(π − arctan( )) = − cos(arctan( )) = − .
2 2
13

8 √13 12 √13 8 √13 12 √13


Therefore, x(t) = 4 cos t = − 13
and y(t) = 4 sin t = 13
, so (− 13
,
13
) is a critical point on the boundary and
2 2
8 √13 12 √13 8 √13 12 √13 8 √13 12 √13
g (− , ) = (− ) +( ) + 4 (− )−6 ( )
13 13 13 13 13 13

−− −−
144 64 32 √13 72 √13
= + − −
13 13 13 13
−−
208 − 104 √13
= ≈ −12.844.
13

For the second angle,

14.7.11 https://math.libretexts.org/@go/page/4542
−−
3 3
3 √13
sin t = sin(2π − arctan( )) = − sin(arctan( )) = −
2 2
13
−−
3 3
2 √13
cos t = cos(2π − arctan( )) = cos(arctan( )) = .
2 2
13

8 √13 12 √13 8 √13 12 √13


Therefore, x(t) = 4 cos t = 13
and y(t) = 4 sin t = − 13
, so ( 13
,−
13
) is a critical point on the boundary and
2 2
8 √13 12 √13 8 √13 12 √13 8 √13 12 √13
g( ,− ) =( ) + (− ) +4 ( ) − 6 (− )
13 13 13 13 13 13

−− −−
144 64 32 √13 72 √13
= + + +
13 13 13 13
−−
208 + 104 √13
= ≈ 44.844.
13

The absolute minimum of g is −13, which is attained at the point (−2, 3) , which is an interior point of D . The absolute
8 √13 12 √13
maximum of g is approximately equal to 44.844, which is attained at the boundary point (
13
,−
13
) . These are the
absolute extrema of g on D as shown in the following figure.
Figure 14.7.10: The function f (x, y) has a local minimum and a local maximum.

 Exercise 14.7.3:

Use the problem-solving strategy for finding absolute extrema of a function to find the absolute extrema of the function
2 2
f (x, y) = 4 x − 2xy + 6 y − 8x + 2y + 3

on the domain defined by 0 ≤ x ≤ 2 and −1 ≤ y ≤ 3.

Hint
Calculate f (x, y) and f
x y (x, y), and set them equal to zero. Then, calculate f for each critical point and find the extrema of
f on the boundary of D.

Answer
The absolute minimum occurs at (1, 0) : f (1, 0) = −1.
The absolute maximum occurs at (0, 3) : f (0, 3) = 63.

 Example 14.7.4: Profitable Golf Balls

Pro-T company has developed a profit model that depends on the number x of golf balls sold per month (measured in
thousands), and the number of hours per month of advertising y , according to the function
2 2
z = f (x, y) = 48x + 96y − x − 2xy − 9 y ,

where z is measured in thousands of dollars. The maximum number of golf balls that can be produced and sold is 50, 000, and
the maximum number of hours of advertising that can be purchased is 25. Find the values of x and y that maximize profit, and
find the maximum profit.

14.7.12 https://math.libretexts.org/@go/page/4542
Figure 14.7.11: (credit: modification of work by oatsy40, Flickr)
Solution
Using the problem-solving strategy, step 1 involves finding the critical points of f on its domain. Therefore, we first calculate
f (x, y) and f (x, y), then set them each equal to zero:
x y

fx (x, y) = 48 − 2x − 2y

fy (x, y) = 96 − 2x − 18y.

Setting them equal to zero yields the system of equations


48 − 2x − 2y = 0

96 − 2x − 18y = 0.

The solution to this system is x = 21 and y =3 . Therefore (21, 3) is a critical point of f . Calculating f (21, 3) gives
2 2
f (21, 3) = 48(21) + 96(3) − 21 − 2(21)(3) − 9(3 ) = 648.

The domain of this function is 0 ≤ x ≤ 50 and 0 ≤ y ≤ 25 as shown in the following graph.

Figure 14.7.12: Graph of the domain of the function f (x, y) = 48x + 96y − x 2
− 2xy − 9y .
2

L1 is the line segment connecting (0, 0) and (50, 0), and it can be parameterized by the equations x(t) = t, y(t) = 0 for
0 ≤ t ≤ 50. We then define g(t) = f (x(t), y(t)) :
g(t) = f (x(t), y(t))

= f (t, 0)

2 2
= 48t + 96(0) − y − 2(t)(0) − 9(0 )

2
= 48t − t .

Setting g'(t) = 0 yields the critical point t = 24, which corresponds to the point (24, 0) in the domain of f . Calculating
f (24, 0) gives 576.
L2 is the line segment connecting (50, 0) and (50, 25), and it can be parameterized by the equations x(t) = 50, y(t) = t for
0 ≤ t ≤ 25 . Once again, we define g(t) = f (x(t), y(t)) :

14.7.13 https://math.libretexts.org/@go/page/4542
g(t) = f (x(t), y(t))

= f (50, t)

2 2
= 48(50) + 96t − 50 − 2(50)t − 9 t

2
= −9 t − 4t − 100.

This function has a critical point at t = − , which corresponds to the point (50, −29). This point is not in the domain of f .
2

L3 is the line segment connecting (0, 25) and (50, 25), and it can be parameterized by the equations x(t) = t, y(t) = 25 for
0 ≤ t ≤ 50 . We define g(t) = f (x(t), y(t)) :

g(t) = f (x(t), y(t))

= f (t, 25)

2 2
= 48t + 96(25) − t − 2t(25) − 9(25 )

2
= −t − 2t − 3225.

This function has a critical point at t = −1 , which corresponds to the point (−1, 25), which is not in the domain.
L4 is the line segment connecting (0, 0) to (0, 25) , and it can be parameterized by the equations x(t) = 0, y(t) = t for
0 ≤ t ≤ 25 . We define g(t) = f (x(t), y(t)) :

g(t) = f (x(t), y(t))

= f (0, t)

2 2
= 48(0) + 96t − (0 ) − 2(0)t − 9 t

2
= 96t − 9 t .

This function has a critical point at t =


16

3
, which corresponds to the point (0,
16

3
) , which is on the boundary of the domain.
Calculating f (0, ) gives 256.
16

We also need to find the values of f (x, y) at the corners of its domain. These corners are located at (0, 0), (50, 0), (50, 25)and
(0, 25):

2 2
f (0, 0) = 48(0) + 96(0) − (0 ) − 2(0)(0) − 9(0 ) =0

2 2
f (50, 0) = 48(50) + 96(0) − (50 ) − 2(50)(0) − 9(0 ) = −100

2 2
f (50, 25) = 48(50) + 96(25) − (50 ) − 2(50)(25) − 9(25 ) = −5825

2 2
f (0, 25) = 48(0) + 96(25) − (0 ) − 2(0)(25) − 9(25 ) = −3225.

The maximum value is 648, which occurs at (21, 3). Therefore, a maximum profit of $648, 000 is realized when 21, 000 golf
balls are sold and 3 hours of advertising are purchased per month as shown in the following figure.
Figure 14.7.13: The profit function f (x, y) has a maximum at (21, 3, 648).

Key Concepts
A critical point of the function f (x, y) is any point (x , y ) where either f (x , y ) = f (x , y ) = 0 , or at least one of
0 0 x 0 0 y 0 0

f (x , y ) and f (x , y ) do not exist.


x 0 0 y 0 0

A saddle point is a point (x , y ) where f (x , y ) = f (x , y ) = 0 , but f (x , y ) is neither a maximum nor a minimum at


0 0 x 0 0 y 0 0 0 0

that point.
To find extrema of functions of two variables, first find the critical points, then calculate the discriminant and apply the second
derivative test.

Key Equations
Discriminant
2
D = fxx (x0 , y0 )fyy (x0 , y0 ) − (fxy (x0 , y0 ))

14.7.14 https://math.libretexts.org/@go/page/4542
Glossary
critical point of a function of two variables
the point (x 0, y0 ) is called a critical point of f (x, y) if one of the two following conditions holds:
1. fx (x0 , y0 ) = fy (x0 , y0 ) = 0

2. At least one of f x (x0 , y0 ) and f y (x0 , y0 ) do not exist

discriminant
the discriminant of the function f (x, y) is given by the formula D = f xx (x0 , y0 )fyy (x0 , y0 ) − (fxy (x0 , y0 ))
2

saddle point
given the function z = f (x, y), the point (x 0, y0 , f (x0 , y0 )) is a saddle point if both f x (x0 , y0 ) = 0 and f
y (x0 , y0 ) = 0 , but f
does not have a local extremum at (x , y ) 0 0

14.7: Maximum and Minimum Values is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
14.7: Maxima/Minima Problems by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

14.7.15 https://math.libretexts.org/@go/page/4542
14.8: Lagrange Multipliers
 Learning Objectives
Use the method of Lagrange multipliers to solve optimization problems with one constraint.
Use the method of Lagrange multipliers to solve optimization problems with two constraints.

Solving optimization problems for functions of two or more variables can be similar to solving such problems in single-variable
calculus. However, techniques for dealing with multiple variables allow us to solve more varied optimization problems for which
we need to deal with additional conditions or constraints. In this section, we examine one of the more common and useful methods
for solving optimization problems with constraints.

Lagrange Multipliers
In the previous section, an applied situation was explored involving maximizing a profit function, subject to certain constraints. In
that example, the constraints involved a maximum number of golf balls that could be produced and sold in 1 month (x), and a
maximum number of advertising hours that could be purchased per month (y). Suppose these were combined into a single
budgetary constraint, such as 20x + 4y ≤ 216, that took into account both the cost of producing the golf balls and the number of
advertising hours purchased per month. The goal is still to maximize profit, but now there is a different type of constraint on the
values of x and y . This constraint and the corresponding profit function
2 2
f (x, y) = 48x + 96y − x − 2xy − 9 y

is an example of an optimization problem, and the function f (x, y) is called the objective function. A graph of various level
curves of the function f (x, y) follows.

Figure 14.8.1 : Graph showing level curves of the function f (x, y) = 48x + 96y − x
2
− 2xy − 9y
2
corresponding to
c = 150, 250, 350, and 400.
In Figure 14.8.1, the value c represents different profit levels (i.e., values of the function f ). As the value of c increases, the curve
shifts to the right. Since our goal is to maximize profit, we want to choose a curve as far to the right as possible. If there were no
restrictions on the number of golf balls the company could produce or the number of units of advertising available, then we could
produce as many golf balls as we want, and advertise as much as we want, and there would be not be a maximum profit for the
company. Unfortunately, we have a budgetary constraint that is modeled by the inequality 20x + 4y ≤ 216. To see how this
constraint interacts with the profit function, Figure 14.8.2 shows the graph of the line 20x + 4y = 216 superimposed on the
previous graph.

14.8.1 https://math.libretexts.org/@go/page/4543
Figure 14.8.2 : Graph of level curves of the function f (x, y) = 48x + 96y − x 2
− 2xy − 9y
2
corresponding to c = 150, 250, 350,
and 395. The red graph is the constraint function.
As mentioned previously, the maximum profit occurs when the level curve is as far to the right as possible. However, the level of
production corresponding to this maximum profit must also satisfy the budgetary constraint, so the point at which this profit occurs
must also lie on (or to the left of) the red line in Figure 14.8.2. Inspection of this graph reveals that this point exists where the line
is tangent to the level curve of f . Trial and error reveals that this profit level seems to be around 395, when x and y are both just
less than 5. We return to the solution of this problem later in this section. From a theoretical standpoint, at the point where the profit
curve is tangent to the constraint line, the gradient of both of the functions evaluated at that point must point in the same (or
opposite) direction. Recall that the gradient of a function of more than one variable is a vector. If two vectors point in the same (or
opposite) directions, then one must be a constant multiple of the other. This idea is the basis of the method of Lagrange
multipliers.

 Method of Lagrange Multipliers: One Constraint

Theorem 14.8.1: Let f and g be functions of two variables with continuous partial derivatives at every point of some open set
containing the smooth curve g(x, y) = 0. Suppose that f , when restricted to points on the curve g(x, y) = 0 , has a local

extremum at the point (x 0, y0 ) and that ∇g(x 0, y0 ) ≠ 0 . Then there is a number λ called a Lagrange multiplier, for which
⇀ ⇀
∇f (x0 , y0 ) = λ ∇g(x0 , y0 ).

 Proof

Assume that a constrained extremum occurs at the point (x0 , y0 ). Furthermore, we assume that the equation g(x, y) = 0 can
be smoothly parameterized as
x = x(s) and y = y(s)

where s is an arc length parameter with reference point (x0 , y0 ) at s =0 . Therefore, the quantity z = f (x(s), y(s)) has a
dz
relative maximum or relative minimum at s = 0 , and this implies that =0 at that point. From the chain rule,
ds

dz ∂f ∂x ∂f ∂y
= ⋅ + ⋅
ds ∂x ∂s ∂y ∂s

∂f ∂f ∂x ∂y
^ ^ ^ ^
=( i + j) ⋅ ( i + j)
∂x ∂y ∂s ∂s

= 0,

where the derivatives are all evaluated at s = 0 . However, the first factor in the dot product is the gradient of f , and the second
factor is the unit tangent vector T⃗ (0) to the constraint curve. Since the point (x , y ) corresponds to s = 0 , it follows from
0 0

this equation that


⇀ ⇀
∇f (x0 , y0 ) ⋅ T(0) = 0,

14.8.2 https://math.libretexts.org/@go/page/4543

which implies that the gradient is either the zero vector 0 or it is normal to the constraint curve at a constrained relative

extremum. However, the constraint curve g(x, y) = 0 is a level curve for the function g(x, y) so that if ∇g(x , y ) ≠ 0 then 0 0

∇g(x0 , y0 ) is normal to this curve at (x 0, y0 ) It follows, then, that there is some scalar λ such that
⇀ ⇀
∇f (x0 , y0 ) = λ ∇g(x0 , y0 )

To apply Theorem 14.8.1 to an optimization problem similar to that for the golf ball manufacturer, we need a problem-solving
strategy.

 Problem-Solving Strategy: Steps for Using Lagrange Multipliers


1. Determine the objective function f (x, y) and the constraint function g(x, y). Does the optimization problem involve
maximizing or minimizing the objective function?
2. Set up a system of equations using the following template:
⇀ ⇀
∇f (x0 , y0 ) = λ ∇g(x0 , y0 ) (14.8.1)
.
g(x0 , y0 ) = 0 (14.8.2)

3. Solve for x and y .


0 0

4. The largest of the values of f at the solutions found in step 3 maximizes f ; the smallest of those values minimizes f .

 Example 14.8.1: Using Lagrange Multipliers

Use the method of Lagrange multipliers to find the minimum value of f (x, y) = x 2
+ 4y
2
− 2x + 8y subject to the constraint
x + 2y = 7.

Solution
Let’s follow the problem-solving strategy:
1. The objective function is f (x, y) = x + 4y − 2x + 8y. To determine the constraint function, we must first subtract 7
2 2

from both sides of the constraint. This gives x + 2y − 7 = 0. The constraint function is equal to the left-hand side, so
g(x, y) = x + 2y − 7 . The problem asks us to solve for the minimum value of f , subject to the constraint (Figure 14.8.3).

Figure 14.8.3 : Graph of level curves of the function f (x, y) = x 2


+ 4y
2
− 2x + 8y corresponding to c = 10 and 26. The red
graph is the constraint function.
2. We then must calculate the gradients of both f and g :

^ ^
∇f (x, y) = (2x − 2) i + (8y + 8) j

^ ^
∇g (x, y) = i + 2 j .

⇀ ⇀
The equation ∇f (x 0, y0 ) = λ ∇g (x0 , y0 ) becomes

14.8.3 https://math.libretexts.org/@go/page/4543
^ ^ ^ ^
(2 x0 − 2) i + (8 y0 + 8) j = λ ( i + 2 j ) ,

which can be rewritten as


^ ^ ^ ^
(2 x0 − 2) i + (8 y0 + 8) j = λ i + 2λ j .

Next, we set the coefficients of ^i and ^j equal to each other:

2 x0 − 2 = λ

8 y0 + 8 = 2λ.

The equation g (x 0, y0 ) = 0 becomes x 0 + 2 y0 − 7 = 0 . Therefore, the system of equations that needs to be solved is

2 x0 − 2 = λ

8 y0 + 8 = 2λ

x0 + 2 y0 − 7 = 0.

3. This is a linear system of three equations in three variables. We start by solving the second equation for λ and substituting it
into the first equation. This gives λ = 4y + 4 , so substituting this into the first equation gives
0

2 x0 − 2 = 4 y0 + 4.

Solving this equation for x gives x


0 0 = 2 y0 + 3 . We then substitute this into the third equation:
(2 y0 + 3) + 2 y0 − 7 = 0

4 y0 − 4 = 0

y0 = 1.

Since x 0 = 2 y0 + 3, this gives x 0 = 5.

4. Next, we evaluate f (x, y) = x 2


+ 4y
2
− 2x + 8y at the point (5, 1),
2 2
f (5, 1) = 5 + 4(1 ) − 2(5) + 8(1) = 27.

To ensure this corresponds to a minimum value on the constraint function, let’s try some other points on the constraint from
either side of the point (5, 1), such as the intercepts of g(x, y) = 0 , Which are (7, 0) and (0, 3.5).
We get f (7, 0) = 35 > 27 and f (0, 3.5) = 77 > 27.
So it appears that f has a relative minimum of 27 at (5, 1), subject to the given constraint.

 Exercise 14.8.1

Use the method of Lagrange multipliers to find the maximum value of


2 2
f (x, y) = 9 x + 36xy − 4 y − 18x − 8y

subject to the constraint 3x + 4y = 32.

Hint
Use the problem-solving strategy for the method of Lagrange multipliers.
Answer
Subject to the given constraint, f has a maximum value of 976 at the point (8, 2).

Let’s now return to the problem posed at the beginning of the section.

14.8.4 https://math.libretexts.org/@go/page/4543
 Example 14.8.2: Golf Balls and Lagrange Multipliers

The golf ball manufacturer, Pro-T, has developed a profit model that depends on the number x of golf balls sold per month
(measured in thousands), and the number of hours per month of advertising y, according to the function
2 2
z = f (x, y) = 48x + 96y − x − 2xy − 9 y ,

where z is measured in thousands of dollars. The budgetary constraint function relating the cost of the production of thousands
golf balls and advertising units is given by 20x + 4y = 216. Find the values of x and y that maximize profit, and find the
maximum profit.
Solution:
Again, we follow the problem-solving strategy:
1. The objective function is f (x, y) = 48x + 96y − x − 2xy − 9y . To determine the constraint function, we first subtract
2 2

216 from both sides of the constraint, then divide both sides by 4 , which gives 5x + y − 54 = 0. The constraint function is

equal to the left-hand side, so g(x, y) = 5x + y − 54. The problem asks us to solve for the maximum value of f , subject to
this constraint.
2. So, we calculate the gradients of both f and g :

^ ^
∇f (x, y) = (48 − 2x − 2y) i + (96 − 2x − 18y) j


^ ^
∇g(x, y) = 5 i + j .

⇀ ⇀
The equation ∇f (x 0, y0 ) = λ ∇g(x0 , y0 ) becomes
^ ^ ^ ^
(48 − 2 x0 − 2 y0 ) i + (96 − 2 x0 − 18 y0 ) j = λ(5 i + j ),

which can be rewritten as


^ ^ ^ ^
(48 − 2 x0 − 2 y0 ) i + (96 − 2 x0 − 18 y0 ) j = λ5 i + λ j .

We then set the coefficients of ^i and ^j equal to each other:


48 − 2 x0 − 2 y0 = 5λ

96 − 2 x0 − 18 y0 = λ.

The equation g(x 0, y0 ) = 0 becomes 5x 0 + y0 − 54 = 0 . Therefore, the system of equations that needs to be solved is
48 − 2 x0 − 2 y0 = 5λ

96 − 2 x0 − 18 y0 = λ

5 x0 + y0 − 54 = 0.

3. We use the left-hand side of the second equation to replace λ in the first equation:
48 − 2 x0 − 2 y0 = 5(96 − 2 x0 − 18 y0 )

48 − 2 x0 − 2 y0 = 480 − 10 x0 − 90 y0

8x0 = 432 − 88y0

x0 = 54 − 11 y0 .

Then we substitute this into the third equation:


5(54 − 11 y0 ) + y0 − 54 =0

270 − 55 y0 + y0 − 54 =0

216 − 54y0 = 0

y0 = 4.

Since x 0 = 54 − 11 y0 , this gives x 0 = 10.

14.8.5 https://math.libretexts.org/@go/page/4543
4. We then substitute (10, 4) into f (x, y) = 48x + 96y − x 2
− 2xy − 9 y ,
2
which gives
2 2
f (10, 4) = 48(10) + 96(4) − (10 ) − 2(10)(4) − 9(4 )

= 480 + 384 − 100 − 80 − 144

= 540.

Therefore the maximum profit that can be attained, subject to budgetary constraints, is $540, 000with a production level of
10, 000 golf balls and 4 hours of advertising bought per month. Let’s check to make sure this truly is a maximum. The

endpoints of the line that defines the constraint are (10.8, 0) and (0, 54) Let’s evaluate f at both of these points:
2 2
f (10.8, 0) = 48(10.8) + 96(0) − 10.8 − 2(10.8)(0) − 9(0 )

= 401.76

2 2
f (0, 54) = 48(0) + 96(54) − 0 − 2(0)(54) − 9(54 )

= −21, 060.

The second value represents a loss, since no golf balls are produced. Neither of these values exceed 540, so it seems that
our extremum is a maximum value of f , subject to the given constraint.

 Exercise 14.8.2: Optimizing the Cobb-Douglas function

A company has determined that its production level is given by the Cobb-Douglas function f (x, y) = 2.5x y where x 0.45 0.55

represents the total number of labor hours in 1 year and y represents the total capital input for the company. Suppose 1 unit of
labor costs $40 and 1 unit of capital costs $50. Use the method of Lagrange multipliers to find the maximum value of
f (x, y) = 2.5x
0.45
y subject to a budgetary constraint of $500, 000per year.
0.55

Hint
Use the problem-solving strategy for the method of Lagrange multipliers.
Answer
Subject to the given constraint, a maximum production level of 13890 occurs with 5625 labor hours and $5500 of total
capital input.

In the case of an objective function with three variables and a single constraint function, it is possible to use the method of
Lagrange multipliers to solve an optimization problem as well. An example of an objective function with three variables could be
the Cobb-Douglas function in Exercise 14.8.2: f (x, y, z) = x y z , where x represents the cost of labor, y represents capital
0.2 0.4 0.4

input, and z represents the cost of advertising. The method is the same as for the method with a function of two variables; the
equations to be solved are
⇀ ⇀
∇f (x, y, z) = λ ∇g(x, y, z)

g(x, y, z) = 0.

 Example 14.8.3: Lagrange Multipliers with a Three-Variable objective function


Maximize the function f (x, y, z) = x 2
+y
2
+z
2
subject to the constraint x + y + z = 1.
Solution
1. The objective function is f (x, y, z) = x + y + z . To determine the constraint function, we subtract 1 from each side of
2 2 2

the constraint: x + y + z − 1 = 0 which gives the constraint function as g(x, y, z) = x + y + z − 1.


⇀ ⇀
2. Next, we calculate ∇f (x, y, z) and ∇g(x, y, z) :

∇f (x, y, z) = ⟨2x, 2y, 2z⟩


∇g(x, y, z) = ⟨1, 1, 1⟩.

14.8.6 https://math.libretexts.org/@go/page/4543
This leads to the equations
⟨2 x0 , 2 y0 , 2 z0 ⟩ = λ⟨1, 1, 1⟩

x0 + y0 + z0 − 1 =0

which can be rewritten in the following form:


2x0 = λ

2y0 = λ

2z0 = λ

x0 + y0 + z0 − 1 = 0.

3. Since each of the first three equations has λ on the right-hand side, we know that 2x = 2y = 2z and all three variables 0 0 0

are equal to each other. Substituting y = x and z = x into the last equation yields 3x − 1 = 0, so x = and y =
0 0 0 0 0 0
1

3
0
1

and z = which corresponds to a critical point on the constraint curve.


0
1

4. Then, we evaluate f at the point ( 1

3
,
1

3
,
1

3
) :
2 2 2
1 1 1 1 1 1 3 1
f ( , , ) =( ) +( ) +( ) = =
3 3 3 3 3 3 9 3

Therefore, a possible extremum of the function is . To verify it is a minimum, choose other points that satisfy the constraint
1

from either side of the point we obtained above and calculate f at those points. For example,
2 2 2
f (1, 0, 0) = 1 +0 +0 =1

2 2 2
f (0, −2, 3) = 0 + (−2 ) +3 = 13.

Both of these values are greater than 1

3
, leading us to believe the extremum is a minimum, subject to the given constraint.

 Exercise 14.8.3

Use the method of Lagrange multipliers to find the minimum value of the function

f (x, y, z) = x + y + z

subject to the constraint x 2


+y
2
+z
2
= 1.

Hint
Use the problem-solving strategy for the method of Lagrange multipliers with an objective function of three variables.
Answer
Evaluating f at both points we obtained, gives us,
– – – – – –
√3 √3 √3 √3 √3 √3 –
f ( , , ) = + + = √3
3 3 3 3 3 3
– – – – – –
√3 √3 √3 √3 √3 √3 –
f (− ,− ,− ) =− − − = −√3
3 3 3 3 3 3


Since the constraint is continuous, we compare these values and conclude that f has a relative minimum of −√3 at the
– – –
√3 √3 √3
point (− ,− ,− ) , subject to the given constraint.
3 3 3

Problems with Two Constraints


The method of Lagrange multipliers can be applied to problems with more than one constraint. In this case the objective function,
w is a function of three variables:

14.8.7 https://math.libretexts.org/@go/page/4543
w = f (x, y, z)

and it is subject to two constraints:

g(x, y, z) = 0 and h(x, y, z) = 0.

There are two Lagrange multipliers, λ and λ , and the system of equations becomes
1 2

⇀ ⇀ ⇀
∇f (x0 , y0 , z0 ) = λ1 ∇g(x0 , y0 , z0 ) + λ2 ∇h(x0 , y0 , z0 )

g(x0 , y0 , z0 ) = 0

h(x0 , y0 , z0 ) = 0

 Example 14.8.4: Lagrange Multipliers with Two Constraints

Find the maximum and minimum values of the function


2 2 2
f (x, y, z) = x +y +z

subject to the constraints z 2


=x
2
+y
2
and x + y − z + 1 = 0.
Solution
Let’s follow the problem-solving strategy:
1. The objective function is f (x, y, z) = x + y + z . To determine the constraint functions, we first subtract z from both
2 2 2 2

sides of the first constraint, which gives x + y − z = 0 , so g(x, y, z) = x + y − z . The second constraint function
2 2 2 2 2 2

is h(x, y, z) = x + y − z + 1.
2. We then calculate the gradients of f , g, and h :

^ ^ ^
∇f (x, y, z) = 2x i + 2y j + 2zk


^ ^ ^
∇g(x, y, z) = 2x i + 2y j − 2zk


^ ^ ^
∇h(x, y, z) = i + j − k.

⇀ ⇀ ⇀
The equation ∇f (x 0, y0 , z0 ) = λ1 ∇g(x0 , y0 , z0 ) + λ2 ∇h(x0 , y0 , z0 ) becomes
^ ^ ^ ^ ^ ^ ^ ^ ^
2 x0 i + 2 y0 j + 2 z0 k = λ1 (2 x0 i + 2 y0 j − 2 z0 k) + λ2 ( i + j − k),

which can be rewritten as


^ ^ ^ ^ ^ ^
2 x0 i + 2 y0 j + 2 z0 k = (2 λ1 x0 + λ2 ) i + (2 λ1 y0 + λ2 ) j − (2 λ1 z0 + λ2 )k.

Next, we set the coefficients of ^i and ^j equal to each other:


2x0 = 2 λ1 x0 + λ2

2y0 = 2 λ1 y0 + λ2

2z0 = −2 λ1 z0 − λ2 .

The two equations that arise from the constraints are z 2


0
=x
2
0
+y
0
2
and x 0 + y0 − z0 + 1 = 0 . Combining these equations
with the previous three equations gives
2x0 = 2 λ1 x0 + λ2

2y0 = 2 λ1 y0 + λ2

2z0 = −2 λ1 z0 − λ2

2 2 2
z =x +y
0 0 0

x0 + y0 − z0 + 1 = 0.

3. The first three equations contain the variable λ . Solving the third equation for λ and replacing into the first and second
2 2

equations reduces the number of equations to four:

14.8.8 https://math.libretexts.org/@go/page/4543
2x0 = 2 λ1 x0 − 2 λ1 z0 − 2 z0

2y0 = 2 λ1 y0 − 2 λ1 z0 − 2 z0

2 2 2
z =x +y
0 0 0

x0 + y0 − z0 + 1 = 0.

x0 + z0
Next, we solve the first and second equation for λ . The first equation gives λ
1 1 = , the second equation gives
x0 − z0
y0 + z0
λ1 = . We set the right-hand side of each equation equal to each other and cross-multiply:
y0 − z0

x0 + z0 y0 + z0
=
x0 − z0 y0 − z0

(x0 + z0 )(y0 − z0 ) = (x0 − z0 )(y0 + z0 )

2 2
x0 y0 − x0 z0 + y0 z0 − z = x0 y0 + x0 z0 − y0 z0 − z
0 0

2 y0 z0 − 2 x0 z0 = 0

2 z0 (y0 − x0 ) = 0.

Therefore, either z = 0 or y = x . If z = 0 , then the first constraint becomes 0 = x + y . The only real solution to
0 0 0 0
2
0
2
0

this equation is x = 0 and y = 0 , which gives the ordered triple (0, 0, 0). This point does not satisfy the second
0 0

constraint, so it is not a solution. Next, we consider y = x , which reduces the number of equations to three:
0 0

y0 = x0

2 2 2
z =x +y
0 0 0

x0 + y0 − z0 + 1 = 0.

We substitute the first equation into the second and third equations:
2 2 2
z =x +x
0 0 0

= x0 + x0 − z0 + 1 = 0.

Then, we solve the second equation for z , which gives z


0 0 = 2 x0 + 1 . We then substitute this into the first equation,
2 2
z = 2x
0 0

2 2 2
(2 x + 1) = 2x
0 0

2 2
4x + 4 x0 + 1 = 2x
0 0

2
2x + 4 x0 + 1 = 0,
0

and use the quadratic formula to solve for x : 0

−−−−−−−−−−
2
−4 ± √ 4 − 4(2)(1) – – –
−4 ± √8 −4 ± 2 √2 √2
x0 = = = = −1 ± .
2(2) 4 4 2

Recall y0 = x0 , so this solves for y as well. Then, z


0 0 = 2 x0 + 1 , so

√2 – –
z0 = 2 x0 + 1 = 2 (−1 ± ) + 1 = −2 + 1 ± √2 = −1 ± √2.
2

Therefore, there are two ordered triplet solutions:


– – – –
√2 √2 – √2 √2 –
(−1 + , −1 + , −1 + √2) and (−1 − , −1 − , −1 − √2) .
2 2 2 2

– –
√2 √2 –
4. We substitute (−1 + , −1 + , −1 + √2) into f (x, y, z) = x 2
+y
2
+z
2
, which gives
2 2

14.8.9 https://math.libretexts.org/@go/page/4543
– – – 2 – 2
√2 √2 – √2 √2 – 2
f (−1 + , −1 + , −1 + √2) = (−1 + ) + (−1 + ) + (−1 + √2)
2 2 2 2

– 1 – 1 –
= (1 − √2 + ) + (1 − √2 + ) + (1 − 2 √2 + 2)
2 2


= 6 − 4 √2.

– –
√2 √2 –
Then, we substitute (−1 − , −1 + , −1 + √2) into f (x, y, z) = x 2
+y
2
+z
2
, which gives
2 2

– – – 2 – 2
√2 √2 – √2 √2 – 2
f (−1 − , −1 + , −1 + √2) = (−1 − ) + (−1 − ) + (−1 − √2)
2 2 2 2

– 1 – 1 –
= (1 + √2 + ) + (1 + √2 + ) + (1 + 2 √2 + 2)
2 2


= 6 + 4 √2.

– –
6 + 4 √2 is the maximum value and 6 − 4√2 is the minimum value of f (x, y, z), subject to the given constraints.

 Exercise 14.8.4

Use the method of Lagrange multipliers to find the minimum value of the function
2 2 2
f (x, y, z) = x +y +z

subject to the constraints 2x + y + 2z = 9 and 5x + 5y + 7z = 29.

Hint
Use the problem-solving strategy for the method of Lagrange multipliers with two constraints.
Answer
f (2, 1, 2) = 9 is a minimum value of f , subject to the given constraints.

Key Concepts
An objective function combined with one or more constraints is an example of an optimization problem.
To solve optimization problems, we apply the method of Lagrange multipliers using a four-step problem-solving strategy.

Key Equations
Method of Lagrange multipliers, one constraint
⇀ ⇀
∇f (x0 , y0 ) = λ ∇g(x0 , y0 )

g(x0 , y0 ) = 0

Method of Lagrange multipliers, two constraints


⇀ ⇀ ⇀
∇f (x0 , y0 , z0 ) = λ1 ∇g(x0 , y0 , z0 ) + λ2 ∇h(x0 , y0 , z0 )

g(x0 , y0 , z0 ) = 0

h(x0 , y0 , z0 ) = 0

Glossary
constraint
an inequality or equation involving one or more variables that is used in an optimization problem; the constraint enforces a limit
on the possible solutions for the problem

Lagrange multiplier

14.8.10 https://math.libretexts.org/@go/page/4543
the constant (or constants) used in the method of Lagrange multipliers; in the case of one constant, it is represented by the
variable λ

method of Lagrange multipliers


a method of solving an optimization problem subject to one or more constraints

objective function
the function that is to be maximized or minimized in an optimization problem

optimization problem
calculation of a maximum or minimum value of a function of several variables, often using Lagrange multipliers

14.8: Lagrange Multipliers is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
14.8: Lagrange Multipliers by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

14.8.11 https://math.libretexts.org/@go/page/4543
CHAPTER OVERVIEW

15: Multiple Integrals


A general Calculus Textmap organized around the textbook

Calculus: Early Transcendentals


by James Stewart

I II III IV V VI VII VIII IX X XI XII XIII XIV XV XVI XVII

This Textmap is currently under construction... please be patient with us.

Topic hierarchy
15.1: Double Integrals over Rectangles
15.2: Double Integrals over General Regions
15.3: Double Integrals in Polar Coordinates
15.4: Applications of Double Integrals
15.5: Surface Area
15.6: Triple Integrals
15.7: Triple Integrals in Cylindrical Coordinates
15.8: Triple Integrals in Spherical Coordinates
15.9: Change of Variables in Multiple Integrals

15: Multiple Integrals is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

1
15.1: Double Integrals over Rectangles
 Learning Objectives
Recognize when a function of two variables is integrable over a rectangular region.
Recognize and use some of the properties of double integrals.
Evaluate a double integral over a rectangular region by writing it as an iterated integral.
Use a double integral to calculate the area of a region, volume under a surface, or average value of a function over a plane region.

In this section we investigate double integrals and show how we can use them to find the volume of a solid over a rectangular region in the xy-plane. Many of
the properties of double integrals are similar to those we have already discussed for single integrals.

Volumes and Double Integrals


We begin by considering the space above a rectangular region R . Consider a continuous function f (x, y) ≥ 0 of two variables defined on the closed rectangle
R:

2
R = [a, b] × [c, d] = {(x, y) ∈ R | a ≤ x ≤ b, c ≤ y ≤ d}

Here [a, b] × [c, d] denotes the Cartesian product of the two closed intervals [a, b] and [c, d]. It consists of rectangular pairs (x, y) such that a ≤ x ≤ b and
c ≤ y ≤ d . The graph of f represents a surface above the xy-plane with equation z = f (x, y) where z is the height of the surface at the point (x, y). Let S be

the solid that lies above R and under the graph of f (Figure 15.1.1). The base of the solid is the rectangle R in the xy-plane. We want to find the volume V of
the solid S .

Figure 15.1.1 : The graph of f (x, y) over the rectangle R in the xy-plane is a curved surface.
We divide the region R into small rectangles R , each with area ΔA and with sides Δx and Δy (Figure 15.1.2). We do this by dividing the interval [a, b] into
ij

m subintervals and dividing the interval [c, d] into n subintervals. Hence Δx = , Δy = , and ΔA = ΔxΔy .
b−a d−c

m n

Figure 15.1.2 : Rectangle R is divided into small rectangles R each with area ΔA.
ij

The volume of a thin rectangular box above R is f (x , y ) ΔA, where (x , y ) is an arbitrary sample point in each R as shown in the following figure,
ij

ij

ij

ij

ij ij

f (x , y ) is the height of the corresponding thin rectangular box, and ΔA is the area of each rectangle R .
∗ ∗
ij ij ij

15.1.1 https://math.libretexts.org/@go/page/4545
Figure 15.1.3 : A thin rectangular box above R with height f (x
ij

ij
, y
ij

.
)

Using the same idea for all the subrectangles, we obtain an approximate volume of the solid S as
m n

∗ ∗
V ≈ ∑ ∑ f (x , y )ΔA.
ij ij

i=1 j=1

This sum is known as a double Riemann sum and can be used to approximate the value of the volume of the solid. Here the double sum means that for each
subrectangle we evaluate the function at the chosen point, multiply by the area of each rectangle, and then add all the results.
As we have seen in the single-variable case, we obtain a better approximation to the actual volume if m and n become larger.
m n

∗ ∗
V = lim ∑ ∑ f (x , y )ΔA
ij ij
m,n→∞
i=1 j=1

or
m n

∗ ∗
V = lim ∑ ∑ f (x , y )ΔA.
ij ij
Δx, Δy→0
i=1 j=1

Note that the sum approaches a limit in either case and the limit is the volume of the solid with the base R . Now we are ready to define the double integral.

 Definition: Double Integral over a Rectangular Region R

The double integral of the function f (x, y) over the rectangular region R in the xy-plane is defined as
m n

∗ ∗
∬ f (x, y)dA = lim ∑ ∑ f (x , y )ΔA.
ij ij
m,n→∞
R
i=1 j=1

If f (x, y) ≥ 0, then the volume V of the solid S , which lies above R in the xy-plane and under the graph of f , is the double integral of the function f (x, y)
over the rectangle R . If the function is ever negative, then the double integral can be considered a “signed” volume in a manner similar to the way we defined
net signed area in The Definite Integral.

 Example 15.1.1: Setting up a Double Integral and Approximating It by Double Sums


Consider the function z = f (x, y) = 3 x
2
−y over the rectangular region R = [0, 2] × [0, 2] (Figure 15.1.4).
a. Set up a double integral for finding the value of the signed volume of the solid S that lies above R and “under” the graph of f .
b. Divide R into four squares with m = n = 2 , and choose the sample point as the upper right corner point of each square (1,1),(2,1),(1,2), and (2,2)
(Figure 15.1.4) to approximate the signed volume of the solid S that lies above R and “under” the graph of f .
c. Divide R into four squares with m = n = 2 , and choose the sample point as the midpoint of each square: (1/2, 1/2), (3/2, 1/2), (1/2,3/2), and (3/2, 3/2)
to approximate the signed volume.

Figure 15.1.4 : The function z = f (x, y) graphed over the rectangular region R = [0, 2] × [0, 2] .
Solution
a. As we can see, the function z = f (x, y) = 3x − y is above the plane. To find the signed volume of S , we need to divide the region R into small
2

rectangles R , each with area ΔA and with sides Δx and Δy, and choose (x , y ) as sample points in each R . Hence, a double integral is set up as
ij

ij

ij ij

15.1.2 https://math.libretexts.org/@go/page/4545
m n

2 ∗ 2 ∗
V =∬ (3 x − y)dA = lim ∑ ∑[3(x ) −y ]ΔA.
ij ij
m,n→∞
R
i=1 j=1

b. Approximating the signed volume using a Riemann sum with m = n = 2 we have ΔA = ΔxΔy = 1 × 1 = 1 . Also, the sample points are (1, 1), (2,
1), (1, 2), and (2, 2) as shown in the following figure.

Figure 15.1.5 : Subrectangles for the rectangular region R = [0, 2] × [0, 2] .


Hence,
2 2

∗ ∗
V ≈ ∑ ∑ f (x ,y )ΔA
ij ij

i=1 j=1

∗ ∗ ∗ ∗
= ∑(f (x ,y ) + f (x ,y ))ΔA
i1 i1 i2 i2

i=1

∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗
= f (x ,y )ΔA + f (x ,y )ΔA + f (x ,y )ΔA + f (x ,y )ΔA
11 11 21 21 12 12 22 22

= f (1, 1)(1) + f (2, 1)(1) + f (1, 2)(1) + f (2, 2)(1)

= (3 − 1)(1) + (12 − 1)(1) + (3 − 2)(1) + (12 − 2)(1)

= 2 + 11 + 1 + 10 = 24.

c. Approximating the signed volume using a Riemann sum with m = n = 2 we haveΔA = ΔxΔy = 1 × 1 = 1 . In this case the sample points are (1/2,
1/2), (3/2, 1/2), (1/2, 3/2), and (3/2, 3/2).
Hence,
2 2

∗ ∗
V ≈ ∑ ∑ f (x ,y )ΔA
ij ij

i=1 j=1

∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗
= f (x ,y )ΔA + f (x ,y )ΔA + f (x ,y )ΔA + f (x ,y )ΔA
11 11 21 21 12 12 22 22

= f (1/2, 1/2)(1) + f (3/2, 1/2)(1) + f (1/2, 3/2)(1) + f (3/2, 3/2)(1)

3 1 27 1 3 3 27 3
=( − ) (1) + ( − ) (1) + ( − ) (1) + ( − ) (1)
4 4 4 2 4 2 4 2

2 25 3 21 45
= + + (− )+ = = 11.
4 4 4 4 4

Analysis
Notice that the approximate answers differ due to the choices of the sample points. In either case, we are introducing some error because we are using only
a few sample points. Thus, we need to investigate how we can achieve an accurate answer.

 Exercise 15.1.1

Use the same functionz = f (x, y) = 3x 2


−y over the rectangular region R = [0, 2] × [0, 2] .
Divide R into the same four squares with m = n = 2 , and choose the sample points as the upper left corner point of each square (0,1), (1,1), (0,2), and
(1,2) (Figure 15.1.5) to approximate the signed volume of the solid S that lies above R and “under” the graph of f .

Hint
Follow the steps of the previous example.

15.1.3 https://math.libretexts.org/@go/page/4545
Answer
2 2

∗ ∗
V ≈ ∑ ∑ f (x ,y ) ΔA = 0
ij ij

i=1 j=1

Note that we developed the concept of double integral using a rectangular region R . This concept can be extended to any general region. However, when a
region is not rectangular, the subrectangles may not all fit perfectly into R , particularly if the base area is curved. We examine this situation in more detail in the
next section, where we study regions that are not always rectangular and subrectangles may not fit perfectly in the region R . Also, the heights may not be exact
if the surface z = f (x, y) is curved. However, the errors on the sides and the height where the pieces may not fit perfectly within the solid S approach 0 as m
and n approach infinity. Also, the double integral of the function z = f (x, y) exists provided that the function f is not too discontinuous. If the function is
bounded and continuous over R except on a finite number of smooth curves, then the double integral exists and we say that ff is integrable over R .
Since ΔA = ΔxΔy = ΔyΔx , we can express dA as dx dy or dy dx. This means that, when we are using rectangular coordinates, the double integral over a
region R denoted by

∬ f (x, y) dA
R

can be written as

∬ f (x, y) dx dy
R

or

∬ f (x, y) dy dx.
R

Now let’s list some of the properties that can be helpful to compute double integrals.

Properties of Double Integrals


The properties of double integrals are very helpful when computing them or otherwise working with them. We list here six properties of double integrals.
Properties 1 and 2 are referred to as the linearity of the integral, property 3 is the additivity of the integral, property 4 is the monotonicity of the integral, and
property 5 is used to find the bounds of the integral. Property 6 is used if f (x, y) is a product of two functions g(x) and h(y).

 Theorem: Properties of Double Integrals

Assume that the functions f (x, y) and g(x, y) are integrable over the rectangular region R ; S and T are subregions of R ; and assume that m and M are
real numbers.
i. The sum f (x, y) + g(x, y) is integrable and

∬ [f (x, y) + g(x, y)] dA = ∬ f (x, y) dA + ∬ g(x, y) dA.


R R R

ii. If c is a constant, then cf (x, y) is integrable and

∬ cf (x, y) dA = c ∬ f (x, y) dA.


R R

iii. If R = S ∪ T and S ∩ T =∅ except an overlap on the boundaries, then

∬ f (x, y) dA = ∬ f (x, y) dA + ∬ f (x, y) dA.


R S T

iv. If f (x, y) ≥ g(x, y) for (x, y) in R , then

∬ f (x, y) dA ≥ ∬ g(x, y) dA.


R R

v. If m ≤ f (x, y) ≤ M and A(R) = the area of R , then

m ⋅ A(R) ≤ ∬ f (x, y) dA ≤ M ⋅ A(R).


R

vi. In the case where f (x, y) can be factored as a product of a function g(x) of x only and a function h(y) of y only, then over the region
R = {(x, y) | a ≤ x ≤ b, c ≤ y ≤ d} , the double integral can be written as

b d

∬ f (x, y) dA = ( ∫ g(x) dx) ( ∫ h(y) dy) .


R a c

These properties are used in the evaluation of double integrals, as we will see later. We will become skilled in using these properties once we become familiar
with the computational tools of double integrals. So let’s get to that now.

15.1.4 https://math.libretexts.org/@go/page/4545
Iterated Integrals
So far, we have seen how to set up a double integral and how to obtain an approximate value for it. We can also imagine that evaluating double integrals by
using the definition can be a very lengthy process if we choose larger values for m and n .Therefore, we need a practical and convenient technique for
computing double integrals. In other words, we need to learn how to compute double integrals without employing the definition that uses limits and double
sums.
The basic idea is that the evaluation becomes easier if we can break a double integral into single integrals by integrating first with respect to one variable and
then with respect to the other. The key tool we need is called an iterated integral.

 Definitions: Iterated Integrals


Assume a , b , c , and d are real numbers. We define an iterated integral for a function f (x, y) over the rectangular region R = [a, b] × [c, d] as
b d b d

∫ ∫ f (x, y) dy dx = ∫ [∫ f (x, y) dy] dx


a c a c

or
d b d b

∫ ∫ f (x, y) dx dy = ∫ [∫ f (x, y) dx] dy.


c a c a

b d d b
The notation ∫ a
[∫
c
f (x, y) dy] dx means that we integrate f (x, y) with respect to y while holding x constant. Similarly, the notation ∫
c
[∫
a
f (x, y) dx] dy

means that we integrate f (x, y) with respect to x while holding y constant. The fact that double integrals can be split into iterated integrals is expressed in
Fubini’s theorem. Think of this theorem as an essential tool for evaluating double integrals.

 Theorem: Fubini's Theorem

Suppose that f (x, y) is a function of two variables that is continuous over a rectangular region R = {(x, y) ∈ R 2
| a ≤ x ≤ b, c ≤ y ≤ d} . Then we see
from Figure 15.1.6 that the double integral of f over the region equals an iterated integral,
b d d b

∬ f (x, y) dA = ∬ f (x, y) dx dy = ∫ ∫ f (x, y) dy dx = ∫ ∫ f (x, y) dx dy.


R R a c c a

More generally, Fubini’s theorem is true if f is bounded on R and f is discontinuous only on a finite number of continuous curves. In other words, f has to
be integrable over R .

Figure 15.1.6 : (a) Integrating first with respect to y and then with respect to x to find the area A(x) and then the volume V ; (b) integrating first with respect to
x and then with respect to y to find the area A(y) and then the volume V .

 Example 15.1.2: Using Fubini’s Theorem

Use Fubini’s theorem to compute the double integral ∬ f (x, y) dA where f (x, y) = x and R = [0, 2] × [0, 1] .
R

Solution
Fubini’s theorem offers an easier way to evaluate the double integral by the use of an iterated integral. Note how the boundary values of the region R

become the upper and lower limits of integration.

15.1.5 https://math.libretexts.org/@go/page/4545
∬ f (x, y) dA = ∬ f (x, y) dx dy
R R

y=1 x=2

=∫ ∫ x dx dy
y=0 x=0

y=1 2 x=2
x ∣
=∫ [ ∣ ] dy
y=0
2 ∣x=0

y=1 y=1

=∫ 2 dy = 2y ∣ =2

y=0 y=0

The double integration in this example is simple enough to use Fubini’s theorem directly, allowing us to convert a double integral into an iterated integral.
Consequently, we are now ready to convert all double integrals to iterated integrals and demonstrate how the properties listed earlier can help us evaluate
double integrals when the function f (x, y) is more complex. Note that the order of integration can be changed (see Example 7).

 Example 15.1.3: Illustrating Properties i and ii


Evaluate the double integral

2
∬ (xy − 3x y ) dA, where R = {(x, y) | 0 ≤ x ≤ 2, 1 ≤ y ≤ 2}.
R

Solution
This function has two pieces: one piece is xy and the other is 3xy . Also, the second piece has a constant 3. Notice how we use properties i and ii to help
2

evaluate the double integral.

2 2
∬ (xy − 3x y ) dA = ∬ xy dA + ∬ (−3x y ) dA Property i: Integral of a sum is the sum of the integrals.
R R R

y=2 x=2 y=2 x=2


2
=∫ ∫ xy dx dy − ∫ ∫ 3x y dx dy Convert double integrals to iterated integrals.
y=1 x=0 y=1 x=0

y=2 2 x=2 y=2 2 x=2


x ∣ x ∣
2
=∫ ( y) ∣ dy − 3 ∫ ( y )∣ dy Integrate with respect to x, holding y constant.
2 ∣ 2 ∣
y=1 x=0 y=1 x=0

y=2 y=2
2
=∫ 2y dy − ∫ 6 y dy Property ii: Placing the constant before the integral.
y=1 y=1

2 2
2
=2∫ y dy − 6 ∫ y dy Integrate with respect to y.
1 1

2 2 3 2
y ∣ y ∣
=2 ∣ −6 ∣
2 ∣1 3 ∣1

2 2
2∣ 3∣
=y ∣ − 2y ∣
∣ ∣
1 1

= (4 − 1) − 2(8 − 1) = 3 − 2(7) = 3 − 14 = −11.

 Example 15.1.4: Illustrating Property v.

Over the region R = {(x, y) | 1 ≤ x ≤ 3, 1 ≤ y ≤ 2} , we have 2 ≤ x 2


+y
2
≤ 13 . Find a lower and an upper bound for the integral ∬ 2
(x
2
+ y ) dA.
R

Solution
For a lower bound, integrate the constant function 2 over the region R . For an upper bound, integrate the constant function 13 over the region R .
2 3 2 3 2 2
∣ ∣
∫ ∫ 2 dx dy =∫ [2x ∣ ] dy = ∫ 2(2)dy = 4y ∣ = 4(2 − 1) = 4
∣ ∣
1 1 1 1 1 1

2 3 2 3 2 2
∣ ∣
∫ ∫ 13dx dy =∫ [13x ∣ ] dy = ∫ 13(2) dy = 26y ∣ = 26(2 − 1) = 26.
∣ ∣
1 1 1 1 1 1

Hence, we obtain 4 ≤ ∬ (x
2 2
+ y ) dA ≤ 26.
R

 Example 15.1.5: Illustrating Property vi

Evaluate the integral ∬ e


y
cos x dA over the region R = {(x, y) | 0 ≤ x ≤ π

2
, 0 ≤ y ≤ 1} .
R

Solution

15.1.6 https://math.libretexts.org/@go/page/4545
This is a great example for property vi because the function f (x, y) is clearly the product of two single-variable functions e and cos x. Thus we can split y

the integral into two parts and then integrate each one as a single-variable integration problem.
1 π/2
y y
∬ e cos x dA = ∫ ∫ e cos x dx dy
R 0 0

1 π/2
y
= (∫ e dy) ( ∫ cos x dx)
0 0

1 π/2
y∣ ∣
= (e ∣ )(sin x ∣ )
∣ ∣
0 0

= e − 1.

 Exercise 15.1.2

a. Use the properties of the double integral and Fubini’s theorem to evaluate the integral
1 3

∫ ∫ (3 − x + 4y) dy dx.
0 −1

1
b. Show that 0 ≤ ∬ sin πx cos πy dA ≤ where R = (0, 1

4
)(
1

4
,
1

2
) .
R
32

Hint
Use properties i. and ii. and evaluate the iterated integral, and then use property v.

Answer
a. 26
b. Answers may vary.

As we mentioned before, when we are using rectangular coordinates, the double integral over a region R denoted by ∬ f (x, y) dA can be written as R

∬ f (x, y) dx dy or ∬ f (x, y) dy dx. The next example shows that the results are the same regardless of which order of integration we choose.
R R

 Example 15.1.6: Evaluating an Iterated Integral in Two Ways

Let’s return to the function f (x, y) = 3x 2


−y from Example 1, this time over the rectangular region R = [0, 2] × [0, 3] . Use Fubini’s theorem to evaluate
∬ f (x, y) dA in two different ways:
R

a. First integrate with respect to y and then with respect to x;


b. First integrate with respect to x and then with respect to y .
Solution
Figure 15.1.6 shows how the calculation works in two different ways.
a. First integrate with respect to y and then integrate with respect to x:
x=2 y=3
2
∬ f (x, y) dA = ∫ ∫ (3 x − y) dy dx
R x=0 y=0

x=2 y=3 x=2 2 y=3


y ∣
2 2
=∫ (∫ (3 x − y) dy) dx = ∫ [3 x y − ∣ ] dx
x=0 y=0 x=0 2 ∣y=0

x=2 x=2
9 9 ∣
2 3
=∫ (9 x − ) dx = 3 x − x∣ = 15.
2 2 ∣
x=0 x=0

b. First integrate with respect to x and then integrate with respect to y :


y=3 x=2
2
∬ f (x, y) dA = ∫ ∫ (3 x − y) dx dy
R y=0 x=0

y=3 x=2
2
=∫ (∫ (3 x − y) dx) dy
y=0 x=0

y=3 x=2
3 ∣
=∫ [x − xy ∣ ] dy

y=0 x=0

y=3 y=3
2∣
=∫ (8 − 2y) dy = 8y − y ∣ = 15.

y=0 y=0

Analysis

15.1.7 https://math.libretexts.org/@go/page/4545
With either order of integration, the double integral gives us an answer of 15. We might wish to interpret this answer as a volume in cubic units of the solid
S below the function f (x, y) = 3 x − y over the region R = [0, 2] × [0, 3] . However, remember that the interpretation of a double integral as a (non-
2

signed) volume works only when the integrand f is a nonnegative function over the base region R .

 Exercise 15.1.3

Evaluate
y=2 x=5
2 2
∫ ∫ (2 − 3 x + y ) dx dy.
y=−3 x=3

Hint
Use Fubini’s theorem.

Answer
1340

3

In the next example we see that it can actually be beneficial to switch the order of integration to make the computation easier. We will come back to this idea
several times in this chapter.

 Example 15.1.7: Switching the Order of Integration

Consider the double integral ∬ x sin(xy) dA over the region R = {(x, y) | 0 ≤ x ≤ π, 1 ≤ y ≤ 2} (Figure 15.1.7).
R

a. Express the double integral in two different ways.


b. Analyze whether evaluating the double integral in one way is easier than the other and why.
c. Evaluate the integral.

Figure 15.1.7 : The function z = f (x, y) = x sin(xy) over the rectangular region R = [0, π] × [1, 2].
a. We can express ∬ x sin(xy) dA in the following two ways: first by integrating with respect to y and then with respect to x; second by integrating
R

with respect to x and then with respect to y .


x=π y=2

∬ x sin(xy) dA = ∫ ∫ x sin(xy) dy dx
R x=0 y=1

Integrate first with respect to y .


y=2 x=π

=∫ ∫ x sin(xy) dx dy
y=1 x=0

Integrate first with respect to x.


b. If we want to integrate with respect to y first and then integrate with respect to x, we see that we can use the substitution u = xy, which gives
du = x dy . Hence the inner integral is simply ∫ sin u du and we can change the limits to be functions of x,

x=π y=2 x=π u=2x

∬ x sin(xy) dA = ∫ ∫ x sin(xy) dy dx = ∫ [∫ sin(u) du] dx.


R x=0 y=1 x=0 u=x

However, integrating with respect to x first and then integrating with respect to y requires integration by parts for the inner integral, with u = x and
dv = sin(xy)dx

15.1.8 https://math.libretexts.org/@go/page/4545
cos(xy)
Then du = dx and v = − y
, so
y=2 x=π y=2 x=π x=π
x cos(xy) ∣ 1
∬ x sin(xy) dA = ∫ ∫ x sin(xy) dx dy = ∫ [− ∣ + ∫ cos(xy) dx] dy.
y ∣ y
R y=1 x=0 y=1 x=0 x=0

Since the evaluation is getting complicated, we will only do the computation that is easier to do, which is clearly the first method.
c. Evaluate the double integral using the easier way.
x=π y=2

∬ x sin(xy) dA = ∫ ∫ x sin(xy) dy dx
R x=0 y=1

x=π u=2x x=π u=2x



=∫ [∫ sin(u) du] dx = ∫ [− cos u ∣ ] dx
∣u=x
x=0 u=x x=0

x=π

=∫ (− cos 2x + cos x) dx
x=0

x=π
1 ∣
= (− sin 2x + sin x) ∣ = 0.
2 ∣
x=0

 Exercise 15.1.4

Evaluate the integral ∬ xe


xy
dA where R = [0, 1] × [0, ln 5] .
R

Hint
Integrate with respect to y first.

Answer
4−ln 5

ln 5

Applications of Double Integrals


Double integrals are very useful for finding the area of a region bounded by curves of functions. We describe this situation in more detail in the next section.
However, if the region is a rectangular shape, we can find its area by integrating the constant function f (x, y) = 1 over the region R .

 Definition: Area of a Region R

The area of the region R is given by

A(R) = ∬ 1 dA.
R

This definition makes sense because using f (x, y) = 1 and evaluating the integral make it a product of length and width. Let’s check this formula with an
example and see how this works.

 Example 15.1.8: Finding Area Using a Double Integral

Find the area of the region R = { (x, y) | 0 ≤ x ≤ 3, 0 ≤ y ≤ 2} by using a double integral, that is, by integrating 1 over the region R .
Solution
The region is rectangular with length 3 and width 2, so we know that the area is 6. We get the same answer when we use a double integral:
2 3 2 2 2 2
3 ∣ 2
A(R) = ∫ ∫ 1 dx dy = ∫ [x ∣
∣ ] dy = ∫ 3dy = 3 ∫ dy = 3y ∣ = 3(2) = 6 units .
0 ∣0
0 0 0 0 0

We have already seen how double integrals can be used to find the volume of a solid bounded above by a function f (x, y) ≥ 0 over a region R provided
f (x, y) ≥ 0 for all (x, y) in R . Here is another example to illustrate this concept.

 Example 15.1.9: Volume of an Elliptic Paraboloid

Find the volume V of the solid S that is bounded by the elliptic paraboloid 2x
2
+y
2
+ z = 27 , the planes x =3 and y =3 , and the three coordinate
planes.
Solution
First notice the graph of the surface z = 27 − 2x − y in Figure 15.1.8(a) and above the square region R = [−3, 3] × [−3, 3] . However, we need the
2 2
1

volume of the solid bounded by the elliptic paraboloid 2x + y + z = 27 , the planes x = 3 and y = 3 , and the three coordinate planes.
2 2

15.1.9 https://math.libretexts.org/@go/page/4545
Figure 15.1.8 : (a) The surface z = 27 − 2x − y above the square region
2 2
R1 = [−3, 3] × [−3, 3] . (b) The solid S lies under the surface
z = 27 − 2x
2
−y
2
above the square region R1 = [0, 3] × [0, 3] .
Now let’s look at the graph of the surface in Figure 15.1.8(b). We determine the volume V by evaluating the double integral over R : 2

2 2
V =∬ z dA = ∬ (27 − 2 x − y ) dA
R R

y=3 x=3
2 2
=∫ ∫ (27 − 2 x − y ) dx dy Convert to literal integral.
y=0 x=0

y=3 x=3
2 ∣
3 2
=∫ [27x − x − y x] ∣ dy Integrate with respect to x.
3 ∣
y=0 x=0

y=3 y=3
2 3∣
=∫ (63 − 3 y )dy = 63y − y ∣ = 162.

y=0 y=0

 Exercise 15.1.5

Find the volume of the solid bounded above by the graph of f (x, y) = xy sin(x y)
2
and below by the xy -plane on the rectangular region
R = [0, 1] × [0, π] .

Hint
Graph the function, set up the integral, and use an iterated integral.

Answer
π

Recall that we defined the average value of a function of one variable on an interval [a, b] as
b
1
fave = ∫ f (x) dx.
b −a a

Similarly, we can define the average value of a function of two variables over a region R . The main difference is that we divide by an area instead of the width
of an interval.

 Definition: Average Value of a Function


The average value of a function of two variables over a region R is
1
Fave = ∬ f (x, y) dx dy.
Area of R R

In the next example we find the average value of a function over a rectangular region. This is a good example of obtaining useful information for an integration
by making individual measurements over a grid, instead of trying to find an algebraic expression for a function.

 Example 15.1.10: Calculating Average Storm Rainfall

The weather map in Figure 15.1.9 shows an unusually moist storm system associated with the remnants of Hurricane Karl, which dumped 4–8 inches
(100–200 mm) of rain in some parts of the Midwest on September 22–23, 2010. The area of rainfall measured 300 miles east to west and 250 miles north to
south. Estimate the average rainfall over the entire area in those two days.

15.1.10 https://math.libretexts.org/@go/page/4545
Figure 15.1.9 : Effects of Hurricane Karl, which dumped 4–8 inches (100–200 mm) of rain in some parts of southwest Wisconsin, southern Minnesota, and
southeast South Dakota over a span of 300 miles east to west and 250 miles north to south.
Solution
Place the origin at the southwest corner of the map so that all the values can be considered as being in the first quadrant and hence all are positive. Now
divide the entire map into six rectangles (m = 2 and n = 3) , as shown in Figure 15.1.9. Assume f (x, y) denotes the storm rainfall in inches at a point
approximately x miles to the east of the origin and y miles to the north of the origin. Let R represent the entire area of 250 × 300 = 75000 square miles.
Then the area of each subrectangle is
1
ΔA = (75000) = 12500.
6

Assume (x ∗, y ∗) are approximately the midpoints of each subrectangle


ij ij Rij . Note the color-coded region at each of these points, and estimate the
rainfall. The rainfall at each of these points can be estimated as:
At (x 11 , y11 ), the rainfall is 0.08.
At (x 12 , y12 ), the rainfall is 0.08.
At (x 13 , y13 ), the rainfall is 0.01.
At (x 21 , y21 ), the rainfall is 1.70.
At (x 22 , y22 ), the rainfall is 1.74.
At (x 23 , y23 ), the rainfall is 3.00.

Figure 15.1.10: Storm rainfall with rectangular axes and showing the midpoints of each subrectangle.
According to our definition, the average storm rainfall in the entire area during those two days was

15.1.11 https://math.libretexts.org/@go/page/4545
1 1
fave = ∬ f (x, y) dx dy = ∬ f (x, y) dx dy
Area R R
75000 R

3 2
1
∗ ∗
≈ ∑ ∑ f (x ,y )ΔA
ij ij
75000
i=1 j=1

1
∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗
= [f (x ,y )ΔA + f (x ,y )ΔA + f (x ,y )ΔA + f (x ,y )ΔA + f (x ,y )ΔA + f (x ,y )ΔA]
11 11 12 12 13 13 21 21 22 22 23 23
75000

1
≈ [0.08 + 0.08 + 0.01 + 1.70 + 1.74 + 3.00]ΔA
75000

1
= [0.08 + 0.08 + 0.01 + 1.70 + 1.74 + 3.00]12500
75000

1
= [0.08 + 0.08 + 0.01 + 1.70 + 1.74 + 3.00]
6

≈ 1.10 in.

During September 22–23, 2010 this area had an average storm rainfall of approximately 1.10 inches.

 Exercise 15.1.6

A contour map is shown for a function f (x, y) on the rectangle R = [−3, 6] × [−1, 4] .

a. Use the midpoint rule with m = 3 and n = 2 to estimate the value of ∬ f (x, y) dA.
R

b. Estimate the average value of the function f (x, y).

Hint
Divide the region into six rectangles, and use the contour lines to estimate the values for f (x, y).

Answer
Answers to both parts a. and b. may vary.

Key Concepts
We can use a double Riemann sum to approximate the volume of a solid bounded above by a function of two variables over a rectangular region. By taking
the limit, this becomes a double integral representing the volume of the solid.
Properties of double integral are useful to simplify computation and find bounds on their values.
We can use Fubini’s theorem to write and evaluate a double integral as an iterated integral.
Double integrals are used to calculate the area of a region, the volume under a surface, and the average value of a function of two variables over a
rectangular region.

Key Equations
m n

∬ f (x, y) dA = lim ∑ ∑ f (xi j∗, yi j∗) ΔA


m,n→∞
R i=1 j=1

b d b d

∫ ∫ f (x, y) dx dy = ∫ [∫ f (x, y) dy] dx


a c a c

or
d b d b

∫ ∫ f (x, y) dx dy = ∫ [∫ f (x, y) dx] dy


c a c a

1
fave = ∬ f (x, y) dx dy
Area of R R

15.1.12 https://math.libretexts.org/@go/page/4545
Glossary
double integral
of the function f (x, y) over the region R in the xy-plane is defined as the limit of a double Riemann sum,
m n

∗ ∗
∬ f (x, y) dA = lim ∑ ∑ f (x ,y ) ΔA.
ij ij
m,n→∞
R i=1 j=1

double Riemann sum


of the function f (x, y) over a rectangular region R is
m n

∗ ∗
∑ ∑ f (x ,y ) ΔA,
ij ij

i=1 j=1

where R is divided into smaller subrectangles R and (x ij



ij
,y
ij

) is an arbitrary point in R ij

Fubini’s theorem
if f (x, y) is a function of two variables that is continuous over a rectangular region R = {(x, y) ∈ R 2
| a ≤ x ≤ b, c ≤ y ≤ d} , then the double integral
of f over the region equals an iterated integral,
b d d b

∬ f (x, y) dA = ∫ ∫ f (x, y) dx dy = ∫ ∫ f (x, y) dx dy


R a c c a

iterated integral
for a function f (x, y) over the region R is
b d b d

a. ∫ ∫ f (x, y) dx dy = ∫ [∫ f (x, y) dy] dx,


a c a c

d b d b

b. ∫ ∫ f (x, y) dx dy = ∫ [∫ f (x, y) dx] dy,


c a c a

where a, b, c, and d are any real numbers and R = [a, b] × [c, d]

15.1: Double Integrals over Rectangles is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
15.1: Double Integrals over Rectangular Regions by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

15.1.13 https://math.libretexts.org/@go/page/4545
15.2: Double Integrals over General Regions
 Learning Objectives
Recognize when a function of two variables is integrable over a general region.
Evaluate a double integral by computing an iterated integral over a region bounded by two vertical lines and two functions of x, or
two horizontal lines and two functions of y .
Simplify the calculation of an iterated integral by changing the order of integration.
Use double integrals to calculate the volume of a region between two surfaces or the area of a plane region.
Solve problems involving double improper integrals.

Previously, we studied the concept of double integrals and examined the tools needed to compute them. We learned techniques and
properties to integrate functions of two variables over rectangular regions. We also discussed several applications, such as finding the
volume bounded above by a function over a rectangular region, finding area by integration, and calculating the average value of a function
of two variables.
In this section we consider double integrals of functions defined over a general bounded region D on the plane. Most of the previous results
hold in this situation as well, but some techniques need to be extended to cover this more general case.

General Regions of Integration


An example of a general bounded region D on a plane is shown in Figure 15.2.1. Since D is bounded on the plane, there must exist a
rectangular region R on the same plane that encloses the region D that is, a rectangular region R exists such that D is a subset of
R(D ⊆ R) .

Figure 15.2.1 : For a region D that is a subset of R , we can define a function g(x, y) to equal f (x, y) at every point in D and 0 at every
point of R not in D .
Suppose z = f (x, y) is defined on a general planar bounded region D as in Figure 15.2.1. In order to develop double integrals of f over D
we extend the definition of the function to include all points on the rectangular region R and then use the concepts and tools from the
preceding section. But how do we extend the definition of f to include all the points on R ? We do this by defining a new function g(x, y) on
R as follows:

f (x, y), if (x, y) is in D


g(x, y) = {
0, if (x, y) is in R but not in D

Note that we might have some technical difficulties if the boundary of D is complicated. So we assume the boundary to be a piecewise
smooth and continuous simple closed curve. Also, since all the results developed in the section on Double Integrals over Rectangular
Regions used an integrable function f (x, y) we must be careful about g(x, y) and verify that g(x, y) is an integrable function over the
rectangular region R . This happens as long as the region D is bounded by simple closed curves. For now we will concentrate on the
descriptions of the regions rather than the function and extend our theory appropriately for integration.
We consider two types of planar bounded regions.

 Definition: Type I and Type II regions


A region D in the (x, y)-plane is of Type I if it lies between two vertical lines and the graphs of two continuous functions g1 (x) and
g (x). That is (Figure 15.2.2),
2

D = {(x, y) | a ≤ x ≤ b, g1 (x) ≤ y ≤ g2 (x)}.

15.2.1 https://math.libretexts.org/@go/page/4546
A region D in the xy-plane is of Type II if it lies between two horizontal lines and the graphs of two continuous functions h1 (y) and
h (y). That is (Figure 15.2.3),
2

D = {(x, y) | c ≤ y ≤ d, h1 (y) ≤ x ≤ h2 (y)}.

Figure 15.2.2 . A Type I region lies between two vertical lines and the graphs of two functions of x .

Figure 15.2.3 : A Type II region lies between two horizontal lines and the graphs of two functions of y .

 Example 15.2.1: Describing a Region as Type I and Also as Type II

Consider the region in the first quadrant between the functions y = √−


x and y = x (Figure 15.2.4). Describe the region first as Type I
3

and then as Type II.

Figure 15.2.4 : Region D can be described as Type I or as Type II.


When describing a region as Type I, we need to identify the function that lies above the region and the function that lies below the
region. Here, region D is bounded above by y = √− x and below by y = x in the interval for x in [0, 1]. Hence, as Type I, D is
3

described as the set {(x, y) | 0 ≤ x ≤ 1, x ≤ y ≤ √−


3
x} .
3

However, when describing a region as Type II, we need to identify the function that lies on the left of the region and the function that
lies on the right of the region. Here, the region D is bounded on the left by x = y and on the right by x = √y in the interval for y in
2 3

[0, 1]. Hence, as Type II, D is described as the set {(x, y) | 0 ≤ y ≤ 1, y ≤ x ≤ √y} .
2 3

 Exercise 15.2.1
Consider the region in the first quadrant between the functions y = 2x and y = x . Describe the region first as Type I and then as Type
2

II.

Hint
Graph the functions, and draw vertical and horizontal lines.

15.2.2 https://math.libretexts.org/@go/page/4546
Answer
Type I and Type II are expressed as {(x, y) | 0 ≤ x ≤ 2, x
2
≤ y ≤ 2x} and {(x, y)| 0 ≤ y ≤ 4, 1

2
y ≤ x ≤ √y} , respectively.

Double Integrals over Non-rectangular Regions


To develop the concept and tools for evaluation of a double integral over a general, nonrectangular region, we need to first understand the
region and be able to express it as Type I or Type II or a combination of both. Without understanding the regions, we will not be able to
decide the limits of integrations in double integrals. As a first step, let us look at the following theorem.

 Theorem: Double Integrals over Nonrectangular Regions

Suppose g(x, y) is the extension to the rectangle R of the function f (x, y) defined on the regions D and R as shown in Figure 15.2.1

inside R . Then g(x, y) is integrable and we define the double integral of f (x, y) over D by

∬ f (x, y) dA = ∬ g(x, y) dA.

D R

The right-hand side of this equation is what we have seen before, so this theorem is reasonable because R is a rectangle and ∬ g(x, y)dA
R

has been discussed in the preceding section. Also, the equality works because the values of g(x, y) are 0 for any point (x, y) that lies outside
D and hence these points do not add anything to the integral. However, it is important that the rectangle R contains the region D.

As a matter of fact, if the region D is bounded by smooth curves on a plane and we are able to describe it as Type I or Type II or a mix of
both, then we can use the following theorem and not have to find a rectangle R containing the region.

 Theorem: Fubini’s Theorem (Strong Form)

For a function f (x, y) that is continuous on a region D of Type I, we have


b g (x)
2

∬ f (x, y) dA = ∬ f (x, y) dy dx = ∫ [∫ f (x, y) dy] dx.


a g1 (x)
D D

Similarly, for a function f (x, y) that is continuous on a region D of Type II, we have
d h2 (y)

∬ f (x, y) dA = ∬ f (x, y) dx dy = ∫ [∫ f (x, y) dx] dy.


c h1 (y)
D D

The integral in each of these expressions is an iterated integral, similar to those we have seen before. Notice that, in the inner integral in the
first expression, we integrate f (x, y) with x being held constant and the limits of integration being g (x) and g (x). In the inner integral in
1 2

the second expression, we integrate f (x, y) with y being held constant and the limits of integration are h (x) and h (x). 1 2

 Example 15.2.2: Evaluating an Iterated Integral over a Type I Region

Evaluate the integral ∬ 2


x e
xy
dA where D is shown in Figure 15.2.5.
D

Solution
First construct the region as a Type I region (Figure 15.2.5). Here D = {(x, y) | 0 ≤ x ≤ 2, 1

2
x ≤ y ≤ 1} . Then we have
x=2 y=1
2 xy 2 xy
∬ x e dA = ∫ ∫ x e dy dx.
x=0 y=1/2x
D

Figure 15.2.5 : We can express region D as a Type I region and integrate from y = 1

2
x to y = 1 between the lines x = 0 and x = 2 .

15.2.3 https://math.libretexts.org/@go/page/4546
Therefore, we have
x=2 y=1 x=2 y=1
2 xy 2 xy
∫ ∫ x e dy dx = ∫ [∫ x e dy] dx Iterated integral for a Type I region.
1 1
x=0 y= x x=0 y= x
2 2

x=2 xy y=1
e ∣
2
=∫ [x ]∣ dx Integrate with respect to y
x=0
x ∣y=1/2x

x=2
2
x x /2
=∫ [x e − xe ] dx Integrate with respect to x
x=0

1 2 x=2
x x x ∣
= [x e −e −e 2
] = 2.

x=0

In Example 15.2.2, we could have looked at the region in another way, such as D = {(x, y) | 0 ≤ y ≤ 1, 0 ≤ x ≤ 2y} (Figure 15.2.6).

Figure 15.2.6 .
This is a Type II region and the integral would then look like
y=1 x=2y
2 xy 2 xy
∬ x e dA = ∫ ∫ x e dx dy.
y=0 x=0
D

However, if we integrate first with respect to x this integral is lengthy to compute because we have to use integration by parts twice.

 Example 15.2.3: Evaluating an Iterated Integral over a Type II Region

Evaluate the integral

2 2
∬ (3 x + y ) dA

where D = {(x, y) | − 2 ≤ y ≤ 3, y
2
− 3 ≤ x ≤ y + 3} .
Solution
Notice that D can be seen as either a Type I or a Type II region, as shown in Figure 15.2.7. However, in this case describing D as Type I
is more complicated than describing it as Type II. Therefore, we use D as a Type II region for the integration.

Figure 15.2.7 : The region D in this example can be either (a) Type I or (b) Type II.
Choosing this order of integration, we have

15.2.4 https://math.libretexts.org/@go/page/4546
y=3 x=y+3
2 2 2 2
∬ (3 x + y ) dA = ∫ ∫ (3 x + y ) dx dy
y=−2 x=y 2 −3
D

y=3
y+3
3 2
=∫ (x + x y )∣
∣ 2
dy Iterated integral, Type II region
y −3
y=−2

y=3
3 2 2 2
=∫ ((y + 3 ) + (y + 3)y − (y − 3)y ) dy
y=−2

3
2 3 4 6
=∫ (54 + 27y − 12 y + 2y + 8y − y ) dy Integrate with respect to x.
−2

2 4 5 7 3
27y y 8y y
3
= [54y + − 4y + + − ]
2 2 5 7
−2

2375
= .
7

 Exercise 15.2.2

Sketch the region D and evaluate the iterated integral

∬ xy dy dx

where D is the region bounded by the curves y = cos x and y = sin x in the interval [−3π/4, .
π/4]

Hint
Express D as a Type I region, and integrate with respect to y first.

Answer
π

Recall from Double Integrals over Rectangular Regions the properties of double integrals. As we have seen from the examples here, all these
properties are also valid for a function defined on a non-rectangular bounded region on a plane. In particular, property 3 states:
If R = S ∪ T and S ∩ T =0 except at their boundaries, then

∬ f (x, y) dA = ∬ f (x, y) dA + ∬ f (x, y) dA.

R S T

Similarly, we have the following property of double integrals over a non-rectangular bounded region on a plane.

 Theorem: Decomposing Regions into Smaller Regions

Suppose the region D can be expressed as D = D 1 ∪ D2 where D and D do not overlap except at their boundaries. Then
1 2

∬ f (x, y) dA = ∬ f (x, y) dA + ∬ f (x, y) dA.

D D1 D2

This theorem is particularly useful for non-rectangular regions because it allows us to split a region into a union of regions of Type I and
Type II. Then we can compute the double integral on each piece in a convenient way, as in the next example.

 Example 15.2.4: Decomposing Regions

Express the region D shown in Figure 15.2.8 as a union of regions of Type I or Type II, and evaluate the integral

∬ (2x + 5y) dA.

15.2.5 https://math.libretexts.org/@go/page/4546
Figure 15.2.8 : This region can be decomposed into a union of three regions of Type I or Type II.
Solution
The region D is not easy to decompose into any one type; it is actually a combination of different types. So we can write it as a union of
three regions D , D ,
1 and 2 D where, D = {(x, y) | − 2 ≤ x ≤ 0, 0 ≤ y ≤ (x + 2 ) } ,
3 1
2

y )} , and D = {(x, y) | − 4 ≤ y ≤ 0, − 2 ≤ x ≤ (y − . These regions are


1 3 1 3
D = {(x, y) | 0 ≤ y ≤ 4, 0 ≤ x ≤ (y −
2 3 y )}
16 16

illustrated more clearly in Figure 15.2.9.

Figure 15.2.9 : Breaking the region into three subregions makes it easier to set up the integration.
Here D is Type I and D and D are both of Type II. Hence,
1 2 3

∬ (2x + 5y) dA = ∬ (2x + 5y) dA + ∬ (2x + 5y) dA + ∬ (2x + 5y) dA

D D1 D2 D3

2 3 3
x=0 y=(x+2) y=4 x=y−(1/16)y y=0 x=y−(1/16)y

=∫ ∫ (2x + 5y) dy dx + ∫ ∫ (2 + 5y) dx dy + ∫ ∫ (2x + 5y) dx dy


x=−2 y=0 y=0 x=0 y=−4 x=−2

x=0 y=4
1 2 2
1 6
7 4 2
=∫ [ (2 + x ) (20 + 24x + 5 x )] dx + ∫ [ y − y + 6 y ] dy +
x=−2
2 y=0
256 16

y=0
1 7
6 4 2
∫ [ y − y + 6y + 10y − 4] dy
y=−4
256 16

40 1664 1696 1304


= + − = .
3 35 35 105

Now we could redo this example using a union of two Type II regions (see the Checkpoint).

 Exercise 15.2.3
Consider the region bounded by the curves y = ln x and y =e
x
in the interval [1, 2] . Decompose the region into smaller regions of
Type II.

Hint
Sketch the region, and split it into three regions to set it up.

Answer
y 2
{(x, y) | 0 ≤ y ≤ 1, 1 ≤ x ≤ e } ∪ {(x, y) | 1 ≤ y ≤ e, 1 ≤ x ≤ 2} ∪ {(x, y) | e ≤ y ≤ e , ln y ≤ x ≤ 2}

 Exercise 15.2.4

Redo Example 15.2.4 using a union of two Type II regions.

Hint
1 3
1 13
{(x, y) | 0 ≤ y ≤ 4, 2 + √y ≤ x ≤ (y − y )} ∪ {(x, y) | − 4 ≤ y ≤ 0, − 2 ≤ x ≤ (y − y )}
16 16

Answer
Same as in the example shown.

15.2.6 https://math.libretexts.org/@go/page/4546
Changing the Order of Integration
As we have already seen when we evaluate an iterated integral, sometimes one order of integration leads to a computation that is
significantly simpler than the other order of integration. Sometimes the order of integration does not matter, but it is important to learn to
recognize when a change in order will simplify our work.

 Example 15.2.5: Changing the Order of Integration

Reverse the order of integration in the iterated integral


2
x=√2 y=2−x
2
x
∫ ∫ xe dy dx.
x=0 y=0

Then evaluate the new iterated integral.


Solution
The region as presented is of Type I. To reverse the order of integration, we must first express the region as Type II. Refer to Figure
15.2.10.

Figure 15.2.10: Converting a region from Type I to Type II.


We can see from the limits of integration that the region is bounded above by y = 2 − x and below by y = 0 where x is in the interval 2

– −−−−
[0, √2]. By reversing the order, we have the region bounded on the left by x = 0 and on the right by x = √2 − y where y is in the
−−− −
interval [0, 2]. We solved y = 2 − x in terms of x to obtain x = √2 − y .
2

Hence
2
√2 2−x 2 √2−y
2 2
x x
∫ ∫ xe dy dx = ∫ ∫ xe dx dy Reverse the order of integration then use substitution.
0 0 0 0

2 √2−y 2
1 2
x ∣ 1 2−y
=∫ [ e ∣ ] dy = ∫ (e − 1) dy
2 ∣ 2
0 0 0

2
1 ∣ 1
2−y 2
=− (e + y)∣ = (e − 3).
2 ∣ 2
0

 Example 15.2.6: Evaluating an Iterated Integral by Reversing the Order of Integration

Consider the iterated integral

∬ f (x, y) dx dy

where z = f (x, y) = x − 2y over a triangular region R that has sides on x = 0, y =0 , and the line x + y = 1 . Sketch the region, and
then evaluate the iterated integral by
a. integrating first with respect to y and then
b. integrating first with respect to x.
Solution
A sketch of the region appears in Figure 15.2.11.

15.2.7 https://math.libretexts.org/@go/page/4546
Figure 15.2.11: A triangular region R for integrating in two ways.
We can complete this integration in two different ways.
a. One way to look at it is by first integrating y from y = 0 to y = 1 − x vertically and then integrating x from x = 0 to x = 1 :
x=1 y=1−x x=1
y=1−x
2 ∣
∬ f (x, y) dx dy =∫ ∫ (x − 2y) dy dx = ∫ (xy − 2 y ) dx

y=0
x=0 y=0 x=0
R

x=1 x=1
3 2 x=1 1
2 2 2 3 ∣
=∫ [x(1 − x) − (1 − x ) ] dx = ∫ [−1 + 3x − 2 x ]dx = [−x + x − x ] =− .

x=0 x=0
2 3 x=0 6

b. The other way to do this problem is by first integrating x from x = 0 to x = 1 − y horizontally and then integrating y from y = 0 to
y = 1:

y=3 x=y+3
2 2 2 2
∬ (3 x + y ) dA = ∫ ∫ (3 x + y ) dx dy
2
y=−2 x=y −3
D

y=3
y+3
3 2 ∣
=∫ (x + xy ) dy Iterated integral, Type II region
∣y 2 −3
y=−2

y=3
3 2 2 2
=∫ ((y + 3 ) + (y + 3)y − (y − 3)y ) dy
y=−2

3
2 3 4 6
=∫ (54 + 27y − 12 y + 2y + 8y − y ) dy Integrate with respect to x.
−2

2 4 5 7
27y y 8y y 3
3 ∣
= (54y + − 4y + + − )

2 2 5 7 −2

2375
= .
7

 Exercise 15.2.5

Evaluate the iterated integral ∬ (x


2
+ y ) dA
2
over the region D in the first quadrant between the functions y = 2x and y =x
2
.
D

Evaluate the iterated integral by integrating first with respect to y and then integrating first with resect to x.

Hint
Sketch the region and follow Example 15.2.6.

Answer
216

35

Calculating Volumes, Areas, and Average Values


We can use double integrals over general regions to compute volumes, areas, and average values. The methods are the same as those in
Double Integrals over Rectangular Regions, but without the restriction to a rectangular region, we can now solve a wider variety of
problems.

15.2.8 https://math.libretexts.org/@go/page/4546
 Example 15.2.7: Finding the Volume of a Tetrahedron
Find the volume of the solid bounded by the planes x = 0, y = 0, z = 0 , and 2x + 3y + z = 6 .
Solution
The solid is a tetrahedron with the base on the xy-plane and a height z = 6 − 2x − 3y . The base is the region D bounded by the lines,
x = 0 , y = 0 and 2x + 3y = 6 where z = 0 (Figure 15.2.12). Note that we can consider the region D as Type I or as Type II, and we

can integrate in both ways.


Figure 15.2.12 : A tetrahedron consisting of the three coordinate planes and the plane z = 6 − 2x − 3y , with the base bound by
x = 0, y = 0 , and 2x + 3y = 6 .
First, consider D as a Type I region, and hence D = {(x, y) | 0 ≤ x ≤ 3, 0 ≤y ≤2−
2

3
x} .
Therefore, the volume is
x=3 y=2−(2x/3) x=3 y=2−(2x/3)
3 ∣
2
V =∫ ∫ (6 − 2x − 3y) dy dx = ∫ [ (6y − 2xy − y )∣ ] dx
x=0 y=0 x=0
2 ∣
y=0

x=3
2
2
=∫ [ (x − 3 ) ] dx = 6.
x=0
3

Now consider D as a Type II region, so D = {(x, y) | 0 ≤ y ≤ 2, 0 ≤x ≤3−


3

2
y} . In this calculation, the volume is
y=2 x=3−(3y/2) y=2
x=3−(3y/2)
2 ∣
V =∫ ∫ (6 − 2x − 3y) dx dy = ∫ [(6x − x − 3xy) ] dy
∣x=0
y=0 x=0 y=0

y=2
9
2
=∫ [ (y − 2 ) ] dy = 6.
y=0
4

Therefore, the volume is 6 cubic units.

 Exercise 15.2.6

Find the volume of the solid bounded above by f (x, y) = 10 − 2x + y over the region enclosed by the curves y = 0 and y = e where x

x is in the interval [0, 1].

Hint
Sketch the region, and describe it as Type I.
Answer
2
e

4
+ 10e −
49

4
cubic units

Finding the area of a rectangular region is easy, but finding the area of a non-rectangular region is not so easy. As we have seen, we can use
double integrals to find a rectangular area. As a matter of fact, this comes in very handy for finding the area of a general non-rectangular
region, as stated in the next definition.

 Definition: Double Integrals

The area of a plane-bounded region D is defined as the double integral

∬ 1 dA.

We have already seen how to find areas in terms of single integration. Here we are seeing another way of finding areas by using double
integrals, which can be very useful, as we will see in the later sections of this chapter.

15.2.9 https://math.libretexts.org/@go/page/4546
 Example 15.2.8: Finding the Area of a Region
Find the area of the region bounded below by the curve y = x and above by the line y = 2x in the first quadrant (Figure 15.2.13).
2

Figure 15.2.13: The region bounded by y = x and y = 2x . 2

Solution
We just have to integrate the constant function f (x, y) = 1 over the region. Thus, the area A of the bounded region is
x=2 y=2x y=4 x=√y

∫ ∫ dy dx or ∫ ∫ dx dy :
2
x=0 y=x y=0 x=y/2

A =∬ 1 dx dy

x=2 y=2x

=∫ ∫ 1 dy dx
x=0 y=x2

x=2
y=2x

=∫ (y ) dx

y=x2
x=0

x=2
2
=∫ (2x − x ) dx
x=0

3
x 2 4
2 ∣
= (x − ) = .

3 0 3

 Exercise 15.2.7

Find the area of a region bounded above by the curve y = x and below by y = 0 over the interval [0, 3].
3

Hint
Sketch the region.

Answer
81

4
square units

We can also use a double integral to find the average value of a function over a general region. The definition is a direct extension of the
earlier formula.

 Definition: The Average Value of a Function

If f (x, y) is integrable over a plane-bounded region D with positive area A(D), then the average value of the function is
1
fave = ∬ f (x, y) dA.
A(D)
D

Note that the area is A(D) = ∬ 1 dA .


D

15.2.10 https://math.libretexts.org/@go/page/4546
 Example 15.2.9: Finding an Average Value
Find the average value of the function f (x, y) = 7xy on the region bounded by the line x = y and the curve x = √y (Figure 15.2.14).
2

Figure 15.2.14: The region bounded by x = y and x = √y .


Solution
First find the area A(D) where the region D is given by the figure. We have
y=1 x=√y y=1 y=1 2 1
x=√y 2 y ∣ 1
∣ 2/3
A(D) = ∬ 1 dA = ∫ ∫ 1 dx dy = ∫ [x ] dy = ∫ (√y − y) dy = y − ∣ =

y=0 x=y y=0
x=y
y=0
3 2 ∣ 6
0
D

Then the average value of the given function over this region is
y=1 x=√y y=1 x=√y
1 1 2
1 7 2 2∣
fave = ∬ f (x, y) dA = ∫ ∫ 7x y dx dy = ∫ [ x y ∣ ] dy
A(D) A(D) 1/6 2 ∣
y=0 x=y y=0 x=y
D

1
y=1 y=1 4 5
7 7 42 y y ∣ 42 21
2 2 3 4
=6∫ [ y (y − y )] dy = 6 ∫ [ (y − y )] dy = ( − )∣ = = .
y=0
2 y=0
2 2 4 5 ∣ 40 20
0

 Exercise 15.2.8

Find the average value of the function f (x, y) = xy over the triangle with vertices (0, 0), (1, 0) and (1, 3).

Hint
Express the line joining (0, 0) and (1, 3) as a function y = g(x) .

Answer
3

Improper Double Integrals


An improper double integral is an integral ∬ f dA where either D is an unbounded region or f is an unbounded function. For example,
D

D = {(x, y) | |x − y| ≥ 2} is an unbounded region, and the function f (x, y) = 1/(1 − x 2


− 2y )
2
over the ellipse x
2
+ 3y
2
≥1 is an
unbounded function. Hence, both of the following integrals are improper integrals:

i.
∬ xy dA where D = {(x, y)|| x − y| ≥ 2};

ii. 1 2 2
∬ dA where D = {(x, y)| x + 3y ≤ 1}.
2 2
1 −x − 2y
D

In this section we would like to deal with improper integrals of functions over rectangles or simple regions such that f has only finitely many
discontinuities. Not all such improper integrals can be evaluated; however, a form of Fubini’s theorem does apply for some types of
improper integrals.

15.2.11 https://math.libretexts.org/@go/page/4546
 Theorem: Fubini’s Theorem for Improper Integrals
If D is a bounded rectangle or simple region in the plane defined by
{(x, y) : a ≤ x ≤ b, g(x) ≤ y ≤ h(x)} and also by
{(x, y) : c ≤ y ≤ d, j(y) ≤ x ≤ k(y)} and f is a nonnegative function on D with finitely many discontinuities in the interior of D

then
x=b y=h(x) y=d x=k(y)

∬ f dA = ∫ ∫ f (x, y) dy dx = ∫ ∫ f (x, y) dx dy
x=a y=g(x) y=c x=j(y)
D

It is very important to note that we required that the function be nonnegative on D for the theorem to work. We consider only the case where
the function has finitely many discontinuities inside D.

 Example 15.2.10: Evaluating a Double Improper Integral


y

Consider the function f (x, y) = e

y
over the region D = {(x, y) : 0 ≤ x ≤ 1, x ≤ y ≤ √x }.

Notice that the function is nonnegative and continuous at all points on D except (0, 0). Use Fubini’s theorem to evaluate the improper
integral.
Solution
First we plot the region D (Figure 15.2.15); then we express it in another way.

Figure 15.2.15: The function f is continuous at all points of the region D except (0, 0).
The other way to express the same region D is
2
D = {(x, y) : 0 ≤ y ≤ 1, y ≤ x ≤ y}.

Thus we can use Fubini’s theorem for improper integrals and evaluate the integral as
y=1 x=y y
e
∫ ∫ dx dy.
y=0 x=y
2 y

Therefore, we have
y=1 x=y y y=1 y x=y y=1 y 1
e e ∣ e 2 y y
∫ ∫ dx dy = ∫ x∣ dy = ∫ (y − y ) dy = ∫ (e − y e ) dy = e − 2.
2 y y ∣ 2 y
y=0 x=y y=0 x=y y=0 0

As mentioned before, we also have an improper integral if the region of integration is unbounded. Suppose now that the function f is
continuous in an unbounded rectangle R .

 Theorem: Improper Integrals on an Unbounded Region

If R is an unbounded rectangle such as R = {(x, y) : a ≤ x ≤ ∞, c ≤ y ≤ ∞} , then when the limit exists, we have
b d d b

∬ f (x, y) dA = lim ∫ (∫ f (x, y) dy) dx = lim ∫ (∫ f (x, y) dx) dy.


(b,d)→(∞,∞) (b,d)→(∞,∞)
a c c a
R

15.2.12 https://math.libretexts.org/@go/page/4546
The following example shows how this theorem can be used in certain cases of improper integrals.

 Example 15.2.11

Evaluate the integral ∬ xy e where R is the first quadrant of the plane.


2 2
−x −y
dA
R

Solution
The region R is the first quadrant of the plane, which is unbounded. So
x=b y=d
2 2 2 2
−x −y −x −y
∬ xy e dA = lim ∫ (∫ xy e dy) dx
(b,d)→(∞,∞)
x=0 y=0
R

x=b
2 2
−x −y
= lim ∫ xy e dy
(b,d)→(∞,∞) y=0

1 2 2 1
−b −d
= lim (1 − e ) (1 − e ) =
(b,d)→(∞,∞) 4 4

Thus,
2 2
−x −y
∬ xy e dA

is convergent and the value is 1

4
.

 Exercise 15.2.9

y
∬ −−−−−−−−− dA
2 2
√1 −x −y
D

where D = {(x, y) : x ≥ 0, y ≥ 0, x
2
+y
2
≤ 1} .

Hint
Notice that the integral is nonnegative and discontinuous on x + y = 1 . Express 2 2
the region D as
−−−− −
2
D = {(x, y) : 0 ≤ x ≤ 1, 0 ≤ y ≤ √1 − x } and integrate using the method of substitution.

Answer
π

In some situations in probability theory, we can gain insight into a problem when we are able to use double integrals over general regions.
Before we go over an example with a double integral, we need to set a few definitions and become familiar with some important properties.

 Definition: Joint Density Function

Consider a pair of continuous random variables X and Y such as the birthdays of two people or the number of sunny and rainy days in a
month. The joint density function f of X and Y satisfies the probability that (X, Y ) lies in a certain region D:

P ((X, Y ) ∈ D) = ∬ f (x, y) dA.

Since the probabilities can never be negative and must lie between 0 and 1 the joint density function satisfies the following inequality
and equation:

f (x, y) ≥ 0 and ∬ f (x, y) dA = 1.

15.2.13 https://math.libretexts.org/@go/page/4546
 Definition: Independent Random Variables

The variables X and Y are said to be independent random variables if their joint density function is the product of their individual
density functions:

f (x, y) = f1 (x)f2 (y).

 Example 15.2.12: Application to Probability

At Sydney’s Restaurant, customers must wait an average of 15 minutes for a table. From the time they are seated until they have finished
their meal requires an additional 40 minutes, on average. What is the probability that a customer spends less than an hour and a half at
the diner, assuming that waiting for a table and completing the meal are independent events?
Solution
Waiting times are mathematically modeled by exponential density functions, with m being the average waiting time, as
0, if t < 0

f (t) = { 1
−t/m
e , if t ≥ 0.
m

if X and Y are random variables for ‘waiting for a table’ and ‘completing the meal,’ then the probability density functions are,
respectively,

0, if x < 0. 0, if y < 0

f1 (x) = { 1 and f2 (y) = { 1


−x/15 −y/40
e , if x ≥ 0. e , if y ≥ 0.
15 40

Clearly, the events are independent and hence the joint density function is the product of the individual functions
0, if x < 0 or y < 0,

f (x, y) = f1 (x)f2 (y) = { 1


−x/15
e , if x, y ≥ 0
600

We want to find the probability that the combined time X + Y is less than 90 minutes. In terms of geometry, it means that the region D
is in the first quadrant bounded by the line x + y = 90 (Figure 15.2.16).

Figure 15.2.16: The region of integration for a joint probability density function.
Hence, the probability that (X, Y ) is in the region D is
1
−x/15 −y/40
P (X + Y ≤ 90) = P ((X, Y ) ∈ D) = ∬ f (x, y) dA = ∬ e e dA.
600
D D

Since x + y = 90 is the same as y = 90 − x , we have a region of Type I, so

15.2.14 https://math.libretexts.org/@go/page/4546
D = {(x, y) | 0 ≤ x ≤ 90, 0 ≤ y ≤ 90 − x},

x=90 y=90−x x=90 y=90−x


1 1
−(/15 −y/40 −x/15 −y/40
P (X + Y ≤ 90) = ∫ ∫ e e dx dy = ∫ ∫ e e dx dy
600 x=0 y=0 600 x=0 y=0

x=90 y=90−x
1
−(x/15+y/40)
= ∫ ∫ e dx dy = 0.8328
600 x=0 y=0

Thus, there is an 83.2% chance that a customer spends less than an hour and a half at the restaurant.

Another important application in probability that can involve improper double integrals is the calculation of expected values. First we define
this concept and then show an example of a calculation.

 Definition: Expected Values

In probability theory, we denote the expected values E(X) and E(Y ) respectively, as the most likely outcomes of the events. The
expected values E(X) and E(Y ) are given by

E(X) = ∬ x f (x, y) dA and E(Y ) = ∬ y f (x, y) dA,

S S

where S is the sample space of the random variables X and Y .

 Example 15.2.13: Finding Expected Value

Find the expected time for the events ‘waiting for a table’ and ‘completing the meal’ in Example 15.2.12.
Solution
Using the first quadrant of the rectangular coordinate plane as the sample space, we have improper integrals for E(X) and E(Y ) . The
expected time for a table is
1 −x/15 −y/40
E(X) = ∬ x e e dA
600
S

x=∞ y=∞
1
−x/15 −y/40
= ∫ ∫ xe e dA
600 x=0 y=0

x=a y=b
1
−x/15 −y/40
= lim ∫ ∫ xe e dx dy
600 (a,b)→(∞,∞)
x=0 y=0

x=a y=b
1 −x/15 −y/40
= ( lim ∫ xe dx) ( lim ∫ e dy)
600 a→∞
x=0
b→∞
y=0

x=a y=b
1 ∣ ∣
−x/15 −y/40
= ( ( lim (−15 e (x + 15))) ) ( ( lim (−40 e ))∣ )
600 a→∞ ∣ x=0 b→∞ ∣
y=0

1
−a/15 −b/40
= ( lim (−15 e (x + 15) + 225)) ( lim (−40 e + 40))
600 a→∞ b→∞

1
= (225)(40) = 15.
600

A similar calculation shows that E(Y ) = 40 . This means that the expected values of the two random events are the average waiting
time and the average dining time, respectively.

 Exercise 15.2.10

The joint density function for two random variables X and Y is given by
1 2 2
(x + y ), if ≤ x ≤ 15, 0 ≤ y ≤ 10
600
f (x, y) = {
0, otherwise

15.2.15 https://math.libretexts.org/@go/page/4546
Find the probability that X is at most 10 and Y is at least 5.

Hint
Compute the probability
10 y=10
1 2 2
P (X ≤ 10, Y ≥ 5) = ∫ ∫ (x + y )dy dx.
x=−∞ y=5
6000

Answer
55
≈ 0.7638
72

Key Concepts
A general bounded region D on the plane is a region that can be enclosed inside a rectangular region. We can use this idea to define a
double integral over a general bounded region.
To evaluate an iterated integral of a function over a general nonrectangular region, we sketch the region and express it as a Type I or as a
Type II region or as a union of several Type I or Type II regions that overlap only on their boundaries.
We can use double integrals to find volumes, areas, and average values of a function over general regions, similarly to calculations over
rectangular regions.
We can use Fubini’s theorem for improper integrals to evaluate some types of improper integrals.

Key Equations
Iterated integral over a Type I region
b g (x)
2

∬ f (x, y) dA = ∬ f (x, y) dy dx = ∫ [∫ f (x, y) dy] dx


a g1 (x)
D D

Iterated integral over a Type II region


d h2 (y)

∬ f (x, y) dA = ∬ (x, y) dx dy = ∫ [∫ f (x, y) dx] dy


c h1 (y)
D D

Glossary
improper double integral
a double integral over an unbounded region or of an unbounded function

Type I
a region D in the xy- plane is Type I if it lies between two vertical lines and the graphs of two continuous functions g 1 (x) and g2 (x)

Type II
a region D in the xy-plane is Type II if it lies between two horizontal lines and the graphs of two continuous functions h 1 (y) and h2 (h)

15.2: Double Integrals over General Regions is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
15.2: Double Integrals over General Regions by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

15.2.16 https://math.libretexts.org/@go/page/4546
15.3: Double Integrals in Polar Coordinates
 Learning Objectives
Recognize the format of a double integral over a polar rectangular region.
Evaluate a double integral in polar coordinates by using an iterated integral.
Recognize the format of a double integral over a general polar region.
Use double integrals in polar coordinates to calculate areas and volumes.

Double integrals are sometimes much easier to evaluate if we change rectangular coordinates to polar coordinates. However, before
we describe how to make this change, we need to establish the concept of a double integral in a polar rectangular region.

Polar Rectangular Regions of Integration


When we defined the double integral for a continuous function in rectangular coordinates—say, g over a region R in the xy-plane
—we divided R into subrectangles with sides parallel to the coordinate axes. These sides have either constant x-values and/or
constant y -values. In polar coordinates, the shape we work with is a polar rectangle, whose sides have constant r-values and/or
constant θ -values. This means we can describe a polar rectangle as in Figure 15.3.1a, with R = {(r, θ) | a ≤ r ≤ b, α ≤ θ ≤ β} .

Figure 15.3.1 : (a) A polar rectangle R (b) divided into subrectangles R (c) Close-up of a subrectangle.
ij

In this section, we are looking to integrate over polar rectangles. Consider a function f (r, θ) over a polar rectangle R . We divide
the interval [a, b] into m subintervals [r , r ] of length Δr = (b − a)/m and divide the interval [α, β] into n subintervals
i−1 i

[θi−1 , θ ] of width Δθ = (β − α)/n . This means that the circles r = r and rays θ = θ
i for 1 ≤ i ≤ m and 1 ≤ j ≤ n divide the
i i

polar rectangle R into smaller polar subrectangles R (Figure 15.3.1b).


ij

As before, we need to find the area ΔA of the polar subrectangle R and the “polar” volume of the thin box above R . Recall
ij ij

that, in a circle of radius r the length s of an arc subtended by a central angle of θ radians is s = rθ . Notice that the polar rectangle
R ij looks a lot like a trapezoid with parallel sides r Δθ and r Δθ and with a width Δr. Hence the area of the polar subrectangle
i−1 i

R ij is
1
ΔA = Δr(ri−1 Δθ + ri Δθ).
2

Simplifying and letting


1

r = (ri−1 + ri )
ij
2

we have ΔA = r ∗
ij
ΔrΔθ .
Therefore, the polar volume of the thin box above R (Figure 15.3.2) is
ij

∗ ∗ ∗ ∗ ∗
f (r ,θ )ΔA = f (r ,θ )r ΔrΔθ.
ij ij ij ij ij

15.3.1 https://math.libretexts.org/@go/page/4547
Figure 15.3.2 : Finding the volume of the thin box above polar rectangle R . ij

Using the same idea for all the subrectangles and summing the volumes of the rectangular boxes, we obtain a double Riemann sum
as
m n

∗ ∗ ∗
∑ ∑ f (r ,θ )r ΔrΔθ.
ij ij ij

i=1 j=1

As we have seen before, we obtain a better approximation to the polar volume of the solid above the region R when we let m and
n become larger. Hence, we define the polar volume as the limit of the double Riemann sum,

m n

∗ ∗ ∗
V = lim ∑ ∑ f (r ,θ )r ΔrΔθ.
ij ij ij
m,n→∞
i=1 j=1

This becomes the expression for the double integral.

 Definition: The double integral in polar coordinates

The double integral of the function f (r, θ) over the polar rectangular region R in the rθ-plane is defined as
m n

∗ ∗
∬ f (r, θ)dA = lim ∑ ∑ f (r ,θ )ΔA (15.3.1)
ij ij
m,n→∞
R i=1 j=1

m n

∗ ∗ ∗
= lim ∑ ∑ f (r ,θ )r ΔrΔθ. (15.3.2)
ij ij ij
m,n→∞
i=1 j=1

Again, just as in section on Double Integrals over Rectangular Regions, the double integral over a polar rectangular region can be
expressed as an iterated integral in polar coordinates. Hence,
θ=β r=b

∬ f (r, θ) dA = ∬ f (r, θ) r dr dθ = ∫ ∫ f (r, θ) r dr dθ.


R R θ=α r=a

Notice that the expression for dA is replaced by r dr dθ when working in polar coordinates. Another way to look at the polar
double integral is to change the double integral in rectangular coordinates by substitution. When the function f is given in terms of
x and y using x = r cos θ, y = r sin θ , and dA = r dr dθ changes it to

∬ f (x, y) dA = ∬ f (r cos θ, r sin θ) r dr dθ.


R R

Note that all the properties listed in section on Double Integrals over Rectangular Regions for the double integral in rectangular
coordinates hold true for the double integral in polar coordinates as well, so we can use them without hesitation.

 Example 15.3.1A: Sketching a Polar Rectangular Region


Sketch the polar rectangular region

R = {(r, θ) | 1 ≤ r ≤ 3, 0 ≤ θ ≤ π}.

15.3.2 https://math.libretexts.org/@go/page/4547
Solution
As we can see from Figure 15.3.3, r = 1 and r = 3 are circles of radius 1 and 3 and 0 ≤θ ≤π covers the entire top half of
the plane. Hence the region R looks like a semicircular band.

Figure 15.3.3 : The polar region R lies between two semicircles.


Now that we have sketched a polar rectangular region, let us demonstrate how to evaluate a double integral over this region by
using polar coordinates.

 Example 15.3.1B: Evaluating a Double Integral over a Polar Rectangular Region

Evaluate the integral ∬ 3x dA over the region R = {(r, θ) | 1 ≤ r ≤ 2, 0 ≤ θ ≤ π}.


R

Solution
First we sketch a figure similar to Figure 15.3.3, but with outer radius r = 2 . From the figure we can see that we have
θ=π r=2

∬ 3x dA = ∫ ∫ 3r cos θ r dr dθ Use an integral with correct limits of integration.


R θ=0 r=1

θ=π
r=2
3
=∫ cos θ [r ∣
∣ ] dθ Integrate first with respect to r.
r=1
θ=0

θ=π

=∫ 7 cos θ dθ
θ=0

θ=π

= 7 sin θ∣ = 0.

θ=0

 Exercise 15.3.1

Sketch the region D = {(r, θ)|1 ≤ r ≤ 2, −


π

2
≤θ ≤
π

2
} , and evaluate ∬ .
x dA
R

Hint
Follow the steps in Example 15.3.1A.

Answer
14

 Example 15.3.2A: Evaluating a Double Integral by Converting from Rectangular Coordinates

Evaluate the integral

2 2
∬ (1 − x − y ) dA
R

where R is the unit circle on the xy-plane.


Solution

15.3.3 https://math.libretexts.org/@go/page/4547
The region R is a unit circle, so we can describe it as R = {(r, θ) | 0 ≤ r ≤ 1, 0 ≤ θ ≤ 2π} .
Using the conversion x = r cos θ, y = r sin θ , and dA = r dr dθ , we have
2π 1
2 2 2
∬ (1 − x − y ) dA =∫ ∫ (1 − r ) r dr dθ
R 0 0

2π 1
3
=∫ ∫ (r − r ) dr dθ
0 0

1
2π 2 4
r r
=∫ [ − ] dθ
0
2 4
0


1 π
=∫ dθ = .
0
4 2

 Example 15.3.2B: Evaluating a Double Integral by Converting from Rectangular Coordinates

Evaluate the integral

∬ (x + y) dA
R

where R = {(x, y) | 1 ≤ x 2
+y
2
≤ 4, x ≤ 0}.

Solution
We can see that R is an annular region that can be converted to polar coordinates and described as
R = {(r, θ) | 1 ≤ r ≤ 2,
π

2
≤θ ≤ } (see the following graph).

Figure 15.3.4 : The annular region of integration R .


Hence, using the conversion x = r cos θ, y = r sin θ , and dA = r dr dθ , we have
θ=3π/2 r=2

∬ (x + y) dA = ∫ ∫ (r cos θ + r sin θ)r dr dθ


R θ=π/2 r=1

r=2 3π/2
2
= (∫ r dr) ( ∫ (cos θ + sin θ) dθ)
r=1 π/2

2 3π/2
3 ∣
r
= [ ] [sin θ − cos θ] ∣
3
1 ∣ π/2

14
=− .
3

 Exercise 15.3.2
Evaluate the integral

2 2
∬ (4 − x − y ) dA
R

15.3.4 https://math.libretexts.org/@go/page/4547
where R is the circle of radius 2 on the xy-plane.

Hint
Follow the steps in the previous example.

Answer

General Polar Regions of Integration


To evaluate the double integral of a continuous function by iterated integrals over general polar regions, we consider two types of
regions, analogous to Type I and Type II as discussed for rectangular coordinates in section on Double Integrals over General
Regions. It is more common to write polar equations as r = f (θ) than θ = f (r) , so we describe a general polar region as
R = {(r, θ) | α ≤ θ ≤ β, h (θ) ≤ r ≤ h (θ)}
1 2 (Figure 15.3.5).

Figure 15.3.5 : A general polar region between α ≤ θ ≤ β and h 1 (θ) ≤ r ≤ h2 (θ) .

 Theorem: Double Integrals over General Polar Regions

If f (r, θ) is continuous on a general polar region D as described above, then


θ=β r=h2 (θ)

∬ f (r, θ) r dr dθ = ∫ ∫ f (r, θ) r dr dθ.


D θ=α r=h1 (θ)

 Example 15.3.3: Evaluating a Double Integral over a General Polar Region

Evaluate the integral

2
∬ r sin θ r dr dθ
D

where D is the region bounded by the polar axis and the upper half of the cardioid r = 1 + cos θ .
Solution
We can describe the region D as {(r, θ) | 0 ≤ θ ≤ π, 0 ≤ r ≤ 1 + cos θ} as shown in Figure 15.3.6.

15.3.5 https://math.libretexts.org/@go/page/4547
Figure 15.3.6 : The region D is the top half of a cardioid.
Hence, we have
θ=π r=1+cos θ
2 2
∬ r sin θ r dr dθ = ∫ ∫ (r sin θ) r dr dθ
D θ=0 r=0

r=1+cos θ
θ=π
1 ∣
4
= ∫ [r ]∣ sin θ dθ
4 θ=0 ∣
r=0

θ=π
1
4
= ∫ (1 + cos θ) sin θ dθ
4 θ=0
π
5
1 (1 + cos θ) 8
=− [ ] = .
4 5 5
0

 Exercise 15.3.3

Evaluate the integral

2 2
∬ r sin 2θ r dr dθ
D

−−−−−
where D = {(r, θ) | 0 ≤ θ ≤ π, 0 ≤ r ≤ 2 √cos 2θ} .

Hint
Graph the region and follow the steps in the previous example.

Answer
π

Polar Areas and Volumes


As in rectangular coordinates, if a solid S is bounded by the surface z = f (r, θ) , as well as by the surfaces r = a, r = b, θ = α ,
and θ = β , we can find the volume V of S by double integration, as
θ=β r=b

V =∬ f (r, θ) r dr dθ = ∫ ∫ f (r, θ) r dr dθ.


R θ=α r=a

If the base of the solid can be described as D = {(r, θ)|α ≤ θ ≤ β, h1 (θ) ≤ r ≤ h2 (θ)} , then the double integral for the volume
becomes
θ=β r=h2 (θ)

V =∬ f (r, θ) r dr dθ = ∫ ∫ f (r, θ) r dr dθ.


D θ=α r=h1 (θ)

15.3.6 https://math.libretexts.org/@go/page/4547
We illustrate this idea with some examples.

 Example 15.3.4A: Finding a Volume Using a Double Integral

Find the volume of the solid that lies under the paraboloid z = 1 − x 2
−y
2
and above the unit circle on the xy-plane (Figure
15.3.7).

Figure 15.3.7 : Finding the volume of a solid under a paraboloid and above the unit circle.
Solution
By the method of double integration, we can see that the volume is the iterated integral of the form

2 2
∬ (1 − x − y ) dA
R

where R = {(r, θ) | 0 ≤ r ≤ 1, 0 ≤ θ ≤ 2π} .


This integration was shown before in Example 15.3.2A, so the volume is π

2
cubic units.

 Example 15.3.4B: Finding a Volume Using Double Integration


Find the volume of the solid that lies under the paraboloid z = 4 − x − y and above the disk (x − 1) + y
2 2 2 2
=1 on the xy-
plane. See the paraboloid in Figure 15.3.8 intersecting the cylinder (x − 1) + y = 1 above the xy-plane.
2 2

Figure 15.3.8 : Finding the volume of a solid with a paraboloid cap and a circular base.

15.3.7 https://math.libretexts.org/@go/page/4547
Solution
First change the disk (x − 1) + y = 1 to polar coordinates. Expanding the square term, we have x − 2x + 1 + y = 1 .
2 2 2 2

Then simplify to get x + y = 2x , which in polar coordinates becomes r = 2r cos θ and then either r = 0 or r = 2 cos θ .
2 2 2

Similarly, the equation of the paraboloid changes to z = 4 − r . Therefore we can describe the disk (x − 1) + y = 1 on the
2 2 2

xy -plane as the region

D = {(r, θ) | 0 ≤ θ ≤ π, 0 ≤ r ≤ 2 cos θ}.

Hence the volume of the solid bounded above by the paraboloid z = 4 − x 2


−y
2
and below by r = 2 cos θ is

V =∬ f (r, θ) r dr dθ
D

θ=π r=2 cos θ


2
=∫ ∫ (4 − r ) r dr dθ
θ=0 r=0

θ=π 2 cos θ
2 4
r r ∣
=∫ [4 − ∣ ] dθ
θ=0 2 4 ∣
0

π
2 4
=∫ [8 cos θ − 4 cos θ] dθ
0
π
5 5 3
5 3
= [ θ+ sin θ cos θ − sin θ cos θ] = π units .
2 2 2
0

Notice in the next example that integration is not always easy with polar coordinates. Complexity of integration depends on the
function and also on the region over which we need to perform the integration. If the region has a more natural expression in polar
coordinates or if f has a simpler antiderivative in polar coordinates, then the change in polar coordinates is appropriate; otherwise,
use rectangular coordinates.

 Example 15.3.5A: Finding a Volume Using a Double Integral

Find the volume of the region that lies under the paraboloid z =x
2
+y
2
and above the triangle enclosed by the lines
y = x, x = 0 , and x + y = 2 in the xy-plane.

Solution
First examine the region over which we need to set up the double integral and the accompanying paraboloid.

Figure 15.3.9 : Finding the volume of a solid under a paraboloid and above a given triangle.
The region D is {(x, y) | 0 ≤ x ≤ 1, x ≤ y ≤ 2 − x} . Converting the lines y = x, x = 0 , and x + y = 2 in the xy-plane to
functions of r and θ we have θ = π/4, θ = π/2 , and r = 2/(cos θ + sin θ) , respectively. Graphing the region on the xy-
plane, we see that it looks like D = {(r, θ) | π/4 ≤ θ ≤ π/2, 0 ≤ r ≤ 2/(cos θ + sin θ)} .
Now converting the equation of the surface gives z = x 2
+y
2
=r
2
. Therefore, the volume of the solid is given by the double
integral

15.3.8 https://math.libretexts.org/@go/page/4547
V =∬ f (r, θ) r dr dθ
D

θ=π/2 r=2/(cos θ+sin θ)


2
=∫ ∫ r r drdθ
θ=π/4 r=0

π/2 4 2/(cos θ+sin θ)


r
=∫ [ ] dθ
π/4
4
0

π/2 4
1 2
= ∫ ( ) dθ
4 π/4 cos θ + sin θ

π/2 4
16 1
= ∫ ( ) dθ
4 π/4
cos θ + sin θ

π/2 4
1
=4∫ ( ) dθ.
π/4
cos θ + sin θ

As you can see, this integral is very complicated. So, we can instead evaluate this double integral in rectangular coordinates as
1 2−x
2 2
V =∫ ∫ (x + y ) dy dx.
0 x

Evaluating gives
1 2−x
2 2
V =∫ ∫ (x + y ) dy dx
0 x

1 2−x
3
y ∣
2
=∫ [x y + ]∣ dx
0
3 ∣
x

1 3
8 2
8x
=∫ − 4x + 4 x − dx
0 3 3
1
3 4
8x 4x 2x ∣
2
= [ − 2x + − ]∣
3 3 3 ∣
0

4 3
= units .
3

To answer the question of how the formulas for the volumes of different standard solids such as a sphere, a cone, or a cylinder are
found, we want to demonstrate an example and find the volume of an arbitrary cone.

 Example 15.3.5B: Finding a Volume Using a Double Integral


−−−−− −
Use polar coordinates to find the volume inside the cone z = 2 − √x 2
+ y2 and above the xy-plane.
Solution
The region D for the integration is the base of the cone, which appears to be a circle on the xy-plane (Figure 15.3.10).

15.3.9 https://math.libretexts.org/@go/page/4547
Figure 15.3.10: Finding the volume of a solid inside the cone and above the xy-plane.
We find the equation of the circle by setting z = 0 :
−−−−−−
2 2
0 = 2 − √x +y

−−−−−−
2 2
2 = √x +y

2 2
x +y = 4.

This means the radius of the circle is 2 so for the integration we have 0 ≤ θ ≤ 2π and 0 ≤ r ≤ 2 . Substituting x = r cos θ and
− −− −− −
y = r sin θ in the equation z = 2 − √x + y we have z = 2 − r . Therefore, the volume of the cone is
2 2

θ=2π r=2
4 8π
∫ ∫ (2 − r) r dr dθ = 2π = cubic units.
θ=0 r=0
3 3

Analysis

Note that if we were to find the volume of an arbitrary cone with radius α units and height h units, then the equation of the cone
− −− −−−
would be z = h − √x + y .
h

a
2 2

We can still use Figure 15.3.10 and set up the integral as


θ=2π r=a
h
∫ ∫ (h − r) r dr dθ.
θ=0 r=0
a

Evaluating the integral, we get 1

3
2
πa h .

 Exercise 15.3.5

Use polar coordinates to find an iterated integral for finding the volume of the solid enclosed by the paraboloids z = x 2
+y
2

and z = 16 − x − y .
2 2

Hint
Sketching the graphs can help.

Answer
2π 2 √2
2
V =∫ ∫ (16 − 2 r ) r dr dθ = 64π cubic units.
0 0

As with rectangular coordinates, we can also use polar coordinates to find areas of certain regions using a double integral. As
before, we need to understand the region whose area we want to compute. Sketching a graph and identifying the region can be
helpful to realize the limits of integration. Generally, the area formula in double integration will look like

15.3.10 https://math.libretexts.org/@go/page/4547
β h2 (θ)

Area of A = ∫ ∫ 1 r dr dθ.
α h1 (θ)

 Example 15.3.6A: Finding an Area Using a Double Integral in Polar Coordinates

Evaluate the area bounded by the curve r = cos 4θ .


Solution
Sketching the graph of the function r = cos 4θ reveals that it is a polar rose with eight petals (see the following figure).

Figure 15.3.11: Finding the area of a polar rose with eight petals.
Using symmetry, we can see that we need to find the area of one petal and then multiply it by 8. Notice that the values of θ for
which the graph passes through the origin are the zeros of the function cos 4θ, and these are odd multiples of π/8. Thus, one of
the petals corresponds to the values of θ in the interval [−π/8, π/8]. Therefore, the area bounded by the curve r = cos 4θ is
θ=π/8 r=cos 4θ

A =8∫ ∫ 1 r dr dθ
θ=−π/8 r=0

θ=π/8 cos 4θ
1 2∣
=8∫ [ r ∣ ] dθ
2 ∣
θ=−π/8 0

π/8
1 2
=8∫ cos 4θ dθ
−π/8
2

π/8
1 1 ∣
=8 [ θ+ sin 4θ cos 4θ∣ ]
4 16 ∣
−π/8

π π
2
=8[ ] = units .
16 2

 Example 15.3.6B: Finding Area Between Two Polar Curves


Find the area enclosed by the circle r = 3 cos θ and the cardioid r = 1 + cos θ .
Solution
First and foremost, sketch the graphs of the region (Figure 15.3.12).

15.3.11 https://math.libretexts.org/@go/page/4547
Figure 15.3.12: Finding the area enclosed by both a circle and a cardioid.
We can from see the symmetry of the graph that we need to find the points of intersection. Setting the two equations equal to
each other gives

3 cos θ = 1 + cos θ.

One of the points of intersection is θ = π/3 . The area above the polar axis consists of two parts, with one part defined by the
cardioid from θ = 0 to θ = π/3 and the other part defined by the circle from θ = π/3 to θ = π/2 . By symmetry, the total area
is twice the area above the polar axis. Thus, we have
θ=π/3 r=1+cos θ θ=π/2 r=3 cos θ

A = 2 [∫ ∫ 1 r dr dθ + ∫ ∫ 1 r dr dθ] .
θ=0 r=0 θ=π/3 r=0

Evaluating each piece separately, we find that the area is


1 9 – 3 9 – 5 5
A =2( π+ √3 + π− √3) = 2 ( π) = π square units.
4 16 8 16 8 4

 Exercise 15.3.6
Find the area enclosed inside the cardioid r = 3 − 3 sin θ and outside the cardioid r = 1 + sin θ .

Hint
Sketch the graph, and solve for the points of intersection.

Answer
π/6 3−3 sin θ
– 2
A =2∫ ∫ r dr dθ = (8π + 9 √3) units
−π/2 1+sin θ

 Example 15.3.7: Evaluating an Improper Double Integral in Polar Coordinates


Evaluate the integral
2 2
−10( x +y )
∬ e dx dy.
2
R

15.3.12 https://math.libretexts.org/@go/page/4547
Solution
This is an improper integral because we are integrating over an unbounded region R . In polar coordinates, the entire plane R 2 2

can be seen as 0 ≤ θ ≤ 2π, 0 ≤ r ≤ ∞ .


Using the changes of variables from rectangular coordinates to polar coordinates, we have
θ=2π r=∞ θ=2π r=a
2 2 2 2
−10( x +y ) −10r −10r
∬ e dx dy = ∫ ∫ e r dr dθ = ∫ ( lim ∫ e r dr) dθ
2 a→∞
R θ=0 r=0 θ=0 r=0

θ=2π r=a
2
−10r
= (∫ ) dθ ( lim ∫ e r dr)
a→∞
θ=0 r=0

r=a
2
−10r
= 2π ( lim ∫ e r dr)
a→∞
r=0

1 2 a
−10r ∣ )
= 2π lim (− ) (e
∣0
a→∞ 20

1 2
−10a
= 2π (− ) lim (e − 1)
20 a→∞

π
= .
10

 Exercise 15.3.7

Evaluate the integral


2 2
−4( x +y )
∬ e dx dy.
2
R

Hint
Convert to the polar coordinate system.

Answer
π

Key Concepts
To apply a double integral to a situation with circular symmetry, it is often convenient to use a double integral in polar
coordinates. We can apply these double integrals over a polar rectangular region or a general polar region, using an iterated
integral similar to those used with rectangular double integrals.
The area dA in polar coordinates becomes r dr dθ.
Use x = r cos θ, y = r sin θ , and dA = r dr dθ to convert an integral in rectangular coordinates to an integral in polar
coordinates.
y
Use r = x + y and θ = tan ( ) to convert an integral in polar coordinates to an integral in rectangular coordinates, if
2 2 2 −1
x

needed.
To find the volume in polar coordinates bounded above by a surface z = f (r, θ) over a region on the xy-plane, use a double
integral in polar coordinates.

Key Equations
Double integral over a polar rectangular region R
m n m n

∗ ∗ ∗ ∗ ∗
∬ f (r, θ)dA = lim ∑ ∑ f (r ,θ )ΔA = lim ∑ ∑ f (r ,θ )r ΔrΔθ
ij ij ij ij ij
m,n→∞ m,n→∞
R
i=1 j=1 i=1 j=1

Double integral over a general polar region

15.3.13 https://math.libretexts.org/@go/page/4547
θ=β r2 (θ)

∬ f (r, θ) r dr dθ = ∫ ∫ f (r, θ) r dr dθ
D θ=α r=h1 (θ)

Glossary
polar rectangle
the region enclosed between the circles r = a and r = b and the angles θ = α and θ = β ; it is described as
R = {(r, θ) | a ≤ r ≤ b, α ≤ θ ≤ β}

15.3: Double Integrals in Polar Coordinates is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
15.3: Double Integrals in Polar Coordinates by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

15.3.14 https://math.libretexts.org/@go/page/4547
15.4: Applications of Double Integrals
 Learning Objectives
Recognize when a function of two variables is integrable over a rectangular region.
Recognize and use some of the properties of double integrals.
Evaluate a double integral over a rectangular region by writing it as an iterated integral.
Use a double integral to calculate the area of a region, volume under a surface, or average value of a function over a plane region.

In this section we investigate double integrals and show how we can use them to find the volume of a solid over a rectangular region in the xy-plane. Many of
the properties of double integrals are similar to those we have already discussed for single integrals.

Volumes and Double Integrals


We begin by considering the space above a rectangular region R . Consider a continuous function f (x, y) ≥ 0 of two variables defined on the closed rectangle
R:

2
R = [a, b] × [c, d] = {(x, y) ∈ R | a ≤ x ≤ b, c ≤ y ≤ d}

Here [a, b] × [c, d] denotes the Cartesian product of the two closed intervals [a, b] and [c, d]. It consists of rectangular pairs (x, y) such that a ≤ x ≤ b and
c ≤ y ≤ d . The graph of f represents a surface above the xy-plane with equation z = f (x, y) where z is the height of the surface at the point (x, y). Let S be

the solid that lies above R and under the graph of f (Figure 15.4.1). The base of the solid is the rectangle R in the xy-plane. We want to find the volume V of
the solid S .

Figure 15.4.1 : The graph of f (x, y) over the rectangle R in the xy-plane is a curved surface.
We divide the region R into small rectangles R , each with area ΔA and with sides Δx and Δy (Figure 15.4.2). We do this by dividing the interval [a, b] into
ij

m subintervals and dividing the interval [c, d] into n subintervals. Hence Δx = , Δy = , and ΔA = ΔxΔy .
b−a d−c

m n

Figure 15.4.2 : Rectangle R is divided into small rectangles R each with area ΔA.
ij

The volume of a thin rectangular box above R is f (x , y ) ΔA, where (x , y ) is an arbitrary sample point in each R as shown in the following figure,
ij

ij

ij

ij

ij ij

f (x , y ) is the height of the corresponding thin rectangular box, and ΔA is the area of each rectangle R .
∗ ∗
ij ij ij

15.4.1 https://math.libretexts.org/@go/page/4548
Figure 15.4.3 : A thin rectangular box above R with height f (x
ij

ij
, y
ij

.
)

Using the same idea for all the subrectangles, we obtain an approximate volume of the solid S as
m n

∗ ∗
V ≈ ∑ ∑ f (x , y )ΔA.
ij ij

i=1 j=1

This sum is known as a double Riemann sum and can be used to approximate the value of the volume of the solid. Here the double sum means that for each
subrectangle we evaluate the function at the chosen point, multiply by the area of each rectangle, and then add all the results.
As we have seen in the single-variable case, we obtain a better approximation to the actual volume if m and n become larger.
m n

∗ ∗
V = lim ∑ ∑ f (x , y )ΔA
ij ij
m,n→∞
i=1 j=1

or
m n

∗ ∗
V = lim ∑ ∑ f (x , y )ΔA.
ij ij
Δx, Δy→0
i=1 j=1

Note that the sum approaches a limit in either case and the limit is the volume of the solid with the base R . Now we are ready to define the double integral.

 Definition: Double Integral over a Rectangular Region R

The double integral of the function f (x, y) over the rectangular region R in the xy-plane is defined as
m n

∗ ∗
∬ f (x, y)dA = lim ∑ ∑ f (x , y )ΔA.
ij ij
m,n→∞
R
i=1 j=1

If f (x, y) ≥ 0, then the volume V of the solid S , which lies above R in the xy-plane and under the graph of f , is the double integral of the function f (x, y)
over the rectangle R . If the function is ever negative, then the double integral can be considered a “signed” volume in a manner similar to the way we defined
net signed area in The Definite Integral.

 Example 15.4.1: Setting up a Double Integral and Approximating It by Double Sums


Consider the function z = f (x, y) = 3 x
2
−y over the rectangular region R = [0, 2] × [0, 2] (Figure 15.4.4).
a. Set up a double integral for finding the value of the signed volume of the solid S that lies above R and “under” the graph of f .
b. Divide R into four squares with m = n = 2 , and choose the sample point as the upper right corner point of each square (1,1),(2,1),(1,2), and (2,2)
(Figure 15.4.4) to approximate the signed volume of the solid S that lies above R and “under” the graph of f .
c. Divide R into four squares with m = n = 2 , and choose the sample point as the midpoint of each square: (1/2, 1/2), (3/2, 1/2), (1/2,3/2), and (3/2, 3/2)
to approximate the signed volume.

Figure 15.4.4 : The function z = f (x, y) graphed over the rectangular region R = [0, 2] × [0, 2] .
Solution
a. As we can see, the function z = f (x, y) = 3x − y is above the plane. To find the signed volume of S , we need to divide the region R into small
2

rectangles R , each with area ΔA and with sides Δx and Δy, and choose (x , y ) as sample points in each R . Hence, a double integral is set up as
ij

ij

ij ij

15.4.2 https://math.libretexts.org/@go/page/4548
m n

2 ∗ 2 ∗
V =∬ (3 x − y)dA = lim ∑ ∑[3(x ) −y ]ΔA.
ij ij
m,n→∞
R
i=1 j=1

b. Approximating the signed volume using a Riemann sum with m = n = 2 we have ΔA = ΔxΔy = 1 × 1 = 1 . Also, the sample points are (1, 1), (2,
1), (1, 2), and (2, 2) as shown in the following figure.

Figure 15.4.5 : Subrectangles for the rectangular region R = [0, 2] × [0, 2] .


Hence,
2 2

∗ ∗
V ≈ ∑ ∑ f (x ,y )ΔA
ij ij

i=1 j=1

∗ ∗ ∗ ∗
= ∑(f (x ,y ) + f (x ,y ))ΔA
i1 i1 i2 i2

i=1

∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗
= f (x ,y )ΔA + f (x ,y )ΔA + f (x ,y )ΔA + f (x ,y )ΔA
11 11 21 21 12 12 22 22

= f (1, 1)(1) + f (2, 1)(1) + f (1, 2)(1) + f (2, 2)(1)

= (3 − 1)(1) + (12 − 1)(1) + (3 − 2)(1) + (12 − 2)(1)

= 2 + 11 + 1 + 10 = 24.

c. Approximating the signed volume using a Riemann sum with m = n = 2 we haveΔA = ΔxΔy = 1 × 1 = 1 . In this case the sample points are (1/2,
1/2), (3/2, 1/2), (1/2, 3/2), and (3/2, 3/2).
Hence,
2 2

∗ ∗
V ≈ ∑ ∑ f (x ,y )ΔA
ij ij

i=1 j=1

∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗
= f (x ,y )ΔA + f (x ,y )ΔA + f (x ,y )ΔA + f (x ,y )ΔA
11 11 21 21 12 12 22 22

= f (1/2, 1/2)(1) + f (3/2, 1/2)(1) + f (1/2, 3/2)(1) + f (3/2, 3/2)(1)

3 1 27 1 3 3 27 3
=( − ) (1) + ( − ) (1) + ( − ) (1) + ( − ) (1)
4 4 4 2 4 2 4 2

2 25 3 21 45
= + + (− )+ = = 11.
4 4 4 4 4

Analysis
Notice that the approximate answers differ due to the choices of the sample points. In either case, we are introducing some error because we are using only
a few sample points. Thus, we need to investigate how we can achieve an accurate answer.

 Exercise 15.4.1

Use the same functionz = f (x, y) = 3x 2


−y over the rectangular region R = [0, 2] × [0, 2] .
Divide R into the same four squares with m = n = 2 , and choose the sample points as the upper left corner point of each square (0,1), (1,1), (0,2), and
(1,2) (Figure 15.4.5) to approximate the signed volume of the solid S that lies above R and “under” the graph of f .

Hint
Follow the steps of the previous example.

15.4.3 https://math.libretexts.org/@go/page/4548
Answer
2 2

∗ ∗
V ≈ ∑ ∑ f (x ,y ) ΔA = 0
ij ij

i=1 j=1

Note that we developed the concept of double integral using a rectangular region R . This concept can be extended to any general region. However, when a
region is not rectangular, the subrectangles may not all fit perfectly into R , particularly if the base area is curved. We examine this situation in more detail in the
next section, where we study regions that are not always rectangular and subrectangles may not fit perfectly in the region R . Also, the heights may not be exact
if the surface z = f (x, y) is curved. However, the errors on the sides and the height where the pieces may not fit perfectly within the solid S approach 0 as m
and n approach infinity. Also, the double integral of the function z = f (x, y) exists provided that the function f is not too discontinuous. If the function is
bounded and continuous over R except on a finite number of smooth curves, then the double integral exists and we say that ff is integrable over R .
Since ΔA = ΔxΔy = ΔyΔx , we can express dA as dx dy or dy dx. This means that, when we are using rectangular coordinates, the double integral over a
region R denoted by

∬ f (x, y) dA
R

can be written as

∬ f (x, y) dx dy
R

or

∬ f (x, y) dy dx.
R

Now let’s list some of the properties that can be helpful to compute double integrals.

Properties of Double Integrals


The properties of double integrals are very helpful when computing them or otherwise working with them. We list here six properties of double integrals.
Properties 1 and 2 are referred to as the linearity of the integral, property 3 is the additivity of the integral, property 4 is the monotonicity of the integral, and
property 5 is used to find the bounds of the integral. Property 6 is used if f (x, y) is a product of two functions g(x) and h(y).

 Theorem: Properties of Double Integrals

Assume that the functions f (x, y) and g(x, y) are integrable over the rectangular region R ; S and T are subregions of R ; and assume that m and M are
real numbers.
i. The sum f (x, y) + g(x, y) is integrable and

∬ [f (x, y) + g(x, y)] dA = ∬ f (x, y) dA + ∬ g(x, y) dA.


R R R

ii. If c is a constant, then cf (x, y) is integrable and

∬ cf (x, y) dA = c ∬ f (x, y) dA.


R R

iii. If R = S ∪ T and S ∩ T =∅ except an overlap on the boundaries, then

∬ f (x, y) dA = ∬ f (x, y) dA + ∬ f (x, y) dA.


R S T

iv. If f (x, y) ≥ g(x, y) for (x, y) in R , then

∬ f (x, y) dA ≥ ∬ g(x, y) dA.


R R

v. If m ≤ f (x, y) ≤ M and A(R) = the area of R , then

m ⋅ A(R) ≤ ∬ f (x, y) dA ≤ M ⋅ A(R).


R

vi. In the case where f (x, y) can be factored as a product of a function g(x) of x only and a function h(y) of y only, then over the region
R = {(x, y) | a ≤ x ≤ b, c ≤ y ≤ d} , the double integral can be written as

b d

∬ f (x, y) dA = ( ∫ g(x) dx) ( ∫ h(y) dy) .


R a c

These properties are used in the evaluation of double integrals, as we will see later. We will become skilled in using these properties once we become familiar
with the computational tools of double integrals. So let’s get to that now.

15.4.4 https://math.libretexts.org/@go/page/4548
Iterated Integrals
So far, we have seen how to set up a double integral and how to obtain an approximate value for it. We can also imagine that evaluating double integrals by
using the definition can be a very lengthy process if we choose larger values for m and n .Therefore, we need a practical and convenient technique for
computing double integrals. In other words, we need to learn how to compute double integrals without employing the definition that uses limits and double
sums.
The basic idea is that the evaluation becomes easier if we can break a double integral into single integrals by integrating first with respect to one variable and
then with respect to the other. The key tool we need is called an iterated integral.

 Definitions: Iterated Integrals


Assume a , b , c , and d are real numbers. We define an iterated integral for a function f (x, y) over the rectangular region R = [a, b] × [c, d] as
b d b d

∫ ∫ f (x, y) dy dx = ∫ [∫ f (x, y) dy] dx


a c a c

or
d b d b

∫ ∫ f (x, y) dx dy = ∫ [∫ f (x, y) dx] dy.


c a c a

b d d b
The notation ∫ a
[∫
c
f (x, y) dy] dx means that we integrate f (x, y) with respect to y while holding x constant. Similarly, the notation ∫
c
[∫
a
f (x, y) dx] dy

means that we integrate f (x, y) with respect to x while holding y constant. The fact that double integrals can be split into iterated integrals is expressed in
Fubini’s theorem. Think of this theorem as an essential tool for evaluating double integrals.

 Theorem: Fubini's Theorem

Suppose that f (x, y) is a function of two variables that is continuous over a rectangular region R = {(x, y) ∈ R 2
| a ≤ x ≤ b, c ≤ y ≤ d} . Then we see
from Figure 15.4.6 that the double integral of f over the region equals an iterated integral,
b d d b

∬ f (x, y) dA = ∬ f (x, y) dx dy = ∫ ∫ f (x, y) dy dx = ∫ ∫ f (x, y) dx dy.


R R a c c a

More generally, Fubini’s theorem is true if f is bounded on R and f is discontinuous only on a finite number of continuous curves. In other words, f has to
be integrable over R .

Figure 15.4.6 : (a) Integrating first with respect to y and then with respect to x to find the area A(x) and then the volume V ; (b) integrating first with respect to
x and then with respect to y to find the area A(y) and then the volume V .

 Example 15.4.2: Using Fubini’s Theorem

Use Fubini’s theorem to compute the double integral ∬ f (x, y) dA where f (x, y) = x and R = [0, 2] × [0, 1] .
R

Solution
Fubini’s theorem offers an easier way to evaluate the double integral by the use of an iterated integral. Note how the boundary values of the region R

become the upper and lower limits of integration.

15.4.5 https://math.libretexts.org/@go/page/4548
∬ f (x, y) dA = ∬ f (x, y) dx dy
R R

y=1 x=2

=∫ ∫ x dx dy
y=0 x=0

y=1 2 x=2
x ∣
=∫ [ ∣ ] dy
y=0
2 ∣x=0

y=1 y=1

=∫ 2 dy = 2y ∣ =2

y=0 y=0

The double integration in this example is simple enough to use Fubini’s theorem directly, allowing us to convert a double integral into an iterated integral.
Consequently, we are now ready to convert all double integrals to iterated integrals and demonstrate how the properties listed earlier can help us evaluate
double integrals when the function f (x, y) is more complex. Note that the order of integration can be changed (see Example 7).

 Example 15.4.3: Illustrating Properties i and ii


Evaluate the double integral

2
∬ (xy − 3x y ) dA, where R = {(x, y) | 0 ≤ x ≤ 2, 1 ≤ y ≤ 2}.
R

Solution
This function has two pieces: one piece is xy and the other is 3xy . Also, the second piece has a constant 3. Notice how we use properties i and ii to help
2

evaluate the double integral.

2 2
∬ (xy − 3x y ) dA = ∬ xy dA + ∬ (−3x y ) dA Property i: Integral of a sum is the sum of the integrals.
R R R

y=2 x=2 y=2 x=2


2
=∫ ∫ xy dx dy − ∫ ∫ 3x y dx dy Convert double integrals to iterated integrals.
y=1 x=0 y=1 x=0

y=2 2 x=2 y=2 2 x=2


x ∣ x ∣
2
=∫ ( y) ∣ dy − 3 ∫ ( y )∣ dy Integrate with respect to x, holding y constant.
2 ∣ 2 ∣
y=1 x=0 y=1 x=0

y=2 y=2
2
=∫ 2y dy − ∫ 6 y dy Property ii: Placing the constant before the integral.
y=1 y=1

2 2
2
=2∫ y dy − 6 ∫ y dy Integrate with respect to y.
1 1

2 2 3 2
y ∣ y ∣
=2 ∣ −6 ∣
2 ∣1 3 ∣1

2 2
2∣ 3∣
=y ∣ − 2y ∣
∣ ∣
1 1

= (4 − 1) − 2(8 − 1) = 3 − 2(7) = 3 − 14 = −11.

 Example 15.4.4: Illustrating Property v.

Over the region R = {(x, y) | 1 ≤ x ≤ 3, 1 ≤ y ≤ 2} , we have 2 ≤ x 2


+y
2
≤ 13 . Find a lower and an upper bound for the integral ∬ 2
(x
2
+ y ) dA.
R

Solution
For a lower bound, integrate the constant function 2 over the region R . For an upper bound, integrate the constant function 13 over the region R .
2 3 2 3 2 2
∣ ∣
∫ ∫ 2 dx dy =∫ [2x ∣ ] dy = ∫ 2(2)dy = 4y ∣ = 4(2 − 1) = 4
∣ ∣
1 1 1 1 1 1

2 3 2 3 2 2
∣ ∣
∫ ∫ 13dx dy =∫ [13x ∣ ] dy = ∫ 13(2) dy = 26y ∣ = 26(2 − 1) = 26.
∣ ∣
1 1 1 1 1 1

Hence, we obtain 4 ≤ ∬ (x
2 2
+ y ) dA ≤ 26.
R

 Example 15.4.5: Illustrating Property vi

Evaluate the integral ∬ e


y
cos x dA over the region R = {(x, y) | 0 ≤ x ≤ π

2
, 0 ≤ y ≤ 1} .
R

Solution

15.4.6 https://math.libretexts.org/@go/page/4548
This is a great example for property vi because the function f (x, y) is clearly the product of two single-variable functions e and cos x. Thus we can split y

the integral into two parts and then integrate each one as a single-variable integration problem.
1 π/2
y y
∬ e cos x dA = ∫ ∫ e cos x dx dy
R 0 0

1 π/2
y
= (∫ e dy) ( ∫ cos x dx)
0 0

1 π/2
y∣ ∣
= (e ∣ )(sin x ∣ )
∣ ∣
0 0

= e − 1.

 Exercise 15.4.2

a. Use the properties of the double integral and Fubini’s theorem to evaluate the integral
1 3

∫ ∫ (3 − x + 4y) dy dx.
0 −1

1
b. Show that 0 ≤ ∬ sin πx cos πy dA ≤ where R = (0, 1

4
)(
1

4
,
1

2
) .
R
32

Hint
Use properties i. and ii. and evaluate the iterated integral, and then use property v.

Answer
a. 26
b. Answers may vary.

As we mentioned before, when we are using rectangular coordinates, the double integral over a region R denoted by ∬ f (x, y) dA can be written as R

∬ f (x, y) dx dy or ∬ f (x, y) dy dx. The next example shows that the results are the same regardless of which order of integration we choose.
R R

 Example 15.4.6: Evaluating an Iterated Integral in Two Ways

Let’s return to the function f (x, y) = 3x 2


−y from Example 1, this time over the rectangular region R = [0, 2] × [0, 3] . Use Fubini’s theorem to evaluate
∬ f (x, y) dA in two different ways:
R

a. First integrate with respect to y and then with respect to x;


b. First integrate with respect to x and then with respect to y .
Solution
Figure 15.4.6 shows how the calculation works in two different ways.
a. First integrate with respect to y and then integrate with respect to x:
x=2 y=3
2
∬ f (x, y) dA = ∫ ∫ (3 x − y) dy dx
R x=0 y=0

x=2 y=3 x=2 2 y=3


y ∣
2 2
=∫ (∫ (3 x − y) dy) dx = ∫ [3 x y − ∣ ] dx
x=0 y=0 x=0 2 ∣y=0

x=2 x=2
9 9 ∣
2 3
=∫ (9 x − ) dx = 3 x − x∣ = 15.
2 2 ∣
x=0 x=0

b. First integrate with respect to x and then integrate with respect to y :


y=3 x=2
2
∬ f (x, y) dA = ∫ ∫ (3 x − y) dx dy
R y=0 x=0

y=3 x=2
2
=∫ (∫ (3 x − y) dx) dy
y=0 x=0

y=3 x=2
3 ∣
=∫ [x − xy ∣ ] dy

y=0 x=0

y=3 y=3
2∣
=∫ (8 − 2y) dy = 8y − y ∣ = 15.

y=0 y=0

Analysis

15.4.7 https://math.libretexts.org/@go/page/4548
With either order of integration, the double integral gives us an answer of 15. We might wish to interpret this answer as a volume in cubic units of the solid
S below the function f (x, y) = 3 x − y over the region R = [0, 2] × [0, 3] . However, remember that the interpretation of a double integral as a (non-
2

signed) volume works only when the integrand f is a nonnegative function over the base region R .

 Exercise 15.4.3

Evaluate
y=2 x=5
2 2
∫ ∫ (2 − 3 x + y ) dx dy.
y=−3 x=3

Hint
Use Fubini’s theorem.

Answer
1340

3

In the next example we see that it can actually be beneficial to switch the order of integration to make the computation easier. We will come back to this idea
several times in this chapter.

 Example 15.4.7: Switching the Order of Integration

Consider the double integral ∬ x sin(xy) dA over the region R = {(x, y) | 0 ≤ x ≤ π, 1 ≤ y ≤ 2} (Figure 15.4.7).
R

a. Express the double integral in two different ways.


b. Analyze whether evaluating the double integral in one way is easier than the other and why.
c. Evaluate the integral.

Figure 15.4.7 : The function z = f (x, y) = x sin(xy) over the rectangular region R = [0, π] × [1, 2].
a. We can express ∬ x sin(xy) dA in the following two ways: first by integrating with respect to y and then with respect to x; second by integrating
R

with respect to x and then with respect to y .


x=π y=2

∬ x sin(xy) dA = ∫ ∫ x sin(xy) dy dx
R x=0 y=1

Integrate first with respect to y .


y=2 x=π

=∫ ∫ x sin(xy) dx dy
y=1 x=0

Integrate first with respect to x.


b. If we want to integrate with respect to y first and then integrate with respect to x, we see that we can use the substitution u = xy, which gives
du = x dy . Hence the inner integral is simply ∫ sin u du and we can change the limits to be functions of x,

x=π y=2 x=π u=2x

∬ x sin(xy) dA = ∫ ∫ x sin(xy) dy dx = ∫ [∫ sin(u) du] dx.


R x=0 y=1 x=0 u=x

However, integrating with respect to x first and then integrating with respect to y requires integration by parts for the inner integral, with u = x and
dv = sin(xy)dx

15.4.8 https://math.libretexts.org/@go/page/4548
cos(xy)
Then du = dx and v = − y
, so
y=2 x=π y=2 x=π x=π
x cos(xy) ∣ 1
∬ x sin(xy) dA = ∫ ∫ x sin(xy) dx dy = ∫ [− ∣ + ∫ cos(xy) dx] dy.
y ∣ y
R y=1 x=0 y=1 x=0 x=0

Since the evaluation is getting complicated, we will only do the computation that is easier to do, which is clearly the first method.
c. Evaluate the double integral using the easier way.
x=π y=2

∬ x sin(xy) dA = ∫ ∫ x sin(xy) dy dx
R x=0 y=1

x=π u=2x x=π u=2x



=∫ [∫ sin(u) du] dx = ∫ [− cos u ∣ ] dx
∣u=x
x=0 u=x x=0

x=π

=∫ (− cos 2x + cos x) dx
x=0

x=π
1 ∣
= (− sin 2x + sin x) ∣ = 0.
2 ∣
x=0

 Exercise 15.4.4

Evaluate the integral ∬ xe


xy
dA where R = [0, 1] × [0, ln 5] .
R

Hint
Integrate with respect to y first.

Answer
4−ln 5

ln 5

Applications of Double Integrals


Double integrals are very useful for finding the area of a region bounded by curves of functions. We describe this situation in more detail in the next section.
However, if the region is a rectangular shape, we can find its area by integrating the constant function f (x, y) = 1 over the region R .

 Definition: Area of a Region R

The area of the region R is given by

A(R) = ∬ 1 dA.
R

This definition makes sense because using f (x, y) = 1 and evaluating the integral make it a product of length and width. Let’s check this formula with an
example and see how this works.

 Example 15.4.8: Finding Area Using a Double Integral

Find the area of the region R = { (x, y) | 0 ≤ x ≤ 3, 0 ≤ y ≤ 2} by using a double integral, that is, by integrating 1 over the region R .
Solution
The region is rectangular with length 3 and width 2, so we know that the area is 6. We get the same answer when we use a double integral:
2 3 2 2 2 2
3 ∣ 2
A(R) = ∫ ∫ 1 dx dy = ∫ [x ∣
∣ ] dy = ∫ 3dy = 3 ∫ dy = 3y ∣ = 3(2) = 6 units .
0 ∣0
0 0 0 0 0

We have already seen how double integrals can be used to find the volume of a solid bounded above by a function f (x, y) ≥ 0 over a region R provided
f (x, y) ≥ 0 for all (x, y) in R . Here is another example to illustrate this concept.

 Example 15.4.9: Volume of an Elliptic Paraboloid

Find the volume V of the solid S that is bounded by the elliptic paraboloid 2x
2
+y
2
+ z = 27 , the planes x =3 and y =3 , and the three coordinate
planes.
Solution
First notice the graph of the surface z = 27 − 2x − y in Figure 15.4.8(a) and above the square region R = [−3, 3] × [−3, 3] . However, we need the
2 2
1

volume of the solid bounded by the elliptic paraboloid 2x + y + z = 27 , the planes x = 3 and y = 3 , and the three coordinate planes.
2 2

15.4.9 https://math.libretexts.org/@go/page/4548
Figure 15.4.8 : (a) The surface z = 27 − 2x − y above the square region
2 2
R1 = [−3, 3] × [−3, 3] . (b) The solid S lies under the surface
z = 27 − 2x
2
−y
2
above the square region R1 = [0, 3] × [0, 3] .
Now let’s look at the graph of the surface in Figure 15.4.8(b). We determine the volume V by evaluating the double integral over R : 2

2 2
V =∬ z dA = ∬ (27 − 2 x − y ) dA
R R

y=3 x=3
2 2
=∫ ∫ (27 − 2 x − y ) dx dy Convert to literal integral.
y=0 x=0

y=3 x=3
2 ∣
3 2
=∫ [27x − x − y x] ∣ dy Integrate with respect to x.
3 ∣
y=0 x=0

y=3 y=3
2 3∣
=∫ (63 − 3 y )dy = 63y − y ∣ = 162.

y=0 y=0

 Exercise 15.4.5

Find the volume of the solid bounded above by the graph of f (x, y) = xy sin(x y)
2
and below by the xy -plane on the rectangular region
R = [0, 1] × [0, π] .

Hint
Graph the function, set up the integral, and use an iterated integral.

Answer
π

Recall that we defined the average value of a function of one variable on an interval [a, b] as
b
1
fave = ∫ f (x) dx.
b −a a

Similarly, we can define the average value of a function of two variables over a region R . The main difference is that we divide by an area instead of the width
of an interval.

 Definition: Average Value of a Function


The average value of a function of two variables over a region R is
1
Fave = ∬ f (x, y) dx dy.
Area of R R

In the next example we find the average value of a function over a rectangular region. This is a good example of obtaining useful information for an integration
by making individual measurements over a grid, instead of trying to find an algebraic expression for a function.

 Example 15.4.10: Calculating Average Storm Rainfall

The weather map in Figure 15.4.9 shows an unusually moist storm system associated with the remnants of Hurricane Karl, which dumped 4–8 inches
(100–200 mm) of rain in some parts of the Midwest on September 22–23, 2010. The area of rainfall measured 300 miles east to west and 250 miles north to
south. Estimate the average rainfall over the entire area in those two days.

15.4.10 https://math.libretexts.org/@go/page/4548
Figure 15.4.9 : Effects of Hurricane Karl, which dumped 4–8 inches (100–200 mm) of rain in some parts of southwest Wisconsin, southern Minnesota, and
southeast South Dakota over a span of 300 miles east to west and 250 miles north to south.
Solution
Place the origin at the southwest corner of the map so that all the values can be considered as being in the first quadrant and hence all are positive. Now
divide the entire map into six rectangles (m = 2 and n = 3) , as shown in Figure 15.4.9. Assume f (x, y) denotes the storm rainfall in inches at a point
approximately x miles to the east of the origin and y miles to the north of the origin. Let R represent the entire area of 250 × 300 = 75000 square miles.
Then the area of each subrectangle is
1
ΔA = (75000) = 12500.
6

Assume (x ∗, y ∗) are approximately the midpoints of each subrectangle


ij ij Rij . Note the color-coded region at each of these points, and estimate the
rainfall. The rainfall at each of these points can be estimated as:
At (x 11 , y11 ), the rainfall is 0.08.
At (x 12 , y12 ), the rainfall is 0.08.
At (x 13 , y13 ), the rainfall is 0.01.
At (x 21 , y21 ), the rainfall is 1.70.
At (x 22 , y22 ), the rainfall is 1.74.
At (x 23 , y23 ), the rainfall is 3.00.

Figure 15.4.10: Storm rainfall with rectangular axes and showing the midpoints of each subrectangle.
According to our definition, the average storm rainfall in the entire area during those two days was

15.4.11 https://math.libretexts.org/@go/page/4548
1 1
fave = ∬ f (x, y) dx dy = ∬ f (x, y) dx dy
Area R R
75000 R

3 2
1
∗ ∗
≈ ∑ ∑ f (x ,y )ΔA
ij ij
75000
i=1 j=1

1
∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗
= [f (x ,y )ΔA + f (x ,y )ΔA + f (x ,y )ΔA + f (x ,y )ΔA + f (x ,y )ΔA + f (x ,y )ΔA]
11 11 12 12 13 13 21 21 22 22 23 23
75000

1
≈ [0.08 + 0.08 + 0.01 + 1.70 + 1.74 + 3.00]ΔA
75000

1
= [0.08 + 0.08 + 0.01 + 1.70 + 1.74 + 3.00]12500
75000

1
= [0.08 + 0.08 + 0.01 + 1.70 + 1.74 + 3.00]
6

≈ 1.10 in.

During September 22–23, 2010 this area had an average storm rainfall of approximately 1.10 inches.

 Exercise 15.4.6

A contour map is shown for a function f (x, y) on the rectangle R = [−3, 6] × [−1, 4] .

a. Use the midpoint rule with m = 3 and n = 2 to estimate the value of ∬ f (x, y) dA.
R

b. Estimate the average value of the function f (x, y).

Hint
Divide the region into six rectangles, and use the contour lines to estimate the values for f (x, y).

Answer
Answers to both parts a. and b. may vary.

Key Concepts
We can use a double Riemann sum to approximate the volume of a solid bounded above by a function of two variables over a rectangular region. By taking
the limit, this becomes a double integral representing the volume of the solid.
Properties of double integral are useful to simplify computation and find bounds on their values.
We can use Fubini’s theorem to write and evaluate a double integral as an iterated integral.
Double integrals are used to calculate the area of a region, the volume under a surface, and the average value of a function of two variables over a
rectangular region.

Key Equations
m n

∬ f (x, y) dA = lim ∑ ∑ f (xi j∗, yi j∗) ΔA


m,n→∞
R i=1 j=1

b d b d

∫ ∫ f (x, y) dx dy = ∫ [∫ f (x, y) dy] dx


a c a c

or
d b d b

∫ ∫ f (x, y) dx dy = ∫ [∫ f (x, y) dx] dy


c a c a

1
fave = ∬ f (x, y) dx dy
Area of R R

15.4.12 https://math.libretexts.org/@go/page/4548
Glossary
double integral
of the function f (x, y) over the region R in the xy-plane is defined as the limit of a double Riemann sum,
m n

∗ ∗
∬ f (x, y) dA = lim ∑ ∑ f (x ,y ) ΔA.
ij ij
m,n→∞
R i=1 j=1

double Riemann sum


of the function f (x, y) over a rectangular region R is
m n

∗ ∗
∑ ∑ f (x ,y ) ΔA,
ij ij

i=1 j=1

where R is divided into smaller subrectangles R and (x ij



ij
,y
ij

) is an arbitrary point in R ij

Fubini’s theorem
if f (x, y) is a function of two variables that is continuous over a rectangular region R = {(x, y) ∈ R 2
| a ≤ x ≤ b, c ≤ y ≤ d} , then the double integral
of f over the region equals an iterated integral,
b d d b

∬ f (x, y) dA = ∫ ∫ f (x, y) dx dy = ∫ ∫ f (x, y) dx dy


R a c c a

iterated integral
for a function f (x, y) over the region R is
b d b d

a. ∫ ∫ f (x, y) dx dy = ∫ [∫ f (x, y) dy] dx,


a c a c

d b d b

b. ∫ ∫ f (x, y) dx dy = ∫ [∫ f (x, y) dx] dy,


c a c a

where a, b, c, and d are any real numbers and R = [a, b] × [c, d]

15.4: Applications of Double Integrals is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
15.1: Double Integrals over Rectangular Regions by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

15.4.13 https://math.libretexts.org/@go/page/4548
Welcome to the Mathematics Library. This Living Library is a principal hub of the LibreTexts project, which is a multi-institutional
collaborative venture to develop the next generation of open-access texts to improve postsecondary education at all levels of higher
learning. The LibreTexts approach is highly collaborative where an Open Access textbook environment is under constant revision
by students, faculty, and outside experts to supplant conventional paper-based books.

1 https://math.libretexts.org/@go/page/4549
15.6: Triple Integrals
 Learning Objectives
Recognize when a function of three variables is integrable over a rectangular box.
Evaluate a triple integral by expressing it as an iterated integral.
Recognize when a function of three variables is integrable over a closed and bounded region.
Simplify a calculation by changing the order of integration of a triple integral.
Calculate the average value of a function of three variables.

Previously, we discussed the double integral of a function f (x, y) of two variables over a rectangular region in the plane. In this section we define the triple
integral of a function f (x, y, z) of three variables over a rectangular solid box in space, R . Later in this section we extend the definition to more general
3

regions in R .3

Integrable Functions of Three Variables


We can define a rectangular box B in R as 3

B = {(x, y, z) | a ≤ x ≤ b, c ≤ y ≤ d, e ≤ z ≤ f }.

We follow a similar procedure to what we did in previously. We divide the interval [a, b] into l subintervals [x i−1 , xi ] of equal length Δx with
xi − xi−1
Δx = ,
l

divide the interval [c, d] into m subintervals [y i−1 , yi ] of equal length Δy with
yj − yj−1
Δy = ,
m

and divide the interval [e, f ] into n subintervals [z i−1 , zi ] of equal length Δz with
zk − zk−1
Δz =
n

Then the rectangular box B is subdivided into lmn subboxes:

Bijk = [ xi−1 , xi ] × [ yi−1 , yi ] × [ zi−1 , zi ],

as shown in Figure 15.6.1.

Figure 15.6.1 : A rectangular box in R divided into subboxes by planes parallel to the coordinate planes.
3

For each i, j, and k , consider a sample point (x ∗


ijk
,y

ijk
,z

ijk
) in each sub-box B ijk . We see that its volume is ΔV = ΔxΔyΔz . Form the triple Riemann
sum
l m n

∗ ∗ ∗
∑ ∑ ∑ f (x ,y ,z ) ΔxΔyΔz.
ijk ijk ijk

i=1 j=1 k=1

15.6.1 https://math.libretexts.org/@go/page/4550
We define the triple integral in terms of the limit of a triple Riemann sum, as we did for the double integral in terms of a double Riemann sum.

 Definition: The triple integral

The triple integral of a function f (x, y, z) over a rectangular box B is defined as


l m n

∗ ∗ ∗
lim ∑ ∑ ∑ f (x ,y ,z ) ΔxΔyΔz = ∭ f (x, y, z) dV
ijk ijk ijk
l,m,n→∞
B
i=1 j=1 k=1

if this limit exists.

When the triple integral exists on B the function f (x, y, z) is said to be integrable on B . Also, the triple integral exists if f (x, y, z) is continuous on B .
Therefore, we will use continuous functions for our examples. However, continuity is sufficient but not necessary; in other words, f is bounded on B and
continuous except possibly on the boundary of B . The sample point (x , y , z ) can be any point in the rectangular sub-box B and all the properties

ijk

ijk

ijk ijk

of a double integral apply to a triple integral. Just as the double integral has many practical applications, the triple integral also has many applications, which
we discuss in later sections.
Now that we have developed the concept of the triple integral, we need to know how to compute it. Just as in the case of the double integral, we can have an
iterated triple integral, and consequently, a version of Fubini’s theorem for triple integrals exists.

 Fubini’s Theorem for Triple Integrals

If f (x, y, z) is continuous on a rectangular box B = [a, b] × [c, d] × [e, f ] , then


f d b

∬ f (x, y, z) dV = ∫ ∫ ∫ f (x, y, z) dx dy dz.


B e c a

This integral is also equal to any of the other five possible orderings for the iterated triple integral.

For a, b, c, d, e and f real numbers, the iterated triple integral can be expressed in six different orderings:
f d b f d b

∫ ∫ ∫ f (x, y, z) dx dy dz = ∫ (∫ (∫ f (x, y, z) dx) dy) dz (15.6.1)


e c a e c a

d f b

=∫ (∫ (∫ f (x, y, z) dx) dz) dy (15.6.2)


c e a

b f d

=∫ (∫ (∫ f (x, y, z) dy) dz) dx (15.6.3)


a e c

f b d

=∫ (∫ (∫ f (x, y, z) dy) dx) dz (15.6.4)


e a c

d b d

=∫ (∫ (∫ f (x, y, z) dz) dx) dy (15.6.5)


c a c

b d f

=∫ (∫ (∫ f (x, y, z) dz) dy) dx (15.6.6)


a c e

For a rectangular box, the order of integration does not make any significant difference in the level of difficulty in computation. We compute triple integrals
using Fubini’s Theorem rather than using the Riemann sum definition. We follow the order of integration in the same way as we did for double integrals
(that is, from inside to outside).

 Example 15.6.1: Evaluating a Triple Integral


Evaluate the triple integral
z=1 y=4 x=5
2
∫ ∫ ∫ (x + y z ) dx dy dz.
z=0 y=2 x=−1

Solution
The order of integration is specified in the problem, so integrate with respect to x first, then y, and then z .

15.6.2 https://math.libretexts.org/@go/page/4550
z=1 y=4 x=5
2
∫ ∫ ∫ (x + y z ) dx dy dz
z=0 y=2 x=−1

z=1 y=4 x=5


2
x ∣
2
=∫ ∫ [ + xy z ∣ ] dy dz Integrate with respect to x.
z=0 y=2
2 ∣
x=−1

z=1 y=4
2
=∫ ∫ [12 + 6y z ] dy dz Evaluate.
z=0 y=2

z=1 2 y=4
y ∣
2
=∫ [ 12y + 6 z ∣ ] dz Integrate with respect to y.
z=0
2 ∣y=2

z=1
2
=∫ [24 + 36 z ] dz Evaluate.
z=0

z=1
3
z
= [24z + 36 ] Integrate with respect to z.
3
z=0

= 36. Evaluate.

 Example 15.6.2: Evaluating a Triple Integral

Evaluate the triple integral

2
∭ x yz dV
B

where B = {(x, y, z) | − 2 ≤ x ≤ 1, 0 ≤ y ≤ 3, 1 ≤ z ≤ 5} as shown in Figure 15.6.2.

Figure 15.6.2 : Evaluating a triple integral over a given rectangular box.


Solution
The order is not specified, but we can use the iterated integral in any order without changing the level of difficulty. Choose, say, to integrate y first, then
x, and then z .

5 1 3
2 2
∭ x yz dV =∫ ∫ ∫ [ x yz] dy dx dz
1 −2 0
B

5 1 3
3
y ∣
2
=∫ ∫ [x z∣ ] dx dz
1 −2
3 ∣
0

5 1
y
2
=∫ ∫ x z dx dz
1 −2
2

5 3 1
9 x ∣
=∫ [ z∣ ] dz
1
2 3 ∣
−2

5
27
=∫ z dz
1
2

2 5
27 z ∣
= ∣ = 162.
2 2 ∣
1

Now try to integrate in a different order just to see that we get the same answer. Choose to integrate with respect to x first, then z , then y

15.6.3 https://math.libretexts.org/@go/page/4550
3 5 1
2 2
∭ x yz dV =∫ ∫ ∫ [ x yz] dx dz dy
0 1 −2
B

3 5 3 1
x ∣
=∫ ∫ [ yz∣ ] dz dy
0 1
3 ∣
−2

3 5

=∫ ∫ 3yz dz dy
0 1

3 2 5
z ∣
=∫ [3y ∣ ] dy
0
2 ∣
1

=∫ 36y dy
0

2 3
y ∣
= 36 ∣ = 18(9 − 0) = 162.
2 ∣
0

 Exercise 15.6.1

Evaluate the triple integral

∭ z sin x cos y dV
B


where B = {(x, y, z) | 0 ≤ x ≤ π, ≤ y ≤ 2π, 1 ≤ z ≤ 3} .
2

Hint
Follow the steps in the previous example.
Answer

∭ z sin x cos y dV = 8
B

Triple Integrals over a General Bounded Region


We now expand the definition of the triple integral to compute a triple integral over a more general bounded region E in R . The general bounded regions 3

we will consider are of three types. First, let D be the bounded region that is a projection of E onto the xy-plane. Suppose the region E in R has the form 3

E = {(x, y, z) | (x, y) ∈ D, u1 (x, y) ≤ z ≤ u2 (x, y)}.

For two functions z = u 1 (x, y) and u


2 (x, y), such that u 1 (x, y) ≤ u2 (x, y) for all (x, y) in D as shown in the following figure.

Figure 15.6.3 : We can describe region E as the space between u 1 (x, y) and u 2 (x, y) above the projection D of E onto the xy-plane.

 Triple Integral over a General Region

The triple integral of a continuous function f (x, y, z) over a general three-dimensional region

E = {(x, y, z) | (x, y) ∈ D, u1 (x, y) ≤ z ≤ u2 (x, y)}

in R , where D is the projection of E onto the xy-plane, is


3

u2 (x,y)

∭ f (x, y, z) dV = ∬ [∫ f (x, y, z) dz] dA.


E D u1 (x,y)

15.6.4 https://math.libretexts.org/@go/page/4550
Similarly, we can consider a general bounded region D in the xy-plane and two functions y = u 1 (x, z) and y = u2 (x, z) such that u1 (x, z) ≤ u2 (x, z) for
all (x, z) in D. Then we can describe the solid region E in R as 3

E = {(x, y, z) | (x, z) ∈ D, u1 (x, z) ≤ z ≤ u2 (x, z)}

where D is the projection of E onto the xy-plane and the triple integral is
u2 (x,z)

∭ f (x, y, z) dV = ∬ [∫ f (x, y, z) dy] dA.


E D u1 (x,z)

Finally, if D is a general bounded region in the xy-plane and we have two functions x = u1 (y, z) and x = u 2 (y, z) such that u1 (y, z) ≤ u2 (y, z) for all
(y, z) in D, then the solid region E in R can be described as
3

E = {(x, y, z) | (y, z) ∈ D, u1 (y, z) ≤ z ≤ u2 (y, z)}

where D is the projection of E onto the xy-plane and the triple integral is
u2 (y,z)

∭ f (x, y, z) dV = ∬ [∫ f (x, y, z) dx] dA.


E D u1 (y,z)

Note that the region D in any of the planes may be of Type I or Type II as described in previously. If D in the xy-plane is of Type I (Figure 15.6.4), then

E = {(x, y, z) | a ≤ x ≤ b, g1 (x) ≤ y ≤ g2 (x), u1 (x, y) ≤ z ≤ u2 (x, y)}.

Figure 15.6.4 : A box E where the projection D in the xy-plane is of Type I.


Then the triple integral becomes
b g (x) u2 (x,y)
2

∭ f (x, y, z) dV = ∫ ∫ ∫ f (x, y, z) dz dy dx.


E a g1 (x) u1 (x,y)

If D in the xy-plane is of Type II (Figure 15.6.5), then

E = {(x, y, z) | c ≤ x ≤ d, h1 (x) ≤ y ≤ h2 (x), u1 (x, y) ≤ z ≤ u2 (x, y)}.

Figure 15.6.5 : A box E where the projection D in the xy-plane is of Type II.
Then the triple integral becomes

15.6.5 https://math.libretexts.org/@go/page/4550
y=d x=h2 (y) z=u2 (x,y)

∭ f (x, y, z) dV = ∫ ∫ ∫ f (x, y, z) dz dx dy.


E y=c x=h1 (y) z=u1 (x,y)

 Example 15.6.3A: Evaluating a Triple Integral over a General Bounded Region

Evaluate the triple integral of the function f (x, y, z) = 5x − 3y over the solid tetrahedron bounded by the planes x = 0, y = 0, z = 0 , and
x +y +z = 1 .

Solution
Figure 15.6.6 shows the solid tetrahedron E and its projection D on the xy-plane.

Figure 15.6.6 : The solid E has a projection D on the xy-plane of Type I.


We can describe the solid region tetrahedron as

E = {(x, y, z) | 0 ≤ x ≤ 1, 0 ≤ y ≤ 1 − x, 0 ≤ z ≤ 1 − x − y}.

Hence, the triple integral is


x=1 y=1−x z=1−x−y

∭ f (x, y, z) dV = ∫ ∫ ∫ (5x − 3y) dz dy dx.


E x=0 y=0 z=0

z=1−x−y

To simplify the calculation, first evaluate the integral ∫ (5x − 3y) dz . We have
z=0

z=1−x−y z=1−x−y

∫ (5x − 3y) dz = (5x − 3y)z∣ = (5x − 3y)(1 − x − y).

z=0 z=0

Now evaluate the integral


y=1−x

∫ (5x − 3y)(1 − x − y) dy,


y=0

obtaining
y=1−x
1 2
∫ (5x − 3y)(1 − x − y) dy = (x − 1 ) (6x − 1).
y=0 2

Finally evaluate
x=1
1 1
2
∫ (x − 1 ) (6x − 1) dx = .
x=0 2 12

Putting it all together, we have


x=1 y=1−x z=1−x−y
1
∭ f (x, y, z) dV = ∫ ∫ ∫ (5x − 3y) dz dy dx = .
E x=0 y=0 z=0
12

Just as we used the double integral

∬ 1 dA
D

to find the area of a general bounded region D we can use

∭ 1 dV
E

to find the volume of a general solid bounded region E . The next example illustrates the method.

15.6.6 https://math.libretexts.org/@go/page/4550
 Example 15.6.3B: Finding a Volume by Evaluating a Triple Integral

Find the volume of a right pyramid that has the square base in the xy-plane [−1, 1] × [−1, 1] and vertex at the point (0, 0, 1) as shown in the following
figure.

Figure 15.6.7 : Finding the volume of a pyramid with a square base.


Solution
In this pyramid the value of z changes from 0 to 1 and at each height z the cross section of the pyramid for any value of z is the square

[−1 + z, 1 − z] × [−1 + z, 1 − z].

Hence, the volume of the pyramid is

∭ 1 dV
E

where

E = {(x, y, z) | 0 ≤ z ≤ 1, −1 + z ≤ y ≤ 1 − z, −1 + z ≤ x ≤ 1 − z}.

Thus, we have
z=1 y=1−z x=1−z

∭ 1 dV =∫ ∫ ∫ 1 dx dy dz
E z=0 y=−1+z x=−1+z

z=1 y=1−z

=∫ ∫ (2 − 2z) dy dz
z=0 y=−1+z

z=1
4
2
=∫ (2 − 2z) dz = .
z=0
3

4
Hence, the volume of the pyramid is cubic units.
3

 Exercise 15.6.3

Consider the solid sphere E = {(x, y, z) | x 2


+y
2
+z
2
= 9} . Write the triple integral

∭ f (x, y, z) dV
E

for an arbitrary function f as an iterated integral. Then evaluate this triple integral with f (x, y, z) = 1 . Notice that this gives the volume of a sphere
using a triple integral.

Hint
Follow the steps in the previous example. Use symmetry.
Answer
2 2 2
x=3 y=√9−z z=√9−x −y

∭ 1 dV = 8 ∫ ∫ ∫ 1 dz dy dx
E x=−3 y=−√9−z 2 z=−√9−x2 −y 2

= 36π cubic units.

15.6.7 https://math.libretexts.org/@go/page/4550
Changing the Order of Integration
As we have already seen in double integrals over general bounded regions, changing the order of the integration is done quite often to simplify the
computation. With a triple integral over a rectangular box, the order of integration does not change the level of difficulty of the calculation. However, with a
triple integral over a general bounded region, choosing an appropriate order of integration can simplify the computation quite a bit. Sometimes making the
change to polar coordinates can also be very helpful. We demonstrate two examples here.

 Example 15.6.4: Changing the Order of Integration

Consider the iterated integral


2
x=1 y=x z=y

∫ ∫ ∫ f (x, y, z) dz dy dx.
x=0 y=0 z=0

The order of integration here is first with respect to z, then y, and then x. Express this integral by changing the order of integration to be first with
respect to x, then z , and then y . Verify that the value of the integral is the same if we let f (x, y, z) = xyz.
Solution
The best way to do this is to sketch the region E and its projections onto each of the three coordinate planes. Thus, let
2
E = {(x, y, z) | 0 ≤ x ≤ 1, 0 ≤ y ≤ x , 0 ≤ z ≤ y}.

and
2 2
x=1 y=x z=x

∫ ∫ ∫ f (x, y, z) dz dy dx = ∭ f (x, y, z) dV .
x=0 y=0 z=0 E

We need to express this triple integral as


y=d z=v2 (y) x=u2 (y,z)

∫ ∫ ∫ f (x, y, z) dx dz dy.
y=c z=v1 (y) x=u1 (y,z)

Knowing the region E we can draw the following projections (Figure 15.6.8):
on the xy-plane is D 1
2
= {(x, y) | 0 ≤ x ≤ 1, 0 ≤ y ≤ x } = {(x, y) | 0 ≤ y ≤ 1, √y ≤ x ≤ 1},

on the yz-plane is D 2 = {(y, z) | 0 ≤ y ≤ 1, 0 ≤ z ≤ y }


2
, and
on the xz-plane is D 3 = {(x, z) | 0 ≤ x ≤ 1, 0 ≤ z ≤ x }
2
.

Figure 15.6.8 . The three cross sections of E on the three coordinate planes.
Now we can describe the same region E as {(x, y, z) | 0 ≤ y ≤ 1, 2
0 ≤ z ≤ y , √y ≤ x ≤ 1} , and consequently, the triple integral becomes
2
y=d z=v2 (y) x=u2 (y,z) y=1 z=x x=1

∫ ∫ ∫ f (x, y, z) dx dz dy = ∫ ∫ ∫ f (x, y, z) dx dz dy
y=c z=v1 (y) x=u1 (y,z) y=0 z=0 x=√y

Now assume that f (x, y, z) = xyz in each of the integrals. Then we have

15.6.8 https://math.libretexts.org/@go/page/4550
2
2 2 2
x=1 y=x z=y x=1 y=x 2 z=y
z ∣
∫ ∫ ∫ xyz dz dy dx = ∫ ∫ [xy ∣ ] dy dx
x=0 y=0 z=0 x=0 y=0
2 ∣
z=0

2
x=1 y=x 5
y
=∫ ∫ (x ) dy dx
x=0 y=0
2

2
x=1 y=x
6
y ∣
=∫ [x ∣ ] dx
x=0 12 ∣
y=0

x=1 x=1
13 14
x x ∣
=∫ dx = ∣
x=0
12 168 ∣
x=0

1
= ,
168

y=1 z=y
2
x=1 y=1 z=y
2 1
2
x ∣
∫ ∫ ∫ xyz dx dz dy = ∫ ∫ [yz ∣ ] dz dy
y=0 z=0 x=√y y=0 z=0 2 ∣
√y

2
y=1 z=y 2
yz y z
=∫ ∫ ( − ) dz dy
y=0 z=0
2 2

2
y=1 2 2 2 z=y
yz y z ∣
=∫ [ − ∣ ] dy
y=0
4 4 ∣
z=0

y=1 5 6
y y
=∫ ( − ) dy
y=0
4 4

y=1
6 7
y y ∣
= ( − )∣
24 28 ∣
y=0

1
= .
168

The answers match.

 Exercise 15.6.4
Write five different iterated integrals equal to the given integral
z=4 y=4−z x=√y

∫ ∫ ∫ f (x, y, z) dx dy dz.
z=0 y=0 x=0

Hint
Follow the steps in the previous example, using the region E as {(x, y, z) | 0 ≤ z ≤ 4, 0 ≤ y ≤ 4 − z, 0 ≤ x ≤ √y} , and describe and sketch the
projections onto each of the three planes, five different times.
Answer
z=4 x=√4−z y=4−z y=4 z=4−y x=√y y=4 x=√y Z=4−y

(i) ∫ ∫ ∫ f (x, y, z) dy dx dz, (ii) ∫ ∫ ∫ f (x, y, z) dx dz dy, (iii) ∫ ∫ ∫ f (x, y, z) dz dx dy,


2
z=0 x=0 y=x y=0 z=0 x=0 y=0 x=0 z=0

2
x=2 y=4 z=4−y x=2 z=4−x y=4−z

(iv) ∫ ∫ ∫ f (x, y, z) dz dy dx, (v) ∫ ∫ ∫ f (x, y, z) dy dz dx


2 2
x=0 y=x z=0 x=0 z=0 y=x

 Example 15.6.5: Changing Integration Order and Coordinate Systems


Evaluate the triple integral
− −−−−−
2 2
∭ √ x + z dV ,

where E is the region bounded by the paraboloid y = x 2


+z
2
(Figure 15.6.9) and the plane y = 4 .

15.6.9 https://math.libretexts.org/@go/page/4550
Figure 15.6.9 . Integrating a triple integral over a paraboloid.
Solution
The projection of the solid region E onto the xy-plane is the region bounded above by y = 4 and below by the parabola y = x as shown. 2

Figure 15.6.10. Cross section in the xy-plane of the paraboloid in Figure 15.6.9 .
Thus, we have
−−−−− −−−−−
2 2 2
E = {(x, y, z) | − 2 ≤ x ≤ 2, x ≤ y ≤ 4, −√ y − x ≤ z√ y − x }.

The triple integral becomes


x=2 y=4 z=√y−x2
− −−−−− − −−−−−
2 2 2 2
∭ √ x + z dV = ∫ ∫ ∫ √ x + z dz dy dx.
2 2
E x=−2 y=x z=−√y−x

This expression is difficult to compute, so consider the projection of E onto the xz-plane. This is a circular disc x 2
+z
2
≤4 . So we obtain
2 2
x=2 y=4 z=√y−x x=2 z=√4−x y=4
− −−−−− − −−−−− − −−−−−
2 2 2 2 2 2
∭ √ x + z dV = ∫ ∫ ∫ √ x + z dz dy dx = ∫ ∫ ∫ √ x + z dy dz dx.
2 2 2 2 2
E x=−2 y=x z=−√y−x x=−2 z=−√4−x y=x +z

Here the order of integration changes from being first with respect to z then y and then x to being first with respect to y then to z and then to x. It will
soon be clear how this change can be beneficial for computation. We have
2 2
x=2 z=√4−x y=4 x=2 z=√4−x
− −−−−− − −−−−−
2 2 2 2 2 2
∫ ∫ ∫ √ x + z dy dz dx = ∫ ∫ (4 − x − z )√ x + z dz dx.
x=−2 z=√4−x2 y=x2 +z 2 x=−2 z=−√4−x2

Now use the polar substitution x = r cos θ, z = r sin θ , and dz dx = r dr dθ in the xz-plane. This is essentially the same thing as when we used
polar coordinates in the xy-plane, except we are replacing y by z . Consequently the limits of integration change and we have, by using r = x + z , 2 2 2

x=2 z=√4−x
2
θ=2π r=2 2π 2
3 5
− −−−−− 4r r ∣
2 2 2 2 2
∫ ∫ (4 − x − z )√ x + z dz dx = ∫ ∫ (4 − r )rr dr dθ = ∫ [ − ∣ ] dθ =
x=−2 z=−√4−x2 θ=0 r=0 0 3 5 ∣
0


64 128π
∫ dθ =
0
15 15

Average Value of a Function of Three Variables


Recall that we found the average value of a function of two variables by evaluating the double integral over a region on the plane and then dividing by the
area of the region. Similarly, we can find the average value of a function in three variables by evaluating the triple integral over a solid region and then
dividing by the volume of the solid.

15.6.10 https://math.libretexts.org/@go/page/4550
 Average Value of a Function of Three Variables

If f (x, y, z) is integrable over a solid bounded region E with positive volume V (E), then the average value of the function is
1
fave = ∭ f (x, y, z) dV .
V (E) E

Note that the volume is

V (E) = ∭ 1 dV .
E

 Example 15.6.6: Finding an Average Temperature

The temperature at a point (x, y, z) of a solid E bounded by the coordinate planes and the plane x +y +z = 1 is T (x, y, z) = (xy + 8z + 20) °C .
Find the average temperature over the solid.
Solution
Use the theorem given above and the triple integral to find the numerator and the denominator. Then do the division. Notice that the plane
x + y + z = 1 has intercepts (1, 0, 0), (0, 1, 0),and (0, 0, 1). The region E looks like

E = {(x, y, z) | 0 ≤ x ≤ 1, 0 ≤ y ≤ 1 − x, 0 ≤ z ≤ 1 − x − y}.

Hence the triple integral of the temperature is


x=1 y=1−x z=1−x−y
147
∭ f (x, y, z) dV = ∫ ∫ ∫ (xy + 8z + 20) dz dy dx = .
E x=0 y=0 z=0 40

The volume evaluation is


x=1 y=1−x z=1−x−y
1
V (E) = ∭ 1 dV = ∫ ∫ ∫ 1 dz dy dx = .
E x=0 y=0 z=0 6

Hence the average value is


147/40 6(147) 441
Tave = = = °C
1/6 40 20

 Exercise 15.6.6

Find the average value of the function f (x, y, z) = xyz over the cube with sides of length 4 units in the first octant with one vertex at the origin and
edges parallel to the coordinate axes.

Hint
Follow the steps in the previous example.
Answer
fave = 8

Key Concepts
To compute a triple integral we use Fubini’s theorem, which states that if f (x, y, z) is continuous on a rectangular box B = [a, b] × [c, d] × [e, f ] , then
f d b

∭ f (x, y, z) dV = ∫ ∫ ∫ f (x, y, z) dx dy dz
B e c a

and is also equal to any of the other five possible orderings for the iterated triple integral.
To compute the volume of a general solid bounded region E we use the triple integral

V (E) = ∭ 1 dV .
E

Interchanging the order of the iterated integrals does not change the answer. As a matter of fact, interchanging the order of integration can help simplify
the computation.
To compute the average value of a function over a general three-dimensional region, we use
1
fave = ∭ f (x, y, z) dV .
V (E) E

15.6.11 https://math.libretexts.org/@go/page/4550
Key Equations
Triple integral
l m n

∗ ∗ ∗
lim ∑ ∑ ∑ f (x ,y ,z ) ΔxΔyΔz = ∭ f (x, y, z) dV
ijk ijk ijk
l,m,n→∞
i=1 j=1 k=1 B

Glossary
triple integral
the triple integral of a continuous function f (x, y, z) over a rectangular solid box B is the limit of a Riemann sum for a function of three variables, if this
limit exists

15.6: Triple Integrals is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
15.4: Triple Integrals by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source: https://openstax.org/details/books/calculus-volume-1.

15.6.12 https://math.libretexts.org/@go/page/4550
15.7: Triple Integrals in Cylindrical Coordinates
 Learning Objectives
Evaluate a triple integral by changing to cylindrical coordinates.
Evaluate a triple integral by changing to spherical coordinates.

Earlier in this chapter we showed how to convert a double integral in rectangular coordinates into a double integral in polar
coordinates in order to deal more conveniently with problems involving circular symmetry. A similar situation occurs with triple
integrals, but here we need to distinguish between cylindrical symmetry and spherical symmetry. In this section we convert triple
integrals in rectangular coordinates into a triple integral in either cylindrical or spherical coordinates.
Also recall the chapter prelude, which showed the opera house l’Hemisphèric in Valencia, Spain. It has four sections with one of
the sections being a theater in a five-story-high sphere (ball) under an oval roof as long as a football field. Inside is an IMAX
screen that changes the sphere into a planetarium with a sky full of 9000 twinkling stars. Using triple integrals in spherical
coordinates, we can find the volumes of different geometric shapes like these.

Review of Cylindrical Coordinates


As we have seen earlier, in two-dimensional space R a point with rectangular coordinates (x, y) can be identified with (r, θ) in
2

y
polar coordinates and vice versa, where x = r cos θ , y = r sin θ, r = x + y and tan θ = ( ) are the relationships between
2 2 2
x

the variables.
In three-dimensional space R a point with rectangular coordinates (x, y, z) can be identified with cylindrical coordinates (r, θ, z)
3

and vice versa. We can use these same conversion relationships, adding z as the vertical distance to the point from the (xy-plane as
shown in 15.7.1.

Figure 15.7.1 : Cylindrical coordinates are similar to polar coordinates with a vertical z coordinate added.
To convert from rectangular to cylindrical coordinates, we use the conversion
x = r cos θ

y = r sin θ

z =z

To convert from cylindrical to rectangular coordinates, we use


r
2
=x
2
+y
2
and
−1 y
θ = tan ( )
x

z =z

Note that that z -coordinate remains the same in both cases.


In the two-dimensional plane with a rectangular coordinate system, when we say x = k (constant) we mean an unbounded vertical
line parallel to the y -axis and when y = l (constant) we mean an unbounded horizontal line parallel to the x-axis. With the polar

15.7.1 https://math.libretexts.org/@go/page/4551
coordinate system, when we say r = c (constant), we mean a circle of radius c units and when θ =α (constant) we mean an
infinite ray making an angle α with the positive x-axis.
Similarly, in three-dimensional space with rectangular coordinates (x, y, z) the equations x = k, y = l and z = m where k, l and
m are constants, represent unbounded planes parallel to the yz-plane, xz-plane and xy-plane, respectively. With cylindrical

coordinates (r, θ, z) , by r = c, θ = α , and z = m , where c, α, and m are constants, we mean an unbounded vertical cylinder with
the z-axis as its radial axis; a plane making a constant angle α with the xy-plane; and an unbounded horizontal plane parallel to the
xy-plane, respectively. This means that the circular cylinder x + y = c in rectangular coordinates can be represented simply as
2 2 2

r = c in cylindrical coordinates. (Refer to Cylindrical and Spherical Coordinates for more review.)

Integration in Cylindrical Coordinates


Triple integrals can often be more readily evaluated by using cylindrical coordinates instead of rectangular coordinates. Some
common equations of surfaces in rectangular coordinates along with corresponding equations in cylindrical coordinates are listed in
Table 15.7.1. These equations will become handy as we proceed with solving problems using triple integrals.
Table 15.7.1 : Equations of Some Common Shapes
Circular cylinder Circular cone Sphere Paraboloid

Rectangular x
2
+y
2
= c
2
z
2 2
= c (x
2
+y )
2 2
x +y
2
+z
2
= c
2
z = c(x
2
+y )
2

Cylindrical r = c z = cr
2
r +z
2
= c
2
z = cr
2

As before, we start with the simplest bounded region B in R to describe in cylindrical coordinates, in the form of a cylindrical
3

box, B = {(r, θ, z)|a ≤ r ≤ b, α ≤ θ ≤ β, c ≤ z ≤ d} (Figure 15.7.2). Suppose we divide each interval into l, m, and n
β⋅α
subdivisions such that Δr = b⋅a

l
, Δθ =
m
, and Δz =
d⋅c

n
. Then we can state the following definition for a triple integral in
cylindrical coordinates.

Figure 15.7.2 : A cylindrical box B described by cylindrical coordinates.

 DEFINITION: triple integral in cylindrical coordinates

Consider the cylindrical box (expressed in cylindrical coordinates)

B = {(r, θ, z)|a ≤ r ≤ b, α ≤ θ ≤ β, c ≤ z ≤ d}.

If the function is continuous on B and if (r , θ , z ) is any sample point in the cylindrical subbox
f (r, θ, z)

ijk

ijk

ijk

B ijk = |ri−1 , r | × |θ
i , θ | × |z
j−1 j , k | (Figure 15.7.2), then we can define the triple integral in cylindrical coordinates as
k−1 i

the limit of a triple Riemann sum, provided the following limit exists:
l m n

∗ ∗ ∗
lim ∑ ∑ ∑ f (r ,θ ,z )ΔrΔθΔz.
ijk ijk ijk
l,m,n→∞
i=1 j=1 k=1

15.7.2 https://math.libretexts.org/@go/page/4551
Note that if g(x, y, z) is the function in rectangular coordinates and the box B is expressed in rectangular coordinates, then the
triple integral

∭ g(x, y, z)dV
B

is equal to the triple integral

∭ g(r cos θ, r sin θ, z)r dr dθ dz


B

and we have

∭ g(x, y, z)dV = ∭ g(r cos θ, r sin θ, z)r dr dθ dz = ∭ f (r, θ z)r dr dθ dz.


B B B

As mentioned in the preceding section, all the properties of a double integral work well in triple integrals, whether in rectangular
coordinates or cylindrical coordinates. They also hold for iterated integrals. To reiterate, in cylindrical coordinates, Fubini’s
theorem takes the following form:

 Theorem: Fubini’s Theorem in Cylindrical Coordinates

Suppose that g(x, y, z) is continuous on a rectangular box B which when described in cylindrical coordinates looks like
B = {(r, θ, z)|a ≤ r ≤ b, α ≤ θ ≤ β, c ≤ z ≤ d} .
Then g(x, y, z) = g(r cos θ, r sin θ, z) = f (r, θ, z) and
d α b

∭ g(x, y, z)dV = ∫ ∫ ∫ f (r, θ, z)r dr dθ dz.


B c β a

The iterated integral may be replaced equivalently by any one of the other five iterated integrals obtained by integrating with
respect to the three variables in other orders.
Cylindrical coordinate systems work well for solids that are symmetric around an axis, such as cylinders and cones. Let us look at
some examples before we define the triple integral in cylindrical coordinates on general cylindrical regions.

 Example 15.7.1: Evaluating a Triple Integral over a Cylindrical Box

Evaluate the triple integral

∭ (zr sin θ)r dr dθ dz


B

where the cylindrical box B is B = {(r, θ, z)|0 ≤ r ≤ 2, 0 ≤ θ ≤ π/2, 0, ≤ z ≤ 4}.

Solution
As stated in Fubini’s theorem, we can write the triple integral as the iterated integral
θ=π/2 r=2 z=4

∭ (zr sin θ)r dr dθ dz = ∫ ∫ ∫ (zr sin θ)r dz dr dθ.


B θ=0 r=0 z=0

The evaluation of the iterated integral is straightforward. Each variable in the integral is independent of the others, so we can
integrate each variable separately and multiply the results together. This makes the computation much easier:
θ=π/2 r=2 z=4 π/2 2 4
2
∫ ∫ ∫ (zr sin θ)r dz dr dθ = ( ∫ sin θ dθ) ( ∫ r dr) ( ∫ z dz)
θ=0 r=0 z=0 0 0 0

3 2 2 4
π/2 r ∣ z ∣ 64
= ( − cos θ| )( ∣ )( ∣ ) = .
0
3 ∣ 2 ∣ 3
0 0

15.7.3 https://math.libretexts.org/@go/page/4551
 Exercise 15.7.1:
Evaluate the triple integral
θ=π r=1 z=4

∫ ∫ ∫ rz sin θr dz dr dθ.
θ=0 r=0 z=0

Hint
Follow the same steps as in the previous example.
Answer
8

If the cylindrical region over which we have to integrate is a general solid, we look at the projections onto the coordinate planes.
Hence the triple integral of a continuous function f (r, θ, z) over a general solid region
E = {(r, θ, z)|(r, θ) ∈ D, u (r, θ) ≤ z ≤ u (r, θ)}
1 2 in R where D is the projection of E onto the rθ-plane, is
3

u2 (r,θ)

∭ f (r, θ, z)r dr dθ dz = ∬ [∫ f (r, θ, z)dz] r dr dθ.


E D u1 (r,θ)

In particular, if D = {(r, θ)|G 1 (θ) ≤ r ≤ g2 (θ), α ≤ θ ≤ β} , then we have


θ=β r=g2 (θ) z=u2 (r,θ)

∭ f (r, θ, z)r dr dθ = ∫ ∫ ∫ f (r, θ, z)r dz dr dθ.


E θ=α r=g1 (θ) z=u1 (r,θ)

Similar formulas exist for projections onto the other coordinate planes. We can use polar coordinates in those planes if necessary.

 Example 15.7.2: Setting up a Triple Integral in Cylindrical Coordinates over a General Region

Consider the region E inside the right circular cylinder with equation r = 2 sin θ , bounded below by the rθ-plane and
bounded above by the sphere with radius 4 centered at the origin (Figure 15.5.3). Set up a triple integral over this region with a
function f (r, θ, z) in cylindrical coordinates.

Figure 15.7.3 : Setting up a triple integral in cylindrical coordinates over a cylindrical region.
Solution
−−−−−−
First, identify that the equation for the sphere is r + z = 16 . We can see that the limits for z are from 0 to z = √16 − r .
2 2 2

Then the limits for r are from 0 to r = 2 sin θ . Finally, the limits for θ are from 0 to π. Hence the region is
−−−−−−
E = {(r, θ, z)|0 ≤ θ ≤ π, 0 ≤ r ≤ 2 sin θ, 0 ≤ z ≤ √16 − r }. Therefore, the triple integral is
2

θ=π r=2 sin θ z=√16−r2

∭ f (r, θ, z)r dz dr dθ = ∫ ∫ ∫ f (r, θ, z)r dz dr dθ.


E θ=0 r=0 z=0

15.7.4 https://math.libretexts.org/@go/page/4551
 Exercise 15.7.2:

Consider the region inside the right circular cylinder with equation r = 2 sin θ bounded below by the rθ -plane and bounded
above by z = 4 − y . Set up a triple integral with a function f (r, θ, z) in cylindrical coordinates.

Hint
Analyze the region, and draw a sketch.
Answer
θ=π r=2 sin θ z=4−r sin θ

∭ f (r, θ, z)r dz dr dθ = ∫ ∫ ∫ f (r, θ, z)r dz dr dθ.


E θ=0 r=0 z=0

 Example 15.7.3: Setting up a Triple Integral in Two Ways


−−−−−−
Let E be the region bounded below by the cone z = √x + y and above by the paraboloid z = 2 − x − y . (Figure
2 2 2 2

15.5.4). Set up a triple integral in cylindrical coordinates to find the volume of the region, using the following orders of
integration:
a. dz dr dθ
b. dr dz dθ

Figure 15.7.4 : Setting up a triple integral in cylindrical coordinates over a conical region.
Solution
−−−−−−
a. The cone is of radius 1 where it meets the paraboloid. Since z = 2 − x − y = 2 − r and z = √x + y = r (assuming
2 2 2 2 2 2

r is nonnegative), we have 2 − r = r . Solving, we have r + r − 2 = (r + 2)(r − 1) = 0 . Since r ≥ 0 , we have r = 1 .


2 2

Therefore z = 1 . So the intersection of these two surfaces is a circle of radius 1 in the plane z = 1 . The cone is the lower
bound for z and the paraboloid is the upper bound. The projection of the region onto the xy-plane is the circle of radius 1
centered at the origin.
Thus, we can describe the region as E = {(r, θ, z)|0 ≤ θ ≤ 2π, 0 ≤ r ≤ 1, r ≤ z ≤ 2 − r }
2
.
Hence the integral for the volume is
2
θ=2π r=1 z=2−r

V =∫ ∫ ∫ r dz dr dθ.
θ=0 r=0 z=r

b. We can also write the cone surface as r = z and the paraboloid as r = 2 − z . The lower bound for r is zero, but the upper
2

bound is sometimes the cone and the other times it is the paraboloid. The plane z = 1 divides the region into two regions. Then
the region can be described as
− −−−
E = {(r, θ, z)|0 ≤ θ ≤ 2π, 0 ≤ z ≤ 1, 0 ≤ r ≤ z} ∪ {(r, θ, z)|0 ≤ θ ≤ 2π, 1 ≤ z ≤ 2, 0 ≤ r ≤ √ 2 − z }.

15.7.5 https://math.libretexts.org/@go/page/4551
Now the integral for the volume becomes
θ=2π z=1 r=z θ=2π z=2 r=√2−z

V =∫ ∫ ∫ r dr dz dθ + ∫ ∫ ∫ r dr dz dθ.
θ=0 z=0 r=0 θ=0 z=1 r=0

 Exercise 15.7.3:

Redo the previous example with the order of integration dθ dz dr .

Hint
Note that θ is independent of r and z .
Answer
E = {(r, θ, z)|0 ≤ θ ≤ 2π, 0 ≤ z ≤ 1, 0 ≤ r ≤ 2 − z }
2
and
2
r=1 z=2−r θ=2π

V =∫ ∫ ∫ r dθ dz dr.
r=0 z=0 θ=0

 Example 15.7.4: Finding a Volume with Triple Integrals in Two Ways

Let E be the region bounded below by the rθ-plane, above by the sphere x + y + z = 4 , and on the sides by the cylinder
2 2 2

x + y = 1 (Figure 15.5.5). Set up a triple integral in cylindrical coordinates to find the volume of the region using the
2 2

following orders of integration, and in each case find the volume and check that the answers are the same:
a. dz dr dθ
b. dr dz dθ .

Figure 15.7.5 : Finding a cylindrical volume with a triple integral in cylindrical coordinates.
Solution
a. Note that the equation for the sphere is
2 2 2 2 2
x +y +z = 4 or r +z =4

and the equation for the cylinder is


2 2 2
x +y = 1 or r = 1.

Thus, we have for the region E


− −−−−
2
E = {(r, θ, z)|0 ≤ z ≤ √ 4 − r , 0 ≤ r ≤ 1, 0 ≤ θ ≤ 2π}

Hence the integral for the volume is

15.7.6 https://math.libretexts.org/@go/page/4551
2
θ=2π r=1 z=√4−r

V (E) = ∫ ∫ ∫ r dz dr dθ (15.7.1)
θ=0 r=0 z=0

θ=2π r=1 θ=2π r=1


z=√4−r
2 − −−−−
2
=∫ ∫ [⟩rz| ] dr dθ = ∫ ∫ (r√ 4 − r ) dr dθ (15.7.2)
z=0
θ=0 r=0 θ=0 r=0


8 – 8 –
=∫ ( − √3) dθ = 2π ( − √3) cubic units. (15.7.3)
0
3 3

b. Since the sphere is 2


x +y
2
+z , which is r + z = 4 , and the cylinder is x + y = 1 , which is r = 1 , we have
2
=4
2 2 2 2 2


1 + z = 4 , that is, z = 3 . Thus we have two regions, since the sphere and the cylinder intersect at (1, √3) in the rz -plane
2 2

− −−−− –
2
E1 = {(r, θ, z)|0 ≤ r ≤ √ 4 − r , √3 ≤ z ≤ 2, 0 ≤ θ ≤ 2π}

and

E2 = {(r, θ, z)|0 ≤ r ≤ 1, 0 ≤ z ≤ √3, 0 ≤ θ ≤ 2π}.

Hence the integral for the volume is


θ=2π z=2 r=√4−r2 θ=2π z=√3 r=1

V (E) = ∫ ∫ ∫ r dr dz dθ + ∫ ∫ ∫ r dr dz dθ (15.7.4)
θ=0 z=√3 r=0 θ=0 z=0 r=0

– 16 – 8 –
= √3π + ( − 3 √3) π = 2π ( − √3) cubic units. (15.7.5)
3 3

 Exercise 15.7.4
Redo the previous example with the order of integration dθ dz dr .

Hint
A figure can be helpful. Note that θ is independent of r and z .
Answer
−−−− −
2
E2 = {(r, θ, z)|0 ≤ θ ≤ 2π, 0 ≤ r ≤ 1, r ≤ z ≤ √4 − r } and
2
r=1 z=√4−r θ=2π

V =∫ ∫ ∫ r dθ dz dr.
r=0 z=r θ=0

Review of Spherical Coordinates


In three-dimensional space R in the spherical coordinate system, we specify a point P by its distance ρ from the origin, the polar
3

angle θ from the positive x-axis (same as in the cylindrical coordinate system), and the angle φ from the positive z -axis and the
line OP (Figure 15.7.6). Note that ρ > 0 and 0 ≤ φ ≤ π . (Refer to Cylindrical and Spherical Coordinates for a review.) Spherical
coordinates are useful for triple integrals over regions that are symmetric with respect to the origin.

15.7.7 https://math.libretexts.org/@go/page/4551
Figure 15.7.6 : The spherical coordinate system locates points with two angles and a distance from the origin.
Recall the relationships that connect rectangular coordinates with spherical coordinates.
From spherical coordinates to rectangular coordinates:

x = ρ sin φ cos θ, y = ρ sin φ sin θ, and z = ρ cos φ.

From rectangular coordinates to spherical coordinates:

2 2 2 2
y z
ρ =x +y + z , tan θ = , φ = arccos( −−−−−−−−−− ).
x 2 2 2
√x +y +z

Other relationships that are important to know for conversions are


r = ρ sin φ

θ =θ These equations are used to convert from spherical coordinates to cylindrical coordinates.
z = ρ cos φ

and
−− −−−−
2 2
ρ = √r + z

θ =θ These equations are used to convert from cylindrical coordinates to spherical coordinates.
z
φ = arccos( )
√r2 +z 2

15.7.7 shows a few solid regions that are convenient to express in spherical coordinates.

Figure 15.7.7 : Spherical coordinates are especially convenient for working with solids bounded by these types of surfaces. (The
letter c indicates a constant.)

Integration in Spherical Coordinates


We now establish a triple integral in the spherical coordinate system, as we did before in the cylindrical coordinate system. Let the
function f (ρ, θ, φ) be continuous in a bounded spherical box, B = {(ρ, θ, φ)|a ≤ ρ ≤ b, α ≤ θ ≤ β, γ ≤ φ ≤ ψ} . We then
β−α ψ−γ
divide each interval into l, m, n and n subdivisions such that Δρ =
b−a

l
, Δθ =
m
. Δφ =
n
. Now we can illustrate the

15.7.8 https://math.libretexts.org/@go/page/4551
following theorem for triple integrals in spherical coordinates with (ρ , θ , φ ) being any sample point in the spherical subbox

ijk

ijk

ijk

Bijk . For the volume element of the subbox ΔV in spherical coordinates, we have ΔV = (Δρ) (ρΔφ) (ρ sin φ Δθ) , as shown in
the following figure.

Figure 15.7.8 : The volume element of a box in spherical coordinates.

 Definition: triple integral in spherical coordinates

The triple integral in spherical coordinates is the limit of a triple Riemann sum,
l m n

∗ ∗ ∗ ∗ 2
lim ∑ ∑ ∑ f (ρ ,θ ,φ )(ρ ) sin φΔρΔθΔφ
ijk ijk ijk ijk
l,m,n→∞
i=1 j=1 k=1

provided the limit exists.

As with the other multiple integrals we have examined, all the properties work similarly for a triple integral in the spherical
coordinate system, and so do the iterated integrals. Fubini’s theorem takes the following form.

 Theorem: Fubini’s Theorem for Spherical Coordinates

If f (ρ, θ, φ) is continuous on a spherical solid box B = [a, b] × [α, β] × [γ, ψ] , then


φ=ψ θ=β ρ=b
2 2
∭ f (ρ, θ, φ) ρ sin φdρ dφ dθ = ∫ ∫ ∫ f (ρ, θ, φ) ρ sin φ dρ dφ dθ.
B φ=γ θ=α ρ=a

This iterated integral may be replaced by other iterated integrals by integrating with respect to the three variables in other
orders.

As stated before, spherical coordinate systems work well for solids that are symmetric around a point, such as spheres and cones.
Let us look at some examples before we consider triple integrals in spherical coordinates on general spherical regions.

 Example 15.7.5: Evaluating a Triple Integral in Spherical Coordinates

Evaluate the iterated triple integral


θ=2π φ=π/2 ρ=1
2
∫ ∫ ∫ ρ sin φ dρ dφ dθ.
θ=0 φ=0 ρ=0

Solution
As before, in this case the variables in the iterated integral are actually independent of each other and hence we can integrate
each piece and multiply:
2π π/2 1 2π π/2 1
2 2
1 2π
∫ ∫ ∫ ρ sin φ dρ dφ dθ = ∫ dθ ∫ sin φ dφ ∫ ρ dρ = (2π) (1) ( ) =
0 0 0 0 0 0
3 3

15.7.9 https://math.libretexts.org/@go/page/4551
The concept of triple integration in spherical coordinates can be extended to integration over a general solid, using the
projections onto the coordinate planes. Note that dV and dA mean the increments in volume and area, respectively. The
variables V and A are used as the variables for integration to express the integrals.
The triple integral of a continuous function f (ρ, θ, φ) over a general solid region

E = {(ρ, θ, φ)|(ρ, θ) ∈ D, u1 (ρ, θ) ≤ φ ≤ u2 (ρ, θ)}

in R , where D is the projection of E onto the ρθ-plane, is


3

u2 (ρ,θ)

∭ f (ρ, θ, φ)dV = ∬ [∫ f (ρ, θ, φ) dφ] dA.


E D u1 (ρ,θ)

In particular, if D = {(ρ, θ)|g1 (θ) ≤ ρ ≤ g2 (θ), α ≤ θ ≤ β} , the we have


β g2 (θ) u2 (ρ,θ)
2
∭ f (ρ, θ, φ)dV = ∫ ∫ ∫ f (ρ, θ, φ)ρ sin φ dφ dρ dθ.
E α g1 (θ) u1 (ρ,θ)

Similar formulas occur for projections onto the other coordinate planes.

 Example 15.7.6: Setting up a Triple Integral in Spherical Coordinates


−−−−−−− −
Set up an integral for the volume of the region bounded by the cone 2 2
z = √3(x + y ) and the hemisphere
−−−−−−−− −
2
z = √4 − x − y (see the figure below).
2

Figure 15.7.9 : A region bounded below by a cone and above by a hemisphere.


Solution
Using the conversion formulas from rectangular coordinates to spherical coordinates, we have:
−−−−−−− − –
For the cone: z = √3(x 2 2
+y ) or ρ cos φ = √3ρ sin φ or tan φ = √3
1
or φ = π

6
.
−−−−−−−− −
For the sphere: z = √4 − x 2
−y
2
or z 2 2
+x +y
2
=4 or ρ 2
=4 or ρ = 2 .
Thus, the triple integral for the volume is
θ=2π φ+π/6 ρ=2
2
V (E) = ∫ ∫ ∫ ρ sin φ dρ dφ dθ.
θ=0 φ=0 ρ=0

 Exercise 15.7.5
Set up a triple integral for the volume of the solid region bounded above by the sphere ρ = 2 and bounded below by the cone
φ = π/3.

Hint
Follow the steps of the previous example.

15.7.10 https://math.libretexts.org/@go/page/4551
Answer
θ=2π φ=π/3 ρ=2
2
V (E) = ∫ ∫ ∫ ρ sin φ dρ dφ dθ
θ=0 φ=0 ρ=0

 Example 15.7.7: Interchanging Order of Integration in Spherical Coordinates


−−−−−−
Let E be the region bounded below by the cone z = √x + y and above by the sphere z = x + y + z (Figure 15.5.10).
2 2 2 2 2

Set up a triple integral in spherical coordinates and find the volume of the region using the following orders of integration:
a. dρ dϕ dθ
b. dφ dρ dθ

Figure 15.7.10:. A region bounded below by a cone and above by a sphere.


Solution
a. Use the conversion formulas to write the equations of the sphere and cone in spherical coordinates.
For the sphere:
2 2 2
x +y +z =z (15.7.6)
2
ρ = ρ cos φ (15.7.7)

ρ = cos φ. (15.7.8)

For the cone:


−−−−−−
2 2
z = √x +y (15.7.9)

−−−−−−−−−−−−
2 2 2
ρ cos φ = √ ρ sin φ cos ϕ (15.7.10)

−−−−−−−−−−−−−−−−−−−−
2 2 2 2
ρ cos φ = √ ρ sin φ (cos ϕ + sin ϕ) (15.7.11)

ρ cos φ = ρ sin φ (15.7.12)

cos φ = sin φ (15.7.13)

φ = π/4. (15.7.14)

Hence the integral for the volume of the solid region E becomes
θ=2π φ=π/4 ρ=cos φ
2
V (E) = ∫ ∫ ∫ ρ sin φ dρ dφ dθ.
θ=0 φ=0 ρ=0

b. Consider the φρ-plane. Note that the ranges for φ and ρ (from part a.) are
– –
0 ≤ ρ√2/2and √2 ≤ ρ1 (15.7.15)

0 ≤ φ ≤ π/40 ≤ ρ ≤ cos φ (15.7.16)

15.7.11 https://math.libretexts.org/@go/page/4551

The curve ρ = cos φ meets the line φ = π/4 at the point (π/4, √2/2) . Thus, to change the order of integration, we need to
use two pieces:

0 ≤ ρ ≤ √2/2, 0 ≤ φ ≤ π/4

and
– −1
√2/2 ≤ ρ ≤ 1, 0 ≤ φ ≤ cos ρ.

Hence the integral for the volume of the solid region E becomes
−1
θ=2π ρ=√2/2 φ=π/4 θ=2π ρ=1 φ=cos ρ
2 2
V (E) = ∫ ∫ ∫ ρ sin φ dφ dρ dθ + ∫ ∫ ∫ ρ sin φ dφ dρ dθ
θ=0 ρ=0 φ=0 θ=0 ρ=√2/2 φ=0

In each case, the integration results in V (E) = π

8
.

Before we end this section, we present a couple of examples that can illustrate the conversion from rectangular coordinates to
cylindrical coordinates and from rectangular coordinates to spherical coordinates.

 Example 15.7.8: Converting from Rectangular Coordinates to Cylindrical Coordinates

Convert the following integral into cylindrical coordinates:


2 2 2
y=1 x=√1−y z=√x +y

∫ ∫ ∫ xyz dz dx dy.
2 2
y=−1 x=0 z=x +y

Solution
The ranges of the variables are
−1 ≤ y ≤ y (15.7.17)
−−−−−
2
0 ≤ x ≤ √1 − y (15.7.18)

−−−−−−
2 2 2 2
x +y ≤ z ≤ √x +y . (15.7.19)

The first two inequalities describe the right half of a circle of radius 1. Therefore, the ranges for θ and r are
π π
− ≤θ ≤ and 0 ≤ r ≤ 1.
2 2

The limits of z are r 2


≤z ≤r , hence
2 2 2
y=1 x=√1−y z=√x +y θ=π/2 r=1 z=r

∫ ∫ ∫ xyz dz dx dy = ∫ ∫ ∫ r(r cos θ) (r sin θ) z dz dr dθ.


y=−1 x=0 z=x2 +y 2 θ=−π/2 r=0 z=r2

 Example 15.7.9: Converting from Rectangular Coordinates to Spherical Coordinates


Convert the following integral into spherical coordinates:
2 2 2
y=3 x=√9−y z=√18−x −y
2 2 2
∫ ∫ ∫ (x +y + z )dz dx dy.
y=0 x=0 z=√x2 +y 2

Solution
The ranges of the variables are
0 ≤y ≤3 (15.7.20)
−−−−−
2
0 ≤ x ≤ √9 − y (15.7.21)

−−−−−− −−−−−−−−−−
2 2 2 2
√x +y ≤ z ≤ √ 18 − x −y . (15.7.22)

15.7.12 https://math.libretexts.org/@go/page/4551
The first two ranges of variables describe a quarter disk in the first quadrant of the xy -plane. Hence the range for θ is
0 ≤θ ≤
π

2
.
−−−−−− −−−−−−−−− −
The lower bound z = √x + y is the upper half of a cone and the upper bound
2 2 2
z = √18 − x − y
2
is the upper half of a
−− –
sphere. Therefore, we have 0 ≤ ρ ≤ √18 , which is 0 ≤ ρ ≤ 3√2 .
For the ranges of φ we need to find where the cone and the sphere intersect, so solve the equation
2 2
r +z = 18 (15.7.23)
−−−−−−
2 2 2 2
(√ x +y ) +z = 18 (15.7.24)

2 2
z +z = 18 (15.7.25)

2
2z = 18 (15.7.26)

2
z =9 (15.7.27)

z = 3. (15.7.28)

This gives

3 √2 cos φ = 3 (15.7.29)

1
cos φ = – (15.7.30)
√2
π
φ = . (15.7.31)
4

Putting this together, we obtain


2 2 2
y=3 x=√9−y z=√18−x −y φ=π/4 θ=π/2 ρ=3 √2
2 2 2 4
∫ ∫ ∫ (x +y + z )dz dx dy = ∫ ∫ ∫ ρ sin φ dρ dθ dφ.
y=0 x=0 z=√x2 +y 2 φ=0 θ=0 ρ=0

 Exercise 15.7.6:

Use rectangular, cylindrical, and spherical coordinates to set up triple integrals for finding the volume of the region inside the
sphere x + y + z = 4 but outside the cylinder x + y = 1 .
2 2 2 2 2

Answer: Rectangular
2 2 2 2 2 2
x=2 y=√4−x z=√4−x −y x=1 y=√1−x z=√4−x −y

∫ ∫ ∫ dz dy dx − ∫ ∫ ∫ dz dy dx.
2 2 2 2 2 2
x=−2 y=−√4−x z=−√4−x −y x=−1 y=−√1−x z=−√4−x −y

Answer: Cylindrical
2
θ=2π r=2 z=√4−r

∫ ∫ ∫ r dz dr dθ.
2
θ=0 r=1 z=−√4−r

Answer: Spherical
φ=5π/6 θ=2π ρ=2
2
∫ ∫ ∫ ρ sin φ dρ dθ dφ.
φ=π/6 θ=0 ρ=csc φ

Now that we are familiar with the spherical coordinate system, let’s find the volume of some known geometric figures, such as
spheres and ellipsoids.

 Example 15.7.10: Chapter Opener: Finding the Volume of l’Hemisphèric

Find the volume of the spherical planetarium in l’Hemisphèric in Valencia, Spain, which is five stories tall and has a radius of
approximately 50 ft, using the equation x + y + z = r . 2 2 2 2

15.7.13 https://math.libretexts.org/@go/page/4551
Figure 15.7.11: (credit: modification of work by Javier Yaya Tur, Wikimedia Commons)
Solution
We calculate the volume of the ball in the first octant, where x ≤ 0, y ≤ 0 , and z ≤ 0 , using spherical coordinates, and then
multiply the result by 8 for symmetry. Since we consider the region D as the first octant in the integral, the ranges of the
variables are
π π
0 ≤φ ≤ , 0 ≤ ρ ≤ r, 0 ≤ θ ≤ .
2 2

Therefore,
θ=π/2 ρ=π φ=π/2
2
V =∭ dx dy dz = 8 ∫ ∫ ∫ ρ sin θ dφ dρ dφ (15.7.32)
D θ=0 ρ=0 φ=0

φ=π/2 ρ=r θ=π/2


2
=8∫ dφ ∫ ρ dρ ∫ sin θ dθ (15.7.33)
φ=0 ρ=0 θ=0

3
π r
=8 ( ) ( ) (1) (15.7.34)
2 3

4
3
= πr . (15.7.35)
3

This exactly matches with what we knew. So for a sphere with a radius of approximately 50 ft, the volume is
π(50 ) ≈ 523, 600 f t .
4 3 3

For the next example we find the volume of an ellipsoid.

 Example 15.7.11: Finding the Volume of an Ellipsoid


2
2 y 2

Find the volume of the ellipsoid x


2
a
+ 2
+
z

c
2
=1 .
b

Solution
We again use symmetry and evaluate the volume of the ellipsoid using spherical coordinates. As before, we use the first octant
x ≤ 0, y ≤ 0 , and z ≤ 0 and then multiply the result by 8 .

In this case the ranges of the variables are


π π
0 ≤φ ≤ 0 ≤ ρ ≤ 1, and 0 ≤ θ ≤ .
2 2

Also, we need to change the rectangular to spherical coordinates in this way:

x = aρ cos φ sin θ, y = bρ sin φ sin θ, and z = cp cos θ.

Then the volume of the ellipsoid becomes

15.7.14 https://math.libretexts.org/@go/page/4551
V =∭ dx dy dz (15.7.36)
D

θ=π/2 ρ=1 φ=π/2


2
=8∫ ∫ ∫ abc ρ sin θ dφ dρ dθ (15.7.37)
θ=0 ρ=0 φ=0

(15.7.38)

φ=π/2 ρ=1 θ=π/2


2
= 8abc ∫ dφ ∫ ρ dρ ∫ sin θ dθ (15.7.39)
φ=0 ρ=0 θ=0

π 1
= 8abc ( )( ) (1) (15.7.40)
2 3

4
= πabc. (15.7.41)
3

 Example 15.7.12: Finding the Volume of the Space Inside an Ellipsoid and Outside a Sphere
2 2 2
y
Find the volume of the space inside the ellipsoid x
2
+
2
+
z
2
=1 and outside the sphere x 2
+y
2
+z
2
= 50
2
.
75 80 90

Solution
This problem is directly related to the l’Hemisphèric structure. The volume of space inside the ellipsoid and outside the sphere
might be useful to find the expense of heating or cooling that space. We can use the preceding two examples for the volume of
the sphere and ellipsoid and then substract.
First we find the volume of the ellipsoid using a = 75 ft, b = 80 ft, and c = 90 ft in the result from Example. Hence the
volume of the ellipsoid is
4
3
Vellipsoid = π(75)(80)(90) ≈ 2, 262, 000 f t .
3

From Example, the volume of the sphere is


3
Vsphere ≈ 523, 600 f t .

2
2 y 2

Therefore, the volume of the space inside the ellipsoid x


2
+ 2
+
z
2
=1 and outside the sphere x
2
+y
2
+z
2
= 50
2
is
75 80 90

approximately
3
VH emispheric = Vellipsoid − Vsphere = 1, 738, 400 f t .

 Student Project: Hot air balloons

Hot air ballooning is a relaxing, peaceful pastime that many people enjoy. Many balloonist gatherings take place around the
world, such as the Albuquerque International Balloon Fiesta. The Albuquerque event is the largest hot air balloon festival in
the world, with over 500 balloons participating each year.

Figure 15.7.12: Balloons lift off at the 2001 Albuquerque International Balloon Fiesta. (credit: David Herrera, Flickr)
As the name implies, hot air balloons use hot air to generate lift. (Hot air is less dense than cooler air, so the balloon floats as
long as the hot air stays hot.) The heat is generated by a propane burner suspended below the opening of the basket. Once the
balloon takes off, the pilot controls the altitude of the balloon, either by using the burner to heat the air and ascend or by using
a vent near the top of the balloon to release heated air and descend. The pilot has very little control over where the balloon

15.7.15 https://math.libretexts.org/@go/page/4551
goes, however—balloons are at the mercy of the winds. The uncertainty over where we will end up is one of the reasons
balloonists are attracted to the sport.
In this project we use triple integrals to learn more about hot air balloons. We model the balloon in two pieces. The top of the
balloon is modeled by a half sphere of radius 28
feet. The bottom of the balloon is modeled by a frustum of a cone (think of an ice cream cone with the pointy end cut off). The
radius of the large end of the frustum is 28 feet and the radius of the small end of the frustum is 28 feet. A graph of our balloon
model and a cross-sectional diagram showing the dimensions are shown in the following figure.

Figure 15.7.13: (a) Use a half sphere to model the top part of the balloon and a frustum of a cone to model the bottom part of
the balloon. (b) A cross section of the balloon showing its dimensions.
We first want to find the volume of the balloon. If we look at the top part and the bottom part of the balloon separately, we see
that they are geometric solids with known volume formulas. However, it is still worthwhile to set up and evaluate the integrals
we would need to find the volume. If we calculate the volume using integration, we can use the known volume formulas to
check our answers. This will help ensure that we have the integrals set up correctly for the later, more complicated stages of the
project.
1. Find the volume of the balloon in two ways.
a. Use triple integrals to calculate the volume. Consider each part of the balloon separately. (Consider using spherical
coordinates for the top part and cylindrical coordinates for the bottom part.)
b. Verify the answer using the formulas for the volume of a sphere, V =
4

3
3
πr , and for the volume of a cone, V =
1

3
2
πr h .
In reality, calculating the temperature at a point inside the balloon is a tremendously complicated endeavor. In fact, an entire
branch of physics (thermodynamics) is devoted to studying heat and temperature. For the purposes of this project, however, we
are going to make some simplifying assumptions about how temperature varies from point to point within the balloon. Assume
that just prior to liftoff, the temperature (in degrees Fahrenheit) of the air inside the balloon varies according to the function
z−r
T0 (r, θ, z) = + 210.
10

2. What is the average temperature of the air in the balloon just prior to liftoff? (Again, look at each part of the balloon
separately, and do not forget to convert the function into spherical coordinates when looking at the top part of the balloon.)
Now the pilot activates the burner for 10 seconds. This action affects the temperature in a 12-foot-wide column 20 feet high,
directly above the burner. A cross section of the balloon depicting this column in shown in the following figure

15.7.16 https://math.libretexts.org/@go/page/4551
Figure 15.7.14: Activating the burner heats the air in a 20-foot-high, 12-foot-wide column directly above the burner.
Assume that after the pilot activates the burner for 10 seconds, the temperature of the air in the column described above
increases according to the formula

H (r, θ, z) = −2z − 48.

Then the temperature of the air in the column is given by


z−r
T1 (r, θ, z) = + 210 + (−2z − 48),
10

while the temperature in the remainder of the balloon is still given by


z−r
T0 (r, θ, z) = + 210.
10

3. Find the average temperature of the air in the balloon after the pilot has activated the burner for 10 seconds.

Key Concepts
To evaluate a triple integral in cylindrical coordinates, use the iterated integral
θ=β r=g2 (θ) u2 (r,θ)

∫ ∫ ∫ f (r, θ, z)r dz dr dθ.


θ=α r=g1 (θ) z=u1 (r,θ)

To evaluate a triple integral in spherical coordinates, use the iterated integral


θ=β ρ=g2 (θ) u2 (r,θ)
2
∫ ∫ ∫ f (ρ, θ, φ) ρ sin φ dφ dρ dθ.
θ=α ρ=g (θ) φ=u1 (r,θ)
1

Key Equations
Triple integral in cylindrical coordinates

∭ g(s, y, z)dV = ∭ g(r cos θ, r sin θ, z)r dr dθ dz = ∭ f (r, θ, z)r dr dθ dz


B B B

Triple integral in spherical coordinates


φ=ψ θ=β ρ=b
2 2
∭ f (ρ, θ, φ)ρ sin φ dρ dφ dθ = ∫ ∫ ∫ f (ρ, θ, φ)ρ sin φ dρ dφ dθ
B φ=γ θ=α ρ=a

Glossary
triple integral in cylindrical coordinates
the limit of a triple Riemann sum, provided the following limit exists:

15.7.17 https://math.libretexts.org/@go/page/4551
l m n

∗ ∗ ∗ ∗
li ml,m,n→∞ ∑ ∑ ∑ f (r ,θ ,s )r ΔrΔθΔz
ijk ijk ijk ijk

i=1 j=1 k=1

triple integral in spherical coordinates


the limit of a triple Riemann sum, provided the following limit exists:
l m n

∗ ∗ ∗ ∗ 2
li ml,m,n→∞ ∑ ∑ ∑ f (ρ ,θ ,φ )(ρ ) sin φΔρΔθΔφ
ijk ijk ijk ijk

i=1 j=1 k=1

15.7: Triple Integrals in Cylindrical Coordinates is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
15.5: Triple Integrals in Cylindrical and Spherical Coordinates by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0.
Original source: https://openstax.org/details/books/calculus-volume-1.

15.7.18 https://math.libretexts.org/@go/page/4551
15.8: Triple Integrals in Spherical Coordinates
 Learning Objectives
Evaluate a triple integral by changing to cylindrical coordinates.
Evaluate a triple integral by changing to spherical coordinates.

Earlier in this chapter we showed how to convert a double integral in rectangular coordinates into a double integral in polar
coordinates in order to deal more conveniently with problems involving circular symmetry. A similar situation occurs with triple
integrals, but here we need to distinguish between cylindrical symmetry and spherical symmetry. In this section we convert triple
integrals in rectangular coordinates into a triple integral in either cylindrical or spherical coordinates.
Also recall the chapter prelude, which showed the opera house l’Hemisphèric in Valencia, Spain. It has four sections with one of
the sections being a theater in a five-story-high sphere (ball) under an oval roof as long as a football field. Inside is an IMAX
screen that changes the sphere into a planetarium with a sky full of 9000 twinkling stars. Using triple integrals in spherical
coordinates, we can find the volumes of different geometric shapes like these.

Review of Cylindrical Coordinates


As we have seen earlier, in two-dimensional space R a point with rectangular coordinates (x, y) can be identified with (r, θ) in
2

y
polar coordinates and vice versa, where x = r cos θ , y = r sin θ, r = x + y and tan θ = ( ) are the relationships between
2 2 2
x

the variables.
In three-dimensional space R a point with rectangular coordinates (x, y, z) can be identified with cylindrical coordinates (r, θ, z)
3

and vice versa. We can use these same conversion relationships, adding z as the vertical distance to the point from the (xy-plane as
shown in 15.8.1.

Figure 15.8.1 : Cylindrical coordinates are similar to polar coordinates with a vertical z coordinate added.
To convert from rectangular to cylindrical coordinates, we use the conversion
x = r cos θ

y = r sin θ

z =z

To convert from cylindrical to rectangular coordinates, we use


r
2
=x
2
+y
2
and
−1 y
θ = tan ( )
x

z =z

Note that that z -coordinate remains the same in both cases.


In the two-dimensional plane with a rectangular coordinate system, when we say x = k (constant) we mean an unbounded vertical
line parallel to the y -axis and when y = l (constant) we mean an unbounded horizontal line parallel to the x-axis. With the polar

15.8.1 https://math.libretexts.org/@go/page/4552
coordinate system, when we say r = c (constant), we mean a circle of radius c units and when θ =α (constant) we mean an
infinite ray making an angle α with the positive x-axis.
Similarly, in three-dimensional space with rectangular coordinates (x, y, z) the equations x = k, y = l and z = m where k, l and
m are constants, represent unbounded planes parallel to the yz-plane, xz-plane and xy-plane, respectively. With cylindrical

coordinates (r, θ, z) , by r = c, θ = α , and z = m , where c, α, and m are constants, we mean an unbounded vertical cylinder with
the z-axis as its radial axis; a plane making a constant angle α with the xy-plane; and an unbounded horizontal plane parallel to the
xy-plane, respectively. This means that the circular cylinder x + y = c in rectangular coordinates can be represented simply as
2 2 2

r = c in cylindrical coordinates. (Refer to Cylindrical and Spherical Coordinates for more review.)

Integration in Cylindrical Coordinates


Triple integrals can often be more readily evaluated by using cylindrical coordinates instead of rectangular coordinates. Some
common equations of surfaces in rectangular coordinates along with corresponding equations in cylindrical coordinates are listed in
Table 15.8.1. These equations will become handy as we proceed with solving problems using triple integrals.
Table 15.8.1 : Equations of Some Common Shapes
Circular cylinder Circular cone Sphere Paraboloid

Rectangular x
2
+y
2
= c
2
z
2 2
= c (x
2
+y )
2 2
x +y
2
+z
2
= c
2
z = c(x
2
+y )
2

Cylindrical r = c z = cr
2
r +z
2
= c
2
z = cr
2

As before, we start with the simplest bounded region B in R to describe in cylindrical coordinates, in the form of a cylindrical
3

box, B = {(r, θ, z)|a ≤ r ≤ b, α ≤ θ ≤ β, c ≤ z ≤ d} (Figure 15.8.2). Suppose we divide each interval into l, m, and n
β⋅α
subdivisions such that Δr = b⋅a

l
, Δθ =
m
, and Δz =
d⋅c

n
. Then we can state the following definition for a triple integral in
cylindrical coordinates.

Figure 15.8.2 : A cylindrical box B described by cylindrical coordinates.

 DEFINITION: triple integral in cylindrical coordinates

Consider the cylindrical box (expressed in cylindrical coordinates)

B = {(r, θ, z)|a ≤ r ≤ b, α ≤ θ ≤ β, c ≤ z ≤ d}.

If the function is continuous on B and if (r , θ , z ) is any sample point in the cylindrical subbox
f (r, θ, z)

ijk

ijk

ijk

B ijk = |ri−1 , r | × |θ
i , θ | × |z
j−1 j , k | (Figure 15.8.2), then we can define the triple integral in cylindrical coordinates as
k−1 i

the limit of a triple Riemann sum, provided the following limit exists:
l m n

∗ ∗ ∗
lim ∑ ∑ ∑ f (r ,θ ,z )ΔrΔθΔz.
ijk ijk ijk
l,m,n→∞
i=1 j=1 k=1

15.8.2 https://math.libretexts.org/@go/page/4552
Note that if g(x, y, z) is the function in rectangular coordinates and the box B is expressed in rectangular coordinates, then the
triple integral

∭ g(x, y, z)dV
B

is equal to the triple integral

∭ g(r cos θ, r sin θ, z)r dr dθ dz


B

and we have

∭ g(x, y, z)dV = ∭ g(r cos θ, r sin θ, z)r dr dθ dz = ∭ f (r, θ z)r dr dθ dz.


B B B

As mentioned in the preceding section, all the properties of a double integral work well in triple integrals, whether in rectangular
coordinates or cylindrical coordinates. They also hold for iterated integrals. To reiterate, in cylindrical coordinates, Fubini’s
theorem takes the following form:

 Theorem: Fubini’s Theorem in Cylindrical Coordinates

Suppose that g(x, y, z) is continuous on a rectangular box B which when described in cylindrical coordinates looks like
B = {(r, θ, z)|a ≤ r ≤ b, α ≤ θ ≤ β, c ≤ z ≤ d} .
Then g(x, y, z) = g(r cos θ, r sin θ, z) = f (r, θ, z) and
d α b

∭ g(x, y, z)dV = ∫ ∫ ∫ f (r, θ, z)r dr dθ dz.


B c β a

The iterated integral may be replaced equivalently by any one of the other five iterated integrals obtained by integrating with
respect to the three variables in other orders.
Cylindrical coordinate systems work well for solids that are symmetric around an axis, such as cylinders and cones. Let us look at
some examples before we define the triple integral in cylindrical coordinates on general cylindrical regions.

 Example 15.8.1: Evaluating a Triple Integral over a Cylindrical Box

Evaluate the triple integral

∭ (zr sin θ)r dr dθ dz


B

where the cylindrical box B is B = {(r, θ, z)|0 ≤ r ≤ 2, 0 ≤ θ ≤ π/2, 0, ≤ z ≤ 4}.

Solution
As stated in Fubini’s theorem, we can write the triple integral as the iterated integral
θ=π/2 r=2 z=4

∭ (zr sin θ)r dr dθ dz = ∫ ∫ ∫ (zr sin θ)r dz dr dθ.


B θ=0 r=0 z=0

The evaluation of the iterated integral is straightforward. Each variable in the integral is independent of the others, so we can
integrate each variable separately and multiply the results together. This makes the computation much easier:
θ=π/2 r=2 z=4 π/2 2 4
2
∫ ∫ ∫ (zr sin θ)r dz dr dθ = ( ∫ sin θ dθ) ( ∫ r dr) ( ∫ z dz)
θ=0 r=0 z=0 0 0 0

3 2 2 4
π/2 r ∣ z ∣ 64
= ( − cos θ| )( ∣ )( ∣ ) = .
0
3 ∣ 2 ∣ 3
0 0

15.8.3 https://math.libretexts.org/@go/page/4552
 Exercise 15.8.1:
Evaluate the triple integral
θ=π r=1 z=4

∫ ∫ ∫ rz sin θr dz dr dθ.
θ=0 r=0 z=0

Hint
Follow the same steps as in the previous example.
Answer
8

If the cylindrical region over which we have to integrate is a general solid, we look at the projections onto the coordinate planes.
Hence the triple integral of a continuous function f (r, θ, z) over a general solid region
E = {(r, θ, z)|(r, θ) ∈ D, u (r, θ) ≤ z ≤ u (r, θ)}
1 2 in R where D is the projection of E onto the rθ-plane, is
3

u2 (r,θ)

∭ f (r, θ, z)r dr dθ dz = ∬ [∫ f (r, θ, z)dz] r dr dθ.


E D u1 (r,θ)

In particular, if D = {(r, θ)|G 1 (θ) ≤ r ≤ g2 (θ), α ≤ θ ≤ β} , then we have


θ=β r=g2 (θ) z=u2 (r,θ)

∭ f (r, θ, z)r dr dθ = ∫ ∫ ∫ f (r, θ, z)r dz dr dθ.


E θ=α r=g1 (θ) z=u1 (r,θ)

Similar formulas exist for projections onto the other coordinate planes. We can use polar coordinates in those planes if necessary.

 Example 15.8.2: Setting up a Triple Integral in Cylindrical Coordinates over a General Region

Consider the region E inside the right circular cylinder with equation r = 2 sin θ , bounded below by the rθ-plane and
bounded above by the sphere with radius 4 centered at the origin (Figure 15.5.3). Set up a triple integral over this region with a
function f (r, θ, z) in cylindrical coordinates.

Figure 15.8.3 : Setting up a triple integral in cylindrical coordinates over a cylindrical region.
Solution
−−−−−−
First, identify that the equation for the sphere is r + z = 16 . We can see that the limits for z are from 0 to z = √16 − r .
2 2 2

Then the limits for r are from 0 to r = 2 sin θ . Finally, the limits for θ are from 0 to π. Hence the region is
−−−−−−
E = {(r, θ, z)|0 ≤ θ ≤ π, 0 ≤ r ≤ 2 sin θ, 0 ≤ z ≤ √16 − r }. Therefore, the triple integral is
2

θ=π r=2 sin θ z=√16−r2

∭ f (r, θ, z)r dz dr dθ = ∫ ∫ ∫ f (r, θ, z)r dz dr dθ.


E θ=0 r=0 z=0

15.8.4 https://math.libretexts.org/@go/page/4552
 Exercise 15.8.2:

Consider the region inside the right circular cylinder with equation r = 2 sin θ bounded below by the rθ -plane and bounded
above by z = 4 − y . Set up a triple integral with a function f (r, θ, z) in cylindrical coordinates.

Hint
Analyze the region, and draw a sketch.
Answer
θ=π r=2 sin θ z=4−r sin θ

∭ f (r, θ, z)r dz dr dθ = ∫ ∫ ∫ f (r, θ, z)r dz dr dθ.


E θ=0 r=0 z=0

 Example 15.8.3: Setting up a Triple Integral in Two Ways


−−−−−−
Let E be the region bounded below by the cone z = √x + y and above by the paraboloid z = 2 − x − y . (Figure
2 2 2 2

15.5.4). Set up a triple integral in cylindrical coordinates to find the volume of the region, using the following orders of
integration:
a. dz dr dθ
b. dr dz dθ

Figure 15.8.4 : Setting up a triple integral in cylindrical coordinates over a conical region.
Solution
−−−−−−
a. The cone is of radius 1 where it meets the paraboloid. Since z = 2 − x − y = 2 − r and z = √x + y = r (assuming
2 2 2 2 2 2

r is nonnegative), we have 2 − r = r . Solving, we have r + r − 2 = (r + 2)(r − 1) = 0 . Since r ≥ 0 , we have r = 1 .


2 2

Therefore z = 1 . So the intersection of these two surfaces is a circle of radius 1 in the plane z = 1 . The cone is the lower
bound for z and the paraboloid is the upper bound. The projection of the region onto the xy-plane is the circle of radius 1
centered at the origin.
Thus, we can describe the region as E = {(r, θ, z)|0 ≤ θ ≤ 2π, 0 ≤ r ≤ 1, r ≤ z ≤ 2 − r }
2
.
Hence the integral for the volume is
2
θ=2π r=1 z=2−r

V =∫ ∫ ∫ r dz dr dθ.
θ=0 r=0 z=r

b. We can also write the cone surface as r = z and the paraboloid as r = 2 − z . The lower bound for r is zero, but the upper
2

bound is sometimes the cone and the other times it is the paraboloid. The plane z = 1 divides the region into two regions. Then
the region can be described as
− −−−
E = {(r, θ, z)|0 ≤ θ ≤ 2π, 0 ≤ z ≤ 1, 0 ≤ r ≤ z} ∪ {(r, θ, z)|0 ≤ θ ≤ 2π, 1 ≤ z ≤ 2, 0 ≤ r ≤ √ 2 − z }.

15.8.5 https://math.libretexts.org/@go/page/4552
Now the integral for the volume becomes
θ=2π z=1 r=z θ=2π z=2 r=√2−z

V =∫ ∫ ∫ r dr dz dθ + ∫ ∫ ∫ r dr dz dθ.
θ=0 z=0 r=0 θ=0 z=1 r=0

 Exercise 15.8.3:

Redo the previous example with the order of integration dθ dz dr .

Hint
Note that θ is independent of r and z .
Answer
E = {(r, θ, z)|0 ≤ θ ≤ 2π, 0 ≤ z ≤ 1, 0 ≤ r ≤ 2 − z }
2
and
2
r=1 z=2−r θ=2π

V =∫ ∫ ∫ r dθ dz dr.
r=0 z=0 θ=0

 Example 15.8.4: Finding a Volume with Triple Integrals in Two Ways

Let E be the region bounded below by the rθ-plane, above by the sphere x + y + z = 4 , and on the sides by the cylinder
2 2 2

x + y = 1 (Figure 15.5.5). Set up a triple integral in cylindrical coordinates to find the volume of the region using the
2 2

following orders of integration, and in each case find the volume and check that the answers are the same:
a. dz dr dθ
b. dr dz dθ .

Figure 15.8.5 : Finding a cylindrical volume with a triple integral in cylindrical coordinates.
Solution
a. Note that the equation for the sphere is
2 2 2 2 2
x +y +z = 4 or r +z =4

and the equation for the cylinder is


2 2 2
x +y = 1 or r = 1.

Thus, we have for the region E


− −−−−
2
E = {(r, θ, z)|0 ≤ z ≤ √ 4 − r , 0 ≤ r ≤ 1, 0 ≤ θ ≤ 2π}

Hence the integral for the volume is

15.8.6 https://math.libretexts.org/@go/page/4552
2
θ=2π r=1 z=√4−r

V (E) = ∫ ∫ ∫ r dz dr dθ (15.8.1)
θ=0 r=0 z=0

θ=2π r=1 θ=2π r=1


z=√4−r
2 − −−−−
2
=∫ ∫ [⟩rz| ] dr dθ = ∫ ∫ (r√ 4 − r ) dr dθ (15.8.2)
z=0
θ=0 r=0 θ=0 r=0


8 – 8 –
=∫ ( − √3) dθ = 2π ( − √3) cubic units. (15.8.3)
0
3 3

b. Since the sphere is 2


x +y
2
+z , which is r + z = 4 , and the cylinder is x + y = 1 , which is r = 1 , we have
2
=4
2 2 2 2 2


1 + z = 4 , that is, z = 3 . Thus we have two regions, since the sphere and the cylinder intersect at (1, √3) in the rz -plane
2 2

− −−−− –
2
E1 = {(r, θ, z)|0 ≤ r ≤ √ 4 − r , √3 ≤ z ≤ 2, 0 ≤ θ ≤ 2π}

and

E2 = {(r, θ, z)|0 ≤ r ≤ 1, 0 ≤ z ≤ √3, 0 ≤ θ ≤ 2π}.

Hence the integral for the volume is


θ=2π z=2 r=√4−r2 θ=2π z=√3 r=1

V (E) = ∫ ∫ ∫ r dr dz dθ + ∫ ∫ ∫ r dr dz dθ (15.8.4)
θ=0 z=√3 r=0 θ=0 z=0 r=0

– 16 – 8 –
= √3π + ( − 3 √3) π = 2π ( − √3) cubic units. (15.8.5)
3 3

 Exercise 15.8.4
Redo the previous example with the order of integration dθ dz dr .

Hint
A figure can be helpful. Note that θ is independent of r and z .
Answer
−−−− −
2
E2 = {(r, θ, z)|0 ≤ θ ≤ 2π, 0 ≤ r ≤ 1, r ≤ z ≤ √4 − r } and
2
r=1 z=√4−r θ=2π

V =∫ ∫ ∫ r dθ dz dr.
r=0 z=r θ=0

Review of Spherical Coordinates


In three-dimensional space R in the spherical coordinate system, we specify a point P by its distance ρ from the origin, the polar
3

angle θ from the positive x-axis (same as in the cylindrical coordinate system), and the angle φ from the positive z -axis and the
line OP (Figure 15.8.6). Note that ρ > 0 and 0 ≤ φ ≤ π . (Refer to Cylindrical and Spherical Coordinates for a review.) Spherical
coordinates are useful for triple integrals over regions that are symmetric with respect to the origin.

15.8.7 https://math.libretexts.org/@go/page/4552
Figure 15.8.6 : The spherical coordinate system locates points with two angles and a distance from the origin.
Recall the relationships that connect rectangular coordinates with spherical coordinates.
From spherical coordinates to rectangular coordinates:

x = ρ sin φ cos θ, y = ρ sin φ sin θ, and z = ρ cos φ.

From rectangular coordinates to spherical coordinates:

2 2 2 2
y z
ρ =x +y + z , tan θ = , φ = arccos( −−−−−−−−−− ).
x 2 2 2
√x +y +z

Other relationships that are important to know for conversions are


r = ρ sin φ

θ =θ These equations are used to convert from spherical coordinates to cylindrical coordinates.
z = ρ cos φ

and
−− −−−−
2 2
ρ = √r + z

θ =θ These equations are used to convert from cylindrical coordinates to spherical coordinates.
z
φ = arccos( )
√r2 +z 2

15.8.7 shows a few solid regions that are convenient to express in spherical coordinates.

Figure 15.8.7 : Spherical coordinates are especially convenient for working with solids bounded by these types of surfaces. (The
letter c indicates a constant.)

Integration in Spherical Coordinates


We now establish a triple integral in the spherical coordinate system, as we did before in the cylindrical coordinate system. Let the
function f (ρ, θ, φ) be continuous in a bounded spherical box, B = {(ρ, θ, φ)|a ≤ ρ ≤ b, α ≤ θ ≤ β, γ ≤ φ ≤ ψ} . We then
β−α ψ−γ
divide each interval into l, m, n and n subdivisions such that Δρ =
b−a

l
, Δθ =
m
. Δφ =
n
. Now we can illustrate the

15.8.8 https://math.libretexts.org/@go/page/4552
following theorem for triple integrals in spherical coordinates with (ρ , θ , φ ) being any sample point in the spherical subbox

ijk

ijk

ijk

Bijk . For the volume element of the subbox ΔV in spherical coordinates, we have ΔV = (Δρ) (ρΔφ) (ρ sin φ Δθ) , as shown in
the following figure.

Figure 15.8.8 : The volume element of a box in spherical coordinates.

 Definition: triple integral in spherical coordinates

The triple integral in spherical coordinates is the limit of a triple Riemann sum,
l m n

∗ ∗ ∗ ∗ 2
lim ∑ ∑ ∑ f (ρ ,θ ,φ )(ρ ) sin φΔρΔθΔφ
ijk ijk ijk ijk
l,m,n→∞
i=1 j=1 k=1

provided the limit exists.

As with the other multiple integrals we have examined, all the properties work similarly for a triple integral in the spherical
coordinate system, and so do the iterated integrals. Fubini’s theorem takes the following form.

 Theorem: Fubini’s Theorem for Spherical Coordinates

If f (ρ, θ, φ) is continuous on a spherical solid box B = [a, b] × [α, β] × [γ, ψ] , then


φ=ψ θ=β ρ=b
2 2
∭ f (ρ, θ, φ) ρ sin φdρ dφ dθ = ∫ ∫ ∫ f (ρ, θ, φ) ρ sin φ dρ dφ dθ.
B φ=γ θ=α ρ=a

This iterated integral may be replaced by other iterated integrals by integrating with respect to the three variables in other
orders.

As stated before, spherical coordinate systems work well for solids that are symmetric around a point, such as spheres and cones.
Let us look at some examples before we consider triple integrals in spherical coordinates on general spherical regions.

 Example 15.8.5: Evaluating a Triple Integral in Spherical Coordinates

Evaluate the iterated triple integral


θ=2π φ=π/2 ρ=1
2
∫ ∫ ∫ ρ sin φ dρ dφ dθ.
θ=0 φ=0 ρ=0

Solution
As before, in this case the variables in the iterated integral are actually independent of each other and hence we can integrate
each piece and multiply:
2π π/2 1 2π π/2 1
2 2
1 2π
∫ ∫ ∫ ρ sin φ dρ dφ dθ = ∫ dθ ∫ sin φ dφ ∫ ρ dρ = (2π) (1) ( ) =
0 0 0 0 0 0
3 3

15.8.9 https://math.libretexts.org/@go/page/4552
The concept of triple integration in spherical coordinates can be extended to integration over a general solid, using the
projections onto the coordinate planes. Note that dV and dA mean the increments in volume and area, respectively. The
variables V and A are used as the variables for integration to express the integrals.
The triple integral of a continuous function f (ρ, θ, φ) over a general solid region

E = {(ρ, θ, φ)|(ρ, θ) ∈ D, u1 (ρ, θ) ≤ φ ≤ u2 (ρ, θ)}

in R , where D is the projection of E onto the ρθ-plane, is


3

u2 (ρ,θ)

∭ f (ρ, θ, φ)dV = ∬ [∫ f (ρ, θ, φ) dφ] dA.


E D u1 (ρ,θ)

In particular, if D = {(ρ, θ)|g1 (θ) ≤ ρ ≤ g2 (θ), α ≤ θ ≤ β} , the we have


β g2 (θ) u2 (ρ,θ)
2
∭ f (ρ, θ, φ)dV = ∫ ∫ ∫ f (ρ, θ, φ)ρ sin φ dφ dρ dθ.
E α g1 (θ) u1 (ρ,θ)

Similar formulas occur for projections onto the other coordinate planes.

 Example 15.8.6: Setting up a Triple Integral in Spherical Coordinates


−−−−−−− −
Set up an integral for the volume of the region bounded by the cone 2 2
z = √3(x + y ) and the hemisphere
−−−−−−−− −
2
z = √4 − x − y (see the figure below).
2

Figure 15.8.9 : A region bounded below by a cone and above by a hemisphere.


Solution
Using the conversion formulas from rectangular coordinates to spherical coordinates, we have:
−−−−−−− − –
For the cone: z = √3(x 2 2
+y ) or ρ cos φ = √3ρ sin φ or tan φ = √3
1
or φ = π

6
.
−−−−−−−− −
For the sphere: z = √4 − x 2
−y
2
or z 2 2
+x +y
2
=4 or ρ 2
=4 or ρ = 2 .
Thus, the triple integral for the volume is
θ=2π φ+π/6 ρ=2
2
V (E) = ∫ ∫ ∫ ρ sin φ dρ dφ dθ.
θ=0 φ=0 ρ=0

 Exercise 15.8.5
Set up a triple integral for the volume of the solid region bounded above by the sphere ρ = 2 and bounded below by the cone
φ = π/3.

Hint
Follow the steps of the previous example.

15.8.10 https://math.libretexts.org/@go/page/4552
Answer
θ=2π φ=π/3 ρ=2
2
V (E) = ∫ ∫ ∫ ρ sin φ dρ dφ dθ
θ=0 φ=0 ρ=0

 Example 15.8.7: Interchanging Order of Integration in Spherical Coordinates


−−−−−−
Let E be the region bounded below by the cone z = √x + y and above by the sphere z = x + y + z (Figure 15.5.10).
2 2 2 2 2

Set up a triple integral in spherical coordinates and find the volume of the region using the following orders of integration:
a. dρ dϕ dθ
b. dφ dρ dθ

Figure 15.8.10:. A region bounded below by a cone and above by a sphere.


Solution
a. Use the conversion formulas to write the equations of the sphere and cone in spherical coordinates.
For the sphere:
2 2 2
x +y +z =z (15.8.6)
2
ρ = ρ cos φ (15.8.7)

ρ = cos φ. (15.8.8)

For the cone:


−−−−−−
2 2
z = √x +y (15.8.9)

−−−−−−−−−−−−
2 2 2
ρ cos φ = √ ρ sin φ cos ϕ (15.8.10)

−−−−−−−−−−−−−−−−−−−−
2 2 2 2
ρ cos φ = √ ρ sin φ (cos ϕ + sin ϕ) (15.8.11)

ρ cos φ = ρ sin φ (15.8.12)

cos φ = sin φ (15.8.13)

φ = π/4. (15.8.14)

Hence the integral for the volume of the solid region E becomes
θ=2π φ=π/4 ρ=cos φ
2
V (E) = ∫ ∫ ∫ ρ sin φ dρ dφ dθ.
θ=0 φ=0 ρ=0

b. Consider the φρ-plane. Note that the ranges for φ and ρ (from part a.) are
– –
0 ≤ ρ√2/2and √2 ≤ ρ1 (15.8.15)

0 ≤ φ ≤ π/40 ≤ ρ ≤ cos φ (15.8.16)

15.8.11 https://math.libretexts.org/@go/page/4552

The curve ρ = cos φ meets the line φ = π/4 at the point (π/4, √2/2) . Thus, to change the order of integration, we need to
use two pieces:

0 ≤ ρ ≤ √2/2, 0 ≤ φ ≤ π/4

and
– −1
√2/2 ≤ ρ ≤ 1, 0 ≤ φ ≤ cos ρ.

Hence the integral for the volume of the solid region E becomes
−1
θ=2π ρ=√2/2 φ=π/4 θ=2π ρ=1 φ=cos ρ
2 2
V (E) = ∫ ∫ ∫ ρ sin φ dφ dρ dθ + ∫ ∫ ∫ ρ sin φ dφ dρ dθ
θ=0 ρ=0 φ=0 θ=0 ρ=√2/2 φ=0

In each case, the integration results in V (E) = π

8
.

Before we end this section, we present a couple of examples that can illustrate the conversion from rectangular coordinates to
cylindrical coordinates and from rectangular coordinates to spherical coordinates.

 Example 15.8.8: Converting from Rectangular Coordinates to Cylindrical Coordinates

Convert the following integral into cylindrical coordinates:


2 2 2
y=1 x=√1−y z=√x +y

∫ ∫ ∫ xyz dz dx dy.
2 2
y=−1 x=0 z=x +y

Solution
The ranges of the variables are
−1 ≤ y ≤ y (15.8.17)
−−−−−
2
0 ≤ x ≤ √1 − y (15.8.18)

−−−−−−
2 2 2 2
x +y ≤ z ≤ √x +y . (15.8.19)

The first two inequalities describe the right half of a circle of radius 1. Therefore, the ranges for θ and r are
π π
− ≤θ ≤ and 0 ≤ r ≤ 1.
2 2

The limits of z are r 2


≤z ≤r , hence
2 2 2
y=1 x=√1−y z=√x +y θ=π/2 r=1 z=r

∫ ∫ ∫ xyz dz dx dy = ∫ ∫ ∫ r(r cos θ) (r sin θ) z dz dr dθ.


y=−1 x=0 z=x2 +y 2 θ=−π/2 r=0 z=r2

 Example 15.8.9: Converting from Rectangular Coordinates to Spherical Coordinates


Convert the following integral into spherical coordinates:
2 2 2
y=3 x=√9−y z=√18−x −y
2 2 2
∫ ∫ ∫ (x +y + z )dz dx dy.
y=0 x=0 z=√x2 +y 2

Solution
The ranges of the variables are
0 ≤y ≤3 (15.8.20)
−−−−−
2
0 ≤ x ≤ √9 − y (15.8.21)

−−−−−− −−−−−−−−−−
2 2 2 2
√x +y ≤ z ≤ √ 18 − x −y . (15.8.22)

15.8.12 https://math.libretexts.org/@go/page/4552
The first two ranges of variables describe a quarter disk in the first quadrant of the xy -plane. Hence the range for θ is
0 ≤θ ≤
π

2
.
−−−−−− −−−−−−−−− −
The lower bound z = √x + y is the upper half of a cone and the upper bound
2 2 2
z = √18 − x − y
2
is the upper half of a
−− –
sphere. Therefore, we have 0 ≤ ρ ≤ √18 , which is 0 ≤ ρ ≤ 3√2 .
For the ranges of φ we need to find where the cone and the sphere intersect, so solve the equation
2 2
r +z = 18 (15.8.23)
−−−−−−
2 2 2 2
(√ x +y ) +z = 18 (15.8.24)

2 2
z +z = 18 (15.8.25)

2
2z = 18 (15.8.26)

2
z =9 (15.8.27)

z = 3. (15.8.28)

This gives

3 √2 cos φ = 3 (15.8.29)

1
cos φ = – (15.8.30)
√2
π
φ = . (15.8.31)
4

Putting this together, we obtain


2 2 2
y=3 x=√9−y z=√18−x −y φ=π/4 θ=π/2 ρ=3 √2
2 2 2 4
∫ ∫ ∫ (x +y + z )dz dx dy = ∫ ∫ ∫ ρ sin φ dρ dθ dφ.
y=0 x=0 z=√x2 +y 2 φ=0 θ=0 ρ=0

 Exercise 15.8.6:

Use rectangular, cylindrical, and spherical coordinates to set up triple integrals for finding the volume of the region inside the
sphere x + y + z = 4 but outside the cylinder x + y = 1 .
2 2 2 2 2

Answer: Rectangular
2 2 2 2 2 2
x=2 y=√4−x z=√4−x −y x=1 y=√1−x z=√4−x −y

∫ ∫ ∫ dz dy dx − ∫ ∫ ∫ dz dy dx.
2 2 2 2 2 2
x=−2 y=−√4−x z=−√4−x −y x=−1 y=−√1−x z=−√4−x −y

Answer: Cylindrical
2
θ=2π r=2 z=√4−r

∫ ∫ ∫ r dz dr dθ.
2
θ=0 r=1 z=−√4−r

Answer: Spherical
φ=5π/6 θ=2π ρ=2
2
∫ ∫ ∫ ρ sin φ dρ dθ dφ.
φ=π/6 θ=0 ρ=csc φ

Now that we are familiar with the spherical coordinate system, let’s find the volume of some known geometric figures, such as
spheres and ellipsoids.

 Example 15.8.10: Chapter Opener: Finding the Volume of l’Hemisphèric

Find the volume of the spherical planetarium in l’Hemisphèric in Valencia, Spain, which is five stories tall and has a radius of
approximately 50 ft, using the equation x + y + z = r . 2 2 2 2

15.8.13 https://math.libretexts.org/@go/page/4552
Figure 15.8.11: (credit: modification of work by Javier Yaya Tur, Wikimedia Commons)
Solution
We calculate the volume of the ball in the first octant, where x ≤ 0, y ≤ 0 , and z ≤ 0 , using spherical coordinates, and then
multiply the result by 8 for symmetry. Since we consider the region D as the first octant in the integral, the ranges of the
variables are
π π
0 ≤φ ≤ , 0 ≤ ρ ≤ r, 0 ≤ θ ≤ .
2 2

Therefore,
θ=π/2 ρ=π φ=π/2
2
V =∭ dx dy dz = 8 ∫ ∫ ∫ ρ sin θ dφ dρ dφ (15.8.32)
D θ=0 ρ=0 φ=0

φ=π/2 ρ=r θ=π/2


2
=8∫ dφ ∫ ρ dρ ∫ sin θ dθ (15.8.33)
φ=0 ρ=0 θ=0

3
π r
=8 ( ) ( ) (1) (15.8.34)
2 3

4
3
= πr . (15.8.35)
3

This exactly matches with what we knew. So for a sphere with a radius of approximately 50 ft, the volume is
π(50 ) ≈ 523, 600 f t .
4 3 3

For the next example we find the volume of an ellipsoid.

 Example 15.8.11: Finding the Volume of an Ellipsoid


2
2 y 2

Find the volume of the ellipsoid x


2
a
+ 2
+
z

c
2
=1 .
b

Solution
We again use symmetry and evaluate the volume of the ellipsoid using spherical coordinates. As before, we use the first octant
x ≤ 0, y ≤ 0 , and z ≤ 0 and then multiply the result by 8 .

In this case the ranges of the variables are


π π
0 ≤φ ≤ 0 ≤ ρ ≤ 1, and 0 ≤ θ ≤ .
2 2

Also, we need to change the rectangular to spherical coordinates in this way:

x = aρ cos φ sin θ, y = bρ sin φ sin θ, and z = cp cos θ.

Then the volume of the ellipsoid becomes

15.8.14 https://math.libretexts.org/@go/page/4552
V =∭ dx dy dz (15.8.36)
D

θ=π/2 ρ=1 φ=π/2


2
=8∫ ∫ ∫ abc ρ sin θ dφ dρ dθ (15.8.37)
θ=0 ρ=0 φ=0

(15.8.38)

φ=π/2 ρ=1 θ=π/2


2
= 8abc ∫ dφ ∫ ρ dρ ∫ sin θ dθ (15.8.39)
φ=0 ρ=0 θ=0

π 1
= 8abc ( )( ) (1) (15.8.40)
2 3

4
= πabc. (15.8.41)
3

 Example 15.8.12: Finding the Volume of the Space Inside an Ellipsoid and Outside a Sphere
2 2 2
y
Find the volume of the space inside the ellipsoid x
2
+
2
+
z
2
=1 and outside the sphere x 2
+y
2
+z
2
= 50
2
.
75 80 90

Solution
This problem is directly related to the l’Hemisphèric structure. The volume of space inside the ellipsoid and outside the sphere
might be useful to find the expense of heating or cooling that space. We can use the preceding two examples for the volume of
the sphere and ellipsoid and then substract.
First we find the volume of the ellipsoid using a = 75 ft, b = 80 ft, and c = 90 ft in the result from Example. Hence the
volume of the ellipsoid is
4
3
Vellipsoid = π(75)(80)(90) ≈ 2, 262, 000 f t .
3

From Example, the volume of the sphere is


3
Vsphere ≈ 523, 600 f t .

2
2 y 2

Therefore, the volume of the space inside the ellipsoid x


2
+ 2
+
z
2
=1 and outside the sphere x
2
+y
2
+z
2
= 50
2
is
75 80 90

approximately
3
VH emispheric = Vellipsoid − Vsphere = 1, 738, 400 f t .

 Student Project: Hot air balloons

Hot air ballooning is a relaxing, peaceful pastime that many people enjoy. Many balloonist gatherings take place around the
world, such as the Albuquerque International Balloon Fiesta. The Albuquerque event is the largest hot air balloon festival in
the world, with over 500 balloons participating each year.

Figure 15.8.12: Balloons lift off at the 2001 Albuquerque International Balloon Fiesta. (credit: David Herrera, Flickr)
As the name implies, hot air balloons use hot air to generate lift. (Hot air is less dense than cooler air, so the balloon floats as
long as the hot air stays hot.) The heat is generated by a propane burner suspended below the opening of the basket. Once the
balloon takes off, the pilot controls the altitude of the balloon, either by using the burner to heat the air and ascend or by using
a vent near the top of the balloon to release heated air and descend. The pilot has very little control over where the balloon

15.8.15 https://math.libretexts.org/@go/page/4552
goes, however—balloons are at the mercy of the winds. The uncertainty over where we will end up is one of the reasons
balloonists are attracted to the sport.
In this project we use triple integrals to learn more about hot air balloons. We model the balloon in two pieces. The top of the
balloon is modeled by a half sphere of radius 28
feet. The bottom of the balloon is modeled by a frustum of a cone (think of an ice cream cone with the pointy end cut off). The
radius of the large end of the frustum is 28 feet and the radius of the small end of the frustum is 28 feet. A graph of our balloon
model and a cross-sectional diagram showing the dimensions are shown in the following figure.

Figure 15.8.13: (a) Use a half sphere to model the top part of the balloon and a frustum of a cone to model the bottom part of
the balloon. (b) A cross section of the balloon showing its dimensions.
We first want to find the volume of the balloon. If we look at the top part and the bottom part of the balloon separately, we see
that they are geometric solids with known volume formulas. However, it is still worthwhile to set up and evaluate the integrals
we would need to find the volume. If we calculate the volume using integration, we can use the known volume formulas to
check our answers. This will help ensure that we have the integrals set up correctly for the later, more complicated stages of the
project.
1. Find the volume of the balloon in two ways.
a. Use triple integrals to calculate the volume. Consider each part of the balloon separately. (Consider using spherical
coordinates for the top part and cylindrical coordinates for the bottom part.)
b. Verify the answer using the formulas for the volume of a sphere, V =
4

3
3
πr , and for the volume of a cone, V =
1

3
2
πr h .
In reality, calculating the temperature at a point inside the balloon is a tremendously complicated endeavor. In fact, an entire
branch of physics (thermodynamics) is devoted to studying heat and temperature. For the purposes of this project, however, we
are going to make some simplifying assumptions about how temperature varies from point to point within the balloon. Assume
that just prior to liftoff, the temperature (in degrees Fahrenheit) of the air inside the balloon varies according to the function
z−r
T0 (r, θ, z) = + 210.
10

2. What is the average temperature of the air in the balloon just prior to liftoff? (Again, look at each part of the balloon
separately, and do not forget to convert the function into spherical coordinates when looking at the top part of the balloon.)
Now the pilot activates the burner for 10 seconds. This action affects the temperature in a 12-foot-wide column 20 feet high,
directly above the burner. A cross section of the balloon depicting this column in shown in the following figure

15.8.16 https://math.libretexts.org/@go/page/4552
Figure 15.8.14: Activating the burner heats the air in a 20-foot-high, 12-foot-wide column directly above the burner.
Assume that after the pilot activates the burner for 10 seconds, the temperature of the air in the column described above
increases according to the formula

H (r, θ, z) = −2z − 48.

Then the temperature of the air in the column is given by


z−r
T1 (r, θ, z) = + 210 + (−2z − 48),
10

while the temperature in the remainder of the balloon is still given by


z−r
T0 (r, θ, z) = + 210.
10

3. Find the average temperature of the air in the balloon after the pilot has activated the burner for 10 seconds.

Key Concepts
To evaluate a triple integral in cylindrical coordinates, use the iterated integral
θ=β r=g2 (θ) u2 (r,θ)

∫ ∫ ∫ f (r, θ, z)r dz dr dθ.


θ=α r=g1 (θ) z=u1 (r,θ)

To evaluate a triple integral in spherical coordinates, use the iterated integral


θ=β ρ=g2 (θ) u2 (r,θ)
2
∫ ∫ ∫ f (ρ, θ, φ) ρ sin φ dφ dρ dθ.
θ=α ρ=g (θ) φ=u1 (r,θ)
1

Key Equations
Triple integral in cylindrical coordinates

∭ g(s, y, z)dV = ∭ g(r cos θ, r sin θ, z)r dr dθ dz = ∭ f (r, θ, z)r dr dθ dz


B B B

Triple integral in spherical coordinates


φ=ψ θ=β ρ=b
2 2
∭ f (ρ, θ, φ)ρ sin φ dρ dφ dθ = ∫ ∫ ∫ f (ρ, θ, φ)ρ sin φ dρ dφ dθ
B φ=γ θ=α ρ=a

Glossary
triple integral in cylindrical coordinates
the limit of a triple Riemann sum, provided the following limit exists:

15.8.17 https://math.libretexts.org/@go/page/4552
l m n

∗ ∗ ∗ ∗
li ml,m,n→∞ ∑ ∑ ∑ f (r ,θ ,s )r ΔrΔθΔz
ijk ijk ijk ijk

i=1 j=1 k=1

triple integral in spherical coordinates


the limit of a triple Riemann sum, provided the following limit exists:
l m n

∗ ∗ ∗ ∗ 2
li ml,m,n→∞ ∑ ∑ ∑ f (ρ ,θ ,φ )(ρ ) sin φΔρΔθΔφ
ijk ijk ijk ijk

i=1 j=1 k=1

15.8: Triple Integrals in Spherical Coordinates is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
15.5: Triple Integrals in Cylindrical and Spherical Coordinates by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0.
Original source: https://openstax.org/details/books/calculus-volume-1.

15.8.18 https://math.libretexts.org/@go/page/4552
15.9: Change of Variables in Multiple Integrals
Given the difficulty of evaluating multiple integrals, the reader may be wondering if it is possible to simplify those integrals using a
suitable substitution for the variables. The answer is yes, though it is a bit more complicated than the substitution method which
you learned in single-variable calculus.
Recall that if you are given, for example, the definite integral
2
− −−−−
3 2
∫ x √ x − 1 dx (15.9.1)
1

then you would make the substitution


2 2
u =x −1 ⇒ x = u +1 (15.9.2)

du = 2x dx

which changes the limits of integration


x =1 ⇒ u =0 (15.9.3)

x =2 ⇒ u =3

so that we get
2 2
−−−−− 1 − −−−−
3 2 2 2
∫ x √x − 1 = ∫ x ⋅ 2x √ x − 1 dx
1 1
2

3
1 −
=∫ (u + 1)√u du
0
2

3
1
3/2 1/2
= ∫ (u +u ) du, which can be easily integrated to give
2 0


14 √3
=
5

Let us take a different look at what happened when we did that substitution, which will give some motivation for how substitution
works in multiple integrals. First, we let u = x − 1 . On the interval of integration [1, 2], the function x ↦ x − 1 is strictly
2 2

increasing (and maps [1, 2] onto [0, 3]) and hence has an inverse function (defined on the interval [0, 3]). That is, on [0, 3] we can
define x as a function of u , namely
−− −−−
x = g(u) = √ u + 1 (15.9.4)

−−−−−
Then substituting that expression for x into the function 3 2
f (x) = x √x − 1 gives
3/2 −
f (x) = f (g(u)) = (u + 1) √u (15.9.5)

and we see that


dx ′ ′
= g (u) ⇒ dx = g (u) du (15.9.6)
du

1
−1/2
dx = (u + 1 ) du
2

so since
−1
g(0) = 1 ⇒ 0 = g (1) (15.9.7)

−1
g(3) = 2 ⇒ 3 = g (2)

then performing the substitution as we did earlier gives

15.9.1 https://math.libretexts.org/@go/page/4553
2 2
− −−−−
3 2
∫ f (x) dx = ∫ x √ x − 1 dx (15.9.8)
1 1

3
1

=∫ (u + 1)√u du, which can be written as
0 2

3
3/2 − 1 −1/2
=∫ (u + 1 ) √u ⋅ (u + 1 ) du, which means
0
2

−1
2 g (2)

∫ f (x) dx = ∫ f (g(u))g (u) du
1 g −1 (1)

In general, if x = g(u) is a one-to-one, differentiable function from an interval [c, d] (which you can think of as being on the “u-
axis”) onto an interval [a, b] (on the x-axis), which means that g'(u) ≠ 0 on the interval (c, d), so that
a = g(c) and b = g(d), then c = g
−1
(a) and d = g (b) , and −1

−1
b g (b)

∫ f (x) dx = ∫ f (g(u))g (u) du (15.9.9)
−1
a g (a)

This is called the change of variable formula for integrals of single-variable functions, and it is what you were implicitly using
when doing integration by substitution. This formula turns out to be a special case of a more general formula which can be used to
evaluate multiple integrals. We will state the formulas for double and triple integrals involving real-valued functions of two and
three variables, respectively. We will assume that all the functions involved are continuously differentiable and that the regions and
solids involved all have “reasonable” boundaries. The proof of the following theorem is beyond the scope of the text.

Theorem 15.9.1: Change of Variables Formula for Multiple Integrals

Let x = x(u, v) and y = y(u, v) define a one-to-one mapping of a region R' in the uv-plane onto a region R in the xy-plane
such that the determinant
∣ ∂x ∂x ∣
∣ ∣
∣ ∂u ∂v ∣
J(u, v) = (15.9.10)
∣ ∣
∂y ∂y
∣ ∣
∣ ∂u ∂v ∣

is never in R . Then

∬ f (x, y) dA(x, y) = ∬ f (x(u, v), y(u, v)) |J(u, v)| dA(u, v) (15.9.11)


R R

We use the notation dA(x, y) and dA(u, v) to denote the area element in the (x, y) and (u, v) coordinates, respectively.
Similarly, if x = x(u, v, w), y = y(u, v, w) and z = z(u, v, w) define a one-to-one mapping of a solid S' in uvw-space onto a
solid S in xyz -space such that the determinant
∣ ∂x ∂x ∂x ∣
∣ ∣
∂u ∂v ∂w
∣ ∣
∣ ∂y ∂y ∂y ∣
J(u, v, w) = ∣ ∣ (15.9.12)
∣ ∂u ∂v ∂w ∣
∣ ∣
∂z ∂z ∂z
∣ ∣
∣ ∂u ∂v ∂w ∣

is never 0 in S , then

∭ f (x, y, z)dV (x, y, z) = ∭ f (x(u, v, w), y(u, v, w), z(u, v, w))|J(u, v, w)|dV (u, v, w) (15.9.13)


S S

15.9.2 https://math.libretexts.org/@go/page/4553
The determinant J(u, v) in Equation 15.9.10 is called the Jacobian of x and y with respect to u and v , and is sometimes written
as
∂(x, y)
J(u, v) = (15.9.14)
∂(u, v)

Similarly, the Jacobian J(u, v, w) of three variables is sometimes written as


∂(x, y, z)
J(u, v, w) = (15.9.15)
∂(u, v, w)

Notice that Equation 15.9.11 is saying that dA(x, y) = |J(u, v)|dA(u, v) , which you can think of as a two-variable version of the
relation dx = g'(u) du in the single-variable case.
The following example shows how the change of variables formula is used.

Example 15.9.1

Evaluate
x−y

∬ e x+y
dA
R

where R = (x, y) : x ≥ 0, y ≥ 0, x + y ≤ 1 .
Solution
First, note that evaluating this double integral without using substitution is probably impossible, at least in a closed form. By
looking at the numerator and denominator of the exponent of e , we will try the substitution u = x − y and v = x + y . To use
the change of variables Formula 15.9.11, we need to write both x and y in terms of u and v . So solving for x and y gives
1 1
x = (u + v) and y = (v − u) . In Figure 15.9.1 below, we see how the mapping
2 2
1 1
x = x(u, v) = (u + v), y = y(u, v) = (v − u) maps the region R' onto R in a one-to-one manner.
2 2

Figure 15.9.1 : The regions R and R


Now we see that
∣ ∂x ∂x ∣ ∣ 1 1 ∣
∣ ∣ ∣ ∣
∂u ∂v 2 2 ∣ 1 ∣1∣ 1
∣ ∣ ∣
J(u, v) = = = ⇒ |J(u, v)| = ∣ ∣ =
∣ ∣ ∣ ∣ 2 ∣2∣ 2
∂y ∂y 1 1
∣ ∣ ∣ ∣

∣ ∂u ∣ 2 ∣
∂v ∣ 2

so using horizontal slices in R , we have


15.9.3 https://math.libretexts.org/@go/page/4553
x −y

x +y
∬ e dA = ∬ f (x(u, v), y(u, v))|J(u, v)|dA


R R

1 v
u/v
1
=∫ ∫ e du dv
0 −v
2

1
v u=v
u/v
=∫ ( e ∣
∣ ) dv
u=−v
0 2

1
v 1
=∫ (e − e )dv
0
2

2 2
v 1 1 1 e −1
1
= (e − e )∣
∣0 = (e − ) =
4 4 e 4e

The change of variables formula can be used to evaluate double integrals in polar coordinates. Letting
x = x(r, θ) = r cos θ and y = y(r, θ) = r sin θ, (15.9.16)

we have
∣ ∂x ∂x ∣
∣ ∣
∂r ∂θ ∣ cos θ −r sin θ ∣
∣ ∣ 2 2
J(u, v) = =∣ ∣ = r cos θ + r sin θ = r ⇒ |J(u, v)| = |r| = r (15.9.17)
∣ ∣
∂y ∂y ∣ sin θ r cos θ ∣
∣ ∣
∣ ∂r ∂θ ∣

so we have the following formula:

Double Integral in Polar Coordinates

∬ f (x, y) dx dy = ∬ f (r cos θ, r sin θ)r dr dθ (15.9.18)


R R

where the mapping x = r cos θ, y = r sin θ maps the region R' in the rθ-plane onto the region R in the xy-plane in a one-to-
one manner.

Example 15.9.2: Volume of Paraboloid

Find the volume V inside the paraboloid z = x 2


+y
2
for 0 ≤ z ≤ 1

Solution
Using vertical slices, we see that

2 2
V = ∬ (1 − z)dA = ∬ (1 − (x + y ))dA

R R

Figure 15.9.2 : z = x
2
+y
2

15.9.4 https://math.libretexts.org/@go/page/4553
where R = (x, y) : x
2
+y
2
≤1is the unit disk in R (see Figure 15.9.2). In polar coordinates
2
(r, θ) we know that
x
2
+y
2 2
=r and that the unit disk R is the set R' = (r, θ) : 0 ≤ r ≤ 1, 0 ≤ θ ≤ 2π . Thus,
2π 1
2
V =∫ ∫ (1 − r )r dr dθ
0 0

2π 1
3
=∫ ∫ (r − r )dr dθ
0 0

2π 2 4
r r r=1
=∫ ( − ∣
∣r=0 ) dθ
o
2 4


1
=∫ dθ
0
4

π
=
2

Example 15.9.3: Volume of Cone


−−−−−−
Find the volume V inside the cone z = √x 2
+y
2
for 0 ≤ z ≤ 1 .
Solution
Using vertical slices, we see that
−−−−−−
2 2
V = ∬ (1 − z)dA = ∬ (1 − √ x +y ) dA

R R

−−−−−−
Figure 15.9.3 : z 2
= √x + y
2

where R = (x, y) : x
2
+y
2
≤1is the unit disk in R (see Figure 15.9.3). In polar coordinates
2
(r, θ) we know that
−−−−−−−− −
√x2 + y 2 = r and that the unit disk R is the set R' = (r, θ) : 0 ≤ r ≤ 1, 0 ≤ θ ≤ 2π . Thus,
2π 1

V =∫ ∫ (1 − r)r dr dθ
0 0

2π 1
2
=∫ ∫ (r − r )dr dθ
0 0

2π 2 3
r r r=1
=∫ ( − ∣
∣ ) dθ
r=0
0
2 3


1
=∫ dθ
0 6

π
=
3

In a similar fashion, it can be shown (see Exercises 5-6) that triple integrals in cylindrical and spherical coordinates take the
following forms:

15.9.5 https://math.libretexts.org/@go/page/4553
Triple Integral in Cylindrical Coordinates

∭ f (x, y, z)dx dy dz = ∭ (r cos θ, r sin θ, z)r dr dθ dz (15.9.19)


S S

where the mapping x = r cos θ, y = r sin θ, z = z maps the solid S' in rθz -space onto the solid S in xyz -space in a one-to-
one manner.

Triple Integral in Spherical Coordinates

2
∭ f (x, y, z)dx dy dz = ∭ f (ρ sin φ cos θ, ρ sin φ sin θ, ρ cos φ)ρ sin φdρ dφ dθ (15.9.20)


S S

where the mapping x = ρ sin φ cos θ, y = ρ sin φ sin θ, z = ρ cos φ maps the solid S' in ρφθ - space onto the solid S in -
xyz

space in a one-to-one manner.

Example 15.9.4

For a > 0 , find the volume V inside the sphere S = x 2


+y
2
+z
2
=a
2
.
Solution
We see that S is the set ρ = a in spherical coordinates, so
2π π a
2
V =∭ 1dV = ∫ ∫ ∫ 1ρ sin φdρ dφ dθ
0 0 0
S

2π π 3 2π π 3
ρ ρ=a a
=∫ ∫ ( ∣
∣ ) sin φdφ dθ = ∫ ∫ sin φdφ dθ
ρ=0
0 0
3 0 0
3

2π 3 2π 3 3
a φ=π 2a 4πa
=∫ (− cos φ ∣
∣ ) dθ = ∫ dθ =
φ=0
0
3 0
3 3

15.9: Change of Variables in Multiple Integrals is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
3.5: Change of Variables in Multiple Integrals by Michael Corral is licensed GNU FDL. Original source: http://www.mecmath.net/.

15.9.6 https://math.libretexts.org/@go/page/4553
CHAPTER OVERVIEW

16: Vector Calculus


A general Calculus Textmap organized around the textbook

Calculus: Early Transcendentals


by James Stewart

I II III IV V VI VII VIII IX X XI XII XIII XIV XV XVI XVII

This Textmap is currently under construction... please be patient with us.


16.1: Vector Fields
16.2: Line Integrals
16.3: The Fundamental Theorem for Line Integrals
16.4: Green's Theorem
16.5: Curl and Divergence
16.6: Parametric Surfaces and Their Areas
16.7: Surface Integrals
16.8: Stokes' Theorem
16.9: The Divergence Theorem

16: Vector Calculus is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

1
16.1: Vector Fields
We have already seen that a convenient way to describe a line in three dimensions is to provide a vector that "points to'' every point
on the line as a parameter t varies, like
⟨1, 2, 3⟩ + t⟨1, −2, 2⟩ = ⟨1 + t, 2 − 2t, 3 + 2t⟩. (16.1.1)

Except that this gives a particularly simple geometric object, there is nothing special about the individual functions of t that make
up the coordinates of this vector---any vector with a parameter, like ⟨f (t), g(t), h(t)⟩ , will describe some curve in three dimensions
as t varies through all possible values.

Example 13.1.1
Describe the curves ⟨cos t, sin t, 0⟩, ⟨cos t, sin t, t⟩, and ⟨cos t, sin t, 2t⟩.
Solution
As tvaries, the first two coordinates in all three functions trace out the points on the unit circle, starting with (1, 0) when
t =0 and proceeding counter-clockwise around the circle as t increases. In the first case, the z coordinate is always 0, so this
describes precisely the unit circle in the x-y plane. In the second case, the x and y coordinates still describe a circle, but now
the z coordinate varies, so that the height of the curve matches the value of t . When t = π , for example, the resulting vector is
⟨−1, 0, π⟩. A bit of thought should convince you that the result is a helix. In the third vector, the z coordinate varies twice as

fast as the parameter t , so we get a stretched out helix. Both are shown in figure 13.1.1. On the left is the first helix, shown for
t between 0 and 4π; on the right is the second helix, shown for t between 0 and 2π. Both start and end at the same point, but

the first helix takes two full "turns'' to get there, because its z coordinate grows more slowly.

Figure 13.1.1. Two helixes.

A vector expression of the form ⟨f (t), g(t), h(t)⟩ is called a vector function; it is a function from the real numbers R to the set of
all three-dimensional vectors.
We can alternately think of it as three separate functions, x = f (t) , y = g(t) , and z = h(t) , that describe points in space. In this
case we usually refer to the set of equations as parametric equations for the curve, just as for a line. While the parameter t in a
vector function might represent any one of a number of physical quantities, or be simply a "pure number'', it is often convenient and
useful to think of t as representing time. The vector function then tells you where in space
a particular object is at any time.
Vector functions can be difficult to understand, that is, difficult to picture. When available, computer software can be very helpful.
When working by hand, one useful approach is to consider the "projections'' of the curve onto the three standard coordinate planes.
We have already done this in part: in example 13.1.1 we noted that all three curves project to a circle in the x-y plane, since
⟨cos t, sin t⟩ is a two dimensional vector function for the unit circle.

Example 13.1.2
Graph the projections of ⟨cos t, sin t, 2t⟩ onto the x-z plane and the y -z plane.
Solution

16.1.1 https://math.libretexts.org/@go/page/4555
The two dimensional vector function for the projection onto the x-z plane is ⟨cos t, 2t⟩, or in parametric form, x = cos t ,
z = 2t . By eliminating t we get the equation x = cos(z/2), the familiar curve shown on the left in figure~\xrefn{fig:helix

projections}. For the projection onto the y -z plane, we start with the vector function ⟨sin t, 2t⟩, which is the same as y = sin t ,
z = 2t . Eliminating t gives y = sin(z/2) , as shown on the right in figure 13.1.2.

Figure 13.1.2. The projections of ⟨cos t, sin t, 2t⟩ onto the x-z and y -z planes.

Contributors

16.1: Vector Fields is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
13.1: Space Curves by David Guichard is licensed CC BY-NC-SA 4.0.

16.1.2 https://math.libretexts.org/@go/page/4555
16.2: Line Integrals
In single-variable calculus you learned how to integrate a real-valued function f (x) over an interval [a, b] in R . This integral 1

(usually called a Riemann integral) can be thought of as an integral over a path in R , since an interval (or collection of intervals)
1

is really the only kind of “path” in R . You may also recall that if f (x) represented the force applied along the x-axis to an object
1

at position x in [a, b], then the work W done in moving that object from position x = a to x = b was defined as the integral:
b

W =∫ f (x)dx (16.2.1)
a

In this section, we will see how to define the integral of a function (either real-valued or vector-valued) of two variables over a
general path (i.e. a curve) in R . This definition will be motivated by the physical notion of work. We will begin with real-valued
2

functions of two variables.


In physics, the intuitive idea of work is that
Work = Force × Distance (16.2.2)

Suppose that we want to find the total amount W of work done in moving an object along a curve C in R with a smooth 2

parametrization x = x(t), y = y(t), a ≤ t ≤ b , with a force f (x, y) which varies with the position (x, y) of the object and is
applied in the direction of motion along C (see Figure 16.2.1 below).

Figure 16.2.1 Curve C : x = x(t), y = y(t) for t in [a, b]

We will assume for now that the function f (x, y) is continuous and real-valued, so we only consider the magnitude of the force.
Partition the interval [a, b] as follows:

a = t0 < t1 < t2 < ⋅ ⋅ ⋅ < tn−1 < tn = b, for some integer n ≥ 2 (16.2.3)

As we can see from Figure 16.2.1, over a typical subinterval [t i, ti+1 ] the distance Δs traveled along the curve is approximately
i
−−−−−−−−−
√Δx
2
i
+ Δy
2
i
, by the Pythagorean Theorem. Thus, if the subinterval is small enough then the work done in moving the object
along that piece of the curve is approximately
−−−−−−−−−
2 2
Force × Distance ≈ f (xi∗ , yi∗ )√ Δx + Δy (16.2.4)
i i

where (x i∗ , yi∗ ) = (x(ti∗ ), y(ti∗ )) for some t i∗ in [ ti , ti+1 ] , and so


n−1
−−−−−−−−−
2 2
W ≈ ∑ f (xi∗ , yi∗ )√ Δx + Δy (16.2.5)
i i

i=0

is approximately the total amount of work done over the entire curve. But since
−−−−−−−−−−−−−−−−−
2 2
−−−−−−−−− Δxi Δyi
2 2
√ Δx + Δy = √( ) +( ) Δ ti (16.2.6)
i i
Δti Δti

where Δt i = ti+1 − ti , then


−−−−−−−−−−−−−−−−−
n−1 2 2
Δxi Δyi
W ≈ ∑ f (xi∗ , yi∗ )√ ( ) +( ) Δ ti (16.2.7)
Δti Δti
i=0

Taking the limit of that sum as the length of the largest subinterval goes to 0, the sum over all subintervals becomes the integral
from t = a to t = b , Δx Δ t and Δ y Δ t become x'(t) and y'(t), respectively, and f (x , y ) becomes f (x(t), y(t)), so
i i i i i∗ i∗

16.2.1 https://math.libretexts.org/@go/page/4556
that
b −−−−−−−−−−−
2 2
W =∫ f (x(t), y(t))√ x'(t) + y'(t) dt (16.2.8)
a

The integral on the right side of the above equation gives us our idea of how to define, for any real-valued function f (x, y), the
integral of f (x, y) along the curve C , called a line integral:

Definition 16.2.1: Line Integral of a scalar Field

For a real-valued function f (x, y) and a curve C in R , parametrized by


2
x = x(t), y = y(t), a ≤ t ≤ b , the line integral of
f (x, y) along C with respect to arc length s is

b −−−−−−−−−−−
2 2
∫ f (x, y) ds = ∫ f (x(t), y(t))√ x'(t) + y'(t) dt (16.2.9)
C a

The symbol ds is the differential of the arc length function


t −−−−−−−−−−−−
2 2
s = s(t) = ∫ √ x'(u ) + y'(u ) du (16.2.10)
a

which you may recognize from Section 1.9 as the length of the curve C over the interval [a, t], for all t in [a, b]. That is,
−−−−−−−−−−−
2 2
ds = s'(t) dt = √ x'(t) + y'(t) dt, (16.2.11)

by the Fundamental Theorem of Calculus.


For a general real-valued function f (x, y), what does the line integral ∫ f (x, y) ds represent? The preceding discussion of ds
C

gives us a clue. You can think of differentials as infinitesimal lengths. So if you think of f (x, y) as the height of a picket fence
along C , then f (x, y) ds can be thought of as approximately the area of a section of that fence over some infinitesimally small
section of the curve, and thus the line integral ∫ f (x, y) ds is the total area of that picket fence (see Figure 16.2.2).
C

Figure 16.2.2 : Area of shaded rectangle = height × width ≈ f (x, y) ds

Example 16.2.1

Use a line integral to show that the lateral surface area A of a right circular cylinder of radius r and height h is 2πrh.
Solution
We will use the right circular cylinder with base circle C given by x 2
+y
2
=r
2
and with height h in the positive z direction
(see Figure 16.2.3). Parametrize C as follows:
x = x(t) = r cos t, y = y(t) = r sin t, 0 ≤ t ≤ 2π (16.2.12)

16.2.2 https://math.libretexts.org/@go/page/4556
Figure 16.2.3
b −−−−−−−−−−−
2 2
A =∫ f (x, y) ds = ∫ f (x(t), y(t))√ x'(t) + y'(t) dt
C a

2π −−−−−−−−−−−−−−−−−
2 2
=∫ h √ (−r sin t) + (r cos t) dt
0


− −−− −−−−− −−
2 2
=h∫ r√ sin t + cos t dt
0

= rh ∫ 1 dt = 2πrh
0

Note in Example 16.2.1 that if we had traversed the circle C twice, i.e. let t vary from 0 to 4π then we would have gotten an area
of 4πrh, i.e. twice the desired area, even though the curve itself is still the same (namely, a circle of radius r). Also, notice that we
traversed the circle in the counter-clockwise direction. If we had gone in the clockwise direction, using the parametrization
x = x(t) = r cos(2π − t), y = y(t) = r sin(2π − t), 0 ≤ t ≤ 2π, (16.2.13)

then it is easy to verify (Exercise 12) that the value of the line integral is unchanged.
In general, it can be shown (Exercise 15) that reversing the direction in which a curve C is traversed leaves ∫ f (x, y) ds C

unchanged, for any f (x, y). If a curve C has a parametrization x = x(t), y = y(t), a ≤ t ≤ b, then denote by −C the same curve
as C but traversed in the opposite direction. Then −C is parametrized by
x = x(a + b − t), y = y(a + b − t), a ≤ t ≤ b, (16.2.14)

and we have

∫ f (x, y) ds = ∫ f (x, y) ds. (16.2.15)


C −C

Notice that our definition of the line integral was with respect to the arc length parameter s . We can also define
b

∫ f (x, y) dx = ∫ f (x(t), y(t))x'(t) dt (16.2.16)


C a

as the line integral of f (x, y) along C with respect to x, and


b

∫ f (x, y) dy = ∫ f (x(t), y(t))y'(t) dt (16.2.17)


C a

as the line integral of f (x, y) along C with respect to y .


In the derivation of the formula for a line integral, we used the idea of work as force multiplied by distance. However, we know
that force is actually a vector. So it would be helpful to develop a vector form for a line integral. For this, suppose that we have a
function f (x, y) defined on R by2

f(x, y) = P (x, y)i + Q(x, y)j

for some continuous real-valued functions P (x, y) and Q(x, y) on R . Such a function f is called a vector field on R . It is
2 2

defined at points in R , and its values are vectors in R . For a curve C with a smooth parametrization
2 2

x = x(t), y = y(t), a ≤ t ≤ b , let

16.2.3 https://math.libretexts.org/@go/page/4556
r(t) = x(t)i + y(t)j

be the position vector for a point (x(t), y(t)) on C . Then r (t) = x (t)i + y
′ ′ ′
(t)j and so
b b

∫ P (x, y) dx + ∫ Q(x, y) dy =∫ P (x(t), y(t))x'(t) dt + ∫ Q(x(t), y(t))y'(t) dt


C C a a

=∫ (P (x(t), y(t))x'(t) + Q(x(t), y(t))y'(t)) dt


a

=∫ f(x(t), y(t)) ⋅ r'(t)dt


a

by definition of f (x, y). Notice that the function f (x(t), y(t)) ⋅ r'(t) is a real-valued function on [a, b], so the last integral on the
right looks somewhat similar to our earlier definition of a line integral. This leads us to the following definition:

Definition 16.2.2: Line Integral of a vector Field

For a vector field f(x, y) = P (x, y)i + Q(x, y)j and a curve C with a smooth parametrization x = x(t), y = y(t), a ≤ t ≤ b ,
the line integral of f along C is

∫ f ⋅ dr = ∫ P (x, y) dx + ∫ Q(x, y) dy (16.2.18)


C C C

=∫ f(x(t), y(t)) ⋅ r'(t) dt (16.2.19)


a

where r(t) = x(t)i + y(t)j is the position vector for points on C .

We use the notation dr = r'(t) dt = dxi + dyj to denote the differential of the vector-valued function r. The line integral in
Definition 16.2.2 is often called a line integral of a vector field to distinguish it from the line integral in Definition 16.2.1 which is
called a line integral of a scalar field. For convenience we will often write

∫ P (x, y) dx + ∫ Q(x, y) dy = ∫ P (x, y) dx + Q(x, y) dy,


C C C

where it is understood that the line integral along C is being applied to both P and Q . The quantity P (x, y) dx + Q(x, y) dy is
known as a differential form. For a real-valued function F (x, y), the differential of F is
∂F ∂F
dF = dx + dy. (16.2.20)
∂x ∂y

A differential form P (x, y) dx + Q(x, y) dy is called exact if it equals dF for some function F (x, y).
Recall that if the points on a curve C have position vector r(t) = x(t)i + y(t)j , then r'(t) is a tangent vector to C at the point
(x(t), y(t)) in the direction of increasing t (which we call the direction of C ). Since C is a smooth curve, then r'(t) ≠ 0 on [a, b]

and hence

r (t)
T(t) =
∥ r′ (t)∥

is the unit tangent vector to C at (x(t), y(t)). Putting Definitions 16.2.1 and 16.2.2 together we get the following theorem:

Theorem 16.2.1

For a vector field f(x, y) = P (x, y)i + Q(x, y)j and a curve C with a smooth parametrization x = x(t), y = y(t), a ≤ t ≤ b
and position vector r(t) = x(t)i + y(t)j ,

∫ f ⋅ dr = ∫ f ⋅ T ds, (16.2.21)
C C

16.2.4 https://math.libretexts.org/@go/page/4556
r'(t)
where T(t) = is the unit tangent vector to C at (x(t), y(t)).
∥r'(t)∥

If the vector field f(x, y) represents the force moving an object along a curve C , then the work W done by this force is

W =∫ f ⋅ T ds = ∫ f ⋅ dr (16.2.22)
C C

16.2: Line Integrals is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
4.1: Line Integrals by Michael Corral is licensed GNU FDL. Original source: http://www.mecmath.net/.

16.2.5 https://math.libretexts.org/@go/page/4556
16.3: The Fundamental Theorem for Line Integrals
One way to write the Fundamental Theorem of Calculus is:
$$\int_a^b f'(x)\,dx = f(b)-f(a).\]
That is, to compute the integral of a derivative f we need only compute the values of f at the endpoints. Something similar is true

for line integrals of a certain form.


Theorem: Fundamental Theorem of Line Integrals
Suppose a curve C is given by the vector function r(t) , with a = r(a) and b = r(b) . Then
$$\int_C \nabla f\cdot d{\bf r} = f({\bf b})-f({\bf a}),\]
provided that r is sufficiently nice.

Proof
We write r = ⟨x(t), y(t), z(t)⟩ , so that r ′ ′
= ⟨x (t), y (t), z (t)⟩
′ ′
. Also, we know that ∇f = ⟨fx , fy , fz ⟩ . Then
$$\int_C \nabla f\cdot d{\bf r} = \int_a^b \langle f_x,f_y,f_z\rangle\cdot\langle x'(t),y'(t),z'(t)\rangle \,dt=\int_a^b f_x x'+f_y
y'+f_z z' \,dt.\]
By the chain rule (see section 14.4) f x x

+ fy y + fz z
′ ′
= df /dt , where f in this context means f (x(t), y(t), z(t)) , a
function of t . In other words, all we have is
$$\int_a^b f'(t)\,dt=f(b)-f(a).\]
In this context, f (a) = f (x(a), y(a), z(a)) . Since a = r(a) = ⟨x(a), y(a), z(a)⟩ , we can write f (a) = f (a) ---this is a bit of
a cheat, since we are simultaneously using f to mean f (t) and f (x, y, z), and since f (x(a), y(a), z(a)) is not technically the
same as f (⟨x(a), y(a), z(a)⟩), but the concepts are clear and the different uses are compatible. Doing the same for b , we get
$$\int_C \nabla f\cdot d{\bf r} = \int_a^b f'(t)\,dt=f(b)-f(a)=f({\bf b})-f({\bf a}).\]

This theorem, like the Fundamental Theorem of Calculus, says roughly that if we integrate a "derivative-like function'' (f or ∇f ) ′

the result depends only on the values of the original function (f ) at the endpoints.
If a vector field F is the gradient of a function,

F = ∇f (16.3.1)

then we say that F is a conservative vector field. If F is a conservative force field, then the integral for work, ∫ F ⋅ dr , is in the C

form required by the Fundamental Theorem of Line Integrals. This means that in a conservative force field, the amount of work
required to move an object from point a to point b depends only on those points, not on the path taken between them. In physics,
forces that can ascribed to a conservative vector field are called conservative forces and are important for many applications.

Example 16.3.2:

An object moves in the force field


−x −y −z
F =⟨ , , ⟩, (16.3.2)
2 2 2 3/2 2 2 2 3/2 2 2 2 3/2
(x +y +z ) (x +y +z ) (x +y +z )

along the curve r = ⟨1 + t, t 3


, t cos(πt)⟩ as t ranges from 0 to 1. Find the work done by the force on the object.
Solution
The straightforward way to do this involves substituting the components of r into F, forming the dot product F ⋅ r , and then ′

trying to compute the integral, but this integral is extraordinarily messy, perhaps impossible to compute. But since
− −−−−−−−− −
) we need only substitute:
2 2 2
F = ∇(1/ √x + y + z

16.3.1 https://math.libretexts.org/@go/page/4557
(2,1,−1)

1 1
∫ F ⋅ dr = −−−−−−−−−−∣ = – − 1. (16.3.3)
2 2 2
C √x +y +z ∣ √6
(1,0,0)

Another immediate consequence of the Fundamental Theorem involves closed paths. A path C is closed if it forms a loop, so that
traveling over the C curve brings you back to the starting point. If C is a closed path, we can integrate around it starting at any
point a; since the starting and ending points are the same,
$$\int_C \nabla f\cdot d{\bf r}=f({\bf a})-f({\bf a})=0.\]
For example, in a gravitational field (an inverse square law field) the amount of work required to move an object around a closed
path is zero. Of course, it's only the net amount of work that is zero. It may well take a great deal of work to get from point a to
point b , but then the return trip will "produce'' work. For example, it takes work to pump water from a lower to a higher elevation,
but if you then let gravity pull the water back down, you can recover work by running a water wheel or generator. (In the real world
you won't recover all the work because of various losses along the way.)
To make use of the Fundamental Theorem of Line Integrals, we need to be able to spot conservative vector fields F and to compute
f so that F = ∇f . Suppose that F = ⟨P , Q⟩ = ∇f . Then P = f and Q = f , and provided that f is sufficiently nice, we know
x y

from Clairaut's Theorem that P = f = f = Q . If we compute P and Q and find that they are not equal, then F is not
y xy yx x y x

conservative. If P = Q , then, again provided that F is sufficiently nice, we can be assured that F is conservative. Ultimately,
y x

what's important is that we be able to find f ; as this amounts to finding anti-derivatives, we may not always succeed.

Example 16.3.3

Find an f so that ⟨3 + 2xy, x 2 2


− 3 y ⟩ = ∇f .
Solution
First, note that
∂ ∂ 2 2
(3 + 2xy) = 2x and (x − 3 y ) = 2x, (16.3.4)
∂y ∂x

so the desired f does exist. This means that f = 3 + 2xy , so that f = 3x + x y + g(y) ; the first two terms are needed to get
x
2

3 + 2xy , and the g(y) could be any function of y , as it would disappear upon taking a derivative with respect to x. Likewise,

since f = x − 3y , f = x y − y + h(x) . The question now becomes, is it possible to find g(y) and h(x) so that
y
2 2 2 3

2 2 3
3x + x y + g(y) = x y − y + h(x), (16.3.5)

and of course the answer is yes: g(y) = −y , h(x) = 3x. Thus, f 3


= 3x + x y − y
2 3
.

We can test a vector field F = ⟨P , Q, R⟩ in a similar way. Suppose that ⟨P , Q, R⟩ = ⟨f , f , f ⟩. If we temporarily hold z x y z

constant, then f (x, y, z) is a function of x and y , and by Clairaut's Theorem P = f = f = Q . Likewise, holding y constant
y xy yx x

implies P = f = f = R , and with x constant we get Q = f = f = R . Conversely, if we find that P = Q ,


z xz zx x z yz zy y y x

P = R , and Q = R
z x z then F is conservative.
y

Contributors
David Guichard (Whitman College)
Integrated by Justin Marshall.

16.3: The Fundamental Theorem for Line Integrals is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by
LibreTexts.
16.3: The Fundamental Theorem of Line Integrals by David Guichard is licensed CC BY-NC-SA 4.0.

16.3.2 https://math.libretexts.org/@go/page/4557
16.4: Green's Theorem
We will now see a way of evaluating the line integral of a smooth vector field around a simple closed curve. A vector field
f(x, y) = P (x, y)i + Q(x, y)j is smooth if its component functions P (x, y) and Q(x, y) are smooth. We will use Green’s Theorem

(sometimes called Green’s Theorem in the plane) to relate the line integral around a closed curve with a double integral over the
region inside the curve:

Theorem 4.7: Green's Theorem


Let R be a region in 2
R whose boundary is a simple closed curve C which is piecewise smooth. Let
f(x, y) = P (x, y)i + Q(x, y)j be a smooth vector field defined on both R and C . Then
∂Q ∂P
∮ f ⋅ dr = ∬ ( − ) dA, (16.4.1)
C
∂x ∂y
R

where C is traversed so that R is always on the left side of C .

Proof: We will prove the theorem in the case for a simple region R , that is, where the boundary curve C can be written as
C =C ∪C 1 in two distinct ways:
2

C1 = the curve y = y1 (x) from the point X1 to the point X2 (16.4.2)

C2 = the curve y = y2 (x) from the point X2 to the point X1 , (16.4.3)

where X and X are the points on C farthest to the left and right, respectively; and
1 2

C1 = the curve x = x1 (y) from the point Y2 to the point Y1 (16.4.4)

C2 = the curve x = x2 (y) from the point Y1 to the point Y2 , (16.4.5)

where Y and Y are the lowest and highest points, respectively, on C . See Figure 4.3.1.
1 2

Figure 4.3.1
Integrate P (x, y) around C using the representation C = C1 ∪ C2 given by Equation 16.4.3 and Equation 16.4.4.
Since y = y (x) along C (as x goes from a to b) and y = y
1 1 2 (x) along C2 (as x goes from b to a) , as we see from Figure 4.3.1,
then we have

16.4.1 https://math.libretexts.org/@go/page/4558
∮ P (x, y) dx = ∫ P (x, y) dx + ∫ P (x, y) dx
C C1 C2

b a

=∫ P (x, y1 (x)) dx + ∫ P (x, y2 (x)) dx


a b

b b

=∫ P (x, y1 (x)) dx − ∫ P (x, y2 (x)) dx


a a

= −∫ (P (x, y2 (x)) − P (x, y1 (x))) dx


a

b
y=y2 (x)

= −∫ (P (x, y) ) dx

y=y1 (x)
a

b y2 (x)
∂P (x, y)
= −∫ ∫ dy dx (by the Fundamental Theorem of Calculus)
a y (x) ∂y
1

∂P
= −∬ dA. (16.4.6)
∂y
R

Likewise, integrate Q(x, y) around C using the representation C = C ∪ C given by Equation 16.4.5 and Equation ??? . Since
1 2

x = x (y) along C
1 (as y goes from d to c ) and x = x (y) along C (as y goes from c to d ), as we see from Figure 4.3.1, then
1 2 2

we have

∮ Q(x, y) dy = ∫ Q(x, y) dy + ∫ Q(x, y) dy


C C1 C2

c d

=∫ Q(x1 (y), y) dy + ∫ Q(x2 (y), y) dy


d c

d d

= −∫ Q(x1 (y), y) dy + ∫ Q(x2 (y), y) dy


c c

=∫ (Q(x2 (y), y) − Q(x1 (y), y)) dy


c

d
x=x2 (y)

=∫ (Q(x, y) ) dy

x=x1 (y)
c

d x2 (y)
∂Q(x, y)
=∫ ∫ dx dy (by the Fundamental Theorem of Calculus)
c x1 (y)
∂x

∂Q
=∬ dA, and so
∂x
R

∮ f ⋅ dr = ∮ P (x, y) dx + ∮ Q(x, y) dy
C C C

∂P ∂Q
= −∬ dA + ∬ dA
R ∂y R ∂x

∂Q ∂P
=∬ ( − ) dA.
R
∂x ∂y

(QED)

Though we proved Green’s Theorem only for a simple region R , the theorem can also be proved for more general regions (say, a
union of simple regions).

16.4.2 https://math.libretexts.org/@go/page/4558
Example 4.7
Evaluate ∮
C
(x
2 2
+ y ) dx + 2xy dy , where C is the boundary (traversed counterclockwise) of the region
R = (x, y) : 0 ≤ x ≤ 1, 2 x
2
≤ y ≤ 2x .
R is the shaded region in Figure 4.3.2. By Green’s Theorem, for P (x, y) = x 2
+y
2
and Q(x, y) = 2xy , we have
∂Q ∂P
2 2
∮ (x + y ) dx + 2xy dy =∬ ( − ) dA
C R
∂x ∂y

=∬ (2y − 2y) dA = ∬ 0 dA = 0.
R R

Figure 4.3.2
We actually already knew that the answer was zero. Recall from Example 4.5 in Section 4.2 that the vector field
1
f(x, y) = (x
2 2
+ y )i + 2xyj has a potential function F (x, y) = x
3
+ xy
2
, and so ∮ C
f ⋅ dr = 0 by Corollary 4.6.
3

Example 4.8

Let f(x, y) = P (x, y)i + Q(x, y)j, where


−y x
P (x, y) = and Q(x, y) = ,
2 2 2 2
x +y x +y

and let R = (x, y) : 0 < x + y ≤ 1 . For the boundary curve C


2 2
: x
2
+y
2
=1 , traversed counterclockwise, it was shown in
Exercise 9(b) in Section 4.2 that ∮ f ⋅ dr = 2π . ButC

2 2
∂Q y +x ∂P ∂Q ∂P
= = ⇒ ∬ ( − ) dA = ∬ 0 dA = 0
2 2 2
∂x (x +y ) ∂y ∂x ∂y
R R

This would seem to contradict Green’s Theorem. However, note that R is not the entire region enclosed by C , since the point
(0, 0) is not contained in R . That is, R has a “hole” at the origin, so Green’s Theorem does not apply.

If we modify the region to be the annulus R = (x, y) : 1/4 ≤ x + y ≤ 1 (see Figure 4.3.3), and take the “boundary”
R
2 2

C of R to be C = C ∪ C , where C is the unit circle x + y = 1 traversed counterclockwise and C is the circle


1 2 1
2 2
2

+ y = 1/4 traversed clockwise, then it can be shown (see Exercise 8) that


2 2
x

∮ f ⋅ dr = 0
C

Figure 4.3.3 The annulus R

16.4.3 https://math.libretexts.org/@go/page/4558
∂Q ∂P
We would still have ∬ ( − ) dA = 0 , so for this R we would have
∂x ∂y
R

∂Q ∂P
∮ f ⋅ dr = ∬ ( − ) dA,
C
∂x ∂y
R

which shows that Green’s Theorem holds for the annular region R .

It turns out that Green’s Theorem can be extended to multiply connected regions, that is, regions like the annulus in Example 4.8,
which have one or more regions cut out from the interior, as opposed to discrete points being cut out. For such regions, the “outer”
boundary and the “inner” boundaries are traversed so that R is always on the left side.

Figure 4.3.4 Multiply connected regions


The intuitive idea for why Green’s Theorem holds for multiply connected regions is shown in Figure 4.3.4 above. The idea is to cut
“slits” between the boundaries of a multiply connected region R so that R is divided into subregions which do not have any
“holes”. For example, in Figure 4.3.4(a) the region R is the union of the regions R and R , which are divided by the slits 1 2

indicated by the dashed lines. Those slits are part of the boundary of both R and R , and we traverse then in the manner indicated
1 2

by the arrows. Notice that along each slit the boundary of R is traversed in the opposite direction as that of R , which means that
1 2

the line integrals of \textbf{f} along those slits cancel each other out. Since R and R do not have holes in them, then Green’s
1 2

Theorem holds in each subregion, so that


∂Q ∂P ∂Q ∂P
∮ f ⋅ dr = ∬ ( − ) dA and ∮ f ⋅ dr = ∬ R2 ( − ) dA.
bdy of R1
∂x ∂y bdy of R2
∂x ∂y
R1

But since the line integrals along the slits cancel out, we have

∮ f ⋅ dr = ∮ f ⋅ dr + ∮ f ⋅ dr,
C1 ∪C2 bdy of R1 bdy of R2

and so
∂Q ∂P ∂Q ∂P ∂Q ∂P
∮ f ⋅ dr = ∬ ( − ) dA + ∬ ( − ) dA = ∬ ( − ) dA,
C1 ∪C2 ∂x ∂y ∂x ∂y ∂x ∂y
R1 R2 R

which shows that Green’s Theorem holds in the region R . A similar argument shows that the theorem holds in the region with two
holes shown in Figure 4.3.4(b).
We know from Corollary 4.6 that when a smooth vector field f(x, y) = P (x, y)i + Q(x, y)j on a region R (whose boundary is a
piecewise smooth, simple closed curve C ) has a potential in R , then ∮ f ⋅ dr = 0 . And if the potential F (x, y) is smooth in R ,
C

∂F ∂F
then = P and =Q , and so we know that
∂x ∂y

2 2
∂ F ∂ F ∂P ∂Q
= ⇒ = in R
∂y∂x ∂x∂y ∂y ∂x

∂P ∂Q
Conversely, if = in R then
∂y ∂x

16.4.4 https://math.libretexts.org/@go/page/4558
∂Q ∂P
∮ f ⋅ dr = ∬ ( − ) dA ∬ 0 dA = 0
C
∂x ∂y
R R

For a simply connected region R (i.e. a region with no holes), the following can be shown:

The following statements are equivalent for a simply connected region R in R : 2

a. f(x, y) = P (x, y)i + Q(x, y)j has a smooth potential F (x, y) in R


b. ∫ f ⋅ dr is independent of the path for any curve C in R
C

c. ∮ f ⋅ dr = 0 for every simple closed curve C in R


C

∂P ∂Q
d. = in R (in this case, the differential form P dx + Qdy is exact)
∂y ∂x

16.4: Green's Theorem is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
4.3: Green’s Theorem by Michael Corral is licensed GNU FDL. Original source: http://www.mecmath.net/.

16.4.5 https://math.libretexts.org/@go/page/4558
16.5: Curl and Divergence
In this final section we will establish some relationships between the gradient, divergence and curl, and we will also introduce a
new quantity called the Laplacian. We will then show how to write these quantities in cylindrical and spherical coordinates.

Gradient
For a real-valued function f (x, y, z) on R , the gradient ∇f (x, y, z) is a vector-valued function on R , that is, its value at a point
3 3

(x, y, z) is the vector

∂f ∂f ∂f ∂f ∂f ∂f
∇f (x, y, z) = ( , , ) = i+ j+ k
∂x ∂y ∂z ∂x ∂y ∂z

in R , where each of the partial derivatives is evaluated at the point


3
(x, y, z) . So in this way, you can think of the symbol ∇ as
being “applied” to a real-valued function f to produce a vector ∇f .
It turns out that the divergence and curl can also be expressed in terms of the symbol ∇. This is done by thinking of ∇ as a vector
in R , namely
3

∂ ∂ ∂
∇ = i+ j+ k. (16.5.1)
∂x ∂y ∂z

∂ ∂ ∂
Here, the symbols , and are to be thought of as “partial derivative operators” that will get “applied” to a real-valued
∂x ∂y ∂z

∂f ∂f ∂f ∂
function, say f (x, y, z) , to produce the partial derivatives , and . For instance, “applied” to
∂x ∂y ∂z ∂x

∂f
f (x, y, z) produces .
∂x

∂ ∂ ∂
Is ∇ really a vector? Strictly speaking, no, since , and are not actual numbers. But it helps to think of ∇ as a vector,
∂x ∂y ∂z

∂ ∂ ∂
especially with the divergence and curl, as we will soon see. The process of “applying” , , to a real-valued function
∂x ∂y ∂z

f (x, y, z) is normally thought of as multiplying the quantities:


∂ ∂f ∂ ∂f ∂ ∂f
( ) (f ) = , ( ) (f ) = , ( ) (f ) =
∂x ∂x ∂y ∂y ∂z ∂z

For this reason, ∇ is often referred to as the “del operator”, since it “operates” on functions.

Divergence
For example, it is often convenient to write the divergence div f as ∇ ⋅ f , since for a vector field
f(x, y, z) = f (x, y, z)i + f (x, y, z)j + f (x, y, z)k , the dot product of f with ∇ (thought of as a vector) makes sense:
1 2 3

∂ ∂ ∂
∇⋅f =( i+ j+ k) ⋅ (f1 (x, y, z)i + f2 (x, y, z)j + f3 (x, y, z)k)
∂x ∂y ∂z

∂ ∂ ∂
=( ) (f1 ) + ( ) (f2 ) + ( ) (f3 )
∂x ∂y ∂z

∂f1 ∂f2 ∂f3


= + +
∂x ∂y ∂z

= div f

We can also write curl f in terms of ∇, namely as ∇×f , since for a vector field
f(x, y, z) = P (x, y, z)i + Q(x, y, z)j + R(x, y, z)k , we have:

16.5.1 https://math.libretexts.org/@go/page/4559
∣ i j k ∣
∣ ∣
∣ ∂ ∂ ∂ ∣
∇×f = ∣ ∣
∣ ∂x ∂y ∂z ∣
∣ ∣
∣ P (x, y, z) Q(x, y, z) R(x, y, z) ∣

∂R ∂Q ∂R ∂P ∂Q ∂P
=( − ) i−( − ) j +( − )k
∂y ∂z ∂x ∂z ∂x ∂y

∂R ∂Q ∂P ∂R ∂Q ∂P
=( − ) i+( − ) j +( − )k
∂y ∂z ∂z ∂x ∂x ∂y

= curl f

∂f ∂f ∂f
For a real-valued function f (x, y, z), the gradient ∇f (x, y, z) = i+ j+ k is a vector field, so we can take its
∂x ∂y ∂z

divergence:
div ∇f = ∇ ⋅ ∇f

∂ ∂ ∂ ∂f ∂f ∂f
=( i+ j+ k) ⋅ ( i+ j+ k)
∂x ∂y ∂z ∂x ∂y ∂z

∂ ∂f ∂ ∂f ∂ ∂f
= ( )+ ( )+ ( )
∂x ∂x ∂y ∂y ∂z ∂z

2 2 2
∂ f ∂ f ∂ f
= + +
∂x2 ∂y 2 ∂z 2

Note that this is a real-valued function, to which we will give a special name:

Definition 4.7: Laplacian

For a real-valued function f (x, y, z), the Laplacian of f , denoted by Δf , is given by


2 2 2
∂ f ∂ f ∂ f
Δf (x, y, z) = ∇ ⋅ ∇f = + + . (16.5.2)
2 2 2
∂x ∂y ∂z

Often the notation ∇ 2


f is used for the Laplacian instead of Δf , using the convention ∇ 2
= ∇⋅∇ .

Example 4.17

Let r(x, y, z) = xi + yj + zk be the position vector field on R . Then ∥r(x, y, z)∥ 3 2


= r⋅ r = x
2
+y
2
+z
2
is a real-valued
function. Find
a. the gradient of ∥r∥ 2

b. the divergence of r
c. the curl of r
d. the Laplacian of ∥r∥ 2

Solution:
(a) ∇∥r∥ 2
= 2xi + 2yj + 2zk = 2r

∂ ∂ ∂
(b) ∇ ⋅ r = (x) + (y) + (z) = 1 + 1 + 1 = 3
∂x ∂y ∂z

(c)
∣ i j k ∣
∣ ∣
∣ ∂ ∂ ∂ ∣
∇×r = ∣ ∣ = (0 − 0)i − (0 − 0)j + (0 − 0)k = 0
∣ ∂x ∂y ∂z ∣
∣ ∣
∣ x y z ∣

16.5.2 https://math.libretexts.org/@go/page/4559
2 2 2
∂ ∂ ∂
(d) Δ∥r∥ 2
=
2
2
(x +y
2 2
+z )+
2
(x
2
+y
2
+z )+
2

2
(x
2
+y
2 2
+z ) = 2 +2 +2 = 6
∂x ∂y ∂z

Note that we could have calculated Δ∥r∥ another way, using the ∇ notation along with parts (a) and (b):
2

2 2
Δ∥r∥ = ∇ ⋅ ∇∥r∥ = ∇ ⋅ 2r = 2∇ ⋅ r = 2(3) = 6

Notice that in Example 4.17 if we take the curl of the gradient of ∥r∥ we get 2

2
∇ × (∇∥r∥ ) = ∇ × 2r = 2∇ × r = 20 = 0.

The following theorem shows that this will be the case in general:

Theorem 4.15.
For any smooth real-valued function f (x, y, z), ∇ × (∇f ) = 0 .

Proof
We see by the smoothness of f that
∣ i j k ∣
∣ ∣
∣ ∂ ∂ ∂ ∣
∣ ∣
∇ × (∇f ) = ∂x ∂y ∂z (16.5.3)
∣ ∣
∣ ∣
∂f ∂f ∂f
∣ ∣
∣ ∂x ∂y ∂z ∣

2 2 2 2 2 2
∂ f ∂ f ∂ f ∂ f ∂ f ∂ f
=( − ) i−( − ) j +( − ) k = 0, (16.5.4)
∂y∂z ∂z∂y ∂x∂z ∂z∂x ∂x∂y ∂y∂x

since the mixed partial derivatives in each component are equal.


Corollary 4.16

If a vector field f (x, y, z) has a potential, then curl f = 0 .

Another way of stating Theorem 4.15 is that gradients are irrotational. Also, notice that in Example 4.17 if we take the divergence
of the curl of r we trivially get
∇ ⋅ (∇ × r) = ∇ ⋅ 0 = 0. (16.5.5)

The following theorem shows that this will be the case in general:

Theorem 4.17.

For any smooth vector field f(x, y, z), ∇ ⋅ (∇ × f) = 0.

The proof is straightforward and left as an exercise for the reader.

Corollary 4.18
The flux of the curl of a smooth vector field f (x, y, z) through any closed surface is zero.

Proof: Let Σ be a closed surface which bounds a solid S . The flux of ∇ × f through Σ is

16.5.3 https://math.libretexts.org/@go/page/4559
∬ (∇ × f) ⋅ dσ =∭ ∇ ⋅ (∇ × f) dV (by the Divergence Theorem) (16.5.6)

Σ S

=∭ 0 dV (by Theorem 4.17) (16.5.7)

=0 (16.5.8)

(QED)

There is another method for proving Theorem 4.15 which can be useful, and is often used in physics. Namely, if the surface integral
∬ f (x, y, z) dσ = 0 for all surfaces Σ in some solid region (usually all of R ), then we must have f (x, y, z) = 0 throughout that
3

region. The proof is not trivial, and physicists do not usually bother to prove it. But the result is true, and can also be applied to
double and triple integrals.
For instance, to prove Theorem 4.15, assume that f (x, y, z) is a smooth real-valued function on R . Let C be a simple closed curve
3

in R and let Σ be any capping surface for C (i.e. Σ is orientable and its boundary is C ). Since ∇f is a vector field, then
3

∬ (∇ × (∇f )) ⋅ n dσ =∮ ∇f ⋅ dr by Stokes’ Theorem, so


C
Σ

= 0 by Corollary 4.13.

Since the choice of Σ was arbitrary, then we must have (∇ × (∇f )) ⋅ n = 0 throughout R , where n is any unit vector. Using i, j
3

and k in place of n, we see that we must have ∇ × (∇f ) = 0 in R , which completes the proof.
3

Example 4.18
A system of electric charges has a charge density ρ(x, y, z) and produces an electrostatic field E(x, y, z) at points (x, y, z) in
space. Gauss’ Law states that

∬ E ⋅ dσ = 4π ∭ ρ dV

Σ S

for any closed surface Σ which encloses the charges, with S being the solid region enclosed by Σ . Show that ∇ ⋅ E = 4πρ .
This is one of Maxwell’s Equations.
Solution
By the Divergence Theorem, we have

∭ ∇ ⋅ EdV =∬ E ⋅ dσ

S Σ

= 4π ∭ ρ dV by Gauss’ Law, so combining the integrals gives

∭ (∇ ⋅ E − 4πρ) dV = 0 , so

∇ ⋅ E − 4πρ = 0 since Σ and hence S was arbitrary, so

∇ ⋅ E = 4πρ.

Often (especially in physics) it is convenient to use other coordinate systems when dealing with quantities such as the gradient,
divergence, curl and Laplacian. We will present the formulas for these in cylindrical and spherical coordinates.
Recall from Section 1.7 that a point (x, y, z) can be represented in cylindrical coordinates
(r, θ, z), where x = r cos θ, y = r sin θ, z = z. At each point

16.5.4 https://math.libretexts.org/@go/page/4559
(r, θ, z), let er , eθ , ez be unit vectors in the direction of increasing r, θ, z, respectively (see Figure 4.6.1). Then er , eθ , ez

form an orthonormal set of vectors. Note, by the right-hand rule, that e z × er = eθ .

Figure 4.6.1 Orthonormal vectors e r, eθ , ez in cylindrical coordinates (left) and spherical coordinates (right).
Similarly, a point be
(x, y, z) represented canin spherical coordinates (ρ, θ, φ), where
x = ρ sin φ cos θ, y = ρ sin φ sin θ, z = ρ cos φ. At each point (ρ, θ, φ), let e , e , e be unit vectors in the direction of increasing
ρ θ φ

ρ, θ, φ, respectively (see Figure 4.6.2). Then the vectors e , e , e are orthonormal. By the right-hand rule, we see that
ρ θ φ

e ×e = e
θ ρ . φ

We can now summarize the expressions for the gradient, divergence, curl and Laplacian in Cartesian, cylindrical and spherical
coordinates in the following tables:

Cartesian
(x, y, z) : Scalar function F ; Vector field f = f 1i + f2 j + f3 k

∂F ∂F ∂F
gradient : ∇F = i+ j+ k
∂x ∂y ∂z

∂f1 ∂f2 ∂f3


divergence : ∇ ⋅ f = + +
∂x ∂y ∂z

∂f3 ∂f2 ∂f1 ∂f3 ∂f2 ∂f1


curl : ∇ × f = ( − ) i+( − ) j +( − )k
∂y ∂z ∂z ∂x ∂x ∂y
2 2 2
∂ F ∂ F ∂ F
Laplacian : ΔF =
2
+
2
+
2
∂x ∂y ∂z

Cylindrical
(r, θ, z) : Scalar function F ; Vector field f = f r er + fθ eθ + fz ez

∂F 1 ∂F ∂F
gradient : ∇F = er + eθ + ez
∂r r ∂θ ∂z
1 ∂ 1 ∂fθ ∂fz
divergence : ∇ ⋅ f = (rfr ) + +
r ∂r r ∂θ ∂z
1 ∂fz ∂fθ ∂fr ∂fz 1 ∂ ∂fr
curl : ∇ × f = ( − ) er + ( − ) eθ + ( (rfθ ) − ) ez
r ∂θ ∂z ∂z ∂r r ∂r ∂θ
2 2
1 ∂ ∂F 1 ∂ F ∂ F
Laplacian : ΔF = (r )+
2 2
+
2
r ∂r ∂r r ∂θ ∂z

16.5.5 https://math.libretexts.org/@go/page/4559
Spherical
(ρ, θ, φ) : Scalar function F ; Vector field f = f ρ eρ + fθ eθ + fφ eφ

∂F 1 ∂F 1 ∂F
gradient : ∇F = eρ + eθ + eφ
∂ρ ρ sin φ ∂θ ρ ∂φ

1 ∂ 1 ∂fθ 1 ∂
divergence : ∇ ⋅ f = 2
2
(ρ fρ ) + sin φ + (sin φ fθ )
ρ ∂ρ ρ ∂θ ρ sin φ ∂φ

1 ∂ ∂fφ 1 ∂ ∂fρ 1 ∂fρ 1 ∂


curl : ∇ × f = ( (sin φ fθ ) − ) eρ + ( (ρfφ ) − ) eθ + ( − (ρfθ )) eφ
ρ sin φ ∂φ ∂θ ρ ∂ρ ∂φ ρ sin φ ∂θ ρ ∂ρ
2
1 ∂ ∂F 1 ∂ F 1 ∂ ∂F
Laplacian : ΔF =
2

2
)+
2 2
+
2
(sin φ )
ρ ∂ρ ∂ρ 2 ρ sin φ ∂φ ∂φ
ρ sin φ ∂θ

The derivation of the above formulas for cylindrical and spherical coordinates is straightforward but extremely tedious. The basic
idea is to take the Cartesian equivalent of the quantity in question and to substitute into that formula using the appropriate
coordinate transformation. As an example, we will derive the formula for the gradient in spherical coordinates.
Goal: Show that the gradient of a real-valued function F (ρ, θ, φ) in spherical coordinates is:
∂F 1 ∂F 1 ∂F
∇F = eρ + eθ + eφ
∂ρ ρ sin φ ∂θ ρ ∂φ

∂F ∂F ∂F
Idea: In the Cartesian gradient formula ∇F (x, y, z) = i+ j+ k , put the Cartesian basis vectors i, j, k in terms of the
∂x ∂y ∂z
∂F ∂F ∂F
spherical coordinate basis vectors e ρ, eθ , eφ and functions of ρ, θ and φ . Then put the partial derivatives , , in terms
∂x ∂y ∂z
∂F ∂F ∂F
of , , and functions of ρ, θ and φ .
∂ρ ∂θ ∂φ

Step 1: Get formulas for e ρ, eθ , eφ in terms of i, j, k.


We can see from Figure 4.6.2 that the unit vector eρ in the ρ direction at a general point (ρ, θ, φ) is
r
eρ = , where r = xi + yj + zk is the position vector of the point in Cartesian coordinates. Thus,
∥r∥

r xi + yj + zk
eρ = = ,
−−−−−−−−−−
∥r∥ 2 2 2
√x +y +z

−−−−−−−−− −
so using x = ρ sin φ cos θ, y = ρ sin φ sin θ, z = ρ cos φ, 2 2
and ρ = √x + y + z
2
, we get:

eρ = sin φ cos θi + sin φ sin θj + cos φk

Now, since the angle θ is measured in the xy-plane, then the unit vector e in the θ direction must be parallel to the xy-plane. Thatθ

is, e is of the form ai + bj + 0k . To figure out what a and b are, note that since e ⊥ e , then in particular e ⊥ e when
θ θ ρ θ ρ

e is in the xy -plane.
ρ That occurs when the angle φ is π/2. Putting φ = π/2 into the formula for
e gives e = cos θi + sin θj + 0k , and we see that a vector perpendicular to that is − sin θi + cos θj + 0k . Since this vector is
ρ ρ

also a unit vector and points in the (positive) θ direction, it must be e : θ

eθ = − sin θi + cos θj + 0k

Lastly, since e φ = eθ × eρ , we get:

eφ = cos φ cos θi + cos φ sin θj − sin φk

Step 2: Use the three formulas from Step 1 to solve for i, j, k in terms of e ρ, eθ , eφ .
This comes down to solving a system of three equations in three unknowns. There are many ways of doing this, but we will do it by
combining the formulas for e and e to eliminate k , which will give us an equation involving just i and j. This, with the formula
ρ φ

for e , will then leave us with a system of two equations in two unknowns (i and j), which we will use to solve first for j then for i.
θ

Lastly, we will solve for k.

16.5.6 https://math.libretexts.org/@go/page/4559
First, note that

sin φ eρ + cos φ eφ = cos θi + sin θj

so that
2 2
sin θ(sin φ eρ + cos φ eφ ) + cos θeθ = (sin θ + cos θ)j = j,

and so:

j = sin φ sin θeρ + cos θeθ + cos φ sin θeφ

Likewise, we see that


2 2
cos θ(sin φ eρ + cos φ eφ ) − sin θeθ = (cos θ + sin θ)i = i,

and so:

i = sin φ cos θeρ − sin θeθ + cos φ cos θeφ

Lastly, we see that:

k = cos φ eρ − sin φ eφ

∂F ∂F ∂F ∂F ∂F ∂F
Step 3: Get formulas for , , in terms of , , .
∂ρ ∂θ ∂φ ∂x ∂y ∂z

By the Chain Rule, we have


∂F ∂F ∂x ∂F ∂y ∂F ∂z
= + + ,
∂ρ ∂x ∂ρ ∂y ∂ρ ∂z ∂ρ

∂F ∂F ∂x ∂F ∂y ∂F ∂z
= + + ,
∂θ ∂x ∂θ ∂y ∂θ ∂z ∂θ

∂F ∂F ∂x ∂F ∂y ∂F ∂z
= + + ,
∂φ ∂x ∂φ ∂y ∂φ ∂z ∂φ

which yields:

∂F ∂F ∂F ∂F
= sin φ cos θ + sin φ sin θ + cos φ
∂ρ ∂x ∂y ∂z

∂F ∂F ∂F
= −ρ sin φ sin θ + ρ sin φ cos θ (16.5.9)
∂θ ∂x ∂y

∂F ∂F ∂F ∂F
= ρ cos φ cos θ + ρ cos φ sin θ − ρ sin φ
∂φ ∂x ∂y ∂z

∂F ∂F ∂F ∂F ∂F ∂F
Step 4: Use the three formulas from Step 3 to solve for , , in terms of , , .
∂x ∂y ∂z ∂ρ ∂θ ∂φ

Again, this involves solving a system of three equations in three unknowns. Using a similar process of elimination as in Step 2, we
get:

∂F 1 ∂F ∂F ∂F
2
= (ρ sin φ cos θ − sin θ + sin φ cos φ cos θ )
∂x ρ sin φ ∂ρ ∂θ ∂φ

∂F 1 2
∂F ∂F ∂F
= (ρ sin φ sin θ + cos θ + sin φ cos φ sin θ ) (16.5.10)
∂y ρ sin φ ∂ρ ∂θ ∂φ

∂F 1 ∂F ∂F
= (ρ cos φ − sin φ )
∂z ρ ∂ρ ∂φ

16.5.7 https://math.libretexts.org/@go/page/4559
∂F ∂F ∂F
Step 5: Substitute the formulas for i, j, k from Step 2 and the formulas for , , from Step 4 into the Cartesian gradient
∂x ∂y ∂z

∂F ∂F ∂F
formula ∇F (x, y, z) = i+ j+ k .
∂x ∂y ∂z

Doing this last step is perhaps the most tedious, since it involves simplifying 3 × 3 + 3 × 3 + 2 × 2 = 22 terms! Namely,
1 ∂F ∂F ∂F
2
∇F = (ρ sin φ cos θ − sin θ + sin φ cos φ cos θ ) (sin φ cos θeρ − sin θeθ + cos φ cos θeφ )
ρ sin φ ∂ρ ∂θ ∂φ

1 ∂F ∂F ∂F
2
+ (ρ sin φ sin θ + cos θ + sin φ cos φ sin θ ) (sin φ sin θeρ + cos θeθ + cos φ sin θeφ )
ρ sin φ ∂ρ ∂θ ∂φ

1 ∂F ∂F
+ (ρ cos φ − sin φ ) (cos φ eρ − sin φ eφ ),
ρ ∂ρ ∂φ

which we see has 8 terms involving eρ , 6 terms involving eθ , and 8 terms involving eφ . But the algebra is straightforward and
yields the desired result:
∂F 1 ∂F 1 ∂F
∇F = eρ + eθ + eφ ✓ (16.5.11)
∂ρ ρ sin φ ∂θ ρ ∂φ

Example 4.19

In Example 4.17 we showed that ∇∥r∥ = 2r and Δ ∥r∥ = 6, where r(x, y, z) = xi + yj + zk


2 2
in Cartesian coordinates.
Verify that we get the same answers if we switch to spherical coordinates.
Solution
Since ∥r∥ = x + y + z = ρ
2 2 2 2 2
in spherical coordinates, let F (ρ, θ, φ) = ρ
2
(so that F (ρ, θ, φ) = ∥r∥
2
). The gradient
of F in spherical coordinates is
∂F 1 ∂F 1 ∂F
∇F = eρ + eθ + eφ
∂ρ ρ sin φ ∂θ ρ ∂φ

1 1
= 2ρeρ + (0)eθ + (0)eφ
ρ sin φ ρ

r
= 2ρeρ = 2ρ , as we showed earlier, so
∥r∥

r
= 2ρ = 2r, as expected. And the Laplacian is
ρ

2
1 ∂ ∂F 1 ∂ F 1 ∂ ∂F
2
ΔF = (ρ )+ + (sin φ )
2 2 2 2 2
ρ ∂ρ ∂ρ ρ sin φ ∂θ ρ sin φ ∂φ ∂φ

1 ∂ 2
1 1 ∂
= (ρ 2ρ) + (0) + (sin φ(0))
2 2 2
ρ ∂ρ ρ sin φ ρ sin φ ∂φ

1 ∂ 3
= (2 ρ ) + 0 + 0
2
ρ ∂ρ

1
2
= (6 ρ ) = 6, as expected.
2
ρ

16.5: Curl and Divergence is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
4.6: Gradient, Divergence, Curl, and Laplacian by Michael Corral is licensed GNU FDL. Original source: http://www.mecmath.net/.

16.5.8 https://math.libretexts.org/@go/page/4559
16.6: Parametric Surfaces and Their Areas
We have now seen many kinds of functions. When we talked about parametric curves, we defined them as functions from R to R 2

(plane curves) or R to R (space curves). Because each of these has its domain R, they are one dimensional (you can only go
3

forward or backward). In this section, we investigate how to parameterize two dimensional surfaces. Below is the definition.

Definition: Parametric Surfaces


A parametric surface is a function with domain R and range R .
2 3

We typically use the variables u and v for the domain and x, y , and z for the range. We often use vector notation to exhibit
parametric surfaces.

Example 16.6.1

A sphere of radius 7 can be parameterized by


^ ^ ^
r(u, v) = 7 cos u sin v i + 7 sin u sin v j + 7 cos vk (16.6.1)

Notice that we have just used spherical coordinates with the radius held at 7.
We can use a computer to graph a parametric surface. Below is the graph of the surface
1 1

^ ^ ^
r(u, v) = sin u i + cos v j + exp(2 u 3
+ 2v 3
)k. (16.6.2)

Example 16.6.2

Represent the surface


x
z =e cos(x − y) (16.6.3)

parametrically.
Solution
The idea is similar to parametric curves. We just let x = u and y = v , to get
^ ^ u ^
r(u, v) = u i + v j + e cos(u − v)k. (16.6.4)

Example 16.6.3

A surface is created by revolving the curve


y = cos x (16.6.5)

about the x-axis. Find parametric equations for this surface.

16.6.1 https://math.libretexts.org/@go/page/4560
Solution
For a fixed value of x, we get a circle of radius cos x. Now use polar coordinates (in the yz-plane) to get
^ ^ ^
r(u, v) = u i + r cos v j + r sin vk. (16.6.6)

Since u = x and r = cos x , we can substitute cosu for r in the above equation to get
^ ^ ^
r(u, v) = u i + cos u cos v j + cos u sin vk. (16.6.7)

Normal Vectors and Tangent Planes


We have already learned how to find a normal vector of a surface that is presented as a function of tow variables, namely find the
gradient vector. To find the normal vector to a surface r(t) that is defined parametrically, we proceed as follows.
The partial derivatives
ru (u0 , v0 ) and rv (u0 , v0 ) (16.6.8)

will lie on the tangent plane to the surface at the point (u , v ). This is true, because fixing one variable constant and letting the
0 0

other vary, produced a curve on the surface through (u , v ). r (u , v ) will be tangent to this curve. The tangent plane contains
0 0 u 0 0

all vectors tangent to curves passing through the point.


To find a normal vector, we just cross the two tangent vectors.

Example 16.6.4

Find the equation of the tangent plane to the surface


2 2 ^ ^ ^
r(u, v) = (u − v ) i + (u + v) j + (uv)k (16.6.9)

at the point (1, 2).


Solution
We have
^ ^ ^
ru (u, v) = (2u) i + j + vk (16.6.10)

^ ^ ^
rv (u, v) = (−2v) i + j + u k (16.6.11)

so that
^ ^ ^
ru (1, 2) = 2 i + j + 2 k (16.6.12)

^ ^ ^
rv (1, 2) = −4 i + j + k (16.6.13)

^ ^ ^
r(1, 2) = −3 i + 3 j + 3 k. (16.6.14)

Now cross these vectors together to get


∣ ^ ^ ^
i j k∣
∣ ∣
ru × rv = ∣ 2 1 2 ∣ (16.6.15)
∣ ∣
∣ −4 1 1 ∣

^ ^ ^
= − i − 10 j + 6 k. (16.6.16)

We now have the normal vector and a point (−3, 3, 2). We use the normal vector-point equation for a plane

16.6.2 https://math.libretexts.org/@go/page/4560
−1(x + 3) − 10(y − 3) + 6(z − 2) = 0 (16.6.17)

−x − 10y + 6z = −15 or x + 10y − 6z = 15. (16.6.18)

Surface Area
To find the surface area of a parametrically defined surface, we proceed in a similar way as in the case as a surface defined by a
function. Instead of projecting down to the region in the xy-plane, we project back to a region in the uv-plane. We cut the region
into small rectangles which map approximately to small parallelograms with adjacent defining vectors ru and rv. The area of these
parallelograms will equal the magnitude of the cross product of ru and rv. Finally add the areas up and take the limit as the
rectangles get small. This will produce a double integral.

Definition: Area of a Parametric Surface


Let S be a smooth surface defined parametrically by
^ ^ ^
r(u, v) = x(u, v) i + y(u, v) j + z(u, v)k (16.6.19)

where u and v are contained in a region R. Then the surface area of S is given by

SA = ∬ || ru × rv || dudv. (16.6.20)
R

Since the magnitude of a cross product involves a square root, the integral in the surface area formula is usually impossible or
nearly impossible to evaluate without power series or by approximation techniques.

Example 16.6.5

Find the surface area of the surface given by


2 ^ ^ 2 ^
r(u, v) = (v ) i + (u − v) j + (u )k 0 ≤u ≤2 1 ≤ v ≤ 4. (16.6.21)

Solution
We calculate
^ ^
ru (u, v) = j + 2u k (16.6.22)

^ ^
rv (u, v) = (2v) i + j . (16.6.23)

The cross product is


∣ ^
i
^
j
^
k ∣
∣ ∣
||r × r|| = ∣ 0 1 2u ∣ (16.6.24)
∣ ∣
∣ 2v −1 0 ∣

^ ^ ^
= ||2u i + 4uv j − 2vk|| (16.6.25)
−− −−−−− −−−−−−
2 2 2 2
= 2√ u + 4u v + v . (16.6.26)

The surface area formula gives


2 4
− −− −−−−−−
2 2 2
SA = ∫ ∫ 2√ 4u v + v dvdu. (16.6.27)
0 1

This integral is probably impossible to compute exactly. Instead, a calculator can be used to obtain a surface area of 70.9.

Larry Green (Lake Tahoe Community College)


Integrated by Justin Marshall.

16.6: Parametric Surfaces and Their Areas is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
2.7: Parametric Surfaces has no license indicated.

16.6.3 https://math.libretexts.org/@go/page/4560
Welcome to the Mathematics Library. This Living Library is a principal hub of the LibreTexts project, which is a multi-institutional
collaborative venture to develop the next generation of open-access texts to improve postsecondary education at all levels of higher
learning. The LibreTexts approach is highly collaborative where an Open Access textbook environment is under constant revision
by students, faculty, and outside experts to supplant conventional paper-based books.

1 https://math.libretexts.org/@go/page/4561
16.8: Stokes' Theorem
So far the only types of line integrals which we have discussed are those along curves in R . But the definitions and properties 2

which were covered in Sections 4.1 and 4.2 can easily be extended to include functions of three variables, so that we can now
discuss line integrals along curves in R . 3

Definition 16.8.1: Line Integrals

For a real-valued function f (x, y, z) and a curve C in R , parametrized by 3


x = x(t), y = y(t), z = z(t), a ≤ t ≤ b , the line
integral of f (x, y, z) along C with respect to arc length s is
b −−−−−−−−−−−−−−−−−
2 2 2
∫ f (x, y, z) ds = ∫ f (x(t), y(t), z(t))√ x'(t) + y'(t) + z'(t) dt. (16.8.1)
C a

The line integral of f (x, y, z) along C with respect to x is


b

∫ f (x, y, z) dx = ∫ f (x(t), y(t), z(t))x'(t) dt. (16.8.2)


C a

The line integral of f (x, y, z) along C with respect to y is


b

∫ f (x, y, z) dy = ∫ f (x(t), y(t), z(t))y'(t) dt. (16.8.3)


C a

The line integral of f (x, y, z) along C with respect to z is


b

∫ f (x, y, z) dz = ∫ f (x(t), y(t), z(t))z'(t) dt. (16.8.4)


C a

Similar to the two-variable case, if f (x, y, z) ≥ 0 then the line integral ∫ C


f (x, y, z) ds can be thought of as the total area of the
“picket fence” of height f (x, y, z) at each point along the curve C in R . 3

Vector fields in R are defined in a similar fashion to those in R , which allows us to define the line integral of a vector field along
3 2

a curve in R .
3

Definition 16.8.2

For a vector field f(x, y, z) = P (x, y, z)i + Q(x, y, z)j + R(x, y, z)k and a curve C in 3
R with a smooth parametrization
x = x(t), y = y(t), z = z(t), a ≤ t ≤ b , the line integral of f along C is

∫ f ⋅ dr =∫ P (x, y, z) dx + ∫ Q(x, y, z) dy + ∫ R(x, y, z) dz (16.8.5)


C C C C

=∫ f(x(t), y(t), z(t)) ⋅ r'(t)dt, (16.8.6)


a

where r(t) = x(t)i + y(t)j + z(t)k is the position vector for points on C .

Similar to the two-variable case, if f(x, y, z) represents the force applied to an object at a point (x, y, z) then the line integral
f ⋅ dr represents the work done by that force in moving the object along the curve C in R .
3

C

Some of the most important results we will need for line integrals in R
3
are stated below without proof (the proofs are similar to
their two-variable equivalents).

Theorem 16.8.1

For a vector field f(x, y, z) = P (x, y, z)i + Q(x, y, z)j + R(x, y, z)k and a curve C with a smooth parametrization
x = x(t), y = y(t), z = z(t), a ≤ t ≤ b and position vector r(t) = x(t)i + y(t)j + z(t)k ,

16.8.1 https://math.libretexts.org/@go/page/4562
∫ f ⋅ dr = ∫ f ⋅ T ds, (16.8.7)
C C

r'(t)
where T(t) = is the unit tangent vector to C at (x(t), y(t), z(t)).
∥r'(t)∥

Theorem 16.8.2: Chain Rule

If w = f (x, y, z) is a continuously differentiable function of x, y, and z, and x = x(t), y = y(t) and z = z(t) are
differentiable functions of t, then w is a differentiable function of t , and
dw ∂w dx ∂w dy ∂w dz
= + + . (16.8.8)
dt ∂x dt ∂y dt ∂z dt

Also, if x = x(t 1, t2 ), y = y(t1 , t2 ) and z = z(t1 , t2 ) are continuously differentiable function of (t 1, t2 ) , then
∂w ∂w ∂x ∂w ∂y ∂w ∂z
= + + (16.8.9)
∂t1 ∂x ∂t1 ∂y ∂t1 ∂z ∂t1

and
∂w ∂w ∂x ∂w ∂y ∂w ∂z
= + + (16.8.10)
∂t2 ∂x ∂t2 ∂y ∂t2 ∂z ∂t2

Theorem 16.8.3: Potential

Let f(x, y, z) = P (x, y, z)i + Q(x, y, z)j + R(x, y, z)k be a vector field in some solid S , with P , Q and R continuously
differentiable functions on S . Let C be a smooth curve in S parametrized by x = x(t), y = y(t), z = z(t), a ≤ t ≤ b .
Suppose that there is a real-valued function F (x, y, z) such that ∇F = f on S . Then

∫ f ⋅ dr = F (B) − F (A), (16.8.11)


C

where A = (x(a), y(a), z(a)) and B = (x(b), y(b), z(b)) are the endpoints of C .

Corollary

If a vector field \(\textbf{f}\) has a potential in a solid S , then ∮ C


f ⋅ dr = 0 for any closed curve C in S (i.e. ∮ C
∇F ⋅ dr = 0

for any real-valued function F (x, y, z)).

Example 16.8.1

Let f (x, y, z) = z and let C be the curve in R parametrized by


3

x = t sin t, y = t cos t, z = t, 0 ≤ t ≤ 8π.

Evaluate ∫ C
f (x, y, z) ds . (Note: C is called a conical helix. See Figure 4.5.1).
Solution
Since x'(t) = sin t + t cos t, y'(t) = cos t − t sin t, and z'(t) = 1 , we have
2 2 2 2 2 2 2 2 2
x'(t) + y'(t) + z'(t) = (sin t + 2t sin t cos t + t cos t) + (cos t − 2t sin t cos t + t sin t) + 1

2 2 2 2 2
= t (sin t + cos t) + sin t + cos t +1

2
=t + 2,

so since f (x(t), y(t), z(t)) = z(t) = t along the curve C , then

16.8.2 https://math.libretexts.org/@go/page/4562
8π −−−−−−−−−−−−−−−−−
2 2 2
∫ f (x, y, z) ds = ∫ f (x(t), y(t), z(t))√ x'(t) + y'(t) + z'(t) dt
C 0


−−−−−
2
=∫ t√ t + 2 dt
0

1 8π 1 –
2 3/2 ∣ 2 3/2
=( (t + 2) ) = ((64 π + 2) − 2 √2) .
∣0
3 3

Figure 4.5.1 Conical helix C

Example 16.8.2

Let f(x, y, z) = xi + yj + 2zk be a vector field in R . Using the same curve C from Example 4.12, evaluate ∫
3

C
f ⋅ dr .
Solution:
2 2
x y
It is easy to see that F (x, y, z) = + +z
2
is a potential for f(x, y, z) (i.e. ∇F = f) .
2 2

So by Theorem 4.12 we know that

∫ f ⋅ dr = F (B) − F (A), where A = (x(0), y(0), z(0)) and B = (x(8π), y(8π), z(8π)), so
C

= F (8π sin 8π, 8π cos 8π, 8π) − F (0 sin 0, 0 cos 0, 0)

= F (0, 8π, 8π) − F (0, 0, 0)

2
(8π)
2 2
=0+ + (8π ) − (0 + 0 + 0) = 96 π .
2

We will now discuss a generalization of Green’s Theorem in R to orientable surfaces in R , called Stokes’ Theorem. A surface Σ
2 3

in R is orientable if there is a continuous vector field N in R such that N is nonzero and normal to Σ (i.e. perpendicular to the
3 3

tangent plane) at each point of Σ. We say that such an N is a normal vector field.
For example, the unit sphere x + y + z = 1 is orientable, since the continuous vector field N(x, y, z) = xi + yj + zk is
2 2 2

nonzero and normal to the sphere at each point. In fact, −N(x, y, z) is another normal vector field (see Figure 4.5.2). We see in
this case that N(x, y, z) is what we have called an outward normal vector, and −N(x, y, z) is an inward normal vector. These
“outward” and “inward” normal vector fields on the sphere correspond to an “outer” and “inner” side, respectively, of the sphere.
That is, we say that the sphere is a two-sided surface. Roughly, “two-sided” means “orientable”. Other examples of two-sided, and
hence orientable, surfaces are cylinders, paraboloids, ellipsoids, and planes.

16.8.3 https://math.libretexts.org/@go/page/4562
Figure 4.5.2
You may be wondering what kind of surface would not have two sides. An example is the Möbius strip, which is constructed by
taking a thin rectangle and connecting its ends at the opposite corners, resulting in a “twisted” strip (see Figure 4.5.3).

Figure 4.5.3: Möbius strip


If you imagine walking along a line down the center of the Möbius strip, as in Figure 4.5.3(b), then you arrive back at the same
place from which you started but upside down! That is, your orientation changed even though your motion was continuous along
that center line. Informally, thinking of your vertical direction as a normal vector field along the strip, there is a discontinuity at
your starting point (and, in fact, at every point) since your vertical direction takes two different values there. The Möbius strip has
only one side, and hence is nonorientable.
For an orientable surface Σ which has a boundary curve C , pick a unit normal vector n such that if you walked along C with your
head pointing in the direction of n, then the surface would be on your left. We say in this situation that n is a positive unit normal
vector and that C is traversed n-positively. We can now state Stokes’ Theorem:

Theorem 16.8.4: Stoke's Theorem

Let Σ be an orientable surface in R


3
whose boundary is a simple closed curve C , and let
f(x, y, z) = P (x, y, z)i + Q(x, y, z)j + R(x, y, z)k be a smooth vector field defined on some subset of R that contains Σ.
3

Then

∮ f ⋅ dr = ∬ (curl f) ⋅ n dσ, (16.8.12)


C
Σ

where
∂R ∂Q ∂P ∂R ∂Q ∂P
curl f = ( − ) i+( − ) j +( − ) k, (16.8.13)
∂y ∂z ∂z ∂x ∂x ∂y

n is a positive unit normal vector over Σ, and C is traversed n-positively.

Proof: As the general case is beyond the scope of this text, we will prove the theorem only for the special case where Σ is the graph
of z = z(x, y) for some smooth real-valued function z(x, y), with (x, y) varying over a region D in R . 2

Projecting Σ onto the xy-plane, we see that the closed curve C (the boundary curve of Σ) projects onto a closed curve C which is D

the boundary curve of D (see Figure 4.5.4). Assuming that C has a smooth parametrization, its projection C in the xy-plane also
D

has a smooth parametrization, say

16.8.4 https://math.libretexts.org/@go/page/4562
Figure 4.5.4

CD : x = x(t), y = y(t), a ≤ t ≤ b,

and so C can be parametrized (in R ) as3

C : x = x(t), y = y(t), z = z(x(t), y(t)), a ≤ t ≤ b,

since the curve C is part of the surface z = z(x, y) . Now, by the Chain Rule (Theorem 4.4 in Section 4.2), for
z = z(x(t), y(t)) as a function of t , we know that

∂z ∂z
z'(t) = x'(t) + y'(t),
∂x ∂y

and so

∮ f ⋅ dr =∫ P (x, y, z) dx + Q(x, y, z) dy + R(x, y, z) dz


C C

b
∂z ∂z
=∫ (P x'(t) + Qy'(t) + R ( x'(t) + y'(t))) dt
a ∂x ∂y

b
∂z ∂z
=∫ ((P + R ) x'(t) + (Q + R ) y'(t)) dt
a ∂x ∂y

~ ~
=∫ P (x, y) dx + Q(x, y) dy,
CD

where

~ ∂z
P (x, y) = P (x, y, z(x, y)) + R(x, y, z(x, y)) (x, y), and
∂x

~ ∂z
Q(x, y) = Q(x, y, z(x, y)) + R(x, y, z(x, y)) (x, y)
∂y

for (x, y) in D. Thus, by Green’s Theorem applied to the region D, we have


~ ~
∂Q ∂P
∮ f ⋅ dr = ∬ ( − ) dA. (16.8.14)
C
∂x ∂y
D

Thus,
~
∂Q ∂ ∂z
= (Q(x, y, z(x, y)) + R(x, y, z(x, y)) (x, y)) , so by the Product Rule we get
∂x ∂x ∂y

∂ ∂ ∂z ∂ ∂z
= (Q(x, y, z(x, y))) + ( R(x, y, z(x, y))) (x, y) + R(x, y, z(x, y)) ( (x, y))
∂x ∂x ∂y ∂x ∂y

Now, by Equation 16.8.9 in Theorem 4.11, we have

16.8.5 https://math.libretexts.org/@go/page/4562
∂ ∂Q ∂x ∂Q ∂y ∂Q ∂z
(Q(x, y, z(x, y))) = + +
∂x ∂x ∂x ∂y ∂x ∂z ∂x

∂Q ∂Q ∂Q ∂z
= ⋅1+ ⋅0+
∂x ∂y ∂z ∂x

∂Q ∂Q ∂z
= + .
∂x ∂z ∂x

Similarly,
∂ ∂R ∂R ∂z
(R(x, y, z(x, y))) = + .
∂x ∂x ∂z ∂x

Thus,
~ 2
∂Q ∂Q ∂Q ∂z ∂R ∂R ∂z ∂z ∂ z
= + +( + ) + R(x, y, z(x, y))
∂x ∂x ∂z ∂x ∂x ∂z ∂x ∂y ∂x∂y

2
∂Q ∂Q ∂z ∂R ∂z ∂R ∂z ∂z ∂ z
= + + + +R .
∂x ∂z ∂x ∂x ∂y ∂z ∂x ∂y ∂x∂y

In a similar fashion, we can calculate


~ 2
∂P ∂P ∂P ∂z ∂R ∂z ∂R ∂z ∂z ∂ z
= + + + +R .
∂y ∂y ∂z ∂y ∂y ∂x ∂z ∂y ∂x ∂y∂x

So subtracting gives
~ ~
∂Q ∂P ∂Q ∂R ∂z ∂R ∂P ∂z ∂Q ∂P
− =( − ) +( − ) +( − ) (16.8.15)
∂x ∂y ∂z ∂y ∂x ∂x ∂z ∂y ∂x ∂y

2 2
∂ z ∂ z
since = by the smoothness of z = z(x, y) . Hence, by Equation 16.8.14,
∂x∂y ∂y∂x

∂R ∂Q ∂z ∂P ∂R ∂z ∂Q ∂P
∮ f ⋅ dr = ∬ (− ( − ) −( − ) +( − )) dA (16.8.16)
C D ∂y ∂z ∂x ∂z ∂x ∂y ∂x ∂y

after factoring out a −1 from the terms in the first two products in Equation 16.8.15.
∂z ∂z
Now, recall from Section 2.3 (see p.76) that the vector N =− i− j +k is normal to the tangent plane to the surface
∂x ∂y

z = z(x, y) at each point of Σ. Thus,


∂z ∂z
− i− j +k
N ∂x ∂y
n = = −−−−−−−−−−−−−−−−−−
∥N∥ 2 2
∂z ∂z
√1 +( ) +( )
∂x ∂y

is in fact a positive unit normal vector to Σ (see Figure 4.5.4). Hence, using the parametrization
∂r ∂z ∂r ∂z
r(x, y) = xi + yj + z(x, y)k, for (x, y) in D , of the surface Σ , we have = i+ k and =j+ k , and so
∂x ∂x ∂y ∂y
−−−−−−−−−−−−−−−−−−
2 2
∂r ∂r ∂z ∂z
∥ × ∥ = √1 + ( ) +( ) . So we see that using Equation 16.8.13 for curl f, we have
∂x ∂y ∂x ∂y

16.8.6 https://math.libretexts.org/@go/page/4562
∂r ∂r
∥ ∥
∬ (curl f) ⋅ n dσ = ∬ (curl f) × dA
∥ ∥
∂x ∂y
Σ D

∂R ∂Q ∂P ∂R ∂Q ∂P ∂z ∂z
=∬ (( − ) i+( − ) j +( − ) k) ⋅ (− i− j + k) dA
∂y ∂z ∂z ∂x ∂x ∂y ∂x ∂y
D

∂R ∂Q ∂z ∂P ∂R ∂z ∂Q ∂P
=∬ (− ( − ) −( − ) +( − )) dA,
∂y ∂z ∂x ∂z ∂x ∂y ∂x ∂y
D

which, upon comparing to Equation 16.8.16, proves the Theorem.


(QED)

Note: The condition in Stokes’ Theorem that the surface Σ have a (continuously varying) positive unit normal vector n and a
boundary curve C traversed n-positively can be expressed more precisely as follows: if r(t) is the position vector for C and
T(t) = r'(t)/∥r'(t)∥ is the unit tangent vector to C , then the vectors T, n, T × n form a right-handed system.

Also, it should be noted that Stokes’ Theorem holds even when the boundary curve C is piecewise smooth.

Example 16.8.3

Verify Stokes’ Theorem for f(x, y, z) = zi + xj + yk when Σ is the paraboloid z =x


2
+y
2
such that z ≤1 (see Figure
4.5.5).

Figure 4.5.5 z = x
2
+y
2

Solution:
The positive unit normal vector to the surface z = z(x, y) = x 2
+y
2
is
∂z ∂z
− i− j +k
∂x ∂y −2xi − 2yj + k
n = −−−−−−−−−−−−−−−−−− = − −−−−−−−−− −,
2 2 2 2
√ 1 + 4x + 4y
∂z ∂z
√1 +( ) +( )
∂x ∂y

and curl f = (1−0)i+(1−0)j+(1−0)k = i+j+k, so


−−−−−−−−−−−
2 2
(curl f) ⋅ n = (−2x − 2y + 1)/ √ 1 + 4 x + 4y .

Since Σ can be parametrized as r(x, y) = xi + yj + (x 2 2


+ y )k for (x, y) in the region D = (x, y) : x
2
+y
2
≤1 , then

16.8.7 https://math.libretexts.org/@go/page/4562
∂r ∂r
∥ ∥
∬ (curl f) ⋅ n dσ = ∬ (curl f) × dA
∥ ∥
∂x ∂y
Σ D

−−−−−−−−−−−
−2x − 2y + 1
2 2
=∬ √ 1 + 4x + 4y dA
− −−−−−−−−− −
√ 1 + 4 x2 + 4 y 2
D

= ∬ (−2x − 2y + 1) dA, so switching to polar coordinates gives

2π 1

=∫ ∫ (−2r cos θ − 2r sin θ + 1) r dr dθ


0 0

2π 1
2 2
=∫ ∫ (−2 r cos θ − 2 r sin θ + r) dr dθ
0 0

2π 3 3 2
2r 2r r r=1

=∫ (− cos θ − sin θ + ) dθ

0
3 3 2 r=0


2 2 1
=∫ (− cos θ − sin θ + ) dθ
0
3 3 2

2 2 1 2π

=− sin θ + cos θ + θ = π.
∣0
3 3 2

The boundary curve C is the unit circle x + y = 1 laying in the plane z = 1 (see Figure 4.5.5), which can be parametrized
2 2

as x = cos t, y = sin t, z = 1 for 0 ≤ t ≤ 2π . So


∮ f ⋅ dr =∫ ((1)(− sin t) + (cos t)(cos t) + (sin t)(0)) dt


C 0


1 + cos 2t 1 + cos 2t
2
=∫ (− sin t + ) dt (here we used cos t = )
0
2 2

t sin 2t 2π

= cos t + + = π.

2 4 0

So we see that ∮ C
f ⋅ dr = ∬ (curl f) ⋅ ndσ , as predicted by Stokes’ Theorem.
Σ

The line integral in the preceding example was far simpler to calculate than the surface integral, but this will not always be the
case.

Example 16.8.4
2 2
x y
Let Σ be the elliptic paraboloid z = + for z ≤ 1 , and let C be its boundary curve. Calculate
4 9

C
f ⋅ dr for f(x, y, z) = (9xz + 2y)i + (2x + y )j + (−2 y
2 2
+ 2z)k , where C is traversed counterclockwise
Solution
2 2
x y
The surface is similar to the one in Example 16.8.3 , except now the boundary curve C is the ellipse + =1 laying in
4 9
the plane z = 1 . In this case, using Stokes’ Theorem is easier than computing the line integral directly. As in Example 4.14, at
2 2
x y
each point (x, y, z(x, y)) on the surface z = z(x, y) = + the vector
4 9

∂z ∂z x 2y
− i− j +k − i− j +k
∂x ∂y 2 9
n = −−−−−−−−−−−−−−−−−− = −−−−−−−−−− −,
2 2 2 2
∂z ∂z x 4y
√1+ +
√1 +( ) +( )
4 9
∂x ∂y

16.8.8 https://math.libretexts.org/@go/page/4562
is a positive unit normal vector to Σ. And calculating the curl of f gives

curl f = (−4y − 0)i + (9x − 0)j + (2 − 2)k = −4yi + 9xj + 0k,

so
x 2y
(−4y)(− ) + (9x)(− ) + (0)(1)
2 9 2xy − 2xy + 0
(curl f) ⋅ n = = = 0,
−−−−−−−−−− − −−−−−−−−−− −
2 2 2 2
x 4y x 4y
√1+ + √1+ +
4 9 4 9

and so by Stokes’ Theorem

∮ f ⋅ dr = ∬ (curl f) ⋅ n dσ = ∬ 0 dσ = 0.
C
Σ Σ

In physical applications, for a simple closed curve C the line integral ∮ f ⋅ dr is often called the circulation of f around C . For
C

example, if E represents the electrostatic field due to a point charge, then it turns out that curl E = 0 , which means that the
circulation ∮ E ⋅ dr = 0 by Stokes’ Theorem. Vector fields which have zero curl are often called irrotational fields.
C

In fact, the term curl was created by the 19th century Scottish physicist James Clerk Maxwell in his study of electromagnetism,
where it is used extensively. In physics, the curl is interpreted as a measure of circulation density. This is best seen by using another
definition of curl f which is equivalent to the definition given by Equation 16.8.13. Namely, for a point (x, y, z) in R , 3

1
n ⋅ (curl f(x, y, z) = lim ∮ f ⋅ dr, (16.8.17)
S→0 S C

where S is the surface area of a surface Σ containing the point (x, y, z) and with a simple closed boundary curve C and positive
unit normal vector n at (x, y, z). In the limit, think of the curve C shrinking to the point (x, y, z), which causes Σ, the surface it
bounds, to have smaller and smaller surface area. That ratio of circulation to surface area in the limit is what makes the curl a rough
measure of circulation density (i.e. circulation per unit area).

Figure 4.5.6 Curl and rotation


An idea of how the curl of a vector field is related to rotation is shown in Figure 4.5.6. Suppose we have a vector field f(x, y, z)
which is always parallel to the xy-plane at each point (x, y, z) and that the vectors grow larger the further the point (x, y, z) is from
the y -axis. For example, f(x, y, z) = (1 + x )j . Think of the vector field as representing the flow of water, and imagine dropping
2

two wheels with paddles into that water flow, as in Figure 4.5.6. Since the flow is stronger (i.e. the magnitude of f is larger) as you
move away from the y -axis, then such a wheel would rotate counterclockwise if it were dropped to the right of the y -axis, and it
would rotate clockwise if it were dropped to the left of the y -axis. In both cases the curl would be nonzero (curl f(x, y, z) = 2xk in
our example) and would obey the right-hand rule, that is, curl f(x, y, z) points in the direction of your thumb as you cup your right
hand in the direction of the rotation of the wheel. So the curl points outward (in the positive z -direction) if x > 0 and points inward
(in the negative z -direction) if x < 0 . Notice that if all the vectors had the same direction and the same magnitude, then the wheels
would not rotate and hence there would be no curl (which is why such fields are called irrotational, meaning no rotation).

16.8.9 https://math.libretexts.org/@go/page/4562
Finally, by Stokes’ Theorem, we know that if C is a simple closed curve in some solid region S in R and if f(x, y, z) is a smooth 3

vector field such that curl f = 0 in S , then

∮ f ⋅ dr = ∬ (curl f ⋅ n dσ = ∬ 0 ⋅ n dσ = ∬ 0 dσ = 0,
C
Σ Σ Σ

where Σ is any orientable surface inside S whose boundary is C (such a surface is sometimes called a capping surface for C ). So
similar to the two-variable case, we have a threedimensional version of a result from Section 4.3, for solid regions in R which are 3

simply connected (i.e. regions having no holes):

The following statements are equivalent for a simply connected solid region S in R : 3

a. f(x, y, z) = P (x, y, z)i + Q(x, y, z)j + R(x, y, z)k has a smooth potential F (x, y, z) in S
b. ∫ f ⋅ dr is independent of the path for any curve C in S
C

c. ∮ f ⋅ dr = 0 for every simple closed curve C in S


C

∂R ∂Q ∂P ∂R ∂Q ∂P
d. = , = , and = in S (i.e. curl f = 0 in S )
∂y ∂z ∂z ∂x ∂x ∂y

Part (d) is also a way of saying that the differential form P dx + Q dy + R dz is exact.

Example 16.8.5

Determine if the vector field f(x, y, z) = xyzi + xzj + xyk has a potential in R . 3

Solution
Since R is simply connected, we just need to check whether curl f = 0 throughout R , that is,
3 3

∂R ∂Q ∂P ∂R ∂Q ∂P
= , = , and =
∂y ∂z ∂z ∂x ∂x ∂y

throughout R , where P (x, y, z) = xyz, Q(x, y, z) = xz,


3
and R(x, y, z) = xy . But we see that
∂P ∂R ∂P ∂R
3
= xy, =y ⇒ ≠ for some (x, y, z) in R .
∂z ∂x ∂z ∂x

Thus, f (x, y, z) does not have a potential in R . 3

16.8: Stokes' Theorem is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
4.5: Stokes’ Theorem by Michael Corral is licensed GNU FDL. Original source: http://www.mecmath.net/.

16.8.10 https://math.libretexts.org/@go/page/4562
16.9: The Divergence Theorem
In this final section we will establish some relationships between the gradient, divergence and curl, and we will also introduce a
new quantity called the Laplacian. We will then show how to write these quantities in cylindrical and spherical coordinates.

Gradient
For a real-valued function f (x, y, z) on R , the gradient ∇f (x, y, z) is a vector-valued function on R , that is, its value at a point
3 3

(x, y, z) is the vector

∂f ∂f ∂f ∂f ∂f ∂f
∇f (x, y, z) = ( , , ) = i+ j+ k
∂x ∂y ∂z ∂x ∂y ∂z

in R , where each of the partial derivatives is evaluated at the point


3
(x, y, z) . So in this way, you can think of the symbol ∇ as
being “applied” to a real-valued function f to produce a vector ∇f .
It turns out that the divergence and curl can also be expressed in terms of the symbol ∇. This is done by thinking of ∇ as a vector
in R , namely
3

∂ ∂ ∂
∇ = i+ j+ k. (16.9.1)
∂x ∂y ∂z

∂ ∂ ∂
Here, the symbols , and are to be thought of as “partial derivative operators” that will get “applied” to a real-valued
∂x ∂y ∂z

∂f ∂f ∂f ∂
function, say f (x, y, z) , to produce the partial derivatives , and . For instance, “applied” to
∂x ∂y ∂z ∂x

∂f
f (x, y, z) produces .
∂x

∂ ∂ ∂
Is ∇ really a vector? Strictly speaking, no, since , and are not actual numbers. But it helps to think of ∇ as a vector,
∂x ∂y ∂z

∂ ∂ ∂
especially with the divergence and curl, as we will soon see. The process of “applying” , , to a real-valued function
∂x ∂y ∂z

f (x, y, z) is normally thought of as multiplying the quantities:


∂ ∂f ∂ ∂f ∂ ∂f
( ) (f ) = , ( ) (f ) = , ( ) (f ) =
∂x ∂x ∂y ∂y ∂z ∂z

For this reason, ∇ is often referred to as the “del operator”, since it “operates” on functions.

Divergence
For example, it is often convenient to write the divergence div f as ∇ ⋅ f , since for a vector field
f(x, y, z) = f (x, y, z)i + f (x, y, z)j + f (x, y, z)k , the dot product of f with ∇ (thought of as a vector) makes sense:
1 2 3

∂ ∂ ∂
∇⋅f =( i+ j+ k) ⋅ (f1 (x, y, z)i + f2 (x, y, z)j + f3 (x, y, z)k)
∂x ∂y ∂z

∂ ∂ ∂
=( ) (f1 ) + ( ) (f2 ) + ( ) (f3 )
∂x ∂y ∂z

∂f1 ∂f2 ∂f3


= + +
∂x ∂y ∂z

= div f

We can also write curl f in terms of ∇, namely as ∇×f , since for a vector field
f(x, y, z) = P (x, y, z)i + Q(x, y, z)j + R(x, y, z)k , we have:

16.9.1 https://math.libretexts.org/@go/page/4563
∣ i j k ∣
∣ ∣
∣ ∂ ∂ ∂ ∣
∇×f = ∣ ∣
∣ ∂x ∂y ∂z ∣
∣ ∣
∣ P (x, y, z) Q(x, y, z) R(x, y, z) ∣

∂R ∂Q ∂R ∂P ∂Q ∂P
=( − ) i−( − ) j +( − )k
∂y ∂z ∂x ∂z ∂x ∂y

∂R ∂Q ∂P ∂R ∂Q ∂P
=( − ) i+( − ) j +( − )k
∂y ∂z ∂z ∂x ∂x ∂y

= curl f

∂f ∂f ∂f
For a real-valued function f (x, y, z), the gradient ∇f (x, y, z) = i+ j+ k is a vector field, so we can take its
∂x ∂y ∂z

divergence:
div ∇f = ∇ ⋅ ∇f

∂ ∂ ∂ ∂f ∂f ∂f
=( i+ j+ k) ⋅ ( i+ j+ k)
∂x ∂y ∂z ∂x ∂y ∂z

∂ ∂f ∂ ∂f ∂ ∂f
= ( )+ ( )+ ( )
∂x ∂x ∂y ∂y ∂z ∂z

2 2 2
∂ f ∂ f ∂ f
= + +
∂x2 ∂y 2 ∂z 2

Note that this is a real-valued function, to which we will give a special name:

Definition 4.7: Laplacian

For a real-valued function f (x, y, z), the Laplacian of f , denoted by Δf , is given by


2 2 2
∂ f ∂ f ∂ f
Δf (x, y, z) = ∇ ⋅ ∇f = + + . (16.9.2)
2 2 2
∂x ∂y ∂z

Often the notation ∇ 2


f is used for the Laplacian instead of Δf , using the convention ∇ 2
= ∇⋅∇ .

Example 4.17

Let r(x, y, z) = xi + yj + zk be the position vector field on R . Then ∥r(x, y, z)∥ 3 2


= r⋅ r = x
2
+y
2
+z
2
is a real-valued
function. Find
a. the gradient of ∥r∥ 2

b. the divergence of r
c. the curl of r
d. the Laplacian of ∥r∥ 2

Solution:
(a) ∇∥r∥ 2
= 2xi + 2yj + 2zk = 2r

∂ ∂ ∂
(b) ∇ ⋅ r = (x) + (y) + (z) = 1 + 1 + 1 = 3
∂x ∂y ∂z

(c)
∣ i j k ∣
∣ ∣
∣ ∂ ∂ ∂ ∣
∇×r = ∣ ∣ = (0 − 0)i − (0 − 0)j + (0 − 0)k = 0
∣ ∂x ∂y ∂z ∣
∣ ∣
∣ x y z ∣

16.9.2 https://math.libretexts.org/@go/page/4563
2 2 2
∂ ∂ ∂
(d) Δ∥r∥ 2
=
2
2
(x +y
2 2
+z )+
2
(x
2
+y
2
+z )+
2

2
(x
2
+y
2 2
+z ) = 2 +2 +2 = 6
∂x ∂y ∂z

Note that we could have calculated Δ∥r∥ another way, using the ∇ notation along with parts (a) and (b):
2

2 2
Δ∥r∥ = ∇ ⋅ ∇∥r∥ = ∇ ⋅ 2r = 2∇ ⋅ r = 2(3) = 6

Notice that in Example 4.17 if we take the curl of the gradient of ∥r∥ we get 2

2
∇ × (∇∥r∥ ) = ∇ × 2r = 2∇ × r = 20 = 0.

The following theorem shows that this will be the case in general:

Theorem 4.15.
For any smooth real-valued function f (x, y, z), ∇ × (∇f ) = 0 .

Proof
We see by the smoothness of f that
∣ i j k ∣
∣ ∣
∣ ∂ ∂ ∂ ∣
∣ ∣
∇ × (∇f ) = ∂x ∂y ∂z (16.9.3)
∣ ∣
∣ ∣
∂f ∂f ∂f
∣ ∣
∣ ∂x ∂y ∂z ∣

2 2 2 2 2 2
∂ f ∂ f ∂ f ∂ f ∂ f ∂ f
=( − ) i−( − ) j +( − ) k = 0, (16.9.4)
∂y∂z ∂z∂y ∂x∂z ∂z∂x ∂x∂y ∂y∂x

since the mixed partial derivatives in each component are equal.


Corollary 4.16

If a vector field f (x, y, z) has a potential, then curl f = 0 .

Another way of stating Theorem 4.15 is that gradients are irrotational. Also, notice that in Example 4.17 if we take the divergence
of the curl of r we trivially get
∇ ⋅ (∇ × r) = ∇ ⋅ 0 = 0. (16.9.5)

The following theorem shows that this will be the case in general:

Theorem 4.17.

For any smooth vector field f(x, y, z), ∇ ⋅ (∇ × f) = 0.

The proof is straightforward and left as an exercise for the reader.

Corollary 4.18
The flux of the curl of a smooth vector field f (x, y, z) through any closed surface is zero.

Proof: Let Σ be a closed surface which bounds a solid S . The flux of ∇ × f through Σ is

16.9.3 https://math.libretexts.org/@go/page/4563
∬ (∇ × f) ⋅ dσ =∭ ∇ ⋅ (∇ × f) dV (by the Divergence Theorem) (16.9.6)

Σ S

=∭ 0 dV (by Theorem 4.17) (16.9.7)

=0 (16.9.8)

(QED)

There is another method for proving Theorem 4.15 which can be useful, and is often used in physics. Namely, if the surface integral
∬ f (x, y, z) dσ = 0 for all surfaces Σ in some solid region (usually all of R ), then we must have f (x, y, z) = 0 throughout that
3

region. The proof is not trivial, and physicists do not usually bother to prove it. But the result is true, and can also be applied to
double and triple integrals.
For instance, to prove Theorem 4.15, assume that f (x, y, z) is a smooth real-valued function on R . Let C be a simple closed curve
3

in R and let Σ be any capping surface for C (i.e. Σ is orientable and its boundary is C ). Since ∇f is a vector field, then
3

∬ (∇ × (∇f )) ⋅ n dσ =∮ ∇f ⋅ dr by Stokes’ Theorem, so


C
Σ

= 0 by Corollary 4.13.

Since the choice of Σ was arbitrary, then we must have (∇ × (∇f )) ⋅ n = 0 throughout R , where n is any unit vector. Using i, j
3

and k in place of n, we see that we must have ∇ × (∇f ) = 0 in R , which completes the proof.
3

Example 4.18
A system of electric charges has a charge density ρ(x, y, z) and produces an electrostatic field E(x, y, z) at points (x, y, z) in
space. Gauss’ Law states that

∬ E ⋅ dσ = 4π ∭ ρ dV

Σ S

for any closed surface Σ which encloses the charges, with S being the solid region enclosed by Σ . Show that ∇ ⋅ E = 4πρ .
This is one of Maxwell’s Equations.
Solution
By the Divergence Theorem, we have

∭ ∇ ⋅ EdV =∬ E ⋅ dσ

S Σ

= 4π ∭ ρ dV by Gauss’ Law, so combining the integrals gives

∭ (∇ ⋅ E − 4πρ) dV = 0 , so

∇ ⋅ E − 4πρ = 0 since Σ and hence S was arbitrary, so

∇ ⋅ E = 4πρ.

Often (especially in physics) it is convenient to use other coordinate systems when dealing with quantities such as the gradient,
divergence, curl and Laplacian. We will present the formulas for these in cylindrical and spherical coordinates.
Recall from Section 1.7 that a point (x, y, z) can be represented in cylindrical coordinates
(r, θ, z), where x = r cos θ, y = r sin θ, z = z. At each point

16.9.4 https://math.libretexts.org/@go/page/4563
(r, θ, z), let er , eθ , ez be unit vectors in the direction of increasing r, θ, z, respectively (see Figure 4.6.1). Then er , eθ , ez

form an orthonormal set of vectors. Note, by the right-hand rule, that e z × er = eθ .

Figure 4.6.1 Orthonormal vectors e r, eθ , ez in cylindrical coordinates (left) and spherical coordinates (right).
Similarly, a point be
(x, y, z) represented canin spherical coordinates (ρ, θ, φ), where
x = ρ sin φ cos θ, y = ρ sin φ sin θ, z = ρ cos φ. At each point (ρ, θ, φ), let e , e , e be unit vectors in the direction of increasing
ρ θ φ

ρ, θ, φ, respectively (see Figure 4.6.2). Then the vectors e , e , e are orthonormal. By the right-hand rule, we see that
ρ θ φ

e ×e = e
θ ρ . φ

We can now summarize the expressions for the gradient, divergence, curl and Laplacian in Cartesian, cylindrical and spherical
coordinates in the following tables:

Cartesian
(x, y, z) : Scalar function F ; Vector field f = f 1i + f2 j + f3 k

∂F ∂F ∂F
gradient : ∇F = i+ j+ k
∂x ∂y ∂z

∂f1 ∂f2 ∂f3


divergence : ∇ ⋅ f = + +
∂x ∂y ∂z

∂f3 ∂f2 ∂f1 ∂f3 ∂f2 ∂f1


curl : ∇ × f = ( − ) i+( − ) j +( − )k
∂y ∂z ∂z ∂x ∂x ∂y
2 2 2
∂ F ∂ F ∂ F
Laplacian : ΔF =
2
+
2
+
2
∂x ∂y ∂z

Cylindrical
(r, θ, z) : Scalar function F ; Vector field f = f r er + fθ eθ + fz ez

∂F 1 ∂F ∂F
gradient : ∇F = er + eθ + ez
∂r r ∂θ ∂z
1 ∂ 1 ∂fθ ∂fz
divergence : ∇ ⋅ f = (rfr ) + +
r ∂r r ∂θ ∂z
1 ∂fz ∂fθ ∂fr ∂fz 1 ∂ ∂fr
curl : ∇ × f = ( − ) er + ( − ) eθ + ( (rfθ ) − ) ez
r ∂θ ∂z ∂z ∂r r ∂r ∂θ
2 2
1 ∂ ∂F 1 ∂ F ∂ F
Laplacian : ΔF = (r )+
2 2
+
2
r ∂r ∂r r ∂θ ∂z

16.9.5 https://math.libretexts.org/@go/page/4563
Spherical
(ρ, θ, φ) : Scalar function F ; Vector field f = f ρ eρ + fθ eθ + fφ eφ

∂F 1 ∂F 1 ∂F
gradient : ∇F = eρ + eθ + eφ
∂ρ ρ sin φ ∂θ ρ ∂φ

1 ∂ 1 ∂fθ 1 ∂
divergence : ∇ ⋅ f = 2
2
(ρ fρ ) + sin φ + (sin φ fθ )
ρ ∂ρ ρ ∂θ ρ sin φ ∂φ

1 ∂ ∂fφ 1 ∂ ∂fρ 1 ∂fρ 1 ∂


curl : ∇ × f = ( (sin φ fθ ) − ) eρ + ( (ρfφ ) − ) eθ + ( − (ρfθ )) eφ
ρ sin φ ∂φ ∂θ ρ ∂ρ ∂φ ρ sin φ ∂θ ρ ∂ρ
2
1 ∂ ∂F 1 ∂ F 1 ∂ ∂F
Laplacian : ΔF =
2

2
)+
2 2
+
2
(sin φ )
ρ ∂ρ ∂ρ 2 ρ sin φ ∂φ ∂φ
ρ sin φ ∂θ

The derivation of the above formulas for cylindrical and spherical coordinates is straightforward but extremely tedious. The basic
idea is to take the Cartesian equivalent of the quantity in question and to substitute into that formula using the appropriate
coordinate transformation. As an example, we will derive the formula for the gradient in spherical coordinates.
Goal: Show that the gradient of a real-valued function F (ρ, θ, φ) in spherical coordinates is:
∂F 1 ∂F 1 ∂F
∇F = eρ + eθ + eφ
∂ρ ρ sin φ ∂θ ρ ∂φ

∂F ∂F ∂F
Idea: In the Cartesian gradient formula ∇F (x, y, z) = i+ j+ k , put the Cartesian basis vectors i, j, k in terms of the
∂x ∂y ∂z
∂F ∂F ∂F
spherical coordinate basis vectors e ρ, eθ , eφ and functions of ρ, θ and φ . Then put the partial derivatives , , in terms
∂x ∂y ∂z
∂F ∂F ∂F
of , , and functions of ρ, θ and φ .
∂ρ ∂θ ∂φ

Step 1: Get formulas for e ρ, eθ , eφ in terms of i, j, k.


We can see from Figure 4.6.2 that the unit vector eρ in the ρ direction at a general point (ρ, θ, φ) is
r
eρ = , where r = xi + yj + zk is the position vector of the point in Cartesian coordinates. Thus,
∥r∥

r xi + yj + zk
eρ = = ,
−−−−−−−−−−
∥r∥ 2 2 2
√x +y +z

−−−−−−−−− −
so using x = ρ sin φ cos θ, y = ρ sin φ sin θ, z = ρ cos φ, 2 2
and ρ = √x + y + z
2
, we get:

eρ = sin φ cos θi + sin φ sin θj + cos φk

Now, since the angle θ is measured in the xy-plane, then the unit vector e in the θ direction must be parallel to the xy-plane. Thatθ

is, e is of the form ai + bj + 0k . To figure out what a and b are, note that since e ⊥ e , then in particular e ⊥ e when
θ θ ρ θ ρ

e is in the xy -plane.
ρ That occurs when the angle φ is π/2. Putting φ = π/2 into the formula for
e gives e = cos θi + sin θj + 0k , and we see that a vector perpendicular to that is − sin θi + cos θj + 0k . Since this vector is
ρ ρ

also a unit vector and points in the (positive) θ direction, it must be e : θ

eθ = − sin θi + cos θj + 0k

Lastly, since e φ = eθ × eρ , we get:

eφ = cos φ cos θi + cos φ sin θj − sin φk

Step 2: Use the three formulas from Step 1 to solve for i, j, k in terms of e ρ, eθ , eφ .
This comes down to solving a system of three equations in three unknowns. There are many ways of doing this, but we will do it by
combining the formulas for e and e to eliminate k , which will give us an equation involving just i and j. This, with the formula
ρ φ

for e , will then leave us with a system of two equations in two unknowns (i and j), which we will use to solve first for j then for i.
θ

Lastly, we will solve for k.

16.9.6 https://math.libretexts.org/@go/page/4563
First, note that

sin φ eρ + cos φ eφ = cos θi + sin θj

so that
2 2
sin θ(sin φ eρ + cos φ eφ ) + cos θeθ = (sin θ + cos θ)j = j,

and so:

j = sin φ sin θeρ + cos θeθ + cos φ sin θeφ

Likewise, we see that


2 2
cos θ(sin φ eρ + cos φ eφ ) − sin θeθ = (cos θ + sin θ)i = i,

and so:

i = sin φ cos θeρ − sin θeθ + cos φ cos θeφ

Lastly, we see that:

k = cos φ eρ − sin φ eφ

∂F ∂F ∂F ∂F ∂F ∂F
Step 3: Get formulas for , , in terms of , , .
∂ρ ∂θ ∂φ ∂x ∂y ∂z

By the Chain Rule, we have


∂F ∂F ∂x ∂F ∂y ∂F ∂z
= + + ,
∂ρ ∂x ∂ρ ∂y ∂ρ ∂z ∂ρ

∂F ∂F ∂x ∂F ∂y ∂F ∂z
= + + ,
∂θ ∂x ∂θ ∂y ∂θ ∂z ∂θ

∂F ∂F ∂x ∂F ∂y ∂F ∂z
= + + ,
∂φ ∂x ∂φ ∂y ∂φ ∂z ∂φ

which yields:

∂F ∂F ∂F ∂F
= sin φ cos θ + sin φ sin θ + cos φ
∂ρ ∂x ∂y ∂z

∂F ∂F ∂F
= −ρ sin φ sin θ + ρ sin φ cos θ (16.9.9)
∂θ ∂x ∂y

∂F ∂F ∂F ∂F
= ρ cos φ cos θ + ρ cos φ sin θ − ρ sin φ
∂φ ∂x ∂y ∂z

∂F ∂F ∂F ∂F ∂F ∂F
Step 4: Use the three formulas from Step 3 to solve for , , in terms of , , .
∂x ∂y ∂z ∂ρ ∂θ ∂φ

Again, this involves solving a system of three equations in three unknowns. Using a similar process of elimination as in Step 2, we
get:

∂F 1 ∂F ∂F ∂F
2
= (ρ sin φ cos θ − sin θ + sin φ cos φ cos θ )
∂x ρ sin φ ∂ρ ∂θ ∂φ

∂F 1 2
∂F ∂F ∂F
= (ρ sin φ sin θ + cos θ + sin φ cos φ sin θ ) (16.9.10)
∂y ρ sin φ ∂ρ ∂θ ∂φ

∂F 1 ∂F ∂F
= (ρ cos φ − sin φ )
∂z ρ ∂ρ ∂φ

16.9.7 https://math.libretexts.org/@go/page/4563
∂F ∂F ∂F
Step 5: Substitute the formulas for i, j, k from Step 2 and the formulas for , , from Step 4 into the Cartesian gradient
∂x ∂y ∂z

∂F ∂F ∂F
formula ∇F (x, y, z) = i+ j+ k .
∂x ∂y ∂z

Doing this last step is perhaps the most tedious, since it involves simplifying 3 × 3 + 3 × 3 + 2 × 2 = 22 terms! Namely,
1 ∂F ∂F ∂F
2
∇F = (ρ sin φ cos θ − sin θ + sin φ cos φ cos θ ) (sin φ cos θeρ − sin θeθ + cos φ cos θeφ )
ρ sin φ ∂ρ ∂θ ∂φ

1 ∂F ∂F ∂F
2
+ (ρ sin φ sin θ + cos θ + sin φ cos φ sin θ ) (sin φ sin θeρ + cos θeθ + cos φ sin θeφ )
ρ sin φ ∂ρ ∂θ ∂φ

1 ∂F ∂F
+ (ρ cos φ − sin φ ) (cos φ eρ − sin φ eφ ),
ρ ∂ρ ∂φ

which we see has 8 terms involving eρ , 6 terms involving eθ , and 8 terms involving eφ . But the algebra is straightforward and
yields the desired result:
∂F 1 ∂F 1 ∂F
∇F = eρ + eθ + eφ ✓ (16.9.11)
∂ρ ρ sin φ ∂θ ρ ∂φ

Example 4.19

In Example 4.17 we showed that ∇∥r∥ = 2r and Δ ∥r∥ = 6, where r(x, y, z) = xi + yj + zk


2 2
in Cartesian coordinates.
Verify that we get the same answers if we switch to spherical coordinates.
Solution
Since ∥r∥ = x + y + z = ρ
2 2 2 2 2
in spherical coordinates, let F (ρ, θ, φ) = ρ
2
(so that F (ρ, θ, φ) = ∥r∥
2
). The gradient
of F in spherical coordinates is
∂F 1 ∂F 1 ∂F
∇F = eρ + eθ + eφ
∂ρ ρ sin φ ∂θ ρ ∂φ

1 1
= 2ρeρ + (0)eθ + (0)eφ
ρ sin φ ρ

r
= 2ρeρ = 2ρ , as we showed earlier, so
∥r∥

r
= 2ρ = 2r, as expected. And the Laplacian is
ρ

2
1 ∂ ∂F 1 ∂ F 1 ∂ ∂F
2
ΔF = (ρ )+ + (sin φ )
2 2 2 2 2
ρ ∂ρ ∂ρ ρ sin φ ∂θ ρ sin φ ∂φ ∂φ

1 ∂ 2
1 1 ∂
= (ρ 2ρ) + (0) + (sin φ(0))
2 2 2
ρ ∂ρ ρ sin φ ρ sin φ ∂φ

1 ∂ 3
= (2 ρ ) + 0 + 0
2
ρ ∂ρ

1
2
= (6 ρ ) = 6, as expected.
2
ρ

16.9: The Divergence Theorem is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
4.6: Gradient, Divergence, Curl, and Laplacian by Michael Corral is licensed GNU FDL. Original source: http://www.mecmath.net/.

16.9.8 https://math.libretexts.org/@go/page/4563
CHAPTER OVERVIEW

17: Second-Order Differential Equations


A general Calculus Textmap organized around the textbook

Calculus: Early Transcendentals


by James Stewart

I II III IV V VI VII VIII IX X XI XII XIII XIV XV XVI XVII

This Textmap is currently under construction... please be patient with us.

Topic hierarchy
17.1: Second-Order Linear Equations
17.2: Nonhomogeneous Linear Equations
17.3: Applications of Second-Order Differential Equations
17.4: Series Solutions of Differential Equations

17: Second-Order Differential Equations is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

1
17.1: Second-Order Linear Equations
 Learning Objectives
Recognize homogeneous and nonhomogeneous linear differential equations.
Determine the characteristic equation of a homogeneous linear equation.
Use the roots of the characteristic equation to find the solution to a homogeneous linear equation.
Solve initial-value and boundary-value problems involving linear differential equations.

When working with differential equations, usually the goal is to find a solution. In other words, we want to find a function (or
functions) that satisfies the differential equation. The technique we use to find these solutions varies, depending on the form of the
differential equation with which we are working. Second-order differential equations have several important characteristics that can
help us determine which solution method to use. In this section, we examine some of these characteristics and the associated
terminology.

Homogeneous Linear Equations


Consider the second-order differential equation
′′ 2 ′ 3
xy + 2 x y + 5 x y = 0.

Notice that y and its derivatives appear in a relatively simple form. They are multiplied by functions of x, but are not raised to any
powers themselves, nor are they multiplied together. As discussed in previously, first-order equations with similar characteristics
are said to be linear. The same is true of second-order equations. Also note that all the terms in this differential equation involve
either y or one of its derivatives. There are no terms involving only functions of x. Equations like this, in which every term
contains y or one of its derivatives, are called homogeneous.
Not all differential equations are homogeneous. Consider the differential equation
′′ 2 ′ 3 2
xy + 2x y + 5x y = x .

The x term on the right side of the equal sign does not contain
2
y or any of its derivatives. Therefore, this differential equation is
nonhomogeneous.

 Definition: Homogeneous and Nonhomogeneous Linear Equations


A second-order differential equation is linear if it can be written in the form
′′ ′
a2 (x)y + a)1(x)y + a0 (x)y = r(x), (17.1.1)

where a (x), a (x), a (x), and r(x) are real-valued functions and a (x) is not identically zero. If r(x) ≡ 0 —in other words,
2 1 0 2

if r(x) = 0 for every value of x—the equation is said to be a homogeneous linear equation. If r(x) ≠ 0 for some value of x,
the equation is said to be a nonhomogeneous linear equation.

In linear differential equations, y and its derivatives can be raised only to the first power and they may not be multiplied by one


another. Terms involving y or √y make the equation nonlinear. Functions of y and its derivatives, such as sin y or e , are

2 ′ y

similarly prohibited in linear differential equations.


Note that equations may not always be given in standard form (the form shown in the definition). It can be helpful to rewrite them
in that form to decide whether they are linear, or whether a linear equation is homogeneous.

 Example 17.1.1: Classifying Second-Order Equations

Classify each of the following equations as linear or nonlinear. If the equation is linear, determine further whether it is
homogeneous or nonhomogeneous.
a. y + 3x y + x y = x
′′ 4 ′ 2 2 3

b. (sin x)y + (cos x)y + 3y = 0


′′ ′

c. 4t x + 3tx x + 4x = 0
2 ′′ ′

17.1.1 https://math.libretexts.org/@go/page/4565
d. 5y + y = 4x
′′ 5

e. (cos x)y − sin y + (sin x)y − cos x = 0


′′ ′

f. 8ty − 6t y + 4ty − 3t = 0
′′ 2 ′ 2

g. sin(x )y − (cos x)y + x y = y − 3


2 ′′ ′ 2 ′

h. y + 5x y − 3y = cos y
′′ ′

Solution
a. This equation is nonlinear because of the y term. 2

b. This equation is linear. There is no term involving a power or function of y, and the coefficients are all functions of x.The
equation is already written in standard form, and r(x) is identically zero, so the equation is homogeneous.
c. This equation is nonlinear. Note that, in this case, x is the dependent variable and t is the independent variable. The second
term involves the product of x and x , so the equation is nonlinear.

d. This equation is linear. Since r(x) = 4x , the equation is nonhomogeneous.


5

e. This equation is nonlinear, because of the sin y term. ′

f. This equation is linear. Rewriting it in standard form gives


2 ′′ 2 ′ 2
8t y − 6 t y + 4ty = 3 t .

With the equation in standard form, we can see that r(t) = 3t , so the equation is nonhomogeneous. 2

g. This equation looks like it’s linear, but we should rewrite it in standard form to be sure. We get
2 ′′ ′ 2
sin(x )y − (cos x + 1)y + x y = −3.

This equation is, indeed, linear. With r(x) = −3, it is nonhomogeneous.


h. This equation is nonlinear because of the cos y term.

 Exercise 17.1.1

Classify each of the following equations as linear or nonlinear. If the equation is linear, determine further whether it is
homogeneous or nonhomogeneous.
a. (y )2 − y + 8x y = 0
′′ ′ 3

b. (sin t)y + cos t − 3ty


′′ ′
=0

Hint
Write the equation in standard form (Equation 17.1.1) if necessary. Check for powers or functions of y and its derivatives.
Answer a
Nonlinear Linear
Answer b
nonhomogeneous

Later in this section, we will see some techniques for solving specific types of differential equations. Before we get to that,
however, let’s get a feel for how solutions to linear differential equations behave. In many cases, solving differential equations
depends on making educated guesses about what the solution might look like. Knowing how various types of solutions behave will
be helpful.

 Example 17.1.2: Verifying a Solution


Consider the linear, homogeneous differential equation
2 ′′
x y − xy' − 3y = 0.

Looking at this equation, notice that the coefficient functions are polynomials, with higher powers of x associated with higher-
order derivatives of y . Show that y = x is a solution to this differential equation.
3

Solution

17.1.2 https://math.libretexts.org/@go/page/4565
Let y = x 3
. Then y ′ 2
= 3x and y ′′
= 6x. Substituting into the differential equation, we see that
2 ′′ ′ 2 2 3
x y − x y − 3y = x (6x) − x(3 x ) − 3(x )

3 3 3
= 6x − 3x − 3x

= 0.

 Exercise 17.1.2

Show that y = 2x is a solution to the differential equation


2

1 2 ′′ ′
x y − x y + y = 0. (17.1.2)
2

Hint
Calculate the derivatives and substitute them into the differential equation.
Answer
This requires calculating y and y . ′ ′′

dy

y = = 4x
dx

and

dy
′′
y = =4
dx

Inserting these derivatives along with y = 2x into Equation 17.1.2. 2

1 2 ′′ ′ ?
x y − xy + y = 0
2

1 2 2 ?
x (4) − x(4x) + 2 x = 0
2

2 2 2 ✓
2x − 4x + 2x = 0

Yes, this is a solution to the differential equation in Equation 17.1.2.

Although simply finding any solution to a differential equation is important, mathematicians and engineers often want to go beyond
finding one solution to a differential equation to finding all solutions to a differential equation. In other words, we want to find a
general solution. Just as with first-order differential equations, a general solution (or family of solutions) gives the entire set of
solutions to a differential equation. An important difference between first-order and second-order equations is that, with second-
order equations, we typically need to find two different solutions to the equation to find the general solution. If we find two
solutions, then any linear combination of these solutions is also a solution. We state this fact as the following theorem.

 Theorem: SUPERPOSITION PRINCIPLE

If y1 (x) and y 2 (x) are solutions to a linear homogeneous differential equation, then the function
y(x) = c1 y1 (x) + c2 y2 (x), (17.1.3)

where c and c are constants, is also a solution.


1 2

The proof of this superposition principle theorem is left as an exercise.

17.1.3 https://math.libretexts.org/@go/page/4565
 Example 17.1.3: Verifying the Superposition Principle
Consider the differential equation
′′ ′
y − 4 y − 5y = 0.

Given that e −x
and e 5x
are solutions to this differential equation, show that 4e −x
+e
5x
is a solution.
Solution
Although this can be done through a simple application of the Superposition principle (Equation 17.1.3 ), but we can also
confirm it is a solution via an approach like in Example 17.1.2. We have
−x 5x
y(x) = 4 e +e

′ −x 5x
y (x) = −4 e + 5e

′′ −x 5x
y (x) = 4 e + 25 e .

Then
′′ ′ ? −x 5x −x 5x −x 5x
y − 4 y − 5y = (4 e + 25 e ) − 4(−4 e + 5e ) − 5(4 e +e )

? −x 5x −x 5x −x 5x
= 4e + 25 e + 16 e − 20 e − 20 e − 5e


= 0.

Thus, y(x) = 4e −x
+e
5x
is a solution.

 Exercise 17.1.3

Consider the differential equation


′′ ′
y + 5 y + 6y = 0.

Given that e −2x


and e −3x
are solutions to this differential equation, show that 3e −2x
+ 6e
−3x
is a solution.

Hint
Differentiate the function and substitute into the differential equation.
Answer
Although this can be a simple application of the Superposition principle (Equation 17.1.3), we can also set through it like in
Example 17.1.2. We have
−2x −3x
y(x) = 3 e + 6e

′ −2x −3x
y (x) = −6 e − 18 e

′′ −2x 3x
y (x) = 12 e + 54 e .

Then
′′ ′ −2x 3x −2x −3x −2x 3x
y + 5 y + 6y = (12 e + 54 e ) + 5(−6 e − 18 e ) + 6(3 e + 6e )

? −2x 3x −2x 3x −2x 3x


= 12e + 54e − 30e − 90e + 18e + 36e


= 0.

Thus, 3e −2x
+ 6e
−3x
is a solution to the differential equation

Unfortunately, to find the general solution to a second-order differential equation, it is not enough to find any two solutions and
then combine them. Consider the differential equation

17.1.4 https://math.libretexts.org/@go/page/4565
′′ ′
x + 7 x + 12x = 0.

Both e −3t
and 2e −3t
are solutions (you can check this). However,
−3t −3t
x(t) = c1 e + c2 (2 e )

is not the general solution. This expression does not account for all solutions to the differential equation. In particular, it fails to
account for the function e , which is also a solution to the differential equation. It turns out that to find the general solution to a
−4t

second-order differential equation, we must find two linearly independent solutions. We define that terminology here.

 Definition: Linearly Dependent functions

A set of functions f1 (x), f2 (x), … , fn (x) is said to be linearly dependent if there are constants , not all zero,
c1 , c2 , … cn ,

such that

c1 f1 (x) + c2 f2 (x) + ⋯ + cn fn (x) = 0

for all x over the interval of interest. A set of functions that is not linearly dependent is said to be linearly independent.

In this chapter, we usually test sets of only two functions for linear independence, which allows us to simplify this definition. From
a practical perspective, we see that two functions are linearly dependent if either one of them is identically zero or if they are
constant multiples of each other.
First we show that if the functions meet the conditions given previously, then they are linearly dependent. If one of the functions is
identically zero—say, f (x) ≡ 0 —then choose c = 0 and c = 1, and the condition for linear dependence is satisfied. If, on the
2 1 2

other hand, neither f (x) nor f (x) is identically zero, but f (x) = C f (x) for some constant C , then choose c = C and
1 2 1 2 1

c = −1, and again, the condition is satisfied.


2

Next, we show that if two functions are linearly dependent, then either one is identically zero or they are constant multiples of one
another. Assume f (x) and f (x) are linearly independent. Then, there are constants, c and c , not both zero, such that
1 2 1 2

c1 f1 (x) + c2 f2 (x) = 0

for all x over the interval of interest. Then,

c1 f1 (x) = −c2 f2 (x).

Now, since we stated that c1 and c2 can’t both be zero, assume c2 ≠ 0. Then, there are two cases: either c1 = 0 or c1 ≠ 0. If
c = 0, then
1

0 = −c2 f2 (x)

0 = f2 (x),

so one of the functions is identically zero. Now suppose c 1 ≠ 0. Then,


c2
f1 (x) = (− ) f2 (x)
c1

and we see that the functions are constant multiples of one another.

 Theorem: Linear Dependence of Two Functions

Two functions, and f (x), are said to be linearly dependent if either one of them is identically zero or if
f1 (x) 2

f1 (x) = C f2 (x) for some constant C and for all x over the interval of interest. Functions that are not linearly dependent are
said to be linearly independent.

 Example 17.1.4: Testing for Linear Dependence

Determine whether the following pairs of functions are linearly dependent or linearly independent.
a. f1 (x) = x
2
and f 2 (x) = 5x
2

17.1.5 https://math.libretexts.org/@go/page/4565
b. f 1 (x) = sin x and f (x) = cos x
2

c. f 1 (x) = e
3x
and f (x) = e
2
−3x

d. f 1 (x) = 3x and f (x) = 3x + 1


2

Solution
a. f (x) = 5f (x), so the functions are linearly dependent.
2 1

b. There is no constant C such that f (x) = C f (x), so the functions are linearly independent.
1 2

c. There is no constant C such that f (x) = C f (x), so the functions are linearly independent. Don’t get confused by the fact
1 2

that the exponents are constant multiples of each other. With two exponential functions, unless the exponents are equal, the
functions are linearly independent.
d. There is no constant C such that f (x) = C f (x), so the functions are linearly independent.
1 2

 Exercise 17.1.4

Determine whether the following pairs of functions are linearly dependent or linearly independent: f1 (x) = e
x
and
3x
f2 (x) = 3 e .

Hint
Are the functions constant multiples of one another?
Answer
Linearly independent

If we are able to find two linearly independent solutions to a second-order differential equation, then we can combine them to find
the general solution. This result is formally stated in the following theorem.

 Theorem: General Solution to a Homogeneous Equation

If y (x) and y (x) are linearly independent solutions to a second-order, linear, homogeneous differential equation, then the
1 2

general solution is given by

y(x) = c1 y1 (x) + c2 y2 (x),

where c and c are constants.


1 2

When we say a family of functions is the general solution to a differential equation, we mean that
1. every expression of that form is a solution and
2. every solution to the differential equation can be written in that form, which makes this theorem extremely powerful.
If we can find two linearly independent solutions to a second order differential equation, we have, effectively, found all solutions to
the second order differential equation—quite a remarkable statement. The proof of this theorem is beyond the scope of this text.

 Example 17.1.5: Writing the General Solution


If y1 (t) =e
3t
and y 2 (t) =e
−3t
are solutions to y ′′
− 9y = 0, what is the general solution?
Solution
Note that y and y are not constant multiples of one another, so they are linearly independent. Then, the general solution to
1 2

the differential equation is


3t −3t
y(t) = c1 e + c2 e .

17.1.6 https://math.libretexts.org/@go/page/4565
 Exercise 17.1.5
If y1 (x) =e
3x
and y 2 (x) = xe
3x
are solutions to y ′′ ′
− 6 y + 9y = 0, what is the general solution?

Hint
Check for linear independence first.
Answer
3x 3x
y(x) = c1 e + c2 x e

Second-Order Equations with Constant Coefficients


Now that we have a better feel for linear differential equations, we are going to concentrate on solving second-order equations of
the form
′′ ′
ay + b y + cy = 0, (17.2)

where a, b, and c are constants.


Since all the coefficients are constants, the solutions are probably going to be functions with derivatives that are constant multiples
of themselves. We need all the terms to cancel out, and if taking a derivative introduces a term that is not a constant multiple of the
original function, it is difficult to see how that term cancels out. Exponential functions have derivatives that are constant multiples
of the original function, so let’s see what happens when we try a solution of the form y(x) = e , where λ (the lowercase Greek λx

letter lambda) is some constant.


If y(x) = e λx
, then y ′
(x) = λ e
λx
and y ′′
=λ e
2 λx
. Substituting these expressions into Equation 17.1.1, we get
′′ ′ 2 λx λx λx
ay + b y + cy = a(λ e ) + b(λ e ) + ce

λx 2
=e (aλ + bλ + c).

Since e λx
is never zero, this expression can be equal to zero for all x only if
2
aλ + bλ + c = 0.

We call this the characteristic equation of the differential equation.

 Definition: characteristic equation

The characteristic equation of the second order differential equation ay ′′ ′


+ b y + cy = 0 is
2
aλ + bλ + c = 0.

The characteristic equation is very important in finding solutions to differential equations of this form. We can solve the
characteristic equation either by factoring or by using the quadratic formula
− − −−−−−
2
−b ± √ b − 4ac
λ = .
2a

This gives three cases. The characteristic equation has


1. distinct real roots;
2. a single, repeated real root; or
3. complex conjugate roots.
We consider each of these cases separately.
Case 1: Distinct Real Roots
If the characteristic equation has distinct real roots λ1 and λ , then 2 e
λ1 x
and e
λ2 x
are linearly independent solutions to Example
17.1.1, and the general solution is given by

λ1 x λ2 x
y(x) = c1 e + c2 e ,

17.1.7 https://math.libretexts.org/@go/page/4565
where c and c are constants.
1 2

For example, the differential equation y + 9y + 14y = 0 has the associated characteristic equation λ + 9λ + 14 = 0. This
′′ ′ 2

factors into (λ + 2)(λ + 7) = 0, which has roots λ = −2 and λ = −7. Therefore, the general solution to this differential
1 2

equation is
−2x −7x
y(x) = c1 e + c2 e .

Case 2: Single Repeated Real Root


Things are a little more complicated if the characteristic equation has a repeated real root, λ . In this case, we know e is a solution λx

to Equation 17.1.1, but it is only one solution and we need two linearly independent solutions to determine the general solution. We
might be tempted to try a function of the form ke , where k is some constant, but it would not be linearly independent of e .
λx λx

Therefore, let’s try xe as the second solution. First, note that by the quadratic formula,
λx

− − −−−−−
2
−b ± √ b − 4ac
λ = .
2a

But, λ is a repeated root, so the discriminate (b 2


− 4ac ) is zero and λ = −b

2a
. Thus, if y = xe , we have λx

′ λx λx
y =e + λx e

′′ λx 2 λx
y = 2λ e + λ xe .

Substituting both expressions into Equation 17.1.1, we see that


′′ λx 2 λx λx λx λx
ay + by' + cy = a(2λ e + λ xe ) + b(e + λx e ) + cx e

λx 2 λx
= xe (aλ + bλ + c) + e (2aλ + b)

λx λx
= xe (0) + e (2a(−b2a) + b)

λx
= 0 +e (0)


= 0.

This shows that xe is a solution to Equation 17.1.1. Since e and xe are linearly independent, when the characteristic
λx λx λx

equation has a repeated root λ , the general solution to Equation 17.1.1 is given by
λx λx
y(x) = c1 e + c2 x e ,

where c and c are constants.


1 2

For example, the differential equation y ′′ ′


+ 12 y + 36y = 0 has the associated characteristic equation
2
λ + 12λ + 36 = 0.

This factors into (λ + 6) 2


= 0, which has a repeated root λ = −6 . Therefore, the general solution to this differential equation is
−6x −6x
y(x) = c1 e + c2 x e .

Case 3: Complex Conjugate Roots


The third case we must consider is when b − 4ac < 0. In this case, when we apply the quadratic formula, we are taking the square
2


−−
root of a negative number. We must use the imaginary number i = √−1 to find the roots, which take the form λ = α + βi and 1

λ = α − βi. The complex number α + βi is called the conjugate of α − βi . Thus, we see that when the discriminate b − 4ac is
2
2

negative, the roots of our characteristic equation are always complex conjugates.
This creates a little bit of a problem for us. If we follow the same process we used for distinct real roots—using the roots of the
characteristic equation as the coefficients in the exponents of exponential functions—we get the functions e and e as (α+βi)x (α−βi)x

our solutions. However, there are problems with this approach. First, these functions take on complex (imaginary) values, and a
complete discussion of such functions is beyond the scope of this text. Second, even if we were comfortable with complex-value
functions, in this course we do not address the idea of a derivative for such functions. So, if possible, we’d like to find two linearly
independent real-value solutions to the differential equation. For purposes of this development, we are going to manipulate and
differentiate the functions e and e
(α+βi)x
as if they were real-value functions. For these particular functions, this approach is
(α−βi)x

17.1.8 https://math.libretexts.org/@go/page/4565
valid mathematically, but be aware that there are other instances when complex-value functions do not follow the same rules as
real-value functions. Those of you interested in a more in-depth discussion of complex-value functions should consult a complex
analysis text.
Based on the roots α ± βi of the characteristic equation, the functions e
(α+βi)x
and e (α−βi)x
are linearly independent solutions to
the differential equation and the general solution is given by
(α+βi)x (α−βi)x
y(x) = c1 e + c2 e .

Using some smart choices for c and c , and a little bit of algebraic manipulation, we can find two linearly independent, real-value
1 2

solutions to Equation 17.1.1 and express our general solution in those terms.
We encountered exponential functions with complex exponents earlier. One of the key tools we used to express these exponential
functions in terms of sines and cosines was Euler’s formula, which tells us that

e = cos θ + i sin θ (17.1.4)

Euler’s formula

for all real numbers θ .


Going back to the general solution, we have
(α+βi)x (α−βi)x
y(x) = c1 e + c2 e

αx βix αx −βix
= c1 e e + c2 e e

αx βix −βix
=e (c1 e + c2 e ).

Applying Euler’s formula (Equation 17.1.4) together with the identities cos(−x) = cos x and sin(−x) = − sin x, we get
αx
y(x) =e [ c1 (cos βx + i sin βx) + c2 (cos(−βx) + i sin(−βx))]

αx
=e [(c1 + c2 ) cos βx + (c1 − c2 )i sin βx]. (17.1.5)

Now, if we choose c 1 = c2 =
1

2
, the second term is zero and we get
αx
y(x) = e cos βx

as a real-value solution to Equation 17.1.1. Similarly, if we choose c 1 =−


i

2
and c 2 =
i

2
, the first term of Equation 17.1.5 is zero
and we get
αx
y(x) = e sin βx

as a second, linearly independent, real-value solution to Equation 17.1.1.


Based on this, we see that if the characteristic equation has complex conjugate roots α ± βi, then the general solution to Equation
17.1.1 is given by

αx αx
y(x) = c1 e cos βx + c2 e sin βx

αx
=e (c1 cos βx + c2 sin βx),

where c and c are constants.


1 2

For example, the differential equation y − 2y + 5y = 0 has the associated characteristic equation λ − 2λ + 5 = 0. By the
′′ ′ 2

quadratic formula, the roots of the characteristic equation are 1 ± 2i. Therefore, the general solution to this differential equation is
x
y(x) = e (c1 cos 2x + c2 sin 2x).

Summary of Results
We can solve second-order, linear, homogeneous differential equations with constant coefficients by finding the roots of the
associated characteristic equation. The form of the general solution varies, depending on whether the characteristic equation has
distinct, real roots; a single, repeated real root; or complex conjugate roots. The three cases are summarized in Table 17.1.1.
Table 17.1.1 : Summary of Characteristic Equation Cases

17.1.9 https://math.libretexts.org/@go/page/4565
Characteristic Equation Roots General Solution to the Differential Equation

Distinct real roots, λ and λ


1 2 y(x) = c1 e
λ1 x λ2 x
+ c2 e

A repeated real root, λ y(x) = c1 e


λx
+ c2 x e
λx

Complex conjugate roots α ± βi αx


y(x) = e (c1 cosβx + c2 sin βx)

 PROBLEM-SOLVING STRATEGY: USING THE CHARACTERISTIC EQUATION TO SOLVE SECOND-


ORDER DIFFERENTIAL EQUATIONS WITH CONSTANT COEFFICIENTS
1. Write the differential equation in the form a + by + cy = 0. ′′ ′

2. Find the corresponding characteristic equation aλ + bλ + c = 0. 2

3. Either factor the characteristic equation or use the quadratic formula to find the roots.
4. Determine the form of the general solution based on whether the characteristic equation has distinct, real roots; a single,
repeated real root; or complex conjugate roots.

 Example 17.1.6: Solving Second-Order Equations with Constant Coefficients

Find the general solution to the following differential equations. Give your answers as functions of x.
a. y ′′ ′
+ 3 y − 4y = 0

b. y ′′ ′
+ 6 y + 13y = 0

c. y ′′ ′
+ 2y + y = 0

d. y ′′
− 5y

=0

e. y ′′
− 16y = 0

f. y ′′
+ 16y = 0

Solution
Note that all these equations are already given in standard form (step 1).
1. The characteristic equation is λ + 3λ − 4 = 0 (step 2). This factors into (λ + 4)(λ − 1) = 0 , so the roots of the
2

characteristic equation are λ = −4 and λ = 1 (step 3). Then the general solution to the differential equation is
1 2

−4x x
y(x) = c1 e + c2 e . (step 1)

2. The characteristic equation is λ + 6λ + 13 = 0 (step 2). Applying the quadratic formula, we see this equation has
2

complex conjugate roots −3 ± 2i (step 3). Then the general solution to the differential equation is
−3t
y(t) = e (c1 cos 2t + c2 sin 2t). (step 2)

3. The characteristic equation is λ + 2λ + 1 = 0 (step 2). This factors into (λ + 1)2 = 0, so the characteristic equation has
2

a repeated real root λ = −1 (step 3). Then the general solution to the differential equation is
−t −t
y(t) = c1 e + c2 te . (step 3)

4. The characteristic equation is λ − 5λ (step 2). This factors into λ(λ − 5) = 0, so the roots of the characteristic equation
2

are λ = 0 and λ = 5 (step 3). Note that e = e = 1 , so our first solution is just a constant. Then the general solution to
1 2
0x 0

the differential equation is


5x
y(x) = c1 + c2 e . (step 4)

5. The characteristic equation is λ − 16 = 0 (step 2). This factors into (λ + 4)(λ − 4) = 0, so the roots of the characteristic
2

equation are λ = 4 and λ = −4 (step 3). Then the general solution to the differential equation is
1 2

4x −4x
y(x) = c1 e + c2 e . (step 5)

6. The characteristic equation is λ + 16 = 0 (step 2). This has complex conjugate roots ±4i (step 3). Note that
2

= e = 1 , so the exponential term in our solution is just a constant. Then the general solution to the differential
0x 0
e

equation is

y(t) = c1 cos 4t + c2 sin 4t. (step 6)

17.1.10 https://math.libretexts.org/@go/page/4565
 Exercise 17.1.6

Find the general solution to the following differential equations:


a. y ′′ ′
− 2 y + 10y = 0

b. y ′′ ′
+ 14 y + 49y = 0

Hint
Find the roots of the characteristic equation.
Answer a
x
y(x) = e (c1 cos 3x + c2 sin 3x)

Answer b
−7x −7x
y(x) = c1 e + c2 x e

Initial-Value Problems and Boundary-Value Problems


So far, we have been finding general solutions to differential equations. However, differential equations are often used to describe
physical systems, and the person studying that physical system usually knows something about the state of that system at one or
more points in time. For example, if a constant-coefficient differential equation is representing how far a motorcycle shock
absorber is compressed, we might know that the rider is sitting still on his motorcycle at the start of a race, time t = t . This means 0

the system is at equilibrium, so y(t ) = 0, and the compression of the shock absorber is not changing, so y (t ) = 0. With these
0

0

two initial conditions and the general solution to the differential equation, we can find the specific solution to the differential
equation that satisfies both initial conditions. This process is known as solving an initial-value problem. (Recall that we discussed
initial-value problems in Introduction to Differential Equations.) Note that second-order equations have two arbitrary constants in
the general solution, and therefore we require two initial conditions to find the solution to the initial-value problem.
Sometimes we know the condition of the system at two different times. For example, we might know y(t ) = y and 0 0

y(t ) = y . These conditions are called boundary conditions, and finding the solution to the differential equation that satisfies the
1 1

boundary conditions is called solving a boundary-value problem.


Mathematicians, scientists, and engineers are interested in understanding the conditions under which an initial-value problem or a
boundary-value problem has a unique solution. Although a complete treatment of this topic is beyond the scope of this text, it is
useful to know that, within the context of constant-coefficient, second-order equations, initial-value problems are guaranteed to
have a unique solution as long as two initial conditions are provided. Boundary-value problems, however, are not as well behaved.
Even when two boundary conditions are known, we may encounter boundary-value problems with unique solutions, many
solutions, or no solution at all.

 Example 17.1.7: Solving an Initial-Value Problem

Solve the following initial-value problem: y ′′ ′


+ 3 y − 4y = 0, y(0) = 1, y (0) = −9.

Solution
We already solved this differential equation in Example 17.6a. and found the general solution to be
−4x x
y(x) = c1 e + c2 e .

Then
′ −4x x
y (x) = −4 c1 e + c2 e .

When x = 0, we have y(0) = c 1 + c2 and y ′


(0) = −4 c1 + c2 . Applying the initial conditions, we have
c1 + c2 = 1

−4 c1 + c2 = −9.

Then c 1 = 1 − c2 . Substituting this expression into the second equation, we see that

17.1.11 https://math.libretexts.org/@go/page/4565
−4(1 − c2 ) + c2 = −9

−4 + 4 c2 + c2 = −9

5c2 = −5

c2 = −1.

So, c 1 =2 and the solution to the initial-value problem is


−4x x
y(x) = 2 e −e .

 Exercise 17.1.7

Solve the initial-value problem y ′′ ′


− 3 y − 10y = 0,

y(0) = 0, y (0) = 7.

Hint
Use the initial conditions to determine values for c and c . 1 2

Answer
−2x 5x
y(x) = −e +e

 Example 17.1.8: Solving an Initial-Value Problem and Graphing the Solution

Solve the following initial-value problem and graph the solution:


′′ ′ ′
y + 6 y + 13y = 0, y(0) = 0, y (0) = 2

Solution
We already solved this differential equation in Example 17.1.6b. and found the general solution to be
−3x
y(x) = e (c1 cos 2x + c2 sin 2x).

Then
′ −3x −3x
y (x) = e (−2 c1 sin 2x + 2 c2 cos 2x) − 3 e (c1 cos 2x + c2 sin 2x).

When x = 0, we have y(0) = c and y 1



(0) = 2 c2 − 3 c1 . Applying the initial conditions, we obtain
c1 = 0

−3 c1 + 2 c2 = 2.

Therefore, c 1 = 0, c2 = 1, and the solution to the initial value problem is shown in the following graph.
−3x
y =e sin 2x.

17.1.12 https://math.libretexts.org/@go/page/4565
 Exercise 17.1.8

Solve the following initial-value problem and graph the solution: y ′′ ′


− 2 y + 10y = 0, y(0) = 2, y (0) = −1

Hint
Use the initial conditions to determine values for c and c 1 2.

Answer
x
y(x) = e (2 cos 3x − sin 3x)

 Example 17.1.9: Initial-Value Problem Representing a Spring-Mass System

The following initial-value problem models the position of an object with mass attached to a spring. Spring-mass systems are
examined in detail in Applications. The solution to the differential equation gives the position of the mass with respect to a
neutral (equilibrium) position (in meters) at any given time. (Note that for spring-mass systems of this type, it is customary to
define the downward direction as positive.)
′′ ′ ′
y + 2 y + y = 0, y(0) = 1, y (0) = 0

Solve the initial-value problem and graph the solution. What is the position of the mass at time t = 2 sec? How fast is the mass
moving at time t = 1 sec? In what direction?
Solution
In Example Example 17.1.6c. we found the general solution to this differential equation to be
−t −t
y(t) = c1 e + c2 te .

Then
′ −t −t −t
y (t) = −c1 e + c2 (−te +e ).

When t = 0, we have y(0) = c and y 1



(0) = c1 + c2 . Applying the initial conditions, we obtain
c1 = 1

− c1 + c2 = 0.

Thus, c 1 = 1, c2 = 1, and the solution to the initial value problem is


−t −t
y(t) = e + te .

This solution is represented in the following graph. At time t = 2, the mass is at position y(2) = e −2
+ 2e
−2
= 3e
−2
≈ 0.406

m below equilibrium.

To calculate the velocity at time t = 1, we need to find the derivative. We have y(t) = e −t
+ te
−t
, so
′ −t −t −t −t
y (t) = −e +e − te = −te .

Then y ′
(1) = −e
−1
≈ −0.3679 . At time t = 1, the mass is moving upward at 0.3679 m/sec.

17.1.13 https://math.libretexts.org/@go/page/4565
 Exercise 17.1.9
Suppose the following initial-value problem models the position (in feet) of a mass in a spring-mass system at any given time.
Solve the initial-value problem and graph the solution. What is the position of the mass at time t = 0.3 sec? How fast is it
moving at time t = 0.1 sec? In what direction?
′′ ′ ′
y + 14 y + 49y = 0, y(0) = 0, y (0) = 1

Hint
Use the initial conditions to determine values for c and c . 1 2

Answer
−7t
y(t) = te

At time ≈ 0.0367. The mass is 0.0367 ft below equilibrium. At time



(−7 0.3) −2.1
t = 0.3, y(0.3) = 0.3 e = 0.3 e

t = 0.1, y (0.1) = 0.3 e
−0.7
≈ 0.1490. The mass is moving downward at a speed of 0.1490ft/sec.

 Example 17.1.10: Solving a Boundary-Value Problem

In Example 17.6f. we solved the differential equation y + 16y = 0 and found the general solution to be
′′

y(t) = c cos 4t + c sin 4t. If possible, solve the boundary-value problem if the boundary conditions are the following:
1 2

a. y(0) = 0, y( ) = 0 π

b. y(0) = 1, y(0) = 1, y( π

8
) =0

c. y( ) = 0, y( ) = 2
π

8

Solution
We have

y(x) = c1 cos 4t + c2 sin 4t.

1. Applying the first boundary condition given here, we get y(0) = c = 0. So the solution is of the form y(t) = c sin 4t.
1 2

When we apply the second boundary condition, though, we get y( ) = c sin(4( )) = c sin π = 0 for all values of c .
π

4
2
π

4
2 2

The boundary conditions are not sufficient to determine a value for c , so this boundary-value problem has infinitely many
2

solutions. Thus, y(t) = c sin 4t is a solution for any value of c .


2 2

2. Applying the first boundary condition given here, we get y(0) = c = 1. Applying the second boundary condition gives
1

) = c = 0, so c = 0. In this case, we have a unique solution: y(t) = cos 4t .


π
y( 2 2
8

3. Applying the first boundary condition given here, we get y( ) = c = 0. However, applying the second boundary
π

8
2

condition gives y( ) = −c = 2, so c = −2. We cannot have c = 0 = −2, so this boundary value problem has no

8
2 2 2

solution.

17.1.14 https://math.libretexts.org/@go/page/4565
Key Concepts
Second-order differential equations can be classified as linear or nonlinear, homogeneous or nonhomogeneous.
To find a general solution for a homogeneous second-order differential equation, we must find two linearly independent
solutions. If y (x) and y (x) are linearly independent solutions to a second-order, linear, homogeneous differential equation,
1 2

then the general solution is given by

y(x) = c1 y1 (x) + c2 y2 (x).

To solve homogeneous second-order differential equations with constant coefficients, find the roots of the characteristic
equation. The form of the general solution varies depending on whether the characteristic equation has distinct, real roots; a
single, repeated real root; or complex conjugate roots.
Initial conditions or boundary conditions can then be used to find the specific solution to a differential equation that satisfies
those conditions, except when there is no solution or infinitely many solutions.

Key Equations
Linear second-order differential equation
′′ ′
a2 (x)y + a1 (x)y + a0 (x)y = r(x)

Second-order equation with constant coefficients


′′ ′
ay + b y + cy = 0

Glossary
boundary conditions
the conditions that give the state of a system at different times, such as the position of a spring-mass system at two different
times

boundary-value problem
a differential equation with associated boundary conditions

characteristic equation
the equation aλ 2
+ bλ + c = 0 for the differential equation ay'' +by' + cy = 0

homogeneous linear equation


a second-order differential equation that can be written in the form a 2 (x)y'' +a1 (x)y' + a0 (x)y = r(x) , but r(x) = 0 for
every value of x

nonhomogeneous linear equation


a second-order differential equation that can be written in the form a 2 (x)y'' +a1 (x)y' + a0 (x)y = r(x) , but r(x) ≠ 0 for
some value of x

linearly dependent
a set of functions f 1 (x), f2 (x), … , fn (x) for whichthere are constants c , 1 c2 , … , cn , not all zero, such that
c1 f1 (x) + c2 f2 (x) + ⋯ + cn fn (x) = 0 for all x in the interval of interest

linearly independent
a set of functions f 1 (x), f2 (x), … , fn (x) for which there are no constants c 1, c2 , … , cn , such that
c1 f1 (x) + c2 f2 (x) + ⋯ + cn fn (x) = 0 for all x in the interval of interest

17.1: Second-Order Linear Equations is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
17.1: Second-Order Linear Equations by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

17.1.15 https://math.libretexts.org/@go/page/4565
17.2: Nonhomogeneous Linear Equations
Now we consider second order equations of the form aÿ + bẏ + cy = f (t) , with a , b , and c constant. Of course, if a = 0 this is
really a first order equation, so we assume a ≠ 0 . Also, much as in exercise 20 of section 17.5, if c = 0 we can solve the related
first order equation ah˙
+ bh = f (t) , and then solve h = ẏ for y . So we will only examine examples in which c ≠ 0 .

Suppose that y (t) and y (t) are solutions to aÿ + bẏ + cy = f (t) , and consider the function
1 2 h = y1 − y2 . We substitute this
function into the left hand side of the differential equation and simplify:
′′ ′ ′′ ′ ′′ ′
a(y1 − y2 ) + b(y1 − y2 ) + c(y1 − y2 ) = ay + by + c y1 − (ay + by + c y2 ) = f (t) − f (t) = 0. (17.2.1)
1 1 2 2

So h is a solution to the homogeneous equation aÿ + bẏ + cy = 0 . Since we know how to find all such h , then with just one
particular solution y we can express all possible solutions y , namely, y = h + y , where now h is the general solution to the
2 1 1 2

homogeneous equation. Of course, this is exactly how we approached the first order linear equation.
To make use of this observation we need a method to find a single solution y . This turns out to be somewhat more difficult than
2

the first order case, but if f (t) is of a certain simple form, we can find a solution using the method of undetermined coefficients,
sometimes more whimsically called the method of judicious guessing.

Example 17.2.1:

Solve the differential equation ÿ − ẏ − 6y = 18t 2


+5 .
Solution
The general solution of the homogeneous equation is Ae + Be . We guess that a solution to the non-homogeneous
3t −2t

equation might look like f (t) itself, namely, a quadratic y = at + bt + c . Substituting this guess into the differential equation
2

we get
2 2
ÿ − ẏ − 6y = 2a − (2at + b) − 6(at + bt + c) = −6at + (−2a − 6b)t + (2a − b − 6c). (17.2.2)

We want this to equal 18t 2


+5 , so we need
−6a = 18

−2a − 6b = 0 (17.2.3)

2a − b − 6c = 5

This is a system of three equations in three unknowns and is not hard to solve: a = −3 , b =1 , c = −2 . Thus the general
solution to the differential equation is Ae + Be − 3t + t − 2 .
3t −2t 2

So the "judicious guess'' is a function with the same form as f (t) but with undetermined (or better, yet to be determined)
coefficients. This works whenever f (t) is a polynomial.

Example 17.2.2:

Consider the initial value problem mÿ + ky = −mg , y(0) = 2 , ẏ (0) = 50 . The left hand side represents a mass-spring
system with no damping, i.e., b = 0 . Unlike the homogeneous case, we now consider the force due to gravity, −mg, assuming
the spring is vertical at the surface of the earth, so that g = 980 . To be specific, let us take m = 1 and k = 100 . The general
solution to the homogeneous equation is A cos(10t) + B sin(10t) . For the solution to the non-homogeneous equation we
guess simply a constant y = a , since −mg = −980 is a constant. Then ÿ + 100y = 100a so a = −980/100 = −9.8. The
desired general solution is then A cos(10t) + B sin(10t) − 9.8 . Substituting the initial conditions we get
2 = A − 9.8
(17.2.4)
50 = 10B

so A = 11.8 and B = 5 and the solution is 11.8 cos(10t) + 5 sin(10t) − 9.8.

More generally, this method can be used when a function similar to f (t) has derivatives that are also similar to f (t); in the
examples so far, since f (t) was a polynomial, so were its derivatives. The method will work if f (t) has the form

17.2.1 https://math.libretexts.org/@go/page/4566
sin(βt) , where p(t) and q(t) are polynomials; when α = β = 0 this is simply p(t), a polynomial. In the
αt αt
p(t)e cos(βt) + q(t)e

most general form it is not simple to describe the appropriate judicious guess; we content ourselves with some examples to
illustrate the process.

Example 17.2.3:

Find the general solution to ÿ + 7ẏ + 10y = e . The characteristic equation is r + 7r + 10 = (r + 5)(r + 2) , so the
3t 2

solution to the homogeneous equation is Ae + Be . For a particular solution to the inhomogeneous equation we guess
−5t −2t

C e . Substituting we get
3t

3t 3t 3t 3t
9C e + 21C e + 10C e =e 40C . (17.2.5)

When C = 1/40 this is equal to f (t) = e , so the solution is Ae


3t −5t
+ Be
−2t
+ (1/40)e
3t
.

Example 17.2.4:

Find the general solution to ÿ + 7ẏ + 10y = e . Following the last example we might guess
−2t
Ce
−2t
, but since this is a
solution to the homogeneous equation it cannot work. Instead we guess C te . Then −2t

−2t −2t −2t −2t −2t −2t −2t


(−2C e − 2C e + 4C te ) + 7(C e − 2C te ) + 10C te =e (−3C ). (17.2.6)

Then C = −1/3 and the solution is Ae −5t


+ Be
−2t
− (1/3)te
−2t
.

In general, if f (t) = e and k is one of the roots of the characteristic equation, then we guess
kt
C te
kt
instead of Ce
kt
. If k is the
only root of the characteristic equation, then C te will not work, and we must guess C t e .
kt 2 kt

Example 17.2.5:

Find the general solution to ÿ − 6ẏ + 9y = e . The characteristic equation is r − 6r + 9 = (r − 3) , so the general solution
3t 2 2

to the homogeneous equation is Ae + Bte . Guessing C t e for the particular solution, we get
3t 3t 2 3t

2 3t 3t 3t 3t 2 3t 3t 2 3t 3t
(9C t e + 6C te + 6C te + 2C e ) − 6(3C t e + 2C te ) + 9C t e =e 2C . (17.2.7)

The solution is thus Ae 3t


+ Bte
3t
+ (1/2)t e
2 3t
.

It is common in various physical systems to encounter an f (t) of the form a cos(ωt) + b sin(ωt) .

Example 17.2.6:

Find the general solution to ÿ + 6ẏ + 25y = cos(4t) . The roots of the characteristic equation are −3 ± 4i , so the solution to
the homogeneous equation is e (A cos(4t) + B sin(4t)) . For a particular solution, we guess C cos(4t) + D sin(4t) .
−3t

Substituting as usual:
(−16C cos(4t) + − 16D sin(4t)) + 6(−4C sin(4t) + 4D cos(4t)) + 25(C cos(4t) + D sin(4t))
(17.2.8)
= (24D + 9C ) cos(4t) + (−24C + 9D) sin(4t).

To make this equal to cos(4t) we need


24D + 9C = 1
(17.2.9)
9D − 24C = 0

which gives C = 1/73 and D = 8/219. The full solution is then


−3t
e (A cos(4t) + B sin(4t)) + (1/73) cos(4t) + (8/219) sin(4t). (17.2.10)

The function e
−3t
is a damped oscillation as in example 17.5.3, while
(A cos(4t) + B sin(4t))

(1/73) cos(4t) + (8/219) sin(4t) is a simple undamped oscillation. As t increases, the sum e (A cos(4t) + B sin(4t)) −3t

approaches zero, so the solution \[e^{-3t}(A\cos(4t)+B\sin(4t))+(1/73)\cos(4t)+(8/219)\sin(4t)\[ becomes more and more like
the simple oscillation (1/73) cos(4t) + (8/219) sin(4t)---notice that the initial conditions don't matter to this long term
behavior. The damped portion is called the transient part of solution, and the simple oscillation is called the steady state part

17.2.2 https://math.libretexts.org/@go/page/4566
of solution. A physical example is a mass-spring system. If the only force on the mass is due to the spring, then the behavior of
the system is a damped oscillation. If in addition an external force is applied to the mass, and if the force varies according to a
function of the form a cos(ωt) + b sin(ωt) , then the long term behavior will be a simple oscillation determined by the steady
state part of the general solution; the initial position of the mass will not matter.

As with the exponential form, such a simple guess may not work.

Example 17.2.4:

Find the general solution to ÿ + 16y = − sin(4t) . The roots of the characteristic equation are ±4i, so the solution to the
homogeneous equation is A cos(4t) + B sin(4t) . Since both cos(4t) and sin(4t) are solutions to the homogeneous equation,
C cos(4t) + D sin(4t) is also, so it cannot be a solution to the non-homogeneous equation. Instead, we guess
C t cos(4t) + Dt sin(4t) . Then substituting:

(−16C t cos(4t) −16D sin(4t) + 8D cos(4t) − 8C sin(4t))) + 16(C t cos(4t) + Dt sin(4t))


(17.2.11)
= 8D cos(4t) − 8C sin(4t).

Thus C = 1/8 , D = 0 , and the solution is C cos(4t) + D sin(4t) + (1/8)t cos(4t) .

In general, if f (t) = a cos(ωt) + b sin(ωt) , and ±ωi are the roots of the characteristic equation, then instead of
C cos(ωt) + D sin(ωt) we guess C t cos(ωt) + Dt sin(ωt) .

Contributors

17.2: Nonhomogeneous Linear Equations is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
17.6: Second Order Linear Equations by David Guichard is licensed CC BY-NC-SA 4.0.

17.2.3 https://math.libretexts.org/@go/page/4566
17.3: Applications of Second-Order Differential Equations
Let us consider to the example of a mass on a spring. We now examine the case of forced oscillations, which we did not yet handle.
That is, we consider the equation
′′ ′
mx + c x + kx = F (t)

for some nonzero F (t). The setup is again: m is mass, c is friction, k is the spring constant, and F (t) is an external force acting on
the mass.

Figure 17.3.1
What we are interested in is periodic forcing, such as noncentered rotating parts, or perhaps loud sounds, or other sources of
periodic force. Once we learn about Fourier series in Chapter 4, we will see that we cover all periodic functions by simply
considering F (t) = F cos(ωt) (or sine instead of cosine, the calculations are essentially the same).
0

Undamped Forced Motion and Resonance


First let us consider undamped c = 0 motion for simplicity. We have the equation
′′
mx + kx = F0 cos(ωt)

This equation has the complementary solution (solution to the associated homogeneous equation)

xc = C1 cos(ω0 t) + C2 sin(ω0 t)

−−
where ω0 = √
k

m
is the natural frequency (angular), which is the frequency at which the system “wants to oscillate” without
external interference.
Let us suppose that ω ≠ ω . We try the solution x = A cos(ωt) and solve for A . Note that we need not have sine in our trial
0 p

solution as on the left hand side we will only get cosines anyway. If you include a sine it is fine; you will find that its coefficient
will be zero.
We solve using the method of undetermined coefficients. We find that
F0
xp = cos(ωt)
2 2
m(ω −ω )
0

We leave it as an exercise to do the algebra required.


The general solution is
F0
x = C1 cos(ω0 t) + C2 sin(ω0 t) + cos(ωt)
2
m(ω − ω2 )
0

or written another way


F0
x = C cos(ω0 t − y) + cos(ωt)
2 2
m(ω −ω )
0

Hence it is a superposition of two cosine waves at different frequencies.

17.3.1 https://math.libretexts.org/@go/page/4567
 Example 17.3.1

Take
′′ ′
0.5 x + 8x = 10 cos(πt), x(0) = 0, x (0) = 0


−−
Let us compute. First we read off the parameters: ω = π, ω 0 =√
0.5
8
= 4, F0 = 10, m = 0.5 . The general solution is

20
x = C1 cos(4t) + C2 sin(4t) + cos(πt)
2
16 − π

Solve for C and C using the initial conditions. It is easy to see that C
1 2 1 =
−20

16−π 2
and C 2 =0 . Hence

20
x = (cos(πt) − cos(4t))
2
16 − π

Figure 17.3.2 : Graph of .


20
(cos(πt) − cos(4t))
16−π 2

Notice the “beating” behavior in Figure 17.3.2. First use the trigonometric identity
A−B A+B
2 sin( ) sin( ) = cos B − cos A
2 2

to get that
20 4 −π 4 +π
x = (2 sin( t) sin( t))
16 − π 2 2 2

Notice that x is a high frequency wave modulated by a low frequency wave.


Now suppose that ω = ω . Obviously, we cannot try the solution A cos(ωt) and then use the method of undetermined
0

coefficients. We notice that cos(ωt) solves the associated homogeneous equation. Therefore, we need to try
x = At cos(ωt) + Bt sin(ωt) . This time we do need the sine term since the second derivative of t cos(ωt) does contain
p

sines. We write the equation


F0
′′ 2
x +ω x = cos(ωt)
m

Plugging x into the left hand side we get


p

F0
2Bω cos(ωt) − 2Aω sin(ωt) = cos(ωt)
m

F0 F0
Hence A = 0 and B = 2mω
. Our particular solution is 2mω
t sin(ωt) and our general solution is

17.3.2 https://math.libretexts.org/@go/page/4567
F0
x = C1 cos(ωt) + C2 sin(ωt) + t sin(ωt)
2mω

The important term is the last one (the particular solution we found). We can see that this term grows without bound as t → ∞ .
−−−−−−−
F0 t −F0 t
In fact it oscillates between 2mω
and 2mω
. The first two terms only oscillate between ±√C 1
2
+C
2
2
, which becomes smaller
and smaller in proportion to the oscillations of the last term as t gets larger. In Figure 17.3.3 we see the graph with
C = C = 0, F = 2, m = 1, ω = π .
1 2 0

Figure 17.3.3 : Graph of 1

π
t sin(πt).
By forcing the system in just the right frequency we produce very wild oscillations. This kind of behavior is called resonance
or perhaps pure resonance. Sometimes resonance is desired. For example, remember when as a kid you could start swinging by
just moving back and forth on the swing seat in the “correct frequency”? You were trying to achieve resonance. The force of
each one of your moves was small, but after a while it produced large swings.
On the other hand resonance can be destructive. In an earthquake some buildings collapse while others may be relatively
undamaged. This is due to different buildings having different resonance frequencies. So figuring out the resonance frequency
can be very important.
A common (but wrong) example of destructive force of resonance is the Tacoma Narrows bridge failure. It turns out there was
a different phenomenon at play. 1

Damped Forced Motion and Practical Resonance


In real life things are not as simple as they were above. There is, of course, some damping. Our equation becomes
′′ ′
mx + c x + kx = F0 cos(ωt), (17.3.1)

for some c > 0 . We have solved the homogeneous problem before. We let

−−
c k
p = ω0 =√
2m m

We replace equation (17.3.1) with

′′ ′
F0
2
x + 2p x + ω x = cos(ωt)
0
m

−−−−−−
The roots of the characteristic equation of the associated homogeneous problem are r1 , r2 = −p ± √p
2 2
−ω
0
. The form of the
general solution of the associated homogeneous equation depends on the sign of p 2
−ω
2
0
, or equivalently on the sign of c 2
− 4km ,
as we have seen before. That is,
r1 t r2 t 2
⎧ C1 e + C2 e , if c > 4km,
pt −pt 2
xc = ⎨ C1 e + C2 te , if c = 4km,

−pt 2
e (C1 cos(ω1 t) + C2 sin(ω1 t)), if c < 4km,

−−−−−−
where ω 1 = √ω
2
0
−p
2
. In any case, we can see that x c (t) → 0 as t → ∞ . Furthermore, there can be no conflicts when trying to
solve for the undetermined coefficients by trying x p = A cos(ωt) + B sin(ωt) . Let us plug in and solve for A and B . We get (the
tedious details are left to reader)

17.3.3 https://math.libretexts.org/@go/page/4567
2 2
F0
2 2
((ω − ω )B − 2ωpA) sin(ωt) + ((ω − ω )A + 2ωpB) cos(ωt) = cos(ωt)
0 0
m

We get that
2 2
(ω − ω )F0
0
A =
2 2
2 2
m (2ωp) + m (ω −ω )
0

2ωpF0
B =
2 2
2 2
m (2ωp) + m (ω −ω )
0

−−−−−−−
We also compute C = √A2 + B2 to be
F0
C = −−−−−−−−−−−−−−−−
2 2
2 2
m √ (2ωp) + (ω −ω )
0

Thus our particular solution is


2 2
(ω − ω )F0 2ωpF0
0
xP = cos(ωt) + sin(ωt)
2 2 2 2
2 2 2 2
m (2ωp) + m (ω −ω ) m (2ωp) + m (ω −ω )
0 0

Or in the alternative notation we have amplitude C and phase shift γ where (if ω ≠ ω ) 0

B 2ωp
tan γ = =
2
A ω − ω2
0

Hence we have
F0
xp = −−−−−−−−−−−−−−−− cos(ωt − γ)
2 2
2 2
m √ (2ωp) + (ω −ω )
0

F0
If ω = ω we see that A = 0, B = C
0 =
2mωp
, and γ =
π

2
.

The exact formula is not as important as the idea. Do not memorize the above formula, you should instead remember the ideas
involved. For different forcing function F , you will get a different formula for x . So there is no point in memorizing this specific
p

formula. You can always recompute it later or look it up if you really need it.
For reasons we will explain in a moment, we call x the transient solution and denote it by x . We call the x we found above the
c tr p

steady periodic solution and denote it by x . The general solution to our problem is
sp

x = xc + xp = xtr + xsp

We note that x = x goes to zero as t → ∞ , as all the terms involve an exponential with a negative exponent. Hence for large t ,
c tr

the effect of x is negligible and we will essentially only see x . Hence the name transient. Notice that x involves no arbitrary
tr sp sp

constants, and the initial conditions will only affect x . This means that the effect of the initial conditions will be negligible after
tr

some period of time. Because of this behavior, we might as well focus on the steady periodic solution and ignore the transient
solution. See Figure 17.3.4 for a graph of different initial conditions.

17.3.4 https://math.libretexts.org/@go/page/4567
Figure 17.3.4 : Solutions with different initial conditions for parameters k = 1, m = 1, F0 = 1, c = 0.7, and ω = 1.1.
Notice that the speed at which x goes to zero depends on P (and hence c ). The bigger P is (the bigger c is), the “faster” x
tr tr

becomes negligible. So the smaller the damping, the longer the “transient region.” This agrees with the observation that when
c = 0 , the initial conditions affect the behavior for all time (i.e. an infinite “transient region”).

Let us describe what we mean by resonance when damping is present. Since there were no conflicts when solving with
undetermined coefficient, there is no term that goes to infinity. What we will look at however is the maximum value of the
amplitude of the steady periodic solution. Let C be the amplitude of x . If we plot C as a function of ω (with all other parameters
sp

fixed) we can find its maximum. We call the ω that achieves this maximum the practical resonance frequency. We call the maximal
amplitude C (ω) the practical resonance amplitude. Thus when damping is present we talk of practical resonance rather than pure
resonance. A sample plot for three different values of c is given in Figure 17.3.5. As you can see the practical resonance amplitude
grows as damping gets smaller, and any practical resonance can disappear when damping is large.

Figure 17.3.5 : Graph of C (ω) showing practical resonance with parameters k = 1, m = 1, F0 = 1 . The top line is with c = 0.4 ,
the middle line with c = 0.8 , and the bottom line with c = 1.6 .
To find the maximum we need to find the derivative C ′
(ω) . Computation shows
2 2 2
−4ω(2 p +ω − ω )F0
′ 0
C (ω) =
3/2
2 2 2
m ((2ωp) + (ω − ω ))
0

This is zero either when ω = 0 or when 2p 2



2 2
−ω
0
=0 . In other words, C ′
(ω) = 0 when

17.3.5 https://math.libretexts.org/@go/page/4567
−−−−−−−
2 2
ω = √ω − 2p or ω = 0
0

−−−−−−−
It can be shown that if ω 2
0
2
− 2p is positive, then √ω 2
0
2
− 2p is the practical resonance frequency (that is the point where C (ω) is
maximal, note that in this case C (ω) > 0 for small ω). If ω = 0 is the maximum, then essentially there is no practical resonance

since we assume that ω > 0 in our system. In this case the amplitude gets larger as the forcing frequency gets smaller.
If practical resonance occurs, the frequency is smaller than ω . As the damping c (and hence P ) becomes smaller, the practical
0

resonance frequency goes to ω . So when damping is very small, ω is a good estimate of the resonance frequency. This behavior
0 0

agrees with the observation that when c = 0 , then ω is the resonance frequency.
0

Another interesting observation to make is that when ω → ∞ , then ω → 0 . This means that if the forcing frequency gets too high it
does not manage to get the mass moving in the mass-spring system. This is quite reasonable intuitively. If we wiggle back and forth
really fast while sitting on a swing, we will not get it moving at all, no matter how forceful. Fast vibrations just cancel each other
out before the mass has any chance of responding by moving one way or the other.
The behavior is more complicated if the forcing function is not an exact cosine wave, but for example a square wave. A general
periodic function will be the sum (superposition) of many cosine waves of different frequencies. The reader is encouraged to come
back to this section once we have learned about the Fourier series.

Footnotes
1
K. Billah and R. Scanlan, Resonance, Tacoma Narrows Bridge Failure, and Undergraduate Physics Textbooks, American Journal
of Physics, 59(2), 1991, 118–124, http://www.ketchum.org/billah/Billah-Scanlan.pdf

17.3: Applications of Second-Order Differential Equations is shared under a not declared license and was authored, remixed, and/or curated by
LibreTexts.
2.6: Forced Oscillations and Resonance by Jiří Lebl is licensed CC BY-SA 4.0. Original source: https://www.jirka.org/diffyqs.

17.3.6 https://math.libretexts.org/@go/page/4567
17.4: Series Solutions of Differential Equations
 Learning Objectives
Use power series to solve first-order and second-order differential equations.

Previously, we studied how functions can be represented as power series, y(x) = ∑ an x


n
. We also saw that we can find series
n=0

representations of the derivatives of such functions by differentiating the power series term by term. This gives

n−1
y'(x) = ∑ nan x

n=1

and

n−2
y'' (x) = ∑ n(n − 1)an x .

n=2

In some cases, these power series representations can be used to find solutions to differential equations.
The examples and exercises in this section were chosen for which power solutions exist. However, it is not always the case that
power solutions exist. Those of you interested in a more rigorous treatment of this topic should review the differential equations
section of the LibreTexts.

 Problem-Solving Strategy: Finding Power Series Solutions to Differential Equations


1. Assume the differential equation has a solution of the form

n
y(x) = ∑ an x .

n=0

2. Differentiate the power series term by term to get


n−1
y'(x) = ∑ nan x

n=1

and

n−2
y'' (x) = ∑ n(n − 1)an x .

n=2

3. Substitute the power series expressions into the differential equation.


4. Re-index sums as necessary to combine terms and simplify the expression.
5. Equate coefficients of like powers of x to determine values for the coefficients a in the power series.
n

6. Substitute the coefficients back into the power series and write the solution.

 Example 17.4.1: Series Solutions to Differential Equations


Find a power series solution for the following differential equations.
a. y − y = 0
′′

b. (x − 1)y'' +6xy' + 4y = −4
2

Solution
Part a
Assume

17.4.1 https://math.libretexts.org/@go/page/4568

n
y(x) = ∑ an x (step 1)

n=0

Then,

n−1
y'(x) = ∑ nan x (step 2A)

n=1

and

n−2
y'' (x) = ∑ n(n − 1)an x (step 2B)

n=2

We want to find values for the coefficients a such that


n

y'' −y = 0

∞ ∞

n−2 n
∑ n(n − 1)an x − ∑ an x = 0. (step 3)

n=2 n=0

We want the indices on our sums to match so that we can express them using a single summation. That is, we want to rewrite
the first summation so that it starts with n = 0 .
To re-index the first term, replace n with n + 2 inside the sum, and change the lower summation limit to n = 0. We get
∞ ∞

n−2 n
∑ n(n − 1)an x = ∑(n + 2)(n + 1)an+2 x .

n=2 n=0

This gives
∞ ∞

n
∑(n + 2)(n + 1)an+2 x − ∑ an xn =0

n=0 n=0

n
∑[(n + 2)(n + 1)an+2 − an ] x = 0. (step 4)

n=0

Because power series expansions of functions are unique, this equation can be true only if the coefficients of each power of x
are zero. So we have

(n + 2)(n + 1)an+2 − an = 0 for n = 0, 1, 2, … .

This recurrence relationship allows us to express each coefficient a in terms of the coefficient two terms earlier. This yields
n

one expression for even values of n and another expression for odd values of n . Looking first at the equations involving even
values of n , we see that
a0
a2 =
2

a2 a0
a4 = =
4⋅3 4!

a4 a0
a6 = =
6⋅5 6!

Thus, in general, when n is even,


a0
an = . (step 5)
n!

For the equations involving odd values of n, we see that

17.4.2 https://math.libretexts.org/@go/page/4568
a1 a1
a3 = =
3⋅2 3!

a3 a1
a5 = =
5⋅4 5!

a5 a1
a7 = =
7⋅6 7!

Therefore, in general, when n is odd,


a1
an = . (step 5)
n!

Putting this together, we have


n
y(x) = ∑ an x

n=0

a0 a1 a0 a1
2 3 4 5
= a0 + a1 x + x + x + x + x +⋯ .
2 3! 4! 5!

Re-indexing the sums to account for the even and odd values of n separately, we obtain
∞ ∞
1 1
2k 2k+1
y(x) = a0 ∑ x + a1 ∑ x . (step 6)
(2k)! (2k + 1)!
k=0 k=0

Analysis for part a.


As expected for a second-order differential equation, this solution depends on two arbitrary constants. However, note that our
differential equation is a constant-coefficient differential equation, yet the power series solution does not appear to have the
familiar form (containing exponential functions) that we are used to seeing. Furthermore, since y(x) = c e + c e is the 1
x
2
−x

general solution to this equation, we must be able to write any solution in this form, and it is not clear whether the power series
solution we just found can, in fact, be written in that form.
Fortunately, after writing the power series representations of e and e x −x
, and doing some algebra, we find that if we choose
(a0 + a1 ) (a0 − a1 )
c0 = , c1 = ,
2 2

we then have a 0 = c0 + c1 and a 1 = c0 − c1 , and


a0 2
a1 3
a0 4
a1 5
y(x) = a0 + a1 x + x + x + x + x +⋯
2 3! 4! 5!

(c0 + c1 ) (c0 − c1 ) (c0 + c1 ) (c0 − c1 )


2 3 4 5
= (c0 + c1 ) + (c0 − c1 )x + x + x + x + x +⋯
2 3! 4! 5!

∞ n ∞ n
x (−x)
= c0 ∑ + c1 ∑
n! n!
n=0 n=0

x −x
= c0 e + c1 e .

So we have, in fact, found the same general solution. Note that this choice of c and c is not obvious. This is a case when we 1 2

know what the answer should be, and have essentially “reverse-engineered” our choice of coefficients.
Part b
Assume

n
y(x) = ∑ an x (step 1)

n=0

Then,

17.4.3 https://math.libretexts.org/@go/page/4568

n−1
y'(x) = ∑ nan x (step 2)

n=1

and

n−2
y'' (x) = ∑ n(n − 1)an x (step 2)

n=2

We want to find values for the coefficients a such that


n

2
(x − 1)y'' +6xy' + 4y = −4
∞ ∞ ∞
2 n−2 n−1 n
(x − 1) ∑ n(n − 1)an x + 6x ∑ nan x + 4 ∑ an x = −4

n=2 n=1 n=0

∞ ∞ ∞ ∞
2 n−2 n−2 n−1 n
x ∑ n(n − 1)an x − ∑ n(n − 1)an x + 6x ∑ nan x + 4 ∑ an x = −4.

n=2 n=2 n=1 n=0

Taking the external factors inside the summations, we get


∞ ∞ ∞ ∞

n n−2 n n
∑ n(n − 1)an x − ∑ n(n − 1)an x + ∑ 6nan x + ∑ 4 an x = −4. (step 3)

n=2 n=2 n=1 n=0

Now, in the first summation, we see that when n = 0 or n = 1 , the term evaluates to zero, so we can add these terms back into
our sum to get
∞ ∞

n n
∑ n(n − 1)an x = ∑ n(n − 1)an x .

n=2 n=0

Similarly, in the third term, we see that when n = 0 , the expression evaluates to zero, so we can add that term back in as well.
We have
∞ ∞

n n
∑ 6nan x = ∑ 6nan x .

n=1 n=0

Then, we need only shift the indices in our second term. We get
∞ ∞

n−2 n
∑ n(n − 1)an x = ∑(n + 2)(n + 1)an+2 x .

n=2 n=0

Thus, we have
∞ ∞ ∞ ∞

n n n n
∑ n(n − 1)an x − ∑(n + 2)(n + 1)an+2 x + ∑ 6nan x + ∑ 4 an x = −4 (step 4)

n=0 n=0 n=0 n=0

n
∑[n(n − 1)an − (n + 2)(n + 1)an+2 + 6nan + 4 an ] x = −4

n=0


2 n
∑[(n − n)an + 6nan + 4 an − (n + 2)(n + 1)an+2 ] x = −4

n=0


2 n
∑[ n an + 5nan + 4 an − (n + 2)(n + 1)an+2 ] x = −4

n=0

2 n
∑[(n + 5n + 4)an − (n + 2)(n + 1)an+2 ] x = −4

n=0

n
∑[(n + 4)(n + 1)an − (n + 2)(n + 1)an+2 ] x = −4

n=0

17.4.4 https://math.libretexts.org/@go/page/4568
Looking at the coefficients of each power of x, we see that the constant term must be equal to −4 , and the coefficients of all
other powers of x must be zero. Then, looking first at the constant term,
4 a0 − 2 a2 = −4
(step 3)
a2 = 2 a0 + 2

For n ≥ 1 , we have
(n + 4)(n + 1)an − (n + 2)(n + 1)an+2 =0

(n + 1)[(n + 4)an − (n + 2)an+2 ] = 0.

Since n ≥ 1, n + 1 ≠ 0, we see that


(n + 4)an − (n + 2)an+2 = 0

and thus
n+4
an+2 = an .
n+2

For even values of n , we have


6
a4 = (2 a0 + 2) = 3 a0 + 3
4
8
a6 = (3 a0 + 3) = 4 a0 + 4
6

In general,
a2k = (k + 1)(a0 + 1). (step 5)

For odd values of n, we have


5
a3 = a1
3

7 7
a5 = a3 = a1
5 3

9 9
a7 = a5 = a1 = 3 a1
7 3

In general,
2k + 3
a2k+1 = a1 . (step 5 continued)
3

Putting this together, we have


∞ ∞

2k
2k + 3 2k+1
y(x) = ∑(k + 1)(a0 + 1)x + ∑( )a1 x . (step 6)
3
k=0 k=0

 Exercise 17.4.1

Find a power series solution for the following differential equations.


a. y' + 2xy = 0
b. (x + 1)y' = 3y

Hint

17.4.5 https://math.libretexts.org/@go/page/4568
Follow the problem-solving strategy.

Answer a
∞ n
(−1) 2
2n −x
y(x) = a0 ∑ x = a0 e
n!
n=0

Answer b
3
y(x) = a0 (x + 1 )

Bessel functions
We close this section with a brief introduction to Bessel functions. Complete treatment of Bessel functions is well beyond the scope
of this course, but we get a little taste of the topic here so we can see how series solutions to differential equations are used in real-
world applications. The Bessel equation of order n is given by
2 2 2
x y'' +xy' + (x − n )y = 0.

This equation arises in many physical applications, particularly those involving cylindrical coordinates, such as the vibration of a
circular drum head and transient heating or cooling of a cylinder. In the next example, we find a power series solution to the Bessel
equation of order 0.

 Example 17.4.2: Power Series Solution to the Bessel Equation

Find a power series solution to the Bessel equation of order 0 and graph the solution.
Solution
The Bessel equation of order 0 is given by
2 2
x y'' +xy' + x y = 0.

∞ ∞ ∞

We assume a solution of the form y = ∑ a nx


n
. Then y'(x) = ∑ na nx
n−1
and y ′′
(x) = ∑ n(n − 1)an x
n−2
. Substituting
n=0 n=1 n=2

this into the differential equation, we get


∞ ∞ ∞

2 n−2 n−1 2 n
x ∑ n(n − 1)an x + x ∑ nan x +x ∑ an x =0 Substitution.

n=2 n=1 n=0

∞ ∞ ∞

n n n+2
∑ n(n − 1)an x + ∑ nan x + ∑ an x =0 Bring external factors within sums.

n=2 n=1 n=0

∞ ∞ ∞

n n n
∑ n(n − 1)an x + ∑ nan x + ∑ an−2 x =0 Re-index third sum.

n=2 n=1 n=2

∞ ∞ ∞
n n n
∑ n(n − 1)an x + a1 x + ∑ nan x + ∑ an−2 x =0 Separate n = 1 term from second sum.

n=2 n=2 n=2

n
a1 x + ∑[n(n − 1)an + nan + an−2 ] x =0 Collect summation terms.

n=2

2 n
a1 x + ∑[(n − n)an + nan + an−2 ] x =0 Multiply through in first term.

n=2

2 n
a1 x + ∑[ n an + an−2 ] x = 0. Simplify.

n=2

Then, a 1 =0 , and for n ≥ 2,

17.4.6 https://math.libretexts.org/@go/page/4568
2
n an + an−2 = 0

1
an = − an−2 .
2
n

Because a 1 =0 , all odd terms are zero. Then, for even values of n, we have
1
a2 = − a0
2
2

1 1
a4 = − a2 = a0 .
2 2 2
4 4 ⋅2

1 1
a6 = − a4 = − a0
2 2 2 2
6 6 ⋅4 ⋅2

In general,
k
(−1)
a2k = a0 .
2k 2
(2 ) (k! )

Thus, we have
∞ k
(−1)
2k
y(x) = a0 ∑ x .
(2 )2k (k! )2
k=0

The graph appears below.

 Exercise 17.4.2

Verify that the expression found in Example 17.4.2 is a solution to the Bessel equation of order 0.

Hint
Differentiate the power series term by term and substitute it into the differential equation.

Key Concepts
Power series representations of functions can sometimes be used to find solutions to differential equations.
Differentiate the power series term by term and substitute into the differential equation to find relationships between the power
series coefficients.

17.4: Series Solutions of Differential Equations is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.
17.4: Series Solutions of Differential Equations by Edwin “Jed” Herman, Gilbert Strang is licensed CC BY-NC-SA 4.0. Original source:
https://openstax.org/details/books/calculus-volume-1.

17.4.7 https://math.libretexts.org/@go/page/4568
Index
A Divergence Theorem L
arc length 16.9: The Divergence Theorem L’Hôpital’s rule
8.1: Arc Length 4.4: Indeterminate Forms and l'Hospital's Rule
13.3: Arc Length and Curvature G
Gradient Vector S
C 14.6: Directional Derivatives and the Gradient Stokes’ Theorem
carrying capacity Vector
16.8: Stokes' Theorem
9.6: Predator-Prey Systems Green's theorem
16.4: Green's Theorem
V
D Volume by Shells
Directional Derivatives I
6.3: Volumes by Cylindrical Shells
14.6: Directional Derivatives and the Gradient indeterminate forms
Vector 4.4: Indeterminate Forms and l'Hospital's Rule

1 https://math.libretexts.org/@go/page/38026
Glossary
average rate of change | is a function f(x) over an characteristic equation | the equation
absolute convergence | if the series \displaystyle interval [x,x+h] is \frac{f(x+h)−f(a)}{b−a} aλ^2+bλ+c=0 for the differential equation ay″+by′
\sum^∞_{n=1}|a_n| converges, the series \displaystyle
+cy=0
\sum^∞_{n=1}a_n is said to converge absolutely average value of a function | (or f_{ave}) the
average value of a function on an interval can be found circulation | the tendency of a fluid to move in the
absolute error | if B is an estimate of some by calculating the definite integral of the function and direction of curve C. If C is a closed curve, then the
quantity having an actual value of A, then the absolute
dividing that value by the length of the interval circulation of \vecs F along C is line integral ∫_C \vecs
error is given by |A−B|
F·\vecs T \,ds, which we also denote ∮_C\vecs F·\vecs
average velocity | the change in an object’s T \,ds.
absolute extremum | if f has an absolute position divided by the length of a time period; the
maximum or absolute minimum at c, we say f has an average velocity of an object over a time interval [t,a] closed curve | a curve for which there exists a
absolute extremum at c (if t<a or [a,t] if t>a), with a position given by s(t), that parameterization \vecs r(t), a≤t≤b, such that \vecs
absolute maximum | if f(c)≥f(x) for all x in the is v_{ave}=\dfrac{s(t)−s(a)}{t−a} r(a)=\vecs r(b), and the curve is traversed exactly once
domain of f, we say f has an absolute maximum at c
base | the number b in the exponential function closed curve | a curve that begins and ends at the
absolute minimum | if f(c)≤f(x) for all x in the f(x)=b^x and the logarithmic function f(x)=\log_bx same point
domain of f, we say f has an absolute minimum at c binomial series | the Maclaurin series for f(x)= closed set | a set S that contains all its boundary
absolute value function | f(x)=\begin{cases}−x, (1+x)^r; it is given by points
& \text{if } x<0\x, & \text{if } x≥0\end{cases} (1+x)^r=\sum_{n=0}^∞(^r_n)x^n=1+rx+\dfrac{r(r−1)
}{2!}x^2+⋯+\dfrac{r(r−1)⋯(r−n+1)}{n!}x^n+⋯ for comparison test | If 0≤a_n≤b_n for all n≥N and
acceleration | is the rate of change of the velocity, |x|<1 \displaystyle \sum^∞_{n=1}b_n converges, then
that is, the derivative of velocity \displaystyle \sum^∞_{n=1}a_n converges; if
binormal vector | a unit vector orthogonal to the a_n≥b_n≥0 for all n≥N and \displaystyle
acceleration vector | the second derivative of the unit tangent vector and the unit normal vector \sum^∞_{n=1}b_n diverges, then \displaystyle
position vector \sum^∞_{n=1}a_n diverges.
boundary conditions | the conditions that give the
algebraic function | a function involving any state of a system at different times, such as the position complementary equation | for the
combination of only the basic operations of addition, of a spring-mass system at two different times nonhomogeneous linear differential equation a+2(x)y″
subtraction, multiplication, division, powers, and roots +a_1(x)y′+a_0(x)y=r(x), \nonumber the associated
applied to an input variable x boundary point | a point P_0 of R is a boundary homogeneous equation, called the complementary
point if every δ disk centered around P_0 contains
alternating series | a series of the form equation, is a_2(x)y''+a_1(x)y′+a_0(x)y=0 \nonumber
points both inside and outside R
\displaystyle \sum^∞_{n=1}(−1)^{n+1}b_n or component | a scalar that describes either the
\displaystyle \sum^∞_{n=1}(−1)^nb_n, where b_n≥0, boundary-value problem | a differential vertical or horizontal direction of a vector
is called an alternating series equation with associated boundary conditions
component functions | the component functions
alternating series test | for an alternating series of bounded above | a sequence \displaystyle {a_n} is of the vector-valued function \vecs
either form, if b_{n+1}≤b_n for all integers n≥1 and bounded above if there exists a constant \displaystyle
r(t)=f(t)\hat{\mathbf{i}}+g(t)\hat{\mathbf{j}} are f(t)
b_n→0, then an alternating series converges M such that \displaystyle a_n≤M for all positive
and g(t), and the component functions of the vector-
integers \displaystyle n
amount of change | the amount of a function f(x) valued function \vecs
over an interval [x,x+h] is f(x+h)−f(x) bounded below | a sequence \displaystyle {a_n} is r(t)=f(t)\hat{\mathbf{i}}+g(t)\hat{\mathbf{j}}+h(t)\ha
bounded below if there exists a constant \displaystyle t{\mathbf{k}} are f(t), g(t) and h(t)
angular coordinate | θ the angle formed by a line M such that \displaystyle M≤a_n for all positive
segment connecting the origin to a point in the polar composite function | given two functions f and g,
integers \displaystyle n
coordinate system with the positive radial (x) axis, a new function, denoted g∘f, such that (g∘f)
measured counterclockwise bounded sequence | a sequence \displaystyle (x)=g(f(x))
{a_n} is bounded if there exists a constant computer algebra system (CAS) | technology
antiderivative | a function F such that F′(x)=f(x) \displaystyle M such that \displaystyle |a_n|≤M for all
for all x in the domain of f is an antiderivative of f used to perform many mathematical tasks, including
positive integers \displaystyle n integration
arc length | the arc length of a curve can be thought cardioid | a plane curve traced by a point on the
of as the distance a person would travel along the path concave down | if f is differentiable over an interval
perimeter of a circle that is rolling around a fixed
of the curve I and f' is decreasing over I, then f is concave down
circle of the same radius; the equation of a cardioid is
over I
arc-length function | a function s(t) that describes r=a(1+\sin θ) or r=a(1+\cos θ)
the arc length of curve C as a function of t concave up | if f is differentiable over an interval I
carrying capacity | the maximum population of an and f' is increasing over I, then f is concave up over I
arc-length parameterization a | organism that the environment can sustain indefinitely
reparameterization of a vector-valued function in concavity | the upward or downward curve of the
catenary | a curve in the shape of the function graph of a function
which the parameter is equal to the arc length y=a\cdot\cosh(x/a) is a catenary; a cable of uniform
arithmetic sequence | a sequence in which the density suspended between two supports assumes the concavity test | suppose f is twice differentiable
difference between every pair of consecutive terms is shape of a catenary over an interval I; if f''>0 over I, then f is concave up
the same is called an arithmetic sequence over I; if f''< over I, then f is concave down over I
center of mass | the point at which the total mass of
asymptotically semi-stable solution | y=k if it the system could be concentrated without changing the conditional convergence | if the series
is neither asymptotically stable nor asymptotically moment \displaystyle \sum^∞_{n=1}a_n converges, but the
unstable series \displaystyle \sum^∞_{n=1}|a_n| diverges, the
centroid | the centroid of a region is the geometric series \displaystyle \sum^∞_{n=1}a_n is said to
asymptotically stable solution | y=k if there center of the region; laminas are often represented by converge conditionally
exists ε>0 such that for any value c∈(k−ε,k+ε) the regions in the plane; if the lamina has a constant
solution to the initial-value problem y′=f(x,y),y(x_0)=c density, the center of mass of the lamina depends only conic section | a conic section is any curve formed
approaches k as x approaches infinity on the shape of the corresponding planar region; in this by the intersection of a plane with a cone of two
case, the center of mass of the lamina corresponds to nappes
asymptotically unstable solution | y=k if there the centroid of the representative region
exists ε>0 such that for any value c∈(k−ε,k+ε) the connected region | a region in which any two
solution to the initial-value problem y′=f(x,y),y(x_0)=c chain rule | the chain rule defines the derivative of a points can be connected by a path with a trace
never approaches k as xapproaches infinity composite function as the derivative of the outer contained entirely inside the region
function evaluated at the inner function times the
autonomous differential equation | an derivative of the inner function connected set | an open set S that cannot be
equation in which the right-hand side is a function of y represented as the union of two or more disjoint,
alone change of variables | the substitution of a nonempty open subsets
variable, such as u, for an expression in the integrand
conservative field | a vector field for which there
exists a scalar function f such that \vecs ∇f=\vecs{F}

1 https://math.libretexts.org/@go/page/51389
constant multiple law for limits | the limit law curl | the curl of vector field \vecs{F}=⟨P,Q,R⟩, differentiable | a function f(x,y) is differentiable at
\lim_{x→a}cf(x)=c⋅\lim_{x→a}f(x)=cL \nonumber denoted \vecs ∇× \vecs{F} is the “determinant” of the (x_0,y_0) if f(x,y) can be expressed in the form
matrix \begin{vmatrix} \mathbf{\hat i} & f(x,y)=f(x_0,y_0)+f_x(x_0,y_0)(x−x_0)+f_y(x_0,y_0)
constant multiple rule | the derivative of a \mathbf{\hat j} & \mathbf{\hat k} \ \dfrac{\partial} (y−y_0)+E(x,y), where the error term E(x,y) satisfies
constant c multiplied by a function f is the same as the
{\partial x} & \dfrac{\partial}{\partial y} & \lim_{(x,y)→(x_0,y_0)}\dfrac{E(x,y)}
constant multiplied by the derivative: \dfrac{d}
\dfrac{\partial}{\partial z} \ P & Q & R {\sqrt{(x−x_0)^2+(y−y_0)^2}}=0
{dx}\big(cf(x)\big)=cf′(x)
\end{vmatrix}. \nonumber and is given by the
expression (R_y−Q_z)\,\mathbf{\hat i} +
differentiable at a | a function for which f'(a)
constant rule | the derivative of a constant function exists is differentiable at a
is zero: \dfrac{d}{dx}(c)=0, where c is a constant (P_z−R_x)\,\mathbf{\hat j} +(Q_x−P_y)\,\mathbf{\hat
k} ; it measures the tendency of particles at a point to differentiable function | a function for which
constraint | an inequality or equation involving one rotate about the axis that points in the direction of the f'(x) exists is a differentiable function
or more variables that is used in an optimization curl at the point
problem; the constraint enforces a limit on the possible differentiable on S | a function for which f'(x)
solutions for the problem curvature | the derivative of the unit tangent vector exists for each x in the open set S is differentiable on S
with respect to the arc-length parameter
continuity at a point | A function f(x) is differential | the differential dx is an independent
continuous at a point a if and only if the following cusp | a pointed end or part where two curves meet variable that can be assigned any nonzero real number;
three conditions are satisfied: (1) f(a) is defined, (2) cycloid | the curve traced by a point on the rim of a the differential dy is defined to be dy=f'(x)\,dx
\displaystyle \lim_{x→a}f(x) exists, and (3) circular wheel as the wheel rolls along a straight line
\displaystyle \lim{x→a}f(x)=f(a)
differential calculus | the field of calculus
without slippage concerned with the study of derivatives and their
continuity from the left | A function is cylinder | a set of lines parallel to a given line applications
continuous from the left at b if \displaystyle
\lim_{x→b^−}f(x)=f(b)
passing through a given curve differential equation | an equation involving a
function y=y(x) and one or more of its derivatives
cylindrical coordinate system | a way to
continuity from the right | A function is describe a location in space with an ordered triple
continuous from the right at a if \displaystyle differential form | given a differentiable function
(r,θ,z), where (r,θ) represents the polar coordinates of y=f'(x), the equation dy=f'(x)\,dx is the differential
\lim_{x→a^+}f(x)=f(a)
the point’s projection in the xy-plane, and z represents form of the derivative of y with respect to x
continuity over an interval | a function that can the point’s projection onto the z-axis
be traced with a pencil without lifting the pencil; a
differentiation | the process of taking a derivative
decreasing on the interval I | a function
function is continuous over an open interval if it is decreasing on the interval I if, for all direction angles | the angles formed by a nonzero
continuous at every point in the interval; a function x_1,\,x_2∈I,\;f(x_1)≥f(x_2) if x_1<x_2 vector and the coordinate axes
f(x) is continuous over a closed interval of the form
[a,b] if it is continuous at every point in (a,b), and it is definite integral | a primary operation of calculus; direction cosines | the cosines of the angles formed
continuous from the right at a and from the left at b the area between the curve and the x-axis over a given by a nonzero vector and the coordinate axes
interval is a definite integral direction field (slope field) | a mathematical
contour map | a plot of the various level curves of a
given function f(x,y) definite integral of a vector-valued function object used to graphically represent solutions to a first-
| the vector obtained by calculating the definite order differential equation; at each point in a direction
convergence of a series | a series converges if the integral of each of the component functions of a given field, a line segment appears whose slope is equal to
sequence of partial sums for that series converges vector-valued function, then using the results as the the slope of a solution to the differential equation
components of the resulting function passing through that point
convergent sequence | a convergent sequence is a
sequence \displaystyle {a_n} for which there exists a degree | for a polynomial function, the value of the direction vector | a vector parallel to a line that is
real number \displaystyle L such that \displaystyle a_n largest exponent of any term used to describe the direction, or orientation, of the
is arbitrarily close to \displaystyle L as long as line in space
\displaystyle n is sufficiently large density function | a density function describes how
mass is distributed throughout an object; it can be a directional derivative | the derivative of a
coordinate plane | a plane containing two of the linear density, expressed in terms of mass per unit function in the direction of a given unit vector
three coordinate axes in the three-dimensional length; an area density, expressed in terms of mass per
coordinate system, named by the axes it contains: the directrix | a directrix (plural: directrices) is a line
unit area; or a volume density, expressed in terms of used to construct and define a conic section; a parabola
xy-plane, xz-plane, or the yz-plane mass per unit volume; weight-density is also used to has one directrix; ellipses and hyperbolas have two
critical point | if f'(c)=0 or f'(c) is undefined, we describe weight (rather than mass) per unit volume
say that c is a critical point of f discontinuity at a point | A function is
dependent variable | the output variable for a discontinuous at a point or has a discontinuity at a
critical point of a function of two variables | function
point if it is not continuous at the point
the point (x_0,y_0) is called a critical point of f(x,y) if
derivative | the slope of the tangent line to a discriminant | the value 4AC−B^2, which is used
one of the two following conditions holds: 1.
function at a point, calculated by taking the limit of the to identify a conic when the equation contains a term
f_x(x_0,y_0)=f_y(x_0,y_0)=0 2. At least one of
difference quotient, is the derivative involving xy, is called a discriminant
f_x(x_0,y_0) and f_y(x_0,y_0) do not exist
derivative function | gives the derivative of a discriminant | the discriminant of the function
cross product | \vecs u×\vecs v=
function at each point in the domain of the original
(u_2v_3−u_3v_2)\mathbf{\hat i}− f(x,y) is given by the formula D=f_{xx}
function for which the derivative is defined
(u_1v_3−u_3v_1)\mathbf{\hat j}+ (x_0,y_0)f_{yy}(x_0,y_0)−(f_{xy}(x_0,y_0))^2
(u_1v_2−u_2v_1)\mathbf{\hat k}, where \vecs derivative of a vector-valued function | the disk method | a special case of the slicing method
u=⟨u_1,u_2,u_3⟩ and \vecs v=⟨v_1,v_2,v_3⟩ derivative of a vector-valued function \vecs{r}(t) is
used with solids of revolution when the slices are disks
determinant a real number associated with a square \vecs{r}′(t) = \lim \limits_{\Delta t \to 0} \frac{\vecs
matrix parallelepiped a three-dimensional prism with r(t+\Delta t)−\vecs r(t)}{ \Delta t}, provided the limit divergence | the divergence of a vector field
six faces that are parallelograms torque the effect of a exists \vecs{F}=⟨P,Q,R⟩, denoted \vecs ∇× \vecs{F}, is
force that causes an object to rotate triple scalar P_x+Q_y+R_z; it measures the “outflowing-ness” of a
product the dot product of a vector with the cross difference law for limits | the limit law vector field
product of two other vectors: \vecs u⋅(\vecs v×\vecs w) \lim_{x→a}(f(x)−g(x))=\lim_{x→a}f(x)−
vector product the cross product of two vectors. \lim_{x→a}g(x)=L−M \nonumber divergence of a series | a series diverges if the
sequence of partial sums for that series diverges
cross-section | the intersection of a plane and a solid difference quotient | of a function f(x) at a is
object
given by \dfrac{f(a+h)−f(a)}{h} or \dfrac{f(x)−f(a)} divergence test | if \displaystyle
{x−a} \lim_{n→∞}a_n≠0, then the series \displaystyle
cubic function | a polynomial of degree 3; that is, a \sum^∞_{n=1}a_n diverges
function of the form f(x)=ax^3+bx^2+cx+d, where a≠0 difference rule | the derivative of the difference of
a function f and a function g is the same as the divergent sequence | a sequence that is not
difference of the derivative of f and the derivative of g: convergent is divergent
\dfrac{d}{dx}\big(f(x)−g(x)\big)=f′(x)−g′(x)
domain | the set of inputs for a function

2 https://math.libretexts.org/@go/page/51389
dot product or scalar product | \vecs{ u} flux | the rate of a fluid flowing across a curve in a geometric series | a geometric series is a series that
⋅\vecs{ v}=u_1v_1+u_2v_2+u_3v_3 where \vecs{ vector field; the flux of vector field \vecs F across can be written in the form \displaystyle
u}=⟨u_1,u_2,u_3⟩ and \vecs{ v}=⟨v_1,v_2,v_3⟩ plane curve C is line integral ∫_C \vecs F·\frac{\vecs \sum_{n=1}^∞ar^{n−1}=a+ar+ar^2+ar^3+⋯
n(t)}{‖\vecs n(t)‖} \,ds
double integral | of the function f(x,y) over the gradient | the gradient of the function f(x,y) is
region R in the xy-plane is defined as the limit of a flux integral | another name for a surface integral of defined to be \vecs ∇f(x,y)=(∂f/∂x)\,\hat{\mathbf i}+
double Riemann sum, \iint_R f(x,y) \,dA = a vector field; the preferred term in physics and (∂f/∂y)\,\hat{\mathbf j}, which can be generalized to a
\lim_{m,n\rightarrow \infty} \sum_{i=1}^m engineering function of any number of independent variables
\sum_{j=1}^n f(x_{ij}^*, y_{ij}^*) \,\Delta A.
\nonumber focal parameter | the focal parameter is the gradient field | a vector field \vecs{F} for which
distance from a focus of a conic section to the nearest there exists a scalar function f such that \vecs
double Riemann sum | of the function f(x,y) over directrix ∇f=\vecs{F}; in other words, a vector field that is the
a rectangular region R is \sum_{i=1}^m \sum_{j=1}^n gradient of a function; such vector fields are also
f(x_{ij}^*, y_{ij}^*) \,\Delta A, \nonumber where R is
focus | a focus (plural: foci) is a point used to called conservative
construct and define a conic section; a parabola has
divided into smaller subrectangles R_{ij} and
(x_{ij}^*, y_{ij}^*) is an arbitrary point in R_{ij}
one focus; an ellipse and a hyperbola have two graph of a function | the set of points (x,y) such
that x is in the domain of f and y=f(x)
doubling time | if a quantity grows exponentially, formal definition of an infinite limit |
the doubling time is the amount of time it takes the
\displaystyle \lim_{x→a}f(x)=\infty if for every M>0, graph of a function of two variables | a set of
there exists a δ>0 such that if 0<|x−a|<δ, then f(x)>M ordered triples (x,y,z) that satisfies the equation
quantity to double, and is given by (\ln 2)/k
\displaystyle \lim_{x→a}f(x)=-\infty if for every M>0, z=f(x,y) plotted in three-dimensional Cartesian space
eccentricity | the eccentricity is defined as the there exists a δ>0 such that if 0<|x−a|<δ, then f(x)<-M
distance from any point on the conic section to its
Green’s theorem | relates the integral over a
Frenet frame of reference | (TNB frame) a connected region to an integral over the boundary of
focus divided by the perpendicular distance from that
frame of reference in three-dimensional space formed the region
point to the nearest directrix
by the unit tangent vector, the unit normal vector, and
ellipsoid | a three-dimensional surface described by the binormal vector grid curves | curves on a surface that are parallel to
grid lines in a coordinate plane
an equation of the form \dfrac{x^2}{a^2}+\dfrac{y^2}
frustum | a portion of a cone; a frustum is
{b^2}+\dfrac{z^2}{c^2}=1; all traces of this surface growth rate | the constant r>0 in the exponential
constructed by cutting the cone with a plane parallel to
are ellipses growth function P(t)=P_0e^{rt}
the base
elliptic cone | a three-dimensional surface described half-life | if a quantity decays exponentially, the half-
by an equation of the form \dfrac{x^2}
Fubini’s theorem | if f(x,y) is a function of two
variables that is continuous over a rectangular region R life is the amount of time it takes the quantity to be
{a^2}+\dfrac{y^2}{b^2}−\dfrac{z^2}{c^2}=0; traces reduced by half. It is given by (\ln 2)/k
= \big\{(x,y) \in \mathbb{R}^2 \,|\,a \leq x \leq b, \, c
of this surface include ellipses and intersecting lines
\leq y \leq d\big\}, then the double integral of f over harmonic series | the harmonic series takes the
elliptic paraboloid | a three-dimensional surface the region equals an iterated integral, form \displaystyle \sum_{n=1}^∞\frac{1}
described by an equation of the form z=\dfrac{x^2} \displaystyle\iint_R f(x,y) \, dA = \int_a^b \int_c^d {n}=1+\frac{1}{2}+\frac{1}{3}+⋯
{a^2}+\dfrac{y^2}{b^2}; traces of this surface include f(x,y) \,dx \, dy = \int_c^d \int_a^b f(x,y) \,dx \, dy
ellipses and parabolas \nonumber heat flow | a vector field proportional to the negative
temperature gradient in an object
end behavior | the behavior of a function as x→∞ function | a set of inputs, a set of outputs, and a rule
and x→−∞ for mapping each input to exactly one output helix | a three-dimensional curve in the shape of a
spiral
epsilon-delta definition of the limit | function of two variables | a function z=f(x,y)
\displaystyle \lim_{x→a}f(x)=L if for every ε>0, there that maps each ordered pair (x,y) in a subset D of R^2 higher-order derivative | a derivative of a
exists a δ>0 such that if 0<|x−a|<δ, then |f(x)−L|<ε to a unique real number z derivative, from the second derivative to the
n^{\text{th}} derivative, is called a higher-order
equilibrium solution | any solution to the Fundamental Theorem for Line Integrals | derivative
differential equation of the form y=c, where c is a the value of line integral \displaystyle \int_C\vecs
constant ∇f⋅d\vecs r depends only on the value of f at the higher-order partial derivatives | second-order
endpoints of C: \displaystyle \int_C \vecs ∇f⋅d\vecs or higher partial derivatives, regardless of whether
equivalent vectors | vectors that have the same r=f(\vecs r(b))−f(\vecs r(a)) they are mixed partial derivatives
magnitude and the same direction
fundamental theorem of calculus | (also, homogeneous linear equation | a second-order
Euler’s Method | a numerical technique used to evaluation theorem) we can evaluate a definite integral differential equation that can be written in the form
approximate solutions to an initial-value problem by evaluating the antiderivative of the integrand at the a_2(x)y″+a_1(x)y′+a_0(x)y=r(x), but r(x)=0 for every
even function | a function is even if f(−x)=f(x) for endpoints of the interval and subtracting value of x
all x in the domain of f fundamental theorem of calculus | uses a Hooke’s law | this law states that the force required
explicit formula | a sequence may be defined by an definite integral to define an antiderivative of a to compress (or elongate) a spring is proportional to
explicit formula such that \displaystyle a_n=f(n) function the distance the spring has been compressed (or
stretched) from equilibrium; in other words, F=kx,
exponent | the value x in the expression b^x fundamental theorem of calculus | the where k is a constant
theorem, central to the entire development of calculus,
exponential decay | systems that exhibit that establishes the relationship between differentiation horizontal asymptote | if \displaystyle
exponential decay follow a model of the form and integration \lim_{x→∞}f(x)=L or \displaystyle
y=y_0e^{−kt} \lim_{x→−∞}f(x)=L, then y=L is a horizontal
general form | an equation of a conic section asymptote of f
exponential growth | systems that exhibit written as a general second-degree equation
exponential growth follow a model of the form horizontal line test | a function f is one-to-one if
y=y_0e^{kt} general form of the equation of a plane | an and only if every horizontal line intersects the graph of
equation in the form ax+by+cz+d=0, where \vecs f, at most, once
extreme value theorem | if f is a continuous n=⟨a,b,c⟩ is a normal vector of the plane, P=
function over a finite, closed interval, then f has an (x_0,y_0,z_0) is a point on the plane, and hydrostatic pressure | the pressure exerted by
absolute maximum and an absolute minimum d=−ax_0−by_0−cz_0 water on a submerged object
Fermat’s theorem | if f has a local extremum at c, general solution (or family of solutions) | the hyperbolic functions | the functions denoted
then c is a critical point of f entire set of solutions to a given differential equation \sinh,\,\cosh,\,\operatorname{tanh},\,\operatorname{cs
ch},\,\operatorname{sech}, and \coth, which involve
first derivative test | let f be a continuous function generalized chain rule | the chain rule extended certain combinations of e^x and e^{−x}
over an interval I containing a critical point c such that to functions of more than one independent variable, in
f is differentiable over I except possibly at c; if f' which each independent variable may depend on one hyperboloid of one sheet | a three-dimensional
changes sign from positive to negative as x increases or more other variables surface described by an equation of the form
through c, then f has a local maximum at c; if f' \dfrac{x^2}{a^2}+\dfrac{y^2}{b^2}−\dfrac{z^2}
changes sign from negative to positive as x increases geometric sequence | a sequence \displaystyle {c^2}=1; traces of this surface include ellipses and
through c, then f has a local minimum at c; if f' does {a_n} in which the ratio \displaystyle a_{n+1}/a_n is hyperbolas
not change sign as x increases through c, then f does the same for all positive integers \displaystyle n is
not have a local extremum at c called a geometric sequence

3 https://math.libretexts.org/@go/page/51389
hyperboloid of two sheets | a three-dimensional initial-value problem | a differential equation iterated integral | for a function f(x,y) over the
surface described by an equation of the form together with an initial value or values region R is a. \displaystyle \int_a^b \int_c^d f(x,y) \,dx
\dfrac{z^2}{c^2}−\dfrac{x^2}{a^2}−\dfrac{y^2} \, dy = \int_a^b \left[\int_c^d f(x,y) \, dy\right] \, dx, b.
{b^2}=1; traces of this surface include ellipses and
instantaneous rate of change | the rate of \displaystyle \int_c^d \int_a^b f(x,y) \, dx \, dy =
change of a function at any point along the function a,
hyperbolas \int_c^d \left[\int_a^b f(x,y) \, dx\right] \, dy, where
also called f′(a), or the derivative of the function at a
a,b,c, and d are any real numbers and R = [a,b] \times
implicit differentiation | is a technique for [c,d]
computing \dfrac{dy}{dx} for a function defined by instantaneous velocity | The instantaneous
an equation, accomplished by differentiating both sides velocity of an object with a position function that is iterative process | process in which a list of
of the equation (remembering to treat the variable y as given by s(t) is the value that the average velocities on numbers x_0,x_1,x_2,x_3… is generated by starting
a function) and solving for \dfrac{dy}{dx} intervals of the form [t,a] and [a,t] approach as the with a number x_0 and defining x_n=F(x_{n−1}) for
values of t move closer to a, provided such a value n≥1
improper double integral | a double integral exists
over an unbounded region or of an unbounded function Jacobian | the Jacobian J (u,v) in two variables is a 2
integrable function | a function is integrable if the \times 2 determinant: J(u,v) = \begin{vmatrix}
improper integral | an integral over an infinite limit defining the integral exists; in other words, if the
\frac{\partial x}{\partial u} \frac{\partial y}{\partial u}
interval or an integral of a function containing an limit of the Riemann sums as n goes to infinity exists
\nonumber \ \frac{\partial x}{\partial v} \frac{\partial
infinite discontinuity on the interval; an improper
integral calculus | the study of integrals and their y}{\partial v} \end{vmatrix}; \nonumber the Jacobian
integral is defined in terms of a limit. The improper
applications J (u,v,w) in three variables is a 3 \times 3 determinant:
integral converges if this limit is a finite real number;
J(u,v,w) = \begin{vmatrix} \frac{\partial x}{\partial u}
otherwise, the improper integral diverges integral test | for a series \displaystyle \frac{\partial y}{\partial u} \frac{\partial z}{\partial u}
\sum^∞_{n=1}a_n with positive terms a_n, if there \nonumber \ \frac{\partial x}{\partial v} \frac{\partial
increasing on the interval I | a function
exists a continuous, decreasing function f such that y}{\partial v} \frac{\partial z}{\partial v} \nonumber \
increasing on the interval I if for all
f(n)=a_n for all positive integers n, then \frac{\partial x}{\partial w} \frac{\partial y}{\partial
x_1,\,x_2∈I,\;f(x_1)≤f(x_2) if x_1<x_2
\sum_{n=1}^∞a_n \nonumber and ∫^∞_1f(x)\,dx w} \frac{\partial z}{\partial w}\end{vmatrix}
indefinite integral | the most general \nonumber either both converge or both diverge \nonumber
antiderivative of f(x) is the indefinite integral of f; we
use the notation \displaystyle \int f(x)\,dx to denote the
integrand | the function to the right of the jump discontinuity | A jump discontinuity occurs
integration symbol; the integrand includes the function at a point a if \displaystyle \lim_{x→a^−}f(x) and
indefinite integral of f
being integrated \displaystyle \lim_{x→a^+}f(x) both exist, but
indefinite integral of a vector-valued \displaystyle \lim_{x→a^−}f(x)≠\lim_{x→a^+}f(x)
function | a vector-valued function with a derivative integrating factor | any function f(x) that is
that is equal to a given vector-valued function multiplied on both sides of a differential equation to Kepler’s laws of planetary motion | three laws
make the side involving the unknown function equal to governing the motion of planets, asteroids, and comets
independence of path | a vector field \vecs{F} the derivative of a product of two functions in orbit around the Sun
has path independence if \displaystyle \int_{C_1}
\vecs F⋅d\vecs r=\displaystyle \int_{C_2} \vecs integration by parts | a technique of integration Lagrange multiplier | the constant (or constants)
F⋅d\vecs r for any curves C_1 and C_2 in the domain that allows the exchange of one integral for another used in the method of Lagrange multipliers; in the case
of \vecs{F} with the same initial points and terminal using the formula \displaystyle ∫u\,dv=uv−∫v\,du of one constant, it is represented by the variable λ
points integration by substitution | a technique for lamina | a thin sheet of material; laminas are thin
independent variable | the input variable for a integration that allows integration of functions that are enough that, for mathematical purposes, they can be
function the result of a chain-rule derivative treated as if they are two-dimensional
indeterminate forms | When evaluating a limit, integration table | a table that lists integration left-endpoint approximation | an approximation
the forms \dfrac{0}{0},∞/∞, 0⋅∞, ∞−∞, 0^0, ∞^0, and formulas of the area under a curve computed by using the left
1^∞ are considered indeterminate because further interior point | a point P_0 of \mathbb{R} is a endpoint of each subinterval to calculate the height of
analysis is required to determine whether the limit boundary point if there is a δ disk centered around P_0 the vertical sides of each rectangle
exists and, if so, what its value is. contained completely in \mathbb{R} level curve of a function of two variables |
index variable | the subscript used to define the the set of points satisfying the equation f(x,y)=c for
Intermediate Value Theorem | Let f be
terms in a sequence is called the index some real number c in the range of f
continuous over a closed bounded interval [a,b] if z is
infinite discontinuity | An infinite discontinuity any real number between f(a) and f(b), then there is a level surface of a function of three variables
occurs at a point a if \displaystyle number c in [a,b] satisfying f(c)=z | the set of points satisfying the equation f(x,y,z)=c for
\lim_{x→a^−}f(x)=±∞ or \displaystyle some real number c in the range of f
intermediate variable | given a composition of
\lim_{x→a^+}f(x)=±∞ functions (e.g., \displaystyle f(x(t),y(t))), the limaçon | the graph of the equation r=a+b\sin θ or
infinite limit | A function has an infinite limit at a intermediate variables are the variables that are r=a+b\cos θ. If a=b then the graph is a cardioid
point a if it either increases or decreases without bound independent in the outer function but dependent on
other variables as well; in the function \displaystyle limit | the process of letting x or t approach a in an
as it approaches a expression; the limit of a function f(x) as x approaches
f(x(t),y(t)), the variables \displaystyle x and
infinite limit at infinity | a function that becomes \displaystyle y are examples of intermediate variables a is the value that f(x) approaches as x approaches a
arbitrarily large as x becomes large limit at infinity | a function that approaches a limit
interval of convergence | the set of real numbers
infinite series | an infinite series is an expression of x for which a power series converges value L as x becomes large
the form \displaystyle a_1+a_2+a_3+⋯ limit comparison test | Suppose a_n,b_n≥0 for all
=\sum_{n=1}^∞a_n intuitive definition of the limit | If all values of
the function f(x) approach the real number L as the n≥1. If \displaystyle \lim_{n→∞}a_n/b_n→L≠0, then
inflection point | if f is continuous at c and f values of x(≠a) approach a, f(x) approaches L \displaystyle \sum^∞_{n=1}a_n and \displaystyle
changes concavity at c, the point (c,f(c)) is an \sum^∞_{n=1}b_n both converge or both diverge; if
inflection point of f inverse function | for a function f, the inverse \displaystyle \lim_{n→∞}a_n/b_n→0 and
function f^{−1} satisfies f^{−1}(y)=x if f(x)=y \displaystyle \sum^∞_{n=1}b_n converges, then
initial point | the starting point of a vector \displaystyle \sum^∞_{n=1}a_n converges. If
inverse hyperbolic functions | the inverses of \displaystyle \lim_{n→∞}a_n/b_n→∞, and
initial population | the population at time t=0 the hyperbolic functions where \cosh and
\displaystyle \sum^∞_{n=1}b_n diverges, then
\operatorname{sech} are restricted to the domain
initial value problem | a problem that requires \displaystyle \sum^∞_{n=1}a_n diverges.
[0,∞);each of these functions can be expressed in terms
finding a function y that satisfies the differential
equation \dfrac{dy}{dx}=f(x) together with the initial
of a composition of the natural logarithm function and limit laws | the individual properties of limits; for
an algebraic function each of the individual laws, let f(x) and g(x) be defined
condition y(x_0)=y_0
for all x≠a over some open interval containing a;
initial value(s) | a value or set of values that a inverse trigonometric functions | the inverses assume that L and M are real numbers so that
of the trigonometric functions are defined on restricted
solution of a differential equation satisfies for a fixed \lim_{x→a}f(x)=L and \lim_{x→a}g(x)=M; let c be a
domains where they are one-to-one functions
value of the independent variable constant
initial velocity | the velocity at time t=0 limit of a sequence | the real number LL to which
a sequence converges is called the limit of the
sequence

4 https://math.libretexts.org/@go/page/51389
limit of a vector-valued function | a vector- major axis | the major axis of a conic section passes natural logarithm | the function \ln x=\log_ex
valued function \vecs r(t) has a limit \vecs L as t through the vertex in the case of a parabola or through
approaches a if \lim \limits{t \to a} \left| \vecs r(t) - the two vertices in the case of an ellipse or hyperbola;
net change theorem | if we know the rate of
change of a quantity, the net change theorem says the
\vecs L \right| = 0 it is also an axis of symmetry of the conic; also called
future quantity is equal to the initial quantity plus the
the transverse axis
limits of integration | these values appear near the integral of the rate of change of the quantity
top and bottom of the integral sign and define the marginal cost | is the derivative of the cost
interval over which the function should be integrated function, or the approximate cost of producing one net signed area | the area between a function and
more item the x-axis such that the area below the x-axis is
line integral | the integral of a function along a subtracted from the area above the x-axis; the result is
curve in a plane or in space marginal profit | is the derivative of the profit the same as the definite integral of the function
function, or the approximate profit obtained by
linear | description of a first-order differential producing and selling one more item
Newton’s method | method for approximating
equation that can be written in the form a(x)y′ roots of f(x)=0; using an initial guess x_0; each
+b(x)y=c(x) marginal revenue | is the derivative of the revenue subsequent approximation is defined by the equation
function, or the approximate revenue obtained by x_n=x_{n−1}−\frac{f(x_{n−1})}{f'(x_{n−1})}
linear approximation | the linear function selling one more item
L(x)=f(a)+f'(a)(x−a) is the linear approximation of f at nonelementary integral | an integral for which
x=a mass flux | the rate of mass flow of a fluid per unit the antiderivative of the integrand cannot be expressed
area, measured in mass per unit time per unit area as an elementary function
linear approximation | given a function f(x,y)
and a tangent plane to the function at a point mathematical model | A method of simulating nonhomogeneous linear equation | a second-
(x_0,y_0), we can approximate f(x,y) for points near real-life situations with mathematical equations order differential equation that can be written in the
(x_0,y_0) using the tangent plane formula form a_2(x)y″+a_1(x)y′+a_0(x)y=r(x), but r(x)≠0 for
mean value theorem | if f is continuous over [a,b] some value of x
linear function | a function that can be written in and differentiable over (a,b), then there exists c∈(a,b)
the form f(x)=mx+b such that f′(c)=\frac{f(b)−f(a)}{b−a} normal component of acceleration | the
coefficient of the unit normal vector \vecs N when the
linearly dependent | a set of functions mean value theorem for integrals | guarantees acceleration vector is written as a linear combination
f_1(x),f_2(x),…,f_n(x) for whichthere are constants that a point c exists such that f(c) is equal to the of \vecs T and \vecs N
c_1,c_2,…c_n, not all zero, such that average value of the function
c_1f_1(x)+c_2f_2(x)+⋯+c_nf_n(x)=0 for all \(x\) in normal plane | a plane that is perpendicular to a
the interval of interest
method of cylindrical shells | a method of curve at any point on the curve
calculating the volume of a solid of revolution by
linearly independent | a set of functions dividing the solid into nested cylindrical shells; this normal vector | a vector perpendicular to a plane
f_1(x),f_2(x),…,f_n(x) for which there are no method is different from the methods of disks or
constants c_1,c_2,…c_n, such that washers in that we integrate with respect to the normalization | using scalar multiplication to find a
c_1f_1(x)+c_2f_2(x)+⋯+c_nf_n(x)=0 for all \(x\) in opposite variable unit vector with a given direction
the interval of interest number e | as m gets larger, the quantity (1+
method of Lagrange multipliers | a method of
(1/m)^m gets closer to some real number; we define
local extremum | if f has a local maximum or local solving an optimization problem subject to one or
that real number to be e; the value of e is
minimum at c, we say f has a local extremum at c more constraints
approximately 2.718282
local maximum | if there exists an interval I such method of undetermined coefficients | a
that f(c)≥f(x) for all x∈I, we say f has a local method that involves making a guess about the form of
numerical integration | the variety of numerical
methods used to estimate the value of a definite
maximum at c the particular solution, then solving for the coefficients
integral, including the midpoint rule, trapezoidal rule,
in the guess
local minimum | if there exists an interval I such and Simpson’s rule
that f(c)≤f(x) for all x∈I, we say f has a local method of variation of parameters | a method
minimum at c that involves looking for particular solutions in the objective function | the function that is to be
form y_p(x)=u(x)y_1(x)+v(x)y_2(x), where y_1 and maximized or minimized in an optimization problem
logarithmic differentiation | is a technique that y_2 are linearly independent solutions to the
allows us to differentiate a function by first taking the
oblique asymptote | the line y=mx+b if f(x)
complementary equations, and then solving a system approaches it as x→∞ or x→−∞
natural logarithm of both sides of an equation, of equations to find u(x) and v(x)
applying properties of logarithms to simplify the octants | the eight regions of space created by the
equation, and differentiating implicitly midpoint rule | a rule that uses a Riemann sum of coordinate planes
the form \displaystyle M_n=\sum^n_{i=1}f(m_i)Δx,
logarithmic function | a function of the form where m_i is the midpoint of the i^{\text{th}} odd function | a function is odd if f(−x)=−f(x) for
f(x)=\log_b(x) for some base b>0,\,b≠1 such that all x in the domain of f
subinterval to approximate \displaystyle ∫^b_af(x)\,dx
y=\log_b(x) if and only if b^y=x
minor axis | the minor axis is perpendicular to the one-sided limit | A one-sided limit of a function is
logistic differential equation | a differential major axis and intersects the major axis at the center of a limit taken from either the left or the right
equation that incorporates the carrying capacity K and
the conic, or at the vertex in the case of the parabola; one-to-one function | a function f is one-to-one if
growth rate rr into a population model
also called the conjugate axis f(x_1)≠f(x_2) if x_1≠x_2
lower sum | a sum obtained by using the minimum mixed partial derivatives | second-order or
value of f(x) on each subinterval one-to-one transformation | a transformation T :
higher partial derivatives, in which at least two of the G \rightarrow R defined as T(u,v) = (x,y) is said to be
L’Hôpital’s rule | If f and g are differentiable differentiations are with respect to different variables one-to-one if no two points map to the same image
functions over an interval a, except possibly at a, and point
moment | if n masses are arranged on a number line,
\displaystyle \lim_{x→a}f(x)=0=\lim_{x→a}g(x) or
the moment of the system with respect to the origin is open set | a set S that contains none of its boundary
\displaystyle \lim_{x→a}f(x) and \displaystyle
given by \displaystyle M=\sum^n_{i=1}m_ix_i; if, points
\lim_{x→a}g(x) are infinite, then \displaystyle
instead, we consider a region in the plane, bounded
\lim_{x→a}\dfrac{f(x)}{g(x)}=\lim_{x→a}\dfrac{f′ optimization problem | calculation of a
above by a function f(x) over an interval [a,b], then the
(x)}{g′(x)}, assuming the limit on the right exists or is maximum or minimum value of a function of several
moments of the region with respect to the x- and y-
∞ or −∞. variables, often using Lagrange multipliers
axes are given by \displaystyle
Maclaurin polynomial | a Taylor polynomial M_x=ρ∫^b_a\dfrac{[f(x)]^2}{2}\,dx and \displaystyle
optimization problems | problems that are solved
centered at 0; the n^{\text{th}}-degree Taylor M_y=ρ∫^b_axf(x)\,dx, respectively
by finding the maximum or minimum value of a
polynomial for f at 0 is the n^{\text{th}}-degree monotone sequence | an increasing or decreasing function
Maclaurin polynomial for f sequence
order of a differential equation | the highest
Maclaurin series | a Taylor series for a function f multivariable calculus | the study of the calculus order of any derivative of the unknown function that
at x=0 is known as a Maclaurin series for f appears in the equation
of functions of two or more variables
magnitude | the length of a vector nappe | a nappe is one half of a double cone orientation | the direction that a point moves on a
graph as the parameter increases
natural exponential function | the function
f(x)=e^x

5 https://math.libretexts.org/@go/page/51389
orientation of a curve | the orientation of a curve periodic function | a function is periodic if it has a product rule | the derivative of a product of two
C is a specified direction of C repeating pattern as the values of x move from left to functions is the derivative of the first function times
right the second function plus the derivative of the second
orientation of a surface | if a surface has an function times the first function: \dfrac{d}
“inner” side and an “outer” side, then an orientation is phase line | a visual representation of the behavior {dx}\big(f(x)g(x)\big)=f′(x)g(x)+g′(x)f(x)
a choice of the inner or the outer side; the surface of solutions to an autonomous differential equation
could also have “upward” and “downward” subject to various initial conditions projectile motion | motion of an object with an
orientations initial velocity but no force acting on it other than
piecewise smooth curve | an oriented curve that gravity
orthogonal vectors | vectors that form a right is not smooth, but can be written as the union of
angle when placed in standard position finitely many smooth curves propagated error | the error that results in a
calculated quantity f(x) resulting from a measurement
osculating circle | a circle that is tangent to a curve piecewise-defined function | a function that is error dx
C at a point P and that shares the same curvature defined differently on different parts of its domain
quadratic function | a polynomial of degree 2;
osculating plane | the plane determined by the unit planar transformation | a function T that that is, a function of the form f(x)=ax^2+bx+c where
tangent and the unit normal vector transforms a region G in one plane into a region R in a≠0
another plane by a change of variables
p-series | a series of the form \displaystyle quadric surfaces | surfaces in three dimensions
\sum^∞_{n=1}1/n^p plane curve | the set of ordered pairs (f(t),g(t)) having the property that the traces of the surface are
together with their defining parametric equations
parallelogram method | a method for finding the x=f(t) and y=g(t)
conic sections (ellipses, hyperbolas, and parabolas)
sum of two vectors; position the vectors so they share
quotient law for limits | the limit law
the same initial point; the vectors then form two point-slope equation | equation of a linear \lim_{x→a}\dfrac{f(x)}
adjacent sides of a parallelogram; the sum of the function indicating its slope and a point on the graph
{g(x)}=\dfrac{\lim_{x→a}f(x)}
vectors is the diagonal of that parallelogram of the function
{\lim_{x→a}g(x)}=\dfrac{L}{M} for M≠0
parameter | an independent variable that both x and polar axis | the horizontal axis in the polar quotient rule | the derivative of the quotient of two
y depend on in a parametric curve; usually represented coordinate system corresponding to r≥0 functions is the derivative of the first function times
by the variable t the second function minus the derivative of the second
polar coordinate system | a system for locating
parameter domain (parameter space) | the points in the plane. The coordinates are r, the radial function times the first function, all divided by the
region of the uv-plane over which the parameters u and coordinate, and θ, the angular coordinate square of the second function: \dfrac{d}
v vary for parameterization \vecs r(u,v) = \langle {dx}\left(\dfrac{f(x)}{g(x)}\right)=\dfrac{f′(x)g(x)−g′
x(u,v), \, y(u,v), \, z(u,v)\rangle polar equation | an equation or function relating (x)f(x)}{\big(g(x)\big)^2}
the radial coordinate to the angular coordinate in the
parameterization of a curve | rewriting the polar coordinate system radial coordinate | r the coordinate in the polar
equation of a curve defined by a function y=f(x) as coordinate system that measures the distance from a
parametric equations
polar rectangle | the region enclosed between the point in the plane to the pole
circles r = a and r = b and the angles \theta = \alpha
parameterized surface (parametric surface) and \theta = \beta; it is described as R = \{(r, radial field | a vector field in which all vectors
| a surface given by a description of the form \vecs \theta)\,|\,a \leq r \leq b, \, \alpha \leq \theta \leq \beta\} either point directly toward or directly away from the
r(u,v) = \langle x(u,v), \, y(u,v), \, z(u,v)\rangle, where origin; the magnitude of any vector depends only on its
the parameters u and v vary over a parameter domain pole | the central point of the polar coordinate system, distance from the origin
in the uv-plane equivalent to the origin of a Cartesian system
radians | for a circular arc of length s on a circle of
parametric curve | the graph of the parametric polynomial function | a function of the form radius 1, the radian measure of the associated angle θ
equations x(t) and y(t) over an interval a≤t≤b f(x)=a_nx^n+a_{n−1}x^{n−1}+…+a_1x+a_0 is s
combined with the equations population growth rate | is the derivative of the radius of convergence | if there exists a real
parametric equations | the equations x=x(t) and population with respect to time number R>0 such that a power series centered at x=a
y=y(t) that define a parametric curve potential function | a scalar function f such that converges for |x−a|<R and diverges for |x−a|>R, then R
\vecs ∇f=\vecs{F} is the radius of convergence; if the power series only
parametric equations of a line | the set of converges at x=a, the radius of convergence is R=0; if
equations x=x_0+ta, y=y_0+tb, and z=z_0+tc power function | a function of the form f(x)=x^n the power series converges for all real numbers x, the
describing the line with direction vector v=⟨a,b,c⟩ for any positive integer n≥1 radius of convergence is R=∞
passing through point (x_0,y_0,z_0)
power law for limits | the limit law \lim_{x→a} radius of curvature | the reciprocal of the
partial derivative | a derivative of a function of (f(x))^n=(\lim_{x→a}f(x))^n=L^n \nonumber for curvature
more than one independent variable in which all the every positive integer n
variables but one are held constant radius of gyration | the distance from an object’s
power reduction formula | a rule that allows an center of mass to its axis of rotation
partial differential equation | an equation that integral of a power of a trigonometric function to be
involves an unknown function of more than one exchanged for an integral involving a lower power range | the set of outputs for a function
independent variable and one or more of its partial
power rule | the derivative of a power function is a ratio test | for a series \displaystyle
derivatives
function in which the power on x becomes the \sum^∞_{n=1}a_n with nonzero terms, let
partial fraction decomposition | a technique coefficient of the term and the power on x in the \displaystyle ρ=\lim_{n→∞}|a_{n+1}/a_n|; if 0≤ρ<1,
used to break down a rational function into the sum of derivative decreases by 1: If n is an integer, then the series converges absolutely; if ρ>1, the series
simple rational functions \dfrac{d}{dx}\left(x^n\right)=nx^{n−1} diverges; if ρ=1, the test is inconclusive
partial sum | the kth partial sum of the infinite power series | a series of the form rational function | a function of the form
series \displaystyle \sum^∞_{n=1}a_n is the finite sum \sum_{n=0}^∞c_nx^n is a power series centered at f(x)=p(x)/q(x), where p(x) and q(x) are polynomials
\displaystyle x=0; a series of the form \sum_{n=0}^∞c_n(x−a)^n is recurrence relation | a recurrence relation is a
S_k=\sum_{n=1}^ka_n=a_1+a_2+a_3+⋯+a_k a power series centered at x=a relationship in which a term a_n in a sequence is
particular solution | member of a family of principal unit normal vector | a vector defined in terms of earlier terms in the sequence
solutions to a differential equation that satisfies a orthogonal to the unit tangent vector, given by the
particular initial condition
region | an open, connected, nonempty subset of
formula \frac{\vecs T′(t)}{‖\vecs T′(t)‖} \mathbb{R}^2
particular solution | a solution y_p(x) of a principal unit tangent vector | a unit vector
differential equation that contains no arbitrary
regular parameterization | parameterization
tangent to a curve C \vecs r(u,v) = \langle x(u,v), \, y(u,v), \, z(u,v)\rangle
constants
product law for limits | the limit law \lim_{x→a} such that r_u \times r_v is not zero for point (u,v) in
partition | a set of points that divides an interval into (f(x)⋅g(x))=\lim_{x→a}f(x)⋅\lim_{x→a}g(x)=L⋅M the parameter domain
subintervals \nonumber regular partition | a partition in which the
percentage error | the relative error expressed as a subintervals all have the same width
percentage

6 https://math.libretexts.org/@go/page/51389
related rates | are rates of change associated with scalar equation of a plane | the equation solution curve | a curve graphed in a direction field
two or more related quantities that are changing over a(x−x_0)+b(y−y_0)+c(z−z_0)=0 used to describe a that corresponds to the solution to the initial-value
time plane containing point P=(x_0,y_0,z_0) with normal problem passing through a given point in the direction
vector n=⟨a,b,c⟩ or its alternate form ax+by+cz+d=0, field
relative error | given an absolute error Δq for a where d=−ax_0−by_0−cz_0
particular quantity, \frac{Δq}{q} is the relative error. solution to a differential equation | a function
scalar line integral | the scalar line integral of a y=f(x) that satisfies a given differential equation
relative error | error as a percentage of the actual function f along a curve C with respect to arc length is
value, given by \text{relative error}=\left|\frac{A−B} the integral \displaystyle \int_C f\,ds, it is the integral space curve | the set of ordered triples (f(t),g(t),h(t))
{A}\right|⋅100\% \nonumber of a scalar function f along a curve in a plane or in together with their defining parametric equations
space; such an integral is defined in terms of a x=f(t), y=g(t) and z=h(t)
remainder estimate | for a series \displaystyle
\sum^∞_{n=}1a_n with positive terms a_n and a Riemann sum, as is a single-variable integral space-filling curve | a curve that completely
continuous, decreasing function f such that f(n)=a_n occupies a two-dimensional subset of the real plane
scalar multiplication | a vector operation that
for all positive integers n, the remainder \displaystyle
defines the product of a scalar and a vector speed | is the absolute value of velocity, that is, |v(t)|
R_N=\sum^∞_{n=1}a_n−\sum^N_{n=1}a_n satisfies
the following estimate: scalar projection | the magnitude of the vector is the speed of an object at time t whose velocity is
∫^∞_{N+1}f(x)\,dx<R_N<∫^∞_Nf(x)\,dx \nonumber projection of a vector given by v(t)

removable discontinuity | A removable secant | A secant line to a function f(x) at a is a line sphere | the set of all points equidistant from a given
point known as the center
discontinuity occurs at a point a if f(x) is discontinuous through the point (a,f(a)) and another point on the
at a, but \displaystyle \lim_{x→a}f(x) exists function; the slope of the secant line is given by spherical coordinate system | a way to describe
m_{sec}=\dfrac{f(x)−f(a)}{x−a} a location in space with an ordered triple (ρ,θ,φ),
reparameterization | an alternative
where ρ is the distance between P and the origin (ρ≠0),
parameterization of a given vector-valued function second derivative test | suppose f'(c)=0 and f'' is
θ is the same angle used to describe the location in
continuous over an interval containing c; if f''(c)>0,
restricted domain | a subset of the domain of a then f has a local minimum at c; if f''(c)<0, then f has a
cylindrical coordinates, and φ is the angle formed by
function f the positive z-axis and line segment \bar{OP}, where
local maximum at c; if f''(c)=0, then the test is
O is the origin and 0≤φ≤π
riemann sum | an estimate of the area under the inconclusive
curve of the form A≈\displaystyle separable differential equation | any equation squeeze theorem | states that if f(x)≤g(x)≤h(x) for
\sum_{i=1}^nf(x^∗_i)Δx that can be written in the form y'=f(x)g(y) all x≠a over an open interval containing a and
\lim_{x→a}f(x)=L=\lim_ {x→a}h(x) where L is a real
right-endpoint approximation | the right- separation of variables | a method used to solve a number, then \lim_{x→a}g(x)=L
endpoint approximation is an approximation of the
separable differential equation
area of the rectangles under a curve using the right standard equation of a sphere | (x−a)^2+
endpoint of each subinterval to construct the vertical sequence | an ordered list of numbers of the form (y−b)^2+(z−c)^2=r^2 describes a sphere with center
sides of each rectangle \displaystyle a_1,a_2,a_3,… is a sequence (a,b,c) and radius r
right-hand rule | a common way to define the sigma notation | (also, summation notation) the standard form | the form of a first-order linear
orientation of the three-dimensional coordinate system; Greek letter sigma (Σ) indicates addition of the values; differential equation obtained by writing the
when the right hand is curved around the z-axis in such the values of the index above and below the sigma differential equation in the form y'+p(x)y=q(x)
a way that the fingers curl from the positive x-axis to indicate where to begin the summation and where to
the positive y-axis, the thumb points in the direction of end it
standard form | an equation of a conic section
showing its properties, such as location of the vertex or
the positive z-axis
simple curve | a curve that does not cross itself lengths of major and minor axes
RLC series circuit | a complete electrical path
consisting of a resistor, an inductor, and a capacitor; a
simple harmonic motion | motion described by standard unit vectors | unit vectors along the
the equation x(t)=c_1 \cos (ωt)+c_2 \sin (ωt), as coordinate axes: \hat{\mathbf i}=⟨1,0⟩,\, \hat{\mathbf
second-order, constant-coefficient differential equation
exhibited by an undamped spring-mass system in j}=⟨0,1⟩
can be used to model the charge on the capacitor in an
which the mass continues to oscillate indefinitely
RLC series circuit standard-position vector | a vector with initial
rolle’s theorem | if f is continuous over [a,b] and simply connected region | a region that is point (0,0)
differentiable over (a,b), and if f(a)=f(b), then there connected and has the property that any closed curve
that lies entirely inside the region encompasses points
steady-state solution | a solution to a
exists c∈(a,b) such that f′(c)=0 nonhomogeneous differential equation related to the
that are entirely inside the region
forcing function; in the long term, the solution
root function | a function of the form f(x)=x^{1/n}
for any integer n≥2
Simpson’s rule | a rule that approximates approaches the steady-state solution
\displaystyle ∫^b_af(x)\,dx using the area under a
root law for limits | the limit law piecewise quadratic function. The approximation S_n
step size | the increment hh that is added to the xx
value at each step in Euler’s Method
\lim_{x→a}\sqrt[n]{f(x)}=\sqrt[n] to \displaystyle ∫^b_af(x)\,dx is given by
{\lim_{x→a}f(x)}=\sqrt[n]{L} for all L if n is odd and S_n=\frac{Δx} Stokes’ theorem | relates the flux integral over a
for L≥0 if n is even {3}\big(f(x_0)+4\,f(x_1)+2\,f(x_2)+4\,f(x_3)+2\,f(x_4 surface S to a line integral around the boundary C of
)+⋯+2\,f(x_{n−2})+4\,f(x_{n−1})+f(x_n)\big). the surface S
root test | for a series
\displaystyle
\nonumber
\sum^∞_{n=1}a_n, let \displaystyle stream function | if \vecs F=⟨P,Q⟩ is a source-free
ρ=\lim_{n→∞}\sqrt[n]{|a_n|}; if 0≤ρ<1, the series skew lines | two lines that are not parallel but do not vector field, then stream function g is a function such
converges absolutely; if ρ>1, the series diverges; if intersect that P=g_y and Q=−g_x
ρ=1, the test is inconclusive
slicing method | a method of calculating the sum law for limits | The limit law \lim_{x→a}
rose | graph of the polar equation r=a\cos 2θ or volume of a solid that involves cutting the solid into (f(x)+g(x))=\lim_{x→a}f(x)+\lim_{x→a}g(x)=L+M
r=a\sin 2θfor a positive constant a pieces, estimating the volume of each piece, then
adding these estimates to arrive at an estimate of the sum rule | the derivative of the sum of a function f
rotational field | a vector field in which the vector total volume; as the number of slices goes to infinity, and a function g is the same as the sum of the
at point (x,y) is tangent to a circle with radius derivative of f and the derivative of g: \dfrac{d}
this estimate becomes an integral that gives the exact
r=\sqrt{x^2+y^2}; in a rotational field, all vectors flow {dx}\big(f(x)+g(x)\big)=f′(x)+g′(x)
value of the volume
either clockwise or counterclockwise, and the
magnitude of a vector depends only on its distance slope | the change in y for each unit change in x surface | the graph of a function of two variables,
from the origin z=f(x,y)
slope-intercept form | equation of a linear
rulings | parallel lines that make up a cylindrical function indicating its slope and y-intercept surface area | the surface area of a solid is the total
surface area of the outer layer of the object; for objects such as
smooth | curves where the vector-valued function cubes or bricks, the surface area of the object is the
saddle point | given the function z=f(x,y), the point \vecs r(t) is differentiable with a non-zero derivative sum of the areas of all of its faces
(x_0,y_0,f(x_0,y_0)) is a saddle point if both
f_x(x_0,y_0)=0 and f_y(x_0,y_0)=0, but f does not
solid of revolution | a solid generated by revolving surface area | the area of surface S given by the
a region in a plane around a line in that plane surface integral \iint_S \,dS \nonumber
have a local extremum at (x_0,y_0)
scalar | a real number

7 https://math.libretexts.org/@go/page/51389
surface independent | flux integrals of curl vector term-by-term differentiation of a power trigonometric integral | an integral involving
fields are surface independent if their evaluation does series | a technique for evaluating the derivative of a powers and products of trigonometric functions
not depend on the surface but only on the boundary of power series \displaystyle \sum_{n=0}^∞c_n(x−a)^n
the surface by evaluating the derivative of each term separately to trigonometric substitution | an integration
create the new power series \displaystyle technique that converts an algebraic integral
surface integral | an integral of a function over a \sum_{n=1}^∞nc_n(x−a)^{n−1} containing expressions of the form \sqrt{a^2−x^2},
surface \sqrt{a^2+x^2}, or \sqrt{x^2−a^2} into a
term-by-term integration of a power series | trigonometric integral
surface integral of a scalar-valued function | a technique for integrating a power series \displaystyle
a surface integral in which the integrand is a scalar \sum_{n=0}^∞c_n(x−a)^n by integrating each term triple integral | the triple integral of a continuous
function separately to create the new power series \displaystyle function f(x,y,z) over a rectangular solid box B is the
C+\sum_{n=0}^∞c_n\dfrac{(x−a)^{n+1}}{n+1} limit of a Riemann sum for a function of three
surface integral of a vector field | a surface
variables, if this limit exists
integral in which the integrand is a vector field terminal point | the endpoint of a vector
triple integral in cylindrical coordinates | the
symmetric equations of a line | the equations theorem of Pappus for volume | this theorem limit of a triple Riemann sum, provided the following
\dfrac{x−x_0}{a}=\dfrac{y−y_0}{b}=\dfrac{z−z_0} states that the volume of a solid of revolution formed limit exists: lim_{l,m,n\rightarrow\infty}
{c} describing the line with direction vector v=⟨a,b,c⟩ by revolving a region around an external axis is equal \sum_{i=1}^l \sum_{j=1}^m \sum_{k=1}^n
passing through point (x_0,y_0,z_0) to the area of the region multiplied by the distance f(r_{ijk}^*, \theta_{ijk}^*, s_{ijk}^*) r_{ijk}^*
symmetry about the origin | the graph of a traveled by the centroid of the region \Delta r \Delta \theta \Delta z \nonumber
function f is symmetric about the origin if (−x,−y) is three-dimensional rectangular coordinate
on the graph of f whenever (x,y) is on the graph
triple integral in spherical coordinates | the
system | a coordinate system defined by three lines limit of a triple Riemann sum, provided the following
symmetry about the y-axis | the graph of a that intersect at right angles; every point in space is limit exists: lim_{l,m,n\rightarrow\infty}
function f is symmetric about the y-axis if (−x,y) is on described by an ordered triple (x,y,z) that plots its \sum_{i=1}^l \sum_{j=1}^m \sum_{k=1}^n
the graph of f whenever (x,y) is on the graph location relative to the defining axes f(\rho_{ijk}^*, \theta_{ijk}^*, \varphi_{ijk}^*)
threshold population | the minimum population (\rho_{ijk}^*)^2 \sin \, \varphi \Delta \rho \Delta \theta
symmetry principle | the symmetry principle \Delta \varphi \nonumber
states that if a region R is symmetric about a line I, that is necessary for a species to survive
then the centroid of R lies on I total area | total area between a function and the x- Type I | a region D in the xy- plane is Type I if it lies
axis is calculated by adding the area above the x-axis between two vertical lines and the graphs of two
table of values | a table containing a list of inputs continuous functions g_1(x) and g_2(x)
and their corresponding outputs and the area below the x-axis; the result is the same as
the definite integral of the absolute value of the Type II | a region D in the xy-plane is Type II if it
tangent | A tangent line to the graph of a function at function lies between two horizontal lines and the graphs of two
a point (a,f(a)) is the line that secant lines through continuous functions h_1(y) and h_2(h)
(a,f(a)) approach as they are taken through points on total differential | the total differential of the
the function with x-values that approach a; the slope of function f(x,y) at (x_0,y_0) is given by the formula unbounded sequence | a sequence that is not
the tangent line to a graph at a measures the rate of dz=f_x(x_0,y_0)dx+fy(x_0,y_0)dy bounded is called unbounded
change of the function at a trace | the intersection of a three-dimensional surface unit vector | a vector with magnitude 1
tangent line approximation (linearization) | with a coordinate plane
since the linear approximation of f at x=a is defined unit vector field | a vector field in which the
transcendental function | a function that cannot magnitude of every vector is 1
using the equation of the tangent line, the linear be expressed by a combination of basic arithmetic
approximation of f at x=a is also known as the tangent operations upper sum | a sum obtained by using the maximum
line approximation to f at x=a value of f(x) on each subinterval
transformation | a function that transforms a
tangent plane | given a function f(x,y) that is region GG in one plane into a region RR in another variable of integration | indicates which variable
differentiable at a point (x_0,y_0), the equation of the plane by a change of variables you are integrating with respect to; if it is x, then the
tangent plane to the surface z=f(x,y) is given by function in the integrand is followed by dx
z=f(x_0,y_0)+f_x(x_0,y_0)(x−x_0)+f_y(x_0,y_0) transformation of a function | a shift, scaling,
(y−y_0) or reflection of a function vector | a mathematical object that has both
magnitude and direction
tangent vector | to \vecs{r}(t) at t=t_0 any vector trapezoidal rule | a rule that approximates
\vecs v such that, when the tail of the vector is placed \displaystyle ∫^b_af(x)\,dx using the area of trapezoids. vector addition | a vector operation that defines the
at point \vecs r(t_0) on the graph, vector \vecs{v} is The approximation T_n to \displaystyle ∫^b_af(x)\,dx sum of two vectors
tangent to curve C is given by T_n=\frac{Δx}{2}\big(f(x_0)+2\, vector difference | the vector difference \vecs{v}−
f(x_1)+2\, f(x_2)+⋯+2\, f(x_{n−1})+f(x_n)\big). \vecs{w} is defined as \vecs{v}+(−
tangential component of acceleration | the \nonumber
coefficient of the unit tangent vector \vecs T when the \vecs{w})=\vecs{v}+(−1)\vecs{w}
acceleration vector is written as a linear combination tree diagram | illustrates and derives formulas for vector equation of a line | the equation \vecs
of \vecs T and \vecs N the generalized chain rule, in which each independent
r=\vecs r_0+t\vecs v used to describe a line with
variable is accounted for
Taylor polynomials | the n^{\text{th}}-degree direction vector \vecs v=⟨a,b,c⟩ passing through point
Taylor polynomial for f at x=a is p_n(x)=f(a)+f′(a) triangle inequality | If a and b are any real P=(x_0,y_0,z_0), where \vecs r_0=⟨x_0,y_0,z_0⟩, is
(x−a)+\dfrac{f''(a)}{2!}(x−a)^2+⋯+\dfrac{f^{(n)} numbers, then |a+b|≤|a|+|b| the position vector of point P
(a)}{n!}(x−a)^n vector equation of a plane | the equation \vecs
triangle inequality | the length of any side of a
Taylor series | a power series at a that converges to triangle is less than the sum of the lengths of the other n⋅\vecd{PQ}=0, where P is a given point in the plane,
a function f on some open interval containing a. two sides Q is any point in the plane, and \vecs n is a normal
vector of the plane
Taylor’s theorem with remainder | for a triangle method | a method for finding the sum of
function f and the n^{\text{th}}-degree Taylor two vectors; position the vectors so the terminal point vector field | measured in ℝ^2, an assignment of a
polynomial for f at x=a, the remainder R_n(x)=f(x) of one vector is the initial point of the other; these vector \vecs{F}(x,y) to each point (x,y) of a subset D
−p_n(x) satisfies R_n(x)=\dfrac{f^{(n+1)}(c)} vectors then form two sides of a triangle; the sum of of ℝ^2; in ℝ^3, an assignment of a vector \vecs{F}
{(n+1)!}(x−a)^{n+1} for somec between x and a; if the vectors is the vector that forms the third side; the (x,y,z) to each point (x,y,z) of a subset D of ℝ^3
there exists an interval I containing a and a real initial point of the sum is the initial point of the first
vector line integral | the vector line integral of
number M such that ∣f^{(n+1)}(x)∣≤M for all x in I, vector; the terminal point of the sum is the terminal
vector field \vecs F along curve C is the integral of the
then |R_n(x)|≤\dfrac{M}{(n+1)!}|x−a|^{n+1} point of the second vector
dot product of \vecs F with unit tangent vector \vecs T
telescoping series | a telescoping series is one in trigonometric functions | functions of an angle of C with respect to arc length, ∫_C \vecs F·\vecs T\,
which most of the terms cancel in each of the partial defined as ratios of the lengths of the sides of a right ds; such an integral is defined in terms of a Riemann
sums triangle sum, similar to a single-variable integral

term | the number \displaystyle a_n in the sequence trigonometric identity | an equation involving vector parameterization | any representation of a
\displaystyle {a_n} is called the \displaystyle nth term trigonometric functions that is true for all angles θ for plane or space curve using a vector-valued function
of the sequence which the functions in the equation are defined
vector projection | the component of a vector that
follows a given direction

8 https://math.libretexts.org/@go/page/51389
vector sum | the sum of two vectors, \vecs{v} and vertex | a vertex is an extreme point on a conic work | the amount of energy it takes to move an
\vecs{w}, can be constructed graphically by placing section; a parabola has one vertex at its turning point. object; in physics, when a force is constant, work is
the initial point of \vecs{w} at the terminal point of An ellipse has two vertices, one at each end of the expressed as the product of force and distance
\vecs{v}; then the vector sum \vecs{v}+\vecs{w} is major axis; a hyperbola has two vertices, one at the
the vector with an initial point that coincides with the turning point of each branch
work done by a force | work is generally thought
of as the amount of energy it takes to move an object;
initial point of \vecs{v}, and with a terminal point that
coincides with the terminal point of \vecs{w}
vertical asymptote | A function has a vertical ⇀

asymptote at x=a if the limit as x approaches a from if we represent an applied force by a vector F and the
vector-valued function | a function of the form the right or left is infinite ⇀

\vecs r(t)=f(t)\hat{\mathbf{i}}+g(t)\hat{\mathbf{j}} or displacement of an object by a vector s , then the work


\vecs vertical line test | given the graph of a function, ⇀

every vertical line intersects the graph, at most, once done by the force is the dot product of F and \vecs{ s}.
r(t)=f(t)\hat{\mathbf{i}}+g(t)\hat{\mathbf{j}}+h(t)\ha
t{\mathbf{k}},where the component functions f, g, vertical trace | the set of ordered triples (c,y,z) that zero vector | the vector with both initial point and
and h are real-valued functions of the parameter t. solves the equation f(c,y)=z for a given constant x=c or terminal point (0, 0)
the set of ordered triples (x,d,z) that solves the
velocity vector | the derivative of the position zeros of a function | when a real number x is a
equation f(x,d)=z for a given constant y=d
vector zero of a function f, f(x) = 0
washer method | a special case of the slicing δ ball | all points in \mathbb{R}^3 lying at a distance
method used with solids of revolution when the slices
of less than δ from (x_0,y_0,z_0)
are washers
δ disk | an open disk of radius δ centered at point
(a,b)

9 https://math.libretexts.org/@go/page/51389
Detailed Licensing
Overview
Title: Map: Calculus - Early Transcendentals (Stewart)
Webpages: 151
Applicable Restrictions: Noncommercial
All licenses found:
Undeclared: 98.7% (149 pages)
CC BY-SA 4.0: 0.7% (1 page)
CC BY-NC-SA 4.0: 0.7% (1 page)

By Page
Map: Calculus - Early Transcendentals (Stewart) - 3.7: Rates of Change in the Natural and Social
Undeclared Sciences - Undeclared
Front Matter - Undeclared 3.8: Exponential Growth and Decay - Undeclared
TitlePage - Undeclared 3.9: Related Rates - Undeclared
InfoPage - Undeclared 3.10: Linear Approximations and Differentials -
Table of Contents - Undeclared Undeclared
Licensing - Undeclared 3.11: Hyperbolic Functions - Undeclared
4: Applications of Differentiation - Undeclared
1: Functions and Models - Undeclared
4.1: Maximum and Minimum Values - Undeclared
1.1: Four Ways to Represent a Function - Undeclared
4.2: The Mean Value Theorem - Undeclared
1.2: Mathematical Models- A Catalog of Essential
4.3: How Derivatives Affect the Shape of a Graph -
Functions - Undeclared
Undeclared
1.3: New Functions from Old Functions - Undeclared
4.4: Indeterminate Forms and l'Hospital's Rule -
1.4: Exponential Functions - Undeclared
Undeclared
1.5: Inverse Functions and Logarithms - Undeclared
4.5: Summary of Curve Sketching - Undeclared
2: Limits and Derivatives - Undeclared
4.6: Graphing with Calculus and Calculators -
2.1: The Tangent and Velocity Problems - Undeclared Undeclared
2.2: The Limit of a Function - Undeclared 4.7: Optimization Problems - Undeclared
2.3: Calculating Limits Using the Limit Laws - 4.8: Newton's Method - Undeclared
Undeclared 4.9: Antiderivatives - Undeclared
2.4: The Precise Definition of a Limit - Undeclared
5: Integrals - Undeclared
2.5: Continuity - Undeclared
2.6: Limits at Infinity; Horizontal Asymptotes - 5.1: Areas and Distances - Undeclared
Undeclared 5.2: The Definite Integral - Undeclared
2.7: Derivatives and Rates of Change - Undeclared 5.3: The Fundamental Theorem of Calculus -
2.8: The Derivative as a Function - Undeclared Undeclared
5.4: Indefinite Integrals and the Net Change Theorem
3: Differentiation Rules - Undeclared
- Undeclared
3.1: Derivatives of Polynomials and Exponential 5.5: The Substitution Rule - Undeclared
Functions - Undeclared
6: Applications of Integration - Undeclared
3.2: The Product and Quotient Rules - Undeclared
3.3: Derivatives of Trigonometric Functions - 6.1: Areas Between Curves - Undeclared
Undeclared 6.2: Volumes - Undeclared
3.4: The Chain Rule - Undeclared 6.3: Volumes by Cylindrical Shells - Undeclared
3.5: Implicit Differentiation - Undeclared 6.4: Work - Undeclared
3.6: Derivatives of Logarithmic Functions - 6.5: Average Value of a Function - Undeclared
Undeclared 7: Techniques of Integration - Undeclared
7.1: Integration by Parts - Undeclared

1 https://math.libretexts.org/@go/page/115432
7.2: Trigonometric Integrals - Undeclared 11.6: Absolute Convergence and the Ratio and Root
7.3: Trigonometric Substitution - Undeclared Test - Undeclared
7.4: Integration of Rational Functions by Partial 11.7: Strategy for Testing Series - Undeclared
Fractions - Undeclared 11.8: Power Series - Undeclared
7.5: Strategy for Integration - Undeclared 11.9: Representations of Functions as Power Series -
7.6: Integration Using Tables and Computer Algebra Undeclared
Systems - Undeclared 11.10: Taylor and Maclaurin Series - Undeclared
7.7: Approximate Integration - Undeclared 11.11: Applications of Taylor Polynomials -
7.8: Improper Integrals - Undeclared Undeclared
8: Further Applications of Integration - Undeclared 12: Vectors and The Geometry of Space - Undeclared
8.1: Arc Length - Undeclared 12.1: Three-Dimensional Coordinate Systems -
8.2: Area of a Surface of Revolution - Undeclared Undeclared
8.3: Applications to Physics and Engineering - 12.2: Vectors - Undeclared
Undeclared 12.3: The Dot Product - Undeclared
8.4: Applications to Economics and Biology - 12.4: The Cross Product - Undeclared
Undeclared 12.5: Equations of Lines and Planes - Undeclared
8.5: Probability - Undeclared 12.6: Cylinders and Quadric Surfaces - Undeclared
9: Differential Equations - Undeclared 13: Vector Functions - Undeclared
9.1: Modeling with Differential Equations - 13.1: Vector Functions and Space Curves -
Undeclared Undeclared
9.2: Direction Fields and Euler's Method - 13.2: Derivatives and Integrals of Vector Functions -
Undeclared Undeclared
9.3: Separable Equations - Undeclared 13.3: Arc Length and Curvature - Undeclared
9.4: Models for Population Growth - CC BY-SA 4.0 13.4: Motion in Space- Velocity and Acceleration -
9.5: Linear Equations - Undeclared Undeclared
9.6: Predator-Prey Systems - Undeclared 14: Partial Derivatives - Undeclared
10: Parametric Equations And Polar Coordinates - 14.1: Functions of Several Variables - Undeclared
Undeclared 14.2: Limits and Continuity - Undeclared
Front Matter - Undeclared 14.3: Partial Derivatives - Undeclared
TitlePage - Undeclared 14.4: Tangent Planes and Linear Approximations -
InfoPage - Undeclared Undeclared
14.5: The Chain Rule - Undeclared
10.1: Curves Defined by Parametric Equations -
14.6: Directional Derivatives and the Gradient Vector
Undeclared
- Undeclared
10.2: Calculus with Parametric Curves - Undeclared
14.7: Maximum and Minimum Values - Undeclared
10.3: Polar Coordinates - Undeclared
14.8: Lagrange Multipliers - Undeclared
10.4: Areas and Lengths in Polar Coordinates -
Undeclared 15: Multiple Integrals - Undeclared
10.5: Conic Sections - Undeclared 15.1: Double Integrals over Rectangles - Undeclared
10.6: Conic Sections in Polar Coordinates - 15.2: Double Integrals over General Regions -
Undeclared Undeclared
Back Matter - Undeclared 15.3: Double Integrals in Polar Coordinates -
Index - Undeclared Undeclared
11: Infinite Sequences And Series - Undeclared 15.4: Applications of Double Integrals - Undeclared
15.5: Surface Area - Undeclared
11.1: Sequences - Undeclared
15.6: Triple Integrals - Undeclared
11.2: Series - Undeclared
15.7: Triple Integrals in Cylindrical Coordinates -
11.3: The Integral Test and Estimates of Sums -
Undeclared
Undeclared
15.8: Triple Integrals in Spherical Coordinates -
11.4: The Comparison Tests - Undeclared
Undeclared
11.5: Alternating Series - Undeclared
15.9: Change of Variables in Multiple Integrals -
Undeclared

2 https://math.libretexts.org/@go/page/115432
16: Vector Calculus - Undeclared 17.1: Second-Order Linear Equations - Undeclared
16.1: Vector Fields - Undeclared 17.2: Nonhomogeneous Linear Equations -
16.2: Line Integrals - Undeclared Undeclared
16.3: The Fundamental Theorem for Line Integrals - 17.3: Applications of Second-Order Differential
CC BY-NC-SA 4.0 Equations - Undeclared
16.4: Green's Theorem - Undeclared 17.4: Series Solutions of Differential Equations -
16.5: Curl and Divergence - Undeclared Undeclared
16.6: Parametric Surfaces and Their Areas - Back Matter - Undeclared
Undeclared Index - Undeclared
16.7: Surface Integrals - Undeclared Glossary - Undeclared
16.8: Stokes' Theorem - Undeclared Detailed Licensing - Undeclared
16.9: The Divergence Theorem - Undeclared
17: Second-Order Differential Equations - Undeclared

3 https://math.libretexts.org/@go/page/115432

You might also like