Math117 Course Notes
Math117 Course Notes
Lecture Notes
David Harmsworth
University of Waterloo
2014, 2018
Part I
Functions
Since calculus is primarily concerned with the study of functions, we begin this course with
a review of some of the basic concepts. Since most of these ideas should already be familiar
to you, we’ll move quite quickly, with a focus on addressing some common misconceptions.
However, we will also introduce some concepts which are usually not discussed until later on
in the calculus sequence, so there should be something new for everyone each week.
A function is simply a rule which assigns a single output value to each input value. You are
probably familiar with the “vertical line test”; for a true function there can be only a single
output for each input, and this corresponds to the fact that its graph cannot pass through
any vertical line more than once. Since we customarily use the name “x” for the independent
(input) variable and the name “y” for the dependent (output) variable, we can describe the
action of a function f by writing y = f (x). You may occasionally also see the notation
Note that there doesn’t have to be a formula for a function. For instance, the temperature
at a given location can be regarded as a function of time, but there’s no explicit formula for
it! Of course, to use calculus, we might try to invent a formula which approximates the real
function.
The domain of a function is the set of allowable values for the independent variable, while
the range is the set of possible values for the dependent variable. For example, for the function
√
f (x) = x − 1, the domain (unless otherwise specified) is the set of values of x such that x ≥ 1,
Comment: We’ll often describe such sets in interval notation: an interval such as
{x | 1 < x < 2} can be expressed simply as the interval (1, 2). If we wish to include the
endpoints we use square brackets: the interval {x | 1 ≤ x ≤ 2} can be written as [1, 2]. These
two types of intervals are referred to as “open” and “closed”, respectively. The two types of
1
parentheses can be combined as needed, so for example the interval [1, 2) is closed on the left
and open on the right (that is, the number 1 is included, but the number 2 is not). We use the
symbol ∞ for unbounded intervals; it will always be accompanied by a round bracket (because
∞ is not a real number, so it can’t be included in an interval). With this notation we could
√
write the domain and range for x − 1 as [1, ∞) and [0, ∞), respectively.
Comment #2: Of course, we could also impose a restriction on the domain for a given
√
function. For example, we could define a function as g(x) = x − 1 with domain x ∈ [1, 5),
Comment #3: You’ll probably notice that textbooks alternate between two different
notations for introducing the functions they want you to work on in their exercises. Is there
a difference in meaning between writing, for example, f (x) = x2 , and writing y = x2 ? Well,
obviously there isn’t much difference in the amount of information given; the difference is
really one of emphasis. In the prior notation the emphasis is on the rule; we are given a name
for the function, and told that it is the one which squares the input. In the latter notation we
are given a name for the output, and the emphasis is on the relationship between variables.
Comment #4: Sometimes other relationships between variables may also be of interest.
For example, the equation x2 + y 2 = a2 should be familiar as the equation of a circle of radius
a, but its graph clearly fails the vertical line test! So, why do we make such a fuss about which
relationships are functions and which aren’t? Well, it does make a difference for the theory of
calculus, so for example to perform certain calculations we might have to break the equation
√ √
of our circle into the two functions y = a2 − x2 and y = − a2 − x2 .
2
2 Composition of Functions
If y = f (x) and x = g(t), then we can view y as a function of t: y = f (g(t)). There is a second
notation for this, too; we may write it as y = f ◦ g(t). This notation is convenient if we wish
It’s easy to show that composition is not generally commutative. That is, f ◦ g is not
mathematics, there will still occasionally be some ambiguity in our notation. For example, it
may have occurred to you that the interval notation we introduced above could cause some
confusion, since the expression (1, 2) could be interpreted as either as an interval or as a point
in the xy-plane. The intent is usually clear from the context, though, and if it isn’t we can
You’ll notice a similar problem if you consider the expression h(2 − y 2 ). This could be
letters a, b, c, and d to represent parameters (quantities which are constant for the purposes
of the calculation, but can be altered), s, t, x, y, and z to represent variables, and the letters
f and g as names for functions1 . However, this is not a firm rule (and h is often used in both
3
3 Inverse Functions
We say that a function g is an inverse of a function f if g(f (x)) = x, for any x in the domain
of f . Actually, inverses are unique, so we may say that g is the inverse of f . In words we
might say that the inverse “undoes” the action of the original function.
Suppose y = f (x), and that g is the inverse of f . Then, by our definition, we know that
g (y) = x.
f (g (y)) = f (x) ,
and so
f (g (y)) = y
Notation: Unfortunately, the problems with notation do not end with the comments of
the previous section. The standard notation for the inverse of f (x) is f −1 (x), which is arguably
the worst piece of notation in all of calculus. The reason should be clear; the inverse of f is not
the same thing as the reciprocal of f ! That is, we may write the reciprocal, 1
f (x) , as [f (x)]−1 ,
but we must not confuse this with f −1 (x), which means something completely different. We’ll
see this problem again a little bit later with the so-called3 inverse trigonometric functions:
Many authors prefer to use the name arcsin x for the inverse, but even if you decide to
use this consistently you must never write sin−1 x to represent the reciprocal, because you
will be misunderstood!
3
You’ll see the reason for the adjective “so-called” when we get to that topic.
4
Finding Inverses: In simple cases we can find the inverse function simply by solving the
Solution: It will help to give a name to the output, so let’s write y = 12 (x − 1). Then we
have
2y = x − 1,
and then
x = 2y + 1.
Note: You’ve been taught to switch the variables, so that in this example you would obtain
the expression y = 2x+1, but this is not at all necessary. Writing f −1 (y) = 2y+1 gives exactly
the same information as writing f −1 (x) = 2x + 1. As we discussed at the very beginning, the
function is just the rule: multiply by 2 and then add 1. It doesn’t matter what name we give
to the independent variable; the function remains the same. It is traditional to interchange
In fact, in applications, interchanging the variables is a terrible idea, since the variables
will usually be associated with specific quantities. For example, suppose that the distance
√
travelled by a moving object can be calculated as x = f (t) = 2t − 1, where t represents
time. Then the inverse function gives the time required for the object to move a given distance:
Tradition also dictates that when graphing a function y = f (x), we should use the hori-
zontal axis for the independent variable. If we do this (and also interchange the names of the
variables so that the horizontal axis corresponds to the variable x), then it follows that the
graph of the inverse will be the reflection of the original graph across the line y = x. Why?
5
Invertibility: Not every function possesses an inverse. The problem, of course, is that for
the inverse to be a true function, it must have a single output for every input. This requires
that the original function must have a single input for every output, in which case we say that
it is one-to-one. We can often spot this from the graph; if f is one-to-one it will pass the
“horizontal line test”. This ties in with our discussion above; if f is invertible then its graph
will pass the vertical line test after we interchange the axes!
Note that even if f is not one-to-one, and therefore not invertible, we may be able to
restrict its domain to some interval on which it is one-to-one, and then we can define an
Example: The function f (x) = x2 has no inverse. However, the restriction of f to the
√
interval [0, ∞) does have an inverse: let’s call it g+ (x) = x. Alternatively, we could consider
√
the restriction of f to the interval (−∞, 0], which has the inverse g− (x) = − x.
4 Symmetry
Of course, the graph of an even function is symmetric about the y-axis. A famous example is
1.5
0.5
-5 -4 -3 -2 -1 0 1 2 3 4 5
-0.5
-1
-1.5
Figure 1: -2
The prototypical examples are even powers: x2 , x4 , x6 , etc. (these are probably the reason
for the use of the word “even”). Reciprocal even powers are also even: x−2 , x−4 , etc., and so
is the absolute value function, |x|. Note that we can obtain an even function whose graph lies
6
Definition: We say a function f (x) is odd if f (−x) = −f (x).
The graph of an odd function is said to be symmetric “about the origin”, or “antisymmetric”.
of successive reflections, across both axes. The sine function is odd, of course:
1.5
0.5
-5 -4 -3 -2 -1 0 1 2 3 4 5
-0.5
-1
-1.5
Figure 2: -2
So are odd powers and reciprocal powers: x, x3 , x5 , etc., x−1 , x−3 , etc. Here’s a little
challenge: can you think of a way to construct an odd function whose graph lies in between x
• Most functions are neither even nor odd. However, any function whose domain is
symmetric about x = 0 (so that f (−x) is defined whenever f (x) is defined) can be
expressed as the sum of an even component and an odd component. Here’s how:
1 1
f (x) = f (x) + f (−x) − f (−x)
2 2
5
To prove the first one, let f (x) and g (x) be even functions, and let h (x) = f (x) g (x). Then h (−x) =
f (−x) g (−x) = f (x) g (x) = h (x), and so h is even.
To remember the rules, just think of the simplest examples: x2 · x4 = x6 , x · x3 = x4 , and x · x2 = x3 . From
these you can see why the rules for products of even and odd functions correspond to the rules for sums of
even and odd integers!
7
1 1 1 1
= f (x) + f (x) + f (−x) − f (−x)
2 2 2 2
1 1
= [f (x) + f (−x)] + [f (x) − f (−x)] .
2 2
1 1
Now let g (x) = [f (x) + f (−x)] and let h (x) = [f (x) − f (−x)]. Observe that
2 2
f (x) = g (x) + h (x), and when we examine these two functions we find that
1
g (−x) = [f (−x) + f (x)] = g (x) ,
2
so g is even, and
1 1
h (−x) = [f (−x) − f (x)] = − [f (x) − f (−x)] ,
2 2
so h is odd. Thus we have obtained a formula for obtaining the even and odd
components of f . We don’t often use this formula, but it is the basis for two useful
functions; if we break the exponential function up into its even and odd components we
obtain
1 x 1 x
ex = e + e−x + e − e−x ,
2 2
1 x 1 x
e + e−x , e − e−x .
cosh (x) := sinh (x) :=
2 2
These are the even and odd components of the natural exponential function! They are
known as the hyperbolic cosine and hyperbolic sine functions, respectively, and we’ll
As you learn more and more calculus you’ll discover several reasons why symmetry can be
important. For the time being, just observe that noticing symmetry can make sketching
easier:
8
Solution: First note that f is even, so we can start by sketching the part of the graph
f (x) = x2 − x
1 2 1
= x− −
2 4
This tells us that this part of the graph is part of a upward-opening parabola with its vertex
x2 − x = 0 =⇒ x (x − 1) = 0 =⇒ x = 0 or x = 1.
0.5
-0.5
Figure 3: -1
Answer to “little challenge” involving odd functions: To use just powers of x and
x3 |x3 |
the absolute value funciton, we can let f (x) = |x| (or f (x) = x , which is exactly the same
function).
9
5 Piecewise-Defined Functions
We will sometimes encounter functions which are defined by different formulas on different
5
−x, if x < 0 4
f (x) = x2 , if 0 ≤ x < 2 3
2
4,
if x ≥ 2
1
-5 -4 -3 -2 -1 0 1 2 3 4 5
-1
Such functions might seem artificial, but there are many physical phenomena which need
There are several simple piecewise-defined functions which are used quite commonly. One
of them should already be familiar to you; the absolute value function is of great importance
in calculus.
√ √
Alternatively, we could define it as |x| = x2 (this works because the symbol “ ” always
It will be useful to think of |x| as the distance between x and zero on the number line.
Similarly, we can think of |x − a| as being the distance between x and a specific number a.
Example: The inequality |x − 5| < 2 is satisfied by the set of points within 2 units of 5:
6 √
Also, if we’re working with real numbers, then the notation x1/2 means exactly the same thing as x, so
1/2 1/2
we could also write |x| = x2 . This means that x2 is not always equal to x!
10
Figure 4: 2 3 4 5 6 7 8 x
If x < 5 instead, then |x − 5| < 2 means − (x − 5) < 2. That is, x − 5 > −2, so x > 3
(recall that if we multiply an inequality by a negative number, then the inequality reverses
direction). Hence we may have x ∈ [5, 7) or x ∈ (3, 5). Combining these, we know that
x ∈ (3, 7).
Although we will often be able to find shortcuts, it is essential that you be able to use
the definition of |x|. It allows us to break difficult problems down into cases, so that we can
Solution: The expressions |x + 3| and |2x + 1| each have two possible meanings, so it
seems as though we should have four cases to consider. However, if you try to identify the
four cases you’ll realize that there are only three, since it is impossible to have x + 3 < 0 and
2x + 1 ≥ 0 at the same time (this would mean that x < −3 and x ≥ −1/2). The easiest way
to see this is to realize that the meaning of our inequality changes at two values of x : −3 and
−1/2. That means that there are just three intervals to consider!
Case I: Suppose x < −3. Then both x + 3 and 2x + 1 are negative, so the inequality
reads
−x − 3 ≤ −2x − 1
=⇒ x ≤ 2.
Think about what that tells us for a moment: if we assume that x < −3, then the inequality
requires that x ≤ 2, which is guaranteed anyway! So, the inequality is solved by any number
11
Case II: Suppose x ∈ −3, − 21 . Then x + 3 is positive, but 2x + 1 is still negative, so
x + 3 ≤ −2x − 1
=⇒ 3x ≤ −4
4
=⇒ x≤− .
3
That is, of the values of x in the interval −3, − 21 , only the values less than − 43 satisfy the
inequality (so the values in − 43 , − 12 are excluded from our solution set).
Case III: Suppose x ≥ − 12 . Then both x + 3 and 2x + 1 are positive, so the inequality is
x + 3 ≤ 2x + 1
=⇒ x ≥ 2.
Combining the results from the three cases, we can conclude that the inequality |x + 3| ≤
|2x + 1| is satified by
4
x∈ −∞, − ∪ [2, ∞).
3
You’re unlikely to see this one often, but we’ll mention it briefly. The signum function simply
12
5.3 Ramp Functions
4
0,
if t < 0 3.2
r (t) = 2.4
ct,
if t ≥ 0 1.6
0.8
(where c is a constant)
-1 -0.5 0 0.5 1 1.5 2 2.5 3
-0.8
The floor function has infinitely many pieces, but it can be defined simply in words:
Example 1: b4.17c = 4, b7c = 7, and b−2.32c = −3 (not −2; we just said the function
rounds down!).
Answer: Observe that f (2.1) = 2, f (2.5) = 3, f (2.7) = 3, f (3) = 3, etc. This function
1
Example 3: What does the function g (x) = 10 b10xc do?
Answer: Notice that we have g (2.1) = 2.1, g (2.13) = 2.1, g (2.1313...) = 2.1, and so on.
(this rounds up instead of down). We don’t really need both of these, since dxe = −b−xc.
13
5.5 The Fractional-Part Function
Here’s another simple idea, which can most easily be defined this way:
Note that F RACP T (x) is periodic! This gives it some interesting applications. For
example, if we are given an angle θ ∈ (−∞, ∞) in radians, we can obtain the corresponding
θ
angle in the interval [0, 2π) by using the function f (θ) = 2πF RACP T . Try it!
2π
This is perhaps the simplest piecewise-defined funtion we can imagine, and for that reason it’s
also one of the most important (along with the absolute value function). This is all it is:
H(t)
0,
if t < 0
H (t) = 1
1,
if t ≥ 0
!2 !1 1 2 t
This can be used to write any piecewise-defined function in single-line form. In fact, you’ll
need to be able to do this in your third calculus course, so we’ll try to get you used to the idea
now.
so if we think of moving from left to right (increasing time), then multiplication by H (t) can
14
H(t)
!1 1 2 3 4 t
Figure 5:
0,
if t < a
Also, we can shift this effect, since H (t − a) =
1,
if t ≥ a.
0,
if t < 1
Example: Consider f (t) = t2 H (t − 1). We can express this as f (t) = so
t2 ,
if t ≥ 1,
the graph looks like this:
f (t)
3
!1 1 2 t
Figure 6:
That’s really all we need to know; with this tool we can produce an infinite variety of
Solution: First write the function in piecewise-defined form (just apply the definition of
H (t)):
If t ∈ (−∞, −2), then f (t) = t =t
f (t) = t + (2 − t) + t2 − 1 = t2 + 1
If t ∈ [0, 1), then
f (t) = t + (2 − t) + t2 − 1 + 2 − t − t2
If t ∈ [1, ∞), then =3−t
15
f (t)
3
!4 !3 !2 !1 1 2 3 4 t
!1
!2
!3
Figure 7:
Of course, for this idea to be useful we must be able to do these problems in reverse. That
is, given a function defined in piecewise form, we need to be able to express it in terms of
the Heaviside function. There are many ways to do this, but there’s one simple strategy that
turns out to be the most useful. The idea is to begin with whatever we have on the left-most
portion of the graph, and use the Heaviside function to impose changes at each point where
the definition changes. We need the Heaviside function only in the form H (t − a); all we need
to do is set the values of a and work out what H (t − a) needs to be multiplied by.
−t, for t < 0
f (t) = t2 , for 0 ≤ t < 2
4,
for t ≥ 2.
simply add t + t2 . At time t = 2, we need to replace t2 with 4, so that’s exactly what we do;
we add 4 − t2 !
f (t) = −t + t + t2 H (t) + 4 − t2 H (t − 2) .
16
The “Shortcuts”
When the given functions are more complicated it can be harder to do these calculations in
our heads, so some people prefer to use a couple of tricks to write down an expression quickly,
and then simplify it afterwards. This actually ends up requiring more work, but it might help
us to avoid mistakes.
Consider that
1,
if t < a
1 − H (t − a) =
0,
if t ≥ a.
H(t)
a t
so its graph looks like this:
0, for t < a
H (t − a) − H (t − b) = f (t) = 1, for a ≤ t < b
0,
for t ≥ b.
H(t)
a b t
for which the graph looks like this:
With these, we can proceed formulaically. That is, given a function with, say, four “pieces”
17
g1 (t) , for t < a
g (t) ,
2 for a ≤ t < b
f (t) =
g3 (t) ,
for b ≤ t < c
g4 (t) ,
for t > c,
H(t)
t,
if t ≤ 1
f (t) = 1
−t,
if t > 1
1 t
!1
in which there appears a “≤” instead of a “≥” (so our intervals are closed on the right instead of
on the left). It is possible to deal with these; for this example we’d need to use the expression
H (1 − t). However, in applications we will not usually worry about this detail. For example,
if you imagine using mathematics to model what happens when you turn a lamp on, it doesn’t
really matter whether we consider the light to be on or off at the precise moment when we
In fact, various textbooks define the Heaviside function differently; you may see it defined
as
18
0,
if t ≤ 0
H (t) =
1,
if t > 0
or even as
0,
if t < 0
H (t) =
1,
if t > 0.
We simply won’t need to worry about the distinction once we start using the function in
applications.
6 Periodicity
The graph of a periodic function consists of infinitely many segments which are replicas of
each other:
0.5
0.25
-5 -4 -3 -2 -1 0 1 2 3 4 5
-0.25
Figure 8: -0.5
There are a few terms used to describe periodic functions that you should know:
1
• Frequency = . If the period is measured in seconds, then the frequency is in
Period
“Hertz”: 1 Hz = 1s−1 . This is the number of complete cycles per second. In engineering,
it’s common to use the letter f for frequency, while physicists tend to use the Greek
2π
• Angular Frequency = = 2πf . This has units of radians per second (we’ll review
Period
what radians are shortly). We will actually refer to the angular frequency much more
19
often than to the frequency; so much so that we’ll often get tired of saying “angular”!
When we say “frequency”, we often mean “angular frequency” - watch out for this. Our
notation will help; it is customary to denote the angular frequency by the Greek letter
Periodic Extensions
a certain time interval, and we’ll need to create a function which matches the given one over
the given interval, but is defined over the entire real line, and is periodic. The same task may
be required in other applications as well. For example, we might be given this as our original
function:
2
1.5
1
2x, if x ∈ 0, 2 1
f (x) = 1
0.5
2 − 2x, if x ∈ 2, 1
-1.6 -1.2 -0.8 -0.4 0 0.4 0.8 1.2 1.6 2 2.4 2.8
0
elsewhere -0.5
-1
We could extend this as an even periodic function (of period 1), like this:
1.5
0.5
-1.6 -1.2 -0.8 -0.4 0 0.4 0.8 1.2 1.6 2 2.4 2.8
-0.5
Figure 9: -1
1.6
0.8
-1.6 -1.2 -0.8 -0.4 0 0.4 0.8 1.2 1.6 2 2.4 2.8
-0.8
Figure 10:
20
Notice that both of these reproduce f (x) over the original interval (0, 1). You may see some
otherwise7 .
You should be quite comfortable with the idea of combining rational functions into one
through finding a common denominator, but in calculus we will often find it necessary to
reverse this procedure. Fortunately, this can always be done... provided that we can manage
Fact: Any proper rational function can be expressed as the sum of simpler rational functions,
Example:
5x2 − 5x + 4 3x − 1 2
3 2
= 2 +
x −x −x−2 x +x+1 x−2
How do we do this? The Method of Partial Fractions essentially consists of guessing the
form of the decomposition on the right by taking advantage of the experience we have in
working in the other direction. If we think about all of the various things that can happen
We begin by factoring the denominator as far as possible, into linear and irreducibly
quadratic factors (the Fundamental Theorem of Algebra guarantees that this is always possible
in theory, although it can be difficult in practice). We then predict the form of the partial
fraction decomposition using three rules (which are admittedly difficult to explain clearly, but
7
Some authors use the term marginally proper if the numerator and denominator are of the same degree,
in which case our term proper needs to be replaced by strictly proper.
21
1. For any linear factor (c1 x + c0 ) in the denominator, the decomposition will contain a
A
term of the form , for some constant A.
c1 x + c0
2. For any irreducible quadratic factor (c2 x2 + c1 x + c0 ) in the denominator, the decompo-
Ax + B
sition will contain a term of the form 2
, for some constants A and B.
c2 x + c1 x + c0
3. For any factor which is repeated, n times, we need n terms of the forms given by Rules
If you’re not sure why these rules work the way they do, just try doing some of the following
examples in reverse (that is, go through the procedure of putting the results over the common
denominator), and you should begin to see the logic behind them.
Examples:
x+2 x+2
1. Consider . Factoring the denominator gives , so we only need
x2
+ 5x + 4 (x + 4)(x + 1)
x+2 A B
Rule 1: 2 = + . Now, to determine the values of A and B, the
x + 5x + 4 x+4 x+1
idea is to put these expressions over a common denominator again and match up the
coefficients:
x+2 A B A(x + 1) + B(x + 4)
= + =
x2 + 5x + 4 x+4 x+1 x2 + 5x + 4
We can cancel the denominators, and this leaves us with x + 2 = A(x + 1) + B(x + 4)∗ =
(A + B)x + (A + 4B). The only way these two polynomials can be equal is if the
1= A+B
coefficients are equal, so we have the pair of equations . Solving these,
2 = A + 4B
we find A = 2/3, and B = 1/3, and so we have our result:
x+2 1 2 1
= + .
x2 + 5x + 4 3 x+4 x+1
Note: Once we get to the point marked *, we could find A & B more quickly by substi-
tuting values for x. This “cover-up” trick doesn’t work so well when we have quadratic
factors, so you’ll still need the concept of matching coefficients, but it does give us a useful
set x = −1 to get 1 = 3B (so B = 1/3)
shortcut when the factors are linear:8
set x = −4 to get −2 = −3A (so A = 2/3).
8
It might occur to you that these are precisely the values of x at which our original function is undefined,
22
x
2. Now consider . We need Rules 1 and 2 here:
(x + 1)(x2 + x + 1)
x A Bx + C
2
= + 2
(x + 1)(x + x + 1) x+1 x +x+1
Since we have one quadratic factor, the cover-up method can be used to get one of
the remaining two values, though, it’s quickest to match the coefficients: Comparing x2
terms, we see that we must have 0 = A + B, while comparing constant terms, we find
x x+1 1
Therefore B = 1 and C = 1, so = 2 − .
(x + 1)(x2 + x + 1) x +x+1 x+1
2x5 − x3
3. Consider . We need all three rules here:
(x + 2) (x2 + 1)3
2x5 − x3 A Bx + C Dx + E Fx + G
3 = + 2 + 2 +
(x + 2) (x2 + 1) x+2 x +1 2
(x + 1) (x2 + 1)3
These problems get very tedious, and software packages can handle them for us, so we
Fact: If f (x) is an improper rational function, then it can always be written as the sum of
How? One option is long division (you may have seen synthetic division, but this only
so we shouldn’t be allowed to do this! However, we could get exactly the same results by taking limits as x
approaches −1 and 4, so the problem isn’t really a problem after all (to use some terminology we’ll introduce
properly later on, these points are removable discontinuities).
9
To see why this is more efficient than relying exclusively on the cover-up method, consider trying to
complete this example that way. Setting x = 0 does at least look helpful; it gives us 0 = A + C, but this is
entirely equivalent to comparing the constant terms. After that, though, there are really no more useful values
of x; the best we can do is pick a nice round number like x = 1. This gives us the equation 1 = −3A +2(B + C).
This is certainly sufficient for us to determine the values of all three constants, but this last equation is definitely
more complicated than the equation we obtained from comparing the coefficients of x2 . The most efficient way
to proceed is to use a sensible combination of the two techniques.
23
works for linear denominators).
x3 − 1
Example: Rewrite f (x) = .
x2 + 2x + 1
Solution: The long division calculation should look something like this:
x−2
x2 + 2x + 1 x3 − 1
x3 + 2x2 + x
−2x2 − x − 1
−2x2 − 4x − 2
3x + 1
−1 x3 3x + 1
This tells us that = x−2+ 2 , and now we are in a position to
x2
+ 2x + 1 x + 2x + 1
factor the denominator and proceed with the partial fraction decomposition.
Another option is to include the polynomial terms in our partial fraction decomposition
procedure directly:
x3 − 1 C D
= Ax + B + +
2
x + 2x + 1 x + 1 (x + 1)2
x3 − 1 Ax (x + 1)2 + B (x + 1)2 + C (x + 1) + D
=⇒ =
(x + 1)2 (x + 1)2
x3 − 1 = A x3 + 2x2 + x + B x2 + 2x + 1 + C (x + 1) + D.
=⇒
Setting x = −1 gives −2 = D.
Note that we might have realized at the beginning that A = 1, by doing the first step of long
24
Application to Curve Sketching (if time permits)
Suppose we wish to investigate the graph of the function in the example above. We’ve dis-
covered that for large values of x (positive or negative), f (x) ≈ x − 2. This means that the
line y = x − 2 is an oblique asymptote (also called a slant asymptote). We also have a vertical
3
• the graph of f lies above the line y = x − 2 when x −1 (because f (x) ≈ x − 2 + x+1 )
• the graph of f lies below the line y = x − 2 when x −1 (for the same reason)
−2
• as x → −1, f → −∞, since the (x+1)2
term dominates.
This is all we need to determine that the graph looks like this:
y
x"!1
3
y"x!2
!4 !1 1 2 4 x
!1
!2
!7
Figure 11:
x3 − x2 − x + 3
Example: Sketch the graph of f (x) = .
x−1
x2 −1
x−1 x3 − x2 − x + 3
x3 − x2
−x + 3
−x + 1
25
2
=⇒ f (x) = x2 − 1 + .
x−1
From this we can see that f has a vertical asymptote at x = 1, and the graph of f approaches
the parabola y = x2 − 1 “asymptotically” as x → ±∞. The y-intercept is (0, −3). The other
part of the graph doesn’t have any intercepts, so to anchor the graph let’s just find one point:
!5 !1 1 2 5 x
!3
!5
Figure 12:
Radian Measure
Definition: One radian is the angle for which the length of the arc of a circle matches the
s!r
r
θ!1
r
The radian measure of an angle is the ratio of arc length to radius (in any circle drawn with
s
the vertex of the angle at its center): θ = .
r
1. Our definition of radian measure immediately gives us a formula for the length of an arc:
in a circle of radius r, the arc length corresponding to an angle θ (in radians) is given
by s = rθ.
26
2. At this point you’re probably more familiar with degrees than radians, so we should
know how they are related. Well, we know that the circumference of a circle (that is,
the arc length for a full circle) is s = 2πr. Matching this to the formula s = rθ, we
discover that in a full circle, θ = 2π radians. Hence 2π radians = 360◦ , which allows us
◦
to conclude that 1 rad = 180 π , and 1◦ = 180
π
rad. To make you a bit more comfortable
with radian measure, it may help to note that 1 radian ≈ 57.3◦ . An easy way to make
sense of this is to compare the diagram above to an equilateral triangle; each angle in
an equilateral triangle is 60◦ , but if you imagine bending one side into an arc of a circle,
3. There is one way in which radians (and degrees) differ from other kinds of units. Since
they are defined by a ratio of lengths, they are dimensionless. That is,
In calculus we will use radians only. You’ll see why when we discuss derivatives, but
to put it simply it’s because they work better! Degrees are useful because the number
integers. Radians can be said to be a more natural measure, because they are defined
27
The Trigonometric Functions
You’ve probably seen the sine and cosine functions defined as ratios of lengths of sides of
triangles, but we’ll use a slightly different definition. Consider the unit circle, x2 + y 2 = 1,
and an angle θ made between a ray from its center and the x-axis. We define the cosine and
sine of θ as the coordinates of the point of intersection of that ray and the circle itself:
1.6
1.2
❁ P(cosθ,sinθ)
0.8
0.4
θ
-2.4 -2 -1.6 -1.2 -0.8 -0.4 0 0.4 0.8 1.2 1.6 2 2.4
-0.4
-0.8
-1.2
y = sin θ
These are “parametric equations” for the unit circle; if we imagine θ running through the
values from 0 to 2π, the point (x, y) traces out the circle. Many of the properties of the sine
and cosine functions should now appear to be immediate consequences of our definition:
π π 3π
• sin (0) = 0, cos (0) = 1, sin 2 = 1, cos 2 = 0, sin 2 = −1, etc.
• sin (θ + 2πk) = sin (θ), cos (θ + 2πk) = cos (θ), for all integers k (so the functions are
• Imagine moving clockwise from the origin instead; compare the values we obtain with
28
1.6
1.2
❁ cos(θ),sin(θ)
0.8
0.4
θ
-2.4 -2 -1.6 -1.2 -0.8 -0.4 0 -θ 0.4 0.8 1.2 1.6 2 2.4
-0.4
-0.8 ❁ cos(-θ),sin(-θ)
-1.2
From this it is clear that cos (−θ) = cos (θ), and sin (−θ) = − sin (θ), so the cosine function
• The sine of θ is positive when P is above the x-axis (in the 1st & 2nd quadrants), while
the cosine is positive when P is to the right of the y-axis (in the 1st & 4th quadrants).
If you know the “right” definition of the trigonometric functions, there’s no need for the
“CAST” rule!
• Where does the tangent function come from? We draw a line which is tangent to the
circle at the point (1, 0), and observe where it intersects our ray. The y-coordinate of
that intersection point is defined to be the tangent of θ, tan (θ). By similar triangles, we
tan θ sin θ
can see that = , which gives us a more practical definition.
1 cos θ
29
❁ (1,tan(θ))
1.5
1
❁ cos(θ),sin(θ)
0.5
θ
-2.4 -2 -1.6 -1.2 -0.8 -0.4 0 0.4 0.8 1.2 1.6 2 2.4
-0.5
-1
-1.5
Figure 15:
• There are another three functions in fairly common usage. They can be defined simply
as reciprocals of the three we have already named, but they all have geometric origins.
The secant function, for example, is the length of the part of our original ray which lies
between the origin and the above-mentioned tangent line (a secant line is a line which
1.5
❁ (1,tan(𝚹)
↗
)
1
c(θ
se
↙
0.5
-2.4 -2 -1.6 -1.2 -0.8 -0.4 0 0.4 0.8 1.2 1.6 2 2.4
-0.5
-1
Figure 16:
There is also a cosecant and a cotangent, abbreviated csc θ and cot θ. For simplicity, it’s
30
enough to remember our original definition of sine and cosine, and to remember that
sin θ 1 1 1
tan θ = , sec θ = , csc θ = , cot θ = .
cos θ cos θ sin θ tan θ
We will rarely use the cosecant and cotangent, and even the secant function is of limited
Trigonometric Identities
Most textbooks have long lists of identities, but there are really only a few that you really
have to know. We’ve placed them in boxes in the following discussion. The other identities
are either less commonly needed, or can be derived quickly from this short list. We emphasize,
though, that you MUST KNOW the important ones! The trigonometric identities are our tools
for performing algebra with trigonometric functions, so this is as important as knowing the
rules for manipulating exponentials and logarithms, for example, or knowing how to multiply
π
• Consider the angles θ and 2 − θ:
π
2! θ
θ
sin θ
θ
cos ( π2 ! θ )
Figure 17:
π π
We can see that cos 2 − θ = sin θ. Similarly, sin 2 − θ = cos θ. Furthermore, since cosine
π
cos θ − = sin θ
2
31
π
and sin θ − = − cos θ .
2
• The “sum-of-angle” identities are more difficult to establish, and we’ll state them here
without proof:
If you know these, and you know that cosine is even and sine is odd, then you can
Furthermore, the first of these can be combined with the Pythagorean identity to give
cos 2θ = 2 cos2 θ − 1
or cos 2θ = 1 − 2 sin2 θ,
1
cos2 θ = (1 + cos 2θ)
2
1
and sin2 θ = (1 − cos 2θ) .
2
32
Example: Solve for θ, if sin 2θ = cos θ, and x ∈ [0, 2π].
Solution: One option is to rewrite the equation as 2 sin θ cos θ = cos θ. This allows us
to see that either 2 sin θ = 1 or cos θ = 0 (don’t overlook the second possibility - we can only
remember that sin θ corresponds to the y-coordinate, and observe that there are two angles
which give the same value. If one is θ, then the other is π − θ (see the figure below).
1.5
π-θ
❁ 0.5 ❁
θ θ
-2.4 -2 -1.6 -1.2 -0.8 -0.4 0 0.4 0.8 1.2 1.6 2 2.4
-0.5
-1
-1.5
π 3π
Case 2: If cos θ = 0, then θ = 2 or θ = 2 .
Example: Rewrite cos4 θ in terms of cos 2θ and cos 4θ (this is a skill we’ll need later, when
we discuss integration).
2
4 2
2 1 + cos 2θ
cos θ = cos θ =
2
1
1 + 2 cos 2θ + cos2 2θ
=
4
1 1
= 1 + 2 cos 2θ + (1 + cos 4θ)
4 2
33
1
= (3 + 4 cos 2θ + cos 4θ) .
8
We’re now in a position to see some of the reasons for the names of the hyperbolic functions,
1 x
e + e−x and sinh x = 21 (ex − e−x ). You can easily show from their definitions
cosh x =
2
that
cosh2 x − sinh2 x = 1
(try it). Therefore, if we set x = cosh θ and y = sinh θ, then we obtain parametric equations
This analogy does have one peculiar twist; the variable θ here is NOT the angle! Never-
theless, there is a connection. It turns out that θ is twice the area enclosed by the x-axis, the
ray, and the curve... for both hyperbolic and circular functions!
34
9 The “Inverse” Trigonometric Functions
The first thing to know about the inverse trigonometric functions is that ... there aren’t any!
In fact, no periodic function can have an inverse (because periodic functions can’t be one-to-
one). The functions which are commonly referred to as the inverse trigonometric functions
are inverses only of versions of the trigonometric functions which have restrictions imposed on
their domains. We’ll start with the so-called inverse sine function.
We need to identify an interval on which the sine function covers its full range [−1, 1] and is
one-to-one. There are infinitely many choices, but we tend to like staying close to the origin
h π πi
when given a choice, so we use the interval − , .
2 2
0.5
-2.4π -2π -1.6π -1.2π -0.8π -0.4π 0 0.4π 0.8π 1.2π 1.6π 2π 2.4π
-0.5
-1
Figure 18:
h π πi
We can say that the function f (x) = sin x, with domain − , , is invertible. This
2 2
allows us to define a new function as follows:
h π πi
Definition: y = sin−1 x means that x = sin y, where y ∈ − , .
2 2
Note that the domain of sin−1 x is [−1, 1] (and this would have been the case no matter
which domain restriction we had chosen). Also remember that y is the angle here. It may
35
0.5π
0.25π
-0.25π
-0.5π
Figure 19:
Since sin−1 x is NOT really the inverse of sin x, some confusion is natural, and some care
• the statement sin sin−1 x = x is true for all x ∈ [−1, 1] (and makes no sense elsewhere),
• but the statement sin−1 (sin x) = x is true only if x ∈ − π2 , π2 , even though this function
Examples:
• sin−1 sin π3 = π
3
• sin−1 sin 5π π
4 = −4
(How did we get the second result? We use the same idea as for Example 1 of the previous
section; we need the angle in the 1st or 4th quadrant which gives the same sine value as 5π/4,
! = 5∏/4
! = -∏/4
which just requires a reflection across the y-axis.)
Since the name “inverse sine” is a misnomer, many people prefer the name arcsine for our
new function. This name is a reminder of the connection of the trigonometric functions to the
unit circle; if y is the angle (in radians), then it’s also the length of the associated arc!
y
1
y
1
Figure 20:
36
A Tougher Question: What does the graph of y = sin−1 (sin x) look like?
h π πi
Well, we know that if x ∈ − , , then y = x:
2 2
2.5
-2.5
Figure 21:
7.5
2.5
-2.5
-5
You might be able to guess at the rest if you observe that f (−π) = 0 and f (π) = 0. To
confirm that guess, we need two more properties of trigonometric functions: identities, and
symmetry!
π
We know that sin x = cos x − (the sine function IS the cosine function, shifted to the
2
right!). Therefore we can write our function as y = sin−1 cos x − π2 . Finally, since the
cosine function is even, we realize that our own function is essentially an even function shifted
to the right. In other words, the graph must be symmetric about the line x = π/2:
37
7.5
2.5
-2.5
-5
We define the inverse cosine function in a similar way, but we have to make one adjustment:
h π πi
the cosine function is not one-to-one on the interval − , , so we have to use a different
2 2
interval for our original domain restriction. We use [0, π] instead:
0.5
-0.5
-1
Figure 24:
38
1.6π
0.8π
-5 -4 -3 -2 -1 0 1 2 3 4 5
-0.8π
-1.6π
Figure 25:
Since the sine and cosine functions are so closely related, we can often avoid using the
What interval should be used for the arctangent? That’s easy to answer:
2.5
-2.4π -2π -1.6π -1.2π -0.8π -0.4π 0 0.4π 0.8π 1.2π 1.6π 2π 2.4π
-2.5
Figure 26: -5
π π
The tangent function is one-to-one on the interval − , . This is almost the same as
2 2
for the sine function, except that the tangent function is undefined at the endpoints of the
One other difference to note is that the domain of the arctangent is the entire real line
(x ∈ R). For related reasons, it ends up being the most important member of this family of
39
π
0.5π
-0.5π
-π
Figure 27:
Of course, there are still three more to be defined, but we will rarely use them. There is one
Consider the secant function, y = sec x (it’s the solid curve in the figure below; the dashed
curve is the graph of cos x, to illustrate the relationship between the two).
-1
-2
-3
-4
Figure 28:
Picking an interval on which sec x is one-to-one is awkward; there’s no way to cover the full
range of the function without crossing a discontinuity. The most natural choice might be
h π π i
0, ∪ , π , and some authors do use this interval to define the inverse. However, it turns
2 2
out that this results in two different formulas for the derivative, depending on which part of
the domain x lies in! For this reason, other authors prefer to use this definition:
40
h π h π
Definition: y = sec−1 x means that x = sec y, where y ∈ −π, − ∪ 0, .
2 2
Also note that no matter what we do, the domain will not be contiguous; it must be
(−∞, −1] ∪ [1, ∞). With the definition above, the graph looks like this:
0.5π
-5 -2.5 0 2.5 5
-0.5π
-π
Figure 29:
The domain and range restrictions make this a pretty strange function. Fortunately, we
can usually avoid using it, if we wish! For example, suppose x is in the 1st quadrant. If
y = sec x =⇒ x = sec−1 y
OR
1
y = sec x =⇒ y =
cos x
1
=⇒ cos x =
y
1
=⇒ x = cos−1 .
y
If the angle x is not in the first quadrant, then we’ll have to think a bit harder. There will
be different “rules” for avoiding the inverse secant, cosecant, and cotangent functions in the
various quadrants, but they will all involve just the addition or subtraction of some angle.
41
10 Working with Sines and Cosines
The Pythagorean identity and the double-angle formulas are of critical importance in calculus,
for reasons we’ll explore later (they are essential tools for the evaluation of some common types
of integrals). The sum-of-angle formulas are also extremely important, but for a different
In some applications of calculus to physics and engineering, we’ll have input information
of the form
f (t) = B sin ωt
(here B is the amplitude and ω is the angular frequency), and we’ll find that our mathematical
This is, in fact, still a sine wave. It has the same angular frequency as the input, but a different
How do we accomplish this? The key is the double angle formula, sin(θ1 +θ2 ) = sin θ1 cos θ2 +
and we can work backwards from this to determine what the values of A and ω must be.
Solution: Since A sin(2t+α) = A sin 2t cos α+A cos 2t sin α = (A sin α) cos 2t+(A cos α) sin 2t,
and we want this to be equal to 5 cos 2t−3 sin 2t, we match up the coefficients. That is, looking
11
Note: if we graph g(t) versus t, we find that, in comparison to an unshiftedsine wave such as f (t), g(t) is
α
shifted to the left by the quantity α/ω; since A sin (ωt + α) = A sin ω t + ω . However, it is also possible
to graph g(t) versus the quantity ωt, in which case the shift is simply α, and this practice is quite common.
For this reason when we speak of the phase (or the phase shift), we are referring to α, rather than α/ω.
42
at the coefficients of cos 2t in the two expressions, we conclude that
A sin α = 5. (1)
So, we’ve arrived at a system of two equations in two unknowns, which we hope to be able to
Now, if we square both sides of both equations and add the results, we discover that
equations). Should we take the positive root or the negative one? This is entirely our choice;
each will simply require a different value of α to accompany it. However, since we want to be
able to refer to A as the amplitude of the combined sine wave, we’ll choose the positive one:
√
A = 34.
Next, we can easily eliminate A from the system of equations by dividing one by the other.
This tells us that tan α = −5/3. Reaching for a calculator, we find that tan−1 (−5/3) ≈ −1.030
radians. We have to be careful, here, though; this is not necessarily the correct value for α!
In fact, in this particular example it isn’t; since we’ve selected A to be positive, equations 1
and 2 tell us that sin α is positive, and cos α is negative, so α must be in the 2nd quadrant,
This is α!
-5 -4 -3 -2 -1 0 1 2 3 4 5
This is arctan(-5/3).
-1
-2
-3
Figure 30:
43
The angle in that quadrant whose tangent is −5/3 is α = tan−1 (−5/3) + π ≈ 2.111 radians
(although we could also add any multiple of 2π to this, so for example we might choose to set
√
In general we can conclude that a sin ωt + b cos ωt = a2 + b2 sin(ωt + α), where tan α = ab .
√
It’s safe to remember the formula A = a2 + b2 for the amplitude, but determining the phase
shift takes more care; we need to use the signs of a and b to determine which quadrant α
lies in before we can decide which formula applies. If α is in the 1st or 4th quadrants then
Note: If you prefer, you could calculate α more directly from equation 1, using sin α =
√ √
5/ 34, or from equation 2, using cos α = −3/ 34. However, the corrections for angles in the
“wrong” quadrants take a bit more thought if we use the arcsine or arccosine functions.
44
Part II
Limits
11 Sequences
Introduction
etc.), or possibly N plus the number 0. To denote the entire sequence, we write {an }∞
n=1 , or
just {an }.
∞
(−1)n (n + 1)
A sequence may have a formula (for example, is the sequence
3nn=0
2 3 4 5
1, − , , − , , . . . ), or it might not (for example, the sequence {an }, where an is the
3 9 27 81
population of the world on January 1st of year n).
Most of calculus deals with continuous functions of real variables, but we do encounter
● ●
If the string is plucked, it will vibrate, as a sine wave. Since the ends are fixed, though,
● ●
● ●
45
● ●
etc.
At each moment, the waves have the form of some combination of curves C1 sin πx
L ,
We may even encounter erratic-looking sequences without formulas, and the problems we’re
studying don’t have to be complicated for this to happen. For example, consider the digits in
the number π (which is just the ratio of circumference to diameter in any circle).
Of course, if there is a pattern, then ideally we would like to work out what the formula
Examples:
Limits of Sequences
The most interesting question about a list of numbers is whether it approaches a limit. Con-
∞
n 1 2 3 4
sider the sequence {an } = , which is , , , , . . . . We can see that the terms
n + 1 n=1 2 3 4 5
n
get closer and closer to 1 as n increases, and so we say that converges to 1 (and we
n+1
say that 1 is the limit of the sequence). We have two common notations for this; we may write
lim an = 1
n→∞
or
an → 1 as n → ∞.
46
Notice that the terms in our sequence never actually reach 1. This is probably the most
dangerous misconception about limits; when we say that lim an = L, all we are saying is
n→∞
that the difference between an and L becomes infinitesimally small as n increases. Since n can
Although the concept of convergence is a fairly simple one, stating a precise mathematical
definition is surprisingly difficult. Saying that “an → L as n → ∞ means that the numbers an
get closer and closer to L as n approaches infinity” is not really sufficient. What does “close”
mean? What does it mean for n to “approach” infinity? The real definition of convergence of
a sequence is this:
Definition: A sequence {an } converges to the limit L if for any positive number ε, there
This probably requires some explanation. The idea is that if the limit exists, then we should
can do that (for any distance ε, no matter how small), then the limit must exist.
Throughout Math 117 and Math 119 we’ll spend very little time on proofs. However, since
what we are dealing with here is the definition of the most fundamental concept in calculus,
we want you to understand it, and so we are going to give you some practice in proving the
existence of limits.
1
Example: Use the definition to prove that √ → 0 as n → ∞.
n
Solution: We need to show that for any ε, we can find an N such that
1
n > N =⇒ √ − 0 < ε.
n
Well, if we simplify the expression on the right, we see that what we need to end up with is
1 1
√ < ε. We can work backwards from this (solving for n) to see that what we need is n > 2 .
n ε
It might help to think of this as a little game. I give you a value for ε; let’s say ε = 0.1.
1
Your challenge is to tell me how large n has to be to make √ < 0.1. So, you tell me that n
n
47
just needs to be larger than 100 (this is the N in the definition). I might then say, “what if
ε = 0.01?”, and you would tell me that in that case n needs to be larger than 10000. If you
can always win this game, no matter what value of ε I pick, then the sequence does indeed
1 1 1
Let ε be any positive number. If n > 2
, then √ < ε. Therefore √ → 0 as n → ∞.
ε n n
n
Example: Prove that → 1 as n → ∞.
n+1
Solution: Plugging an and L into our definition, we must show that for any ε > 0, we
As we did in the first example, we’ll try to start with the right hand side and work backwards.
n
−1 <ε will be true if
n+1
n − (n + 1)
<ε
n+1
−1
⇐= <ε
n+1
1
⇐= <ε
n+1
1
⇐= n+1>
ε
1
⇐= n> − 1.
ε
since we already have them on the page, directly above, we’ll omit them from the proof itself.)
48
Calculation of Limits
You’ll probably agree that our definition of convergence is not all that easy to use. For one
thing, we have to be able to guess at the limit before we can use the definition to prove that
it is correct! So, how do we go about finding limits, in practice? We use theorems instead.
Below are the tools we need. All of these results can be proved using the definition.
1 Some basics:
? lim n = ∞.
n→∞
1
? lim = 0.
n→∞ n
The limit of a sum is equal to the sum of the limits, provided that both limits exist.
That is,
if an → a and bn → b as n → ∞,
then an ± bn → a ± b as n → ∞.
8 − 2n ∞
Example: Consider .
5n n=1
8 − 2n 8 2 8 2 2
Since = − , and since → 0 and → as n → ∞, we know that
5n 5n 5 5n 5 5
8 − 2n 2
→ − as n → ∞.
5n 5
23 The limit of a product is equal to the product of the limits, provided that both limits
if an → a and bn → b as n → ∞,
then an bn → ab as n → ∞.
Similarly, the limit of a quotient is equal to the quotient of the limits, provided that the
limits exist and the limit of the sequence in the denominator is not zero.
Note that as a special case, we have also lim Can = Ca, for any constant C.
n→∞
49
∞
3+n
Example: Consider .
2n + 1 n=1
Our theorems above only apply when all of the limits involved exist, so we have to be a
little bit careful in how we deal with this. The usual practice for rational expressions is
3
3+n +1
= n 1.
2n + 1 2+ n
3 1
Now, + 1 → 1 and 2 + → 2 as n → ∞,
n n
3+n 1
so we can conclude that → as n → ∞.
2n + 1 2
1
4 lim = 0, for any p > 0.
n→∞ np
Note also that if r = 1, then rn → 1, while for other values of r the limit doesn’t exist.
∞
πn
Example: Consider sin .
2n + 1 n=1
πn π π
Since = 1 → as n → ∞, and since sin x is a continuous function (every-
2n + 1 2+ n
2
πn π
where), it follows that sin → sin = 1 as n → ∞.
2n + 1 2
lim f (an ) = f lim an if f is continuous and lim an exists.
n→∞ n→∞ n→∞
This looks like a bit of magic; we can interchange the order of the two operations (if the
If an → L and cn → L as n → ∞, and if an ≤ bn ≤ cn ,
then bn → L as n → ∞ as well.
12
We’re cheating here - we haven’t defined continuity yet! Unfortunately, because the “real” definitions of
limits and continuity are so difficult to work with, the discussion gets a bit long and complicated if we do
everything in a completely logical order.
50
The reason why this is true is easiest to demonstrate using continuous functions (the
Squeeze Theorem applies to them too). See below; if all we know about the function
h (x) is that its values always lie in between the values of f (x) and g (x), but we know
that f and g share the same limit, then h must have that limit as well.
f(x)
2.5 h(x)
g(x)
-2.5
-5
Figure 31:
∞
sin n
Example: Consider .
n n=1
Note that lim sin n does not exist, and there’s no useful way to rewrite this sequence.
n→∞
However, we can observe that
−1 ≤ sin n ≤ 1,
1 1 sin n
Finally, since − → 0 and → 0 as n → ∞, we can conclude that → 0 as n → ∞
n n n
as well.
? lim en = ∞.
n→∞
? lim ln (n) = ∞ .
n→∞
51
? lim tan (n) does not exist.
n→∞
More Examples
Many limits will be obvious, and we may state them “by inspection”. The only real difficul-
ties arise when inspection leads to an “indeterminate form”. For example, if we look at the
2
ln n + 1 ∞
expression , and try simply “plugging in” n = ∞, we get the result , which is
n+5 ∞
0
meaningless. Similarly, we might end up with , or ∞ − ∞, which are also indeterminate13 .
0
In these cases we can attempt to rewrite the sequences in forms which are not indeterminate.
3n2 − n + 2
1. Find lim .
n→∞ 2n2 + 4
3n2 − n + 2 3 − n1 + n22 3
2
= 4 −→ as n → ∞.
2n + 4 2 + n2 2
6n + 3 n
2. Evaluate lim .
n→∞ 6n−1 − 4
Solution: We try a variation on the same theme; divide through by the exponential
6n 3n
6n + 3 n 6n + 6n
= 6n−1
6n−1 − 4 − 64n
6n
1 n
1+ 2
= 1 1 n
−→ 6 as n → ∞.
6 −4 6
√
3n2 + 2
3. Evaluate lim .
n→∞ n+4
13
There arealso a handful
n of exponential forms which are indeterminate. A particularly famous example is
1
the sequence 1 + . Trying to “evaluate” this “at infinity” gives us the form 1∞ , which you might think
n
should be 1. However, the base is never actually equal to 1! What we have in the base is a sequence of numbers
which are all larger than one (even though they are shrinking)! This limit is, in fact, the definition of the
number e, as we’ll discuss in more detail later on.
52
Solution: Again, we can adapt the idea of dividing by the highest power. The numer-
ator isn’t a polynomial, but we might say that the highest “effective” power there is n (we
have n2 inside a square root). Since this matches the highest power in the denominator,
we can write √ √
1
3n2 + 2 n 3n2 + 2
= 1
n+4 n (n + 4)
√
√1 3n2 + 2
n2
= 4
1+ n
q
2
3+ n2 √
= 4 −→ 3 as n → ∞.
1+ n
p
4. Evaluate n2 + 2n − n.
Solution: This has the form ∞ − ∞. We can use our standard trick again, but to do
so we must first convert our sequence into a ratio (we “rationalize” it):
√ !
p p n2 + 2n + n
n2 + 2n − n = n2 + 2n − n √
n2 + 2n + n
2n 2 2
=√ =q −→ =1 as n → ∞.
n2 + 2n + n 1 + n2 + 1 2
If a sequence does not have a limit, we say that it diverges. This can happen in a few different
ways.
n2
• The terms may grow without bound. For example, the terms in the sequence
continue to get larger and larger. We have a special notation for this; we write
lim an = ∞
n→∞
or
an → ∞ as n → ∞.
There is a precise definition of what this notation means, similar to the “ε-N ” definition
introduced in this section. You may see it discussed in the assignments. There is a
53
similar definition for the case lim an = −∞, as well.
n→∞
• There may be no pattern at all to the terms (for example, consider the sequence of digits
in the number π: {3, 1, 4, 1, 5, 9, 2, 6, . . .}. We still say that this sequence diverges, since
it has no limit.
• Some sequences oscillate between two numbers. For example, the sequence {(−1)n }
bounces back and forth between 1 and -1. In this case as well, we say that the sequence
diverges (we use the word divergent to mean that that the limit does not exist, no matter
Such a sequence is still said to diverge (our definition doesn’t allow a sequence to have
2πn 1
more than one limit). For example, consider sin + (we’ll let you
3 10n + 1
investigate this one on your own; there are three convergent subsequences).
Answers to Examples:
1. {an } = {2 + 5n}∞ ∞
n=0 , or {5n − 3}n=1 , etc.
( )∞ !
(−1)n n ∞ (−1)n+1 (n + 1)
2. {an } = or , etc.
(n + 1)2 n=1 (n + 2)2 n=0
The definitions and theorems for limits of sequences can easily be extended to limits of func-
tions as x → ∞; we can simply replace the integer variable n with the real variable x, and
everything that we have discussed so far works exactly the same way. We can even adapt it
all for limits as x → −∞; nothing else changes. However, given a function f of a real variable
x, we can also consider a second type of limit: given a constant a, what happens to f (x) as
54
Definition:
The statement f (x) → L as x → a or lim f (x) = L means that for any positive number
x→a
ε, there exists a number δ such that
Whereas for sequences we needed to make n large, we’re now thinking of making the distance
between x and a small ; you should think of δ as being a small positive number. If you found
the idea of the game from the previous section helpful, then the modification is this: if I
suggest a small value for ε, your challenge is now to tell me how close x must be to a in order
for |f (x) − L| to be less than ε. The number you provide is what we’ve labelled as δ. If you
There’s one other small modification: we’ve stipulated that 0 < |x − a| because we’re not
The definition could be paraphrased as “ lim f (x) = L means that f (x) can be made
x→a
arbitrarily close to L by making x sufficiently close (but not equal) to a”.
Solution: We must show that for any > 0, there exists a δ > 0 such that |x − 2| <
δ =⇒ |(3x − 2) − 4| < .
|(3x − 2) − 4| < ε
⇐= |3x − 6| < ε
⇐= 3 |x − 2| < ε
ε
⇐= |x − 2| < .
3
We can now see that δ = ε/3 is the value required for the proof:
(Condensed) Proof:
Let be any positive number. If |x − 2| < ε/3, then |(3x − 2) − 4| < ε. Therefore
lim (3x − 2) = 4.
x→2
55
Note that limits of functions can also fail to exist in the same ways as limits of sequences
can fail to exist. Also note that we need a different definition for every one of the following
statements:
• lim f (x) = L
x→a
• lim f (x) = ∞
x→a
• lim f (x) = −∞
x→a
• lim f (x) = L
x→∞
• lim f (x) = ∞
x→∞
• lim f (x) = −∞
x→∞
• lim f (x) = L
x→−∞
• lim f (x) = ∞
x→−∞
• lim f (x) = −∞
x→−∞
You may get some practice with some of these in the assignments. Actually, there are six
Fortunately, the definition we’ve just introduced leads to the same set of theorems as govern
limits of sequences as n → ∞ (and functions as x → ±∞). We also get one more extremely
important one:
If you prefer the other notation, this says that f is continuous at a if and only if lim f (x) =
x→a
f (a).
56
This theorem makes the calculation of most limits trivial in practice. For instance, consider
the example above. It seems silly to have to prove that 3x − 2 approaches 4 as x approaches 2,
because we all know that the function 3x − 2 is continuous (not just at 2, but everywhere), so
all we have to do is evaluate it at x = 2! In fact, we know that all polynomials are continuous,
and all of our other familiar functions are continuous on their domains as well. Therefore
So, what kinds of discontinuities might we encounter? The most familiar is the division-
by-zero kind. If both numerator and denominator approach zero as x → a, then we have an
indeterminate form, and as we saw with sequences the most useful strategy is to try to rewrite
the function in a form which is not indeterminate. In some cases, though, it isn’t possible
to rewrite the function in any useful way, so we might ask ourselves if the Squeeze Theorem
We may also encounter discontinuities when working with piecewise-defined functions. For
Here the notations x → a+ and x → a− refer to the limits of f as we approach a from the
right and left sides, respectively (and again, all of our other limit theorems extend to these
types of limits as well). If we find different limits from the two directions of approach, then f
simply does not have a limit as x approaches a. So, for piecewise-defined functions, we simply
Examples:
2x2 + 1
a) lim
x→2 x2 + 6x − 4
14
Technically, we have a serious logical problem here. We have not yet defined continuity! What we should
be doing is proceeding to a discussion of continuity (which we’ll discuss in the next lecture), verifying this
claim that all of our elementary functions are continuous on their domains, and then returning to the present
discussion of limits. However, we all have a basic understanding of what continuity means, and this departure
from the logical procession allows us proceed to some examples of actual calculations. The difficulty is this:
an understanding of limits is required for a rigorous discussion of continuity, but in practice we rely on an
understanding of continuity to evaluate limits! We’ve dealt with this in a different way than most textbooks,
but the textbooks still have a logical problem. They use theorem (3) as the definition of continuity... but then
they assume that their functions are continuous almost everywhere in order to determine if f (x) → f (a)!
57
This function is continuous at x = 2, so we can conclude immediately that
2x2 + 1 2x2 + 1 9 3
lim = = = .
x→2 x2 + 6x − 4 x2 + 6x − 4 x=2 12 4
x+5
b) lim
x→2 x−2
This time we have a discontinuity in the denominator, but we do not have an indeter-
minate form, since the numerator is not zero. A little bit of thought is all that is required
x+5
here: if we approach 2 from the right we obtain large positive values so lim =∞ ,
x→2+ x − 2
x+5
while if we approach from the left we obtain large negative values so lim = −∞ .
x→2− x − 2
Therefore the limit simply doesn’t exist.
x2 + x − 6
c) lim
x→2 x−2
Here we do have an indeterminate form; both numerator and denominator approach
x2 + x − 6 (x + 3)(x − 2)
zero as x → 2, so we try rewriting the function: = =x+3
x−2 x−2
(for x 6= 2). Clearly x + 3 → 5 as x → 2, so the limit is 5.
(2 + x)3 − 8
d) lim
x→0 x
Again, a little bit of algebra is all that’s required:
1
−1 ≤ cos ≤ 1.
x
58
2x2 − 3x
f) lim
x→1.5 |2x − 3|
This is a piecewise-defined function, with the change in the definition of the function
occurring precisely at the discontinuity, so we need to check the two one-sided limits:
2x2 − 3x x(2x − 3)
For x > 1.5, = = x → 1.5 as x → 1.5+ .
|2x − 3| 2x − 3
2x2 − 3x x(2x − 3)
For x < 1.5, = = −x → −1.5 as x → 1.5− .
|2x − 3| −(2x − 3)
Since the left- and right-sided limits don’t match, the limit does not exist.
p
g) lim 9x2 + x − 3x
x→∞
Limits as x → ∞ work exactly as limits of sequences; we’ve seen problems like this
before! √ !
p 9x2 + x − 3x p 2
9x2 + x − 3x = √ ( 9x + x − 3x)
9x2 + x + 3x
x
=√
9x2 + x + 3x
1
=q
9 + x1 + 3
1
−→ as x → ∞.
6
x
h) lim √
x→−∞ 4x2 + 3
Limits as x → −∞ also work the same way... except that occasionally the fact that we
x 1
√ = 1
√
4x2 + 3 x 4x2 + 3
1 √
= −1 √ since x = − x2 when x < 0(!!!)
√ 4x2 + 3
x2
−1
=q , which → − 21 as x → ∞.
4 + x32
A Special Limit: Once in a while we do encounter limits which we cannot evaluate with
these simple algebraic manipulations. Later on we’ll introduce a couple of other techniques
which will help (Taylor Series approximations, and L’Hôpital’s Rule). For now, we’ll introduce
59
just one special limit which is of particular importance:
sin θ
lim = 1.
θ→0 θ
The proof is somewhat long, so we’ll omit it, but we can make a quick intuitive argument as
to why this should be true. Recall that θ is not just the angle, but also the length of the arc
subtended by the angle θ in a unit circle. If we consider an extremely small angle, we can see
that this arc becomes virtually indistinguishable from the vertical distance sin θ (see Figure
32).
Arc length θ
Vertical
Figure 32: distance sin θ
sin θ
You may also encounter the limit lim . You should be able to see that this is different;
θ
θ→∞
it’s zero, by the Squeeze Theorem. As a final note, we point out that the function
sin x
x if x 6= 0
f (x) =
1
if x = 0
occurs so often in digital signal processing that it gets a special name: it is called the “sinc”
function (short for “sinus cardinalus”, or cardinal sine), and it is written as sinc (x). You can
0.75
0.5
0.25
-0.25
60
13 Continuity
There is a precise definition of continuity which is often omitted from textbooks. You’ll notice
that it looks like another limit definition, except that it involves values of the function at two
A function f (x) is continuous on an interval if for any x and y in that interval, and for any
To paraphrase this, when we say that a function is continuous, we mean that we can always
force f (x) to be close to f (y) by making sure that x is close to y. Unfortunately, it turns out
that it is hard to use this definition successfully, so we’ll stick with one example:
Solution: We’ll rely on a geometric argument. What we need to show is that for any
positive number ε, we can make |sin x − sin y| < ε, by making |x − y| small enough. Well,
61
1.5
1.25
0.75
arc length = x-y
sin(x)-sin(y)
0.5
angle = x sin(x)
arc length = y
0.25 sin(y)
angle = y
0 0.25 0.5 0.75 1 1.25 1.5 1.75 2 2.25 2.5
Figure 34:
We can see that the vertical distance sin (x) − sin (y) is smaller than the arc length x − y.
If we consider other quadrants of the unit circle, the signs may change, but we’ll always have
|sin (x) − sin (y)| ≤ |x − y|. Therefore, in order to ensure that |sin (x) − sin (y)| < ε, all we
have to do is make sure that |x − y| < ε. This implies that sin (x) is a continuous function,
As usual with definitions, we won’t want to use it very often. Instead, we’ll accept that all
√ 1 x
sin x, cos x, x, x, , e , ln x, etc.
x
Furthermore, the Continuity Criterion can be used to prove the Continuity Theorems:
62
Theorem:
• (f ◦ g) (x) is continuous on I
• (f g) (x) is continuous on I
1
• is continuous at any point in I where f (x) 6= 0.
f (x)
Putting all of this together, we can summarize everything there is to know about continuity
1 Most of the functions we’ll encounter will obviously be continuous on their domains.
This means that for most functions, we can usually tell where we have continuity just
Examples:
2 Continuity will usually only be in question when we are dealing with piecewise-defined
functions!
For these, the theorem of the previous section is the tool we need:
In fact most textbooks present this as the definition of continuity! (Why didn’t we
do this? Simply because we felt like being honest. This “definition” only applies to
15 1
It probably makes sense to you to say that has a discontinuity at x = 0. Note, though, that we can
x
1
also correctly say that is continuous on its domain!
x
63
individual points, so it would be extremely hard to prove that a function like sin x is
continuous everywhere!)
Examples:
x2 ,
for x ≤ 2
? Consider the function f (x) = We know that this is continuous
6 − x,
for x > 2.
for x < 2 and x > 2, but is it continuous at x = 2? Well, we can see that f (2) = 4,
sin x
We know that is continuous for all x 6= 0, so the only question is at that
x
sin x
special point x = 0. However, we’ve established that lim = 1, so the sinc
x→0 x
function is indeed continuous everywhere.
Types of Discontinuities
We have three informal names for the various ways in which a function can fail to be continuous
at a single point:
sin x
• The discontinuity in is called a “removable discontinuity”, because we can obtain
x
a continuous function (the sinc function) simply by defining f (0) = 1.
1
1
x−1 for x 6= 1
• The discontinuities we see in the functions f (x) = and g (x) =
x−1
0
for x = 1
are called “infinite discontinuities”, for obvious reasons.
64
x
for x ∈ [0, 10)
• The discontinuity in the function f (x) = is called a “jump dis-
8
for x ∈ [10, ∞)
continuity”. Again, you should be able to see why if you make a rough sketch of the
graph.
There are many theorems which apply only to continuous functions. Here are two of the most
important:
Theorem:
Suppose f (x) is continuous on a closed interval [a, b]. If c is a number between f (a) and
f (b) (that is, if f (a) < c < f (b) or f (b) < c < f (a)), then there exists at least one number
(a,f(a)) ●
It might seem odd to state this as a theorem, since it seems so obvious, but it will be
convenient to have a name for the argument. Instead of saying “we know this because f is
continuous and c is between f (a) and f (b)” we can now just say “by the IVT”! Also, it is
65
surprisingly difficult to prove the IVT; we won’t be able to do so here.
Application: Root Finding Suppose we wish to solve the equation ex −2 = cos x. A quick
sketch tells us that there must be at least one solution, but we have no way of calculating
it exactly. However, with repeated use of the IVT we are able to calculate it precisely; that
is, we can find an approximate solution to whatever level of accuracy we desire. The idea is
simple: we rewrite the equation as ex − 2 − cos x = 0, let f (x) = ex − 2 − cos x, and observe
Since f (0) = −2 < 0 and f (1) ≈ 0.18 > 0, we can conclude that the root must lie between
Next, check the midpoint of this interval: f (0.5) ≈ −0.85 < 0, so x0 must lie between
x = 0.5 and x = 1.
Repeat using the midpoint of this new interval: f (0.75) ≈ −0.63 < 0, so x0 ∈ (0.75, 1).
Continue. Let’s see if we can find the root to two decimal places:
We can now say that to two decimal places, the root is 0.95. This is referred to as the
Bisection Method for root finding. It’s slow, but if we’re working by hand we can speed it up
a bit by using some common sense; since f (0) = −2 and f (1) ≈ 0.18 we might have guessed
right away that the root would be closer to 1 than to 0, and jumped straight to 0.9 instead of
using the midpoint. We’ll see some faster methods in Math 119.
Appication: Curve Sketching Understanding the IVT can help us determine what a
and −2. Since f is continuous, it cannot change sign within the intervals (−∞, −2), (−2, 0),
66
(0, 2), or (2, ∞) (otherwise the IVT would guarantee the existence of another zero). So, all we
f (−3) = −15
f (−1) = 3
f (1) = −3
f (3) = 15
Therefore f is positive on (−2, 0) and on (2, ∞), and negative on (−∞, −2) and (0, 2).
Of course, this isn’t an awful lot of information, but it can be a useful idea if we’re having
Theorem:
If f is continuous on a closed interval [a, b], then f attains a maximum value and a minimum
Perhaps the best way to appreciate this is to consider how a function can fail to have
If f (x) is not continuous, then its graph could do something like this:
3
●
In this case f (x) isn’t
1
●
0
even bounded on the
given interval.
-1
-2
-3
67
If we have a removable discontinuity, we could have a situation like this instead:
3
If f (x) is continuous, but the interval under consideration isn’t closed, then we might have
this situation:
3
values.
-1
-2
°
-3
So, that explains what can happen with continuous functions on open intervals, or functions
example of the remaining case; demonstrate that a function with a jump discontinuity can
68
Part III
Differential Calculus
Calculus is usually thought of as having two “halves”: differential calculus and integral calculus
(although the concept of limit underlies both of them). You should already be familiar with
most of the formulas we’re about to review, so we’ll move quickly here. First, though, we
14 The Derivative
f (x1 ) − f (x0 )
.
x1 − x0
This expression is often referred to as the Newton difference quotient. Its meaning should be
clear. As an example, suppose your car’s odometer reads 31,040 km at 10am, and 31,373 km
at 1pm. Then the average rate of change of the distance you have travelled along your path
The derivative of a function f (x) at a point is simply the limit of the difference quotient
Definition:
f (x1 ) − f (x0 )
lim ,
x1 →x0 x1 − x0
assuming that this limit exists. If the limit does not exist, we say that f is not differentiable
at x0 .
There are several notations in common usage, all inherited from the great minds of the
69
• Lagrange’s notation: f 0 (x0 ). This will be our preferred notation for simple statements
x0 ”.
df
• Leibniz’s notation: (x0 ). This will be our preferred notation for more sophisticated
dx
df
calculations. It is useful because its form is so similar to the Newton quotient;
dx
∆f
stands for lim (where the Greek letter ∆ is used to denote a change in a quantity;
x1 →x0 ∆x
∆f = f1 − f0 and ∆x = x1 − x0 ). Since the derivative is the limit of a quotient, there are
and the Leibniz notation enables us to do so efficiently (that is, we may occasionally
df
treat the expression as a fraction). It also allows us to treat the derivative as an
dx
d
operator (a function which acts on functions instead of numbers); we can write [f (x)]
dx
to denote the action of the differentiation process upon the function f (more on this in
dy df
a moment). Also, note that if y = f (x), we’ll frequently write instead of .
dx dx
• Newton’s notation: f˙ (x0 ). This is rarely used in purely mathematical texts, but is the
• Euler’s notation: Df . This is used when we are treating the derivative exclusively as an
d
operator; it’s a more concise way of writing [f (x)], which also allows us to discuss
dx
the derivative function without specifying a name for the independent variable. You will
Geometric Interpretation The average rate of change of a function f (x) over an interval
[x0 , x1 ] can be interpreted as the slope of the secant line joining the points (x0 , f (x0 )) and
(x1 , f (x1 )). If we let x1 approach x0 this secant line spans a shorter and shorter segment of
the curve, and in the limit we obtain a line which touches the curve at only one point (at
least, only one in the immediate vicinity). This, of course, is how we define a tangent line to
70
2.4
•• • •
1.6
•• •
• •
•
0.8
x0 ← x1
-0.8
Figure 35:
To obtain the equation of the tangent line, recall the “point-slope” form of the equation of
a line: a line with slope m passing through a point (a, b) has equation
(y − b) = m (x − a) .
Using this we can say that the tangent line to f (x) at x0 has equation
y − f (x0 ) = f 0 (x0 ) (x − x0 ) .
That is,
y = f (x0 ) + f 0 (x0 ) (x − x0 ) .
The definition of the derivative can be written in a second way. If we write the difference
f (x1 ) − f (x0 )
f 0 (x0 ) = lim
x1 →x0 x1 − x0
can be rewritten as
f (x0 + ∆x) − f (x0 )
f 0 (x0 ) = lim
∆x→0 ∆x
71
The advantage of this is that we’ve eliminated x from the definition, which allows us to let x0
vary (we’ll just relabel it as x) and hence define the derivative as a function:
f (x + ∆x) − f (x)
f 0 (x) = lim .
∆x→0 ∆x
Higher-order Derivatives
Since the derivative of a function f is itself a new function, it can be differentiated as well.
We’ll call the result the second derivative of f (and we can then calculate a third derivative,
and so on).
In Lagrange’s notation, we add more strokes: the second and third derivatives are f 00 (x)
and f 000 (x). For the fourth derivative and beyond, it’s common to write superscript numbers
to tell how many dots there are, which may be one reason why this notation has fallen out of
favour with mathematicians, but is still popular with physicists, who rarely need to discuss
Physical Interpretation
The meaning of a derivative will depend on the context. Remembering the difference quotient
will help!
Example: If s (t) describes the displacement of an object as a function of time, then the
length (∆s)
derivative s 0 (t) has units of , so it must be velocity! Since velocity is interesting
time (∆t)
∆v
in its own right, let’s give it a new label: let v (t) = s 0 (t). Then v 0 (t) = lim has units
∆t→0 ∆t
length/time (∆s/∆t) length
of , or ; it is acceleration (a (t) = v 0 (t) = s 00 (t)).
time (∆t) time2
Question: If m (x) gives the mass of the part of a metal rod that lies between its left end
72
0 x
∆m mass
Answer: Since m 0 (x) = lim , it has units of . We may refer to this as the
∆x→0 ∆x length
“linear density” of the rod.
15 Differentiation Formulas
We want you to know and understand the definition of the derivative so that you can properly
interpret derivatives, but we’ll rarely have to actually use it. Instead, we have a set of theorems
d
1 (k) = 0, for any constant k.
dt
d n
2 (x ) = nxn−1 , for any constant n (although of course if n < 1 then we must exclude
dt
the point x = 0).
d df df
3 [kf (x)] = k , for any constant k (assuming that exists).
dx dx dx
d df dg
4 [f (x) + g (x)] = + (assuming that these both exist).
dx dx dx
Note that rules 3 and 4 are the defining properties of linearity; we say that the derivative
is a linear operator.
Also, rules 1 through 4 are all we need in order to differentiate any polynomial.
Example: You should have no problem calculating the derivative of f (x) = 3x3 +5x2 −2x+7.
d
5 (sin x) = cos x
dx
It may be worth seeing the proof of this particular rule:
73
sin x cos ∆x + cos x sin ∆x − sin x
= lim (using the sum-of-angle identity)
∆x→0 ∆x
cos ∆x − 1 sin ∆x
= lim sin x + lim cos x (assuming that both limits exist)
∆x→0 ∆x ∆x→0 ∆x
cos ∆x − 1 sin ∆x
= sin x lim + cos x lim
∆x→0 ∆x ∆x→0 ∆x
sin ∆x
Now, if we are working in radians, then → 1 as ∆x → 0, while
∆x
cos ∆x − 1 cos ∆x − 1 cos ∆x + 1
=
∆x ∆x cos ∆x + 1
cos2 ∆x − 1
=
∆x (cos ∆x + 1)
sin2 ∆x
=
∆x (cos ∆x + 1)
sin ∆x sin ∆x
=
∆x cos ∆x + 1
−→ 1 · 0 = 0 as ∆x −→ 0.
d
Hence (sin x) = cos x , provided that x is in radians.
dx
d
6 (cos x) = − sin x , by a similar calculation.
dx
d x
7 (e ) = ex
dx
This rule requires some explanation. If we apply the definition of the derivative, this is
what we find:
d x ex+∆x − ex
(e ) = lim
dx ∆x→0 ∆x
∆x − 1
x e
= lim e
∆x→0 ∆x
e∆x − 1
= ex lim .
∆x→0 ∆x
eh − 1
This is a very special limit. For simplicity let’s write it as lim. There’s no obvious
h→0 h
way to simplify the expression, and at the moment we have no techniques for evaluating
74
it. However, you can see that in order to get the result we want, the limit needs to be
equal to 1. In fact, the number e is defined to be the number which makes this particular
d
limit 1, so that dx (ex ) = ex . That’s an implicit definition; to find an explicit one observe
that
eh − 1
if ≈1
h
then eh − 1 ≈ h
so eh ≈ 1 + h
and so e ≈ (1 + h)1/h .
Next we have the big three rules for differentiating combinations of functions:
d df dg
8 The Product Rule: [f (x) g (x)] = ·g+f ·
dx dx dx
d f (x) 1 df dg
9 The Quotient Rule: = ·g−f ·
dx g (x) [g (x)]2 dx dx
This, of course, is what the rule looks like in Lagrange’s notation, which should be
sufficient for our needs for now. As we proceed, though, you will need to become equally
dy dy du
= .
dx du dx
dy
Note that du is f 0 (u), which is f 0 (g (x)), while du
dx is g 0 (x), so this is indeed the same
rule!
You can see that in the Leibniz notation it looks as though if we work backwards the
differential du cancels out, as if our derivatives were fractions. This is, in fact, exactly
75
what’s happening, except that the cancellation occurs within a limit! That apparent
cancellation is a reflection of an actual cancellation used in the proof of the chain rule:
dy ∆y
= lim
dx ∆x→0 ∆x
∆y ∆u
= lim , (as long as ∆u 6= 0).
∆x→0 ∆u ∆x
du ∆u
Now, if exists (that is, if lim exists), then this can be expressed as
dx ∆x→0 ∆x
∆y ∆u
lim lim
∆x→0 ∆u ∆x→0 ∆x
du
and ∆u must approach 0 as ∆x → 0 (again, assuming that exists). Therefore this is
dx
∆y ∆u dy du
lim lim = ,
∆u→0 ∆u ∆x→0 ∆x du dx
as expected.16
A second advantage of the Leibniz notation is that it allows us to write the rules for
dy dy dx ds dt
= .
dz dx ds dt dz
With these ten rules, we can calculate derivatives for almost any other functions we need.
cos2 x + sin2 x
d d sin x 1 d
11 (tan x) = = 2
= 2
= sec2 x, so (tan x) = sec2 x .
dx dx cos x cos x cos x dx
d
12 Similarly, (cot x) = − csc2 x .
dx
16
This “proof” is not quite complete. The assumption “as long as ∆u 6= 0” needs to be explored.
76
d d 1 d h i
13 (sec x) = = (cos x)−1
dx dx cos x dx
−2 sin x 1 sin x
= − (cos x) (− sin x) = = = sec x tan x,
cos2 x cos x cos x
d
so (sec x) = sec x tan x .
dx
d
14 Similarly, (csc x) = − csc x cot x .
dx
d ex + e−x
d 1 x d
e − e−x = sinh x, so
15 (cosh x) = = (cosh x) = sinh x .
dx dx 2 2 dx
d
16 Similarly, (sinh x) = cosh x .
dx
d
17 A calculation similar to that for the tangent function shows that (tanh x) = sech2 x .
dx
etc.
For functions which are defined implicitly, such as the functions defined by the equation
x2 + y 2 = 4, we might try solving for y (or x) explicitly before using the formulas above.
There is another way to proceed, though (which is fortunate, since it isn’t always possible to
All we need to do is “view” y as a function of x, and differentiate both sides of the equation
Example:
x2 + [y (x)]2 = 4
d h 2 i d
=⇒ x + [y (x)]2 = (4)
dx dx
dy
=⇒ 2x + 2y (x) =0
dx
dy x
=⇒ =− (for y 6= 0)
dx y
77
√
Of course, we could now replace y with ± 4 − x2 , if we wish.
xy dy dy
e y+x =1+
dx dx
dy dy
=⇒ yexy + xexy =1+
dx dx
dy 1 − yexy
=⇒ = xy .
dx xe − 1
Unfortunately we can’t do anything about the fact that our expression involves both x and y.
Nevertheless, if we can identify a point on the curve, then we should be able to find the slope
at that point. For example, we can see that the point (0, 1) is on the curve, and the slope
there is zero.
Suppose we know the derivative of an invertible function f (x). What does that tell us about
the derivative of its inverse? Well, let’s start by writing y = f −1 (x). We wish to determine
dy
. We can use the implicit differentiation technique we’ve just discussed to find it! We know
dx
that
x = f (y) ,
d d
so (x) = (f (y))
dx dx
dy
=⇒ 1 = f 0 (y)
dx
dy 1
=⇒ = 0
dx f (y)
1
= .
f 0 (f −1 (x))
This formula looks a bit cumbersome, but consider what happens if we write it in Leibniz
dy 1
notation. Our second-last line, = 0 , can be rewritten as
dx f (y)
dy 1
= ,
dx dx/dy
78
which is a marvelously simple result!
By mimicking this procedure, we can prove a few more formulas to add to our list:
16 Suppose y = ln x.
Then x = ey
dy
so 1 = ey
dx
dy 1 1
and so = y = .
dx e x
d 1
That is, ln x = .
dx x
dy
Therefore 1 = cos y ,
dx
dy 1
so =
dx cos y
1
=p
1 − sin2 y
h π π i
we know that we need the positive root, since cos y ≥ 0 when y ∈ − ,
2 2
1
=√ .
1 − x2
d 1
sin−1 x = √
Hence .
dx 1 − x2
d 1
cos−1 x = − √
18 Similarly, (try proving this to see where the negative sign
dx 1 − x2
comes from).
19 Using the same strategy with the arctangent function yields an extremely useful rule:
d 1
tan−1 x =
.
dx 1 + x2
dy 1
Caution: Our formula for the derivative of an inverse, = , suggests that we can
dx dx/dy
treat derivatives as fractions. This is usually true, but only for first-order deriva-
79
It is not hard to find a counterexample to prove this claim. Consider:
If y = ex then x = ln y,
dy dx 1 1
so = ex while = = x ,
dx dy y e
d2 y 2
d x 1 1
but = ex whereas = − 2 = − 2x .
dx2 dy 2 y e
d2 y 1
So, if 6= 2 , what should the formula be? It’s unlikely that you’ll ever have to
dx2 d x/dy 2
use it, but finding it is a useful exercise in using the chain rule:
d2 y
d dy
=
dx2 dx dx
d 1
= .
dx dx/dy
This is now asking for the derivative with respect to x of a function of y (or rather, y (x)),
which is exactly what the Chain Rule is for. We differentiate with respect to y, and multiply
by dy/dx:
d 1 d 1 dy
= ·
dx dx/dy dy dx/dy dx
1 d2 x dy
=− · · (Chain Rule again!)
(dx/dy)2 dy 2 dx
1 d2 x
=− .
(dx/dy)3 dy 2
You do NOT need to know this formula, but the steps are quite similar to certain calculations
How might we find the derivative of a function such as (cos x)x ? First of all, we need to realize
that the formulas we’ve discussed so far do not allow us to calculate it straightforwardly
d k
(the formula x = kxk−1 requires that k be a constant, so we cannot simply write
dx
y 0 = x (cos x)x−1 ).
80
The trick is to apply logarithms, in order to displace the exponent:
y = (cos x)x
=⇒ ln y = ln [(cos x)x ]
= x ln (cos x)
1 dy 1
=⇒ = ln (cos x) + x (− sin x)
y dx cos x
dy
=⇒ = (cos x)x [ln (cos x) − x tan x] .
dx
We could, in fact, use this technique to derive a formula for the derivative of a function of the
Of course, in practice you won’t often encounter functions of this form. This technique,
√
x2 3 7x − 14
ln y = ln
(1 + x2 )4
4
= ln x2 + ln [7 (x − 2)]1/3 − ln 1 + x2
1 1
ln 7 + ln (x − 2) − 4 ln 1 + x2
= 2 ln x +
3 3
1 dy 2 1 8x
so = + −
y dx x 3 (x − 2) 1 + x2
√
x2 3 7x − 14 2
dy 1 8x
and hence = + − .
dx (1 + x2 )4 x 3 (x − 2) 1 + x2
Comment: Technically these steps are only valid for x > 2, since we may only apply
the logarithm to positive numbers. However, the result is also valid for x < 2, and this is not
just a “fluke”. One extra step would make this clear: we’d just need to apply absolute values
Logarithmic differentiation also allows us to derive one more pair of differentiation rules:
20 Consider f (x) = ax , where a can be any positive number. To find its derivative, let’s
81
write
y = ax ,
ln y = x ln a
1 dy
=⇒ = ln a
y dx
dy
=⇒ = y ln a = (ln a) ax .
dx
d x
Therefore (a ) = (ln a) ax .
dx
21 Using the previously-discussed method for dealing with inverse functions, we can show
d 1
that (loga x) = .
dx (ln a) x
17 Theorems
f (x) − f (a)
lim [f (x) − f (a)] = lim (x − a) lim (true if both limits exist)
x→a x→a x→a x−a
= 0.
Theorem: If f (x) is continuous on [a, b] and differentiable on (a, b), then there exists a
82
Note that there may be more than one such number. The graphical interpretation is that f
attains its average slope at least once in (a, b), which we hope seems intuitive. This particular
theorem can be very useful for proving other theorems in calculus. For example, we can prove
the following:
Corollary: If f 0 (x) = 0 for all x ∈ (a, b), then f (x) is constant on (a, b).
Proof: Let x1 and x2 be any two distinct points in (a, b). By the Mean Value Theorem,
f (x1 ) − f (x2 )
f 0 (c) = .
x1 − x2
But we’ve assumed that f 0 (c) = 0, so f (x1 ) = f (x2 ). Since x1 and x2 are arbitary, f has
give you some practice with this in the homework and assignments.
Note: you will also see the name spelled “L’Hospital’s Rule”. This is how Guillaume de
l’Hospital, after whom the result is named, spelled his own name, but in modern French
spelling the silent “s” is replaced by the circumflex over the preceding vowel. The “H” is also
18 Related Rates
Here’s a very direct application of the chain rule: if we know that two quantities are related
(let’s say that y = f (x)), and each of those two quantities is changing in time, then the rates
83
y (t) = f (x (t))
d d
=⇒ [y (t)] = [f (x (t))]
dt dt
dy df dx
=⇒ = .
dt dx dt
df dy
Note that dx will not usually be constant, so to find dt , we will usually need to know both
dx df
dt and t (from which we can determine x, and hence determine dx ).
Example: Suppose a spherical balloon is being inflated at a rate of 1 litre per minute. How
quickly is the diameter of the balloon increasing at the moment when the diameter is 10 cm?
Solution: Uh-oh - we have a word problem. Don’t panic; we just need to translate it
into mathematics! We are given information about volume and diameter, so let’s call those
V and d. We need to consider the relationship between them; we know that for a sphere,
4 d πd3
V = πr3 , where r is the radius, and of course r = , so V = . Differentiating both sides
3 2 6
with respect to time, t, we find that
dV πd2 dd
= .
dt 2 dt
dV dd
We are given both dt and d, so all we have to do is plug them in and solve for dt . There’s
just one catch: we have to take some care that we use compatible units. Since 1 litre is a cubic
π dd
1= ,
2 dt
dd 2 20
so = dm/min, or cm/min if we’d rather not use decimeters for the final result.
dt π π
Comment: We can also see that this rate of change will be slowing rapidly, since with
dV dd 2
fixed at 1 L/min, we have = .
dt dt πd2
The essence of every one of these problems is the same; we’re just differentiating everything
with respect to time. The challenge will typically lie in the “translation to mathematics”, but
84
• Read the question carefully (what information are you given, and what information are
you being asked for?). Give names to the quantities, and be aware that some will be
• Determine the relationship between the variables. This may come from a standard
formula, as in the example above, or it might be based on the specific geometry of the
Once you have that relationship, all you have to do is differentiate both sides of the equation
with respect to time, applying the chain rule. The last step is just plugging numbers in, with
km/hr. An observer on the ground sees it pass directly overhead, and watches it travel away
from him. When the angle of elevation reaches 45◦ , how quickly is it decreasing?
Solution: We are given an altitude of h = 10 km (constant). We are also told that the
horizontal component of the distance from the observer (let’s call this x) is changing at the
dx π
rate = 900 km/hr. Finally, we are given that the angle of elevation (θ) is radians (since
dt 4
dθ
we’re never going to use degrees in a calculus problem), and we are asked to find . So...
dt
how are x and θ related? A diagram might help here.
h
We can see that tan θ = , so differentiating with respect to time tells us that
x
dθ h dx
sec2 θ =− 2 .
dt x dt
π h
When θ = , we have x = = 10 km, and so
4 tan θ
dθ h cos2 θ dx
=−
dt x2 dt
1
(10 km) 2
=− 2 (900 km/hr)
100 km
= −45 rad/hr.
85
19 Differentials
As we mentioned earlier in the course, the expressions ∆x and ∆y are used to denote small
increments in the values of x and y. We also use the expressions dx and dy to represent
these quantities in limits as they approach zero. For example, when we introduce the definite
n
X Z b
lim f (x∗i ) ∆x = f (x) dx,
∆x→0 a
i=1
∆y dy
lim = .
∆x→0 ∆x dx
Let’s explore this definition of the derivative a bit further. Assuming that the increment
∆y dy
∆x is small, we know that ≈ . That is (if we switch to Lagrange’s notation),
∆x dx
∆y
≈ f 0 (x) .
∆x
Well, we can write this as ∆y ≈ f 0 (x) ∆x (since ∆y and ∆x actually are separate finite
dy = f 0 (x) dx .
The expression f 0 (x) dx is called the differential of f (and we may also write it as df instead
dy
of dy). Essentially what we’ve done is provide some justification for treating the expression
dx
as a fraction. The differentials dy and dx are still meaningless in isolation; they are defined by
their relationship to each other. Nevertheless, we can manipulate them as separate quantities
As a rule of thumb, given expressions involving differentials (dx, dy, dz, ds, etc.), we can
treat them as separate quantities as long as we put everything back into sensible expressions
17
As a simple example, we can see an immediate connection to our discussion of related rates: if we divide
dy dx
the expression above by a differential dt, we get = f 0 (x) .
dt dt
86
dy
when we’re finished. For example, consider the chain rule. Given the derivative , we can
dx
imagine dy and dx to be distinct quantities, and we can multiply and divide by a third quantity
du:
dy dy du dy du
= = ,
dx dx du du dx
and this gives a valid result as long as we’ve defined the variable u appropriately and as long
dy du
as the derivatives and both exist. We’ll manipulate differentials in a similar way when
du dx Z b
we study integration; the differential dx in the expression f (x) dx can be treated as a finite
a
quantity, as long as when we’re finished “playing with it” we end up with a sensible integral
expression.
The notation of differentials is sometimes used in a different way. Consider a function f (x),
on an interval [x0 , x0 + ∆x]. When x increases by the amount ∆x, the value of f will increase
by an amount ∆f , and we know that we can use the tangent line to f at x0 to find an
approximation for this change. The straightforward way is to find the equation of the tangent
line to f at x0 , and evaluate it at x0 + ∆x, but we can be more efficient than that.
∆f
Since ≈ f 0 (x0 ), we know that ∆f ≈ f 0 (x0 ) ∆x. This allows us to conclude that
∆x
In practice, it is helpful to have a separate notation for the approximate change, f 0 (x0 ) ∆x,
and we traditionally use the differentials for this purpose18 . That is, while we’re using the
increments ∆x and ∆f for the horizontal and vertical components of motion along the actual
curve y = f (x), we’ll use the differentials dx and df to represent the corresponding quantities
for the tangent line. Of course, the change in x is the same for both, so we’re setting dx = ∆x.
87
y=f(x)
tangent line
f(x0+Δx)
f(x0)
↕df=f '(x )dx
0 ↕ Δf=f(x0+Δx)-f(x0)
← Δx=dx →
x0 x0+Δx
Figure 37:
The idea is that if we’re interested in finding the change ∆f , we can find a quick approxi-
Example: Consider a square of side length x. Its area is A (x) = x2 . Suppose we increase x
by a small amount ∆x. Since the differential is dA = 2xdx, we can conclude that the change
∆A = A (x + ∆x) − A (x)
= (x + ∆x)2 − x2
= x2 + 2x∆x + (∆x)2 − x2
= 2x∆x + (∆x)2 .
You can see that the error in our differential approximation is (∆x)2 , which will be very small
88
Δx xΔx (Δx)2
x x2 xΔx
x Δx
Figure 38:
This is, admittedly, not quite as useful as it was 50 years ago! If we have a calculator handy,
and we have numbers, then we won’t need the approximation. For example, if x = 20cm and
∆x = 0.1cm, then we can immediately calculate the new area as (20.1cm)2 = 404.01cm2 (and
finding the approximate value of 404cm2 actually takes a few seconds longer!). However, as
you’ll see in the assignments, the idea is still useful when we are not dealing with specific
numbers.
√
Example: Use differentials to approximate 78.
√ √
Solution: Realizing that 81 = 9, we identify f (x) = x, x0 = 81, and ∆x = dx = −3.
We then calculate df :
df = f 0 (x) dx
1
= √ dx
2 x
1
= √ · (−3)
2 81
1
=− .
6
√ √
78 = 81 + ∆f
√
≈ 81 + df
89
1
=9−
6
5
= 8 ≈ 8.833.
6
√
(Compare this to the calculator value of 78 ≈ 8.83176...)
Example: A metal sphere of radius 10cm is to be coated with a layer of silver, 0.02 cm
Solution: We essentially want to know the change in the volume of a sphere when the
4
V (r) = πr3 ,
3
= 4πr2 ∆r
= 4π(10)2 (0.02)
= 8π.
For comparison, the exact value should be 34 πr13 − 43 πr03 = 43 π 10.023 − 103 ≈ 25.183cm3 .
To appreciate the advantage of the method of differentials, consider that we can state more
generally that for any initial radius r, and any desired thickness of silver coating dr, the
volume of silver required will be approximately dV = 4πr2 dr (as long as the coating is thin,
in relation to the base sphere). Realizing that this quantity is directly proportional to the
thickness of the coating, but proportional to the square of the radius of the sphere may be
90
20 Graphical Implications of the Derivative (Application to Curve-
Sketching)
You probably already have some intuitive understanding of the terms “increasing”, “decreasing”,
“maximum”, and “minimum”, as they apply to the graphs of functions. However, it will be
Definition:
• If f is either increasing or decreasing on I (one or the other, but we don’t care or know
Now, consider the definition of the derivative of f with this new definition in mind. Since we
f (x2 ) − f (x1 )
can write f 0 (x1 ) = lim , it follows that if f 0 (x) > 0, for all x in the interval
x2 →x1 x2 − x1
I, then f (x) is increasing on I. Similarly, if f 0 (x) < 0, then f is decreasing on I.
Note that the converse is not quite true; if f is increasing we may still find that f 0 (x) = 0
Example: Consider the function f (x) = x − sin x. Differentiating, we find that f 0 (x) =
1 − cos x. Clearly f 0 (x) ≥ 0 for all x, but equality holds only at the discrete points x = 0,
±2π, ±4π, etc. This does imply that f (x) is increasing for all x. The graph is plotted below.
91
12.5
10
7.5
2.5
Figure 39:
Definition:
• A function f has an absolute ( or global) maximum at x0 if f (x0 ) ≥ f (x) for all x in the
domain of f .
• Similarly, if f (x0 ) ≤ f (x) for all x in its domain, then we say that f has an absolute ( or
global) minimum at x0 .
For determining the shape of a graph, the following terms are more helpful:
• A function f has a local (or relative) maximum at x0 if there is a number h such that
• A function f has a local (or relative) minimum at x0 if there is a number h such that
“minimum” is “minima”. We also refer to them collectively as “extrema” (the singular of which
is “extremum”).
Now, a little thought should reveal that if the function under consideration is continuous,
then it can only possess a local maximum at x0 if it is increasing to the left of x0 and decreasing
to the right of x0 (% &). What does this tell us about the derivative of f at x0 ? Well,
if f 0 changes from positive to negative as x increases through x0 , then there are only two
92
Of course, exactly the same observations can be made about local minima, which suggests
If we wish to find all of the local extrema of a continuous function f , we must first find all
neither, we may apply the First Derivative Test: if f 0 changes from positive to negative as x
increases through x0 , then f must have a local maximum at x0 (% &), while if f 0 changes
from negative to positive, then f must have a local minimum instead (& %).
Examples:
• Consider the function f (x) = 2x3 − 3x2 − 12x + 5. Differentiating, we find that f 0 (x) =
6x2 − 6x − 12 = 6(x2 − x − 2) = 6(x − 2)(x + 1). Therefore f has critical points where
x = −1 and x = 2. Furthermore, f 0 > 0 on (−∞, −1), f 0 < 0 on (−1, 2), and f 0 > 0
on (2, ∞). Hence we can conclude that the point (−1, 18) is a local maximum, while the
24
16
-5 -2.5 0 2.5 5
-8
-16
Figure 40:
1
• Consider the function f (x) = x1/3 . Here differentiating yields f 0 (x) = x−2/3 , so f 0 (0)
3
does not exist. This makes (0, 0) a critical point, but it is in fact not an extremum, since
93
1.6
0.8
-0.8
2 2
f 0 (x) = (3x + 1)−1/3 (3) = √
3
.
3 3x + 1
We can see that f 0 (x) is undefined when x = −1/3. Furthermore, f 0 (x) < 0 when
x < −1/3, while f 0 (x) > 0 when x > −1/3, so f has a local minimum at (−1/3, 0).
2.5
1.5
0.5
Figure 42:
1 2
• Consider the function f (x) = = x−2 . We have f 0 (x) = −2x−3 = − 3 , which is
x2 x
negative when x is negative, and positive when x is positive. However, there is no
critical point here, since x = 0 is not in the domain of f ! In fact, even if we were to
94
re-define f as, say,
1,
2
if x 6= 0
f (x) = x
a
if x = 0
then the point (0, a) would still not be a maximum. In fact it would be a local minimum!
The point here is that all of our discussions regarding derivatives and their implications
7.5
2.5
Figure 43:
•
Recall the Extreme Value Theorem: if f (x) is continuous on a closed interval [a, b], then f
attains an absolute maximum value and a minimum value on that interval. A little thought
reveals that these absolute extrema must occur either at critical points of f on [a, b] or at the
endpoints x = a, x = b. So, all we need to do is find the critical points and compare their
values, both against each other and against f (a) and f (b) (it is not necessary to classify them
as local maxima or minima). This is commonly referred to as the Closed Interval Method for
Example: Find the absolute maximum and minumum values of f (x) = x3 − 3x + 1 on the
Solution: Differentiating, we find f 0 (x) = 3x2 − 3 = 3(x + 1)(x − 1), so the only critical
value within the interval [0, 2] is x = 1. At this point we find that f (1) = −1. Meanwhile,
at the endpoints we have f (0) = 1 and f (2) = 3. Comparing these, we conclude that the
95
Comment: If the interval of interest is not closed, then the Extreme Value Theorem doesn’t
apply, and there may not be any absolute extrema. However, there may be, and if there are
then we can usually modify the Closed Interval Method by considering limits as we approach
x
each end of the interval. For example, consider the function f (x) = , on the interval
1 + x2
1−x 2
(0, ∞). This has derivative f 0 (x) = , so the only critical point within the interval
(1 + x2 )2
is at x = 1. We aren’t allowed to evaluate f at zero (or, obviously “at” infinity), but if we
consider that lim f (x) = 0, and lim f (x) = 0, while f (1) = 1/2, it becomes clear that the
x→0+ x→∞
maximum value of f on the given interval is 1/2. On the other hand, f does not have a
minimum value (although we could state that 0 is a lower bound on the values of f ).
0.5
0.25
Figure 44:
• If the concavity of f changes at a point (x0 , f (x0 )) (either from up to down or from
Considering our earlier discussion of the terms “increasing” and “decreasing”, we can see an
easy test for concavity: if f 00 (x) > 0 on I then f must be concave up, while if f 00 (x) < 0
then f must be concave down. Also, just as critical points occur where f 0 (x) is either zero
96
or undefined, we see that inflection points must occur either where f 00 (x) = 0 or f 00 (x) does
not exist.
is either zero or undefined (and these are important because they are the possible locations
of maxima or minima), we do not have a corresponding term for points at which f 00 is zero
or undefined. These are simply “possible inflection points”, or, to be more precise, they are
critical points of f 0 .
Examples:
• Consider f (x) = x1/3 . Here f 0 (x) = 13 x−2/3 , and f 00 (x) = − 92 x−5/3 , so we see that even
though f is continuous everywhere, none of its derivatives are defined at (0, 0). Therefore
this is a critical point. Is it an extremum? Well, applying the First Derivative Test, we
see that f 0 is positive for x < 0 and for x > 0, so this is in fact not an extremum.
However, the second derivative f 00 does change sign at 0, from positive to negative, so
In general, if we want to produce a reasonably accurate graph by hand, we can use information
from f , f 0 , and f 00 . From f itself we can plot a point or two to “anchor” the graph (usually we
look for points at which the curve crosses the x or y axes), and it’s often useful to consider limits
and locate any extrema, and from f 00 we can determine concavity and locate any inflection
points.
Solution:
x2/3 (x − 1), we can see that f = 0 at x = 0 and x = 1 (these are the only “x-intercepts”), and
97
the only limits of interest are lim f = ∞, and lim f = −∞ (both of which you should be
x→∞ x→−∞
able to see upon inspection).
5 2 1
Information from f 0 : Differentiating gives f 0 (x) = x2/3 − x−1/3 = 1/3 (5x − 2),
3 3 3x
from which we can see that the only critical values are x = 0 and x = 2/5. To apply the first
derivative test and summarize the behaviour of the function on the intervals between the
critical points, some people find it helpful to display the signs of each factor of f 0 in a chart,
in this way:
(−∞, 0) (0, 25 ) ( 25 , ∞)
1
3x1/3
− + +
(5x − 2) − − +
f0 + − +
f % & %
Here the last line is intended to represent the deduction that f must be increasing on
(−∞, 0) and ( 52 , ∞), and decreasing in between. From this we can conclude from the first
derivative test that the point (0, 0) must be a local minimum, while the point ( 52 , −0.326...)
10 −1/3 2 −4/3 2
f 00 (x) = x + x = 4/3 (5x + 1).
9 9 9x
From this we can see that the only possible locations of inflection points are at x = − 15 and
x = 0. Again, we can use a chart to display the signs of f 00 on each interval of interest, for
each factor:
(−∞, − 15 ) (− 15 , 0) (0, ∞)
2
9x4/3
+ + +
(5x + 1) − + +
f 00 − + +
f a ` `
98
Putting all of this information together, we arrive at the graph shown below.
1.6
0.8
-0.8
-1.6
Figure 45:
Comment: Some graphing software has difficulty in producing the graph of this function
(the entire left side of the graph may be omitted). The problem can usually be corrected by
rewriting the function in the form f (x) = (x5 )1/3 − (x2 )1/3 .
alternative to the First Derivative Test available to us. If x = x0 is a critical value of f , and
Of course, if f 00 (x0 ) = 0, or if f 00 (x0 ) does not exist, then the test is of no help, so we must
rely on the First Derivative Test. For example, consider the functions f (x) = ±x4 . You can
easily verify that x = 0 is a critical point for both functions, and that f 00 (0) = 0 for both
functions. Obviously, though, x4 > 0 for all x 6= 0, while −x4 < 0 for all x 6= 0, so the origin
As another example, consider that for the sketch above (Figure 45), we have only one way
In fact, the First Derivative Test will often be the easier of the two to apply, because the
calculation of the second derivative can be more trouble than it’s worth. The Second Derivative
Test will be a useful shortcut when we’re working with polynomials and other simple functions.
99
Part IV
Integral Calculus
Introduction
Integral calculus is actually much older than differential calculus. The essential idea is simply
that of taking a difficult problem, breaking it down into smaller, more manageable pieces, and
then putting the results together (“re-integrating” them!). There is one application which is
Consider a continuous, non-negative function f (x). How might we find the area between
y=f(x)
x=a x=b
Figure 46:
You should be able to see numerous ways in which we could find an approximation; we
could split the region up into rectangles and triangles, calculate the area of each one, and add
Rather than doing this haphazardly, though, we have a standard algorithm, which will
allow us to develop theorems and formulas (and to calculate such an area much more quickly).
First, we divide the interval [a, b] into n subintervals of equal length ∆x. We’ll label the
x0 x1 x2 xn
Figure 47: a b
In each interval [xi−1 , xi ], we pick some value x∗i at which to evaluate f , and use this value
100
to define the height of a rectangle occupying that interval:
y=f(x)
x=xi-₁ x=xi
Figure 48:
We calculate the area of this rectangle. We then repeat the procedure for each of the other
A1
A2 A3 A6
A4 A5
Figure 49:
Area ≈ A1 + A2 + ... + An
That is,
n
X
Area ≈ f (x∗i ) ∆x (this is called a Riemann Sum).
i=1
Example: Let’s find an approximate value for the area below the curve y = x2 , between
x = 0 and x = 3. For simplicity, we’ll use just three rectangles (of width 1), and base their
101
10
10
y=x2
y=x2
7.5
7.5
5
5
2.5
2.5
x1*=0.5
-0.8 -0.4 0 0.4 0.8 1.2 1.6 2 2.4 2.8 3.2 3.6 4
x2*=1.5 x3*=2.5
This gives
Now, you might be thinking “couldn’t we have used more rectangles?”, and of course you’re
right; using more rectangles should give us a better approximation. In fact, we can take it one
n
X
lim f (x∗i ) ∆x.
n→∞
i=1
Example: Let’s revisit our previous example, and try to find the exact area. Here’s what
we do:
3 6 ... 3i ...
0 n n n 3
• Choose x∗i to be xi , say (that is, evaluate f at the right point of each interval)
• Let n → ∞.
This gives
n
X
Area = lim f (x∗i ) ∆x
n→∞
i=1
19
In fact, we can view this as the definition of the area of an irregularly shaped region.
102
n 2
X 3i 3
= lim
n→∞ n n
i=1
n
X 27i2
= lim
n→∞ n3
i=1
n
27 X 2
= lim i
n→∞ n3
i=1
27 n (n + 1) (2n + 1) X
= lim (this is a known formula for i2 )
n→∞ n3 6
=9
You can see that even with just three rectangles, we didn’t do too badly. This is partly because
we chose to use the midpoint of each rectangle; as a rule of thumb this tends to work best
Definition:
Let f be continuous on the interval [a, b]. Partition [a, b] into n subintervals of equal length
b−a
∆x = . Label the endpoints of the subintervals xi , for i = 0..n (so that the ith interval
n
is [xi−1 , xi ] i = 1..n), and in each interval, select a point x∗i . The definite integral of f from
a to b is
Z b n
X
f (x) dx = lim f (x∗i ) ∆x.
a n→∞
i=1
R
1 The symbol “ ” is an elongated “s” for “sum”. This should serve as an eternal reminder
of the definition; it may be helpful to think of the integral as a kind of sum, of infinitely
2 There are standard terms for the various elements within the integral. The “dx” is called
the differential (as we’ve already discussed), the function f (x) is called the integrand,
and the numbers a and b are called the limits of integration (although this usage of the
word limit is not consistent with the rest of mathematics; boundaries might be a more
103
appropriate word).
3 Our definition says nothing about areas! Instead, the integral can represent anything
that the product f (x) ∆x can represent. If you’re ever struggling to interpret an integral,
try looking at the units, and remember that the differential is part of the integral!
E.g. If t is time, and f (t) is a velocity of an object travelling in a straight line, then
Z b
f (t) dt has units of (m/s)·s = m. It’s the distance travelled!
a
Z b
4 Because f (x) dx is a number, the x is often called a “dummy” variable. It is only
a
used for the calculation process, and so any other variable can be substituted, whenever
we wish it:
Z b Z b Z b Z b
f (x) dx = f (t) dt = f (θ) dθ = f (γ) dγ = . . .
a a a a
Definite integrals have a number of useful properties. We won’t prove these here, but if
you think of the application to area, most of them should make sense:
y=k
x=a x=b
Figure 51:
Z a
2 f (x) dx = 0 (The region has no width.)
a
Z b Z b Z b
3 [f (x) ± g (x)] dx = f (x) dx ± g (x) dx
a a a
Z b Z b
4 kf (x) dx = k f (x) dx
a a
104
a b
(a − b) (b − a)
Z Z
5 f (x) dx = − f (x) dx because ∆x = =−
b a n n
This property is hard to make sense of in terms of areas, but it will occasionally be
helpful; we can reverse the limits of integration at any time we wish, and just multiply
by −1 to compensate.
Z b Z c Z b
6 f (x) dx = f (x) dx + f (x) dx
a a c
When we’re dealing with areas, this statement just says that we can split areas into two:
However, we can easily use property 5 to show that this rule works even if the number
Z b Z b
If f (x) ≥ g (x) for x ∈ [a, b] , then f (x) dx ≥ g (x) dx.
a a
y y ! f(x)
y ! g(x)
x!a x!b x
Figure 53:
105
The Connection between Integral and Differential Calculus
Consider a function f (x), continuous and positive on the interval [a, b]. Let A (x) be the area
below the curve between x = a and an arbitrary point x in the interval [a, b]. Notice that the
To be precise, if we add a thin strip of area to the right, of width h, then the area of the
y ! f(x)
A(x)
a x x"h b x
Figure 54:
It must also be approximately equal to hf (x) (since it is nearly rectangular). For small
A (x + h) − A (x) ≈ hf (x) ,
and so
A (x + h) − A (x)
≈ f (x) .
h
A (x + h) − A (x)
lim = f (x)
h→0 h
The rate of change of A (x) isn’t just related to f (x); it’s equal to it!
Example: As a very simple demonstration, suppose f (x) = x. The region below the line
dA
y = x is always a triangle, with area A (x) = 21 x2 . Sure enough, = x.
dx
106
4
3.2
y=x
2.4
1.6
0.8
A=(1/2)x2
Figure 55:
The relationship found above is the single most important discovery in calculus. To generalize
it (so that we’re not limited to discussing areas), note that our area function A (x) could be
Z x
expressed as f (x) dx. Actually, this reveals that we’ve used x for two different purposes
a
(it’s both the name of the axis and the name of an arbitrary point on the axis). To avoid this
Z x
confusion, recall our comments about the “dummy” variable; we can rewrite this as f (t) dt.
a
We then obtain this:
Z x
g (x) = f (t) dt, for x ∈ [a, b]
a
Z x
d
f (t) dt = f (x) .
dx a
Comments:
1 We’ve dropped the restriction that f (x) ≥ 0, and made no reference to areas.
107
2 The FTC can be paraphrased as “differentiation is the inverse of integration”.
Z x Z x
d
f −→ −→ f (t) dt −→ −→ f
a a dx
3 We’ve just said that the FTC can be thought of as a differentiation rule, but it may
be even more helpful to think of it in reverse; the function we’ve called g (x) is an
Examples:
Z x
2 2
erf (x) = √ e−t dt.
π 0
d 2 2
You can see by applying the FTC that (erf (x)) = √ e−x . We’ll discuss
dx π
numerous techniques for evaluating integrals, but none of them will ever help us to
Z x
2
simplify the expression e−t dt; it cannot be expressed in terms of elementary
0
functions! All we can do is add the error function to our list of functions; we’ll have
to accept that it’s defined as an integral, and we’ll only ever be able to evaluate it
x
πt2
Z
S (x) = sin dt.
0 2
108
22.2 The FTC Part II
Z x
We commented above that g (x) = f (t) dt is an antiderivative of f (x). However it’s
a
actually infinitely many of them... a different one for each value of a. This shouldn’t be
d
surprising; we know in general that (f (x) + C) = f 0 (x) for any value of C, so if we work
dx
backwards there should be infinitely many antiderivatives.
Of course, we can see that the antiderivatives must be related: if g1 (x) and g2 (x) are both
Suppose we can find a specific antiderivative for f (x); call it F (x) (for example, given
f (x) = x2 we might be able to guess that F (x) = 31 x3 ). No matter what the value of a is, we
Z x
know that if g (x) = f (t) dt, then g (x) = F (x) + C, for some value of C.
a
That is,
Z x
f (t) dt = F (x) + C.
a
Z x
f (t) dt = F (x) − F (a) .
a
Z b
f (t) dt = F (b) − F (a) .
a
This is a remarkable result. It says, essentially, that to sum up all of the values of f (x)
between a and b, all we need to do is evaluate an antiderivative of f (x) at just two points!
To use this, of course, we’ll need to be able to find antiderivatives, so that will be our focus
109
for the next several lectures.
Comment on Notation: Since we may encounter complicated functions, we will often write
the difference F (b) − F (a) as F (x) | ba , to save ourselves having to write F out twice.
23 Indefinite Integrals
Thanks to the Fundamental Theorem of Calculus, the first step towards evaluating an inte-
gral is finding an antiderivative (unless we’re going to use numerical methods instead of the
FTC). Because of this, we often use the word “integral” as a synonym for “antiderivative”,
even though they are, in principle, completely different concepts! This has become part of
universally-accepted terminology, and we even use the integral sign (without limits) to repre-
sent antiderivatives:
Definition:
The indefinite integral of f (x) is the collection of all possible antiderivatives. We denote it
Z
by f (x) dx.
If we happen to know one antiderivative F (x), then of course this allows us to write
Z
f (x) dx = F (x) + C.
Now, how might we go about finding antiderivatives? Well, for a handful of expressions
we can simply reverse our differentiation rules, so let’s start with that:
d
1 Since (xn ) = nxn−1 , we know that
dx
Z
1
xn dx = xn+1 + C
n+1
(the differentiation rule is “multiply by the original exponent, then reduce the exponent
by one”, so the reverse is “increase the exponent by 1, then divide by the new exponent”).
d 1
2 Rule 1 breaks down if n = −1, so we need a special case. Recall that (ln x) = , so
dx x
we may claim that
Z
1
dx = ln x + C.
x
This is not quite complete, though, because it only makes sense if x is positive. What if
110
x is negative? Well, for x < 0 the function ln x is undefined, but ln (−x) is ok. In fact,
d 1 1
(ln (−x)) = · (−1) = ,
dx −x x
so we have
Z
1 ln (x) + C,
if x > 0
dx = .
x
ln (−x) + C,
if x < 0
Z
1
dx = ln |x| + C .
x
d x
3 Since (e ) = ex , we know that
dx
Z
ex dx = ex + C .
d x
More generally, since (a ) = (ln a) ax , we know that
dx
ax
Z
ax dx = +C .
ln a
4 Considering our list of rules for differentiating the trigonometric functions, we can state
the following:
Z
cos x dx = sin x + C
Z
sin x dx = − cos x + C
Z
sec2 x dx = tan x + C
Z
csc2 x dx = − cot x + C
111
Z
sec x tan x dx = sec x + C
Z
csc x cot x dx = − csc x + C
5 Similarly,
Z
cosh x dx = sinh x + C
Z
sinh x dx = cosh x + C
etc.
6
Z
1
dx = tan−1 x + C
1 + x2
Z
1
√ dx = sin−1 x + C
1 − x2
Z
1
We might also state that −√ dx = cos−1 x + C, but this isn’t really adding
1 − x2
anything for us, except that it reveals something about the inverse trigonometric func-
tions.20
Unfortunately, those are about all of the actual rules available to us! Over the next few
lectures we’ll introduce several “techniques of integration”, but all any of these can do is convert
integrals into new forms... and the goal will be to obtain one of the integrals we already know.
For this reason the list above will be essential; you MUST know every formula we’ve written
above.
Z
1
20
If √ dx = sin−1 x + C1 = − cos−1 x + C2 , then sin−1 x and (− cos−1 x) must only differ by a
1 − x2
constant:
sin−1 x = − cos−1 x + K
or sin−1 x + cos−1 x = K.
Setting x = 0 tells us that K = π/2, so we’ve discovered an identity:
π
sin−1 x + cos−1 x = .
2
112
Also, keep in mind that some integration problems are actually impossible! As we’ve
2
already mentioned, e−x and sin x2 do not possess “nice” antiderivatives (every continuous
function possesses an antiderivative, but it may not be possible to express that antiderivative
Examples
With a bit of thought, we can apply the rules above to functions which don’t quite match the
formulas exactly.
Z
2
1. Consider 2 + x2 dx. All we need to do is expand the integrand; we have
Z
2 1
4 + 2x2 + x4 dx = 4x + x3 + x5 + C.
3 5
Z
2. Consider cos (2x) dx. We might guess that this should involve sin (2x), but checking
d
this gives sin (2x) = 2 cos (2x). All we need to do here is adjust for the 2:
dx
Z
1
cos (2x) dx = sin (2x) + C.
2
This is sometimes referred to as the “guess and fix-up method”. If you’re going to guess,
though, make sure you check your answer! Here we can indeed see that
d 1
sin (2x) + C = cos (2x) .
dx 2
Z
3. Consider cos2 x dx. Your first thought might be that this should involve sin2 x... but
if you check that you’ll see that it isn’t even close! In fact there’s a standard trick for
Z Z
1 + cos 2x
cos2 x dx = dx
2
Z Z
1 1
= dx + cos (2x) dx
2 2
1 1
= x + sin (2x) + C.
2 4
113
2 + e−x
Z
4. Consider dx. We can expand this:
ex
2 + e−x
Z Z Z Z
−x −2x −x
e−2x dx
dx = 2e +e dx = 2 e dx +
ex
−x 1
+ − e−2x + C2
= 2 −e + C1 (guessing and fixing)
2
1
= −2e−x − e−2x + C.
2
Technique)
Inverting the rules for specific functions was easy enough, but can we invert the Product,
We won’t worry about the Quotient Rule, since it is never required (we can always write
f (x)
as f (x) [g (x)]−1 and apply the Product and Chain Rules). Of the other two, the easier
g (x)
one to deal with is the Chain Rule, so let’s start there:
d
(f (g (x))) = f 0 (g (x)) g 0 (x) .
dx
This structure will usually be hard to recognize, but we may be able to break it down into two
steps. Notice that the integral contains a function (g (x)) “inside” another function, and the
derivative of this “inner” function also appears in the integral. If we happen to notice this, we
can try introducing a different variable; we make the substitution u = g (x). If we differentiate
du = g 0 (x) dx.
114
This allows us to rewrite the entire integral in terms of our new variable:
Z Z
f 0 (g (x)) g 0 (x) dx = f 0 (u) du.
If we’ve been lucky enough to encounter a problem which is well suited to this idea, and clever
enough to spot the substitution, then we might now recognize the function f 0 (u) as being one
of the functions from our list of known antidifferentiation formulas, allowing us to conclude
that
Z Z
0 0
f (g (x)) g (x) dx = f 0 (u) du = f (u) + C = f (g (x)) + C.
A few examples:
Z Z
6x du
2
dx = .
3x + 4 u
Z Z
6x du
2
dx =
3x + 4 u
= ln |u| + C
= ln 3x2 + 4 + C
(we can drop the absolute values here because we know that 3x2 + 4 is always positive).
Z
sin (3 ln x)
2 Consider dx. There’s an obvious choice here; 3 ln x is sitting “inside” an-
x
other function, and its derivative is also in the integral... or at least a multiple of its
derivative is there.
115
3 dx du
We let u = 3 ln x, which means that du = dx, which we could rearrange as = .
x x 3
Now
Z Z
sin (3 ln x) dx
dx = sin (3 ln x)
x x
Z
du
= sin (u)
3
Z
1
= sin u du
3
1
= − cos u + C
3
1
= − cos (3 ln x) + C.
3
Z
3 Here’s a simple one for which we don’t yet have a formula: what’s tan x dx? We need
Now we let u = cos x, because this choice gives us du = − sin x dx, which matches the
Z Z Z
sin x 1
tan x dx = dx = − du
cos x u
= − ln |u| + C
= − ln |cos x| + C.
√
ex 1 + ex dx? We let u = 1 + ex , which gives du = ex dx. Then
R
4 How might we tackle
√ √
Z Z
x
e 1+ ex dx = u du
2
= u3/2 + C
3
2
= (1 + ex )3/2 + C.
3
5 Sometimes the method can be made to work even when it doesn’t initially look hopeful.
116
Z p
For example, consider x3 1 + x2 dx. We see that we have an “inner” function 1 +
x2 , but its derivative is 2x, which does not match the x3 constituting the rest of the
u = 1 + x2 =⇒ du = 2xdx
Z p Z p
x3 1 + x2 dx = x2 1 + x2 (x dx)
A glance back at our substitution tells us that we can replace x2 with u − 1, and so
Z p Z p
x3 1 + x2 dx = x2 1 + x2 (x dx)
√ du
Z
= (u − 1) u
2
Z
1 3/2
= u − u1/2 du
2
1 2 5/2 2 3/2
= u − u +C
2 5 3
1 5/2 1 3/2
= 1 + x2 − 1 + x2 + C.
5 3
If we’re unsure of ourselves, we always have the option of checking our results by differ-
entiation:
1 5/2 1 3/2
If y= 1 + x2 − 1 + x2 +C
5 3
dy 1 3/2 1 1/2
then = 1 + x2 · 2x − 1 + x2 · 2x
dx 2 2
3/2 1/2
= x 1 + x2 − x 1 + x2
p
1 + x2 − 1
=x 1 + x2
p
= x3 1 + x2 , as it should be!
117
6 Sometimes we will have to rework the integral before a useful substitution becomes
Z
dx
apparent. This may require some creativity. For example, consider . Your first
1 + ex
thought might be to try letting u = 1 + ex , but this fails (because differentiation yields
There are several tricks that might work here. Here’s one: multiply the numerator and
denominator by e−x :
e−x
Z Z
dx
= dx.
1 + ex e−x + 1
e−x
Z Z
dx
= dx
1 + ex e−x + 1
Z
du
= −
u
= − ln |u| + C
= − ln e−x + 1 + C.
−x
1
Note that the result can be written in several different ways: − ln e + 1 = ln −x
=
x e +1
e
ln = x − ln (1 + ex ). This is something to keep in mind if you’re comparing
1 + ex
answers to practice problems with your classmates, or with the answers in the back of
the textbook, or with answers given by software; if your answers don’t match exactly, it’s
possible that they’re equal, but in a different form. They may even differ by a constant!
118
25 Integration by Parts (IBP)
We have just one more differentiation rule to reverse. Recall the Product Rule:
d du dv
(u (x) v (x)) = ·v+u·
dx dx dx
(we’re using u and v instead of f and g here simply because we’re working towards a formula,
If we separate the integral into two and simplify the differentials, this takes on the form
Z Z
v du + u dv = uv + C.
Now, it’s going to be hard to recognize this pattern in a pair of integrals; we’re normally going
to be looking at one integral at a time. So, we isolate one of the integrals. This gives the
Note that we’ve omitted the “+C”. This is because we have an indefinite integral on each
side of the equation - each one of them already contains an arbitrary constant!
This formula probably looks a bit strange, but remember that both u and v are supposed
Z Z
dv
to be functions of x. The left-hand side is really u (x) dx = u (x) v 0 (x) dx, and the
Z dx
right-hand side contains u (x), v (x), and v (x) u0 (x) dx. Therefore, to use the formula, we
need to break our integrand up into a function u (x) and a function v 0 (x). If we can integrate
v 0 (x) to find v (x), then the integration by parts formula will enable us to replace the original
integration problem with a new one. If we’re lucky, the new one will be a problem we can
solve!
119
Examples:
Z
1 Consider xex dx. First observe that there really aren’t any useful substitutions avail-
able here, and we really can’t simplify the integrand in any way. However, we do have
a product of functions, so IBP is worth trying. So, we want to identify u and dv such
Z Z
x
that xe dx will become udv. There seems to be more than one way to proceed,
u = x and dv = ex dx.
This means
du = dx and v = ex + C.
Z Z Z
xex dx = udv = uv − vdu
Z
x
= x (e + C) − (ex + C) dx
= xex + Cx − (ex + Cx + K)
= xex − ex + K.
Notice that the arbitrary constant from v (the “C”) cancelled out. This will always
It isn’t always going to be clear how we should pick u and dv, or that we should be using
IBP at all; it will take practice and experience. Two principles may help:
? The goal is to obtain a simpler integral than we start with, so it may help to pick
u = ex and dv = xdx.
120
1
=⇒ du = ex dx, v = x2 .
2
This gives
Z Z
1 1 2 x
xe dx = x2 ex −
x
x e dx.
2 2
This is entirely correct... but we’ve arrived at a more complicated integral than we
started with! When there are two obvious ways of proceeding, we’ll often find that one
of them leads to a simpler integral, while the other leads in the other direction.
Z
2 Many problems will require repeated applications of IBP. For example, consider x2 ex dx.
If we let
u = x2 and dv = ex dx
so du = 2xdx, v = ex
then
Z Z
x2 ex dx = uv − vdu
Z
2 x
=x e −2 xex dx
Z
3 Here’s a famous example with an odd twist: what do we do with ex sin x dx? There
are two obvious choices for u again, but in this case it turns out that both work equally
u = ex , dv = sin x dx
This gives
Z Z
x
uv − vdu = −e cos x + ex cos x dx,
and now we have a second integral, similar to the original one. We try the same strategy
again:
u = ex , dv = cos x dx
121
du = ex dx, v = sin x
It looks as though we’re back to where we started. However, if we carefully write down
what we’ve got so far, we discover that we’re not actually stuck at all; we have
Z Z
ex sin x dx = −ex cos x + ex sin x − ex sin x dx
Z
2 ex sin x dx = ex (sin x − cos x) + C
(adding “+C” because we now have an indefinite integral on only one side of the equa-
tion), so
Z
1
ex sin x dx = ex (sin x − cos x) + C2
2
4 Normally, we’ll consider using IBP when our integrand is a product of two functions,
Z
but there is a handful of exceptions. For example, consider ln x dx. If we let
u = ln x dv = dx
1
du = dx v=x
x
= x ln x − x + C.
The same trick works on the inverse trigonometric functions sin−1 x and tan−1 x, with
122
5 It will occasionally be useful to use both the Change-of-Variable technique and IBP
√
Z
in the same problem. Consider sinh x dx. At first glance neither method seems
hopeful. However, we can always try a substitution, and it will transform the integral
√
t= x.
Then
1
dt = √ dx,
2 x
= 2t dt
√
Z Z
This converts sinh x dx into 2t sinh t dt, and now we can try integration by parts!
Let
u = 2t, dv = sinh t dt
du = 2dt, v = cosh t.
Then
√
Z Z
sinh x dx = 2t sinh t dt
Z
= 2t cosh t − 2 cosh t dt
= 2t cosh t − 2 sinh t + C
√ √ √
= 2 x cosh x − 2 sinh x + C.
If we’re working with a definite integral, and we decide that we need either to make a substi-
tution or to use integration by parts, we could work on the indefinite integral in the margin,
and then apply the Fundamental Theorem of Calculus directly. However, we can be slightly
more efficient.
123
Z 3
2
Example: Consider xex dx. If we let
2
u = x2
then du = 2xdx
and we can express the limits of integration in terms of the new variable as well. If x = 2,
Z 3 Z 9 9
x2 1 u 1
xe dx = e du = eu
2 4 2 2 4
1 9
e − e4 .
=
2
Thus, if we make a substitution in a definite integral, we never have to return to the original
variables! We will be expecting you to take advantage of this in your own work; it might
not be absolutely necessary, but there is usually no good reason not to change the limits of
Z 3
Example: Consider x2 ln x dx. We’ll use integration by parts here, with
2
u = ln x, dv = x2 dx
1 1
du = dx, v = x3 .
x 3
Since the integration by parts formula involves two new variables, we’ll always revert to the
original one. We can simply carry the limits through the integration process, like this:
Z 3 3 Z 3
1 3 1 2
x2 ln x dx = x ln x − x dx
2 3 2 2 3
3
8 1
= 9 ln 3 − ln 2 − x3
3 9 2
8 19
= 9 ln 3 − ln 2 − .
3 9
124
26 Other Simple Applications of Integration
We introduced the concept of the definite integral with a view to finding the area between a
curve and the x-axis, but we can make one generalization very easily. Suppose we have two
y=f(x)
y=g(x)
x=a ∆x x=b
Figure 56:
Of course! Considering Figure (56), we proceed exactly as before: we partition the interval
[a, b] into small segments of width ∆x, and pick a point x∗i in each subinterval. We evaluate
f and g at this point to create a rectangle of width ∆x and height f (x∗i ) − g (x∗i ). Adding all
n
X
[f (x∗i ) − g (x∗i )] ∆x.
i=1
This is just the Riemann sum from the definition of the definite integral, with the function
f (x) replaced with the function f (x) − g (x), and so if we let n → ∞, we find that
Z b
Area = [f (x) − g (x)] dx.
a
Solution: First, a quick sketch reveals that there is exactly one region enclosed by these
125
1.5
y=x2
0.5
y=x3
∆x
1 1
x3 x4
Z
2 3
1 1 1
A= x −x dx = − = − = .
0 3 4 0 3 4 12
Occasionally a problem will be made easier if we divide the region into horizontal rectangles
instead. We then use y as the variable of integration. This will be the obvious way to proceed
x=g(y) x=f(y)
y=d
∆y
y=c
Figure 58:
This way,
n
X
Area ≈ [xright − xleft ] ∆y
i=1
n
X
= [f (yi∗ ) − g (yi∗ )] ∆y,
i=1
so
Z d
Area = [f (y) − g (y)] dy.
c
126
We refer to regions bounded by two functions of x, such as in Figure (56) as “Type I” regions,
while regions bounded by two functions of y, as in Figure (58), are called “Type II”. If a region
can be described in either way (which is the case in Figure (57)), then we call it “Type III”.
2.5
y=ln(x)
1.5
1
∆y
0.5
This is a Type III region, so we can express the area as a single integral, using either x or
Z 3
A= ln x dx
1
= (3 ln 3 − 3) − (0 − 1)
= 3 ln 3 − 2.
Z ln 3
A= (3 − ey ) dy
0
= (3y − ey )|ln
0
3
= (3 ln 3 − 3) − (0 − 1)
127
= 3 ln 3 − 2.
We may choose to integrate on y either because the integral is easier that way (in the above
example using x requires integration by parts, while using y does not), or because of the shape
of the region.
y=x
1.5
0.5
y=(2/x)-1
-0.5
-1
Solution: We always have two options, but since this is a type II region, one of them
Z 1 Z 2
2
A= x dx + − 1 dx
0 1 x
1
= ... = 2 ln 2 − .
2
Z 1
2
A= − y dy
0 y+1
1
= ... = 2 ln 2 − .
2
128
26.2 Mean Values of Functions
Here’s a completely different application. Think back to our development of the definite
integral. At one stage we have n rectangles, of heights f (x∗i ). The average height of these
rectangles must be
Pn
i=1 f (x∗i )
,
n
and it seems reasonable to claim that this can be used as an approximation for the average
Letting n → ∞, we can define the mean value of a continuous function f on an interval [a, b]
as
Z b
1
m.v. (f ) = f (x) dx.
(b − a) a
Example: The mean value of f (x) = sin x over the interval [0, π] is
Z π
1
sin x dx
π 0
π
1
= (− cos x)
π 0
1
= (1 + 1)
π
2
=
π
(which is approximately 0.637). We could consider this to be the mean value of the absolute
value of f over the entire real line (its average distance from the x-axis). See below:
129
2
Figure 61:
square root of the mean of the square of f (the “root mean square”):
s
Z b
1
r.m.s. (f ) = [f (x)]2 dx
b−a a
This will always give a value at least as large as the mean of the absolute value of f ; it
m.v. (sin x) = 0
2
m.v. (|sin x|) = ≈ 0.637
π
s
Z 2π
1 1
r.m.s. (sin x) = sin2 x dx = . . . = √ ≈ 0.707.
2π 0 2
130
27 Trigonometric Substitutions
Now we introduce a variation on the change-of-variable technique, useful for a fairly specific
√ √
set of problems. If we encounter integrals containing terms such as a2 − x2 or x2 ± a2 ,
where a is a constant, there are three specific substitutions which may be helpful. The idea
√
For example, if we encounter the expression x2 + 1, replacing x with tan θ will enable us to
p p √
x2 + 1 = tan2 θ + 1 = sec2 θ = sec θ
(this last step is only valid if sec θ > 0, but for reasons we’ll explain in a moment, this will
Notice that we’ve written these substitutions in inverse form; what we are really doing is
introducing new variables θ = tan−1 xa , etc. For these particular substitutions, the calcula-
As with any substitution, the effect is simply to transform the integral into a new form,
and there’s no guarantee that the new integral will be one that we can evaluate... but it may
be worth trying!
Finally, if we are indeed able to evaluate the integral in θ, we’ll want to rewrite the result
in terms of x. This will typically involve a problem of the following type: given that tan θ = xa ,
what is sin θ? To answer this, we could either work with identities, or draw a sample right
131
x2 ! a2
x
θ
Figure 62: a
From this we can read off the values of the other trigonometric functions. In this example,
x a
sin θ = √ , and cos θ = √ .
x2+a 2 x + a2
2
Examples:
Z
dx
a) Consider √ . Following the rules above, we let x = 2 tan θ. As in every substi-
4 + x2
tution, we differentiate this:
dx = 2 sec2 θ dθ.
2 sec2 θ
Z Z
dx
√ = √ dθ
4 + x2 4 + 4 tan2 θ
sec2 θ
Z
= √ dθ
1 + tan2 θ
sec2 θ
Z
= dθ
|sec θ|
π π
Now, we’ve defined θ as θ = tan−1 x
2 . That means that θ ∈ − , , and so sec θ is
2 2
always positive. Therefore we have
Z
sec θ dθ.
Now, that’s a problem. This is not on our basic list, and it turns out that with the
techniques available to us in this course, it’s quite difficult to evaluate it. However, the
formula is known, and this particular integral appears often enough that we’ll add it to
132
Z
sec x dx = ln |sec x + tan x| + C
Z
csc x dx = − ln |csc x + cot x| + C
Z Z
dx
√ = sec θ dθ
4 + x2
= ln |sec θ + tan θ| + C.
We still need to return to x. We know that tan θ = x/2, but what’s sec θ? We can either draw
p
sec θ = 1 + tan2 θ
r x 2
= 1+
2
1p
= 4 + x2 .
2
Continuing,
√
4 + x2 x
Z
dx
√ = ln + + C1
4 + x2 2 2
p
= ln x + 4 + x 2 + C2
A
where we’ve used the fact that ln = ln A − ln B, and set C2 = C1 − ln 2.
B
133
and let u = x/2, so du = dx/2, giving
Z
du
√
1 + u2
= sinh−1 u + C
x
= sinh−1 + C,
2
and this is, believe it or not, equivalent to our first result! (Since the hyperbolic func-
tions are defined in terms of exponentials, their inverses can be expressed in terms of
logarithms.)
Z
1
b) Now consider the integral √ dx (you might recognize the integrand, if you’ve
x x2 − 1
been looking at tables of derivatives, but let’s assume that you don’t). Looking at the
x = sec θ,
Z Z
1 1 1
√ dx = ·√ sec θ tan θ dθ
x x2 − 1 sec θ 2
sec θ − 1
Z
tan θ
√= dθ.
tan2 θ
π π
Now, since θ = sec−1 (x), we have θ ∈ −π, − ∪ 0, (see section 8 of these notes,
2 2
and observe that we have to exclude the points θ = −π, θ = 0 to avoid division by zero),
Z Z
1 tan θ
√ dx = dθ
x x2 − 1 tan θ
Z
= dθ
134
=θ+C
= sec−1 x + C,
d 1
sec−1 x = √
.
dx x x2 − 1
x3
Z
c) Consider √ dx. This time we let
4 − x2
x = 2 sin θ
=⇒ dx = 2 cos θ dθ.
sin3 θ cos θ
Z
=8 p dθ
1 − sin2 θ
sin3 θ cos θ
Z
=8 dθ (question: why not ± cos θ?)
cos θ
Z
=8 sin3 θ dθ
Z
1 − cos2 θ sin θ dθ
=8
Z
1 − u2 du
= −8
Z
u2 − 1 du
=8
u3
=8 −u +C
3
1 3
=8 cos θ − cos θ + C
3
135
Use a triangle:
3/2 √ !
4 − x2 4 − x2
=8 − +C
24 2
1 3/2 p
= 4 − x2 − 4 4 − x2 + C.
3
1 1/2 −1/2
f 0 (x) = 4 − x2 (−2x) − 2 4 − x2 (−2x)
2
p 4x
= −x 4 − x2 + √
4 − x2
−x 4 − x2 + 4x
= √
4 − x2
x3
=√ as it should be!
4 − x2
The same technique can be useful for other powers of a2 − x2 or x2 ± a2 (not just square
Every rational function has an antiderivative which can be expressed in terms of some com-
bination of rational functions, the natural logarithm, and the arctangent. We’ll discuss the
required strategies through a number of examples, starting with some easy ones.
Z
1
1 dx = ln |x − 4| + C
x−4
(If you have trouble seeing this right away, you could make the substitution u = x − 4).
Z
1
2 dx = − ln |4 − x| + C
4−x
1 1
(Again, a substitution could be made here, or you could realize that 4−x = − x−4 , and
ln |x − 4| = ln |4 − x|).
Z
1 1
2 dx = − (x − 4) + C
3
(x − 4)
136
(Once again, try u = x − 4 if you need to. Once you’ve seen a few examples like this,
you’ll probably find that you no longer need to go through the substitution procedure.)
Z
1
4 What do we do with dx? Use partial fractions!
x2 − 4
1 A B
= +
x2 − 4 x+2 x−2
=⇒ 1 = A (x − 2) + B (x + 2)
x=2 =⇒ 1 = 4B
x = −2 =⇒ 1 = −4A
=⇒ A = − 14 , B= 1
4
− 14 1
Z Z Z
1 4
so 2
dx = dx + dx
x −4 x+2 x−2
1 1
− ln |x + 2| + ln |x − 2| + C
4 4
1 x−2
= ln + C.
4 x+2
Z
1
5 Next, what about dx?
x2 +4
Z
1
Recall from our basic list of integrals that dx = tan−1 x + C. So... we just
x2 + 1
need to figure out how to deal with the “4”:
Z Z
1 1 1
dx = dx.
2
x +4 4 x 2
2 +1
x 1
Now we let u = , and differentiate: du = dx
2 2
Z Z
1 2du 1 du
= 2
= 2
4 u +1 2 u +1
1
= tan−1 u + C
2
1 x
= tan−1 + C.
2 2
137
We could easily make this more general:
Z
1 1 −1 x
dx = tan + C.
x2 + a2 a a
u = x2 + 4
=⇒ du = 2x dx.
Then
1
2 du
Z Z
x
2
dx =
x +4 u
1
= ln |u| + C
2
1
= ln x2 + 4 + C.
2
x2
Z
7 Let’s take that one step further: what do we do with dx? Well, the integrand
x2 + 4
is an improper rational function, and if we encounter these it will usually be a good idea
to split them up. We can use long division if necessary, but in this case you might be
x2 x2 + 4 4 4
2
= 2
− 2 =1− 2 .
x +4 x +4 x +4 x +4
138
Z
dx
integral ?
x2 − 3x + 2
Well, this is actually no more complicated than example 4 ; we can use partial fractions
again.
1 A B
= +
x2 − 3x + 2 x−2 x−1
=⇒ 1 = A (x − 1) + B (x − 2)
=⇒ B = −1, A = 1.
Z Z Z
dx 1 1
2
= dx − dx
x − 3x + 2 x−2 x−1
= ln |x − 2| − ln |x − 1| + C
x−2
= ln + C.
x−1
Z
dx
9 Next, consider . This looks very much like the previous example... but
x2 − 2x + 5
the denominator is irreducible! In this circumstance the trick is to complete the square:
Z Z
dx dx
=
2
x − 2x + 5 (x − 1)2 + 4
1 x−1
= tan−1 + C.
2 2
10 As we discussed earlier in the course, every polynomial can be factored into linear and
quadratic factors, so we can break any rational function down into a sum of polynomials
and proper rational functions in which the denominator is of degree no greater than
two. So, after performing long division and a partial fraction decomposition, we really
Z
2x + 5
shouldn’t have to deal with anything worse than this: dx.
x2 + 5x + 10
Actually, this one isn’t so bad - we can just make a substitution!
Let u = x2 + 5x + 10
=⇒ du = (2x + 5) dx
139
Z Z
2x + 5 du
and then dx =
x2 + 5x + 10 u
= ln |u| + C
= ln x2 + 5x + 10 + C
(we’ve dropped the absolute values because this quadratic is always positive).
u = x2 + x + 1
we get
du = (2x + 1) dx,
which does not match the numerator. However, it does tell us something. Notice that
it’s really the “+2” in the numerator of our integrand that’s causing the problem. In
other words, the substitution would have worked if the problem had been to evaluate
x + 12
Z
du 1
dx instead because = x+ dx . Therefore we can try splitting
x2 + x + 1 2 2
the integral up; the remaining part will have only a constant in the numerator, and we’ll
x + 12 3
Z Z Z
x+2 2
dx = dx + dx
x2 + x + 1 x2 + x + 1 x2 + x + 1
Z 1
2 du
Z
3 dx
= + 2
u 2 x+ 1 + 32 4
1
!!
1 3 2 2 x+
= ln |u|+ · √ tan−1 √ 2
(using our formula from example 5 again)
2 2 3 3
√
1 2x + 1
= ln x2 + x + 1 + 3 tan−1 √ + C.
2 3
12 The previous example is just about as hard as it gets. The only way it could possibly be
140
worse is if our denominator contains a repeated quadratic root. In these cases we may
Z
1
dx,
(x2 + 1)2
we would let
x = tan θ
=⇒ dx = sec2 θ dθ.
Then,
sec2 θ
Z Z
1
dx = dθ
(x2 + 1)2 (tan2 θ + 1)
2
sec2 θ
Z
= dθ
sec4 θ
Z
= cos2 θ dθ
Z
1 + cos 2θ
= dθ
2
1 sin (2θ)
= θ+ + C.
2 2
To return to x, we first need to apply one of the double angle formulas, to obtain
1
(θ + sin θ cos θ) + C.
2
1)
x+
2
√(
x
Figure 63:
141
and we have our result:
Z
1 1 x
2 = tan−1 x + 2
+ C.
(x2 + 1) 2 x +1
142
29 Summary of Integration Rules and Techniques
143
Z
• sec x dx = ln |sec x + tan x| + C
Z
• csc x dx = − ln |csc x + cot x| + C
• Integration by Parts
• Rewriting the Integrand (by simple algebra, trigonometric identities, partial fractions,
or completing squares)
All of these techniques have the same goal: to replace the given integral with an integral which
is on our list! For this reason some memorization is essential; you MUST know these formulas.
144
30 More Applications of Integration
Now that we’ve studied all of the methods of integration available to us, we turn to another
couple of applications.
Given a segment of a curve y = f (x), spanning an interval [a, b], we’ve discussed how to find
the area below it, and how to find its average distance from the origin. We might also be
interested in its length. Of course, we could measure this, by laying a string along the curve
(carefully), then stretching out the string and measuring that, but how might we calculate it?
The strategy will be the same as in every other application of integration; we’ll break the
problem into small pieces. The first step, as usual, is to partition an axis (let’s say the x-axis)
into subintervals of equal length (∆x). Now, the section of the curve y = f (x) which spans
each one of these subintervals will be short (and will get shorter when we let ∆x approach
zero), and so we can reasonably approximate it by a straight line. . ., and we do know how to
Approximation
to curve
∆Li
∆yi
∆x
xi-1 xi
Figure 64:
From the figure above, you can see that the length of each straight line segment can be
expressed as
q
∆Li = (∆x)2 + (∆yi )2 . (5)
What we need to do next is let ∆x approach zero, but it might not be immediately clear how
Xn q
we would go about calculating lim (∆x)2 + (∆yi )2 ; we need one more step first. The
∆x→0
i=1
145
trick is to factor a copy of ∆x out of the square root; this way we have a differential at the end
of our expression, and we can interpret our sum as a Riemann sum. That is, from equation
(5) we write s 2
∆yi
∆Li = 1+ ∆x,
∆x
so that s
n 2
X ∆yi
L≈ 1+ ∆x,
∆x
i=1
and now we may state that the exact length of the curve y = f (x) over the interval [a, b] is
s 2
Z b
dy
L= 1+ dx.
a dx
Example: It can be shown that a cable suspended between two supports of equal height
x
y = a cosh + K,
a
y=a⋅cosh(1)+K
y=a+K
If we can determine the value of a, then we can calculate the length of the cable (the
s 2
Z D/2
dy
L= 1+ dx
−D/2 dx
146
D/2
Z r x
= 1 + sinh2 dx
−D/2 a
D/2
Z r x
=2 1 + sinh2 dx (since the integrand is even)
0 a
Z D/2 r x
=2 cosh2 dx (since cosh2 x − sinh2 x = 1)
0 a
Z D/2 x
=2 cosh dx (since cosh x > 0 for all x)
0 a
D
= 2a sinh .
2a
Unfortunately, most of the integrals generated by the arc length formula will have to be
approximated by numerical methods. For example, even for a simple curve like y = x3 , the
Z bp
L= 1 + 9x4 dx,
a
Comment: We can easily adapt the formula for curves which are given in the form x = g (y).
We simply subdivide the y axis instead, and factor out ∆y before taking the limit as ∆y → 0,
to obtain s
Z d 2
dx
L= 1+ dy.
c dy
Consider the graph of an equation y = f (x) over an interval [a, b], with f (x) > 0 on this
interval. Imagine revolving this curve segment about the x- axis, in three dimensions; the
result will be a three-dimensional surface. We can go one step further; consider the region
bounded by the curve segment, the x- axis, and the lines x = a and x = b, and imagine
revolving that entire region about the x- axis; the result will be a three-dimensional solid. For
147
15
10
y = ex
0 1 2 3
Figure 66:
12
0 1 2 3
-4
-8
but because of the symmetry of this particular object it can be fully described in terms of the
single variable x, and we already have the tools to answer questions such as this: what is the
We are about to develop a formula for this, but before we do so, here’s a word of caution:
we could easily consider a different axis of revolution, and each choice will require a different
formula. For example, we could consider revolving the same curve segment about the line
y = −5; this will lead to a larger solid with a rather different geometry:
148
10
0 1 2 3
-5
-10
-15
We could even consider a vertical axis of revolution; using the y- axis produces a circular
12
-3 -2 -1 0 1 2 3
-4
Figure 69:
Because of this variability in the nature of the examples you might encounter, you are
strongly advised not to rely on memorization of the formulas we are about to develop. Instead,
How are we going to develop these formulas? By now the strategy should be familiar: we’re
going to partition the x- axis (or possibly the y- axis) into intervals of length ∆x (or ∆y), make
an approximation of the volume for each corresponding section of the solid, add them up, and
149
take a limit as the width of the intervals goes to zero. Since we’ve done this sort of thing more
than once now, for different applications, let’s dispense with some of the rigour. In a very
rough sense, the essential idea of integration is to break a problem down into a sum of infinitely
many infinitesimal parts. We’ll start by dividing the region into infinitesimally thin rectangles
(of thickness dx or dy), and calculate the volume of the corresponding (infinitesimally thin)
section of the solid. We’ll label this volume dV (often called a “volume element”). Summing
R
up all infinitely many of them, the volume of the entire solid will be V = dV .
There are two very different circumstances to consider, each with a few possible variations.
30.2.1 Vertical Rectangles Revolved about a Horizontal Axis (or Vice Versa)
Consider the first problem proposed above (Figure 67). If we were asked to find the area of the
generating region, it would be sensible to begin with a partition of the x- axis, producing thin
vertical rectangles. So, let’s start the same way, but ask a different question: what happens to
infinitesimally thin vertical rectangles as the region they cover is revolved about the x- axis?
We hope you can see that each one of these rectangles will generate a disk, of radius f (x)
dV = (area of face)(thickness)
= π [f (x)]2 dx.
Z
V = dV
Z b
= π [f (x)]2 dx.
a
Setting f (x) = ex and the interval as [1, 2], we find that the volume of the object in Figure
Z 2
π 4
πe2x dx = e − e2 units3 .
67 is V =
1 2
Now consider the second of our proposed problems (Figure 68). What happens when we
revolve our vertical rectangles about an axis y = k, k < 0? We obtain disks with holes in the
middle (we call these washers). This just requires a minor adjustment to our formula; each
150
infinitesimally-thin washer will have volume
dV = (area of face)(thickness)
2 2
= πrouter − πrinner dx
h i
= π (f (x) + k)2 − k 2 dx,
Z b h i
V = π (f (x))2 + 2kf (x) dx.
a
So (setting f (x) = ex and the interval as [1, 2] again) we find that the volume of the object
Z 2 h Z 2
x 2 2
i
2x x
1 4 19 2
in Figure 68 is V = π (e + 5) − 5 dx = π e + 10e dx = π e + e − 10e
1 1 2 2
units3 .
30.2.2 Vertical Rectangles about a Vertical Axis (or Horizontal about Horizontal)
Next, consider the third of our proposed problems (Figure 69). If we stick with the choice
of vertical rectangles (which does make sense, given the shape of the region), then we find
ourselves with a very different situation: revolving these rectangles about a vertical axis won’t
produce disks or washers at all. Instead, we obtain infinitesimally thin cylinders (these are
Very well, then... what’s the volume of a cylindrical shell? We could calculate it as the
difference between the volumes of two cylinders of (very) slightly different radii. For the
situation shown in Figure 69, the smaller cylinder would have radius x, and the larger one
would have radius x + dx. They both have height f (x), and so the volume of a typical
2 2
= π router − rinner h
h i
= π (x + dx)2 − x2 f (x)
h i
= π 2xdx + (dx)2 f (x) .
151
Since dx is infinitesimally small, the (dx)2 term is infinitesimal even when compared to dx,
so it can be ignored; we can use dV = 2πx dx as our volume element, and the volume of the
Z b
= 2πxf (x) dx.
a
Z 2 2
Thus we find that the volume of the object in Figure (69) is V = 2πxex dx = 2π [xex − ex ] =
1 1
2πe2 units3 .
Note that we’ll also generate cylindrical shells if we revolve horizontal rectangles about
a horizontal axis, and so we’ll require a similar integral in y. And of course, just as our
disk/washer formula might need to be modified on a case-by-case basis, the same is true of
our cylindrical shell formula. If the original region is bounded between two curves y = f (x)
and y = g (x) with f (x) > g (x) on [a, b], and the axis of revolution is, say, the line x = k
Z b
(where k < a), then we’ll need to calculate V as 2π (x − k) [f (x) − g (x)] dx.
a
Comments:
1. There is a simpler way to think of the cylindrical shell formula: since the cylindrical shell
is infinitesimally thin, we could imagine cutting it along a vertical line, and flattening
it out like a sheet of paper. That sheet would have thickness dx, height h = f (x), and
length given by what was the circumference of the cylindrical shell: 2πr = 2πx. Thinking
2. Most textbooks devote separate sections to a “disk / washer method” and a “cylindrical
shell” method, but you shouldn’t be making decisions based upon which of the “methods”
you want to use. Rather, the decision to be made is whether to partition the original
region into vertical rectangles or horizontal rectangles - the appropriate formula will
follow from that decision. How do we make that decision? The same way we make that
decision in other applications, like finding areas (we’d rather end up with one integral
than more than one, and some functions might be easier to work with than their inverses).
3. Something to think about: what if the axis of revolution is at an angle? It’s unlikely
152
that you’ll see any problems like this, but ask yourself: if the same generating region
we’ve used in our discussion so far were to be revolved about the line y = x − 2, say,
Example: Consider the region bounded by the curve y = tan−1 x and the lines x = 0 and
y = π/4. What is the volume of the object generated by revolving this region about the line
x = −1?
Solution: This region can be conveniently broken down into either vertical or horizontal
(I) Using Vertical Rectangles: Revolving vertical rectangles about the given (verti-
(II) Using Horizontal Rectangles: Revolving horizontal rectangles about the given
dV = (area) (thickness)
= πr22 − πr12 dy
h i
= π (xright + 1)2 − (xleft + 1)2 dy
h i
= π (tan y + 1)2 − 12 dy.
153
So, if we don’t like the look of the integral in (I), we could try this instead:
Z π/4
π tan2 y + 2 tan y dy
V =
0
Z π/4
sec2 y − 1 + 2 tan y dy
=π
0
π/4
= π [tan y − y − 2 ln |cos y|]|0
π
= π 1 − + ln 2 .
4
Exercise: Verify that the integral in (I) gives the same result. Hint: resist the urge to
multiply everything out! Just proceed directly with integration by parts, with u = π
4 − tan−1 x
and dv = (x + 1) dx.
That’s right - we’re not done with solids of revolution yet; there’s another question we can
answer! Let’s revisit the original idea from the previous section: take a segment of the curve
y = f (x) over an interval [a, b] and revolve it about the x-axis. To find the surface area of this
solid of revolution, we begin by partitioning the given interval [a, b] into subintervals of length
dx. On each of these intervals, the curve y = f (x) can be approximated by a straight line
segment - and from our earlier discussion of arc length we know the length of this segment to
q
be ds = 1 + [f 0 (x)]2 dx . When revolved about the x-axis, each one of these line segments
generates a frustum of a cone (a section bounded by two parallel planes). So, what’s the
The change in the radius of the cone over the interval is negligible, so we can approximate
the surface area of each frustrum as the surface area of a cylinder of radius f (x) and length
ds. That is, for each interval of length dx, we get a frustum of surface area dA = 2πrl =
q
2πf (x) 1 + [f 0 (x)]2 dx, and so the surface area for the entire solid of revolution (not counting
Z b q
A= 2πf (x) 1 + [f 0 (x)]2 dx.
a
What if we are to revolve the curve segment about a vertical axis instead? Then the radius
154
of each frustrum isn’t f (x)! If, for example, the axis of revolution is the y-axis (and a > 0),
Z b q
A= 2πx 1 + [f 0 (x)]2 dx.
a
presence of the square root will often make these difficult to evaluate exactly (as we saw with
Examples:
1. For the solid we began with, generated by the curve y = ex over the interval [1, 2],
Z 2 p
A= 2πex 1 + e2x dx.
1
This can be tackled by hand (with methods we’ve discussed), but it’s a challenge. The
h √ √ √ √ i
exact value is π e2 1 + e4 − e 1 + e2 + ln e2 + 1 + e4 − ln e + 1 + e2 square
2. For our second solid (Figure 68), the area of the outer surface (just the surface generated
by the curve y = ex , ignoring the left, right, and interior surfaces of the solid) must be
Z 2 p
A= 2π (ex + 5) 1 + e2x dx.
1
The only change we’ve had to make is to increase the radii of the frustra, but it makes
the integral even harder. A calculator gives an approximate value of 301.7 square units.
3. For our third solid (Figure 69), the radius of each frustum is x, so the surface generated
Z 2 p
A= 2πx 1 + e2x dx.
1
155
That integral doesn’t look very friendly either, so maybe we could consider integrating
gives us
e2
Z r
1
A= 2π ln y 1 + 2 dy.
e y
Ouch. It’s safe to say that we won’t be asking you to evaluate that by hand on your final
exam, either! The really important thing (for your exam, and for your future studies) is
that you be able to set up the integral correctly. After all, that’s the part that requires
human understanding; the evaluation can always be left to a machine. The two integrals
above give the same result (of course): it’s approximately 47.45 square units.
31 Improper Integrals
Our definition of the definite integral applies only to continuous functions, on closed intervals.
1
However, it is possible to relax these conditions. For example, consider the function f (x) = 2 .
x
The area below the graph of f (x) from x = 1 to an arbitary point x = t is
Z t
1
A (t) = dx
1 x2
t
1
=−
x 1
1
= − + 1.
t
y=1/x2
x=1 x=t
Figure 70:
1 2 3
This gives A (2) = , A (3) = , A (4) = , etc, and we immediately notice something:
2 3 4
156
the area will always be less than 1, and in fact we can state that
1
lim A (t) = lim 1 − = 1.
t→∞ t→∞ t
More generally, then, we can extend our definition of the definite integral to infinite intervals
in this way:
Definition:
Z ∞
If f (x) is continuous on [a, ∞), then the improper integral f (x) dx is defined as
a
Z ∞ Z t
f (x) dx = lim f (x) dx.
a t→∞ a
If this limit exists, then we say that the integral converges; otherwise it diverges.
Similarly, we define
Z a Z a
f (x) dx = lim f (x) dx.
−∞ t→−∞ t
Z ∞ Z t
1 1
dx = lim dx = . . .
1 x2 t→∞ 1 x2
=1
A comment should be made here; we appear to be looking at a region which has infinite
length, but finite area! If this seems paradoxical, it’s because we (humans) tend to misinterpret
statements involving the concept of “infinity”. Infinity is not a “thing”, and there’s no such
Z ∞
1
dx = 1
1 x2
Z t
1
lim dx = 1;
t→∞ 1 x2
21
Actually, it is possible to define infinity as an object, but care is required. In the “extended reals”, ∞ and
−∞ are treated as numbers, while in the usual treatment of complex numbers they merge into a single point
at infinity (if you’re interested, look up “Riemann Sphere”).
157
it simply means that as we continue to enlarge the area highlighted in Figure 70, the area
increases more and more slowly, and never reaches 1. This is analogous to the idea of a
convergent infinite sequence, which you’ll study in Math 119; as an example consider that we
could take the number 0.9, add 0.09, then add 0.009, etc, and never reach 1 no matter how
Z ∞
1
Example 2: Now consider the improper integral dx. Applying the definition, we find
1 x
that
Z ∞ Z t
1 1
dx = lim dx
1 x t→∞ 1 x
t
= lim ln |x|
t→∞ 1
= lim ln t
t→∞
= ∞.
Z ∞
1
That is, this integral diverges. In fact, it is not hard to show that dx converges if and
1 xp
only if p > 1.
Z ∞
1
Example 3: Consider dx. To find an antiderivative, we would let u = ln x,
2 x (ln x)2
giving du = x1 dx. Hence
Z ∞ Z t
1 1
dx = lim dx
2 x (ln x)2 t→∞ 2 x (ln x)2
Z ln t
1
= lim du
t→∞ ln 2 u2
ln t
1
= lim −
t→∞ u ln 2
1 1
= lim − +
t→∞ ln t ln 2
1
= .
ln 2
For functions with discontinuities (or for functions being considered on finite but open
158
Definition:
If f (x) is continuous at every point in the interval [a, b] except at x = a, then the improper
Z b
integral f (x) dx is defined as
a
Z b Z b
f (x) dx = lim f (x) dx.
a t→a+ t
As for the first type of improper integral, if this limit exists, then we say that the integral
Z b Z t
f (x) dx := lim f (x) dx.
a t→b− a
Z b Z c Z b
f (x) dx := f (x) dx + f (x) dx,
a a c
Example 4:
Z 1 Z 1
1 1
√ dx = lim √ dx
0 x t→0+ t x
1
√
= lim 2 x
t→0+ t
√
= lim 2−2 t
t→0+
=2
Z 1
1
More generally, it can be shown that dx converges if and only if p < 1 (compare this to
0 xp
Example 2).
Z 4
dt
Example 5: Consider . If you fail to notice the discontinuity, you might write
1 (t − 2)2
Z 4 4
dt 1
2 = −t − 2
1 (t − 2) 1
159
1
=− −1
2
3
=− ,
2
and it should be obvious that this can’t be right (the integrand is always positive, so the
Z 4 Z 2 Z 4
dt dt dt
= +
1 (t − 2)2 1 (t − 2)2 2 (t − 2)2
Z x Z 4
dt dt
= lim + lim
x→2− 1 (t − 2)2 x→2+ x (t − 2)2
1 x 1 4
= lim −
+ lim −
x→2− t − 2 1 x→2+ t − 2 x
1 1 1
= lim − − 1 + lim − + .
t→2− x−2 t→2+ 2 x−2
Now, the first limit is infinite, and the second one is also (negatively) infinite. Hence the
integral diverges.
Z ∞
Example 6: Consider x dx. You might be tempted to say that this is zero, since
Z a −∞
x dx = 0, for any value of a. However, in doing so you’d be assuming that the upper and
−a
lower limits of the integral go to ±∞ at the same rate! To use our definitions, we must write22
Z ∞ Z 0 Z ∞
x dx = x dx + x dx
−∞ −∞ 0
0 t
x2 x2
= lim + lim
t→−∞ 2 t→∞ 2
t 0
↓ ↓
−∞ ∞
Z ∞ Z t
22
The temptation is to write f (x) dx = lim f (x) dx. In fact, this definition is used in certain
−∞ t→∞ −t
contexts, because it does
Z ∞ give useful results in applications.
Z 2tHowever, from a strictlyZ logical perspective, it is
t+1
flawed; why couldn’t f dx be defined instead as lim f dx, or perhaps lim f dx?
−∞ t→∞ −t t→∞ −t
160
Z ∞
Since both integrals diverge, we must conclude that x dx diverges.
−∞
If you’re wondering why we need these definitions, here are a couple of examples to show
Example 7: Suppose we try to apply our arc length formula to a circle of radius r to find
its circumference (yes, we already know what the answer should be!). Let’s proceed by finding
the length of the portion of the circle in the first quadrant, and multiplying it by 4. The upper
√ x
half of the circle has equation y = r2 − x2 , so y 0 = − √ . Therefore the circumference
r 2 − x2
of the circle must be s
Z r 2
dy
C=4 1+ dx
0 dx
r
r
x2
Z
=4 1+ dx
0 r 2 − x2
r
r
r2
Z
=4 dx
0 r2 − x2
Z r
r
=4 √ dx.
0 r2 − x2
This is an improper integral, since the integrand is undefined at the upper limit x = r. So,
Z t
1
C = 4 lim q dx
t→r− 0 x 2
1− r
x t
= 4 lim r sin−1
t→r− r 0
−1 t −1
= 4r lim sin − sin (0)
t→r− r
= 4r sin−1 (1)
= 2πr,
as expected.
Note: In this particular example (and many similar ones), you would still get the same
result if you failed to notice that the integral was improper. However, understanding the
161
concept may still be helpful to you. For instance, if you were to try to evaluate this integral
using a calculator (after choosing a specific value for r), it would fail! Realizing that it is
improper might help you modify your approach to fix the problem.
Example 8: We can use an improper integral to calculate the escape velocity of a projectile
fired from a planet’s surface (we’ll ignore air resistance here). The key realization is that the
1 2
kinetic energy of the projectile at launch, 2 mv , must be greater than the total work to be
done against gravity as the projectile rises. Now, for motion in a straight line we know that
work = force × distance, but if the force varies with distance then this must be calculated as
Z b
W = F (x) dx
a
(we split the journey into small distances of length dx, imagine the gravitational force to be
constant over each small interval, and sum up the amounts of work required to traverse each
one).
In our case,
GM m
F =− ,
x2
and we want to move the projectile from x = R (the surface of the planet) to x = ∞ (an
Z ∞
GM m
Work = dx
R x2
Z t
GM m
= lim dx
t→∞ R x2
t
1
= GM m lim −
t→∞ x R
1 1
= GM m lim − +
t→∞ t R
GM m
= .
R
1 GM m
mv 2 > ,
2 R
162
r
2GM
i.e. v> .
R
If the planet is earth, this gives an astonishing speed of 11.2 km/s... and we haven’t even
163
Part V
Polar Coordinates
32 Introduction
So far, all of our discussions have taken place in the Cartesian coordinate system, in which
the coordinates are the distances from a vertical and a horizontal axis, respectively.
P=(x,y)
x
Figure 71:
Alternatively, we could describe the location of a point by giving its distance from the
origin (we’ll call this ρ) and the angle between the horizontal axis and a line segment drawn
P=(𝝆,𝝋)
Figure 72:
164
can take on any value, although we only need the values φ ∈ [0, 2π]. This means that polar
representations are not unique; the point (1, 0) can also be described as (1, 2π), (1, 4π), etc.
Note also that we lack distinct notations for the two systems; if we see the expression (a, b)
If we lay our two diagrams over top of each other, we can convert between the two systems.
x = ρ cos φ
y = ρ sin φ.
P=(𝝆,𝝋)
𝝆
y
𝝋
Figure 73:
Examples: For points on the axes, no calculations should be necessary; just think about
165
π
(x, y) = (0, 1) is (ρ, φ) = 1,
2
√ π
(x, y) = (1, 1) is (ρ, φ) = 2,
4
√ 3π
(x, y) = (−1, 1) is (ρ, φ) = 2,
4
etc.
Examples:
• Consider the equation ρ = sin φ, with the restriction φ ∈ [0, π]. What does the graph
look like? If we try drawing it roughly, by hand, just by thinking about what the
Figure 74:
166
Is this really a circle? Yes it is! We can confirm this by using the conversion formulas:
p y
ρ = sin φ =⇒ x2 + y 2 = p
x2 + y 2
=⇒ x2 + y 2 = y
=⇒ x2 + y 2 − y = 0
2
2 1 1
=⇒ x + y− = ,
2 4
and we recognize this as being the equation of a circle of radius 1/2, centered at 0, 12 .
Question: What happens for φ ∈ (π, 2π), where we obtain negative values for ρ?
There are two conventions in use. Some authors simply ignore this range, since ρ is
supposed to be a distance (so there are no points corresponding to these values of φ). Others,
though, interpret a negative distance as a distance in the opposite direction. That is, they use
the rule (−ρ, φ) = (ρ, φ + π). Thus, for example, the point (ρ, φ) = −2, π4 is (ρ, φ) = 2, 5π
4 :
𝛗=𝞹/4
-5 -4 -3 -2 -1 0 1 2 3 4 5
-1
Here!
-2
-3
Figure 75:
More Examples:
• For ρ = sin φ, the section for φ ∈ (π, 2π) reproduces the same circle!
167
1.6
1.2
0.8
↗
0.4
-2.4 -2 -1.6 -1.2 -0.8 -0.4 0 0.4 0.8 1.2 1.6 2 2.4
-0.4
-0.8
-1.2
0.75
0.5
𝝆=cos(𝛗)
0.25
𝝆(𝞹/2)=0 𝝆(0)=1
-0.5 -0.25 0 0.25 0.5 0.75 1 1.25 1.5 1.75
-0.25
-0.5
• Simple equations in polar coordinates can have some interesting graphs. Consider ρ =
2 sin φ + 1. Conversion into Cartesian coordinates is not very helpful here, so it’s easiest
to trace the curve out using what we know about the sine function (it might help to
graph the equation y = 2 sin x + 1 first, to see how the output values change as the input
values we’ll use the rule described above. We should get something like this:
168
4
𝝆(𝞹/2)=3
3
𝝆 increases from 1 to 3
↵
2
-5 -4 -3 -2 -1 0 1 2 3 4 5
𝝆(𝞹)=1 𝝆(0)=1
-1
Figure 78:
• You might try also the equations ρ = cos 2φ and ρ = 2 − sin φ, as exercises.
Since circles (and spheres) are such natural shapes, we encounter them frequently in applica-
tions. And since they can be described so much more simply in polar coordinates, it can be
very useful to be able to do calculus in polar coordinates. You’ll get a sense of how important
this is in Math 119, but we are already in a position to at least ask some simple questions:
given a curve (or curve segment) with polar equation ρ = f (φ), how might we find its length?
33.1 Areas
We know how to find the area between the x-axis and a curve y = f (x) for x ∈ [a, b], and it
isn’t too difficult to modify that idea. The first step is to realize what it is we will actually be
calculating! Polar equations describe distance from the origin instead of from an axis, and so
given a curve segment ρ = f (φ), φ ∈ [α, β], the quantity we’ll most easily be able to calculate
is the area between the origin and the curve over that range of angles, i.e. the area bounded
169
4
φ =β
ρ=f(φ)
2
φ =α
1
-2 -1 0 1 2 3 4 5 6 7
-1
Figure 79:
From here, we follow a familiar argument, modifying it for the context. We partition the
interval [α, β] into tiny angular increments ∆φ. In each small interval [φi−1 , φi ], i = 1...n, we
pick a value φ∗i at which to evaluate f . We use that value to define the radius of a sector
of a circle (instead of the height of a rectangle). Recalling that the area of a sector of a
1
circle of radius r and angle θ is A = 21 r2 θ, we know that this sector has area dA = ρ2i dφ =
2
1 ∗ 2
[f (φi )] dφ. Thus, we can approximate the area “inside” the curve as the sum of the areas
2
of a multitude of sectors of different radii, and letting dφ → 0 gives us the exact value:
Z β
1
A= [f (φ)]2 dφ.
α 2
If what we’re looking for is the area between two curves ρ = f (φ) and ρ = g (φ), with
f > g on [α, β], then we just have to subtract the smaller area from the larger one:
Z β
1
A= [f (φ)]2 − [g (φ)]2 dφ.
α 2
Example: Find the area of the portion of the circle x2 + y 2 = 4 which lies above the line
y = 1.
Solution: If you were to encounter this question on an exam, your instincts might tell
you to just proceed with a calculation in Cartesian coordinates; this area must be A =
Z √3 p
√ 4 − x2 dx. That integral can indeed be evaluated, but polar coordinates provide
− 3
an easier option. We need to rewrite the curves in polar form, of course: the circle has polar
170
equation ρ = 2, and for the line we have ρ sin φ = 1, so ρ = csc φ. We also need to know the
range of value of φ; for this we can just observe that the curves intersect when csc φ = 2, i.e.
when sin φ = 1/2. Therefore φ runs from π/6 to 5π/6. The desired area must therefore be
Z φ2
1 2
ρouter − ρ2inner dφ
A=
φ1 2
Z 5π/6 Z 5π/6
1 2 1
2 − csc2 φ dφ = 2 − csc2 φ
= dφ
π/6 2 π/6 2
5π/6
1
= 2φ + cot φ
2 π/6
√ ! √ !
5π 3 π 3
= − − +
3 2 3 2
4π √
= − 3.
3
Note: there are a couple of ways we could have made this problem even easier, by taking
advantage of symmetry: consider that the area bounded above the line y = 1 must be the
same as the area bounded to the right of the line x = 1; that gives an integral involving sec2 φ
instead of csc2 φ. We could also cut the region in half along the axis, and double the result.
To find the length of a curve with polar equation ρ = f (φ), with φ ∈ [α, β], recall the work
we’ve already done on arc length. The idea was to partition the curve into small segments of
length ∆s, calculate the length of each one, add them up, and take a limit. Roughly speaking
q
ds = (dx)2 + (dy)2 .
171
To use x as the variable of integration (with x ∈ [a, b]), we simply factored out a dx within
s 2
Z b
dy
L= 1+ dx.
a dx
To use y as the variable of integration (with y ∈ [c, d]), we simply factored out a dy:
s 2
Z d
dx
L= + 1 dy.
c dy
If we wish to use φ as the variable of integration, we just have to introduce a differential dφ,
Now, to use this formula, we do need a bit of work. You can see that we will need to
express both x and y in terms of φ. That isn’t hard to do; we know that
This describes the curve using two functions of the single variable φ; we call this a parameter-
ization of the curve, and we’ll discuss the concept further in Math 119. For the moment, all
dx dy
= f 0 (φ) cos φ − f (φ) sin φ, = f 0 (φ) sin φ + f (φ) cos φ
dφ dφ
2
dx 2
= f 0 (φ) cos2 φ − 2f (φ) f 0 (φ) sin φ cos φ + [f (φ)]2 sin2 φ
=⇒
dφ
2
dy 2
= f 0 (φ) sin2 φ + 2f (φ) f 0 (φ) sin φ cos φ + [f (φ)]2 cos2 φ,
and
dφ
2 2
dx dy 2
= f 0 (φ) + [f (φ)]2 .
so +
dφ dφ
172
Therefore, the arc length formula has polar form
Z β q
L= [f 0 (φ)]2 + [f (φ)]2 dφ.
α
If you prefer, you can use the fact that ρ = f (φ) and express this as
s 2
Z β
dρ
L= + ρ2 dφ.
α dφ
Example: Consider the graph of the equation ρ = 2 sin φ + 1, which we discussed in the
previous section (see Figure 78) . The outer loop has length
Z 7π/6 q
L= (2 cos φ)2 + (2 sin φ + 1)2 dφ
−π/6
Z 7π/6 q
= 4 cos2 φ + 4 sin2 φ + 4 sin φ + 1 dφ
−π/6
Z 7π/6 p
= 5 + 4 sin φ dφ.
−π/6
Unfortunately, using polar coordinates doesn’t get us away from the fact that arc length
integrals contain square roots, and we’ve got another one which is hard to evaluate exactly. A
calculator gives an approximation easily enough: the length is about 10.68 units.
173
34 Complex Numbers
The basic idea behind complex numbers is a simple one, but it took centuries to be taken
seriously. Consider the algebraic equation x2 + 1 = 0. We know that this has no solutions
(just consider the fact that the graph of y = x2 + 1 does not cross the x-axis). However, if
we “blindly” apply the rules we usually use for solving such equations, we find that x2 = −1,
This is clearly nonsense, but it turns out to be a remarkable useful sort of nonsense!
Defining an imaginary number or a complex number needs some care; we will avoid using
√
the expression “ −1” in our definitions altogether because it can lead to errors like this24 :
√ p p p √ 2
1= 1= (−1) (−1) = (−1) (−1) = −1 = −1.
Instead, we define a complex number z to be an ordered pair (x, y), written in the special
notation
z = x + iy
and accompanied by rules of algebra which ensure that i2 = −1 (we’ll state those rules in
√
a moment). This allows us to avoid the “ ” symbol, but still introduce i as an “imaginary”
number such that i and −i are the two square roots of −1. If y = 0 then z is a real number,
The number x is referred to as the real part of z, and we may use the notation Re (z) = x.
The number y is referred to as the imaginary part of z, and we may use the notation
Im (z) = y. There is something odd about this particular choice of terminology; the imaginary
part of z is a real number ! The quantity iy is imaginary, but y itself is real, and this is the
There is also one other peculiarity of notation you’ll need to get used to. In the engineering
world the letter i has another role; it stands for electrical current (often, I is used for a constant
value, and i (t) is used for a current which changes as a function of time). For that reason,
24
What’s wrong with this calculation? The problem is that when we work with real numbers, the symbol
√
“ ” represents the positive square root, but complex numbers cannot be described as positive or negative!
In fact, when working with√complex numbers, we will have to treat roots (and all non-integer exponents) as
multivalued functions, so “ −1” will have to be understood as representing both i and −i.
174
engineers will usually use j for the imaginary unit instead of i. We might as well make that
switch right away; from now on we’ll write our complex numbers as z = x + jy.
Remarkably, the introduction of complex numbers can make math simpler ! For example,
consider the problem of factoring polynomials (which is what motivated the idea in the first
place). When we work with real numbers, it is known that every polynomial can be factored
into a collection of linear and irreducible quadratic factors. If we allow ourselves to use
complex numbers, then we can shorten that statement: every polynomial can be factored into
√ √
−1 ± 1−4 1 −3
z= =− ± .
2 2 2
We can see then that there are no real solutions, but in the world of complex numbers we can
You’ll notice that the two roots differ only by a negative sign, and you should be able to
see that this will happen with every quadratic equation, as a result of the “±” in the quadratic
formula. We say that the solutions are complex conjugates of each other, and we have a special
Before we really get our hands dirty, we should make one more comment about notation:
we do not distinguish between x + jy and x + yj; either form is acceptable. Common practice
is to write numbers in front of j, but variables behind it, so for example we would usually
Complex Arithmetic
Let two complex numbers be z1 = x1 + jy1 and z2 = x2 + jy2 . We define equality, addition,
175
2 z1 + z2 := (x1 + x2 ) + j (y1 + y2 )
3 z1 z2 := (x1 x2 − y1 y2 ) + j (x1 y2 + x2 y1 )
z1 x1 x2 + y1 y2 x 2 y1 − x 1 y2
4 := +j
z2 (x2 )2 + (y2 )2 (x2 )2 + (y2 )2
The multiplication and division rules look complicated, but they are designed such that we
can simply use our familiar rules for real variables, combined with the rule j 2 = −1.
5 (z ∗ )∗ = z
1
6 z + z ∗ = 2Re (z), so Re (z) = (z + z ∗ )
2
1
7 z − z ∗ = 2jIm (z), so Im (z) = (z − z ∗ )
2j
8 zz ∗ = (x + jy) (x − jy) = x2 + y 2
Graphical Representation
We can represent z = x + jy graphically by plotting it as the ordered pair (x, y) in the complex
plane:
Im(z)
3
3+2j
2
-4 -3 -2 -1 0 1 2 3 4 5
Re(z)
-1
-2
Figure 80:
176
This turns out to be a very useful idea. For one thing, you can now see why the “imaginary
part” is defined to be the real number y; in this graphical interpretation j is nothing more
than a placeholder. Also, we can see that addition and subtraction of complex numbers is
analogous to addition and subtraction of vectors. Even more importantly, it opens the door
Recall that we can move from Cartesian coordinates to polar coordinates by letting x = r cos θ
z = x + jy
= r cos θ + jr sin θ
= r (cos θ + j sin θ) .
Im(z)
3
3+2j=r(cos(𝛳)+jsin(𝛳))
2
r=√13
1
𝛳=arctan(2/3)
-4 -3.2 -2.4 -1.6 -0.8 0 0.8 1.6 2.4 3.2 4 4.8 5.6
Re(z)
-1
-2
Figure 81:
The number r is called the modulus of z, and we’ll also write it as |z| .
The number θ is called the argument of z, and we may write it as arg (z).
177
and
y
tan θ = .
x
To solve for θ, of course, we need to know which quadrant z is in. If Re (z) > 0 then
θ = tan−1 (y/x), while if Re (z) < 0 we have θ = tan−1 (y/x) + π. Of course, we could use
tan−1 (y/x) − π instead; every complex number has infinitely many polar representations since
Now, if we try multiplying two complex numbers expressed in polar form, something sur-
prising happens. Let z1 = r1 (cos θ1 + j sin θ1 ) and let z2 = r2 (cos θ2 + j sin θ2 ). Then
That is, when we multiply complex numbers, the moduli get multiplied and the arguments
get added! This might look familiar; compare it to what happens when we multiply two
This suggests a way to define the exponential function for imaginary numbers:
Euler’s Formula:
This is one of the most important formulas in all of mathematics. The special case where
θ = π is particularly elegant: we find that ejπ = −1, so ejπ + 1 = 0. This one simple equation
ties together the five most important numbers in mathematics: 0, 1, j, π, and e! Also, Euler’s
Formula gives us a third way to express complex numbers, which is equivalent to the polar
z = x + jy
178
= r (cos θ + j sin θ)
= rejθ .
The standard form x + jy will still be the easier form to use when we need to add or
subtract complex numbers, but the exponential form can make multiplication and division
much easier.
Example:
√ jπ
1+j 2e 4
= √ −j π
1−j 2e 4
π
= ej 2
=j
A comment seems necessary here: you shouldn’t need the conversion formulas to get
through this example; just think about the location of the points in the complex plane. For
π
example, ej 2 has r = 1 and θ = π2 , so it lies one unit up the imaginary axis; that’s the number
3π
j. Similarly, you should immediately recognize that ejπ = −1, and ej 2 = −j. Of course, the
Example:
√ j tan−1 (1/2)
2+j 5e √ −1
· (3 − j) = √ −1
· 10ej tan (−1/3)
−3 + 2j 13e j(tan (−2/3)+π)
r
50 j [tan−1 (1/2)+tan−1 (2/3)−π−tan−1 (1/3)]
= e
13
≈ 1.961ej(−2.412)
≈ −1.46 − 1.31j
Euler’s Formula can be used to extend the definitions of some familiar functions to the
π π
• We define ez = ex+jy = ex ejy . So, for example, e1+j 2 = eej 2 = ej.
• Since ejθ = cos θ + j sin θ, replacing θ with −θ gives e−jθ = cos θ − j sin θ. Adding the
179
two expressions together and dividing by 2, we get
1 jθ 1 jθ
cos θ = e + e−jθ , and sin θ = e − e−jθ .
2 2j
• The structures of the two formulas above might look familiar. In fact, if we set θ = jx,
we discover that
1 −x
e + ex
cos (jx) =
2
1 x
e + e−x
=
2
= cosh x,
and
1 −x
e − ex
sin (jx) =
2j
1 x
e − e−x
=−
2j
−1
= sinh x
j
= j sinh x
−1 −j
(using the fact that j = j2
= j). We use these as our definitions of the cosines and
sines of imaginary numbers, and combining them with the sum-of-angle identities allows
Example:
cos (2 + 3j) = cos (2) cos (3j) − sin (2) sin (3j)
≈ −4.19 − 9.11j
De Moivre’s Theorem:
n
z n = rejθ
180
= rn ejnθ
10 10
1 1 1 π
+ j = √ ej 4
2 2 2
1 j 5π
= e 2
25
1 5π 5π
= cos + j sin
32 2 2
1
= j.
32
Complex Roots
De Moivre’s Theorem can be modified for fractional powers. Before we explore this, though, we
√
need another comment about notation. For real numbers, the expression x is understood to
represent the positive square root of x, and it is undefined when x is negative. More generally,
there may be 0, 1, or 2 nth roots of a real number, depending on whether n is even or odd, and
whether the real number is positive or negative. Whenever there are two roots, the notation
x1/n is always understood to represent the positive one. However, when we discuss complex
numbers, the words “positive” and “negative” have no meaning (all we can do is describe the
real and imaginary parts as being either positive or negative). So, in the context of complex
√ √
numbers, we consider the expressions z and z 1/n to be multivalued; z has two values, and
Now, if we try to apply De Moivre’s Theorem using a fractional power, we end up with
this:
However, this only gives one root, even in cases where we already know that there should be
181
two (eg for 11/2 we get 1, but not −1). To find the rest, we need to remember that the polar
and exponential forms of z are not unique; if z = rejθ , then z = rej(θ+2kπ) , for every integer
k. Using this expression for z, we discover that there are n distinct values of z 1/n , given by
1/n
wk = z 1/n = rej(θ+2kπ)
= r1/n ej(θ+2kπ)/n
1/n θ 2kπ θ 2kπ
=r cos + + j sin + ,
n n n n
where k = 0, 1, 2, ...n − 1 (when k reaches n we have r1/n ej(θ/n+2π) = r1/n ejθ/n , so we begin
Notice that all of the roots have the same modulus, and the arguments differ by multiples
of 2π/n. That is, the roots are all located on the circle z = |z|1/n , and are evenly spaced
around it!
Solution: In exponential form, −8 = 8ejπ (that is, r = 8 and θ = π). However, we can
√ jπ √ √ √
w0 = 2e 6 = 2 cos π6 + j sin π6 = 2 23 + 12 j
√ j( π + π ) √ j π √
w1 = 2e 6 3 = 2e 2 = 2j
√ j ( π + 2π ) √ √ −√3 1 √
2e 6 3 = 2 cos 5π 5π √1
w2 = 6 + j sin 6 = 2 2 + 2j = 2
− 3+j
√ j ( π + 3π ) √ √ −√3 1 √
2e 6 3 = 2 cos 7π 7π √1
w3 = 6 + j sin 6 = 2 2 − 2j = 2
− 3−j
√ j ( π + 4π ) √ j 3π √
w4 = 2e 6 3 = 2e 2 = − 2j
√ j ( π + 5π ) √ √ √3 1 √
2e 6 3 = 2 cos 11π 11π √1
w5 = 6 + j sin 6 = 2 2 − 2j = 2
3−j
182
Im(z)
1.6
w1
1.2
w2 0.8
w0
0.4
Re(z)
-2.4 -2 -1.6 -1.2 -0.8 -0.4 0 0.4 0.8 1.2 1.6 2 2.4
-0.4
w3 -0.8
w5
-1.2
w4
-1.6
Figure 82:
Example: Let’s just make sure that our knowledge of real numbers fits into this new way
Solution: In exponential form, 4 is just 4... but it can also be written as 4ej2kπ .
√
Therefore 4 = (4)1/2 = 2ej(2kπ)/2 = 2ejkπ . Letting k = 0 and k = 1, we obtain w0 = 2
and w1 = 2ejπ = −2, as expected! And indeed, the two points are equally spaced on the circle
of radius 2.
Euler’s Formula and the resulting formulas for the sine and cosine functions can be useful for
those of us who dislike working with trigonometric functions; sometimes we can avoid them
1
ejθ + e−jθ . Therefore
Solution: We know that cos θ can be expressed as 2
1 jθ 3
cos3 θ = e + e−jθ
8
183
1 3jθ
= e + 3ejθ + 3e−jθ + e−3jθ (using the binomial expansion)
8
1 3jθ 3 jθ
= e + e−3jθ + e + e−jθ
8 8
1 e3jθ + e−3jθ 3 ejθ + e−jθ
= +
4 2 4 2
1 3
= cos 3θ + cos θ.
4 4
Z
Example: Evaluate ex cos 2x dx.
Solution: Using the traditional methods, we would need to do two iterations of inte-
gration by parts here. However, if we recognize cos 2x as being the real part of e2jx , we can
Z Z Z
x x 2jx x 2jx
e cos 2x dx = e Re e dx = Re e e dx
Z Z
Now, ex e2jx dx = e(1+2j)x dx
e(1+2j)x
= +C (where C is a complex constant)
1 + 2j
(1 − 2j)
= e(1+2j)x +C
5
1 x
= [(e cos 2x + jex sin 2x) (1 − 2j)] + C.
5
All that is left is to write down the real part of this; we can conclude that
Z
1 x
ex cos 2x dx = (e cos 2x + 2ex sin 2x) + C.
5
You might not be convinced that this is easier, but it does demonstrate the surprising fact
that complex numbers can be used to solve problems involving real-valued variables! We’ll
184
35 An Application of Complex Numbers: Impedance
It is known from experiment that when a current of i (t) = I sin (ωt) passes through a resistor
of resistance R the voltage drop across the resistor is vR (t) = IR sin (ωt). This can be stated
more concisely as
v = iR,
which is the famous “Ohm’s Law”. However, the relationship between current and voltage is
slightly more complicated when capacitors or inductors are involved. When the same current
i (t) = I sin (ωt) passes across a capacitor of capacitance C the voltage drop across the ca-
I π
pacitor is vC (t) = sin ωt − , and when it passes through an inductor of inductance L
ωC 2 π
we find that vL (t) = ωLI sin ωt + . Even though these formulas involve phase shifts, it
2
is possible to express them in a form similar to Ohm’s Law, if we take advantage of complex
numbers!
Recall that
i (t) = I · Im ejωt .
In fact, as a mathematical trick, we will pretend that the current is a complex quantity; we
185
• Each value of t gives a point in the complex plane, on the circle of radius I.
i ! "I
i (0) ! i (2π/ω ) ! i (4π/ω ) ! I
i (t) ! I e jωt
i ! "I j
Figure 83:
• The actual current is given by the imaginary part of the complex one; we just consider
Now consider our three rules again; how should they be expressed for the complex current
Resistors: v (t) = IR sin (ωt) becomes ṽ (t) = IRejωt . So, Ohm’s Law is unchanged:
ṽ = Rĩ.
I π
Capacitors: v (t) = sin ωt − becomes
ωC 2
I j (ωt− π )
ṽ (t) = e 2
ωC
I −j π jωt
= e 2e
ωC
−jI jωt
= e
ωC
I jωt 1
= e (using = −j)
jωC j
But now this is just a multiple of the current! We have arrived at a version of Ohm’s Law for
capacitors:
1
ṽ = ĩ
jωC
186
π
Inductors: v (t) = ωLI sin ωt + becomes
2
π
ṽ (t) = ωLIej (ωt+ 2 )
π
= ωLIej 2 ejωt
= jωLIejωt
ṽ = jωLĩ
All three component types represent opposition to the flow of electricity, so we call this new
You may be familiar with the fact that if a current passes through several resistors connected in
series, then we can treat that section of the circuit as a single resistor, and the total resistance
is simply the sum of the individual ones. On the other hand, if the resistors are connected in
R1
A B A B
R1 R2 R2 R2
R3
1 1 1 1
R ! R1 " R2 " R3 ! " "
Figure 84: R R1 R2 R3
Fortunately, capacitances and inductances combine in the same way (at least, if we view
1/C as the quantity of interest for capacitors instead of C). Furthermore, now that we have
187
generalized Ohm’s Law, we can treat resistors, capacitors, and inductors as if they were all
1 1 1 1
R = R1 + R2 + R3 + . . . = + + + ...
R R1 R2 R3
1 1 1 1
= + + + ... C = C1 + C2 + C3 + . . .
C C1 C2 C3
1 1 1 1
L = L1 + L2 + L3 + · · · = + + + ...
L L1 L2 L3
1 1 1 1
Z = Z1 + Z2 + Z3 + ... = + + + ...
Z Z1 Z2 Z3
Examples:
A B
Figure 85: R!5Ω C ! 0.1 F
Find v (t): the difference in electrical potential energy between terminals A and B.
Note: 0.1 farads is a rather large value for a capacitor, but we won’t concern ourselves
with realism too much here; the goal is to illustrate the method.
Solution: The complex current must be ĩ (t) = Iejt (so that the imaginary part is
i (t)).
188
= I [(5 cos t + 10 sin t) + j (5 sin t − 10 cos t)] .
√
(which we could rewrite as v (t) = 5 5I sin (t − 1.107)).
R!2Ω
A B
Since they are connected in parallel, the total impedance can be calculated this way:
1 1 1
= +
Z Z1 Z2
1 1
= +
2 2j
1
= (1 − j)
2
1 2 1+j
=⇒ Z= 1 = = 1 + j.
2 (1 − j)
1−j 1+j
ṽ (t) = Z ĩ = (1 + j) Ie4jt
189
= I [(cos 4t − sin 4t) + j (cos 4t + sin 4t)]
√ π
= 2I sin 4t + .
4
√ π
Comment: Since the number 1 − j is recognizable as 2e−j 4 , these calculations are
1 2 √
Z= 1 = √ −j π = 2ejπ/4 = 1 + j
2 (1 − j) 2e 4
√ jπ/4 4jt
=⇒ ṽ (t) = 2e e I
√
= 2Iej(4t+π/4)
√ π
=⇒ v (t) = 2I sin 4t + .
4
Note in particular that this puts the solution directly into the desired amplitude/phase
form!
R!5Ω
A B
L ! 0.5 H
Figure 87: C ! 0.1 F
Again, the problem is to find the voltage drop between terminals A and B, given an
190
To determine how they combine, we first consider the part of the circuit which is con-
structed in parallel. We can calculate an impedance for this section; let’s call it ZRC :
1 1 1
= +
ZRC ZR ZC
√
1 j 1 2 jπ/4
= + = (1 + j) = e
5 5 5 5
5 5
=⇒ ZRC = √ e−jπ/4 = (1 − j)
2 2
Now we simply have two components connected in series, and the net impedance is
Z = ZRC + ZL
5
= (1 − j) + j
2
5 3
= − j.
2 2
5 − 3j
ṽ (t) = Z ĩ (t) = Iej2t
2
√
34I −j tan−1 (3/5) j2t
= e e
2
√
34I j (2t−tan−1 (3/5))
= e
2
√
34
I sin 2t − tan−1 (3/5)
v (t) =
2
A final comment: in one of your second-year courses, you will see another method for
solving problems like these, which is based on calculus instead of complex algebra. It is
based on the observation that the formulas for the voltage drop formulas for capacitors
191
and inductors can be expressed as antiderivatives or derivatives of the current. Given
I π
vC (t) = sin ωt − ,
ωC 2
and
π
vL (t) = ωLI sin ωt + ,
2
and
t t
−I
Z Z
I 1
vC (t) = cos (ωt) = sin (ωτ ) dτ = i (τ ) dτ.
ωC C a C a
These formulas turn out to be valid even when the current is not simply alternating,
which makes the resulting method (using differential equations) much more powerful.
192