LectureNotes HT22 Part2
LectureNotes HT22 Part2
Part 2
Jan-Fredrik Olsen
ii
Contents
iii
iv CONTENTS
Introduction
In this chapter we study limits for functions. For sequences, we considered what hap-
pened to a sequence an as n ! 1. Now, we mainly consider what happens to a function
f (x) as x ! a. Even though these two situations are quite similar, we need to be a bit
more nuanced when it comes to limits of functions.
Remark 6.1 (Selected problems from previous exams based on this chapter)
lim (3x3 9x + 1) = 7.
x!2
175
176 CHAPTER 6. LIMITS FOR FUNCTIONS
(x , x + ) ⇢ D.
The point is that if I is a neighbourhood of a point x, then this means that there is
a little bit of wiggle-room both to the left and the right of x inside of I.
p
Example 6.3 The function f (x) = x is defined in a neighbourhood of the point x = 1,
but, it is not defined in a neighbourhood of x = 0 (however, it is defined in a one-sided
neighbourhood of x = 0, see exercise 6.11).
(x , x) [ (x, x + ) ⇢ D.
Example 6.5 The natural domain of the function f (x) = sin x/x is R\{0}. This
function is therefore defined in a punctured neighbourhood of x = 0.
As we will see on the next page, it therefore makes sense (at least intuitively) to
study the limit of f (x) = sin x/x as x approaches 0.
6.1. A FIRST LOOK AT LIMITS FOR FUNCTIONS 177
Informal definition 6.6 (limit of a function) Suppose that f is defined (at least)
in a punctured neighbourhood of a point a. Then, if f (x) "approaches" some value L as
x "approaches" a, we write
Of course, strictly speaking, this "informal definition" does not really define anything.
But in the following example, we try to illustrate what we are getting at.
Fig. 1. Two visualisations of f (x) where we indicated the computed values of f (x)
with red dots. Notice that x = 0 is not in the domain of f .
From the above figures, it seems like the values of f (x) approach 1 as x ! 0. It is
therefore reasonable to believe that
sin x
lim = 1.
x!0 x
178 CHAPTER 6. LIMITS FOR FUNCTIONS
Let us summarise the insight from the above example in a rule of thumb:
Remark 6.9 (The finger does not care about the point itself ) When discussing
if limx!a f (x) = L, we are not allowed to use any information about f (x) at the point
x = a itself. This is a feature of the formal definition of the limit which is necessary
since we typically want to investigate what happens as we approach points just outside
the domain of f . For this reason, the finger is blind to any information at x = a itself.
Exercise 6.10 What appears to be the limit for each of the functions in Figure 3 as
x approaches 0?
One-sided limits
Exercise 6.11 Use definitions 6.2 and 6.4 as insipiration to define what we ought to
mean by one-sided (punctured) neighbourhoods to the right and to the left of a point,
respectively.
Remark: The point is to find a condition that ensures that the domain of a function
f is such that we can evaluate it as x approaches the point a from the left or from the
right, respectively.
Informal definition 6.12 (one-sided limits) Suppose that f is defined (at least)
on a punctured one-sided neighbourhood to the right of the point a. Then, if f (x)
"approaches" some value L as x "approaches" a from the right, we write
lim f (x) = L or f (x) ! L.
x!a+ x!a+
If the same holds, but with the word "right" replaced by "left" above, we write
lim f (x) = L or f (x) ! L.
x!a x!a
Exercise 6.13 (a) What are the one-sided limits of the function shown in Figure 4,
above?
(b) What are the one-sided limits of the function f (x) = sin x/x studied in Example
6.7, above?
180 CHAPTER 6. LIMITS FOR FUNCTIONS
This proposition, which you are asked to prove in exercise 6.35, is rather useful since
it sometimes reduces checking an annoying limit (involving, say, absolute values) by two
hopefully friendlier one-sided limits.
Since the one-sided limits are different, we conclude that the limit does not exist.
lim f (x) = L,
x!a
That is, in situations where it only makes sense to consider one-sided limits, then by
the limit of a function, we always mean its one-sided limit.
p
Example 6.18 The limit of f (x) = x as x ! 0 is equal to 0.
6.1. A FIRST LOOK AT LIMITS FOR FUNCTIONS 181
Continuity
Above, we remarked that when computing limx!a f (x) our finger does not care about
about what happens at x = a itself. But what if we do care?
Here is an informal, not quite accurate, but quite useful description of what it means
for a function to be continuous at a point.
Notice that the property that Df \{a} has points arbitrarily close to a just means
that the left-hand side of the implication in the above definition is not empty.
Remark 6.26 When working with limits, sometimes it is enough to assert that some
property holds for f when x 6= a is "sufficiently close" to a. By this we mean that there
exists some > 0 so that the property holds for x 2 Df \{a} whenever |x a| < .
Our goal is now to show that we get (2x 1) closer than ✏ to 3 by moving x sufficiently
6.2. THE DEFINITION OF THE LIMIT FOR FUNCTIONS 183
The first step towards achieving this is to simplify the expression we are considering:
Exercise 6.28 What is the appropriate delta response to ✏ = 1 and ✏ = 1/2, respec-
tively, according to the computations in the above example? Does this match what
we see in Figure 7?
Exercise 6.29 In this exercise you are to use the definition of the limit to show that
lim (4x 3) = 9.
x!3
Example 6.30 Let us do a more difficult example. We use the epsilon-delta definition
of the limit to show that
f (x) = x3 2x2 5x + 8
184 CHAPTER 6. LIMITS FOR FUNCTIONS
is continuous at x = 1. That is, we need to use the epsilon-delta definition to show that
Again, we begin by supposing that we are given fixed but unknown ✏ > 0.
Fig. 8. Here, we illustrate the function f (x) = x3 2x2 5x + 8 with two possible
epsilon challenges. The question is, how do we respond?
The first step towards achieving this is to simplify the expression we are considering:
To have any hope of succeeding, we need the factor x 1 to appear. But this is exactly
what happens since x = 1 is a root of this expression (check this!). Using what we
learned in Chapter 1, we obtain
This is all good, but how to deal with |x2 x 6|? We now use a trick: observe that
while we do not yet know how small we need to make |x 1|, let us at least agree that we
will make this distance smaller than 1 (because why not?). Opening the absolute value,
we see that
|x 1| < 1 () 1 < x 1 < 1 () 0 < x < 2.
Next, we see that, under the condition 0 < x < 2, the triangle inequality gives us that
|x2 x 6| x2 + |x| + 6 < 22 + 2 + 6 = 12.
That is, by combining what we have done so far, we get
But now, we are in exactly the same situation as in the previous example. Here, the
final observation is that
✏
|x 1| < =) 12|x 1| < ✏.
12
In conclusion, we have shown that
(
|x 1| < 1
=) |(x3 2x2 5x + 8) 2| < ✏.
|x 1| < ✏/12
But wait! Does this satisfy the definition of the limit? Yes, what this means is that given
a challenge ✏ > 0, then the proper response is to be the smallest of the two numbers 1
and ✏/12, whichever that may be. We express this as choosing = min{1, ✏/12}. Done!
The first three exercises below are meant to help you understand the above example.
Exercise 6.31 Draw the graph of the function (✏) = min{1, ✏/12}.
Exercise 6.32 In Example 6.30, what seems like appropriate delta responses based
on Figure 8 (where epsilon is equal to 4 and 2, respectively)? What delta response is
suggested by the computations in the example? Does it matter that are not the same?
Exercise 6.33 In this exercise you are to use the definition of the limit to show that
If you are able to solve the following exercise, then you have an excellent understand-
ing of the epsilon-delta type proofs and the techniques involved. Most students will need
more than one attempt to solve it.
(c) Is there any epsilon we can beat by choosing = 2? (You are also supposed to
answer this question by inspecting the graph visually.)
(d) Use the definition of the limit to verify that f (x) is continuous at x = 2.
Remark: Part (c) is included to help you notice an added difficulty when solving (d).
The point of this exercise is to figure out how to successfully deal with this in part (d).
Exercise 6.35 (a) Formulate an epsilon-type definition for the one sided limits ap-
pearing in Remark 6.12 and Proposition 6.14.
(b) Prove the (= part of Proposition 6.14.
(c) Prove the =) part of Proposition 6.14.
Hint: Once you have done (a), the rest of this exercise is just a matter of comparing
the definitions for one and two-sided limits.
Remark 6.36 In the definition of continuity (Definition 6.19), it may strike you as
arbitrary that we choose to call a function continuous at isolated points. However, one
reason why this is natural is that it is equivalent to the following way to define continuity
at a point:
We say that f (x) is continuous at a point a 2 Df if, for every ✏ > 0, there exists a
> 0 so that
|x a| < =) |f (x) f (a)| < ✏.
Exercise 6.37 Show that the definition in the above remark is equivalent to Defini-
tion 6.19.
Remark: This means that you need to prove that if f is continuous at a point according
to the one definition, then this implies that it is also continuous according to the other.
That is, there are two implications to check.
6.2. THE DEFINITION OF THE LIMIT FOR FUNCTIONS 187
and
respectively. Since this does not require any new ideas (just look at the definitions for
the limits of sequences and functions that we have come across so far), we leave this as
an exercise.
Exercise 6.38 Formulate an ✏-type definition for both limits mentioned above.
188 CHAPTER 6. LIMITS FOR FUNCTIONS
Proposition 6.39 (Rulebook for the limit of functions) Suppose that the limits
limx!a f (x) and limx!a g(x) exist (and are finite). Then the following hold:
⇣ ⌘
(i) lim f (x) ± g(x) = lim f (x) ± lim g(x)
x!a x!a x!a
The next two rules state that the limit respects inequalities. We note that in both rules,
it is enough for the inequalities to hold for x 6= a that are sufficiently close to a:
(v) f (x) g(x) h(x) and lim f (x) = lim h(x) = L =) lim g(x) = L.
x!a x!a x!a
We now state a "composition rule" for the limit. It says that if we can evaluate the limit
of f g as x tends to a, then
(
f is continuous, and
(vi) =) lim f g(x) = f lim g(x)
lim g(x) exists and is in Df x!a x!a
x!a
Here is essentially the same rule as above, formulated as "change of variables" rule. In
this case, we also need to assume that g(x) is not equal to b for x sufficiently close to a:
(vi’) lim g(x) = b and lim f (u) exists =) lim f g(x) = lim f (u).
x!a u!b x!a u!b
Finally, the following concrete limits are useful enough to be included here:
Finally, the same extensions to infinite limits as for the limits of sequences hold, and,
moreover, the above rules also apply if we replace a by +1 or 1.
6.3. THE RULEBOOK FOR LIMITS OF FUNCTIONS 189
Proposition 6.40
⇣ ⌘
lim f (x) = L and lim g(x) = M =) lim f (x) + g(x) = L + M.
x!a x!a x!a
Note that the statement “|x a| is small” is exactly mean when we write |x a| < .
In particular, the hypothesis of the proposition, i.e., the statements f (x) ! L and
g(x) ! M , mean the following: for all ✏1 , ✏2 > 0, there exist numbers 1 , 2 > 0, so that
As in the corresponding proof for sequences, we are almost done with the proof at this
point. Recall that our goal is to somehow use the information in (6.3) to obtain (6.2).
So, what we do is to connect these expressions by using the triangle inequality as follows:
Next, observe that if we choose ✏1 = ✏/2 and ✏2 = ✏/2, then we are guaranteed that there
exist numbers 1 , 2 so that (6.3) holds. Combining this with the above, we find that
8
<|x a| < 1
>
|x a| < 2 =) |(f (x) + g(x)) (L + M )| < ✏1 + ✏2 = ✏.
>
:
x 6= a
In other words, the definition of the limit limx!a (f (x) + g(x)) = L + M is satisfied if we
choose = min{ 1 , 2 }.
Exercise 6.41 Modify the proof of the product rule for limits of sequences so that it
applies to limits for functions.
Exercise 6.42 Modify the proof of the squeeze theorem for limits of sequences so
that it applies to limits for functions.
190 CHAPTER 6. LIMITS FOR FUNCTIONS
Proof of Proposition 6.43. In a slightly sloppy language, what we know is the following:
8✏1 > 0 : |x a| small =) |g(x) b| < ✏1
(6.4)
8✏2 > 0 : |u b| small =) |f (u) L| < ✏2
What we need to prove is that given some unknown, but fixed, ✏ > 0, then
|x a| small =) |f (g(x)) L| < ✏.
But this is rather reasonable. Indeed, note by using the connection u = g(x), we ought to
be able to combine the two lines in (6.4) to arrive at the following chain of implications:
u=g(x)
|x a| small =) |u b| small =) |f (u) L| < ✏
In the figure, below, we illustrate the basic objective of the proof. You start out with
an epsilon target around L and are supposed to find a some delta on the x-axis that you
know answers this challenge. To do this, you need to take into account what happens
on the u-axis in the middle.
Exercise 6.44 In this exercise you are
asked to complete the proof.
Remark 6.45 (A final rule of thumb – an algebraic finger) If you can get away
with replacing x by its limit, then go for it!
x2 1
lim = lim (x + 1) = 2.
x!1 x 1 x!1
Exercise 6.47 Use the rulebook to justify the steps in the above example.
Exercise 6.48 Determine the following limits.
✓ ◆
x 2 1 2
(a) lim (b) lim + .
x!2 x2 + x 6 x! 1 x + 1 x2 1
192 CHAPTER 6. LIMITS FOR FUNCTIONS
Next, suppose that we know that the function y = x2 is continuous (you are asked to
verify this below). Then we can use (vi) to simplify the above computation as follows:
⇣ ⌘2
lim (4x + 1)2 = lim (4x + 1) = 52 = 25.
x!1 x!1
Now this last computation might not seem so appealing as the notation was a bit cum-
bersome by comparison to the application of (vi). But consider the following, more
elegant way of presenting computation using rule (vi’):
Since u = 4x + 1 ! 5 as x ! 1, we have
u=4x+1
lim (4x + 1)2 = lim u2 = 25.
x!1 u !5
Exercise 6.50 In this exercise you are asked to study the connection between rules
(vi) and (vi’).
Recall that the elementary functions included all functions in the following list, as well
as all combinations of these functions using a finite number of the operations of addition,
subtraction, multiplication, addition and composition (no inverses though!):
As a first step to proving the above theorem, we establish the following proposition,
which more or less follows immediately from the rulebook for the limit of functions.
Proposition 6.52 Suppose that f and g are continuous. Then all combinations of f
and g using a finite number of the operations of addition, subtraction, multiplication,
division and composition are continuous.
Proof. Suppose that f and g are continuous, and that a is some point in the domain of
f + g. If this is an isolated point of the domain, we are done. If not, we check the limit:
Since the limit of f + g as x tends to a equals the value of (f + g)(a), we conclude that
the sum must be continuous.
The proofs of the continuity of f g, f · g, f /g and f g are almost identical to the
one above, and so we leave these as an exercise.
Note that the above proposition does not say anything about continuity of inverse
functions. This is because even if a function is continuous and invertible, its inverse is
not necessarily continuous. We also record the following fact, which we will need.
194 CHAPTER 6. LIMITS FOR FUNCTIONS
is continuous.
While we postpone the proof of this proposition until the end of Chapter 9, we invite
you to do the following exercise.
By the above propositions, in order to prove Theorem 6.51, we actually only need to
check that a few of the functions in the list of elementary functions are continuous.
Lemma 6.56
(i) If all polynomials are continuous, then so are all rational functions.
(ii) If the logarithm is continuous, then so is the exponential function and all power
functions.
(iii) If the sine and cosine functions are continuous, then so is the tangent function and
the inverse trigonometric functions.
In light of Lemma 6.56, the following three lemmas establishes Theorem 6.51.
Proof. From the rulebook of the limit, we already know that the constants and f (x) = x
are continuous. This is enough to do a proof by induction to show that all polynomials
of degree n for all n 2 N are continuous. In particular, the continuity of the constants is
exactly the base case n = 0.
To do the induction step, we assume that all polynomials of degree n are continuous.
Our goal is to prove that this implies that all polynomials of degree n + 1 are continuous.
So, suppose that p(x) is a polynomial of degree n + 1. Now, observe that y =
p(x) p(0) has a zero at x = 0. By the fundamental theorem of algebra, this means
6.4. HOW TO USE THE RULEBOOK IN PRACTICE 195
we can factor out x and write p(x) p(0) = xr(x) for some polynomial r(x) of degree
n, which, by the induction hypothesis, is continuous. But now we have basically won!
Indeed, solving for p(x), we obtain the formula p(x) = xr(x) p(0) which allows us to
use Proposition 6.52 to conclude that p(x) is continuous (keep in mind, we already know
that y = x is continuous).
We now turn to the logarithm, the sine and the cosine. We first recall that in the
"geometrically obvious" facts listed on these functions in remarks 2.49 and 2.65, we also
said that these functions have no "jumps" in their graphs. That is, we take it as obviously
true that these functions are continous. Nevertheless, below, we indicate how to show
that the continuity of these functions is can be deduced from some facts that we will
be able to prove later, once we define the sine, cosine and logarithm in a non-geometric
fashion.
Proof. Let us first prove that the logarithm is continuous at x = 1. Since ln(1) = 0, this
amounts to showing that
lim ln(x) = 0.
x!1
But this follows immediately by applying the squeeze theorem to the following inequality
(recall Proposition 2.70):
x 1
log x x 1, 8x > 0.
x
Next, we use the logarithmic laws to prove that this implies that the logarithm is con-
tinuous for all x > 0. To do this, we need to use the logarithmic laws in combination
with the change of variables formula. Here is one way to compute this (note that we are,
yet again, relying on the trick of adding by 0):
Proof. This proof is rather similar to the one for the logarithm above. Indeed, we start
out by establishing that the sine is continuous at x = 0. Since sin(1) = 0, this amounts
to showing that
lim sin x = 0. (6.5)
x!0
As for the logarithm, we again apply an inequality from Chapter 2 (recall Proposition
2.60):
0 sin x x tan x, 8x 2 [0,⇡/2).
Here, we only need the two left-most parts of this triple inequality. Namely that 0
sin x x. From this, it immediately follows from the Squeeze theorem that (6.5) holds
when x ! 0+ . Using that sin x is an odd function, the corresponding one-sided limit
from the left also holds:
u= x
lim sin x = lim sin( u) = lim sin(u) = 0.
x!0 u!0+ u!0+
Next, we establish that it now follows that the cosine function is also continuous at x = 0.
This can be done in several ways. For instance, we can apply the half-angle formula on
the form ⇣x⌘
cos(x) = 1 2 sin2 .
2
Indeed, taking the limit of this expression as x ! 0, we find that
x
lim cos x = 1 2 lim sin2
x!0 x!0 2
⇣ x ⌘2
= 1 2 lim sin
x!0 2
x
u= 2 ⇣ ⌘2
= 1 2 lim sin u = 1 2 · 02 = 1.
u!0
(Notice that we used the continuity of the polynomial y = x2 to be able to pass the limit
inside of the square in the last line!)
Now, all that remains is to use the continuity of the sine and cosine at x = 0 prove
that they are continuous at all x. Here is how to prove that the sine is continuous at
an arbitrary point a 2 R using, yet again, trigonometric identities and the change of
variables rule:
u=x a
lim sin(x) = lim sin(u + a)
x!a u!0
⇣ ⌘
= lim sin u cos a + sin a cos u
u!0
Remark 6.63 We remark that Theorem 6.51, above, is the answer to the following
question: "How do I justify that a function is continuous on the final exam". That is,
just remark that it is an elementary function! (This is something we will practice doing
over and over and over again in Chapter 9!)
198 CHAPTER 6. LIMITS FOR FUNCTIONS
Proposition 6.64
ln(x + 1)
lim = 1.
x!0 x
Since this limit can be established using basically the same ideas as when we showed
that the logarithm was continuous at x = 1, we outline the proof as an exercise. As
before, the central players will be the logarithmic laws and the inequality
x 1
log(x) x 1, 8x > 0.
x
Exercise 6.65 (a) Verify the limit visually by considering a suitable plot.
(b) Prove the above limit by following the proof of the continuity of the logarithm
at x = 1, line by line.
Proposition 6.66
sin x 1 cos(x)
(i) lim =1 (ii) lim =0
x!0 x x!0 x
Again we give the proofs of these limit as an exercise, since the plan is to mimic what
we did when proving that the sine and cosine were continuous. Specifically, the point is
to use trigonometric identities in combination with the inequality
Exercise 6.67 (a) Verify the limits visually by considering a suitable plot.
(b) Prove the above limits by following the proof of the continuity of the sine and
cosine at x = 0, "essentially" line by line.
sin(x2 1) ex 1
(a) lim (b) lim
x!1 x 1 x!0 x
6.4. HOW TO USE THE RULEBOOK IN PRACTICE 199
Proposition 6.69 ln x
lim = 0, ↵ > 0.
x!1 x↵
We now give an example of how we can use this limit in a computation. Note that
since the rulebook for limits is well-adapted to deal with limits where x tends to infinity.
Example 6.70 We use the above limit in combination with a change of variables to
show that for > 0, we have
x
lim x = 0.
x!1 e
Notice that this limit is of the form [1/1]. So, this is just another example of the
exponential function winning essentially every fight he is involved in.
Let us try the change of variables ex = u, which is the same as x = ln u. Our hope
is that this will transform this expression into the limit from the above proposition. We
note that x ! +1 implies that u = ex ! +1. This allows the following computation:
x u=ex (ln u) ⇣ ln u ⌘
lim x = lim = lim
x!1 e u!1 u u!1 u1/
⇣ ln u ⌘
= lim 1/ = 0.
u!1 u
In the last line, we first used the fact that y = x is continuous (see exercise 6.62, below),
followwed by Proposition 6.69 for ↵ = 1/ .
lim x↵ ln x.
x!0+
Example 6.74 (Vertical asymptotes) Let us determine any vertical and horisontal
asymptotes of the expression
1
f (x) = p .
x2 2x x
To find the vertical asymptotes, we need to identify all points in R where the function
can tend to infinity. First, note that when x approaches points where the denominator
is defined and non-zero, the "additional rule thumb" applies and we get a finite limit.
p
p To figure out where we can have an infinite limit, we investigate x
2 2x =
x(x 2). We see that this root is defined for on ( 1,0] [ [2, + 1), and, moreover,
that it is zero when x 2 {0, 2}. We check the following limits:
1 1 h 1 i h 1 i
lim f (x) = = and lim f (x) = + = + = +1.
x!2+ 0 2 2 x!0 0 0 0
That is, we have a vertical asymptote at x = 0. Here, we used the symbols 0+ and 0
to indicate whether or not the zeroes are approached from the positive or negative sides.
Moreover, we checked one-sided limits since we cannot approach from inside (0,2).
6.4. HOW TO USE THE RULEBOOK IN PRACTICE 201
The
p final possibility for vertical asymptotes are at points where the root is defined,
but x2 2x x = 0. That is, at points where x satisfies:
p
x2 x = x =) x2 x = x2 () x = 0.
But this is the point we already detected, so we have found all vertical asymptotes.
To check for horisontal asymptotes, we need to figure out if the expression approaches
some constant as x approaches +1 or 1. In this case, the computations more or less
act as if we were computing with sequences and letting n ! 1.
Let us revisit the above example.
Example 6.75 (Horisontal asymptotes) We check whether the function f (x) in the
previous example has any horisontal asymptotes. First, we check if there is a horisontal
asymptote as x ! +1:
h 1 i p
1 1 x2 x + x
lim p = = lim p ·p
x!1 x2 x x 1 1 x!1 x2 x x x2 x + x
p
x2 x + x
= lim
x!1 x
q
1 p
x 1 x +1 1+0+1
= lim = = 2.
x!1 x 1 1
(Notice how we used practically every rule of thumb here.)
We leave it as an exercise to check for a horisontal asymptote as x ! 1.
Exercise 6.76 (a) Determine whether the function in the previous example has a
horisontal asymptote as x ! 1.
(b) Use some visualisation tool to draw the graph of f (x) to verify your answer in
(a).
Exercise 6.77 (a) Determine any vertical and horisontal asymptotes of
1 ⇣1⌘
f (x) = 2 sin .
x x
(b) Use some visualisation tool to draw the graph of the function in (a) to verify
your answers.
We now turn to the question of how to compute skew asymptotes (which, we remind
you, are sometimes also called oblique asymptotes). In Appendix A, we explain how we
202 CHAPTER 6. LIMITS FOR FUNCTIONS
can identify skew asymptotes of rational functions by using polynomial division. Here is
a recipe for finding skew asymptotes that also works for functions that are not rational.
Remark 6.79 An example of finding a skew asymptote is worked out in the YouTube-
film linked to Method 6.78.
Exercise 6.80 Use the above method to find the skew asymptotes as x ! 1 of the
following functions.
x3 + 2x2 5x
(a) f (x) = 3x 2 (b) f (x) = (c) f (x) = xe1/x .
x2 + 1
Hint: The skew asymptote in part (b) can also be found using polynomial division, as
we did in Chapter 2. Try both methods to see if they match up. In (c), a standard
limit may come in handy.
Exercise 6.81 In this exercise, we prove that Method 6.78 will always give the correct
answer.
x3 3x + 1
(iii) (4 points) lim =3
x!2 2x 3
Exercise 6.83 (Lund, May 2015) Do one of the following exercises (both are worth
the same number of points).
(a) Give the definition of limx!a f (x) = L and use it to show that
lim (x2 + 3x + 1) = 5.
x!1
Exercise 6.85 (Lund, May 2014) Explain briefly what it means for a function f
to be continuous at a point x. Then use the epsilon-delta definition of the limit to
show that f (x) = x3 + 2x2 + 3x + 1 is continuous at x = 2.
204 CHAPTER 6. LIMITS FOR FUNCTIONS
6.21 C = 1.
6.22 No.
6.23 The function is continuous, but not at x = 0 (how can this be?).
6.28 = 1/2 and = 1/4, respectively (or any smaller than this).
6.29 (a) |x 3| < 1/4, (b) |x 3| < 1/40, (c) |x 3| < ✏/4.
6.32 It is a bit hard to see, but based on the first figure = 1/2 (or even something
slightly larger than this) seems to work, and in the second figure, = 1/4 seems to
work. The formula from the example gives = 4/12 = 1/3 and = 2/12 = 1/6,
respectively. It does not matter that these are not the same (since if one works,
then all smaller also automatically work – the point is to find some that is
small enough).
6.33 (c) Following the steps of Example 6.30, we arrive at = min{1, ✏/21}. (But
just changing the steps slightly may lead to other choices for which are also
acceptable).
6.41 Basically, the proofs are the same except conditions of the type n > N are replaced
by conditions of the type |x a| < .
6.71 The limit is equal to 0. Do a change of variables to make it into standard limit
(iii).
6.72 Rewrite the expressions using ax = ex log a , then you will find the answers (a) e2/3 ,
(b) 1.
6.81 In this exercise, it is important to keep in mind that y = kx+m is a skew asymptote
for f (x) if limx!1 (f (x) kx m) = 0. In (a), the strategy is to add by 0 in both
the expressions for A and B to make kx + m appear, and then to use the definition
of the skew asymptote. In (b), it is enough to rewrite the computation that gives
you B in order to verify that y = Ax + B satisfies the definition of being a skew
asymptote.
206 CHAPTER 6. LIMITS FOR FUNCTIONS
Chapter 7
What is Calculus
As in Chapter 3, we now take the time to explain one of the major ideas of mathematics.
In Chapter 3, the point was to explain how mathematical analysis can be understood
as the study of mathematical objects in terms of limits of "infinite" processes. Here, we
discuss some of the central ideas of Calculus, which is the part of mathematical analysis
that deals with the interplay between the derivative and the definite integral.
Informal definition 7.1 (Day job of the derivative) The derivative f 0 (x) of a
function f denotes the slope of the line tangent to the graph of f at the point x.
Fig. 1. The slope of the blue line is exactly what we mean by f 0 (1).
207
208 CHAPTER 7. WHAT IS CALCULUS
Based on the informal definition, it may come as a surprise that the derivative is
probably among the most important scientific objects ever "discovered". Indeed, it is
central for our understanding of our physical reality in mathematical terms!
Fig. 2. The derivative is like Batman. It is has a seemingly boring day job, but is a
superhero by night (describing gravitational waves and curing cancer!).
f (2) f (1) 22 12
f 0 (1) ⇡ = = 3.
2 1 1
How to improve this estimate? Well, let us make
the gaps between the red dots smaller: Fig. 3. A first guess of the slope
of f at x = 1.
0 f (1 + 12 ) f (1) (1 + 12 )2 1
f (1) ⇡ = = 2.5
1 + 12 1 1
2
1 1 2
f (1 + 10 ) f (1) (1 + 10 ) 12
f 0 (1) ⇡ 1 = 1 = 2.1.
1 + 10 1 10
What happens if we make the gaps between the red dots smaller? Could it be the
approximations for f 0 (1) stabilise at 2? As you are supposed to check in the following
exercise, this is indeed the case!
Exercise 7.3 We consider the function f (x) = x2 from the above example.
(a) What approximation for f 0 (1) do you get if you compute the slope of the straight
line through the points (1,f (1)) and (1 + 1/100, f (1 + 1/100))?
(b) Write out an expression for the slope of the line passing through the points
(1 + h, f (1 + h)), where h is some unknown number. Try to determine the limit
as h ! 0 both numerically using Python and analytically using computational
rules for the limit.
Inspired by the above discussion, we now give the definition of what we mean by the
derivative and tangent lines, respectively, of a function f at a point x.
def f (x + h) f (x)
f 0 (x) = lim
h!0 h
at all points where this limit exists. Moreover, if the limit exists, we say that f is
differentiable at x. If f is differentiable at all points in its domain, we simply say that f
is differentiable. Note that we sometimes write dx d
f (x) or df
dx (x) instead of f (x).
0
y = f 0 (a)(x a) + f (a)
Exercise 7.6 Use the definition of the derivative to calculate the derivatives of:
(a) f (x) = C (b) f (x) = x (c) f (x) = x2
(d) f (x) = 1/x (e) f (x) = kx + m (f ) f (x) = eCx
210 CHAPTER 7. WHAT IS CALCULUS
Informal definition 7.7 (Day job of the definite integral) The definite integral
Z b
f (x) dx
a
denotes the area under the graph of the function f (x) on the interval [a,b], where the
area below the x-axis is to be interpreted as being negative.
Fig. 6. Left: The definite integral of y = x2 over the interval [0,1]. Right: The
definite integral of y = sin x over the interval [0.8]. The area below the x-axis is
counted as "negative area".
There is more than one way to set up a limiting process for computing the area under
the graph of a function, and all of them lead to a version of the definite integral. In fact,
it is said that every mathematician in the 18th century had his own version of the defi-
nite integral. The Riemann integral, which is the one we learn in this course, is mostly
considered a pedagogical tool. In practice, mathematicians, physicists and engineers use
the more advanced Lebesgue integral which you will meet in later courses.
Fig. 7. If the derivative is like Batman, then it makes sense to think of the definite
integral as Robin. Indeed, as we shall see below, together they form a formidable
crime fighting duo!
p
Example 7.8 Let us try to compute the area under the graph of f (x) = 1 x2 as
x 2 [0,1]. As in high school, our strategy is to approximate this area by using rectangles,
and, as with the derivative, we make such approximations pretending that we only have
knowledge of the graph of f at a finite number of evenly spread red dots.
p
Fig. 8. To the left, we approximate the area under the graph of f (x) = 1 x2 using
4 rectangles with base lengths 1/4, to the right, we approximate using 8 rectangles
with base lengths 1/8 (in both cases, the right-most rectangle has height zero).
In the above figures, we see two examples of lower Riemann sums. That is, finite
sums of the area of rectangles lying below some graph. For convenience, we denote the
lower Riemann sums shown above by L4 and L8 , respectively.
In particular, if we write xk = k/4 for k 2 {0, 1, 2, 3, 4}, then the area expressed in
the left-most figure, above, is equal to
Writing xk = (xk xk 1 ) for the base-lengths of the rectangles, we can express the
resulting approximation as
Z 1 4
X 4 p
X 1
f (x)dx ⇡ f (xk ) xk = 1 (k/4)2 · = 0.62...,
0 4
k=1 k=1
Exercise 7.9 (a) Use high school geometry to computep the exact value of the area
considered in the above exercise. (Hint: Rewrite y = 1 x2 .)
(b) Compute L8 , L100 , L1000 and compare these values to the value obtained in (a).
Hint: To do (b), feel free to use the code given in Example 7.10, below.
7.1. THE DERIVATIVE AND THE DEFINITE INTEGRAL 213
Notice how the code mirrors the mathematical notation (and that we, in this code, split
the interval into N equally long pieces)! Also, note that we could replace the three last
lines by the single line S = sum([f(X[k])⇤(X[k] X[k 1]) for k in range(1,N+1))].
Inspired by the above discussion, we now give the definition of what we mean by the
definite integral of a function f with respect to some interval [a,b].
At the moment, even though it may seem reasonable, we do not know if the limit in
Definition 7.11 exists for all continuous functions. This is something we prove is true in
Chapter 12.
Exercise 7.12 For each of the following integrals, (a) make a drawing of the actual
area that the integrals represent, and use what you know about high school geome-
try to compute their exact values, and (b) adapt the code in the above example to
approximate them numerically.
Z 3 Z 2⇡
(i) x dx, (ii) sin(x) dx.
1 0
214 CHAPTER 7. WHAT IS CALCULUS
Example 7.13 Let us denote the upper Riemann sum illustrated in Figure 9 by U4 .
Writing, as above, xk = k/4 for k 2 {0,1,2,3,4} and letting xk = xk xk 1 , we obtain
4
X
U4 = f (xk 1) xk = 0.87...
k=1
In the following exercise, we make the important point that for monotone functions,
we can figure out the quality of approximations by Riemann sums without having to
compute the Riemann sums themselves.
Exercise 7.15 Use the result of the previousRexercise to determine how large we have
1p
to choose n in order for Ln to approximate 0 1 x2 dx with an error of less than
1/1000. Use Python to verify that this is correct.
7.1. THE DERIVATIVE AND THE DEFINITE INTEGRAL 215
However, there is also a third way to compute definite integrals – one which you prob-
ably used the most in high school. This method uses the connection between derivatives
and the definite integral as discovered by Newton and Leibniz in the 17th century. Some
even claim that this is the most important scientific discovery ever made!
3. Use the evaluation formula. That is, if F
is a primitive function of f on [a,b] (that
is, F 0 (x) = f (x) for all x 2 [a,b]), then
Z b
f (t)dt = F (b) F (a).
a
Fig. 11. The derivative and the
When reading the above result, keep in mind definite integral, finally together,
that the definite integral and the derivative kicking ass!
come from two completely different geometric
notions: areas and tangent lines, respectively. It
was therefore quite surprising, and very useful,
when Newton and Leibniz realised that these
two concepts are mirror images of one another!
(In fact, historically, the definite integral was
studied long before anyone came up with the
derivative.) At the time, there was a huge argu-
ment between the two on who discovered "Cal-
culus" – the link between theory of integration
Fig. 12. Gottfried Wilhelm von
and the theory of differentiation – first. In the
Leibniz (1646–1716) shown biting
end, Newton was credited for the discovery, but
Sir Isaac Newon (1642–1727).
we are using Leibniz’ notation.
Exercise 7.16 Use the evaluation formula in combination with exercise 7.6 to com-
pute the following integrals.
Z 3 Z 3 Z 3
dx
(a) x dx (b) 2
(c) e5x dx.
1 1 x 1
Exercise 7.17 Compute the following expressions by first using the evaluation for-
mula, and then differentiating (you can assume that f has a primitive). Do the answers
surprise you? Z x Z x
d dx d
(a) (b) f (t)dt
dx 1 x2 dx 1
216 CHAPTER 7. WHAT IS CALCULUS
= an + an = 2an
The assumption that we have one single bacteria at time zero means that we put a0 = 1.
By the above formula for an+1 , we immediately obtain
a1 = 2a0 = 2
a2 = 2a1 = 4
allow us to compute the number of bacteria at any given time. We call this system of
equations the proportionate growth model for the E. coli population on our sandwich. In
particular, we call an+1 = 2an the evolution equation of the model, and a0 = 1 its initial
condition (e.g., if we initially had more bacteria, we would just adjust a0 ).
7.2. A FIRST LOOK AT DIFFERENTIAL EQUATIONS 217
Exercise 7.19 (a) According to Google, around E. coli 50 bacteria are more than
enough to poison the average person. How long does it take for the E. coli popu-
lation to reach this size?
(b) The earth weighs 6 · 1024 kg and a bacteria about 10 15 kg. How long does it
take for the colony of bacteria to weigh more than the earth? Is this reasonable?
A point of the above exercise is to illustrate that this model for bacterial growth is
not particularly realistic in the long run. The following model improves on this by taking
into account that bacteria produce less offspring when food and space become scarce.
Example 7.20 (The discrete-time logistic growth model) The evolution equation
of the so-called logistic growth model is obtained by making the assumption that the
number of new bacteria produced at time step n + 1 is given by the formula
⇣ an ⌘
(number of new bacteria) = an 1 . (7.1)
L
Here, L is a parameter reflecting the maximum number of bacteria supported by the
sandwich (note that the expression (7.1) approaches 0 as an approaches L). This means
that the evolution equation from the proportional growth model should be replaced by
⇣ an ⌘ ⇣ an ⌘
an+1 = an + an 1 = an 2 .
L L
Fig. 13. Left: A plot of the 20 first terms from the logistic growth model with L = 100
and initial condition a0 = 1. Right: Two actual growth curves of E. coli taken from
a report published at http://2009.igem.org/Team:SJTU-BioX-Shanghai/Results.
Exercise 7.21 (a) Suppose an E. coli bacteria has a diameter of 10 6 meters. Use
this to estimate how many E. coli will fit on your Sandwich (i.e., the value of L).
(b) Compute how long it takes for the colony to reach 50 bacteria according to the
logistic growth model.
(c) Compute how many bacteria there are on the Sandwhich after 2 days, according
to the logistic growth model. Compare this to the answer in (b) of exercise 7.19.
218 CHAPTER 7. WHAT IS CALCULUS
Fig. 14. That is, the time variable will now move smoothly along the real line,
instead of only taking certain values through a sequence of discrete jumps.
An advantage of working with differential equations is that we can apply the tools of
Calculus to solving them. A disadvantage, however, is that they are more difficult to
simulate (due to the lack of discrete time-jumps).
Example 7.22 (Continuous-time growth models) Suppose that the function y(t)
describes the number of bacteria on the sandwich from Example 7.18 at time t. This
means that the derivative y 0 (t) describes the rate of growth of the colony. Assuming that
we start out with one lonely bacteria at time t = 0, our initial condition is y(0) = 1.
If we assume that the bacterial growth is simply proportional to the number of
bacteria present (similar to the discrete-time proportional growth model), then we arrive
at the continuous-time proportional growth model
( 0
y (t) = Cy(t)
y(0) = 1,
for some experimentally determined constant C. Here, y 0 (t) = Cy(t) is the evolution
equation of the model, and y(0) = 1 is its initial condition. As in the discrete model, if
we initially had more bacteria, we would adjust y(0).
Similar to the discrete-time case, a more realistic model is given by the continuous-
time logistic growth model, where we replace the evolution equation by
⇣ y(t) ⌘
y 0 (t) = Cy(t) 1 .
L
While there is no nice formula for the solution of the discrete-time logistic growth model,
we will see later on that using the tools of Calculus, a formula for the solution is possible
to find in the continuous-time case.
Exercise 7.23 (a) Verify that y(t) = DeCt solves the continuous-time proportional
growth model for all D 2 R. Hint: Recall exercise 7.6.
(b) Suppose that y(t) from (a) models the growth of the bacteria on our sandwich
after t minutes. Use the equation y(0) = 1 and y(20) = 2 to find C and D.
(c) How long does it take for the colony to reach 50 bacteria according to this model?
7.2. A FIRST LOOK AT DIFFERENTIAL EQUATIONS 219
y 0 = xy. (7.2)
While we do not know what the solution to this equation is, we do know that if its
solution has the value y = 2 at x = 1, then the derivative y 0 of the solution at x = 1
must satisfy
y 0 (1) = 1 · 2 = 2.
Exercise 7.25 Match the slope fields shown below to the differential equations:
(a) y 0 = 1/x (b) y 0 = y
(i) (ii)
Exercise 7.26 Suppose we use the continuous-time growth model to model the num-
ber of bacteria on our sandwich, with C and L as in exercise 7.21. Draw a slope field
for the evolution equation for this model (use reasonable values for x and y).
Hint: Try not to be to detailed here. A quite rough slope field will be sufficient for
understanding how the solutions behave.
220 CHAPTER 7. WHAT IS CALCULUS
The second step in obtaining a visual solution is to place your pen somewhere in the
slope field, and then trace the path suggested by the "flow" of the slope field. We make
this more precise in the following example:
Example 7.27 Again, we consider the differential equation y 0 = xy from the previous
example. Let us assume that the solution we are looking for satisfies for which y( 1) = 1.
That is, we are looking for a solution of the sys-
tem (
y 0 = xy
y( 1) = 1
As in the discrete-time case, we call y 0 = xy the
evolution equation of this system, and y( 1) =
1 its initial condition.
Now, if we place our pen at the point (x,y) =
( 1,1), and then trace out the path suggested
by the "flow" of the slope field, we arrive at the
following: Fig. 16. Here we have used the
slope filed to visualise the solution
of the differential equation with
initial condition y( 1) = 1.
In the above example, notice how the solution seems to have a slope at each point
that matches the slope prescribed by the differential equation! That is, we have visually
solved the differential equation for the given initial condition. Intuitively, we can think
of the slope field as describing the current of some river, and the solution being the path
of a leaf dropped at the coordinate of the initial condition.
Exercise 7.28 (Exercise 7.25, continued) Use the above slope fields to visually solve
the following differential equations with given initial values:
( (
y 0 = 1/x y0 = y
(a) (b)
y(1) = 0 y(0) = 1
Exercise 7.29 (Exercise 7.26, continued) Draw a few solutions for the continuous-
time logistic growth model for our E. coli infested sandwich. How do the solutions
behave with respect to different initial conditions at x = 0?
7.2. A FIRST LOOK AT DIFFERENTIAL EQUATIONS 221
Fig. 17. Here, we have drawn, by hand, a "solution" of the differential equation
y 0 = x2 y 2 , with initial condition y( 1) = 1. The idea is to follow the blue arrows!
To analyse, step-by-step, how we came up with the red curve in the above figure, we
assume that the differential equation can be expressed on the form
y 0 = f (x,y), (7.3)
where f (x,y) is some expression in terms of x and y (for instance, the differential equation
in Example 7.24 is on this form for f (x,y) = xy).
Step 1: Place the point of your pen in an initial
coordinate (x0 , y0 ) and choose a step-size x.
Step 2: Check the slope of the blue line of the
slope field at (x0 , y0 ). According to formula
(7.3), the value of y 0 is given by f (x0 , y0 ).
Step 3: Let your pen follow the blue line as
you move a distance of x in the horisontal
direction. Since the blue line has a slope of
f (x0 ,y0 ), this means you also move a distance
of y = f (x0 ,y0 ) x in the vertical direction. In Fig. 18. When drawing the blue lines
this way, you end up moving diagonally from one after another, we start to get
(x0 ,y0 ) to (x1 , y1 ) = (x0 + x, y0 + y). something looking like a graph.
Step 4: We now repeat steps 1, 2 and 3, with the point (x0 , y0 ) replaced by (x1 , y1 ), to
obtain a new line segment ending at the point (x2 , y2 ), and so on, until you have reached
as far as you want (note that it is usual to keep the same x at each consecutive step).
222 CHAPTER 7. WHAT IS CALCULUS
Example 7.30 Let us, by hand, simulate the differential equation y 0 = x2 y 2 with
initial condition (x0 , y0 ) = (0, 1) and time-step x = 0.1. This means that f (x,y) =
x2 y 2 . According to step 2, above, we compute y 0 = f ( 1,1) = 0. This gives y =
f ( 1,1) x = 0. This means that our new point is
To move a second "time-step" of x = 0.1, we repeat the process with the point (0,1)
replaced by (0.1,1). Note that this time y 0 = f (0.1,1) = (0.1)2 12 = 0.9.
Fig. 19. A selection of simulated curves. The orange is the one produced in the
above example. The right-most image is the one produced by this code.
7.2. A FIRST LOOK AT DIFFERENTIAL EQUATIONS 223
Exercise 7.32 Consider the differential equation y 0 = xy with (x0 ,y0 ) = ( 2,1).
(a) Where do you end up if you take a "time-step" of length x = 0.1? Compare
this with the figure in Example 7.27.
(b) Where do you end up if you take a second time step? There are now two sources
of error. Which ones?
(c) Use the code from Example 7.31 to simulate the differential equation with the
above initial condition. Does it match the figure in Example 7.27?
Exercise 7.33 (Exercise 7.28, continued.) Use the code from Example 7.31 to sim-
ulate the initial value problems from Exercise 7.28. How do the simulations compare
to your visual solutions?
Exercise 7.34 (Exercise 7.29, continued.) Use the code from Example 7.31 to simu-
late the continuous-time logistic growth model for our sandwich. Choose initial con-
ditions at y(0) that will result in different types of behaviour.
Remark 7.35 (The danger of butterflies) The code in Example 7.31 has an extreme
– and unnecessary – weakness which has led to people being killed, and stock markets to
crash. The problem is that there is a round-off error which experiences a snowball effect
as the code runs.
To understand what is going on, recall that a computer cannot represent every real
number. Instead, it has a finite number of floating point numbers that it can repre-
sent. Unfortunately, 1/10 is not one of them. Instead, the variable deltaX will contain
something that is close to 1/10, but which contains an error in the 16th digit. That is,
x = 1/10 + ✏ where ✏ ⇡ 10 17 (an important detail: why do we claim that the error
is of the size 10 17 and not 10 16 ?). After n repetitions of the for-loop, this means that
the error in the x variable is equal to n · ✏ ⇡ n · 10 17 .
To the right, we see the Patriot defensive missile
system used by the Americans, for instance, in their
wars against Iraq. It had a targeting computer that
made a "tick" every 1/10 seconds. Some poor pro-
grammer needed to make a vector containing a sched-
ule for all the "ticks" from the targeting computer,
and did this using a for-loop essentially executing the
command t = t + 0.1 (just as we do in Example
7.31). The result was that the missile system became Fig. 20.
useless after a certain amount of time.
Exercise 7.36 (a) Compute the error accumulated in the time variable 100 hours
after a Patriot missile system was turned on.
(b) Suggest a way to change the code to avoid the problem of round-off errors alto-
gether in the list X.
224 CHAPTER 7. WHAT IS CALCULUS
.
7.3. ANSWERS TO SELECTED EXERCISES 225
7.6 (a) f 0 (x) = 0, (b) f 0 (x) = 1, (c) f 0 (x) = 2x, (d) f 0 (x) = 1/x2 , (e) f 0 (x) = k, (f)
f 0 (x) = CeCx .
7.12 (a): The area represented by (i) is a rectangle plus a triangle with combined area
4. The area represented by (ii) has equal size above and below the x-axis, and so,
by symmetry, the integral is equal to 0.
7.14 The combined area of the difference between UN and LN is equal to a column
with area (f (b) f (a)) · (b a)/N . For this to be less than ✏ > 0, we must have
N > (f (b) f (a))(b a)/✏.
7.19 (a) After 2 hours there are 64 bacteria. This is just enough to get food poisoning,
(b) just less than 44 hours.
7.23 (b) D = 1, C = (ln 2)/20, which yields y(t) = 2t/20 , (c) A bit less than 113 minutes.
7.32 You should get the following answers: (a) After one step, the simulated solution
gives y( 1.9) ⇡ 0.8, while the exact solution gives y( 1.9) = e 0.195 = 0.8228....
(b) After two time steps, the simulated solution gives y( 1.8) ⇡ 0.648. The main
sources of error are now that we are relying on an approximate value of y( 1.9)
(in step one, we had an exact value of y( 2)), as well as making a mistake by
following this slope for the full length of the time-step x = 0.1 (in the first step,
this was the only source of error).
Additionally, the computer makes a round-off error, but this has an extremely
minor effect at this point.
226 CHAPTER 7. WHAT IS CALCULUS
Chapter 8
The derivative
In this chapter we figure out the computational rules for the derivative and prove the
differentiation formulas for the functions most commonly appearing in these lecture
notes.
Remark 8.1 (Selected problems from previous exams based on this chapter)
1. The following elementary functions are closely related to the trigonometric func-
tions, and are called hyperbolic functions:
ex e x ex + e x
sinh x = and cosh x = .
2 2
(a) Prove the differentiation formulas
227
228 CHAPTER 8. THE DERIVATIVE
def f (x + h) f (x)
f 0 (x) = lim
h!0 h
at all points where this limit exists. Moreover, if the limit exists, we say that f is
differentiable at x. If f is differentiable at all points in its domain, we simply say that f
is differentiable. Note that we sometimes write dx d
f (x) or df
dx (x) instead of f (x).
0
Exercise 8.3 (a) In Chapter 7 We computed the derivative of f (x) = x2 using the
definition. Check that you can also compute this derivative by using the limit
f (x) f (u)
lim .
u!x x u
(b) Is it always true that the limit in (a) is equal to the derivative of f (x) for all
differentiable functions f ? If yes, prove this, or, if not, then find a counter-
example.
Remark 8.4 (Leibniz notation) The letter h in the definition of the derivative denotes
how far we move away from the point x on the x-axis. Similarly, the quantity f (x +
h) f (x) denotes how far this pushes the function away from the value f (x) along the
y-axis. Denoting these changes to the values x and f (x) by x and f , respectively,
Leibniz came up with the notation df
dx
for the derivative. (Here, we should point out that the Greek letter corresponds to
the latin letter "d" and stands for "difference".) Explicitly, we have
df f f (x + x) f (x)
= lim = lim .
dx x!0 x x!0 x
In order to remember this notation, then the following is worth keeping in mind:
(In fact, this also holds true in the case of the notation for the definite integral.)
8.1. COMPUTATIONAL RULES FOR THE DERIVATIVE 229
f (0 + h) f (0) f (h)
lim = lim .
h!0 h h!0 h
Here, we used that f (0) = 0, according to the definition of f . Now, to plug in a formula
for f (h), we need to know if h is positive or negative (since f has different formulas
depending on the sign of h). This forces us to consider the one-sided limits separately:
f (h) h2 + h
lim = lim = lim h + 1 = 1,
h!0+ h h!0+ h h!0+
f (h) Ch
lim = lim = lim C = C.
h!0 h h!0 h h!0
Since these two limits are equal if and only if the two-sided limit exists (Proposition
6.14), it follows that f is differentiable at x = 0 exactly if C = 1.
Exercise 8.6 Consider the function in the above example. Suppose that x > 0 is
some fixed number. Do we have to take both formulas of f into consideration when
computing f 0 (x)? Explain why.
Exercise 8.7 Determine values for C and D so that the following function is both
continuous and differentiable at x = 0:
(p
x+1+D x 0
f (x) =
C(x + 1) x < 0
230 CHAPTER 8. THE DERIVATIVE
x2 2x
x 1
1/x
p
x
x↵
ex
ln x
sin x
cos x
tan x
arcsin x
arccos x
arctan x
8.1. COMPUTATIONAL RULES FOR THE DERIVATIVE 231
d
(v) f g(x) = f 0 g(x) · g 0 (x) (chain rule)
dx
We prove these rules in Section 8.2, below. As we shall see, these rules are all con-
sequences of the computational rules for the limit. For this reason, it may be surprising
that the sum rule looks very similar to the one for the limit, while others do not.
While the above computational rules should be more or less familiar from high school,
the following rule is probably not.
d 1 1
f (x) = .
dx f0 f 1 (x)
We immediately note that while this result may be hard to read and apply, it is
actually not that hard to prove. Indeed, it follows almost immediately from the chain
rule. We shall return to this when we discuss implicit differentiation later in the chapter.
232 CHAPTER 8. THE DERIVATIVE
Example 8.11 Here are some examples to illustrate rules (i) to (iv).
d 3
(i) x + sin x = 3x2 + cos x
dx
d 3
(ii) x sin x = 3x2 sin x + x3 cos x
dx
= x2 3 sin x + x cos x
d⇣ 1 ⌘ 1 d 3
(iii) = 2 · x sin x
dx x3 sin x 3
x sin x dx
x2 3 sin x + x cos x
=
x6 sin2 x
3 sin x + x cos x
=
x4 sin2 x
Notice how we in (iii) do not try to solve everything in one line. Instead, the first
step was essentially to recall the reciprocal rule. Indeed, to have a bit of patience when
computing derivates often helps us avoid mistakes.
We include one more example on the product rule to illustrate the importance of
patience when using the product rule:
Example 8.12 Applying the product rule twice, we can differentiate the product of
three functions:
d d
ln x · sin x · arctan x = ln x · sin x · {z
arctan x}
dx dx |{z} |
f g
⇣d ⌘ d⇣ ⌘
= ln x · sin x · arctan x + ln x · sin x · arctan x
dx dx
8.1. COMPUTATIONAL RULES FOR THE DERIVATIVE 233
Remark: You can avoid the use of the chain rule in this exercise.
Exercise 8.14 Show that for constants a,b,c,d such that not both c = d = 0, then
✓ ◆
d ax + b ad bc
= .
dx cx + d (cx + d)2
Exercise 8.15 (a) Use the product rule to prove by induction that
d n
x = nxn 1
, 8n 2 {1,2,3, . . .}.
dx
(b) Combine the formula from (a) with the reciprocal rule to prove that
d n
x = nxn 1
, 8n 2 { 1, 2, 3, . . .}.
dx
We now move on to rule (v), namely, the the chain rule.
Here, we use the chain rule as formulated in Proposition 8.9 with f (x) = sin x and
g(x) = x3 . In particular, since f 0 (x) = cos x, this means that f 0 g(x) = cos(x3 ).
Remark: At first, the chain rule can be confusing. If you are struggling with this
exercise, continue reading, and then try again after taking a look at Example 8.20.
234 CHAPTER 8. THE DERIVATIVE
The chain rule is usually the computational rule for the derivative that requires the
most effort to master. The main reason is probably that the notation is sort of bad.
Indeed, notice that in our formulation of the chain rule, then
d
f g(x) 6= f 0 g(x) . (8.1)
dx
So what is going on? Well, in the expression to the left, we are trying to say that one
should first compose f and g, to get f (g(x)) = sin(x3 ), and then take the derivative of
this composition. In the expression to the right, on the other hand, we mean to say that
you should first take the derivative of f (x) = sin(x), and afterwards compose the result
with g(x).
This difference is really not at all clear from how we write these expressions. So,
to make the chain rule easier to understand, it is common to introduce different letters
for the variables and write f (u) for the outer function and g(x) for the inner function.
With the Leibniz notation for the derivative (recall Remark 5.24), we can now write the
right-most expression in (8.1) as follows:
d d
f (u) or f (u) .
du du u=g(x)
These two expressions mean the same thing. However, in the right-most variant, we
make the extra effort of reminding the reader that only after taking the derivative, do
we put u = g(x). The chain rule can now be expressed as
d d df du
f g(x) = f (u) = · .
dx dx du
|{z} dx
|{z}
outer der. inner der.
Example 8.16 (continued) In the case of sin(x3 ) then f (u) = sin u is the outer function
and g(x) = x3 is the inner function. This means that f 0 (u) = cos u is the outer derivative
and g 0 (x) = 3x2 is the inner derivative. By the chain rule, we get
d u=x3 d du
sin(x3 ) = sin(u) = cos u ·
dx dx dx
d 3
= cos(x3 ) · x
dx
= cos(x3 ) · 3x2 .
Exercise 8.18 Compute the derivatives in exercise 8.17 using this notation.
Exercise 8.19 Check the definition of the indefinite integral in Appendix A, and use
it to pair the following integrals with the suitable expression. Note that to solve this
8.1. COMPUTATIONAL RULES FOR THE DERIVATIVE 235
exercise, you only need to be able to compute derivatives. (Why? Also, note that you
do not even need to know what the derivative of arctan x is.)
Z
dx 1
(a) 2
(i) ln(x2 + 4) + C
4+x 2
Z
dx 1⇣ ⌘
(b) (ii) ln(2 + x) ln(2 x) + C
4+x 4
Z
dx
(c) (iii) ln(4 + x) + C
(4 + x)2
Z
x dx 1
(d) 2
(iv) +C
4+x 4+x
Z
dx 1 x
(e) 2
(v) arctan + C
4 x 2 2
Let us consider one more example where we illustrate the use of the chain rule.
Exercise 8.22 Compute the derivatives of the following functions. Note that they
all have something in common. In particular, after having done this exercise, think
about what this means for their graphs, and plot them to see if you are correct.
1
(a) f (x) = arctan + arctan x
x
x
(b) f (x) = arcsin p arctan x
1 + x2
p p
(c) f (x) = 2 arctan(x x2 1) + arctan x2 1
Hint: These functions – and how to compute their derivatives – have all appeared on
recent exams. You can find these exams, with full solutions, on the course website.
8.2. PROOF OF THE COMPUTATIONAL RULES FOR THE DERIVATIVE 237
Example 8.23 (Proof of the sum rule) We use the computational rules for the limit
to show that ⇣ ⌘0
f (x) + g(x) = f 0 (x) + g 0 (x).
Notice that we could use the summation rule for the limit since we knew that both the
limits f 0 (x) and g 0 (x) exist.
Exercise 8.24 Show that if f has a derivative at x, then it is also continuous there.
What part of the diagram in Figure 1 does this justify?
Hint: Recall the formula from exercise 8.3, and find a chain of equalities showing that
⇣ ⌘
lim f (u) f (x) = · · · = 0.
u!x
d⇣ ⌘ f (x + h)g(x + h) f (x)g(x)
f (x) · g(x) = lim = · · · = f 0 (x)g(x) + f (x)g 0 (x).
dx h!0 h
Exercise 8.27 (a) Use the definition of the derivative to figure out a formula for
d 1
.
dx g(x)
(b) Use the formula found in (a) and the product rule for derivatives to derive a
formula for d f (x)
.
dx g(x)
Finally, we turn to the chain rule. It is – by far
– the most difficult of the computational rules to
prove. To prepare us for the proof, we give a sim-
ple, but unfortunately false, argument that helps
us understand what is going on (curiously, this
"proof" may be found in numerous high school Fig. 4. Fake proof ahead.
textbooks).
Example 8.28 (Fake "proof" of chain rule) We wish to compute the limit
To end the proof, we need to compute the limit labeled by (⇤). To do this, we make a
change of variables. That is, we set k = g(x + h) g(x). Note that as h ! 0, then k ! 0
(this is true since differentiable functions are automatically continuous). This allows us
to write
f g(x + h) f (g(x)) f g(x) + k f g(x) ⇣ ⌘
lim = lim = f 0 g(x) .
h!0 g(x + h) g(x) k!0 k
Notice that the last expression here means the derivative of the function f evaluated at
the point g(x).
Exercise 8.29 (a) What is the problem with the above proof? (A correct proof is
supplied on the following page.)
(b) This "fake proof" can be used to come up with a correct proof for the differenti-
ation formula for inverse functions from Proposition 8.10.
Hint: In (b), use the "fake proof" to differentiate f 1 (f (x)).
240 CHAPTER 8. THE DERIVATIVE
Now, the problem with the fake proof of the chain rule occurs when we multiply by one.
Indeed, the expression
g(x + h) g(x)
g(x + h) g(x)
may be of the form 0/0 an infinite number of times as h ! 0. That is, we need to find
an alternative approach that avoids division by g(x + h) g(x).
For this reason, let us now consider the definition of f 0 , which we write up as follows:
f (u + k) f (u)
f 0 (u) = lim . (8.2)
k!0 k
As in the fake proof, we want to put k = g(x + h) g(x). However, the problem we
mentioned above then becomes precisely that k may be zero for various values of h, and
we may therefore not divide by it. But here is the crucial step: before we make the
connection between k and g(x + h) g(x), we define
8
<f 0 (u) f (u + k) f (u) , k 6= 0,
E(k) = k (8.3)
:
0 k = 0,
where we keep in mind that we consider u as being fixed and k as the variable. Here, we
also notice that since (8.2) holds, it follows that
lim E(k) = 0.
k!0
That is, when defined in this way, the function E(k) is continuous at the origin. Why
did we do all of this? Well, by multiplying up k, and rearranging (8.3), we can now write
⇣ ⌘
f (u + k) f (u) = f 0 (u) E(k) k.
Since this expression is fine for k = 0, it is now safe to put u = g(x) and make the
connection k = g(x + h) g(x), which, in particular, means that k ! 0 as h ! 0 and
that we can write
⇣ ⌘⇣ ⌘
f (g(x) + k) f (g(x)) = f 0 (g(x)) E(k) g(x + h) g(x) .
Here, we did not write out the first k since this will make the computation that follows
below slightly easier to read. Also, notice that we can rewrite k = g(x + h) g(x) as
g(x + h) = g(x) + k.
8.2. PROOF OF THE COMPUTATIONAL RULES FOR THE DERIVATIVE 241
Finally, we have gathered all the necessary pieces needed to make the following
computation:
Remark 8.30 While the above proof seems to be more complicated than the "fake"
proof, the general idea is basically the same. However, here, things get more complicated
as we need to do some extra bookkeeping to make sure that nothing bad happens if k
happens to be zero for h arbitrarily close to 0 (but not equal to 0).
Exercise 8.31 (Discussion) Is the "fake" proof really that bad? Can you think of
any conditions under which it will actually work, and do the functions we normally
consider in these lecture notes satisfy such conditions?
Exercise 8.32 Use the chain rule to prove the differentiation formula for invertible
functions from Proposition 8.10. Here, you may assume that both f and f 1 are
differentiable.
Remark: You should compare this exercise to exercise 8.29.
Remark 8.33 In the YouTube-film linked in the margin, here, another proof of the
formula for the derivative on an inverse function is given.
242 CHAPTER 8. THE DERIVATIVE
Example 8.34
Let f (x) = arcsin(x) and g(x) = sin(x). Surely,
these functions are differentiable. However, this
is not the case for the composition f g(x) =
arcsin(sin x). As we see in the figure to the right,
there are plenty of pointy edges! The point is
that arcsin(x) is not differentiable at the end-
points of its domain, and this causes trouble
when x is such that sin x = ±1.
Fig. 5. Look! Pointy edges!
But all is not lost. The following proposition is analogue to Proposition 6.52 (notice
that it contains a little bit of "fine print").
Proposition 8.35 Suppose that f and g are differentiable. Then the same is true for
the functions f ± g, f · g and f /g. Moreover, if g is differentiable at a point x and f is
differentiable at the point g(x), then f g is also differentiable at x.
Proposition 8.38
d 1
(i) log x = , x>0
dx x
d
(ii) sin x = cos x
dx
d
(iii) cos x = sin x
dx
As it happens, we have already done the hard work in proving this proposition in
the previous chapter (see page 198). In the following exercises, the point is to help you
realise this.
Exercise 8.39 We now ask you to prove part (i) of the above proposition.
(a) Write out the definition of the derivative of the logarithm at x = 1 and verify
that we have already proved that formula (i) holds at this point.
(b) Use the logarithmic laws, and the change of variables rule for the limit, to extend
the differentiation formula to all other x.
Hint: The solution to (a) should reveal where to look for inspiration for (b).
Exercise 8.40 In this exercise, we ask you to prove parts (ii) and (iii) of the above
proposition.
(a) Write out the definition of the derivative of the sine and cosine at x = 0 and
verify that we have already proved that formulas (ii) and (iii) hold there.
(b) Use suitable trigonometric formulas, and the change of variables rule for the limit,
to extend the differentiation formulas to all other x.
Hint: The solution to (a) should reveal where to look for inspiration for (b).
Now, an interesting observation is that while the logarithm has domain x > 0, its
derivative y = 1/x has the much larger domain x 6= 0. This leads us to ponder if we can
somehow extend the logarithm to negative x in such a way that the derivative of this
extension is equal to 1/x there. This is the point of the following exercise:
Exercise 8.41 (a) Use the chain rule for the derivative to prove that for x < 0 we
have
d 1
log( x) = .
dx x
(b) Use what you learned in (a) to write a formula for a function f (x) with domain
R\{0} such that f 0 (x) = 1/x there.
244 CHAPTER 8. THE DERIVATIVE
dp x 1/2 p
y0 = 1 x2 = p =) y 0 (1/2) = p = 1/ 3.
dx 1 x2 1 1/4
p
Fig.
p 7. The function y = + 1 x describes the upper semi-circle and y =
2
The point is that we can solve our problem without ever needing to know the formula
for f (x).
The first step is to put y = f (x) into the equation for the circle:
x2 + f (x)2 = 1.
Since the left-hand side is identical to the right-hand side for all x, the derivative of the
left-hand side has to be equal to the derivative of the right-hand side. We obtain:
d⇣ 2 ⌘ d⇣ ⌘
x + f (x)2 = 1 =) 2x + 2f (x)f 0 (x) = 0.
dx dx
Here, f 0 (x) appears since it is the inner derivative when we use the chain rule to get
(f (x)2 )0 = 2f (x)f 0 (x). Rewriting the above expression, we get
x x
f 0 (x) = () y 0 = .
f (x) y
x4 y4 x2 + y 2 = 0.
Example 8.45 We now use implicit differentiation to compute the derivative of the
function y = arcsin(x) by using the fact that it is the inverse function of the sine. That
is, for x 2 Darcsin we have
y = arcsin x () sin y = x,
Remark 8.46 Observe that when using implicit differentiation to study the derivatives
of inverse functions as we do above, then we have no need for the implicit function
8.3. DIFFERENTIATION FORMULAS FOR ELEMENTARY FUNCTIONS 247
theorem. Indeed, since the function is differentiable, we know that we can consider both
y as a function of x and x as a function of y (if needed).
Proposition 8.52
d
(i) exp(x) = exp(x)
dx
d
(ii) exp(ix) = i exp(x)
dx
d ↵
(iii) x = ↵x↵ 1
, ↵ 2 R, x > 0
dx
d 1
(iv) arcsin x = p , x 2 ( 1,1),
dx 1 x2
d 1
(v) arccos x = p , x 2 ( 1,1),
dx 1 x2
d 1
(vi) arctan x = , x 2 R.
dx 1 + x2
248 CHAPTER 8. THE DERIVATIVE
Exercise 8.53 (Exam 2015-05-27, part of 5) Make a table of signs for the deriva-
tive of the function
x 1
f (x) = ln x , x 1.
x+1
Exercise 8.54 (Exam 2014-08-18, part of 2) Make a table of signs for the deriva-
tive of the function
1
f (x) = 2 arctan x + , x 6= 0.
x
Exercise 8.55 (Exam 2014-05-26, part of 3) Make a table of signs for the deriva-
tive, and the second derivative, of the function
1/x
f (x) = |x|e , x 6= 0.
Exercise 8.56 (Exam 2012-12-19, part of 4) Make a table of signs for the deriva-
tive of the function
Exercise 8.57 (Exam 2012-05-28, part of 1) Make a table of signs for the deriva-
tive of the function p
2
f (x) = e x /2 x2 + 1, x 2 R.
Exercise 8.58 (Exam 2012-05-28, part of 3) Make a table of signs for the deriva-
tive of the function
x 1
f (x) = ln(1 + e ) , x 2 R.
ex +1
8.5. ANSWERS TO SELECTED EXERCISES 249
8.6 No.
8.13 (a) (2x arctan(x) 1)/ arctan(x)2 , (b) 1/(1 + cos(x)), (c) 2(ln(x) 1)/(ln x)2 .
2
8.17 (a) cos xesin x , (b) 2x/(1 + x2 ), (c) 1/(1 + x2 ), (d) (1 + 2x2 )ex
8.19 (a) - (v), (b) - (iii), (c) - (iv), (d) - (i), (e) - (ii).
8.24 Here is an additional hint: multiply the expression in the original hint by one in
such a way that you can take advantage of the fact that the limit
f (u) f (x)
lim
u!x u x
exists and is equal to some finite number.
8.25 It says that there are functions that are continuous but not differentiable. That is,
that the two areas in the Venn diagram do not coincide.
8.27 (a)
1 1
g(x+h) g(x) g(x) g(x + h) 1 1 g(x + h) g(x) g 0 (x)
lim = lim = lim · = .
h!0 h h!0 hg(x)g(x + h) g(x) h!0 g(x + h) h g(x)2
8.48 Let y = ex , then log y = x. Now differentiate implicitly on both sides, and then
substitute back.
8.51 Let y = f 1 (x), then f (y) = x. Now differentiate implicitly on both sides, and
then substitute back.
250 CHAPTER 8. THE DERIVATIVE
Chapter 9
Introduction
The main focus of the chapter is the Mean Value Theorem. Basically the first half of
the chapter is about understanding its statement and exploring how it has been playing
a part in our lives since high-school mathematics. The second half of the chapter is
about proving the Mean Value Theorem. Along the way, we will meet the Intermediate
Value Theorem, the Min-Max Theorem and, perhaps the most important of them all,
the Bolzano-Weierstrass theorem.
Remark 9.1 (Selected problems from previous exams based on this chapter)
251
252 CHAPTER 9. MORE ON FUNCTIONS AND LIMITS
Theorem 9.2 (The Mean Value Theorem for Derivatives) Suppose that a func-
tion f is continuous on [a,b] and differentiable on (a,b). Then there exists (at least) a
point c 2 (a,b) such that
f (b) f (a)
f 0 (c) = .
b a
Let us formulate the Granny example in a mathematical language: Suppose that the
function f (x) describes the position of Granny at time x, she starts at time x = a, and
stops at time x = b. Her average velocity is then expressed as the quotient
f (b) f (a)
.
b a
Now, according to our intuitive reasoning, then at some point in time she would actually
have had to drive at the average velocity. In other words, there has to be some time
x = c,
with a < c < b, so that her actual velocity, given
by f 0 (c), was equal to her average velocity. That
is, there has to exists a c 2 (a,b) so that
f (b) f (a)
f 0 (c) = .
b a
This is exactly the conclusion of the Mean Value
Theorem! Fig. 3. Busted!
Let us now consider the hypotheses of the Mean Value Theorem. Why do we need to
assume that f is continuous on [a,b] and differentiable on (a,b)? Well, suppose we allow
f to have discontinuities. This would mean that Granny can teleport! And if Granny
has a teleportation device in her living room, this means that she can easily get from
Lund to Malmö in 5 minutes without ever breaking any laws (of traffic, at least).
Fig. 4. Left: If Granny can travel in a discontinous manner, then she can teleport.
Right: If Granny can travel without being differentiable, she will break her neck.
And what about being differentiable? Well, if f (t) is not differentiable, then the slope
may change instantaneously. That is, Granny’s velocity can go from 90 km/h to 300
km/h and back to 90 km/h without ever passing the average velocity. While this is bad
news for anyone who wants to avoid whiplash, it also means that we cannot expect the
Mean Value Theorem to hold if the function is not differentiable.
We consider the proof of the Mean Value Theorem in Section 9.4.
254 CHAPTER 9. MORE ON FUNCTIONS AND LIMITS
Proof. We prove part (ii), and leave parts (i) and (iii) as an exercise.
So, we assume that f 0 (x) = 0 on I, and want to show that f (x) is constant on I.
To do this, we choose choose two points x1 , x2 2 I with x1 < x2 . By the Mean Value
Theorem, there exists a point c 2 (x1 , x2 ) so that
f (x2 ) f (x1 )
= f 0 (c).
x2 x1
Since f 0 (c) = 0, and the denominator is non-zero (x1 6= x2 ), it follows that the numerator
has to be zero. That is,
f (x2 ) = f (x1 ).
This means that f takes the same value at all points in I and is therefore constant.
Corollary 9.3 may not seem like much, but this is far from true. The following exercise
is meant to give a hint of its power.
Exercise 9.6 Suppose you only know the differentiation formulas (sin x)0 = cos x,
(cos x)0 = sin x and that sin 0 = 0 and cos 0 = 1. Use this and Corollary 9.3 to prove
that
sin2 x + cos2 x = 1, 8x 2 R.
Theorem 9.8 (The Min-Max Theorem) Suppose that f is continuous on the finite
and closed interval [a,b]. Then there exist (at least) two points `, u 2 [a,b] so that for all
x 2 [a,b] we have
f (`) f (x) f (u).
The Max-min theorem is closely related to the Intermediate value theorem, and is
difficult to prove for the same reason. Namely, could it happen that the max and min
points happen at "holes" in the real line? In order to build a little more mathematical
"muscle" before tackling the proofs of these theorems, we postpone further study until
Section 9.5 (we will not need these results until that chapter anyway).
Exercise 9.9 Give examples that show how (a) the Intermediate Value Theorem
and (b) the Min-Max Theorem both fail if we remove the condition that f is continuous
on a closed and finite interval.
Exercise 9.10 Show that the following statement is a consequence of the Intermedi-
ate Value Theorem. "Suppose that f is continuous on [a,b] and s 2 R, and that one
of the following conditions hold:
Exercise 9.11 Use the previous exercise and the Min-Max Theorem to prove that if
the domain of a continuous function f is a closed interval, then the range of f is also
an interval.
Remark: This is also true if f is continuous but defined on an interval that is not
closed. As an extra challenge, feel free to try to extend your proof to also cover this
situation.
9.2. CONSEQUENCES OF THE MEAN VALUE THEOREM 257
Example 9.12 For x in [ 2,3], we determine the largest and smallest values of
f (x) = x3 3x + 1.
Following what we (should have) learned in high school, we compute the derivative
f 0 (x) = 3x2 3 = 3(x2 1) and observe that f 0 (x) = 0 holds exactly when x = ±1.
To efficiently use the information contained in f 0 (x) to sketch the function, one can
make a table of signs for the derivative, and then apply Corollary 9.3 as follows:
Fig. 8. In the first three rows, we figure out the sign of f 0 (x). In the last row, we
use Corollary 9.3 to translate this into how f (x) behaves.
Based on the above table, we see that f has its smallest value at either x = 2 or x = 1,
and its largest value at either x = 1 or x = 3. To better understand what is going on,
we compute the values of f at these points:
f (1) = 1 f (3) = 19
Let us now recall some vocabulary relevant to the above example. We call a point
x = a a local maximum point of a function if f (x) f (a) for all x in a neighbourhood
of x = a. Local minimum points are defined similarly (at end-points we only require
this to hold in one-sided neighbourhoods). Collectively, these are called local extremal
points. The local extreme points at which the function takes its largest and smallest
values are called global extremal points. In the above example, x = 3 was a global
maximum point, while both x = 2 and x = 1 were global minima.
Points where f 0 (x) = 0 are called stationary
points, while points x 2 Df where f 0 (x) does
not exist are called singular points. We remark
that the definition of singular points varies be-
tween textbooks. (For instance, some textbooks
do not require singular points to be in the do- Fig. 10. Restricted to [ 1,2), f (x) =
main in order for them to be able say that, e.g., |x| has two extremal points. Do you
f (x) = 1/x has a singular point at x = 0.) see which these are?
Next, we mention a result that is due to
Fermat, and which leads to an explanation of
where a function can have maximum and mini-
mum points. While this result is sometimes re-
ferred to as "Fermat’s theorem" it should not be
confused with "Fermat’s last theorem", which is
a famous problem that took more than 300 years
to solve (Simon Singh’s book on Fermat’s last
theorem was basically what got me into mathe-
matics in the first place). Fig. 11. Pierre de Fermat 1607 –
Note that all points of an interval that are not 1665.
endpoints are called inner points.
Proof. Suppose that x = c is a local maximum of f . By hypothesis f 0 (c) exists, and so,
by definition, we have
f (x) f (c)
f 0 (c) = lim . (9.1)
x!c x c
Since c is not an endpoint of the interval I, we can consider one-sided limits from each
side. Since we know that f (x) f (c) is negative when x approaches c, this allows us to
use knowledge of the sign of (x c) in the one-sided limits to deduce that
Since f 0 (c) must be equal to both of these limits, it follows that f 0 (c) = 0. The proof in
the case of a local minimum is similar.
• x is a stationary point of f .
• x is an endpoint of [a,b].
Proof. Suppose that x is an extremal point of f . We can split the proof into two cases:
(i) x is an endpoint of [a,b], (ii) x is an inner point of [a,b].
In the first case, we are done (since endpoints is one of the conclusions we aim for). So,
we deal with case (ii). But this we can split into two sub-cases: (ii-a) f is differentiable
at x, (ii-b) f is not differentiable at x. In the first of these subcases, we use Fermat’s
theorem to conclude that x is a stationary point. In the second, we are immediately
done, since this means that x is a singular point of x.
f (x) = x3 3x + 1.
Since f has no singular points, this means that we are guaranteed that the global
maximum and minimum points of f will be on (at least) one of the points in the following
list: x = 2, x = 3 (the endpoints) or x = ±1 (the stationary points of f ).
By comparing the values of f computed in the table of Example 9.12, we see that
f ( 2) = f (1) = 1 is the global minimum and f (3) = 19 is the global maximum.
While the above approach is somewhat less complicated than making a table of signs,
it also gives less information. For instance, without making further arguments, we do not
know if x = 1 is a local maximum, local minimum or a terrace point of f – something
which would be immediately clear from a table of signs for f 0 .
Remark 9.16 (Warning) Please note that while elegant, the justification of the ap-
proach from Example 9.15 collapses completely if f is continuous and differentiable on
an open interval. For instance, if we take the same function, that is f (x) = x3 3x + 1,
but now consider it on ( 1,1), then none of the points considered above are global
maximum or minimum points for f (in fact, on this interval, f has no global extremal
points!).
Exercise 9.19 Determine all extreme points and asymptotes of the function
2
p
f (x) = e x /4 x2 + 1.
Classify the extreme points and determine whether they are global. Illustrate your
answer in a simple sketch of the function.
Exercise 9.20 Consider the function given by
8p
< 1 + x2 x>0
f (x) =
:
1 x2 x0
In the previous example, we used the table of signs to make crude sketches of the graph
of a function based on information on its derivative. Here, we take a look at how this
can be improved by taking the second derivative into account.
First, recall that the second derivative is
the derivative of the derivative, the third deriva-
tive is the derivative of the second derivative, and
so on. That is,
def d 0 def d 00
f 00 (x) = f (x), f 000 (x) = f (x), · · · .
dx dx
For higher order derivatives, it becomes silly to
keep writing f 000 , f 0000 and so on, it is practical to
use the notation f (3) , f (4) , . . ., instead.
Fig. 13. The mascots of the Rice
Also, recall that if y denotes position with re- Krispies cereals are apparently called
spect to time, then y 0 represents velocity (change “Snap”, “Crackle” and “Pop”. Which,
of position), y 00 acceleration (change of velocity), apparently, is exactly what engineers
and y 000 is called the “jerk” (change of accelera- call y (4) , y (5) and y (6) , respectively.
tion).
To understand, visually, what it means that
f 00 is positive or negative, we apply Corollary
9.3 to f 0 (instead of to f ). This tells us that if
f 00 is positive, then f 0 is increasing, and if f 00 is
negative then f 0 decreasing. By definition, we
say that a function f is convex on an interval
I if f 0 is increasing on I, and concave if f 0 is
decreasing on I (see figure to the right).
Fig. 14. The happy graph has a grow-
The point where a function changes from be- ing derivative, and the sad graph has
ing convex to concave (or vice versa) is called an a decreasing derivative.
inflection point. We require, as is usual, that
for a point x to be called an inflection point for
a function f , then f must be differentiable at x.
Now, let us make the following observation
from Figure 15. If a stationary point is in an
interval where the graph is happy (convex), then
it has to lie at the bottom of the smile, and thus
it must be a local minimum. This is a special Fig. 15. A happy function with a sta-
case of the second-derivative test: tionary point.
262 CHAPTER 9. MORE ON FUNCTIONS AND LIMITS
Proposition 9.21 (The second derivative test) Suppose that f (x) is twice differ-
entiable on (a,b) and has a stationary point at c 2 (a,b). Then the following holds.
If f 00 (c) = 0, then anything can happen. (It could be an extreme point, or it could be
an inflection point.)
Proof. This proof is actually kind of fun and kind of similar to that of Fermat’s
theorem, above. Suppose we are in situation (i). Our first observation is that since x = c
is a stationary point for f we have
f 0 (x) f 0 (c) f 0 (x)
f 00 (c) = lim = lim .
x!c x c x!c x c
f 0 (x)
0 > f 00 (c) = lim .
x!c+ x c
Here, the denominator x c must be positive.
Fig. 16. In some (possibly very small)
But since the fraction f 0 (x)/(x c) is supposed to
neighbourhood of x = c, we know the
give a negative limit, this means that f 0 (x) has to
sign of f 0 (x). This means that we
become negative as x ! c+ . For similar reasons,
know the behaviour of f (x).
f 0 (x) has to become positive as x ! c .
Let us now consider a first example.
Example 9.22 (use of the second derivative test) Let us find the extremal points
of f (x) = x3 3x + 1 without making a table of signs. We can do this by considering
and
f 00 (x) = 6x.
We notice that this function is continuous and differentiable on all of R. This means
that the only possible extremal point is at the stationary points x = ±1. Since
Here is a second, more complicated example, where we use the second derivative to
provide a relatively detailed sketch of a function.
Example 9.23
the formula for f (x), we see directly that it has a vertical asymptote at x = 0. We are
also able to identify horisontal asymptotes follow from the computation:
⇣ 1⌘ ⇣ ⇡⌘
lim 2 arctan x + =2· ± + 0 = ±⇡.
x!±1 x 2
ppTaking into
p
account the information from the table of signs, and the fact that
p
2 + 1 ⇡ 1.4 + 1 = 2.4 > 1, we arrive at the following sketch.
Fig. 20. Our sketch for f . Notice how, in certain respects, it is more informative
than the computer generated image (Figure 17).
Exercise 9.24 Sketch the graph of the function f (x) = xe 1/x . In particular, deter-
mine its domain, all asymptotes (vertical, horisontal and skew), all extremal points
(local and global) and where it is convex and concave.
Exercise 9.25 Repeat the previous exercise for the function f (x) = (x2 1)2/3 .
Hint: The derivative and double-derivatives can be expressed quite nicely, however, the
computation is messy if you are not careful.
By Corollary 9.3, this means that f (x) is constant. If we put x = 1, we see that
1
f (1) = ln + ln 1 = 0 + 0 = 0.
1
That is, f (x) is constantly equal to 0, and we are done.
Exercise 9.28 As in the above example, use only that (ln x)0 = 1/x and ln 1 = 0 to
prove that for all a > 0, we have
ln(ax) = ln a + ln x, x > 0.
Exercise 9.33 (Exam 95-05-31) Show that ln(1 + 4x) > arctan 3x for all x > 0.
9.2. CONSEQUENCES OF THE MEAN VALUE THEOREM 267
(2 x ln x)
f 0 (x) = x2x+1 · . Fig. 23. Table of signs for f 0 .
(x2 + 2x )
Since f is differentiable on R, we see from the table of signs that, by the above exercise,
f has at most one zero on each interval ( 1, 0], (0,2/ ln 2) and [2/ ln 2, +1). That is,
at most 3 zeroes on R. To investigate further, we check that
x 2 2x
lim f (x) = lim = 1 < 0,
x!1 x!1 x2 + 2x
4
(ln 2)2
22/ ln 2
f (0) = 1 < 0 and f (2/ ln 2) = 4 > 0,
(ln 2)2
+ 22/ ln 2
x2 2x
lim f (x) = lim = 1 > 0.
x! 1 x!1 x2 + 2x
In particular, this means f changes sign on each of the three intervals mentioned above,
and therefore, by the Intermediate Value Theorem applied to the continuous function
f three times, we find that f has to have at least three zeroes on R. We can therefore
conclude that f has exactly three zeroes on R.
Exercise 9.36 Let f (x) = cos x and g(x) = x. Prove that f (x) and g(x) intersect
each other on the interval [0,⇡/2] (a) at least once, and (b) at most once.
Exercise 9.37 (Exam 2001-10-31) Determine the number of real roots of
1
arctan x + = 1.
x+2
268 CHAPTER 9. MORE ON FUNCTIONS AND LIMITS
(a) If f attains its maximum and minimum value for points u, ` 2 I, respectively,
then Rf = [c,d] with c = f (`) and d = f (u).
(b) If f attains neither its maximum or minimum value on I, but the limits
both exist or are equal to infinity, then Rf = (c,d) with c and d being equal to
the smallest and largest limit, respectively.
(c) Formulate what happens if, say, f attains its maximum value at some point u 2 I,
but not its minimum value, and both limits mentioned in (b) exist.
We now apply the results of the above exercise to determine the range of a function.
Example 9.39 Let us determine the range of the function from Example 9.23. Since
it is not continuous at x = 0, we need to consider the function restricted to the intervals
( 1, 0) and (0,1) separately.
Let us first consider f restricted to the interval ( 1,0). By what was done in
Example 9.23, we see that f attains its maximum value on that interval at the point
x = 1, and that f ( 1) = ⇡/2 1. Moroever, we observed that
By the above exercise, this means that the range of f when restricted to ( 1,0) is equal
to ( 1, ⇡/2 1].
Similarly, when we consider f restricted to the interval (0,1), we find that its range
is equal to [⇡/2 + 1,1). We therefore conclude that f with Df = ( 1,0) [ (0,1) has
range
Rf = ( 1, ⇡/2 1] [ [⇡/2 + 1,1).
Exercise 9.40 Determine the range of the function from Example 9.35.
9.3. A NICE LITTLE TRICK: L’HOPITAL’S RULE 269
f 0 (x)
lim
x!c g 0 (x)
exist.
Then, if we have either
f (x) h 0 i f (x) h 1 i
(i) lim = or (ii) lim = ,
x!c g(x) 0 x!c g(x) 1
it follows that
f (x) f 0 (x)
lim = lim 0 . (9.2)
x!c g(x) x!c g (x)
Before discussing how to prove this result, let us illustrate how it works. Here is a
first example:
Example 9.42 Let us use L’Hopital’s rule to compute the familiar limit
sin x
lim .
x!0 x
Since this expression is of the form [0/0], we want to use L’Hopital’s theorem. If we, for
the moment, ignore all conditions we need to check to make sure the rule applies, we see
270 CHAPTER 9. MORE ON FUNCTIONS AND LIMITS
But can we trust this? Well, we need to check that all conditions of L’Hopital’s rule
are satisfied. But this follows since sin x and x are elementary functions defined in a
neigbourhood of x = 0, and since the limit of cos x/1 exists as x ! 0. Done!
x2
lim .
x!1 ex
This limit is of the form [1/1]. As in the previous example, we start by applying
formula (9.2) if L’Hopital’s rule without really worrying about if the involved limits
exist:
x2 h 1 i L’Hop. 2x
lim x = = lim x .
x!1 e 1 x!1 e
This limit is also of the type [1/1]. So, what to do? Well, again, let us hope for the
best and apply (9.2) to get:
2x h 1 i L’Hop. 2 2
lim x = = lim x = = 0.
x!1 e 1 x!1 e 1
Ok, so now we ended up with a concrete value, which is nice. But can we trust the
computation? That is, are the conditions of L’Hopital’s formula met? Now, the crucial
condition in each step is whether the limit of f 0 /g 0 exists. As it turns out, we can justify
this condition "backwards". Indeed, in the final step, we verified that the limit of 2/ex
exists, and therefore the second application of L’Hopital’s rule is justified. But this
means that the limit of 2x/ex exists, and therefore the first application of L’Hopital’s
rule is justified, and all is good!
Finally, we look at an example where not checking the condition that the limit f 0 /g 0
exists does get us into trouble.
Example 9.44 Let us try to use L’Hopital’s rule to compute the limit
x + sin x
lim .
x!1 x
9.3. A NICE LITTLE TRICK: L’HOPITAL’S RULE 271
As in the previous examples, let us just use L’Hopital’s rule and worry about conditions
later. This gives:
x + sin x h 1 i L’Hop. 1 + cos x
lim = = lim .
x!1 x 1 x!1 1
Now, since cos x diverges as x ! 1, this limit does not exist. But what does this say
about our original expression? Well, not so much. Indeed, here is what happens when
we try to compute it without using L’Hopital’s rule:
x + sin x ⇣ sin x ⌘
lim = lim 1 + = 1 + 0 = 1.
x!1 x x!1 x
That is, in this case, L’Hopital’s theorem gives us the wrong answer. And the reason is
that the limit of f 0 /g 0 does not exist, and so L’Hopital’s rule does not apply here.
Exercise 9.47 What happens if you try to use L’Hopitals rule to compute the limit
x
lim p ?
x!1 x2 1
Exercise 9.48 Compare what happens if you try to compute the limit
f (x)
lim ,
x!0 g(x)
with f (x) = x2 sin(1/x) and g(x) = x, using (i) L’Hopital’s rule, and (ii) without
using L’Hopital’s rule. Does this make sense?
272 CHAPTER 9. MORE ON FUNCTIONS AND LIMITS
A naive “proof” in the [0/0] case in which c is a finite point goes something like this:
Exercise 9.50 In this exercise, we ask you to prove L’Hopital’s rule in the case when
x tends to a finite number.
(a) Prove that l’Hopitals theorem is a consequence of the above proposition in the
case where c is a finite number, and the limit f /g is of the type [0/0].
(b) Deduce from (a) that L’Hopital’s rule holds for limits of the type [0/0] when
x ! 1.
Exercise 9.51 (a) It would seem that Proposition 9.49 has a problem for functions
g such that g(a) = g(b). Explain why this potential problem is already taken
care of by the hypothesis of the proposition.
(b) Some textbooks on Calculus formulate Proposition 9.49 in such a way that there
is no problem if g(a) = g(b). Can you suggest such a formulation?
9.4. A CLOSER LOOK AT THE MEAN VALUE THEOREM 273
Theorem 9.2 (The Mean Value Theorem for the derivative) Suppose that a
function f is continuous on [a,b] and differentiable on (a,b). Then there exists (at least)
a point c 2 (a,b) such that
f (b) f (a)
f 0 (c) = .
b a
Rolle’s theorem
The first step to proving the Mean Value Theorem is to establish a baby-version usually
called Rolle’s theorem. We choose to call it a “lemma”, since this is the customary word
to use for result that are purely preparatory in nature.
Lemma 9.52 (Rolle’s theorem) Suppose that f is continuous on [a,b] and differen-
tiable on (a,b). Also, suppose that f (a) = f (b) = 0. Then there exists a point c 2 (a,b)
such that f 0 (c) = 0.
Exercise 9.53 In Rolle’s theorem we made the assumption that f (a) = f (b) = 0. Is
this needed for the conclusion to be true? Motivate your answer with either a proof
or an example, as appropriate.
274 CHAPTER 9. MORE ON FUNCTIONS AND LIMITS
Proof that the Mean Value Theorem follows from Rolle’s theorem
Suppose that we are given a function f that is
continuous on [a,b] and differentiable on (a,b).
The point of the proof is to create a situation
where we can apply Rolle’s theorem.
To this end, we define the following function
✓ ◆
def f (b) f (a)
h(x) = f (x) f (a) + (x a) .
b a
Exercise 9.54 Motivate why the function h(x) is continuous on [a,b] and differen-
tiable on (a,b).
Hint: Use Proposition 6.52.
Exercise 9.55 Modify the above proof to prove the Generalised Mean Value Theorem
for derivatives (Proposition 9.49).
Hint: The point is to modify the help function so that when we apply Rolle’s theorem,
the Generalised Mean Value Theorem follows.
9.5. PROOFS OF A FEW DEEP THEOREMS RELATED TO CONTINUITY 275
Theorem 6.8 (The Min-max theorem) Suppose that f is continuous on the finite
and closed interval [a,b]. Then there exists (at least) two points `, u 2 [a,b] so that for
all x 2 [a,b] we have
f (`) f (x) f (u).
Since some students find the logic of this chapter a bit hard to follow, here is an
overview that indicates how the main results of this chapter are connected (some of
these results will be formulated for the first time in the following pages):
• Case 1: We found a zero! We end the search and return the value m as our c.
9.5. PROOFS OF A FEW DEEP THEOREMS RELATED TO CONTINUITY 277
• Case 2: f (a) and f (m) have opposite signs, and we put a1 = a and b1 = m.
• Case 3: f (m) and f (b) have opposite signs, and we put a1 = m and b1 = b.
What happens now? Well, if Case 1 applies, we are done. So let us assume that one of
cases 2 or 3 hold. In either case, we end up with a new interval [a1 ,b1 ] for which
a| |b
(i) f (a1 ) < 0 (ii) f (b1 ) > 0 (iii) |b1 a1 | = .
2
But this means that the function f satisfies exactly the same hypotheses on [a1 ,b1 ] as on
[a,b]. This allows us to repeat the above process on [a1 ,b1 ] to produce an interval that
we now call [a2 ,b2 ]. Continuing in this way, we obtain a sequence of nested intervals
[a,b] [a1 ,b1 ] [a2 ,b2 ] [a3 ,b3 ] · · ·
for which |b a|
(i) f (an ) < 0 (ii) f (bn ) > 0 (iii) |bn an | = .
2n
Now, one of two things may happen. Either case 1 kicks in after a finite number of
steps, and we have found a zero of f (and the proof ends). Or it does not, and we end
up with an infinite sequence of intervals [an ,bn ]. The point is now to prove that even if
the process does not stop after a finite number of steps, we still find a zero of f .
b a
B A = lim bn lim an = lim (bn an ) = lim = 0.
n!1 n!1 n!1 n!1 2n
This means that we can put c = A = B. Moreover, since the (an )1 n=1 are bounded above
by b, and the (bn )1
n=1 are bounded below by a, it follows that A b, and B a, and so
c 2 [a, b].
The last step of this proof is to show that f (c) = 0. To this end, recall that f is
continuous on [a, b], and that f (an ) < 0 and f (bn ) > 0 for all n. From this, we obtain
both
f (c) = f ( lim an ) = lim f (an ) 0,
n!1 n!1
and
f (c) = f ( lim bn ) = lim f (bn ) 0.
n!1 n!1
In other words, 0 f (c) 0, and so we conclude that f (c) = 0.
278 CHAPTER 9. MORE ON FUNCTIONS AND LIMITS
(i) (dk )1
k=1 only consists of elements taken from the sequence (cn )n=1 .
1
(ii) (dk )1
k=1 respects the order of the sequence (cn )n=1 .
1
Example 9.58 Let (cn )1 n=1 = 1/n. Then (dk )k=1 = (1/2 )k=1 is a subsequence of
1 k 1
n=1 . To see this a bit more clearly, let us write out the first few terms of the two
(cn )1
sequences as follows: 1 1 1 1 1 1 1 1
1, , , , , , , , ,...
2 3 |{z}
|{z} 4 5 6 7 |{z} 8 9
=d1 =d2 =d3
As we see, the sequence d1 , d2 , d3 , . . . only consists of numbers taken from the list
c1 , c2 , c3 , . . . and the order of the original sequence has been respected.
This is a typical way of expressing subsequences. Indeed, so much so, that we take the
following to be our formal definition of a subsequence:
Definition 9.59 If (nk )1 k=1 is a strictly increasing sequence of integers, then we say
that (cnk )1
k=1 is a subsequence of (cn )1
n=1 .
Exercise 9.61 Suppose that cn = 1/(2n) for n = 1, 2, . . .. Write the following entries
from a subsequence dk of cn on the form cnk .
1 1 1 1
d1 = , d2 = , d3 = , d4 = .
22 10 5! 210
Remark: The point is to figure out suitable values for the nk for these four terms.
• Case 1: Only the half [a, m] has an infinite number of entries of (cn )1
n=0 . Denote
by n1 the index of first cn to appear inside this interval. We choose d1 = cn1 and
put a1 = a, b1 = m.
• Case 2: Only the half [m, b] has an infinite number of entries of (cn )1
n=0 . Denote
by n1 the index of first cn to appear inside this interval. We choose d1 = cn1 and
put a1 = m, b1 = b.
• Case 3: Both halves [a, m] and [m, b] have an infinite number of entries from
n=1 . In this case, we choose d1 , a1 and b1 exactly as in Case 1.
(cn )1
In particular, this means that the sequence (dk )1k=0 where dk = cnk is a subsequence of
(cn )1
n=0 . Moreover, we are in a good position to prove that dk converges to some limit
L 2 [a,b]. Since this is done by repeating, word by word, the last lines of the proof of
the Intermediate Value Theorem, we leave this to the reader.
For the interested student, we also mention a completely different way of proving the
Bolzano-Weierstrass theorem. It is based on the following, rather interesting, lemma.
(a) Explain how the Bolzano-Weierstrass theorem follows from the lemma.
(b) To prove the lemma itself, suppose that (cn )1
n=1 has no decreasing subsequence.
Use this assumption to prove that there can only be a finite number of n with
the property that for all m n we have cm cn (do this by contradiction).
(c) Use what you proved in (b) to prove that (cn )1
n=1 must have an increasing sub-
sequence.
(d) Explain why combining (a), (b) and (c) proves the Bolzano-Weierstrass theorem.
C f (x) C, 8x 2 [a,b].
Before we give the proof, note how this lemma says less than the Min-Max Theorem.
Indeed, we are not claiming that the function f is able to attain the values C and C
at some points on the interval [a,b]. Instead, we are merely saying that the function is
bounded by these constants.
Proof of the lemma. We begin by proving the
existence of an upper bound for f (x). We do
this by a contradiction argument involving the
Bolzano-Weierstrass theorem. To this end, sup-
pose that f (x) is not bounded from above. That
is, for all n 2 N there exists an cn 2 [a, b] so that
f (cn ) n. (9.3)
f (dk ) k.
But, this is absurd since L 2 Df and so, in particular, f (L) has to be a finite number.
We conclude that f (x) has to be bounded above by some constant C > 0.
282 CHAPTER 9. MORE ON FUNCTIONS AND LIMITS
Exercise 9.66 Prove the second half of the lemma. That is, prove that the function
must have a bound from below.
Hint: Using the part we already proved, this can be done in one or two lines.
We now turn to the proof of the Min-Max Theorem itself. Since the argument follows
the same pattern as the ones we have seen above, we only outline the proof.
Outline of the proof of the Min-Max Theorem. We restrict ourselves to showing that
there exists a number u 2 [a,b] so that f (x) f (u) for all x 2 [a,b]. That is, f attains
its global maximum.
Our plan is to apply the Bolzano-Weierstrass theorem to find a sequence dk converging
to a number L so that f (L) is the global maximum of the f . This we can do in the
following way:
Fig. 35. Illustration of the situation of the above proof. The key to getting the
contradiction is to show that there exists a sequence yn that converges to y, but so
that the sequence xn = f 1 (yn ) stays away from x = f 1 (x).
Exercise 9.69 Complete the proof of the above proposition by filling out the above
steps. In particular, point out why we arrive at a contradiction in Step 4.
Exercise 9.70 (Challenge) Prove that if f is defined, continuous and invertible on
an interval, then f has to be monotone.
Hint: Use exercise 2.44.
284 CHAPTER 9. MORE ON FUNCTIONS AND LIMITS
d 1
arcsin(x) = p , x 2 ( 1,1).
dx 1 x2
x1 x2 =) f (x1 ) f (x2 ).
1 ⇡
arctan(x) + arctan = , x > 0.
x 2
(a) Say that a function takes the y-values 1 and 3. Under what assumptions can we
guarantee that f must also take the y-value 2? (Here, you are supposed to refer
to and formulate a theorem from the course.)
(b) Give an example of a function that takes the y-values 1 and 3, but not the y-value
2. Point out why your example does not contradict the theorem you cited in (a).
(c) Determine the range of the function f (x) = 2 arctan(x) + x1 .
Exercise 9.76 (Exam 2014-05-26) Let f (x) = |x|e 1/x for x 6= 0. Make a sketch
of this function which shows where it is growing/decreasing, convex/concave and any
asymptotes it may have. In you sketch, make sure you also point out any extreme
points, and how the function behaves close to x = 0.
Exercise 9.77 (Exam 2014-01-09) Determine the number of zeros of the function
f (x) = ex x2 .
Exercise 9.78 (Exam 2013-12-18) Let
1
f (x) = p .
x2 2x x
(a) Determine the domain of f .
(b) Determine where the function is positive and negative, respectively.
(c) Determine where the function is continuous.
(d) Determine any horisontal and vertical asymptotes the function may have.
(e) Determine all local and global extreme points.
It is important that you illustrate your answers in a sketch of the function.
Exercise 9.79 (Exam 2013-08-21) Prove the inequality
2x ln x < x2 1, x > 1.
Exercise 9.82 (Exam 2012-12-19) Determine all local and global extreme points
of the function
f (x) = |x3 6x2 + 9x 4|
on the interval [0,5] and sketch its graph there.
x 1
ln(1 + e )> , x 2 R.
ex +1
9.7. ANSWERS TO SELECTED EXERCISES 287
9.5 The modification is that the strict inequalities have to be replaced by non-strict
inequalities (to see why, think of the example f (x) = x3 ). Next, since the derivative
of f exists, we know that
f (x + h) f (x) f (x + h) f (x)
lim = f 0 (x) = lim .
h!0+ h h!0 h
9.6 Put f (x) = cos2 x + sin2 x 1. Then f 0 (x) = 2 sin cos x + 2 sin cos x = 0. Hence,
by Proposition 9.3, f is equal to some constant. What remains is to figure out the
value of this constant.
9.18 As a general point of strategy, it is wise to first study the graph of g(x) = x2 +x 2 =
(x + 2)(x 1). Now, g has zeroes at x = 2 and x = 1, and doing a table of signs
for g 0 reveals that it has a local minimum at x = 1/2. When taking the absolute
value to obtain f (x) = |g(x)|, the part of g that is negative becomes positive –
that is, reflected with respect to the x-axis:
Fig. 36. A plot of g(x) (left) and f (x) right. Notice how the zeroes of g become the
global minima of f .
In particular, zeroes of g(x) become global minima of f (x), and the local minimum
for g at x = 1/2 becomes a local maximum f ( 1/2) = 9/4. There are no global
maxima. Here is an illustration of the situation:
288 CHAPTER 9. MORE ON FUNCTIONS AND LIMITS
9.29 Put f (x) = arcsin x ⇡/2 + arccos x and study its derivative. The formula is true
for 1 x 1.
p
9.30 Put f (x) = arctan x2 1 + arcsin(1/x) and study its derivative. The value of
the constant is ⇡/2.
9.47 If you just use L’Hopital (without simplifying the expression first), then the ex-
pression will just repeat itself (more or less – you will get its reciprocal).
9.50 To use the Generalised Mean Value Theorem, we need for f and g to be defined
and continuous at x = c (but not necessarily differentiable there). If this is the
case, then fine. If not, consider instead the "extended" functions
( (
f (x) x 6= c g(x) x 6= c
F (x) = G(x) =
0 x=c 0 x=c
These functions are both defined, and continuous at x = c. Moreover, since they
agree with f and g, respectively, for x 6= c, we also have that
f (x) F (x)
lim = lim .
x!c g(x) x!c G(x)
9.60 First, notice that n1 1 (in fact, all nk are larger than or equal to one). Next, we
observe that since n2 is an integer, then the inequality n2 > n1 1 implies n2 2.
Using these observations, one should be able to set up an induction proof.
9.62 The next step in the proof is to follow, word-by-word, the portion of the proof of
the intermediate value theorem establishing that limk!1 ak = limk!1 bk = c for
some c 2 [a,b]. Since ak dk bk , the conclusion now follows from the squeeze
theorem.
9.66 Hint: To obtain a lower bound for f (x) is the same as getting an upper bound for
g(x) = f (x)...
9.7. ANSWERS TO SELECTED EXERCISES 291
and
lim f (xnk ) = lim ynk = y = f (x).
k!1 k!1
Since f is invertible (and therefore satisfies the horisontal line criterion), this
means that x = L. By what we did in Step 2, this gives a contradiction
(please make sure you see why this is).
292 CHAPTER 9. MORE ON FUNCTIONS AND LIMITS
Chapter 10
In this chapter we study the indefinite integral. Since the indefinite integral is just a
matter of taking the derivative "backwards", all facts about the indefinite integral are
obtained from corresponding facts on the derivative.
Remark 10.1 (Selected problems from previous exams based on this chapter)
293
294 CHAPTER 10. THE INDEFINITE INTEGRAL
Example 10.2 (Free fall example) Suppose that y(x) describes the vertical position
of an object in free fall at time x ("free fall" means that the only force acting upon the
object is gravity). According to introductory physics, its acceleration is described by
y 00 (x) = 9.82.
We now have an equation for the velocity y 0 . But we want an expression for y.
Guessing once more (again, you are asked to check this below), it seems like the following
guy does the trick:
9.82 2
y(x) = x .
2
Exercise 10.3 By using the definition of the derivative, verify that if y = ( 9.82/2)x2 ,
then y 0 = 9.82x and y 00 = 9.82.
Our solution method can be described as doing derivative backwards. That is, we
are looking for primitive functions:
Let us now look closer at Example 10.2, above. Is it really reasonable that the
motion of your mobile phone is described by the expression y = 9.82x2 /2? Well, not
necessarily. For instance, notice that this expression forces the position at time x = 0 to
be y(0) = 0. Moreover, it also forces the velocity at time x = 0 to be y 0 (0) = 0.
The expression for y found in the above example matches the situation where you
drop your phone down a hole in the ground. That is, the initial heights and velocity are
0. But what if the situation is the one to the right, where the initial height and velocity
is different from zero? The problem is that in the example, we missed a lot of solutions.
A function does not only have one primitive function, it has lots of them.
10.1. A FIRST LOOK AT THE INDEFINITE INTEGRAL 295
Fig. 1. It does not seem like too much to ask for Newton’s laws of nature to be able
to describe the physics in both of these situations.
Exercise 10.5 Suppose that f is a function on some interval [a,b]. We now ask you
to show that if F is a primitive function of f , then all primitive functions of f are
exactly on the form F + C, where C is any constant.
(a) Suppose that F (x) is a primitive of f (x). By using the definition of the derivative,
verify that G(x) = F (x) + C is also a primitive of f (x), no matter the constant
C 2 R.
(b) Suppose that F (x) and G(x) are two primitive functions of f (x). Use the Mean
Value Theorem to show that there must exist a constant C so that G(x) =
F (x) + C.
Example 10.6 (Free fall example, continued) By exercise 10.5, we can modify our
first guess to be
y 0 (x) = 9.82x + C.
Indeed, no matter the constant C, if we differentiate this expression, we get y 00 (x) =
9.82. By noticing that y 0 (0) = C, we see that we should interpret C as the initial
velocity.
Next, we modify our second guess to be
9.82 2
y(x) = x + Cx + D.
2
Indeed, by differentiating this expression, we get y 0 (x) = 9.82x + C. Here, we see
that y(0) = D, so the constant D should be interpreted as the initial position. This
means that we have found an equation for free fall that is flexible with respect to initial
velocities and positions.
296 CHAPTER 10. THE INDEFINITE INTEGRAL
Exercise 10.7 Suppose that you throw you mobile phone upwards. The initial height
is roughly 2 meters (your hand is extended upwards as you release it) and it takes 3
seconds for the mobile phone to hit the ground. (a) What was the initial velocity?
(b) How far up did the phone go?
Keeping track of these constants means that our equations (hopefully) model all
relevant physical situations, and not just a particular one. Since these constants are so
important, we introduce the following definition.
to denote all primitive functions of f . By what we did in exercise 10.5, it follows that if
f is defined on an interval, and F is any primitive function of f , then we can write
Z
f (x) dx = F (x) + C, C 2 R.
Remark 10.9 If you read the definition of the indefinite integral carefully, you may
realise that it would be more correct to write
Z
f (x) dx = {F (x) + C : C 2 R}.
However, as long as we keep track of the constant C, we can safely skip the set notation
in most situations (note that if f is not defined on an interval, then we should adjust
this slightly – do you see why and how?).
The day job of the indefinite integral is to represent all primitive functions of f (x).
Because of this, the operations of taking derivatives and taking indefinite integrals are
by definition inverse to each other.
Fig. 2. That is, what the one does, the other tries to undo.
Since we know that sin x and cos x are related by the derivative, we guess that a primitive
of x sin(x2 ) ought to be F (x) = cos(x2 ). To check whether this is correct, we differentiate:
F 0 (x) = 2x sin(x2 ).
Since this is not equal to x sin(x2 ), our guess was wrong! However, only by the factor
2. To try to compensate for this, we make the modified guess F (x) = (1/2) cos(x2 ).
This time differentiation gives F 0 (x) = x sin(x2 ), and our guess is shown to be correct!
In conclusion, we have proved that
Z
1
x sin(x2 ) dx = cos(x2 ) + C.
2
Exercise 10.13 Use the method of guessing and checking to compute the following
indefinite integrals.
Z Z Z Z
dx x2 3x2
(a) (b) xe dx (c) dx (d) tan2 x dx
x3 1 + x3
Hint: In (d), it helps to know a few ways of how to express the derivative of tan x.
298 CHAPTER 10. THE INDEFINITE INTEGRAL
The rulebook, part 2: The summation rule and a weak product rule
The method of guessing and checking also allows us to obtain other computational rules
for the indefinite integral. We begin with the following pair of basic rules:
Proposition 10.14 For all functions f,g and constants k 2 R\{0}, we have
Z Z
(i) kf (x)dx = k f (x) dx, k is a constant
Z ⇣ ⌘ Z Z
(ii) f (x) + g(x) dx = f (x)dx + g(x)dx
While these formulas are rather straight-forward, there is a complication here. In-
deed, there is something mysterious going on with the constant C. Let us look at an
example to see what is going on.
Example 10.15 (A closer look at the undetermined constant) Since (x2 )0 = 2x,
it holds by the definition of the indefinite integral that
Z
2xdx = x2 + C.
Now, in practice, we tend to avoid using set notation when dealing with indefinite inte-
grals. Instead, we usually justify the above identity by using different symbols for the
indefinite constants showing up. For instance, we write
Z Z
2
2xdx = x + C and 2 xdx = x2 + 2D,
and then observe that as C and D runs through all of R, then theseR two formulas
R describe
exactly the same functions, and we may therefore conclude that 2xdx = 2 xdx.
10.2. THE RULEBOOK FOR INDEFINITE INTEGRATION 299
Fig. 3. Here we see a selection of primitive functions for y = 2x. Notice that these
functions all have the same slope for a given value of x, and that all can be obtained
from both the formula x2 + C and x2 + 2D.
In fact, since the indefinite integral is really a set of functions, we are allowed to
simplify the constants involved. For instance, the computation in the above example is
often expressed as
Z Z ⇣1 ⌘
2xdx = 2 xdx = 2 x2 + C = x2 + 2C = x2 + D,
2
or even Z Z ⇣1 ⌘
2xdx = 2 xdx = 2 x2 + C = x2 + 2C = x2 + C.
2
Note that in the second computation, we actually change the role of C in the middle of
the computation. While sloppy, this is quite common. To warn people that this is about
to happen, mathematicians sometimes write "the constant C may change from line to
line" at the start of certain computations. We do this when the actual value of C does
not matter.
Hint: No fancy techniques are needed. Just rewrite the expression in some useful way.
300 CHAPTER 10. THE INDEFINITE INTEGRAL
d⇣ ⌘
f (x)g(x) = f 0 (x)g(x) + f (x)g 0 (x)
dx
But this means that f (x)g(x) is a primitive of the expression f 0 (x)g(x) + f (x)g 0 (x). By
the definition of the indefinite integral, this means that
Z ⇣ ⌘
f 0 (x)g(x) + f (x)g 0 (x) dx = f (x)g(x) + C.
In the last equality, we combine the constant C with the constant still inside of the
integral appearing on the left-hand side. This means that the constant is still there, we
just do not see it in the formula.
We have now proved the formula called integration by parts (or partial integration):
To make the formula work, we have to pretend that one of the factors is the term f 0
from the integration by parts formula, and that the other is g.
The following is a bad choice:
Z Z
1 1 2
x sin
|{z} |{z}x dx = x2 sin x x cos x dx.
f0 g |2 {z } |2 {z }
f ·g f ·g 0
10.2. THE RULEBOOK FOR INDEFINITE INTEGRATION 301
Here, we first choose candidates for f 0 and g, compute f and g 0 , and then insert everything
into the partial integration formula. However, it turns out that this was a bad idea since
our next expression looks even worse than the one we began with.
We therefore try another choice for f 0 and g:
Z Z
x sin
|{z} |{z}x dx = x · ( cos x) 1 · ( cos x) dx
0
| {z } | {z }
g f g·f g 0 ·f
Z
= x cos x + cos x dx.
Important: If you doubt your answer when computing an indefinite integral, or just
want to double check that your answer is correct, then recall that you can guess and check.
If taking the derivative of the answer does not give you back the original function, you
have messed up. In the above example, the following verifies that our answer is correct:
d⇣ ⌘
x cos x + sin x + C = cos x + x sin x + cos x = x sin x.
dx
The following three exercises more or less sums up the basic tricks related to partial
integration.
Hint: In (b), you need two tricks. The first is sin2 x = sin x · sin x. Try to figure out
the second yourself. (Using this trick twice will not work!)
302 CHAPTER 10. THE INDEFINITE INTEGRAL
Next, we are going to investigate how we can make the chain rule give us a computational
rule for the indefinite integral. This will result in the most useful computational rule for
the indefinite integral, namely the change of variables formula.
To get some intuition of what is going on, let us first look at some examples.
This essentially means that we have to guess a primitive for sin 3x. The guess cos 3x
seems natural, but it is slightly wrong (check this yourself!). However, if we adjust this
guess – compensating for the constant just as in example 10.12 – we obtain the correct
formula Z
1
sin 3x dx = cos 3x + C.
3
Both examples above are typically handled by the change of variables formula for the
indefinite integral, which is a consequence of the chain rule for derivatives. To this end,
we begin by writing up the chain rule us follows:
d
F g(x) = f g(x) g 0 (x)
dx
Here, we use F to denote a primitive of f . But this means that F (g(x)) is a primitive
10.2. THE RULEBOOK FOR INDEFINITE INTEGRATION 303
If you examine the two previous examples closely, then you see that this is actually the
formula we used. However, this is not the formula that most mathematicians think of
when they say change of variables.
So, let us massage the above expression a bit, introducing the variable u = g(x). The
job of this variable is to hide the complexity of g(x). Indeed, we can write the following:
Z
F g(x) + C = F (u) + C = f (u) du.
R
Here, the integral f (u) du tells us that weR are to find the primitive of f (u) as if u is
the variable of f . That is, when computing f (u)du, we are allowed to forget about the
connection u = g(x).
Combining the above, we get:
To remember this formula, we use Leibniz notation. Notice that when we put u =
g(x), then computing the derivative of u with respect to x gives
du
= g 0 (x) () du = g 0 (x)dx.
dx
The expression to the right does not really mean anything (since we have only given
meaning to the symbol du/dx, and not to du and dx separately), but the notation is
excellent as a reminder of how to use the change of variables formula.
Let us revisit some of the above examples.
The point is now more or less to think in the same way as when we used the chain rule.
Here sin u is a friendly outer function (since we know how to integrate it with respect to
u), so we put u = 2x. (Note that we often call sin u the outer function, and u = 2x the
304 CHAPTER 10. THE INDEFINITE INTEGRAL
inner function.) We compute the derivative of the inner function (usually just called the
inner derivative), and change it to the form indicated above:
du
= 2 () du = 2dx.
dx
By the change of variables formula, this yields
Z Z
1
sin 2x dx = sin 2x · 2dx
2
Z
1
= sin u du
2
1⇣ ⌘ 1
= cos u + C = cos 2x + C.
2 2
(Note that the constant C changed in the last equality. We could have used a new name,
like D, and pointed out that C/2 = D, but it does not really matter, since it is an
“indefinite constant”.)
du 1 dx
= p () du = p .
dx 2 x 2 x
This is just what we need! (Such miracles occur frequently on exams.) The change of
variables formula now yields
Z p Z
sin x p dx
p dx = sin x · p
2 x 2 x
Z
= sin u · du
p
= cos u + C = cos x + C.
10.2. THE RULEBOOK FOR INDEFINITE INTEGRATION 305
Exercise 10.28 Use change of variables and with Proposition 8.8 to compute:
Z Z
10
(a) (2x + 3) dx (b) cos x esin x dx
Z Z
x sin(2x)
(c) 2
dx (d) dx
1+x 1 + cos2 x
Hint: Most of these should be rather straight-forward. However, to solve part (d), you
may want to browse the various trigonometric identities from Chapter A.
Exercise 10.29 Use a suitable change of variables to show that
Z 2
x (11x 9)
dx = x3 (x 1)2/3 + C.
3(x 1)1/3
Remark: This integral looks much worse than it actually is...
With a bit more experience, these exercises will get easier, since we will more used
to spotting suitable pairs of inner and outer functions. When we reach the exam, these
should (hopefully) be routine exercises. Note that what makes change of variable formula
difficult, in the way we are using it here, is that it is not enough to spot a suitable pair
of inner and outer functions – the derivative of the inner function must also appear!
However, this is not strictly necessary. The change of variables formula can also
be used even if the derivative of the inner function does not appear. As the following
example shows, there is quite a lot of freedom.
This is similar to the previous example, however this time there is no sign of a suitable
inner derivative. Still, we can try to force the change of variables. So, again, we put
p p p
u = x and compute du = dx/2 x. But since u = x, we can also express this as
du = dx/2u, which is the same as 2udu = dx.
Using this, Z Z Z
p
sin x dx = sin u · 2u du = 2 u sin u du.
Exercise 10.32 Parts (iii) and (iv) of Proposition 8.9 also yield integration formulas,
even though they are practically never used. Determine these.
306 CHAPTER 10. THE INDEFINITE INTEGRAL
This is called a partial fraction decomposition of 1/(x2 1). Indeed, once such A,B are
found, the integral is solved as follows
Z
dx
2
= A ln |x + 1| + B ln |x 1| + C. (10.1)
x 1
This leads to two questions: how do we know that we can decompose 1/(x2 1) in
this way, and how do we determine the constants A and B? We address the second
question first.
1 = (A + B)x + (B A)
10.3. A BIG BAG OF INTEGRATION TRICKS 307
Notice that for the left-hand side to be equal to the right-hand side, it suffices to have
A + B = 0 and A B = 1. That is, we get the system of equations
A+B =0
B A = 1.
Solving this system, we get the solutions A = 1/2 and B = 1/2.
Method 2 – laying on hands: The following method is often easier and more direct
than the one above. The trick is to take advantage of the fact that (10.2) is supposed to
hold for all x. In particular, choosing x = 1 and x = 1, we get
x = 1 =) 1 = A(1 1) + B(1 + 1) = 0 + 2B
1
=) B =
2
x = 1 =) 1 = A( 1 1) + B( 1 + 1) = 2A + 0
1
=) A = .
2
(Can you guess why the method is called laying on hands?)
We conclude that Z
dx 1 x 1
= ln + C.
x2 1 2 x+1
Exercise 10.34 Modify the solution procedure in the above example a tiny bit to
compute the integral Z
x dx
.
x2 1
Exercise 10.35 Combine the strategies from 10.33 and 10.34 to compute the integral
Z
3x + 2
2
dx.
x x 2
1. Make sure that the degree of the numerator p(x) is strictly less than the degree
of the denominator q(x). If this is not the case, perform a polynomial division to
write
p(x) p1 (x)
= r(x) + ,
q(x) q(x)
where r(x) and p1 (x) are polynomials and deg p1 (x) < deg q(x).
2. Factorise the denominator q(x) as much as possible. Here, any factorisation is fine
as long as no two factors have common zeroes.
3. Make a suitable guess for the partial fraction decomposition of p(x)/q(x) (or, if
you needed to do a polynomial division in Step 1, of p1 (x)/q(x))). For each factor
of the form (x + a)n in q(x), we include in the guess either
A0 + A1 x + · · · + An n 1
1x B0 B1 Bn 1
or + + ··· +
(x + a)n 1 x + a (x + a) 2 (x + a)n
and for each factor of the form (ax2 + bx + c)n , we include in the guess either
A0 + A1 x + · · · + A2n 1 x2n 1 B0 + C 0 x B1 + C1 x
or +
(x2 + ax + b)n ax2 + bx + a (ax + bx + c)2
Bn 1 + C n 1 x
+··· + ,
(ax2 + bx + c)n
where the Aj , Bj and Cj are all constants (see below for some examples).
4. Determine the constants Aj , Bj and Cj .
5. Double check that your formula is correct (this is how you know whether or not
you have made a mistake – and will actually be the proof that your formula is
true).
We now take a closer look at how to make suitable guesses in Step 3 above. The
thing is that the guess we need to make for the partial fractions decomposition sometimes
need to be adjusted. Here are some pointers on how this is done.
2) Here is an example where one factor is of the form (x + a)n with n > 1. According
to our method, each first degree term with higher multiplicity can be included in one of
two ways: either once with a numerator of degree one degree less than the multiplicty,
or as several times with increasing multiplicity as shown in the second line below:
16x + 61 Ax2 + Bx + C D
3
= 3
+ (10.4)
(x + 2) (x 3) (x + 2) x 3
E F G H
= + 2
+ 3
+ (10.5)
x + 2 (x + 2) (x + 2) x 3
3) Finally, if you have a second degree factor that you “cannot” factorise then you
can compensate by choosing a suitable degree for the denominator. For instance:
16x + 61 Ax + B C
= 2 + (10.6)
(x2 + 2)(x 3) x +2 x 3
With the above guidelines, you essentially know all you need to know about partial
fraction decompositions of rational functions (see Remark 10.41, below).
Exercise 10.38 (a) Inspired by the guidelines above, state an appropriate guess for
x3 + 2x + 1
.
(x 1)2 (x2 1)(x2 + 1)2
2x + 1 A Bx + C
= + .
(x 1)(x2 1) x 1 x2 1
What goes wrong? (b) Adjust the partial fraction decomposition so that it works and
compute the integral.
Remark 10.41 Note that we do not state or prove a general theorem on partial fraction
decompositions (the techniques needed to prove such a result go beyond the scope of this
course). This means that you need to prove your partial fraction decomposition each and
every time you use this technique. (But this is in any case a good idea as this amounts
to double checking whatever such formula you come up with!)
Next, we point out that the guesses mentioned in Method 10.37 are all that we need.
Indeed, any factor in q(x) can always be expressed as a product of factors of the form
(x + a)n and/or (ax2 + bx + c)n . A simple example of this would be
x3 1 = (x 1)(x2 + x + 1).
Moreover, if we allow complex numbers, then we can express any factor of q(x) as a
product of factors (x + a)n . An example of this would be
In other words, by allowing for complex number, we could simplify the types of guesses
we need to make! Alas, this would get us into trouble, as we do not know how to integrate
factors of the form 1/(x + a) when a is a complex number (it would require us to define
the logarithm of a complex number – which requires a deeper discussion on complex
numbers).
Exercise 10.42 Prove that all third degree polynomials can be written as a product
of a second degree polynomial and a first degree polynomial with real coefficients.
Hint: Use the graph sketching techniques from Chapter 9 in combination with Propo-
sition A.34.
Remark: Note that if we would allow complex coefficients, then by the fundamental
theorem of algebra, every polynomial of degree three can be written as a product of
three first degree polynomials.
10.3. A BIG BAG OF INTEGRATION TRICKS 311
Example 10.43 To compute the above integral, we start by completing the square
of the denominator (if the numerator was of equal or higher degree, we would start by
doing a polynomial division):
Z Z
dx du
2
= .
x + 4x + 5 (x + 2)2 + 1
Next, we make the change of variables x + 2 = u. This gives dx = du, and we compute
Z Z
du du
2
= 2
= arctan u + C
(x + 2) + 1 u +1
= arctan(x + 2) + C.
Exercise 10.44 Compute the following integrals by first completing the square of the
denominator. Z Z
dx dx
(a) 2
(b) 2
x + 4x + 5 x + 4x + 3
Exercise 10.45 Compute Z
dx
.
(x + 1)(x2 + x + 1)
Exercise 10.46 Compute
Z Z
dx x3
(a) (b) dx
x3 3x2 + 2x x2 + 5
is fairly easy to compute when n is odd, and quite annoying when n is even. To see why
this is, we consider n = 5. Then, by the Pythagorean identity, we get
Z Z Z
cos x dx = cos x · cos x dx = (1 sin2 x)2 cos x dx.
5 4
Exercise 10.49 Complete the computation of the integral in the previous example by
making a suitable change of variables.
Hint: First, multiply with 1 so that you can use Pythagoras without square roots.
when n is an even number? Well, the strategy is to reduce the number n is much as
possible by using the half-angle formulas
1 + cos 2x 1 cos 2x
cos2 x = and sin2 x =
2 2
To illustrate this, suppose that n = 4. Then, by the half-angle formula,
Z Z ⇣
1 + cos 2x ⌘2
cos4 x dx = dx
2
Z
1 + 2 cos 2x + cos2 (2x)
= dx
4
Exercise 10.52 Complete the computation of the integral in the previous example by
applying the half-angle formula once more.
Exercise 10.53 Compute the indefinite integral of sin6 x.
10.3. A BIG BAG OF INTEGRATION TRICKS 313
Exercise 10.54 Pair the following integrals with suitable primitive functions.
Z
sin 2x 1 1
(a) 2
dx (i) sin6 x sin8 x + C
1 cos x 6 8
Z
x 1
(b) (sin x)5 (cos x)3 dx (ii) sin(2x) + C
2 4
Z
dx
(c) (iii) 2 ln(sin x) + C
cos x
Z
1 1 + sin x
(d) sin x(cos5 x + cos3 x)1/3 dx (iv) ln +C
2 1 sin x
Z
3
(e) sin2 x dx (v) (cos2 x + 1)4/3 + C
8
Remark: You should make it your business to understand how to compute all of these
integrals, and not just verify their solution using differentiation.
314 CHAPTER 10. THE INDEFINITE INTEGRAL
stopping us, so let’s try! Taking into account that dx = cos u du, we get
Z p Z
2
1 x dx = | cos u| cos u du.
Can we get rid of the absolute value? Yes, to see how, you need to notice that the change
of variables we really made is not x = sin u, but rather arcsin(x) = u (this is why we
call it an inverse substitution – if this confuses you, go back and carefully re-read the
section on the change of variables formula, and compare to what is going on here). Since
the range of arcsin(x) is [ ⇡/2, ⇡/2], it follows that | cos u| = cos u (see Example 2.86).
Yay! In other words, we need to compute the integral
Z p Z
1 x dx = cos2 u du.
2
By exercise 10.54 (sort of), and the fact that x = sin u is the same as arcsin x = u, we
get
Z
u + sin u cos u
cos2 u du = +C
2
arcsin x + sin(arcsin x) cos(arcsin x)
= +C
2
p
arcsin x + x 1 x2
= +C
2
p
Here, we used in the last step that sin(arcsin x) = x and cos(arcsin x) = 1 x2 for
x 2 [ 1,1] (recall Example 2.86).
where f is a rational function in two variables, can be computed using partial fraction
decompositions. The trick is to use the change of variables
x
y = tan .
2
The point is that with this change of variables, it is possible (but not easy) to compute
that
2 dy 2y 1 y2
dx = sin x = cos x = .
1 + y2 1 + y2 1 + y2
This means that the integral in (10.7) can be rewritten and computed using the partial
fraction decomposition technique. While we will not really pursue this technique (the
expressions you end up with tend to be really horrible), here are two exercises to get a
feel of what is going on.
Exercise 10.60 Prove the formulas for dx, sin x and cos x in the above remark.
10.4. EXAM EXERCISES 317
(a) State the definition of the indefinite integral and explain the difference to definite
integrals.
(b) Compute the integrals
Z Z 3
p
(i) sin x dx (ii) |x2 5x + 4| dx
0
Remark: In part (ii), the definite integral actually requires some thinking before it can
be solved using "indefinite" techniques.
R
(a) Explain briefly the relation between a primitive function f (x)dx and the inte-
Rb
gral a f (x)dx.
(b) Find all primitive functions of
tan x
.
sin x + 1
(c) Compute
Z 3
dx
.
1 2 + |x2 2|
Remark: As in the first exercise above, part (c) requires some thought before the in-
definite techniques can be used.
10.5
(F (x + h) + C) (F (x) + C) F (x + h) F (x)
(F (x)+C)0 = lim = lim = f (x).
h !0 h h !0 h
10.7 (a) About 14 m/s, which is roughly the same as 51 km/h. (b) About 12 meters
(you do not need to differentiate to figure this out. For instance, you can deduce
this by completing the square of the expression – indeed, this was how we found
the range of a second degree expression in Chapter 0.).
x2
10.10 (a) x + C, (b) 2 + C, (c) x
1
+ C, (d) C.
2
10.13 (a) 1/2x2 + C, (b) e x /2 + C, (c) ln(1 + x3 ) + C, (d) tan x x + C. (In (d),
we used the fact that we already know that (tan x)0 = 1 + tan2 x.)
10.16 We give the argument for (i). On the one hand, if F (x) is a primitive of f (x), then
it follows by the product rule for derivatives that (kF (x))0 = f (x). It therefore
follows by the definition of indefinite integrals that
Z
kf (x) dx = kF (x) + C.
Since, as we allow C and D to run through all of R, then the two above formulas
describe exactly the same functions (given that k 6= 0). Hence, these two indefinite
integrals represent the same sets of functions, and we conclude that
Z Z
kf (x) dx = k f (x) dx.
10.20 (a) xex ex + C, (b) (x2 2) sin x + 2x cos x + C, (c) (ln(x) + 1)/x + C.
10.22 (a) (1/2)ex (sin x cos x) + C, (b) (1/2)(x sin x cos x) + C, (c) (ln(x))2 /2 + C.
10.28 (a) (1/22)(2x+3)11 +C, (b) esin x +C, (c) (1/2) ln(x2 +1)+C, (d) ln(1+cos2 x)+C.
x A B
= + .
(x 1)(x + 1) x+1 x 1
That is, you need to find A, B that solves x = A(x 1) + B(x + 1). Using either
method from example 10.33 gives A = B = 1/2.
10.36 (a) Compare the limit of x ! 1 of both expressions, can they ever be the same?
(b) x + ln |x2 1| ln |x+1|
2 + C.
10.38 (a)
A + Bx + Cx2 D E + F x + Gx2 + Hx3
+ +
(x 1)3 x+1 (x2 + 1)2
or
A B C D E + Fx G + Hx
+ + + + +
x 1 (x 1)2 (x 1)3 x + 1 (x2 + 1) (x2 + 1)2
(b) According to the second of the above expressions, the integral will be equal to
B C F H 1
A log |x 1| +D log |x+1|+E arctan(x)+ log(1+x2 )+G??? +C,
(x 1) 2(x 1)2 2 2 x2 + 1
where ??? stands for whatever the answer to exercise 10.57 is.
10.39 (a) Not possible. The factor (x 1) appears in both expressions on the right-hand
side.
10.5. ANSWERS TO SELECTED EXERCISES 321
10.54 (a) - (iii), (b) - (i), (c) - (iv), (d) - (v), (e) - (ii).
1 1 + cos(arcsin y)
cos(arcsin(y)) ln + C.
2 1 cos(arcsin y))
Plugging in the expression for cos(arcsin(y)), and then simplifying, leads you to
the expression p p
1 y 2 log( 1 y 2 + 1) + log(y) + C.
10.60 Rewrite y = tan(x/2) as arctan y = x/2. With this, implicit differentiation will
immediately give you the formula for dx. To get the formulas for sin x and cos x,
observe that x = 2 arctan y and that the expressions you need to compute are
sin(2 arctan y) and cos(2 arctan 2). You can now do as we did in Chapter 0 (that
is, either work geometrically or use trigonometric identities).
322 CHAPTER 10. THE INDEFINITE INTEGRAL
Chapter 11
Differential equations
Introduction
Differential equations are the main mathematical tool used by scientists to formulate the
laws of nature. In Chapter 7, we discussed the intuition behind differential equations and
how to simulate them. Here, we discuss how to solve them using the tools of Calculus.
Remark 11.1 (Selected problems from previous exams based on this chapter)
1 1 + y2 y 1
(i) = , (ii) y 0 = , x > 0, (iii) y 00 +y 0 2y = ex .
x sin x y0 x 1 + x2
323
324 CHAPTER 11. DIFFERENTIAL EQUATIONS
A first order differential equation is said to be separable if it can be written on the form
g(y)y 0 = f (x). (11.1)
In particular, both the continuous-time proportional growth model and logistic-growth
model considered in Chapter 7 are of this form. We now describe a method that can be
used to solve them.
Note that for the last line in the above chain of equivalences to hold, x2 +C must be in the
interval ( ⇡/2,⇡/2) (do you see why?). However, this restriction is not that important.
Indeed, treating y = tan(x2 + C) as a guess for the solution, we can see, by checking,
that everything is fine whenever x2 + C is not an integer multiple of ⇡/2 (again, why?).
In the above example, we more or less tried to explain exactly what was going on.
However, since the level of detail may hide the solution strategy, we now try to express
the above solution in way that (hopefully) reveals the strategy more clearly.
Here, the comment that we made at the end of Example 11.2 also applies.
Exercise 11.4 Use the above method to solve the following differential equations
(and verify that your solutions satisfy the original differential equations!).
Exercise 11.5 Solve the differential equation in the model for continuous-time pro-
portional growth (recall Example 7.22).
For easy reference, here is a brief summary of the above solution method.
Method 11.7 (Solution method for separable first order differential equations)
dy
g(y)y 0 = f (x) () g(y) = f (x)
dx
() g(y)dy = f (x)dx
Z Z
() g(y)dy = f (x)dx
Here, the same remark as we made in the text following Proposition 8.25 applies.
326 CHAPTER 11. DIFFERENTIAL EQUATIONS
mv 0 = m · 9.82 kv (11.4)
Exercise 11.8 Explain exactly why this differential equation is of the form (11.3).
We now describe a solution method tailored for these types of differential equations.
() xy = cos x + C
C cos x
() y = .
x
1
We explain why we use the word linear when we deal with second order differential equations.
11.1. FIRST ORDER DIFFERENTIAL EQUATIONS 327
Unfortunately, this method has an obvious weakness: what if there is no obvious way
of applying the product rule "backwards"? For instance, how to solve
y 0 + 2y = sin x ?
The thing is that there is a miraculous trick that always fixes this problem. That is, we
can always use the product rule backward if we first multiply by a suitable integrating
factor!
Example 11.10 (A first look at integrating factors) We now try to solve the
differential equation
y 0 + 2y = sin x.
Here, it is not immediately clear how to apply the product rule backwards to simplify
the left-hand side. But look what happens if we multiply by e2x on both sides:
Suddenly, the left-hand side is the result of a product rule. Indeed, the left-hand side is
equal to
d ⇣ 2x ⌘
e y .
dx
The rest of the solution is now more or less as in the previous example (but with a harder
integral to compute), and we leave it as an exercise.
Exercise 11.12 Below, we explain how to identify appropriate integrating factors. But
before reading on, try to figure out the integrating factors needed to solve the following
differential equations:
But how to determine the integrating factor? Well, suppose your differential equation
is of the form
y 0 + p(x)y = sin(x).
The trick is to multiply in the factor eP (x) , where P (x) is a primitive of p(x). Indeed,
the differential equation becomes
You can now verify yourself that the left-hand side is of the form (do this!)
d ⇣ P (x) ⌘
e y .
dx
328 CHAPTER 11. DIFFERENTIAL EQUATIONS
Exercise 11.13 Identify the appropriate integrating factors, and use them to solve:
1
(a) y 0 · y = x2 ex , x > 0 (b) (1 + x2 ) · y 0 + y = 1 (c) y 0 + xy = x3 .
x
Exercise 11.14 (a) Determine the solutions of the ODE
1 0 2 1
2
y 3
y+ 2 = 0, x > 0.
x x x +x
f (x)
(b) Which of these solutions satisfies the terminal condition limx!1 x2
=1?
Exercise 11.15 (a) Use the method of integrating factors to solve the differential
equation for the free fall with air resistance given by equation
mv 0 = m · 9.82 kv.
(b) In the initial stages of a skydive, people tend to lie horizontally to increase air
resistance, and thus decrease the speed of the fall. The stable velocity at this
stage is (according to the web) about 200 km/h. Suppose this is true for Elsa,
determine k by considering a suitable limit of the differential equation in (a).
For easy reference, here is a brief summary of the above solution method.
Note that we are justified in multiplying these equations with eP (x) since this expression
is never zero.
We now remark that since at every step of the above description of the solution
procedure to first order linear differential equations we have equivalences, it follows that
the procedure gives us every solution of the differential equation. We cannot make the
same claim about our solution method for separable differential equations since one step,
strictly speaking, does not make sense (treating dy/dx as a quotient. We ask you to fix
this in the next exercise.
Exercise 11.17 (Challenge) Inspired by the above solution method for first order
linear differential equations, try to rewrite the solution method for separable differ-
ential equations so that it becomes clear that it gives all solutions (on an implicit
form).
Hint: Involve the primitive functions G and F of g and f , respectively.
11.2. SECOND ORDER LINEAR ODES WITH CONSTANT COEFFICIENTS 329
ay 00 + by 0 + cy = g(x) (11.5)
where a, b, c, p, q are constants and g(x), f (x) functions. Below, we discuss the physical
significance of these equations and study how to solve them (at some point we also
discuss why we use the term “linear” here).
Physical significance
Linear second order differential equations give a
mathematical model for the oscillation of a mass-
spring system. For non-physicists, this may seem
like a contrived example, but it cannot be over-
stated how fundamental this system is to the
understanding of many (perhaps all?) phenom-
Fig. 2. Explanation of the spring-
ena studied by physicists (indeed, everything vi-
mass system.
brates – even vacuum!). When interpreted this
way, we should consider x to represent time, y
the displacement of the spring from its equilib-
rium point, the constants a, b and c represent in-
herent properties of the spring-mass system (see
the figure), while f (x) should be thought of as
an external force (for example gravity, wind, or
a motor that is attached to the system).
As an example, let us consider the initial value
problem 8 00 0
<y + y + 4y = 0
>
y(0) = 1
>
:
y 0 (0) = 0
Here, the mass of the system is a = 1, the friction
of the spring is b = 1, the stiffness is c = 4 (we Fig. 3. The initial condition y(0) = 1
ignore units) and there is no external force. The essentially means that we push the
two initial conditions, both at time x = 0, say mass up one unit, and then release at
that the object has an initial displacement of y = time x = 0. The condition y 0 (0) = 0
1 and speed y 0 = 0. means that there is no initial speed.
330 CHAPTER 11. DIFFERENTIAL EQUATIONS
Note that there are two initial conditions above. In general, you should impose the
same number of initial conditions as the order of the differential equation. To get a feeling
for why, recall that when solving y 00 = 10, you integrate twice, giving you the solution
y = 5x2 + Cx + D. To determine both coefficients, we need two initial conditions. This
is morally true for all second order equations (even if the solution methods vary).
Fig. 5. Here is what happens if we change the initial conditions to y(0) = 0 and
y 0 (0) = 2. This means that we start the motion by giving the object a vertical
punch upwards resulting in an initial motion with speed 2 (whatever this means).
To the right, we have simulated (with x = 0.1) the effect of such an initial condition
on a system with evolution equation y 00 + y 0 + 4y = 0. Notice that the dots are more
spread at first, which indicates a high initial velocity.
Differential operators
We now start our discussion of how to solve second order differential equations of the
form
y 00 + py 0 + qy = f (x), p,q 2 R. (11.7)
The basic idea is to figure out a way to factorise such second order differential equations
into first order linear differential equations of the form
y 0 + g(x)y = h(x),
for which we already have a nice solution method (integrating factors!). To this end, we
first recall that we can express the derivative of, say, a sum as follows:
d⇣ ⌘
f 0 (x) + g 0 (x) = f (x) + g(x) .
dx
Now, in the same way, for some function y = y(x), we should be able to write
⇣d ⌘
f 0 (x) + 2f (x) = + 2 f (x).
dx
Here, (d/dx + 2) is what we call a differential operator. In particular, just as the symbol
d/dx on is not a function, neither is the symbol 2. Instead, both symbols are what we
call operators – that is, functions that act on functions (that is, we use boldface font in
order to separate between the number 2 and the operator 2). Specifically, the operator
d/dx takes a function f as input and gives the function f 0 as output, and the operator
2 takes a function f as input and gives the function 2f as output.
Exercise 11.20 Suppose g is some differentiable function, and let g be the operator of
multiplication by g. Check wether or not it is true in general that (d/dx)g = g(d/dx).
332 CHAPTER 11. DIFFERENTIAL EQUATIONS
p(r) = r2 + pr + q.
But what happens if we now replace each occurrence of r by d/dx? Well, this gives us
a differential operator that acts as follows:
⇣d ⌘⇣ d ⌘ ⇣d ⌘
+1 +2 y = + 1 (y 0 + 2y)
dx dx dx
d 0
= (y + 2y) + (y 0 + 2y)
dx (11.8)
= (y 00 + 2y 0 ) + (y 0 + 2y)
= y 00 + 3y 0 + 2y.
That is, by considering the characteristic polynomial, we have arrived at the identity
⇣ d2 d ⌘ ⇣d ⌘⇣ d ⌘
+ 3 + 2 = + 1 + 2 .
dx2 dx dx dx
y 00 + 3y 0 + 2y = ex .
By Example 11.22, we know that it can be factored on the form by using its factorised
form ⇣d ⌘⇣ d ⌘
+1 + 2 y = ex .
dx dx
But how to proceed? Let us jump out of the example for a while to discuss the strategy.
The idea is basically to use the factorised form to express the second order differential
equation as a system of two first order differential equations. This can be done as follows:
y 0 + 2y = u
u 0 + u = ex .
Conversely, note that if y solves the original differential equation, then y gives a solution
to this system! We formulate this as a theorem.
Proposition 11.25 (Solution theorem for second order linear differential equa-
tions) Let p, q 2 R, and consider the differential equation
y 00 + py 0 + qy = f (x).
Moreover, suppose that the characteristic polynomial can be factored on the form
p(r) = r2 + pr + q = (r r1 )(r r2 ).
Then y is a solution of the differential equation if and only if it solves the system
( 0
y r2 y = u
u0 r1 u = f (x).
Let us illustrate how to solve such a system by continuing our work on the above
example. Notice that by using the theorem, we do not have to mention differential
operators at all (from now on, they do their work in the background!):
y 0 + 2y = u
u 0 + u = ex .
u0 + u = ex () u0 ex + uex = e2x
() (uex )0 = e2x
1
() uex = e2x + A
2
1
() u = ex + Ae x
.
2
Step 2: We plug the expression for u into u = y 0 + 2y to get the differential equation
y 0 + 2y = (1/2)ex + Ae x
.
y 00 + py 0 + qy = f (x),
we call the solutions to
y 00 + py 0 + qy = 0 (11.9)
its homogeneous solutions. That is, these are the solutions to the spring-mass system
when it is not influenced by external forces.
As it turns out, the homogenous solutions are rather well behaved. In particular,
they satisfy the two following propositions2 .
Proposition 11.28 Suppose that y1 , y2 are two solutions of (11.9). Then for all A, B 2
C, the linear combinations y = Ay1 + By2 are also solutions to (11.9).
Proposition 11.29 (Structure theorem for homogeneous solutions) For all p,q 2
R, the following holds:
(i) If the characteristic polynomial of (11.9) has two different roots (i.e., r0 6= r1 ),
then all solutions of (11.9) are given by
y = Aer0 x + Ber1 x , A, B constants.
(ii) If the characteristic polynomial of (11.9) has a repeated root (i.e, r0 = r1 ), then
all solutions of (11.9) are given by
2
After taking the courses in linear algebra, you will recognise that the first proposition says that the
homogeneous solutions of a second order linear differential equation always form a vector space, and the
second that this vector space is always of dimension 2.
336 CHAPTER 11. DIFFERENTIAL EQUATIONS
has complex solutions y = Aeix + Be ix . Let us see what happens when we combine this
with the formula eix = cos(x) + i sin(x) for the complex exponential:
y = Aeix + Be ix
⇣ ⌘ ⇣ ⌘
= A cos(x) + i sin(x) + B cos( x) + i sin( x)
(11.10)
= (A + B) cos(x) + (iA iB) sin(x) = C cos(x) + D sin(x).
| {z } | {z }
=C =D
That is, it seems that by allowing C, D to be (possibly) complex, we get the solutions on
“real form”. But this is not really the case. Indeed, if we solve for the initial conditions
y(0) = 1 and y 0 (0) = 0, we obtain C = 1 and D = 0. That is, the constants C, D were
real all along – it was the constants A, B that were complex! In particular, y = cos x.
(Compare this with the simulation shown in Figure 6!)
Exercise 11.35 Express the solutions of parts (c) and (d) of 11.32 on real form.
11.2. SECOND ORDER LINEAR ODES WITH CONSTANT COEFFICIENTS 337
Exercise 11.37 Prove this proposition. That is (a) prove that if yh and yp are as
above, then y = yp + yh solves (11.11), and (b) prove that if y and yp are two solutions
of (11.11), then y yp is a homogeneous solution.
Hint: The proof is quite similar to that of Proposition 11.28.
We now formulate our most "general structure theorem" for the solutions.
Proposition 11.38 (The general structure theorem) Suppose that yp is any solu-
tion of the the differential equation
y 00 + py 0 + qy = f (x). (11.12)
Moreover, suppose that r1 , r2 are the (possibly repeated) roots of the polynomial
p(r) = r2 + pr + q.
Then y is a solution of (11.12) if and only if
y = yp + yh
Moreover, p,q are real, then if the roots in Case 1 are complex, there exists k, ! 2 R so
that r1 = k + i! and r2 = k i!, and we can write
We will explore how to use this theorem on the next couple of pages.
338 CHAPTER 11. DIFFERENTIAL EQUATIONS
expresses all solutions of the differential equation y 0 = f (x) there. In particular, F (x)
plays the part of the particular solution and C the part of the homogeneous solution.
Notice that in both structure theorems, we get complete information on how the
homogeneous solution looks, but no information on the particular solution. That is, to
find particular solutions we have to study each case separately. Here is a first example:
Example 11.39 (First "guess and check" example, part 1) Let us consider the
differential equation
y 00 + 3y 0 + 2y = 2(x2 + 1).
Since the characteristic polynomial is
we know by the structure theorem that the homogeneous solutions are given by
x 2x
yh = Ae + Be .
Now, to find all solutions, we must also determine a particular solution yp . That is, we
seek a function yp so that, when inserted into the above equation, makes the left-hand
side equal to the right-hand side.
Before finding the particular solution, let us make a note on the physical interpreta-
tion of what we are doing. Namely, the inhomogeneous differential equation represents a
physical spring-mass system as explained on page 329 with external force f (x). Below,
we simulate how the system evolves when starting from initial values y(0) = 1, y 0 (0) = 0.
Fig. 7. Left: Simulation of the differential equation with f (x) = 2(x2 + 1). Right:
For comparison, the same simulation, but with f (x) = sin x.
11.2. SECOND ORDER LINEAR ODES WITH CONSTANT COEFFICIENTS 339
In both cases, the solution behaves – after a transitional period – more or less like the
external force. This suggests to us the following: Guess that the particular solution is
more or less on the same form as the external force!
Example 11.39 (part 2) Since f (x) is a second degree polynomial, we guess that our
solution is of the form
yp = C 2 x 2 + C 1 x + C 0 .
The point is now to use this guess and insert yp , yp0 , yp00 into the differential equation, and
see what this says about the constants C0 , C1 , C2 . The calculation is as follows:
x 2x 9
y = yh + yp = Ae + Be + x2 3x + .
2
Exercise 11.40 Solve the initial value problem (shown in Figure 11.7):
8
>
> y 00 + 3y 0 + 2y = 2(x2 + 1)
>
<
y(0) = 1
>
>
>
: y 0 (0) = 0
Next, we adress the following question: what are natural guesses for other choices of
f (x)? Well, the thing is that
R for general f (x), we cannot hope to find solutions (just as
we cannot hope to solve f (x)dx for general f ). However, there are some specific f ’s
that allow for reasonable guesses:
340 CHAPTER 11. DIFFERENTIAL EQUATIONS
Here, we provide a list of some choices of f (x) along with natural guesses for particular
solutions:
Moreover, if f (x) is either a sum or product of the above functions, then our guess should
be the product or sum of the corresponding guesses.
(a) y 00 +2y 0 +4y = 2x+3 (b) y 00 +4y 0 +4y = ex (c) y 00 +6y 0 +4y = cos(x)
Exercise 11.43 Solve the initial value problem (simulated in Figure 11.7):
8 00
>
> y + 3y 0 + 2y = sin x
>
<
y(0) = 1
>
>
>
: y 0 (0) = 0
Remark: When you are done, plot your solution. Does it match the simulation shown
in Figure 7?
11.2. SECOND ORDER LINEAR ODES WITH CONSTANT COEFFICIENTS 341
Why natural guesses sometimes fail and how to deal with this
Unfortunately, it turns out that even if you make a guess based on Table 11.41, it may
be that it does not work! In the following example, we explain why this happens.
yh = C sin x + D cos x.
But this is identical to the natural guess! In other words, when plugged into (11.13), the
natural guess will make the left-hand side equal to zero, and not sin x.
So, what to do? For inspiration, let us try to solve two homogeneous equations by
using the guess and check method.
But let us suppose we do not know this. Instead, we observe that the first order linear
differential equation 3y 0 + 2y = 0 has solution y = Ce (2/3)x , and therefore suspect that
the solution could be on the form y = erx (by the structure theorem for homogeneous
solutions, we may ignore the constant). To check if this works for some value of r, we
compute
y 0 = rerx
y 00 = r2 erx
342 CHAPTER 11. DIFFERENTIAL EQUATIONS
and plug these expressions into the homogeneous differential equation to get
() erx (r2 + 3r + 2) = 0
() r2 + 3r + 2 = 0.
That is, we observe that y = erx is a solution if and only if r is a root of the characteristic
polynomial! This should not be surprising, since this gives us the two solutions y1 = e x
and y2 = e 2x promised by the structure theorem.
Here is an example where the "guess and check" strategy for finding homogeneous
solutions, indicated above, fails.
r2 + 2r + r = 0.
But since this polynomial has a repeated root at r = 1, we only find the solution
y1 = e x in this way. That is, the solution y2 = xe x is "missing".
Now, let us return to example 11.44, and the failure of the natural guess. Note that
while we missed the solution y2 = xe x in Example 11.47, we do obtain it by multiplying
y1 by x. Could it be that we can fix failed guesses for finding particular solutions simply
by multiplying by x? Well, that would just be too good to be true... or?
If we plug the expression for y 00 and y into the differential equation y 00 + y 0 = sin x, we
get the equation
2D sin x + 2C cos x = sin x.
This implies that for the left-hand and right-hand sides to agree for all x, we need
D = 1/2 and C = 0. That is,
x
yp = cos x.
2
Taking the real form of the homogeneous solution into account (recall example 11.33),
we conclude that the general solution is given by
x
y = y p + yh = cos x + C sin x + D cos x.
2
Success!
Remark 11.50 Some students find it frustrating that there is no general method
to solve inhomogeneous second order linear differential equations (as opposed to their
homogeneous counter-parts). Please keep in mind that if we want exact formulas for the
solutions, then just as there is no general strategy to solve all indefinite integrals, there is
no general strategy for solving all second order differential equations. However, if all we
need is to solve the differential equation numerically, then simulations essentially always
work (see Example 11.56 for how to simulate second order differential equations).
344 CHAPTER 11. DIFFERENTIAL EQUATIONS
y 00 + y = sin(x). (11.14)
yh = A sin x + B cos x.
One important feature of this solution is that no matter the A and B, the homogeneous
solution will always have the same frequencies. Indeed, it is possible to show that given
A,B you can always find C,D such that
Exercise 11.51 Use the addition formula for sin x and Pythagoras to prove this.
That is, the homogeneous solution of (11.14) really dictates the frequency with which
this system wants to oscillate when there is no outside force. Now, what happens when
we impose an external force that oscillates with exactly this frequency (as we are doing
above)? Well, resonance:
Exercise 11.52 Determine the solution to the differential equation in (11.14) with
initial conditions y(0) = 1 and y 0 (0) = 0. Does it match the simulation shown above?
Proposition 11.55 (The structure theorem) Suppose that yp is any solution of the
the differential equation
y (n) + pn 1y
(n 1)
+ · · · + p1 y 0 + p0 y = f (x). (11.15)
yh = ( A1 + B1 x + · · · )er1 x + ( A2 + B2 x + · · · )er2 x
| {z } | {z }
degree m1 polynomial degree m2 polynomial
+ · · · + ( Ak + Bk x + · · · )erk x .
| {z }
degree mk polynomial
If all coefficients pj are real, then any complex roots appear in conjugate pairs having
the same multiplicity. In particular, if ri , rj form such a pair with multiplicities mj =
mi = m, then there exists k, ! 2 R so that ri = k + i!, rj = k i! and the corresponding
two "groups" of terms from the formula for yh can be replaced by the single "group" of
terms ⇣ ⌘
ekx (C1 + C2 x + · · · ) cos(!x) + (D1 + D2 x + · · ·) sin(!x)
| {z } | {z }
degree m polynomial degree m polynomial
While we will neither prove nor use this result, we remark that the proof is a straight-
forward, albeit tedious, extension of the proof for the case n = 2.
11.2. SECOND ORDER LINEAR ODES WITH CONSTANT COEFFICIENTS 347
u 0 + u = ex . (11.17)
Our strategy is basically this:
Step 1: Choose initial conditions y(x0 ) and y 0 (x0 ), and a time step x0 .
Step 2: Simulate the first order differential equation (11.17) (here, we use (11.16)
to translate initial conditions for y into initial conditions for u).
Step 3: Simulate the first order differential equation (11.16) (here, we use the
simulated values for u from Step 2).
1 def ddy(x,y,dy ):
2 return np.exp(x) 2 ⇤ y 3 ⇤ dy # This requires the numpy package
3
4 deltaX = 1/2 ⇤⇤ 3; N = 2 ⇤⇤ 5 # Stepsize and number of steps
5 X = [0 + n ⇤ deltaX for n in range(0,N+1)] # X values starting at 0
6 Y = [1]; DY = [0] # Initial conditions for Y and DY
7
8 for n in range(0,N): # Loop simulating Y and DY
9 slope = DY[n]
10 slope_of_slope = ddy(X[n],Y[n],DY[n])
11 Y. append (Y[n] + slope ⇤ deltaX )
12 DY. append (Y[n] + slope_of_slope ⇤ deltaX )
Exercise 11.59 Use the above code to simulate y 00 + y = 0 with initial conditions
y(0) = 1 and y 0 (0) = 0.
11.3. RELEVANT EXERCISES FROM PREVIOUS EXAMS 349
(a) Show that if yp and yep are two solutions of this equation, then their difference
solves the homogeneous equation.
(b) Find all solutions of the differential equation. Comment on the relevance of what
you did in part (a).
Exercise 11.62 (Exam 2015-08-17, Problem 3) Solve the initial value problem
(
(4 + e2x )y 0 = yex
y(0) = 1
Exercise 11.65 (Exam 2014-12-18, 3) Determine all solutions to the following dif-
ferential equations.
y
(a) y 0 + = arctan(x), (b) (1 + cos x)y 0 = (1 + y 2 ) tan x.
x
xy 0 + y = x3 , x 2]0,1[.
Exercise 11.70 (Exam 2013-12-18, Problem 2) Solve the initial value problem
8 0
< y + y = 1,
x
:
y(1) = 2
Exercise 11.75 (Exam 2012-05-28, Problem 2) Solve the initial value problem
(
xy 0 = y + xy
y(1) = 1
11.3. ANSWERS TO SELECTED EXERCISES 351
11.8 It can be rewritten as v 0 + p(x)v = q(x) with p(x) = k/m and q(x) = 9.82 (so,
both p(x) and q(x) are constants).
2
11.12 The integrating factors are (a) e3x , (b) e(x ) .
11.15 (a) v = (m/k) · 9.82 + Ce kx/m , (b) assuming Elsa weighs m = 100 kg’s, then
k = 4.91 (we ignore the units).
11.18 (b) Underdamped (b < 0), harmonic (b = 0), damped (0 < b < 2), critically
damped (b = 2), and overdamped (b > 2).
11.30 For y = Ay1 + By2 , compute y 0 and y 00 . Plug this into the differential equation
ay 00 + by 0 + cy 0 = 0, and observe that the left-hand side becomes equal to the right
hand-side (you need to use the fact that both y1 and y2 satisfy this differential
equation on their own).
11.40 y = 4e x + (1/2)e 2x + x2
3x + 9/2.
p p 2x ,
11.42 (a) y = x/2 + 1/2 + e x (C sin( 3x) + p
D cos( 3x), p(b) y = ex /9 + (A + Bx)e
(c) y = (2 sin x + cos x)/15 + Ae( 3+ 5)x + Be( 3 5)x .
11.49 (a) The "natural" guess yp = Cx2 e x fails, (b) this gives solution y = Ae x +
Bxe x + (1/6)x3 e x , (c) y = e x + xe x + (1/6)x3 e x
352 CHAPTER 11. DIFFERENTIAL EQUATIONS
11.51 By the addition formula C sin(x + D) = C cos(D) sin(x) + C sin(D) cos(x). This
gives the relations A = C cos(D) and B = C sin(D). This means that C and D
are the
p modulus and angle of the coordinate (A,B). In other words, we can put
C = A2 + B 2 , and, say, D = arctan(B/A).
Introduction
In this chapter, we study the definite integral. A highlight is the proof of the Fundamental
Theorem of Calculus, which explains why definite integration and differentiation are, in
a sense, "opposite" operations.
Remark 12.1 (Selected problems from previous exams based on this chapter)
2. Explain what we mean by a improper integral, and determine whether the following
improper integrals converge or diverge:
Z 1 Z 1
dx dx
(a) 2
(b) p
2 x(ln x) 0 sin x
353
354 CHAPTER 12. THE DEFINITE INTEGRAL
An immediate issue is that we do not actually know if this definition holds for some
continuous functions, or for all continuous functions. Or could it be that it holds for
more than just continuous functions?
But notice the following. If we extend Definition 12.2 in this way, then we
would have
Z 1 N
X ⇣ ⌘ 1 N
X 1
f (x) dx = lim f k/N · = lim 0· = 0,
0 N !1 |{z} N N !1 N
k=1 xk k=1
and p
Z 1/ 2 XN ⇣ p ⌘ XN
1 1 1
f (x) dx = lim f k/ 2N · p = lim 1· p =p .
0 N !1 | {z } 2N N !1 2N 2
k=1 xk k=1
This is completely unreasonable (since it does not respect inequality (12.1)), and there-
fore we cannot hope to extend Definition 12.2 naively!
Exercise 12.4 Show that the Dirichlet function is discontinuous at all points in R.
Definition 12.6 (Partitions and mesh size) A finite set of points P ⇢ [a,b] is called
a partition of the interval [a, b] if we can write P = {xk }N
k=0 with
We call the size of the largest "gap" xk = xk xk 1 the mesh size of the partition P,
and denote it by mesh(P) (the notation kPk is also common). Note that unless otherwise
stated, we always express a partition in such a way so that the above inequalities hold.
p
Fig. 3. Here we show two Riemann sums for the function f (x) = 1 x2 . The
partition used on the right has more points, but both have the same mesh size.
Notice that a large mesh size indicates that the Riemann sum is probably bad.
Next, given a partition P, we will allow the height of the rectangle with base [xk 1 ,xk ]
to be computed with respect to any height f (ck ) as long as ck 2 [xk 1 ,xk ]. That is, we
define a Riemann sum as follows.
Definition 12.7 (Riemann sums) Let f be a bounded function defined on [a,b]. Then
the sum
XN
f (ck ) xk ,
k=1
where PN = {xk }N
k=0 is any partition of [a,b] into N pieces, and ck 2 [xk 1 ,xk ] for all
k 2 {1, . . . , N }, is called a Riemann sum for f .
12.1. THE DEFINITION OF THE DEFINITE INTEGRAL 357
Fig. 4. Illustration of two Riemann sums that lie neither above nor below the graph.
As we see from the figure on the right, when the mesh size is small, it really should
not matter that much which ck we choose in each subinterval [xk 1 ,xk ].
has the same limit for all sequences of partitions PN with mesh size going to 0, as
N ! 1, and all choices ck 2 [xk 1 ,xk ]. Moreover, we call the common value of these
limits the Riemann integral of f over [a,b] and denote it by
Z b
f (x) dx.
a
If the above does not hold, we say that f is not Riemann integrable on [a,b].
Note that, above, we are tacitly assuming that PN partitions [a,b] into N pieces.
While this is rather standard, it is not really necessary – the important thing is that the
mesh size of PN goes to 0 as N ! 1. Moreover, while both the ck and xk depend
on N , we ignore this in the notation, as is customary, to make the expression (12.2) less
scary.
Exercise 12.9 What does this definition say about the Dirichlet function from Ex-
ample 12.3?
358 CHAPTER 12. THE DEFINITE INTEGRAL
Definition 12.10 (Lower and upper Riemann sums) Let f be a bounded function
defined on [a,b], and let P be a partition of [a,b]. For each subinterval [xk 1 ,xk ], denote
by mk and Mk the infimum and supremum, respectively, of f over this interval.
We now define the lower and upper Riemann sums of f with respect to P by
N
X N
X
L(f,P ) = mk xk and U (f,P ) = Mk x k .
k=1 k=1
Exercise 12.11 (a) Suppose that f is bounded function defined on [a,b]. Are the
lower and upper Riemann sums for f always Riemann sums? Consider some
suitable example and explain.
(b) What if we in addition assume that f is continuous?
Lemma 12.14 Suppose that f is Darboux integrable on [a,b]. Then there exists a
unique number A with the property that for all partitions P, P 0 of [a,b], we have
L(f,P) A U (f,P 0 ). (12.3)
Proof. Let f be a Darboux integrable function on [a,b]. In exercise 12.15, below, you are
asked to prove that if P, P 0 are two partitions of [a,b], then
L(f,P ) U (f,P 0 ).
Geometrically, this should be clear, since all lower Riemann sums should lie completely
below the graph of the function, while the upper Riemann sums lie completely above the
graph (however, as Example 12.3 teaches us, we need to tread carefully when studying
the concept of areas below graphs).
For any fixed P 0 , then U (f,P 0 ) is an upper bound for L(f,P ) for all partitions P of
[a,b]. Therefore, by the completeness axiom, supP L(f,P ) exists. Similarly, inf P 0 U (f,P 0 )
exists. Moreover, by exercise 1.114, we have that
sup L(f,P ) inf0 U (f,P 0 ).
P P
Next, we note that if a number A satisfies (12.3), then it must also satisfy
sup L(f,P ) A inf0 U (f,P 0 ). (12.4)
P P
Fig. 6. The set of values obtained by the L(f,P ) and U (f,P 0 ) as P,P 0 run through all
possible partitions of [a,b] are indicated in red and blue, respectively. In particular,
the distance between A1 and A2 should be smaller than the distance between any
two upper and lower Riemann sums.
360 CHAPTER 12. THE DEFINITE INTEGRAL
We now invoke the Darboux integrability of the function f for the first time. This
implies that for all ✏ > 0, there exists a partition P so that
U (f,P ) L(f,P ) < ✏.
Since B A > 0, we can choose ✏ = (B A)/2. But this implies that
B A
B A U (f,P ) L(f,P ) < ,
2
which is absurd. Done!
Exercise 12.15 Let P be a partition of [a,b].
(a) Prove that if P = P 0 , then L(f, P ) U (f, P 0 ).
(b) Prove that if you extend P by putting P 0 = P [ {y}, for y 2 [a,b], then
L(f,P ) L(f,P 0 ) and U (f,P 0 ) U (f,P ). (12.5)
(c) Prove, by induction, that if you extend P by putting P 00 = P [ {y1 , . . . , yM }, for
yj 2 [a,b], then (12.5) holds with P 0 replaced by P 00 .
(d) Combine parts (a) and (c) to show that for all partitions P, P 0 we have L(f,P )
U (f,P 0 ).
Hint: For (b), see the figure below. In (c), you can use (b) to prove both the base case
and the induction step. In (d), you need to define, in terms of P and P 0 some third
partition P 00 to which you can compare both P and P 0 .
Fig. 7. Here is what happens to a lower Riemann sum when we add some extra point
y1 to the partition.
With the above lemma in hand, the following definition makes sense, and explains
what we mean by the Darboux integral.
We immediately note that most of the functions we encounter in these lecture notes
are smooth (such as polynomials, rational functions, the exponential function and the
logarithm, as well as the trigonometric functions). However, there are plenty of non-
smooth functions. Even elementary functions can fail to be smooth. A basic example is
the absolute value function (which is not differentiable at x = 0).
Exercise 12.18 Prove that if f is smooth on an interval [a,b], then there exists a
constant M so that |f (x1 ) f (x0 )| M |x1 x0 | for all x1 , x0 2 [a,b].
Hint: The proof is short and sweet, but uses some of the main results of the course.
Proof. By the definition, we need to check that for all ✏ > 0, there exists a partition P of
[a,b] so that U (f,P ) L(f,P ) < ✏. To this end, we first let ✏ > 0 be some fixed number.
Next, we study the expression U (f,P ) L(f,P ). To obtain a more explicit formula,
we denote the points in the partition P by a = x0 < x1 < . . . < xN = b. Moreover, since
f is continuous, it follows by the Min-Max theorem, that the infimum and supremum
of f on each of the subintervals [xk 1 ,xk ] exist and are attained at points `k and uk ,
respectively. In particular, this means that the formulas for the lower and upper Riemann
sums, respectively, can be expressed as
N
X N
X
L(f,P ) = f (`k ) xk and U (f,P ) = f (uk ) xk ,
k=1 k=1
N ⇣
X ⌘
= f (uk ) f (`k ) xk .
k=1
362 CHAPTER 12. THE DEFINITE INTEGRAL
Next, we invoke exercise 12.18, which says that there exists some M > 0 so that
xk M
mesh(P )M.
Here, in the second step, we used that, for each k, the points uk , `k are from the same
subinterval [xk 1 ,xk ], and in the final step, we used that the largest xk in a partition
is called its mesh size mesh(P ).
Inserting this into the expression for U (f,P ) L(f,P ), we obtain the inequality
N
X
|U (f,P ) L(f,P )| mesh(P )M xk = mesh(P )M (b a).
k=1
Proof. The idea of this proof is essentially to push the techniques used in the proof of
Proposition 12.19 just a little bit further. Indeed, for each N 2 N, let PN = {xk }Nk=0
and {ck }N
k=0 be as in the hypothesis.
First, we note that by Proposition 12.19, f is Darboux integrable on [a,b]. We denote
the value of its Darboux integral on [a,b] by A (this A is then the same as the one in
Lemma 12.12.3).
Next, let f (`k ) and f (uk ) denote the smallest and largest value of f on [xk 1 ,xk ].
This means that for all ck 2 [xk 1 ,xk ], we have f (`k ) f (ck ) f (uk ). In particular,
12.1. THE DEFINITION OF THE DEFINITE INTEGRAL 363
As we mentioned above, a problem with developping a theory for the definite integral
for smooth functions is that it will not apply to, for instance, the absolute value function.
However, this can be easily fixed by noticing that the absolute value function is what we
call piece-wise smooth. We explain what we mean by this in the following example.
Exercise 12.23 (a) Define what it means for a function to be piece-wise smooth.
(b) Prove that piece-wise smooth functions are Darboux integrable.
Remark: This definition of piece-wise smooth is not entirely standard. Usually,
one also asks for the function to be continuous at the "break points".
364 CHAPTER 12. THE DEFINITE INTEGRAL
If you re-read the proof of Proposition 12.20, you will notice that the only point
where we used the hypothesis that f is smooth, is when we obtain, for all k, the estimate
The point of this estimate is that by choosing mesh(P ), we can get |f (uk ) f (`k )| to be
as small as we want, for all k.
While this seems like just the thing continuity would imply (i.e., |uk `k | small
implies |f (uk ) f (`k )| small), unfortunately, the fact that we need to get this "for all
k" causes problems. To explain why this is, we recall the definition of continuity.
For all ✏ > 0 there exists > 0 such that for all y 2 I we have
So why is using continuity problematic? Well, given a partition of [a,b], then to know
if |uk `k | small implies |f (uk ) f (`k )| small, we would have to apply the definition
of continuity to each point `k . For each k, this would give a different value of k , and
we would have no guarantee that these k ’s are large enough for |uk `k | < k to hold
for all (or even any) k. Potentially, we could respond to this by making the partition
smaller (thus moving the uk ’s and `k ’s closer together), but this would result in a new
set of uk ’s, and therefore also new k ’s, leaving us back at square one.
To be able to deal with this, we need another type of continuity.
12.1. THE DEFINITION OF THE DEFINITE INTEGRAL 365
For all ✏ > 0, there exists > 0 such that for all x, y 2 I we have
Here, an important detail is to notice that while usual continuity1 is always relative
to some specific point, uniform continuity is always relative to an interval.
Example 12.27 (a) The function f (x) = x2 is not uniformly continous on R. Indeed,
let us show that the definition fails for ✏ = 1 by finding two points that are arbitrarily
close together, but such that |f (x) f (y)| 1. To this end, let > 0 be some arbitrary,
but fixed number, and consider the points x + and x, for some x > 0 to be decided.
With this, we obtain
|f (x + ) f (x)| = |(x + )2 x2 | = 2x + 2
For x0 2 [0,10], the smallest value for prescribed by this formula would be min{1, 21
✏
}.
Since this value is non-zero, and works for all x0 2 [0,10], we are done.
Exercise 12.28 (a) Show that f (x) = 1/x is not uniformly continuous on (0,1].
(b) Show that f (x) = 1/x is uniformly continuous on [1/2,1].
Exercise 12.29 Show that smooth functions are uniformly continuous on intervals
[a,b]. Hint: Recall exercise 12.18.
Exercise 12.30 Show that if f is uniformly continuous, then for all ✏ > 0 there exists a
> 0 so that for all partitions with mesh size mesh(P ) < , then U (f,P ) L(f,P ) < ✏.
Explain why this proves Proposition 12.20 with "smooth" is replaced by "uniformly
continuous".
Remark: This exercise seems challenging at first, but once you solve it, you will see that
you needed to do almost nothing (since the concept of uniform continuity essentially is
tailored for the task).
1
When also discussing uniform continuity, we sometimes call the usual form of continuity "point wise
continuity".
366 CHAPTER 12. THE DEFINITE INTEGRAL
In light of exercise 12.30, the following result immediately implies Proposition 12.24.
Exercise 12.32 Use the following three rules for forming negations to deduce the above
negation of uniform continuity:
(i) The negation of "for all x then A holds" is "there exists x such that A is false".
(ii) The negation of "there exists x such that A holds" is "for all x then A is false".
(iii) The negation of "for all x we have A =) B" is "there exists x so that A and
the negation of B both hold ".
Exercise 12.33 There is no reason why the sequences xn and yn should converge in
the above argument. Use Bolzano-Weierstrass to fix this, and explain why the limits
are the same.
Hint: It is a bad idea to apply Bolzano-Weiestrass separately to xn and yn .
12.2. BASIC RULEBOOK FOR THE DEFINITE INTEGRAL 367
Definition 12.34 Suppose that f is a Riemann integrable function on [a,b] and that
c 2 [a,b]. Then we define:
Z c Z a Z b
(i) f (x) dx = 0 (ii) f (x) dx = f (x) dx
c b a
Proposition 12.35 (Basic rulebook for the definite integral) Suppose that f, g, h
are Riemann integrable on [a,b], c 2 (a,b) and k 2 R. Then we have
Z b
(iii) dx = b a
a
Z c Z b Z b
(iv) f (x) dx + f (x) dx = f (x) dx,
a c a
Z b⇣ ⌘ Z b Z b
(v) f (x) + g(x) dx = f (x) dx + g(x) dx
a a a
Z b Z b
(vi) kf (x)dx = k f (x) dx
a a
Z b Z b
(vii) f (x) g(x) on [a,b] =) f (x) dx g(x) dx.
a a
Finally, we have the triangle inequality for definite integrals: if a < b, then
Z b Z b
(viii) f (x)dx |f (x)| dx
a a
368 CHAPTER 12. THE DEFINITE INTEGRAL
Proof of rule (iv). For each N 2 N, we partition [a,c] and [c,b] into N equally long subin-
tervals. We denote these partitions by {x0 , x1 , . . . , xN } and {y0 , y1 , . . . , yN }, respectively
(in particular, this means that xN = y0 ). Since the mesh sizes of both partitions tend
to zero as N grows, it follows by Definition 12.8 that
Z c N
X Z b N
X
f (x) dx = lim f (xk ) xk , and f (x) dx = lim f (yk ) yk .
a N !1 c N !1
k=0 k=0
Exercise 12.36 The simplest rule is rule (iii). Show that it follows almost immediately
from Definition 12.8
Exercise 12.37 Use Definition 12.8 to prove rules (v), (vi) and (vii). (The arguments
are all minor variations of the one used to prove rule (iv).
Exercise 12.38 (Discussion) Although we claimed that rules (i) and (ii) do not
really fit into our framework, it is possible to modify what we mean by a partition so
that we can give proofs for these rules.
If we know that the the triangle inequality holds for the sum of two numbers, then it
can almost immediately be extended to a sum of three numbers as follows:
|a + b + c| = |(a + b) + c| |a + b| + |c|
Notice that in both of the inequalities in this computation, we only used the usual
triangle inequality (12.6).
(b) Use Definition 12.8, in combination with (a), to prove rule (viii), namely that
Z b Z b
f (t) dt |f (t)| dt.
a a
Proposition 12.40 (Mean Value Theorem for Integrals) Suppose that f is con-
tinuous on [a,b]. Then there exists a c 2 [a,b] so that
Z b
f (c)(b a) = f (x)dx.
a
Before we begin the proof, let us think about what this result actually says. Now,
by the Min-Max Theorem, there exist points `, u 2 [a,b] so that f (`) and f (u) are the
smallest and largest values of f on [a,b]. Consider the following figure:
Fig. 10. The red rectangle has area f (`)(b a), the blue rectangle has area f (u)(b a).
The Mean Value Theorem for Integrals just says that there has to be some "average"
height f (c) so that the rectangle with area f (c)(b a) is exactly equal to the green
area under the graph on the interval [a,b].
Proof of the Mean Value Theorem for Integrals. As discussed above, let f (`) and f (u)
be the minimum and maximum of f on [a,b], respectively. This gives us the inequalities
The values f (`) and f (u) do not depend on x, we can therefore use rule (vi) to rewrite
these inequalities as
Z b Z b Z b
f (`) dx f (x)dx f (u) dx,
a a a
Notice that the value which we label by s is between f (`) and f (u). This means that by
the Intermediate Value Theorem (or more specifically, exercise 9.10), there must exist
some value c 2 [a,b] so that s = f (c). This is exactly what we needed to prove!
As for the Mean Value Theorem for derivatives, there also exists a Generalised Mean
Value Theorem for Integrals which we shall need in a later chapter. Since the proof is
just a minor variation of the above proof, we formulate this result, and leave the proof
to the exercises.
Exercise 12.42 (a) Prove the Generalised Mean Value Theorem for Integrals by
modifying the proof of the Mean Value Theorem for integrals.
(b) Does the conclusion of the Generalised Mean Value Theorem for Integrals hold
if we replace the assumption that g is positive on [a,b], by the assumption that g
is negative on [a,b]?
Exercise 12.43 It is possible to improve both Mean Value Theorems for Integrals
formulated above, so that in the conclusion we can replace c 2 [a,b] by c 2 (a,b).
Modify the proof of Proposition 12.40 so that this holds.
Hint: Take a look at the strategy for the proof of Rolle’s theorem.
372 CHAPTER 12. THE DEFINITE INTEGRAL
Above, we
R 1 have
p already seen how to both "cheat" and use "brute force" to study
the integral 0 1 x2 dx. The main point of this section is to explain how and why
the third method works. To this end, we need to establish what some claim is the most
important scientific discovery ever made:
(ii) (the evaluation formula) If F is a primitive function of f on I, then for all a,b 2 I
we have Z b
f (t) dt = F (b) F (a).
a
Note that we often use the notation [F (t)]ba in place of F (b) F (a).
Our goal is to show that G(x) is a primitive func- Fig. 12. Illustration of f (t) (the red
tion of f (x). In other words, we need to compute graph) and G(x) (the green "area
the derivative of G(x). function").
Using the basic computational rule (iv) for the definite integral, we begin the com-
putation of G0 (x) as follows:
G(x + h) G(x)
G0 (x) = lim
h!0 h
Z Z x
1 ⇣ x+h ⌘
= lim f (t) dt f (t) dt
h!0 h a a
Z x+h
1
= lim f (t) dt
h!0 h x
Next, by the Mean Value Theorem for Integrals, we continue the computation by
1
· · · = lim f (c)(x + h x)
h!0 h
= lim f (c)
h!0
Finally, we observe that since c 2 [x,x + h], by the squeeze theorem, it follows that h ! 0
implies c ! x, and so, we complete the computation using the continuity of f :
· · · = lim f (c) = f (x).
c!x
Proof of part (ii). Fix a,b 2 I and some primitive function of f on I that we call F (we
emphasize that at the moment, we know nothing about F , except that F 0 = f ). Letting
G be as in the first part of the theorem, since also G0 = f on I, we already know from
the theory of indefinite integration that there exists a constant C so that
F (x) = G(x) + C
Z x
= f (t)dt + C.
a
374 CHAPTER 12. THE DEFINITE INTEGRAL
Exercise 12.45 Compute the following integrals by first identifying the relevant prim-
itive functions (that is, by first computing the relevant indefinite integrals).
Z 4 Z 4 Z e
p dx dx
(a) sin x dx (b) (c)
0 2 x2 1 1 x(1 + (ln x)2 )
Exercise 12.46 Compute the following definite integrals by first splitting up the def-
inite integral using rule (iv) from Proposition 12.35.
Z 2⇡ Z 3
(a) | sin x| dx (b) |x4 16| dx
0 0
Exercise 12.47 (From an old exam) What is the largest value taken by the following
function? Z x
1 t
g(x) = dt, x 2 R.
0 1 + t2
Exercise 12.48 Let F (x) be some primitive function of f (t) = arctan(t2 ). By the
evaluation formula it holds that
Z x4
arctan(t2 ) dt = F (x4 ) F (x2 ).
x2
Use this to find a formula for the derivative of the integral on the left-hand side with
respect to x (note that you never need to figure out who F actually is).
Exercise 12.49 An alternative proof of the evaluation formula starts as follows: Choose
a partition P = {x0 , x1 , . . . , xn } of [a,b] and add 0 n-times by writing
Proposition 12.50 (The change of variable formula) Suppose that g has a con-
tinuous derivative on [a,b], and suppose that f is continuous on the range of g. Then
Z b Z g(b)
0
f (g(t))g (t) dt = f (u) du.
a g(a)
Proof. Suppose that F is a primitive function of f . By the chain rule, we know that
d
f (g(t))g 0 (t) = F (g(t)).
dt
On the one hand, integrating both sides over the interval [a,b], and applying the evalu-
ation formula to the right-hand side, we get
Z b Z b
0 d
f (g(t))g (t) dt = F (g(t)) dt
a a dt
= F (g(b)) F (g(a)).
On the other hand, since F is a primitive function of f , the evaluation formula gives
Z g(b)
f (u)du = F ((g(b)) F (g(a)).
g(a)
Exercise 12.52 (a) Suppose that f is an oddZ and continuous function the interval
a
[ a,a]. What does this say about the integral f (x) dx ?
Z 1 p a
Exercise 12.53 Prove Proposition 12.50 by modifying the argument in exercise 12.49.
376 CHAPTER 12. THE DEFINITE INTEGRAL
Proposition 12.54 (The partial integration formula) Suppose that f, g have con-
tinuous derivatives on [a,b]. Then
Z b Z b
f 0 (t)g(t) dt = [f (t)g(t)]ba f (t)g 0 (t) dt.
a a
Proof. As in the proof of the partial integration formula for the indefinite integral, we
begin by considering the product rule for the derivative:
(f g)0 = f 0 g + f g 0 .
Taking the definite integral on both sides of this expression, and applying the evaluation
formula to the left-hand side, we get
Z b Z b
0
f (b)g(b) f (a)g(a) = f (t)g(t)dt + f (t)g 0 (t)dt,
a a
(a) Compute I1 , I2 , I3 .
(b) Show that the sequence In converges.
Hint: (b) becomes easier when you realise that you do not have to show what the limit
is (however, the limit is possible to determine – so if you want an extra challenge, try
to do also this).
In the following exercises you need to use the full flexibility of Riemann sums to find
formulas describing certain geometric properties of various objects.
Exercise 12.60 (Challenge) In this last exercise, we consider how to compute the
surface area of a function with rotational symmetry. As in the previous exercise, we
think of the object as a thinly sliced baguette.
378 CHAPTER 12. THE DEFINITE INTEGRAL
(a) First, consider the case when the baguette is just a cylinder (that is, f (x) is
constant). Explain why the surface area of the slice denoted by Sk is equal to
2⇡f (xk ) sk , where sk is defined as in exercise 12.58.
(b) Next, we consider the more general situation shown in Figure 15, where the slice
is not a cylinder. Use the figure as inspiration, and explain why it follows from
the intermediate value theorem that almost the same formula for Sk holds.
(c) Deduce the following formula for the total surface:
XN Z b p
S = lim Sk = 2⇡f (x) 1 + f 0 (x)2 dx.
N !1 a
k=1
p
(d) Compute the surface area of y = x, x 2 [0,1].
Fig. 15. Left: The thinly sliced baguette. To approximate the area of its "crust",
we think of the function as being a straight line on each slice (just as when we
computed the length of the graph). As before, we denote the length of this line by
sk . This allows us to approximate the "crust-area" of each slice by the surface
area of a sequence of bands. Right: To compute the area of the blue band, think
of it as being made of rubber (that is flexible). If you keep the width sk , but
stretch/squish the band so that it becomes cylindrical with the same radius on both
sides, then the total area either gets larger (green) or smaller (red). This is a key
observation to getting a nice formula in part (b) of exercise 12.60.
Remark 12.61 Above, we ask you to compute formulas for length, volume and surface
area without actually defining what we mean by these concepts. While not ideal, we
allow ourselves to be slightly sloppy since we only want to briefly showcase how, in
certain situations, these concepts can be computed in terms of one variable definite
integrals.
For the actual definitions, we refer the interested student to any text on several
variable calculus, where these concepts all belong naturally. For instance, the two-
dimensional analogue of the Darboux-Riemann integral defines what we mean by vol-
ume.
12.4. APPLICATIONS TO GEOMETRY AND ELEMENTARY FUNCTIONS 379
We now ask you to use the theory of the definite integral to deduce some basic
properties of the logarithm.
Exercise 12.63 By only using properties of the definite integral, prove that:
Remark 12.64 Based on the above exercise, the proofs given for the properties of the
exponential function are now correct.
380 CHAPTER 12. THE DEFINITE INTEGRAL
Fig. 16. This is the "infinitely long" area under 1/(1 + x2 ) represented by the above
integral.
Exercise 12.65 In this exercise, we explain the general idea behind computing "in-
finitely long" areas.
(a) Fix a number c > 0. Compute the integral of f (x) = 1/(1 + x2 ) over [0, c].
(b) Take the limit as c ! 1 in the answer from (a). What do we get? Use the above
figure to explain why it makes sense to consider such a limit.
We are so happy with the procedure from the above example that we make the
following definition.
Definition 12.66 Let f be continuous on [a,1). Then we define the improper inte-
gral of f on [a,1) to be the limit
Z 1 Z c
f (x) dx = lim f (x) dx.
a c!1 a
We say that the integral converges if this limit exists, and that it diverges otherwise.
12.5. COMPUTING AND ESTIMATING UNBOUNDED AREAS 381
Definition 12.68 Suppose that f is continuous on (a,b]. Then we define the improper
integral of f over (a,b] to be the limit
Z b Z b
f (x) dx = lim f (x) dx.
a c!a+ c
We say that the integral converges if this limit exists, and that it diverges otherwise.
Remark: When 1 is replaced by c, these integrals can all be computed by first finding
primitive functions.
Exercise 12.71 Determine exactly for which ↵ 2 R the following integral diverges and
converges, respectively: Z 1
dx
1 x↵
Exercise 12.73 Determine exactly for which ↵ 2 R the following integral diverges and
converges, respectively: Z 1
dx
↵
0 x
Remark 12.74 (i) Note that while there also exists other types of improper integrals,
they are mostly just minor variations of the above ideas. For instance, if f is continuous
on [a,b) with a vertical asymptote at x = b, then to obtain its integral over [a,b), we
should first integrate over [a,c] for some a < c < b, and then take the limit c ! b .
(ii) However, one variation that requires some care is if an integral is improper for
two (or more) reasons. For instance, this is the case with
Z 1
dx
. (12.7)
0 x
In such cases, we must split up the integral that each "issue" can be considered separately.
That is, here, we should first make the split
Z 1 Z 1 Z 1
dx dx dx
= + ,
0 x 0 x 1 x
and then consider the two integrals on the right-hand side (that are both improper for
one single reason), separately. If both converge, we say that (12.7) converges. If at least
one of them diverges then we say that (12.7) diverges.
Fig. 18. Here, we have plotted the graph of f (x) = 1/(1 + x2 ) and indicated the
values 1/(1 + k 2 ) for k = 1, 2, 3, . . . with vertical red line segments topped with a
dot. Next to each vertical line segment, we have placed a rectangle with base length
1. This means that the area of each rectangle is equal to its height, P and that the
combined area of all the rectangles is equal to the value of the series 1 2
k=1 1/(1+k ).
The point of the above figure is that the rectangles completely lie under the graph of the
function. In particular, merely using that a function f is continuous and decreasing, the
following computation is justified:
Z N Z 1 Z 2 Z N
f (x) dx = f (x) dx + f (x) dx + · · · + f (x) dx
0 0 1 N 1
Z 1 Z 2 Z N
f (1) dx + f (2) dx + · · · + f (N ) dx
0 1 N 1
N
X
= f (1) + f (2) + · · · + f (N ) = f (k).
k=1
Hence, by the Balloon lemma (or rather, the dichotomy for positive series), it follows
that the series (12.8) converges.
Exercise 12.76 You can also justify the connection between the partial sums and the
integral using what we know about Riemann sums. Do this.
384 CHAPTER 12. THE DEFINITE INTEGRAL
Note that above, we were able to bound an infinite series by an integral. The opposite
is also possible. Indeed, consider the following figure:
Fig. 19. Here, we have placed the rectangles to the right of their respective heights
instead of to the left. Their combined area has not changed, but now the rectangles
completely cover a part of the area under the graph.
Exercise 12.77 Suppose that f is a continuous and decreasing function. Use the above
figure as inspiration to prove that
XN Z N +1
f (k) f (x)dx.
k=1 1
Exercise 12.78 (a) Use the above exercise to find a new proof of the fact that the
harmonic series diverges (that does not use Oresme’s trick).
P
(b) Use the above techniques to prove that 1 k=1 1/k converges if and only if ↵ > 1.
↵
Proposition 12.79 (Integral test for infinite series) Suppose that f is positive,
continuous and decreasing on [a,1), and let A be any integer such that A > a. Then
Z 1 1
X
(i) f (x) dx converges () f (k) converges,
a k=A
Z 1 1
X
(ii) f (x) dx diverges () f (k) diverges.
a k=A
Proof. By what we did on the previous page, the implication =) of (i) and (= of (ii)
hold. To obtain the remaining implications, we need a Balloon lemma for monotonous
functions, in order to get a dichotomy result for improper integrals (cf. Proposition 5.24).
Since these results are established by repeating, essentially word-by-word, the proofs in
the case of infinite sequences and series, respectively, we leave them to the interested
reader.
12.5. COMPUTING AND ESTIMATING UNBOUNDED AREAS 385
converges. To apply the integral test, we first need to check whether or not the terms
2
are decreasing. To this end, we compute the derivative of the function f (x) = xe x /2 :
x2 /2 x2 /2 x2 /2
f 0 (x) = e x2 e = (1 x2 )e .
We see that the derivative of f is negative whenever 1 x2 < 1. Since this holds
when x > 1, it follows that the function is decreasing on [1,1). By the integral test
(Proposition 12.79), this means that the series converges if and only if the same is true
for Z 1
2
xe x /2 dx.
1
To determine whether or not this improper integral converges, we do the following:
Z 1 Z c
x2 /2 2
xe dx = lim xe x /2 dx
1 c!1 1
Z c2 /2
u=x2 /2 u
= lim e du
c!1 1/2
c2 /2 c2 /2
= lim [ eu ]1/2 = lim (e 1/2
e )=e 1/2
.
c!1 c!1
Exercise 12.81 Use the integral test to determine whether the following infinite series
converge or not.
1
X 1
X 1
X 1
X
1 1 1 1
(a) (b) (c) (d) .
k k2 k ln k k(ln k)2
k=1 k=1 k=2 k=2
Remark: The point is to study these series using the tools we have for integrals.
Exercise 12.82 Use figures similar to figures 18 and 19 to show that for N = 1, 2, 3, . . .
we have X N
1
ln N 1 + ln N
k
k=1
Hint:
PN If you run into problems at x = 0 in the integral expression, then study the sum
k=2 1/k instead.
386 CHAPTER 12. THE DEFINITE INTEGRAL
This result remains true if we replace the sense in which the integrals are improper.
Exercise 12.84 Adapt the proof for the comparison test for infinite series to prove the
above proposition.
Fig. 20. Here, a = 1. It should be intuitively clear that if the larger area is finite,
then so is the smaller area. And correspondingly, if the smaller is infinite, then so
is the larger one.
f (x)
lim = L,
x!1 g(x)
then Z 1 Z 1
f (x)dx < 1 () g(x)dx < 1.
a a
If L = 0 or L = 1, then in each case half of the result holds (can you see which?).
This result remains true if we replace the sense in which the integrals are improper.
Exercise 12.86 Adapt the proof for the limit comparison test for infinite series to
prove the above proposition.
12.5. COMPUTING AND ESTIMATING UNBOUNDED AREAS 387
As with series, in order to use these comparison tests, we need something to compare
with. And, just like with series, we call the most useful class of integrals "↵-integrals".
Exercise 12.88 Prove this proposition (without using the integral test).
If both (I) and (II) converge, then the integral is convergent. If one, or both, of (I) and
(II) diverge, then the integral is divergent.
We first check (I). In much the same way as we used comparison tests for series, we
start out by using our intuition of what happens when x ⇡ 0:
Z 1 Z 1 Z 1
x+2 2 2
p dx ⇡ p dx = 3/2
dx
3 5 3
0 x +x 0 x +0 0 x
By Proposition 12.87, this indicates that we ought to expect divergence since ↵ = 3/2 >
1. To verify this, we use the limit comparison test (Proposition 12.85):
x+2
p
x3 + x5 x3/2 (x + 2) x3/2 (x + 2)
lim = lim p = lim p
x!0+ 1 x!0+ x3 + x5 x!0+ x3/2 1 + x2
x3/2
0+2
=p = 2.
1+0
By the limit comparison test, this means that the integral (I) is divergent. But this
means that the total integral (I) + (II) is divergent, and we are done.
388 CHAPTER 12. THE DEFINITE INTEGRAL
Exercise 12.90 Determine if (II) from the above example is convergent or not.
Exercise 12.92 For which ↵ 2 R does the following integral converge or diverge:
Z 1
dx
p ?
0 x(1 + x↵ )
Exercise 12.93 Motivate why the following integrals converge, and show that for a >
0, we have Z 1
n!
xn e ax dx = n+1 , n = 1, 2, 3, . . . .
0 a
Exercise 12.94 Define what it means for an improper integral to (a) converge abso-
lutely, and (b) converge conditionally.
Remark: As we have done above, to keep things reasonable, it is enough to consider the
case of functions that are either unbounded or are defined on an unbounded domain.
Exercise 12.95 Prove that if an improper integral converges absoltely, then it also
converges in the usual sense.
Hint: The proof is essentially, word-by-word, the same as for infinite series.
Exercise 12.96 Determine if the following improper integral converges absolutely, con-
ditionally or if it diverges. Z 1
sin x
dx
1 x2
Exercise 12.99 (Exam 2016-01-23, part of exercise) Determine whether the fol-
lowing improper integral converges or diverges. If it converges, compute its value.
Z 1
dx
2 x(ln x)2
Exercise 12.100 (Exam 2015-01-08) (a) By comparing the following sum with an
integral, show that
N
X 1
arctan(n) N arctan(N ) ln(1 + N 2 ).
2
n=1
with 6= 0.
Exercise 12.102 (Exam 2014-12-18, part of exercise) Explain why the length of
the graph of y = f (x) as x 2 [a,b] is given by the formula
Z bp
1 + f 0 (x)2 dx.
a
390 CHAPTER 12. THE DEFINITE INTEGRAL
Exercise 12.103 (Exam 2014-10-04, part of exercise) (a) Suppose that fR is contin-
1
uous on (0,1]. Define what we mean when we say that the improper integral 0 f (x) dx
is convergent or divergent, respectively.
(b) Determine whether or not the following integral is convergent.
Z 1
dx
p .
0 sin x
(a) Draw a figure which illustrates the area that this integral represents. Also, with-
out computing the integral, determine its value for a = 1.
p
(b) Show that for a 2 [0,1] it holds that sin(2 arcsin a) = 2a 1 a2 . (Hint: The
addition formula for the sine.)
(c) Compute the integral by using the change of variables x = sin u. p
Double-check
your answer by using that for a = 1/2 the value is supposed to be 3/8 + ⇡/12.
Exercise 12.105 (Exam 2014-05-26, part of exercise) (a) Explain the difference
between definite and indefinite integrals. (b) Solve
Z 3
dx
.
1 2 + |x2 2|
Exercise 12.106 (Exam 2014-05-26, part of exercise) Determine a curve for which
the following integral denotes the length:
Z ⇡/6 q
1 + sin2 (x) dx.
0
Exercise 12.114 (Exam 2013-01-09) Determine all local and global extreme points
of the function Z x
1 t
f (x) = 2
dt
0 1+t
on R. Illustrate your answer in a sketch.
Exercise 12.117 (Exam 2012-05-28) (a) Let a > 0. For which s > 0 does the
improper integral Z 1
dx
a xs
392 CHAPTER 12. THE DEFINITE INTEGRAL
converge? Compute the exact value for the cases when it does converge.
(b) Show (for instance by making a suitable illustration) that
Z N Z
N +1
dx X 1 N
dx
.
2 xs ns 1 xs
n=2
(c) Determine
1
X 1
lim (s 1) .
s!1+ ns
n=1
N p
X 1 1+ N +1
p p 2 ln .
k(1 + k) 2
k=1
N
X 1 1 2N + 1
p > ln .
k2 +1+k 2 3
k=1
p
Remark: This one is slightly tricky. It is much easier to get the lower bound ( 2 +
1) ln(N + 1).
Exercise 12.120 (Lund, May 2016) Choose one of the following two exercises. (Both
give full credit.)
Since the proof of this equality requires techniques that is not included in this
course, you should not attempt to prove it. Instead, determine how large we need
to choose N so that the difference between the partial sum
N
X 1
k2
k=1
and the series (that is, the error) is less than 1/1000.
(b) The following Python code computes the value of a partial sum of some series.
12.6. RELEVANT EXERCISES FROM PREVIOUS EXAMS 393
1 N= 100
2 S = 0
3 for n in range(0,N):
4 S = S + n/(n ⇤⇤ 4 + 1) ⇤⇤ (1/2)
5 print (S)
(i) In mathematical notation, write down the partial sum and series.
(ii) In case the series is convergent, explain how many terms is needed for the
partial sum to approximate the value of the series with an error of at most 1/1000.
In case the series is divergent, explain how many terms is needed for the partial
sum to be larger than 1000.
394 CHAPTER 12. THE DEFINITE INTEGRAL
12.37 To prove (iv), choose a sequence of partitions PN with mesh size going tot 0. This
yields a chain of equalities:
Z b⇣ ⌘ N ⇣
X ⌘ Z b Z b
f (x) + g(x) dx = lim f (xk ) + g(xk ) xk = · · · = f (x)dx + g(x)dx.
a N !1 a a
k=1
(d) No.
12.39 (a) The main point in the induction step is to notice that
N
X N
X1
ak = ak + aN .
k=1 k=1
(b) This argument is now very similar to those from exercise 12.37. (c)
12.45 (a) 2 sin(2) 4 cos(2) (see Example 10.30), (b) (1/2) ln(9/5) (see also Example
10.33), (c) ⇡/4 (use substitution u = ln(x)).
12.47 The largest value is ⇡/4 log 2/2 (to see why this is, use the Fundamental theorem
of calculus to make a table of signs for g 0 (x).)
12.48 Z x4
d
arctan(t2 ) dt = 4x3 arctan(x8 ) 2x arctan(x4 ).
dx x2
12.49 Additional hint: After applying the Mean Value Theorem, the expression should
be exactly on the form required by Definition 12.12.2.
12.51 (a) log(7/4) (hint: use the double angle formula), (b) ⇡ 2 /4 2 (hint: partial
p
integration), (c) ⇡/2 1 (hint: substitute u = x).
12.7. ANSWERS TO SELECTED EXERCISES 395
12.52 (a) 0 (compute this by expressing the integral as a limit of Riemann sums, and
then use f ( x) = f (x), (b) 0.
p
12.58 (a) sk = ( xk )2 + ( fP k ) , (b) P
2 = xk · f 0 (xk ), (c) the point is to
fk p p take the
limit of the Riemann sums sk = 1 + f 0 (ck )2 xk , (d) (1/27)(13 13 8) ⇡
1.439...
12.59 (a) Vk = ⇡f (xk )2 · xk (this is the formula for the volume of a cylinder with
"radius"
P f (xk ) and
P thickness xk ), (b) the point is to take the limit of the Riemann
sums Vk = ⇡f (xk )2 xk , (c) (4/3)⇡R3 .
12.60 (a) this follows by the standard formula for the surface area for the side of a
cyldinder (google it), (b) write up the expression of the surface areas for the sides
of the red and green cylinders, respectively. By the intermediate value theorem, the
surface area of the side of the blue object has to be Sk = 2⇡f (ck ) skPfor some
cPk 2 [xx 1 ,xk ], (c)
Pthe pointpis to take the limit of the Riemann sum Sk =
2⇡f (ck ) sk = 2⇡f (ck ) 1 + f 0 (dk )2 xk where ck , dk 2 [xk 1 ,xk ]. (Note that
here we have a problem since we are a dealing with both ck ’s and dk ’s – you can
choose to either ignore this, or, you can take the extra challenge and try to figure
out how to deal with this! (hint: it can be dealt with in a pretty naive way), (d)
(⇡/6)(53/2 1).
12.71 By computing the improper integral, we see that it converges for all ↵ > 1 and
diverges for all ↵ < 1 (you need to consider the cases ↵ 6= 1 and ↵ = 1 separately).
p
12.67 (a) 2(1 c), (b) 2.
12.73 Converges for all ↵ < 1 and diverges for all ↵ 1 (be sure to compare this to the
result of exercise 12.71).
12.92 Converges for all ↵ > 1/2 (be sure to take into account that this integral is improper
for two reasons, and therefore needs to be split up).
Taylor polynomials
Introduction
We have now come to the final chapter of these lecture notes, where we combine most
of what we have learned so far in the study of Taylor polynomials.
397
398 CHAPTER 13. TAYLOR POLYNOMIALS
x3
T3 (x) = x ,
3!
x3 x5
T5 (x) = x + .
3! 5!
Note that T1 is identical to the tangent line of sin(x) at x = 0, while T3 and T5 are
what we ought to call the "tangent cube" and "tangent quintics", respectively, of sin x
at x = 0. For fun, below, we illustrate how T37 approximates sin x.
x3 x5 x37
Fig. 2. Comparison of y = sin x and y = T37 (x) = x 3! + 5! ··· + 37! .
Remark 13.2 (Python code for Taylor polynomials of the sine function)
1 import math as m
2 def T(k,x) # Returns value of Taylor polynomial of order n=2k+1.
3 C = [( 1) ⇤⇤ j ⇤ x ⇤⇤ (2j+1)/m. factorial (2j+1) for j in range(0,k)]
4 return sum(C)
Exercise 13.3 How large does n have to be in the Taylor polynomials for sin x cen-
tered at x = 0 for Tn (1) to match 4 digits of sin(1)? (Use the code in Remark 13.2.)
13.1. A FIRST LOOK AT TAYLOR POLYNOMIALS 399
Example 13.4 We now indicate how to use Taylor polynomials for sin x to compute
the standard limit sin x
lim .
x!0 x
What could be simpler than this? The problem with this computation, of course, is
that sin x is not actually equal to x. Expressed more correctly, the above computation
is actually
sin x T1 (x) + error T1 (x) error error
lim = lim = lim + lim = 1 + lim .
x!0 x x!0 x x!0 x x!0 x x!0 x
For this reason, to make the computation work, we need to prove that the error term
goes faster to 0 than x as x ! 0. Understanding the error we make when we replace
functions by their Taylor polynomials is one of the main goals of this chapter.
x2
T2 (x) = 1 + x +
2!
x2 x3
T3 (x) = 1 + x + +
2! 3!
and so forth. For instance, by plugging x =
t2 /2 into T1 (x), we ought to have
t2 /2 t2
e ⇡ T1 ( t2 /2) = 1 Fig. 4. The function ex together with
2
close to t = 0. Integrating this, we ought to get its Taylor polynomials T1 , T2 and T3 .
Z Z 1⇣
1 1
t2 /2 1 t2 ⌘
p e dt ⇡ p 1 dt
2⇡ 0 2⇡ 0 2
1 ⇣ 1⌘ 5
=p 1 = p = 0.3324...
2⇡ 6 6 2⇡
As in the previous example, to make this computation accurate, we must take the error
made when approximating by Taylor polynomials into account. This gives the expression
Z 1 Z 1⇣
1 t2 /2 1 t2 Taylor approximation⌘
p e dt = p 1 + error dt = 0.3324... + total error.
2⇡ 0 2⇡ 0 2
In particular, understanding the error allows us to figure out what order Taylor polyno-
mial to use in order to beat any ✏ threshold for the desired accuracy of this computation.
Exercise 13.7 How many terms from the Taylor polynomials for ex , centered at
x = 0, do you need to approximate the above integral so that you match the 6 first
decimal digits of its true value? (Use Table 13.18 in combination with trial and error.)
13.1. A FIRST LOOK AT TAYLOR POLYNOMIALS 401
Exercise 13.9 Prove that if T1 satisfies the above definition, then T1 = f (a) +
f 0 (a)(x a).
Hint: A polynomial p(x) is of degree at most one exactly if it is on the form p(x) =
c0 + c1 x. Indeed, if c1 = 0, then it is of degree 0. Otherwise it is of degree 1.
Next, we consider how to define a second order tangent curve – what we probably
should call a tangent parabola. In light of the above formulation of the definition for
tangent lines, the following definition is rather natural.
Exercise 13.11 Prove that if T2 satisfies the above definition for a = 0, then
f 00 (0) 2
T2 (x) = f (0) + f 0 (0)x + x .
2
Exercise 13.12 Determine the tangent parabola at x = 0 for the following functions.
(a) f (x) = ex (b) f (x) = ln(1 + x) (c) f (x) = sin x.
402 CHAPTER 13. TAYLOR POLYNOMIALS
In the above exercise, you were asked to find the formula for tangent parabolas
at x = 0. While it is not that hard to directly find the corresponding formula near
some x 6= 0, the computations do become annoying if you are not careful – especially
when we consider Taylor polynomials of large order. Here is a "trick" that helps keep
computations simple.
T2 (x + a) = c0 + c1 x + c2 x2 .
T2 (x) = c0 + c1 (x a) + c2 (x a)2 .
Exercise 13.15 Use the formula from the above remark to show that the tangent
parabola of a function at x = a is given by
f 00 (a)
T2 (x) = f (a) + f 0 (a)(x a) + (x a)2 .
2
arctan(x) ⇡
(1 + x)↵ ⇡
arcsin(x) ⇡
arccos(x) ⇡
We highly recommend that you fill out the above table every time you encounter a
new Taylor polynomial throughout this chapter.
404 CHAPTER 13. TAYLOR POLYNOMIALS
The following theorem extends the formulas for the tangent lines and parabolas on
page 401. It allows us to compute most (but not all) Taylor polynomials in Table 13.18.
f 00 (a) f n (a)
Tn (x) = f (a) + f 0 (a)(x a) + (x a)2 + · · · + (x a)n .
2 n!
..
.
Tn(n) (x) = 0 + 0 + 0 + · · · + cn n(n 1)(n 2) · · · 2 · 1.
Letting x = a, we observe that only the first term from each expression is non-zero, and
that we get
Tn (a) = c0
Tn0 (a) = c1
Tn00 (a) = c2 2
..
.
Tn(n) (a) = cn n!
(k)
But, by hypothesis, we have Tn = f (k) (a). Solving for the cn , and inserting this into
(13.1), above, we obtain the desired formula.
13.1. A FIRST LOOK AT TAYLOR POLYNOMIALS 405
f 00 (x) = 2 · (1 x) 3
f 000 (x) = 2 · 3 · (1 x) 4
f (4) (x) = 2 · 3 · 4 · (1 x) 5
.
Based on this, it would seem that a reasonable guess for the k’th derivative would be
f (k) (x) = k!(1 x) (k+1)
.
Let us now prove this formula by induction. Since we already took care of proving the
"base case", all that remains is to do the "induction step". That is, we show that if the
formula holds for k, then it this also holds for k + 1. We do this as follows:
d (k) d
f (k+1) (x) = f (x) = k!(1 x) (k+1)
dx dx
(k+1) 1
= k!( k 1)(1 x) · ( 1)
(k+2)
= (k + 1)!(1 x) .
(We point out that in this computation, the induction hypothesis was used when we
wrote out the formula for f (k) . Also, note that the factor ( 1) appearing in the middle
line is the derivative of the inner function 1 x.)
Inserting x = 0 into the expression for the k’th derivative of f , we get
f (k) (0) = k!
And so, by Taylor’s formula, we find that the Taylor polynomials of f centered at x = 0
are exactly
Tn (x) = 1 + x + x2 + · · · + xn .
Exercise 13.22 Compute the Taylor polynomials, of all orders, centered at x = 0, for
(a) f (x) = ex (b) f (x) = ln(1+x) (c) f (x) = sin x (d) f (x) = cos x.
= sin x x,
Exempel 13.25 How well is f (x) = sin x approximated on the interval [ 1/10, 1/10]
by its tangent line T1 (x) centered at x = 0? By what we did above, we know that
|E1 (x)| x2 .
In particular, this means that the error we make on the interval [ 1/10, 1/10], when we
replace f by T1 , is no larger than 1/103 = 1/1000.
Exercise 13.26 (a) In the above example we first showed that sin x = x + E1 (x),
where |E1 (x)| x2 for x close to 0. Use this to "fix" the computation in Example
13.4.
(b) The second estimate above was |E1 (x)| |x|3 . Does this matter when used to
Example 13.4? Would it make a difference if the estimate was |E1 (x)| C|x|n
where C > 0 is any constant and n 2?
Exercise 13.27 Use Proposition 13.24 to determine the following limits.
ln(1 + x) ex 1 arcsin(x)
(a) lim (b) lim (c) lim .
x!0 x x!0 x x!0 x
408 CHAPTER 13. TAYLOR POLYNOMIALS
Example 13.28 (Example 13.6 revisited) Using Proposition 13.24, we obtain that
for x 2 [ 1,1] we have
t2 et4
ex = 1 + E1 ( t2 /2) with |E1 ( t2 /2)| .
2 4
But this means that
Z 1 Z 1 Z 1
1 t2 /2 1 1
p e dt = p T1 ( t2 /2)dt + p E1 ( t2 /2)dt .
2⇡ 0 2⇡ 0 2⇡ 0
| {z } | {z }
the approximation the error made
We now make the following observation: the error with the actual error made in
Example 13.6 is roughly
0.3413 0.3324 ⇡ 0.009,
which is much smaller than the estimate found above. This motivates the following
questions:
As it turns out, the answer to both questions is yes. In particular, since understanding
how to get a better value for the constant C in Proposition 13.24 will help us extend the
proposition to higher order Taylor polynomials, we will now explain how this is done.
Proof of Proposition 13.24, revisited. A problem in our original proof is that we use the
Mean Value Theorem in the first steps. Indeed, the "unknown" quantities c and d are
rather hard to handle since we do not know exactly where they are. Instead, we can try
to use the Fundamental Theorem of Calculus (FTC) to start the computation as follows:
13.2. ERROR ESTIMATES FOR TAYLOR POLYNOMIALS 409
⇣ ⌘
E1 (x) = f (x) T1 (x) = f (x) f (a) + f 0 (a)(x a)
where we flipped (x t) to (t x) to get rid of a minus sign in the last step. Moreover,
observe how we are still in total control since there are no c’s or d’s anywhere. But this
is as far as it goes. Because now, in the last step, we apply the Generalised Mean Value
Theorem for Integrals (Proposition 12.41):
Z x
00 f 00 (c)
= f (c) (x t)dt = (x a)2 ,
a 2
where, finally, c, which is some number between x and a, appears.
Exercise 13.29 Use the result of the above computation to find a better value for the
constant C appearing in Proposition 13.24 (you should probably record this improved
value in the margin next to the proposition).
Exercise 13.30 Use the error estimate from exercise 13.29, in addition to any other
computational tweak you can think of, to improve the error estimate in Example 13.28.
Exercise 13.31 (Exercises 13.16 and 13.23, continued.) Use the error estimate from
exercise
p 13.29 to estimate the error made when we used tangent lines to approximate
2. (Compare this to the actual error.)
1
Let us think of this as an homage to Taylor as he invented the technique.
410 CHAPTER 13. TAYLOR POLYNOMIALS
As it turns out, we are in a good position to prove this result. Indeed, the second proof
for Proposition 13.24 actually establishes Lagrange’s formula for n = 1, and basically
contains all the ideas required. In the following exercise, you are guided, step-by-step,
into proving the full result.
Exercise 13.33 We now prove Theorem 13.32. To this end, let f be as in the state-
ment, and let En = f Tn .
(a) The main part of the proof is to establish, by induction, the following integral
formula: Z x
(x t)n
En (x) = f (n+1) (t) dt.
a n!
(i) Re-read the second proof for Proposition 13.24, and observe that we have
already showed that the base case n = 1 holds.
(ii) To better understand how the induction step should work, first justify why
00
E2 (x) = E1 (x) f 2(a) (x a)2 .
(iii) Next, use the relation in (ii) in combination with a suitable application of
integration by parts, to prove that the integral formula for n = 2 follows
from the one for n = 1.
(iv) Use induction to prove the integral formula for general n.
(b) Use the Generalised Mean Value Theorem for Integrals (Proposition 12.41) to
deduce Lagrange’s error formula from the integral error formula.
13.2. ERROR ESTIMATES FOR TAYLOR POLYNOMIALS 411
Over the next few pages, we are going to consider some examples and exercises. To
help you out, in the following remark, we summarise what we have shown above.
Remark 13.34 Suppose f, f 0 , . . . , f (n) are defined at the point x = a. Then we can
compute the n’th order Taylor polynomials of f , centered at x = a, to be
f (n+1) (c)
f (x) = Tn (x) + En (x) with En (x) = (x a)n+1 .
(n + 1)!
Exercise 13.35 (a) Show in detail how we use the Bounded Function Theorem to
obtain the last estimate for En (x) in the above remark from the Lagrange error
formula.
(b) What formula does this give for the constant C?
This means that Lagrange’s error formula (or Remark 13.34) yields the estimate
|x|n+1 |x|n+1
|En (x)| = |f (n+1) (c)| .
(n + 1)! (n + 1)!
The question now becomes, when is this less than
1/100 for all x 2 [ 6,6]? First, note that |x|n+1
is the largest when x = 6. This means we have
to figure out how large n has to be for
6n+1 1
(n + 1)! 100
Exercise 13.38 In the above example, we basically figured out how large to choose the
order for sin(6) to be approximated by Tn (6), where the Tn are Taylor polynomials
centered at x = 0. Determine what order you need for this approximation to hold
with the same accuracy (i.e., 1/100), if you instead use Taylor polynomials for sin x
centered at x = 2⇡ ⇡ 6.28.
Exercise 13.39 (a) Let T2n+1 be the Taylor expansion of sin(x) of order 2n + 1
centered at x = 0. Show that the error function E2n+1 satisfies the inequality
x2n+3
|E2n+1 (x)| .
(2n + 3)!
(b) According to the error estimate in (a), what order Taylor polynomial do you need
for it to approximate sin(1) with an error of less than 10 4 ? What about 10 16 ?
Exercise 13.40 Show that the error function En for the Taylor polynomial Tn of
f (x) = 1/(1 x) is given by
1
En (x) = xn+1 ,
(1 c)n+2
Exercise 13.41 In this exercise we are to use the Taylor polynomials Tn for ex (cen-
tered at x = 0), to estimate the value for e. We suppose that we only know the
derivatives of ex and that e 3.
(a) Show that on the interval [ 1,1], the error function for the Taylor polynomials
Tn (x) satisfy the relation
3|x|n+1
|En (x)|
(n + 1)!
(b) How many terms from the Taylor polynomial of ex is needed to approximate
e = e1 with an accuracy of 16 digits? (Why not make a table such as in the
above example?).
414 CHAPTER 13. TAYLOR POLYNOMIALS
Example 13.42 (Example 13.6, revisited, again) Let us again consider the integral
Z 1
1 t2
p e 2 dt. (13.2)
2⇡ 0
We now use Lagrange’s formula for the error function to determine how many terms
from the Taylor polynomial Tn for f (x) = ex (centered at x = 0) we need for the integral
to be approximated by Z 1 ⇣ 2⌘
1 t
p Tn dt (13.3)
2⇡ 0 2
with an error less than 10 7.
Here, Rn represents the error we make when we approximate the integral (13.2) by the
integral (13.3). So, our goal is to estimate how large it is.
By Lagrange’s formula, we obtain
xn+1 xn+1
En (x) = f (n+1) (c) = ec ,
(n + 1)! (n + 1)!
where c is between 0 and t2 /2. Taking absolute values, this means that
⇣ t2 ⌘ t2n+2
En n+1 .
2 2 (n + 1)!
By using the triangle inequality for integrals, this implies that
Z 1 ⇣ t2 ⌘ Z 1
1 1 t2n+2
|Rn | p En dt p dt
2⇡ 0 2 2⇡ 0 2n+1 (n + 1)!
1 1
=p n+1
.
2⇡ 2 (2n + 3)(n + 1)!
You are asked to use this estimate to complete this example in exericse 13.43, below.
13.2. ERROR ESTIMATES FOR TAYLOR POLYNOMIALS 415
(a) Use, say, Python to make a table such as the one in Example 13.37. How large
do you have to choose n to get |Rn | 10 7 ? Does this match what you found in
Example 13.7?
(b) Compare the estimated error and the "actual error". Which is largest? Does
this make sense? (Here, to find the "actual error", why not use the value for the
integral as given by WolframAlpha?)
Exercise 13.44 (From old exam) Use Taylor’s formula with Lagrange error term to
decide whether or not Z 1
cos(x2 )dx > 9/10.
0
p t t2 t3
1+t=1+ + R(t) with |R(t)| .
2 8 16
Example 13.47 Let us try to find the Taylor expansion of f (x) = arctan(x) of order n
(centered at x = 0). Since arctan x is our favourite function, we expect this to work in
an extremely beautiful way. But here is what happens when we compute its derivatives:
1
f 0 (x) =
1 + x2
2x
f 00 (x) =
(1 + x2 )2
2(1 + x2 )2 ( 2x)2(1 + x2 )(2x)
f 000 (x) =
(1 + x2 )4
This is rather curious since it turns out that the arctangent has rather nice Taylor
polynomials. The problem is that we need some tool other than Taylor’s formula to
determine them. Such a tool is actually offered by the uniqueness theorem for Taylor
polynomials.
Before we state the uniqueness theorem, we need to know exactly what we should mean
by a "good" approximation near the point x = a. We therefore recall that, by Remark
13.34, we know that the n’th order error function satisfies
for some constant whose value, it turns out, does not matter here. Indeed, the uniqueness
theorem says that any polynomial approximating f with an error smaller than or equal
to what is given in the above estimate (no matter how large C is) must itself be a Taylor
polynomial.
In fact, recall that also in exercise 13.26, we saw a situation where the exact value of
the constant C in the error term estimate did not matter. Therefore, we might as well
introduce the following definition.
13.3. A UNIQUENESS THEOREM FOR TAYLOR POLYNOMIALS 417
In terms of this notation, we can formulate the uniqueness theorem for Taylor polyno-
mials . When you get used to the Big-oh notation, you will start to notice how it makes
these types of statements a bit easier to read (indeed, compare it to the last statement
of Remark 13.34).
Proof. As the implication " (= " is just Remark 13.34, it only remains to prove the
implication " =) ". To this end, we start by expressing the polynomial p on the form
(recall Remark 13.13)
p(x) = c0 + c1 (x a) + c2 (x a)2 + · · · + cn (x a)n .
By the hypothesis, and the definition of the Big-oh, we have for x close to a that
|f (x) c0 c1 (x a) c2 (x a)2 ··· cn (x a)n | C|x a|n+1 . (13.4)
Since f is continuous, letting x ! a, we get |f (a) c0 | = 0. That is,
c0 = f (a).
Inserting this into (13.4), and then dividing by |x a| on both sides, we get
f (x) f (a)
c1 c2 (x a) ··· cn (x a)n 1
C|x a|n .
x a
Since f is differentiable, letting x ! a, we get |f 0 (a) c1 | = 0. That is, c1 = f 0 (a).
Continuing in this way, we obtain the desired result.
To this end, we recall Lemma 5.13, which, in the context of the language of Taylor
polynomials says that
xn+1
f (x) = p(x) + En (x) with En (x) = .
1 x
Now, to obtain the desired estimate on the error term, all we have to do is find some
neighbourhood of x = 0 where 1/(1 x) does not become to large. One such interval is
[ 1/2,1/2], where we have 1/2 1 x 3/2. This implies that on this interval we have
xn+1 1
|En (x)| = = |x|n+1 2|x|n+1 .
1 x 1 x
From this, it follows by the uniqueness theorem for Taylor polynomials that p(x) = Tn (x).
Exercise 13.52 Determine a C > 0, so that for x 2 [ , ], the inequality |En (x)|
C|x|n+1 holds.
Exercise 13.53 Compare the error term found in the above example to the one from
exercise 13.40. Which one do you think is better?
13.3. A UNIQUENESS THEOREM FOR TAYLOR POLYNOMIALS 419
Example 13.54 We show that the n’th order Taylor polynomial, centered at x = 0, of
f (x) = ln(1 + x)
is given by
x2 x3 xn
p(x) = x + · · · + ( 1)n 1 .
2 3 n
To do this, the first step is to plug x = t into the formula for the (n 1)’st partial
sums of the Geometric series. This yields the expression
1 ( 1)n tn
=1 t + t2 · · · + ( 1)n 1 n 1
t + . (13.5)
1+t 1+t
The point is that when we integrate both sides of this expression from 0 to x, we obtain
Z x⇣ ⌘ Z x
( 1)n tn
ln(1 + x) = 1 t + t2 · · · + ( 1)n 1 tn 1 dt + dt
0 0 1+t
Z x
x2 x3 xn ( 1)n tn
=x + · · · + ( 1)n 1 + dt.
| 2 3 {z n} 0 1+t
=p(x)
This is great! In particular, this gives the following formula for the error term:
Z x
( 1)n tn
En (x) = f (x) p(x) = dt.
0 1+t
Using the Generalised Mean Value Theorem for Integrals (Proposition 12.41), we obtain
1 ( 1)n xn+1
En (x) = ,
1+c n+1
420 CHAPTER 13. TAYLOR POLYNOMIALS
where c is some number between 0 and x. Next, if we restrict x to, say, the interval
[ 1/2, 1/2], this implies that c 2 [ 1/2, 1/2], and so, we have 1/2 1 + c 3/2. In
particular, this means that
2
|E(x)| |x|n+1 .
n+1
This means that by the Uniqueness Theorem for Taylor Polynomials, we have that p(x)
is the Taylor polynomial for ln(1 + x) centered at x = 0.
(a) Modify the above argument to find an estimate for |E(x)| that is valid for x 2
[ 9/10,9/10].
(b) Find an estimate for |E(x)| that is valid for x = 1.
(c) Use (b) to prove Mengoli’s formula for the value of the Alternating Harmonic
series.
13.3. A UNIQUENESS THEOREM FOR TAYLOR POLYNOMIALS 421
Proposition 13.59
O x2
(i) = O (x) (ii) lim O (x) = 0.
x x!0
It follows from, say, the Lagrange error formula that we can write
sin x = x + E1 (x) where |E1 (x)| C|x|2 for x close to 0.
But this means that by the definition of the Big-oh, we can write
sin x = x + O x2 as x ! 0. (13.6)
Finally, by the observations recorded in Proposition 13.59, we obtain
sin x x + O x2 ⇣ ⌘
lim = lim = lim 1 + O (x) = 1 + 0 = 1.
x!0 x x!0 x x!0
13.4. THE BIG-OH "CALCULUS" FOR ERROR TERMS 423
Notice how the Big-oh notation allows us to make the computation in the above
example using exactly the information we need, but nothing more. As we will see,
keeping the level of detail to a minimum allows us to compute more efficiently.
Exercise 13.61 Try to compute the following “standard limits” using the Taylor ap-
proximations of order 1, 2 and 3 using the Big-oh notation for the error term. How
does the answer depend on the order?
ex =
sin x =
x2 x4 x2n
cos x = 1 + + · · · + ( 1)n + O x2n+2
2 4! (2n)!
1
=
1 x
ln(1 + x) =
x3 x5 x2n+1
arctan(x) = x + · · · + ( 1)n + O x2n+3
3 5 2n + 1
(1 + x)↵ =
arcsin x =
arccos x =
424 CHAPTER 13. TAYLOR POLYNOMIALS
Proposition 13.64 (Computational rules) First, we formulate rules (i) and (ii),
from above, in a more general way. Indeed, for ↵, 2 R, the following holds as x ! 0:
Proof. We only prove rule (v), and leave the others as exercises. So, assume that ↵.
The following computation then holds for all x 2 [ 1,1]\{0}:
= (|x|↵ + C) · |x|
(1 + C)|x|
Note that in the last step, we used the fact thats ↵ 0 and that |x| 1. This is
fine since the Big-oh only demands that the estimate holds (at least) on a punctured
neighbourhood of x = 0.
Exercise 13.65 Prove the remaining rules of the above proposition.
Hint: Rules (i) and (ii) are proved just like their counter-parts on page 422. Rules
(iii) and (iv) follows almost immediately from the definition of the Big-oh. Finally,
the proof of rule (vi) is more or less the same as that of rule (v).
Exercise 13.66 While efficient, the slightly sloppy notation we use to express the
computational rules for the Big-oh is not without flaws. For instance, we could claim
both that O x3 = O x2 is true, but that O x2 = O x3 is not. Can you explain
how to make sense of this apparent non-sense?
13.4. THE BIG-OH "CALCULUS" FOR ERROR TERMS 425
Example 13.67 Let us try to find the third order Taylor polynomial (centered at x = 0)
of
f (x) = sin x cos x.
A naive approach would be to just multiply the third order Taylor polynomials of sin x
and cos x and hope for the best. So, let us do this! Multiplying
x3 x2
sin x = x + O x5 and cos x = 1 + O x4
3! 2
we get
⇣ x3 ⌘⇣ x2 ⌘
sin x cos x = x + O x5 1 + O x4
3! 2
x3 x3 x5
=x +
2 3! 2 · 3!
(13.7)
4 x3
+x·O x · O x4 + O x5 · O x4
3!
x2
+ O x5 · O x5 .
2
This looks absolutely horrible! But not to worry. When we use the Pacman-rules of
the Big-oh, the term O x5 eats up all terms with x of the same or higher exponent,
resulting in
x3 x3 2x3
sin x cos x = x + O x5 = x + O x5 .
2 3! 3
By the uniqueness theorem, we have actually found the Taylor polynomial of sin x cos x
of order 4 (!) centered at x = 0.
Exercise 13.68 Let f (x) = sin x cos x. Without computing any derivatives, use the
above example to determine the values of f 0 (0), f 00 (0), f 000 (0) and f (4) (0).
Example 13.69 We now compute the general expression for the Taylor polynomial of
f (x) = x arctan(x2 )
centered at x = 0. To this end, we first recall that
x3 x5 x2n+1
arctan(x) = x + · · · + ( 1)n + O x2n+3 .
3 5 2n + 1
But this means that
x6 x10 x4n+2
arctan(x2 ) = x2 + · · · + ( 1)n + O x4n+6 ,
3 5 2n + 1
and finally, where we need to use rule (i) of the the rulebook for the Big-oh, we get
x7 x11 x4n+3
x arctan(x2 ) = x3 + · · · + ( 1)n + O x4n+7 .
3 5 2n + 1
x3 x5
sin x = x + + O x7
3! 5!
x2 x4
cos x = 1 + O x6 .
2 4!
Using the rulebook for the Big-oh, we get
⇣ x3 ⌘ ⇣ x2 ⌘
sin x x cos x = x + O x5 x· 1 + O x4
6 2
x3 x3 x3
=x + O x5 x+ + O x5 = + O x5 .
6 2 3
Exercise 13.71 Find the third order Taylor approximation centered at x = 0 of:
⇣ x2 ⌘ x
(a) f (x) = ln(1+x) sin x (b) f (x) = 1 x+ e (c) f (x) = sin(sin x).
2
13.4. THE BIG-OH "CALCULUS" FOR ERROR TERMS 427
The point is that this puts us in an excellent position to use Rule of Thumb 2 from
Chapter 4 (let the strong main terms fight each other!). In this case, we use the expres-
sions found in examples 13.69 and 13.70, in combination with rules (i) and (ii) from the
rulebook of the Big-oh, to get
x arctan(x2 ) x3 + O x7 x3 1 + O x4 1+0
lim = lim x3 = lim 3 · 1 2
= 1 = 3,
x!0 sin x x cos x x 3 + O (x ) 3 +0
5
3 + O (x )
x!0 x!0
Exercise 13.73 What happens if we increase the order of any of the Taylor approx-
imations in Example 13.72? In particular, add one term to each of the three Taylor
approximations (and adjust the corresponding Big-oh terms accordingly). How does
this affect the computation of the limit? Is there any way to recognise that you used
"too many" terms?
Example 13.75 Determine whether the following infinite series converges or diverges:
1
X 1
k arctan k1
.
k=1
sin k1
Intuitive step: We need to figure out how the terms in the series behave for large k.
This means that 1/k is small, and it makes sense to consider the Taylor approximations of
arctan x and sin x as x tends to zero 0. To express both the numerator and denominators
on the form “main term” + “error term”, we make the following observation:
8 8
x 3 > 1 1 1
>
<arctan x = x >
<arctan k ⇡ k 3k 3
+ O x5
3 as x ! 0 =) as k ! 1
>
: >
> 1 1
sin x = x + O x 3 : sin ⇡ ,
k k
This implies that 1
arctan k1 ( 3k13 ) 1
k
1 ⇡ 1 = 2, as k ! 1.
sin k (k) 3k
P1
and so, we expect that the series will behave like the convergent ↵-series k=1 1/k
2.
Formal step: We use the limit comparison test, making sure to include Big-oh terms:
1 1
arctan
(k sin k1
k
) 1
arctan k1
lim = lim k 2 · k
k!1 ( k12 ) k!1 sin k1
1
3k3
+ O k15 1
3 +O 1
k2
1
+0 1
= lim k 2 · 1 1 = lim 1 = 3
= .
k!1
k + O k3
k!1 1+O k2
1+0 3
Exercise 13.76 (From previous exam) Determine whether the following series and
generalised integrals converge:
⇣1 Z 1⇣
1⌘ 1⌘
X1
1
(a) k· sin (b) arctan dx.
k k 1 x x
k=1
Exercise 13.77 (From previous exam) Use Taylor approximations with Big-oh er-
ror term to determine for what values of a does the following infinite series converge:
1 ⇣p
X ⌘
k 2a + 1 ka .
k=1
13.4. THE BIG-OH "CALCULUS" FOR ERROR TERMS 429
(a) Use L’Hopital’s rule to verify that fP ( ) resolves the ultraviolet catastrophe.
(b) Use a Taylor polynomial with Big-oh error term to show that for large wave-
lengths, then Planck’s law becomes more and more like the Rayleigh-Jeans law.
(c) For what value of is fP the largest for our sun? Google a bit, and use realistic
values for the various constants appearing in the expression. (Hint: The answer
is kind of indicated in the figure above.)
(d) As in (c), estimate the maximum of fP for the stars Betelgeuse (T = 3400),
Procyon (T = 6400) and Sirius (T = 9200). In particular, why do you think
Sirius is thought of as a "blue" star, while "Betelgeuse" is thought of as a "red"
star?
430 CHAPTER 13. TAYLOR POLYNOMIALS
Fig. 22. Albert Einstein (1879 – 1955). Personally, I choose to believe that there is
a Big-oh lurking behind the guy. (A prize to the one who finds an actual Big-oh on
a blackboard behind Einstein – such a photo has to exist!)
(a) Determine the Taylor expansion of order 4n + 1 with error term for the function
Z x
1
f (x) = dt.
0 1 + 2t4
(b) For which x can you guarantee that the error term goes to 0 as n ! 1?
Bonus question: Show that the Taylor expansion diverges as n ! 1 for the
remaining x.
(a) Formulate Taylor’s formula centered at x = 0 with an explicit estimate for the
error (make sure you state under which assumptions the statement holds).
(b) Show that
Z x
x5 x9
cos(t2 ) dt = x + R(x) with |R(x)| < .
0 10 216
R 0.1
(c) Determine an approximation for 0 cos(t2 ) dt with an error of at most 10 decimal
digits. Your approximation should be given in the form of a rational number (i.e.,
a fraction).
(a) Explain why the length of the graph of y = f (x) as x 2 [a,b] is given by the
formula Z bp
1 + f 0 (x)2 dx.
a
1 1 2 u3
(1 + u)1/2 = 1 + u u + R(u) with |R(u)| < for u 0.
2 8 16
(b) Use these results to approximate the length of the curve y = sin x as x 2 [0,⇡/2],
and give the smallest estimate for this error that you are able to.
(a) Formulate Taylor’s formula with explicit error term estimate for a function f (x)
when the approximation is centered at x = 0.
(b) Given a function with the graph shown in Figure 23, what can we say about the
three first coefficients of its Taylor expansion centered at x = 0?
p
(c) Determine what for the Taylor polynomial of f (x) = 4 + x2 is needed for the
error to be less than 1/105 on the interval [ 1,1].
Exercise 13.85 (Exam 2014-08-18) From a previous exercises on this exam we know
that Z 1/2 p p
2
3 ⇡
1 x dt = + .
0 8 12
We are now supposed to use this to find a rational number that approximates ⇡.
(a) Use a p
suitable Taylor expansion to determine a rational number that approxi-
mates 3 with an error of at most 1/100.
13.5. RELEVANT EXERCISES FROM PREVIOUS EXAMS 433
(b) Use a suitable Taylor expansions to determine a rational number that approxi-
mates the integral above with the same accuracy.
(c) Use parts (a) and (b) to give a rational number that approximates ⇡. What is
the error of this approximation?
Exercise 13.86 (Exam 2014-05-26)
(a) Determine a curve for which the following integral gives the length:
Z ⇡/6 q
1 + sin2 (x) dx.
0
(b) Find a fraction that approximates the value of this length with an error of at
most 1/10. (Bonuspoints if you manage to approximate with an error less than
1/200.)
Exercise 13.87 (Exam 2014-01-09) As all exercises, this gives at most 5 points.
Here, we are going to approximate the value of
Z 1
cos(x2 ) dx.
0
Do one of the following:
(a) Show that cos x = 1 + R1 (x) with |R1 (x)| x2 /2. Use this to give an approxi-
mation of the integral and an estimate of the error. (3 points)
(b) Show that cos x = 1 x2 /2 + R2 (x) with |R1 (x)| x4 /24. Use this to give an
approximation of the integral and an estimate of the error. (4 points)
(c) Give a rational number approximating the above integral with an error less than
1/1000. (5 points)
Exercise 13.88 (Exam 2013-12-18)
(a) Show that
1
=1 x2 + x4 x6 + · · · + ( 1)n x2n + Rn (x) with |Rn (x)| x2n+2 .
1 + x2
(b) Use part (a) to find the formula for the Taylor expansion of arctan(x). In partic-
ular, what inequality for the error term do you get?
(c) Use this to find a rational number approximating ⇡ with an error less than 1/100.
(This rational number can be given as a sum of fractions.)
Exercise 13.89 (Exam 2013-05-29) Motivate why the function sin(x2 )/x2 can be
integrated over the interval [ 1,1], and determine a rational approximation of
Z 1
sin(x2 )
dx
1 x2
by using Taylor approximation. Give this approximation of the integral both with
434 CHAPTER 13. TAYLOR POLYNOMIALS
Exercise 13.90 (Exam 2016-01-23, part of exercise) Determine whether the fol-
lowing series converges:
1 ⇣
X 1⌘
k 1 cos
k
k=1
Exercise 13.93 (Exam 2016-01-07, part of exercise) Determine whether the fol-
lowing series diverges or converges:
1 ⇣
X 1 1⌘
sin arctan .
k k
k=1
13.5. RELEVANT EXERCISES FROM PREVIOUS EXAMS 435
Exercise 13.97 (Exam 2014-05-26, part of exercise) Determine whether the fol-
lowing series converges or diverges.
X1
tan( k1 ) 1
k
q .
k=1 sin k1
1 cos(arctan x)
lim .
x!0 (arctan x)2
13.5 You get the same answer as with T1 (all additional terms just give extra zeroes).
R1 2
13.7 6 terms, which gives the approximation p12⇡ 0 e t /2 dt ⇡ 1 1/(3 · 2) + 1/(5 · 22 ·
2!) 1/(7 · 23 · 3!) + 1/(9 · 24 · 4!) 1/(11 · 25 · 5!).
13.30 It suffices to apply the error estimate for x 2 [ 1/2,0]. Using this, in combination
with the improved error estimate, we find that the total error will not be larger
than approximately 0.00997....
(ii), and then integrate by parts (differentiating f 00 and integrating (x t)). (iv)
Do the same as in (iii), but now for general n.
13.40 (There is an error in the formula: the term ( 1)n should not be there.) The point
of the exercise is to show (by induction) that
n!
f (n) (x) = .
(1 x)n+1
13.43 (a) n = 6 will do, (b) the estimated error is roughly 4.1 · 10 8 , while the distance
computed between the integral and the estimate obtained using Taylor polynomials
with n = 6 is roughly 3.9 · 10 8 .
If we would be so lucky that the main term is equal to 9/10 for some value of n,
then all we need to do is figure out if we can tell which sign the error term is...
13.45 (a) Use the Taylor formula with Lagrange error function, (b) the point is to observe
that Z ⇡/2 Z ⇡/2 ⇣ Z ⇡/2
cos3 t ⌘
sin(cos t)dt = cos t dt + R(cos t)dt .
⇡/3 ⇡/3 6 ⇡/3
| {z } | {z }
main term error term
13.46 (a) This follows by using the Taylor formula with Lagrange error term, (b) (1/64) ·
10 8 .
13.52 C = 1/(1 ) for sufficiently small (well, strictly smaller than 1).
13.53 The one from the example has no unknown "c", and should therefore be easier to
handle.
13.55 (a)
P1|E(x)| k 110|x|
n+1 /(n + 1), (b) |E(x)| |x|n+1 /(n + 1), (c) the point is that
1
= Pn ( x2 ) + Rn ( x2 ),
1 + x2
where Pn is the Taylor expansion of 1/(1 x) of order n centered at x = 0, and
Rn is the corresponding error function (here, we use the notations Pn and Rn
13.6. ANSWERS TO SELECTED EXERCISES 439
since Tn and En are to be used for the Taylor polynomials and error functions for
arctan(x)).
P
(b) The Taylor polynomials for arctan(x) are T2n+1 = nk=0 ( 1)k x2k+1 /(2k + 1)
with error term given by
Z x
E2n+1 (x) = Rn ( t2 )dt.
0
Basically, playing around with this integral expression as in Example 13.54 solves
the exercise.
n
13.57 (a) m = (n)m /m! = (n)mP /(m)m , (b) f (n) (x) = (↵)n (1 + x)↵ n , (c) do as in
example 13.21, (d) Tn (x) = nk=0 (↵)k xk /k!.
P ( 1/2)k 2k+1
13.58 (a) T2n+1 (x) = nk=0 ( 1)k k!·(2k+1) x , (b) use arccos(x) = ⇡/2 arcsin(x).
13.61 (a) answer 1 for orders 1,2,3, (b) 1 for orders 1,2,3, (c) no answer for order 1, for
order 2,3 answer is 1/2 (d) answer 1 for orders 1,2,3.
13.65 Here is a proof for (iii):
|C · O (x↵ ) | C · D · |x↵ | = E · |x↵ |,
where C,D,E are constants. This implies, by definition, that C · O (x↵ ) = O (x↵ ).
13.66 The point is that the formulas are formulated in a sloppy way. If we are to be
careful, they should be formulated as
f (x) = O x3 =) f (x) = O x2 ,
and
f (x) = O x2 =) f (x) = O x3 .
Now, the first implication is always true, while the second one is not (why?).
13.71 (a) x2 /2 + x3 /2 + O x4 , (b) 1 + x3 /6 + O x4 , (c) x x3 /3 + O x5 .
13.73 It does not affect the result. The extra terms all give extra 0’s when we take the
limit (which is unnecessary).
13.74 (a) 1/2 (b) 1 (c) 1/6 (d) 1/2.
13.76 both (a) and (b) converge.
13.77 The point is to start by writing
1 p
X 1
X ⇣r 1 ⌘
↵ ↵
( k 2↵ + 1 k )= k 1+ 1 .
k 2↵
k=1 k=1
p
Using Taylor polynomials for 1 + x centered at x = 0 we see that convergence
happens if and only if ↵ > 1.
440 CHAPTER 13. TAYLOR POLYNOMIALS
13.79 (a) the (relativistic) energy goes to infinity, (c) when v/c is small, the relativistic
energy gets closer to the classical energy.
Appendix E
E-1
E-2 APPENDIX E. ADDITIONAL RESULTS ON THE DEFINITE INTEGRAL
yields exactly the energy of this sound-wave. Or, if f (t) is the probability distribution
of a quantum particle, then the same expression yields the probability of this particle to
be located in the interval [a,b]. For reasons such as these, it is useful to have a concept
of integration which allows one to compute (or at least define) the definite integral for a
large class of functions. (We note that there are versions of the definite integral that go,
in different ways, beyond the Lebesgue integral. Examples include the Bochner integral
for certain vector valued functions and the Ito integral for "stochastic" functions).
Remark E.1 (The Lebesgue integral of Dirichlet’s function) Let us now discuss
briefly why the Dirichlet function, as we defined it in Example 5.3, satisfies
Z 1
f (x)dx = 1
0
according to Lebesgue. In terms of the general approach, the main difference between
the Riemann integral and the Lebesgue integral is that in the latter, we consider a
"partition" of the y-axis and not the x-axis. In particular, we can express the Lebesgue
integral of f as follows:
Z 1
f (x)dx = 0 · |{x : f (x) = 0}| + 1 · |{x : f (x) = 1}|
0
= |[0,1] \ (R\Q)|.
where, for A ⇢ R, we use the notation |A| to denote the "size" or "measure" of the set of
points A. (To indicate why this makes sense, note that if f (x) = 3 for x in, say [0,1/2],
and zero for all other x, then the Lebesgue integral of f over [0,1] is equal to 3 · 1/2,
which coincides with the Riemann integral of this function.)
Now, a problem with the Lebesgue integral is that it is quite difficult to make a
theory for how to measure the size of subsets of R. In fact, this gives rise to a separate
branch of mathematics called "measure theory" on which "integration theory" is built.
Here, we mention the following facts, which should all be reasonable for any naive notion
of "measure" for subsets of the real line:
• Suppose that A and B are two subsets of R that can be "measured" (not all sets
can be measured!), then:
• Finally, we mention that the summation rule holds for sequences (An )1
n=1 of disjoint
sets in the sense that
1
X
|A1 [ A2 [ A3 [ · · · | = |An |.
n=1
Let us now see how we can apply these facts to figure out the Lebesgue integral of the
Dirichlet function. As we have seen in Appendix B, Cantor showed that the rational
numbers are countable. That is, we can write Q = (rn )1
n=1 . But this means that
1
X
|Q| = |{rn }| = 0.
n=1
Remark F.1 (Selected problems from previous exams) Note that while this
chapter is not, strictly speaking, part of the course, there have been "difficult" problems
on previous exams that explore parts of what we discuss here. Here are two examples of
such problems:
F-1
F-2 APPENDIX F. TAYLOR AND POWER SERIES
Taylor series are exactly what you get when you take the limit n ! 1 in a Taylor
expansion of the form
f (x) = Tn (x) + En (x).
As we will indicate here, this allows us to think about certain functions as being "poly-
nomials of infinite degree".
Here is a first example.
Example F.2 (Taylor series of the exponential function) The n’th order Taylor
expansion of the exponential function centered at x = 0 can be expressed as
n
X
x xk ec
e = + xn+1 , (F.1)
k! (n + 1)!
k=0
where the error term is the one given by Lagrange’s formula. By using the quotient test
for the convergence of series, we observe that the series
1
X xk
k!
k=0
converges absolutely for all fixed choices of x. Moreover, by the table of growth, we find
that for all fixed choices of x, we have
xk
lim = 0.
n!1 k!
Now, the exponential function is by no means the only function that can be repre-
sented by a Taylor series. Here is a second example.
F.1. A FIRST LOOK AT TAYLOR AND POWER SERIES F-3
Example F.3 (Taylor series for the arctangent) By exercise 13.56, we know that
the 2n + 1’st order Taylor polynomial of the arctangent, centered at x = 0, is given by
n
X x2k+1 |x|2n+3
arctan x = ( 1)k + E2n+1 (x) where |E2n+1 (x)| .
2k + 1 2n + 3
k=0
converges for all |x| 1, and by the divergence test, this series diverges for all |x| > 1.
Moreover, for fixed |x| 1, we have
|x|2k+1 1
lim |E2n+1 (x)| lim lim = 0.
n!1 n!1 2k + 1 n!1 2k + 1
while for x outside of this interval, then the right-hand side of this expression diverges.
Notice how the two examples we consider above differ in that the Taylor series for the
exponential function represents its function for all x 2 R, while the one for the arctangent
only represents its function for x 2 [ 1,1]. This motivates the following definition.
Definition F.4 (Entire function) If a function f is such that its Taylor series centered
at x = 0 converges, and is equal to, f for all x 2 R, then we call it entire.
The following proposition, which we do not prove here, makes identifying entire
functions easier.
Proposition F.5 Let f be defined on R and have derivatives of all orders at some point
x = a. If the Taylor series for a function centered at x = a converges to f (x) for all
x 2 R, then f is an entire function.
Exercise F.6 Show that sin x and cos x are entire functions.
Exercise F.7 Are inverse functions of invertible entire functions themselves entire?
Exercise F.8 Use the Taylor series of ex to find an expression for eix in terms of an
infinite series. What are the real and imaginary parts of this series?
F-4 APPENDIX F. TAYLOR AND POWER SERIES
Example F.9 We wish to determine for which x the following power series converges
1
X (3x 1)k
.
k+1
k=1
First, we suppose that x 2 R is fixed, but unknown, and study absolute convergence of
the series. That is, we study the convergence of
1
X |3x 1|k
.
k {z
+1 }
k=1 |
=|ak |
k+1
|3x 1|k+1 k + 1
= lim · = |3x 1|
k!1 |3x 1|k k+2
By the quotient test, the series is absolutely convergent whenever this limit is strictly
less than 1. Hence, the series is absolutely convergent when
|3x 1| < 1 () 1 < 3x 1 < 1 () 0 < 3x < 2 () 0 < x < 2/3.
Moreover, the quotient test says that the series diverges whenever
|3x 1| > 1 () x < 0 eller x > 2/3.
However, the quotient test gives no information about what happens when the limit is
equal to 1. That is, when x = 0 or x = 2/3. So, we have to investigate these cases
separately. Plugging these terms into the series, we see that
1
X 1
X
( 1)k 1
x = 0 =) and x = 2/3 =) .
k+1 k+1
k=1 k=1
That is, for x = 0 we obtain the alternate harmonic series, which converges, and for
x = 2/3 we obtain the harmonic series, which diverges.
In conclusion, the power series converges for x 2 [0,2/3) and diverges for all x outside
of this interval. In particular, the convergence is absolute for all x 2 (0,2/3).
F-6 APPENDIX F. TAYLOR AND POWER SERIES
Let us now make some observations with respect to the above example. First of all,
the power series from the example is centered at x = 1/3. This is seen by writing
1
X X 3k1
(3x 1)k
= (x 1/3)k .
k+1 k+1
k=1 k=1
Next, observe that the power series converges in the interval [0,2/3). While this interval
is a fairly ugly one, it has some important features:
• this interval is centered at the same point as the power series (that is, at x = 1/3),
• the power series converges absolutely for all inner points of the interval,
• the power series converges conditionally on one end-point (x = 0), and diverges at
the other (x = 2/3).
there exists a number R 0 called the radius of convergence (which we also allow to be
1) such that:
(i) the power series converges precisely on an interval with radius R 0 centered at
x = a (we call this the interval of convergence of the power series),
(ii) the power series converges absolutely on all interior points of this interval,
(iii) the power series may or may not converge (conditionally or absolutely) on its
endpoints.
Proof. To prove (i) and (ii), it suffices to prove that if the power series converges at a
point x0 6= a, then it converges absolutely for all x such that
(that is, it will converge absolutely at all points that are strictly closer to a than x0 ).
Indeed, this shows that the convergence has to be on an interval, and that at the inner
points, we have absolute convergence. As we saw in the above example, it is possible to
have both divergence and conditional convergence taking place at endpoint (we leave it
F.1. A FIRST LOOK AT TAYLOR AND POWER SERIES F-7
to the reader to find an example of a power series that has absolute convergence at at
least one endpoint). So, suppose the power series converges at x0 6= a. That is,
1
X
ck (x0 a)k
k=0
Finally, by inequality (F.3), on the previous page, we conclude that the series on the
right-hand side is a convergent Geometric series, and we are done.
We remark that since absolute convergence in some sense is "better" than conditional
convergence, this means that power series have "better" convergence behaviour in its
interior than on the endpoints of its interval of convergence. Starting on the next page,
we shall show that power series converge converge in an even nicer way if we keep a
positive distance to the endpoints of the interval of convergence.
Exercise F.13 Not every problem in the known universe can be solved using Taylor
series. Consider ( 2
e 1/x x =6 0
f (x) =
0 x=0
(a) Show that f (x) is continuous at x = 0.
(b) Use induction to determine f (n) (0) for all n. What does this say about Tn (x)?
F-8 APPENDIX F. TAYLOR AND POWER SERIES
Definition F.14 (Uniform convergence for power series) We say that a power
series
X1
ck (x0 a)k
k=0
converges uniformly on an interval I if, for all ✏ > 0 there exists a number N so that for
all x 2 I we have
X1 Xn
n > N =) ck (x0 a)k ck (x0 a)k < ✏.
|k=0 {z k=0 }
this is the n’th tail
Before considering an example where we use the above definition, let us immediately
point out that the above definition is not really about power series, or even about series
at all.
Note that definition F.14 follows from Definition F.15 by letting the gn be the n’th
partial sum of the power series. Also, note that in the above definition, we mean that
the gn are defined at least on the set I (which does not need to be an interval at all).
1
For this reason our "normal" continuity is often referred to as pointwise continuity, and the "normal"
convergence of a power series is often referred to as pointwise convergence.
F.2. UNIFORM CONVERGENCE AND CONTINUITY OF POWER SERIES F-9
Example F.16 Let us again consider the power series from Example F.9,
1
X (3x 1)k
.
k+1
k=1
As we saw above, it converges on the interval [0,2/3). Let us now show that it converges
uniformly on the interval [1/6,3/6]. That is, for ✏ > 0 given, our goal is to prove the
existence of a number N so that for n > N we have
1
X (3x 1)k
< ✏.
k+1
k=n+1
Next, notice that the assumption x 2 [1/6,3/6] implies that x 1/3 2 [ 1/6, 1/6]. That
is,
1 1
x 2 [1/6,3/6] =) |3x 1| 3 · = .
6 2
So, if we apply the triangle inequality to the infinite series, we obtain the following chain
of inequalities:
X1 1
X X1
(3x 1)k 1 1
k
.
k+1 2 (k + 1) 2k
k=n+1 k=n+1 k=n+1
Finally, solving 1/2n < ✏, shows that when n > N with N = log ✏/ log 2, the above
expression is less than ✏ for all x 2 [1/6,3/6], and we are done!
Exercise F.17 We consider again the same power series as in the above exercise.
(a) Show that the power series converges uniformly on the even larger interval [1/9, 5/9].
(b) (Challenge) The power series does not converge uniformly on the full interval
[0,2/3). Show that this is the case.
Exercise F.18 What intervals do the Taylor series for exp(x) converge uniformly on?
Is there any interval where this function does not converge uniformly?
F-10 APPENDIX F. TAYLOR AND POWER SERIES
Proof. Let c 2 I and ✏ > 0 be given. Our goal is to find some > 0 so that for all x 2 I,
we have
|x c| < =) |g(x) g(c)| < ✏.
Out first observation is that since gn converges to g uniformly on I, it follows that there
exists a number N so that we have
(
|gn (c) g(c)| < ✏/3
n > N =) ,
|gn (x) g(x)| < ✏/3
While it may not be immediately clear, we have now won! This is exactly the one we
are looking for. Indeed, if |x c| < and n = N + 1, then the following computation is
justified:
Exercise F.20 Show that P it follows that if fk (x) is a sequence of continuous functions
on some interval I, then if 1 k=0 fk (x) converges uniformly, then this infinite series is
continuous on I.
P
Hint: Apply the above proposition to the sequence of partial sums nk=0 fk (x).
F.2. UNIFORM CONVERGENCE AND CONTINUITY OF POWER SERIES F-11
converges uniformly on [1/6,3/6]. Since each partial sum of this power series is a finite
sum of continuous functions, and therefore itself continuous, it follows from Proposition
F.19 that the function defined by the power series is continuous on [1/6,3/6].
Now, in the above examples, we showed that the power series we studied converges
uniformly on [1/6,3/6] and is therefore continuous on this interval. However, in exercise
F.17, you were asked to show that it converges uniformly on the larger interval [1/9,5/9],
and therefore is also continuous there. The next result shows how far we can push this.
Proposition F.22 Power series converge uniformly at all inner points of their domain
of convergence.
While it is not that hard to give a direct proof of this proposition, we choose to first
establish a lemma which is important in its own right, as it gives a seemingly modest,
but quite useful, convergence test to establish uniform convergence.
In combination, propositions F.19 and F.22, yield most of the following result.
Proposition F.26 Power series are continuous at all points in their domain of conver-
gence.
Exercise F.27 (Challenge) Complete the proof of the above proposition. That is,
prove that if a power series converges at an endpoint of its domain of convergence,
then it is continuous there.
F-12 APPENDIX F. TAYLOR AND POWER SERIES
x2 x3 x4
ex = 1 + x + + + + ··· .
2 3! 4!
Are we allowed to differentiate the infinite sum? Well, we do not know this yet, but let
us take a look at what happens if we differentiate both sides of this identity:
d x d⇣ x2 x3 x4 ⌘
e = 1+x+ + + + ···
dx dx 2 3! 4!
d d d x2 d x3 d x4
= 1+ x+ + + + ···
dx dx dx 2 dx 3! dx 4!
x2 x3
=0+1+x+ + + · · · = ex .
2 3!
Nice, no? :-)
Exercise F.29 Differentiate the formula (F.2) for the arctangent term-by-term. What
does this say about the value of the derivative of arctan(x) at x = 1? Does this equal
the value of (arctan x)0 = 1/(1 + x2 ) at x = 1?
The point of the above example and exercise is to show that it is both natural to
be able to differentiate an infinite sum of functions term by term, but it can also be
dangerous. That is, we need some type of proposition to tell us when we are allowed
write
1 1
d X X d
fk (x) = fk (x), and (F.4)
dx dx
k=0 k=0
Z x⇣X
1 ⌘ 1 ⇣Z
X x ⌘
fk (t) dt = fk (t)dt . (F.5)
a k=0 k=0 a
F.3. WHEN CAN WE INTEGRATE AND DIFFERENTIATE POWER SERIES?F-13
Proof. Let us denote the limit of the uniform limit of the sequence gn on I by g. We first
observe that since a uniform limit of continuous functions is itself continuous, it follows
that g is integrable on I. Next, let ✏ > 0 be given. Since gn converges uniformly to g,
there exists an N so that for n > N , we have
✏
|gn (x) g(x)| < 8x 2 I.
|I|
as required.
Exercise F.31 Use the above propositionP to prove that if fk is a sequence of functions
continuous on a closed interval I, and if 1 k=0 fk (x) converges uniformly on I, then
(F.5) holds.
P
Hint: Apply the above proposition to the sequence of partial sums nk=0 fk (x).
Exercise F.32 Explain how the above result applies to power series.
F-14 APPENDIX F. TAYLOR AND POWER SERIES
Exercise F.34 (Challenge) Combine Proposition F.30 with the Fundamental theo-
rem of calculus to prove the above proposition.
Rx
Hint: a gn0 (t)dt = gn (x) gn (a).
Exercise F.35 Use the above proposition to prove that P1if fk is a sequence of functions
differentiable on some interval I. If the infinite
P series k=0 fk (x) converges on at least
one point x0 2 I, and the infinite series 1 f
k=0 k
0 (x) converges uniformly on I, then
(F.4) holds.
Exercise F.36 Explain how the above result applies to power series.
Exercise F.37 In this exercise you are to explore a situation where we are not al-
lowed to interchange the order of limits and the derivative. Specifically, let fk (x) =
sin(kx)/k, and show that in this case
d d
lim fk (x) 6= lim fk (x).
k!1 dx dx k!1
Exercise F.38 One application of the above proposition is to determine the sum of
certain power series. In particular, can you suggest – and prove – a formula for the
function defined by the power series
1
X
kxk 1
.
k=1
F.4. SUMMARY ON HOW TO DEFINE THE ELEMENTARY FUNCTIONS F-15
As we have pointed out, a critical defect of any pretense for a "rigorous" construction
of the theory of these functions is that for all – except the polynomials and rational
functions – we have based ourselves on definitions based on geometric considerations.
We now discuss and summarise how all definitions can now be put on a safe foundation,
and how to obtain their fundamental properties.
We begin with the logarithm. Now, the definition for the logarithm is on firm foundation,
since the geometric definition we used in Chapter 2 can be formulated in terms of a
definite integral. 3
In particular, it follows more or less immediately from this definition that the loga-
rithm has the following properties:
Let us move on to the exponential function. Now, since the logarithm has a strictly
positive derivative on its domain, it follows that it is an invertible function. We call this
inverse exp(x). Here is how we can now obtain all the fundamental properties we need
on the this function.
• By its definition, we obtain that the exponential function satisfies exp(0) = 1, has
domain R and range (0,1).
• Since the exponential function is the inverse function of a continuous function
defined on an interval, it is itself continuous.
• By what we did in exercise 8.48, which only used the properties for the logarithm
listed above, we know that (exp x)0 = exp x.
• In Proposition W2.33 (in the suggested changes document), we explained how all
the laws for the exponential function follows from the laws of the logarithm. In
def
particular, we explained why exp(x) = ex with e = exp(1).
• By what we did in Example F.1, which only used what we have discussed above,
we know that
X1
xk
exp(x) = .
k!
k=0
• Using the Taylor series for the exponential function, we can compute the value
def
e = exp(1) to any desired accuracy. In fact, this approximation formula is so good
that to get at least 16 correct decimals, we only need the partial sum for which
k! 1016 , which happens when k = 19.
Finally, we explain how we can use the above to extend the complex exponential to
the complex plane, thus obtaining the definition of the complex exponential. Indeed,
while we, in Chapter 2, defined the complex exponential in terms of the sine and cosine,
we can use the Taylor series of the exponential function to give a direct definition. To
prepare for this, we prove the following result.
Proof. The point of the proof is that absolute convergence also works when we sum up
complex numbers. That is, if ck is a sequence of complex numbers, then
1
X 1
X
|cn | < 1 =) cn converges.
k=0 k=0
Since this is a rather modest extension, we leave it as an exercise below. Here, the point
is simply to notice, that since, for any complex number we have |z k | = |z|k , we now
F.4. SUMMARY ON HOW TO DEFINE THE ELEMENTARY FUNCTIONS F-17
obtain
1
X 1
X
|z|k zk
< 1 =) converges.
k! k!
k=0 k=0
Not surprisingly, we can obtain all properties of the complex exponential from this
definition. Since this can be a bit tedious, let us restrict ourselves to proving one such
property (that we will need below, when discussing the trigonometric functions).
Proof. To prove the above formula, let us work on the n’th partial sum for exp(z + w):
Xn Xn k ✓ ◆ n k
(z + w)k 1 X k j k j X X z j wk j
= z w =
k! k! j j!(k j)!
k=0 k=0 j=0 k=0 j=0
n X
X n n j j k
n X
X
z j wk j z w
··· = = .
j!(k j)! j!k!
j=0 k=j j=0 k=0
Notice that in the last step, we made a change of summation index from k to n in the
inner sum, by setting n = k j. Finally, by taking the limit as n ! 1 on both sides of
our computation, we arrive at
1
X 1 X
X 1
(z + w)k z j wk
=
k! j!k!
k=0 j=0 k=0
⇣X
1
z k ⌘⇣ X wk ⌘
1
= .
k! k!
j=0 k=0
(This is a bit dodgy since n occurs in two different places, making the "partial sum"
on the right-hand side into not really being a partial sum. We ask you to fix this in an
exercise below.)
Exercise F.44 (Challenge) Show that if ck,j is absolutely convergent, in the sense
that
X1 X1
|ck,j | < 1,
j=0 k=0
then
X n j
n X n X
X n
lim ck,j = lim ck,j .
n!1 n !1
j=0 k=0 j=0 k=0
Remark: This exercise looks worse than it is. The point is just to use the absolute
convergence to show that "the tail" will be as small as you want.
Remark F.45 The above exercise can be extended in the sense that if the double sum
converges absolutely, then all ways of summing that double series will converge and give
the same result.
Remark F.46 The scheme for extending the power series of the exponential function
to the complex plane works for all power series. Indeed, since all power series converge
absolutely on interior points of their intervals of convergence, it follows by the same
argument that if all power series converge on a disc on the complex plane with the same
center and radius as its interval of convergence.
F.4. SUMMARY ON HOW TO DEFINE THE ELEMENTARY FUNCTIONS F-19
and
1
X ( 1)k x2k
cos x = .
(2k)!
k=0
(Notice that we do not need to make any new definition of the tangent, since, as before,
we just put tan x = sin x/ cos x.)
Let us now indicate how to collect all fundamental facts on the sine and cosine from
starting from the above definition.
• The domains of the sine and cosine are R, and sin 0 = 0 cos 0 = 1.
• sin x is an odd function and cos x is an even function.
• Since the sine and cosine converge uniformly on all finite intervals in R, it follows
we can differentiate their defining power series term by term. This yields (sin x)0 =
cos x and (cos x)0 = sin x.
• As we saw in exercise 9.6, the Pythagorean identity cos2 x + sin2 x = 1 follows by
the above points using the main corollary of the Mean value theorem.
• By comparing the above power series to that of the complex exponential, we get
exp(ix) = cos x + i sin x.
• By equating the real and imaginary parts of the left and right-hand sides of the
identity exp i(a + b) = exp(ia) exp(ib), the addition formulas of the sine and
cosine follow immediately.
• By combining the Pythagorean identity with the addition formulas, we obtain the
double-angle and half-angle formulas.
• By using the error term estimate of the alternating series test, we can check that
cos(2) < 0. By the intermediate value theorem, this implies that there exists at
least one point c 2 (0,2) so that cos(c) = 0. Let c be the smallest such point, and
define ⇡ = 2c.
F-20 APPENDIX F. TAYLOR AND POWER SERIES
• Since cos(x) is strictly positive on (0,⇡/2), it follows that sin x is strictly growing
on [0,⇡/2]. In particular, this implies that sin(⇡/2) is positive, and we can then
deduce from the Pythagorean identity that sin(⇡/2) = 1.
• Using the addition formula in combination with the values of the sine and cosine at
⇡/2, we can obtain all translation formulas for the sine and cosine involving ⇡/2,
⇡ and 2⇡.
• In particular, from the above, we find that cosine is strictly positive on ( ⇡/2,⇡/2)
and and that the sine is strictly positive on (0,⇡). By the differentiation formulas,
this implies that the sine is strictly increasing on [ ⇡/2,⇡/2] and that the cosine
is strictly decreasing on [0,⇡].
• The value of the sine and cosine at ⇡/3 follows by using the above to play around
with the expression sin(3x). The values at ⇡/6 can then be deduced.
• It follows from the Pythagorean identity that the ranges of the sine and cosine
belong to [ 1,1]. Using the fact that sin(±⇡/2) = ±1 and cos(0) = 1 and cos(⇡) =
1 in combination with the intermediate value theorem, we obtain that their ranges
are exactly [ 1,1].
Above, you should note one thing that might strike you as shocking. We are defining
⇡ to be twice the value of the smallest positive zero of the cosine. Below, we point out
an algorithm for computing its value to a high a level of accuracy as we would ever want.
Also, we point out that since all properties of the tangent is derived from its definition
tan(x) = sin(x)/ cos(x), then as soon as we know everything about the cosine and sine
functions, then we also obtain all facts on the tangent function.
But before we do this, let us now turn to the inverse trigonometric functions. By
the list of points above, it follows that the cosine is invertible if its domain is restricted
to [0,⇡] and that the sine is invertible if its domain is restricted to [ ⇡/2,⇡/2], and
that when restricted to these domains, these functions retain their full range. Similarly,
we obtain that the tangent function is invertible when restricted to ( ⇡/2,⇡/2) with
range R. Now, since we have already, in earlier chapters, explained how to obtain all
properties of the inverse trigonometric functions from the sine, cosine and tangent, we
need not repeat this here. However, we do point out that from the Taylor series
1
X x2k+1
arctan x = ( 1)k , x 2 [ 1,1],
2k + 1
k=0
(see exercise 13.56 and Example F.3 for an explanation of this). But since arctan(1) =
⇡/4, it follows that
X1
( 1)k ⇣ 1 1 ⌘
⇡=4 =4 1 + ··· .
2k + 1 3 5
k=0
Unfortunately, while this formula for ⇡ is very pretty, it converges rather slowly, meaning
we have to compute a lot of terms to get a reasonable accuracy for ⇡. But, hey, at least
it is an alternating series, and the error is therefore easily tracked.
F.5. POWER SERIES AND DIFFERENTIAL EQUATIONS F-21
Example F.48 (Airy’s equation) To give an idea of what we can accomplish using
power series, we consider a famous equation from physics known as Airy’s equation:
y 00 xy = 0.
Why would we be interested in the Airy equation? Well, the following quote is taken
from Wikipedia (and is meant to make this equation sound important):
[The function solving the Airy equation] is the solution to Schrödinger’s equation for a
particle confined within a triangular potential well, [it] underlies the form of the intensity
near an optical directional caustic, [and ] is also important in microscopy and astronomy;
it describes the pattern, due to diffraction and interference, produced by a point source
of light.
Ok, so how to solve Airy’s equation? The first observation is that the Airy equation
is a homogeneous and linear differential equation of order 2 – but with non-constant
coefficients! For this reason, none of the solution methods we have seen for differential
equations, so far, will work. This means we need a new idea. So, let us try to "genetically
engineer" (!) a solution of the form
1
X
y(x) = c k xk .
k=0
This is actually a very good strategy, which (almost immediately) leads to the two
(linearly independent) solutions
x3 x6 x9
y0 (x) = 1 + + + + ···
2·3 2·3·5·6 2·3·5·6·8·9
and x4 x7 x10
y1 (x) = x + + + + ··· .
3 · 4 3 · 4 · 6 · 7 3 · 4 · 6 · 7 · 9 · 10
F-22 APPENDIX F. TAYLOR AND POWER SERIES
As with the case of 2nd order linear differential equations with constant coefficients, here,
the solution space is still linear and 2 dimensional, so all solutions are given on the form
y = Ay0 + By1 .
While the solution method of using Taylor series should understandable for students
having read this far in the lecture notes, we do not include the details as an act of mercy.
Instead, we refer the interested student to exercise F.51, below, and/or the YouTube
movie https://www.youtube.com/watch?v=0jnXdXfIbKk for the details.
P1
Exercise F.49 Suppose that y(x) = k=0 ck x
k satisfies the initial value problem
(
y0 = y
y(0) = 1
(b) Use the relation y 0 = y to show that we must have the relation
ck
ck+1 = , 8k 0.
k+1
(c) Use the above, in combination with the initial condition y(0) = 1, to determine
the values of all ck . Do you recognise the function y(x? Is this surprising?
F.5. POWER SERIES AND DIFFERENTIAL EQUATIONS F-23
Exercise F.50 In this exercise, we invite you to repeat the procedure from the above
example to find a solution to the initial value problem
8 00
<y + y = 0
>
y= y(0) = 0
>
: 0
y (0) = 1
Exercise F.51 Repeat the procedure from the above exercise to solve Airy’s equation.