Calc1 2 Introduction To Differentiation
Calc1 2 Introduction To Differentiation
Calc1 2 Introduction To Differentiation
55)
Contents
2 Introduction to Dierentiation 1
2.1 Motivation: Rates and Tangent Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2.7.2 Dierentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2 Introduction to Dierentiation
In this chapter, we develop the derivative along with techniques for dierentiation. We begin by dening the
derivative as a limit and then use the denition to compute derivatives of simple functions. We then prove the basic
rules (such as the Product Rule and the Chain Rule) that will allow us to compute more complicated derivatives,
and mention additional time-saving techniques such as logarithmic dierentiation. We then discuss implicit relations
and implicit functions (and how to compute their derivatives), and problems involving related rates and the idea
of the linearization of a function.
◦ For example, we might want to know how fast a ball falls, how fast a chemical reaction occurs, how fast
a population is growing.
◦ Or we might be interested in how fast our car is going, how fast money is accruing in our bank accounts,
and how fast we are learning something.
1
◦ We would like a way to quantify this idea precisely.
• The derivative of a function captures how fast that function is changing, at the point we measure it. In other
words, the derivative measures the rate of change of a function.
◦ Think of looking at a car's speedometer: it measures how fast the car is moving at the specic instant
the speedometer is read.
• One notion we already can talk about is average velocity: [average velocity]=[total distance] / [total time].
◦ Example: A world-class sprinter can run the 100m dash in 10 seconds. Over that time, the sprinter is
traveling at an average speed of 10 meters per second.
• However, we are not asking for an average rate of change; we want to know the instantaneous rate of change,
which is the velocity at a specic instant, rather than over a time interval.
• One thing we might try is to calculate average rates of change over smaller and smaller intervals around our
point of interest, to see if these numbers approach some value.
◦ f (t), and we want to look over the time interval [t, t + ∆t] where ∆t is
If the position function is given by
some very small time increment, then the average velocity over that time interval is [average velocity]=
[end position] − [start position] f (t + ∆t) − f (t)
= .
[total time] ∆t
◦ We would like to examine this dierence quotient as we make the time interval ∆t smaller and smaller.
We can do this precisely using limits, and this leads us to the formal denition of the derivative.
• Another important use of the the derivative is in nding the equation for the line tangent to a curve at a
point.
◦ If f (x) is a nice function, then the tangent line to the graph y = f (x) at the point (a, f (a)) is the line
passing through (a, f (a)) that just touches the graph at (a, f (a)), and, near that point of tangency, is
the line which is closest to the graph of y = f (x).
3
◦ Here is a graph of y = 2x − x along with two tangent lines to the curve:
◦ To specify the equation of a line requires two pieces of information: the slope of the line and one point
that the line passes through.
◦ The tangent line to y = f (x) at x = a, by denition, passes through the point (a, f (a)), so in order to
determine the tangent line, it is enough to nd the slope of the tangent line.
◦ One way we can try to compute the slope of the tangent line is by approximating the tangent line with
secant lines passing through two points of the graph of y = f (x): through (a, f (a)) and another nearby
point (a + h, f (a + h)), where h is very small.
◦ The slope of the secant line through (a, f (a)) and (a + h, f (a + h)) is, by the slope formula, given by
f (a + h) − f (a) f (a + h) − f (a)
= .
(a + h) − a h
2
◦ Then, if f behaves well enough, we should be able to nd the slope of the tangent line by evaluating
f (a + h) − f (a)
the slope of the secant line through (a, f (a)) and (a + h, f (a + h)), as h gets closer and
h
closer to 0.
◦ Again, using limits, we can formalize this notion, and (once again) the answer is given by the derivative.
f (x) − f (a)
◦ Alternate Denition: An alternate way of writing the denition of the derivative is f 0 (a) = lim .
x→a x−a
◦ Note that the limit in the alternate denition is the same as the limit in the denition above, upon
making the change of variables h = x − a.
◦ A function whose derivative exists at x = a is called dierentiable at x = a. If a function is dierentiable
for every value of x, we say it is everywhere dierentiable.
◦ Most of the functions we will encounter are dierentiable everywhere that they are dened, although in
general, functions do not need to be dierentiable anywhere.
• The derivative f 0 (x) is also the slope of the tangent line to the graph of y = f (x).
◦ These are the most fundamental interpretations of the derivative: the author of these notes calls them
the calculus mantra.
• Example: Use the denition of the derivative to compute f 0 (x), for f (x) = x2 , and then nd an equation for
2
the tangent line to y=x at x = 1.
• Non-Example: Show that the derivative of f (x) = |x| does not exist at x = 0.
3
d2 f
• Denition: The second derivative of f, denoted f 00 (x) or , is the derivative of the derivative of f.
dx2
◦ In general we mostly care about the rst and second derivatives of f (x), as they have the most meaning
in physical applications.
◦ But there is no reason to stop with just the second derivative: in general, we dene the nth derivative
dn f
of f, denoted f (n) (x) or , to be the result of taking the derivative of f, n times.
dxn
• Example: Find the second and third derivatives of f (x) = x2 .
d 2−2
[2] = lim = lim 0 = 0
dx h→0 h h→0
• In the particular event that a function represents the position p(t) of an object moving along a one-dimensional
axis, with t representing time, the rst and second derivatives have particularly concrete interpretations:
• Denition: If an object has position p(t) at time t, the rst derivative p0 (t) represents the velocity of the
00
object, and the second derivative p (t) represents the acceleration of the object.
◦ Another way of phrasing these denitions is that velocity is the rate of change of position, and acceleration
is the rate of change of velocity.
◦ If the units of p(t) are distance, then the units of p0 (t) are distance per time, and the units of p00 (t) are
distance per time squared.
◦ For example, if p(t) is measured in meters and t is measured in seconds, then the units of p0 (t) are m/s
00
(meters per second) and the units of p (t) are m/s 2
(meters per second squared, or equivalently meters
per second per second).
p(t + h) − p(t)
◦ These units come directly out of the limit denition, because in the dierence quotient p0 (t) = lim ,
h→0 h
00
the numerator has units of distance while the denominator has units of time. Likewise, in p (t) =
p0 (t + h) − p(t)
lim , the numerator has units of distance per time, while the denominator still has units
h→0 h
of time.
◦ Remark: Derivatives past the second derivative carry less obvious physical meaning, but the third deriva-
tive p000 (t), which represents the rate of change of acceleration, is sometimes called jerk.
df
• Remarks on Notation: We will frequently write both and f 0 (x) to denote the derivative of f with respect
dx
df
to x. They mean the same thing, but we will typically use f 0 (x) since it is easier to write, and use only
dx
when we need to emphasize the dierence quotient nature of the derivative.
d d
◦ Another notation we will use for the derivative is [f ]: the symbol means take the derivative
dx d?
of the thing that follows, with respect to ?. We will generally use this notation when f is something
complicated, to make it clearer what we are doing.
◦ When f is a function of a single variable, the notation f0 always means the derivative of f with respect
to that variable. Taking additional derivatives is denoted by adding primes: thus f 000 (x) is the third
derivative of f.
4
◦ For functions involving more than one variable, the notation f0 is ambiguous, and should not be used.
0
For such functions, the notation fx is often used (in place of f ) to denote the derivative of f with respect
to x.
◦ On occasion, especially in physics, the notation f˙ is used to denote the derivative of f with respect to a
variable representing time (t), in contrast with the notation f0 used to denote the derivative of f with
respect to a variable representing space (x).
• Roughly speaking, a dierentiable function (one whose derivative exists everywhere) behaves nicely, at least
in comparison to arbitrary functions. We have another notion of niceness (namely, continuity), and it turns
out that being dierentiable is a stronger condition than being continuous:
◦ This theorem says a dierentiable function is continuous. The converse of this theorem is false: there
exist functions which are continuous at a point but not dierentiable there, such as |x| at x = 0.
◦ Even more pathologically, there exist functions which are continuous everywhere but dierentiable nowhere.
◦ Proof: By the multiplication rule for limits, we can write
f (x) − f (a) f (x) − f (a) h i
lim [f (x) − f (a)] = lim · (x − a) = lim · lim (x − a) = f 0 (a) · 0 = 0
x→a x→a x−a x→a x−a x→a
f (x) − f (a)
where in the last step we used the alternate denition of the derivative f 0 (a) = lim .
x→a x−a
◦ Therefore, lim [f (x) − f (a)] = 0. Now since f (a) is a constant, we can pull it out of the limit and then
x→a
rearrange the equality to read lim f (x) = f (a).
x→a
◦ But this says precisely that f (x) is continuous at x = a, as we wanted.
d
[c] = 0
dx
d n
[x ] = n xn−1
dx
d x
[e ] = ex
dx
d 1
[ln(x)] =
dx x
d
[sin(x)] = cos(x)
dx
d
[cos(x)] = − sin(x)
dx
◦ Note 1: The formulas for the derivatives of sine and cosine require that the angle be measured in radians,
not degrees. This is the reason that we measure angles in radians: they are the most natural unit when
doing calculus.
◦ Note 2: In a similar way, the formula for the derivative of ex is surprisingly simple, while (as we will
x 0 x
see) the derivative of the more general exponential function f (x) = a is f (x) = a ln(a), which has
an unpleasant natural logarithm constant factor in it. This is the reason we usually prefer to use the
natural exponential ex in calculus, since its derivative is the simplest. Likewise, among all the dierent
logarithms, the one having the simplest derivative is the natural logarithm.
• Here is the proof of each of these rules, from the denition of the derivative:
5
c−c 0
◦ The derivative of a constant is zero, since if f (x) = c for all x, then f 0 (x) = lim = lim = 0.
h→0 h h→0 h
◦ If n is a positive integer, then for f (x) = xn , we have (by the alternate denition of the derivative)
n n
x −a
f 0 (a) = lim
x−a
x→a
(x − a)(xn−1 + xn−2 a + · · · + x an−2 + an−1 )
= lim
x→a x−a
= lim xn−1 + xn−2 a + · · · + x an−2 + an−1
x→a
n terms
= an−1 + an−1 + · · · + an−1
= n an−1
For other n (negative integers, rational numbers, general real numbers) the proof uses the same ideas,
but is not much more enlightening.
ex+h − ex
f 0 (x) = lim
h→0 h
ex eh − ex
= lim
h→0 h
ex (eh − 1)
= lim
h→0 h
h
e −1
= ex · lim
h→0 h
= ex
eh − 1
because lim = 1, either by a calculation or by denition of the exponential.
h→0 h
◦ For f (x) = ln(x), we have
ln(x + h) − ln(x)
f 0 (x) = lim
h→0 h
1 h x/h h
= lim ln 1 + = lim ln 1 +
h→0 h x h→0 x x
1 1 1 1
= lim y ln 1 + = lim y ln 1 +
y→∞ x y x y→∞ y
y
1 1 1 1
= ln lim 1 + = ln(e) =
x y→∞ y x x
y
1
where we made the change of variables y = x/h and used the fact that lim 1+ = e, either by a
y→∞ y
calculation or by denition of the number e.
sin(x + h) − sin(x)
f 0 (x) = lim
h→0 h
sin(x) cos(h) + cos(x) sin(h) − sin(x)
= lim
h→0 h
cos(h) − 1 sin(h)
= lim sin(x) · + cos(x) ·
h→0 h h
cos(h) − 1 sin(h)
= sin(x) · lim + cos(x) · lim
h→0 h h→0 h
= cos(x)
cos(h) − 1 sin(h)
because lim =0 and lim = 1.
h→0 h h→0 h
6
sin(x)
∗ To obtain the sine limit, a geometry calculation shows that cos(x) < <1 for small positive
x
x. Then apply the squeeze theorem as x → 0.
cos(x) − 1 cos(x) − 1 cos(x) + 1 − sin2 (x) sin(x)
∗ For the cosine limit, observe that = · = =− ·
x x cos(x) + 1 x · (cos(x) + 1) x
sin(x)
, and then note that as x→0 the rst term goes to −1 while the second goes to 0.
cos(x) + 1
◦ For f (x) = cos(x), we have
cos(x + h) − cos(x)
f 0 (x) = lim
h→0 h
cos(x) cos(h) − sin(x) sin(h) − cos(x)
= lim
h→0 h
cos(h) − 1 sin(h)
= lim cos(x) · − sin(x) ·
h→0 h h
cos(h) − 1 sin(h)
= cos(x) · lim − sin(x) · lim
h→0 h h→0 h
= − sin(x)
cos(h) − 1 sin(h)
again because lim =0 and lim = 1.
h→0 h h→0 h
• The derivatives of the other trigonometric functions, and the inverse trigonometric functions, often appear as
well. The most important are tan(x), sin−1 (x), and tan−1 (x).
d
[tan(x)] = sec2 (x)
dx
d
[sec(x)] = sec(x) tan(x)
dx
d
[cot(x)] = − csc2 (x)
dx
d
[csc(x)] = − csc(x) cot(x)
dx
d −1 1
sin (x) = √
dx 1 − x2
d −1 1
cos (x) = −√
dx 1 − x2
d −1 1
tan (x) =
dx 1 + x2
d −1 1
sec (x) = √
dx x x2 − 1
d −1 1
csc (x) = − √
dx x x2 − 1
d −1 1
cot (x) = −
dx 1 + x2
◦ The formulas for tangent, secant, cosecant, and cotangent can be obtained, with some eort, from the
formal denition by reducing the calculations to sines and cosines. However, it is much easier to obtain
them using some of the rules for computing derivatives (which we have not discussed yet).
◦ The derivatives of the inverse trigonometric functions can be computed using the denition of derivative,
but it is complicated and rather tricky; we will avoid this and instead discuss a general formula later for
nding derivatives of inverse functions.
7
2.4.1 Rules for Computing Derivatives
• In the list of rules below, f means f (x), f 0 means f 0 (x), and so on.
• Using these rules, we can compute the derivatives of any elementary function (namely, any function which is
a combination of sums, products, quotients, exponentials, logs, and trigonometric and inverse trigonometric
functions of x).
d d
• Sum/Dierence Rule: For any dierentiable f and g , we have [f + g] = f 0 + g 0 and [f − g] = f 0 − g 0 .
dx dx
◦ What this means is: if we have a sum (or dierence) of functions, we can compute the derivative of the
sum (or dierence) just by dierentiating each term one at a time, and then adding (or subtracting) the
results.
◦ Proof: We have
• It might seem, based on the rules for sums and dierences, that the derivative of a product of two functions
would be the product of their derivatives. However, this is not true!
◦ For example, x5 = x3 · x2 , but the derivative of x5 is 5x4 while the product of the derivatives of x3 and
2
x is 3x · 2x = 6x3 .
2
d
• Product Rule: For any dierentiable functions f and g, we have [f · g] = f 0 · g + f · g 0 .
dx
d
◦ In the case where g is the constant function c, we obtain the Constant Multiple Rule [c f ] = c f 0 .
dx
◦ Proof: We have
d f (x + h)g(x + h) − f (x)g(x)
[f (x)g(x)] = lim
dx h→0 h
[f (x + h)g(x + h) − f (x + h)g(x)] + [f (x + h)g(x) − f (x)g(x)]
= lim
h→0 h
g(x + h) − g(x) f (x + h) − f (x)
= lim f (x + h) · lim + lim · lim g(x)
h→0 h→0 h h→0 h h→0
0 0
= f (x)g (x) + f (x)g(x)
where in the last step, we used the fact that lim f (x + h) = f (x) because f (x) is dierentiable and
h→0
therefore continuous.
• There is also a rule for computing derivatives of quotients. Like with the Product Rule, the derivative of a
quotient is not the corresponding quotient of derivatives, but rather something slightly more complicated.
8
f 0 · g − f · g0
d f
• Quotient Rule: For any dierentiable functions f and g with g 6= 0, we have = .
dx g g2
◦ In the Quotient Rule, it is very important not to mix up the order of the two terms in the numerator
0
(f g has a positive sign, while g0 f has a negative sign). Getting them backwards will yield a result that
is o by a factor of −1.
g0
d 1
◦ Proof: First we show that = − 2:
dx g g
1 1
−
d 1 g(x + h) g(x)
= lim
dx g(x) h→0 h
g(x) − g(x + h)
g(x + h) · g(x)
= lim
h→0 h
g(x) − g(x + h) 1
= lim ·
h→0 h g(x) · g(x + h)
g(x) − g(x + h) 1
= lim · lim
h→0 h h→0 g(x) · g(x + h)
1
= −g 0 (x) ·
g(x)2
where in the last step, we used the denition of the derivative and fact that lim g(x + h) = g(x) because
h→0
g(x) is dierentiable and therefore continuous.
f f 1
◦ The general result for =f
follows by writing · and applying the Product Rule:
g g g
d f d 1 d 1 d 1
= f· = [f ] · + f ·
dx g dx g dx g dx g
0
1 −g
= f0 · + f · 2
g g
f 0 · g f · g0 f0 · g − f0 · g
= − = .
g2 g2 g2
• The rules above allow us to dierentiate any combination of sums, dierences, products, and quotients.
◦ However, it is not possible to describe a function like h(x) = sin(cos(x)) using only those operations and
functions whose derivatives we know.
◦ The missing ingredient is function composition: notice that h(x) = f (g(x)) where f (x) = sin(x) and
g(x) = cos(x).
◦ We will now give a rule for nding the derivative of a composition chain like h(x) = f (g(x)).
d
• Chain Rule: For any dierentiable functions f and g, we have [f (g(x))] = f 0 (g(x)) · g 0 (x) .
dx
dz dz dy
◦ Another way of writing the Chain Rule is = · , where z is a function of y and y is a
dx dy dx
function of x. The translation between this formulation and the explicit one above is y = g(x) and
z = f (y) = f (g(x)).
◦ A natural way to interpret this formulation of the Chain Rule is using rates of change: for example, if z
changes twice as fast as y and y changes four times as fast as x, then z is changing 4·2 = 8 times as fast
as x.
9
◦ 1
Proof : By the alternate denition of the derivative,
d f (g(x)) − f (g(a))
[f (g(a))] = lim
dx x→a x−a
f (g(x)) − f (g(a)) g(x) − g(a)
= lim ·
x→a g(x) − g(a) x−a
f (g(x)) − f (g(a)) g(x) − g(a)
= lim · lim
x→a g(x) − g(a) x→a x−a
f (y) − f (b) g(x) − g(a)
= lim · lim
y→b y−b x→a x−a
0 0
= f (b) · g (a)
= f 0 (g(a)) · g 0 (a)
where to get from the third equation to the fourth, we substituted y = g(x) and b = g(a) in the rst
limit and used the fact that y→b as x→a because g(x) is dierentiable and therefore continuous.
• There are a few useful corollaries of the Chain Rule which are worth remembering on their own:
d g0
◦ When f (x) is the natural logarithm, we obtain the Log Rule [ln(g)] = .
dx g
d h g(x) i
◦ When f (x) is the base-e exponential we obtain the Exponential Rule e = eg(x) · g 0 (x) .
dx
◦ In particular, because ax = ex ln(a) , if we set g(x) = x ln(a) in the rule above, we obtain the general-base
d x
exponential derivative [a ] = ax ln(a) .
dx
• Finally, we will give a rule for computing the derivative of an inverse function:
d −1 1
• Inverse Rule: If f is a dierentiable function with inverse f −1 , then [f (x)] = 0 −1 .
dx f (f (x))
d −1 f −1 (x) − f −1 (a)
◦ Proof: By the alternate denition of derivative, we have f (x) (a) = lim .
dx x→a x−a
◦ Now set b = f −1 (a) and make the substitution y = f −1 (x), so that a = f (b) and x = f (y).
d −1 y−b 1 1 1
◦ We obtain f (x) (a) = lim = = 0 = 0 −1 , as desired.
dx y→b f (y) − f (b) f (y) − f (b) f (b) f (f (a))
lim
y→b y−b
√ 1 −1/2
◦ Note that x = x1/2 , so applying the Power Rule yields f 0 (x) = x .
2
10
◦ Here, we can just dierentiate term-by-term (secretly applying the Sum and Dierence Rules) using the
1
Power Rule to get g 0 (x) = 30x9 − 3x−2 − x−1/2 .
2
1
◦ The Product Rule gives h0 (y) = ey · d
dy [ln(y)] + d y
dy [e ] · ln(y) = ey · + ey · ln(y) .
y
et + 3
• Example: Dierentiate q(t) = with respect to t.
t2
d t d 2
dt [e + 3] · (t2 ) − dt [t ] · (et + 3) et · (t2 ) − 2t · (et + 3)
◦ The Quotient Rule gives q 0 (t) = 2 2
= .
(t ) (t2 )2
t et − 2(et + 3)
◦ This can be simplied by cancelling a factor of t, giving q 0 (t) = .
t3
◦ If we write f (x) = sin(x)/ cos(x), we can apply the Quotient Rule to see
d d
· (cos(x)) − dx
dx [sin(x)] [cos(x)] · (sin(x))
f 0 (x) = 2
cos (x)
cos(x) · cos(x) − (− sin(x)) · sin(x)
=
cos2 (x)
cos2 (x) + sin2 (x) 1
= =
cos2 (x) cos2 (x)
= sec2 (x) .
◦ The derivatives of secant, cosecant, and cotangent can be computed the same way.
◦ Here, we can apply the Chain Rule, with f (x) = x15 and g(x) = x2 + ex .
◦ We obtain h0 (x) = f 0 (g(x)) · g 0 (x) = 15(x2 + ex )14 · (2x + ex ) .
√
• Example: Dierentiate k(x) = x3 + 1 .
√
◦ Here, we can apply the Chain Rule, with f (x) = x = x1/2 and g(x) = x3 + 1.
1 3
◦ We obtain k 0 (x) = f 0 (g(x)) · g 0 (x) = (x + 1)−1/2 · 3x2 .
2
◦ We use the Inverse Rule: notice that g(x) = f −1 (x) for f (x) = tan(x).
1 1 1
◦ Since f 0 (x) = sec2 (x), we get g 0 (x) = = = .
f 0 (f −1 (x)) 1 + tan2 (tan−1 (x)) 1 + x2
◦ The derivatives of the other inverse trigonometric functions can be computed the same way.
• Using these dierentiation techniques, we can quickly solve the kinds of problems that motivated our denition
of the derivative without having to resort to cumbersome limit calculations:
3
• Example: Find an equation for the tangent line to y = x2 + at x = 1.
x
11
◦ We have y(1) = 4, so the tangent line passes through (1, 4).
3
◦ The derivative of this function is y 0 = 2x − , so y 0 (1) = −1.
x2
◦ Therefore, the tangent line has slope −1 and passes through the point (1, 4), so its equation is y − 4 = −1(x − 1) .
(Note that we could also write the equation in a variety of other ways, such as y = 5 − x or x + y = 5.)
• Example: If at time t seconds, a particle has position p(t) = t − sin(πt) meters, how fast is it moving at t = 2s
and how fast is it accelerating at t = 3s?
◦ The velocity is given by the derivative of position, so v(t) = p0 (t) = 1 − π cos(πt), so at t = 2s, we see
that v(2) = 1 − π cos(2π) = 1 − π .
◦ The units on the velocity are meters per second, and since 1−π is negative and speed is positive, at
◦ Acceleration is the derivative of velocity, and the units on acceleration are meters per second squared,
m/s2 , so a(t) = v 0 (t) = π 2 sin(πt), so at t = 3s, we see that a(3) = 0m/s2 . (This says that the particle is
we guess that the tangent line to this curve is vertical at the origin and thus has equation x = 0 .
◦ Indeed, this is borne out by the graph of the curve and the three tangent lines we found above the
tangent line at the origin does indeed look like a vertical line:
1.0 1.0 1.0 1.0
0.5 1.0 1.5 0.5 1.0 1.5 0.5 1.0 1.5 0.5 1.0 1.5
◦ Although it may seem dicult, the procedure is very algorithmic: simply decide which which rule can be
applied rst. Then write down the result of applying that rule to the given expression, and then evaluate
any remaining derivatives in the same way.
√
• Example: Dierentiate f (x) = ln 1 + e x .
12
√
d
0 dx (1 + e x)
◦ First, observe that we can apply the Chain Rule to see that f (x) = √ .
1+e x
◦ Now we are left with the simpler task of evaluating the derivative in the numerator. We can do this with
√
0
e · 21 x−1/2
x
another application of the Chain Rule, yielding the end result f (x) = √ .
1+e x
√
x3 + 2 x
• Example: Dierentiate f (x) = .
sin(x)
d 3
√ d √
0 dx [x + 2 x] · sin(x) − dx [sin(x)] · (x3 + 2 x)
◦ We start by applying the Quotient Rule, giving f (x) = 2 .
(sin(x))
√
0 (3x2 + x−1/2 ) · sin(x) − cos(x) · (x3 + 2 x)
◦ Evaluating the leftover derivatives yields f (x) = 2 .
(sin(x))
d 3 r d
h0 (r) = r 2 · ln(r) + r3 2r · [ln(r)]
dr dr
2 r 1
3r 2 + r3 2r ln(2) · ln(r) + r3 2r ·
=
r
where we used the Product Rule again to dierentiate r 3 2r .
• Example: Dierentiate f (x) = sin(sin(sin(sin(x)))).
◦ This function seems intimidating, but it is built entirely through function composition, so we can nd
its derivative using the Chain Rule by working from the outside to the inside.
◦ To start, note that f (x) has the form sin(?), where ? is some other function (namely, sin(sin(sin(x)))).
d
◦ Therefore, f 0 (x) = cos(?) · [?] by the Chain Rule.
dx
◦ We have now reduced the problem to nding the derivative of the simpler expression ? = sin(sin(sin(x))).
◦ Repeatedly applying the Chain Rule in this manner yields
d
f 0 (x) = cos(sin(sin(sin(x)))) · [sin(sin(sin(x)))]
dx
d
= cos(sin(sin(sin(x)))) · cos(sin(sin(x)) · [sin(sin(x))]
dx
d
= cos(sin(sin(sin(x)))) · cos(sin(sin(x)) · cos(sin(x)) · [sin(x)]
dx
= cos(sin(sin(sin(x)))) · cos(sin(sin(x)) · cos(sin(x)) · cos(x) .
2
• Example: Dierentiate f (x) = x tan(x3 + e2x cos(x)
).
◦ By applying the Product Rule and Chain Rule repeatedly, we obtain
d 2 d h 2
i
f 0 (x) = [x] · tan(x3 + e2x cos(x) ) + x · tan(x3 + e2x cos(x) )
dx dx
3 2x2 cos(x) 2 d h 3 2
i
= 1 · tan(x + e ) + x · sec (x + e2x cos(x) ) ·
2 3
x + e2x cos(x)
dx
3 2
2x cos(x) 2 3 2
2x cos(x) 2 2x2 cos(x) d 2
= 1 · tan(x + e ) + x · sec (x + e ) · 3x + e · 2x cos(x)
dx
2 2
h 2 i
1 · tan(x3 + e2x cos(x) ) + x · sec2 (x3 + e2x cos(x) ) · 3x2 + e2x cos(x) · 4x cos(x) − 2x2 sin(x) .
=
13
s
tan−1 (3t2 )
• Example: Dierentiate q(t) = .
sin(sec(t))
−1/2
1 tan−1 (ln(3t)) d tan−1 (ln(3t))
0
q (t) = ·
2 sin(sec(t)) dt sin(sec(t))
−1/2 1 d 2 −1 d
−1
1+(3t2 )2 · dt [1 + 3t ] · sin(sec(t)) − tan (3t2 ) · cos(sec(t)) · dt [sec(t)]
1 tan (ln(3t))
= ·
2 sin(sec(t)) sin2 (sec(t))
−1/2 1 −1
1 tan−1 (ln(3t)) 1+(3t2 )2 · 6t · sin(sec(t)) − tan (3t2 ) · cos(sec(t)) · sec(t) tan(t)
= · .
2 sin(sec(t)) sin2 (sec(t))
• There is a useful trick to simplify such dierentiation problems, called logarithmic dierentiation: the basic
idea of the procedure is to dierentiate ln(f ) rather than f itself. If ln(f ) can be simplied using properties
of logarithms, the resulting computations can be substantially easier. We will illustrate the idea with a few
examples.
◦ We could just use the Product and Chain Rules, giving f 0 (x) = 3(x − 2)2 · (x2 + 1)7 + (x − 2)3 · 7(x + 1)6 · 2x .
• The logarithmic dierentiation method saves the most time with complicated products, quotients, or powers.
14
f0 1 3x2 2 3x2 4x3
◦ Dierentiating yields = · 3 + · 3 − 100 · 4 .
f 2 x +3 3 x −1 x +x+1
√ p
x3 + 3 · 5 (x3 − 1)2 3x2 3x2 4x3
0 1 2
◦ Thus, f = · · + · − 100 · 4 .
(x4 + x + 1)100 2 x3 + 3 3 x3 − 1 x +x+1
• More generally, we can use logarithmic dierentiation to nd the derivative of a general exponential of the
f 0 (x)
g(x) d g(x)
0
form f (x) : the result is f (x) = g (x) ln f (x) + g(x) · · f (x)g(x) .
dx f (x)
◦ It is not worth memorizing this rule (which is why it is not boxed). Instead, it is much more useful to
remember how to compute such derivatives using logarithmic dierentiation.
• Sometimes, however, we are also interested in implicit relations, which relate the values of y and x not via
an explicit function y = f (x), but as some equation of the form R(x, y) = 0, where R is an arbitrary function
of the two variables x and y .
◦ Any equation relating x and y can (by moving all the terms to one side) be written in the form R(x, y) = 0.
◦ So an implicit relation involving x and y is the same thing as some equation that x and y satisfy.
◦ We allow ourselves to rearrange implicit relations, to put things on either side of the equality.
• Sometimes, it is possible to turn an implicit relation into an explicit one by rearranging it. Other times, this
is not possible.
15
• Like with explicit functions y = f (x), we can produce a graph of all the points (x, y) satisfying an implicit
relation.
◦ Note that actually producing the graph of an implicit relation, though, is trickier than producing the
graph of an explicit function y = f (x): more is required than just picking a bunch of values of x and
evaluating the function.
◦ Usually, the graph of a single implicit relation will be a curve or a collection of curves (possibly intersecting
each other) in the plane.
◦ As can be seen from the plot of x−y + y cos(x) = tan(x − y), implicit relations can have very complicated
graphs!
• Motivating Example: Consider the implicit relation x2 + y 2 = 1, whose graph is the unit circle.
◦ To describe the graph of the unit circle explicitly we have to glue together the graphs of the two functions
√ √
y = 1 − x2 and y = − 1 − x2 for −1 ≤ x ≤ 1. But if we allow ourselves an implicit relation for y,
then we can describe the unit circle with the far simpler single equation x2 + y 2 = 1.
◦ In the case of x2 + y 2 = 1, we can solve the implicit relation for y.
◦ However, for more complicated implicit relations, it may not be possible to solve for y explicitly as a
function of x; for example, it is not possible to solve y + sin(y) = x + ex for either x or y as an elementary
function of the other.
◦ Even if it is possible to solve for y, the result might be very complicated. And it is also possible to lose
information, if we are not extremely careful about solving the relation for y.
16
∗ For example, x = sin(y) and y = sin−1 (x) appear to be equivalent, but in fact they aren't: (x, y) =
(0, 2π) satises the rst equation but not the second one, because the (principal) arcsine function
h π πi
sin−1 only has range − , .
2 2
• Remark (relation versus function) : We often want to speak of implicit relations between x and y as
implicitly dening y as a function of x. Technically speaking, this is often troublesome: in any given implicit
relation, there may be more than one value of y associated to a particular value of x, in which case y is not
actually a function of x. If we want to insist on trying to make y an actual function of x, we would have to
specify which particular value of y we want whenever there is a choice.
◦ For example, consider the implicit relation x2 = y 2 . If we solve for y in terms of x then we obtain y = ±x.
For each value of x (except for x = 0) the relation gives two possible values of y.
◦ If we wanted to solve the implicit relation x2 = y 2 to obtain y as a function of x, we would need to
specify which value of y we want for each possible value of x.
◦ There are many ways of doing this: some of the obvious ones are y = x, or y = −x, or y = |x|. But we
could also do something like taking y = x, for −3 ≤ x ≤ 3, and taking y = −x for the other possible x.
• From here onwards, we will abuse terminology and not worry about whether the things we want to call
implicit functions should really be named functions or not.
◦ Note that the interpretation of y0 is still exactly the same: y0 still gives the slope of the tangent line to
0
the curve. Thus, when y = 0, the tangent line is horizontal, and when y0 blows up to ±∞, the tangent
line is vertical.
◦ One other type of unusual behavior can happen with implicit relations: if the graph of the curve R(x, y) =
0 0
0 crosses itself, then the value of y will usually be an indeterminate fraction of the form .
0
• Furthermore, if we want to compute the second derivative y 00 in terms of x and y, then we can simply
0
dierentiate y (using the Chain Rule), and then substitute in for any occurrences of y0 in the resulting
00
expression to write y as a function of x and y. Repeatedly dierentiating in this manner allows calculation
of any higher derivative, although the results usually get complicated very quickly.
• Example: For the implicit function y dened by x2 + y 2 =1, nd y 0 using implicit dierentiation, and then
3 4
nd the slope of the tangent line to its graph at the point ,− .
5 5
2
◦ To nd y0 , we think of y as a function of x and dierentiate both sides of x2 + [y(x)] = 1 with the Chain
Rule.
1
◦ This gives 2x + 2 · [y(x)] · y 0 (x) = 0, or, more compactly, 2x + 2y y 0 = 0.
−2x x
◦ Solving for y0 gives 2y y 0 = −2x, so that y0 = =− .
2y y
17
3 4
◦ The slope of the tangent line at , − , which we can verify actually does lie on the curve since
5 5
2 2
3 4 3/5 3
− + = 1, is then y0 = − = .
5 5 −4/5 4
0
• Example (repeated): For the implicit function y dened by x2 +y 2 = 1, try to
nd y by solving for y explicitly,
3 4
and then nd the slope of the tangent line to its graph at the point ,− .
5 5
√
◦ Solving x2 + y 2 = 1 for y gives y = ± 1 − x2 = ±(1 − x2 )1/2 . Note that the ± is very much necessary
because there are two possible values of the square root, positive and negative.
◦ If we blindly take the derivative of the expression y = ±(1 − x2 )1/2 with the Chain Rule, we obtain
1
y = ± · (1 − x2 )−1/2 · (−2x).
0
2
◦ Now there is already an issue: which sign do we want? The only plausible answer is that it must depend
on the value of y we are interested in.
◦ For y = −4/5, in particular, we probably want the negative sign. Proceeding with this assumption,
" 2 #−1/2
3 1 0
2 −1/2 0 1 3 3
plugging in x = to the expression y = − ·(1−x ) ·(−2x) gives y = − · 1 − ·(−2· ),
5 2 2 5 5
3
which after some arithmetic eventually does simplify to the answer y0 = we found above.
4
◦ Now compare our general formula from the explicit solution to the problem, to the general formula to
1
the implicit solution: the formula y 0 = ± · (1 − x2 )1/2 · (−2x) is far more complicated (not to mention
2
0 x
the ambiguity about the sign) than the formula y = − .
y
◦ In this case, we can see that the implicit formula is much better. (This is almost always the case for
implicit dierentiation problems!)
• Example: Find an equation for the tangent line to the curve x2 + xy + y 4 = 3 at the point (x, y) = (1, 1).
◦ We have a point the tangent line passes through, namely (1, 1), so all we need is the line's slope, which
is the derivative y0 .
4
◦ To nd y 0 , we think of y as an implicit function y(x), and write the equation as x2 +x·[y(x)]+[y(x)] = 3.
◦ Now we dierentiate both sides using the Chain Rule and the Product Rule, to get
3
2x + (1 · y(x) + x · y 0 (x)) + 4 · [y(x)] · y 0 (x) = 0, or, more compactly, 2x + y + xy 0 + 4y 3 y 0 = 0.
2x + y
◦ Rearranging gives (2x + y) + (x + 4y 3 )y 0 = 0, or y0 = − .
x + 4y 3
3
◦ Plugging in the point (1, 1) gives y 0 (1, 1) = − .
5
3
◦ Thus the tangent line has slope − and passes through (1, 1), so an equation for the tangent line is
5
3
y − 1 = − (x − 1) .
5
◦ Here is a graph of the curve together with this tangent line:
18
• Example: Find y 0 and y 00 for the curve y 2 = x3 + x2 . Examine what happens to y0 at the point (−1, 0) and
at the point (0, 0).
2
◦ Dierentiating both sides of [y(x)] = x3 + x2 via the Chain Rule gives 2 [y(x)] · y 0 (x) = 3x2 + 2x.
3x2 + 2x
◦ Solving for y0 gives y0 = .
2y
◦ To nd y 00 , we dierentiate the expression for y0 with respect to x: this yields
d d
2 2
00 dx 3x + 2x · (2y) − (3x + 2x) · dx [2y] (6x + 2) · (2y) − (3x2 + 2x) · (2y 0 )
y = = .
(2y)2 (2y)2
3x2 + 2x
(6x + 2) · (2y) − (3x2 + 2x) ·
y (6x + 2) · (2y) · y − (3x2 + 2x)2
◦ Substituting in for y 0 gives y 00 = = .
(2y)2 4y 3
1
◦ At (−1, 0), the formula gives y0 = , which is the 'innite' type of undened, so here the tangent line is
0
vertical, as is conrmed by the graph.
0
◦ At (0, 0), the formula gives y0 = , which is the 'indeterminate' type of undened. From the graph, it
0
appears that the curve crosses itself at the origin, and it even looks (in some sense) like there are actually
two tangent lines at the origin.
19
√
◦ In fact, this suspicion is accurate, per the following calculation: if we solve for
√ y, we obtain y =x x+1
or y = −x x + 1.
√ 3x2 + 2x 3x2 + 2x 3x + 2
◦ For x 6= 0, setting y = x x+1 yields y0 = = √ = √ , and now taking the
2y 2(x x + 1) 2 x+1
0
limit as x→0 yields y = 1.
√
◦ In a similar way, setting y = −x x + 1 and then taking the limit as x→0 yields y 0 = −1.
◦ Thus, the two 'tangent lines' at the origin have slopes −1 and 1, meaning that they are y = x and
y = −x.
◦ Remark: Curves with the general form y 2 = x3 + a x2 + b x + c are called elliptic curves. (The curve in
this example is called a singular elliptic curve, because the graph crosses itself.) Elliptic curves have a
rather surprisingly wide variety of applications: in particular, they are used extensively in cryptography
(the secure transmission of information) and are the basis for some algorithms for factorization of large
integers.
◦ Often in physical applications, we will have systems involving a number of variables that are each a
function of time, and we want to relate those variables to each other.
◦ One very common situation is the following: a particle moves around in the Cartesian plane (i.e., the
xy -plane) over time, so that both its x-coordinate and y -coordinate are functions of t.
◦ Given a parametrization x = x(t) and y = y(t), we would like to analyze properties of the parametric
curve that (x(t), y(t)) traces out in the plane, as t varies over some interval.
◦ To graph a parametric curve, it is sometimes possible to nd a Cartesian equation for the curve of the
form y = f (x). However, there is no general method for doing this.
◦ In lieu of some sort of clever way to eliminate the variable t, the standard method is just to plug in many
values of the parameter t to nd some points (x(t), y(t)) on the curve, and then connect them up with a
smooth curve.
◦ To sketch a parametric curve, we rst make a table of values: we plug in easy values for t and compute
the corresponding points (x, y).
t 0
√π/6 √π/4 π/3 π/2 2π/3 3π/4
√ 5π/6
√ π
x 1 3/2 √2/2 √1/2 0 −1/2
√ −√ 2/2 − 3/2 −1
y 0 1/2 2/2 3/2 1 3/2 2/2 1/2 0
◦ If we plot these points and then join them up with a smooth curve, we see that resulting shape seems to
be a circle.
◦ Indeed, it is a circle: from the Pythagorean identity sin2 (t) + cos2 (t) = 1, we immediately see that if
2 2
x = cos(t) and y = sin(t), then x + y = 1.
◦ Graphs of the parametric curve x = cos(t), y = sin(t) for 0 ≤ t ≤ π/2, 0 ≤ t ≤ π , and 0 ≤ t ≤ 2π are
below:
20
1.0 1.0 1.0
-1.0 -0.5 0.5 1.0 -1.0 -0.5 0.5 1.0 -1.0 -0.5 0.5 1.0
◦ Motivated by the previous example, we see that the Pythagorean identity sin2 (2t) + cos2 (2t) = 1 again
2 2
implies that x + y = 1, so this curve is once again the unit circle.
◦ However, the parametrization is dierent: if we plug in a few values for t, we will in general not end up
with the same points as before. In fact, it is easy to see that this parametrization will move along the
unit circle twice as quickly as the previous one.
• Example: Describe the parametric curve x = a cos(t), y = b sin(t), for 0 ≤ t ≤ 2π , where a and b are positive
real numbers.
◦ From our analysis earlier, this curve will have the same general shape as a circle, but stretched by a
factor of a in the x-direction and by a factor of b in the y -direction.
x2 y 2
◦ This describes an ellipse: specically, the ellipse + = 1. (That this is a Cartesian equation for the
a2 b2
parametric curve is easy to see, once it is pointed out.)
1.0
0.5
-0.5
-1.0
-1.5
◦ Notice that this curve crosses itself at the origin: it passes through the origin both when t = −1 and
when t = 1.
◦ This curve has a special name: it is called a (singular) elliptic curve, plotted above,
21
◦ It can be veried that this particular curve also has a Cartesian equation given by y 2 = x3 + x2 .
◦ An elliptic curve is a curve having the general form y 2 = x3 +ax2 +bx+c; such curves have a surprisingly
wide variety of applications, including in cryptography and in factoring algorithms.
◦ It is not particularly easy (or worthwhile) to plug in enough points to create accurate pictures by hand.
Instead, it is best to use a computer; here are the results:
-5 5
-5 5
-5
-5
◦ The rst graph is called a nine-pointed hypocycloid. (A hypocycloid is a curve traced out by a point on
a circle being rolled around the inside of a larger circle; to illustrate, the circle has been included in the
picture.)
◦ The second graph is a spiral starting at the origin and winding around counterclockwise with a continually
increasing radius.
• Example: Describe the parametric curve x = t, y = 0 for −∞ < t < ∞, and compare it to the curve
π π
x = tan(t), y = 0 for − <t< .
2 2
◦ The rst curve is just the y -axis, where the particle tracing out the curve moves at constant speed.
◦ The second curve is also the y -axis, but this time the particle moves at non-constant speed. (In fact, it
takes only a nite amount of time to cover the entire axis.)
dy
• Parametric Dierentiation Formula: The slope of the tangent line to the curve (x(t), y(t)) is given by
0
dx
dy dy/dt y (t)
= = 0 .
dx dx/dt x (t)
dy dy dx dy
◦ This formula is just a rearrangement of the Chain Rule = · ; solving for gives the formula.
dt dx dt dx
• Example: Find the slope of the tangent line at time t for the curve parametrized by x = cos(t), y = sin(t).
◦ We have x0 (t) = − sin(t) and y 0 (t) = cos(t), so by the formula, the slope of the tangent line to the circle
◦ Remark: As we saw above, this curve is the unit circle. Note that the circle's radius has slope y/x =
sin(t)/ cos(t) = tan(t), reecting the fact that for a circle, the radius is perpendicular to the tangent line.
◦ Note also that for t=0 the formula gives −1/0, indicating that the tangent line is vertical.
22
• Example: Show that the parametric curve x = et + e−t , y = et − e−t traces out half of the hyperbola
2 2
x − y = 4. Then nd the slope of the tangent line to the curve at time t.
◦ If x = et +e−t and y = et −e−t , then x2 −y 2 = (et +e−t )2 −(et −e−t )2 = (e2t +2+e−2t )−(e2t −2+e−2t ) = 4.
◦ However, the parametric curve is only half of the hyperbola: x = et + e−t is always positive (since
exponentials are positive), so the parametric curve only traces out the half with x > 0.
dy y 0 (t) et + e−t
◦ For the slope of the tangent, we have y 0 (t) = et + e−t and x0 (t) = et − e−t , so = 0 = t .
dx x (t) e − e−t
dy x
◦ Note that since x = et + e−t and y = et − e−t we can also write as . This is the result obtained
dx y
from implicit dierentiation of x2 − y 2 = 4.
• In the specic situation where (x(t), y(t)) represents the position of a particle at time t, then x0 (t) is the
0
velocity of the particle in the x-direction and y (t) is the velocity of the particle in the y -direction.
• Denition: If at time t a particle has position r(t) = (x(t), y(t)), then its velocity is dened as v(t) =
(x0 (t), y 0 (t)) and its acceleration is dened as
p a(t) = (x00 (t), y 00 (t)). The particle's speed is the magnitude of
the velocity, which is dened as ||v(t)|| = x0 (t)2 + y 0 (t)2 .
• Example: Find the velocity, speed, and acceleration of a particle whose position at time t seconds is equal to
r(t) = (2 + e2t , 2 − e2t ) meters.
◦ Since x(t) = 2 + e2t meters and y(t) = 2 − e2t meters , we have x0 (t) = 2e2t m/s and y 0 (t) = −2e2t m/s, so
2t
the velocity is v(t) = (2e , −2e2t )m/s .
√
x0 (t)2 + y 0 (t)2 = (2e2t m/s)2 + (−2e2t m/s)2 = 2 2e2t m/s
p p
◦ Then the speed is .
◦ Likewise, since x00 (t) = 4e2t and y 00 (t) = −4e2t , the acceleration is a(t) = (4e2t , −4e2t ) .
◦ In particular, linear approximations are useful when the relation between two variables can only be
written implicitly, and we would like to be able to say something about what happens to one variable if
the other changes.
2 More properly, we are actually treating (x(t), y(t)) as a vector, rather than an ordered pair representing the coordinates of a point
in the plane.
23
f (x0 + ∆x) − f (x0 )
◦ Intuitively, the denition says that for small values of ∆x, the quotient should
∆x
f (x0 + ∆x) − f (x0 )
be close to the value f 0 (x0 ), which we write as ≈ f 0 (x0 ). (The symbol ≈ means
∆x
approximately equal to.)
◦ Clearing the denominator and rearranging yields f (x0 + ∆x) ≈ f (x0 ) + f 0 (x0 ) · ∆x, where ∆x is small.
0
◦ Equivalently, if we write x = x0 + ∆x, this says f (x) ≈ f (x0 ) + f (x0 ) · (x − x0 ), where x − x0 is small
(i.e., when x is close to x0 ).
◦ Note how similar this expression is to the equation of the tangent line to y = f (x) at x = x0 , which is
y = f (x0 ) + f 0 (x0 ) · (x − x0 ).
◦ We have f (x) = (1 + x)3 and x0 = 0, and also f 0 (x) = 3(1 + x)2 by the Chain Rule.
• The reason the linearization is useful is because it is the best linear approximation to a dierentiable function:
• Theorem (Linear Approximation): If f (x) is dierentiable at x = x0 , then the linearization of f (x) near x = x0 ,
f (x) − L(x)
dened by L(x) = f (x0 ) + f 0 (x0 ) · (x − x0 ), is the only linear function satisfying lim = 0.
x→x0 x − x0
◦ Equivalently, this result says that the tangent line y = L(x) is the only function with the property that,
when we approximate f (x) with L(x), the resulting error near x = x0 is small when compared to x − x0 .
◦ Proof: Suppose f (x) is dierentiable at x = x0 .
f (x) − L(x) f (x) − f (x0 ) − f 0 (x0 ) · (x − x0 ) f (x) − f (x0 )
◦ We compute lim = lim = lim − f 0 (x0 ) = 0
x→x0 x − x0 x→x0 x − x0 x→x0 x − x0
by the denition of the derivative, so the linearization does have the desired property.
f (x) − g(x)
◦ Now suppose that g(x) = a + b(x − x0 ) is another linear function with lim = 0.
x→x0 x − x0
f (x) − g(x)
◦ By the limit laws, 0 · 0 = lim · lim (x − x0 ) = lim (f (x) − g(x)) = f (x0 ) − g(x0 ), so a=
x→x0 x − x0 x→x0 x→x0
f (x0 ).
f (x) − g(x) f (x) − f (x0 ) − b(x − x0 ) f (x) − f (x0 )
◦ Also, 0 = lim = lim = lim − b = f 0 (x0 ) − b, so
x→x0 x − x0 x→x0 x − x0 x→x0 x − x0
0
b = f (x0 ).
◦ So we see that the linearization is the only possible g(x), as claimed.
• We can use the linearization of a function to nd numerical approximations to function values.
◦ The idea is to nd a function f, a nice value of x0 , and a small ∆x such that f (x0 ) is easy to compute
and f (x0 + ∆x) is the value we want.
◦ Then f (x0 + ∆x) ≈ f (x0 ) + f 0 (x0 ) · ∆x gives the approximate value desired.
• Example: Find a linear approximation to f (x) = ln(x) near x = 1, and use it to approximate ln(0.9).
◦ With a calculator, we compute ln(0.9) ≈ −0.10536 . . . , so the linear approximation is quite good.
√ √
• Example: Approximate the values of 1.1 and 1.4.
24
√ √
◦ To approximate
√ 1.1 the most natural thing to try x0 = 1 and ∆x = 0.1, and f (x) = x: then
f (x0 + ∆x) = 1.1 is the value we want to compute.
√ 1
◦ For f (x) = x = x1/2 we have f 0 (x) = x−1/2 .
2
√ √ 1 1
◦ We get the approximation 1.1 = f (1 + 0.1) ≈ 1 + · 1−1/2 · (0.1) = 1 + · 0.1 = 1.05 .
2 2
√
◦ With a computer, we obtain 1.1 = 1.04881 . . . , so we see that the linear approximation is quite good.
√ 1
◦ If instead we take ∆x = 0.4, we get the approximation 1.4 = f (1 + 0.4) ≈ 1 + · 1−1/2 · (0.4) = 1.2 .
2
√
◦ With a computer, we obtain 1.4 = 1.18322 . . . . The approximation is still good, but not as good as
the previous one.
• Although the linearization gives the best possible linear approximation to a dierentiable function, we have
not actually given any way to estimate how large the approximation error might be.
◦ If we do not actually know how accurate the approximation is, then it will not be particularly helpful
for doing calculations.
◦ It is possible to give an error estimate for the linearization (presuming the function is twice-dierentiable),
but we will postpone discussion of this topic for now since we must develop more background rst.
◦ For completeness, here is such an error estimate: if f (x) is a function whose second derivative is con-
tinuous, then for any values a and b, if La (x) is the linearization of f at x = a, then |f (b) − La (b)| ≤
2
|b − a| 00
M· where M is any constant such that f (x) ≤ M for all x in the interval [a, b].
2
• Remark: There is no reason only to consider approximation by linear functions, aside from the fact that linear
functions are the easiest to analyze. We could just as well look for quadratic approximations a0 + a1 x + a2 x2 ,
2 3
or cubic approximations a0 + a1 x + a2 x + a3 x , or general polynomial approximations to f (x).
◦ This is precisely the idea behind Taylor polynomials, which (among other things) give even better
approximations of a function than the linearization does. However, developing these ideas fully requires
integral calculus, so we will not pursue the discussion further at the moment.
◦ It is also possible to use dierent classes of functions to approximate functions, such as trigonometric
functions. Approximating a function by one of the form a0 +a1 sin(x)+b1 cos(x)+a2 sin(2x)+b2 cos(2x)+
··· leads to the idea of a Fourier series, which possess an equally deep and useful theory (and which,
unfortunately, we also will not develop further at the moment).
2.7.2 Dierentials
• Denition: When y = f (x) is given as a function of x, the dierential dy is dened as dy = f 0 (x) dx, where
dx is an independent variable.
◦ As can be seen by rearranging the denition, we have dened dierentials so that the derivative, which
dy
we have denoted as f 0 (x) = , is now actually a quotient of variables.
dx
◦ The intuitive idea of dierentials is that as ∆x tends to zero the approximate equality ∆y ≈ f 0 (x0 ) · ∆x
becomes more and more accurate. In the limit we turn the ∆ symbols into d symbols.
• Dierentials are a piece of what is sometimes (pejoratively) called abstract nonsense: some theoretical
notation which is useful and yields correct answers, but whose rigorous use we do not have the tools to justify
at this stage.
◦ There are several dierent technical ways to make sense of dierentials, but all of them involve signi-
cantly more advanced mathematics than calculus.
◦ Our primary interest in dierentials is that they greatly simplify some computations and arguments in
integral calculus, particularly in calculations involved in changing variables.
25
◦ Dierentials also simplify and make clearer some of the properties of derivatives: for example, the Chain
dz dz dy
Rule's dierential form = · can now be interpreted as a cancellation of the dy terms.
dx dy dx
◦ Note that we obtained the same result either way, on account of the Chain Rule.
◦ Thus, if we know the rate that one quantity is changing, we can nd the rate at which the other is
changing. Problems of this type are called related rates problems, as they involve analyzing the rate
of change of one quantity using information about related quantities.
◦ First, if not already given, choose variable names and translate the given information into mathemat-
ical statements. Identify all relations between the variables. Draw a picture, if the relations are not
immediately clear.
◦ Finally, solve for the desired quantity using the given information and the information obtained from the
derivatives.
• Example: A spherical balloon is inated such that its volume increases by 40π cm3 per second. Find the rate
at which the balloon's radius is increasing (i) when the radius is 2 cm, and (ii) when the radius is 5 cm.
4 3
◦ The volume of the balloon is given by V = πr (in cm3 ), where r is the radius of the balloon (in cm).
3
dV 3 dr
◦ We are given that = 40π cm /s and want to nd .
dt dt
4 dV dr
◦ Dierentiating the relation V = πr3 via the Chain Rule gives = 4πr2 .
3 dt dt
dr dr dV /dt
◦ Then, solving for gives = . Now we can plug in the appropriate values to nd the desired
dt dt 4πr2
information.
3
dr 40π cm /s 5
◦ When r=2 we see = = cm/s. So when r = 2, the radius is increasing at 2.5 cm/s .
dt 4π · 22 cm2 2
3
dr 40π cm /s 2
◦ When r=5 we see = 2 2
= cm/s. So when r = 5, the radius is increasing at 0.4 cm/s .
dt 4π · 5 cm 5
dr
◦ Note that we obtain the correct units (cm/s, distance per time) for the rate of change .
dt
• Example: Two ships leave a port, one traveling north and the other traveling east. After 4 hours, the north-
moving ship is moving at 40 km/h and is 120 km from the port, and the east-moving ship is moving at 30 km/h
and is 90 km from the port. Find the rate at which the distance between the ships is changing after 4 hours.
26
◦ Let the distance of the east-traveling ship from the port be x(t) and the distance of the north-traveling
ship from the port be y(t), where t is measured in hours.
◦ Then by the Pythagorean Theorem, the distance d(t) between the ships satises d(t)2 = x(t)2 + y(t)2 .
0
We want to nd d (t).
◦ The given information says that x(4) = 120 km, x0 (4) = 40 km/h, y(4) = 90 km, and y 0 (4) = 30 km/h.
◦ Plugging in t=4 gives d(4)2 = x(4)2 + y(4)2 = (120 km)2 + (90 km)2 = 1502 km2 , so d(4) = 150 km.
◦ Now, dierentiating the relation
2
d(t) = x(t) + y(t) 2 2
via the Chain Rule gives 2 · d(t) · d0 (t) = 2 · x(t) ·
0 0
x (t) + 2 · y(t) · y (t).
x(t) · x0 (t) + y(t) · y 0 (t)
◦ Upon solving for the desired quantity d0 (t), we see d0 (t) = .
d(t)
x(t) · x0 (t) + y(t) · y 0 (t)
◦ If we then take t = 4 and plug in all of the known values, we get d0 (4) = =
d(t)
2
(120 km)(40km/h) + (90 km)(30km/h) 7500km /h
= = 50 km/h .
150 km 150 km
• Example: A large rectangular box has a length of 6 m, a width of 4 m, and a height of 3 m. The length is
changing at a rate of +1 m/s, the width is changing at a rate of −2 m/s, and the height is changing at a rate of
+1 m/s. Find the rates at which the volume and total surface area of the crate are changing.
◦ Let the length be l(t), the width be w(t), and the height be h(t).
0 0
◦ We are given l = 6 m, w = 4 m, h = 3 m, l = +1m/s, w = −2m/s, and h0 = +1m/s, all at time t = 0.
◦ The volume is given by V =l·w·h and the surface area is given by A = 2lw + 2lh + 2wh.
◦ Dierentiating each expression yields
dV d d dh dl dw dh dl dw dh
= [(lw) · h] = [lw] · h + (lw) · = ·w+l· · h + (lw) · = wh + l h + lw
dt dt dt dt dt dt dt dt dt dt
and
dA d dl dw dl dh dw dh
= [2lw + 2lh + 2wh] = 2 w+l + h+l + h+w .
dt dt dt dt dt dt dt dt
◦ Now we just need to plug in all of the given data. We obtain
and
2
◦ So the volume of the box is not changing , and the surface area is increasing at 2 m /s .
27