0% found this document useful (0 votes)
67 views14 pages

Differentiation - in A Nutshell: The Fundamentals

The document provides an overview of differentiation in 3 sentences or less: It discusses the basic definition of differentiation as finding the slope of the tangent line to a function at a point and approximating functions locally with polynomials. The derivative of a function f(x) is defined as the coefficient of ε in the approximation f(x + ε) ≈ f(x) + εf'(x). Higher order derivatives provide better approximations by including higher order terms in this polynomial approximation.

Uploaded by

Raghav Goel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views14 pages

Differentiation - in A Nutshell: The Fundamentals

The document provides an overview of differentiation in 3 sentences or less: It discusses the basic definition of differentiation as finding the slope of the tangent line to a function at a point and approximating functions locally with polynomials. The derivative of a function f(x) is defined as the coefficient of ε in the approximation f(x + ε) ≈ f(x) + εf'(x). Higher order derivatives provide better approximations by including higher order terms in this polynomial approximation.

Uploaded by

Raghav Goel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Differentiation – In a Nutshell

Dennis Chen, Tanush Chopra, Vishal Muthuvel


2020

We discuss differentiation in a nutshell and provide a rundown of the basic definition, the fundamental
laws, a few other fundamental derivatives that are good to know, and derive the trigonometric derivatives.
At the end we present a collection of problems, ranging from accessible to challenging. A large portion of
the first section is based on “evan explains differentiation in 20 minutes with no rigor whatsoever,” which I
recommend watching.

Œ 1 The Fundamentals
I suspect that most people reading this handout will already know the limit definition of the derivative,
which I am absolutely not interested in. Therefore, we instead provide two more intuitive and fundamental
definitions for those who feel that they are just pushing symbols. (Both of these definitions are secretly the
same idea.) The first one is for those who cannot understand what a derivative means at all, and the second
one is the one we are going to push to understand what calculus means at all.
Actually, I am going to state this explicitly, because this is the result we’re building up to anyways:
calculus is about approximating functions (and perturbations of functions at a point) with a polynomial.

Tangent Line. The derivative of a function at a point is just the slope of the tangent line to that point.

Derivatives are Approximations. Derivatives are a way to approximate a function near a particular point.

None of these means anything without an example, so we provide the prototypical example of 𝑓 (𝑥) = 𝑥 2 .

Example. Find the derivative of 𝑓 (𝑥) = 𝑥 2 at 𝑥 = 2 (and describe what this means).

1
Solution: Note that the line is tangent to the curve at the point (2, 4), which means that it is a linear
approximation of 𝑓 (𝑥) at points close to (2, 4). In particular, this means that
𝑓 (2 + 𝜖) ≈ 4 + slope ×𝜖.
²
derivative
of 𝑓 at 2

We denote the derivative of 𝑓 at 2 as 𝑓 ′ (2). To compute this, we can just expand (2 + 𝜖)2 = 4 + 4𝜖 + 𝜖2 , and
so approximating it as a linear function will give you 4 + 4𝜖. Thus the derivative is 4.
This mindset will completely explain the Power Rule and higher order derivatives (such as second order
derivatives). But first, we should define the order of a function. Violent abuses of notation will follow.

1.1 Order of a Function


Borrowing a concept from computer science, we discuss the order of a function. Typically in computer
science we want to examine how programs behave as they grow large, so a function that runs in 𝑛 2 + 7𝑛 + 4
time is said to be “𝑂(𝑛 2 ),” because only the 𝑛 2 term matters when it grows large. But in calculus we want to
examine the behavior of a function as it grows smaller; thus, the most significant term is the constant term,
followed by the linear term, then the quadratic term, etc. Terms with smaller degrees are more significant.

Order. We say that the order of a function is 𝑂(𝜖 𝑘 ) if the smallest degree of 𝜖 in a term is 𝑘.

Now we can be more specific when we say 𝑓 (2 + 𝜖) ≈ 4 + 4𝜖 – what we really mean is that 𝑓 (2 + 𝜖) =
4 + 4𝜖 + 𝑂(𝜖 2 ), where we truthfully could care less what the function 𝑂(𝜖2 ) entails.

1.2 The Power Rule


We can take the derivative of a function at a point – but generally we will want to analyze this derivative as
the point changes. Thus, the derivative of a function itself is a function.

Derivative of a Function. We denote the derivative of 𝑓 ′ at any arbitrary point as 𝑓 ′ (𝑥). We express this
in terms of 𝑥. It satisfies the equation

𝑓 (𝑥 + 𝜖) = 𝑓 (𝑥) + 𝜖 𝑓 ′ (𝑥) + 𝑂(𝜖2 ),

This fundamentally means that 𝑓 ′ (𝑥) is the coefficient of the 𝜖 term of 𝑓 (𝑥 + 𝜖).
We take 𝑓 (𝑥) = 𝑥 2 again as an example.

Example. What is the derivative of 𝑓 (𝑥) = 𝑥 2 , expressed as a function of 𝑥?

Solution: Note that 𝑓 (𝑥 + 𝜖) = (𝑥 + 𝜖)2 = 𝑥 2 + 2𝑥𝜖 + 𝜖 2 . Since 𝜖 2 is 𝑂(𝜖 2 ) and the rest of the terms are not,
𝑓 ′ (𝑥) = 2𝑥.
Now we go further – what’s the derivative of 𝑓 (𝑥) = 𝑥 𝑛 in general?

Power Rule. If 𝑓 (𝑥) = 𝑥 𝑛 , then 𝑓 ′ (𝑥) = 𝑛𝑥 𝑛−1 .

Proof. Note that 𝑓 (𝑥 + 𝜖) = 𝑥 𝑛 + 𝑛𝑥 𝑛−1 𝜖 + 𝑂(𝜖 2 ), so the derivative is the coefficient of 𝜖, namely 𝑛𝑥 𝑛−1 .

Keep in mind that this holds for non-integer 𝑛 as well because of the Extended Binomial Theorem (as (𝑛1 )
is always 𝑛).

Exercise. Find the equation of the line tangent to 𝑥 4 + 3𝑥 2 at (2, 28).

2
1.3 Higher-Order Derivatives
If the first derivative is the linear approximation (the coefficient of the 𝜖 term) of 𝑓 (𝑥 + 𝜖), then it intuitively
follows that the second derivative is the quadratic approximation of 𝑓 (𝑥 +𝜖), or in other words, the coefficient
of the 𝜖 2 term. With this understanding in mind, note that adding higher order approximations is just
reducing the error in our previous approximation. With this in mind, if you keep adding higher order
approximations, you should be able to get the function itself – and this is the idea behind Taylor Series.

Taylor Series. Given a function 𝑓 ,

𝑓 (𝑥) 𝑓 ′ (𝑥)𝜖 𝑓 ′′ (𝑥)𝜖 2 𝑓 ′′′ (𝑥)𝜖3


𝑓 (𝑥 + 𝜖) = + + + + ⋯.a
0! 1! 2! 3!
a This is non-standard notation for the Taylor expansion, but I personally believe this is the best way to think about it.

This is the entire definition of higher order derivatives. Even with non-polynomial functions, there is
usually1 a Taylor Series, because we can just keep approximating.

nth Derivative. We denote the 𝑛th derivative, as defined by the Taylor Series, as 𝑓 (𝑛) (𝑥). We denote the
𝑑𝑛 𝑦
result of taking the derivative of 𝑦 = 𝑓 (𝑥) 𝑛 times as 𝑑𝑥 𝑛 𝑓 (𝑥).

We now prove that the 𝑛th derivative is achieved by taking the derivative 𝑛 times to show that it is
consistent with our Taylor Series definition.

nth Derivative results from taking the derivative n times. Given a function 𝑓 (𝑥) = 𝑦,

𝑑𝑛 𝑦
𝑓 (𝑛) (𝑥) = 𝑓 (𝑥).
𝑑𝑥 𝑛
where the derivative of 𝑓 is taken 𝑛 times.

This proof relies on the fact that derivatives are additive and multiplicative under scalar multiplication –
which we will only formally prove until later. An astute reader will note that the factorial denominators are
not strictly necessary to sum up approximations, and that we in fact contrive the definition of Taylor series
to create this consistency in the definitions.

Proof. Since derivatives are additive and multiplicative under scalar multiplication, and differentiable
functions can be written as polynomial series, we need only prove this for polynomial functions 𝑓 (𝑥) = 𝑥 𝑐 .
We note that by the Power Rule, 𝑓 ′ (𝑥) = 𝑐𝑥 𝑐−1 . Applying this recursively yields

𝑑𝑛 𝑦
𝑓 (𝑥) = 𝑐(𝑐 − 1)(𝑐 − 2)⋯(𝑐 − (𝑛 − 1))𝑥 𝑐−𝑛 .
𝑑𝑥 𝑛
Now note by expanding 𝑓 (𝑥 + 𝜖) = (𝑥 + 𝜖)𝑐 we get the coefficient of the 𝜖 𝑛 term to be

𝑛
( )𝑥 𝑐−𝑛 .
𝑐
Since
𝑐(𝑐 − 1)(𝑐 − 2)⋯(𝑐 − (𝑛 − 1))𝑥 𝑐−𝑛 𝑛
= ( )𝑥 𝑐−𝑛 ,
𝑛! 𝑐
we have shown the desired result.
1For our purposes, we are only going to look at functions with Taylor Series – but keep in mind there are weird functions that do
not, namely those that are not differentiable at any point (which do exist).

3
Thus the notation is interchangable, and more crucially, the definitions are too.

Maclaurin Series. If you want to approximate 𝑓 (𝜖) with a polynomial, plug in 𝑥 = 0 to get

𝑓 (0) 𝑓 ′ (0)𝜖 𝑓 ′′ (0)𝜖 2 𝑓 ′′′ (0)𝜖 3


𝑓 (0 + 𝜖) = + + + +⋯
0! 1! 2! 3!
Remember that we’re taking 𝑥 = 0, not 𝜖 = 0.a
a Standard notation for the Taylor Series actually does not abide by this principle, but we made a design choice to emphasize

that 𝜖 is slightly perturbing the function at a point, and then approximating it at 𝑥 with the derivative.

A corollary of Maclaurin Series is L’Hopital’s Rule.

L’Hopital’s Rule. If 𝑓 (0) = 𝑔(0) = 0 and 𝑘 is the smallest number such that 𝑔 (𝑘) (𝑎) ≠ 0, then

𝑓 (𝑥) 𝑓 (𝑘) (0)


lim = .
𝑥→0 𝑔(𝑥) 𝑔 (𝑘) (0)

Proof. Note the Maclaurin Series look like

𝑓 (𝑘) (0)𝑥 𝑘
𝑓 (𝑥) = + 𝑂(𝑥 𝑘+1 )
𝑘!
𝑔 (𝑘) (0)𝑥 𝑘
𝑔(𝑥) = + 𝑂(𝑥 𝑘+1 ),
𝑘!
so
𝑓 (𝑥) 𝑓 (𝑘) (0) + 𝑂(𝑥) 𝑓 (𝑘) (0)
lim = lim (𝑘) = .
𝑥→0 𝑔(𝑥) 𝑥→0 𝑔 (0) + 𝑂(𝑥) 𝑔 (𝑘) (0)

A proof with the limit definition of derivatives is also quite straightforward. We do not prove the more
𝑓 (𝑥) 𝑓 ′ (𝑥)
general rule, which is that lim = lim in the indefinite forms, but you still do need to know this.
𝑥→𝑎 𝑔(𝑥) 𝑔 ′
𝑥→𝑎 (𝑥)

sin(2𝑥)
Exercise. Find lim 2 .
𝑥→0 𝑥+𝑥

1.4 Aside: Limit and Order Definitions


We address a comment from mathchampion1 here: “You don’t explicitly prove that the derivative actually
gives you the slope of the tangent line when you start out with the approximation-based definition.”

Tangent slope and approximation give the same derivative. Say that 𝑓 (𝑥 + 𝜖) = 𝑓 (𝑥) + 𝑓 ′ (𝑥)𝜖 + 𝑂(𝜖 2 ).
Then 𝑦 − 𝑓 (𝑎) = 𝑓 ′ (𝑎)(𝑥 − 𝑎) is tangent to 𝑓 at 𝑎, 𝑓 (𝑎).

4
Proof. Note that
𝑓 (𝑥 + 𝜖) − 𝑓 (𝑥)
𝑓 ′ (𝑥) = lim
𝜖→0 𝜖
by the limit definition, which would give us the slope of the tangent line. Now rearranging gives

lim 𝑓 (𝑥 + 𝜖) = lim 𝑓 (𝑥) + 𝑓 ′ (𝑥)𝜖,


𝜖→0 𝜖→0

and since lim𝜖→0 𝑓 (𝑥 + 𝜖) = lim𝜖→0 𝑓 (𝑥) + 𝑓 ′ (𝑥)𝜖 + 𝑂(𝜖 2 ). Since 𝜖 approaches 0 and 𝑂(𝜖2 ) is smaller than
linear, the two definitions are consistent.

1.5 Summary
We give a summary here because the fundamental ideas are quite advanced/non-standard; we do not
continue this for other sections because formula sheets already exist.
∎ Derivatives are approximations.
F The first derivative is the linear approximation, the second derivative is the quadratic approximation,
and so on.
F The Taylor Series is defined as

𝑓 (𝑥) 𝑓 ′ (𝑥)𝜖 𝑓 ′′ (𝑥)𝜖 2 𝑓 ′′′ (𝑥)𝜖 3


𝑓 (𝑥 + 𝜖) = + + + + ⋯.
0! 1! 2! 3!
F Plugging in 𝑥 = 0 gives the Maclaurin Series, which can be used to express a function as a
polynomial without the perturbance perspective.
∎ The factorials in the Taylor Series exist solely to make the Taylor Series definition consistent with
differentiating a function 𝑛 times.
F You prove this by noting that derivatives are additive and scalar multiplicative and then taking
the 𝑛th derivative of each term at a time.
∎ L’Hopital’s Rule
𝑓 (𝑎)
F If 𝑔(𝑎) is indeterminate and 𝑔 ′ (𝑎) ≠ 0,

𝑓 (𝑎) 𝑓 ′ (𝑎)
lim = .
𝑥→𝑎 𝑔(𝑎) 𝑔 ′ (𝑎)

F This can be proved with Maclaurin Series or the limit definition of a derivative.

Œ 2 Laws of Differentiation
I will say the following explicitly: Everything is based off of the chain and product rule (except for
additive/multiplicative, which are just obvious).

Derivatives are Additive and Scalar Multiplicative. For any functions 𝑓 , 𝑔 (where ( 𝑓 + 𝑔)(𝑥) denotes the
function 𝑓 (𝑥) + 𝑔(𝑥)), and given scalars 𝑎, 𝑏,

𝑎 𝑓 (𝑛) (𝑥) + 𝑏 𝑓 (𝑛) (𝑥) = (𝑎 𝑓 + 𝑏 𝑔)(𝑛) (𝑥).

5
Proof. Consider the functions as Taylor Series, which are polynomials, and note that the 𝑛th degree
term of polynomial are additive and scalar multiplicative for all 𝑛.a
a This is also why ( 𝑓 𝑔)′ ≠ 𝑓 ′ 𝑔 ′ – it should not be too hard to think of two polynomials 𝑓 and 𝑔 such that the 𝑥 coefficient of 𝑓 𝑔

is different from the product of 𝑥 coefficient of 𝑓 and the 𝑥 coefficient of 𝑔.

Okay, now onto the heavy lifting.

2.1 Fundamental Laws of Differentiation


These are the chain and product rules.
The results are the quotient rule, inverse function rule, and implicit differentiation. The quotient rule is
a consequence of the chain and product rules, and implicit differentiation is a special case of the chain rule
(which we will not prove).

Product Rule. Given differentiable functions 𝑓 (𝑥), 𝑔(𝑥),

( 𝑓 (𝑥)𝑔(𝑥))′ = 𝑓 ′ (𝑥)𝑔(𝑥) + 𝑓 (𝑥)𝑔 ′ (𝑥).

We prove this through algebraic manipulations and Taylor Series.

Proof 1 (Limit). By the limit definition of the derivative, we get that

𝑓 (𝑥)𝑔(𝑥) − 𝑓 (𝑎)𝑔(𝑎)
( 𝑓 ○ 𝑔)′ (𝑎) = lim
𝑥→𝑎 𝑥−𝑎
𝑓 (𝑎)𝑔(𝑥)
To see why the product rule must be true, we proceed by adding and subtracting 𝑥−𝑎 to the above
limit statement.
𝑓 (𝑥)𝑔(𝑥) − 𝑓 (𝑎)𝑔(𝑎)
( 𝑓 ○ 𝑔)′ (𝑎) = lim
𝑥→𝑎 𝑥−𝑎
𝑓 (𝑥)𝑔(𝑥)− 𝑓 (𝑎)𝑔(𝑥) + 𝑓 (𝑎)𝑔(𝑥) − 𝑓 (𝑎)𝑔(𝑎)
= lim
𝑥→𝑎 𝑥−𝑎
𝑓 (𝑥)− 𝑓 (𝑎) 𝑔(𝑥) − 𝑔(𝑎)
= lim ( ⋅ 𝑔(𝑥) + 𝑓 (𝑎) ⋅ )
𝑥→𝑎 𝑥−𝑎 𝑥−𝑎
𝑓 (𝑥)− 𝑓 (𝑎) 𝑔(𝑥) − 𝑔(𝑎)
= lim ⋅ lim 𝑔(𝑥) + lim 𝑓 (𝑎) ⋅ lim
𝑥→𝑎 𝑥−𝑎 𝑥→𝑎 𝑥→𝑎 𝑥→𝑎 𝑥−𝑎
= 𝑓 ′ (𝑎)𝑔(𝑎) + 𝑓 (𝑎)𝑔 ′ (𝑎).

Proof 2 (Taylor). Note 𝑓 (𝑥 + 𝜖) = 𝑓 (𝑥) + 𝑓 ′ (𝑥)𝜖 + 𝑂(𝜖2 ) and 𝑔(𝑥 + 𝜖) = 𝑔(𝑥) + 𝑔 ′ (𝑥)𝜖 + 𝑂(𝜖2 ), so

𝑓 (𝑥 + 𝜖)𝑔(𝑥 + 𝜖) = 𝑓 (𝑥)𝑔(𝑥) + ( 𝑓 ′ (𝑥)𝑔(𝑥) + 𝑓 (𝑥)𝑔 ′ (𝑥))𝜖 + 𝑂(𝜖2 ).

By the definition of the derivative, ( 𝑓 (𝑥)𝑔(𝑥))′ is the coefficient of the 𝜖 term, or 𝑓 ′ (𝑥)𝑔(𝑥) + 𝑓 (𝑥)𝑔 ′ (𝑥),
as desired.

Chain Rule. Given differentiable functions 𝑓 and 𝑔,

𝑓 (𝑔(𝑥))′ = 𝑓 ′ (𝑔(𝑥))𝑔 ′ (𝑥).

6
Proof. By the definition of the derivative as a limit, we want the derivative of the composed function
𝑓 (𝑔(𝑎)) to be
𝑓 (𝑔(𝑥)) − 𝑓 (𝑔(𝑎))
lim
𝑥→𝑎 𝑥−𝑎
But why do we multiply the derivative of 𝑓 (𝑥) with respect to 𝑔(𝑥) by the derivative of 𝑔(𝑥) when
differentiating a composition of functions? Well, that is the question we address in this proof.
To start, the definition of the derivative as a limit to find the instantaneous rate of change of 𝑓 (𝑔(𝑥))
gives
𝑓 (𝑔(𝑥)) − 𝑓 (𝑔(𝑎))
𝑓 ′ (𝑔(𝑎)) = lim
𝑥→𝑎 𝑔(𝑥) − 𝑔(𝑎)
Note here that the denominator contains 𝑔(𝑥)− 𝑔(𝑎) because we must define the rate of change of 𝑓 (𝑔(𝑥))
with respect to the rate of change of the input (i.e., 𝑔(𝑥) in this case as it is the input of the composed
function).Multiplying the above limit with the derivative of 𝑔(𝑥) as a limit, we get the following

𝑓 (𝑔(𝑥)) − 𝑓 (𝑔(𝑎)) 𝑔(𝑥) − 𝑔(𝑎) 𝑓 (𝑔(𝑥)) − 𝑓 (𝑔(𝑎)) 𝑑


𝑓 ′ (𝑔(𝑎)) ⋅ 𝑔 ′ (𝑎) = lim ⋅ = lim = ( 𝑓 (𝑔(𝑎))
𝑥→𝑎 𝑔(𝑥) − 𝑔(𝑎) 𝑥−𝑎 𝑥→𝑎 𝑥−𝑎 𝑑𝑥

And the equivalent expression is exactly what we want. To complete this proof, we summarize the above
findings as
𝑑
( 𝑓 (𝑔(𝑎)) = 𝑓 ′ (𝑔(𝑎)) ⋅ 𝑔 ′ (𝑎)
𝑑𝑥

2.2 Quotient Rule


We first prove the reciprocal rule as a lemma.2

Reciprocal Rule. Given a function 𝑓 ,


𝑓 ′ (𝑥)

1
( ) =− .
𝑓 (𝑥) 𝑓 (𝑥)2

Proof. This is just a consequence of the chain rule, since the inner function is 𝑓 , the outer function is 𝑥1 ,
and the derivative of 𝑥1 is − 𝑥12 .

Now we can prove the quotient rule in its full glory.

Quotient Rule. Given functions 𝑓 , 𝑔,

𝑓 (𝑥) 𝑓 ′ (𝑥)𝑔(𝑥) − 𝑓 (𝑥)𝑔 ′ (𝑥)



( ) = .
𝑔(𝑥) 𝑔(𝑥)2

2Colloquially, a lemma is an intermediate result proved in order to prove a theorem (the main result).

7
Proof. Note that
−𝑔 ′ (𝑥)

1 1
( 𝑓 (𝑥) ⋅ ) = 𝑓 ′ (𝑥) ⋅ + 𝑓 (𝑥) ⋅
𝑔(𝑥) 𝑔(𝑥) 𝑔(𝑥)2
by the product and reciprocal rules, which simplifies to

𝑓 ′ (𝑥)𝑔(𝑥) − 𝑓 (𝑥)𝑔 ′ (𝑥)


,
𝑔(𝑥)2

as desired.

2.3 Implicit Differentiation


Sometimes we can’t or don’t want to take the effort to rearrange some function 𝑃(𝑥, 𝑦) = 0 to something like
𝑓 (𝑥) = 𝑥, where 𝑃 is in terms of 𝑥 and 𝑦. This calls for implicit differentiation.
√ √
Example. Find the slope of the line tangent to the circle 𝑥 2 + 𝑦 2 = 1 at ( 2
2
, 2 ).
2

Solution: Differentiate both sides to get


𝑑 2
2𝑥 + (𝑦 ) = 0,
𝑑𝑥
and apply the Chain Rule3 to get
2𝑥 + 2𝑦 𝑦 ′ = 0.
Now rearrange to get
𝑥
𝑓 ′ (𝑥) = − ,
𝑦
and plug in (𝑥, 𝑦) = (1, 1) to get 𝑓 ′ (1) = −1.
This is quite tricky, so let’s do another example.

Example. Find the slope of the line tangent to the curve 𝑥 𝑦 2 + 𝑦 3 + 𝑥 𝑦 + 𝑥 + 𝑦 = 8 at (2, 1).

Solution: Differentiate both sides to get

𝑑 2 𝑑
(𝑥 (𝑦 ) + 𝑦 2 ) + ( (𝑦 3 )) + (𝑥 𝑦 ′ + 𝑦) + (1) + (𝑦 ′ ) = 0.4
𝑑𝑥 𝑑𝑥
By the Chain Rule, this is equal to

(2𝑥 𝑦 𝑦 ′ + 𝑦 2 ) + (3𝑦 2 𝑦 ′ ) + (𝑥 𝑦 ′ + 𝑦) + (1) + (𝑦 ′ ) = 0.

Now let’s plug in (2, 1) and solve for 𝑦 ′ . This gives us

(4𝑦 ′ + 1) + (3𝑦 ′ ) + (2𝑦 ′ + 1) + (1) + (𝑦 ′ ) = 10𝑦 ′ + 3 = 0,

or 𝑦 ′ = − 10
3
, which is our answer.
Many important/convenient results in calculus are proved using implicit differentiation; we present a
couple of these results here.
𝑑 𝑑
3We do this because we want to solve for 𝑦 ′ = 𝑑𝑥 𝑦, not 𝑑𝑥 (𝑦 2 ). The reason we can apply the Chain Rule to 𝑦 is because we can
represent 𝑦 as 𝑓 (𝑥) – if this doesn’t make sense to you, replace every 𝑦 with an 𝑓 (𝑥) until it does.
4Parentheses are placed around the derivative of each term for clarity.

8
Inverse Function Rule. Given an invertible function 𝑓 ,

1
( 𝑓 −1 (𝑥))′ = ,
𝑓 ′ ( 𝑓 −1 (𝑥))

as long as 𝑓 ′ ( 𝑓 −1 (𝑥)) ≠ 0.

Proof. We differentiate 𝑦 = 𝑓 −1 (𝑥) by first taking 𝑓 of both sides. This gives us

𝑓 (𝑦) = 𝑥.

Now using the Chain Rule and actually differentiating gives us

𝑓 ′ (𝑦)𝑦 ′ = 1

1
𝑦′ = .
𝑓 ′ (𝑦)
Note that 𝑦 = 𝑓 −1 (𝑥), so substituting gives us

1
( 𝑓 −1 (𝑥))′ = ,
𝑓 ′ ( 𝑓 −1 (𝑥))

as desired.
Make sure you understand that intuitively, the derivative of the inverse is just the reciprocal of the
derivative at the corresponding point.
If you want basic practice with implicit differentiation, all you have to do is make up some sort of
reasonably small polynomial. Therefore, we’ll be presenting harder exercises that aren’t so mindless (in
particular, no polynomials).

𝑑𝑦
Exercise (AoPS Calculus, 3.6.3). Find 𝑑𝑥 if 𝑥 2 + 𝑦 = ln(𝑦 2 − 1).

Exercise (AoPS Calculus, 3.6.4). Find the slope of the tangent line to the curve 𝑥 sin(𝑥 + 𝑦) = 𝑦 cos(𝑥 − 𝑦)
at the point (0, 𝜋2 ).

Œ 3 Derivatives of Certain Functions


3.1 Trigonometric Functions
For differentiating trigonometric functions there are many approaches, one of which is purely using the
definition of a derivative as a limit. The proofs for sin and cos rely on the facts that lim sin𝑥 𝑥 = 1 and
𝑥→0
lim cos𝑥𝑥−1 = 0, which we will prove first using geometric methods.
𝑥→0

Sine Limit. When 𝑥 approaches 0,


sin 𝑥
𝑥
approaches 1.

9
Proof. Consider a unit circle with an arc of 𝑥 (in radians). Note that sin 𝑥 is the height of the altitude
from one point on the circle to the other radius in the diagram, and also note that 𝑥 is the arclength. As
𝑥 approaches 0, the arc gets smaller and less curved, and the line becomes a better approximation of the
arclength.

sin 𝑥 𝑥

Cosine Limit. When 𝑥 approaches 0,


cos 𝑥 − 1
𝑥
approaches 0.

Proof. Refer to the same setup as above. Note that the altitude splits the radius into two pieces of length
cos 𝑥 and 1 − cos 𝑥. As 𝑥 approaches 0, the path sin 𝑥 takes approaches the path 𝑥 takes, so the difference
in the paths (i.e. 1 − cos 𝑥) becomes negligible compared to 𝑥.

Derivative of sin. The derivative of 𝑓 (𝑥) = sin 𝑥 is cos 𝑥.

Proof. We use the limit definition of the derivative and the angle addition formulas.

𝑑 sin(𝑥 + Δ𝑥) − sin(𝑥)


sin(𝑥) = lim
𝑑𝑥 Δ𝑥→0 Δ𝑥
sin(𝑥) cos(Δ𝑥) + sin(Δ𝑥) cos(𝑥) − sin(𝑥)
= lim
Δ𝑥→0 Δ𝑥
sin(𝑥)(cos(Δ𝑥) − 1) + cos(𝑥) sin(Δ𝑥)
= lim
Δ𝑥→0 Δ𝑥
sin(𝑥)(cos(Δ𝑥) − 1) cos(𝑥) sin(Δ𝑥)
= lim + lim
Δ𝑥→0 Δ𝑥 Δ𝑥→0 Δ𝑥
cos(Δ𝑥) − 1 sin(Δ𝑥)
= sin(𝑥) lim + cos(𝑥) lim
Δ𝑥→0 Δ𝑥 Δ𝑥→0 Δ𝑥

Because lim sin𝑥 𝑥 = 1 and lim cos𝑥𝑥−1 = 0,


𝑥→0 𝑥→0

𝑑
sin(𝑥) = cos(𝑥).
𝑑𝑥

Derivative of cos. The derivative of 𝑓 (𝑥) = cos 𝑥 is − sin 𝑥.

10
Proof. We just piggyback on the sin proof. Note that cos 𝑥 = sin(𝑥 + 𝜋2 ), so the derivative of cos 𝑥 is
cos(𝑥 + 𝜋2 ) = sin(𝑥 + 𝜋) = − sin 𝑥.

A natural exercise comes from the following two theorems.

Exercise (Periodic Derivatives). If 𝑓 (𝑥) = sin 𝑥, find 𝑓 ′ (𝑥), 𝑓 ′′ (𝑥), 𝑓 ′′′ (𝑥), and 𝑓 ′′′′ (𝑥). Do the same for
𝑓 (𝑥) = cos 𝑥.

Derivative of tan. The derivative of 𝑓 (𝑥) = tan(𝑥) is sec2 (𝑥).

Proof. We just use the quotient rule. Note that

𝑑 𝑑 sin(𝑥)
tan(𝑥) = ( )
𝑑𝑥 𝑑𝑥 cos(𝑥)
cos(𝑥) cos(𝑥) − sin(𝑥)(− sin(𝑥))
=
cos2 (𝑥)
cos2 (𝑥) + sin2 (𝑥)
=
cos2 (𝑥)
𝑑 1
= tan(𝑥) =
𝑑𝑥 cos2 (𝑥)
𝑑
= tan(𝑥) = sec2 (𝑥).
𝑑𝑥

Exercise (Derivatives of Reciprocal Functions). Given how the trigonometric derivatives for sin, cos,
and tan were derived, determine and prove the derivatives of csc, sec, and cot.

3.1.1 Optional: Taylor Series of sin and cos


This is officially optional, but I highly recommend you do it some time. It does not have to be on your first
read-through of the handout, but you are going to need to know this eventually.
With the derivatives of sin 𝑥 and cos 𝑥 in mind, it should be easy to express 𝑓 (𝑥) = sin 𝑥 and 𝑓 (𝑥) = cos 𝑥
as a Taylor Series. For people learning Calculus for the first time, this is optional but strongly recommended.
3 5
𝜖
Maclaurin Series of sin and cos. The Maclaurin Series of sin 𝜖 is sin 𝜖 = 1! − 𝜖3! + 𝜖5! − ⋯, and the Maclaurin
𝜖0 𝜖2 𝜖4
Series of cos 𝜖 is cos 𝜖 = 0! − 2! + 4! − ⋯.

Proof. If 𝑓 (𝑥) = sin 𝑥, note that

𝑓 (0) 𝑓 ′ (0)𝜖 𝑓 ′′ (0)𝜖2 𝑓 ′′′ (0)𝜖3


𝑓 (0 + 𝜖) = + + + +⋯
0! 1! 2! 3!
sin 0 cos 0𝜖 − sin 0𝜖 2 − cos 0𝜖 3
= + + + +⋯
0! 1! 2! 3!
𝜖 𝜖3 𝜖5
= − + − ⋯.
1! 3! 5!
The proof for cosine follows nearly identically. (Do it on your own.)

11
Exercise. Find the Maclaurin Series of 𝑥 cos 𝑥.

3.2 Inverse of Trigonometric Functions


We will implicitly differentiate here.

Arcsin Derivative. The derivative of 𝑓 (𝑥) = arcsin(𝑥) is √1 .


1−𝑥 2

Proof. Let 𝑓 (𝑥) = 𝑦 and note that this implies sin 𝑦 = 𝑥. Differentiating with respect to 𝑥 gives

𝑑𝑦
cos 𝑦 =1
𝑑𝑥
𝑑𝑦 1
=
𝑑𝑥 cos 𝑦
𝑑𝑦 1
=√ .
𝑑𝑥 1 − 𝑥2

Arccos Derivative. The derivative of 𝑓 (𝑥) = arccos(𝑥) is − √ 1 .


1−𝑥 2

Proof. Let 𝑓 (𝑥) = 𝑦 and note that this implies cos 𝑦 = 𝑥. Differentiating with respect to 𝑥 gives

𝑑𝑦
− sin 𝑦 =1
𝑑𝑥
𝑑𝑦 1
=−
𝑑𝑥 sin 𝑦
𝑑𝑦 1
= −√ .
𝑑𝑥 1 − 𝑥2

Arctan Derivative. The derivative of 𝑓 (𝑥) = arctan(𝑥) is 1


1+𝑥 2
.

Proof. Let 𝑓 (𝑥) = 𝑦 and note that this implies tan 𝑦 = 𝑥. Differentiating with respect to 𝑥 gives

𝑑𝑦
sec2 𝑦 =1
𝑑𝑥
𝑑𝑦 1
=
𝑑𝑥 sec2 𝑦
𝑑𝑦 1
= .
𝑑𝑥 1 + 𝑥 2
The other three functions (the reciprocal functions) are left to the reader. You can either implicitly
differentiate from the start or just use the quotient rule on sin, cos, and tan – both will work.

Exercise (Derivative of Inverse of Reciprocal Trigonometric Functions). Find the derivative of arccsc 𝑥,
arcsec 𝑥, and arccot 𝑥.

12
3.3 Exponential and Logarithmic Functions
If this is your first time learning calculus, What is e? is mandatory reading.
Here’s a short summary of the facts about 𝑒 you absolutely have to know.

Facts about e.

1. 𝑒 is defined as the constant such that the derivative of 𝑓 (𝑥) = 𝑒 𝑥 is 𝑒 𝑥 .


2. Furthermore, 𝑘 ⋅ 𝑒 𝑥 is the only class of functions with this property, where 𝑘 is some arbitrary
constant.
3. The Maclaurin Series of 𝑒 𝑥 is constructed to allow this to happen.

Now we find and prove the derivative of ln 𝑥. It follows straight from the Inverse Function Rule, so try to
do it on your own for a little bit.

Derivative of ln. The derivative of 𝑓 (𝑥) = ln 𝑥 is 𝑥1 .

Proof. We put this straight into the Inverse Function Rule. Note that

1 a 1
(ln 𝑥)′ = = ,
𝑒 ln 𝑥 𝑥
as desired.
a Remember that the derivative of 𝑒 𝑥 is itself.

Exercise. Find the derivative of 𝑓 (𝑥) = log 𝑎 (𝑔(𝑥)).

13
Œ 4 Problems
As a disclaimer, miscellaneous problems related to differentiation that are not on this handout will show up,
such as limit problems and maximization/minimization problems.

Minimum is [32 p]. Problems denoted with n are required. (They still count towards the point total.)

“I raised that boy.”


Kaguya-sama

[2 n] Prove that the derivative of 𝑓 (𝑥) = 𝑒 𝑔(𝑥) is 𝑒 𝑔(𝑥) 𝑔 ′ (𝑥).


[2p] (HMMT) Let 𝑓 (𝑥) = 𝑥 3 + 𝑎𝑥 + 𝑏, with 𝑎 ≠ 𝑏, and suppose that the tangent lines to the graph of 𝑓 at 𝑥 = 𝑎
and 𝑥 = 𝑏 are parallel. Find 𝑓 (1).
[2p] (HMMT Calculus 2010/1) Suppose that 𝑝(𝑥) is a polynomial and that 𝑝(𝑥) − 𝑝 ′ (𝑥) = 𝑥 2 + 2𝑥 + 1.
Compute 𝑝(5).
[3p] (HMMT Calculus 2010/3) Let 𝑝 be a monic cubic polynomial such that 𝑝(0) = 1 and such that all the
zeroes of 𝑝 ′ (𝑥) are also zeros of 𝑝(𝑥). Find 𝑝. Note: monic means that the leading coefficient is 1.
[3 n] (Lemma of Hong Kong) Determine the minimum value 𝑓 (𝑥) = 𝑒 𝑥 + 1
𝑒𝑥 can take.
𝑥
[3p] Find the derivative of 4𝑥 +1 .
4

𝑓 (𝑎+ℎ)− 𝑓 (𝑎)
[3p] (MIT OCW) Show that, 𝑔(ℎ) = ℎ has a removable discontinuity at h = 0 given that 𝑓 ′ (𝑎)
exists.
[3p] (HMMT) Determine the real number 𝑎 having the property that 𝑓 (𝑎) = 𝑎 is a relative minimum of
𝑓 (𝑥) = 𝑥 4 − 𝑥 3 − 𝑥 2 + 𝑎𝑥 + 1.
𝑥 cos 𝑥
[4 n] (HMMT) Compute lim 𝑒 −1−𝑥
sin(𝑥 2 )
.
𝑥→0

[4p] (MAST Diagnostic 2020/C10) Find the maximum value of 𝑘 such that (𝑥 + 1)4 ≥ 𝑘𝑥 3 for all 𝑥.
[2p] (Extension of C10) Find the range of values 𝑘 such that (𝑥 + 1)4 ≥ 𝑘𝑥 3 for all 𝑥.
[6 n] (Leibniz Rule) Given two 𝑛th differentiable functions 𝑓 , 𝑔, prove that
𝑛
𝑛
( 𝑓 𝑔)(𝑛) (𝑥) = ∑ ( ) 𝑓 (𝑘) (𝑥)𝑔 (𝑛−𝑘) (𝑥).
𝑘=0 𝑘

[13p] (Hong Kong TST 2021/1/1) Find, with proof, all real triples (𝑎, 𝑏, 𝑐) satisfying

(22𝑎 + 1)(22𝑏 + 2)(22𝑐 + 8) = 2𝑎+𝑏+𝑐+5 .

14

You might also like