0% found this document useful (0 votes)
21 views40 pages

Chapter 4 Differentiation

Chapter 4 discusses the concept of differentiation in calculus, focusing on the derivatives of univariate functions. It introduces the derivative as the slope of the tangent line to a curve at a given point and explains how to find it using limits and secant lines. The chapter also includes definitions related to derivatives and provides examples to illustrate the process of differentiation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views40 pages

Chapter 4 Differentiation

Chapter 4 discusses the concept of differentiation in calculus, focusing on the derivatives of univariate functions. It introduces the derivative as the slope of the tangent line to a curve at a given point and explains how to find it using limits and secant lines. The chapter also includes definitions related to derivatives and provides examples to illustrate the process of differentiation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Chapter 4

Differentiation
Part I: Derivatives of Univariate Functions

4.1 Introduction

From before the time of the great Greek scientist Archimedes (287-212 B.C), mathematicians
were concerned with two important problems:

Problem 1: Finding the unique tangent line (if one exists) to a given curve at a given point on
the curve. See Figure 4.1.

Figure 4.1: One of the tangent lines to the curve 𝑦 = 𝑓(𝑥).

Problem 2: Finding the area bounded by a given curve and the 𝑥 − 𝑎𝑥𝑖𝑠 (see Figure 4.2).

Figure 4.2: Area bounded by the curve 𝑦 = 4𝑥 − 𝑥 2 and the x-axis

The solution to problem 1 led to what is now called differential calculus, while the solution to
problem 2 gave rise to Integral calculus. In this chapter, we study differential calculus while
integral calculus is the subject of Chapter 5. As we shall see in due course, differential and
integral calculus are really two closely related points of the same subject.

Remark 4.1: The central idea of this chapter is that of the slope of a curve at a given point.
1
That is, we introduce the derivative of a function and the important rules for finding
derivatives various types of a function. We also show how the derivative is used to analyze
the rate of change of a quantity.

4.2 Tangent Lines and Derivatives

A geometric interpretation of a derivative is referred to as being the slope of the tangent line
to a curve at a given point.

4.2.1 Tangent Line

To obtain a suitable definition of a tangent line, we use the limit concept and the geometric
notion of a secant line. A secant line is a line that intersects a curve at two or more points. See
the graph of the function 𝑦 = 𝑓(𝑥).

Figure 4.3: Secant line PQ and the tangent line to the curve of 𝑦 = 𝑓(𝑥) at the point P.

Suppose we wish to define the tangent line at the point P. If Q is a different point on the curve,
the line 𝑃𝑄 is a secant line. If 𝑄 moves along the curve and approaches 𝑃 from the right (see
Figure 4.4), typical secant lines are 𝑃𝑄′, 𝑃𝑄", 𝑃𝑄 ′′′ etc. And, as 𝑄 approaches 𝑃 from the left,
typical secant lines are 𝑃𝑄1 , 𝑃𝑄2 and so on. Show these on the curve in Figure 4.4. In both
cases, the secant lines approach the same limiting position .This common limiting position
of the secant lines is defined to be the tangent line to the curve at 𝑃.This definition of a
tangent line (is reasonable and)applies to curves in general. Not just circles.

2
Figure 4.4: The tangent line is a limiting position of secant lines.

Definition 4.1: Slope of a tangent line

The slope of a curve at a point 𝑃 is the slope, if it exists, of the tangent line to the curve at 𝑃.
Now, since the tangent line at 𝑃 is a limiting position of secant lines 𝑃𝑄, we consider the slope
of the tangent line to be the limiting value of the slopes of the secant lines as 𝑄 approaches
𝑃.

4.3 Derivative

The central concept in the study of calculus is the concept of a derivative. Intuitively, the
derivative of a function 𝑓 at the value 𝑥0 is the slope of the line tangent to the graph of 𝑓 at
the point (𝑥0 , 𝑓(𝑥0 )). We use the notation 𝑓′(𝑥0 ) to denote the derivative of 𝑓 at 𝑥0 . The
function 𝑓′ (the derivative of 𝑓) assigns to each unique real number 𝑥0 a new number 𝑓′(𝑥0 ).
Thus,

𝑓 ′ (𝑥0 ) = slope of a tangent line to the graph of 𝑓 at the point ( 𝑥0 , 𝑓(𝑥0 )).

Finding the Derivative of a Function 𝒇

(The method of Newton and Leinbniz in Mathematical Principles of Natural


Phylosophy(Principal), published in 1687.)

Consider the function 𝑦 = 𝑓(𝑥) a part of whose graph is given in Figure 4.5.To determine the
derivative function, we must find the slope of the line tangent to the curve at each point on
the curve at which there is a unique tangent line. Let (𝑥𝑜 , 𝑓(𝑥𝑜 )) be such a point, assuming
that 𝑓 is defined “near” 𝑥𝑜 .

3
Figure 4.5: The derivative of a function 𝑦 = 𝑓(𝑥) at the point (𝑥, 𝑓(𝑥)) equals the
slope/gradient of the tangent line to the graph at that point

To denote the change in a variable, say 𝑥, the symbol ∆𝑥 (read “delta𝑥”) the letter ℎ is
commonly used.

If ∆𝑥 ( 𝑜𝑟 ℎ) is a small number (positive or negative), then 𝑥𝑜 + ∆𝑥 will be close to 𝑥𝑜 . In


moving from 𝑥𝑜 to 𝑥𝑜 + ∆𝑥 , the values of 𝑓 will move from 𝑓(𝑥𝑜 ) 𝑡𝑜 𝑓(𝑥𝑜 + ∆𝑥 ) .

Now, consider the secant line in Figure 4.5, joining the points (𝑥𝑜 , 𝑓(𝑥𝑜 )) and (𝑥𝑜 +
∆𝑥, 𝑓(𝑥𝑜 + ∆𝑥 )). What is the slope of the secant line?

If we define

∆𝑦 = 𝑓(𝑥𝑜 + ∆𝑥 ) − 𝑓(𝑥𝑜 )

and use 𝑚 to denote the slope of such a secant line. Then

𝑐ℎ𝑎𝑛𝑔𝑒 𝑖𝑛 𝑦 ∆𝑦
𝑚= =
𝑐ℎ𝑎𝑛𝑔𝑒 𝑖𝑛 𝑥 ∆𝑥

𝑓(𝑥𝑜 + ∆𝑥 ) − 𝑓(𝑥𝑜 )
=
(𝑥𝑜 + ∆𝑥 ) − 𝑥𝑜

𝑓(𝑥𝑜 +∆𝑥 )−𝑓(𝑥𝑜 ) (1)


∴ 𝑚= ∆𝑥

Clearly, as ∆𝑥 gets smaller, the secant line gets closer and closer to the tangent line (the
limiting position of secant lines). That is as ∆𝑥 approaches zero (∆𝑥 → 0), the slope of the
secant line approaches the slope of the tangent line. But, the slope of the tangent line at the
point (𝑥𝑜 , 𝑓(𝑥𝑜 ) ) is the derivative, 𝑓 ′ (𝑥𝑜 ). Therefore, we have that

𝑓 ′ (𝑥𝑜 ) = lim 𝑚
∆𝑥→0

4
∆𝑦
∴ 𝑓 ′ (𝑥𝑜 ) = lim
∆𝑥→0 ∆𝑥

𝑓(𝑥𝑜 +∆𝑥 )−𝑓(𝑥𝑜 )


Therefore, 𝑓 ′ (𝑥𝑜 )= lim , provided the limit exists, as illustrated in Figure 4.5.
∆𝑥→0 ∆𝑥

The process of finding a derivative is called differentiation. This result leads to the following
definition (Definition 4.2):

Definition 4.2: Derivative at a point

Let the function 𝑓 be defined on an open interval containing the point 𝑥0 , and suppose that
𝑓(𝑥𝑜 +∆𝑥 )−𝑓(𝑥𝑜 )
lim exists and is finite. Then, 𝑓 is said to be differentiable at the value 𝑥0 and
∆𝑥→0 ∆𝑥
the derivative of 𝑓 at 𝑥0 , denoted by 𝑓 ′ (𝑥0 ) is given by
𝑓(𝑥𝑜 +∆𝑥 )−𝑓(𝑥𝑜 ) (2)
𝑓 ′ (𝑥𝑜 )= lim
∆𝑥→0 ∆𝑥

Definition 4.3: The Derivative Function

The derivative 𝑓 ′ of the function 𝑓 is the function defined as follows:


𝑓(𝑥𝑜 +∆𝑥 )−𝑓(𝑥𝑜 )
1.) Domain𝑓 ′ = {𝑥: lim 𝑒𝑥𝑖𝑠𝑡𝑠 𝑎𝑛𝑑 𝑖𝑠 𝑓𝑖𝑛𝑖𝑡𝑒}.
∆𝑥→0 ∆𝑥

𝑓(𝑥+∆𝑥)−𝑓(𝑥)
2.) For every number 𝑥 in the domain of 𝑓 ′ , 𝑓 ′ (𝑥) = lim . (3)
∆𝑥→0 ∆𝑥

Equation (3) says that 𝑓 ′ is the function, defined at every 𝑥 for which the limit in (3) exists
and is finite, that assigns to every 𝑥 in its domain the derivative 𝑓 ′ (𝑥).Thus, the derivative of
a function is a function, and the value of the derivative at a given value 𝑥𝑜 is the limit obtained
in (2) (Definition 4.2)

Remark 4.1: From definition 4.2, it follows that the derivative 𝑓 ′ (𝑥) of a function 𝑓 can exist
if 𝑓(𝑥) is defined.

Definition 4.4: Equation of Tangent Line - Point-slope method


𝑓(𝑥𝑜 +∆𝑥 )−𝑓(𝑥𝑜 )
If the limit lim exists and is finite, we say that the graph of the function 𝑓 has
∆𝑥→0 ∆𝑥
a tangent line of the point (𝑥0 , 𝑓(𝑥0 )). The tangent line is the line passing through the point
(𝑥0 , 𝑓(𝑥0 )) with slope 𝑓 ′ (𝑥0 ), the derivative of 𝑓 at the point 𝑥0 .

One equation of the tangent line is given by the Point-slope method

𝑦 − 𝑓(𝑥0 ) = 𝑓 ′ (𝑥0 )(𝑥 − 𝑥0 ) (4)

5
𝑦−𝑓(𝑥0 )
since 𝑓 ′ (𝑥0 ) = for the point (𝑥0 , 𝑓(𝑥0 )). Equation (4) is obtained by 𝑦 − 𝑓(𝑥0 )
(𝑥−𝑥0 )
𝑦−𝑓(𝑥0 )
making the subject of the formula in this equation, 𝑓 ′ (𝑥0 ) = .
(𝑥−𝑥0 )

Example 4.1

a) Find the derivative of the function 𝑦 = 𝑓(𝑥) = 𝑥 2 from first principles.


b) What is the equation of the tangent line to the graph of 𝑓(𝑥) = 𝑥 2 at the point (3, 9)?

Solution

a) If 𝑓(𝑥) = 𝑥 2 , then𝑓(𝑥 + ∆𝑥) = (𝑥 + ∆𝑥)2 . If we write 𝑦 = 𝑓(𝑥) = 𝑥 2


𝑑𝑦 𝑑
So, = 𝑓(𝑥)
𝑑𝑥 𝑑𝑥

𝑑𝑦 𝑓(𝑥+∆𝑥)−𝑓(𝑥)
= lim
𝑑𝑥 ∆𝑥→0 ∆𝑥

(𝑥+∆𝑥)2 −(𝑥)2
= lim ∆𝑥
∆𝑥→0

𝑥 2 +2𝑥∆𝑥+(∆𝑥)2 −𝑥 2
= lim
∆𝑥→0 ∆𝑥

𝑑𝑦 2𝑥∆𝑥+(∆𝑥)2
i.e. = lim
𝑑𝑥 ∆𝑥→0 ∆𝑥

= lim (2𝑥 + ∆𝑥)


∆𝑥→0

𝑑𝑦
∴ = 𝑓 ′ (𝑥) = 2𝑥
𝑑𝑥

b) From (a), we see that at every point of the form (𝑥, 𝑓(𝑥)) = (𝑥, 𝑥 2 ), the slope of the line
𝑑𝑦
tangent to the curve is 𝑓 ′ (𝑥) = 𝑑𝑥 = 2𝑥. Therefore, slope of the tangent line at the point
(3, 9) is 𝑓 ′ (3) = 2(3) = 6

Thus, using the point slope method we find the equation of the line tangent to the graph
of 𝑦 = 𝑥 2 at the point (3, 9)= ( 𝑥0 , 𝑓(𝑥0 )). As

𝒚 − 𝒇(𝒙𝟎 ) = 𝒇′ (𝒙𝟎 )(𝒙 − 𝒙𝟎 )

⇒ 𝑦 − 𝑓(3) = 𝑓 ′ (3)(𝑥 − 3)

⇒ 𝑦 − 32 = 6(𝑥 − 3) since 𝑓 ′ (3) = 2(3) = 6

⇒ 𝑦 − 9 = 6𝑥 − 18

i.e. 𝑦 = 6𝑥 − 9

6
Example 4.2 (Finding the derivative of a function involving the radical sign √ )

Using the first principle of differentiation, find the derivative of 𝑦 = √𝑥 and calculate the
slope of the tangent line at the point (4, 2).

Solution: If 𝑦 = 𝑓(𝑥) = √𝑥, then 𝑓(𝑥 + ∆𝑥) = √𝑥 + ∆𝑥


𝑑𝑦 𝑓(𝑥+∆𝑥)−𝑓(𝑥)
By definition, 𝑓 ′ (𝑥) = 𝑑𝑥 = lim
∆𝑥→0 ∆𝑥

√𝑥+∆𝑥 − √ 𝑥
i.e 𝑓 ′ (𝑥) = lim
∆𝑥→0 ∆𝑥

To simplify the expression on the RHS, notice that the numerator, √𝑥 + ∆𝑥 − √𝑥, looks like
one factor, (𝑎 − 𝑏), of the 2 factors in difference of two squares: [ (𝑎 − 𝑏)(𝑎 + 𝑏) = 𝑎2 − 𝑏 2 ]:
2
(√𝑥 + ∆𝑥 − √𝑥)(√𝑥 + ∆𝑥 + √𝑥) = (√𝑥 + ∆𝑥 ) − (√𝑥)2 ) on writing

√𝑥 + ∆𝑥)2 = 𝑎2 and (√𝑥)2 ) = 𝑏 2

Thus, multiply both numerator and denominator by √𝑥 + ∆𝑥 + √𝑥, the conjugate of the
numerator), to obtain
(√𝑥+∆𝑥−√𝑥)(√𝑥+∆𝑥+√𝑥)
𝑓 ′ (𝑥) = lim
∆𝑥→0 ∆𝑥(√𝑥+∆𝑥+√𝑥)

2 2
(√𝑥+∆𝑥) −(√𝑥)
= lim
∆𝑥→0 ∆𝑥(√𝑥+∆𝑥+√𝑥)

(𝑥+∆𝑥)−(𝑥)
i.e. (𝑥) = lim
∆𝑥→0 ∆𝑥(√𝑥+∆𝑥+√𝑥)

∆𝑥
𝑓 ′ (𝑥) = lim
∆𝑥→0 ∆𝑥(√𝑥+∆𝑥+√𝑥)

1
i.e 𝑓 ′ (𝑥)= lim
∆𝑥→0 √𝑥+∆𝑥+√𝑥

1
=
√𝑥+0+√𝑥

1
=
√𝑥+√𝑥

1
1 1 1 1
=2 (=2 . = 2 𝑥 −2 )
√𝑥 √ 𝑥

1 1
∴ 𝑓 ′ (𝑥) = 𝑥 −2
2

7
Finding the slope of a tangent line at point (4, 2): at 𝑥 = 4:

1 1 1 1 1 1
𝑓 ′ (4) = (4)−2 = . = . = 1⁄4
2 2 √4 2 2

Therefore, 𝑓 ′ (4) = 1⁄4 is the slope of tangent line at (4, 2).

Exercise: Verify that the equation of the line tangent to the graph of 𝑓(𝑥) = √𝑥 at the point
(4, 2) is 𝑦 = 1⁄4 𝑥 + 1.

Definition 4.5: Differentiability on an Open Interval

The function 𝑓(𝑥) is differentiable on an open interval (𝑎, 𝑏) if 𝑓 ′ (𝑥) exists for every 𝑥 in
the interval (𝑎, 𝑏).

Example 4.3
1
From Example 4.2, the derivative of the function 𝑦 = √𝑥 is 𝑓 ′ (𝑥) = 2 . Clearly this
√𝑥
derivative exists for every 𝑥 in the open interval 0 < 𝑥 < 𝑏. Thus, 𝑓(𝑥) = √𝑥 is differentiable
on any interval of the form (0, 𝑏), where 𝑏 > 0, 0 < 𝑥 < 𝑏.

Example 4.4

Show that the function 𝑓(𝑥) = 𝑥 2 is differentiable on (−∞, ∞).

Solution

Given𝑓(𝑥) = 𝑥 2 , (we found in Example 4.1),


𝑑𝑓 𝑑𝑥 2
𝑓 ′ (𝑥) = 𝑑𝑥 = 𝑑𝑥

i.e. 𝑓 ′ (𝑥) = 2𝑥

The function 𝑓 ′ (𝑥) is defined for all values 𝑥 on the real line. Hence, 𝑓(𝑥) = 𝑥 2 is
differentiable on the interval (−∞, ∞) = (−∞ < 𝑥 < ∞).

4.3.1 General Rules of Differentiation

Reading: Handout titled 4.3.1 General Rules of Differentiation - EVLM

4.3.2 Derivative of Trigonometric Function

In this section, we discuss derivatives of sin 𝑥, cos 𝑥 and related functions. The material in
this section depends on knowledge of basic trigonometry (see Chapter 2 of this course). In

8
this chapter, we assume unless otherwise specifically stated, that 𝒙 (𝒐𝒓 𝜽) is measured in
radians and that the Domain (sin 𝑥) = 𝐷𝑜𝑚𝑎𝑖𝑛 (cos 𝑥) = ℝ = (−∞, ∞).

Theorem 4.3.1 (a) (Derivate of sin 𝑥 and cos 𝑥)


𝑑 sin 𝑥
= cos 𝑥 (1)
𝑑𝑥

𝑑 cos 𝑥
= − sin 𝑥 (2)
𝑑𝑥

Theorem 4.3.1(b)

If 𝑢 is a differentiable function of 𝑥, denoted 𝑢(𝑥), we may apply the chain rule together with
Equations (1) and (2) to conclude that
𝑑 sin 𝑢(𝑥) 𝑑 sin 𝑢(𝑥) 𝑑𝑢
= . 𝑑𝑥 by the Chain rule
𝑑𝑥 𝑑𝑢

𝑑 𝑠𝑖𝑛 𝑢(𝑥) 𝑑𝑢(𝑥)


⇒ = cos 𝑢(𝑥). (3)
𝑑𝑥 𝑑𝑥

𝑑 cos 𝑢(𝑥) 𝑑 cos 𝑢(𝑥) 𝑑𝑢


And = . 𝑑𝑥
𝑑𝑥 𝑑𝑢

𝑑 cos 𝑢(𝑥) 𝑑𝑢(𝑥)


= − sin 𝑢(𝑥) . (4)
𝑑𝑥 𝑑𝑥

Notice that Theorem 4.3.1(a) is a special case of Theorem 4.3.1 (b) when
𝑑𝑢(𝑥) 𝑑 𝑠𝑖𝑛 𝑥 𝑑𝑥) 𝑑 𝑠𝑖𝑛 𝑥
𝑢(𝑥) = 𝑥, in which acse = 1 (𝐹𝑜𝑟 𝑒𝑥𝑎𝑚𝑝𝑙𝑒, 𝑏𝑦 𝐸𝑞. (3) = cos 𝑥. = cos 𝑥)
𝑑𝑥 𝑑𝑥 𝑑𝑥 𝑑𝑥

Example 4.5

𝑑 sin 𝑥 2
Evaluate
𝑑𝑥

Solution: Let 𝑢 = 𝑥 2 so that 𝑓(𝑢) = sin 𝑢. Then, using the chain rule and the fact that
𝑑 sin 𝑥
= cos 𝑥, we have that
𝑑𝑥

𝑑 sin 𝑥 2 𝑑 𝑑𝑓(𝑢) 𝑑𝑢 𝑑 sin 𝑢 𝑑𝑥 2


= 𝑑𝑥 𝑓(𝑢) = . 𝑑𝑥 = . =cos 𝑢 (2𝑥)
𝑑𝑥 𝑑𝑢 𝑑𝑢 𝑑𝑥

𝑑 sin 𝑥 2
∴ = cos 𝑥 2 (2𝑥)
𝑑𝑥

𝑑 sin 𝑥 2
𝑖. 𝑒. = 2𝑥 cos 𝑥 2 (as generalised in Equation (3) above).
𝑑𝑥

9
Example 4.6
𝑑
Evaluate cos √𝑥.
𝑑𝑥

Solution

Here, the outer function is cos(.) while the inner function is √𝑥. So set the inner function to
1
𝑑𝑢 1 1 1
𝑢 = √𝑥 ⇒ 𝑑𝑥 = 2 𝑥 −2 = 2 , and using Equation (4), we have that
√𝑥

𝑑 cos √𝑥 𝑑 cos 𝑢 𝑑 cos 𝑢 𝑑𝑢


= = . by the chain rule.
𝑑𝑥 𝑑𝑥 𝑑𝑢 𝑑𝑥

𝑑 𝑐𝑜𝑠 √𝑥 𝑑𝑢
𝑖. 𝑒. = − sin 𝑢 . 𝑑𝑥
𝑑𝑥

1 1
=(− sin √𝑥)(2 )
√𝑥

𝑑 cos √𝑥 − sin √𝑥
∴ =
𝑑𝑥 2√𝑥

Example 4.7 (Exercise!)


𝑑
Find 𝑑𝑥 (𝑥 + sin 𝑥)4

Solution: The function 𝑓(𝑥) = (𝑥 + sin 𝑥)4 is of the form 𝑓(𝑥) = [𝑢(𝑥)]𝑛 , a power function,
with 𝑢(𝑥) = 𝑥 + sin 𝑥. And so by the General power rule, the derivative of such a function is
𝒅𝒇(𝒙) 𝒅𝒖
= 𝒏[𝒖(𝒙)]𝒏−𝟏 .
𝒅𝒙 𝒅𝒙

𝑑𝑢 𝑑(𝑥+sin 𝑥)
First, evaluate 𝑑𝑥 = using the chain rule:
𝑑𝑥

𝑑𝑢 𝑑 𝑑𝑥 𝑑 sin 𝑥
= 𝑑𝑥 (𝑥 + 𝑠𝑖𝑛 𝑥) = 𝑑𝑥 + (Derivative of a Sum or a Difference rule)
𝑑𝑥 𝑑𝑥

𝑑𝑢 𝑑 sin 𝑥
⇒ = 1 + cos 𝑥 since = cos 𝑥
𝑑𝑥 𝑑𝑥
Hence, on writing in terms of the original variable x, we get
𝑑(𝑥+sin 𝑥)4
= 4(𝑥 + sin 𝑥)3 (1 + cos 𝑥) by the Chain rule
𝑑𝑥

𝑑
∴ 𝑑𝑥 (𝑥 + 𝑠𝑖𝑛 𝑥)4 = 4(1 + cos 𝑥)(𝑥 + sin 𝑥)3

10
Derivatives of other Trigonometric Functions

Note: For a given right-angled triangle containing the 𝑎𝑛𝑔𝑙𝑒 𝜃, the numbers sin 𝜃 and cos 𝜃
are only two of 𝑠𝑖𝑥(6) possible ratios of lengths of the sides. This section describes how each
of the remaining 4 ratios defines a trigonometric function, and how each function is related
to the sine and cosine functions.

[Aside: Figure 4.6 shows a right-angled triangle with angle 𝜃 with opposite side of length y,
adjacent side of length 𝑥 and hypotenuse of length of ℎ. From figure 4.6, four (4) other
trigonometric functions of 𝜃 maybe defined in terms of the basic functions sin 𝜃 and cos 𝜃.

Figure 4.6 A right-angled triangle with angle 𝜃

The tangent of 𝜃, denoted by tan 𝜃, is


𝑜𝑝𝑝𝑜𝑠𝑖𝑡𝑒
tan 𝜃 =
𝑎𝑑𝑗𝑎𝑐𝑒𝑛𝑡

𝑦
i.e. tan 𝜃 = 𝑥

𝑦 𝑜𝑝𝑝𝑜𝑠𝑖𝑡𝑒 𝑥 𝑎𝑑𝑗𝑎𝑐𝑒𝑛𝑡
But sin 𝜃 = ℎ since sin 𝜃 = ℎ𝑦𝑝𝑜𝑡𝑒𝑛𝑢𝑠𝑒 and cos 𝜃 = ℎ since cos 𝜃 = ℎ𝑦𝑝𝑜𝑡𝑒𝑛𝑢𝑠𝑒 so,

𝑦
⁄ℎ sin 𝜃
tan 𝜃 = 𝑥⁄ = ]
ℎ cos 𝜃

sin 𝜃
i.e. tan 𝜃 = (5)
cos 𝜃

The, cotangent of 𝜃, denoted by cot 𝜃, is defined by the inverse function of tan 𝜃


1 cos 𝜃
cot 𝜃 = tan 𝜃 = sin 𝜃
(6)

The secant of 𝜃, written sec 𝜃 , is defined as the inverse of cos 𝜃:


1
sec 𝜃 = cos 𝜃 (7)

The cosecant of 𝜃, denoted by csc 𝜃, is the inverse function of sin 𝜃


1
csc 𝜃 = sin 𝜃 (8)

11
Reciprocal Identities: Relationships (6), (7) and (8) are called Reciprocal identities.

A summary of derivatives of all the trigonometric functions is given in Theorem 4.3.2

Theorem 4.3.2 (Derivatives of Trigonometric Functions)


𝑑 sin 𝑥
𝑑𝑥
= cos 𝑥

𝑑 cos 𝑥
𝑑𝑥
= − sin 𝑥

𝑑 tan 𝑥
𝑑𝑥
= sec 𝑥 2

𝑑 cot 𝑥
𝑑𝑥
= − csc 𝑥 2

𝑑 sec 𝑥
𝑑𝑥
= sec 𝑥 tan 𝑥

𝑑 csc 𝑥
= − csc 𝑥 cot 𝑥
𝑑𝑥

4.3.3 Derivative of Logarithmic Functions

Recall: Logarithmic function

If 𝑎 𝑦 = 𝑥, 𝑦 is called the logarithm of 𝑥 to the base 𝑎, written as 𝑦 = log 𝑎 𝑥 𝑜𝑟 log 𝑎 𝑥 = 𝑦 .

Note: If 𝑎 and 𝑥 are positive and 𝑎 ≠ 1, then there is only one real value for 𝑦. See Theorem 4.3.3.

Theorem 4.3.3

If 𝑎 > 1, then the logarithmic function 𝑓(𝑥) = log 𝑎 𝑥 is a one-to-one continuous increasing function
of 𝑥 with domain (0, ∞) and range ℝ = (−∞, ∞).

Properties (See Chapter 1(Handout) of Spiegel, M.R)

If 𝑥 𝑎𝑛𝑑 𝑦 are both > 0, then for 𝑎 > 0(≠ 1)

i) log 𝑎 (𝑥𝑦) = log 𝑎 𝑥 + log 𝑎 𝑦

𝑥
ii) log 𝑎 𝑦 = log 𝑎 𝑥 − log 𝑎 𝑦

iii) log 𝑎 𝑥 𝑦 = 𝑦 log 𝑎 𝑥

1
iv) log 𝑎 𝑥 = log 𝑓𝑜𝑟 𝑥 > 0 𝑎𝑛𝑑 𝑥 ≠ 1 (i.e. log of x to the base a equals the reciprocal of log of a to
𝑥𝑎
the base x)

v)

12
4.3.3.1 Natural Logarithms (𝒚 = 𝐥𝐧 𝒙)

In practice, two bases 𝒂 are used: base 𝑎 = 10 and base 𝑎 = 𝑒 (= 2.71828 … ). The logarithm of 𝒙 to
the base 𝒆 is called the natural logarithm of 𝒙, denoted by ln 𝑥. Thus, log 𝑒 𝑥 = ln 𝑥.

Properties

i) If ln 𝑥 = 𝑦, then it means that

log 𝑒 𝑥 = 𝑦

∴ 𝑥 = 𝑒𝑦

ii) ln 𝑒 𝑥 = log 𝑒 𝑒 𝑥 = 𝑥 since log 𝑎 𝑥 = 𝑦 ⇒ 𝑎 𝑦 = 𝑥

∴ ln 𝑒 𝑥 = 𝑥

iii) ln 𝑒 = log 𝑒 𝑒 = 1 as 𝑒 1 = 𝑒 𝑎𝑛𝑑 log 𝑎 𝑥 = 𝑦 ⇒ 𝑎 𝑦 = 𝑥


∴ ln 𝑒 = 1
iv) 𝑒 ln 𝑥 = 𝑥 ln 𝑒 = 𝑥 since ln 𝑒 = 1

∴ 𝑒 ln 𝑥 = 𝑥

Derivatives of Logarithmic Functions


𝑑 1
1. log 𝑎 𝑥 = log 𝑎 𝑒 (1)
𝑑𝑥 𝑥

𝑑 1
2. 𝑑𝑥 log 𝑎 𝑥 = 𝑥 ln 𝑎 (2)

1
Consider Equation (1) above. By property (iv) of logarithmic functions above, log 𝑎 𝑥 = log
𝑥𝑎

1 1
log 𝑎 𝑒 = log = ln 𝑎
𝑒𝑎

1
i.e. log 𝑎 𝑒 = ln 𝑎, so that Equation (1) becomes Equation (2)

𝑑 ln 𝑥 1
3. =
𝑑𝑥 𝑥

Proof

If 𝑎 = 𝑒 ( 𝑖. 𝑒 𝑏𝑎𝑠𝑒 𝑖𝑠 𝑒), then Equation (2) becomes


𝑑 𝑑 1 1
𝑑𝑥
𝑙𝑜𝑔𝑒 𝑥 = 𝑑𝑥 𝑙𝑛 𝑥 = 𝑥 ln 𝑒 = 𝑥 (3)

13
4.3.3.2 Differentiating Composite Logarithmic Functions

Most logarithmic functions that we shall encounter in applications are actually composite (implicit)
functions of the form 𝑦 = ln 𝑢 where 𝑢 is a differentiable function of 𝑥, 𝑢(𝑥).We can differentiate such
functions using Equation (3) and the chain rule as illustrated below.

Let 𝑦 = ln|𝑢|, where 𝑢 is a differentiable function of 𝑥. Then, by the chain rule,


𝑑 𝑑𝑦 𝑑𝑢
𝑑𝑥
𝑙𝑛|𝑢(𝑥)| = 𝑑𝑢 ∙ 𝑑𝑥

𝑑 ln|𝑢| 𝑑𝑢
= 𝑑𝑢
∙ 𝑑𝑥

𝑑 1 𝑑𝑢
⇒ 𝑙𝑛|𝑢(𝑥)| = ∙ 𝑓𝑜𝑟 𝑢 ≠ 0. But, we know that 𝑙𝑛𝑒 = 1 so that
𝑑𝑥 𝑢𝑙𝑛𝑒 𝑑𝑥

𝑑 1 𝑑𝑢
∴ 𝑙𝑛|𝑢(𝑥)| = ∙ 𝑓𝑜𝑟 𝑢 ≠ 0 (4)
𝑑𝑥 𝑢 𝑑𝑥

In general,

𝑑 𝑓′ (𝑥)
[ln 𝑓(𝑥)] = (5)
𝑑𝑥 𝑓(𝑥)

Example 4.8

Differentiate the following functions.

i) 𝑓(𝑥) = 5 ln 𝑥
ii) 𝑦 = ln(𝑥 3 + 1)
ln 𝑥
iii) 𝑓(𝑥) = 𝑥2
iv) 𝑓(𝑥) = ln(sin 𝑥)

Solutions

i) 𝑓(𝑥) = 5 ln 𝑥
Here 𝑢 = 𝑥
𝑑𝑓(𝑥) 𝑑(5 ln 𝑥) 𝑑𝑥 5𝑑 ln 𝑥 5 1
⇒ 𝑓 ′ (𝑥) = 𝑑𝑥
= 𝑑𝑥
. 𝑑𝑥 = 𝑑𝑥
.1 = 𝑥 𝑓𝑜𝑟 𝑥 > 0 𝑠𝑖𝑛𝑐𝑒 ln 𝑥 = 𝑥.

𝑑(5 ln 𝑥) 5
∴ =
𝑑𝑥 𝑥

ii) 𝑦 = ln(𝑥 3 + 1)

This function is of the form 𝑦 = ln 𝑢 with 𝑢 = (𝑥 3 + 1). So, by the Chain rule
𝑑𝑦 𝑑𝑦 𝑑𝑢
𝑑𝑥
= 𝑑𝑢 ∙ 𝑑𝑥

14
𝑑𝑦 𝑑 ln 𝑢 𝑑(𝑥 3 +1)
i.e. 𝑑𝑥 = 𝑑𝑢
. 𝑑𝑥

𝑑𝑦 1 1
⇒ = (3𝑥 2 ) = 3 ∙ (3𝑥 2 )
𝑑𝑥 𝑢 (𝑥 + 1)

𝑑 𝑙𝑛(𝑥 3 +1) 3𝑥 2 𝑑 𝑓 ′ (𝑥)


∴ = 𝑥 3 +1 ( veryfying that [ln 𝑓(𝑥)] = )
𝑑𝑥 𝑑𝑥 𝑓(𝑥)

ln 𝑥
iii) 𝑓(𝑥) = 𝑥2

By the quotient rule, we have that


𝑑𝑓(𝑥) 𝑑 ln 𝑥
𝑑𝑥
= 𝑑𝑥 ( 𝑥 2 )

𝑑 𝑑
𝑥2. ln 𝑥−ln 𝑥. 𝑥 2
= 𝑑𝑥
(𝑥 2 )2
𝑑𝑥

1
𝑥 2 ( )−ln 𝑥(2𝑥) 1
𝑥
= 𝑠𝑖𝑛𝑐𝑒 ln 𝑥 =
𝑥4 𝑥

𝑥−2𝑥 ln 𝑥
= 𝑥4

1−2 ln 𝑥
= 𝑥3

ln 𝑥
𝑑( 2 ) 1−2 ln 𝑥
∴ 𝑥
= 𝑓𝑜𝑟 𝑥 > 0 .
𝑑𝑥 𝑥3

iv) 𝑓(𝑥) = ln(sin 𝑥)

Let 𝑢 = sin 𝑥 so that 𝑓(𝑥) = ln 𝑢

Using the chain rule


𝑑𝑓(𝑥) 𝑑𝑓(𝑥) 𝑑𝑢
𝑑𝑥
= .
𝑑𝑢 𝑑𝑥

we have that
𝑑𝑙𝑛(sin 𝑥) 𝑑 ln 𝑢 𝑑 sin 𝑥
𝑑𝑥
= 𝑑𝑢
. 𝑑𝑥

1 𝑑 ln 𝑢 1 𝑑 sin 𝑥
=𝑢 . cos 𝑥 𝑎𝑠 𝑑𝑢
= 𝑢 𝑎𝑛𝑑 𝑑𝑥
= cos 𝑥

1
=sin 𝑥 . cos 𝑥 on reverting to the original variable x

cos 𝑥 𝑠𝑖𝑛 𝑥 cos 𝑥


= sin 𝑥 = cot 𝑥 by definition (Remember: 𝑡𝑎𝑛𝜃 = ⇒ 𝑡𝑎𝑛−1 (𝜃) = = cot 𝑥 )
𝑐𝑜𝑠 𝑥 sin 𝑥

15
𝑑 ln(sin 𝑥) 𝑐𝑜𝑠 𝑥
∴ = = cot 𝑥 .
𝑑𝑥 𝑠𝑖𝑛 𝑥

4.3.3.3 Derivatives of Composite Log Functions to a Base Different from 𝒆.

To differentiate a logarithmic function to a base different from 𝑒, first convert the logarithm to the
natural logarithm via the Change–of–base formula (STA 101 Lecture Notes!), and then differentiate
the resulting expression.

[ASIDE: Change of Base Formula: Allows us to rewrite a logarithm in terms of logs written with
another base. This is especially helpful when using a calculator to evaluate a log to any base other
than 10 or e. Assume that 𝑥, 𝑎, and 𝑏 are all positive and that a ≠ 1, b ≠ 1. Then, by the change-of-base
logb x
formula log a x  . Notice that the base has been changed from 𝑎 to 𝑏.
logb a

log 2 23 log10 3 0.4771


Examples: (i) log16 23   1.25 , (ii) log 2 3    1.585 , (iii) l 0 g8 x  ln x
log 2 16 log10 2 0.3010 ln 8
log 𝑥
i.e. log 8 𝑥 = log𝑒 8]
𝑒

Consider the function 𝑓(𝑥) = log 𝑢, where 𝑢 is a differential function of 𝑥. By the change-of-base
formula
log𝑒 𝑢 ln 𝑢
𝑓(𝑥) = log 𝑎 𝑢 = = for 𝑎, 𝑢, > 0 and 𝑎 ≠ 1
log𝑒 𝑎 ln 𝑎

Differentiating this with respect to 𝑥 yields


𝑑 𝑑 ln 𝑢
(log 𝑎 𝑢) = ( )
𝑑𝑥 𝑑𝑥 ln 𝑎

1 𝑑
=ln 𝑎 (𝑑𝑥 ln 𝑢) since 𝑎 is not a function of x (is constant) so that ln 𝑎 is itself a constant

1 𝑑 ln 𝑢 𝑑𝑢
= ln 𝑎 ( 𝑑𝑢 𝑑𝑥
. )

𝑑 1 1 𝑑𝑢
𝑑𝑥
log 𝑎 𝑢= ( . )
ln 𝑎 𝑢 𝑑𝑥

𝑑 1 𝑑𝑢
∴ 𝑑𝑥 log 𝑎 𝑢(𝑥) = 𝑢 ln 𝑎 . 𝑑𝑥 𝑓𝑜𝑟 𝑎, 𝑢, > 0 𝑎𝑛𝑑 𝑎 ≠ 1

Example 4.9

Differentiate 𝑦 = log 2 𝑥

Solution: This is the case where the base 𝑎 ≠ 𝑒, so by change-of-base formula we have that

16
ln 𝑢
𝑦 = log 𝑏 𝑢 = ln 𝑏 and

𝑑𝑦 1 𝑑𝑢
𝑑𝑥
= 𝑢 ln 𝑎 . 𝑑𝑥 , 𝑢 > 0

But, here, 𝑢 = 𝑥 𝑎𝑛𝑑 𝑎 = 2


𝑑 log2 𝑥 𝑑 ln 𝑥
→ = ( )
𝑑𝑥 𝑑𝑥 ln 2

1 𝑑 ln 𝑥
=ln 2 ( 𝑑𝑥
)

1 1
=ln 2 (𝑥)

𝑑 1
∴ log 2 𝑥 =
𝑑𝑥 𝑥 ln 2

4.3.4 Derivatives of Exponential Functions


An exponential function is a function of the 𝑓(𝑥) = 𝑎 𝑥 , where 𝑎 is a positive constant.

1
Note: i) 𝑎−𝑥 = .
𝑎𝑥

𝑝
ii) If 𝑥 is a rational number, i.e. 𝑥 = 𝑞 , 𝑞 > 0, then

𝑝 𝑞
⁄𝑞
𝑎𝑥 = 𝑎 = √𝑎 𝑝 (the qth root of ap )

Theorem 4.4.3

If 𝑎 > 0 and 𝑎 ≠ 1, then 𝑓(𝑥) = 𝑎 𝑥 is a continuous function with domain 𝐷(𝑥) = ℝ =


(−∞, ∞) and range 𝑅(𝑓) = ℝ+ = (0, ∞).

Properties

If 𝑎, 𝑏 > 0 constants and 𝑥, 𝑦 ∈ ℝ, then

i) 𝑎 𝑥+𝑦 = 𝑎 𝑥 𝑎 𝑦
𝑎𝑥
ii) 𝑎 𝑥−𝑦 = 𝑎𝑦 (=𝑎 𝑥 𝑎−𝑦 )
iii) (𝑎 𝑥 )𝑦 = 𝑎 𝑥𝑦
iv) (𝑎𝑏)𝑥𝑦 = 𝑎 𝑥𝑦 𝑏 𝑥𝑦

Derivatives

Well known result: The derivatives of 𝑦 = 𝑎 𝑥 is given by

17
𝑑𝑦 𝑑(𝑎 𝑥 )
𝑑𝑥
= 𝑑𝑥
= 𝑎 𝑥 ln 𝑎 (1)

This result can be generalized to obtain the derivative of an exponential function of the form

𝑦 = 𝑎 𝑓(𝑥) where 𝑓(𝑥) is a differentiable function of 𝑥:


𝑑
𝑑𝑥
[𝑎 𝑓(𝑥) ] = 𝑎 𝑓(𝑥) . 𝑓 ′ (𝑥) ln 𝑎 (2)

Special Case (Derivative of 𝒆𝒙 )

The identity linking natural logarithm and exponential functions (Equation 3)

ln(𝑒 𝑥 ) = 𝑥 (3)

provides us with the derivative of the function 𝑦 = 𝑒 𝑥 .

[Proof of the result in Equation 3 (3): ln(𝑒 𝑥 ) = log 𝑒 𝑒 𝑥 = 𝑥]

Differentiating both sides of equation (3) with respect to 𝑥 and using the Chain rule that
𝑑 ln 𝑢 1 𝑑𝑢
= 𝑢 . 𝑑𝑥 with 𝑢 = 𝑒 𝑥 (on the LHS) yields (since if 𝑢 = 𝑒 𝑥 RHS=ln 𝑢)
𝑑𝑥

1 𝑑𝑒 𝑥 𝑑𝑥
. = 𝑑𝑥
𝑒𝑥 𝑑𝑥

1 𝑑𝑒 𝑥
i.e. . =1
𝑒𝑥 𝑑𝑥

𝑑𝑒 𝑥
∴ = 𝑒𝑥 (4)
𝑑𝑥

Notice, therefore, that

i) the function 𝑓(𝑥) = 𝑒 𝑥 is its own derivative.

ii) 𝑓(𝑥) = 𝑒 𝑥 and its multiples are the only functions that have this property, making them very useful
in applications.

Differentiating Composite Functions of the Form 𝒇(𝒙) = 𝒆𝒖

Combining Equation (4) with the Chain rule, we can derive the following rule for differentiating
composite functions of the form 𝑦 = 𝑒 𝑢 , where 𝑢 is a differential function of 𝑥.

If 𝑓(𝑥) = 𝑒 𝑢(𝑥) , then


𝑑𝑓(𝑥) 𝑑𝑢(𝑥)
= 𝑒 𝑢(𝑥) . ln 𝑒
𝑑𝑥 𝑑𝑥

where ln 𝑒 = 1, so that

18
𝑑𝑒 𝑢(𝑥) 𝑑𝑢(𝑥)
= 𝑒 𝑢(𝑥) .
𝑑𝑥 𝑑𝑥

or just

𝑑𝑒 𝑢 𝑑𝑢
= 𝑒 𝑢 . 𝑑𝑥 (5)
𝑑𝑥

𝑑 𝑥 3 +3𝑥
Example 4.10: Find 𝑑𝑥
𝑒

Solution: This function is of the form 𝑓(𝑥) = 𝑒 𝑢(𝑥) , where 𝑢 is differentiable function of 𝑥, 𝑢(𝑥). Thus,

𝑑𝑒 𝑢(𝑥) 𝑑𝑢(𝑥)
= 𝑒 𝑢(𝑥) .
𝑑𝑥 𝑑𝑥

𝑑𝑢(𝑥)
Now, let 𝑢(𝑥) = 𝑥 3 + 3𝑥 ⟹ = 3𝑥 2 + 3
𝑑𝑥

𝑑 3 +3𝑥 3 +3𝑥
⇒ 𝑑𝑥 𝑒 𝑥 = 𝑒𝑥 (3𝑥 2 + 3)

𝑑 3 +3𝑥 3 +3𝑥
∴ 𝑑𝑥 𝑒 𝑥 = (3𝑥 2 + 3)𝑒 𝑥

Example 4.11: Find the derivative of the function 𝑓(𝑥) = 𝑒 √𝑥 .

Solution

Let 𝑢(𝑥) = √𝑥 = 𝑥 1⁄2


1
𝑑𝑢(𝑥) 𝑑𝑥 1⁄2 1 1
⇒ 𝑑𝑥
= 𝑑𝑥
= 2 𝑥 −2 = 2
√𝑥

𝑑𝑓 𝑑𝑒 √𝑥 𝑑𝑢(𝑥)
⇒ 𝑑𝑥 = 𝑑𝑥
= 𝑒 𝑢(𝑥) . 𝑑𝑥

1
=𝑒 √𝑥 . 2
√𝑥

𝑑𝑒 √𝑥 1
∴ = 𝑒 √𝑥
𝑑𝑥 2√𝑥

19
4.4 Higher Order Derivatives

The derivative of a function 𝑦 = 𝑓(𝑥) is itself a function, 𝑦′ = 𝑓′(𝑥). Therefore, we can take
the derivative of 𝑓′(𝑥), which is referred to as the second derivative of 𝑓(𝑥) and, written
𝑓“( 𝑥) or 𝑓 2 (𝑥). This differentiation process can be continued to find the third, fourth, and
successive derivatives of 𝑓(𝑥), which are called higher order derivatives of 𝑓(𝑥). As the
“prime” notation for derivatives becomes messier as successive higher order derivatives are
taken, it is preferable to use the numerical notation f ( n )  x  or y ( n)  x  to denote the nth
derivative of 𝑓(𝑥).

Example 4.12: Find the first, second and third derivatives of f  x   5x 4  3x3  7 x 2  9 x  2 .

Solution

f '  x   20 x3  9 x 2  14 x  9
f ''  x   f (2)  60 x 2  18 x  14
f '''  x   f (3)  120 x 2  18

Example 4.13: Find the first, second and third derivatives of f  x   sin 2 x .
Solution

Note that f  x   sin 2 x   sin x 2 . Letting u  sin x so that y  u 2 and


du
 cos x and, using the
dx
df  x  df  x  du
result  leads to
dx du dx

df sin x 2 du 2 du

f (1) sin x 2   dx

du dx
 
i.e. f (1) sin x 2  2u.cos x
i.e. f (1)  sin x   2  sin x  .cos x
2

 f (1)  sin x   2 cos x sin x


2

f (2)
 sin x   dx
d
2
2 cos x sin x

Noticing that this function f (2)  sin x 2  is of the form


d
uv , with u  2cos x and v  sin x , we
dx
apply the Product rule

20
f (2)  sin x 2   2 cos x
d d
sin x  sin x 2 cos x
dx dx
 f (2)  sin x 2   2 cos x(cos x)  sin x( 2 sin x)
 f (2)  sin x 2   2 cos 2 x  2 sin 2 x
 f (2)  sin x 2   2  cos 2 x  sin 2 x 

Note: 𝒄𝒐𝒔𝟐 𝒙 = (𝒄𝒐𝒔𝒙)𝟐 and 𝒔𝒊𝒏𝟐 𝒙 = (𝒔𝒊𝒏𝒙)𝟐

Similarly,


f (3) sin x 2   d
dx
{2 cos 2 x  2sin 2 x}


 f (3) sin x 2  
d
dx
2 cos 2 x 
d
dx
2sin 2 x

 
2 2 2 2
f (3) sin x 2  4 cos x( sin x)  4sin x  cos x  (Noticing that 𝑐𝑜𝑠 𝑥 = (𝑐𝑜𝑠𝑥) and 𝑠𝑖𝑛 𝑥 = (𝑠𝑖𝑛𝑥) )

 
 f (3) sin x 2  4 cos x sin x  4 cos x sin x

 
 f (3) sin x 2  8cos x sin x

Exercise: Evaluate f (3)  sin x 2  if the angle x  45o

3
Example 4.14: Verify that f  3  4   if f  x   x .
256

4.5 Applications of the Derivative


This section discusses the use of the derivative in real life applications, including
optimization (max-min) problems, graphical analysis (curve sketching), rates of change and
marginal analysis in business and economics. [An application of the derivative to finding
zeros of functions (Newton’s Method) will not be considered in this section. [See document
dated 21-10-2014 titled 10.6 Newton’s Method].

4.5.1: Curve Sketching

Sketching a curve from knowledge of the signs of the first and second derivatives is a useful
way of finding the approximate shape of the graph of a function. When curve sketching,
making a sign chart of the derivatives is an easy way to identify possible points of inflection
and to find the relative minima and maxima, which are both key in sketching the path of a
function. This important technique will be illustrated in Section 4.5.2.

4.5.2 Optimisation (Maxima and Minima for Univariate Functions)

21
Reading: Grossman, Section4.3, p207, STA 102 Lecture Notes

One of the most important applications of differential calculus is to find extreme function
values. The calculus methods for finding the maximum and minimum values of a function are
the basic tools of optimization theory, a very active branch of mathematical research applied
to nearly all fields of practical endeavor.

Theorem 4.5.1: Suppose a function 𝑓 is differentiable for 𝑎 < 𝑥 < 𝑏 = (𝑎, 𝑏) and continuous
for 𝑎 ≤ 𝑥 ≤ 𝑏 = [𝑎, 𝑏],. Then, if
𝑑𝑓
i) 𝑑𝑥 > 0 for every 𝑥 in the open interval (𝑎, 𝑏), 𝑓 is increasing on the closed interval [𝑎, 𝑏].

𝑑𝑓
ii.) 𝑑𝑥 < 0 for every 𝑥 in the open interval (𝑎, 𝑏), f is decreasing on the closed interval [𝑎, 𝑏].

Example 4.15

Let 𝑦 = 𝑥 3 + 3𝑥 2 − 9𝑥 − 10

For what values of 𝑥 is this function increasing or decreasing? Sketch the curve.

Solution

First, find the derivative of the function;


𝑑𝑦
= 3𝑥 2 + 6𝑥 − 9
𝑑𝑥

And set this to zero to find the critical points:

i.e 3𝑥 2 + 6𝑥 − 9 = 0

3(𝑥 2 + 2𝑥 − 3) = 0

𝑥 2 + 2𝑥 − 3 = 0

𝑥 2 + 3𝑥 − 𝑥 + 3(−1) = 0

𝑥(𝑥 + 3) − (𝑥 + 3) = 0

(𝑥 + 3)(𝑥 − 1) = 0

So, either 𝑥 = −3 or 𝑥 = 1. These are the critical points, and lead to 5 intervals:

𝑥 < −3; 𝑥 = −3; −3 < 𝑥 < 1; 𝑥 = 1, 1 < 𝑥

22
Now, for each critical value ( x  3, x  1) and for x  0 , find the (x, y) point to be plotted to
sketch the curve. Thus, at 𝑥 = −3, 𝑦 = 17 → point (−3,17); at : 𝑥 = 1, 𝑦 = −15 →
point (1, −15); at 𝑥 = 0, 𝑦 = −10 ( y − intercept) → 𝑝𝑜𝑖𝑛𝑡(0, −10).

Then use this information to construct a sign chart of the derivatives of the given function
𝑦 = 𝑥 3 + 3𝑥 2 − 9𝑥 − 10 in order to sketch the path of the function, as illustrated in Table
4.1.

𝑑𝑓
Table 4.1: Sign chart of 𝑑𝑥 = 3𝑥 2 + 6𝑥 − 9

Interval Sign of a factor of Sign of the derivative (𝒙 + 𝟑) (𝒙 − 𝟏) The function y is


the derivative 𝒅𝒚
( = 𝟑𝒙𝟐 + 𝟔𝒙 − 𝟗)
𝒅𝒙
(𝑥 + 3) (𝑥 − 1)
𝑥 < −3 - - + increasing
𝑥=3 0 -4 0 at a critical point
−3 < 𝑥 < 1 + - − decreasing
𝑥=1 4 0 0 at a critical point
1<𝑥 + + + increasing

Using the three points (−3,17), (1, −15) 𝑎𝑛𝑑 (0, −10) together with the information
contained in Table 4.1, we can sketch the curve 𝑦 = 𝑥 3 + 3𝑥 2 − 9𝑥 − 10 as in Figure 4.1.

Figure 4.1: Sketch of the curve of 𝑦 = 𝑥 3 + 3𝑥 2 − 9𝑥 − 10

From Figure 4.1, we can see that

i.) The point (−3,17) is a maximum point in the sense that “near” 𝑥 = −3, 𝑦 takes its
largest value of 𝑦 = 17. However, there is no global (or absolute) maximum value for
the function 𝑓 since as 𝑥 increases beyond the value 𝑥 = 1, 𝑦 increases without

23
bound. For example, when 𝑥 = 10, then 𝑦 = 1200, which is far bigger than 𝑦 = 17.
Thus, the point (−3,17) is called a local maximum or relative maximum in the sense
that the function achieves its maximum value there for points near 𝑥 = −3.
ii.) similarly, we call the point (1, −15) a local minimum or relative minimum.

In general, we have the following definition for maxima and minima.

Definition 4.5.1 (Maxima and Minima)


i) A function 𝑓 has a relative maximum at 𝑥0 if there is an open interval (𝑐, 𝑑) containing
𝑥0 such that𝑓(𝑥0 ) ≥ 𝑓(𝑥) for every 𝑥 in (𝑐, 𝑑) .
ii) A function has a relative minimum at 𝑥0 if there is an open interval (𝑐, 𝑑) containing
𝑥0 such that 𝑓(𝑥0 ) ≤ 𝑓(𝑥) for every 𝑥 in (𝑐, 𝑑).
iii) A function 𝑓 has a global maximum at 𝑥0 if 𝑓(𝑥0 ) ≥ 𝑓(𝑥) for every 𝑥 in the domain of
𝑓.
iv) A function 𝑓 has a global minimum at 𝑥0 if 𝑓(𝑥0 ) ≤ 𝑓(𝑥) for every 𝑥 in the domain of
𝑓.

Theorem 4.5.2 (Critical point)

Let a function 𝑓 have a local maximum or minimum at 𝑥0 .Then 𝑥0 is a critical point of𝑓.

Remark 4.5.1: At a critical point (𝑥0 , 𝑓(𝑥0 )) a function may have a local maximum, a local
minimum or neither.

There are two ways of determining when a critical point is a local (relative) maximum or
minimum: the First derivative test and the Second derivative test.

1. The First Derivative Test

Theorem 4.5.3 (The First Derivative Test)

Let 𝑥0 be a critical point of a function 𝑓 with 𝑥0 in an open interval (𝑎, 𝑏). Suppose that 𝑓 is
continuous for 𝑎 ≤ 𝑥 ≤ 𝑏 and differentiable for𝑎 < 𝑥 < 𝑏, except possibly at 𝑥0 itself .Then,

i) If 𝑓 ′ (𝑥) > 0 𝑓𝑜𝑟 𝑎 < 𝑥 < 𝑥0 and 𝑓 ′ (𝑥) < 0 for 𝑥0 < 𝑥 < 𝑏, 𝑓 has a relative maximum
at𝑥0 .
ii) If 𝑓 ′ (𝑥) < 0 𝑓𝑜𝑟 𝑎 < 𝑥 < 𝑥0 and 𝑓 ′ (𝑥) > 0 for 𝑥0 < 𝑥 < 𝑏, 𝑓 has relative minimum at
𝑥0 .
iii) If 𝑓 ′ (𝑥0 ) < 0 𝑓𝑜𝑟 𝑎 < 𝑥0 < 𝑏 𝑜𝑟 𝑓 ′ (𝑥) > 0 for 𝑎 < 𝑥 < 𝑏 (except possibly at 𝑥0
itself, 𝑓 has neither a relative maximum nor a relative minimum at𝑥0 .

24
Property (i) says that if 𝑓 increases to the left of 𝑥0 and decreases to the right of 𝑥0 , then 𝑓
has a relative maximum at 𝑥0 .

Property (ii) says that if 𝑓 decreases to the left of 𝑥0 and increases to the right of 𝑥0 , then 𝑓
has relative minimum at 𝑥0 .

The First derivative test is illustrated in Figure 4.2(a)-(d) in which we have drawn 3 tangent
lines in each of the 4 cases.

Figure 4.2: Relative minima and relative maxima of the function f  x   3x5  5x3  3

There are three critical points for this function: , , and .. Notice that is at
a relative maximum and that the function is concave down at this point, meaning that
f (2)  1  0 Similarly, gives a relative minimum and the function is concave up at this
point, meaning that f (2)  1  0 . However, we will need to be very careful with . In this
case the second derivative is zero, i.e., f (2)  1  0 , but that will not actually mean that the
point (0,3) is not a relative minimum or maximum.

Example 4.16 (First Derivative Test)

From example 4.12, 𝑓(𝑥) = 𝑥 3 + 3𝑥 2 − 9𝑥 − 10 and 𝑓 ′ (𝑥) = 3(𝑥 + 1)(𝑥 − 1). We can use
Table 4.1 to describe the nature of the critical points 𝑥 = −3 𝑎𝑛𝑑 𝑥 = 1. For example, in
the interval (−4, −2), 𝑓 ′ (𝑥) > 0 𝑓𝑜𝑟 𝑥 < −3 and 𝑓 ′ (𝑥) < 0 𝑓𝑜𝑟 𝑥 > −3. This suggests that
𝑓(𝑥) has a relative maximum at 𝑥 = −3. Similarly, in the interval (0, 2), 𝑓 ′ (𝑥) < 0 for 𝑥 < 1
and 𝑓 ′ (𝑥) > 0 for 𝑥 > 1. Thus, in the interval (0,2), 𝑓(𝑥) has a relative minimum at 𝑥 = 1.

1. The Second Derivative Test


(Grossman ().Applied Calculus, Sec 3.3)
𝑑2𝑓(𝑥)
The second derivative 𝑓 (2) (𝑥) 𝑜𝑟 of a function 𝑓(𝑥) provides us with a very
𝑑𝑥 2
straightforward test for determining wherether certain critical numbers for the functions
25
𝑓(𝑥) give relative extrema. the second derivative test is often easier to apply than the First
derivative test.

Concavity and the Second Derivative


Whether the graph of a function 𝑓 is cupped up (concave up) or cupped down (concave
down) is determined by whether the first derivative 𝑓 ′ (i.e. the slope) is an increasing or
decreasing function. To determine whether 𝑓 ′ itself is increasing or decreasing, we must
examine the derivative of 𝑓 ′ , the second derivative of the function 𝑓, 𝑓 (2) (𝑥), if its exists. In
this section, we illustrate how the second derivative of a function is related to the shape of
its graph and how that information can be used to classify relative extreme values.
𝑑
Since the graph (function) of 𝑓 ′ (𝑥) is increasing if 𝑑𝑥 𝑓 ′ (𝑥) = 𝑓 (2) (𝑥) is positive, it follows
that the graph of 𝑓 is concave up when 𝑓 (2) (𝑥) > 0. Similarly, the graph of 𝑓 is concave down
when 𝑓 (2) (𝑥) < 0. [See Figures 4.2 above.]

Definition 4.5.2 (Concavity Theorem)

Let a function 𝑓(𝑥) be twice differentiable (that is let 𝑓 ′ (𝑥) 𝑎𝑛𝑑 𝑓 (2) (𝑥) exist) for all 𝑥 in
the interval (𝑎, 𝑏). Then,

i) The graph of 𝑦 = 𝑓(𝑥) is said to be concave up on (𝑎, 𝑏) if 𝑓 ′′ (𝑥) > 0 for 𝑎𝑙𝑙 𝑥 𝑖𝑛 (𝑎, 𝑏).
ii) The graph of 𝑦 = 𝑓(𝑥) is said to be concave down on (𝑎, 𝑏) if 𝑓 ′′ (𝑥) <
0 for 𝑎𝑙𝑙 𝑥 𝑖𝑛(𝑎, 𝑏).

Point of inflection: The point on the graph of 𝑦 = 𝑓(𝑥) that separates the arcs of opposite
concavity is called a point of inflection. That is infection points are points on a graph where
the concavity changes .A positive second derivative means a function is concave up; a
negative second derivative means the function is concave down. The points of inflection are
the points (on the graph) where the second derivative is zero and, the function changes from
concave up to concave down or vice versa.

Definition 4.5.2 provides us with a procedure for determining the intervals on which the
graph of 𝑦 = 𝑓(𝑥) is concave up or concave down:

i) Find all numbers 𝑥 for which the second derivative 𝑓 (2) (𝑥) = 0 or 𝑓 (2) (𝑥) fails to
exist.
ii) Check the sign of 𝑓 (2) (𝑥) on each of the resulting intervals to determine concavity.

Example 4.17: Determine the concavity for the graph of the function 𝑓(𝑥) = 𝑥 4 − 6𝑥 2 + 2.

Solution

𝑓(𝑥) = 𝑥 4 − 6𝑥 2 + 2

26
𝑓 ′ (𝑥) = 4𝑥 3 − 12𝑥

Now, to check for the points of inflection (𝑥0 , 𝑓(𝑥0 )), find the second derivative of 𝑓(𝑥),
equate it to zero and solve it for 𝑥. [This is so done because the points of inflection are the
points (𝒙𝟎 , 𝒇(𝒙𝟎 )) on the graph where the second derivative is zero]
𝑓 (2) (𝑥) = 12𝑥 2 − 12

And, setting 𝑓 (2) (𝑥) = 0 yields

12𝑥 2 − 12 = 0

12(𝑥 2 − 1) = 0

𝑖. 𝑒. 𝑥 2 − 1 = 0

⇒ (𝑥 + 1)(𝑥 − 1) = 0 since 𝑥 2 − 1 = 𝑥 2 − 12 = (𝑥 + 1)(𝑥 − 1)

So, the zeros of 𝑓 (2) (𝑥) are 𝑥1 = −1 𝑎𝑛𝑑 𝑥2 = 1 . Notice that here, there are no values of 𝑥
for which 𝑓 (2) (𝑥) is undefined.

Check the sign of 𝑓 (2) (𝑥) on each of the resulting intervals: (−∞, −1), (−1, 1) 𝑎𝑛𝑑 (1, ∞) by
choosing one “test number” 𝑡 in each interval and calculating 𝑓 (2) (𝑡). The results are given
in Table 4.2.

Table 4.2

Interval Test number 𝑡 𝑓 (2) (𝑡) Sign of 𝑓 (2) (𝑡)


−∞ < 𝑥 < −1 𝑡 = −2 𝑓 (2) (−2) = 36 +
−1 < 𝑥 < 1 𝑡=0 𝑓 (2) (0)
= −12 −
1<𝑥<∞ 𝑡=2 (2)
𝑓 (2) = 36 +

Thus, applying Definition 4.6.2 to the results in the last column of the Table 4.2, we conclude
that the graph of 𝑓(𝑥) = 𝑥 4 − 6𝑥 2 + 2 is

i) Concave up on (−∞, −1)


ii) Concave down on (−1, 1)
iii) Concave up on (1, ∞)

The points of inflection are the points (𝒙𝟎 , 𝒇(𝒙𝟎 )) on the graph where the second derivative is
zero and, the function changes from concave up to concave down or vice versa.

The concavity of the graph changes at both the points (−1, 𝑓(−1)) =
(−1, −3) 𝑎𝑛𝑑 (1, 𝑓(1)) = (1, −3). So these are the points of inflection, as illustrated in Figure
4.5 (Homework: Sketch the curve).

Theorem 4.5.4 (The Second Derivative Test)

27
Let 𝑓(𝑥) be differentiable on an open interval (c, d) containing the critical value 𝑥 =
𝑥0 with 𝑓 ′ (𝑥0 ) = 0. Suppose also that 𝑓 (2) (𝑥) exists throughout this interval. Then,

i) If 𝑓 (2) (𝑥0 ) < 0, 𝑓(𝑥0 ) is a relative maximum


ii) If 𝑓 (2) (𝑥0 ) > 0, 𝑓(𝑥0 ) is a relative minimum
iii) If 𝑓 (2) (𝑥0 ) = 0, the test is inconclusive

Example 4.18

Find all relative extrema for 𝑓(𝑥) = 𝑥 4 − 8𝑥 2 + 2.

Solution
𝑑𝑓
First, set 𝑑𝑥 = 0 and find the critical numbers.

Given 𝑓(𝑥) = 𝑥 4 − 8𝑥 2 + 2

𝑑𝑓(𝑥)
= 4𝑥 3 − 16𝑥 = 0
𝑑𝑥
⇒ 4𝑥(𝑥 2 − 4) = 0

or 4𝑥(𝑥 2 − 22 ) = 0

⇒ 4𝑥(𝑥 − 2)(𝑥 + 2) = 0

So, either 4𝑥 = 0 or 𝑥 2 − 4 = 0 ⇒ 𝑥 2 − 4 = 0

∴ 𝑥 = 0 𝑜𝑟 𝑥 = −2 𝑎𝑛𝑑 𝑥 = 2 are the 3 critical values


𝑑
Find them second derivative: 𝑓 (2) (𝑥) = 𝑑𝑥 (4𝑥 3 − 16𝑥) = 12𝑥 2 − 16

Check the sign of 𝑓 (2) (𝑥) for each critical number 𝑥0 and apply the second derivative test:

i) 𝑓 (2) 𝑓(0) = 12(0)2 − 16 = −16 < 0 ⇒ (0, 2) is a relative maximum


ii) 𝑓 (2) (−2) = 12(−2)2 − 16 = 48 − 16 = 32 > 0 ⇒ (−2, −14) is a relative minimum.
iii) 𝑓 (2) (2) = 12(2)2 − 16 = 48 − 16 = 32 > 0 ⇒ (2, −14) is a relative minimum.

Remark 4.5.2: A continuous function 𝑓(𝑥) on a closed interval [a, b] always has both a global
(absolute) maximum and a global (absolute) minimum, so that examining the critical values
and the endpoints is enough to find the global maximum and minimum. See Theorem 4.5.4.

Theorem 4.5.5: (Extreme value theorem)

28
If a function 𝑓(𝑥) is continuous on a closed interval [a, b], then 𝑓(𝑥) has both a global
minimum value and a global maximum value. That is, there are real numbers c and d in [a,
b] so that for every x in [a, b], 𝑓(𝑥) attains an absolute (a global) minimum value f(c),
𝑖. 𝑒. , 𝑓(𝑐) ≤ 𝑓(𝑥), and an absolute (a global) maximum value 𝑓(𝑑), 𝑖. 𝑒., 𝑓(𝑥) ≤ 𝑓(𝑑).

Remark 4.5.3: If a function is continuous and has a single critical value, then if there is a local
maximum at the critical value it is a global maximum, and if it is a local minimum it is a global
minimum. There may also be a global minimum in the first case, or a global maximum in the
second case, but that will generally require more effort to determine.

Example 4.19
Find the global (absolute) minimum and maximum of the function 𝑓(𝑥) = (1 − 𝑥)𝑒 𝑥 on the
closed interval [−1, 5].
Solution
Note: The only place a global minimum or maximum can occur on an interval is at one of the
endpoints of the interval or at a critical point inside the interval. A critical point is a point
where the function is defined and its derivative is either 0 or undefined. So start by finding
the derivative

𝑑 𝑑
𝑓 ′ (𝑥) = 𝑒 𝑥 𝑑𝑥 (1 − 𝑥) + (1 − 𝑥) 𝑑𝑥 𝑒 𝑥 by the product rule
So
𝑓 ′ (𝑥) = −𝑥𝑒 𝑥 , −∞ < 𝑥 < ∞

Notice that this derivative is defined everywhere, so the only possible critical numbers will
be where the derivative is equal to 0. And, 𝑓 ′ (𝑥) = −𝑥𝑒 𝑥 = 0 can only be true if either 𝑥 = 0
or 𝑒 𝑥 = 0. But, 𝑒 𝑥 is always positive and, so, is never equal to 0. Thus, the only critical point
comes when 𝑥 = 0.
And, as stated before, the global maximum and minimum can only occur at an endpoint or
at a critical number. So our only choices for 𝑥 here are −1, 0 or 5 . If these are our only
choices, then we just have to plug in the original function 𝑓(𝑥) = (1 − 𝑥)𝑒 𝑥 each one of the
three values and see which one gives the highest value and which gives the lowest value:
2
𝑓(−1) = < 1
𝑒
𝑓(0) = 1
𝑓(5) = −4𝑒 5
Thus, the global minimum comes at the point (5, −4𝑒 5 ) and the global maximum comes at
the point (0, 1).

Theorem 4.5.6: Rolle’s Theorem

29
Suppose a function 𝑓(𝑥) is (i) continuous on a closed interval [a, b], (ii) differentiable on an
open interval (a, b) and (iii) f  a   f  b  . Then, there is a number c in (a, b) such that
f '  c   0 . That is, 𝑓(𝑥) has a critical point c in (a, b).

Theorem 4.5.7: (The Mean value theorem for derivatives)


Suppose a function 𝑓(𝑥) is (i) continuous on a closed interval [a, b] and is differentiable on
an open interval (a, b). Then, there is a number c in (a, b) such that that the derivative of f
f'(c) is equal to the function's average rate of change over [a, b].
f b  f  a 
i.e., f '  c    f  b   f  a   f '  c  b  a  .
ba

4.5.3 Applications of Derivatives to Business Economics

4.5.3.1 Marginal Analysis

References
i) Section 3-7: Marginal Analysis in Business and Economics.
http://faculty.mdc.edu/mmontane/marginal-analysis.pdf.

Example 4.20: Do Examples 1, 2, 3 and Matched Problem 3.


ii) CHAPTER 2 Applications of the Derivative. 2.7 Applications of Derivatives to Business and
Economics. http://math.hawaii.edu/~mchyba/documents/syllabus/Math499/extracredit.pdf

4.5.3.2 Price Elasticity of Demand

Reference: Mark Mac Lean (2011). Price elasticity of Demand. MATH 104 and MATH 184. University
of British Columbia, Canada. http://www.math.ubc.ca/~kliu/notesonelasticity.pdf

Example 4.21: Do Example 1.

Part II: Differentiation of Multivariate Functions

30
4.6 Differentiation of Multivariate Functions
Readings: R Horan & M. Lavelle. Intermediate Mathematics. Introduction to Partial Differentiation.
The University of Plymouth

4.6.0 Multivariate Function

In practice, many quantities that we measure are often functions of two or more variables. That is,
the more common situation is for a dependent variable to be related to more than one independent
variable. Multivariate calculus (also known as Multivariable calculus) is the extension of calculus in
one variable (i.e. Univariate calculus) to calculus in more than one variable. This involves the
differentiation and integration of functions involving multiple variables, rather than just one.

Definition 4.6.1: (Bivariate function)

A bivariate (two-variable) function is a function whose value is dependent on two variables, i.e., a
vector in two dimensional space. Symbolically, this is written f  x1 , x2   f  x  , where x   x1 , x2  , a
2-dimensional vector.

Definition 4.6.2: (Multivariable function)

A p-variable function (or a p-variate function) is a function whose range is a subset of the real line
number ℝ and its domain is the subset of the n-dimensional vector ℝ𝑛 . That is,
f  x1 , x2 ,..., x p   f  x  , where x   x1 , x2 ,..., x p  and f is a scalar.

4.6.1 Partial Differentiation

When a function f of more than one independent variable changes in one or more of the input
variables it is important to calculate the change in the function itself. To determine the rate of change
 
of a multivariate real function f x1 , x2 ,..., x p , where p denotes the number of variables, with
respect to one of its several independent variables x j , j  1, 2,..., p , we find the derivative of f with
respect to x j , j  1, 2,..., p , at a time, while holding the other independent variables constant. This
process is called partial differentiation.

Notation: The symbol ∂ is used whenever a function with more than one variable is being
differentiated. The symbol f x (or ∂f/∂x) is used to denote the first partial derivative of f(x, y) with
respect to the variables x. Likewise, f y (or ∂f/∂y) denotes the first partial derivative of f(x, y) with
respect to the variables y.

Definition 4.6.3: Partial Derivative for a bivariate function

The first partial derivatives of a two-variable function f(x, y) with respect to the variables x and y,
respectively, are given by

31
f  x, y  f  x  x, y   f  x, y 
 lim
x x 0 x
f  x, y  f  x, y  y   f  x, y 
 lim
y y 0 y

Definition 4.6.4: Partial Derivative for a p-variate function


If f x1 , x2 ,..., x p  is a function of p variables, then the partial derivative of f with respect to the j th

variable x j , j  1, 2,..., p , is defined as

𝜕𝑓 𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑗 + ∆𝑥𝑗 , … , 𝑥𝑝 ) − 𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑗 , … , 𝑥𝑝 )


= lim
𝜕𝑥𝑗 ∆𝑥𝑗→0 ∆𝑥𝑗

4.6.1.1 Rules of Partial Differentiation

The rules of partial differentiation follow exactly the same techniques as univariate
differentiation. The only difference is that we have to decide how to treat the other independent
variable. If we hold the other variable constant, that means that it is treated just like any other
constant. Hence, in Definition 4.6.3 the first equation gives the rate of change of f(x, y) with respect
to x while y is held constant. Similarly, the second equation gives the rate of change of f(x, y) with
respect to y while x is held constant.

4.6.1.2 Higher Order Partial Derivatives

The first partial derivatives f x and f y are functions of the variables x and y and, so, we can find their
derivatives. Thus, as with ordinary derivatives of functions of one variable, we can compute higher
order partial derivatives of functions of several variables. If f(x, y) is a bivariate function of x and y,
then

2 f   
f xx     denotes the second partial derivative of f with respect to x,
x 2
x  x 

2 f    
f yy     denotes the second partial derivative of f with respect to y.
y 2
y  y 

2 f   f 
f xy     says first differentiate with respect to x and then with respect to y,
yx y  x 

2 f   f 
f yx     says first differentiate with respect to y and then with respect to x.
xy x  y 

32
The 3rd and 4th equations are called mixed partial (or cross partial) derivatives. Notice that when
finding mixed partial derivatives, we differentiate first with respect to the variable nearest f.

Example 4.22: Find the first- and second-order partial derivatives of the function

f  x, y   e xy  ln x 2  y 
Solution

Example 4.23: Find mixed partial derivatives of the function f  x, y   x 2 cos y  2 xy .

Solution

4.7. Applications of Partial Derivatives

Partial derivatives and mixed partial derivatives are important since they allow us to determine local
maximum and minimum points for multivariate functions.

4.7.1 Unconstraint Optimisation Using the Second Partial Derivative Test

4.7.1.2 Relative Maxima and Minima for a Bivariate function

The definition of relative extrema for functions of two variables is identical to that for functions of
one variable, we just need to remember now that we are working with functions of two variables.

33
Definition 4.7.1: (Relative Minimum and Maximum)

Suppose we are interested in finding points of relative maxima and minima for a function of two
variables, i.e., 𝑓(𝑥, 𝑦). Then,

i) 𝑓(𝑥, 𝑦)) has a relative minimum at the point (𝑥0 , 𝑦0 ) if 𝑓(𝑥 𝑦) ≥ 𝑓(𝑥0 , 𝑦0 ) for all points (𝑥, 𝑦) in
some region around (𝑥0 , 𝑦0 ).
ii) 𝑓(𝑥, 𝑦)) has a relative maximum at the point (𝑥0 , 𝑦0 ) if 𝑓(𝑥 𝑦) ≤ 𝑓(𝑥0 , 𝑦0 ) for all points (𝑥, 𝑦) in
some region around (𝑥0 , 𝑦0 ).

Definition 4.7.2 (Stationary Points)

For a differentiable multivariable function, a stationary (critical) point is a point on the surface of the
graph where all its partial derivatives are zero (equivalently, the gradient is zero).

By definition 4.7.2, a point (𝑥0 , 𝑦0 ) is a stationary point of a bivariate function 𝑓(𝑥 𝑦) if one of the
following is true:

i) 𝑓𝑥 (𝑥0 , 𝑦0 ) = 0 and 𝑓𝑦 (𝑥0 , 𝑦0 ) = 0 (i,e, both the partial derivatives of 𝑓(𝑥 𝑦) at (𝑥0 , 𝑦0 ) are zero).
ii) 𝑓𝑥 (𝑥0 , 𝑦0 ) and/or 𝑓𝑦 (𝑥0 , 𝑦0 ) do (does) not exist.

Finding Stationary Point of a Bivariate Function

Thus, to find the relative minima/maxima of a two-variable function 𝑓(𝑥 𝑦), the first step is to find
the stationary points (𝑥0 , 𝑦0 ) where the gradient is the 0 vector. That is, find 𝑓𝑥 and 𝑓𝑦 and set both to
zero and solve the resulting system of equations for x and y.

Definition 4.7.3: (The Gradient)

The gradient of a multivariate function 𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑝 ) at the p-dimensional vector

𝑇
𝒙 = (𝑥1 , 𝑥2 , … , 𝑥𝑝 ) , denoted by ∇𝑓(𝒙), is the vector

𝜕
𝑓(𝒙)
𝜕𝑥1 𝑓𝑥1
𝜕
𝑓(𝒙) 𝑓
∇𝑓(𝑥) = 𝜕𝑥2 = 𝑥2 , a column vector of the first partial derivatives of 𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑝 ) with
⋮ ⋮
𝜕
𝑓(𝒙) [ 𝑓𝑝 ]
[𝜕𝑥𝑝 ]
respect to each one of the independent variables 𝑥1 , 𝑥2 , … , 𝑥𝑝 .

Note: Both of the first order partial derivatives must be zero at the point (x0 , y0 ). If only one of the
first order partial derivatives is zero at the point, then the point (x0 , y0 ) will not be a stationary point.

Classifying (Identifying) Stationary Points

34
Consider a bivariate function 𝑓(𝑥1 , 𝑥2 ).To check if the point (𝑥0 , 𝑦0 ) with a zero gradient is a relative
minimum or relative maximum, we determine the Hessian matrix of 𝑓(𝑥1 , 𝑥2 ), a matrix whose (𝑖, 𝑗)𝑡ℎ
𝜕2 𝑓(𝑥1 ,𝑥2 )
element is the second-order partial derivative 𝜕𝑥𝑖 𝜕𝑥𝑗
, for 𝑖, 𝑗 = 1, 2:

𝑓𝑥𝑥 (𝒙) 𝑓𝑥𝑦 (𝒙)


𝑯(𝒙) = [ ]
𝑓𝑦𝑥 (𝒙) 𝑓𝑦𝑦 (𝒙)

[ASIDE: In general, the Hessian matrix of a multivariable function of 𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑝 ), a matrix whose
𝜕2 𝑓(𝑥1 ,𝑥2 ,…,𝑥𝑝 )
(𝑖, 𝑗)𝑡ℎ element is the second-order partial derivative , for 𝑖, 𝑗 = 1, 2, … , 𝑝, denoted by
𝜕𝑥𝑖 𝜕𝑥𝑗
𝑯(𝑥), is

𝜕2 𝜕2 𝜕2
𝑓(𝒙) 𝑓(𝒙) 𝑓(𝒙)
𝜕𝑥1 𝜕𝑥1 𝜕𝑥1 𝜕𝑥2 𝜕𝑥1 𝜕𝑥𝑝

𝜕2 𝜕2 𝜕2
𝐻(𝑥) = 𝜕𝑥2 𝜕𝑥1 𝑓(𝒙) 𝑓(𝒙) 𝑓(𝒙)
𝜕𝑥2 𝜕𝑥2 𝜕𝑥2 𝜕𝑥𝑝
⋮ ⋱ ⋮
𝜕2 𝜕2 𝜕2
𝑓(𝒙) 𝑓(𝒙) ⋯ 𝑓(𝒙)
[𝜕𝑥𝑝 𝜕𝑥1 𝜕𝑥𝑝 𝜕𝑥2 𝜕𝑥𝑝 𝜕𝑥𝑝 ]

or

𝑓𝑥1 𝑥1 (𝒙) 𝑓𝑥1 𝑥2 (𝒙) 𝑓𝑥1 𝑥𝑝 (𝒙)



𝑓𝑥 𝑥 (𝒙) 𝑓𝑥2 𝑥2 (𝒙) 𝑓𝑥2 𝑥𝑝 (𝒙)
𝑯(𝑥) = 2 1 ]
⋮ ⋱ ⋮
[𝑓𝑥𝑝 𝑥1 (𝒙) 𝑓𝑥𝑝𝑥2 (𝒙) ⋯ 𝑓𝑥𝑝 𝑥𝑝 (𝒙)]

𝜕2 𝜕2
Note: The Hessian matrix is a symmetric matrix. That is, 𝑓(𝑥) = 𝑓(𝒙). Why?
𝜕𝑥𝑖 𝜕𝑥𝑗 𝜕𝑥𝑗 𝜕𝑥𝑖

To verify if a critical point, say, (𝒙𝟎 , 𝒚𝟎 ), is a relative minimum, relative maximum or a saddle point, we
apply The Second partial derivative test, stated in Theorem 4.7.1.

Theorem 4.7.1: (The Second Partial Derivative Test)

Suppose 𝑓(𝑥, 𝑦) is a two-variable function, with a critical point (𝑥0 , 𝑦0 ), i.e. ∇𝑓(𝒙) = 𝟎 and that the
second order partial derivatives are continuous in some region that contains (𝑥0 , 𝑦0 ). If the
determinant of the Hessian matrix 𝑯(𝑥), denoted by D, at the point (𝑥0 , 𝑦0 ) is
2
𝐷 = 𝑓𝑥𝑥 (𝑥0 , 𝑦0 )𝑓𝑦𝑦 (𝑥0 , 𝑦0 ) − (𝑓𝑥𝑦 (𝑥0 , 𝑦0 )) ,

called the discriminant of the function 𝑓(𝑥, 𝑦), then

i) if 𝐷 > 0 and 𝑓𝑥𝑥 (𝑥0 , 𝑦0 ) > 0, then (𝑥0 , 𝑦0 ) corresponds to a relative minimum.

35
ii) if 𝐷 > 0 and 𝑓𝑥𝑥 (𝑥0 , 𝑦0 ) < 0, then (𝑥0 , 𝑦0 ) corresponds to a relative maximum.
iii) if 𝐷 < 0, then (𝑥0 , 𝑦0 ) corresponds to a saddle point.

iv) if 𝐷 = 0, then the test is inconclusive. That is, the point (𝑥0 , 𝑦0 ) may be a relative minimum,
relative maximum or a saddle point. Other techniques would need to be used to classify the
critical point.

Example 4.23
Find and classify the stationary points (local maximum, local minimum or saddle point) of the
function 𝑓(𝑥, 𝑦) = 𝑥 2 + 𝑦 4 + 1

Solution
To find the stationary points first find the first-order partial derivatives and equate each to zero:
𝑓𝑥 (𝑥, 𝑦) = 2𝑥 = 0 ⇒ 𝑥 = 0
𝑓𝑦 (𝑥, 𝑦) = 4𝑦 3 = 0 ⇒ 𝑦 = 0
Hence,
𝜕
𝑓(𝒙)
𝜕𝑥1 0
∇𝑓(𝒙) = =[ ]
𝜕 0
𝑓(𝒙)
[𝜕𝑥2 ]
Thus, the only stationary point of f is (0, 0). And, to classify the stationary point (0, 0), apply the
Second partial derivative test as follows:
Second-order partials are 𝑓𝑥𝑥 = 2 , 𝑓𝑥𝑦 = 0, 𝑓𝑦𝑦 = 12𝑦 2 , 𝑓𝑦𝑥 = 0 so that
𝑓𝑥𝑥 𝑓𝑥𝑦 2 0
Hessian matrix 𝑯 = [ ]=[ ].
𝑓𝑦𝑥 𝑓𝑦𝑦 0 12𝑦 2
2 0
Hence, the discriminant of the function is 𝐷 = det(𝐻) = | | = 24𝑦 2 − 0 = 24𝑦 2
0 12𝑦 2
And at the stationary point (0, 0), 𝐷 = 0 and so the test provides no information about the nature of
this stationary point (i.e., the test is inclusive).

Example 4.24
Locate and classify the stationary points of the function 𝑓(𝑥, 𝑦) = 3𝑥 2 𝑦 + 𝑦 3 − 3𝑥 2 − 3𝑦 2 + 1

Solution
Note: In this case the function f is continuous and defined for every point (𝑥, 𝑦) ∈ ℝ2 so that any local
extreme values will occur at the critical (stationary) points of f. We find the stationary points by
setting the gradient of f to be equal to the null vector and then simultaneously solve for x and y (just
like in Example 4.17):
𝑓𝑥 6𝑥𝑦 − 6𝑥 0
𝜵𝑓(𝑥) == [ 1 ] = [ 2 ]=[ ]
𝑓𝑥2 3𝑥 + 3𝑦 2 − 6𝑦 0
And, 6𝑥𝑦 − 6𝑥 = 0 ⇒ 𝑥(𝑦 − 1) = 0 ⇒ 𝑥 = 0, 𝑦 = 1.

36
Now, if 𝑥 = 0, the second equation 3𝑥 2 + 3𝑦 2 − 6𝑦 = 0 becomes 3𝑦 2 − 6𝑦 = 0 ⇒ 3𝑦(𝑦 − 2) = 0 so
that 𝑦 = 0 or 𝑦 = 2 and, hence, stationary points of f are (0, 0) and (0, 2).

Similarly, if 𝑦 = 1, the second equation 3𝑥 2 + 3𝑦 2 − 6𝑦 = 0 becomes 3𝑥 2 = 0 so that either


𝑥 = −1 or 𝑥 = 1 and, hence, (-1, 1) and (1, 1) are also stationary points of f.

Thus, there are 4 stationary points of f to be classified: (-1, 1), (0, 0), (0, 2), (1, 1) by using the Second
partial derivative test.

By computing the second-order partial derivatives, we easily obtain the following Hessian matrix as

6𝑦 − 6 6𝑥
𝑯=[ ]
6𝑥 6𝑦 − 6

6𝑦 − 6 6𝑥
And so the discriminant of f is 𝐷 = | | = (6𝑦 − 6)2 − 36𝑥 2 . This leads to the following
6𝑥 6𝑦 − 6
classification of the 4 stationary point of f above.

Stationary point D(x,y) 𝒇𝒙𝒙 (𝒙, 𝒚) Conclusion


(x,y)
(-1, 1) D(-1, 1) =-36<0 0 ((−1, 1), 𝑓(−1, 1)) = (−1, 1, −1)
is a saddle point
(0, 0) 𝐷(0, 0) = 36 > 0 𝑓𝑥𝑥 (0,0) = −6 < 0 ((0, 0), 𝑓(0, 0)) = (0, 0, 1)
is a local maximum
(0, 2) 𝐷(0, 2) = 36 > 0 𝑓𝑥𝑥 (0,2) = 6 > 0 ((0, 2), 𝑓(0, 2)) = (0, 2, −3)
is a local minimum
(1, 1) 𝐷1, 1) = −36 < 0 0 ((1, 1), 𝑓(1, 1)) = (1, 1, −1)
is a saddle point

Example 4.25 (Question 13 of Exercises 4 2017)


Modise Kwena is the operations Vice President for Credit Bank Pty Ltd. The bank gives two categories
of loans, consumer and commercial, and Modise has developed the following functional relationship
showing how profit depends on the amount of loans in each category:
𝑃𝑟𝑜𝑓𝑖𝑡 = 𝑓(𝑥, 𝑦) = −2𝑥 2 − 3𝑦 2 + 60𝑥 + 120𝑦 + 2000,
where x denotes the number of consumer loans and y represents number of commercial loans.
Determine the optimum operating level (combination of loan types) for Modise’s Bank to maximise
profit.
Example 4.26
A company makes two products whose demand equations are given by 𝑞1 = 200 − 3𝑝1 − 𝑝2 and 𝑞2 =
150 − 𝑝1 − 2𝑝2 , respectively, where , 𝑝1 , 𝑞1and 𝑝2 , 𝑞2 are the price and quantity of products 1 and 2,
respectively.

37
a) Determine the price the company should charge for each product in order to maximize total
revenue. Hint: Since the company is selling two products, the total revenue will be the sum of
the total revenues realised from the two products.
b) Verify that the maximum revenue accruable to the company is P4375.

4.7.2 Constraint Optimisation Using Lagrange Multipliers

The Lagrange multipliers method is one of methods for solving constrained extrema problems. Recall
that for a p-variate function f the necessary condition for local extrema is that at the point of extrema
all partial derivatives, if they exist, must be zero. As a result, there are p equations in p unknowns
(𝑥1 , 𝑥2 , … , 𝑥𝑝 ), that may be solved to find the potential extrema point (called stationary point). When
the variables, 𝑥1 , 𝑥2 , … , 𝑥𝑝 , are constrained, there is (at least one) additional equation (the constraint)
but no additional variables, so that the set of equations is overdetermined. Hence, the method
introduces an additional variable (the Lagrange multiplier), denoted by 𝜆, that enables us to solve the
problem.

More specifically), suppose we wish to find the values of 𝑥1 , 𝑥2 , … , 𝑥𝑝 that maximise/minimise


𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑝 ) subject to a constraint that permits only some x values.

More formally, the problem is to minimise/maximise the objective function 𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑝 )


Subject to the constraint 𝑔(𝑥1 , 𝑥2 , … , 𝑥𝑝 ) = 𝑘

The Lagrange multipliers method is based on setting up the new function, called the Lagrange
function,
𝐿(𝑥1 , 𝑥2 , … , 𝑥𝑝 , 𝜆) = 𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑝 ) − 𝜆(𝑔(𝑥1 , 𝑥2 , … , 𝑥𝑝 ) − 𝑘)

where k is a constant and λ is an additional variable called the Lagrange multiplier. From the Lagrange
function, stationary points are obtained by finding the partial derivative of 𝐿(𝑥1 , 𝑥2 , … , 𝑥𝑝 , 𝜆) with
respect to 𝑥1 , 𝑥2 , … , 𝑥𝑝 , and 𝜆 and setting each result to zero and then solving the resulting system of
equation simultaneously for 𝑥1 , 𝑥2 , … , 𝑥𝑝 , and 𝜆. That is,

𝜕 𝜕 𝜕
𝐿(𝑥1 , 𝑥2 , … , 𝑥𝑝 , 𝜆) = 𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑝 ) − 𝜆 𝑔(𝑥1 , 𝑥2 , … , 𝑥𝑝 ) = 0
𝜕𝑥1 𝜕𝑥1 𝜕𝑥1
𝜕 𝜕 𝜕
𝐿(𝑥1 , 𝑥2 , … , 𝑥𝑝 , 𝜆) = 𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑝 ) − 𝜆 𝑔(𝑥1 , 𝑥2 , … , 𝑥𝑝 ) = 0
𝜕𝑥2 𝜕𝑥2 𝜕𝑥2

𝜕 𝜕 𝜕
𝐿(𝑥1 , 𝑥2 , … , 𝑥𝑝 , 𝜆) = 𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑝 ) − 𝜆 𝑔(𝑥1 , 𝑥2 , … , 𝑥𝑝 ) = 0
𝜕𝑥2 𝜕𝑥2 𝜕𝑥2

𝜕 𝜕 𝜕
𝐿(𝑥1 , 𝑥2 , … , 𝑥𝑝 , 𝜆) = 𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑝 ) − 𝜆 𝑔(𝑥1 , 𝑥2 , … , 𝑥𝑝 ) = 0
𝜕𝑥𝑝 𝜕𝑥𝑝 𝜕𝑥𝑝
𝜕 𝜕 𝜕
𝐿(𝑥1 , 𝑥2 , … , 𝑥𝑝 , 𝜆) = 𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑝 ) − 𝜆 𝑔(𝑥1 , 𝑥2 , … , 𝑥𝑝 ) = 0
𝜕𝜆 𝜕𝜆 𝜕𝜆
38
Note: The Lagrange multiplier method provides us with a way of finding the stationary points but,
does not tell us whether the points yield a minimum, maximum or neither. To confirm if the
stationary points indeed yield a minimum, maximum or neither, second-order conditions must be
verified.

Example 4.27 (Consumer’s Utility level)


A consumer’s utility function is given by 𝑈 = 𝑥1 𝑥2 where 𝑥1 is the quantity of good 1 that is bought
and 𝑥2 is the quantity of good 2 that is bought. The price of good 1 is P10 while the price of good 2 is
P2. If the consumer’s income is P100 what will the consumer’s optimal utility level be?

Solution

Maximise U  x1 x2

subject to 10 x1  2 x2  100

Using the The Lagrange multiplier method, first set the Lagrangian function
𝐿(𝑥1 , 𝑥2 , 𝜆) = 𝑥1 𝑥2 − 𝜆(10𝑥1 + 2𝑥2 − 100)

or L  x1 x2   100  10 x1  2 x2 

Differentiating L w.r.t 𝑥1 , 𝑥2 and 𝜆 and then setting each result to zero yields Equations (1) to (3).

L
 x2  10 = 0 (1)
x1

L
 x1  2 = 0 (2)
x2

L
 100  10 x1  2 x2 = 0 (3)

Solve the 3 simultaneous equations:

From Equation (2), x1  2

so that x1 2  

From Equation (1), x2  10x1 2  0

x2  5x1

By (3) 100  10 x1  25x1   0

39
20 x1  100

x1  5

x2  5x1  55  25

The optimal value of U is where x1  5 and x2  25

That is, U  x1 x2  525  125


𝑥1
Interpretation of 𝝀: In this problem, we obtained 𝜆 = ⁄2 =2.5.

40

You might also like