Chapter 4 Differentiation
Chapter 4 Differentiation
Differentiation
Part I: Derivatives of Univariate Functions
4.1 Introduction
From before the time of the great Greek scientist Archimedes (287-212 B.C), mathematicians
were concerned with two important problems:
Problem 1: Finding the unique tangent line (if one exists) to a given curve at a given point on
the curve. See Figure 4.1.
Problem 2: Finding the area bounded by a given curve and the 𝑥 − 𝑎𝑥𝑖𝑠 (see Figure 4.2).
The solution to problem 1 led to what is now called differential calculus, while the solution to
problem 2 gave rise to Integral calculus. In this chapter, we study differential calculus while
integral calculus is the subject of Chapter 5. As we shall see in due course, differential and
integral calculus are really two closely related points of the same subject.
Remark 4.1: The central idea of this chapter is that of the slope of a curve at a given point.
1
That is, we introduce the derivative of a function and the important rules for finding
derivatives various types of a function. We also show how the derivative is used to analyze
the rate of change of a quantity.
A geometric interpretation of a derivative is referred to as being the slope of the tangent line
to a curve at a given point.
To obtain a suitable definition of a tangent line, we use the limit concept and the geometric
notion of a secant line. A secant line is a line that intersects a curve at two or more points. See
the graph of the function 𝑦 = 𝑓(𝑥).
Figure 4.3: Secant line PQ and the tangent line to the curve of 𝑦 = 𝑓(𝑥) at the point P.
Suppose we wish to define the tangent line at the point P. If Q is a different point on the curve,
the line 𝑃𝑄 is a secant line. If 𝑄 moves along the curve and approaches 𝑃 from the right (see
Figure 4.4), typical secant lines are 𝑃𝑄′, 𝑃𝑄", 𝑃𝑄 ′′′ etc. And, as 𝑄 approaches 𝑃 from the left,
typical secant lines are 𝑃𝑄1 , 𝑃𝑄2 and so on. Show these on the curve in Figure 4.4. In both
cases, the secant lines approach the same limiting position .This common limiting position
of the secant lines is defined to be the tangent line to the curve at 𝑃.This definition of a
tangent line (is reasonable and)applies to curves in general. Not just circles.
2
Figure 4.4: The tangent line is a limiting position of secant lines.
The slope of a curve at a point 𝑃 is the slope, if it exists, of the tangent line to the curve at 𝑃.
Now, since the tangent line at 𝑃 is a limiting position of secant lines 𝑃𝑄, we consider the slope
of the tangent line to be the limiting value of the slopes of the secant lines as 𝑄 approaches
𝑃.
4.3 Derivative
The central concept in the study of calculus is the concept of a derivative. Intuitively, the
derivative of a function 𝑓 at the value 𝑥0 is the slope of the line tangent to the graph of 𝑓 at
the point (𝑥0 , 𝑓(𝑥0 )). We use the notation 𝑓′(𝑥0 ) to denote the derivative of 𝑓 at 𝑥0 . The
function 𝑓′ (the derivative of 𝑓) assigns to each unique real number 𝑥0 a new number 𝑓′(𝑥0 ).
Thus,
𝑓 ′ (𝑥0 ) = slope of a tangent line to the graph of 𝑓 at the point ( 𝑥0 , 𝑓(𝑥0 )).
Consider the function 𝑦 = 𝑓(𝑥) a part of whose graph is given in Figure 4.5.To determine the
derivative function, we must find the slope of the line tangent to the curve at each point on
the curve at which there is a unique tangent line. Let (𝑥𝑜 , 𝑓(𝑥𝑜 )) be such a point, assuming
that 𝑓 is defined “near” 𝑥𝑜 .
3
Figure 4.5: The derivative of a function 𝑦 = 𝑓(𝑥) at the point (𝑥, 𝑓(𝑥)) equals the
slope/gradient of the tangent line to the graph at that point
To denote the change in a variable, say 𝑥, the symbol ∆𝑥 (read “delta𝑥”) the letter ℎ is
commonly used.
Now, consider the secant line in Figure 4.5, joining the points (𝑥𝑜 , 𝑓(𝑥𝑜 )) and (𝑥𝑜 +
∆𝑥, 𝑓(𝑥𝑜 + ∆𝑥 )). What is the slope of the secant line?
If we define
∆𝑦 = 𝑓(𝑥𝑜 + ∆𝑥 ) − 𝑓(𝑥𝑜 )
𝑐ℎ𝑎𝑛𝑔𝑒 𝑖𝑛 𝑦 ∆𝑦
𝑚= =
𝑐ℎ𝑎𝑛𝑔𝑒 𝑖𝑛 𝑥 ∆𝑥
𝑓(𝑥𝑜 + ∆𝑥 ) − 𝑓(𝑥𝑜 )
=
(𝑥𝑜 + ∆𝑥 ) − 𝑥𝑜
Clearly, as ∆𝑥 gets smaller, the secant line gets closer and closer to the tangent line (the
limiting position of secant lines). That is as ∆𝑥 approaches zero (∆𝑥 → 0), the slope of the
secant line approaches the slope of the tangent line. But, the slope of the tangent line at the
point (𝑥𝑜 , 𝑓(𝑥𝑜 ) ) is the derivative, 𝑓 ′ (𝑥𝑜 ). Therefore, we have that
𝑓 ′ (𝑥𝑜 ) = lim 𝑚
∆𝑥→0
4
∆𝑦
∴ 𝑓 ′ (𝑥𝑜 ) = lim
∆𝑥→0 ∆𝑥
The process of finding a derivative is called differentiation. This result leads to the following
definition (Definition 4.2):
Let the function 𝑓 be defined on an open interval containing the point 𝑥0 , and suppose that
𝑓(𝑥𝑜 +∆𝑥 )−𝑓(𝑥𝑜 )
lim exists and is finite. Then, 𝑓 is said to be differentiable at the value 𝑥0 and
∆𝑥→0 ∆𝑥
the derivative of 𝑓 at 𝑥0 , denoted by 𝑓 ′ (𝑥0 ) is given by
𝑓(𝑥𝑜 +∆𝑥 )−𝑓(𝑥𝑜 ) (2)
𝑓 ′ (𝑥𝑜 )= lim
∆𝑥→0 ∆𝑥
𝑓(𝑥+∆𝑥)−𝑓(𝑥)
2.) For every number 𝑥 in the domain of 𝑓 ′ , 𝑓 ′ (𝑥) = lim . (3)
∆𝑥→0 ∆𝑥
Equation (3) says that 𝑓 ′ is the function, defined at every 𝑥 for which the limit in (3) exists
and is finite, that assigns to every 𝑥 in its domain the derivative 𝑓 ′ (𝑥).Thus, the derivative of
a function is a function, and the value of the derivative at a given value 𝑥𝑜 is the limit obtained
in (2) (Definition 4.2)
Remark 4.1: From definition 4.2, it follows that the derivative 𝑓 ′ (𝑥) of a function 𝑓 can exist
if 𝑓(𝑥) is defined.
5
𝑦−𝑓(𝑥0 )
since 𝑓 ′ (𝑥0 ) = for the point (𝑥0 , 𝑓(𝑥0 )). Equation (4) is obtained by 𝑦 − 𝑓(𝑥0 )
(𝑥−𝑥0 )
𝑦−𝑓(𝑥0 )
making the subject of the formula in this equation, 𝑓 ′ (𝑥0 ) = .
(𝑥−𝑥0 )
Example 4.1
Solution
𝑑𝑦 𝑓(𝑥+∆𝑥)−𝑓(𝑥)
= lim
𝑑𝑥 ∆𝑥→0 ∆𝑥
(𝑥+∆𝑥)2 −(𝑥)2
= lim ∆𝑥
∆𝑥→0
𝑥 2 +2𝑥∆𝑥+(∆𝑥)2 −𝑥 2
= lim
∆𝑥→0 ∆𝑥
𝑑𝑦 2𝑥∆𝑥+(∆𝑥)2
i.e. = lim
𝑑𝑥 ∆𝑥→0 ∆𝑥
𝑑𝑦
∴ = 𝑓 ′ (𝑥) = 2𝑥
𝑑𝑥
b) From (a), we see that at every point of the form (𝑥, 𝑓(𝑥)) = (𝑥, 𝑥 2 ), the slope of the line
𝑑𝑦
tangent to the curve is 𝑓 ′ (𝑥) = 𝑑𝑥 = 2𝑥. Therefore, slope of the tangent line at the point
(3, 9) is 𝑓 ′ (3) = 2(3) = 6
Thus, using the point slope method we find the equation of the line tangent to the graph
of 𝑦 = 𝑥 2 at the point (3, 9)= ( 𝑥0 , 𝑓(𝑥0 )). As
⇒ 𝑦 − 𝑓(3) = 𝑓 ′ (3)(𝑥 − 3)
⇒ 𝑦 − 9 = 6𝑥 − 18
i.e. 𝑦 = 6𝑥 − 9
6
Example 4.2 (Finding the derivative of a function involving the radical sign √ )
Using the first principle of differentiation, find the derivative of 𝑦 = √𝑥 and calculate the
slope of the tangent line at the point (4, 2).
√𝑥+∆𝑥 − √ 𝑥
i.e 𝑓 ′ (𝑥) = lim
∆𝑥→0 ∆𝑥
To simplify the expression on the RHS, notice that the numerator, √𝑥 + ∆𝑥 − √𝑥, looks like
one factor, (𝑎 − 𝑏), of the 2 factors in difference of two squares: [ (𝑎 − 𝑏)(𝑎 + 𝑏) = 𝑎2 − 𝑏 2 ]:
2
(√𝑥 + ∆𝑥 − √𝑥)(√𝑥 + ∆𝑥 + √𝑥) = (√𝑥 + ∆𝑥 ) − (√𝑥)2 ) on writing
Thus, multiply both numerator and denominator by √𝑥 + ∆𝑥 + √𝑥, the conjugate of the
numerator), to obtain
(√𝑥+∆𝑥−√𝑥)(√𝑥+∆𝑥+√𝑥)
𝑓 ′ (𝑥) = lim
∆𝑥→0 ∆𝑥(√𝑥+∆𝑥+√𝑥)
2 2
(√𝑥+∆𝑥) −(√𝑥)
= lim
∆𝑥→0 ∆𝑥(√𝑥+∆𝑥+√𝑥)
(𝑥+∆𝑥)−(𝑥)
i.e. (𝑥) = lim
∆𝑥→0 ∆𝑥(√𝑥+∆𝑥+√𝑥)
∆𝑥
𝑓 ′ (𝑥) = lim
∆𝑥→0 ∆𝑥(√𝑥+∆𝑥+√𝑥)
1
i.e 𝑓 ′ (𝑥)= lim
∆𝑥→0 √𝑥+∆𝑥+√𝑥
1
=
√𝑥+0+√𝑥
1
=
√𝑥+√𝑥
1
1 1 1 1
=2 (=2 . = 2 𝑥 −2 )
√𝑥 √ 𝑥
1 1
∴ 𝑓 ′ (𝑥) = 𝑥 −2
2
7
Finding the slope of a tangent line at point (4, 2): at 𝑥 = 4:
1 1 1 1 1 1
𝑓 ′ (4) = (4)−2 = . = . = 1⁄4
2 2 √4 2 2
Exercise: Verify that the equation of the line tangent to the graph of 𝑓(𝑥) = √𝑥 at the point
(4, 2) is 𝑦 = 1⁄4 𝑥 + 1.
The function 𝑓(𝑥) is differentiable on an open interval (𝑎, 𝑏) if 𝑓 ′ (𝑥) exists for every 𝑥 in
the interval (𝑎, 𝑏).
Example 4.3
1
From Example 4.2, the derivative of the function 𝑦 = √𝑥 is 𝑓 ′ (𝑥) = 2 . Clearly this
√𝑥
derivative exists for every 𝑥 in the open interval 0 < 𝑥 < 𝑏. Thus, 𝑓(𝑥) = √𝑥 is differentiable
on any interval of the form (0, 𝑏), where 𝑏 > 0, 0 < 𝑥 < 𝑏.
Example 4.4
Solution
i.e. 𝑓 ′ (𝑥) = 2𝑥
The function 𝑓 ′ (𝑥) is defined for all values 𝑥 on the real line. Hence, 𝑓(𝑥) = 𝑥 2 is
differentiable on the interval (−∞, ∞) = (−∞ < 𝑥 < ∞).
In this section, we discuss derivatives of sin 𝑥, cos 𝑥 and related functions. The material in
this section depends on knowledge of basic trigonometry (see Chapter 2 of this course). In
8
this chapter, we assume unless otherwise specifically stated, that 𝒙 (𝒐𝒓 𝜽) is measured in
radians and that the Domain (sin 𝑥) = 𝐷𝑜𝑚𝑎𝑖𝑛 (cos 𝑥) = ℝ = (−∞, ∞).
𝑑 cos 𝑥
= − sin 𝑥 (2)
𝑑𝑥
Theorem 4.3.1(b)
If 𝑢 is a differentiable function of 𝑥, denoted 𝑢(𝑥), we may apply the chain rule together with
Equations (1) and (2) to conclude that
𝑑 sin 𝑢(𝑥) 𝑑 sin 𝑢(𝑥) 𝑑𝑢
= . 𝑑𝑥 by the Chain rule
𝑑𝑥 𝑑𝑢
Notice that Theorem 4.3.1(a) is a special case of Theorem 4.3.1 (b) when
𝑑𝑢(𝑥) 𝑑 𝑠𝑖𝑛 𝑥 𝑑𝑥) 𝑑 𝑠𝑖𝑛 𝑥
𝑢(𝑥) = 𝑥, in which acse = 1 (𝐹𝑜𝑟 𝑒𝑥𝑎𝑚𝑝𝑙𝑒, 𝑏𝑦 𝐸𝑞. (3) = cos 𝑥. = cos 𝑥)
𝑑𝑥 𝑑𝑥 𝑑𝑥 𝑑𝑥
Example 4.5
𝑑 sin 𝑥 2
Evaluate
𝑑𝑥
Solution: Let 𝑢 = 𝑥 2 so that 𝑓(𝑢) = sin 𝑢. Then, using the chain rule and the fact that
𝑑 sin 𝑥
= cos 𝑥, we have that
𝑑𝑥
𝑑 sin 𝑥 2
∴ = cos 𝑥 2 (2𝑥)
𝑑𝑥
𝑑 sin 𝑥 2
𝑖. 𝑒. = 2𝑥 cos 𝑥 2 (as generalised in Equation (3) above).
𝑑𝑥
9
Example 4.6
𝑑
Evaluate cos √𝑥.
𝑑𝑥
Solution
Here, the outer function is cos(.) while the inner function is √𝑥. So set the inner function to
1
𝑑𝑢 1 1 1
𝑢 = √𝑥 ⇒ 𝑑𝑥 = 2 𝑥 −2 = 2 , and using Equation (4), we have that
√𝑥
𝑑 𝑐𝑜𝑠 √𝑥 𝑑𝑢
𝑖. 𝑒. = − sin 𝑢 . 𝑑𝑥
𝑑𝑥
1 1
=(− sin √𝑥)(2 )
√𝑥
𝑑 cos √𝑥 − sin √𝑥
∴ =
𝑑𝑥 2√𝑥
Solution: The function 𝑓(𝑥) = (𝑥 + sin 𝑥)4 is of the form 𝑓(𝑥) = [𝑢(𝑥)]𝑛 , a power function,
with 𝑢(𝑥) = 𝑥 + sin 𝑥. And so by the General power rule, the derivative of such a function is
𝒅𝒇(𝒙) 𝒅𝒖
= 𝒏[𝒖(𝒙)]𝒏−𝟏 .
𝒅𝒙 𝒅𝒙
𝑑𝑢 𝑑(𝑥+sin 𝑥)
First, evaluate 𝑑𝑥 = using the chain rule:
𝑑𝑥
𝑑𝑢 𝑑 𝑑𝑥 𝑑 sin 𝑥
= 𝑑𝑥 (𝑥 + 𝑠𝑖𝑛 𝑥) = 𝑑𝑥 + (Derivative of a Sum or a Difference rule)
𝑑𝑥 𝑑𝑥
𝑑𝑢 𝑑 sin 𝑥
⇒ = 1 + cos 𝑥 since = cos 𝑥
𝑑𝑥 𝑑𝑥
Hence, on writing in terms of the original variable x, we get
𝑑(𝑥+sin 𝑥)4
= 4(𝑥 + sin 𝑥)3 (1 + cos 𝑥) by the Chain rule
𝑑𝑥
𝑑
∴ 𝑑𝑥 (𝑥 + 𝑠𝑖𝑛 𝑥)4 = 4(1 + cos 𝑥)(𝑥 + sin 𝑥)3
10
Derivatives of other Trigonometric Functions
Note: For a given right-angled triangle containing the 𝑎𝑛𝑔𝑙𝑒 𝜃, the numbers sin 𝜃 and cos 𝜃
are only two of 𝑠𝑖𝑥(6) possible ratios of lengths of the sides. This section describes how each
of the remaining 4 ratios defines a trigonometric function, and how each function is related
to the sine and cosine functions.
[Aside: Figure 4.6 shows a right-angled triangle with angle 𝜃 with opposite side of length y,
adjacent side of length 𝑥 and hypotenuse of length of ℎ. From figure 4.6, four (4) other
trigonometric functions of 𝜃 maybe defined in terms of the basic functions sin 𝜃 and cos 𝜃.
𝑦
i.e. tan 𝜃 = 𝑥
𝑦 𝑜𝑝𝑝𝑜𝑠𝑖𝑡𝑒 𝑥 𝑎𝑑𝑗𝑎𝑐𝑒𝑛𝑡
But sin 𝜃 = ℎ since sin 𝜃 = ℎ𝑦𝑝𝑜𝑡𝑒𝑛𝑢𝑠𝑒 and cos 𝜃 = ℎ since cos 𝜃 = ℎ𝑦𝑝𝑜𝑡𝑒𝑛𝑢𝑠𝑒 so,
𝑦
⁄ℎ sin 𝜃
tan 𝜃 = 𝑥⁄ = ]
ℎ cos 𝜃
sin 𝜃
i.e. tan 𝜃 = (5)
cos 𝜃
11
Reciprocal Identities: Relationships (6), (7) and (8) are called Reciprocal identities.
𝑑 cos 𝑥
𝑑𝑥
= − sin 𝑥
𝑑 tan 𝑥
𝑑𝑥
= sec 𝑥 2
𝑑 cot 𝑥
𝑑𝑥
= − csc 𝑥 2
𝑑 sec 𝑥
𝑑𝑥
= sec 𝑥 tan 𝑥
𝑑 csc 𝑥
= − csc 𝑥 cot 𝑥
𝑑𝑥
Note: If 𝑎 and 𝑥 are positive and 𝑎 ≠ 1, then there is only one real value for 𝑦. See Theorem 4.3.3.
Theorem 4.3.3
If 𝑎 > 1, then the logarithmic function 𝑓(𝑥) = log 𝑎 𝑥 is a one-to-one continuous increasing function
of 𝑥 with domain (0, ∞) and range ℝ = (−∞, ∞).
𝑥
ii) log 𝑎 𝑦 = log 𝑎 𝑥 − log 𝑎 𝑦
1
iv) log 𝑎 𝑥 = log 𝑓𝑜𝑟 𝑥 > 0 𝑎𝑛𝑑 𝑥 ≠ 1 (i.e. log of x to the base a equals the reciprocal of log of a to
𝑥𝑎
the base x)
v)
12
4.3.3.1 Natural Logarithms (𝒚 = 𝐥𝐧 𝒙)
In practice, two bases 𝒂 are used: base 𝑎 = 10 and base 𝑎 = 𝑒 (= 2.71828 … ). The logarithm of 𝒙 to
the base 𝒆 is called the natural logarithm of 𝒙, denoted by ln 𝑥. Thus, log 𝑒 𝑥 = ln 𝑥.
Properties
log 𝑒 𝑥 = 𝑦
∴ 𝑥 = 𝑒𝑦
∴ ln 𝑒 𝑥 = 𝑥
∴ 𝑒 ln 𝑥 = 𝑥
𝑑 1
2. 𝑑𝑥 log 𝑎 𝑥 = 𝑥 ln 𝑎 (2)
1
Consider Equation (1) above. By property (iv) of logarithmic functions above, log 𝑎 𝑥 = log
𝑥𝑎
1 1
log 𝑎 𝑒 = log = ln 𝑎
𝑒𝑎
1
i.e. log 𝑎 𝑒 = ln 𝑎, so that Equation (1) becomes Equation (2)
𝑑 ln 𝑥 1
3. =
𝑑𝑥 𝑥
Proof
13
4.3.3.2 Differentiating Composite Logarithmic Functions
Most logarithmic functions that we shall encounter in applications are actually composite (implicit)
functions of the form 𝑦 = ln 𝑢 where 𝑢 is a differentiable function of 𝑥, 𝑢(𝑥).We can differentiate such
functions using Equation (3) and the chain rule as illustrated below.
𝑑 ln|𝑢| 𝑑𝑢
= 𝑑𝑢
∙ 𝑑𝑥
𝑑 1 𝑑𝑢
⇒ 𝑙𝑛|𝑢(𝑥)| = ∙ 𝑓𝑜𝑟 𝑢 ≠ 0. But, we know that 𝑙𝑛𝑒 = 1 so that
𝑑𝑥 𝑢𝑙𝑛𝑒 𝑑𝑥
𝑑 1 𝑑𝑢
∴ 𝑙𝑛|𝑢(𝑥)| = ∙ 𝑓𝑜𝑟 𝑢 ≠ 0 (4)
𝑑𝑥 𝑢 𝑑𝑥
In general,
𝑑 𝑓′ (𝑥)
[ln 𝑓(𝑥)] = (5)
𝑑𝑥 𝑓(𝑥)
Example 4.8
i) 𝑓(𝑥) = 5 ln 𝑥
ii) 𝑦 = ln(𝑥 3 + 1)
ln 𝑥
iii) 𝑓(𝑥) = 𝑥2
iv) 𝑓(𝑥) = ln(sin 𝑥)
Solutions
i) 𝑓(𝑥) = 5 ln 𝑥
Here 𝑢 = 𝑥
𝑑𝑓(𝑥) 𝑑(5 ln 𝑥) 𝑑𝑥 5𝑑 ln 𝑥 5 1
⇒ 𝑓 ′ (𝑥) = 𝑑𝑥
= 𝑑𝑥
. 𝑑𝑥 = 𝑑𝑥
.1 = 𝑥 𝑓𝑜𝑟 𝑥 > 0 𝑠𝑖𝑛𝑐𝑒 ln 𝑥 = 𝑥.
𝑑(5 ln 𝑥) 5
∴ =
𝑑𝑥 𝑥
ii) 𝑦 = ln(𝑥 3 + 1)
This function is of the form 𝑦 = ln 𝑢 with 𝑢 = (𝑥 3 + 1). So, by the Chain rule
𝑑𝑦 𝑑𝑦 𝑑𝑢
𝑑𝑥
= 𝑑𝑢 ∙ 𝑑𝑥
14
𝑑𝑦 𝑑 ln 𝑢 𝑑(𝑥 3 +1)
i.e. 𝑑𝑥 = 𝑑𝑢
. 𝑑𝑥
𝑑𝑦 1 1
⇒ = (3𝑥 2 ) = 3 ∙ (3𝑥 2 )
𝑑𝑥 𝑢 (𝑥 + 1)
ln 𝑥
iii) 𝑓(𝑥) = 𝑥2
𝑑 𝑑
𝑥2. ln 𝑥−ln 𝑥. 𝑥 2
= 𝑑𝑥
(𝑥 2 )2
𝑑𝑥
1
𝑥 2 ( )−ln 𝑥(2𝑥) 1
𝑥
= 𝑠𝑖𝑛𝑐𝑒 ln 𝑥 =
𝑥4 𝑥
𝑥−2𝑥 ln 𝑥
= 𝑥4
1−2 ln 𝑥
= 𝑥3
ln 𝑥
𝑑( 2 ) 1−2 ln 𝑥
∴ 𝑥
= 𝑓𝑜𝑟 𝑥 > 0 .
𝑑𝑥 𝑥3
we have that
𝑑𝑙𝑛(sin 𝑥) 𝑑 ln 𝑢 𝑑 sin 𝑥
𝑑𝑥
= 𝑑𝑢
. 𝑑𝑥
1 𝑑 ln 𝑢 1 𝑑 sin 𝑥
=𝑢 . cos 𝑥 𝑎𝑠 𝑑𝑢
= 𝑢 𝑎𝑛𝑑 𝑑𝑥
= cos 𝑥
1
=sin 𝑥 . cos 𝑥 on reverting to the original variable x
15
𝑑 ln(sin 𝑥) 𝑐𝑜𝑠 𝑥
∴ = = cot 𝑥 .
𝑑𝑥 𝑠𝑖𝑛 𝑥
To differentiate a logarithmic function to a base different from 𝑒, first convert the logarithm to the
natural logarithm via the Change–of–base formula (STA 101 Lecture Notes!), and then differentiate
the resulting expression.
[ASIDE: Change of Base Formula: Allows us to rewrite a logarithm in terms of logs written with
another base. This is especially helpful when using a calculator to evaluate a log to any base other
than 10 or e. Assume that 𝑥, 𝑎, and 𝑏 are all positive and that a ≠ 1, b ≠ 1. Then, by the change-of-base
logb x
formula log a x . Notice that the base has been changed from 𝑎 to 𝑏.
logb a
Consider the function 𝑓(𝑥) = log 𝑢, where 𝑢 is a differential function of 𝑥. By the change-of-base
formula
log𝑒 𝑢 ln 𝑢
𝑓(𝑥) = log 𝑎 𝑢 = = for 𝑎, 𝑢, > 0 and 𝑎 ≠ 1
log𝑒 𝑎 ln 𝑎
1 𝑑
=ln 𝑎 (𝑑𝑥 ln 𝑢) since 𝑎 is not a function of x (is constant) so that ln 𝑎 is itself a constant
1 𝑑 ln 𝑢 𝑑𝑢
= ln 𝑎 ( 𝑑𝑢 𝑑𝑥
. )
𝑑 1 1 𝑑𝑢
𝑑𝑥
log 𝑎 𝑢= ( . )
ln 𝑎 𝑢 𝑑𝑥
𝑑 1 𝑑𝑢
∴ 𝑑𝑥 log 𝑎 𝑢(𝑥) = 𝑢 ln 𝑎 . 𝑑𝑥 𝑓𝑜𝑟 𝑎, 𝑢, > 0 𝑎𝑛𝑑 𝑎 ≠ 1
Example 4.9
Differentiate 𝑦 = log 2 𝑥
Solution: This is the case where the base 𝑎 ≠ 𝑒, so by change-of-base formula we have that
16
ln 𝑢
𝑦 = log 𝑏 𝑢 = ln 𝑏 and
𝑑𝑦 1 𝑑𝑢
𝑑𝑥
= 𝑢 ln 𝑎 . 𝑑𝑥 , 𝑢 > 0
1 𝑑 ln 𝑥
=ln 2 ( 𝑑𝑥
)
1 1
=ln 2 (𝑥)
𝑑 1
∴ log 2 𝑥 =
𝑑𝑥 𝑥 ln 2
1
Note: i) 𝑎−𝑥 = .
𝑎𝑥
𝑝
ii) If 𝑥 is a rational number, i.e. 𝑥 = 𝑞 , 𝑞 > 0, then
𝑝 𝑞
⁄𝑞
𝑎𝑥 = 𝑎 = √𝑎 𝑝 (the qth root of ap )
Theorem 4.4.3
Properties
i) 𝑎 𝑥+𝑦 = 𝑎 𝑥 𝑎 𝑦
𝑎𝑥
ii) 𝑎 𝑥−𝑦 = 𝑎𝑦 (=𝑎 𝑥 𝑎−𝑦 )
iii) (𝑎 𝑥 )𝑦 = 𝑎 𝑥𝑦
iv) (𝑎𝑏)𝑥𝑦 = 𝑎 𝑥𝑦 𝑏 𝑥𝑦
Derivatives
17
𝑑𝑦 𝑑(𝑎 𝑥 )
𝑑𝑥
= 𝑑𝑥
= 𝑎 𝑥 ln 𝑎 (1)
This result can be generalized to obtain the derivative of an exponential function of the form
ln(𝑒 𝑥 ) = 𝑥 (3)
Differentiating both sides of equation (3) with respect to 𝑥 and using the Chain rule that
𝑑 ln 𝑢 1 𝑑𝑢
= 𝑢 . 𝑑𝑥 with 𝑢 = 𝑒 𝑥 (on the LHS) yields (since if 𝑢 = 𝑒 𝑥 RHS=ln 𝑢)
𝑑𝑥
1 𝑑𝑒 𝑥 𝑑𝑥
. = 𝑑𝑥
𝑒𝑥 𝑑𝑥
1 𝑑𝑒 𝑥
i.e. . =1
𝑒𝑥 𝑑𝑥
𝑑𝑒 𝑥
∴ = 𝑒𝑥 (4)
𝑑𝑥
ii) 𝑓(𝑥) = 𝑒 𝑥 and its multiples are the only functions that have this property, making them very useful
in applications.
Combining Equation (4) with the Chain rule, we can derive the following rule for differentiating
composite functions of the form 𝑦 = 𝑒 𝑢 , where 𝑢 is a differential function of 𝑥.
where ln 𝑒 = 1, so that
18
𝑑𝑒 𝑢(𝑥) 𝑑𝑢(𝑥)
= 𝑒 𝑢(𝑥) .
𝑑𝑥 𝑑𝑥
or just
𝑑𝑒 𝑢 𝑑𝑢
= 𝑒 𝑢 . 𝑑𝑥 (5)
𝑑𝑥
𝑑 𝑥 3 +3𝑥
Example 4.10: Find 𝑑𝑥
𝑒
Solution: This function is of the form 𝑓(𝑥) = 𝑒 𝑢(𝑥) , where 𝑢 is differentiable function of 𝑥, 𝑢(𝑥). Thus,
𝑑𝑒 𝑢(𝑥) 𝑑𝑢(𝑥)
= 𝑒 𝑢(𝑥) .
𝑑𝑥 𝑑𝑥
𝑑𝑢(𝑥)
Now, let 𝑢(𝑥) = 𝑥 3 + 3𝑥 ⟹ = 3𝑥 2 + 3
𝑑𝑥
𝑑 3 +3𝑥 3 +3𝑥
⇒ 𝑑𝑥 𝑒 𝑥 = 𝑒𝑥 (3𝑥 2 + 3)
𝑑 3 +3𝑥 3 +3𝑥
∴ 𝑑𝑥 𝑒 𝑥 = (3𝑥 2 + 3)𝑒 𝑥
Solution
𝑑𝑓 𝑑𝑒 √𝑥 𝑑𝑢(𝑥)
⇒ 𝑑𝑥 = 𝑑𝑥
= 𝑒 𝑢(𝑥) . 𝑑𝑥
1
=𝑒 √𝑥 . 2
√𝑥
𝑑𝑒 √𝑥 1
∴ = 𝑒 √𝑥
𝑑𝑥 2√𝑥
19
4.4 Higher Order Derivatives
The derivative of a function 𝑦 = 𝑓(𝑥) is itself a function, 𝑦′ = 𝑓′(𝑥). Therefore, we can take
the derivative of 𝑓′(𝑥), which is referred to as the second derivative of 𝑓(𝑥) and, written
𝑓“( 𝑥) or 𝑓 2 (𝑥). This differentiation process can be continued to find the third, fourth, and
successive derivatives of 𝑓(𝑥), which are called higher order derivatives of 𝑓(𝑥). As the
“prime” notation for derivatives becomes messier as successive higher order derivatives are
taken, it is preferable to use the numerical notation f ( n ) x or y ( n) x to denote the nth
derivative of 𝑓(𝑥).
Example 4.12: Find the first, second and third derivatives of f x 5x 4 3x3 7 x 2 9 x 2 .
Solution
f ' x 20 x3 9 x 2 14 x 9
f '' x f (2) 60 x 2 18 x 14
f ''' x f (3) 120 x 2 18
Example 4.13: Find the first, second and third derivatives of f x sin 2 x .
Solution
df sin x 2 du 2 du
f (1) sin x 2 dx
du dx
i.e. f (1) sin x 2 2u.cos x
i.e. f (1) sin x 2 sin x .cos x
2
f (2)
sin x dx
d
2
2 cos x sin x
20
f (2) sin x 2 2 cos x
d d
sin x sin x 2 cos x
dx dx
f (2) sin x 2 2 cos x(cos x) sin x( 2 sin x)
f (2) sin x 2 2 cos 2 x 2 sin 2 x
f (2) sin x 2 2 cos 2 x sin 2 x
Similarly,
f (3) sin x 2 d
dx
{2 cos 2 x 2sin 2 x}
f (3) sin x 2
d
dx
2 cos 2 x
d
dx
2sin 2 x
2 2 2 2
f (3) sin x 2 4 cos x( sin x) 4sin x cos x (Noticing that 𝑐𝑜𝑠 𝑥 = (𝑐𝑜𝑠𝑥) and 𝑠𝑖𝑛 𝑥 = (𝑠𝑖𝑛𝑥) )
f (3) sin x 2 4 cos x sin x 4 cos x sin x
f (3) sin x 2 8cos x sin x
3
Example 4.14: Verify that f 3 4 if f x x .
256
Sketching a curve from knowledge of the signs of the first and second derivatives is a useful
way of finding the approximate shape of the graph of a function. When curve sketching,
making a sign chart of the derivatives is an easy way to identify possible points of inflection
and to find the relative minima and maxima, which are both key in sketching the path of a
function. This important technique will be illustrated in Section 4.5.2.
21
Reading: Grossman, Section4.3, p207, STA 102 Lecture Notes
One of the most important applications of differential calculus is to find extreme function
values. The calculus methods for finding the maximum and minimum values of a function are
the basic tools of optimization theory, a very active branch of mathematical research applied
to nearly all fields of practical endeavor.
Theorem 4.5.1: Suppose a function 𝑓 is differentiable for 𝑎 < 𝑥 < 𝑏 = (𝑎, 𝑏) and continuous
for 𝑎 ≤ 𝑥 ≤ 𝑏 = [𝑎, 𝑏],. Then, if
𝑑𝑓
i) 𝑑𝑥 > 0 for every 𝑥 in the open interval (𝑎, 𝑏), 𝑓 is increasing on the closed interval [𝑎, 𝑏].
𝑑𝑓
ii.) 𝑑𝑥 < 0 for every 𝑥 in the open interval (𝑎, 𝑏), f is decreasing on the closed interval [𝑎, 𝑏].
Example 4.15
Let 𝑦 = 𝑥 3 + 3𝑥 2 − 9𝑥 − 10
For what values of 𝑥 is this function increasing or decreasing? Sketch the curve.
Solution
i.e 3𝑥 2 + 6𝑥 − 9 = 0
3(𝑥 2 + 2𝑥 − 3) = 0
𝑥 2 + 2𝑥 − 3 = 0
𝑥 2 + 3𝑥 − 𝑥 + 3(−1) = 0
𝑥(𝑥 + 3) − (𝑥 + 3) = 0
(𝑥 + 3)(𝑥 − 1) = 0
So, either 𝑥 = −3 or 𝑥 = 1. These are the critical points, and lead to 5 intervals:
22
Now, for each critical value ( x 3, x 1) and for x 0 , find the (x, y) point to be plotted to
sketch the curve. Thus, at 𝑥 = −3, 𝑦 = 17 → point (−3,17); at : 𝑥 = 1, 𝑦 = −15 →
point (1, −15); at 𝑥 = 0, 𝑦 = −10 ( y − intercept) → 𝑝𝑜𝑖𝑛𝑡(0, −10).
Then use this information to construct a sign chart of the derivatives of the given function
𝑦 = 𝑥 3 + 3𝑥 2 − 9𝑥 − 10 in order to sketch the path of the function, as illustrated in Table
4.1.
𝑑𝑓
Table 4.1: Sign chart of 𝑑𝑥 = 3𝑥 2 + 6𝑥 − 9
Using the three points (−3,17), (1, −15) 𝑎𝑛𝑑 (0, −10) together with the information
contained in Table 4.1, we can sketch the curve 𝑦 = 𝑥 3 + 3𝑥 2 − 9𝑥 − 10 as in Figure 4.1.
i.) The point (−3,17) is a maximum point in the sense that “near” 𝑥 = −3, 𝑦 takes its
largest value of 𝑦 = 17. However, there is no global (or absolute) maximum value for
the function 𝑓 since as 𝑥 increases beyond the value 𝑥 = 1, 𝑦 increases without
23
bound. For example, when 𝑥 = 10, then 𝑦 = 1200, which is far bigger than 𝑦 = 17.
Thus, the point (−3,17) is called a local maximum or relative maximum in the sense
that the function achieves its maximum value there for points near 𝑥 = −3.
ii.) similarly, we call the point (1, −15) a local minimum or relative minimum.
Let a function 𝑓 have a local maximum or minimum at 𝑥0 .Then 𝑥0 is a critical point of𝑓.
Remark 4.5.1: At a critical point (𝑥0 , 𝑓(𝑥0 )) a function may have a local maximum, a local
minimum or neither.
There are two ways of determining when a critical point is a local (relative) maximum or
minimum: the First derivative test and the Second derivative test.
Let 𝑥0 be a critical point of a function 𝑓 with 𝑥0 in an open interval (𝑎, 𝑏). Suppose that 𝑓 is
continuous for 𝑎 ≤ 𝑥 ≤ 𝑏 and differentiable for𝑎 < 𝑥 < 𝑏, except possibly at 𝑥0 itself .Then,
i) If 𝑓 ′ (𝑥) > 0 𝑓𝑜𝑟 𝑎 < 𝑥 < 𝑥0 and 𝑓 ′ (𝑥) < 0 for 𝑥0 < 𝑥 < 𝑏, 𝑓 has a relative maximum
at𝑥0 .
ii) If 𝑓 ′ (𝑥) < 0 𝑓𝑜𝑟 𝑎 < 𝑥 < 𝑥0 and 𝑓 ′ (𝑥) > 0 for 𝑥0 < 𝑥 < 𝑏, 𝑓 has relative minimum at
𝑥0 .
iii) If 𝑓 ′ (𝑥0 ) < 0 𝑓𝑜𝑟 𝑎 < 𝑥0 < 𝑏 𝑜𝑟 𝑓 ′ (𝑥) > 0 for 𝑎 < 𝑥 < 𝑏 (except possibly at 𝑥0
itself, 𝑓 has neither a relative maximum nor a relative minimum at𝑥0 .
24
Property (i) says that if 𝑓 increases to the left of 𝑥0 and decreases to the right of 𝑥0 , then 𝑓
has a relative maximum at 𝑥0 .
Property (ii) says that if 𝑓 decreases to the left of 𝑥0 and increases to the right of 𝑥0 , then 𝑓
has relative minimum at 𝑥0 .
The First derivative test is illustrated in Figure 4.2(a)-(d) in which we have drawn 3 tangent
lines in each of the 4 cases.
Figure 4.2: Relative minima and relative maxima of the function f x 3x5 5x3 3
There are three critical points for this function: , , and .. Notice that is at
a relative maximum and that the function is concave down at this point, meaning that
f (2) 1 0 Similarly, gives a relative minimum and the function is concave up at this
point, meaning that f (2) 1 0 . However, we will need to be very careful with . In this
case the second derivative is zero, i.e., f (2) 1 0 , but that will not actually mean that the
point (0,3) is not a relative minimum or maximum.
From example 4.12, 𝑓(𝑥) = 𝑥 3 + 3𝑥 2 − 9𝑥 − 10 and 𝑓 ′ (𝑥) = 3(𝑥 + 1)(𝑥 − 1). We can use
Table 4.1 to describe the nature of the critical points 𝑥 = −3 𝑎𝑛𝑑 𝑥 = 1. For example, in
the interval (−4, −2), 𝑓 ′ (𝑥) > 0 𝑓𝑜𝑟 𝑥 < −3 and 𝑓 ′ (𝑥) < 0 𝑓𝑜𝑟 𝑥 > −3. This suggests that
𝑓(𝑥) has a relative maximum at 𝑥 = −3. Similarly, in the interval (0, 2), 𝑓 ′ (𝑥) < 0 for 𝑥 < 1
and 𝑓 ′ (𝑥) > 0 for 𝑥 > 1. Thus, in the interval (0,2), 𝑓(𝑥) has a relative minimum at 𝑥 = 1.
Let a function 𝑓(𝑥) be twice differentiable (that is let 𝑓 ′ (𝑥) 𝑎𝑛𝑑 𝑓 (2) (𝑥) exist) for all 𝑥 in
the interval (𝑎, 𝑏). Then,
i) The graph of 𝑦 = 𝑓(𝑥) is said to be concave up on (𝑎, 𝑏) if 𝑓 ′′ (𝑥) > 0 for 𝑎𝑙𝑙 𝑥 𝑖𝑛 (𝑎, 𝑏).
ii) The graph of 𝑦 = 𝑓(𝑥) is said to be concave down on (𝑎, 𝑏) if 𝑓 ′′ (𝑥) <
0 for 𝑎𝑙𝑙 𝑥 𝑖𝑛(𝑎, 𝑏).
Point of inflection: The point on the graph of 𝑦 = 𝑓(𝑥) that separates the arcs of opposite
concavity is called a point of inflection. That is infection points are points on a graph where
the concavity changes .A positive second derivative means a function is concave up; a
negative second derivative means the function is concave down. The points of inflection are
the points (on the graph) where the second derivative is zero and, the function changes from
concave up to concave down or vice versa.
Definition 4.5.2 provides us with a procedure for determining the intervals on which the
graph of 𝑦 = 𝑓(𝑥) is concave up or concave down:
i) Find all numbers 𝑥 for which the second derivative 𝑓 (2) (𝑥) = 0 or 𝑓 (2) (𝑥) fails to
exist.
ii) Check the sign of 𝑓 (2) (𝑥) on each of the resulting intervals to determine concavity.
Example 4.17: Determine the concavity for the graph of the function 𝑓(𝑥) = 𝑥 4 − 6𝑥 2 + 2.
Solution
𝑓(𝑥) = 𝑥 4 − 6𝑥 2 + 2
26
𝑓 ′ (𝑥) = 4𝑥 3 − 12𝑥
Now, to check for the points of inflection (𝑥0 , 𝑓(𝑥0 )), find the second derivative of 𝑓(𝑥),
equate it to zero and solve it for 𝑥. [This is so done because the points of inflection are the
points (𝒙𝟎 , 𝒇(𝒙𝟎 )) on the graph where the second derivative is zero]
𝑓 (2) (𝑥) = 12𝑥 2 − 12
12𝑥 2 − 12 = 0
12(𝑥 2 − 1) = 0
𝑖. 𝑒. 𝑥 2 − 1 = 0
So, the zeros of 𝑓 (2) (𝑥) are 𝑥1 = −1 𝑎𝑛𝑑 𝑥2 = 1 . Notice that here, there are no values of 𝑥
for which 𝑓 (2) (𝑥) is undefined.
Check the sign of 𝑓 (2) (𝑥) on each of the resulting intervals: (−∞, −1), (−1, 1) 𝑎𝑛𝑑 (1, ∞) by
choosing one “test number” 𝑡 in each interval and calculating 𝑓 (2) (𝑡). The results are given
in Table 4.2.
Table 4.2
Thus, applying Definition 4.6.2 to the results in the last column of the Table 4.2, we conclude
that the graph of 𝑓(𝑥) = 𝑥 4 − 6𝑥 2 + 2 is
The points of inflection are the points (𝒙𝟎 , 𝒇(𝒙𝟎 )) on the graph where the second derivative is
zero and, the function changes from concave up to concave down or vice versa.
The concavity of the graph changes at both the points (−1, 𝑓(−1)) =
(−1, −3) 𝑎𝑛𝑑 (1, 𝑓(1)) = (1, −3). So these are the points of inflection, as illustrated in Figure
4.5 (Homework: Sketch the curve).
27
Let 𝑓(𝑥) be differentiable on an open interval (c, d) containing the critical value 𝑥 =
𝑥0 with 𝑓 ′ (𝑥0 ) = 0. Suppose also that 𝑓 (2) (𝑥) exists throughout this interval. Then,
Example 4.18
Solution
𝑑𝑓
First, set 𝑑𝑥 = 0 and find the critical numbers.
Given 𝑓(𝑥) = 𝑥 4 − 8𝑥 2 + 2
𝑑𝑓(𝑥)
= 4𝑥 3 − 16𝑥 = 0
𝑑𝑥
⇒ 4𝑥(𝑥 2 − 4) = 0
or 4𝑥(𝑥 2 − 22 ) = 0
⇒ 4𝑥(𝑥 − 2)(𝑥 + 2) = 0
So, either 4𝑥 = 0 or 𝑥 2 − 4 = 0 ⇒ 𝑥 2 − 4 = 0
Check the sign of 𝑓 (2) (𝑥) for each critical number 𝑥0 and apply the second derivative test:
Remark 4.5.2: A continuous function 𝑓(𝑥) on a closed interval [a, b] always has both a global
(absolute) maximum and a global (absolute) minimum, so that examining the critical values
and the endpoints is enough to find the global maximum and minimum. See Theorem 4.5.4.
28
If a function 𝑓(𝑥) is continuous on a closed interval [a, b], then 𝑓(𝑥) has both a global
minimum value and a global maximum value. That is, there are real numbers c and d in [a,
b] so that for every x in [a, b], 𝑓(𝑥) attains an absolute (a global) minimum value f(c),
𝑖. 𝑒. , 𝑓(𝑐) ≤ 𝑓(𝑥), and an absolute (a global) maximum value 𝑓(𝑑), 𝑖. 𝑒., 𝑓(𝑥) ≤ 𝑓(𝑑).
Remark 4.5.3: If a function is continuous and has a single critical value, then if there is a local
maximum at the critical value it is a global maximum, and if it is a local minimum it is a global
minimum. There may also be a global minimum in the first case, or a global maximum in the
second case, but that will generally require more effort to determine.
Example 4.19
Find the global (absolute) minimum and maximum of the function 𝑓(𝑥) = (1 − 𝑥)𝑒 𝑥 on the
closed interval [−1, 5].
Solution
Note: The only place a global minimum or maximum can occur on an interval is at one of the
endpoints of the interval or at a critical point inside the interval. A critical point is a point
where the function is defined and its derivative is either 0 or undefined. So start by finding
the derivative
𝑑 𝑑
𝑓 ′ (𝑥) = 𝑒 𝑥 𝑑𝑥 (1 − 𝑥) + (1 − 𝑥) 𝑑𝑥 𝑒 𝑥 by the product rule
So
𝑓 ′ (𝑥) = −𝑥𝑒 𝑥 , −∞ < 𝑥 < ∞
Notice that this derivative is defined everywhere, so the only possible critical numbers will
be where the derivative is equal to 0. And, 𝑓 ′ (𝑥) = −𝑥𝑒 𝑥 = 0 can only be true if either 𝑥 = 0
or 𝑒 𝑥 = 0. But, 𝑒 𝑥 is always positive and, so, is never equal to 0. Thus, the only critical point
comes when 𝑥 = 0.
And, as stated before, the global maximum and minimum can only occur at an endpoint or
at a critical number. So our only choices for 𝑥 here are −1, 0 or 5 . If these are our only
choices, then we just have to plug in the original function 𝑓(𝑥) = (1 − 𝑥)𝑒 𝑥 each one of the
three values and see which one gives the highest value and which gives the lowest value:
2
𝑓(−1) = < 1
𝑒
𝑓(0) = 1
𝑓(5) = −4𝑒 5
Thus, the global minimum comes at the point (5, −4𝑒 5 ) and the global maximum comes at
the point (0, 1).
29
Suppose a function 𝑓(𝑥) is (i) continuous on a closed interval [a, b], (ii) differentiable on an
open interval (a, b) and (iii) f a f b . Then, there is a number c in (a, b) such that
f ' c 0 . That is, 𝑓(𝑥) has a critical point c in (a, b).
References
i) Section 3-7: Marginal Analysis in Business and Economics.
http://faculty.mdc.edu/mmontane/marginal-analysis.pdf.
Reference: Mark Mac Lean (2011). Price elasticity of Demand. MATH 104 and MATH 184. University
of British Columbia, Canada. http://www.math.ubc.ca/~kliu/notesonelasticity.pdf
30
4.6 Differentiation of Multivariate Functions
Readings: R Horan & M. Lavelle. Intermediate Mathematics. Introduction to Partial Differentiation.
The University of Plymouth
In practice, many quantities that we measure are often functions of two or more variables. That is,
the more common situation is for a dependent variable to be related to more than one independent
variable. Multivariate calculus (also known as Multivariable calculus) is the extension of calculus in
one variable (i.e. Univariate calculus) to calculus in more than one variable. This involves the
differentiation and integration of functions involving multiple variables, rather than just one.
A bivariate (two-variable) function is a function whose value is dependent on two variables, i.e., a
vector in two dimensional space. Symbolically, this is written f x1 , x2 f x , where x x1 , x2 , a
2-dimensional vector.
A p-variable function (or a p-variate function) is a function whose range is a subset of the real line
number ℝ and its domain is the subset of the n-dimensional vector ℝ𝑛 . That is,
f x1 , x2 ,..., x p f x , where x x1 , x2 ,..., x p and f is a scalar.
When a function f of more than one independent variable changes in one or more of the input
variables it is important to calculate the change in the function itself. To determine the rate of change
of a multivariate real function f x1 , x2 ,..., x p , where p denotes the number of variables, with
respect to one of its several independent variables x j , j 1, 2,..., p , we find the derivative of f with
respect to x j , j 1, 2,..., p , at a time, while holding the other independent variables constant. This
process is called partial differentiation.
Notation: The symbol ∂ is used whenever a function with more than one variable is being
differentiated. The symbol f x (or ∂f/∂x) is used to denote the first partial derivative of f(x, y) with
respect to the variables x. Likewise, f y (or ∂f/∂y) denotes the first partial derivative of f(x, y) with
respect to the variables y.
The first partial derivatives of a two-variable function f(x, y) with respect to the variables x and y,
respectively, are given by
31
f x, y f x x, y f x, y
lim
x x 0 x
f x, y f x, y y f x, y
lim
y y 0 y
If f x1 , x2 ,..., x p is a function of p variables, then the partial derivative of f with respect to the j th
The rules of partial differentiation follow exactly the same techniques as univariate
differentiation. The only difference is that we have to decide how to treat the other independent
variable. If we hold the other variable constant, that means that it is treated just like any other
constant. Hence, in Definition 4.6.3 the first equation gives the rate of change of f(x, y) with respect
to x while y is held constant. Similarly, the second equation gives the rate of change of f(x, y) with
respect to y while x is held constant.
The first partial derivatives f x and f y are functions of the variables x and y and, so, we can find their
derivatives. Thus, as with ordinary derivatives of functions of one variable, we can compute higher
order partial derivatives of functions of several variables. If f(x, y) is a bivariate function of x and y,
then
2 f
f xx denotes the second partial derivative of f with respect to x,
x 2
x x
2 f
f yy denotes the second partial derivative of f with respect to y.
y 2
y y
2 f f
f xy says first differentiate with respect to x and then with respect to y,
yx y x
2 f f
f yx says first differentiate with respect to y and then with respect to x.
xy x y
32
The 3rd and 4th equations are called mixed partial (or cross partial) derivatives. Notice that when
finding mixed partial derivatives, we differentiate first with respect to the variable nearest f.
Example 4.22: Find the first- and second-order partial derivatives of the function
f x, y e xy ln x 2 y
Solution
Solution
Partial derivatives and mixed partial derivatives are important since they allow us to determine local
maximum and minimum points for multivariate functions.
The definition of relative extrema for functions of two variables is identical to that for functions of
one variable, we just need to remember now that we are working with functions of two variables.
33
Definition 4.7.1: (Relative Minimum and Maximum)
Suppose we are interested in finding points of relative maxima and minima for a function of two
variables, i.e., 𝑓(𝑥, 𝑦). Then,
i) 𝑓(𝑥, 𝑦)) has a relative minimum at the point (𝑥0 , 𝑦0 ) if 𝑓(𝑥 𝑦) ≥ 𝑓(𝑥0 , 𝑦0 ) for all points (𝑥, 𝑦) in
some region around (𝑥0 , 𝑦0 ).
ii) 𝑓(𝑥, 𝑦)) has a relative maximum at the point (𝑥0 , 𝑦0 ) if 𝑓(𝑥 𝑦) ≤ 𝑓(𝑥0 , 𝑦0 ) for all points (𝑥, 𝑦) in
some region around (𝑥0 , 𝑦0 ).
For a differentiable multivariable function, a stationary (critical) point is a point on the surface of the
graph where all its partial derivatives are zero (equivalently, the gradient is zero).
By definition 4.7.2, a point (𝑥0 , 𝑦0 ) is a stationary point of a bivariate function 𝑓(𝑥 𝑦) if one of the
following is true:
i) 𝑓𝑥 (𝑥0 , 𝑦0 ) = 0 and 𝑓𝑦 (𝑥0 , 𝑦0 ) = 0 (i,e, both the partial derivatives of 𝑓(𝑥 𝑦) at (𝑥0 , 𝑦0 ) are zero).
ii) 𝑓𝑥 (𝑥0 , 𝑦0 ) and/or 𝑓𝑦 (𝑥0 , 𝑦0 ) do (does) not exist.
Thus, to find the relative minima/maxima of a two-variable function 𝑓(𝑥 𝑦), the first step is to find
the stationary points (𝑥0 , 𝑦0 ) where the gradient is the 0 vector. That is, find 𝑓𝑥 and 𝑓𝑦 and set both to
zero and solve the resulting system of equations for x and y.
𝑇
𝒙 = (𝑥1 , 𝑥2 , … , 𝑥𝑝 ) , denoted by ∇𝑓(𝒙), is the vector
𝜕
𝑓(𝒙)
𝜕𝑥1 𝑓𝑥1
𝜕
𝑓(𝒙) 𝑓
∇𝑓(𝑥) = 𝜕𝑥2 = 𝑥2 , a column vector of the first partial derivatives of 𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑝 ) with
⋮ ⋮
𝜕
𝑓(𝒙) [ 𝑓𝑝 ]
[𝜕𝑥𝑝 ]
respect to each one of the independent variables 𝑥1 , 𝑥2 , … , 𝑥𝑝 .
Note: Both of the first order partial derivatives must be zero at the point (x0 , y0 ). If only one of the
first order partial derivatives is zero at the point, then the point (x0 , y0 ) will not be a stationary point.
34
Consider a bivariate function 𝑓(𝑥1 , 𝑥2 ).To check if the point (𝑥0 , 𝑦0 ) with a zero gradient is a relative
minimum or relative maximum, we determine the Hessian matrix of 𝑓(𝑥1 , 𝑥2 ), a matrix whose (𝑖, 𝑗)𝑡ℎ
𝜕2 𝑓(𝑥1 ,𝑥2 )
element is the second-order partial derivative 𝜕𝑥𝑖 𝜕𝑥𝑗
, for 𝑖, 𝑗 = 1, 2:
[ASIDE: In general, the Hessian matrix of a multivariable function of 𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑝 ), a matrix whose
𝜕2 𝑓(𝑥1 ,𝑥2 ,…,𝑥𝑝 )
(𝑖, 𝑗)𝑡ℎ element is the second-order partial derivative , for 𝑖, 𝑗 = 1, 2, … , 𝑝, denoted by
𝜕𝑥𝑖 𝜕𝑥𝑗
𝑯(𝑥), is
𝜕2 𝜕2 𝜕2
𝑓(𝒙) 𝑓(𝒙) 𝑓(𝒙)
𝜕𝑥1 𝜕𝑥1 𝜕𝑥1 𝜕𝑥2 𝜕𝑥1 𝜕𝑥𝑝
⋯
𝜕2 𝜕2 𝜕2
𝐻(𝑥) = 𝜕𝑥2 𝜕𝑥1 𝑓(𝒙) 𝑓(𝒙) 𝑓(𝒙)
𝜕𝑥2 𝜕𝑥2 𝜕𝑥2 𝜕𝑥𝑝
⋮ ⋱ ⋮
𝜕2 𝜕2 𝜕2
𝑓(𝒙) 𝑓(𝒙) ⋯ 𝑓(𝒙)
[𝜕𝑥𝑝 𝜕𝑥1 𝜕𝑥𝑝 𝜕𝑥2 𝜕𝑥𝑝 𝜕𝑥𝑝 ]
or
𝜕2 𝜕2
Note: The Hessian matrix is a symmetric matrix. That is, 𝑓(𝑥) = 𝑓(𝒙). Why?
𝜕𝑥𝑖 𝜕𝑥𝑗 𝜕𝑥𝑗 𝜕𝑥𝑖
To verify if a critical point, say, (𝒙𝟎 , 𝒚𝟎 ), is a relative minimum, relative maximum or a saddle point, we
apply The Second partial derivative test, stated in Theorem 4.7.1.
Suppose 𝑓(𝑥, 𝑦) is a two-variable function, with a critical point (𝑥0 , 𝑦0 ), i.e. ∇𝑓(𝒙) = 𝟎 and that the
second order partial derivatives are continuous in some region that contains (𝑥0 , 𝑦0 ). If the
determinant of the Hessian matrix 𝑯(𝑥), denoted by D, at the point (𝑥0 , 𝑦0 ) is
2
𝐷 = 𝑓𝑥𝑥 (𝑥0 , 𝑦0 )𝑓𝑦𝑦 (𝑥0 , 𝑦0 ) − (𝑓𝑥𝑦 (𝑥0 , 𝑦0 )) ,
i) if 𝐷 > 0 and 𝑓𝑥𝑥 (𝑥0 , 𝑦0 ) > 0, then (𝑥0 , 𝑦0 ) corresponds to a relative minimum.
35
ii) if 𝐷 > 0 and 𝑓𝑥𝑥 (𝑥0 , 𝑦0 ) < 0, then (𝑥0 , 𝑦0 ) corresponds to a relative maximum.
iii) if 𝐷 < 0, then (𝑥0 , 𝑦0 ) corresponds to a saddle point.
iv) if 𝐷 = 0, then the test is inconclusive. That is, the point (𝑥0 , 𝑦0 ) may be a relative minimum,
relative maximum or a saddle point. Other techniques would need to be used to classify the
critical point.
Example 4.23
Find and classify the stationary points (local maximum, local minimum or saddle point) of the
function 𝑓(𝑥, 𝑦) = 𝑥 2 + 𝑦 4 + 1
Solution
To find the stationary points first find the first-order partial derivatives and equate each to zero:
𝑓𝑥 (𝑥, 𝑦) = 2𝑥 = 0 ⇒ 𝑥 = 0
𝑓𝑦 (𝑥, 𝑦) = 4𝑦 3 = 0 ⇒ 𝑦 = 0
Hence,
𝜕
𝑓(𝒙)
𝜕𝑥1 0
∇𝑓(𝒙) = =[ ]
𝜕 0
𝑓(𝒙)
[𝜕𝑥2 ]
Thus, the only stationary point of f is (0, 0). And, to classify the stationary point (0, 0), apply the
Second partial derivative test as follows:
Second-order partials are 𝑓𝑥𝑥 = 2 , 𝑓𝑥𝑦 = 0, 𝑓𝑦𝑦 = 12𝑦 2 , 𝑓𝑦𝑥 = 0 so that
𝑓𝑥𝑥 𝑓𝑥𝑦 2 0
Hessian matrix 𝑯 = [ ]=[ ].
𝑓𝑦𝑥 𝑓𝑦𝑦 0 12𝑦 2
2 0
Hence, the discriminant of the function is 𝐷 = det(𝐻) = | | = 24𝑦 2 − 0 = 24𝑦 2
0 12𝑦 2
And at the stationary point (0, 0), 𝐷 = 0 and so the test provides no information about the nature of
this stationary point (i.e., the test is inclusive).
Example 4.24
Locate and classify the stationary points of the function 𝑓(𝑥, 𝑦) = 3𝑥 2 𝑦 + 𝑦 3 − 3𝑥 2 − 3𝑦 2 + 1
Solution
Note: In this case the function f is continuous and defined for every point (𝑥, 𝑦) ∈ ℝ2 so that any local
extreme values will occur at the critical (stationary) points of f. We find the stationary points by
setting the gradient of f to be equal to the null vector and then simultaneously solve for x and y (just
like in Example 4.17):
𝑓𝑥 6𝑥𝑦 − 6𝑥 0
𝜵𝑓(𝑥) == [ 1 ] = [ 2 ]=[ ]
𝑓𝑥2 3𝑥 + 3𝑦 2 − 6𝑦 0
And, 6𝑥𝑦 − 6𝑥 = 0 ⇒ 𝑥(𝑦 − 1) = 0 ⇒ 𝑥 = 0, 𝑦 = 1.
36
Now, if 𝑥 = 0, the second equation 3𝑥 2 + 3𝑦 2 − 6𝑦 = 0 becomes 3𝑦 2 − 6𝑦 = 0 ⇒ 3𝑦(𝑦 − 2) = 0 so
that 𝑦 = 0 or 𝑦 = 2 and, hence, stationary points of f are (0, 0) and (0, 2).
Thus, there are 4 stationary points of f to be classified: (-1, 1), (0, 0), (0, 2), (1, 1) by using the Second
partial derivative test.
By computing the second-order partial derivatives, we easily obtain the following Hessian matrix as
6𝑦 − 6 6𝑥
𝑯=[ ]
6𝑥 6𝑦 − 6
6𝑦 − 6 6𝑥
And so the discriminant of f is 𝐷 = | | = (6𝑦 − 6)2 − 36𝑥 2 . This leads to the following
6𝑥 6𝑦 − 6
classification of the 4 stationary point of f above.
37
a) Determine the price the company should charge for each product in order to maximize total
revenue. Hint: Since the company is selling two products, the total revenue will be the sum of
the total revenues realised from the two products.
b) Verify that the maximum revenue accruable to the company is P4375.
The Lagrange multipliers method is one of methods for solving constrained extrema problems. Recall
that for a p-variate function f the necessary condition for local extrema is that at the point of extrema
all partial derivatives, if they exist, must be zero. As a result, there are p equations in p unknowns
(𝑥1 , 𝑥2 , … , 𝑥𝑝 ), that may be solved to find the potential extrema point (called stationary point). When
the variables, 𝑥1 , 𝑥2 , … , 𝑥𝑝 , are constrained, there is (at least one) additional equation (the constraint)
but no additional variables, so that the set of equations is overdetermined. Hence, the method
introduces an additional variable (the Lagrange multiplier), denoted by 𝜆, that enables us to solve the
problem.
The Lagrange multipliers method is based on setting up the new function, called the Lagrange
function,
𝐿(𝑥1 , 𝑥2 , … , 𝑥𝑝 , 𝜆) = 𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑝 ) − 𝜆(𝑔(𝑥1 , 𝑥2 , … , 𝑥𝑝 ) − 𝑘)
where k is a constant and λ is an additional variable called the Lagrange multiplier. From the Lagrange
function, stationary points are obtained by finding the partial derivative of 𝐿(𝑥1 , 𝑥2 , … , 𝑥𝑝 , 𝜆) with
respect to 𝑥1 , 𝑥2 , … , 𝑥𝑝 , and 𝜆 and setting each result to zero and then solving the resulting system of
equation simultaneously for 𝑥1 , 𝑥2 , … , 𝑥𝑝 , and 𝜆. That is,
𝜕 𝜕 𝜕
𝐿(𝑥1 , 𝑥2 , … , 𝑥𝑝 , 𝜆) = 𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑝 ) − 𝜆 𝑔(𝑥1 , 𝑥2 , … , 𝑥𝑝 ) = 0
𝜕𝑥1 𝜕𝑥1 𝜕𝑥1
𝜕 𝜕 𝜕
𝐿(𝑥1 , 𝑥2 , … , 𝑥𝑝 , 𝜆) = 𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑝 ) − 𝜆 𝑔(𝑥1 , 𝑥2 , … , 𝑥𝑝 ) = 0
𝜕𝑥2 𝜕𝑥2 𝜕𝑥2
⋮
𝜕 𝜕 𝜕
𝐿(𝑥1 , 𝑥2 , … , 𝑥𝑝 , 𝜆) = 𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑝 ) − 𝜆 𝑔(𝑥1 , 𝑥2 , … , 𝑥𝑝 ) = 0
𝜕𝑥2 𝜕𝑥2 𝜕𝑥2
⋮
𝜕 𝜕 𝜕
𝐿(𝑥1 , 𝑥2 , … , 𝑥𝑝 , 𝜆) = 𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑝 ) − 𝜆 𝑔(𝑥1 , 𝑥2 , … , 𝑥𝑝 ) = 0
𝜕𝑥𝑝 𝜕𝑥𝑝 𝜕𝑥𝑝
𝜕 𝜕 𝜕
𝐿(𝑥1 , 𝑥2 , … , 𝑥𝑝 , 𝜆) = 𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑝 ) − 𝜆 𝑔(𝑥1 , 𝑥2 , … , 𝑥𝑝 ) = 0
𝜕𝜆 𝜕𝜆 𝜕𝜆
38
Note: The Lagrange multiplier method provides us with a way of finding the stationary points but,
does not tell us whether the points yield a minimum, maximum or neither. To confirm if the
stationary points indeed yield a minimum, maximum or neither, second-order conditions must be
verified.
Solution
Maximise U x1 x2
subject to 10 x1 2 x2 100
Using the The Lagrange multiplier method, first set the Lagrangian function
𝐿(𝑥1 , 𝑥2 , 𝜆) = 𝑥1 𝑥2 − 𝜆(10𝑥1 + 2𝑥2 − 100)
or L x1 x2 100 10 x1 2 x2
Differentiating L w.r.t 𝑥1 , 𝑥2 and 𝜆 and then setting each result to zero yields Equations (1) to (3).
L
x2 10 = 0 (1)
x1
L
x1 2 = 0 (2)
x2
L
100 10 x1 2 x2 = 0 (3)
Solve the 3 simultaneous equations:
so that x1 2
x2 5x1
39
20 x1 100
x1 5
x2 5x1 55 25
40