0% found this document useful (0 votes)
22 views102 pages

Lecturenotes Weeks1 4 2320

Uploaded by

gordenpey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views102 pages

Lecturenotes Weeks1 4 2320

Uploaded by

gordenpey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 102

MA1104 Week1

Vectors, Lines & Planes

1. Distance between two points

A very fundamental concept in mathematics is that of distance. Here, we


want to find the formula for distance in terms of the coordinates of the points.
Suppose we have two points (x1 , y1 ) and (x2 , y2 ) on the xy-plane.

Consider the right-angled triangle formed by the points (x1 , y1 ), (x2 , y2 ) and
(x1 , y2 ). By the Pythogoras Theorem, it is quite easy to see that the distance
d between (x1 , y1 ) and (x2 , y2 ) is
p
d = (x2 − x1 )2 + (y2 − y1 )2 .

Suppose now we have two points in xyz-space:

1
Using a similar argument (see online lecture video), the distance d between
two points (x1 , y1 , z1 ) and (x2 , y2 , z2 ) is :
p
d= (x2 − x1 )2 + (y2 − y1 )2 + (z2 − z1 )2 .

2. Introduction to vectors

A vector is completely defined by two things:

• Length
• Direction

Two vectors are equal if they have the same length and the same direction.

Throughout, vectors will be denoted in bold letters: u, v etc.


A vector u is often represented as

• a position vector u = ha, b, ci, where (a, b, c) is the terminal point when
the initial point is placed at the origin; or
−→
• a directed line segment u = AB when we want to emphasize on the
initial point A(a1 , a2 , a3 ) as well as the terminal point B(b1 , b2 , b3 ).

Let us now turn to the question on adding two vectors. This can be defined
geometrically as follows: The sum u + v is the resulting vector that starts at
the initial point of u and ends at the terminal point of v when we place the
initial point of v at the terminal point of u:

2
This defining rule of addition is known as the Triangle Law of Addition.
By the same token, we can define the sum v + u, which turns out to be the
same as u + v. So, vector addition is commutative, i.e. the order in which
we compute the addition does not matter.

Equivalently, we can define vector addition algebraically as follows:


If a = ha1 , a2 , a3 i, b = hb1 , b2 , b3 i then

a + b = ha1 + b1 , a2 + b2 , a3 + b3 i.

The zero vector, denoted by 0, has length 0. It is the only vector with no
specific direction.

Let’s talk about multiplication. Let c ∈ R and u be a vector.

The scalar multiple cu is the vector

• whose length is |c| times the length of u; and

• whose direction is the same as u if c > 0 and is opposite to u if c < 0.

If c = 0 or u = 0, then cu = 0.

Just like vector addition, it is easier to compute scalar multiple using position
vectors: If c ∈ R, and u = hu1 , u2 , u3 i, then

cu = hcu1 , cu2 , cu3 i.

3
3. Sphere & Midpoint

Example 1. Find the equation (in terms of x, y and z) of the sphere of


radius 2 centered at the point (1, 1, 1).

Solution. Let (x, y, z) be a point of the sphere. Then the distance between
(x, y, z) and the centre (1, 1, 1) is 2. Hence, by the distance formula,
p
(x − 1)2 + (y − 1)2 + (z − 1)2 = 2

(x − 1)2 + (y − 1)2 + (z − 1)2 = 4.




Example 2. Prove that the midpoint of the line segment from A(x1 , y1 , z1 )
and B(x2 , y2 , z2 ) is  
x1 + x2 y1 + y2 z1 + z2
, , .
2 2 2
Solution. Let M (a, b, c) be the midpoint.

Note that
−−→ −−→
AM = M B
−−→ −→ −−→ −−→
OM − OA = OB − OM

ha, b, ci − hx1 , y1 , z1 i = hx2 , y2 , z2 i − ha, b, ci

ha − x1 , b − y1 , c − z1 i = hx2 − a, y2 − b, z2 − ci

Comparing components:

a − x1 = x2 − a

b − y1 = y2 − b

c − z1 = z2 − c

4
Solving for a, b, c:
x1 + x2
a =
2
y1 + y2
b =
2
z1 + z2
c =
2

Example 3. Find the position vector which is equal to the vector directed
from the point A(1, 2, 3) to the point B(3, −2, 0).
Solution. Let O(0, 0, 0) be the origin. Then

−→ −→ −−→
AB = AO + OB
−→ −−→
= −OA + OB

= −h1, 2, 3i + h3, −2, 0i

= h2, −4, −3i.

4. Length of Vector

The standard basis vectors are:

i = h1, 0, 0i, j = h0, 1, 0i, k = h0, 0, 1i.


Any 3D vector can be written as a linear combination of standard basis
vectors:

ha, b, ci = ha, 0, 0i + h0, b, 0i + h0, 0, ci


= ah1, 0, 0i + bh0, 1, 0i + ch0, 0, 1i
= ai + bj + ck.

5
The length of the vector u = hu1 , u2 , u3 i is
q
||u|| = u21 + u22 + u23 .

A unit vector is a vector whose length is 1.

Theorem 1. Let c ∈ R and u be a vector. Then

||cu|| = |c| ||u|| .

Proof. Let u = hu1 , u2 , u3 i.


p
||cu|| = (cu1 )2 + (cu2 )2 + (cu3 )2
q
= c2 (u21 + u22 + u23 )
√ q
= c2 u21 + u22 + u23
= |c| ||u|| .

Theorem 2. If a 6= 0, then a unit vector in the same direction as a is given


by
u = a/ ||a|| .
Proof. Notice 1/ ||a|| is a positive scalar, so u is in the same direction as a.
It remains to show that u is a unit vector, i.e. ||u|| = 1. By the preceding
theorem,

a 1 1
||u|| = = ||a|| = ||a|| = 1.
||a|| ||a|| ||a||
So u is the unit vector in the same direction as a. 

5. Dot product & angle

The dot product of two vectors a = ha1 , a2 , a3 i and b = hb1 , b2 , b3 i is defined


to be
a · b = a1 b 1 + a2 b 2 + a3 b 3 .

6
Theorem 3 (Properties of Dot Product). For vectors a, b and c and any
scalar d,
(i) a · b = b · a (commutativity)
(ii) a · (b + c) = a · b + a · c (distributive law)
(iii) (da) · b = d(a · b) = a · (db)
(iv) 0 · a = 0
(v) a · a = ||a||2 .

Notice a · b = 0 does not imply that a = 0 or b = 0.

For two nonzero vectors a and b in R3 , we define the angle θ between them
to be the smaller angle between a and b when placing their initial points
together.

Note that the larger angle is 2π − θ. So

0 ≤ θ ≤ π.

Some special cases:


• a and b have the same direction iff θ = 0.
• a and b have opposite direction iff θ = π.
• a and b are orthogonal (perpendicular) iff θ = π2 .
Theorem 4 (Dot product angle formula). Let θ be the angle between nonzero
vectors a and b. Then
a · b = ||a|| ||b|| cos θ.

7
Example 4. Find the angle between the vectors a = h2, 1, −3i and b =
h1, 5, 6i.
Solution.
a·b −11
cos θ = =√ √ .
||a|| ||b|| 14 62
 
−11
θ = cos−1 √ √ ≈ 1.953 radian.
14 62


Theorem 5. Two nonzero vectors a and b are orthogonal if and only if


a · b = 0.
Proof.
||a|| ||b|| cos θ = a · b = 0
if and only if cos θ = 0, if and only if θ = π2 , which is equivalent to saying
that a and b are orthogonal. 

6. Projections

−→
Let S be the foot of the perpendicular line from R to the line containing P Q.

−→
The vector P S is called the vector projection of b onto a, denoted by

proja b.

8
The scalar projection of b onto a (also called the component of b along a)
is defined to be the signed magnitude of the vector projection:
a·b
compa b = ||b|| cos θ = .
||a||
π
This value is negative if 2
< θ ≤ π, where θ is the angle between a and b.
Therefore,
 
a a·b a a·b
proja b = compa b × = = a
||a|| ||a|| ||a|| ||a||2

a·b
= a.
a·a
Example 5. Let a = h−2, 3, 1i and b = h1, 1, 2i. Find the scalar projection
and vector projection of b onto a.
Solution.
a·b (−2)(1) + 3(1) + 1(2) 3
compa b = = √ =√ .
||a|| 14 14
3 a 3 3 9 3
proja b = √ = a = h− , , i.
14 ||a|| 14 7 14 14


7. Cross product

For two vectors a = ha1 , a2 , a3 i and b = hb1 , b2 , b3 i, define the cross product
of a and b to be

i j k
a×b = a1 a2 a3
b1 b2 b3
a2 a3 a a a a
= i− 1 3 j+ 1 2 k
b2 b3 b1 b 3 b 1 b2
= (a2 b3 − a3 b2 )i − (a1 b3 − a3 b1 )j
+(a1 b2 − a2 b1 )k.

9
To compute a × b, we must write the components of a in the second row of
the determinant, and the components of b in the third row. The order is
important!

Theorem 6. The vector a × b is orthogonal to both a and b.

Proof. To show a × b is orthogonal to a, we compute their dot product as


follows:
a2 a3 a a a a
(a × b) · a = a1 − 1 3 a2 + 1 2 a3
b2 b3 b 1 b3 b1 b2
= 0.

A similar computation shows that (a × b) · b = 0. 

The vector a × b points in a direction perpendicular to a and b. This can


be given by the right-hand rule as follows:

Theorem 7 (Cross product angle formula). If θ is the angle between a and


b then
||a × b|| = ||a|| ||b|| sin θ.

Theorem 8 (Properties of cross product). If a, b and c are vectors and d


is a scalar, then

(i) a × b = −b × a

(ii) (da) × b = d(a × b) = a × (db)

10
(iii) a × (b + c) = a × b + a × c

(iv) (a + b) × c = a × c + b × c

We can use cross product

• to find the area of a parallelogram

• to find the distance from a point to a line in R3 .

Finding area of the parallelogram spanned by a & b:

Notice that if we take the base of the parallelogram to given by a, then the
height of the parallelogram is

||b|| sin θ

where θ is the angle between a and b. Hence, the area of the parallelogram
is
height × base = ||a|| × ||b|| sin θ = ||a × b|| .

Finding the distance from a point to a line:

11
The distance from the point Q to the line L is always the shortest distance
between Q and the line. Suppose P and R are two points on L. Hence, this
distance is just the length of the straight line joining Q and a point on the
line L in such a way that it is perpendicular to the given line L. From the
figure above, the distance is
−→
P Q sin θ
−→ −→
where θ is the angle between P Q and P R.
Using the cross-product angle formula, this can be written as

−→ −→
−→ PQ × PR
P Q sin θ = −→
PR

8. Equation of a line

To locate a particular line L in space, we need

• A point say P0 (x0 , y0 , z0 ) on the line L.

• A vector v whose direction is parallel to the line L.

It is enough to describe an arbitrary point P (x, y, z) on the line L by de-


scribing the position vector r = hx, y, zi of P .

12
By Triangle Law of Addition,

r = r0 + a

where t ∈ R. Since a is parallel to v, we can write

a = tv, for some t ∈ R.

Hence,
r = r0 + tv, t ∈ R.

This is called a vector equation of L.

We can also write the vector equation in the component form:

v = ha, b, ci, r0 = hx0 , y0 , z0 i, r = hx, y, zi.

Then
r = r0 + tv
hx, y, zi = hx0 , y0 , z0 i + tha, b, ci.

By comparing components, we have a set of parametric equations of the line:

Theorem 9 ( Parametric Equation of Line).

x = x0 + at, y = y0 + bt, z = z0 + ct.

Remarks:

• Usually the parameter t (in the parametric equation of line) takes values
on the entire R or an interval I.

• The numbers a, b and c are called direction numbers of the line L.


These numbers are not unique

• The vector equation and parametric equations of a line are not unique.

13
Example 7. Find an equation of the line passing through P (1, 2, −1) and
Q(5, −3, 4).
Solution. A vector parallel to the line is
−→
P Q = h5 − 1, −3 − 2, 4 − (−1)i = h4, −5, 5i.

A fixed point on the line is P (1, 2, −1). Thus a set of parametric equations
for the line is

x = 1 + 4t, y = 2 − 5t, z = −1 + 5t.




In general, how do two lines relate to each other?

In 2-D, two lines are either parallel or intersect.

In 3-D, two lines are either


• parallel;
• non-parallel and intersect; or
• non-parallel and non-intersecting (skew lines)

Let L1 and L2 be two lines in R3 , with direction vectors a and b, respectively,


and let θ be the angle between a and b.

• The lines L1 and L2 are parallel whenever a and b are parallel.


• If L1 and L2 intersect then θ is an angle between L1 and L2 . Notice
π − θ is also an angle between the lines.

Example 8. Show that the lines

L1 : x − 2 = −t, y − 1 = 2t, z − 5 = 2t,

L2 : x − 1 = s, y − 2 = −s, z − 1 = 3s,
are skew.

14
Solution. A vector parallel to L1 is a = h−1, 2, 2i and a vector parallel to
L2 is b = h1, −1, 3i. Since a is not a scalar multiple of b, these lines are not
parallel.
Rearranging, we have

L1 : x = 2 − t, y = 1 + 2t, z = 5 + 2t,

L2 : x = 1 + s, y = 2 − s, z = 1 + 3s,
Assume for a contradiction that L1 and L2 intersect.
Then there must exist a choice of the parameter t and s such that the values
of x, y and z are the same.

• The x-coordinate must satisfy

2 − t = 1 + s,

so that s = 1 − t.

• The y-coordinate must satisfy

y = 1 + 2t = 2 − s.

Substituting s = 1 − t into the last equation, we have

t = 0, s = 1.

Now, the z-coordinate must satisfy

z = 5 + 2t = 5,

z = 1 + 3s = 4,
which is absurd!

Hence our assumption that L1 and L2 intersects was wrong. So the lines are
skew, as desired. 

15
9. Equation of a plane

To locate a particular plane in space, we need


• A point say P0 (x0 , y0 , z0 ) on the plane.
• A vector n whose direction is perpendicular to the plane.
How do we describe an arbitrary point P (x, y, z) on the plane?

Let r and r0 denote the position vectors of P and P0 respectively. The aim
is to describe the position vector r of the point P .
−−→
Notice that r − r0 is represented by P0 P .

The normal vector n (which is orthogonal to the plane) is always orthogonal


to r − r0 . Hence,
(r − r0 ) · n = 0.

Theorem 10 (Vector Equation of Plane).


n · (r − r0 ) = 0
which can be written as
n · r = n · r0 .

16
In terms of the components, we have

n = ha, b, ci, r = hx, y, zi, r0 = hx0 , y0 , z0 i.

Then n · r = n · r0 becomes

ha, b, ci · hx, y, zi = ha, b, ci · hx0 , y0 , z0 i

ax + by + cz = ax0 + by0 + cz0 ,


whihc is called the linear equation of the plane.

Theorem 11 (Linear Equation of Plane).

ax + by + cz = d,

where
d = ax0 + by0 + cz0 .

Example 9. Find an equation of the plane that passes through the points
P (1, 3, 2), Q(3, −1, 6), R(5, 2, 0).

Solution. First, we need a vector n orthogonal to the plane. This can be


given by
−→ −→
n = PQ × PR

17
Notice
−→ −→
P Q = h2, −4, 4i, P R = h4, −1, −2i.
So
i j k
n = 2 −4 4
4 −1 −2
= 12i + 20j + 14k.

With the point P (1, 3, 2) and the normal vector n, an equation of the plane
is:
12(x) + 20(y) + 14(z) = 12(1) + 20(3) + 14(2)
or after simplifications,
6x + 10y + 7z = 50.


How do two planes relate to each other?

Two planes are parallel if their normal vectors are parallel.

If two planes are not parallel, then


• They intersect in a straight line.

• An angle between the two planes is the angle θ between their normal
vectors. Notice π − θ is also an angle between the planes.

Example 10. (a) Find the angle between the planes x + 2y + z = 3 and
x − 4y + 3z = 5.

(b) Find the line of intersection of these two planes.

18
Solution. (a) The normal vectors of these planes are

n1 = h1, 2, 1i, n2 = h1, −4, 3i.

So, if θ is the angle between them, then


n1 · n2
θ = cos−1
||n1 || ||n2 ||
1(1) + 2(−4) + 1(3)
= cos−1 √ √
1 + 4 + 1 1 + 16 + 9
−4
= cos−1 √ ≈ 108.7◦
156
An angle between the planes is 108.7◦ . You can also give the other angle,
which is 71.3◦ .
(b) Solving both equations for x,

x = 3 − 2y − z and x = 5 + 4y − 3z.

Setting them to be equal:

3 − 2y − z = 5 + 4y − 3z

z = 3y + 1
x = 3 − 2y − (3y + 1) = −5y + 2.
Let y = t be the parameter, we obtain a parametric equation for the line of
intersection

x = −5t + 2, y = t, z = 3t + 1.


10. Examples on plane

Example 11. Find an equation of the plane containing the point (0, 1, 2)
and the line given by

x = t, y = t, z = 2t + 5, t ∈ R.

19
Solution. A point on the line is (0, 0, 5) (choose t = 0).

Two vectors on the plane are h0, 0, 5i − h0, 1, 2i = h0, −1, 3i, and h1, 1, 2i.
A normal to the plane is

h1, 1, 2i × h0, −1, 3i = h5, −3, −1i.

So, the equation of the plane is

h5, −3, −1i · hx, y, z − 5i = 0,

5x − 3y − z + 5 = 0.


Example 12. Find the (shortest) distance between the following planes:

x + 2y + 5z = 6, x + 2y + 5z = 11.

Solution. A point on the plane x + 2y + 5z = 6 is (6, 0, 0). A point on the


plane x + 2y + 5z = 11 is (11, 0, 0).

Let
u = h11, 0, 0i − h6, 0, 0i = h5, 0, 0i.
A normal vector to the planes is

n = h1, 2, 5i.

Let θ be the angle between u and n.

The distance is (why?)

u·n 5
||u|| | cos θ| = =√ .
||n|| 30


20
11. Finding angle between planes

Example 13. Find the angle between adjacent sides i.e. the angle between
the plane OP T S and the plane P QU T of the following symmetric-looking
chute.

Solution. Here is the view from the top of the symmetric-looking chute:

Place the chute in our coordinate system so that O(0, 0, 0), P (6, 0, 0), Q(6, 6, 0)
and R(0, 6, 0).
Then we have:

S(−1, −1, 8), T (7, −1, 8), U (7, 7, 8), V (−1, 7, 8).

We need to find the angle between the plane Π1 which contains the vector
−→ −→ −→ −→
OP and OS and the plane Π2 which contains the vector P Q and P T .

21
Let
−→ −→
u = OP × OS = h0, −48, −6i,
−→ −→
v = P Q × P T = h48, 0, −6i.

The angle between u and v is


   
−1 u·v −1 36
cos = cos ≈ 89◦ .
||u|| ||v|| 482 + 36


12. Vector functions of one variable

The goal of this section is to define and visualize vector function of one
variable r(t).

A vector-valued function is

r(t) = hf (t), g(t), h(t)i = f (t)i + g(t)j + h(t)k

The scalar function f , g and h are called the component functions of r.

Vector equation of lines are examples of vector functions r(t).

The vector function r(t) traces out the curve C:

22
We say that r(t) is a parametrization of C.

A curve C can have more than one parameterizations.

For example, both

r(t) = ht, t2 i, t ∈ R

r(t) = ht3 , t6 i, t ∈ R

parameterize the same parabola f (x) = x2 on the xy-plane.


But then, what’s the difference? See discussion at the end of the next section.

Example 14. Sketch the curve traced out by the vector-valued function

r(t) = sin ti − 3 cos tj + 2tk.

Solution.
 y 2
2
x + = sin2 t + cos2 t = 1
3
which is the equation of an ellipse in 2-D. In 3-D, however, it becomes the
equation of an elliptic cylinder whose axis is the z-axis.
The curve will wind its way up the cylinder anticlockwise as t increases. We
call this curve an elliptical helix.

23
13. Tangent vectors

The derivative of r(t) at t = a defined by

r(a + 4t) − r(a)


r0 (a) = lim .
4t→0 4t
can be regarded as the rate of change of r(t) at t = a.

Just like single variable function f (x), the derivative

f (a + 4x) − f (a)
f 0 (a) = lim
4x→0 4x
can be regarded as the rate of change of f at x = a.

r(a+4t)−r(a)
Notice that for 4t > 0, the vector 4t
points in the same direction
as r(a + 4t) − r(a).
r(a+4t)−r(a)
As 4t → 0, 4t
approaches r0 (a).

This is a vector tangent to the curve at r(a). We also call r0 (a) a tangent
vector to the curve at t = a.

24
An interpretation:

r(t) = the position vector of a particle in space at


time t

r0 (t) = the velocity of the particle at time t. Hence,

||r0 (t)|| = speed of the particle at time t.

We seldom compute r0 (a) from the definition.

It turns out that we can just differentiate

‘component-wise’ !!!

Theorem 12 (Derivative of Vector-valued Function). Let r(t) = hf (t), g(t), h(t)i


and suppose that the components f , g and h are all differentiable at t = a.

Then r is differentiable at t = a and its derivative is given by

r0 (a) = hf 0 (a), g 0 (a), h0 (a)i.

Theorem 13 (Derivative Rules). Suppose r(t) and s(t) are differentiable


vector-valued functions, f (t) is a differentiable scalar function and c is a
scalar constant. Then

(i) d
dt
(r(t) + s(t)) = r0 (t) + s0 (t)

(ii) d
dt
(cr(t)) = cr0 (t)

(iii) d
dt
f (t)r(t) = f 0 (t)r(t) + f (t)r0 (t)

25
(iv) d
dt
r(t) · s(t) = r0 (t) · s(t) + r(t) · s0 (t)

(v) d
dt
(r(t) × s(t)) = r0 (t) × s(t) + r(t) × s0 (t).

Example 15. Find the tangent line L to the curve r(t) = hcos t, sin t, ti at
(0, 1, π/2).

Solution. At point (0, 1, π/2), we have t = π/2.

Since r0 (t) = h− sin t, cos t, 1i, a direction of the tangent line L is given
by the tangent vector

r0 (π/2) = h− sin(π/2), cos(π/2), 1i = h−1, 0, 1i.

So a parametric equation of L is

x = 0 + (−1)t, y = 1 + (0)t, z = π/2 + (1)t,


that is
x = −t, y = 1, z = π/2 + t.


Recall the folllowing example in the previous section. Both of the following
vector functions parameterize the same parabola f (x) = x2 on the xy-plane.

r(t) = ht, t2 i, t ∈ R

r(t) = ht3 , t6 i, t ∈ R
However, their respective tangent vectors are different:

r0 (t) = h1, 2ti, t ∈ R

r0 (t) = h3t2 , 6t5 i, t ∈ R

26
Hence, the ‘velocities’ of the tracing of the curve are different. For example,
at t = 0, the first vector function gives the velocity h1, 0i, whereas the second
vector function gives the velocity h0, 0i at the same point (0, 0).

14. Arc Length

A natural question about a curve in space is ‘How long is it?’. This is the
arc length of the curve. Our formula for arc length only applies to smooth
curves, i.e. curves that do not have sharp corners at their interior points.

The curve above is not smooth.


Our assumption:
Suppose that a smooth curve is traced out by the endpoint of r(t) = hf (t), g(t), h(t)i
where

• f , f 0 , g, g 0 , h, h0 are all continuous for t ∈ [a, b]; and

• the curve is traversed exactly once as t increases from a to b.

27
We can approximate the arc length of this curve as follows:

Step 1.

Partition the interval [a, b] into n subintervals of equal size: a = t0 < t1 <
· · · < tn = b, where ti − ti−1 = 4t = b−a
n
for all i = 1, 2, . . . , n.
Step 2.

Let si denote the arc length of that portion of the curve traced out as t in-
creases from ti−1 to ti .

We can approximate si by the distance of the point (f (ti ), g(ti ), h(ti )) from
(f (ti−1 ), g(ti−1 ), h(ti−1 )) (since f , g and h are continuous).

By the distance formula,


p
si ≈ (f (ti ) − f (ti−1 ))2 + (g(ti ) − g(ti−1 ))2 + (h(ti ) − h(ti−1 ))2 .
On the other hand, we have (why?)
f (ti ) − f (ti−1 ) = f 0 (ci )(ti − ti−1 ) = f 0 (ci )4t
g(ti ) − g(ti−1 ) = g 0 (di )(ti − ti−1 ) = g 0 (di )4t
h(ti ) − h(ti−1 ) = h0 (ei )(ti − ti−1 ) = h0 (ei )4t

28
for some points ci , di and ei in the interval (ti−1 , ti ).
This yields

p
si ≈ (f (ti ) − f (ti−1 ))2 + (g(ti ) − g(ti−1 ))2 + (h(ti ) − h(ti−1 ))2
p
≈ (f 0 (ci )4t)2 + (g 0 (di )4t)2 + (h0 (ei )4t)2
p
≈ f 0 (ci )2 + g 0 (di )2 + h0 (ei )2 4t.

Notice that when 4t is small, ci , di and ei are very close to each other.

So we can further approximate si as


p
si ≈ f 0 (ci )2 + g 0 (ci )2 + h0 (ci )2 4t

for each i = 1, 2, . . . , n (since we assume f 0 , g 0 and h0 are continuous).


The total arc length is then approximately
n
X p
s≈ f 0 (ci )2 + g 0 (ci )2 + h0 (ci )2 4t
i=1

where the total error in the approximation tends to 0 as 4t → 0.


We can make 4t → 0 by taking n → ∞. This gives exact length:
n
X p
s = lim f 0 (ci )2 + g 0 (ci )2 + h0 (ci )2 4t
n→∞
i=1

provided the limit exists.


Theorem 14 (Arc Length Formula). Let C be the curve given by

r(t) = hf (t), g(t), h(t)i, a ≤ t ≤ b

where f 0 , g 0 and h0 are continuous. If C is traversed exactly once as t increases


from a to b, then its length is
Z bp
s = f 0 (t)2 + g 0 (t)2 + h0 (t)2 dt
a
Z b
= ||r0 (t)|| dt
a

29
Example 16. Find the arclength of the curve traced out by the endpoint of
the vector-valued function r(t) = h2t, ln t, t2 i for 1 ≤ t ≤ e.

Solution. Note that


1
r0 (t) = h2, , 2ti.
t

s  2
Z e
1
s = 22 + + (2t)2 dt
1 t
Z er
1 + 4t2 + 4t4
= dt
1 t2
Z er
(1 + 2t2 )2
= dt
1 t2
Z e Z e
1 + 2t2

1
= dt = + 2t dt
1 t 1 t
e
ln |t| + t2 1

=

= (ln e + e2 ) − (ln 1 + 1) = e2 .

30
MA1104 Week2
Functions of Two Variables, Quadric
Surfaces, Limits & Continuity

1. Two-variable function f(x,y)

So far we have seen functions of one variable, i.e. the domain is a subset of
R
function Domain D Range R
(scalar) f (t) D⊆R R⊆R
(vector) r(t) D⊆R R ⊆ V2 or V3

Here, V2 and V3 denote the set of 2D and 3D vectors in R2 and R3 respec-


tively.

In this section, we shall define two-variable function f (x, y), and learn how
to visualize it
• as a surface z = f (x, y) in the xyz-space

• through level curves & contour plots.


Definition 1 (Two-variable functions). A function f of two variables is a
rule that assigns to each ordered pair of real numbers (x, y) in a set D ⊆
R2 = R × R a unique real number denoted by f (x, y).

If a function f is given by a formula and no domain is specified, then the


domain of f is understood to be:

the set of all pairs (x, y) for which the given


expression is a well-defined real number.

Example 1. Find the domain of

f (x, y) = x ln(y 2 − x).

1
Solution. The function ln is only defined for positive real. So f (x, y) is
defined for all x, y such that

y 2 − x > 0.
The domain of f is

D = {(x, y) : x < y 2 }.

How can we visualise f (x, y)??

The graph of a function f of two variables is also called the surface S with
equation z = f (x, y).

We can visualize the graph S of f as lying directly above or below its domain
D in the xy-plane.

2
Graphing functions f (x, y) is not easy!

We can also ‘visualize’ a two-variable function through its traces.

Horizontal traces (level curves): resulting curves when we intersect the sur-
face z = f (x, y) with horizontal planes.

Vertical traces: resulting curves when we intersect the surface z = f (x, y)


with vertical planes.

Sometimes we can identify a surface by examing these traces. Let us see one
example.
Example 2. Match the functions f (x, y) = ln(x2 +y 2 ) and g(x, y) = cos(x2 +
y 2 ) to the surfaces shown below:

Solution. We consider the horizontal traces:


Let c be a constant. Then

f (x, y) = ln(x2 + y 2 ) = c
√ 2
x2 + y 2 = ec = ec

corresponds to the circle of radius ec centred at the origin.

On the other hand,


g(x, y) = cos(x2 + y 2 ) = c
x2 + y 2 = cos−1 c.
Since there are many positive solutions to cos−1 c, we deduce that the above
gives a collection of circles with different radii.

3
Hence z = f (x, y) is the surface on the left, and z = g(x, y) is the surface on
the right in the above figure. 

Definition 2 (Level Curve). A level curve of f (x, y) is the two-dimensional


graph of the equation f (x, y) = k for some constant k.

Definition 3 (Contour Plot). A contour plot of f (x, y) is a graph of numer-


ous level curves f (x, y) = k, for representative values of k.

To sketch contour plots, we use values of k that are equally spaced. The
surface is:

• steep where the level curves are close together.

• flatter where the level curves are farther apart.

Example 3. Sketch some level curves of h(x, y) = 4x2 + y 2 .

Solution.

4
2. Cylinder & Quadric Surfaces

We now introduce some special type surfaces, namely the cylinders and
quadric surfaces. These surfaces will be used later on to explain the the-
ory.

When we mention the word cylinder, we probably think of the following


object

We will use the term cylinder to mean something more general.

Definition 4 (Cylinders). A surface is a cylinder if there is a plane P such


that all the planes parallel to P intersect the surface in the same curve (when
viewed in 2-dimension).

5
Example 4. The surface given by

y2 + z2 = 1

is a cylinder.


In fact, any equation in x, y and z where one of the variable is missing is a
cylinder.

Example 5. Sketch the graph of the surface z = x2 .

Solution.

6
Another common surfaces we shall use frequently are the quadric surfaces.

Definition 5 (Quadric Surface). A quadric surface is the graph of a second-


degree equation in three variables x, y and z:

Ax2 + By 2 + Cz 2 + Dxy + Eyz + F xz + Gx + Hy + Iz + J = 0

where A, B, . . ., J are constants.

By translation and rotation, a quadric surface can be brought into one of the
two standard forms:

Ax2 + By 2 + Cz 2 + J = 0 or Ax2 + By 2 + Iz = 0

Excluding cylinders, there are six basic quadric surfaces:

Equation Standard form


(symmetric about z-axis)
x2 y2 z
a2
+ b2
= c
Elliptic paraboloid
x2 y2 z
a2
− b2
= c
Hyperbolic paraboloid
x2 y2 z2
a2
+ b2
+ c2
=1 Ellipsoid
x2 y2 z2
a2
+ b2
− c2
=0 (Elliptic) cone
x2 y2 z2
a2
+ b2
− c2
=1 Hyperboloid of one sheet
x2 y2 z2
a2
+ b2
− c2
= −1 Hyperboloid of two sheets

7
3. Elliptic Paraboloid

For this section and the subsequent ones, it may be helpful to make use of
the following websites for visualization:

http://matkcy.github.io/MA1104-implicitplot.html
http://matkcy.github.io/MA1104-3dgrapher.html

Definition 6 (Elliptic paraboloid – symmetric about the z-axis).


x2 y 2 z
2
+ 2 =
a b c
Horizontal traces: ellipses.
Vertical traces: parabolas.

2
x2
The figure below is an example of the graph of the elliptic paraboloid a2
+ yb2 =
z
c
when c > 0.

The point (0, 0, 0) is called the vertex of the elliptic paraboloid above .

The vertex will be shifted to (x0 , y0 , z0 ) if the elliptic paraboloid is given by

(x − x0 )2 (y − y0 )2 (z − z0 )
+ =
a2 b2 c

8
4. Hyperbolic Paraboloid

Definition 7 (Hyperbolic paraboloid – symmetric about the z-axis).

x2 y 2 z
− = .
a2 b2 c
Horizontal traces: hyperbolas.
Vertical traces: parabolas.

The case c < 0 is illustrated below:

The following website can draw two surfaces in space. If we intersect a given
surface using a horizontal plane z = k (or vertical planes x = k or y = k),
then the intersection curve will be a horizontal (vertical) trace. Try it out
on the hyperbolic paraboloid!

http://matkcy.github.io/MA1104-intersection.html

9
5. Ellipsoid, Cones & Hypeboloid

Definition 8 (Ellipsoid).
x2 y 2 z 2
+ 2 + 2 =1
a2 b c

If a = b = c, then the ellipsoid is a sphere.

Horizontal traces: ellipses.

Vertical traces: ellipses.

The following three quadric surfaces have very similar equations:


Equation Standard form
(symmetric about z-axis)
x2 y2 z2
a2
+ b2
− c2
=0 (Elliptic) cone
x2 y2 z2
a2
+ b2
− c2
=1 Hyperboloid of one sheet
x2 y2 z2
a2
+ b2
− c2
= −1 Hyperboloid of two sheets

10
Definition 9 (Elliptic cone – symmetric about the z-axis).

x2 y 2 z 2
+ 2 − 2 =0
a2 b c
Horizontal traces: ellipses.

Vertical traces in the planes x = k and y = k are hyperbolas if k 6= 0; if


k = 0 then the trace is a pair of lines.

Definition 10 (Hyperboloid of one sheet – symmetric about the z-axis).

x2 y 2 z 2
+ 2 − 2 =1
a2 b c
Horizontal traces: ellipses.

Vertical traces: hyperbolas.

11
Definition 11 (Hyperboloid of Two Sheets - symmetric about the z-axis).

x2 y 2 z 2
+ 2 − 2 = −1
a2 b c
Horizontal traces in z = k are ellipses if k > c or k < −c.

Vertical traces: hyperbolas.

6. Examples on Quadric Surfaces

12
Example 6. Identify and sketch the surface
x2 + 2z 2 − 6x − y + 10 = 0.
Solution. By completing squares, we rewrite the equation as

z2
(y − 1) = (x − 3)2 +
1/2

• This surface is an elliptic paraboloid.


• Its vertex is the point (3, 1, 0), and
• It is symmetric about the line which is parallel to the y-axis and passes
through (3, 1, 0).

Example 7. Identify and sketch the surface


x2 z2
− y2 − = 1.
4 2

Solution. Rearranging, we have

z 2 x2
y2 + − = −1.
2 4
This is a hyperboloid of two sheets symmetrical about the x-axis.

13


7. Functions of three variables

Definition 12. A function f of three variables is a rule that assigns to each


ordered triple of real numbers (x, y, z) in a set D ⊆ R3 = R × R × R a unique
real number denoted by f (x, y, z).

It is even more difficult to visualize a function f of three variables by its


graph.

That would lie in a four-dimensional space!!!

Definition 13 (Level Surface). A level surface of f (x, y, z) is the three-


dimensional graph of the equation f (x, y, z) = k for some constant k.

If the point (x, y, z) moves along a level surface, the value of f (x, y, z) remains
fixed.

14
Example 8. Find the level surfaces of the function

f (x, y, z) = x2 + y 2 + z 2 .

Solution. Each level surface f (x, y, z) = t for some t can be regarded as one
instance of the function at time t. We can then think of f (x, y, z) as evolving
spheres.

8. Limit of f (x, y)

Recall for single-variable function:

If we write
lim f (x) = L
x→a

we mean as x gets closer and closer (but not equal) to a, the function f (x)
gets closer and closer to L.

Informally, if we write
lim f (x, y) = L
(x,y)→(a,b)

we mean as (x, y) gets closer and closer (but not equal) to (a, b), f (x, y) gets
closer and closer to L.

For two variable function f (x, y), we can get close to a point (a, b) in the
domain via infinitely many directions!

15
Here is the formal definition:

Definition 14 (Limit: A formal definition). Let f be a function of two


variables whose domain D contains points arbitrarily close to (a, b). We say
that the limit of f (x, y) as (x, y) approaches (a, b) is L ∈ R, denoted by

lim f (x, y) = L
(x,y)→(a,b)

p  > 0 there exists a number δ > 0 such that |f (x, y)−L| < 
if for any number
whenever 0 < (x − a)2 + (y − b)2 < δ.

The definition says that the distance between f (x, y) and L can be made
arbitrarily small by making the distance from (x, y) to (a, b) sufficiently
small (but not 0).

Remark: When speaking about limit, the function f is not required to be


defined at (a, b) (i.e the domain D might not contain (a, b)).

It can be proved from the definition that if lim(x,y)→(a,b) f (x, y) = L then

16
• its value L is unique, and

• L is independent of the choice of path approaching (a, b).

9. How to show limit does not exist

The idea is based on the following observation:

Limit exists at (a, b)


=⇒ The limits along ALL paths at (a, b) are the same.

Or equivalently, the contrapositive of the above statement is given as follows:

The limits along SOME paths at (a, b) are different


=⇒ Limit does not exist at (a, b)

This means that as long as we have two paths through the point (a, b) along
which the limits are different, then the function cannot have limit at (a, b).

Theorem 15 (How to show limit does not exist). If f (x, y) approaches L1


as (x, y) approaches (a, b) along a path P1 and f (x, y) approaches L2 as (x, y)
approaches (a, b) along a path P2 and L1 6= L2 , then

lim f (x, y)
(x,y)→(a,b)

does not exist.

Example 9. Show that the following limit does not exist.


y
lim .
(x,y)→(1,0) x + y − 1

17
Solution. Consider the vertical line (path) x = 1, and compute the limit as
y → 0 along this path:
y
lim = lim 1 = 1.
(1,y)→(1,0) 1 + y − 1 y→0

Next, consider the path along the horizontal line y = 0, and compute the
limit as x → 1 along this path:
0
lim = lim 0 = 0.
(x,0)→(1,0) x + 0 − 1 x→1

Since the function approaches different values along two different paths pass-
ing through (1, 0), the limit does not exist at (1, 0).


In general, some of the paths (passing through a given point (a, b)) that we
can try are:

• x = a, y → b (vertical lines);

• y = b, x → a (horizontal lines);

• y = g(x), x → a where g(x) is some simple function (usually linear and


quadratic) such that g(a) = b;

• x = g(y), y → b where g(x) is some simple function (usually linear and


quadratic) such that g(a) = b.

Example 10. Show that the following limit does not exist.
xy
lim .
(x,y)→(0,0) x2 + y 2

Solution. Consider the limit along the path x = 0. We have


0
lim = lim 0 = 0.
(0,y)→(0,0) 0 + y 2 y→0

18
Similarly, for the path y = 0, we have
0
lim = lim 0 = 0.
(x,0)→(0,0) x2 + 0 x→0

Be careful! Just because these two limits are the same does not mean that
the limit exists.

For a limit to exist, the limit must be the same for ALL paths through (0, 0),
not just the two we had considered.

There is another simple path through (0, 0): the path y = x.

Using this path, we have

x2 1 1
lim = lim = .
(x,x)→(0,0) x2 + x2 x→0 2 2

Since this limit does not match the limit along the first two paths we con-
sidered, the limit does not exist.


Example 11. Show that the following limit does not exist:

xy 2
lim
(x,y)→(0,0) x2 + y 4

Solution. Lets approach (0, 0) along the path y = mx, where m is a real
number.

xy 2 x · (mx)2
lim = lim 2
(x,mx)→(0,0) x2 + y 4 x→0 x + (mx)4

xm2
= lim
x→0 1 + m4 x2
limx→0 xm2 0
= = = 0.
limx→0 1 + m4 x2 1

19
2
The function x2xy+y4 approaches the same number 0 as (x, y) → (0, 0) along
y = mx. But this does not imply that the limit of the function is 0.

Let’s try a different path.

Now, approach (0, 0) along the path y 2 = x. We have

xy 2 y2 · y2 1
lim 2 4
= lim 4 4
= .
(y ,y)→(0,0) x + y
2 y→0 y + y 2

Since this limit along y 2 = x is different from the one we had previously for
y = mx, we conclude that the limit does not exist. 

10. How to show limit exists

To show limit exists:

(1) we can deduce it from known/simple functions us-


ing properties of limits or continuity; or

(2) we can use the Squeeze theorem.

We begin with some basic results about limits.

Theorem 16 (Limit Theorems). Suppose f (x, y) and g(x, y) both have lim-
its as (x, y) approaches (a, b). Then

lim (f (x, y) ± g(x, y))


(x,y)→(a,b)

= lim f (x, y) ± lim g(x, y).


(x,y)→(a,b) (x,y)→(a,b)

  
lim f (x, y)g(x, y) = lim f (x, y) lim g(x, y) .
(x,y)→(a,b) (x,y)→(a,b) (x,y)→(a,b)

20
Theorem 17 (Limit Theorems: continued).

f (x, y) lim(x,y)→(a,b) f (x, y)


lim = ,
(x,y)→(a,b) g(x, y) lim(x,y)→(a,b) g(x, y)

provided
lim g(x, y) 6= 0.
(x,y)→(a,b)

Example 12. Find the limit

sin(xπ) + cos(yπ)
lim .
(x,y)→(1,1) x2 + y 2

Solution. Applying the addition and quotient rule for limits:

lim(x,y)→(1,1) sin(xπ) + lim(x,y)→(1,1) cos(yπ) −1


2 2
= .
lim(x,y)→(1,1) x + lim(x,y)→(1,1) y 2


Another method of proving limit exists is by means of Squeeze Theorem.

Theorem 18 (Squeeze). Suppose

• |f (x, y) − L| ≤ g(x, y) ∀(x, y) close to (a, b)

• lim g(x, y) = 0
(x,y)→(a,b)

Then
lim f (x, y) = L.
(x,y)→(a,b)

The proof is omitted.

Example 13. Show that

3x2 y
lim = 0.
(x,y)→(0,0) x2 + y 2

21
3x2 y
Solution. We begin by finding an upper bound for x2 +y 2
− 0 . Notice that

3x2 y x2
− 0 = 3 |y| ≤ 3|y|,
x2 + y 2 x2 + y 2
x2
since x2 +y 2
≤ 1.

Now, since lim(x,y)→(0,0) |y| = 0, the result follows from the Squeeze theorem.


11. Continuity of f (x, y)

Definition 19 (Definition of Continuity). We say that f is continuous at


(a, b) if

lim f (x, y) = f (a, b). (1)


(x,y)→(a,b)

The defining equation (1) is called the substitution property.

Theorem 20 (Continuity Theorems). If f (x, y) and g(x, y) are continuous


at (a, b), then
• f ± g is continuous at (a, b).
• f · g is continuous at (a, b).
f
• g
is continuous at (a, b), provided g(a, b) 6= 0.

Theorem 21 (Continuity and Composition). Suppose f (x, y) is continuous


at (a, b) and g(x) is continuous at f (a, b). Then

h(x, y) = (g ◦ f )(x, y) = g(f (x, y))

is continuous at (a, b).

22
Subsequently, the following classes of functions are continuous in its domain.

• Polynomial in x and y;

• Trigonometric and exponential functions in x and y;

• Rational function in x and y.

Example:
x2 + x 3 y
f (x, y) =
x+y
is continuous on
D = {(x, y) : x + y 6= 0}.

Example 14. Determine if the following function is continuous at (0, 0):


(
x4
(x 2 +y 2 ) if (x, y) 6= (0, 0),
g(x, y) =
0 if (x, y) = (0, 0).

Solution. We need to show that

lim g(x, y) = g(0, 0).


(x,y)→(0,0)

x2
For (x, y) 6= (0, 0), since x2 +y 2
≤ 1, we deduce that

x4
|g(x, y)| = 2 2
≤ |x2 | = x2 .
(x + y )
Since lim(x,y)→(0,0) x2 = 0, we deduce from Squeeze theorem that

lim g(x, y) = 0 = g(0, 0).


(x,y)→(0,0)

Therefore, g(x, y) is continuous at (0, 0).




23
MA1104 Week 3
Partial Derivatives, Chain Rule,
Directional Derivatives

1. Partial Derivative

Recall that for a function f of a single variable x, we define the derivative


function as

f (x + h) − f (x)
f 0 (x) = lim
h→0 h

To extend this to multivariable functions, the idea is to ‘vary’ one variable


and keep other variable(s) fixed.
Definition 1 (Partial Derivative). If f is a function of two variables, its
partial derivatives are the functions fx and fy defined by:
f (x + h, y) − f (x, y)
fx (x, y) = lim ,
h→0 h
f (x, y + h) − f (x, y)
fy (x, y) = lim .
h→0 h

Basically, the definition says that if we wish to differentiate with respect to


x, then we regard y as constant. Similarly, when differentiating with respect
to y, we regard x as constants.

Other notations for partial derivatives:


∂f
fx = .
∂x
∂f
fy = .
∂y

Example 1. For f (x, y) = exy + xy , compute fx and fy .

1
Solution. Treating y as a constant, we have
1
fx (x, y) = yexy + .
y
Treating x as a constant, we have
x
fy (x, y) = xexy − .
y2


Geometric interpretation of partial derivatives:

Consider the surface S given by the equation z = f (x, y).

The curve C1 is the graph of the function g(x) = f (x, b), which is the inter-
section curve of the surface and the vertical plane y = b.

The slope of its tangent T1 at P is: g 0 (a) = fx (a, b).

The curve C2 is the graph of the function h(x) = f (a, y), which is the inter-
section curve of the surface and the vertical plane x = a.

The slope of its tangent T2 at P is: h0 (b) = fy (a, b).

Hence, we can visualize partial derivatives at the point P on the surface as


slopes to the tangent lines T1 and T2 at that point.

2
2. Higher order partial derivatives

In the previous section, we have defined partial derivatives for functions of


two variables. These can be extended to functions depending on any number
of variables. For example if w = f (x, y, z), then we have
∂f ∂w
fx = =
∂x ∂x
∂f ∂w
fy = =
∂y ∂y
∂f ∂w
fz = = .
∂z ∂z

Here, fx means differentiating f with respect to x by regarding the other


variables y and z as constants.

We can also consider their partial derivatives of partial derivatives:

(fx )x , (fx )y , (fy )x , (fy )y .

These are called the second partial derivatives of f .

If z = f (x, y), we use the following notation:


∂2f ∂2z
(fx )x = fxx = ∂x2
= ∂x2
∂2f ∂2z
(fx )y = fxy = ∂y∂x
= ∂y∂x
∂2f ∂2z
(fy )x = fyx = ∂x∂y
= ∂x∂y
∂2f ∂2z
(fy )y = fyy = ∂y 2
= ∂y 2

Thus, the notation fxy means that we first differentiate with respect to x
and then with respect to y. Notice that the order the variables appear in the
denominator is reversed when using the curly ∂ notation.

3
Example 2. Find all second-order partial derivatives of f (x, y) = x2 y − y 3 +
ln x.

Solution. First, we compute the first-order derivatives:


1
fx = 2xy + ,
x
fy = x − 3y 2 .
2

Differentiating the partial derivatives one more time, we have



2xy + x1 = 2y − x12 ,

fxx = ∂x

2xy + x1 =

fxy = ∂y 2x,

fyx = ∂x
(x2 − 3y 2 ) = 2x,

fyy = ∂y
(x2 − 3y 2 ) = −6y.

Theorem 2 (Clairaut’s Theorem). Suppose f is defined on a disk D that


contains (a, b). If the functions fxy and fyx are both continuous on D, then

fxy (a, b) = fyx (a, b).

Using Clairaut’s Theorem, it can be shown that:

fxyy = fyxy = fyyx .

In fact, so long as the number of the same variable occurring in the subscript
are the same, the corresponding partial derivatives are the same.

For example: for f (x, y, z), we have

fxxyyyzz = fxzxzyyy = fyzyzyxx = · · · .

4
3. Tangent plane equation

Recall that we use the derivative f 0 (a) to get the tangent line to the curve
y = f (x) at x = a:

y = f (a) + f 0 (a)(x − a).

The same idea applies to find tangent plane equations.

The tangent plane to the surface S at the point P (a, b, c) is defined to


be the plane that contains both tangent lines T1 and T2 as shown in the
figure below. Recall that T1 and T2 are the tangent lines to the curves of
intersection of the surface S and vertical planes y = b and x = a respectively.

Since the point P (a, b, c) has been given, we just need to look for a normal
vector n to the plane.

• A vector with the same direction as T1 is

h1, 0, fx (a, b)i.

• A vector with the same direction as T2 is

h0, 1, fy (a, b)i.

5
Thus, a vector normal to the plane is given by the cross product:

n = h0, 1, fy (a, b)i × h1, 0, fx (a, b)i

= hfx (a, b), fy (a, b), −1i.

Theorem 3 (Equation of Tangent Plane). Consider the surface S given by


z = f (x, y). A normal vector to the tangent plane to S at (a, b) is

hfx (a, b), fy (a, b), −1i.

The tangent plane is given by

z = f (a, b) + fx (a, b)(x − a) + fy (a, b)(y − b).

Example 3. Find the tangent plane to the elliptic paraboloid z = 2x2 + y 2


at the point (1, 1, 3).

Solution. Notice
fx (x, y) = 4x, fx (1, 1) = 4,

fy (x, y) = 2y, fy (1, 1) = 2.


The equation of the plane is

z = f (1, 1) + 4(x − 1) + 2(y − 1),

z = 4x + 2y − 3.
The figure shows the elliptic paraboloid and its tangent plane at (1, 1, 3) that
we found in the preceding example:

6


4. Differentiability of f (x, y)

Recall that for single-variable function f (x),


f (x) is differentiable at a ⇐⇒ f 0 (a) exists.

In other words, in the single variable case, the notion of differentiability is


equivalent to the existence of derivatives.
However, for functions of more than one variables, existence of partial deriva-
tives DOES NOT imply differentiability!!

In general, for f (x, y), we have

fx and fy exist 6=⇒ f differentiable.

fx and fy exist ⇐= f differentiable.

To define differentiability, we first require the concept of increment.


Definition 4 (Increment). Let z = f (x, y). Suppose 4x and 4y are incre-
ments in the independent variable x and y respectively from a fixed point
(a, b).

7
Then the increment in z at (a, b) is defined by
4z = f (a + 4x, b + 4y) − f (a, b).
Example 4. Let z = 2x2 − xy.

Find 4z at (1, 2) given that


• 4x = 0.1,
• 4y = −0.2.
Solution.
4z = f (x + 4x, y + 4y) − f (x, y)

= f (1 + 0.1, 2 − 0.2) − f (1, 2)

= f (1.1, 1.8) − f (1, 2)

= (2(1.1)2 − (1.1)(1.8)) − (2(1)2 − (1)(2))

= 0.44.

Informally, differentiability can be defined as follows:
Definition 5. We say that f is differentiable at (a, b) if the tangent plane
at (a, b) is a good approximation to f at points close to (a, b).
But how good is good? Although the above definition is not rigorous, it does
give an intuition about differentiability: The more we ‘zoom’ into the point
(a, b), the more the surface looks like the tangent plane at that point!

Here is the formal definition of differentiability (for two-variable functions).


Definition 6 (Differentiability - Two Variable). Let z = f (x, y). We say
that f is differentiable at (a, b) if we can write
4z = fx (a, b)4x + fy (a, b)4y + 1 4x + 2 4y
where 1 and 2 are functions of 4x and 4y which vanish (i.e. 1 , 2 → 0 as
(4x, 4y) → (0, 0)).

8
Notice that

4z
|{z}
change in the function at (a, b)

= fx (a, b)4x + fy (a, b)4y +error.


| {z }
change in tangent plane

≈ fx (a, b)4x + fy (a, b)4y .


| {z }
change in tangent plane

We say that f is differentiable on a region R ⊆ R2 if f is differentiable at


every point in R.
Most of the functions in this course are differentiable in their respective
domain.

Example: Using the definition of differentiability, show that f (x, y) = 2x2 −


xy is differentiable at (1, 2).

Solution. Let z = f (x, y). Then, the increment of z at (1, 2) is

4z = f (1 + 4x, 2 + 4y) − f (1, 2)


= 2(1 + 4x)2 − (1 + 4x)(2 + 4y) − (2(1)2 − (1)(2))
= 2(1 + 24x + (4x)2 ) − (2 + 4y + 24x + 4x4y)
= 24x − 4y + 2(4x)2 − 4x4y

Notice that
fx (x, y) = 4x − y, fy (x, y) = −x
So at (1, 2), we have
fx (1, 2) = 2, fy (1, 2) − 1.
Hence we can write

4z = fx (1, 2)4x + fy (1, 2)4y + 2(4x)2 − 4x4y.

In view of the definition, it remains to find 1 and 2 such that

1 , 2 → 0 as 4x, 4y → 0

9
and
2(4x)2 − 4x4y = 1 4x + 2 4y.
There are many choices. Here, we can choose

1 = 24x, 2 = −4x.

It is clear that these choices satisfy the requirements. 

5. Linear approximation

Theorem 7. Suppose z = f (x, y) is differentiable at (a, b). Let 4x and 4y


be small increments in x and y respectively from (a, b). Then

4z ≈ fx (a, b)4x + fy (a, b)4y.

Notice that the above result follows immediately from the definition of dif-
ferentiability. Since 4z = f (a + 4x, b + 4y) − f (a, b), we have

f (a + 4x, b + 4y) − f (a, b) ≈ fx (a, b)4x + fy (a, b)4y


f (a + 4x, b + 4y) ≈ f (a, b) + fx (a, b)4x + fy (a, b)4y

which is called the Linear Approximation of the function f at (a, b).


This can be extended to functions of more than two variables. For example,
if f (x, y, z) is differentiable, then Linear Approximation of f at (a, b, c) is

f (a + 4x, b + 4y, c + 4z) ≈


f (a, b, c) + fx (a, b, c)4x + fy (a, b, c)4y + fz (a, b, c)4z,

given (relatively) small changes 4x, 4y, 4z in the underlings.

Example 5. The base radius and height of a circular cone are measured as
10cm and 25cm respectively, with a possible error in measurement of as much
as 0.1cm in each.

10
Using linear approximation, estimate the magnitude of maximum error in
the calculated volume of the cone.
Solution. The volume of the cone is V = πr2 h/3. So

2πrh πr2
4V ≈ Vr 4r + Vh 4h = 4r + 4h.
3 3

Since each error is at most 0.1cm, we can take dr = 0.1 and dh = 0.1 along
with r = 10, h = 25 to give
500π 100π
dV = (0.1) + (0.1) = 20π.
3 3
The maximum error required is 20πcm3 . 

Example 6. A company creates rolls of metal by feeding the metal through


very large rollers.
The thickness of the resulting metal depends on:
• speed at which the rollers turn, and

• temperature of the metal.


Suppose that for a certain metal, a thickness of 4mm is produced by
• a speed of 10 m/s, and

• a temperature of 900◦ C
Also, experiments show that
• with no change in temperature, an increase in speed of 0.2 m/s increases
the thickness by 0.06mm,

• with no change in speed, an increase in temperature of 10◦ C decreases


the thickness by 0.04mm.
Question: Estimate the thickness of the metal at speed of 10.1 m/s and at
temperature 880◦ C.

11
Solution. Let T (s, t) be the thickness function (in mm) depending on the
speed s (in m/s) and temperature t (in ◦ C). Based on the information given,
we have
T (10, 900) = 4
∂T 0.06
≈ = 0.3
∂s 0.2
∂T −0.04
≈ = −0.004
∂t 10
4s = 0.1
4t = −20
Hence, by Linear Approximation:
∂T ∂T
T (10.1, 880) ≈ T (10, 900) + 4s + 4t
∂s ∂t
≈ 4 + 0.3(0.1) − 0.004(−20)
≈ 4.11


6. Chain Rule

The Chain Rule is a very useful tool for differentiation. Recall that for
functions of one variable, the Chain Rule says that:

If y = f (x) and x = g(t), where f and g are differentiable functions, then y


is indirectly a differentiable function of t, and
dy dy dx
= .
dt dx dt

It turns out that for multivariable functions, there are many different versions
of Chain Rule. We begin with the simplest one.
Theorem 8 (The Chain Rule - Case 1). Suppose that z = f (x, y) is a
differentiable function of x and y, where x = g(t) and y = h(t) are both
differentiable functions of t. Then, z is a differentiable function of t and
dz ∂f dx ∂f dy
= + .
dt ∂x dt ∂y dt

12
Sketch of proof: Since z = f (x, y) is differentiable, for small 4x and 4y,
we have
4z ≈ fx 4x + fy 4y.
Dividing throughout by 4t:
4z 4x 4y
≈ fx + fy .
4t 4t 4t
In the limit, we have the Chain Rule:
dz dx dy
= fx + fy .
dt dt dt


Example 8. For z = f (x, y) = x2 ey , x = g(t) = t2 − 1 and y = h(t) = sin t,


find the derivative dz
dt
.

Solution. First, compute the partial derivatives:


∂z ∂z
= 2xey , = x2 e y .
∂x ∂y
Next, compute the derivatives:
dx dy
= 2t, = cos t.
dt dt
Therefore, using the chain rule

dz ∂z dx ∂z dy
= +
dt ∂x dt ∂y dt
= 2xey (2t) + x2 ey cos t
= 2(t2 − 1)esin t (2t) + (t2 − 1)2 esin t cos t.

13
Theorem 9 ( Chain Rule - Case 2). Suppose that z = f (x, y) is a differ-
entiable function of x and y, where x = g(s, t) and y = h(s, t) are both
differentiable functions of s and t. Then,
∂z ∂f ∂x ∂f ∂y
= + ,
∂s ∂x ∂s ∂y ∂s
∂z ∂f ∂x ∂f ∂y
= + .
∂t ∂x ∂t ∂y ∂t
Here there are three types of variables:

• s and t are independent variables.

• x and y are called intermediate variables.

• z is the dependent variable.

∂z ∂z
Example 9. If z = ex sin y, where x = st2 and y = s2 t, find ∂s
and ∂t
.

Solution. Applying Case 2 of Chain Rule,

∂z ∂z ∂x ∂z ∂y
= +
∂s ∂x ∂s ∂y ∂s
= (ex sin y)(t2 ) + (ex cos y)(2st)
2 2
= t2 est sin(s2 t) + 2stest cos(s2 t).

∂z ∂z ∂x ∂z ∂y
= +
∂t ∂x ∂t ∂y ∂t
= (ex sin y)(2st) + (ex cos y)(s2 )
2 2
= 2stest sin(s2 t) + s2 est cos(s2 t).

To remember Chain Rule using a tree diagram:

14
To find ∂z
∂s
, we find the product of the partial derivatives along each path
from z to s and then add these products:

∂z ∂z ∂x ∂z ∂y
= + .
∂s ∂x ∂s ∂y ∂s

∂z
Similarly, we find ∂t
by using the paths from z to t.

Theorem 10 (The Chain Rule - General Version). Suppose that u is a


differentiable function of n variables x1 , . . . , xn , and each xj is a differentiable
function of m variables t1 , . . . , tm . Then u is a function of t1 , . . . , tm and
∂u ∂u ∂x1 ∂u ∂x2 ∂u ∂xn
= + + ··· +
∂ti ∂x1 ∂ti ∂x2 ∂ti ∂xn ∂ti
for each i = 1, . . . , m.

Example 10. Write out the Chain Rule for w = f (x, y, z, t) where

• x = x(u, v),

• y = y(u, v),

• z = z(u, v),

• t = t(u, v).

15
Solution.

∂w ∂w ∂x ∂w ∂y ∂w ∂z ∂w ∂t
= + + + .
∂u ∂x ∂u ∂y ∂u ∂z ∂u ∂t ∂u

∂w ∂w ∂x ∂w ∂y ∂w ∂z ∂w ∂t
= + + + .
∂v ∂x ∂v ∂y ∂v ∂z ∂v ∂t ∂v


Example 11. If w = f (x2 − y 2 , y 2 − x2 ) and f is differentiable, show that

∂w ∂w
y +x = 0.
∂x ∂y
Solution. Introduce intermediate variables:

u = x2 − y 2 , v = y 2 − x2 .

Using Chain Rule,


∂w ∂w ∂u ∂w ∂v ∂w ∂w
= + = (2x) + (−2x)
∂x ∂u ∂x ∂v ∂x ∂u ∂v
and
∂w ∂w ∂u ∂w ∂v ∂w ∂w
= + = (−2y) + (2y)
∂y ∂u ∂y ∂v ∂y ∂u ∂v
Therefore
∂w ∂w
y +x
∂x ∂y

16
   
∂w ∂w ∂w ∂w
= (2xy) + (−2xy) + (−2xy) + (2xy) = 0.
∂u ∂v ∂u ∂v


7. Implicit Differentiation

Up to now, we often speak of z = f (x, y) in which case z is a function of x


and y and it is explicitly given by the expression f (x, y). However, functions
can also be implicitly defined.

We say that z is an implicit function of x and y defined by F (x, y, z) = 0 if

for every choice of x and y, the value of z is determined


by F (x, y, z) = 0

For example, consider the following equation

x2 + y 2 + z 2 − 4 = 0.

If we regard x and y as independent variables, then we can say that z is an


implicit function of x and y defined by the above equations, although in this
case one can actually explicitly solve for z in terms of x and y, resulting in
two equations: p
z = 4 − x2 − y 2
p
z = − 4 − x2 − y 2 .
The point about implicit function is that we can still speak of it even when
we do not know how to solve for z in terms of the other two variables. For
instance, we can say that z is an implicit function of x and y defined by

ln(x2 yz + xy) + z = 0.

Question: Suppose z is an implicit function of x and y defined by F (x, y, z) =


0.
∂z ∂z
How to compute , without solving for z?
∂x ∂y

17
Solution. The idea is to use Chain Rule.

Using Chain Rule to differentiate the equation F (x, y, z) = 0 with respect to


x:
∂F ∂F ∂z
+ = 0,
∂x ∂z ∂x
since here we regard x and y are independent variables.

∂F
Therefore, if ∂z
6= 0, then
∂F
∂z ∂x Fx
= − ∂F =− .
∂x ∂z
Fz

∂F
Likewise, if ∂z
6= 0, then
∂F
∂z ∂y Fy
= − ∂F = − .
∂y ∂z
Fz

Theorem 11 (Implicit Differentiation: Two Independent Variables). Sup-


pose the equation F (x, y, z) = 0, where F is differentiable, defines z implicitly
as a differentiable function of x and y. Then,

∂z Fx (x, y, z) ∂z Fy (x, y, z)
=− , =−
∂x Fz (x, y, z) ∂y Fz (x, y, z)

provided Fz (x, y, z) 6= 0.

∂z
Example 12. Find ∂x
if

x3 + y 3 + z 3 + 6xyz = 1.

Solution. Let F (x, y, z) = x3 + y 3 + z 3 + 6xyz − 1. Then

Fx = 3x2 + 6yz, Fz = 3z 2 + 6xy.

18
Therefore, by the Implicit Differentiation Theorem,

∂z Fx 3x2 + 6yz
=− =− 2 .
∂x Fz 3z + 6xy


8. Directional Derivatives

In this section, we shall extend the idea of partial derivatives to that of


directional derivatives. First, recall that

fx (x0 , y0 ): Rate of change of f in the positive direction of x.


fy (x0 , y0 ): Rate of change of f in the positive direction of y.

Question: What is the rate of change of f given any direction u = ha, bi?

Definition 12 (Directional derivative). The directional derivative of f (x, y)


at (x0 , y0 ) in the direction of unit vector u = ha, bi is

f (x0 + ha, y0 + hb) − f (x0 , y0 )


Du f (x0 , y0 ) = lim
h→0 h
provided this limit exists.

The discussion below attempts to clarify the rationale behind this definition.

19
Consider the surface S with equation z = f (x, y) (the graph of f ) and we let
z0 = f (x0 , y0 ).

Then, the point P (x0 , y0 , z0 ) lies on S. The vertical plane that passes through
P in the direction of u intersects S in a curve C.
The slope of the tangent line T to C at the point P is the rate of change of
z in the direction of u.
To compute this slope, we need another point on the curve C, say Q(x, y, z).
Let P 0 , Q0 be the projections of P , Q on the xy-plane.
−−→
The vector P 0 Q0 is parallel to u. So
−− →
P 0 Q0 = hu = hha, hbi for some scalar h.

Therefore,
−− →
P 0 Q0 = hx − x0 , y − y0 i = hha, hbi
and so
x = x0 + ha, y = y0 + hb.

20
4z z − z0
=
h h
f (x0 + ha, y0 + hb) − f (x0 , y0 )
= .
h
If we take the limit as h → 0, we obtain the rate of change of z (with respect
to distance) in the direction of u.
This is called the directional derivative of f in the direction of u, provided
the limit exists:

4z f (x0 + ha, y0 + hb) − f (x0 , y0 )


lim = lim .
h→0 h h→0 h

Based on the figure above, we can interpret Du f (x0 , y0 ) as the slope to the
tangent line T in the direction given by u. Hence,

The directional derivative Du f (x0 , y0 ) is the


rate of change of the function at the point
(x0 , y0 ) in the direction given by u.

Examples:

In the figure above, Du f (x0 , y0 ) > 0.

21
In the figure above, Du f (x0 , y0 ) < 0.

9. Computing directional derivatives

We seldom compute directional derivatives using the definition. Instead, we


often use the following formula.
Theorem 13 (Computing Directional Derivative). If f (x, y) is a differen-
tiable function, then f has a directional derivative in the direction of any
unit vector u = ha, bi and

Du f (x, y) = fx (x, y)a + fy (x, y)b.

We can rewrite it in terms of vectors:

Du f (x, y) = hfx , fy i · ha, bi = hfx , fy i · u.

Proof.

f (x0 + ah, y0 + bh) − f (x0 , y0 )


Du f (x0 , y0 ) = lim
h→0 h

Set g(h) = f (x0 + ah, y0 + bh). Then

g(h) − g(0)
Du f (x0 , y0 ) = lim
h→0 h
0
= g (0).

22
We now compute g 0 (0) in a different way.

Set x = x0 + ah, y = y0 + bh.

We can compute g 0 (h) using the Chain Rule:


∂f dx ∂f dy
g 0 (h) = + = fx (x, y)a + fy (x, y)b.
∂x dh ∂y dh

Substituting h = 0:

g 0 (0) = fx (x0 , y0 )a + fy (x0 , y0 )b.


Hence

Du f (x0 , y0 ) = g 0 (0) = fx (x0 , y0 )a + fy (x0 , y0 )b.




Notice that
Du f = hfx , fy i · u.
The vector hfx , fy i is so important that we will give it a special name.

Definition 14 (Gradient). The gradient of f (x, y) is the vector-valued func-


tion
∂f ∂f
Of (x, y) = hfx , fy i = fx i + fy j = i+ j
∂x ∂y
provided both partial derivatives exist.

Of is read ‘del f ’.

With this notation, we have

Du f (x, y) = Of (x, y) · u
• if u = i = h1, 0i then

Di f = hfx , fy i · h1, 0i = fx .

23
• if u = j = h0, 1i then
Dj f = hfx , fy i · h0, 1i = fy .

So the partial derivatives of f with respect to x and y are just special cases
of the directional derivative.

Example 13. Find the directional derivative of the function f (x, y) = x2 y 3 −


4y at the point (2, −1) in the direction of the vector v = 2i + 5j.

Solution. First compute the gradient vector at (2, −1):


Of (x, y) = 2xy 3 i + (3x2 y 2 − 4)j

Of (2, −1) = −4i + 8j.


Notice v is NOT a unit vector. The unit vector in the direction of v is
v 2 5
u= = √ i + √ j.
||v|| 29 29
Therefore
Du f (2, −1) = Of (2, −1) · u
2 5
= h−4, 8i · h √ , √ i
29 29
32
= √ .
29


The idea of directional derivative can be easily extended to 3D:

Definition 15 (3-D Directional Derivative). The directional derivative of


f (x, y, z) at (x0 , y0 , z0 ) in the direction of unit vector u = ha, b, ci is
f (x0 + ha, y0 + hb, z0 + hc) − f (x0 , y0 , z0 )
Du f (x0 , y0 , z0 ) = lim
h→0 h
provided this limit exists.

24
As in the 2D case, we compute 3D-directional derivative using the gradient
vector.

Theorem 16 (Computing 3-D Directional Derivative).

Du f (x0 , y0 , z0 ) = Of (x, y, z) · u

where
∂f ∂f ∂f
Of = hfx , fy , fz i = i+ j+ k
∂x ∂y ∂z
is the gradient vector.

25
MA1104 Week 4
Gradient Vector, Extrema & Lagrange
Multiplier

1. Gradient Vectors & Level Curves

In this section, we shall study a very fundamental relationship between the


gradient vector Of of a two-variable function f (x, y) and its level curves. As
an application of this connection, we can make use of gradient vectors to find
the equations of
• normal lines to a curve given by f (x, y) = constant;
• tangent lines to a cirve given by f (x, y) = constant.
Theorem 1 (Level Curve vs Of ). Suppose f (x, y) is differentiable function
of x and y at (x0 , y0 ). Suppose Of (x0 , y0 ) 6= 0.
Then Of (x0 , y0 ) is normal to the level curve f (x, y) = k that contains the
point (x0 , y0 ).
This result can be visualized in the following figure. Notice that the gradient
vector Of (x) at a given point (x0 , y0 ) is normal to the level curve of the
function containing the same point. In the figure, the gradient vector is
pointing away from the level curve, however in general, it could also be
pointing in the opposite direction into the level curve, depending on the
function involved. The point is, the gradient vector must be perpendicular
to the level curve.

1
Let us now see a proof of this theorem.

Proof. Let r(t) = hx(t), y(t)i be a parametrization of the level curve f (x, y) =
k containing the fixed point (x0 , y0 ), i.e. f (x0 , y0 ) = k.
The main idea is to differentiate (with respect to t) both sides of the equa-
tionof the level curve:
f (x, y) = k
Doing this, we get
dx dy
fx + fy =0
dt dt
where the left-hand side follows from the Chain Rule. Since Of = hfx , fy i
and h dx , dy i = r0 (t), we can rewrite the above as
dt dt

hfx , fy i · r0 (t) = 0,

which says that Of is perpendicular to r0 (t).


Since r0 (t) is tangential to the level curve, this says that Of is perpendicular
to the level curve!
In particular, Of (x0 , y0 ) is perpendicular to the level curve f (x, y) = k at
(x0 , y0 ), as required. 

Example 1. Find the equation of the normal line (on the xy-plane) at the
point (2, 1) to the ellipse
x2
+ y 2 = 2.
4
Solution. Let f (x, y) = x2 /4 + y 2 . Then

Of (x, y) = hx/2, 2yi,

Of (2, 1) = h1, 2i.


By the preceding theorem, a vector which is normal to the ellipse at (2, 1) is
Of (2, 1) = h1, 2i. Hence, using this vector as the direction vector of the line,
a set of parametric equations for the normal line is

x = 2 + t, y = 1 + 2t, t ∈ R.

2
Example 2. Find the equation of the tangent line (on the xy-plane) at the
point (2, 1) to the ellipse
x2
+ y 2 = 2.
4
Solution. From previous example, we know that the gradient vector at (2, 1)
is
Of (2, 1) = h1, 2i.
Thus, a vector tangential to the ellipse at (2, 1) is

h2, −1i.

Using this vector as the direction vector for the tangent line, a set of para-
metric equations for the tangent line is:

x = 2 + 2t, y = 1 − t, t ∈ R.


2. Gradient Vectors & Level Surfaces

As in the two-variable case, there is a fundamental relationship between


the gradient vector OF of a three-variable function F (x, y, z) and its level
surfaces! As an application of this connection, we can use gradient vectors
to find tangent plane equation to surface whose equation is F (x, y, z) =
constant.

Theorem 2 (Level Surface vs OF ). Suppose F (x, y, z) is differentiable func-


tion of x, y and z at (x0 , y0 , z0 ). Suppose OF (x0 , y0 , z0 ) 6= 0.
Then OF (x0 , y0 , z0 ) is normal to the level surface F (x, y, z) = k that contains
the point (x0 , y0 , z0 ).

The proof is very similar to the two-variable case. However, we shall omit
the proof here (although it is given in the online lecture videos).

As before, we can visualize this fact in the figure below. Notice that the
gradient vector 4F at the given point P (x0 , y0 , z0 ) is always perpendicular
to the level surface of the function containing that point. In the figure, the

3
gradient vector is pointing away from the level surface, however in general,
it could also be pointing in the opposite direction into the level surface,
depending on the function involved. The point is, the gradient vector must
be perpendicular to the level surface.

Consequently, we can use the gradient vector as a normal vector to the tan-
gent plane to the level surface F (x, y, z) = k at the given point.

Theorem 3 (Tangent Plane to Level Surface).

OF (x0 , y0 , z0 ) · hx − x0 , y − y0 , z − z0 i = 0

Example 3. Find the equation of the tangent plane at the point (−2, 1, −3)
to the ellipsoid
x2 z2
+ y2 + = 3.
4 9
Solution. The ellipsoid is the level surface (with k = 3) of the function

x2 z2
F (x, y, z) = + y2 + .
4 9
Therefore,
x 2z
Fx (x, y, z) = , Fy (x, y, z) = 2y, Fz =
2 9
2
Fx (−2, 1, −3) = −1, Fy (−2, 1, −3) = 2, Fz (−2, 1, −3) = − .
3
4
The equation of the tangent plane at (−2, 1, −3) is

OF (−2, 1, −3) · hx − (−2), y − 1, z − (−3)i = 0

2
−1(x + 2) + 2(y − 1) − (z + 3) = 0,
3
which simplifies to

3x − 6y + 2z + 18 = 0.


3. Maximum/Minimum Rate of Change

In this section, we wish to answer the following question:

Question: At a given point (x0 , y0 , z0 ), in which direction does f (x, y, z)


change the fastest/slowest?

In other words, what is the maximum/minimum rate of change of f at


(x0 , y0 , z0 )?

Notice that the same question can beaked for a two-variable function f (x, y).

The answer lies in Of !

To fix the idea, consider the rate of change of f (x, y, z) along a unit vector
u at (x0 , y0 , z0 ). Recall that this is just the directional derivative of f along
u at this point:

Du f (x0 , y0 , z0 ) = Of (x0 , y0 , z0 ) · u.
Here is the main Idea: Note that the function f has been fixed by the
context, consequently, its gradient vector Of (x0 , y0 , z0 ) at the fixed point
(x0 , y0 , z0 ) has also been fixed. What can change is the direction u. So we
are going to vary u and see which one gives the maximum or minimum value
of Du f .

5
Let θ be the angle between Of and u. Then

Du f = Of · u
= ||Of || ||u|| cos θ
= ||Of || cos θ (since u is a unit vector)

• The maximum value of cos θ is 1 and this happens when θ = 0.


So the maximum value of Du f is ||Of || and it occurs when θ = 0, i.e.
u points in the direction of Of .

• The minimum value of cos θ is −1 and this happens when θ = π.


So the minimum value of Du f is − ||Of || and it occurs when θ = π, i.e.
u points in the direction of −Of .

We record these down in the following theorem.


Theorem 4 (Maximizing Rate of Increase/Decrease of f ). Suppose f is a
differentiable function of two or three variables. Let P denote a given point.
Assume Of (P ) 6= 0.

• Of (P ) points in the direction of maximum rate of change of f at P


(maximum value of Du f (P ) is ||Of (P )||)
• −Of (P ) points in the direction of minimum rate of change of f at P
(minimum value of Du f (P ) is − ||Of (P )||)

Example 4. Let f (x, y) = xey .


• In what direction does f have the maximum rate of change at the point
P (2, 0)?
• What is this maximum rate of change?
Solution. Note that

Of (x, y) = hfx , fy i = hey , xey i.

6
• f increases fastest in the direction of the gradient vector

Of (2, 0) = h1, 2i.

• The maximum rate of change is


√ √
||Of (2, 0)|| = 12 + 22 = 5.

4. Critical Points of f (x, y)

One of the applications of multivariable calculus is to find the maximum or


minimum of multivariable functions on a given domain. Very often in the
process, we are required to first determine the critical points of the function
which are candidates for such maximum and minimum.

In this section, we shall look at critical points of a two-variable function


f (x, y). There are three types, namely

• Local maximum

• Local minimum

• Saddle points

Le us start off with some definitions.

Definition 5 (Local Maximum). Let f (x, y) : D → R. Then f has a local


maximum at (a, b) if

f (x, y) ≤ f (a, b) for all the points close to (a, b).

The number f (a, b) is called a local maximum value.

7
Definition 6 (Local Minimum). Let f (x, y) : D → R. Then f has a local
minimum at (a, b) if

f (x, y) ≥ f (a, b) for all the points close (a, b).

The number f (a, b) is called a local minimum value.

The following is a property that both local max and local min must have.

Theorem 7 (A necessary condition). If f has a local maximum or minimum


at (a, b) and the first-order derivatives of f exist there, then

fx (a, b) = fy (a, b) = 0.

We shall not provide a formal proof here. Instead, we shall establish the
intuition behind this fact. Just imagine you are standing at a local minimum
on the surface of a two-variable function. All the points around you in the
vicinity must be ‘higher’ than where you are standing. This says that you
must be standing upright or on a flat plane, as shown in the figure below.

Standing on a local minimum ......

This implies that the tangent plane at local minimum must be flat! In other
words, the equation of the tangent plane is z = c, where c is some constant.

The same can be said when you are standing on a local max:

Standing on a local maximum ......

8
But now, recall that the tangent plane equation at the point you are standing,
say (a, b), is

z = f (a, b) + fx (a, b)(x − a) + fy (a, b)(y − b).

Since the tangent plane is flat at local max/min, we conclude that

fx (a, b) = fy (a, b) = 0.

The contrapositive of the preceding theorem is that if we are not standing


upright (i.e. the tangent plane is not flat) at (a, b), then the point (a, b) can
neither be local max nor local min.

fx (a, b) = fy (a, b) = 0 6=⇒ Local Max/Min

However, the converse of the preceding theorem is NOT true! The fact that
the tangent plane is flat at (a, b) DOES NOT imply that the point is a local
max or local min.

In the following figure, we notice that the tangent plane at which we are
standing is flat, but there points in the vicinity which are larger as well as
smaller than the value of our function at that point.

9
The last figure above suggests that we consider another type of points, called
the Saddle Points.
Definition 9 (Saddle Point). Let f (x, y) : D → R. Then A point (a, b) is
called a saddle point of f if
• fx (a, b) = fy (a, b) = 0; AND

• every neighborhood at (a, b) contains points (x, y) ∈ D for which


f (x, y) < f (a, b) and points (x, y) ∈ D for which f (x, y) > f (a, b).

Suppose you are standing on a surface and you are


standing upright (parallel to the z-axis). Moreover,
when you begin walking, some directions take you uphill
while other directions take you downhill.

Then you are standing at a saddle point!

Example 5. Find the local maximum/minimum and saddle points of f (x, y) =


y 2 − x2 .
Solution. Since fx = −2x and fy = 2y, the only solution to fx = fy = 0 is
(0, 0).

10
So the only candidate is (0, 0).

We still have to check whether f (0, 0) is a local maximum or local minimum


or saddle point.

Note that f (0, 0) = 0.

• If y = 0 and x 6= 0, then

f (x, y) = −x2 < 0 = f (0, 0).

• If x = 0 and y 6= 0, then

f (x, y) = y 2 > 0 = f (0, 0).

Hence (0, 0) is a saddle point.

11
5. Finding Absolute Maximum/Minimum

Definition 10 (Absolute Maximum). Let f (x, y) : D → R. Then f has an


absolute maximum at (a, b) if

f (x, y) ≤ f (a, b) for all points in the domain D.

The number f (a, b) is called a absolute maximum value.

Definition 11 (Absolute Minimum). Let f (x, y) : D → R. Then f has an


absolute minimum at (a, b) if

f (x, y) ≥ f (a, b) for all points in the domain D.


The number f (a, b) is called a absolute minimum value.

The bad news is that absolute maximum/minimum do not always exist!

However, if we restrict the domain of our function to a closed and bounded


region, then absolute maximum and absolute minimum always exist!

Definition 12 (Closed Set in R2 ). A set R ⊆ R2 is closed if it contains all


its boundary points.
(A boundary point of R is a point (a, b) such that every disk with center
(a, b) contains points in R and also points in R2 \ R).

Definition 13 (Bounded Set in R2 ). A set R ⊆ R2 is bounded if it is


contained within some disk. In other words, it is finite in extent

12
Examples of closed and bounded sets in R2 :

Regions which are bounded but not closed:

Theorem 14 (Extreme Value Theorem). If f (x, y) is continuous on a closed


and bounded set D ⊆ R2 , then f attains
• an absolute maximum value f (x1 , y1 ), AND

• an absolute minimum value f (x2 , y2 )


at some points (x1 , y1 ) and (x2 , y2 ) in D.

The following are the basic steps for finding Absolute Maximum/Minimum
- The Closed Interval Method:

Step 1. Find the values of f at its critical points in D.


Step 2. Find the extreme values of f on the boundary of D.
Step 3. The largest of the values from Step 1 and Step 2 is the absolute
maximum value; the smallest of these values is the absolute minimum value.

13
Example 6. Find the absolute maximum and absolute minimum values of
the function f (x, y) = x2 − 2xy + 2y on the rectangle

D = {(x, y) : 0 ≤ x ≤ 3, 0 ≤ y ≤ 2}.

Solution.

Step 1. Find the values of f at critical points.


Solving the equations:

fx = 2x − 2y = 0, fy = −2x + 2 = 0,
we obtain only one critical point (1, 1), which yields f (1, 1) = 1.

Step 2. Find extreme values of f on the boundary of D.

There are four different boundaries L1 , L2 , L3 and L4 as shown in the figure.


We need to analyse each of them separately, and find the corresponding
maximum and minimum on each of these boundaries.

On L1 , we have y = 0, so

f (x, 0) = x2 0 ≤ x ≤ 3.

This is increasing function in x. So

f (0, 0) = 0 is minimum f (3, 0) = 9 is maximum.

14
On L2 , we have x = 3, so

f (3, y) = 9 − 4y 0 ≤ y ≤ 2.

This is a decreasing function in y. So

f (3, 0) = 9 is maximum f (3, 2) = 1 is minimum.

On L3 , we have y = 2, so

f (x, 2) = x2 − 4x + 4 0 ≤ x ≤ 3.

By Calculus of Single Variables, we have

f (2, 2) = 0 is minimum f (0, 2) = 4 is maximum.

Finally, on L4 , we have x = 0, so

f (0, y) = 2y 0 ≤ y ≤ 2.

This is an increasing function in y. So

f (0, 0) = 0 is minimum f (0, 2) = 4 is maximum.

Step 3. Lets compare values obtained in Step 1 and Step 2:

point value of f
(1, 1) 1
(0, 0) 0
(3, 0) 9
(3, 2) 1
(0, 2) 4
(2, 2) 0

• The absolute maximum value of f on D is f (3, 0) = 9.

• The absolute minimum value is f (0, 0) = f (2, 2) = 0.

15


6. Lagrange Multiplier - 2-Variable Case

In the previous section, we consider optimisation problems for f (x, y) on a


region D. Sometimes, in practice, we may need to replace the region D by
just a curve in the domain. In particular, we shall consider the following
question:

Problem: Find the extrema of f (x, y) subject to a given the constraint


g(x, y) = k.

Notice that g(x, y) = k represents a curve on the xy-plane. The procedure


to solve this problem is called the Method of Lagrange Multipliers.

Instead of proving this method in a rigorous manner, I will just explain the
idea behind this technique by using the following example.

Let us assume that the function f has a maximum on the constraint g(x, y) =
k. Instead of visualising the function as surface, we can look at its level
curves. This way, we can visualise the level curves of f and the constraint
curve g(x, y) = k simultaneously on the xy-plane as shown below.

This is what we are going to do:

16
Starting from a point on the constraint curve g(x, y) = k, we are going to move
along the curve as to maximise the function value.

Each time we arrive at a new point, we shall answer the question ‘Are we at the
maximum yet?’.

If the answer is ‘No’, we will keep moving in the direction that can increase our
function value.

This way, we shall stop when we arrive at the maximum.

Let the red dot represent the point (x0 , y0 ), which is our current position.
Suppose we are now at the following point:

The function value at this point cannot be maximum, since it lies between
the level curves f (x, y) = 7 and f (x, y) = 8. Its value must be somewhere
between 7 and 8. This can be increased slightly if we move towards the right.

Suppose the point (x0 , y0 ) now is here:

17
Again, the function value at this point cannot be maximum, since it lies
between the level curves f (x, y) = 9 and f (x, y) = 10. Its value must be
somewhere between 9 and 10. This can be increased slightly if we move
towards the right.

Suppose the point (x0 , y0 ) now is here:

Again, the function value at this point cannot be maximum, since it lies
between the level curves f (x, y) = 8 and f (x, y) = 9. Its value must be
somewhere between 8 and 9. This can be increased slightly if we reverse the
direction from which we cam from just now.

Suppose the point (x0 , y0 ) now is here:

18
Again, the function value at this point cannot be maximum, since it lies
between the level curves f (x, y) = 9 and f (x, y) = 10. Its value must be
somewhere between 9 and 10.

It is not hard to believe that at the maximum (or minimum) point (x0 , y0 ),
the level curve f (x, y) containing (x0 , y0 ) and the constraint curve g(x, y) = k
must have the same tangent line!
This means that at the maximum (minimum) point (x0 , y0 ), the gradient
vectors are parallel to each other:

Of (x0 , y0 ) = λOg(x0 , y0 ) for some λ ∈ R.

Theorem 15 (Lagrange Multipliers for Function of Two Variables). Sup-


pose f (x, y) and g(x, y) are differentiable functions such that Og(x, y) 6= 0
on the constraint curve g(x, y) = k.

Suppose that the minimum/maximum value of f (x, y) subject to the con-


straint g(x, y) = k occurs at (x0 , y0 ).Then

Of (x0 , y0 ) = λOg(x0 , y0 )

for some constant λ (called a Lagrange Multiplier).


Important remark: When applying the above result, we assume that the
maximum/minimum value of f subject to the constraint g(x, y) = k exist (in
this course we always assume this unless otherwise mentioned that we need
to prove the existence of extrema).

19
The following are the steps of the method of Lagrange Mutiplier for two-
variable functions:

Step 1. Find all values of x, y and λ such that


Of (x, y) = λOg(x, y)
and
g(x, y) = k.
By comparing components of the gradient vectors in the first equation, we
have to solve the following system of equations:
fx = λgx
fy = λgy
g(x, y) = k.
Our aim is to find the point(s) (x, y) satisfying the above equations. Occa-
sionally, we still have to solve for λ first before we can get to the solutions
for (x, y).

Step 2. Evaluate f at all the points obtained in Step 1.


• The largest of these values is the maximum value of f ;
• The smallest is the minimum value of f .

Example 7. Find the extreme values of the function f (x, y) = x2 + 2y 2 on


the circle x2 + y 2 = 1.
Solution. Using Lagrange Multipliers, we want to solve the equations
Of = λOg, g(x, y) = 1
where g(x, y) = x2 + y 2 .
Writing in components, we need to solve
2x = 2xλ (e1)
4y = 2yλ (e2)
x2 + y 2 = 1. (e3)
From (e1), we have x = 0 or λ = 1. Consider the following cases:

20
• x = 0: Then it follows from (e3) that y = ±1. So the solutions are
(0, 1) and (0, −1).

• λ = 1: The it follows from (e2) that y = 0. Hence, from (e3), we have


x = ±1. So the solutions are (1, 0) and (−1, 0).

Summary:

points f (x, y) Conclusion


(0, 1) 2 max
(0, −1) 2 max
(1, 0) 1 min
(−1, 0) 1 min

Here is the picture:

7. Lagrange Multiplier - 3-Variable Case

The constraint optimisation problem for two-variable functions in the pre-


ceding section can also be considered for three-variable functions f (x, y, z):

21
Problem: Find the extrema of f (x, y, z) subject to a given the constraint
g(x, y, z) = k.

Perhaps, it is worth noting that this time, the constraint g(x, y, z) = k is a


surface in the xyz-space, rather than a curve on the xy-plane.

Again, we can answer this question by using the Method of Lagrange Multi-
pliers.

Theorem 16 (Lagrange Multiplier - Three Variables). Suppose f (x, y, z)


and g(x, y, z) are differentiable functions such that Og(x, y, z) 6= 0 on the
constraint surface g(x, y, z) = k.

Suppose that the minimum/maximum value of f (x, y, z) subject to the con-


straint g(x, y, z) = k occurs at (x0 , y0 , z0 ).

Then
Of (x0 , y0 , z0 ) = λOg(x0 , y0 , z0 )
for some constant λ (called a Lagrange Multiplier).

s
Below are steps of the method of Lagrange Multipliers in the three-variables
case:

Step 1. Find all values of x, y, z and λ such that

Of (x, y, z) = λOg(x, y, z)

and
g(x, y, z) = k.
By comparing components of the gradient vectors in the first equation, we
have to solve the following system of equations:

fx = λgx
fy = λgy
fz = λgz
g(x, y, z) = k.

22
Our aim is to find the point(s) (x, y, z) satisfying the above equations. Oc-
casionally, we still have to solve for λ first before we can get to the solutions
for (x, y, z).

Step 2. Evaluate f at all the points obtained in Step 1.

• The largest of these values is the maximum value of f ;

• The smallest is the minimum value of f .

Example 8. A rectangle box without the top lid is to be made from 12 m2 of


cardboard. Find the maximum volume of such a box.

Solution. Let x, y, z be the dimensions of the box (measured in meters)


where the bottom surface is a rectangle whose area is xy.

We wish to maximize

V (x, y, z) = xyz
subject to the constraint

g(x, y, z) = 2xz + 2yz + xy = 12.


Using Method of Lagrange Multipliers, we look for values of x, y, z and λ
such that

OV = λOg, g(x, y, z) = 12.


That is, we need to solve the equations:

yz = λ(2z + y) (E1)

xz = λ(2z + x) (E2)

xy = λ(2x + 2y) (E3)

2xz + 2yz + xy = 12. (E4)

23
Note that none of x, y, z can be 0 (why?).

Now, multiply (E1), (E2), (E3) by x, y and z respectively:

xyz = λ(2xz + xy) (E5)

xyz = λ(2yz + xy) (E6)

xyz = λ(2xz + 2yz) (E7)


Also, notice that λ 6= 0; otherwise, it follows from (E1) that one of y and z
must be 0, contradiction.

Therefore, from (E5) and (E6),

2xz + xy = 2yz + xy,


xz = yz,
x = y (since z 6= 0).

On the other hand, from (E6) and (E7),

2yz + xy = 2xz + 2yz,


2xz = xy,
2z = y. (since x 6= 0).

Now, putting x = y = 2z into (E4),

2(2z)z + 2(2z)z + (2z)(2z) = 12z 2 = 12


which gives z = 1 (since z cannot be negative).

Hence
x = 2, y = 2, z = 1.

The maximum volume is 2 · 2 · 1 = 4 m3 .




24

You might also like