Calc3 Chapter1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

Lecture Notes for

MTH5102: CALCULUS III


Dr. Will Sutherland
These notes are updated from previous versions by Malcolm
McCallum and M.J. Thompson, with figures by C.D. Murray, and by
P. Saha.
School of Mathematical Sciences,
Queen Mary, University of London
September 2010

“It is one of the most unnatural features of science that the abstract language of mathematics
should provide such a powerful tool for describing the behaviour of systems both inanimate, as
in physics, and living, as in biology. Why the world should conform to mathematical descriptions
is a deep question. Whatever the answer, it is astonishing.”
Lewis Wolpert (1992)

Students on the module MTH5102 Calculus III are welcome to download, print and photocopy these notes,
in whole or in part, for their personal use. The notes are intended to supplement rather than to replace the
lectures.


c Will Sutherland 2009-10, M.A.H. MacCallum 2006-8, 2001, P. Saha 2005, M.J. Thompson 1999.
Chapter 1

Introductory material

Last revised: 29 Sep 2010

This chapter gives a quick review of the key parts of the prerequisite courses (Calculus I and II, and
Geometry I) which we will actually use in Calculus III, adding some extra material. Those parts which are
revision will be without examples.

1.1 Trigonometric functions

1.1.1 Values

(See Thomas 1.6)


We can quickly obtain the value of a trigonometric function for any argument in terms of values for x ∈ [0, 12 π ]
by remembering a few things. First we have the table

π π π π
0◦ 30◦ = 6 radians

45◦ = 4 rad. 60◦ = 3 rad. 90◦ = 2 rad.
3 √1 1
cos 1 2 2 2 0

1 √1 3
sin 0 2 2 2 1

To get the sign for other values we can use the mnemonic table

Radians Degrees sin cos tan Positive functions


(0, 12 π ) (0◦ , 90◦ ) + + + All
( 12 π , π ) (90◦ , 180◦) + − − Sin
(π , 23 π ) (180◦, 270◦) − − + Tan
( 32 π , 2π ) (270◦, 360◦) − + − Cos

sometimes called the ‘Add Sugar To Coffee’ rule – or use Thomas’ variant “All Students Take Calculus”.
(Note: to be entirely accurate we should have special rows in this table for the values 12 π etc because at those
points one or more of the functions will be zero or unbounded.)

1
Then we remember what happens when we replace x by −x, x + π /2 or x + π :

cos(−x) = cosx, sin(−x) = − sin x,


π π
cos(x + ) = − sin x, sin(x + ) = cos x, (1.1)
2 2
cos(x + π ) = − cosx, sin(x + π ) = − sin x.

These are very easy to derive from eix = cos x + i sin x, remembering that eiπ /2 = i, eiπ = −1. Using them
in combination we can get
cos(π − x) = − cosx, sin(π − x) = sin x.
and so on.

More generally
1 1
cos(x + (m + )π ) = (−1)(m+1) sin x, sin(x + (m + )π ) = (−1)m cosx, (1.2)
2 2
cos(x + nπ ) = (−1)n cosx, sin(x + nπ ) = (−1)n sin x. (1.3)

where m and n are integers. These identities enable us to relate the value we want to a value in the first
quadrant (i.e. the range [0, 12 π ]). Remember the special cases for x = 0,

1
cos(nπ ) = (−1)n , sin((n + )π ) = (−1)n . (1.4)
2
cos((n + 1/2)π ) = 0, sin(nπ ) = 0. (1.5)

which will turn up regularly later on.

1.1.2 Identities for the trigonometric functions

The most important formulae to remember are

sin2 A + cos2 A = 1 (1.6)


cos(A + B) = cos A cosB − sin A sin B (1.7)
sin(A + B) = sin A cos B + cosA sin B. (1.8)

If you have trouble remembering which of the last two is which, and which has the minus in it, try substituting
some special values such as A = 0 or B = 21 π and checking the result. For example, taking A = 0 in the last
equation gives sin B = 0 + sin B, consistent, whereas if you had tried sin(A + B) = sin A cosB − cosA sin B you
would get sin B = 0 − sinB, clearly wrong. From these and the earlier results Eq. 1.1 we get

cos(A − B) = cos A cosB + sin A sin B


sin(A − B) = sin A cos B − cosA sin B.

and by adding or subtracting various pairs of the above equations, we get

cos A cos B = 12 (cos(A + B) + cos(A − B)) (1.9)


1
sin A sin B = 2 (cos(A − B) − cos(A + B)) (1.10)
1
sin A cos B = 2 (sin(A + B) + sin(A − B)), (1.11)
R
which we will find very useful in doing integrations like cos(nx) cos(mx) dx which turn up later on.

2
The double angle cases

sin 2x = 2 sin x cos x


cos 2x = cos x − sin x = 2 cos x − 1 = 1 − 2 sin2 x
2 2 2

cos2 x = 12 (1 + cos2x)
sin2 x = 12 (1 − cos2x)

come up often; we get the “half angle” cases by just substituting in y = 2x, x = y/2 in the above.

Using the basic identities we can easily derive plenty more, such as

sec2 A = 1 + tan2 A
cosC + cosD = 2 cos 12 (C + D) cos 12 (C − D).

We should also note (see Thomas 3.4) that for any constant k,

d(sin(kx)) d(cos(kx))
= k cos(kx), = −k sin(kx) .
dx dx
Both sin(kx) and cos(kx) therefore obey1
d2 y
= −k2 y .
dx2
and it can be proved that these give all solutions, i.e.

d2 y
= −k2 y ⇔ y = a cos(kx) + b sin(kx) (1.12)
dx2
for some constants a and b. We could also write the right side as a combination of eix and e−ix .

1.2 Ln, or loge , exp, and hyperbolic functions

(See Thomas section 7.2)


The natural logarithm ln x can be defined as
Z x
dt
ln x = .
1 t

R x implies ln 1 = 0. Note that this is not a good definition if x < 0, but it is easy to show that for negative x,
This
du/u = ln |x| + constant. The number e (Euler’s number) is then defined by ln e = 1. (After π , this is the
second most important constant in maths).

From the definition it is obvious that


d ln x 1
= .
dx x
One also finds:
ln(ab) = ln a + lnb ,
n
Repeated application of this shows ln(a ) = n ln a for integer n, and it turns out this is true for any power p
i.e.
ln x p = p ln x,
1 Those who have done applied maths. at A-level or later may recognize this as an equation for simple harmonic motion.

3
In particular, either putting p = −1 in the above, or b = 1/a in the previous equation, gives us

ln(a−1 ) = ln(1/a) = − ln(a)

and hence
ln(a/b) = ln a − lnb .

Note that ln(a + b) 6= ln a + lnb (unless a + b = ab).

We can define exp (see Thomas 7.3) to be the inverse function to ln, so that exp(ln x) = ln(exp x) = x.
Then exp1 = e and exp r = er . Note that exp(a + b) = ea eb NOT ea + eb . For any number a, a = eln a and
hence ax = (eln a )x = ex ln a . In particular, this enables us to relate the usual logarithms (base 10) to natural
logarithms since if x = log10 y, y = 10x = ex ln 10 , so ln y = x ln 10 and x = ln y/ ln 10. For a general a we can
define loga y = x to be such that y = ax , so ln y = loge y.

One can show that


d expx
= exp x ;
dx
to prove that, take y = exp x, take ln of both sides so ln y = x, then differentiate giving (1/y)dy/dx = 1 (by
the chain rule), and rearrange.

We can now use ex to define the hyperbolic functions (see Thomas 7.8)

cosh x = 12 (ex + e−x ), sinh x = 21 (ex − e−x ) .

These functions have identities and derivative properties that run closely parallel to those of sin and cos. If
you know the trigonometric identities, the identities for hyperbolic functions can be recovered by substituting
cosh for cos and i sinh for sin, where i2 = −1.

From differentiating ex we find


d sinh kx d cosh kx
= k cosh kx, = k sinh kx .
dx dx
Thence
d2 y
= k2 y ⇔ y = a cosh(kx) + b sinh(kx) (1.13)
dx2
for some constants a and b (we can also write y as a combination of ex and e−x ).

Comparing Eq. 1.13√with 1.12, we now see how to solve d 2 y/dx2 = Cy √ for any constant C : if C is
positive, we define k = C and get 1.13, while if C is negative we define k = −C and get 1.12; finally if
C = 0 we easily integrate twice to get y = ax + b.

1.3 Double and triple integrals

(See Thomas 15.1 and 15.4)


First let us revise the idea of 2-D integration.

Example 1.1. Integrate the function f (x, y) = x2 y2 over the triangular area R: 0 ≤ x ≤ 1, 0 ≤ y ≤ x.

We can write this integral as Z


f (x, y) dA,
R

4
1

0.8

y=x

0.6

0.4

0.2

0 0.2 0.4 0.6 0.8 1


x

Figure 1.1: Integrating over the triangular region R : 0 ≤ x ≤ 1, 0 ≤ y ≤ x.

where dA is an area element. But the area of a little rectangle of length δ x in the x-direction and length δ y in
the y-direction is δ A = δ x δ y; hence we can rewrite dA as dA = dx dy. Thus the integral we want (cf. Fig. 1.1)
is
Z Z Z 1 Z x 
f (x, y) dx dy = x2 y2 dy dx
x=0 y=0
Z 1  Z x 
= x2 y2 dy dx
x=0 y=0
Z 1  
1 3
= x2 x dx
0 3
 1
1 6 1
= x = .
18 0 18
Here there are two key points to note: the limits on the (inner) y-integral depend on x, and in the second
step we have moved the x2 outside the y integral because it does not depend on y; so, the x2 behaves like a
“constant” inside the y-integral , but not for the x integral.

An area integral such as this is often called a double integral (because it can be rewritten as two 1-D
integrations).R Some
R authors use two integration signs, to remind you that it is an area integral: thus they
would write f (x, y)
R
dA. In this course, whenever it is obvious that an integral is over area, we shall
generally just write f (x, y) dA.
RRR
Similarly some
R
books write f (x, y, z) dV for a volume integral: where no confusion will arise, we
shall just write f (x, y, z) dV .

We shall need to put in all the integral signs when obtaining a value by doing the two or three integrations
with respect to coordinates.
R
Exercise 1.1. Calculate R f (x, y) dA for
f (x, y) = 1 − 6x2y and R : 0 ≤ x ≤ 2, −1 ≤ y ≤ 1.
[Answer: 4] 2

Note that in that exercise, the region of integration is a rectangle, so the limits of both the x- and y-
integrations were constants so one could do the x- or the y-integration first – the answer will be the same.
This holds for any rectangular region in 2-D , or for a cuboid when we come to 3-D integration.

5
In example 1.1, we had a triangle; now the upper limit of the “inside” integral depends on the “outer”
variable. upper limit of the y-integral was x, so the y-integration had to be performed first with the limits as
given. Otherwise the answer would have read

x3
Z Z Z x Z 1 
2 2
f (x, y) dx dy = x y dx dy = .
y=0 x=0 9

This depends on x, which is ridiculous as the answer is for a whole area, not some value of x. If we want to
change the order of integration we need to take y going from 0 to 1, then x runs from y to 1 (check the sketch);
now we have to put the dx integral on the inside, and we get
Z Z Z 1 Z 1 
f (x, y) dx dy = x2 y2 dx dy.
y=0 x=y

Check that this does give the same answer.

It’s important that you understand how to get these limits: when doing a numerical evaluation of a multiple
integral, there are several rules to remember :

i) Work out the limits on each variable from a sketch.


ii) The limits on each integral may depend on the variables x, y etc appearing as dx, dy outside that integral,
but should not depend on those inside it. So the limits on the outermost integral sign should not depend
on any of x, y, z; if dx is the outermost integral
iii) The limits on each integral apply to the “matching” variable, again working from inside to outside. So
the last integral sign matches the first one of dx, dy, dz etc.
iv) Evaluate the resulting multiple integral from the “inside out”, so you evaluate the innermost integration
first. Putting in brackets can be helpful here, as in the example above.

R
It is a straightforward step from double integrals to volume integrals (triple integrals) of the form V f (x, y, z) dV .
In Cartesian coordinates we have dV = dx dy dz (the volume of a 3-D rectangular box) and so
Z Z Z Z
f (x, y, z) dV = f (x, y, z) dx dy dz.
V V

Sometimes the geometry of the volume will make other choices of coordinate system preferable. In
Thomas 15.3 and 15.6, which were studied in Calculus II, two-dimensional plane integrals in polar coordi-
nates, and triple integrals in spherical and cylindrical polars are discussed: you will find it very useful to
revise those sections. For a general change of coordinate system from Cartesians (x, y, z) to (u, v, w),
Z Z Z Z Z Z
f dx dy dz = f J du dv dw
V V

where J is the Jacobian determinant of (x, y, z) with respect to (u, v, w). This determinant J is the volume
ratio of the two coordinate systems: if we take an infinitesimal cuboid in (u, v, w) space of volume du dv dw,
this will map to a parallelepiped in x, y, z space, and J is the ratio of those volumes (if you need to revise this
in more detail, see Thomas 15.7).

ex dx dy dz over the volume V of the tetrahedron bounded by the four planes


RRR
Exercise 1.2. Evaluate
x = 0, y = 0, z = 0 and x + y + z = a (a > 0). [Answer: ea − 12 a2 − a − 1.] 2

6
1.4 Curves and surfaces

We shall use various geometrical shapes in examples, so we need the equations for them. The main ones
are so-called ‘conic sections’ in two dimensions, and related three-dimensional surfaces. Other courses also
discuss more complicated shapes (see e.g. Thomas 10.6 and 10.7).

First we discuss curves in two dimensions. There are three main ways to specify a curve:
One way is to give an equation y = f (x); a second way is to give an equation g(x, y) = 0: the curve is
then the set of points (x, y) obeying the equation. Given the first form, we can get the second by defining
g(x, y) = y − f (x) , but not necessarily the converse
A third way is the parametric form in terms of two functions of some variable t: x = a(t), y = b(t) (see
Thomas 3.5). Sometimes we can take t = x itself. The parametrized form carries extra information, about
which direction and how fast we go along the curve as t changes.2 We will see a lot more examples of the
parametrised form in Chapter 2

Using the second way, some standard curves are:

x2 + y2 = a 2 circle, centre (0, 0), radius a (1.14)


x2 y2
+ = 1 ellipse, centre (0, 0), semi-major axes a and b (1.15)
a2 b2
y = ax2 + b parabola, symmetric about x = 0 (1.16)
2 2 2
cy − kx = a , (ck > 0) hyperbola, symmetric about x = 0 and y = 0 (1.17)

(See Thomas 1.2, 1.5.) The special case of a hyperbola with a = 0 is just a pair of straight lines. These curves
involving only constants and powers up to x2 and y2 are known as the conic sections.

To recognize these, first look for the coefficients of the x2 and y2 :


if one is 0, but the corresponding variable appears linearly, it’s a parabola;
if they have the same sign it’s an ellipse (or as a special case a circle), and
if they have opposite sign it’s a hyperbola
(assuming the remaining constants allow there to be some points: x2 + y2 = −5 has no real points).

What if the equation is quadratic but not one of these standard forms? Given

x2 + 6x + y2 + 8x = 0

we can carry out a process called ‘completing the square’ to write it as

(x + 3)2 + (y + 4)2 = 25

which we now recognize as a circle radius 5, centre (−3, −4): this circle passes through the origin. Similar
methods can be used to recognize the other standard curves if they are given relative to origins different from
the ones used in the most standard forms below (cf. Thomas 1.5).

We can also recognize the case where the axes have been transformed, in a similar way. For example,
xy = b2 ⇔ (y + x)2 − (y − x)2 = 4b2, so it’s a hyperbola where the symmetry axes are at 45 ◦ to those used in
(1.17) with c = k = 1 and with 4b2 = a2 . In general we have to complete the square on the terms quadratic in
x and y: for example the rearrangement

x2 + 4xy + 3y2 = (x + 2y)2 − y2

shows the curve x2 + 4xy + 3y2 = 6 is not an ellipse, as you might think from the fact the coefficients of x2
and y2 are both positive, but a hyperbola.
2 The latter approach is used heavily in Geometry II.

7
Parametrized curves are also useful, especially when calculating line integrals along curves later on.
Here are some standard parametrizations for the circle, ellipse and hyperbola:

x2 + y2 = a 2 (x, y) = (a cos θ , a sin θ ) (1.18)


x2 y2
+ =1 (x, y) = (a cos θ , b sin θ ) (1.19)
a2 b2
y2 − x2 = a 2 (x, y) = (a sinh θ , a cosh θ ). (1.20)

These work because of the identity (1.6) and its hyperbolic counterpart cosh2 x − sinh2 x = 1. (See Thomas
10.4 for futher or alternative parametrizations.) We shall use these, especially the first two, later.

1.5 Surfaces in 3-D

[Here we meet material you may not have seen before.]

For surfaces in 3 dimensions, there are similarly three main ways to give the equations. One is to give
one coordinate in terms of the other two, e.g. z = h(x, y). Another is to use a single equation V (x, y, z) = 0.
The third is by a parametrization in terms of two variables e.g. (x(u, v), y(u, v), z(u, v)) (see Thomas 16.6,
and more details in Chapter 2 ).

We shall again focus on surfaces described by quadratics in x, y and z at worst. To work out what the
surface is like, one good way is to consider letting one coordinate be constant, for example z = d, which
means we are considering a “slice” through the surface V = 0 at the plane z = d. The intersection of a curved
surface and a plane is generally a 1-D curve, which we should be able to identify from the previous section.
Then we just stack those curves for varying d.

One simple case is


x2 + y2 = a 2
The equation is the same as for a circle, but as we are now in 3 dimensions, it implies z can take any value.
In each plane z = d we have a circle. Hence, this is an infinite circular cylinder along the z-axis. Very often
some bounding values of z are given, e.g. 0 ≤ z ≤ 2. Then we have a finite cylinder, the shape of a drinks can.

y2 2
Example 1.2. What is the surface a2
+ bz 2 = 1?

It is an infinite elliptical cylinder along the x axis.

We can also have parabolic and hyperbolic “cylinders”, using (1.16) and (1.17).

Another simple three-dimensional surface is that of a sphere of radius a centred at the origin:

x2 + y2 + z2 = a2 . (1.21)

Example 1.3. What is the surface x2 + y2 + z2 = a2 , x ≥ 0?

The hemisphere to the right of the plane x = 0.

We can put together cases where we get one of the standard types of curve listed earlier in planes z = d
and different ones in planes x = k or y = m say.

8
For example, we can generalize the ellipse (1.15) to

x2 y2 z2
+ + =1 (1.22)
a 2 b 2 c2
(see Thomas, 12.6). In this case each of the three types of cut-plane gives an ellipse as the curve. The surface
is an ellipsoid. This shape, and the ones that follow, are shown in diagrams 12.48-12.52 in Thomas. (also,
the Wikipedia article on “Quadrics” has some pretty graphics).

If instead we had taken


z x2 y2
= 2+ 2 , (1.23)
c a b
we then have an ellipse in each plane z = d but a parabola in each plane x = k or y = m. This is an elliptic
paraboloid. Changing the plus to a minus in this equation gives a hyperbolic paraboloid.

Similarly we can obtain an (elliptic) hyperboloid as

x2 y2 z2
+ − = +1 . (1.24)
a 2 b 2 c2
Here we have ellipses in planes z = d, and hyperbolae in the planes x = 0 and y = 0. Moving the z2 term over
to the RHS, we see the RHS is positive for any z, so there is an ellipse for any fixed value of z and the surface
has just one piece (we say ‘one sheet’).

However, if instead we had a −1 on the right, i.e.

x2 y2 z2
+ − = −1 . (1.25)
a 2 b 2 c2
we can rearrange into
x2 y2 z2
2
+ 2 = 2 −1 . (1.26)
a b c

It’s now clear that if z2 /c2 > 1, i.e. z < −c or z > c), we again get an ellipse in the xy plane; but if
−c < z < c the RHS is negative and there are no solutions for x, y. This is a hyperboloid of two sheets.

The elliptic paraboloid and hyperboloid have circular special cases where a = b. Note also that we can
swap x, y and z around in these forms so we have different choices of axes for the same shapes.

There is also the special case of the hyperboloid equation where the constant on the RHS is zero, i.e.

x2 y2 z2
+ − =0 (1.27)
a 2 b 2 c2
This is a cone through the origin. Taking a plane through the origin such as x = 0, we get two straight lines,
while taking planes perpendicular to the axes but not through the origin gives ellipses or parabolae. In fact all
the quadratic curves (ellipses, circles, parabolae and hyperbolae) can be obtained by intersecting the circular
cone z2 = x2 + y2 with planes (not necessarily perpendicular to the axes): this is why they are called conic
sections (see Thomas chapter 10).

If we are given a quadratic surface in a different form, we can first rearrange it into one of the forms
above: rearrange so that all the x, y, z terms are on the left, and the constant on the right; if the constant is not
zero, divide by it to get a +1 on the right; then look at the x2 , y2 and z2 parts: if two or three of these have
negative coefficients, just multiply by −1 to make at least two of the coefficients of x2 , y2 , z2 positive, Then,
If all 3 are positive, it’s an ellipsoid (or a sphere).
If two are positive and one negative it’s a hyperboloid, and we need to check the constant term

9
to see if it’s one sheet or two
If one of the three x2 ,y2 or z2 terms is zero, but there is a linear term in the corresponding
variable, it’s a paraboloid: the relative sign of the other two show if it is elliptic or hyperbolic.
If one variable is missing completely, it’s a “cylinder” given by the matching 2-D curve.

As in the case of curves, we can work out what the shape is if the equations are not in standard form but
have shifted origins or rotated axes, by completing the square. For this course, we’ll keep it simple though,
so we will only be looking at surfaces which are aligned with the coordinate axes.

Now for some parametrized versions (see Thomas 16.6)

Implicit form Parametric form


Cylinder x2 + y2 = a 2 (x, y, z) = (a cos θ , a sin θ , z) Parameters θ , z (1.28)
Sphere x2 + y2 + z2 = a2 (a sin θ cos φ , a sin θ sin φ , a cos θ ) Parameters θ , φ (1.29)
x2 y2 z2
Ellipsoid + + = +1 (a sin θ cos φ , b sin θ sin φ , c cos θ ) Parametersθ , φ (1.30)
a 2 b 2 c2
x2 y2 z2
Hyperboloid + − =1 (a cos u, b sin u cosh v, c sin u sinh v) Parameters u, v (1.31)
a 2 b 2 c2

1.6 Vectors

(Note: this is in chapter 12 in Thomas but in Geometry I you used Hirst)


Vectors can be introduced as displacements in space, called position vectors. To describe a position vector,
we need to specify its direction and its length or magnitude (to say how far we go in the given direction).
This is a geometric definition. A vector is different from a scalar, a quantity which has only a magnitude but
no direction.

One can draw a vector as an arrow of the appropriate length and direction. Vectors are usually notated in
print by boldface type, e.g. a, and in handwriting by under- or over-lining such as a, ~a, or a.
˜
Warning: When writing, it is tempting to miss off the under/overlines to save time. This is a bad idea,
because if you confuse what’s a scalar and what’s a vector in your working, you immediately get nonsense.

To define a vector algebraically, i.e. in a formula, we can use the Cartesian coordinates of the point to
which it displaces the origin, e.g.
r = (x, y, z). (1.32)
Note: As you saw in Geometry I, we can write vectors either as row or column vectors. The column vector
form is useful if you are multiplying by matrices (like rotation matrices), but in this course we shall mainly
use the row vector form which is more compact.)

Here x, y and z are called the components of r. We may refer to (x, y, z) as the point r. From now on we
shall use the notation r only for this vector.

p length of a vector v is denoted by |v| or sometimes just v; this is a scalar. The vector r has length
The
r= x2 + y2 + z2 , by Pythagoras’ theorem in 3 dimensions.

To add vectors a and b we simply take the displacement obtained by displacing first by a and then by b
(the result can be defined as the diagonal of the parallelogram with sides a and b). In components this says
that v = (v1 , v2 , v3 ) and w = (w1 , w2 , w3 ) have the sum

v + w = (v1 + w1 , v2 + w2 , v3 + w3 ).

10
Subtraction can then be defined similarly. The zero vector 0 is the one with zero magnitude (and no well-
defined direction!).

It is now easy to show this obeys the usual rules of addition (and subtraction).3

The displacement from a point r1 to a point r2 is r2 − r1 .

We can multiply a vector by a scalar (a number) λ , simply by multiplying its magnitude, preserving the
direction. In components, if v = (v1 , v2 , v3 ) then we have

λ v = (λ v1 , λ v2 , λ v3 ).

This operation also obeys very simple and obvious rules. 4 This multiplication gives us a way to define the
unit vector (the vector of length 1) in the same direction as v, denoted by v̂, by v̂ ≡ v/|v| (strictly, we should
write the number first so we would have to write (1/|v|)v, but in practice it’s obvious what we mean).

These rules give us another common way of writing a vector. We note that we can arrive at the same total
displacement by first moving along the x-axis, then parallel to the y−axis then parallel to the z−axis; and we
can express this by defining the unit vectors i, j and k along the directions of the three axes by

r = xi + yj + zk.

This way of writing (1.32) has the advantage of making it clearer how the components change if we change
our choice of axes: if we rotate our axes to a different system x′ , y′ , z′ , we will get 3 new unit vectors e.g. i’,
j’ and k’, and converting vectors between systems looks like a matrix multiplication - more on this later.

Note that all of these statements about position vectors in 3 dimensions can very simply be applied in 2
dimensions also, with obvious minor changes.

Although we have motivated vectors by introducing them as displacements, they can represent, or be
interpreted as, many other things: for example, a force, a velocity, inputs and outputs in an economic model,
and so on.

A parametric equation of the type

r = p + tq, −∞ < t < ∞ (1.33)

defines a line through point p parallel to direction q. For example r = tk, −∞ < t < ∞ is the z axis.

Using this, we can get the straight line going through two given points r1 and r2 : the vector from r1 to r2
is r2 − r1 , so the (infinite) line through them is

r = r1 + t(r2 − r1 ), in f ty < t < ∞ (1.34)

If instead we take a range 0 <≤ t ≤ 1 in the above, this gives us the finite line segment with end-points at the
two given points. This will be very useful later on, memorise it.
3 This means that for any vectors a, b and c,
a + b = b + a, (a + b) + c = a + (b + c), ∃0 such that a + 0 = a,
and given a, ∃(−a) such that a + (−a) = 0. These rules are purely abstract and make no reference to displacements or three dimensions,
and are part of the general definition of a vector space which is given in Linear Algebra I. Those who have encountered groups will
recognise that they ensure that the space of vectors is an additive group under vector addition.
4 More precisely, for any vectors a and b, and numbers λ and µ , we have

λ (a + b) = λ a + λ b, (λ + µ )a = λ a + µ a, (λ µ )a = λ ( µ a)
and 1a = a. For a general vector space, as defined in Linear Algebra I, the scalars are elements of a general field but here we shall only
use the real numbers R. However, these rules do apply when λ and µ are elements of a general field, for instance the complex numbers
C.

11
Example 1.4. Medians of a triangle

Vectors can often be used to derive geometrical results very concisely, as this example shows.

Let a, b, c be the corners of a triangle. The midpoint of the side connecting b and c will be 12 (b + c). A
line through this midpoint and a is

r = a + t( 21 b + 12 c − a), −∞ < t < ∞

which is called the median through a. Putting t = 23 (note: here this choice is a rabbit out of the hat, but we can
find it by writing down a second median and solving for the intersection point) we get the point 31 (a + b + c).
Since this point is symmetric in a, b, c, the medians through b and c will also pass through it. Hence the three
medians of a triangle intersect at a single point.

If we write out the components of (1.33), with notation r = (x, y, z), p = (p1 , p2 , p3 ), q = (q1 , q2 , q3 )
we find
x = p1 + tq1 y = p2 + tq2, z = p3 + tq3,
from which we can eliminate t to get
x − p1 y − p2 z − p3
= = ,
q1 q2 q3
giving the two independent linear equations (e.g. for y and z in terms of x) needed for a line in three-
dimensional space.

We can now write functions of 3-dimensional position f (x, y, z) more compactly as functions f (r). Equa-
tions of the form f (r) = constant define surfaces, the constant surfaces of f . A simple example is r2 = 1,
which is a sphere of unit radius centred at the origin. (Recall our notation allows r ≡ |r|.)

12
Example 1.5. A sphere

The geometrical interpretation of


|r − k| = 1
as a sphere of unit radius centred at (0, 0, 1) is obvious. Equivalent expressions are x2 + y2 + (z − 1)2 = 1 and
x2 + y2 + z2 − 2z = 0.

Warning: One of the commonest errors made by students is to confuse vectors and scalars, in particular
to start adding together the components of a vector. The vector (3, 1, 2) is not the same as the scalar 6. This
may seem obvious now, but the mistake is more easily made when using basis vectors like i, j and k; then it
somehow seems to be easier to make the mistake 3i + j + 2k = 6.

1.7 Scalar and vector products

We have defined vector addition and subtraction, but not multiplication of vectors. This is more complicated
because to obtain another vector we need to define both a magnitude and a direction (and in general, vector
division cannot be defined at all; we can divide a vector by a scalar λ just by multiplying by 1/λ , but we
cannot divide anything by a vector).

We first define the dot product, or scalar product 5, whose result is not a vector but a scalar. For vectors v
and w, this is defined by
v.w ≡ |v||w| cos θ , (1.35)
where θ is the angle between v and w. An alternative definition in terms of the components (v1 , v2 , v3 ) and
(w1 , w2 , w3 ) of v and w is
3
v.w ≡ v1 w1 + v2w2 + v3 w3 = ∑ vi wi .
i=1

One can prove that the two definitions are the same by applying Pythagoras’ theorem to a triangle con-
structed as follows. Take sides v, w and v + w. Draw the perpendicular from v + w to the line in direction v.
It has height |w| sin θ and meets the direction v at a distance |v| + |w| cos θ . Now write out Pythagoras with
the lengths in terms of |v|, |w| and θ and again in terms of components and compare the results. The details
are left as an exercise (if you have trouble, look in the online notes for MAS114 Geometry I or in A.E. Hirst,
Vectors in 2 or 3 dimensions, Arnold 1995, chapter 3).

We note in particular that two non-zero vectors v and w are perpendicular ( θ is a right angle) if and only
if v.w = 0.

Example 1.6. (This example was used in Geometry I.)


Find cos θ where θ is the angle between v = (1, 3, −1) and w = (2, 2, 1).

v.w = 1.2 + 3.2 + (−1).1 = 7 = |v||w| cos θ ,


2
|v| = 12 + 32 + (−1)2 = 11,
|w|2 = 22 + 22 + 12 = 9, so
7 7
cos θ = √ √ = √
11 9 3 11

5 In a more abstract setting (such as in Linear Algebra I) this may also be called the inner product.

13
From either form of the definition we can easily derive various algebraic rules.6

A geometrical application of the dot product is in giving the equation of a plane. The plane through a
fixed point p perpendicular to a fixed vector v is given by the set of all points r which have r− p perpendicular
to v, as is easily seen from a sketch. Since two perpendicular vectors have a dot product of zero, this gives

(r − p).v = 0 (1.36)

This easily rearranges to r.v = p.v and the right-hand side is just a constant for given p, v.

In components, if v = (a, b, c) and p.v = d the equation for a plane reads ax + by + cz = d. In practice
people often choose a unit vector n when specifying a plane in this form, so that p.n becomes the perpendic-
ular distance of the plane from the origin. Then, the distance of any other point r1 from that plane is given by
(r1 − p).n = r1 .n − p.n = r1 .n − d (the sign here tells one which side of the plane r1 is on).

The vector product: To define a product of two vectors which is a third vector, we need to define a
direction from two vectors u and v. The only way to do this which treats the two vectors equally is to take the
perpendicular to the plane in which u and v lie. However, this does not fully define a direction, because we
need to know which way to go along the perpendicular. For that the convention is to use the so-called right-
hand rule: hold the fingers of your right hand so they curl round from u to v and then take the direction your
thumb points (see Thomas figures 12.27 and 12.28). If you do DIY, you may find it helpful to remember that
this is the direction a normal screw travels if you turn your screwdriver clockwise. Note that this definition
only works in three dimensions: there is no well-defined vector product in n dimensions for n > 3.

The magnitude of v × w is defined to be |v||w| sin θ (θ as before). Geometrically this is the area of a
parallelogram with sides v and w. Note that for perpendicular vectors this rule implies that the magnitude is
|v||w|. These rules have the consequences that for any vectors u, v and w and any scalar λ ,

v×w = −w × v,
(λ v) × w = λ (v × w) = v × (λ w),
u × (v + w) = (u × v) + (u × w),
(u + v) × w = u × w + v × w,

and v × w = 0 for non-zero v, w if and only if v and w are parallel or anti-parallel (in particular, v × v = 0 for
any v).

Note: it is particularly important to note the sign-change property that v × w = −w × v. This looks
“silly”, but is a consequence of the “handedness” of three-dimensional space, and which way round we
choose to label our three coordinate axes.

From the notation used, the vector product is often called the cross product.7
6 The main ones are that v.w = w.v and that for any vectors u, v and w and any scalar λ ,
v.(λ w) = λ (v.w) = (λ v).w,
u.(v + w) = (u.v) + (u.w),
(u + v).w = (u.w) + (v.w),
v.v = |v|2 ≥ 0,
v.v = 0 ⇔ v = 0.

7 You will also find that in some texts it is denoted v ∧ w, but I strongly advise against using this notation as it leads to confusion in

more general settings where v ∧ w is not a vector. The reason for this misuse is that v ∧ w is what’s called a two-form, and there is an
operation called the Hodge dual, denoted by ∗, such that in three dimensions ∗(v ∧ w) = v × w.

14
To get the expressions for the cross product in terms of components, we can start by noting that the unit
vectors i, j and k are perpendicular to one another (so the vector product of any two distinct ones among them
has magnitude 1). This means that i × j must have length 1 and be perpendicular to both of them, so it is
either +k or −k. Since the usual x, y and z axes, in that order, are a right-handed set, it will turn out that

i × j = +k, j × k = +i, k × i = +j,

and therefore
j × i = −k, k × j = −i, i × k = −j.
(To remember these , think of the sequence ijkijk...; if the two vectors in the cross-product are in the same
order as in that sequence, the RHS has a + sign, while if they are in reverse order there is a − sign.)

Also i × i = j × j = k × k = 0. Using these we easily obtain

(v1 i + v2j + v3k) × (w1 i + w2 j + w3k) = (v2 w3 − v3w2 )i + (v3w1 − v1 w3 )j + (v1 w2 − v2 w1 )k. (1.37)

This can also be written as the formal determinant



i j k

v1 v2 v3 .

w1 w2 w3

One geometrical use of the cross product is in forming the volume of a parallellepiped with sides u,v
and w. Thinking of (say) u and v as the base, and θ as the angle between u × v and w, so that the height is
|w| cos θ , we see that
Volume of parallellepiped = (u × v).w (1.38)
(positive if u,v and w are a right-handed set). This quantity is called the scalar triple product and it is easy to
show that
u.(v × w) = v.(w × u) = w.(u × v)
(but this is −v.(u × w) etc, remember). We can also show that swapping the dot and cross gives the same
result, i.e. (u × v).w = w.(u × v) = u.(v × w) from above , but note that the brackets also move, i.e. the cross
product must be done first (inside the brackets) otherwise the result is nonsense. (Some textbooks may omit
the brackets, but this is potentially confusing). Clearly swapping the two vectors inside the bracket changes
the sign, and we can show that this is also true for swapping any two of the three vectors.

Exercise 1.3. Prove from the definitions that, for all a, b and c,

a × (b × c) = (a.c)b − (a.b)c.

This quantity is called the vector triple product; note that the position of the brackets matters here.

15
1.8 Gradients and directional derivatives

(See Thomas 14.5 [and 16.2])


In Calculus II you met functions of more than one variable. Those that were discussed there were scalar
functions, i.e. functions whose value at a particular point is a number. Such a scalar quantity (magnitude but
no direction) that depends on position in space is called a scalar field. An example would be the temperature
in a room – it has magnitude but not direction (so it is a scalar), and it is (in general) a function of position.

Suppose that V (x, y, z) or V (r) is a scalar field defined in some region. Then we can define a vector, the
gradient of V , at each point, which we denote ∇V , as follows:

∂V ∂V ∂V
∇V = i+ j+ k.
∂x ∂y ∂z

So the x-, y- and z-components of the new vector are ∂ V /∂ x, ∂ V /∂ y and ∂ V /∂ z. See Thomas 14.5 if
you need to revise this in more detail. Sometimes instead of ∇V we write grad V : the two notations are
interchangeable.

Example 1.7. If V (x, y, z) = x2 sin z, calculate ∇V .

In this example, ∂ V /∂ x = 2x sin z, ∂ V /∂ y = 0 and ∂ V /∂ z = x2 cosz. Hence

∇V = 2x sin z i + x2 cos z k.

Exercise 1.4. Evaluate the gradient ∇ f of the following scalar fields.

(a) f = x + y + z,
(b) f = yx2 + y3 − y + 2x2z,
(c) f = a.r, where a is a constant vector.

Now ∇V tells us how V changes if we move from one point to a nearby point. Suppose we start at a point
r = (x, y, z), and then move a small distance dr = (dx, dy, dz) to the new point r + dr = (x + dx, y + dy, z + dz):
we will get a small change in V , given by

dV ≡ V (x + dx, y + dy, z + dz) − V (x, y, z)


∂V ∂V ∂V
= dx + dy + dz
∂x ∂y ∂z
Here the second line uses the Taylor series in more than one variable, and discards terms in second and higher
derivatives since dr is small. But (dx, dy, dz) = dr, and so the right-hand side is just ∇V.dr. Hence for a small
change dr, the change in V is
dV = ∇V.dr (1.39)
Note: to use this, you must evaluate ∇V at the point concerned.

In our original definition of grad, it was implicitly assumed that we were working in terms of some
specified Cartesian coordinate system (x, y, z). Equation (1.39) is important, because we can use it as a more

16
fundamental definition of grad, which will enable us to write down ∇V in more general coordinate systems.
We shall return to this point later, in Chapter 5

Next, consider a surface


V (r) = constant,
and suppose that the point r1 is on that surface. If dr is a displacement on this surface, V (r1 + dr) = V (r1 ).
Thus dV = ∇V |r1 .dr = 0. Since this applies for every small displacement dr in the surface, ∇V |r1 must be
perpendicular to the surface at r1 . This gives us a way of finding a normal to a surface when the surface is
specified by a single equation: the unit normal n will be ∇V /|∇V | evaluated at the point concerned.

Suppose we now want a normal line to V = constant, or a tangent plane (see Thomas 14.6). As we know
from (1.33) and (1.36), a line through point a in direction q can be written in parametric form as

r = a + tq, −∞ < t < ∞,

while the plane through a perpendicular to q is

(r − a).q = 0.

so all we have to do is insert values of a and q in these formulae. For the tangent plane and normal line to a
surface at a given point p, this gives

(r − p).∇V |p = 0 and r = p + t∇V|p .

It is sometimes convenient to eliminate the parameter t for the normal line, which we can do by taking
the cross product with ∇V |p :
(r − p) × ∇V|p = 0.
Note that the forms (r − p).n = 0 and (r − p) × n = 0, using the unit normal n, would give the same plane or
line (though in r = a + tn such a change alters the values of t for given points) so we need not calculate |∇V |
to get the tangent plane or normal line.
Repeated Note: to use this, you must evaluate ∇V at the point concerned.

Exercise 1.5. Find equations for the (i) tangent plane and (ii) normal line at the point P0 on each of the
surfaces:

(a) x2 + 3yz + 4xy = 27, P0 = (3, 1, 2).


(b) y2 z + x2y = 7, P0 = (2, 1, 3).

[Answers: (a) 10x + 18y + 3z = 54, r = (3 + 10t)i + (1 + 18t)j + (2 + 3t)k


(b) 4x + 10y + z = 21, r = (2 + 4t)i + (1 + 10t)j + (3 + t)k] 2

Suppose now that we want to calculate the rate of change of V (r) in a particular direction specified by the
unit vector t. Let s be the distance travelled in the direction of t; then dr = t ds. So dV = ∇V.t ds. Hence we
can conclude that the rate of change of V in the direction of t is
dV
= ∇V.t = t.∇V.
ds
t.∇V is called the directional derivative. Now

∇V.t = |∇V | |t| cos θ = |∇V | cos θ ,

where θ is the angle between the vectors ∇V and t. This is maximized when cos θ = 1, i.e. when θ = 0. Thus
V changes most rapidly in the direction of ∇V , and |∇V | is this most rapid rate of change. It is this property,

17
in the two-dimensional case, that gave rise to the name gradient, because |∇V | is the gradient of the surface
given by z = f (x, y) in that case. Correspondingly the maximum decrease is when t is opposite to ∇V .

Example 1.8. Find the directions in which the function f (x, y, z) = (x/y) − yz increases and decreases
most rapidly at the point P, (4, 1, 1).

We can describe the directions in which f increases and decreases most rapidly by specifying the unit
vectors in those directions. Now
 
1 x
∇ f = i − 2 + z j − yk = (1, −5, −1) at P.
y y

The rate of change of f in the direction of unit vector t is ∇ f .t. This has its maximum when t is in the same
direction as ∇ f ; so the direction t in which f increases most rapidly is

∇f 1
= √ (1, −5, −1).
|∇ f | 27
√ √
and the actual rate is 27. The rate of decrease of f is greatest, at − 27, when t is in the opposite direction,
i.e.
−1
√ (1, −5, −1)
27

Exercise 1.6. Find the directional derivative of Φ at the point (1, 2, 3) in the direction of the vector
(1, 1, 1) where
x2 y2 z2
Φ= + + .
3 9 27
2

We can write ∇ on its own as


∂ ∂ ∂
∇=i +j +k
∂x ∂y ∂z
and work with it like a vector field, although it is in fact not a vector field (since we cannot say what numerical
value its components have at a particular point); strictly speaking ∇ is a vector differential operator. The
name of the symbol ∇ is ‘nabla’ but often in speech we say ‘del’. It is easy to see how to take a two-
dimensional version of ∇. We will return to ∇ in Chapter 3, where we shall see how the ∇ operator can also
be used to differentiate vectors.

18

You might also like