Calc3 Chapter1
Calc3 Chapter1
Calc3 Chapter1
“It is one of the most unnatural features of science that the abstract language of mathematics
should provide such a powerful tool for describing the behaviour of systems both inanimate, as
in physics, and living, as in biology. Why the world should conform to mathematical descriptions
is a deep question. Whatever the answer, it is astonishing.”
Lewis Wolpert (1992)
Students on the module MTH5102 Calculus III are welcome to download, print and photocopy these notes,
in whole or in part, for their personal use. The notes are intended to supplement rather than to replace the
lectures.
c Will Sutherland 2009-10, M.A.H. MacCallum 2006-8, 2001, P. Saha 2005, M.J. Thompson 1999.
Chapter 1
Introductory material
This chapter gives a quick review of the key parts of the prerequisite courses (Calculus I and II, and
Geometry I) which we will actually use in Calculus III, adding some extra material. Those parts which are
revision will be without examples.
1.1.1 Values
π π π π
0◦ 30◦ = 6 radians
√
45◦ = 4 rad. 60◦ = 3 rad. 90◦ = 2 rad.
3 √1 1
cos 1 2 2 2 0
√
1 √1 3
sin 0 2 2 2 1
To get the sign for other values we can use the mnemonic table
sometimes called the ‘Add Sugar To Coffee’ rule – or use Thomas’ variant “All Students Take Calculus”.
(Note: to be entirely accurate we should have special rows in this table for the values 12 π etc because at those
points one or more of the functions will be zero or unbounded.)
1
Then we remember what happens when we replace x by −x, x + π /2 or x + π :
These are very easy to derive from eix = cos x + i sin x, remembering that eiπ /2 = i, eiπ = −1. Using them
in combination we can get
cos(π − x) = − cosx, sin(π − x) = sin x.
and so on.
More generally
1 1
cos(x + (m + )π ) = (−1)(m+1) sin x, sin(x + (m + )π ) = (−1)m cosx, (1.2)
2 2
cos(x + nπ ) = (−1)n cosx, sin(x + nπ ) = (−1)n sin x. (1.3)
where m and n are integers. These identities enable us to relate the value we want to a value in the first
quadrant (i.e. the range [0, 12 π ]). Remember the special cases for x = 0,
1
cos(nπ ) = (−1)n , sin((n + )π ) = (−1)n . (1.4)
2
cos((n + 1/2)π ) = 0, sin(nπ ) = 0. (1.5)
If you have trouble remembering which of the last two is which, and which has the minus in it, try substituting
some special values such as A = 0 or B = 21 π and checking the result. For example, taking A = 0 in the last
equation gives sin B = 0 + sin B, consistent, whereas if you had tried sin(A + B) = sin A cosB − cosA sin B you
would get sin B = 0 − sinB, clearly wrong. From these and the earlier results Eq. 1.1 we get
2
The double angle cases
cos2 x = 12 (1 + cos2x)
sin2 x = 12 (1 − cos2x)
come up often; we get the “half angle” cases by just substituting in y = 2x, x = y/2 in the above.
Using the basic identities we can easily derive plenty more, such as
sec2 A = 1 + tan2 A
cosC + cosD = 2 cos 12 (C + D) cos 12 (C − D).
We should also note (see Thomas 3.4) that for any constant k,
d(sin(kx)) d(cos(kx))
= k cos(kx), = −k sin(kx) .
dx dx
Both sin(kx) and cos(kx) therefore obey1
d2 y
= −k2 y .
dx2
and it can be proved that these give all solutions, i.e.
d2 y
= −k2 y ⇔ y = a cos(kx) + b sin(kx) (1.12)
dx2
for some constants a and b. We could also write the right side as a combination of eix and e−ix .
R x implies ln 1 = 0. Note that this is not a good definition if x < 0, but it is easy to show that for negative x,
This
du/u = ln |x| + constant. The number e (Euler’s number) is then defined by ln e = 1. (After π , this is the
second most important constant in maths).
3
In particular, either putting p = −1 in the above, or b = 1/a in the previous equation, gives us
and hence
ln(a/b) = ln a − lnb .
We can define exp (see Thomas 7.3) to be the inverse function to ln, so that exp(ln x) = ln(exp x) = x.
Then exp1 = e and exp r = er . Note that exp(a + b) = ea eb NOT ea + eb . For any number a, a = eln a and
hence ax = (eln a )x = ex ln a . In particular, this enables us to relate the usual logarithms (base 10) to natural
logarithms since if x = log10 y, y = 10x = ex ln 10 , so ln y = x ln 10 and x = ln y/ ln 10. For a general a we can
define loga y = x to be such that y = ax , so ln y = loge y.
We can now use ex to define the hyperbolic functions (see Thomas 7.8)
These functions have identities and derivative properties that run closely parallel to those of sin and cos. If
you know the trigonometric identities, the identities for hyperbolic functions can be recovered by substituting
cosh for cos and i sinh for sin, where i2 = −1.
Comparing Eq. 1.13√with 1.12, we now see how to solve d 2 y/dx2 = Cy √ for any constant C : if C is
positive, we define k = C and get 1.13, while if C is negative we define k = −C and get 1.12; finally if
C = 0 we easily integrate twice to get y = ax + b.
Example 1.1. Integrate the function f (x, y) = x2 y2 over the triangular area R: 0 ≤ x ≤ 1, 0 ≤ y ≤ x.
4
1
0.8
y=x
0.6
0.4
0.2
where dA is an area element. But the area of a little rectangle of length δ x in the x-direction and length δ y in
the y-direction is δ A = δ x δ y; hence we can rewrite dA as dA = dx dy. Thus the integral we want (cf. Fig. 1.1)
is
Z Z Z 1 Z x
f (x, y) dx dy = x2 y2 dy dx
x=0 y=0
Z 1 Z x
= x2 y2 dy dx
x=0 y=0
Z 1
1 3
= x2 x dx
0 3
1
1 6 1
= x = .
18 0 18
Here there are two key points to note: the limits on the (inner) y-integral depend on x, and in the second
step we have moved the x2 outside the y integral because it does not depend on y; so, the x2 behaves like a
“constant” inside the y-integral , but not for the x integral.
An area integral such as this is often called a double integral (because it can be rewritten as two 1-D
integrations).R Some
R authors use two integration signs, to remind you that it is an area integral: thus they
would write f (x, y)
R
dA. In this course, whenever it is obvious that an integral is over area, we shall
generally just write f (x, y) dA.
RRR
Similarly some
R
books write f (x, y, z) dV for a volume integral: where no confusion will arise, we
shall just write f (x, y, z) dV .
We shall need to put in all the integral signs when obtaining a value by doing the two or three integrations
with respect to coordinates.
R
Exercise 1.1. Calculate R f (x, y) dA for
f (x, y) = 1 − 6x2y and R : 0 ≤ x ≤ 2, −1 ≤ y ≤ 1.
[Answer: 4] 2
Note that in that exercise, the region of integration is a rectangle, so the limits of both the x- and y-
integrations were constants so one could do the x- or the y-integration first – the answer will be the same.
This holds for any rectangular region in 2-D , or for a cuboid when we come to 3-D integration.
5
In example 1.1, we had a triangle; now the upper limit of the “inside” integral depends on the “outer”
variable. upper limit of the y-integral was x, so the y-integration had to be performed first with the limits as
given. Otherwise the answer would have read
x3
Z Z Z x Z 1
2 2
f (x, y) dx dy = x y dx dy = .
y=0 x=0 9
This depends on x, which is ridiculous as the answer is for a whole area, not some value of x. If we want to
change the order of integration we need to take y going from 0 to 1, then x runs from y to 1 (check the sketch);
now we have to put the dx integral on the inside, and we get
Z Z Z 1 Z 1
f (x, y) dx dy = x2 y2 dx dy.
y=0 x=y
It’s important that you understand how to get these limits: when doing a numerical evaluation of a multiple
integral, there are several rules to remember :
R
It is a straightforward step from double integrals to volume integrals (triple integrals) of the form V f (x, y, z) dV .
In Cartesian coordinates we have dV = dx dy dz (the volume of a 3-D rectangular box) and so
Z Z Z Z
f (x, y, z) dV = f (x, y, z) dx dy dz.
V V
Sometimes the geometry of the volume will make other choices of coordinate system preferable. In
Thomas 15.3 and 15.6, which were studied in Calculus II, two-dimensional plane integrals in polar coordi-
nates, and triple integrals in spherical and cylindrical polars are discussed: you will find it very useful to
revise those sections. For a general change of coordinate system from Cartesians (x, y, z) to (u, v, w),
Z Z Z Z Z Z
f dx dy dz = f J du dv dw
V V
where J is the Jacobian determinant of (x, y, z) with respect to (u, v, w). This determinant J is the volume
ratio of the two coordinate systems: if we take an infinitesimal cuboid in (u, v, w) space of volume du dv dw,
this will map to a parallelepiped in x, y, z space, and J is the ratio of those volumes (if you need to revise this
in more detail, see Thomas 15.7).
6
1.4 Curves and surfaces
We shall use various geometrical shapes in examples, so we need the equations for them. The main ones
are so-called ‘conic sections’ in two dimensions, and related three-dimensional surfaces. Other courses also
discuss more complicated shapes (see e.g. Thomas 10.6 and 10.7).
First we discuss curves in two dimensions. There are three main ways to specify a curve:
One way is to give an equation y = f (x); a second way is to give an equation g(x, y) = 0: the curve is
then the set of points (x, y) obeying the equation. Given the first form, we can get the second by defining
g(x, y) = y − f (x) , but not necessarily the converse
A third way is the parametric form in terms of two functions of some variable t: x = a(t), y = b(t) (see
Thomas 3.5). Sometimes we can take t = x itself. The parametrized form carries extra information, about
which direction and how fast we go along the curve as t changes.2 We will see a lot more examples of the
parametrised form in Chapter 2
(See Thomas 1.2, 1.5.) The special case of a hyperbola with a = 0 is just a pair of straight lines. These curves
involving only constants and powers up to x2 and y2 are known as the conic sections.
What if the equation is quadratic but not one of these standard forms? Given
x2 + 6x + y2 + 8x = 0
(x + 3)2 + (y + 4)2 = 25
which we now recognize as a circle radius 5, centre (−3, −4): this circle passes through the origin. Similar
methods can be used to recognize the other standard curves if they are given relative to origins different from
the ones used in the most standard forms below (cf. Thomas 1.5).
We can also recognize the case where the axes have been transformed, in a similar way. For example,
xy = b2 ⇔ (y + x)2 − (y − x)2 = 4b2, so it’s a hyperbola where the symmetry axes are at 45 ◦ to those used in
(1.17) with c = k = 1 and with 4b2 = a2 . In general we have to complete the square on the terms quadratic in
x and y: for example the rearrangement
shows the curve x2 + 4xy + 3y2 = 6 is not an ellipse, as you might think from the fact the coefficients of x2
and y2 are both positive, but a hyperbola.
2 The latter approach is used heavily in Geometry II.
7
Parametrized curves are also useful, especially when calculating line integrals along curves later on.
Here are some standard parametrizations for the circle, ellipse and hyperbola:
These work because of the identity (1.6) and its hyperbolic counterpart cosh2 x − sinh2 x = 1. (See Thomas
10.4 for futher or alternative parametrizations.) We shall use these, especially the first two, later.
For surfaces in 3 dimensions, there are similarly three main ways to give the equations. One is to give
one coordinate in terms of the other two, e.g. z = h(x, y). Another is to use a single equation V (x, y, z) = 0.
The third is by a parametrization in terms of two variables e.g. (x(u, v), y(u, v), z(u, v)) (see Thomas 16.6,
and more details in Chapter 2 ).
We shall again focus on surfaces described by quadratics in x, y and z at worst. To work out what the
surface is like, one good way is to consider letting one coordinate be constant, for example z = d, which
means we are considering a “slice” through the surface V = 0 at the plane z = d. The intersection of a curved
surface and a plane is generally a 1-D curve, which we should be able to identify from the previous section.
Then we just stack those curves for varying d.
y2 2
Example 1.2. What is the surface a2
+ bz 2 = 1?
We can also have parabolic and hyperbolic “cylinders”, using (1.16) and (1.17).
Another simple three-dimensional surface is that of a sphere of radius a centred at the origin:
x2 + y2 + z2 = a2 . (1.21)
We can put together cases where we get one of the standard types of curve listed earlier in planes z = d
and different ones in planes x = k or y = m say.
8
For example, we can generalize the ellipse (1.15) to
x2 y2 z2
+ + =1 (1.22)
a 2 b 2 c2
(see Thomas, 12.6). In this case each of the three types of cut-plane gives an ellipse as the curve. The surface
is an ellipsoid. This shape, and the ones that follow, are shown in diagrams 12.48-12.52 in Thomas. (also,
the Wikipedia article on “Quadrics” has some pretty graphics).
x2 y2 z2
+ − = +1 . (1.24)
a 2 b 2 c2
Here we have ellipses in planes z = d, and hyperbolae in the planes x = 0 and y = 0. Moving the z2 term over
to the RHS, we see the RHS is positive for any z, so there is an ellipse for any fixed value of z and the surface
has just one piece (we say ‘one sheet’).
x2 y2 z2
+ − = −1 . (1.25)
a 2 b 2 c2
we can rearrange into
x2 y2 z2
2
+ 2 = 2 −1 . (1.26)
a b c
It’s now clear that if z2 /c2 > 1, i.e. z < −c or z > c), we again get an ellipse in the xy plane; but if
−c < z < c the RHS is negative and there are no solutions for x, y. This is a hyperboloid of two sheets.
The elliptic paraboloid and hyperboloid have circular special cases where a = b. Note also that we can
swap x, y and z around in these forms so we have different choices of axes for the same shapes.
There is also the special case of the hyperboloid equation where the constant on the RHS is zero, i.e.
x2 y2 z2
+ − =0 (1.27)
a 2 b 2 c2
This is a cone through the origin. Taking a plane through the origin such as x = 0, we get two straight lines,
while taking planes perpendicular to the axes but not through the origin gives ellipses or parabolae. In fact all
the quadratic curves (ellipses, circles, parabolae and hyperbolae) can be obtained by intersecting the circular
cone z2 = x2 + y2 with planes (not necessarily perpendicular to the axes): this is why they are called conic
sections (see Thomas chapter 10).
If we are given a quadratic surface in a different form, we can first rearrange it into one of the forms
above: rearrange so that all the x, y, z terms are on the left, and the constant on the right; if the constant is not
zero, divide by it to get a +1 on the right; then look at the x2 , y2 and z2 parts: if two or three of these have
negative coefficients, just multiply by −1 to make at least two of the coefficients of x2 , y2 , z2 positive, Then,
If all 3 are positive, it’s an ellipsoid (or a sphere).
If two are positive and one negative it’s a hyperboloid, and we need to check the constant term
9
to see if it’s one sheet or two
If one of the three x2 ,y2 or z2 terms is zero, but there is a linear term in the corresponding
variable, it’s a paraboloid: the relative sign of the other two show if it is elliptic or hyperbolic.
If one variable is missing completely, it’s a “cylinder” given by the matching 2-D curve.
As in the case of curves, we can work out what the shape is if the equations are not in standard form but
have shifted origins or rotated axes, by completing the square. For this course, we’ll keep it simple though,
so we will only be looking at surfaces which are aligned with the coordinate axes.
1.6 Vectors
One can draw a vector as an arrow of the appropriate length and direction. Vectors are usually notated in
print by boldface type, e.g. a, and in handwriting by under- or over-lining such as a, ~a, or a.
˜
Warning: When writing, it is tempting to miss off the under/overlines to save time. This is a bad idea,
because if you confuse what’s a scalar and what’s a vector in your working, you immediately get nonsense.
To define a vector algebraically, i.e. in a formula, we can use the Cartesian coordinates of the point to
which it displaces the origin, e.g.
r = (x, y, z). (1.32)
Note: As you saw in Geometry I, we can write vectors either as row or column vectors. The column vector
form is useful if you are multiplying by matrices (like rotation matrices), but in this course we shall mainly
use the row vector form which is more compact.)
Here x, y and z are called the components of r. We may refer to (x, y, z) as the point r. From now on we
shall use the notation r only for this vector.
p length of a vector v is denoted by |v| or sometimes just v; this is a scalar. The vector r has length
The
r= x2 + y2 + z2 , by Pythagoras’ theorem in 3 dimensions.
To add vectors a and b we simply take the displacement obtained by displacing first by a and then by b
(the result can be defined as the diagonal of the parallelogram with sides a and b). In components this says
that v = (v1 , v2 , v3 ) and w = (w1 , w2 , w3 ) have the sum
v + w = (v1 + w1 , v2 + w2 , v3 + w3 ).
10
Subtraction can then be defined similarly. The zero vector 0 is the one with zero magnitude (and no well-
defined direction!).
It is now easy to show this obeys the usual rules of addition (and subtraction).3
We can multiply a vector by a scalar (a number) λ , simply by multiplying its magnitude, preserving the
direction. In components, if v = (v1 , v2 , v3 ) then we have
λ v = (λ v1 , λ v2 , λ v3 ).
This operation also obeys very simple and obvious rules. 4 This multiplication gives us a way to define the
unit vector (the vector of length 1) in the same direction as v, denoted by v̂, by v̂ ≡ v/|v| (strictly, we should
write the number first so we would have to write (1/|v|)v, but in practice it’s obvious what we mean).
These rules give us another common way of writing a vector. We note that we can arrive at the same total
displacement by first moving along the x-axis, then parallel to the y−axis then parallel to the z−axis; and we
can express this by defining the unit vectors i, j and k along the directions of the three axes by
r = xi + yj + zk.
This way of writing (1.32) has the advantage of making it clearer how the components change if we change
our choice of axes: if we rotate our axes to a different system x′ , y′ , z′ , we will get 3 new unit vectors e.g. i’,
j’ and k’, and converting vectors between systems looks like a matrix multiplication - more on this later.
Note that all of these statements about position vectors in 3 dimensions can very simply be applied in 2
dimensions also, with obvious minor changes.
Although we have motivated vectors by introducing them as displacements, they can represent, or be
interpreted as, many other things: for example, a force, a velocity, inputs and outputs in an economic model,
and so on.
defines a line through point p parallel to direction q. For example r = tk, −∞ < t < ∞ is the z axis.
Using this, we can get the straight line going through two given points r1 and r2 : the vector from r1 to r2
is r2 − r1 , so the (infinite) line through them is
If instead we take a range 0 <≤ t ≤ 1 in the above, this gives us the finite line segment with end-points at the
two given points. This will be very useful later on, memorise it.
3 This means that for any vectors a, b and c,
a + b = b + a, (a + b) + c = a + (b + c), ∃0 such that a + 0 = a,
and given a, ∃(−a) such that a + (−a) = 0. These rules are purely abstract and make no reference to displacements or three dimensions,
and are part of the general definition of a vector space which is given in Linear Algebra I. Those who have encountered groups will
recognise that they ensure that the space of vectors is an additive group under vector addition.
4 More precisely, for any vectors a and b, and numbers λ and µ , we have
λ (a + b) = λ a + λ b, (λ + µ )a = λ a + µ a, (λ µ )a = λ ( µ a)
and 1a = a. For a general vector space, as defined in Linear Algebra I, the scalars are elements of a general field but here we shall only
use the real numbers R. However, these rules do apply when λ and µ are elements of a general field, for instance the complex numbers
C.
11
Example 1.4. Medians of a triangle
Vectors can often be used to derive geometrical results very concisely, as this example shows.
Let a, b, c be the corners of a triangle. The midpoint of the side connecting b and c will be 12 (b + c). A
line through this midpoint and a is
which is called the median through a. Putting t = 23 (note: here this choice is a rabbit out of the hat, but we can
find it by writing down a second median and solving for the intersection point) we get the point 31 (a + b + c).
Since this point is symmetric in a, b, c, the medians through b and c will also pass through it. Hence the three
medians of a triangle intersect at a single point.
If we write out the components of (1.33), with notation r = (x, y, z), p = (p1 , p2 , p3 ), q = (q1 , q2 , q3 )
we find
x = p1 + tq1 y = p2 + tq2, z = p3 + tq3,
from which we can eliminate t to get
x − p1 y − p2 z − p3
= = ,
q1 q2 q3
giving the two independent linear equations (e.g. for y and z in terms of x) needed for a line in three-
dimensional space.
We can now write functions of 3-dimensional position f (x, y, z) more compactly as functions f (r). Equa-
tions of the form f (r) = constant define surfaces, the constant surfaces of f . A simple example is r2 = 1,
which is a sphere of unit radius centred at the origin. (Recall our notation allows r ≡ |r|.)
12
Example 1.5. A sphere
Warning: One of the commonest errors made by students is to confuse vectors and scalars, in particular
to start adding together the components of a vector. The vector (3, 1, 2) is not the same as the scalar 6. This
may seem obvious now, but the mistake is more easily made when using basis vectors like i, j and k; then it
somehow seems to be easier to make the mistake 3i + j + 2k = 6.
We have defined vector addition and subtraction, but not multiplication of vectors. This is more complicated
because to obtain another vector we need to define both a magnitude and a direction (and in general, vector
division cannot be defined at all; we can divide a vector by a scalar λ just by multiplying by 1/λ , but we
cannot divide anything by a vector).
We first define the dot product, or scalar product 5, whose result is not a vector but a scalar. For vectors v
and w, this is defined by
v.w ≡ |v||w| cos θ , (1.35)
where θ is the angle between v and w. An alternative definition in terms of the components (v1 , v2 , v3 ) and
(w1 , w2 , w3 ) of v and w is
3
v.w ≡ v1 w1 + v2w2 + v3 w3 = ∑ vi wi .
i=1
One can prove that the two definitions are the same by applying Pythagoras’ theorem to a triangle con-
structed as follows. Take sides v, w and v + w. Draw the perpendicular from v + w to the line in direction v.
It has height |w| sin θ and meets the direction v at a distance |v| + |w| cos θ . Now write out Pythagoras with
the lengths in terms of |v|, |w| and θ and again in terms of components and compare the results. The details
are left as an exercise (if you have trouble, look in the online notes for MAS114 Geometry I or in A.E. Hirst,
Vectors in 2 or 3 dimensions, Arnold 1995, chapter 3).
We note in particular that two non-zero vectors v and w are perpendicular ( θ is a right angle) if and only
if v.w = 0.
5 In a more abstract setting (such as in Linear Algebra I) this may also be called the inner product.
13
From either form of the definition we can easily derive various algebraic rules.6
A geometrical application of the dot product is in giving the equation of a plane. The plane through a
fixed point p perpendicular to a fixed vector v is given by the set of all points r which have r− p perpendicular
to v, as is easily seen from a sketch. Since two perpendicular vectors have a dot product of zero, this gives
(r − p).v = 0 (1.36)
This easily rearranges to r.v = p.v and the right-hand side is just a constant for given p, v.
In components, if v = (a, b, c) and p.v = d the equation for a plane reads ax + by + cz = d. In practice
people often choose a unit vector n when specifying a plane in this form, so that p.n becomes the perpendic-
ular distance of the plane from the origin. Then, the distance of any other point r1 from that plane is given by
(r1 − p).n = r1 .n − p.n = r1 .n − d (the sign here tells one which side of the plane r1 is on).
The vector product: To define a product of two vectors which is a third vector, we need to define a
direction from two vectors u and v. The only way to do this which treats the two vectors equally is to take the
perpendicular to the plane in which u and v lie. However, this does not fully define a direction, because we
need to know which way to go along the perpendicular. For that the convention is to use the so-called right-
hand rule: hold the fingers of your right hand so they curl round from u to v and then take the direction your
thumb points (see Thomas figures 12.27 and 12.28). If you do DIY, you may find it helpful to remember that
this is the direction a normal screw travels if you turn your screwdriver clockwise. Note that this definition
only works in three dimensions: there is no well-defined vector product in n dimensions for n > 3.
The magnitude of v × w is defined to be |v||w| sin θ (θ as before). Geometrically this is the area of a
parallelogram with sides v and w. Note that for perpendicular vectors this rule implies that the magnitude is
|v||w|. These rules have the consequences that for any vectors u, v and w and any scalar λ ,
v×w = −w × v,
(λ v) × w = λ (v × w) = v × (λ w),
u × (v + w) = (u × v) + (u × w),
(u + v) × w = u × w + v × w,
and v × w = 0 for non-zero v, w if and only if v and w are parallel or anti-parallel (in particular, v × v = 0 for
any v).
Note: it is particularly important to note the sign-change property that v × w = −w × v. This looks
“silly”, but is a consequence of the “handedness” of three-dimensional space, and which way round we
choose to label our three coordinate axes.
From the notation used, the vector product is often called the cross product.7
6 The main ones are that v.w = w.v and that for any vectors u, v and w and any scalar λ ,
v.(λ w) = λ (v.w) = (λ v).w,
u.(v + w) = (u.v) + (u.w),
(u + v).w = (u.w) + (v.w),
v.v = |v|2 ≥ 0,
v.v = 0 ⇔ v = 0.
7 You will also find that in some texts it is denoted v ∧ w, but I strongly advise against using this notation as it leads to confusion in
more general settings where v ∧ w is not a vector. The reason for this misuse is that v ∧ w is what’s called a two-form, and there is an
operation called the Hodge dual, denoted by ∗, such that in three dimensions ∗(v ∧ w) = v × w.
14
To get the expressions for the cross product in terms of components, we can start by noting that the unit
vectors i, j and k are perpendicular to one another (so the vector product of any two distinct ones among them
has magnitude 1). This means that i × j must have length 1 and be perpendicular to both of them, so it is
either +k or −k. Since the usual x, y and z axes, in that order, are a right-handed set, it will turn out that
and therefore
j × i = −k, k × j = −i, i × k = −j.
(To remember these , think of the sequence ijkijk...; if the two vectors in the cross-product are in the same
order as in that sequence, the RHS has a + sign, while if they are in reverse order there is a − sign.)
(v1 i + v2j + v3k) × (w1 i + w2 j + w3k) = (v2 w3 − v3w2 )i + (v3w1 − v1 w3 )j + (v1 w2 − v2 w1 )k. (1.37)
One geometrical use of the cross product is in forming the volume of a parallellepiped with sides u,v
and w. Thinking of (say) u and v as the base, and θ as the angle between u × v and w, so that the height is
|w| cos θ , we see that
Volume of parallellepiped = (u × v).w (1.38)
(positive if u,v and w are a right-handed set). This quantity is called the scalar triple product and it is easy to
show that
u.(v × w) = v.(w × u) = w.(u × v)
(but this is −v.(u × w) etc, remember). We can also show that swapping the dot and cross gives the same
result, i.e. (u × v).w = w.(u × v) = u.(v × w) from above , but note that the brackets also move, i.e. the cross
product must be done first (inside the brackets) otherwise the result is nonsense. (Some textbooks may omit
the brackets, but this is potentially confusing). Clearly swapping the two vectors inside the bracket changes
the sign, and we can show that this is also true for swapping any two of the three vectors.
Exercise 1.3. Prove from the definitions that, for all a, b and c,
a × (b × c) = (a.c)b − (a.b)c.
This quantity is called the vector triple product; note that the position of the brackets matters here.
15
1.8 Gradients and directional derivatives
Suppose that V (x, y, z) or V (r) is a scalar field defined in some region. Then we can define a vector, the
gradient of V , at each point, which we denote ∇V , as follows:
∂V ∂V ∂V
∇V = i+ j+ k.
∂x ∂y ∂z
So the x-, y- and z-components of the new vector are ∂ V /∂ x, ∂ V /∂ y and ∂ V /∂ z. See Thomas 14.5 if
you need to revise this in more detail. Sometimes instead of ∇V we write grad V : the two notations are
interchangeable.
∇V = 2x sin z i + x2 cos z k.
(a) f = x + y + z,
(b) f = yx2 + y3 − y + 2x2z,
(c) f = a.r, where a is a constant vector.
Now ∇V tells us how V changes if we move from one point to a nearby point. Suppose we start at a point
r = (x, y, z), and then move a small distance dr = (dx, dy, dz) to the new point r + dr = (x + dx, y + dy, z + dz):
we will get a small change in V , given by
In our original definition of grad, it was implicitly assumed that we were working in terms of some
specified Cartesian coordinate system (x, y, z). Equation (1.39) is important, because we can use it as a more
16
fundamental definition of grad, which will enable us to write down ∇V in more general coordinate systems.
We shall return to this point later, in Chapter 5
Suppose we now want a normal line to V = constant, or a tangent plane (see Thomas 14.6). As we know
from (1.33) and (1.36), a line through point a in direction q can be written in parametric form as
(r − a).q = 0.
so all we have to do is insert values of a and q in these formulae. For the tangent plane and normal line to a
surface at a given point p, this gives
It is sometimes convenient to eliminate the parameter t for the normal line, which we can do by taking
the cross product with ∇V |p :
(r − p) × ∇V|p = 0.
Note that the forms (r − p).n = 0 and (r − p) × n = 0, using the unit normal n, would give the same plane or
line (though in r = a + tn such a change alters the values of t for given points) so we need not calculate |∇V |
to get the tangent plane or normal line.
Repeated Note: to use this, you must evaluate ∇V at the point concerned.
Exercise 1.5. Find equations for the (i) tangent plane and (ii) normal line at the point P0 on each of the
surfaces:
Suppose now that we want to calculate the rate of change of V (r) in a particular direction specified by the
unit vector t. Let s be the distance travelled in the direction of t; then dr = t ds. So dV = ∇V.t ds. Hence we
can conclude that the rate of change of V in the direction of t is
dV
= ∇V.t = t.∇V.
ds
t.∇V is called the directional derivative. Now
where θ is the angle between the vectors ∇V and t. This is maximized when cos θ = 1, i.e. when θ = 0. Thus
V changes most rapidly in the direction of ∇V , and |∇V | is this most rapid rate of change. It is this property,
17
in the two-dimensional case, that gave rise to the name gradient, because |∇V | is the gradient of the surface
given by z = f (x, y) in that case. Correspondingly the maximum decrease is when t is opposite to ∇V .
Example 1.8. Find the directions in which the function f (x, y, z) = (x/y) − yz increases and decreases
most rapidly at the point P, (4, 1, 1).
We can describe the directions in which f increases and decreases most rapidly by specifying the unit
vectors in those directions. Now
1 x
∇ f = i − 2 + z j − yk = (1, −5, −1) at P.
y y
The rate of change of f in the direction of unit vector t is ∇ f .t. This has its maximum when t is in the same
direction as ∇ f ; so the direction t in which f increases most rapidly is
∇f 1
= √ (1, −5, −1).
|∇ f | 27
√ √
and the actual rate is 27. The rate of decrease of f is greatest, at − 27, when t is in the opposite direction,
i.e.
−1
√ (1, −5, −1)
27
Exercise 1.6. Find the directional derivative of Φ at the point (1, 2, 3) in the direction of the vector
(1, 1, 1) where
x2 y2 z2
Φ= + + .
3 9 27
2
18