NotesForCalc2,3 VectorCalculus 20230715
NotesForCalc2,3 VectorCalculus 20230715
ALBERT M. FISHER
Contents
1. Review of Logic and Set Theory 2
2. Review of Linear Algebra 2
3. Review of the Riemann Integration 2
4. Vector Calculus, Part I: Derivatives and the Chain Rule 2
4.1. Metrics, open sets, continuity 2
4.2. Curves 4
4.3. Arc length of a curve 6
4.4. Level curves of a function. 8
4.5. Partial derivatives; directional derivative. 8
4.6. Properties of the gradient of a function. 11
4.7. Three types of curves and surfaces. 12
4.8. The gradient vector field; the matrix form of the tangent vector and of
the gradient. 17
4.9. General definition of derivative of a map. 20
4.10. The general Chain Rule. 24
4.11. Level curves and parametrized curves. 25
4.12. Level surfaces, the gradient and the tangent plane. 27
4.13. Two definitions of the determinant. 30
4.14. Orientation 31
4.15. Three definitions of the vector product. 32
4.16. The Inverse and Implicit Function Theorems 37
4.17. Higher order partial derivatives. 40
4.18. Finding maximums and minimums. 42
4.19. The Taylor polynomial and Taylor series. 42
4.20. Lagrange Multipliers 47
5. Vector Calculus, Part II: the calculus of fields, curves and surfaces 48
5.1. Vector Fields 48
5.2. The line integral 53
5.3. Conservative vector fields 58
5.4. Rotations and exponentials; angle as a potential 64
5.5. Line integral with respect to a differential form 70
5.6. Green’s Theorem: Stokes’ Theorem in the Plane 71
Date: July 15, 2023.
1
2 ALBERT M. FISHER
Definition 4.5. If γ(t) represents the position of a particle at time t, then the
derivative γ 0 (the tangent vector) gives the velocity of the particle v(t) = γ 0 (t), and
the acceleration at time t is the vector a(t) = v0 (t) = γ 00 (t). Note that all of these
are vector quantities, having both a magnitude and a direction. The speed is the
magnitude of the velocity vector, the scalar quantity ||v||.
Now we prove some basic facts about curves and their derivatives:
Proposition 4.3. (Leibnitz’ Rule for curves) Given two differentiable curves γ, η :
[a, b] → Rm , then (γ · η)0 = γ 0 · η + γ · η 0 .
Proof. We just write the curves in coordinates, and apply Leibnitz’ Rule (the Product
Rule) for functions from R to R.
Proposition 4.4. Let γ be a differentiable curve in Rm such that ||γ|| = c for some
constant c. Then γ ⊥ γ 0 .
Proof.
We use Leibnitz’ Rule. We have c = ||γ||2 = γ · γ so for all t,
0 = (γ · γ)0 = γ 0 · γ + γ · γ 0 = 2γ · γ 0
using commutativity of the inner product.
The meaning of ||γ|| = c is intuitively clear: for R2 this says that the curve is in
a circle; for R3 that the image of the curve is in a sphere, and the statement is that
the tangent vector to the curve is tangent to the sphere as it is perpendicular to the
position vector. See Fig. 5.
Corollary 4.5. If γ : [a, b] → Rm is twice differentiable then if ||γ 0 || is constant, we
have γ 0 ⊥ γ 00 .
Proof. We just apply the Proposition to the curve γ 0 .
Corollary 4.6. If γ : [a, b] → Rm is twice differentiable and represents the position of
a particle at time t, then if the speed ||γ 0 || is constant, the acceleration is perpendicular
to the curve (i.e. a ⊥ v).
In other words if you are driving a car at a constant speed around a track, the only
acceleration you will feel is side-to-side. If you apply the brakes or the accelerator
pedal, a component vector of acceleration tangent to the curve will be added to this.
If we reparametrize a curve to have speed 1, then the magnitude of the acceleration
vector can be used to measure how much it curves: we explain this next.
Proposition 4.4 allows us to make the following definition.
Definition 4.6. The curvature of a twice differentiable curve γ in Rn at time t is the
following. For its unit-speed parametrization γ b(s) we define the curvature at time s
b(s) = ||b
to be κ κ ◦ l)(t) = κ(t)
a(s)||; for γ the curvature at time t is κ(t) = (b
For example, the curve γr (t) = r(cos t/r. sin t/r) has velocity γr0 (t) = (− sin t/r, cos t/r)
which has norm one; the acceleration is γr00 (t) = 1r (cos(t/r). sin(t/r)) = − r12 γr (t), with
norm 1r . The curvature is therefore 1r . So if the radius of the next curve on the race
6 ALBERT M. FISHER
track is half as much, you will feel twice the force, since by Newton’s law, F = ma!
This is the physical (and geometric) meaning of the curvature.
Z b
||γ 0 (t)||dt.
a
Z Z b
ds = ||γ 0 (t)||dt.
γ a
We claim that the new formula includes this one: parametrizing the graph
p as a curve
0 0 0
in the plane γ(t) = (t, g(t)). Then γ (t) = (1, g (t)) so ||γ (t)|| = 1 + (g 0 (t))2 ,
R Rbp
whence indeed the arc length is γ ds = a 1 + (g 0 (t))2 dx as claimed.
Proposition 4.7.
(i) The arc length of a curve is unchanged for any change of parametrization, inde-
pendent of orientation. That is,
Z Z
ds = ds.
γ1 γ2
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 7
Proof. (i) Writing u = h(t), we have γ2 (u) = γ2 (h(t)) = γ1 (t). Then since du =
h0 (t)dt, and using the Chain Rule, we have:
Z Z u=d Z u=d
0
ds ≡ ||γ2 (u)|| du = ||(γ20 (h(t)|| du =
γ2 u=c u=c
Z t=b
||γ20 (h(t))|| h0 (t) dt
t=a
Assuming first that h0 > 0, this equals (1)
Z t=b Z t=b
0 0
||γ2 (h(t)) h (t)|| dt = ||(γ2 ◦ h)0 (t)|| dt
t=a t=a
Z t=b Z
= ||γ10 (t)|| dt = ds.
t=a γ1
We next see how this canR be used to give a unit speed parametrization of a curve
t
γ : [a, b] → Rn . Set l(t) = a ||γ 0 (r)|| dr, so l(t) is the arclength of γ from time a to
time t. Let us denote the length of γ by L. Thus function l maps [a, b] to [0, L]]. Note
that l is a primitive (antiderivative) for ||γ 0 || so l0 (t) = ||γ 0 (t)||. We shall assume that
||γ 0 (t)|| > 0 for all t; in this case, the function l is invertible. Our parameter change
will be given by its inverse, h(t) = l−1 (t); then h0 is also positive.
Proposition 4.8. Assume that ||γ 0 (t)|| > 0 for all t. Then the reparametrized curve
b = γ ◦ h has speed one.
γ
Proof. Now (l ◦ h)(t) = t so 1 = (l ◦ h)0 (t) = l0 (h(t)) · h0 (t). We have l0 (t) = ||γ 0 (t)||
so l0 (h(t)) = ||γ 0 ||(h(t)) = ||γ 0 (h(t))|| since h0 (t) > 0. Thus
γ 0 (t)|| = ||(γ ◦ h)0 (t)|| = ||(γ 0 ||(h(t)) · h0 (t)|| = l0 (h(t)h0 (t) = 1.
||b
8 ALBERT M. FISHER
The function l maps [a, b] to [0, L]] whence the parameter-change function h maps
[0, L] to [a, b]. We keep t for the variable in [a, b] and define s = l(t), the arc length
up to time t, so now s is the variable in [0, L] and h(s) = t.
The change of parameter gives γ b(s) = (γ ◦ h)(s) = γ(h(s)) = γ(t). This indeed
parametrizes the curve γ b is by arc length s.
Note further that
Z Z b Z l(b) Z
0 0
ds ≡ ||γ (t)|| dt = ||b
γ (s)||ds ≡ ds
γ a 0 γ
b
From s = l(t) we have ds = l0 (t)dt = ||γ 0 (t)||dt. Now we understand rigorously what
R
is ds: it represents the infinitesimal arc length; this helps explain the notation γ ds
for the total arc length.
∂F
(p) = Du (F )|p .
∂y
See Fig. 4.
It is very easy to calculate the partial derivatives. For the partial with respect to
x, we fix the variable y and find the derivative with respect to x alone.
For example, when F (x, y) = x2 y 3 , then ∂F
∂x
= 2xy 3 while ∂F
∂y
= 3x2 y 2 .
n
For F : R → R the definitions are similar.
10 ALBERT M. FISHER
4.6. Properties of the gradient of a function. For this, first we define: given a
∂F ∂F
map F : Rm → R we define a vector at each point p ∈ Rm , ∇F = ( ∂x 1
, . . . , ∂xm
),
called the gradient of F at p.
As we shall see in the next sections, the gradient has the following important
properties:
12 ALBERT M. FISHER
(1) This defines a vector field, called the gradient vector field of F .
(2) The gradient vector field is everywhere orthogonal to the level sets of F . These are
level curves for F : R2 → R and level surfaces for F : R3 → R. We prove this via the
Chain Rule, see §4.11. In general, the level sets are submanifolds, i.e. differentiable
subsets, of Rn , of dimension (n − 1); this is a consequence of the Implicit Function
Theorem. (Here we have to assume that n 6= 0).
(3) The gradient points in the direction of steepest increase of the function F at the
point p.
(4) The directional derivative of F at p, in the direction of the unit vector u, is given
simply by the inner product:
Du (F )|p = ∇F |p · u.
(5) The gradient ∇F is the vector form of the derivative DF of the function F .
(6) The gradient can be used to easily write the equation of the tangent line to a
level curve of F : R2 → R at the point p = (x, y). For F : R3 → R, the gradient
can be used to write the equation of the tangent plane to a level surface at a point
p = (x0 , y0 , z0 ). We explain this below in §4.11.
4.7. Three types of curves and surfaces. In the course we actually encounter
three different (but related) types of curves and surfaces. First, recall:
Definition 4.8. For f : [a, b] → R then its graph is
graph(f ) = {(x, f (x)) : x ∈ [a, b]}
which is the subset of the plane we usually draw for this. Similarly, for F : R2 → R
then graph(F ) = {(v, F (v)) : v = (x, y) ∈ R2 }. Thus graph(F ) = {(x, y, z) : z =
F (x, y)}.
The different types of curves are:
(i) the graph of function f : R → R;
(ii) a level curve of a function F : R2 → R;
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 13
(i) Which lines in the plane, or planes in R3 , can (or cannot) be written as the graph
of an affine function as in (i)?
(ii) For which values of a, b or a, b, c in (ii) do you get a line or plane?
(iii) For which vectors v and v, w in (iii) do you get a line or plane?
(iv) Make sure you know how to go from one type of line (or plane) to the other,
whenever possible (see the Linear Algebra lecture notes and exercises)!
(v) Write each type of line or plane in matrix form.
Solution: We explain (ii). We claim that the equation Ax + By + C = 0 gives a line
in the plane R2 exactly when not both A, B are 0. Here we have to understand the
meaning of “gives the equation of a line in the plane.”
There are two important points:
(1) This means that we are in the plane, this is our Universe of Discourse (we are
talking only about points in the plane R2 , not about R or R3 ).
(2) “gives the equation of a line” means that the collection of all solutions to the
equation forms a line.
That is,
{(x, y) ∈ R2 : Ax + By + C = 0}
is a geometrical line in R2 .
It makes a huge difference what is our Universe of Discourse (i.e. what we are
talking about). For example, the equation x = 2 in R is a point, in R2 it is a vertical
line, {(x, y) : x = 2}, in R3 it is a vertical plane.
Now for Ax + By + C = 0 to be a line means that
{(x, y) ∈ R2 : Ax + By + C = 0}
is a line. Let us consider the case where B 6= 0. Then this equation is equivalent, i.e.
it has the same solutions:
y = −A/By − C/B = ax + b
which we know is a line.
Next suppose A, B are both 0. Then we have
{(x, y) ∈ R2 : 0x + 0y + C = 0}
equivalently
{(x, y) ∈ R2 : C = 0}
and there are two cases:
(i) C = 0: this statement is true, hence is true for all (x, y), so the solutions are R2 ;
(i) C 6= 0: this statement is false, hence is false for all (x, y), so the solutions are the
empty set.
This proves the Claim.
Planes are handled similarly.
Exercise 4.6.
(i) Given vector spaces V, W and a linear transformation T : V → W , prove that:
Proposition 4.9. The image of T , Im(T ) and the kernel (null space) of T , ker(T )
are (vector) subspaces of W, V respectively.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 15
For a level curve there are two ways to approach finding the tangent line. The
first is to parametrize the level curve somehow and apply the previous case of a
parametrized curve.
Exercise 4.7. For F (x, y) = x2 + y 2 , the curve of level 1 is the unit circle, the
solutions of the equation (i.e. all pairs (x, y) which satisfy the equation)
x2 + y 2 = 1.
Find parametrizations for this level curve, and use that to find the tangent line at
a point.
Solution: We can parametrize this for example by the variable x. Then
√
y = ± 1 − x2
√
and we have two parametrized curves, with t = x, so γ(t) = ± 1 − t2 . This works at
all points except where y 0 (x) = ∞, that is, x = ±1. If we instead parametrize it by y
then this works except where x0 (y) = 0, that is, for y = ±1. We can also parametrize
the entire curve at once, by the angle θ, with
γ(t) = (cos t, sin t)
and t = θ.
16 ALBERT M. FISHER
√
When
√ we parametrize by the variable x, we say the functions f (x) = 1 − x2 , fe(x) =
− 1 − x2 , are defined implicity by the equation x2 + y 2 = 1.
That is, they are explicit functions which are ”implied” by the equation.
The Implicit Function Theorem describes when this can be done, basically when
(partial) derivatives become infinite as above.
Given this, we can apply the formulas for the graph of a function, or for a curve in
the plane, to find the tangent line.
Finding the tangent line or plane: using the normal vector to find the
tangent space. The second way to find the tangent line to a level curve is to find a
normal vector to the curve. We explain this in §4.11.
Definition 4.9. Given F : Rn → R the tangent space Tp to the level set at the point
p = (x1 , x2 , . . . , xn ), for level c = F (p), is an affine subset of Rn , all vectors v such
that (v − p) is othogonal to the gradient, n = ∇Fp .
We consider first the case of n = 2. We write the equation of the tangent line,
recalling that given a point p and a normal vector n = (A, B) then the line passing
through p and perpendicular to n is the collection of all x = (x, y) such that
n · (x − p) = 0.
Thus for n = (A, B) and x = (x, y) and p = (x0 , y0 ) then
(A, B) · (x − x0 , y − y0 ) = 0
giving the general equation for the line,
Ax + By + C = 0
where C = −n · p = −(Ax0 + By0 ).
Since ∇F = n = ( ∂F | , ∂F | ) this gives the formula for the tangent line as
∂x p ∂y p
∂F ∂F
z0 + |p (x − x0 ) + |p (y − y0 ) = 0 (3)
∂x ∂y
We know the formula for the tangent line to the graph of a function f : R → R
is l(x) = f (p) + f 0 (p)(x − p). We can also use the normal vector method to find this
formula in a second way. To do this we define F : R2 → R by
F (x, y) = f (x) − y.
Then the level curve of level 0 gives f (x) − y = 0, so y = f (x) which is the graph of
f.
(Consider a simple example like f (x) = x2 to understand what is going on!)
Note that at the point p = (p, f (p) is ∇Fp = (f 0 (p), −1) so the formula (14) gives
n · (x − p) = 0
where p = (p, f (p)) so we have
(f 0 (p), −1) · ((x − p, y − f (p)) = 0
so
f 0 (p)(x − p) − (y − f (p)) = 0
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 17
so
y = f (p) + f 0 (p)(x − p)
as claimed.
Exercise 4.8. See Exercise 4.9 below.
4.8. The gradient vector field; the matrix form of the tangent vector and
of the gradient. The gradient ∇F of a function F : R → Rm gives an important
example of a vector field. In general, a vector field V on Rm is a function V from Rm
to Rm .
As we mentioned above and shall prove in Proposition 4.15, the level curves of a
function F are orthogonal to the gradient vector field, so the gradient can help us
understand the level curves of F .
We draw the vector wv = V (v) based at each point v. See Fig. 12.
The tangent vector gives the first definition of derivative of a curve γ : R → Rm , the
vector form of the derivative; the second definition, the matrix form of the derivative,
is the (n × 1) matrix, i.e. the column vector with those same entries:
0
x1 (t)
Dγ = ... .
x0n (t))
Thus Dγ : R → Mn×1 .
For a function F : Rn → R, the vector form of its derivative is the gradient ∇F .
This has a matrix form, the row vector i.e. (n × 1) matrix with the same entries:
∂F ∂F
DF |x = ∂x 1
. . . ∂xn .
Given γ : R → Rm and F : Rm → R then the composition is F ◦ γ : R → R, so
we can take its derivative (F ◦ γ)0 (t). The Chain Rule says we can compute this in a
second way. In vector notation it states:
(F ◦ γ)0 (t) = ∇Fγ(t) · γ 0 (t).
This is even simpler to remember in matrix notation, as we have the product of a
row vector and a column vector. For example, with γ : R → R3 and F : R3 → R, we
have
0
x 1 (t)
D(F ◦ γ(t)) = Fx Fy Fz |γ(t) x02 (t) .
x03 (t))
2
Exercise 4.9. F (x, y) = x2 y 3 ; γ(t) = (et , et ).
(1)Find F ◦ γ 0 (0).
2 2 2
First method (directly): f (t) = F ◦ γ(t) = e2t e3t . f 0 (t) = 2e2t e3t + 6t · e2t e3t .
f 0 (0) = 2.
2
Second method (Chain Rule): ∇F = (2xy 3 , x2 3y 2 ). γ 0 (t) = (et , t2 et ).
γ(0) = (1, 1). ∇F (1, 1) = (2, 3). γ 0 (0) = (1, 0).
So F ◦ γ 0 (0) = (2, 3) · (1, 0) = 2.
18 ALBERT M. FISHER
Let us relate the above formula to the usual definition for a function f : R → R,
that is,
f (x + h) − f (x)
f 0 (x) = lim =c
h→0 h
This definition still works for curves, giving us the tangent vector. However for V
of dimension larger than 1 this makes no sense, as we cannot take the ratio of two
vectors.
Remark 4.4. Or nearly. Consider the following: given a linear map L : V → V , so
Lv = w, then in some sense
w
=L:
v
the ratio “should be” a linear transformation!!
However L is not well-defined by this: many linear maps will solve the equation
Lv = w; it is only well-defined if V has dimension 1. What the definition of derivative
requires is that L works for all directions h, and this does make L well-defined.
Remark 4.5. Let us see what happens to the general definition for f : R → R. Then
f (x + h) − f (x)
lim =c
h→0 h
iff for each ε > 0, there exists δ > 0 such that for |h| < δ,
f (x + h) − f (x)
−c <ε
h
or
f (x + h) − f (x) − ch
<ε
|h|
And this is now a special case of the general formula.
We introduce the notation L(V, W ) for the collection of all linear transformations
from V to W . If we choose a basis for V = Rn and W = Rm , then L : Rn → Rm
is represented by an (m × n) matrix. Then L(Rn , Rm ) can be identified with the
matrices Mmn ∼ Rmn , so DF : Rn → L(Rn , Rm ) ∼ Rmn .
When considering F : Rn → Rm then both x ∈ Rn and F (x) ∈ Rm can be written
in components, with x = (x1 , . . . , xn ) and F (x) = (F1 (x), . . . , Fm (x)). We write the
components of F as a column vector, so
F1 (x)
F (x) = ... .
Fm (x)
Each component Fk is a function from Rn to R so has a gradient, ∇Fk = ( ∂F
∂x1
k
, . . . , ∂F
∂xn
k
).
Now DFk is by definition is the corresponding row vector so
DFk = ∂F . . . ∂F
k
∂xn .
k
∂x1
We define the matrix of partials of F : Rn → Rm to be the (m × n) matrix with
the k th row this gradient vector. Let us write [∇Fk ] for the row vector DFk .
22 ALBERT M. FISHER
The most basic cases are f : R1 → Rm and Rn → R1 . The first is a curve, discussed
above, and usually written γ : R → Rm . The general formula then gives the matrix
form of the tangent vector; since γ is a column vector, with
x1 (t)
γ(t) = ...
xm (t))
then Dγ is the (m × 1) matrix with the same entries as the tangent vector γ 0 (t) =
(x01 , , . . . , x0m )(t), so
0
x1 (t)
Dγ|t = ... .
x0m (t))
The second type of map F : Rn → R we call simply a function. The general formula
above then gives the (1 × n) matrix:
∂F ∂F
DF |x = ∂x1
... ∂xn x .
This row vector is the matrix form of the gradient ∇F , since as explained above,
∂F ∂F
∇F = ( ∂x 1
, . . . , ∂xn
).
As we shall see in Proposition 4.15, the level curves of a function F are orthogonal
to the gradient vector field.
An example of level curves is seen in Fig. 21.
Another basic theorem regarding derivatives is the relation to the matrix of partials:
Theorem 4.11. If for F : Rn → Rm , all the partial derivatives ∂Fi /∂xj exists and
is continuous at p, then F is differentiable at p, and its derivative is the linear map
given by the matrix of partials.
For a proof see Theorem 6.4 of Marsden’s book [Mar74]. The of derivatives is very
clearly carried out on pp. 158-185 of Marsden.
F G
# #
V W 7 Z
G◦F
DF |x DG|F (x)
# #
V W 6 Z
D(G◦F )|x
The first example is γ : R → R3 and F : R3 → R, where have seen the Chain Rule
above; in matrix notation it is:
0
x1 (t)
D(F ◦ γ(t)) = Fx Fy Fz |γ(t) x02 (t)
x03 (t))
The product gives a (1 × 1) matrix, whose entry is a number.
In vector notation the Chain Rule is:
Proposition 4.14. Let γ be a differentiable curve in Rn such that ||γ|| = c for some
constant c. Then γ ⊥ γ 0 .
Proof. (Second PProof, using gradient) We define a function F : Rn → R by F (x) =
||x||2 = x · x = ni=1 x2i . Then since ||γ|| = c is constant, c2 = ||F ◦ γ|| whence by the
Chain Rule,
0 = (F ◦ γ)0 (t) = (∇F (γ(t)) · γ 0 (t)
but F (x) = F (x1 , . . . , xn ) = x21 + · · · + x2n whence ∇F (x) = 2(x1 , . . . , xn ) = 2x. Thus
0 = 2γ(t) · γ 0 (t), as claimed.
Directional derivative and the gradient.
The gradient gives us a simple way of calculating the directional derivative. Given
F : Rn → R, with gradient vector field ∇F , and given a unit vector u, then the
directional derivative of F in direction u is given simply by the inner product:
4.11. Level curves and parametrized curves. There are two very distinct types
of curves we encounter here: the curves of this section, which are parametrized curves
(with parameter t = time), and the level curves of a function. Next we describe a link
between the two:
Proposition 4.15. Let G : R2 → R be differentiable and suppose γ : [a, b] → R2 is a
curve which stays in a level curve of G of level c. Then γ 0 (t) is perpendicular to the
gradient of G.
Proof. We have that G(γ(t)) = c for all t. Hence G(γ(t))0 = 0 for all t. Then by
the chain rule, this equals 0 = D(G ◦ γ)(t) = DG|γ(t) Dγ|t . The derivatives here are
matrices, with DG a (1 × 2) matrix (a row vector) and Dγ a column vector; in vector
notation, these are the gradient and tangent vector, so this gives 0 = (G ◦ γ)0 (t) =
(∇G)(γ(t)) · γ 0 (t), so ∇G|γ(t) · γ 0 (t) = 0, telling us that the gradient is perpendicular
to the tangent vector of the curve, as claimed.
Example 2. (Dual hyperbolas) See Fig. 21, depicting level curves of the functions
F (x, y) = (x2 − y 2 ) and G(x, y) = 2xy.
Exercise 4.11. Plot the level curves of F for levels 0, 1, −1 and for G of levels 0, 2, −2.
Compute the gradient vector fields and find their matrices (they are linear!) Compare
to the earlier examples of linear vector fields.
These functions are related algebraically by a change of variables, u = √12 (x − y),
v = √12 (x + y) and geometrically by a rotation Rπ/4 . To verify this we define the
26 ALBERT M. FISHER
4.12. Level surfaces, the gradient and the tangent plane. In Proposition 4.15
of §4.11 we showed that the gradient vector field of F : R2 → R is orthogonal to the
level curves of F . In fact something similar is true for any dimension. For the case
of R3 we get a new formula for the tangent plane, as we now explain.
Proposition 4.16. Let G : R3 → R be differentiable and suppose γ : [a, b] → R3 is a
curve such that the image of γ remains inside the level surface of level c, {(x, y, z) :
G(x, y, z) = c}. That is, for all t, G(γ(t)) = c. Then γ 0 (t) is perpendicular to the
gradient of G.
More generally this is true for higher dimensions, G : Rn → R.
Proof. We have that G(γ(t)) = c for alll t. Hence G(γ(t))0 = 0 for all t. Now by the
chain rule, 0 = D(G ◦ γ)(t) = DG(γ(t)Dγ(t). DG is now a (1 × n) matrix and Dγ a
(n × 1) column vector; in vector notation, these are the gradient and tangent vector,
d
so this gives 0 = dt c = (G ◦ γ)0 (t) = (∇G)(γ(t)) · γ 0 (t) = 0.
Exercise 4.12. First we have a review problem from Linear Algebra: Recall that
the general equation for a plane in R3 is:
Ax + By + Cz + D = 0
where not all three of A, B, C are 0. Given a point p = (x0 , y0 , z0 ) and a vector
n = (A, B, C) then find the general equation of the plane through p and perpendicular
to n.
Solution: We know that the plane is the collection of all x = (x, y, z) such that
n · (x − p) = 0,
so for n = (A, B, C) and x = (x, y, z) and p = (x0 , y0 , z0 ) then
(A, B, C) · (x − x0 , y − y0 , z − z0 ) = 0
giving the general equation for the plane,
Ax + By + Cz + D = 0
where D = −n · p = −(Ax0 + By0 + Cz0 ).
See also Exercise 4.5.
Exercise 4.13. Given the function F (x, y, z) = x2 + y 2 + z 2 , find the tangent plane
to this sphere at the point (1, 2, 3).
Solution: Note that F (1, 2, 3) = 14. Therefore this point is
√ on the level surface of F
of level 14. (This is the sphere about the origin of radius 14).
Now the gradient of F is ∇F (x, y, z) = (2x, 2y, 2z). We know the gradient is
orthogonal to the sphere hence to the tangent plane. This normal vector (to both)
is n = ∇F (1, 2, 3) = (2, 4, 6). We are in the situation of the previous exercise: the
equation of the plane is
Ax + By + Cz + D = 0
28 ALBERT M. FISHER
where the normal vector is n = (A, B, C) = (2, 4, 6) and the plane passes through the
point p = (1, 2, 3).
The equation of the plane is therefore
n · (x − p) = 0
or
n · ((x, y, z) − p) = 0
so
(A, B, C) · (x − 1, y − 2, z − 3) = 0
giving
2x + 4y + 6z + D = 0
where
D = −n · p = −(2, 4, 6) · (1, 2, 3) = −(2 + 8 + 18) = −28,
so we have the plane with general equation
2x + 4y + 6z + −28 = 0.
Exercise 4.14. We solve an exercise requested by a student.
Guidorizzi Vol 2 # 2 of §11.3, p. 204. [Gui02],
Find the equation of a plane which passes through the ponts (1, 1, 2) and (−1, 1, 1)and
which is tangent to the graph of the function f (x, y) = xy.
Solution. We use normal vectors, as follows. The graph of f is
{(x, y, z) : z = f (x, y)}
which equals
{(x, y, z) : z = xy}
equivalently written
{(x, y, z) : xy − z = 0}.
This is the level surface of F : R3 → R defined by F (x, y, z) = xy − z. This has
gradient vector ∇F = (y, x, −1). Let p = (x0 , y0 , z0 ) denote the point where the
plane meets the graph. Then at the point p we have ∇Fp = (y0 , x0 , −1). We know
that the gradient is orthogonal to the level surfaces, in other words it is orthogonal
to the tangent plane to the surface at that point. So n = ∇Fp is a normal vector to
the tangent plane of the level surface at p. This gives us the equation for the tangent
plane
n · (x − p) = 0
so
(y0 , x0 , −1) · ((x, y, z) − (x0 , y0 , z0 )) = 0
(y0 , x0 , −1) · (x − x0 , y − y0 , z − z0 ) = 0
so
y0 x − x0 y − z + z0 = 0
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 29
2y0 − 1 = 0
y0 = 1/2
We now have from the first equation,
y0 − x0 − 2 + x0 y0 = 0
so
1/2 − x0 − 2 + x0 1/2 = 0
multiplying by 2,
1 − 2x0 − 4 + x0 = 0
x0 = 3
Thus z0 = 3/2 giving the equation of the plane:
n · (x − p) = 0
with n = (y0 , x0 , −1) = (1/2, 3, −1) and p = (x0 , y0 , z0 ) = (3, 1/2, 3/2). Finally in
the form
Ax + By + Cz + D = 0
we have
1/2x + 3y − z − 3/2 = 0
or equivalently
x + 6y − 2z − 3 = 0.
To check our numbers we can verify that the three points are indeed on this plane.
Remark 4.7. In these notes we have emphasized the role of three distinct ways of
presenting, or of viewing, the same object: for example a curve may be the graph of
a function, a level curve, or a parametrized curve. We wish to indicate how this fits
into a larger context, in other parts of mathematics.
First, here is a solution to part of Exercise 4.6: to write the image and kernel in
matrix form.
Consider
2 3 2 3
1 2 s = s 1 + t 2 (4)
t
4 5 4 5
Thus if we write the columns of a (3×2) matrix as v = (v1 , v2 , v3 ), w = (w1 , w2 , w3 )
we have more generally
30 ALBERT M. FISHER
v1 w1 v1 w1
v2 w2 s = s v2 + t w2 (5)
t
v3 w3 v3 w3
defining the map T : R2 → R3 where T (s, t) = sv + tw. This is a parametrized plane,
which in this case passes through 0.
Given a parametrized plane in R3 , we should be able to find the general equation.
To do this we bring in the vector product, which we next explain. But first, a few
words about the determinant!
Similarly we define the expansion along any row nj=1 (−1)i+j detA(ij) or indeed
P
any column.
It turns out these are equal, giving the same number whatever row or column
chosen.
Note that this algorithm also works for the (2 × 2) case!
Geometric definition:
Definition 4.11. Let M be an (n × n) real matrix. Then
detM = (±1)(factor of change of volume)
where we take +1 if M preserves orientation, −1 if that is reversed. (Here this is
n-dimensional volume and so is length, area in dimensions 1, 2).
Theorem 4.17. The algebraic and geometric definitions are equivalent.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 31
Proof. For A (2 × 2), note that the factor of change of volume is the area of the image
of the unit square, that generated by the standard basis vectors (1, 0) and (0, 1),
which equals the area of the parallelogram with sides the matrix columns, (a, c) and
(b, d).
Case 1: c = 0. Then the matrix is upper triangular and its determinant agebraically
is ad. But the parallelogram area is (base)(height)= ad as well.
The formula area(parallelogram)= (base)(height) is usually proved by cutting off a
triangle vertically and shifting it to the other side, thus forming a rectangle of the same
base and height. Here is a different way to picture this: imagine the parallelogram is
a pile of horizontal layers, like a stack of cards, and straighten the pile to a vertical
pile by sliding the cards, ending up with the same (a × d) rectangle.
General Case: We reduce to Case 1 as follows, not by rotating (also possible!) but
by sliding the far side of the parallelogram along the direction (b, d). A simple con-
putation shows the area is indeed ad − bc.
Higher dimensions: We note that the above “sliding” operations can be done alge-
braically by an operation of column reduction, equivalently, multiplying on the right
by an elementary matrix of determinant one. This reduces to the upper diagonal
case, and beyond to the diagonal case if desired.
We observe that the same procedure works in R3 and beyond.
Theorem 4.18.
(i) det(AB) = det(A)det(B).
(ii) det(B −1 AB) = det(A).
Proof. Part (i) can be proved algebraically, but it is much easier to use the geometric
definition of determinant, that det(A) = (±1) · (factor of change of volume). (Since
32 ALBERT M. FISHER
u1 u2 u3
u · (v ∧ w) = v1 v2 v3 (6)
w1 w2 w3
Taking u = v in (6) it follows that v · z = 0, similarly for w, proving (i). Recall
u1 u2 u3
that v1 v2 v3 = ± (volume of the paralellopiped spanned by u, v, w), using
w1 w2 w3
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 33
the fact that detM = detM t , where the sign is + iff the map preserves orientation,
since the parallelopiped is the image of the unit cube, and since from Theorem 4.19
we know the determinant gives ± (factor of change of volume).
z1 z2 z3
2
Now taking in (6) u = z = v ∧ w, then ||z|| = z · z = v1 v2 v3 ≥ 0 so the
w1 w2 w3
orientation of (z, v, w) is positive. Using this, from the geometric definition of the
determinant,
z1 z2 z3
v1 v2 v3 = vol(z, v, w)
w1 w2 w3
where this means the volume of the parallelopiped spanned by the basis (if linearly
independent) (z, v, w). Here we use the fact that we can exchange rows for columns
as detA = detAt . But since z is orthogonal to the base parallelogram, this volume is
(base area)(height).
This gives
||z||2 = (base area)(height) = (base area)||z||
so ||z|| = (base area) as claimed. This conclues the proof that Def. (1) implies Def.
2.
It is clear that both Defs. (1), (2) imply (3), but knowing Def. (3) for the ba-
sis vectors i, j, k determines v ∧ w for all v, w, by bilinearity. Hence all three are
equivalent.
Corollary 4.20. We have the nice (and useful!) formula
||v ∧ w||2 = (v · v)(w · w) − (v · w)2 .
Proof. From Theorem 4.19 we know that
||v ∧ w||2 = (area)2 = (||v||||w||| sin θ|)2 =
and this is
||v||2 ||w||2 (1 − cos2 θ) = ||v||2 ||w||2 − (||v||||w|| cos θ)2 = ||v||2 ||w||2 − (v · w)2 .
We shall next see how the vector product satisfies three important properties, the
first two of which we have already proved:
Definition 4.12. A Lie bracket [x, y] on a vector space V is an operation on V (a
function from V × V to V ) which satisfies the axioms:
– bilinearity;
–anticommutativity: [y, x] = −[x, y];
–the Jacobi identity
[x, [y, z]] + [y, [z, x]] + [z, [x, y]] = 0
Proposition 4.21. The vector product v ∧ w on R3 is a Lie bracket, setting [v, w] =
v ∧ w.
34 ALBERT M. FISHER
T S (9)
R2 −−−→ R3 −−−→ R1
The plane P is a set of points (x, y, z), which on the one hand is the image of the
map T , and on the other is a translate of the kernel of the map S by the vector p.
Level surfaces of different levels (that is, planes which are parallel, with different
constants D) fit together as described by Equation (5.10).
Remark 4.9. The important point in this is the following: The plane P is,
by itself, simply a subset of points, a two-dimensional subspace of R3 . However,
Equation (10) gives us two very different ways of viewing P : via the map T or the
map S.
Summarizing, P is the image of R2 via the map T . That is, the map T parametrizes
P ; thus via this map P becomes the parametrized plane Lp (s, t) = sv + tw + p.
On the other hand, via the map S, P is the preimage (inverse image) of a constant
value −D. Thus it is is seen to be a level surface of the map (of level −D). Thus it
is only one of a family of parallel planes, of different levels.
This also gives us insight as to the meaning of the diagram: it says something about
the object (in this case the plane P ) in the middle, from two different perspectives,
given by the two maps.
Again, this just reflects the difference between our two ways of understanding a
plane, as a parametrized plane, see Equation (21), or as the solution set of its general
equation. And this latter is, geometrically, a plane which passes through a point and
has a certain normal vector, n = (A, B, C).
This is the simplest case, of a line in the plane or a plane in space. The general
situation comes from these fundamental results of Linear Algebra:
Theorem 4.22. Given finite-dimensional vector spaces V, W let T : V → W be a
linear transformation. Then:
i) the null space N (T ) is a vector subspace of V ;
ii) the image Im(T ) is also; and
(iii) dim(N (T )) + dim( Im(T )) = dim(V ).
Exercise 4.15. Prove (i), (ii)! See Exercise 4.6.
Corollary 4.23. If T above is surjective, then dim(N (T )) = dim(V ) − dim(W ).
Before we describe the proof, we write it as a diagram, of linear transformations
on vector spaces:
I T
K −−−→ V −−−→ W
Here the first map I is an injection I(v) = v, which just means that it is a 1 − 1
function (just the identity map in this case). Its image is the subspace Im(I) = K
which is the kernel of T , and the image of T is W . That is, the map T is onto.
For the previous example, the map I represents the plane K as a parametrized
subspace, while the map T gives its general equation.
In Algebra, a diagram of maps where the image of one map is the kernel of the
following map is called an exact sequence. In fact, the above diagram of vector spaces
extends to
I T S π
{0} −−−→ K −−−→ V −−−→ W −−−→ {0}
36 ALBERT M. FISHER
where I is the injection and π is the projection π(v) = 0. This extended diagram is
also exact: exactness of the first part
I T
{0} −−−→ K −−−→ V
says that I is injective (1 − 1 to its image) since the kernel of T is then {0}, while
exactness of the second part
S π
V −−−→ W −−−→ {0}
tells us that the map S is onto (surjective) as the kernel of π is all of W , which by
exactness is the image of S.
Back to the proof of the theorem, part (iii) can be proved by writing the map as
a matrix and solving the system of linear equations.
For example when m = 3 and n = 2, we have the following.
Given a matrix
A B C
M=
D E F
we have the matrix equation
Mv = w
where q = w is fixed, and M is fixed, and by the solution set of this equation we
mean the collection of all v which satisfy this equation. Writing w = (s, t) we have
x
A B C s
y = . (11)
D E F t
z
The multiplication v 7→ M v defines a linear transformation T : R3 → R2 . Note
that Im(T ) is equal to the column space of M , the subspace of R3 generated by the
columns of the matrix. This is simply because for a standard column basis vector ek ,
M ek gives the k th column of M .
Note that the matrix equation (23) is equivalent to the “system of two linear
equations in three unknowns”:
(
Ax + By + Cz = s
Dx + Ey + F z = t
This system has full rank iff the rows are linearly independent, iff the dimension of
the image Im(T ) is the maximum possible, in this case 2.
From Linear Algebra we can find the solution set explicitly by row reduction.
For a concrete example, after row reduction we may have
(
x+y =s
2z = t
and we are free to choose y (for this reason known as a “free variable”) but then no
longer free to choose x or z as these are determined, since x = −y + s and z = t/2.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 37
The differentiable version of this result is that if DF (p) has full rank, then F −1 (q)
is a submanifold of R3 , of dimension 3 − 1 = 2. This means that it is a parametrized
surface.
The Implicit Function Theorem moreover gives conditions when given an equation
F (x1 , . . . , xn ) = 0
we can solve for one of the variables, and use the rest √
as our parameters. For the
2 2
simplest example, F (x, y) = x + y = 1 becomes y = ± 1 − x2 .
Just as for matrices, the Implicit Function Theorem has a version for F : Rm → Rn
whenever m ≥ n. The case m = n is indeed the Inverse Function Theorem!
Definition 4.14. F : Rm → Rn is continuously differentiable (of class C 1 ) iff the
derivative DF at each point p exists and the matrix DFp is a continuous function of
p).
Given an open set U ⊆ Rm , a function F : U → V = F (U) is invertible iff there
exists Fe defined on V such that Fe ◦ F is the identity on U and F ◦ Fe is the identity
on V.
A submanifold M ⊆ Rm of dinension d < m is the following: there exists U ⊆ Rd
and Φ : U → M surjective and C 1 such that DΦ is injective at each point of U, to a
linear subspace of dimension d.
Theorem 4.25. (Implicit Function Theorem) Let F : Rm → Rn be C 1 , with m ≥ n.
Suppose the matrix DFp is surjective. Then the set F −1 (p) is a submanifold of Rm
of dimension m − n. That is, for x ∈ F −1 (p), there exists an open subset U ⊆ Rm−n
and H : U → Rm C1 such that F ◦ G(x) = p for all x ∈ U.
Proof. See [War71] Theorem 1.38, p. 31.
For example, if F : R3 → R, then the level surface F −1 (p) is a parametrized surface:
its parametrization is given near the point x by the map H.
Remark 4.11. Similarly to the linear case as explained in Remark 4.9, in the smooth
case, the same set (an embedded manifold) is viewed in two different ways, by means
the two maps, one where it is the image, one the domain. The first parametrizes
the manifold, the second places it as a level curve, surface or manifold of a map on
the higher-dimensional space, and thus shows how it is but one of a family of such
“parallel” manifolds. This is a special mathematical object known as a foliation.
Thus the level surfaces of a function F : R3 → R foliate R3 , and in the special case
of F linear, F is given by the inner product with a normal vector, and the foliation
consists of all those parallel planes.
Thus a parametrized m-dimensional manifold in Rm is a map α : U → M ⊂ Rm
where U is a connected open subset of Rm , α is differentiable and invertible with
image M .
The higher dimensional version of level curves and surfaces can be stated as follows.
Given f : Rm+1 → R then if f is differentiable and surjective and Df is everywhere
onto (one says Df is of maximal rank) then for any point p ∈ Rm+1 the set M =
f −1 (p) is locally a parametrized m-dimensional manifold. Moreover this holds when
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 39
that condition holds at some point (not necessarily all points): for any y = f (p) such
that Df |p is of maximal rank; y is then called a regular value.
Proposition 4.26. (Lemmas 1,2 of Chapter 2 of [MW97]) If f : M → N is a
smooth map between manifolds of dimension m ≥ n, and if y ∈ N is a regular
value, then the set f −1 (y) is a smooth manifold of dimension m − n. The null space
of Dfx : T Mx → T Ny is the tangent space of this submanifold, and its orthogonal
complement is mapped onto T Ny .
One then has a similar diagram
α f
M −−−→ N −−−→ {y}
where the first map is injective and the second is surjective. When one considers the
derivative maps then one gets the exact diagram for the linear case:
Dα|x Dfx
{0} −−−→ Rm −−−→ N −−−→ {0}
For the differentiable case, there are many versions of these theorems. For an
introduction see Lemmas 1,2 of Chapter 2 of [MW97] and for surfaces Proposition 2
of Chapter 2 of [DC16]. For a simple and beautiful general statement see Theorem
1.39 of [War71].
Remark 4.12. A nice simpler version with examples is on pp. 239-240 of Vol. II of
[Gui02]. See also my handwritten Notas de Aula.
More on the Implicit and related Inverse Function Theorems are given e.g. in §7.2-4
of [Mar74], and in Chapter 2.10 and on p. 729 of [HH15]. We next consider some
examples.
Example 3. F (x, y) = x2 + y 2 , the curve of level 1 is the unit circle, the solutions of
the equation (i.e. all pairs (x, y) which satisfy the equation)
x2 + y 2 = 1.
Higher derivatives.
For the second-order approximation we add a term involving the second-order par-
tial derivatives and so on. This gets more and more complicated as we describe
next.
The map F is called C 0 iff it is continuous, and C k iff the k th derivative exists and
is continuous. For this we need to define higher derivatives.
Writing L(V, W ) for the collection of linear transformations from V to W , then
this is a Banach space with the operator norm. Since DF : V → L(V, W ), then
we see that the second derivative is a linear map D2 Fx : V → L(V, W ) and thus
D2 F : V → L(V, L(V, W )), and so on.
In the same way the second, third derivatives are defined, with matrices of increas-
ing size.
The only exception is when n = 1, for a curve γ : [a, b] → Rm : in this case (as
noted above) then γ 0 is also curve in Rm , thus so is γ 00 = (γ 0 )0 etcetera. By contrast
for a function F : Rn → R then the gradient ∇F : Rn → Rn is a vector field,
40 ALBERT M. FISHER
2
so DF : Rn → Rn , and then the second derivative is no longer a vector field as
2 3
D2 F : Rn → Rn × Rn ∼ Rn , and so on, getting more and more complicated.
A domain is an open subset of Rn . A vector field on a domain U is simply a such
a map defined only on the subset U. The vector field is termed C k , for k ≥ 0, iff the
map has those properties (again, C 0 means continuous, and C k that Dk F exists and
is continuous, so D : C k+1 → C k ).
∂ ∂ ∂Fx
Gy = (G) = (Fx ) = = (Fx )y = Fyx .
∂y ∂y ∂y
(This notation can be confusing since Fyx = (Fx )y !)
Now for G = Fx then G : R2 → R. This has as its gradient
∇G = (Gx , Gy ) = (Fxx , Fyx ).
Similarly for G
e = Fy then its gradient is
∇G
e = (G
ex , G
ey ) = (Fxy , Fyy ).
The meaning of this symmetric matrix becomes clear when discussing Taylor poly-
nomials of order 2, and finding maximums and minimums.
(2) If it is a maximum then it must be a maximum for the function restricted to the
line x = x0 . We can then consider the second partials and use the second derivative
test from Calculus 1: If Fxx > 0 then Fx is increasing, so it is a minimum along that
line. This does not necessarily mean it is a minimum off the line.
However there is a fuller method: see Guidorizzi Vol 2 §16.3 [Gui02], and §3.6 of
[HH15].
(i) If Fxx > 0 at p and the Hessian H(p) > 0 then p is a local minimum.
(i) If Fxx < 0 at p and the Hessian H(p) > 0 then p is a local maximum.
(iii) if H(p) < 0 then p is a saddle point. Thus it is neither max nor min.
(iv) If H(p) = 0 then we cannot say from this test and have to look more closely.
Exercise 4.16. Compare the above tests for the functions we have encountered:
F (x, y) = x2 + y 2 , F (x, y) = x2 − y 2 , F (x, y) = xy.
4.19. The Taylor polynomial and Taylor series. Taylor series in one dimen-
sion. Given a map f : R → R, the terminology “ k th -order approximation” to f at a
point x ∈ R comes from the Taylor polynomials and Taylor series. The best k th -order
approximation at x ∈ R is the polynomial of degree k which best fits the map near
that point. This is the polynomial which has all the same derivatives at that point,
up to order k.
Thus the best 0th -order approximationof f at p ∈ R is the constant map with the
value at that point: the map x 7→ f (p). To get the best first-order approximation we
add on the linear map given by the derivative matrix f 0 (p).
This is the affine map
x 7→ f (p) + f 0 (p)(x − p)
whose graph is the tangent line to the graph of f at that point.
For a function f : R → R, we define a sequence of polynomials, each of degree n,
which approximate this function better and better as n → ∞. For this we choose
a point about which we make the approximation, and call this the Taylor polyno-
mial about x0 . Here for simplicity we work with x0 = 0, and note that the Taylor
polynomials in this cae are also called Maclaurin polynomials.
Let us recall that a polynomial of degree n is
p(x) = a0 + a1 x + · · · + ak xk + · · · + an xn .
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 43
1 + x + x2 + x3 /3! + · · · + xn /n!
and the Taylor series is
∞
x 2 3 n
X xk
e = 1 + x + x + x /3! + · · · + x /n! + · · · = (12)
k=0
k!
For f (x) = sin x we have
Exercise 4.17. Check that for ex the derivative of pn+1 is pn , and that for sin(x)
the derivative of pn+1 is pn for cos(x). This agrees with (ex )0 = ex and (sin)0 =
cos, (cos)0 = − sin!!
The definition of the Taylor series for a differentiable function (about 0) is
∞
X
ak x k
k=0
where ak = f (0)/k! (here f (0) is the k th derivative of f at 0)). (Note that we
(k) (k)
Exercise 4.18. (1) Check that this general formula does give the above Taylor series
for ex , sin(x) and cos(x).
(2) Show that the polynomial pn has the same derivatives as f at x = 0, of order
0, 1, . . . , n. That is, pn (0) = f (0), p0n (0) = f 0 (0), p00n (0) = f 00 (0), and so on.
To define the Taylor polynomials about x0 we simply replace x by (x − x0 ) and the
derivatives at 0 by those at x0 . Thus the Taylor polynomial of degree n about x0 is
n
X f (k) (x0 )
pn (x) = (x − x0 )k .
k=0
k!
The Taylor polynomial in higher dimensions.
We can understand the role of the Hessian matrix and Hessian determinant much
better by explaining how to define the Taylor polynomial of a function of 2 variables.
Consider F : R2 → R. Then the best 0th -order approximation about 0 = (0, 0)
is the constant function with the value f (0). The the 1st -order approximation is
the tangent plane to the graph at 0. The best 2nd -order approximation may be a
paraboloid but could instead be a parabolic hyperboloid, Fig. 2. This depends on the
partial derivatives of order 2 at the point.
The Taylor polynomials pn will be functions of two variables (x, y), thus pn : R2 →
R. Just as for one dimension, a polynomial of degree n is a linear combination of
basic terms of degree k for k = 0 (a constant) up to k = n. Each basic term of degree
k will be of the form xi y j such that (i + j) = k. Thus for example the degree of x2 y 3
is 2 + 3 = 5, and the basic polynomials of degree 1 are p(x, y) = x, p(x, y) = y and of
order 2 are p(x, y) = x2 , p(x, y) = y 2 and p(x, y) = xy.
Taking a linear combination of terms of degree ≤ n gives a polynomial of degree
n, for example
p(x, y) = 1 + x + 3y + x2 + y 2 + 5xy
has degree 2.
Consider for example p(x) = x2 + y 2 . Its graph is a is a paraboloid, while the graph
of p(x) = xy is a hyperbolic paraboloid. See Figs. 3, 2.
46 ALBERT M. FISHER
Both these polynomials have degree 2. Both have horizontal tangent plane at 0.
The first has a minimum there while the second is a saddle point hence neither max
nor min. When x = y we have F (x, y) = xy = x2 , an upward parabola, so a minimum
along the line x = y. When x = −y we have F (x, y) = −x2 , so a maximum. Thus
(0, 0) can be neither max nor min. This is the essence of a saddle point.
In fact the terms of order 2 can be understood with the help of a symmetric matrix.
Definition 4.16. A quadratic form on R2 is a function of the form
Q(x, y) = ax2 + by 2 + cxy.
That is, it is a linear combination of the possible terms of degree 2.
Proposition 4.30. Given a quadratic form Q, there is a symmetric (2 × 2) matrix
x
A such that for v = , then
y
Q(v) = vt Av.
That is,
x
Q(v) = x y A .
y
a c/2
Proof. In fact, for A = we have
c/2 b
a c/2 x
x y = ax2 + by 2 + cxy = Q(x, y).
c/2 b y
0 1
Exercise 4.19. Check that when A = then Q(x, y) = 2xy. What do we get
1 0
1 0 1 0
for A = ? For ?
0 1 0 −1
To graph a quadratic form we have the following:
Theorem 4.31. A quadratic form
2 2
x
Q(x, y) = ax + by + cxy = x y A
y
has either
(i) a local min or max at 0, if detA > 0;
(ii) a saddle point at 0, if detA < 0.
If detA = 0, we cannot tell from this test.
Proof. (Sketch) From Linear Algebra, a symmetric matrix A can be diagonalized:
there exists an orthogonal matrix U such that U −1 AU = D where D is diagonal.
Now an orthogonal matrix is a rotation, a reflection, or a product of these. That does
not change wherer 0 is a saddle point, max or min. Also, detD = detU −1 AU = detA.
This proves that detA is the product of its eigenvalues, since the eigenvalues of A and
D are the same. The graph of the quadratic form defined by xD has the two types
described, completing the proof. See the above examples.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 47
Proof. Since ∇Gp 6= 0, the linear transformation (the matrix with those entries) DGp
is surjective, which allows us to use the Implicit Function Theorem, Theorem 4.25.
So the level set B has a parametrization. Calling one of the ccordinates t, we have
a curve γ(t) in B which passes through p at time 0, and such that given any chosen
vector v tangent to B, γ 0 (0) = v 6= 0.
Then G(γ(t)) = c so for all t, D(G ◦ γ)(t) = 0. By the Chain Rule this is
D(G ◦ γ)(t) = DGγ(t) Dγ(t) = ∇Gγ(t) · γ 0 (t).
In particular fo t = 0, ∇Gp · γ 0 (0) = 0.
On the other hand, F has a local maximum at p, so in particular, F ◦ γ(t) has a
local maximum at t = 0.
Therefore,
D(F ◦ γ)(0) = DFp Dγ(0) = ∇Fp · γ 0 (0) = 0.
Since v = γ 0 (0) 6= 0 was any tangent vector to B, both ∇Fp and ∇Gp are orthog-
onal to any such v. Thus they must be multiples (think for example of a level curve,
or a level surface).
Remark 4.14. Note that the derivatives are 0 for two completely different reasons:
that G ◦ γ is constant, and that F ◦ γ has a maximum.
Note that it is possible for λ to be 0, and also possible for ∇Fp to be 0. However for
the proof ∇Gp must be nonzero to be able to apply the Implicit Function Theorem.
5. Vector Calculus, Part II: the calculus of fields, curves and
surfaces
5.1. Vector Fields. In Part I we we have already encountered the gradient vector
field. Here is the general setting:
Definition 5.1. A continuous vector field is a continuous function V : Rm → Rm .
A linear or a differentiable vector field on Rm is simply a linear or differentiable such
function.
The reason we call this a vector field rather than just a function is because of the
special way in which we visualize this. Note that for m ≥ 2 we cannot draw the graph
of a vector field, as we would need too many dimensions! Indeed the graph of V is (by
definition) the collection of all ordered pairs (v, V (v)), a point in Rm × Rm = R2m so
for R2 , to draw the graph of the vector field would require four dimensions.
Instead, we picture the vector field by drawing the vector wv = V (v) based at each
point v. See Fig. 12. We can imagine this field represents the velocity field of a liquid
or gas, showing its motion.
Exercise 5.1. Sketch the following
linear vector fields V (x, y) = (ax + by, cx + dy)
a b
given by the matrix A = acting on column vectors, that is:
c d
x a b x ax + b
A = = ,
y c d y cx + d
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 49
Figure 13. A time-varying velocity vector field: the wind at the sur-
face of the Earth, from nullschool.net
In fact, given a C 2 vector field, we can always find the corresponding flow. This
is the content of the Fundamental Theorem of Ordinary Differential Equations: in
Rn , any C 2 vector field is tangent to a unique family of curves, meaning that there
exists a unique curve γ through a point p tangent to the vector field V : Rn → Rn ,
and furthermore, these can be put together as a flow. Conversely, any such family of
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 51
curves can be differentiated (by finding the tangent vector at each point) to give the
vector field. Finding the curves from the vector field is called integration, and the
curves are called integral curves of the vector field. The equation
Remark 5.2. In the above situation, we call the vector field a velocity vector field as
its value is the tangent vector (velocity) v of a solution curve γ(t), since V (γ(t)) =
γ 0 (t) = v(t).
Now instead of a velocity field, vector fields can also depict a force field, such as
gravity or an electric field.
By a force field we mean the following. The differential equation is now a second-
order vector ODE as it involves the second derivative: for example, F (γ(t)) = mγ 00 (t),
which expresses Newton’s Law F = ma, where γ(t) is the position, γ 0 (t) = v(t) the
velocity, a(t) = γ 00 (t) the acceleration, and m ≥ 0 is the mass of the object. Here we
need two initial conditions: initial position γ(0) and initial velocity γ 0 (0) = v(0); our
Fundamental Theorem then guarantees that we will again have a unique solution.
Further possible interpretations are for example that F represents a magnetic field,
or an area element of a surface as the covector for a two-form. But the first two are
certainly the most common and important for our intuition.
It is possible that the force on the object also depends on its velocity; in that case,
this is given by a vector field F where mγ 00 (t) = F (γ(t), γ 0 (t)). This is the case for a
charged particle moving in a magnetic field.
The definition of a second-order vector DE in Rn is just that: we are given F :
U → Rn which is C 1 then
γ 00 (t) = F (γ(t), γ 0 (t)). (14)
In the time-varying case it would be
γ 00 (t) = F (t, γ(t), γ 0 (t)). (15)
In the first case, F only depends on posiition so is a vector field on U ⊆ Rn . In the
last case it is a vector field on R1+2n , however with values in Rn ⊆ R1+2n .
In fact, all higher-order vector DEs can be converted into first-order vector DEs;
if the order is k, we need k times the dimension. Thus for a second-order vector DE
in Rn , so with dimension n, to write it as a first-order system we simply include the
vector γ 0 as a new variable, giving a new solution curve η = (γ, γ 0 ) in dimension 2n.
Furthermore, time-dependent vector fields, so-called nonstationary or nonautonomous
DEs, can be seen in this context by adding one more parameter (time). See Fig. 13.
Thus all DEs can be interpreted geometrically, as finding integral curves (and flows)
to a velocity vector field.
The electrostatic fields in Figs. 25, 31, 26, are gradient vector fields: depicted are
two families of curves, orthogonal to each other; the electrostatic field is tangent to the
lines of flux between the charges. (Even though they are actually force, not velocity
fields, it is useful to picture them as velocity fields). The curves going around the
charges are the level curves of the electrostatic potential function Φ. Thus F = ∇Φ
is the electrostatic field. Not all vector fields are gradient fields for some potential;
below we find conditions such that this important property holds.
5.2. The line integral. Given a vector field F on Rn , the line integral of F along γ
is Z Z b
F · dγ ≡ F (γ(t)) · γ 0 (t)dt.
γ a
A line integral gives a weight at each point of the curve which depends not only
on the location γ(t) but also on the direction, γ 0 (t) with respect to F (γ(t)): if these
two vectors are aligned it gets a positive weight, if opposed it is negative, and if
perpendicular it is zero. If for example F gives a force field, then the dot product
measures the amount of work needed to move in that direction. Thus an ice skater
glides on the ice doing no work, because the plane of the frozen lake is perpendicular
to the direction of gravity.
The line integral can also be interpreted as is the integral of the curve with respect
to a one-form, the one-form dual to the vector field, just as the dual space F ∗ is dual
to F . We return to this below.
54 ALBERT M. FISHER
Figure 17. Equipotential curves and lines of force for the electrostatic
field of two like charges in the plane. For a gravitational potential
this would be a topographic map showing either two mountains or two
valleys. Note thaat there is a saddle (hyperbolic) point between the
two.
Figure 18. Equipotential curves and lines of force for the electrostatic
field of two like charges in the plane, showing a closeup view of the sad-
dle point in the center.Note that this approximates the dual hyperbolas
of Fig. 21.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 55
We have already seen this special case above in Part I. (So there is some overlap
here with our earlier discussion!)
For an example we already know from first semester Calculus, consider a function
g : [a, b] → R, we consider its graph {(x, g(x)) : a ≤ x ≤ b}. We know from Calculus
the arc length of this graph is
Z bp
1 + (g 0 (t))2 dx.
a
We claim that the new formula includes this one: parametrizing the graph
p as a curve
0 0 0
in the plane γ(t) = (t, g(t)). Then γ (t) = (1, g (t)) so ||γ (t)|| = 1 + (g 0 (t))2 ,
R Rbp
whence indeed γ ds = a 1 + (g 0 (t))2 dx as claimed.
Proposition 5.3.
(i) The line integral of second type of a function along a curve gives the same value
for any change of parametrization, independent of orientation. That is,
Z Z
f (v)ds = f (v)ds.
γ1 γ2
Proof. (i) Writing u = h(t), we have γ2 (u) = γ2 (h(t)) = γ1 (t). Then since du =
h0 (t)dt, and using the Chain Rule, we have:
Z Z u=d Z u=d
0
f (v) ds ≡ f (γ2 (u)) ||γ2 (u)|| du = f (γ2 (h(t))) ||(γ20 (h(t)|| du =
γ2 u=c u=c
Z t=b
f (γ2 (h(t))) ||γ20 (h(t))|| h0 (t) dt
t=a
Assuming first that h0 > 0, this equals (18)
Z t=b Z t=b
0 0
f (γ1 (t)) ||γ2 (h(t)) h (t)|| dt = f (γ1 (t)) ||(γ2 ◦ h)0 (t)|| dt
t=a t=a
Z t=b Z
= f (γ1 (t)) ||γ10 (t)|| dt = f (v) ds.
t=a γ1
0
If instead h < 0, then we have as before
Z Z u=d Z u=d
f (v) ds ≡ f (γ2 (u)) ||γ20 (u)|| du = f (γ2 (h(t))) ||(γ20 (h(t)|| du =
γ2 u=c u=c
Z t=a
f (γ2 (h(t))) ||γ20 (h(t))|| h0 (t) dt
t=b
because, since h0 < 0, h(b) = c, h(a) = d.
(19)
Also we now have ||γ20 (h(t))|| h0 (t) = −||γ20 (h(t)) h0 (t)|| so this is
Z t=a Z t=b
0 0
− f (γ1 (t)) ||γ2 (h(t)) h (t)|| dt = f (γ1 (t)) ||(γ2 ◦ h)0 (t)|| dt
t=b t=a
Z t=b Z
= f (γ1 (t)) ||γ10 (t)|| dt = f (v) ds.
t=a γ1
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 57
We next see how this canR be used to give a unit speed parametrization of a curve
t
γ : [a, b] → Rn . Set l(t) = a ||γ 0 (r)|| dr, so l(t) is the arclength of γ from time a to
time t. Note that l0 (t) = ||γ 0 (t)||. Therefore, if ||γ 0 (t)|| > 0 for all t, this is invertible.
Our parameter change will be given by h(t) = l−1 (t), the inverse function.
Proposition 5.4. Assume that ||γ 0 (t)|| > 0 for all t. Then the reparametrized curve
b = γ ◦ h has speed one.
γ
Proof. Now 1 = (l◦h)0 (t) = l0 (h(t))h0 (t) so ||b
γ 0 (t)|| = ||(γ◦h)0 (t)|| = ||(γ 0 (h(t))h0 (t)|| =
1.
The function l maps [a, b] to [0, l(γ)]] whence the parameter-change function h maps
[0, l(γ)] to [a, b]. We keep t for the variable in [a, b] and define s = l(t), the arc length
up to time t, so now s is the variable in [0, l(γ)] and h(s) = t.
The change of parameter gives γ b(s) = (γ ◦ h)(s) = γ(h(s) = γ(t). This indeed
parametrizes the curve γ b is by arc length s.
Note further that
Z Z b Z l(b) Z
0 0
f (v)ds ≡ f (γ(t))||γ (t)|| dt = f (b γ (s)||ds ≡ f (v)ds
γ (s))||b
γ a 0 γ
b
0 0
From s = l(t) we have ds = l (t)dt = ||γ (t)||dt. Now we understand rigorously what
is ds: it represents the infinitesimal arc length; this helps explain the notation for
this type of integral.
Level curves and parametrized curves.
There are two very distinct types of curves we encounter in Vector Calculus: the
curves of this section, and the level curves of a function. Next we describe a link
between the two:
Proposition 5.5. Let G : R2 → R be differentiable and suppose γ : [a, b] → R2 is a
curve which stays in a level curve of G of level c. Then γ 0 (t) is perpendicular to the
gradient of G.
Proof. We have that G(γ(t)) = c for alll t. Then by the chain rule, D(G ◦ γ)(t) =
DG(γ(t)Dγ(t). The derivatives here are matrices, with DG a (1 × 2) matrix (a
row vector) and Dγ a column vector; in vector notation, these are the gradient and
d
tangent vector, so this reads 0 = dt c = (G ◦ γ)0 (t) = (∇G)(γ(t)) · γ 0 (t).
Corollary 5.6. If γ is a curve with ||γ(t)|| = c, then γ 0 ⊥ γ 00 .
Here is a second, direct proof; see also Corollary 4.6 above:
Proposition 5.7. For a unit-speed curve γ, then always γ 0 ⊥ γ 00 .
Proof. 1 = γ 0 · γ 0 whence by Leibnitz’ Rule,
(γ 0 · γ 0 )0 = 2(γ 0 · γ 00 ) = 0.
This fact allows us to make the following
58 ALBERT M. FISHER
Proof.
Z Z b
F · dγ ≡ F (γ(t)) · γ 0 (t)dt.
γ a
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 59
thus this value only depends on ϕ(A) and ϕ(B), not on the path taken to get there.
RHence if thereR are two paths γ1 , γ2 with the same initial and final points A, B, then
γ1
F · dγ1 = γ2 F · dγ2 .
(ii) =⇒ (iii): If γ is a closed path, then γ(a) = A = γ(b) = B. Define a second
path η with Rthe same initial and final points AR= B but with η(t) = A for all t. Then
η 0 (t) = 0 so η F · dη = 0, whence by (ii) also γ F · dγ = 0.
Another proof is the following: Given a closed path γ, we choose some c ∈ [a, b]
and define C = γ(c). Write γ1 for the path γ restricted to [a, c] and γ2 for γ restricted
to [c, b]. Then by (ii) γ1 and the time-reversed path γe2 have the same initial and final
points, so Z Z
F · dγ1 = F · de
γ2 .
γ1 γ
e2
Therefore
Z Z b
F · dγ = F (γ(t)) · γ 0 (t)dt =
γ a
Z c Z b Z Z
0 0
F (γ(t)) · γ (t)dt + F (γ(t)) · γ (t)dt = F · dγ1 + F · dγ2 =
a c γ1 γ2
Z Z
F · dγ1 − F · de
γ2 = 0.
γ1 γ
e2
60 ALBERT M. FISHER
(iii) =⇒ (ii): We essentially reverse this last argument. We are given that the
integral over a closed path is 0. If thereR are two paths
R γ1 , γ2 with the same initial and
final points A, B we are to show that γ1 F · dγ1 = γ2 F · dγ2 .
As above, we write γ e2 for the time-reversed path. Then γ = γ1 + γ e2 is a closed
loop, so
Z Z Z Z Z
0 = F · dγ = F · dγ1 + F · de
γ2 = F · dγ1 − F · dγ2 = 0.
γ γ1 γ
e2 γ1 γ2
(ii) =⇒ (i): We define a function ϕ by fixing some point A and choosing ϕ(A)
arbitrarily. Then we define the other values as follows. Letting B ∈ Ω, since the
region is path connected there exists a piecewise C 1 path γ : [a, b] → Ω with A =
γ(a), B = γ(b). We set Z
ϕ(B) = F · dγ.
γ
By (ii), this is well-defined as it does not depend on the path.
We claim that ∇ϕ = (ϕx , ϕy ) = F = (F1 , F2 ), showing the calculation for the case
of F : R2 → R. We compute ∂ϕ ∂x
at the point B = (B0 , B1 ) and shall show that
∂ϕ
| = F1 (B).
∂x B
Defining a path η by η(t) = B + te1 , so η(0) = B, we extend the path γ by sticking
η on its end. That is, we define for t ≥ b, γ(t) = B + η(t − b). We still have for
C = γ(c), with c > b, Z c
ϕ(C) = F (γ(t)) · γ 0 (t)dt
a
Rb Rc
by path-independence of (ii). This equals a F (γ(t)) · γ 0 (t)dt + b F (γ(t)) · γ 0 (t)dt =
R c−b
ϕ(B) + 0 F (η(t)) · η 0 (t)dt.
By definition, and taking c = h, ∂ϕ | = limh→0 h1 (ϕ(γ(b + h)) − ϕ(γ(b))) =
∂x B
limh→0 h1 (ϕ(η(h)) − ϕ(η(0))) = limh→0 ( h1 (ϕ(η(h)) − ϕ(B)) = limh→0 h1 ϕ(η(h). Now
1 h
Z
1
lim (ϕ(η(h)) = lim F (η(t)) · η 0 (t)dt
h→0 h h→0 h 0
1 h 1 h
Z Z
= lim F (η(t)) · (1, 0)dt = lim F (B0 + t, B1 ) · (1, 0)dt
h→0 h 0 h→0 h 0
1 h
Z
= lim F1 (B0 + t, 0)dt = F1 (B0 , B1 ) = F1 (B)
h→0 h 0
position of the object in time is given by the curve γ(t), then we write v(t) = γ(t)
for the velocity and a(t) = v0 (t) = γ 00 (t) for the acceleration. So Newton’s law states
F (γ(t)) = ma(t) = mγ 00 (t).
Definition 5.5. Work is defined in mechanics to be (force) · (distance). This means
that the work done by moving a particle against a force is given by that expression.
The continuous-time version of this is given by a line integral.
Precisely, we define the
R work done by moving a particle along a path (a curve) γ
in a force field F to be γ F · dγ.
The kinetic energy of the particle is 12 m||v||2 .
Proposition 5.11. The work done by moving along the path γ in a force field F
from time a to time b is the difference in kinetic energies, Ekin (b) − Ekin (a).
Proof. The work done by moving along the path γ from time a to time b is
Z Z b Z b
0
F · dγ = F (γ(t)) · γ (t)dt = m γ 00 (t)) · γ 0 (t)dt
γ a a
Proof. We have shown in Proposition 5.11 that the work done (in any field) is
Z
F · dγ = Ekin (b) − Ekin (a).
γ
But in a conservative field, we also have a second expression for this: the work done
is Z
F · dγ = ϕ(B) − ϕ(A) = Epot (A) − Epot (B).
γ
Thus
Ekin (b) − Ekin (a). = Epot (A) − Epot (B)
so
Etot (a) = Ekin (a) + Epot (A) = Ekin (b) + Epot (B) = Etot (b).
Rb
Remark 5.5. Note that we calculated the line integral a F (γ(t)) · γ 0 (t)dt in two
different ways, in Proposition 5.9 and Proposition 5.11. For the first we used the
existence of a potential to rewrite F (γ(t)) as ∇ϕ(γ(t)) and use the Chain Rule; for
the second we used Newton’s Law to rewrite F as ma = mγ 00 and apply Leibnitz’
Rule.
It is interesting that this are the same two very different techniques applied to give
two different proofs of Corollary 5.6 above.
Remark 5.6. Note that these formulas represent the determinant of a matrix of sym-
bols rather than numbers, so only make sense as formulas. Nevertheless some of the
properties carry over from the usual situation of a matrix of numbers. For exam-
ple, multilinearity of the determinant or linearity of the vector product is reflected
in linearity of the curl: given two vector fields on R3 , F, G then curl(αF + βG) =
α curl(F ) + β curl(G).
The formulas for R2 and R3 are connected. To understand this, take the fields
F = (P, Q) and Fb = (Pb, Q, b R)
b with R b ≡ 0 and with Pb(x, y, z) = P (x, y) and
Q(x,
b y, z) = Q(x, y), whence Q by − Q
bz = Pbz = 0 so then curl(Fb) = R bz , Pbz − R bx −
bx , Q
Pby = −Q bx − Pby = 0, 0, Q
bz , Pbz , Q bx − Pby = (Qbx − Pby )k.
In other words, curl(Fb) = curl(F ) in this case.
Proposition 5.14. If a field F on R2 is conservative, then the curl is 0.
Proof. This follows immediately from the equality of mixed partials, Lemma 4.28.
Remark 5.7. The proposition says: curl(gradϕ) = 0, that is,
∇ ∧ (∇ϕ) = ∇ × (∇ϕ) = 0.
In fact, the curl in R3 can be understood with the help of that in R2 : if Fb is 0 in
some other direction v (replacing the direction k), then the curl is a multiple of v,
and is equal to the curl on the plane perpendicular to v.
This will always be the case for a linear vector field, because we can rotate the field
so that v now lines up with k and we are in the previous situation. If Fb is not linear,
we define:
Definition 5.8. The linearization of F at p is the linear vector field defined by the
derivative matrix F ∗ = DFp .
As we next show, the curl of F at p is equal to that for its linearization: curl(Fb)|p =
curl(Fb∗ )|0 :
Theorem 5.15. Let F = (P, Q, R) be a differentiable vector field on R3 , with deriv-
ative DFp at the point p. Let F ∗ denote the linear vector field defined by the matrix
DFp .
Then curl(F )|p = curl(F ∗ )|0 , which is constant.
The same holds for R2 .
2 Px Py
Proof. For the case of R , so F = (P, Q), the derivative matrix is DF = .
Qx Qy
The curl is calculated from the off-diagonal entries. So curl(F ) and curl(F ∗ ) are
the same, as they are determined by these entries. More precisely, DFp (x, y) =
(xPx + yPy , xQx + yQy ) = (Pe, Q) e x − (Pe)y )k = (Qx − Py )k.
e which has curl ((Q)
For the (3 × 3) case, the derivative of a linear map is constant, so for all x,
D(F ∗ )(x) = D(F ∗ )(0) = DFp = F ∗ .
Px P y Pz
F ∗ ≡ DFp = Qx Qy Qz |p
Rx Ry Rz
64 ALBERT M. FISHER
Write the rows as Pe, Q, e Then the curl of the linear vector field defined by F ∗ is
e R.
ey − Q
R ez , Pez − R ex − Pey = Ry − Qz , Pz − Rx , Qx − Py = curl(F )p ,
ex , Q
proving the claim.
We note that since for any chosen p, DFp is a linear map, its derivative is constant,
equal to that linear map at any point. Thus curl(F ∗ )|q = curl(F ∗ )|0 , for any q.
Another way to say this is that for any linear vector field the curl is the same at all
points.
The curl is a type of derivative, so it makes sense that it can be calculated from
the derivative matrix. The geometrical meaning of curl is an infinitesimal rotation:
a sphere in R3 rotates about an axis. (To prove this, in Linear Algebra the Spec-
tral Theorem tells us that a rotation –given by an orentation-preserving orthogonal
matrix–has an eigenvector; this gives the axis). The curl measures the infinitesimal
rotation of the vector field, and its vector points along that axis, using the right-hand
rule to indicate the direction of the vector. Why this is an infinitesimal rotation is
explained by the notion of the exponential of a matrix, illustrated in the next example.
See the online text https://activecalculus.org/vector/ for some nice illustrations.
5.4. Rotations and exponentials; angle as a potential.
First we consider the
0 −1
linear vector field V on R2 defined by A = . We shall explain how this is
1 0
tangent to the rotation flow
cos t − sin t
Rt = ,
sin t cos t
see Fig. 19.
The relationship between the matrices A and Rt is simple, beautiful and profound.
We extend the definition of ex to a square matrix M via the Taylor series
exp(M ) = I + M + M 2 /2 + · · · + M k /k! + . . .
It is not hard to show (using comparison
and the matrix norm) that this always
0 −1
converges. In particular, for A = , then
1 0
tA cos t − sin t
e = = Rt
sin t cos t
gives the rotation flow. To see this, write out the first few terms of the matrix series
and use the Taylor series for sin, cos:
A similar equation holds in R3 , which explains why the curl of a vector field does
measure the infinitesimal rotation.
This is related to the most basic and most important differential equation: that for
exponential growth,
f 0 (t) = f (t)
which has as its solution
f (t) = Ket .
0
The same holds for the vector differential equation γ (t) = Aγ(t) where A =
0 −1
and γ = (x, y), that is in matrix form,
1 0
0
x (t) 0 −1 x(t)
=
y 0 (t) 1 0 y(t)
with initial condition (x0 , y0 ) has solution
x(t) cos t − sin t x0
=
y(t) sin t cos t y0
The derivative of the linear map V : R2 → R2 at a point p is DVp = A for all
p, since the derivative of a linear map is constant, with value equal to the matrix
itself.
We claim
thefield V is not conservative. Now writing V = (P, Q), DV =
Px Py 0 −1
= , so the curl is Qx − Py = 1 + 1 = 2. Thus by Proposition 5.14,
Qx Qy 1 0
V is not conservative. R
For a second proof, we calculate the line integral γ V · dγ for the curve γ(t) =
(cos t, sin t), t ∈ [0, 2π]. This is
Z 2π Z 2π
0
V (γ(t)) · γ (t)dt = (− sin t, cos t) · (− sin t, cos t)dt = 2π.
0 0
But this is a closed loop, hence by (iii) of Proposition 5.10 is not conservative.
66 ALBERT M. FISHER
−1 1
+ =0
1 + y2 y2 + 1
To find the constant we evaluate at a single point (actually at two points), where
it is easy: at 1 and −1: Now cot(π/4) = 1, arccot(1) = π/4; this is for the case y > 0,
and cot(−π/4) = −1, arccot(−1) = 3π/4, for the case y < 0, of (24).
So as claimed in (24),
(
−π/4 − 3π/4 = −π/2, for y = 1
arccot(y) − arccot(−1/y) =
3π/4 − π/4 = π/2, for y = −1
Combining this with (23),
(
−π for y > 0
ϕ(x, y) = arccot(x/y) +
0 for y < 0
(
0 for y > 0
ϕ(x, y) + π = arccot(x/y) +
π for y < 0
From (22),
(
0 for y > 0
Θ(x, y) = arccot(x/y) + (25)
π for y < 0
Lastly, for y = 0, x < 0 we have ϕ + π = 0 + π = π and also Θ = π., since
ϕ(−1, 0) = 0 while Θ(−1, 0) = π.
This proves the Claim for all cases, that
Θ = ϕ + π.
Remark 5.9. To better understand the potential function Θ, draw its level curves;
they are rays from the origin, climbing up like a spiral staircase.
Note that for γ(t) = (cos t, sin t) then
Z 2π Z 2π
0
F (γ(t)) · γ (t)dt = (cos t, sin t) · (− sin t, cos t)dt = 2π
0 0
and also
lim Θ(γ(t)) − Θ(1) = lim Θ(B) − Θ(0) = 2π − 0 = 2π
t→2π B→0
R
so the formula γ F · dγ = ϕ(B) − ϕ(A) is still valid in the limit; it is also valid if we
can somehow allow for a “multi-valued function” as a potential!
See §5.15, Fig 24 below for a different view of this potential: it is related to the
electrostatic field of a single charge at the origin.
70 ALBERT M. FISHER
5.5. Line integral with respect to a differential form. We have been studying
line integrals,
Z
F · dγ,
γ
η = P dx + Qdy.
Note that the coefficients depend on the location: they are functions P (x, y), Q(x, y).
This one-form is dual to the vector field F = (P, Q), and conversely, F is dual to
η.
Similarly in R3 we can express a one-form as
η = P dx + Qdy + Rdz.
Again, we then define the line integral with respect to a one-form as equal to its line
integral over the associated vector field.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 71
Given a one-form η, we define line integral of a curve γ over η to be simply the line
integral of the corresponding vector field F , so
Z Z Z Z b
η = P dx + Qdy = F · dγ = F (γ(t)) · γ 0 (t)dt.
γ γ γ a
Thus to calculate a line integral over a one-form, the first step is to write it out as a
standard line integral with respect to the dual field F = (P, Q).
A key fact about line integrals is that the orientation of γ Ris important, Rsince for
γ : [a, b] 7→ R2 with opposite curve γe = −γ, then as we know, γe F · deγ = − γ F · dγ.
Thus γ is an oriented curve, and not just the point set Im(γ), the image of the
curve. R
This is the same as the difference between the Riemann integral [a,b] f (x)dx and the
Rb
integral a f (x)dx = F (b) − F (a) defined from a primitive F , since in the second case
Ra Rb
A = [a, b] is treated as an oriented interval and we have b f (x)dx = − a f (x)dx.
These matters become more subtle for double and triple and k-tuple integrals,
where V ∗ is replaced by the set of alternating k-tensors on Rk , as we explain below.
So far we have only treated one-tensors:
Definition 5.11. Given a vector space V , a one-tensor is an element of the dual
space V ∗ . A differential one-form η on a vector space V is a function taking values in
the one-tensors, so equivalently, η : V → V ∗ . Choice of an inner product associates
V to V ∗ , by sending v 7→ λv ∈ V ∗ with λv (w) = hv, wi. This is an isomorphism,
which depends on the choice of inner product.
5.6. Green’s Theorem: Stokes’ Theorem in the Plane. Here we follow the
outlines of Guidorizzi’s Calculus 3 text: [Gui02]. In my opinion this is (for those who
know Portuguese) a good text to teach from, as it is well organized, with correct
proofs and good worked-out examples and exercises of a consistent level, but it’s not
so easy to study from as it is too dry and also because it lacks the beauty of a more
advanced and abstract approach. The latter is given in spades in Spivak’s beautiful
[Spi65] and Guillemin and Pollack’s transcendent [GP74]; the approach in these notes
is to bridge the way to this very beautiful and powerful more abstract approach while
keeping our feet firmly on the ground of simplicity.
Definition 5.12. Given a simple closed C 1 curve γ in R2 , so γ : [a, b] → R2 with
γ(a) = γ(b), we define a curve on the circle by
whγ(t) = γ(t)/||γ(t)||. This is just the normalized tangent vector, so to see how the
tangent vector turns, we look at how γ b moves along the unit circle.
One can prove (and it makes sense intuitively) that:
Lemma 5.20. γ b either goes around once in the clockwise direction or once in the
counterclockwise direction.
We say γ is oriented positively if it is a counterclockwise motion, otherwise we say
it is oriented negatively.
Given a simple closed curve γ in the plane, to state Green’s Theorem we need tp
be able to talk about its inside and outside. This enables us to define its orientation
as well.
72 ALBERT M. FISHER
These ideas are made precise by the famous Jordan Curve Theorem:
Theorem 5.21. (Jordan) A continuous simple closed curve γ in R2 partitions the
plane into three connected sets:
–the interior of the curve, an open set we call K;
–the image of γ, a closed set, which is the topological boundary of K, so we call it
∂K = Im(γ), the boundary of K;
–the exterior of γ, the open set which is the complement of K ∪ ∂K.
Definition 5.13. Given such a curve, we say it has positve orientation iff it goes in
the counterclockwise direction as seen from the inside.
Proposition 5.22. If γ is oriented positvely and piecewise C 1 , then the interior region
K is to the left of the tangent vector γ 0 (t) for all t where γ 0 (t) exists and is nonzero.
Unfortunately, we will not prove any of these beautiful results here, as good proofs
require a more advanced perspective, bringing in ideas from algebraic or differential
topology; see [Arm83], and as they are clear intuitively by sketching a few pictures.
These ideas also are needed in Complex Analysis. There is a nice treatment relating
this to line integrals in the third edition of Marsden-Hoffman: [MH98].
Theorem 5.23. (Green’s Theorem) Let γ be a simple closed positively oriented curve
in R2 , piecewise C 1 , with non-empty interior. Write K for the closure of the interior
of γ. Let F = (P, Q) be a C 1 vector field defined on some open set U ⊇ K.
Then Z Z Z
F · dγ = curl(F ) · kdxdy.
γ K
equivalently,
Z Z Z
P dx + Qdy = (Qx − Py )dxdy.
γ K
Proof. Proof for rectangle: Let K = [a, b] × [c, d]. Write A = (a, c), B = (b, c), C =
[b, d), D = (a, d). Let γ = γ1 + · · · + γ4 be unit-speed boundary curves traversing
the segments in a counterclockwise direction, γ1 from A to B and so on. Thus
γ1 (t) = A + t(1, 0) = (t, c) for t ∈ [a, b], so γ10 = (1, 0). We have
Z Z b
P dx + Qdy = P (t, c)dt
γ1 a
Here Z b Z b
0
(P, Q)(γ3 (t)) · (1, f (t))dt = P (t, f (t)) + Q(t, f (t))f 0 (t)dt
a a
so the total is
Z b Z d Z b Z b
P (t, c)dt + Q(b, t)dt − P (t, f (t)) − Q(t, f (t))f 0 (t)dt.
a c a a
74 ALBERT M. FISHER
We are almost done. Note that each expression has four terms, and the first three
of them agree, just changing the variable of integration from time t to the spatial
coordinates x and y. It remains to check the last term. This is a substitution,
making use of the inverse function: writing s = f (t), so t = g(s), then ds = f 0 (t)dt
whence indeed
Z b Z s=d Z y=d
0
Q(t, f (t))f (t)dt = Q(g(s), s)ds = Q(g(y), y)dy
a s=c y=c
A more formal proof uses the notion of chains as developed in [Spi65] or [GP74].
of Exercise 5.2, for the region with two boundary circles of radius 1 and 2. What
does Green’s Theorem say in this case?
Remark 5.10. The proof of Green’s Theorem for rectangular regions given above may
remind the reader of the proof we gave above for the equality of mixed partials Lemma
4.28. We next see the exact connection between the two arguments, by showing how
the equality of mixed partials follows as a corollary of Green’s Theorem.
Given a vector field F = (P, Q) and the rectangle from the proof above, R =
[a, b] × [c, d], we parametrize
R the boundary of R by a counterclockwise curve γ and
calculate the line integral γ F · dγ. The corners of R are A, B, C, D with A =
(a, c), B = (b, c), C = (b, d) and D = (a, d). We have γ = γAB + γBC + γCD + γDA
where these are the unit-speed paths; we use the inverse paths for the last two. Thus
0
γAB (t) = (t, c) for t ∈ [a, b]; γAB = (1, 0)
0
γBC (t) = (b, t) for t ∈ [c, d]; γBC = (0, 1)
0
−γCD (t) = (t, d) for t ∈ [a, b]; −γCD = (1, 0)
0
−γDA (t) = (a, t) for t ∈ [c, d]; −γDA = (0, 1)
Then
Z Z b Z b
F · dγAB = (P, Q)(γAB ) · (1, 0)dt = P (t, c)dt
γAB a a
Z Z d Z d
F · dγBC = (P, Q)(γBC ) · (0, 1)dt = Q(b.t)dt
γBC c c
Z Z b Z b
F · dγCD = (P, Q)(γCD ) · (1, 0)dt = P (t, d)dt
γCD a a
Z Z d Z d
F · dγDA = (P, Q)(γDA ) · (0, 1)dt = Q(a, t)dt
γDA c c
So far this is true for any vector field. We now assume F is conservative, so there
∂
exists ϕ with F = ∇ϕ, so F = (P, Q) where P (x, y) = ∂x ϕ(x, y) and Q(x, y) =
∂
∂y
ϕ(x, y). So
Z Z b Z b
∂
F · dγAB = P (t, c)dt = ϕ(t, c)dt = ϕ(t, c)|ba
γAB a a ∂x
76 ALBERT M. FISHER
and we have: Z
F · dγAB = ϕ(t, c)|ba = ϕ(b, c) − ϕ(a, c)
γAB
Z
F · dγBC = ϕ(b, t)|dc = ϕ(b, d) − ϕ(b, c)
γBC
Z
− F · dγCD = ϕ(t, c)|ba = ϕ(b, d) − ϕ(a, d)
γCD
Z
− F · dγCD = ϕ(t, c)|dc = ϕ(a, d) − ϕ(a, c)
γCD
Thus Z
F · dγ = ϕ(b, c) − ϕ(a, c) + ϕ(b, d) − ϕ(b, c)
γ
+ ϕ(a, d) − ϕ(b, d) + ϕ(a, c) − ϕ(a, d) = 0
Note that this statement Z
F · dγ = 0.
γ
is equivalent to that
Z Z
F dγ1 = − F dγ2 ,
γAB +γCD γBC +γDA
as we traverse the sides of R in a different order. And this was exactly the concluding
step in the proof of Lemma 4.28.
We have proved that if F is conservative, then the integral around any rectangular
loop is 0. This prrof has not used the equality of mixed partials.
But we can prove the equality of mixed partials from this fact as a corollary of
Green’s Theorem. Green’s Theorem states that
Z Z Z
curl(F ) dxdy = F · dγ
R γ
for γ = ∂R as above.
Since we have proved that the line integral around ∂B is 0 for each rectangle,
Green’s Theorm tells us that curl(F ) = 0.
But curl(F ) = ( ∂Q
∂x
− ∂P
∂y
)k and hence the mixed partials are equal.
Theorem 5.24. Let F = (P, Q) be a C 1 vector field in the plane, and let γ be a
piecewise C 1 , positively oriented simple closed curve, with interior region K. We
define n = γ 0∗ /||γ 0∗ || = γ 0∗ /||γ 0 ||; this is the outward normal vector of γ.
Then Z Z Z
F · nds = div(F )dxdy.
γ K
The same holds more generally for a finite collection of disjoint such regions K1 , . . . Kn
with boundaries γ1 , . . . , γn and then writing K = ∪Kn and γ = γ1 + · · · + γn .
Proof. We place the two statements side-by-side, for γ the boundary curve of K, one
for the field F and the other for G = Fe∗ :
Green’s Theorem: Z Z Z
G · dγ = curl(G) · k dxdy
γ K
Divergence Theorem:
Z Z Z
F · n ds = div(F ) dxdy.
γ K
Note here that curl(G) · k = div(F ), so once the prove the two different types of
line integrals are equal, the theorem is proved!
For γ(t) = (x(t), y(t)), then γ 0 (t) = (x0 (t), y 0 (t)), and n = γ 0∗ /||γ 0 || = (y 0 , −x0 )/||γ 0 || =
(y , −x0 )/||(y 0 , −x0 )||.
0
Recall (Def. 5.2) that the line integral of second type of a function f : R2 → R over
γ : [a, b] → R2 is defined to be
Z Z b
f (v)ds ≡ f (γ(t)) ||γ 0 (t)||dt
γ a
what we mean by this is the line integral of second type of the function f over γ,
where f is defined on the image of γ by
f (γ(t)) = F (γ(t)) · n(t).
Thus
Z Z Z b
F · n ds ≡ f (v)ds ≡ f (γ(t))||γ 0 (t)||dt.
γ γ a
Z Z b Z b
0
F · n ds = F (γ(t)) · n(t) ||γ (t)||dt = F (γ(t))) · (y 0 , −x0 )/||γ 0 (t)|| ||γ 0 (t)||Rdt
γ a a
Z b Z b Z b
0 0 0 0
= (P, Q)(γ(t)) · (y , −x ) dt = (−Q, P )(γ(t)) · (x , y ) dt = G(γ(t)) · γ 0 (t) dt
a a a
Z Z Z Z Z
= G · dγ = curl(G) · k dxdy = div(F ) dxdy
γ K K
Now this parallelogram P (v, w) is the image of the unit square I × I = [0, 1] × [0, 1]
in R2 by the linear map
v1 w1
A = v2 w2
v3 w 3
This suggests that we can turn the above definition around and give an analogue
of the geometric definition of the determinant of the rectangular matrix A: it is to be
the area of this image parallelogram.
In other words, although the algebraic definition of determinant does not extend
to rectangular matrices, the geometric definition does, and more generally for a k-
parallelopiped P in Rn , analogous to this simplest case of 2-parallelograms in R3 .
The tantalizing task is then to find an algebraic formula for this geometric definition
in general, which must of course include the usual (n × n) case.
An answer comes from the following formula for k-dimensional volume in Rn , see
[HH15] p. 526. Noting that AAt is a square matrix, Hubbard proves the volume of a
k-parallelopiped P in Rn equals:
p
vol(P ) = det(At A).
The Gram matrix is At A; this is useful in Linear Algebra. The name comes from
Jorgen Pedersen Gram, famous for many things including the Gram-Schmidt orthog-
onalization procedure; see Wikipedia. The Gram determinant is the determinant of
the Gram matrix, so Hubbard’s formula is the square root of this. See [CJBS89]
p. 191. We prefer Hubbard’s presentation to Courant-John, in part because we find
the matrix form both easier to work with and to understand.
We present three proofs of the volume formula, first for the simplest cases:
Lemma 5.25.
v
(i)For A a (2 × 1) matrix v = 1 then the length of the image of the unit line
v
p2
segment I = {(0, t) : t ∈ [0, 1]} is det(At A).
(ii)For A the (3 × 2) matrix with columns v, w as above, then
p
||v ∧ w|| = det(At A).
v1
t
Proof. For (i) we have A A = v1 v2 = v12 + v22 .
v2
For (ii),
y y v1 w 1
−→ v −→ v1 v 2 v 3 v · v v · w
At A = v w
= w 1 w 2 w 3 v2 w 2 = w · v w · w
−→ w −→ y y v3 w3
so detAt A = ||v||2 ||w||2 − 2v · w while we know from Corollary 4.20 that the area of
the parallelogram P (v, w) ⊂ R3 satisfies
(area)2 = (v · v)(w · w) − (v · w)2 .
80 ALBERT M. FISHER
This is independent of parameter change; the first proof is geometric and is simply
that volume does not depend on parameterization. The second, algebraic proof shows
how nice the formulas are:
Proof.
p p p
Det(AB) = det((AB)t AB) =det((B t At )AB) = det(B t (At A)B
p p
= det(B t )det(At A)det(B) = |detB| det(At A)
We give a second proof of Theorem 5.27, using only the geometric definition of
determinant.....
(TO DO)
........
a b
Given a (2 × 2) matrix B = b to be the following (3 × 3) matrix:
, define B
c d
1 0 0
b = 0 a b
B
0 c d
and so the area of P (v, w), which we write as area(v, w), equals the volume of the
image of the unit cube by A,
b which we write as vol(A),
b we conclude that
area(v, w) = vol(A)
b = detA
b = DetA
5.9. Surface area and surface integrals. Given a domain (a connected open set)
B ⊆ R2 and a C 1 map σ : B → S ⊆ R3 , such that for every (u.v) ∈ B is a regular
point, i.e. such that ||σu ∧ σv || =
6 0, then as above, σ is a parametrized surface. (Recall
that the regularity condition guarantees that the tangent plane exists at that point).
Definition 5.16. We define the surface area of σ to be
Z Z
area(σ) = ||σu ∧ σv || dudv.
B
This makes sense because area(P (v.w)) = ||v ∧ w|| is the area of the parallelogram
spanned by the vectors v, w, so ||σu ∧ σv || is the infinitesimal area, and the integral
adds this up. The intuition is that for a C1 map, the surface can be well-approximated
by a polygonal surface made up of parallelograms.
Given a parametrized surface σ as above and an invertible C 1 map H : A → B
e = σ ◦ H is a reparametrization of σ via the change of parameter H.
then σ
The next result is the analogue of Proposition 4.7 for curves:
Theorem 5.30. (area is invariant of parametrization). Suppose A ⊆ R2 is a domain
and H : A → B is C1 and invertible. Then for σ
e = σ ◦ H,
area(e
σ ) = area(σ).
Proof. We are to show that
Z Z Z Z
||e
σs ∧ σ
et || dsdt. = ||σu ∧ σv || dudv.
A B
We know from the change-of-variables formula for double integrals that for F :
B → R, defining Fe = F ◦ H, then
Z Z Z Z Z Z
F (s, t)|detDH(s, t)|dsdt =
e F ◦H(s, t)|detDH(s, t)|dsdt = F (u, v) dudv.
A A B
Proof. We follow the above proof, again using the change-of-variables theorem for
double integrals.
Given a parametrized surface σ as above, with its image the parametrized surface
S = σ(B), and a function G : S → R, we define the surface integral of G over σ to
be: Z Z Z Z
G(v)dA = G(u, v)||σu ∧ σv || dudv.
S B
Theorem 5.31 shows this is indeed well-defined as it is invariant with respect to change
of parameteriization.
This notation is analagous to the line integral of second type in Def. 5.2:
Z Z b
f (v)ds ≡ f (γ(t)) ||γ 0 (t)||dt
γ a
0
where we called ds = ||γ (t)||dt the infinitesimal arc length, and this integral is
integration with respect to arc length. Here we call dA = ||σu ∧ σv || dudv the area
form, and this is an integral with repect to area.
5.10. Integrals over parametrized submanifolds. Let us note that the above
formula for surface area of a parametrized surface σ can be written in a different way,
given our definition of Det(A) for a rectangular matrix:
Z Z Z Z
area(σ) = ||σu ∧ σv || dudv DetDσ dudv.
B B
In this integral, we are not keeping
√ track of orientation of the surface, since surface
area is always positive. ( DetM = M t M ≥ 0) and the integral of a positive function
with respect to dA is always positive. This is just like a line integral of second type.
When we wish to include orientation, we use instead the notion of a two-form on
3
R (or a one-form for line integrals).
The above formula, and the proof of invariance for change of parameter, extends
immediately to the situation of a k-dimensional space inside of Rn :
Definition 5.17. Given a domain (a connected open set) B ⊆ Rk and a C 1 map
ϕ : B → S ⊆ Rn , such that for every u ∈ B is a regular point, i.e. such that DetDϕ 6=
0, equivalently Dϕ has maximal rank (= k) then as above, ϕ is a parametrized
submanifold of dimension k. The regularity condition guarantees that the tangent
space to ϕ exists at tha point.
We define the k-dimensional volume of σ to be
Z Z
vol(ϕ) = DetDϕ du
B
where du = dx1 dx2 . . . dxk .
84 ALBERT M. FISHER
This is just like the case in one dimension where given f : [a, b] → R and a function
F satisfying F 0 = f then
Z
f (x)dx = F (b) − F (a).
There is however one important difference: for f Riemann integrable there always
exists such a primitive or antiderivative F , while for higher dimensions this only works
if the field F is conservative, is which case ϕ is called a potential function.
Equivalently, in differental form notation, say for F = (P, Q) then the form η =
P dx + Qdy has a primitive ϕ such that dϕ = η. This leads to the nice formula
Z Z Z Z
η = P dx + Qdy = dη = ϕ = ϕ(B) − ϕ(A).
γ γ ∂γ B=A
saying that these operators are adjoints. (Note that this is indeed analogous to the
definition of the transpose, or adjoint, of a linear operator!)
The first difficulty hidden by this simple notation is all in the definitions, which are
equally abstract and deep. The secondary difficult comes in bridging the abstraction
to the concrete versions of Vector Calculus in R2 and R3 .
We mention two auxilliary points which come up in all these settings. The ba-
sic theorem is Stokes, which can be thought of as (and indeed can be called) the
Fundamental Theorem of Vector Calculus.
We shall need:
Definition 5.18. A differential k- form η is closed iff dη = 0.
It is exact iff there exists a (k − 1)-form α such that dα = η.
Lemma 5.34. If dα = η then dη = 0. Thus, d(dα) = 0. That is, an exact form is
closed.
In fact, we have seen a special case of this in Proposition 5.14, that ∇ × (∇ϕ) = 0.
The two other results are these:
Theorem 5.35. (Poincaré Lemma) On a simply connected domain, a closed form is
exact.
Thus the Poincaré Lemma says that for topologically nice doman (simply con-
nected), a primitive always exists; specifically, for one-forms in Rn , we know this,
since Again, we have seen this in the special case: in R3 , for a simpky connected
domain, for the dual vector field F , if curl(F ) = 0 then there
P exists ϕ such that
∇ϕ = F , thus F has a potential. And ∇ϕ = F iff dϕ = η = Pi dxi .
The second related result is:
Theorem 5.36. (Hodge Decomposition) On a simply connected domain, every dif-
ferential form can be uniquely written as the sum of a closed form and an exact form.
For vector fields in Rn , we say:
Definition 5.19. A vector field F is divergence-free or incompressible iff div(F ) = 0.
It is curl-free or conservative or irrotational iff curl(F ) = 0.
The Hodge decomposition then gives:
Theorem 5.37. (Helmholtz Decomposition) On a simply connected domain, every
vector field which vanishes fast enough at ∞ can be uniquely written as the sum of a
two vector fields, one divergence-free and one curl-free.
Corollary 5.38. A vector field on a simply connected domain, which vanishes fast
enough at ∞, is determined by its divergence and its curl.
Proof. By the Helmholtz Decomposition, our fieldF = Fd + Fc where Fd is curl-
free and Fc is divergence-free. Then curl(F ) = curl(Fc ) + curl(Fd ) = curl(Fc ) and
div(F ) = div(Fc ) + div(Fd ) = div(Fd ). Hence F = curl(F ) + div(F ).???
For vector fields on a simply connected domain in Rn , there are two versions of
Poincaré’s Lemma. The first says that a curl-free vector field has a potential, hence
is conservative: if curlF = 0 then there exists ϕ such that ∇ϕ = F .
The second statement is:
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 87
Theorem 5.39. If div(F ) = 0, then there exists a field A such that curl(A) = F .
For the proof we need:
Lemma 5.40. (Derivative under the Integral) Suppose for U ⊆ R2 open that f : U →
Rb
R is continuous, and that ∂f /∂y exists and is continuous. Define ϕ(y) = a f (x, y)dx.
Then Z b Z b
0 d ∂f
ϕ (y) = f (x, y)dx = (x, y)dx.
dy a a ∂y
Defining the curve γ(t) = (t2 , t) then h(t) = ϕ(γ(t)). We then apply the Chain
Rule:
∂ϕ ∂ϕ
h0 (t) = ∇ϕ|γ(t) · γ 0 (t) = |γ(t) x0 (t) + |γ(t) y 0 (t) =
∂x ∂y
Z t2 Z t2
−tt4 2 −tu2 −t5 2
e 2t + −u e du = 2te − u2 e−tu du.
0 0
88 ALBERT M. FISHER
Now from the Lemma, taking the derivative inside the integral, this gives
Z x Z x
∂/∂yQ(t, y, z)dt − ∂/∂zR(t, y, z)dt + ∂/∂zd(y, z) =
t=x0 t=x0
Z x
−∂/∂yQ(t, y, z) − ∂/∂zR(t, y, z)dt + ∂/∂zd(y, z)
t=x0
∂/∂zd(y, z) = −P (x0 , y, z)
(check all signs!)
So we simply define Z z
d(y, z) = P (x0 , y, r)dr
t=z0
giving the first part of the solution, defined up to a constant.
So far we have shown that for F = (P, Q, R) then the field A = (L, M, N ) with
L = 0. Z x
N (x, y, z) = Q(t, y, z)dt
t=x0
Z x
M (x, y, z) = R(t, y, z)dt + d(y, z)
t=x0
where Z z
d(y, z) = P (x0 , y, r)dr
t=z0
Putting these together we have shown that given F = (P, Q, R), then for A =
(L, M, N ) defined by
L=0
Z x
N (x, y, z) = Q(t, y, z)dt
Zt=x
x
0
Z z
M (x, y, z) = R(t, y, z)dt + P (x0 , y, r)dr.
t=x0 t=z0
then we have
curl(A) = F
.
90 ALBERT M. FISHER
(ii) This implies the map is conformal: angles and orientation are preserved infinites-
imally. By contrast, an anticonformal map preserves angles but reverses orientation;
the simplest example is z 7→ z where for z = a+ib, its complex conjugate is z = a−ib.
A general antiholomorpic map is given by a holomorphic map preceded or followed
by complex conjugation, so the R2 -derivative is a rotation composed with a reflection
in a line through (0, 0). Note that for both conformal and anticonformal maps, in-
finitesimal circles are taken to infinitesimal circles (not ellipses, which is the general
case).
(iii) A function is (complex) analytic iff it has a power series expansion near a point.
In particular, knowing a function has one continuous complex derivative, i.e. in
C 1 , implies, very differently from the real case, it is not only infinitely continuously
differentiable (C ∞ ) but has a power series (is C ω ).
Remark 5.14. Recalling Definition 5.20, if f : C → C is a complex analytic function,
with f = u + iv, then this defines a vector field F = (u, v) on R2 . We note that in
this case the field F has a special form:
ux uy a −b
DF = =
vx vy b a
since f is analytic iff it is complex differentiable, meaning that f 0 (z) is a complex
number w = a + ib = reiθ , giving a dilation times a rotation. This proves the
Cauchy-Riemann equations R ux = vy , uy = −vx .
Now the line integral γ F dγ is closely related to the contour integral of f over γ,
R
written γ f . The beginnings of the theory are developed in parallel; see e.g. [MH87]
p.95 ff. In particular, the winding mumber can be defined using a contour integral.
Of course this is only a starting point for the deep and beautiful subject of Complex
Analysis.
Definition 5.21. A function u : U → R is harmonic iff u is C 2 and uxx + uyy = 0.
We define a linear operator ∆, also written as ∇2 and called the Laplacian, on the
vector space C 2 (U, R) by ∆(u) = uxx + uyy . So u is harmonic iff ∆(u) = 0, iff u is in
the kernel of the operator.
The reason for the notation ∇2 isbeacuse it is notationally suggestive, as we can
think of it as the dot product: ∇2 ϕ = (∇·∇)(ϕ) = ∇·(∇(ϕ) = (δ/δx+δ/δy)·(ϕx +ϕy ).
Figure 22. Level curves for the real and imaginary parts of f (z) =
z 2 (z − 1)2 .
Figure 23. Level curves for the real and imaginary parts of f (z) = z 3 .
since this is a rotation followed by a dilation, this matrix has a special form. Writing
f = u + iv, then thought of as a map F of R2 , this is the vector field F = (u, v), the
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 93
Figure 24. Equipotential curves and lines of force for the electrostatic
field of a single charge in the plane. The equipotentials are level curves
for the potential function ϕ and change color as the angle increases from
0 to π and again from π to 2π. This depends on the formula chosen for
ϕ and the “color map” chosen for the graphics. In complex terms, the
complex log function is f (z) = log(z) and for z = reiθ with θ ∈ [0, 2π)
then f (z) = log(reiθ ) = log(r) + log(eiθ ) = log(r) + iθ = u + iv with
harmonic conjugates u(x, y) = log(r) and v(x, y) = θ. We see the level
curves in the Figure; they form a spiral staircase. See Fig. 20.
Example 6. Consider f (z) = z 2 = u + iv. The gradient fields are F (v) = Av and
Fe(v) = Av
e where
1 0
A=
0 −1
and
0 1
A
e= ,
1 0
for the potentials u and v respectively.
5.15. Electrostatic and gravitational fields in the plane and in R3 . The same
geometry (with dual, orthogonal families of level curves) happens for electrostatic
fields: one family is the equipotentials (curves or surfaces, depending on the dimension)
while the other depicts the lines of force: flow lines tangent to the force vector field.
See the Figures.
96 ALBERT M. FISHER
Figure 25. Equipotential curves and lines of force for the electrostatic
field of two opposite charges in the plane. Colors indicate different
levels of the potential and dual potential, where these are the harmonic
conjugates coming from the associated complex function g(z) = f (z) −
f (z − 1) = log(z) − log(z − 1). These harmonic functions are u(x, y) −
u(x − 1, y) and v(x, y) − v(x − 1, y).
When the opposite charges of Fig. 25 get closer and closer, the behavior approxi-
mates that of an Electrostatic Dipole; see Figs. 26, 30. The charges would cancel out,
if we place one on top of the other, but if we take a limit of the fields as the distance d
goes to 0 as charges c are balanced with this so that the product dc remains constant,
then the limit of the fields (and potentials) exists. Note there is a limiting vector
from plus to minus, along the x-axis. The picture is for the case of charges in the
plane.
We note here that the pictures are unchanged by this sort of normalization, since:
Lemma 5.47.
(i) If F is a conservative field on Rn with potential function ϕ, then the collection of
equipotential curves (or dimension (n − 1) submanifolds) is the same as for the field
aF , a 6= 0.
(ii) If γ is a line of force for F , then γ is orthogonal to each equipotential submanifold.
Proof. (i) We have: ∇ϕ = F iff ∇aϕ = aF , and the level curve of level c corresponds
to that of level ac.
(ii) line of force for F is a curve γ with the property that F(γ(t) = γ 0 (t), i.e. γ is
tangent to the field everywhere (is an orbit of the flow for the ODE). Then ϕ(γ(t) = c
so ϕ(γ(t))0 = 0 but by the Chain Rule this is ϕ(γ(t))0 = ∇ϕ(γ(t)) · γ 0 (t) = F (γ(t)) ·
γ 0 (t).
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 97
Figure 26. Equipotential curves and lines of force for the electrostatic
field of two unlike charges, now closer together.
That the pictures converge (of both the equipotentials and field lines) looks clear
from the figures, but to have the fields and potentials converge we need this normal-
ization.
The potential function shown is
1 (x + d)2 + y 2
u(u, y) = log
d (x − d)2 + y 2
for d = 1, .5, .05.
Dipoles (both electric and magnetic) are useful in applications to electrical engi-
neering and are itnriguing mathematically.
We mention that the geometry of fields in two-dimensional space has practical rel-
evance: for example, the magnetic field generated by electric current passing throung
a wire (in the form of a line) decreases like 1/r, as we can think of the field as be-
ing in the plane perpendicular to the wire. For fascinating related material see the
Wikipedia article on Ampere’s circuital law.
Experiments show that the force between two charged particles with charges q1 , q2 ∈
R with position difference given by a vector v ∈ R3 is
q 1 q2 v
· , r = ||v||
r2 ||v||
(so it is positive hence repulsive if the charges have the same sign).
An intuitive explanation for the factor of 1/r2 is this: suppose we have a light bulb
at the origin and we want to calculate the light density at distance r; the light consists
of photons, and the number emitted per second is the same as the number that pass
98 ALBERT M. FISHER
through a sphere of radius r, which is proportional to the area 4πr2 . Another way
to say this that we are counting the number of field lines per unit area. Both the
electrostatic field of a single charge and gravity (which is more simple as the is no
negative gravity) are mediated by radiating particles and so should decrease in the
same way.
We claim that the attractive potential ϕ of a single charge in R3 is
ϕ = 1/r = (x2 + y 2 + z 2 )−1/2
Since the force field is then F = ∇ϕ we have F = (P, Q, R) where
−x
P = 2
(x + y 2 + z 2 )3/2
and similarly for Q,R. The field strength at (x, y, z) is then
F (x, y, z) = ||(x, y, z)||/||(x, y, z)||3 = 1/r2
as we wanted.
We are thinking of a single large charge being tested by a small charge; we are not
yet calculating the resulting field of two equal charges (or the gravitational field of
two equal mass objects).
In two dimensions, the math is very different, as the field strength now should be
proportional to 1/r as it is inversely proportional to the circumference of a circle,
2πr.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 99
Figure 28. Equipotential curves and lines of force for the electrostatic
field of a planar dipole: two unlike charges very close together. The
potential is 1/d log(((x − d)2 + (x + d)2 ) for d = 0.5.
Thus in R2 , for a single unit charge particle at the origin, we claim that the potential
is
1
ϕ(x, y) = log(x2 + y 2 )
2
for then the force field is
x y
F = (P, Q) = ∇ϕ = ,
x2 + y 2 x2 + y 2
which has norm
||F || = ||(x, y)||/||(x, y)||2 = 1/r,
as we wished.
The dual field is
∗ −y x
F = (−Q, P ) = ,
x2 + y 2 x 2 + y 2
which as we have seen in §5.4 has potential the angle function Θ of Fig. 20, given
by ψ(x, y) = arctan(y/x) or ψ(x, y) = arccot(x/y) depending on the location (since
R2 \ 0 is not simply connected).
The corresponding analytic function is
f (z) = log(z)
100 ALBERT M. FISHER
Figure 29. Equipotential curves and lines of force for the electrostatic
field of an approximate planar dipole: two unlike charges close together.
Figure 30. Equipotential curves and lines of force for the electrostatic
field of an approximate planar dipole: two unlike charges close together.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 101
Figure 31. Equipotential curves and lines of force for the electrostatic
field of two like charges in the plane. Since for one charge at 0 the
associated complex function is f (z) = log(z) = u + iv, here it is g(z) =
f (z) + f (z − 1) = log(z) + log(z − 1). The equipotentials and field lines
are respectively the level curves for the harmonic conjugates u(x, y) +
u(x − 1, y) and v(x, y) + v(x − 1, y).
and for z = reiθ with θ ∈ [0, 2π) then f (z) = log(reiθ ) = log(r) + log(eiθ ) = log(r) +
iθ = u + iv giving the harmonic conjugates u(x, y) = log(r) and v(x, y) = θ, whose
level curves we see in Fig. 24.
This is the case of a single charge. In fact, when combining objects all we have to
do is add the two potentials, ϕ = ϕ1 + ϕ2 , and then the gradient will give the field.
See Figs. 25, 31 for the cases of two oppositely, and equally, charged particles.
That we sum the potentials means in two dimensions that we sum the associated
complex functions as well; for opposite charges we change one of the signs.
In this figure, we have depicted two sets of curves: the level curves of the field ϕ
(the equipotentials), and the flow lines of the gradient field F = ∇ϕ (the lines of
force).
We can formulate this as a theorem; compare to Theorem 5.41 regarding analytic
functions:
Theorem 5.48. For an electrostatic field F = (P, Q) on the plane, then P and Q
are harmonic conjugates, whence
(i)their gradient vector fields are orthogonal;
(ii) their families of level curves are orthogonal.
Further, the potential P and dual potential Q are (perhaps integral) linear combi-
nations of the log and argument (angle) functions on R2 . The corresponding analytic
functions are (integral) linear combinations of the complex log function.
Proof. For a finite combination of point charges at points
P pi ∈ R2 with charges qi ∈ R,
the associated analytic function on C is f (z) = qi log(z − zi ) where p = (x, y)
corresponds to z = x + iy.
102 ALBERT M. FISHER
Figure 32. Equipotential curves and lines of force for the electrostatic
field of two like charges in the plane. Close to the center, (1/2, 0), the
potential and its dual start to approximate the dual hyperbolas of
Fig. 21.
At first we may think that a potential such as the hyperbola shown in Fig. 21,
cannot come from an electrostatic field. However as Feynman Vol II §7.3 [FLS64]
points out, it can (in the limit): the field in the exact middle of two opposite charges
of Fig. 25 looks just like this. See Figs. 32, 33.
To prove this rigorously, take instead the charges at −1, 1 so now f (z) = log(z +
1) + log(z − 1). The Taylor expansion of this about the middle point 0 is −z 2 + . . .
and g(z) = z 2 is indeed the analytic function of Fig. 21. g determines harmonic
conjugates which define the linear vector fields depicted there.
Theorem 5.49. For gravitational fields in R2 , we have the same statement as The-
orem 5.48 except that now only positive values of the density function q can occur.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 103
Figure 33. Equipotential curves and lines of force for the electrostatic
field of two like charges in the plane. Close to the center, (1/2, 0), the
potential and its dual start to approximate the dual hyperbolas of
Fig. 21.
In fact, according to Feynman, any harmonic function and hence any complex
analytic function can occur for a physical electrostatic field in R2 . One can prove this
as follows.
From the mathematical point of view, there are two equivalent ways to characterize
an electrostatic field (in R2 or in R3 ). The first is that the potential of the field is a
solution of Poisson’s equation,
∇2 (ϕ) = ρ
where ρ is a signed measure describing the distribution of charge. From this point
of view one can then go about solving this linear partial differential equation. The
second is to describe the fundamental solution , which is the single point charge, with
its associated (gradient) field, and then define an electric field to be a (vector-valued
integral) linear combination of such fundamental solutions, integrated with respect
to the charge density.
From the first point of view what is fundamental is the PDE, from the second what
is most basic is the fundamental solution (this is Coulomb’s law!). What bridges the
two is the superposition principle, which simply says the space of solutions is a vector
space: we can take linear combinations.
104 ALBERT M. FISHER
In other words, for this linear equation knowing the fundamental solution charac-
terizes the infinite-dimensional vector space of all solutions. And conversely, one of
the methods for solving the PDE is to find its fundamental solution.
(For gravity the solution space is not all of the vector space but rather the positive
cone inside of it).
Now ∇(ϕ) = F is the field, so Poisson’s equation states that
∇2 (ϕ) = div(F ) = ρ.
Thus from the field or the potential we can determine the charge distribution. Ap-
plying the operator ∇ is a type of derivative; the opposite procedure is a type of
integration. Thus given the charge density ρ we find the field by solving the (partial)
differential equation divF = ρ, and given the field we find the potential by solving
the PDE
∇ϕ = F.
Combining these, given ρ we can find ϕ by solving the PDE
∇2 ϕ = ρ,
which is now a second order PDE as it involves second order partials.
The general operation of solving a DE is referred to as integration. As always,
differention is automatic, while integration can be hard! Mathematically speaking,
the first task is to prove that under certain circumstances a solution exists, and
conversely trying to identify any obstructions to having a solution. Such obstructions
are often especially interesting because they are topological; e.g. the equation ∇ϕ =
only has a solution on a simply connected U ⊆ R2 \ {0}.
If there is no charge in a region U, then from Poisson’s equation
∇2 (ϕ) = ρ = 0
and the potential function ϕ is harmonic. Thus for Figs. 25, 25, the potential is
0 everywhere except exacly at those two points. At those points themselves the
potential is infinite and the field is not only infinite but points in all directions, so
neither is defined. When we have a continuous charge density, however, these are
defined everywhere. In that case, by Poisson’s equation the potential is not harmonic
as ∇2 (ϕ) = ρ 6= 0. When the charge density is continuous but nonzero, the field
and potential make perfect sense mathematically being continuous functions, but the
potential is no longer a harmonic function, so it certainly cannot (in
R2 ) have a harmonic conjugate and does not extend to a comlpex analytic function.
Hence the tools of Complex Analysis are not as applicable. Nevertheless, there is still
a dual potential, whose level sets are orthogonal to those of ϕ, similar to the harmonic
case.
To prove this, (I believe and would like to work this out!) we can again refer to
the fundamental solution; since it hold there it must extend to all densities ρ.
But what “is” a point charge? From the mathematical point of view it is a point
mass, simply a measure concentrated at a point. In physics this is called a Dirac
delta function, which is the viewpoint of Riemann-Stieltjes integration. From the
standpoint of Lebesgue integration, it is a measure and not a function at all.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 105
Then we know how to rigorously treat two cases: point masses and continuous
densities. Similarly, one can include densities given by any other Borel measures.
I say “density” rather than “distribution” here because that word will immediately
get used in a very different way! That is the yet more sophisticated viewpoint of
Laurent Schwartz’ theory of distributions, see e.g. [Rud73]. Roughly speaking a
Schwartz distribution is a continuous linear functional defined on a carefully chosen
space of test functions which are smooth and rapidly decreasing. This enables one to
define derivatives, by duality. Thus the advantage of Schwartz distributions is that
they can be differentiated and also can be convolved. Thus if one finds a fundamental
solution to be a Schwartz distribution, the general solution is found by convolving
this over the density. This is exactly what we have described above.
For the simplest case of the fields described above we can get away with point
masses, but for more sophisticated examples we really do need Schwartz distributions.
This is the case when we consider dipoles, but that is beyond the present scope.
For a clear overview of the physics, see the beginning of Jackson’s text [Jac99];
this however goes quickly into much (much) deeper material, including boundary
values, dipoles, Green’s functions, and magnetism, dynamics and the connections
with Special Relativity.
For a remarkable mathematical treatment see Arnold’s book on PDEs: [Arn04].
Now to sketch a proof of Feynman’s claim, given a harmonic function, we define a
field to be the gradient of this potential. Given a field, we find such a potential. ...
Finding a potential
Next we see (by working out some examples) how to find the potential of a conser-
vative vector field.
We know that given F : Rn → R, the gradient vector field is orthogonal to the
level hypersurfaces (submanifolds of dimension n − 1) of F , so level surfaces in R3
and level curves in R2 .
We know that a vector field is conservative iff it is the gradient of a function. There
are two ways that this can fail to be the case: locally or globally.
We want to first examine the local problem: when is a vector field locally conser-
vative?
Switching equivalently to the language of differential forms, the vector field V is
conservative iff the associated 1−form η is exact. We know that a necessary condition
for this to occur is that the form be closed, i.e. that d(η) = 0. In R2 or R3 this is the
same as curl = 0.
Poincaré’s Lemma tells us that locally, the converse holds: any closed form is
exact. A basic counterexample for the global exactness is the angle function on the
plane: there is a local potential (the infinite spiral staircase) but this is a multivalued
function, so not a potential in the usual sense.
Here is a method to try to find a potential for any vector field in the plane. Given a
nowhere-0 vector field V , we want to find a potential ϕ, that is a function ϕ : R2 → R
such that ∇ϕ = V . In this case, its level curves are orthogonal to V . So, let us
consider the orthogonal vector field W to V , say at angle +π/2. Then, using the
Fundamental Theorem of ODEs, draw the integral curves. These are unique hence
106 ALBERT M. FISHER
do not intersect. Globally, they might say be spirals, we can define a function ϕ with
different values on each. Thus, ϕ is a candidate for a potential.
We can see an example in the illustrations of the electrostatic potentials Fig. 31,
25.
There are two families of curves: the equipotentials and the lines of force.
The lines of force are tangent to the gradient vector field. For opposite charges,
we can picture the gradient flow as flowing from the positive to the negative charge.
In fact, we can interpret this as a gravitational field, with a mountain at the positive
and a valley at the negative charge. For like charges, we can picture two mountains.
It is important to remember that there are two quite different interpretations, as
force fields or as velocity fields. The gradient flow refers to the velocity field, and a
particle moves along the curve with that tangent vector. For the force field interpre-
tation, the particle accelerates and may go off the curve because of the acceleration
due to the curvature.
In any case, we can try to imagine switching roles, so the equipotential curves
become the orbits of a gradient flow and vice-versa.
If this works, we will have succeeded in constructing a potential for our vector field.
....
5.16. The role of differential forms. (To DO)......
Exponential growth.
(i)ab+c = ab ac
and
(ii)abc = (ab )c .
An exponential function is of the form f (x) = ax for a > 0 and x ∈ R. The number
1
a is termed the base and x the exponent. By the above rules, 1/a = a−1 and (a 2 )2 = a
√ 1
whence a = a 2 . This makes it easy to define ax for exponent rational.
(1)First, we can use continuity: it can be proved that there is a unique continuous
way to extend this function to the reals. That is, we can approximate x by rational
numbers and take the limit.
(2) From the Fundamental Theorem of Ordinary Differential Equations, there exists
a unique solution to the differential equation (or DE) y 0 = y satisfying y(0) = 1 (this
is called an initial condition for the DE). We define this function to be exp(t) = y(t),
and then and define the number e to be exp(1). This is also the slope of ex at x = 0.
108 ALBERT M. FISHER
We need:
Convergence of the Tayor series for the exponential function.
Theorem 6.1. The series for ex in (26) converges for all x ∈ R. The corresponding
series where x is replaced by a complex number z ∈ C convergse, as does the series
where x is replaced by a square matrix.
Proof. For x fixed, let m > 2x so x/m < 1/2. Then for any n > 0,
n
xn+m xm 1
≤ ·
(n + m)! m! 2
which gives a geometric series hence converges. Thus the sequence of partial sums is
an increasing bounded sequence hence converges by the completeness property of the
real numbers.
Similarly, using the fact that |zw| = |z||w|, the series
z2 z3 zn
exp(z) = 1 + z + + + ··· + + ...
2 6 n!
comverges for all z ∈ C.
For square matrices we need the follwing: We define a norm on the linear space
L(V, W ) of all linear operators from one normed space V to another W by ||A|| =
supv∈V ||Av||/||v|| = sup||v||=1 ||Av||. This is called the operator norm. One of its
main useful properties is that it (clearly) behaves nicely under composition: ||AB|| ≤
||A|| ||B||. This is called submultiplicativity.
In particular this holds for squares matrices A, B. Using this, just as for complex
numbers, the series
A2 A3 An
exp(A) = 1 + A + + + ··· + + ...
2 6 n!
comverges for all A ∈ M(d×d) , the collection of square matrices (with entries in R or
C.)
Note that the derivative of the series for f (x) = ex taken term-by-term, does satisfy
the DE y 0 = y, y(0) = 1, so (3) yields (2). Conversely, knowing the derivative from
(2) gives (3) as the Taylor series: recall that for an infinitely differentiable function
f the Taylor series about 0 is
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 109
∞
X f (k) (0)xk
.
k=0
k!
We let ln(x) denote the natural logarithm, the inverse function g(x) of f (x) = ex =
exp(x), so ln = lne .
Another way to define ln is via integration: for x > 0,
Z x
1
ln(x) = dx
1 x
Another possible definition for ln is via its Taylor series, calculated around the
value x = 1, in other words, find the series for ln(x + 1) around 0.)
Figure 34. Slope field and solution curves for exponential growth y 0 =
cy. The equation is in one dimension, and its flow is along the real line;
these curves are the graphs of those solutions, so including the time
variable. This can also be viewed as solutions to a vector ODE in
the plane, where the curves are tangent to the vector field V (x, y) =
(1, y). These solution curves are γ(t) = (t, y(t)) so γ 0 (t) = (1, y 0 (t)) =
V (γ(t)) = (1, y(t)). The difference between a slope field and a vector
field is this: segments in the slope field are parallel to the vector field
but meet the curves in their midpoint. The picture of the slope field
is often easier to understand, as it is much less cluttered since all the
segments are all of the same maneagable length.
More examples:
y 0 (t) + y(t) = 1: is an DE of first order
sin(y 0 (t)) + y 3 (t) = t: an implicit DE of first order
y (7) (t) + y 9 (t) = t + sin(5t): an DE of order 7
y(t) + y 2 (t) = t is not an DE (as there is no derivative involved!).
Examples:
- The DE y 0 (t) = 0 admits an infinite number of solutions. Indeed, y(t) = c for any
c ∈ R is a solution.
- The implicit DE y 02 (t) + 1 = 0 does not admit any real solution.
6.2. Flows, systems of DEs and vector differential equations. At this point it
will actually be better to consider an apparently more difficult problem: that of DEs
in higher dimensions, or vector differential equations.
Definition 6.2. Given a vector field V on Rn , a vector differential equation of first
order with initial condition x ∈ Rn is:γ(0) = x. is:
τt+s = τs ◦ τt .
even and odd terms. Writing c = cos t and s = sin t, this gives:
∞
X
exp(tA) = (tA)k /k! =
k=0
I + tA + (tA) /2 + (tA)3 /3! + (tA)4 /4! + · · · =
2
order-one vector DE (also in the nonautonomous case, by adding one more dimen-
sion). Geometrically, this means we have a velocity vector field, with an integral
curve exactly corresponding to a solution!
To carry this out in this case, we set w1 = −y, w2 = y 0 . We thus have the pair of
equations w20 = y 00 = −y = w1 ,, w10 = −y 0 = −w2 giving the system
(
w10 = −w2
w20 = w1
This can be written in matrix form, where w = (w1 , w2 ), as
w0 = Aw (29)
so
w10
0 −1 w1
= .
w20 1 0 w2
Definition 6.4. Equation (29), w0 = Aw where A is an (n × n) matrix, and w =
w(t) ∈ Rn is called a vector DE. Note that A defines a linear vector field on Rn , and
w(t) is a curve in Rn ; it is a curve which is tangent to the vector field, as its tangent
vector at each point is the vector field. If the starting point of the curve w(0) = w0
is specified, then we have a vector DE with this initial condition.
Remark 6.1. Now A defines a linear vector field on R2 . Since the variable for time
does not occur here, we had for y 00 = −y ian autonomous second-order equation,
and now we have an autonomous vector DE, equivalently an autonomous system of
two first-order equations. Physically, the variables (w1 , w2 ) = (−y, y 0 ) represent, for
the oscillator, −position and velocity (or momentum, since momentum= mV ). The
vector solution w(t) = (cos t, sin, t) gives the one-dimensional solution y(t) = cos t
for the original equation; the graph of the curve w(t) is the helix (t, cos t, sin t) which
projects to the position y(t) = cos t and velocity y 0 (t) = sin t.
In Physics, we often pass to (position, momentum) soace which is called phase
space.
Exercise 6.2. Solve the second order linear equation y 00 = −y by the following
strategy: we define y 0 = x and x0 = −y, giving a system of two equations of first
order. Then we rewrite the system in vector form x0 = Ax, for x = (x, y) as above.
Now solve this vector DE explicitly for initial condition x0 = (a, b) and sketch the
solutions. Lastly, returning to the original equation y 00 = −y, what are the solutions
y(t)?
Solution.
0 −1
The matrix is A = . We have the solution
1 0
tA cos t − sin t a a cos t − b sin t x(t)
e x0 = Rt x0 = = =
sin t cos t b a sin t + b cos t y(t)
116 ALBERT M. FISHER
for the vector DE with initial condition x0 = (a, b). The general solution for the
one-dimensional second order equation y 00 = −y is therefore
0 1 0 ... 0
0 0 1 ...
.
..
A= .
0 0 0 ... 1
a1 a2 a3 . . . an
References
[Ahl66] Lars V. Ahlfors. Complex Analysis. McGraw-Hill, second edition, 1966.
[Arm83] M.A. Armstrong. Basic Topology, volume 8 of Undergraduate Texts in Mathematics.
Springer, 1983.
[Arn04] Vladimir Igorevc Arnold. Lectures on partial differential equations. Springer, 2004.
[CB14] Ruel Churchill and James Brown. Ebook: Complex Variables and Applications. McGraw
Hill, 2014.
[CJBS89] Richard Courant, Fritz John, Albert A Blank, and Alan Solomon. Introduction to calculus
and analysis, volume 2. Springer, 1989.
[DC16] Manfredo P Do Carmo. Differential geometry of curves and surfaces: revised and updated
second edition. Courier Dover Publications, 2016.
[FLS64] RP Feynman, RB Leighton, and M Sands. The feynman lectures on physics, ii, addison-
rwesley. Reading, Massachusetts, 1964.
[GP74] Victor Guillemin and Alan Pollack. Differential topology. Prentice-Hall, 1974.
[Gui02] Hamilton Luiz Guidorizzi. Um Curso de Cálculo, Vols I- III, ltc, 5a, 2002.
[HH15] John H Hubbard and Barbara Burke Hubbard. Vector Calculus, Linear Algebra, and
Differential Forms: a unified approach. Matrix Editions, 2015.
[Jac99] John David Jackson. Classical electrodynamics, 1999.
[Lan99] Serge Lang. Complex Analysis, volume 103 of Graduate Texts in Mathematics. Spriger,
1999.
[Mar74] Jerrold E. Marsden. Elementary Classical Analysis. W. H. Freeman, 1974.
[MH87] Jerrold E. Marsden and Michael J. Hoffman. Basic Complex Analysis. W. H. Freeman,
second edition, 1987.
[MH98] JE Marsden and JM Hoffman. Basic complex analysis, 3a edição, 1998.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 117
[Mil16] John Milnor. Morse theory.(am-51), volume 51. In Morse Theory.(AM-51), Volume 51.
Princeton university press, 2016.
[MW97] John Milnor and David W Weaver. Topology from the differentiable viewpoint, volume 21.
Princeton university press, 1997.
[O’N06] Barrett O’Neill. Elementary differential geometry. Elsevier, 2006.
[Ros68] Maxwell Rosenlicht. Liouvilles theorem on functions with elementary integrals. Pacific
Journal of Mathematics, 24(1):153–161, 1968.
[Rud73] W. Rudin. Functional Analysis. McGraw-Hill, New York, 1973.
[Spi65] Michael Spivak. Calculus on Manifolds, volume 1. WA Benjamin New York, 1965.
[War71] Frank W Warner. Foundations of differentiable manifolds and Lie groups, volume 94 of
Graduate Texts in Mathematics. Springer Verlag, 1971.
Albert M. Fisher, Dept Mat IME-USP, Caixa Postal 66281, CEP 05315-970 São
Paulo, Brazil
URL: http://ime.usp.br/∼afisher
E-mail address: [email protected]