0% found this document useful (0 votes)
46 views117 pages

NotesForCalc2,3 VectorCalculus 20230715

This document contains lecture notes for vector calculus. It begins with reviews of logic, set theory, linear algebra, and Riemann integration. It then covers vector calculus topics like derivatives and the chain rule, metrics and continuity, curves, arc length, partial derivatives, gradients, vector fields, line integrals, Green's theorem, Stokes' theorem, and differential forms. It provides definitions, examples, theorems, and proofs related to multivariable calculus and vector analysis.

Uploaded by

lu cas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views117 pages

NotesForCalc2,3 VectorCalculus 20230715

This document contains lecture notes for vector calculus. It begins with reviews of logic, set theory, linear algebra, and Riemann integration. It then covers vector calculus topics like derivatives and the chain rule, metrics and continuity, curves, arc length, partial derivatives, gradients, vector fields, line integrals, Green's theorem, Stokes' theorem, and differential forms. It provides definitions, examples, theorems, and proofs related to multivariable calculus and vector analysis.

Uploaded by

lu cas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 117

LECTURE NOTES FOR VECTOR CALCULUS LECTURE NOTES

FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND


3)

ALBERT M. FISHER

Contents
1. Review of Logic and Set Theory 2
2. Review of Linear Algebra 2
3. Review of the Riemann Integration 2
4. Vector Calculus, Part I: Derivatives and the Chain Rule 2
4.1. Metrics, open sets, continuity 2
4.2. Curves 4
4.3. Arc length of a curve 6
4.4. Level curves of a function. 8
4.5. Partial derivatives; directional derivative. 8
4.6. Properties of the gradient of a function. 11
4.7. Three types of curves and surfaces. 12
4.8. The gradient vector field; the matrix form of the tangent vector and of
the gradient. 17
4.9. General definition of derivative of a map. 20
4.10. The general Chain Rule. 24
4.11. Level curves and parametrized curves. 25
4.12. Level surfaces, the gradient and the tangent plane. 27
4.13. Two definitions of the determinant. 30
4.14. Orientation 31
4.15. Three definitions of the vector product. 32
4.16. The Inverse and Implicit Function Theorems 37
4.17. Higher order partial derivatives. 40
4.18. Finding maximums and minimums. 42
4.19. The Taylor polynomial and Taylor series. 42
4.20. Lagrange Multipliers 47
5. Vector Calculus, Part II: the calculus of fields, curves and surfaces 48
5.1. Vector Fields 48
5.2. The line integral 53
5.3. Conservative vector fields 58
5.4. Rotations and exponentials; angle as a potential 64
5.5. Line integral with respect to a differential form 70
5.6. Green’s Theorem: Stokes’ Theorem in the Plane 71
Date: July 15, 2023.
1
2 ALBERT M. FISHER

5.7. The Divergence Theorem in the plane 76


5.8. Surface area and the “determinant” of a rectangular matrix 78
5.9. Surface area and surface integrals 81
5.10. Integrals over parametrized submanifolds. 83
5.11. The Divergence Theorem in space 84
5.12. Stokes’ Theorem 84
5.13. Poincaré’s Lemma: Existence of the vector potential. 84
5.14. Analytic functions and harmonic conjugates 90
5.15. Electrostatic and gravitational fields in the plane and in R3 . 95
5.16. The role of differential forms 106
6. Ordinary differential equations 106
6.1. The classical one-dimensional case 106
6.2. Flows, systems of DEs and vector differential equations 112
References 116

1. Review of Logic and Set Theory


2. Review of Linear Algebra
3. Review of the Riemann Integration
4. Vector Calculus, Part I: Derivatives and the Chain Rule

4.1. Metrics, open sets, continuity. Let us recall:


Definition 4.1. A norm || · || on V is a function with values in R which satisfies:
(i) ||av|| = |a| · ||v|| (homogeneity);
(ii) ||v + w|| ≤ ||v|| + ||w|| (triangle inequality);
(iii) ||v|| ≥ 0, and ||v|| = 0 iff v = 0. (positive definiteness).
Given a set X, a metric on X is a function d : X × X → [0, +∞] satisfying:
(i) d(x, y) = d(y, x) (symmetry);
(ii) d(x, z) ≤ d(x, y) + d(y, z) (triangle inequality);
(iii) d(x, y) ≥ 0, and d(x, y) = 0 iff x = y (positive definiteness).
We then say that (V, || · ||), respectively (X, d), is a normed space, respectively a
metric space.
Having a norm allows us to define a metric on V , with the distance between points
defined by d(v, w) = ||w − v||.
Exercise 4.1. Verify this!
A topology on X is a collection T of subsets of X (these will be called open sets)
satisfying:
(i)∅, X ∈ T ;
(ii) an arbitrary union of open sets is open;
(iii) a finite intersection of open sets is open.
A set C ⊆ X is closed iff its complement C c = X \ C = {x ∈ X : x ∈/ C} is open.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 3

The collection F of closed sets satisfies:


(i)∅, X ∈ F;
(ii) an arbitrary intersection of closed sets is closed;
(iii) a finite union of closed sets is closed.
These properties of F are equivalent to the corresponding properties for T via the
laws for unions and intersections of complements of sets:
(A ∩ B)c = Ac ∪ B c , (A ∪ B)c = Ac ∩ B c
and more generally,

(∩i∈I Ai )c = ∪i∈I (Ai )c ;

(∪i∈I Ai )c = ∩i∈I (Ai )c .


where I is some index set, for example N = {0, 1, 2, . . . } or even an uncountable set
like R.
Exercise 4.2. Verify these statements!
For a metric space a limit point of A ⊆ X is x ∈ X such that for each r > 0, there
is some point of A in Br (x). For a metric spaces a set C is closed iff it contains all of
its limit points. Thus for example (a, b) is not a closed set as a, b are limit points.
Having a metric allows us to define a topology on X, as follows.
Definition 4.2. Given a metric space (X.d), the (open) ball of radius r ∈ [0, ∞]
around x ∈ X is Br (x) = {y ∈ X : d(x, y) < r}.
A set U ⊆ X is open iff, equivalently,
(i) U is a union of open balls;
(ii)for every x ∈ U, ∃r > 0 such that Br (x) ⊆ U.
Exercise 4.3. Verify that this does give a topology.
Convergence and continuity.
Given a sequence (xn )N∈N , we say (xn ) converges to x, equivalently written limn→∞ xn =
x or xn → x, iff for every open set U containing x, then for n sufficiently large, xn ∈ U.
For a metric space, equivalently given r > 0, ∃N such that for all n > N , d(xn , x) < r
(since we can use balls of radius r).
Definition 4.3. Given two metric spaces (X1 , d1 ) and (X2 , d2 ), then f : X → Y is
coninuous iff
(i) if xn → x then f (xn ) → f (x). (This is the usual definition for f : R → R.)
Equivalently, iff:
(ii) if f (x0 ) = y0 , then given ε > 0, ∃δ > 0 such that if d(x, x0 ) < δ then for y = f (x),
then d(y, y0 ) < ε. (This is the famous “ε − δ”- definition.)
(iii)the inverse image of every open set is open.
This third definition works for the more general situation of two topological spaces,
(X, T ) and (Y, S).
Exercise 4.4.
(i) Use each of the three definitions to prove:
4 ALBERT M. FISHER

Proposition 4.1. A composition of continuous functions is continuous.


(ii) Show that (for metric spaces), all three definitions are equivalent. Hint: first
prove (i) ⇐⇒ (ii), then (ii) ⇐⇒ (iii).
Remark 4.1. For some unusual non- metric topological spaces one has to replace
sequences by so-called nets or filters.
4.2. Curves.
Definition 4.4. A (parametrized) curve in a vector space V is a function γ : [a, b] →
V.
The image of the curve is the image of this function, i.e. the collection of all values:
{γ(t) : t ∈ [a, b]}. Thus the parameter t parametrizes the image.
The simplest example is a parametrized line; the curve l(t) = p + tv where p, v
are elements of some vector space V . The image of l is a straight line in V ; the
parametrized line passes through the point p going in the direction v.
Note that the image of a curve is different from the graph. Here we recall that
by definition the graph of a function f : X → Y (where X, Y can be any sets) is
graph(f ) ≡ {(x, y) ∈ X × Y : y = f (x)} = {(x, f (x) : x ∈ X}. (Here X × Y is the
product space, defined to be the collection of all ordered pairs.)
Thus the graph of the curve γ in V is {(t, γ(t)) : t ∈ [a, b]}, a subset of [a, b] × V .
The image shows where you go on the curve, but not how fast or in what direction
you go along this image. We se shortly how you can change the parametrization of a
curve, keeping the same image.
Let us suppose that V has a norm defined on it. Then V is a metric space with
d(v, w) = ||v − w||, so we know what it means for a curve to be continuous. We
define the derivative of γ to be
γ(t + h) − γ(t)
γ 0 (t) = lim
h→0 h
if the limit exists; this is also called the tangent vector to γ at time t. See Fig. 5.
Lemma 4.2. We have in coordinates: γ 0 (t) = (x01 (t), . . . , x0m (t)).
Proof. This is immediate from the definition. For example, in R2 , for γ(t) = (x, y)(t) =
(x(t), y(t)) then
 
γ(t + h) − γ(t) x(t + h) − x(t) y(t + h) − y(t)
= , → (x0 (t), y 0 (t)).
h h h

Note that given a differentiable curve γ : [a, b] → Rm then γ 0 is a second curve in
Rm . We can keep going and define the higher derivatives γ 00 = (γ 0 )0 and so on, all
curves in Rm .
The most common interpretation of the tangent vector of a curve comes from
physics. There we interpret t as time and γ as position:
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 5

Definition 4.5. If γ(t) represents the position of a particle at time t, then the
derivative γ 0 (the tangent vector) gives the velocity of the particle v(t) = γ 0 (t), and
the acceleration at time t is the vector a(t) = v0 (t) = γ 00 (t). Note that all of these
are vector quantities, having both a magnitude and a direction. The speed is the
magnitude of the velocity vector, the scalar quantity ||v||.
Now we prove some basic facts about curves and their derivatives:
Proposition 4.3. (Leibnitz’ Rule for curves) Given two differentiable curves γ, η :
[a, b] → Rm , then (γ · η)0 = γ 0 · η + γ · η 0 .
Proof. We just write the curves in coordinates, and apply Leibnitz’ Rule (the Product
Rule) for functions from R to R. 
Proposition 4.4. Let γ be a differentiable curve in Rm such that ||γ|| = c for some
constant c. Then γ ⊥ γ 0 .
Proof.
We use Leibnitz’ Rule. We have c = ||γ||2 = γ · γ so for all t,
0 = (γ · γ)0 = γ 0 · γ + γ · γ 0 = 2γ · γ 0
using commutativity of the inner product.

The meaning of ||γ|| = c is intuitively clear: for R2 this says that the curve is in
a circle; for R3 that the image of the curve is in a sphere, and the statement is that
the tangent vector to the curve is tangent to the sphere as it is perpendicular to the
position vector. See Fig. 5.
Corollary 4.5. If γ : [a, b] → Rm is twice differentiable then if ||γ 0 || is constant, we
have γ 0 ⊥ γ 00 .
Proof. We just apply the Proposition to the curve γ 0 . 
Corollary 4.6. If γ : [a, b] → Rm is twice differentiable and represents the position of
a particle at time t, then if the speed ||γ 0 || is constant, the acceleration is perpendicular
to the curve (i.e. a ⊥ v).
In other words if you are driving a car at a constant speed around a track, the only
acceleration you will feel is side-to-side. If you apply the brakes or the accelerator
pedal, a component vector of acceleration tangent to the curve will be added to this.
If we reparametrize a curve to have speed 1, then the magnitude of the acceleration
vector can be used to measure how much it curves: we explain this next.
Proposition 4.4 allows us to make the following definition.
Definition 4.6. The curvature of a twice differentiable curve γ in Rn at time t is the
following. For its unit-speed parametrization γ b(s) we define the curvature at time s
b(s) = ||b
to be κ κ ◦ l)(t) = κ(t)
a(s)||; for γ the curvature at time t is κ(t) = (b
For example, the curve γr (t) = r(cos t/r. sin t/r) has velocity γr0 (t) = (− sin t/r, cos t/r)
which has norm one; the acceleration is γr00 (t) = 1r (cos(t/r). sin(t/r)) = − r12 γr (t), with
norm 1r . The curvature is therefore 1r . So if the radius of the next curve on the race
6 ALBERT M. FISHER

track is half as much, you will feel twice the force, since by Newton’s law, F = ma!
This is the physical (and geometric) meaning of the curvature.

4.3. Arc length of a curve. Given a curve γ1 : [a, b] → Rn , by a reparametrization


γ2 of the curve we mean the following: we have an invertible differentiable function
h : [a, b] → [c, d] such that γ2 = γ1 ◦ h. Note that γ1 and γ2 have the same image,
and that the tangent vectors are multiples: γ20 (t) = γ1 ◦ h0 (t) = γ10 (h(t))h0 (t). We call
this a positive or orientation- preserving parameter change if h0 (t) > 0, negative or
orientation- reversing if < 0.

Definition 4.7. We define the arc length of γ to be:

Z b
||γ 0 (t)||dt.
a

We introduce a special formula for this:

Z Z b
ds = ||γ 0 (t)||dt.
γ a

As we shallRexplain below, ds is interpreted to mean integration with respect to arc


length, and ” γ ” is read ”the integral over the curve γ”, so all together this is read
”the integral over the curve γ with respect to arc length”.

For an example we already know from Calculus I, consider a function g : [a, b] → R,


we consider its graph {(x, g(x)) : a ≤ x ≤ b}. We know the arc length of this graph
is
Z bp
1 + (g 0 (t))2 dx.
a

We claim that the new formula includes this one: parametrizing the graph
p as a curve
0 0 0
in the plane γ(t) = (t, g(t)). Then γ (t) = (1, g (t)) so ||γ (t)|| = 1 + (g 0 (t))2 ,
R Rbp
whence indeed the arc length is γ ds = a 1 + (g 0 (t))2 dx as claimed.

Proposition 4.7.
(i) The arc length of a curve is unchanged for any change of parametrization, inde-
pendent of orientation. That is,
Z Z
ds = ds.
γ1 γ2
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 7

Proof. (i) Writing u = h(t), we have γ2 (u) = γ2 (h(t)) = γ1 (t). Then since du =
h0 (t)dt, and using the Chain Rule, we have:
Z Z u=d Z u=d
0
ds ≡ ||γ2 (u)|| du = ||(γ20 (h(t)|| du =
γ2 u=c u=c
Z t=b
||γ20 (h(t))|| h0 (t) dt
t=a
Assuming first that h0 > 0, this equals (1)
Z t=b Z t=b
0 0
||γ2 (h(t)) h (t)|| dt = ||(γ2 ◦ h)0 (t)|| dt
t=a t=a
Z t=b Z
= ||γ10 (t)|| dt = ds.
t=a γ1

If instead h0 < 0, then we have as before


Z Z u=d Z u=d
ds ≡ ||γ20 (u)|| du = ||(γ20 (h(t)|| du =
γ2 u=c u=c
Z t=a
||γ20 (h(t))|| h0 (t) dt
t=b
because, since h0 < 0, h(b) = c, h(a) = d.
(2)
Also we now have ||γ20 (h(t))|| h0 (t) = −||γ20 (h(t)) h0 (t)|| so this is
Z t=a Z t=b
0 0
− ||γ2 (h(t)) h (t)|| dt = ||(γ2 ◦ h)0 (t)|| dt
t=b t=a
Z t=b Z
= ||γ10 (t)|| dt = ds.
t=a γ1

We next see how this canR be used to give a unit speed parametrization of a curve
t
γ : [a, b] → Rn . Set l(t) = a ||γ 0 (r)|| dr, so l(t) is the arclength of γ from time a to
time t. Let us denote the length of γ by L. Thus function l maps [a, b] to [0, L]]. Note
that l is a primitive (antiderivative) for ||γ 0 || so l0 (t) = ||γ 0 (t)||. We shall assume that
||γ 0 (t)|| > 0 for all t; in this case, the function l is invertible. Our parameter change
will be given by its inverse, h(t) = l−1 (t); then h0 is also positive.
Proposition 4.8. Assume that ||γ 0 (t)|| > 0 for all t. Then the reparametrized curve
b = γ ◦ h has speed one.
γ
Proof. Now (l ◦ h)(t) = t so 1 = (l ◦ h)0 (t) = l0 (h(t)) · h0 (t). We have l0 (t) = ||γ 0 (t)||
so l0 (h(t)) = ||γ 0 ||(h(t)) = ||γ 0 (h(t))|| since h0 (t) > 0. Thus
γ 0 (t)|| = ||(γ ◦ h)0 (t)|| = ||(γ 0 ||(h(t)) · h0 (t)|| = l0 (h(t)h0 (t) = 1.
||b

8 ALBERT M. FISHER

The function l maps [a, b] to [0, L]] whence the parameter-change function h maps
[0, L] to [a, b]. We keep t for the variable in [a, b] and define s = l(t), the arc length
up to time t, so now s is the variable in [0, L] and h(s) = t.
The change of parameter gives γ b(s) = (γ ◦ h)(s) = γ(h(s)) = γ(t). This indeed
parametrizes the curve γ b is by arc length s.
Note further that
Z Z b Z l(b) Z
0 0
ds ≡ ||γ (t)|| dt = ||b
γ (s)||ds ≡ ds
γ a 0 γ
b

From s = l(t) we have ds = l0 (t)dt = ||γ 0 (t)||dt. Now we understand rigorously what
R
is ds: it represents the infinitesimal arc length; this helps explain the notation γ ds
for the total arc length.

4.4. Level curves of a function. We would like to visualize a function F : R2 → R.


We do this in two main ways, by drawing the graph of the function (the subset
{(x, y, z) : z = F (x, y)}) or by drawing the level curves of the function. The level
curve of level c ∈ R is the following subset of the plane R2 :
{(x, y) : F (x, y) = c}.
Remark 4.2. In geography, a topographic map of a region X shows the level curves of
the altitude function F (x, y) with F : X → R. See Fig. 1.
For a function F : R3 → R we can no longer draw the graph (we would need four
dimensions!) but can still draw the analogue of the level curves. These are the level
surfaces. An example is F (x, y, z) = x2 + y 2 + z 2 for which the level surfaces of level
c2 are the spheres of radius c. See §4.12.
Remark 4.3. In weather maps we see curves which could indicate constant pressure
or temperature. These actual functions are defined on space (since height above
the ground is also a variable) so are the part close to earth of these level surfaces;
if the Earth were perfectly flat, these would be the level curves of G defined by
G(x, y) = F (x, y, 0), in other words, the level surfaces for F meet sea level z = 0 in
the level curves for G.

4.5. Partial derivatives; directional derivative. Given a map F : R2 → R,


choosing a point p ∈ Rn then the directional derivative of F at p in the direction u
is the following. Here we assume ||u|| = 1, i.e. u is a unit vector.
Above we have defined the parametrized line l(t) = p + tu: the curve which starts
at p and moves in the direction u at unit speed.
Now f (t) = F (l(t)) is a function from R to R. We define
Du (F )|p = f 0 (0).
This gives the amount of increase of the function F in the direction u at that point.
A special case is for u = (1, 0). We define
∂F
(p) = Du (F )|p .
∂x
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 9

Figure 1. From Wikipedia, Topographic Map: a topographic map of


the ski area of Stowe, Vermont and a shaded version of the map which
helps to visualize the landscape.

Similarly for u = (0, 1) we define

∂F
(p) = Du (F )|p .
∂y

See Fig. 4.
It is very easy to calculate the partial derivatives. For the partial with respect to
x, we fix the variable y and find the derivative with respect to x alone.
For example, when F (x, y) = x2 y 3 , then ∂F
∂x
= 2xy 3 while ∂F
∂y
= 3x2 y 2 .
n
For F : R → R the definitions are similar.
10 ALBERT M. FISHER

Figure 2. Graph of the function F (x, y) = x2 − y 2 (a parabolic hyperboloid).


From Google, search “x2 − y 2 ” (rotatable image there) and from
https://web.ma.utexas.edu/users/m408m/Display12-6-2.shtml. Horizontal slices
(these project to the level curves) give a family of hyperbolas in the plane. Sliced
vertically parallel to the x and y axes gives parabolas, sliced parallel to the lines
x = ±y one way gives hyperbolas.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 11

Figure 3. Graphs of parabolic hyperboloid with level curves (a family


of hyperbolas), and of the paraboloid F (x, y) = x2 + y 2 ).
From https://web.ma.utexas.edu/users/m408m/Display12-6-2.shtml.

4.6. Properties of the gradient of a function. For this, first we define: given a
∂F ∂F
map F : Rm → R we define a vector at each point p ∈ Rm , ∇F = ( ∂x 1
, . . . , ∂xm
),
called the gradient of F at p.
As we shall see in the next sections, the gradient has the following important
properties:
12 ALBERT M. FISHER

Figure 4. Partial derivatives; figure from https://activecalculus.org/vector/

(1) This defines a vector field, called the gradient vector field of F .
(2) The gradient vector field is everywhere orthogonal to the level sets of F . These are
level curves for F : R2 → R and level surfaces for F : R3 → R. We prove this via the
Chain Rule, see §4.11. In general, the level sets are submanifolds, i.e. differentiable
subsets, of Rn , of dimension (n − 1); this is a consequence of the Implicit Function
Theorem. (Here we have to assume that n 6= 0).
(3) The gradient points in the direction of steepest increase of the function F at the
point p.
(4) The directional derivative of F at p, in the direction of the unit vector u, is given
simply by the inner product:

Du (F )|p = ∇F |p · u.
(5) The gradient ∇F is the vector form of the derivative DF of the function F .
(6) The gradient can be used to easily write the equation of the tangent line to a
level curve of F : R2 → R at the point p = (x, y). For F : R3 → R, the gradient
can be used to write the equation of the tangent plane to a level surface at a point
p = (x0 , y0 , z0 ). We explain this below in §4.11.
4.7. Three types of curves and surfaces. In the course we actually encounter
three different (but related) types of curves and surfaces. First, recall:
Definition 4.8. For f : [a, b] → R then its graph is
graph(f ) = {(x, f (x)) : x ∈ [a, b]}
which is the subset of the plane we usually draw for this. Similarly, for F : R2 → R
then graph(F ) = {(v, F (v)) : v = (x, y) ∈ R2 }. Thus graph(F ) = {(x, y, z) : z =
F (x, y)}.
The different types of curves are:
(i) the graph of function f : R → R;
(ii) a level curve of a function F : R2 → R;
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 13

(iii) a parametrized curve γ : R → Rm .


Note that the first two are curves in the plane, while the second is a curve in Rm
for any dimension m.
For surfaces we have, similarly:
(i) the graph of function F : R2 → R;
(ii) a level surface of a function F : R3 → R;
(iii) a parametrized surface L : R2 → Rm .
Now the first two are surfaces in R3 , while the last is a surface inside of m-
dimensional space so is much more general.
In both situations, curves and surfaces, these are all related, with (i) being a special
case of (ii) and (ii) a special case of (iii).
In all three cases it is important to first consider the linear (or affine) situation.
That is because, first, it is the simplest case, and secondly, because these will describe
the tangent line and the tangent plane, of a curve or surface, in all cases. These are
exactly the affine lines or planes which bext approximate the curve or surface at a
chosen point.

Here are the affine versions in all cases:


(i) An affine function f : R → R is f (x) = y where y = ax + b.
An affine function F : R2 → R has the form F (x, y) = z where z = ax + by + c.
Note that the graph of f is a line in the plane, while the graph of F is a plane in
3
R.
(ii) The level curve of an affine function F : R2 → R: given F (x, y) = ax + by then
the level curve of level c is the line in the plane,
ax + by = −c,
equivalently
Ax + By + C = 0,
for A = a, B = b, C = −c. This is now in the form of the general equation of a line
in the plane in R3 .
Given the affine function F : R3 → R defined by given F (x, y, z) = ax + by + cz
then the level surface of of level d is the plane in R3 :
ax + by + cz = d
or equivaelntly the general equation of a plane
Ax + By + Cz + D = 0,
for A = a, B = b, C = c, D = −d.
(iii) A parametrized affine curve l : R → Rm : this is a parametrized line,
l(t) = p + tv.
A parametrized affine surface is L : R2 → Rm : this is a parametrized plane:
L(s, t) = p + sv + tw.
Exercise 4.5.
14 ALBERT M. FISHER

(i) Which lines in the plane, or planes in R3 , can (or cannot) be written as the graph
of an affine function as in (i)?
(ii) For which values of a, b or a, b, c in (ii) do you get a line or plane?
(iii) For which vectors v and v, w in (iii) do you get a line or plane?
(iv) Make sure you know how to go from one type of line (or plane) to the other,
whenever possible (see the Linear Algebra lecture notes and exercises)!
(v) Write each type of line or plane in matrix form.
Solution: We explain (ii). We claim that the equation Ax + By + C = 0 gives a line
in the plane R2 exactly when not both A, B are 0. Here we have to understand the
meaning of “gives the equation of a line in the plane.”
There are two important points:
(1) This means that we are in the plane, this is our Universe of Discourse (we are
talking only about points in the plane R2 , not about R or R3 ).
(2) “gives the equation of a line” means that the collection of all solutions to the
equation forms a line.
That is,
{(x, y) ∈ R2 : Ax + By + C = 0}
is a geometrical line in R2 .
It makes a huge difference what is our Universe of Discourse (i.e. what we are
talking about). For example, the equation x = 2 in R is a point, in R2 it is a vertical
line, {(x, y) : x = 2}, in R3 it is a vertical plane.
Now for Ax + By + C = 0 to be a line means that
{(x, y) ∈ R2 : Ax + By + C = 0}
is a line. Let us consider the case where B 6= 0. Then this equation is equivalent, i.e.
it has the same solutions:
y = −A/By − C/B = ax + b
which we know is a line.
Next suppose A, B are both 0. Then we have
{(x, y) ∈ R2 : 0x + 0y + C = 0}
equivalently
{(x, y) ∈ R2 : C = 0}
and there are two cases:
(i) C = 0: this statement is true, hence is true for all (x, y), so the solutions are R2 ;
(i) C 6= 0: this statement is false, hence is false for all (x, y), so the solutions are the
empty set.
This proves the Claim.
Planes are handled similarly.
Exercise 4.6.
(i) Given vector spaces V, W and a linear transformation T : V → W , prove that:
Proposition 4.9. The image of T , Im(T ) and the kernel (null space) of T , ker(T )
are (vector) subspaces of W, V respectively.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 15

(ii) Interpret these statements for T : R2 → Rm (for the image), respectively T :


R3 → R (for the kernel) as matrix equations, and describe the connection to the
parametric and general equations of a plane.
Finding the tangent line or plane: as the graph of a function.
Next, for each of the three cases of curves and surfaces, we show how to find the
tangent line, respectively the tangent plane. These are exactly the affine lines or
planes which bext approximate the curve or surface at the point.
First, given a function f : R → R the formula for the tangent line to its graph is

l(x) = f (p) + f 0 (p)(x − p).


(Draw a picture!)

Note that l(t) is itself a curve, and that it satisfies:


(i) it is an affine function, that is, linear plus a vector;
(ii)l(p) = f (p);
(iii) l0 (p) = f 0 (p).

The formula for the tangent line to a (parametrized) curve γ : R → Rm is nearly


identical:
l(t) = γ(p) + γ 0 (p)(t − p).
This now satisfies similar properties:
(i) it is an affine function;
(ii)l(p) = γ(p);
(iii) l0 (p) = γ 0 (p).

For a level curve there are two ways to approach finding the tangent line. The
first is to parametrize the level curve somehow and apply the previous case of a
parametrized curve.
Exercise 4.7. For F (x, y) = x2 + y 2 , the curve of level 1 is the unit circle, the
solutions of the equation (i.e. all pairs (x, y) which satisfy the equation)
x2 + y 2 = 1.
Find parametrizations for this level curve, and use that to find the tangent line at
a point.
Solution: We can parametrize this for example by the variable x. Then

y = ± 1 − x2

and we have two parametrized curves, with t = x, so γ(t) = ± 1 − t2 . This works at
all points except where y 0 (x) = ∞, that is, x = ±1. If we instead parametrize it by y
then this works except where x0 (y) = 0, that is, for y = ±1. We can also parametrize
the entire curve at once, by the angle θ, with
γ(t) = (cos t, sin t)
and t = θ.
16 ALBERT M. FISHER

When
√ we parametrize by the variable x, we say the functions f (x) = 1 − x2 , fe(x) =
− 1 − x2 , are defined implicity by the equation x2 + y 2 = 1.
That is, they are explicit functions which are ”implied” by the equation.
The Implicit Function Theorem describes when this can be done, basically when
(partial) derivatives become infinite as above.
Given this, we can apply the formulas for the graph of a function, or for a curve in
the plane, to find the tangent line.
Finding the tangent line or plane: using the normal vector to find the
tangent space. The second way to find the tangent line to a level curve is to find a
normal vector to the curve. We explain this in §4.11.
Definition 4.9. Given F : Rn → R the tangent space Tp to the level set at the point
p = (x1 , x2 , . . . , xn ), for level c = F (p), is an affine subset of Rn , all vectors v such
that (v − p) is othogonal to the gradient, n = ∇Fp .
We consider first the case of n = 2. We write the equation of the tangent line,
recalling that given a point p and a normal vector n = (A, B) then the line passing
through p and perpendicular to n is the collection of all x = (x, y) such that
n · (x − p) = 0.
Thus for n = (A, B) and x = (x, y) and p = (x0 , y0 ) then
(A, B) · (x − x0 , y − y0 ) = 0
giving the general equation for the line,

Ax + By + C = 0
where C = −n · p = −(Ax0 + By0 ).
Since ∇F = n = ( ∂F | , ∂F | ) this gives the formula for the tangent line as
∂x p ∂y p

∂F ∂F
z0 + |p (x − x0 ) + |p (y − y0 ) = 0 (3)
∂x ∂y
We know the formula for the tangent line to the graph of a function f : R → R
is l(x) = f (p) + f 0 (p)(x − p). We can also use the normal vector method to find this
formula in a second way. To do this we define F : R2 → R by
F (x, y) = f (x) − y.
Then the level curve of level 0 gives f (x) − y = 0, so y = f (x) which is the graph of
f.
(Consider a simple example like f (x) = x2 to understand what is going on!)
Note that at the point p = (p, f (p) is ∇Fp = (f 0 (p), −1) so the formula (14) gives
n · (x − p) = 0
where p = (p, f (p)) so we have
(f 0 (p), −1) · ((x − p, y − f (p)) = 0
so
f 0 (p)(x − p) − (y − f (p)) = 0
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 17

so
y = f (p) + f 0 (p)(x − p)
as claimed.
Exercise 4.8. See Exercise 4.9 below.

4.8. The gradient vector field; the matrix form of the tangent vector and
of the gradient. The gradient ∇F of a function F : R → Rm gives an important
example of a vector field. In general, a vector field V on Rm is a function V from Rm
to Rm .
As we mentioned above and shall prove in Proposition 4.15, the level curves of a
function F are orthogonal to the gradient vector field, so the gradient can help us
understand the level curves of F .
We draw the vector wv = V (v) based at each point v. See Fig. 12.
The tangent vector gives the first definition of derivative of a curve γ : R → Rm , the
vector form of the derivative; the second definition, the matrix form of the derivative,
is the (n × 1) matrix, i.e. the column vector with those same entries:
 0
x1 (t)

Dγ =  ...  .
x0n (t))
Thus Dγ : R → Mn×1 .
For a function F : Rn → R, the vector form of its derivative is the gradient ∇F .
This has a matrix form, the row vector i.e. (n × 1) matrix with the same entries:
 ∂F ∂F

DF |x = ∂x 1
. . . ∂xn .
Given γ : R → Rm and F : Rm → R then the composition is F ◦ γ : R → R, so
we can take its derivative (F ◦ γ)0 (t). The Chain Rule says we can compute this in a
second way. In vector notation it states:
(F ◦ γ)0 (t) = ∇Fγ(t) · γ 0 (t).
This is even simpler to remember in matrix notation, as we have the product of a
row vector and a column vector. For example, with γ : R → R3 and F : R3 → R, we
have
 0 
x 1 (t)
D(F ◦ γ(t)) = Fx Fy Fz |γ(t)  x02 (t)  .
 
x03 (t))
2
Exercise 4.9. F (x, y) = x2 y 3 ; γ(t) = (et , et ).
(1)Find F ◦ γ 0 (0).
2 2 2
First method (directly): f (t) = F ◦ γ(t) = e2t e3t . f 0 (t) = 2e2t e3t + 6t · e2t e3t .
f 0 (0) = 2.
2
Second method (Chain Rule): ∇F = (2xy 3 , x2 3y 2 ). γ 0 (t) = (et , t2 et ).
γ(0) = (1, 1). ∇F (1, 1) = (2, 3). γ 0 (0) = (1, 0).
So F ◦ γ 0 (0) = (2, 3) · (1, 0) = 2.
18 ALBERT M. FISHER

Figure 5. The normal vector to the surface is normal (orthogonal) to


the tangent plane at that point. Tangent plane to the graph of a func-
tion defined on the plane, showing the meaning of the partial derivatives
at the point. A tangent vetor to a curve in a surface is in the tangent
plane. From https://web.ma.utexas.edu/users/m408m/Display14-
4-2.shtml From https://www.researchgate.net/figure/Figure-S3-
Geometric-illustration-of-tangent-vector-tangent-space-curve-and...
From: Wikipedia, Normal (geometry)
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 19

Figure 6. A vector field in the plane, from Wikipedia. Compare to


the pictures of curves below!

Figure 7. Equipotential curves for the electrostatic field of two oppo-


site charges in the plane. Colors indicate different levels of the potential.
This can also be interpreted as a gravitational field, where the potential
function is height above sea level, and the positive charge is a mountain
top while the negative charge is a valley. Orthogonal to the equipoten-
tials are the lines of force; these are tangent to the gradient vector field
of the potential function. One can imagine flowing along the lines of
force from positive to negative charge, as in a fluid, although this is a
force not a velocity field. (Because of this analogy with fluids, they are
also called the lines of flux of the electrostatic field.
20 ALBERT M. FISHER

Figure 8. Equipotential curves and lines of force for the electrostatic


field of two opposite charges in the plane, now closer together.

Which is easier depends on the problem!


(2)Find ∇F at p = (2, 5).
(3) Find the equation of the tangent plane to the graph of F at that point.
4.9. General definition of derivative of a map. Both the matrix version of tan-
gent vector and of gradient are special cases of the general notion of derivative of a
map from F : Rn → Rm . We state this more generally for normed vector spaces.
Definition 4.10. Let V, W be Banach spaces (a vector space, possibly infinite-
dimensional, on which we have a complete norm defined; complete here means that
there are “no holes” as Cauchy sequences converge; this only can be an issue in infinite
dimensions. The reader can think of Rn with the standard inner product and norm
to get the basic idea). We say a function (or map) F : V → W is differentiable at
the point p ∈ V iff there exists a linear transformation L : V → W such that
lim ||F (x + h) − F (x + h) − L(h)||/||h|| = 0.
h→0

We write DFp = L for this transformation, called the derivative of F at p.


The idea here is that the derivative DF should give the “best linear approximation”
at each point.
What this actually means is the linear part of the best first-order approximation.
The best 0th -order approximation at x ∈ Rn is the constant map with the value
at that point, thus the map x 7→ p = F (x). If L = DF |x , then the best first-
order approximation will be the linear map shifted by this value, thus the affine map
(x + v) 7→ p + Lv.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 21

Let us relate the above formula to the usual definition for a function f : R → R,
that is,
f (x + h) − f (x)
f 0 (x) = lim =c
h→0 h
This definition still works for curves, giving us the tangent vector. However for V
of dimension larger than 1 this makes no sense, as we cannot take the ratio of two
vectors.
Remark 4.4. Or nearly. Consider the following: given a linear map L : V → V , so
Lv = w, then in some sense
w
=L:
v
the ratio “should be” a linear transformation!!
However L is not well-defined by this: many linear maps will solve the equation
Lv = w; it is only well-defined if V has dimension 1. What the definition of derivative
requires is that L works for all directions h, and this does make L well-defined.
Remark 4.5. Let us see what happens to the general definition for f : R → R. Then
f (x + h) − f (x)
lim =c
h→0 h
iff for each ε > 0, there exists δ > 0 such that for |h| < δ,
f (x + h) − f (x)
−c <ε
h
or
f (x + h) − f (x) − ch

|h|
And this is now a special case of the general formula.
We introduce the notation L(V, W ) for the collection of all linear transformations
from V to W . If we choose a basis for V = Rn and W = Rm , then L : Rn → Rm
is represented by an (m × n) matrix. Then L(Rn , Rm ) can be identified with the
matrices Mmn ∼ Rmn , so DF : Rn → L(Rn , Rm ) ∼ Rmn .
When considering F : Rn → Rm then both x ∈ Rn and F (x) ∈ Rm can be written
in components, with x = (x1 , . . . , xn ) and F (x) = (F1 (x), . . . , Fm (x)). We write the
components of F as a column vector, so
F1 (x)
 

F (x) =  ...  .
Fm (x)
Each component Fk is a function from Rn to R so has a gradient, ∇Fk = ( ∂F
∂x1
k
, . . . , ∂F
∂xn
k
).
Now DFk is by definition is the corresponding row vector so

DFk = ∂F . . . ∂F
 k 
∂xn .
k
∂x1
We define the matrix of partials of F : Rn → Rm to be the (m × n) matrix with
the k th row this gradient vector. Let us write [∇Fk ] for the row vector DFk .
22 ALBERT M. FISHER

Then the ij th -matrix entry is the partial derivative


∂Fi
(DF )ij =
∂xj
and so we have the (m × n) matrix
  ∂F1 ∂F1

[∇F1 ] ...

∂x1 ∂xn
DF |x =  ...  =  ... ..  .

. 
∂Fm ∂Fm
[∇Fm ] ∂x1
... ∂xn

The most basic cases are f : R1 → Rm and Rn → R1 . The first is a curve, discussed
above, and usually written γ : R → Rm . The general formula then gives the matrix
form of the tangent vector; since γ is a column vector, with
x1 (t)
 

γ(t) =  ... 
xm (t))
then Dγ is the (m × 1) matrix with the same entries as the tangent vector γ 0 (t) =
(x01 , , . . . , x0m )(t), so
 0
x1 (t)

Dγ|t =  ...  .
x0m (t))
The second type of map F : Rn → R we call simply a function. The general formula
above then gives the (1 × n) matrix:
 ∂F ∂F

DF |x = ∂x1
... ∂xn x .
This row vector is the matrix form of the gradient ∇F , since as explained above,
∂F ∂F
∇F = ( ∂x 1
, . . . , ∂xn
).
As we shall see in Proposition 4.15, the level curves of a function F are orthogonal
to the gradient vector field.
An example of level curves is seen in Fig. 21.

One can think of the matrix of partials for a function F : Rn → Rm as consisting of


lined-up column vectors (tangent vectors) or row vectors (gradients) respectively. We
have explained this regarding the rows. To understand this for the columns, writing a
vector in the domain as x = (x1 , . . . , xn ) then fixing say x2 , . . . , xn and setting t = x1
we have a curve γ(t) = F (t, x2 , . . . , xn ); note that the first column of the derivative
matrix DF is the derivative of this curve, the column tangent vector Dγ.
We have described how the derivative at a point defines a matrix of partial deriva-
tives. The converse is:
Lemma 4.10. A differentiable map F : V → W is continuous. For the case F : Rn →
Rm , the map is differentiable with a continuous derivative iff the partial derivatives
exist and are continuous.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 23

For proof we refer to e.g. §11.2 of [Gui02].

Another basic theorem regarding derivatives is the relation to the matrix of partials:
Theorem 4.11. If for F : Rn → Rm , all the partial derivatives ∂Fi /∂xj exists and
is continuous at p, then F is differentiable at p, and its derivative is the linear map
given by the matrix of partials.
For a proof see Theorem 6.4 of Marsden’s book [Mar74]. The of derivatives is very
clearly carried out on pp. 158-185 of Marsden.

Best affine approximation: tangent line and plane.


Given a map F : Rn → Rm , the terminology “k th -order approximation” to F at a
point x ∈ Rn comes from the Taylor polynomials and Taylor series. For a function
f : Rn → Rm , the best k th -order approximation at x ∈ Rn is the polynomial of degree
k which best fits the map near that point. This is the polynomial (in k variables!)
which has all the same partial derivatives at that point, up to order k.
Thus the best 0th -order approximationof F at p ∈ Rn is the constant map with the
value at that point: the map x 7→ F (p). To get the best first-order approximation
we add on the linear map given by the derivative matrix DF |p .
This is the affine map
x 7→ F (p) + DF |p (x − p),
which we mentioned above for the case of R2 , where this gives the equation of the
tangent plane to the graph of F .
Example 1. (Derivative of a linear or affine map) What is the best linear ap-
proximation to a linear map? Answer: it should be the linear map itself!
Let us understand this precisely.
Consider the matrix  
a b
A=
c d
Acting on column vectors, this defines the linear transformation
      
x a b x ax + by
7→ A = =
y c d y cx + dy
or written as vectors,
T (x, y) = (ax + by, cx + dy).
We want to compute the matrix of partials DTp at a point p = (x0 , y0 ). Now the
components of T are T = (T1 , T2 ) where
T1 (x, y) = ax + by
T2 (x, y) = cx + dy.
Then
∇T1 = (a, b)
∇T2 = (c, d)
24 ALBERT M. FISHER

for each point p. Thus at each point p,


 
a b
DTp = .
c d
This shows:
Proposition 4.12. For an affine map F : Rn → Rm , defined by F (v) = v0 + T (v)
where T : Rn → Rm is linear, the derivative is
DFp = T.
That is to say, the derivative is constant, is constantly equal to the linear part of the
map.
Remark 4.6. To really understand this, consider the case of Rθ , the rotation counter-
clockwise of the plane by angle θ.
4.10. The general Chain Rule. The main theorem involving derivatives is the:
Proposition 4.13. (Chain Rule) A composition of differentiable maps is differen-
tiable, and the derivative is the composition of the corresponding linear maps.
That is, for F : V → W and G : W → Z then for G ◦ F : V → Z we have:
D(G ◦ F )|p = DG|f (p) ◦ DFp .
Thus for the finite-dimensional case the chain rule is stated using the product of
matrices.

F G
# #
V W 7 Z
G◦F

DF |x DG|F (x)
# #
V W 6 Z
D(G◦F )|x

The first example is γ : R → R3 and F : R3 → R, where have seen the Chain Rule
above; in matrix notation it is:
 0 
x1 (t)
D(F ◦ γ(t)) = Fx Fy Fz |γ(t)  x02 (t) 
 
x03 (t))
The product gives a (1 × 1) matrix, whose entry is a number.
In vector notation the Chain Rule is:

(F ◦ γ)0 (t) = ∇Fγ(t) · γ 0 (t).


This number is the same as the entry of the (1 × 1) matrix above.
Now we can give a second proof of Proposition 4.4 above, which we repeat here:
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 25

Proposition 4.14. Let γ be a differentiable curve in Rn such that ||γ|| = c for some
constant c. Then γ ⊥ γ 0 .
Proof. (Second PProof, using gradient) We define a function F : Rn → R by F (x) =
||x||2 = x · x = ni=1 x2i . Then since ||γ|| = c is constant, c2 = ||F ◦ γ|| whence by the
Chain Rule,
0 = (F ◦ γ)0 (t) = (∇F (γ(t)) · γ 0 (t)
but F (x) = F (x1 , . . . , xn ) = x21 + · · · + x2n whence ∇F (x) = 2(x1 , . . . , xn ) = 2x. Thus
0 = 2γ(t) · γ 0 (t), as claimed.

Directional derivative and the gradient.
The gradient gives us a simple way of calculating the directional derivative. Given
F : Rn → R, with gradient vector field ∇F , and given a unit vector u, then the
directional derivative of F in direction u is given simply by the inner product:

Du (F )|p = (∇F (p)) · u.


Exercise 4.10. Check this on the standard basis vectors and compare to the partial
derivatives! What is the direction of steepest increase of F at a point p? Of steepest
decline? What is the rate of increase of F if in a direction tangent to a level curve?

4.11. Level curves and parametrized curves. There are two very distinct types
of curves we encounter here: the curves of this section, which are parametrized curves
(with parameter t = time), and the level curves of a function. Next we describe a link
between the two:
Proposition 4.15. Let G : R2 → R be differentiable and suppose γ : [a, b] → R2 is a
curve which stays in a level curve of G of level c. Then γ 0 (t) is perpendicular to the
gradient of G.
Proof. We have that G(γ(t)) = c for all t. Hence G(γ(t))0 = 0 for all t. Then by
the chain rule, this equals 0 = D(G ◦ γ)(t) = DG|γ(t) Dγ|t . The derivatives here are
matrices, with DG a (1 × 2) matrix (a row vector) and Dγ a column vector; in vector
notation, these are the gradient and tangent vector, so this gives 0 = (G ◦ γ)0 (t) =
(∇G)(γ(t)) · γ 0 (t), so ∇G|γ(t) · γ 0 (t) = 0, telling us that the gradient is perpendicular
to the tangent vector of the curve, as claimed. 
Example 2. (Dual hyperbolas) See Fig. 21, depicting level curves of the functions
F (x, y) = (x2 − y 2 ) and G(x, y) = 2xy.
Exercise 4.11. Plot the level curves of F for levels 0, 1, −1 and for G of levels 0, 2, −2.
Compute the gradient vector fields and find their matrices (they are linear!) Compare
to the earlier examples of linear vector fields.
These functions are related algebraically by a change of variables, u = √12 (x − y),
v = √12 (x + y) and geometrically by a rotation Rπ/4 . To verify this we define the
26 ALBERT M. FISHER

Figure 9. Dual families of hyperbolas: Level curves for the functions


F (x, y) = (x2 −y 2 ) and G(x, y) = 2xy. Note that in this special example
the level curves of F are orthogonal to the level curves of G. In fact,
the gradient vector field of F is orthogonal to the level curves of F , and
is tangent to the level curves of G, and vice-versa!

2
function H : R2 → R2 by H(x, y) = (u, v) = 2
(x − y, x + y) then
1
G ◦ H(x, y) = 2 · (x − y)(x + y) = x2 − y 2 = F (x, y)
2
so F = G ◦ H.
Now H is a linear transformation of R2 given by
√     
2 1 −1 x u
=
2 1 1 y v
and so by the matrix
√   " √2 √ #
2 1 −1 − 2
= √22 √22
2 1 1 2 2
which is indeed rotation counterclockwise by π/4.
We next check Proposition 4.15 for this example.
The gradient of F is ∇F = ( ∂F , ∂F ) = (2x, −2y) and of G is ∇G = ( ∂G
∂x ∂y
, ∂G ) =
∂x ∂y
(y, x). Note that (2x, −2y) · (y, x) = 0 so these vector fields are orthogonal. Further-
more we can find tangent vectors to the level curves as follows. Let us parametrize
the level curve F (x, y) = x2 − y 2 = c by the variable x. Then y = y(x), so the curve
is γ(x) = (x, y(x) with tangent vector (1, y 0 (x))taking the derivative of the equation
with respect to x gives 2x − 2yy 0 = 0, so y 0 = x/y. Thus γ 0 (x) = (1, x/y). This is
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 27

proportional to the vector (y, x) = ∇G which as we have already noted is orthogonal


to the gradient of F at that point.

4.12. Level surfaces, the gradient and the tangent plane. In Proposition 4.15
of §4.11 we showed that the gradient vector field of F : R2 → R is orthogonal to the
level curves of F . In fact something similar is true for any dimension. For the case
of R3 we get a new formula for the tangent plane, as we now explain.
Proposition 4.16. Let G : R3 → R be differentiable and suppose γ : [a, b] → R3 is a
curve such that the image of γ remains inside the level surface of level c, {(x, y, z) :
G(x, y, z) = c}. That is, for all t, G(γ(t)) = c. Then γ 0 (t) is perpendicular to the
gradient of G.
More generally this is true for higher dimensions, G : Rn → R.
Proof. We have that G(γ(t)) = c for alll t. Hence G(γ(t))0 = 0 for all t. Now by the
chain rule, 0 = D(G ◦ γ)(t) = DG(γ(t)Dγ(t). DG is now a (1 × n) matrix and Dγ a
(n × 1) column vector; in vector notation, these are the gradient and tangent vector,
d
so this gives 0 = dt c = (G ◦ γ)0 (t) = (∇G)(γ(t)) · γ 0 (t) = 0. 
Exercise 4.12. First we have a review problem from Linear Algebra: Recall that
the general equation for a plane in R3 is:
Ax + By + Cz + D = 0
where not all three of A, B, C are 0. Given a point p = (x0 , y0 , z0 ) and a vector
n = (A, B, C) then find the general equation of the plane through p and perpendicular
to n.
Solution: We know that the plane is the collection of all x = (x, y, z) such that
n · (x − p) = 0,
so for n = (A, B, C) and x = (x, y, z) and p = (x0 , y0 , z0 ) then
(A, B, C) · (x − x0 , y − y0 , z − z0 ) = 0
giving the general equation for the plane,

Ax + By + Cz + D = 0
where D = −n · p = −(Ax0 + By0 + Cz0 ).
See also Exercise 4.5.
Exercise 4.13. Given the function F (x, y, z) = x2 + y 2 + z 2 , find the tangent plane
to this sphere at the point (1, 2, 3).
Solution: Note that F (1, 2, 3) = 14. Therefore this point is
√ on the level surface of F
of level 14. (This is the sphere about the origin of radius 14).
Now the gradient of F is ∇F (x, y, z) = (2x, 2y, 2z). We know the gradient is
orthogonal to the sphere hence to the tangent plane. This normal vector (to both)
is n = ∇F (1, 2, 3) = (2, 4, 6). We are in the situation of the previous exercise: the
equation of the plane is
Ax + By + Cz + D = 0
28 ALBERT M. FISHER

where the normal vector is n = (A, B, C) = (2, 4, 6) and the plane passes through the
point p = (1, 2, 3).
The equation of the plane is therefore
n · (x − p) = 0
or
n · ((x, y, z) − p) = 0
so

(A, B, C) · (x − 1, y − 2, z − 3) = 0
giving
2x + 4y + 6z + D = 0
where
D = −n · p = −(2, 4, 6) · (1, 2, 3) = −(2 + 8 + 18) = −28,
so we have the plane with general equation
2x + 4y + 6z + −28 = 0.
Exercise 4.14. We solve an exercise requested by a student.
Guidorizzi Vol 2 # 2 of §11.3, p. 204. [Gui02],
Find the equation of a plane which passes through the ponts (1, 1, 2) and (−1, 1, 1)and
which is tangent to the graph of the function f (x, y) = xy.
Solution. We use normal vectors, as follows. The graph of f is
{(x, y, z) : z = f (x, y)}
which equals
{(x, y, z) : z = xy}
equivalently written
{(x, y, z) : xy − z = 0}.
This is the level surface of F : R3 → R defined by F (x, y, z) = xy − z. This has
gradient vector ∇F = (y, x, −1). Let p = (x0 , y0 , z0 ) denote the point where the
plane meets the graph. Then at the point p we have ∇Fp = (y0 , x0 , −1). We know
that the gradient is orthogonal to the level surfaces, in other words it is orthogonal
to the tangent plane to the surface at that point. So n = ∇Fp is a normal vector to
the tangent plane of the level surface at p. This gives us the equation for the tangent
plane

n · (x − p) = 0
so
(y0 , x0 , −1) · ((x, y, z) − (x0 , y0 , z0 )) = 0

(y0 , x0 , −1) · (x − x0 , y − y0 , z − z0 ) = 0
so
y0 x − x0 y − z + z0 = 0
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 29

Now z0 = x0 y0 since p is also on the graph of the function. This gives


y 0 x − x0 y − z + x0 y 0 = 0
We need to find x0 , y0 . The two points are on this plane so satisfy the equation.
Substituting (x, y, z) = (1, 1, 2) and (−1, 1, 1) gives us the equations
y0 − x0 − 2 + x0 y0 = 0
−y0 − x0 − 1 + x0 y0 = 0
Subtracting,

2y0 − 1 = 0
y0 = 1/2
We now have from the first equation,
y0 − x0 − 2 + x0 y0 = 0
so
1/2 − x0 − 2 + x0 1/2 = 0
multiplying by 2,
1 − 2x0 − 4 + x0 = 0
x0 = 3
Thus z0 = 3/2 giving the equation of the plane:
n · (x − p) = 0
with n = (y0 , x0 , −1) = (1/2, 3, −1) and p = (x0 , y0 , z0 ) = (3, 1/2, 3/2). Finally in
the form
Ax + By + Cz + D = 0
we have
1/2x + 3y − z − 3/2 = 0
or equivalently
x + 6y − 2z − 3 = 0.
To check our numbers we can verify that the three points are indeed on this plane.
Remark 4.7. In these notes we have emphasized the role of three distinct ways of
presenting, or of viewing, the same object: for example a curve may be the graph of
a function, a level curve, or a parametrized curve. We wish to indicate how this fits
into a larger context, in other parts of mathematics.
First, here is a solution to part of Exercise 4.6: to write the image and kernel in
matrix form.
Consider      
2 3   2 3
1 2 s = s 1 + t 2 (4)
t
4 5 4 5
Thus if we write the columns of a (3×2) matrix as v = (v1 , v2 , v3 ), w = (w1 , w2 , w3 )
we have more generally
30 ALBERT M. FISHER

     
v1 w1   v1 w1
v2 w2  s = s v2  + t w2  (5)
t
v3 w3 v3 w3
defining the map T : R2 → R3 where T (s, t) = sv + tw. This is a parametrized plane,
which in this case passes through 0.
Given a parametrized plane in R3 , we should be able to find the general equation.
To do this we bring in the vector product, which we next explain. But first, a few
words about the determinant!

4.13. Two definitions of the determinant.


Algebraic definition: Let A be an (n × n) real or complex matrix. We begin with
the usual algebraic definition, which is inductive on n. For n = 1, A = [a] = [A11 ]
a b
and detA is just the number a. For n = 2, A = , and we set det(A) = ad − bc.
c d
This is extended as follows: we define a matrix with entries Sij ∈ {1, −1} as
follows: Sij = (−1)i+j . To visualize this, we write simply the corresponding signs, in
a checkerboard pattern:
 
+ − + −
− + − +
S=  + − + −

− + − +
The ij minor A(ij) of A is defined to be the (n − 1) × (n − 1) matrix formed by
removing the ith row and j th column of A.
Then we expand along the top row by forming the sum of (±1)detA(1j), where the
signs are given by the top row of S, i.e.
n
X
det(A) = (−1)1+j detA(1j).
j=1

Similarly we define the expansion along any row nj=1 (−1)i+j detA(ij) or indeed
P
any column.
It turns out these are equal, giving the same number whatever row or column
chosen.
Note that this algorithm also works for the (2 × 2) case!

Geometric definition:
Definition 4.11. Let M be an (n × n) real matrix. Then
detM = (±1)(factor of change of volume)
where we take +1 if M preserves orientation, −1 if that is reversed. (Here this is
n-dimensional volume and so is length, area in dimensions 1, 2).
Theorem 4.17. The algebraic and geometric definitions are equivalent.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 31

Proof. For A (2 × 2), note that the factor of change of volume is the area of the image
of the unit square, that generated by the standard basis vectors (1, 0) and (0, 1),
which equals the area of the parallelogram with sides the matrix columns, (a, c) and
(b, d).
Case 1: c = 0. Then the matrix is upper triangular and its determinant agebraically
is ad. But the parallelogram area is (base)(height)= ad as well.
The formula area(parallelogram)= (base)(height) is usually proved by cutting off a
triangle vertically and shifting it to the other side, thus forming a rectangle of the same
base and height. Here is a different way to picture this: imagine the parallelogram is
a pile of horizontal layers, like a stack of cards, and straighten the pile to a vertical
pile by sliding the cards, ending up with the same (a × d) rectangle.
General Case: We reduce to Case 1 as follows, not by rotating (also possible!) but
by sliding the far side of the parallelogram along the direction (b, d). A simple con-
putation shows the area is indeed ad − bc.
Higher dimensions: We note that the above “sliding” operations can be done alge-
braically by an operation of column reduction, equivalently, multiplying on the right
by an elementary matrix of determinant one. This reduces to the upper diagonal
case, and beyond to the diagonal case if desired.
We observe that the same procedure works in R3 and beyond.


4.14. Orientation. We may be accustomed to thinking of a certain basis as having


positive orientation and another negative, but this has no intrinsic meaning: what
does make sense is to say that two given bases have the same or different orientation.
As we shall explain, there are only two choices for this.
Thus, given Rn , we let Bb denote the collection of all bases. The change from one
basis B1 to another B2 is given by an invertible matrix A. By definition the collection
of such matrices is called GL(n, R). The collection of those with detA > 0 is called
GL+ ; these are the orientation-preserving matrices. (From the point of view of Group
Theory, GL+ is a subgroup of index 2 of GL, and its coset is GL+ , the collection (not
a subgroup!) of orientation-reversing matrices). Letting GL act on the bases B, b we
define two bases B1 , B2 to have the same orientation iff one is taken to the other by
an element of GL+ . Since this subgroup has index 2, there are only these two choices,
and the second case is expressed by saying they have opposite orientation.
Then, choose one basis B1 we declare (arbitrarily) that this has positive orientation.
The image of this by applying all elements of GL+ defines B c+ , the bases with positive
orientation, and the complement defines B c− , the bases with negative orientation.
Note that Bc− is the GL+ -image of any B2 not in B c+ .

Theorem 4.18.
(i) det(AB) = det(A)det(B).
(ii) det(B −1 AB) = det(A).

Proof. Part (i) can be proved algebraically, but it is much easier to use the geometric
definition of determinant, that det(A) = (±1) · (factor of change of volume). (Since
32 ALBERT M. FISHER

(AB)v = A(Bv)- this is a multiplication of matrices, and we have the associative


law- multiplication of the volume by b and then by a changes it by the factor ab).
Now we have the factor of 1 if A preserves orientation, −1 if not. This again works
for the product; changing the orientation twice leaves it fixed, and −1)(−1) = 1.
Part (ii) follows from this. 
4.15. Three definitions of the vector product. The vector product v ∧ w is
defined only on R3 , to give a vector in R3 . (In R2 , we make the special definition
(a, b) ∧ (c, d) = v ∧ w ∈ R3 where v = (a, b, 0) and w = (c, d, 0)). Here we present
three equivalent definitionsof the vector product. We write i, j, k for the standard basis
vectors in R3 . We write P (v, w, z) for the paralelopiped spanned by v, w, z ∈ R3 ,
that is, P (v, wz) ≡ {(av + bw + cz a, b, c ∈ [0, 1]}, and P (v, w) for the parallelogram
spanned by v, w ∈ R3 , so P (v, w) ≡ {(av + bw : a, b ∈ [0, 1]}.
Theorem 4.19. The following definitions are equivalent.
(1) (Via the “determinant” formula):
i j k
v2 v3 v1 v3 v1 v2
v∧w = v1 v2 v3 =i −j +k
w2 w3 w1 w3 w1 w2
w1 w2 w3
(2) (The geometric definition):
v ∧ w satisfies the following properties:
(i) z = v ∧ w is perpendicular to v and to w;
(ii) The norm of z is equal to the area of the parallelogram P (v, w); thus
||v ∧ w|| = ||v||||w|| · | sin(θ)|.
(iii) If z 6= 0, then (v, w, z) forms a positively oriented basis for R3 .
(3) (The algebraic definition) : The vector product is a bilinear operation such that
i ∧ j = k, j ∧ k = i, k ∧ i = j.
Remark 4.8. (1) is the usual definition given in texts.
Regarding (2), we remark that θ is the angle from v to w, where in the plane this
would mean measured in the counterclockwise sense from v to w; in R3 , together
with an orientation, “counterclockwise” is defined by looking down along the thumb
for the right-hand rule. Note that since the modulus is taken, this is the same for the
angle −θ from w to v and in any case is positive as a norm should be.
The formula in (3) is easy to remember as it follows a circle from i to j to k.
Proof. To prove that (1) =⇒ (2) we note that for any vector u, we used the mixed
product, a mixture of the inner and vector products u · (v ∧ w), and note that:

u1 u2 u3
u · (v ∧ w) = v1 v2 v3 (6)
w1 w2 w3
Taking u = v in (6) it follows that v · z = 0, similarly for w, proving (i). Recall
u1 u2 u3
that v1 v2 v3 = ± (volume of the paralellopiped spanned by u, v, w), using
w1 w2 w3
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 33

the fact that detM = detM t , where the sign is + iff the map preserves orientation,
since the parallelopiped is the image of the unit cube, and since from Theorem 4.19
we know the determinant gives ± (factor of change of volume).
z1 z2 z3
2
Now taking in (6) u = z = v ∧ w, then ||z|| = z · z = v1 v2 v3 ≥ 0 so the
w1 w2 w3
orientation of (z, v, w) is positive. Using this, from the geometric definition of the
determinant,
z1 z2 z3
v1 v2 v3 = vol(z, v, w)
w1 w2 w3
where this means the volume of the parallelopiped spanned by the basis (if linearly
independent) (z, v, w). Here we use the fact that we can exchange rows for columns
as detA = detAt . But since z is orthogonal to the base parallelogram, this volume is
(base area)(height).
This gives
||z||2 = (base area)(height) = (base area)||z||
so ||z|| = (base area) as claimed. This conclues the proof that Def. (1) implies Def.
2.
It is clear that both Defs. (1), (2) imply (3), but knowing Def. (3) for the ba-
sis vectors i, j, k determines v ∧ w for all v, w, by bilinearity. Hence all three are
equivalent.

Corollary 4.20. We have the nice (and useful!) formula
||v ∧ w||2 = (v · v)(w · w) − (v · w)2 .
Proof. From Theorem 4.19 we know that
||v ∧ w||2 = (area)2 = (||v||||w||| sin θ|)2 =
and this is

||v||2 ||w||2 (1 − cos2 θ) = ||v||2 ||w||2 − (||v||||w|| cos θ)2 = ||v||2 ||w||2 − (v · w)2 .

We shall next see how the vector product satisfies three important properties, the
first two of which we have already proved:
Definition 4.12. A Lie bracket [x, y] on a vector space V is an operation on V (a
function from V × V to V ) which satisfies the axioms:
– bilinearity;
–anticommutativity: [y, x] = −[x, y];
–the Jacobi identity
[x, [y, z]] + [y, [z, x]] + [z, [x, y]] = 0
Proposition 4.21. The vector product v ∧ w on R3 is a Lie bracket, setting [v, w] =
v ∧ w.
34 ALBERT M. FISHER

Proof. We have shown the first two properties.


Now from (3) we have an exceptionally easy proof of the Jacobi identity, since by
bilinearity it is enough to check this on the basis vectors, and for example
[i, [j, k]] + [j, [k, i]] + [k, [i, j]] = 0
since each term is 0, and similarly for the other cases. 
General equation of a plane; matrix form. Going back to the parametric
equation of a plane in (21), we had the linear transformation T : R2 → R3 where
T (s, t) = sv
+  tw. Writing H for the (3 × 2) matrix and z for the (2 × 1) column
s
vector z = , then in matrix form this is
t
     
v1 w1   v1 w1
s
Hz = v2 w2
  = s v2 + t w2 
   (7)
t
v3 w3 v3 w3
The image of the map H is a plane which passes through 0. The plane parallel to
this which passes through some point p is the image of the function Hp : (s, t) 7→
T (s, t) = sv + tw + p. Note that H is a linear transformation, while Hp is affine but
not linear (unless p happens to lie on the plane Im(H)).
Given this parametric equation we can find the general equation of the plane
Im(Hp ) as follows: we take our normal vector to be n = (A, B, C) where n = v ∧ w.
Then points (x, y, z) in the plane T (s, t) = sv + tw + p = (x, y, z) satisfy the equation
(A, B, C) · ((x, y, z) − p) = 0
so giving
Ax + By + Cz + D = 0
where D = −n · p.
We have explained this above, in Exercise 4.12.
In matrix form this is
 
  x  
M v = A B C y  = −D . (8)
z
Defining S : R3 → R to be the function S(x, y, z) = Ax + By + Cz then the plane
is the level surface of level −D of S.
Putting these two maps together, we have the composition of maps, with the two
matrices acting on column vectors:
H M
M2,1 −−−→ M3,1 −−−→ M1,1
or as linear transformations:

T S (9)
R2 −−−→ R3 −−−→ R1

Restricting to the image of T , the geometrical plane P ⊆ R3 , we have:


T S
R2 −−−→ P −−−→ {−D} (10)
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 35

The plane P is a set of points (x, y, z), which on the one hand is the image of the
map T , and on the other is a translate of the kernel of the map S by the vector p.
Level surfaces of different levels (that is, planes which are parallel, with different
constants D) fit together as described by Equation (5.10).
Remark 4.9. The important point in this is the following: The plane P is,
by itself, simply a subset of points, a two-dimensional subspace of R3 . However,
Equation (10) gives us two very different ways of viewing P : via the map T or the
map S.
Summarizing, P is the image of R2 via the map T . That is, the map T parametrizes
P ; thus via this map P becomes the parametrized plane Lp (s, t) = sv + tw + p.
On the other hand, via the map S, P is the preimage (inverse image) of a constant
value −D. Thus it is is seen to be a level surface of the map (of level −D). Thus it
is only one of a family of parallel planes, of different levels.
This also gives us insight as to the meaning of the diagram: it says something about
the object (in this case the plane P ) in the middle, from two different perspectives,
given by the two maps.
Again, this just reflects the difference between our two ways of understanding a
plane, as a parametrized plane, see Equation (21), or as the solution set of its general
equation. And this latter is, geometrically, a plane which passes through a point and
has a certain normal vector, n = (A, B, C).
This is the simplest case, of a line in the plane or a plane in space. The general
situation comes from these fundamental results of Linear Algebra:
Theorem 4.22. Given finite-dimensional vector spaces V, W let T : V → W be a
linear transformation. Then:
i) the null space N (T ) is a vector subspace of V ;
ii) the image Im(T ) is also; and
(iii) dim(N (T )) + dim( Im(T )) = dim(V ).
Exercise 4.15. Prove (i), (ii)! See Exercise 4.6.
Corollary 4.23. If T above is surjective, then dim(N (T )) = dim(V ) − dim(W ).
Before we describe the proof, we write it as a diagram, of linear transformations
on vector spaces:

I T
K −−−→ V −−−→ W
Here the first map I is an injection I(v) = v, which just means that it is a 1 − 1
function (just the identity map in this case). Its image is the subspace Im(I) = K
which is the kernel of T , and the image of T is W . That is, the map T is onto.
For the previous example, the map I represents the plane K as a parametrized
subspace, while the map T gives its general equation.
In Algebra, a diagram of maps where the image of one map is the kernel of the
following map is called an exact sequence. In fact, the above diagram of vector spaces
extends to
I T S π
{0} −−−→ K −−−→ V −−−→ W −−−→ {0}
36 ALBERT M. FISHER

where I is the injection and π is the projection π(v) = 0. This extended diagram is
also exact: exactness of the first part
I T
{0} −−−→ K −−−→ V
says that I is injective (1 − 1 to its image) since the kernel of T is then {0}, while
exactness of the second part
S π
V −−−→ W −−−→ {0}
tells us that the map S is onto (surjective) as the kernel of π is all of W , which by
exactness is the image of S.
Back to the proof of the theorem, part (iii) can be proved by writing the map as
a matrix and solving the system of linear equations.
For example when m = 3 and n = 2, we have the following.
Given a matrix  
A B C
M=
D E F
we have the matrix equation

Mv = w
where q = w is fixed, and M is fixed, and by the solution set of this equation we
mean the collection of all v which satisfy this equation. Writing w = (s, t) we have
 
  x  
A B C   s
y = . (11)
D E F t
z
The multiplication v 7→ M v defines a linear transformation T : R3 → R2 . Note
that Im(T ) is equal to the column space of M , the subspace of R3 generated by the
columns of the matrix. This is simply because for a standard column basis vector ek ,
M ek gives the k th column of M .
Note that the matrix equation (23) is equivalent to the “system of two linear
equations in three unknowns”:
(
Ax + By + Cz = s
Dx + Ey + F z = t
This system has full rank iff the rows are linearly independent, iff the dimension of
the image Im(T ) is the maximum possible, in this case 2.
From Linear Algebra we can find the solution set explicitly by row reduction.
For a concrete example, after row reduction we may have
(
x+y =s
2z = t
and we are free to choose y (for this reason known as a “free variable”) but then no
longer free to choose x or z as these are determined, since x = −y + s and z = t/2.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 37

Thus we have for a solution


(x, y, z) = (−y + s, y, t/2) = (s, 0, t/2) + y(−1, 1, 0) = p + yv = l(y)
which is a parametrized line passing through the point p in the direction of v.
If we change s, t we get lines parallel to this one. In particular, if p = (0, 0) then
the solution set is the parametric line l(y) = yv, and this is the kernel of the map T ,
of dimension 1.
In conclusion, the dimension of the solution set is the number of free variables, so
in this case of full rank this is, by Cor. 4.23, 3 − 2 = 1, indeed a line.
The geometrical way to think of this is that each of the equations gives a plane, so
the solutions for the pair of equations is the intersection of two planes which is a line;
the full rank condition means that these planes are not parallel, since their normal
vectors are the rows of M , which are linearly independent.
4.16. The Inverse and Implicit Function Theorems. What we will see next
is how this same point of view applies for the much more general situation of a
differentiable, but nonlinear, map.
We know that given F : Rm → Rn continuously differentiable, the derivative matrix
DF |p well- approximates the function at the point p. That means that certain
properties of F near p should be reflected in the linear map DF |p and vice-versa.
Important examples are given by these two theorems, which are closely related.
Indeed one can choose to prove either one first, then deducing the other from that.
The Inverse Function Theorem states that F : Rm → Rm is invertible near p iff
the matrix DF |p is invertible, which is true iff det(DF |p ) 6= 0. First we need:
Definition 4.13. F : Rm → Rn is continuously differentiable (of class C 1 ) iff the
derivative DF at each point p exists and the matrix DFp is a continuous function of
p).
Given an open set U ⊆ Rm , a function F : U → V = F (U) is invertible iff there
exists Fe defined on V such that Fe ◦ F is the identity on U and F ◦ Fe is the identity
on V.
Theorem 4.24. (Inverse Function Theorem) Let F : Rm → Rm be C 1 . Suppose the
matrix DFp is invertible. Then there exists an open set U containing p such that F
is C 1 and invertible on U.
Proof. See [Mar74], p. 206 and p. 230 or (for a stronger statement, with estimates)
[HH15] p. 264 ff. 
Remark 4.10. Note that by the Chain Rule, we then know that for all points x ∈ U,
with y = F (x), Fe ◦ F (x) = x so I = D(F ◦ Fe)(x) = (DF )Fe(x) DFx whence the inverse
of the matrix DFx is (DFx )−1 = DFey .
The Implicit Function Theorem states that for a C 2 function F : Rm → Rn with
m ≥ n, then if the derivative matrix at a point p is surjective (onto; of full rank) then
the inverse image set F −1 (q) for F (p) = q behaves like the inverse image of a point
by the matrix: it is a submanifold of dimension m − n. A submanifold of dimension
1 is a parametrized curve; of dimension 2 is a parametrized surface.
38 ALBERT M. FISHER

The differentiable version of this result is that if DF (p) has full rank, then F −1 (q)
is a submanifold of R3 , of dimension 3 − 1 = 2. This means that it is a parametrized
surface.
The Implicit Function Theorem moreover gives conditions when given an equation
F (x1 , . . . , xn ) = 0
we can solve for one of the variables, and use the rest √
as our parameters. For the
2 2
simplest example, F (x, y) = x + y = 1 becomes y = ± 1 − x2 .
Just as for matrices, the Implicit Function Theorem has a version for F : Rm → Rn
whenever m ≥ n. The case m = n is indeed the Inverse Function Theorem!
Definition 4.14. F : Rm → Rn is continuously differentiable (of class C 1 ) iff the
derivative DF at each point p exists and the matrix DFp is a continuous function of
p).
Given an open set U ⊆ Rm , a function F : U → V = F (U) is invertible iff there
exists Fe defined on V such that Fe ◦ F is the identity on U and F ◦ Fe is the identity
on V.
A submanifold M ⊆ Rm of dinension d < m is the following: there exists U ⊆ Rd
and Φ : U → M surjective and C 1 such that DΦ is injective at each point of U, to a
linear subspace of dimension d.
Theorem 4.25. (Implicit Function Theorem) Let F : Rm → Rn be C 1 , with m ≥ n.
Suppose the matrix DFp is surjective. Then the set F −1 (p) is a submanifold of Rm
of dimension m − n. That is, for x ∈ F −1 (p), there exists an open subset U ⊆ Rm−n
and H : U → Rm C1 such that F ◦ G(x) = p for all x ∈ U.
Proof. See [War71] Theorem 1.38, p. 31. 
For example, if F : R3 → R, then the level surface F −1 (p) is a parametrized surface:
its parametrization is given near the point x by the map H.
Remark 4.11. Similarly to the linear case as explained in Remark 4.9, in the smooth
case, the same set (an embedded manifold) is viewed in two different ways, by means
the two maps, one where it is the image, one the domain. The first parametrizes
the manifold, the second places it as a level curve, surface or manifold of a map on
the higher-dimensional space, and thus shows how it is but one of a family of such
“parallel” manifolds. This is a special mathematical object known as a foliation.
Thus the level surfaces of a function F : R3 → R foliate R3 , and in the special case
of F linear, F is given by the inner product with a normal vector, and the foliation
consists of all those parallel planes.
Thus a parametrized m-dimensional manifold in Rm is a map α : U → M ⊂ Rm
where U is a connected open subset of Rm , α is differentiable and invertible with
image M .
The higher dimensional version of level curves and surfaces can be stated as follows.
Given f : Rm+1 → R then if f is differentiable and surjective and Df is everywhere
onto (one says Df is of maximal rank) then for any point p ∈ Rm+1 the set M =
f −1 (p) is locally a parametrized m-dimensional manifold. Moreover this holds when
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 39

that condition holds at some point (not necessarily all points): for any y = f (p) such
that Df |p is of maximal rank; y is then called a regular value.
Proposition 4.26. (Lemmas 1,2 of Chapter 2 of [MW97]) If f : M → N is a
smooth map between manifolds of dimension m ≥ n, and if y ∈ N is a regular
value, then the set f −1 (y) is a smooth manifold of dimension m − n. The null space
of Dfx : T Mx → T Ny is the tangent space of this submanifold, and its orthogonal
complement is mapped onto T Ny .
One then has a similar diagram
α f
M −−−→ N −−−→ {y}
where the first map is injective and the second is surjective. When one considers the
derivative maps then one gets the exact diagram for the linear case:

Dα|x Dfx
{0} −−−→ Rm −−−→ N −−−→ {0}
For the differentiable case, there are many versions of these theorems. For an
introduction see Lemmas 1,2 of Chapter 2 of [MW97] and for surfaces Proposition 2
of Chapter 2 of [DC16]. For a simple and beautiful general statement see Theorem
1.39 of [War71].
Remark 4.12. A nice simpler version with examples is on pp. 239-240 of Vol. II of
[Gui02]. See also my handwritten Notas de Aula.
More on the Implicit and related Inverse Function Theorems are given e.g. in §7.2-4
of [Mar74], and in Chapter 2.10 and on p. 729 of [HH15]. We next consider some
examples.
Example 3. F (x, y) = x2 + y 2 , the curve of level 1 is the unit circle, the solutions of
the equation (i.e. all pairs (x, y) which satisfy the equation)
x2 + y 2 = 1.
Higher derivatives.
For the second-order approximation we add a term involving the second-order par-
tial derivatives and so on. This gets more and more complicated as we describe
next.
The map F is called C 0 iff it is continuous, and C k iff the k th derivative exists and
is continuous. For this we need to define higher derivatives.
Writing L(V, W ) for the collection of linear transformations from V to W , then
this is a Banach space with the operator norm. Since DF : V → L(V, W ), then
we see that the second derivative is a linear map D2 Fx : V → L(V, W ) and thus
D2 F : V → L(V, L(V, W )), and so on.
In the same way the second, third derivatives are defined, with matrices of increas-
ing size.
The only exception is when n = 1, for a curve γ : [a, b] → Rm : in this case (as
noted above) then γ 0 is also curve in Rm , thus so is γ 00 = (γ 0 )0 etcetera. By contrast
for a function F : Rn → R then the gradient ∇F : Rn → Rn is a vector field,
40 ALBERT M. FISHER
2
so DF : Rn → Rn , and then the second derivative is no longer a vector field as
2 3
D2 F : Rn → Rn × Rn ∼ Rn , and so on, getting more and more complicated.
A domain is an open subset of Rn . A vector field on a domain U is simply a such
a map defined only on the subset U. The vector field is termed C k , for k ≥ 0, iff the
map has those properties (again, C 0 means continuous, and C k that Dk F exists and
is continuous, so D : C k+1 → C k ).

4.17. Higher order partial derivatives. Given F : R2 → R then Fx = ∂F ∂x


is a
function defined on the plane. Then setting G(x, y) = Fx (x, y) we take its partial

derivatives. We write ∂y G in these equivalent ways:

∂ ∂ ∂Fx
Gy = (G) = (Fx ) = = (Fx )y = Fyx .
∂y ∂y ∂y
(This notation can be confusing since Fyx = (Fx )y !)
Now for G = Fx then G : R2 → R. This has as its gradient
∇G = (Gx , Gy ) = (Fxx , Fyx ).
Similarly for G
e = Fy then its gradient is

∇G
e = (G
ex , G
ey ) = (Fxy , Fyy ).

When a function L : R2 → R2 then writing it in components L = (L1 , L2 ) or in


matrix form  
L
[L] = 1
L2
we know its derivative matrix is
   
∇L1 (L1 )x (L1 )y
[DL] = =
∇L2 (L2 )x (L2 )y
So for DF : R2 → R2 we have
   
2 (Fx )x (Fx )y Fxx Fyx
D F = D(DF ) = = .
(Fy )x (Fy )y Fxy Fyy
Now in fact it is a bit simpler than this because of the following:
Proposition 4.27. For F : R2 → R then Fxy = Fyx .
Equality of mixed partials.
The above important fact in Proposition is often called the quality of mixed partials.
The next result can be proved using just derivatives, but we like the following “Fubini’s
Theorem argument”, partly because it leads in to Green’s Theorem later on:
Lemma 4.28. If f : Rn → R is continuously differentable, then we can change the
order in taking two partial derivatives: e.g. for n = 2, then
   
∂ ∂ϕ ∂ϕ ∂ϕ
f = f .
∂x ∂y ∂y ∂x
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 41

e : ⊗ → R on an open set Ω, then if for


Proof. Given two continuous functions ϕ, ϕ
every rectangle R ⊆ Ω we have
Z Z Z Z
ϕ dx dy = ϕ
e dx dy,
R R
then we can conclude that ϕ = ϕ e on Ω. (Because, if they differ at a point, then one
is larger than the other on a small rectangle about that point, and the integrals there
are different, a contradiction).
We define ϕ(x, y) = ∂ϕ ( ∂ϕ f (x, y)) and ϕ
∂x ∂y
e = ∂ϕ ( ∂ϕ f (x, y)). Our strategy of proof
∂y ∂x
will be to show that for any R = [a, b] × [c, d] we have the above equality of integrals,
and the result will then follow.
Fubini’s Theorem tells us that
Z Z Z d Z b  Z d Z b   
∂ ∂
ϕ(x, y) dx dy = ϕ(x, y) dx dy = f (x, y) dx dy
R c a c a ∂x ∂y
Now Z b  
∂ ∂f ∂f ∂f
(x, y)dx = (b, y) − (a, y)
a ∂x ∂y ∂y ∂y
so the iterated integral equals
Z d Z d
∂ ∂
f (b, y)dy − f (a, y)dy =
c ∂y c ∂y
   
f (b, d) − f (b, c) − f (a, d) − f (a, c) .

Again, by Fubini’s Theorem:


Z Z Z bZ d Z b Z d   
∂ ∂ϕ
ϕ(x,
e y) dx dy = ϕ(x,
e y) dy dx = f (x, y) dy dx
R a c a c ∂y ∂x
This time,
Z d  
∂ ∂ ∂f ∂f
f (x, y) dy = (x, d) − (x, c)
c ∂y ∂x ∂x ∂x
so the iterated integral equals
Z b Z b
∂ ∂
f (x, d)dx − f (x, c)dx =
a ∂x a ∂x
   
f (b, d) − f (a, d) − f (b, c) − f (a, c)

which equals the previous expression, finishing the proof. 


 
a b
Corollary 4.29. The above matrix D2 F is symmetric: it has the form .
b c
In the case of F : Rn → R, all of this makes sense: D2 F is a a symmetric (n × n)
matrix. For F : Rn → R the matrix D2 F is sometimes called the Hessian matrix. Its
determinant is called the Hessian determinant or simply the Hessian. In Guidorizzi
Vol 2 §16.3 this is written H(x, y). [Gui02], and §3.6 of [HH15].
42 ALBERT M. FISHER

The meaning of this symmetric matrix becomes clear when discussing Taylor poly-
nomials of order 2, and finding maximums and minimums.

4.18. Finding maximums and minimums. We note that:


(1) Given F : R2 → R, if a minimum or maximum value occurs at p = (x0 , y0 ) then
the tangent plane must be horizontal. Equivalently, if F is differentiable, then the
partial derivatives Fx , Fy at p are 0.

Definition 4.15. In this case, p is a critical point (ponto critico) of F . Equivalently,


∇F (p) = 0.

(2) If it is a maximum then it must be a maximum for the function restricted to the
line x = x0 . We can then consider the second partials and use the second derivative
test from Calculus 1: If Fxx > 0 then Fx is increasing, so it is a minimum along that
line. This does not necessarily mean it is a minimum off the line.
However there is a fuller method: see Guidorizzi Vol 2 §16.3 [Gui02], and §3.6 of
[HH15].
(i) If Fxx > 0 at p and the Hessian H(p) > 0 then p is a local minimum.
(i) If Fxx < 0 at p and the Hessian H(p) > 0 then p is a local maximum.
(iii) if H(p) < 0 then p is a saddle point. Thus it is neither max nor min.
(iv) If H(p) = 0 then we cannot say from this test and have to look more closely.

Exercise 4.16. Compare the above tests for the functions we have encountered:
F (x, y) = x2 + y 2 , F (x, y) = x2 − y 2 , F (x, y) = xy.

4.19. The Taylor polynomial and Taylor series. Taylor series in one dimen-
sion. Given a map f : R → R, the terminology “ k th -order approximation” to f at a
point x ∈ R comes from the Taylor polynomials and Taylor series. The best k th -order
approximation at x ∈ R is the polynomial of degree k which best fits the map near
that point. This is the polynomial which has all the same derivatives at that point,
up to order k.
Thus the best 0th -order approximationof f at p ∈ R is the constant map with the
value at that point: the map x 7→ f (p). To get the best first-order approximation we
add on the linear map given by the derivative matrix f 0 (p).
This is the affine map
x 7→ f (p) + f 0 (p)(x − p)
whose graph is the tangent line to the graph of f at that point.
For a function f : R → R, we define a sequence of polynomials, each of degree n,
which approximate this function better and better as n → ∞. For this we choose
a point about which we make the approximation, and call this the Taylor polyno-
mial about x0 . Here for simplicity we work with x0 = 0, and note that the Taylor
polynomials in this cae are also called Maclaurin polynomials.
Let us recall that a polynomial of degree n is

p(x) = a0 + a1 x + · · · + ak xk + · · · + an xn .
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 43

Figure 10. cos(x) and its Taylor polynomials pn for n = 0, 2, 4, 10.

Here k is a nonnegtive integer and ak ∈ R. So since x0 = 1 for any x ∈ R, this is


equal to
n
X
0 1 n
p(x) = a0 x + a1 x + · · · + an x = ak x k .
k=0
44 ALBERT M. FISHER

Figure 11. sin(x) and its Taylor polynomials pn for n = 1, 3, 5, 13.

(Here we use the definition 0! = 1.) Thus a polynomial of degree 0 is a constant


function p(x) = a0 , of degree one is an affine function, p(x) = a0 + a1 x, of degree two
is quadratic and so on.
We write pn for the nth Taylor polynomial (about 0). We also say this is the
nth -order approximation to f .
In the nicest cases, pn actually converges to f as n → ∞. This is true, for example,
for f (x) = sin x), cos(x), ex .
The Taylor series is the infinite series which can be thought of as an infinite poly-
nomial. For example, the nth Taylor polynomial for ex is

1 + x + x2 + x3 /3! + · · · + xn /n!
and the Taylor series is

x 2 3 n
X xk
e = 1 + x + x + x /3! + · · · + x /n! + · · · = (12)
k=0
k!
For f (x) = sin x we have

sin(x) = x − x3 /3! + x5 /5! − . . .


with only the odd powers and
cos(x) = 1 − x2 /2! + x4 /4! − . . .
with only the even powers, and both with alternating signs.
That the graphs of the polynomials pn do approach the function can be seen in
Figs. 10, 11. The 0th -order approximation is a horizontal line with that value, the
1st -order approximation is the tangent line to the graph at that point. The 2nd -order
approximation is the parabola which best fits the curve, and so on. In Fig. 10, 11 we
show the Taylor polynomials pn for cos and sin. Note how close the fit becomes as n
increases!
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 45

Exercise 4.17. Check that for ex the derivative of pn+1 is pn , and that for sin(x)
the derivative of pn+1 is pn for cos(x). This agrees with (ex )0 = ex and (sin)0 =
cos, (cos)0 = − sin!!
The definition of the Taylor series for a differentiable function (about 0) is

X
ak x k
k=0
where ak = f (0)/k! (here f (0) is the k th derivative of f at 0)). (Note that we
(k) (k)

write f (0) for f itself).


Thus the Taylor polynomial of degree n is
n
X f (k) (0) k
pn (x) = x .
k=0
k!

Exercise 4.18. (1) Check that this general formula does give the above Taylor series
for ex , sin(x) and cos(x).
(2) Show that the polynomial pn has the same derivatives as f at x = 0, of order
0, 1, . . . , n. That is, pn (0) = f (0), p0n (0) = f 0 (0), p00n (0) = f 00 (0), and so on.
To define the Taylor polynomials about x0 we simply replace x by (x − x0 ) and the
derivatives at 0 by those at x0 . Thus the Taylor polynomial of degree n about x0 is
n
X f (k) (x0 )
pn (x) = (x − x0 )k .
k=0
k!
The Taylor polynomial in higher dimensions.
We can understand the role of the Hessian matrix and Hessian determinant much
better by explaining how to define the Taylor polynomial of a function of 2 variables.
Consider F : R2 → R. Then the best 0th -order approximation about 0 = (0, 0)
is the constant function with the value f (0). The the 1st -order approximation is
the tangent plane to the graph at 0. The best 2nd -order approximation may be a
paraboloid but could instead be a parabolic hyperboloid, Fig. 2. This depends on the
partial derivatives of order 2 at the point.
The Taylor polynomials pn will be functions of two variables (x, y), thus pn : R2 →
R. Just as for one dimension, a polynomial of degree n is a linear combination of
basic terms of degree k for k = 0 (a constant) up to k = n. Each basic term of degree
k will be of the form xi y j such that (i + j) = k. Thus for example the degree of x2 y 3
is 2 + 3 = 5, and the basic polynomials of degree 1 are p(x, y) = x, p(x, y) = y and of
order 2 are p(x, y) = x2 , p(x, y) = y 2 and p(x, y) = xy.
Taking a linear combination of terms of degree ≤ n gives a polynomial of degree
n, for example

p(x, y) = 1 + x + 3y + x2 + y 2 + 5xy
has degree 2.
Consider for example p(x) = x2 + y 2 . Its graph is a is a paraboloid, while the graph
of p(x) = xy is a hyperbolic paraboloid. See Figs. 3, 2.
46 ALBERT M. FISHER

Both these polynomials have degree 2. Both have horizontal tangent plane at 0.
The first has a minimum there while the second is a saddle point hence neither max
nor min. When x = y we have F (x, y) = xy = x2 , an upward parabola, so a minimum
along the line x = y. When x = −y we have F (x, y) = −x2 , so a maximum. Thus
(0, 0) can be neither max nor min. This is the essence of a saddle point.
In fact the terms of order 2 can be understood with the help of a symmetric matrix.
Definition 4.16. A quadratic form on R2 is a function of the form
Q(x, y) = ax2 + by 2 + cxy.
That is, it is a linear combination of the possible terms of degree 2.
Proposition 4.30. Given  a quadratic form Q, there is a symmetric (2 × 2) matrix
x
A such that for v = , then
y
Q(v) = vt Av.
That is,  
  x
Q(v) = x y A .
y
 
a c/2
Proof. In fact, for A = we have
c/2 b
  
  a c/2 x  
x y = ax2 + by 2 + cxy = Q(x, y).
c/2 b y

 
0 1
Exercise 4.19. Check that when A = then Q(x, y) = 2xy. What do we get
1 0
   
1 0 1 0
for A = ? For ?
0 1 0 −1
To graph a quadratic form we have the following:
Theorem 4.31. A quadratic form
 
2 2
  x
Q(x, y) = ax + by + cxy = x y A
y
has either
(i) a local min or max at 0, if detA > 0;
(ii) a saddle point at 0, if detA < 0.
If detA = 0, we cannot tell from this test.
Proof. (Sketch) From Linear Algebra, a symmetric matrix A can be diagonalized:
there exists an orthogonal matrix U such that U −1 AU = D where D is diagonal.
Now an orthogonal matrix is a rotation, a reflection, or a product of these. That does
not change wherer 0 is a saddle point, max or min. Also, detD = detU −1 AU = detA.
This proves that detA is the product of its eigenvalues, since the eigenvalues of A and
D are the same. The graph of the quadratic form defined by xD has the two types
described, completing the proof. See the above examples. 
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 47

We use this to study the Taylor polynomial of order 2 of F : R2 → R. Write A for


the Hessian matrix
 
2 Fxx Fyx
A=D F = (13)
Fxy Fyy
as explained above in (15); note that A is symmetric!
Then the Taylor polynomials about p = (x0 , y0 ) are, for degree 0:
p0 (x, y) = F (p)
For degree 1, writing h = (x − x0 ), k = (y − y0 ) then
p1 (x, y) = F (p) + Fx (p)(x − x0 ) + Fy (p)(y − y0 ) = F (p) + Fx (p)h + Fy (p)k
(this is the tangent plane);
For degree 2 we have
  
1  Fxx Fyx h
p2 (x, y) = F (p) + Fx (p)h + Fy (p)k + h k
2 Fxy Fyy k
1
= F (p) + Fx (p)h + Fy (p)k + (Fxx h2 + 2Fxy hk + Fyy k 2 )
2
Thus
 D2 F (p) h
 

p2 (x, y) = p1 (x, y) + h k
2 k
which reminds us of the formula for dimension 1. Note that the last term is a quadratic
form since the Hessian matrix D2 F is symmetric.
Looking for maximum and minimum points, first we see if the tangent plane is
horizontal. Then the first-order term is 0 so the Taylor polynomial is simply
 D2 F (p) h
 

p2 (x, y) = F (p) + h k = F (p) + Q(x, y)
2 k
where Q is a quadratic form.
As shown in Theorem 4.31, the sign of the determinant then tells us whether the
surface is a max or min, as for a paraboloid or elliptic paraboloid (like a paraboloid
but with an ellipse as crossection) or a saddle point, as for F (x, y) = xy, discussed
above.
For F : Rn → R with n ≥ 2 a similar formula can be given. The higher terms of
the Taylor series also have a nice expression when the derivatives Dk F are viewed as
k-linear functions. For a clear treatment see [Mar74]. See also §3.6 of [HH15].
Remark 4.13. The Hessian (matrix) gives a local form for critical points, see the Morse
Lemma in §1.7 of [GP74], and §I.2 of [Mil16], with a proof in proof Lemma 2.2. See
also §7.6 of of [Mar74]. This is related to what we have seen about the Taylor series.
4.20. Lagrange Multipliers.
Theorem 4.32. Given an open set U ⊆ Rn and two C 1 functions F, G : U → R, let
B be a level set for G, so B = {x ∈ U : G(x) = c}. Assume that for some point
p ∈ U, ∇Gp 6= 0. Then if F has a local maximum at p ∈ B, there exists some λ ∈ R
such that
∇Fp = λ∇Gp .
48 ALBERT M. FISHER

Proof. Since ∇Gp 6= 0, the linear transformation (the matrix with those entries) DGp
is surjective, which allows us to use the Implicit Function Theorem, Theorem 4.25.
So the level set B has a parametrization. Calling one of the ccordinates t, we have
a curve γ(t) in B which passes through p at time 0, and such that given any chosen
vector v tangent to B, γ 0 (0) = v 6= 0.
Then G(γ(t)) = c so for all t, D(G ◦ γ)(t) = 0. By the Chain Rule this is
D(G ◦ γ)(t) = DGγ(t) Dγ(t) = ∇Gγ(t) · γ 0 (t).
In particular fo t = 0, ∇Gp · γ 0 (0) = 0.
On the other hand, F has a local maximum at p, so in particular, F ◦ γ(t) has a
local maximum at t = 0.
Therefore,
D(F ◦ γ)(0) = DFp Dγ(0) = ∇Fp · γ 0 (0) = 0.
Since v = γ 0 (0) 6= 0 was any tangent vector to B, both ∇Fp and ∇Gp are orthog-
onal to any such v. Thus they must be multiples (think for example of a level curve,
or a level surface).

Remark 4.14. Note that the derivatives are 0 for two completely different reasons:
that G ◦ γ is constant, and that F ◦ γ has a maximum.
Note that it is possible for λ to be 0, and also possible for ∇Fp to be 0. However for
the proof ∇Gp must be nonzero to be able to apply the Implicit Function Theorem.
5. Vector Calculus, Part II: the calculus of fields, curves and
surfaces

5.1. Vector Fields. In Part I we we have already encountered the gradient vector
field. Here is the general setting:
Definition 5.1. A continuous vector field is a continuous function V : Rm → Rm .
A linear or a differentiable vector field on Rm is simply a linear or differentiable such
function.
The reason we call this a vector field rather than just a function is because of the
special way in which we visualize this. Note that for m ≥ 2 we cannot draw the graph
of a vector field, as we would need too many dimensions! Indeed the graph of V is (by
definition) the collection of all ordered pairs (v, V (v)), a point in Rm × Rm = R2m so
for R2 , to draw the graph of the vector field would require four dimensions.
Instead, we picture the vector field by drawing the vector wv = V (v) based at each
point v. See Fig. 12. We can imagine this field represents the velocity field of a liquid
or gas, showing its motion.
Exercise 5.1. Sketch the following
 linear vector fields V (x, y) = (ax + by, cx + dy)
a b
given by the matrix A = acting on column vectors, that is:
c d
      
x a b x ax + b
A = = ,
y c d y cx + d
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 49

Figure 12. A vector field in the plane, from Wikipedia. Compare to


the pictures of curves below!

for these matrices:



0 −1
(i) A = .
1 0
 
1 −1
(ii) A =
1 1
 
1 0
(iii) A = .
0 −1
 
0 1
(iv) A = .
1 0
 
1 0
(v) A = .
1 1
Compare your sketches to the figures of curves.

Remark 5.1. Vector fields, flows, and Ordinary Differential Equations To


better understand vector fields, we draw their integral curves. meaning that it satisfies
γ(0) = p and for all t ∈ R, γ 0 (t) = V (γ(t)). That is, the curve is always tangent to
the vector field.
Looking at all the curves at once, we see a continuous motion of Rn , called a flow:
by definition, a flow on Rn is a collection of maps τt : Rn → Rn for each t ∈ R, which
satisfy the flow property τt+s = τt ◦ τs . Given some initial point x, a flow determines
the curve γ(t) = τt (x), whose image is called the orbit of the point x.
An example is the rotation flow of the plane, defined by Rt (x, y) = (bx, yb) where the
vector has been rotated
 counterclockwise
   by angle t. This is given by the rotation
cos t − sin t x x
matrix: Rt (v) = = . See Fig. 14. Applying Rt represents
b
sin t cos t y yb
flowing for the time t.
50 ALBERT M. FISHER

Figure 13. A time-varying velocity vector field: the wind at the sur-
face of the Earth, from nullschool.net

In fact, given a C 2 vector field, we can always find the corresponding flow. This
is the content of the Fundamental Theorem of Ordinary Differential Equations: in
Rn , any C 2 vector field is tangent to a unique family of curves, meaning that there
exists a unique curve γ through a point p tangent to the vector field V : Rn → Rn ,
and furthermore, these can be put together as a flow. Conversely, any such family of
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 51

Figure 14. Level curves of the function F (x, y) = x2 + y 2 , tangent to


the velocity vector field of the rotation flow.

Figure 15. Two hyperbolic flows.

curves can be differentiated (by finding the tangent vector at each point) to give the
vector field. Finding the curves from the vector field is called integration, and the
curves are called integral curves of the vector field. The equation

γ 0 (t) = V (γ(t)) with γ(0) = p

is called a (vector) differential equation with initial condition p. When written in


coordinates, this gives (equivalenly) a system of n differential equations. A curve
satisfying this is an integral curve of the vector field, and is also called a solution
of the differential equation with that initial condition. So to solve the differential
equation means to find the curve!
52 ALBERT M. FISHER

Figure 16. Integral curves of some linear vector fields.

Remark 5.2. In the above situation, we call the vector field a velocity vector field as
its value is the tangent vector (velocity) v of a solution curve γ(t), since V (γ(t)) =
γ 0 (t) = v(t).
Now instead of a velocity field, vector fields can also depict a force field, such as
gravity or an electric field.
By a force field we mean the following. The differential equation is now a second-
order vector ODE as it involves the second derivative: for example, F (γ(t)) = mγ 00 (t),
which expresses Newton’s Law F = ma, where γ(t) is the position, γ 0 (t) = v(t) the
velocity, a(t) = γ 00 (t) the acceleration, and m ≥ 0 is the mass of the object. Here we
need two initial conditions: initial position γ(0) and initial velocity γ 0 (0) = v(0); our
Fundamental Theorem then guarantees that we will again have a unique solution.

Remark 5.3. (On the interpretation of vectors and of vector fields)


There are various possible interpretations of vectors. The two most important are
as movement (a translation of position of a particle) and force (applied to a particle;
it is important to not confuse them as these are completely different! For a simple
example, consider the six vectors a, b, c, d, e, f in the plane defining the vertices of a
regular hexagon, so d = −a and so on. We prove that a + b + c + d + e + f = 0
in two different ways, using these interpretations. First, movement: we consider the
path 0, a, a + b, . . . , a + b + c + d + e + f ; this walks along a translated hexagon and
returns us to 0, so that is the total motion. Second, each vector represents a force
applied to a particle located at 0. Now they cancel pairwise, giving 0 as the resultant
force.
The same two interpretations arise for a vector field F . Now the movement inter-
pretation is that the field is tangent to the flow of a fluid, and a particle (perhaps an
ant on a leaf!) is being carried along the flow lines. The second interpretation is that
a particle is moving according to Newton’s law F = M a in this force field.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 53

Further possible interpretations are for example that F represents a magnetic field,
or an area element of a surface as the covector for a two-form. But the first two are
certainly the most common and important for our intuition.
It is possible that the force on the object also depends on its velocity; in that case,
this is given by a vector field F where mγ 00 (t) = F (γ(t), γ 0 (t)). This is the case for a
charged particle moving in a magnetic field.
The definition of a second-order vector DE in Rn is just that: we are given F :
U → Rn which is C 1 then
γ 00 (t) = F (γ(t), γ 0 (t)). (14)
In the time-varying case it would be
γ 00 (t) = F (t, γ(t), γ 0 (t)). (15)
In the first case, F only depends on posiition so is a vector field on U ⊆ Rn . In the
last case it is a vector field on R1+2n , however with values in Rn ⊆ R1+2n .
In fact, all higher-order vector DEs can be converted into first-order vector DEs;
if the order is k, we need k times the dimension. Thus for a second-order vector DE
in Rn , so with dimension n, to write it as a first-order system we simply include the
vector γ 0 as a new variable, giving a new solution curve η = (γ, γ 0 ) in dimension 2n.
Furthermore, time-dependent vector fields, so-called nonstationary or nonautonomous
DEs, can be seen in this context by adding one more parameter (time). See Fig. 13.
Thus all DEs can be interpreted geometrically, as finding integral curves (and flows)
to a velocity vector field.
The electrostatic fields in Figs. 25, 31, 26, are gradient vector fields: depicted are
two families of curves, orthogonal to each other; the electrostatic field is tangent to the
lines of flux between the charges. (Even though they are actually force, not velocity
fields, it is useful to picture them as velocity fields). The curves going around the
charges are the level curves of the electrostatic potential function Φ. Thus F = ∇Φ
is the electrostatic field. Not all vector fields are gradient fields for some potential;
below we find conditions such that this important property holds.

5.2. The line integral. Given a vector field F on Rn , the line integral of F along γ
is Z Z b
F · dγ ≡ F (γ(t)) · γ 0 (t)dt.
γ a
A line integral gives a weight at each point of the curve which depends not only
on the location γ(t) but also on the direction, γ 0 (t) with respect to F (γ(t)): if these
two vectors are aligned it gets a positive weight, if opposed it is negative, and if
perpendicular it is zero. If for example F gives a force field, then the dot product
measures the amount of work needed to move in that direction. Thus an ice skater
glides on the ice doing no work, because the plane of the frozen lake is perpendicular
to the direction of gravity.
The line integral can also be interpreted as is the integral of the curve with respect
to a one-form, the one-form dual to the vector field, just as the dual space F ∗ is dual
to F . We return to this below.
54 ALBERT M. FISHER

Figure 17. Equipotential curves and lines of force for the electrostatic
field of two like charges in the plane. For a gravitational potential
this would be a topographic map showing either two mountains or two
valleys. Note thaat there is a saddle (hyperbolic) point between the
two.

Figure 18. Equipotential curves and lines of force for the electrostatic
field of two like charges in the plane, showing a closeup view of the sad-
dle point in the center.Note that this approximates the dual hyperbolas
of Fig. 21.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 55

Given a curve γ1 : [a, b] → Rn , by a reparametrization γ2 of the curve we mean


the following: we have an invertible differentiable function h : [a, b] → [c, d] such that
γ2 = γ1 ◦ h. Note that γ1 and γ2 have the same image, and that the tangent vectors
are multiples: γ20 (t) = γ1 ◦ h0 (t) = γ10 (h(t))h0 (t). We call this a positive or orientation-
preserving parameter change if h0 (t) > 0, negative or orientation- reversing if < 0.
Proposition 5.1.
(i) The line integral is unchanged for an orientation- preserving parametrization.
That is, Z Z
F · dγ1 = F · dγ2 .
γ1 γ2
(ii) For an orientation- reversing parametrization, we change the sign.
Proof. (i) Writing u = h(t), we have γ2 (t) = γ1 (h(t)) = γ1 (u). Since du = h0 (t)dt
then using the Chain Rule:
Z Z t=b Z t=b
0
F · dγ2 ≡ F (γ2 (t)) · γ2 (t)dt = F (γ1 (h(t))) · (γ1 ◦ h)0 (t)dt =
γ2 t=a t=a
Z t=b Z u=d Z (16)
0 0 0
F (γ1 (h(t))) · γ1 (h(t))h (t)dt = F (γ1 (u)) · γ1 (u)du = F · dγ1 .
t=a u=c γ1
0
(ii) For h < 0, then h(a) = d, h(b) = c. The calculation is the same, with that
change of the limits of integration:
Z Z t=b
F · dγ2 ≡ F (γ2 (t)) · γ20 (t)dt =
γ2 t=a
Z u=c Z u=d Z (17)
0 0
F (γ1 (u)) · γ1 (u)du = − F (γ1 (u)) · γ1 (u)du = − F · dγ1 .
u=d u=c γ1

n
R If γ : [a, b] →
Corollary 5.2. R R is a path, then writing γ
e for the orientation-reversed
path, we have γe F · de
γ = − γ F · dγ.
Proof. Define h : [a, b] → [a, b] by h(b() = a, h(a) = b, interpolated linearly. Thus,
h(t) = −t + (a + b).
e(t) ≡ γ ◦ h(t). The claim follows from the Proposition.
Then γ

There is a second notion of intergal along a curve, but where we integrate a function
rather than a vector field, so there is no dot product:
Definition 5.2. Given f : Rn → R, the line integral of second type of f along γ is
Z Z b
f (v)ds ≡ f (γ(t))||γ 0 (t)||dt.
γ a
Taking the special case of f ≡ 1, we define the arc length of γ to be:
Z Z b
ds = ||γ 0 (t)||dt.
γ a
56 ALBERT M. FISHER

We have already seen this special case above in Part I. (So there is some overlap
here with our earlier discussion!)
For an example we already know from first semester Calculus, consider a function
g : [a, b] → R, we consider its graph {(x, g(x)) : a ≤ x ≤ b}. We know from Calculus
the arc length of this graph is
Z bp
1 + (g 0 (t))2 dx.
a
We claim that the new formula includes this one: parametrizing the graph
p as a curve
0 0 0
in the plane γ(t) = (t, g(t)). Then γ (t) = (1, g (t)) so ||γ (t)|| = 1 + (g 0 (t))2 ,
R Rbp
whence indeed γ ds = a 1 + (g 0 (t))2 dx as claimed.
Proposition 5.3.
(i) The line integral of second type of a function along a curve gives the same value
for any change of parametrization, independent of orientation. That is,
Z Z
f (v)ds = f (v)ds.
γ1 γ2

Proof. (i) Writing u = h(t), we have γ2 (u) = γ2 (h(t)) = γ1 (t). Then since du =
h0 (t)dt, and using the Chain Rule, we have:
Z Z u=d Z u=d
0
f (v) ds ≡ f (γ2 (u)) ||γ2 (u)|| du = f (γ2 (h(t))) ||(γ20 (h(t)|| du =
γ2 u=c u=c
Z t=b
f (γ2 (h(t))) ||γ20 (h(t))|| h0 (t) dt
t=a
Assuming first that h0 > 0, this equals (18)
Z t=b Z t=b
0 0
f (γ1 (t)) ||γ2 (h(t)) h (t)|| dt = f (γ1 (t)) ||(γ2 ◦ h)0 (t)|| dt
t=a t=a
Z t=b Z
= f (γ1 (t)) ||γ10 (t)|| dt = f (v) ds.
t=a γ1
0
If instead h < 0, then we have as before
Z Z u=d Z u=d
f (v) ds ≡ f (γ2 (u)) ||γ20 (u)|| du = f (γ2 (h(t))) ||(γ20 (h(t)|| du =
γ2 u=c u=c
Z t=a
f (γ2 (h(t))) ||γ20 (h(t))|| h0 (t) dt
t=b
because, since h0 < 0, h(b) = c, h(a) = d.
(19)
Also we now have ||γ20 (h(t))|| h0 (t) = −||γ20 (h(t)) h0 (t)|| so this is
Z t=a Z t=b
0 0
− f (γ1 (t)) ||γ2 (h(t)) h (t)|| dt = f (γ1 (t)) ||(γ2 ◦ h)0 (t)|| dt
t=b t=a
Z t=b Z
= f (γ1 (t)) ||γ10 (t)|| dt = f (v) ds.
t=a γ1
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 57


We next see how this canR be used to give a unit speed parametrization of a curve
t
γ : [a, b] → Rn . Set l(t) = a ||γ 0 (r)|| dr, so l(t) is the arclength of γ from time a to
time t. Note that l0 (t) = ||γ 0 (t)||. Therefore, if ||γ 0 (t)|| > 0 for all t, this is invertible.
Our parameter change will be given by h(t) = l−1 (t), the inverse function.
Proposition 5.4. Assume that ||γ 0 (t)|| > 0 for all t. Then the reparametrized curve
b = γ ◦ h has speed one.
γ
Proof. Now 1 = (l◦h)0 (t) = l0 (h(t))h0 (t) so ||b
γ 0 (t)|| = ||(γ◦h)0 (t)|| = ||(γ 0 (h(t))h0 (t)|| =
1. 
The function l maps [a, b] to [0, l(γ)]] whence the parameter-change function h maps
[0, l(γ)] to [a, b]. We keep t for the variable in [a, b] and define s = l(t), the arc length
up to time t, so now s is the variable in [0, l(γ)] and h(s) = t.
The change of parameter gives γ b(s) = (γ ◦ h)(s) = γ(h(s) = γ(t). This indeed
parametrizes the curve γ b is by arc length s.
Note further that
Z Z b Z l(b) Z
0 0
f (v)ds ≡ f (γ(t))||γ (t)|| dt = f (b γ (s)||ds ≡ f (v)ds
γ (s))||b
γ a 0 γ
b
0 0
From s = l(t) we have ds = l (t)dt = ||γ (t)||dt. Now we understand rigorously what
is ds: it represents the infinitesimal arc length; this helps explain the notation for
this type of integral.
Level curves and parametrized curves.
There are two very distinct types of curves we encounter in Vector Calculus: the
curves of this section, and the level curves of a function. Next we describe a link
between the two:
Proposition 5.5. Let G : R2 → R be differentiable and suppose γ : [a, b] → R2 is a
curve which stays in a level curve of G of level c. Then γ 0 (t) is perpendicular to the
gradient of G.
Proof. We have that G(γ(t)) = c for alll t. Then by the chain rule, D(G ◦ γ)(t) =
DG(γ(t)Dγ(t). The derivatives here are matrices, with DG a (1 × 2) matrix (a
row vector) and Dγ a column vector; in vector notation, these are the gradient and
d
tangent vector, so this reads 0 = dt c = (G ◦ γ)0 (t) = (∇G)(γ(t)) · γ 0 (t). 
Corollary 5.6. If γ is a curve with ||γ(t)|| = c, then γ 0 ⊥ γ 00 .
Here is a second, direct proof; see also Corollary 4.6 above:
Proposition 5.7. For a unit-speed curve γ, then always γ 0 ⊥ γ 00 .
Proof. 1 = γ 0 · γ 0 whence by Leibnitz’ Rule,
(γ 0 · γ 0 )0 = 2(γ 0 · γ 00 ) = 0.

This fact allows us to make the following
58 ALBERT M. FISHER

Definition 5.3. The curvature of a twice differentiable curve γ in Rn at time t is the


following. For its unit-speed parametrization γ b(s) we define the curvature at time s
b(s) = ||b
to be κ κ ◦ l)(t) = κ(t)
a(s)||; for γ the curvature at time t is κ(t) = (b
For example, the curve γr (t) = r(cos t/r. sin t/r) has velocity γr0 (t) = (− sin t/r, cos t/r)
which has norm one; the acceleration is γr00 (t) = 1r (cos(t/r). sin(t/r)) = − r12 γr (t), with
norm 1r . The curvature is therefore 1r . So if the radius of the next curve on the race
track is half as much, you will feel twice the force, since by Newton’s law, F = ma!
This is the physical (and geometric) meaning of the curvature. In differential geome-
try see p. 59 of [O’N06], For how curvature can be defined for surfaces and manifolds,
see e.g. [DC16].
We have seen how a level curve F = c can (sometimes) be filled in by a parametrized
curve γ(t).
This is for f : R2 → R. For functions on R3 the notion of level curve is replaced
by level surfaces. When these can also be parametrized; the exact conditions which
permit this are given by the Implicit Function Theorem, see §?? and vector calculus
texts.

5.3. Conservative vector fields.


Definition 5.4. A subset V of Rn is pathwise connected: this holds iff given two points
A, B in V, there exists a continuous path γ : [a, b] → V such that γ(a) = A, γ(b) = B.
(This definition makes sense not just for Rn but for any metric space, indeed any
topological space). A related definition is this: a set is connected iff it is not the union
of two disjoint nonempty open subsets. It is immediate that pathwise connected
implies connected.
By a region in Rn we shall mean a pathwise connected open set. A vector field F
on a region Ω ⊆ Rn is conservative iff there exists ϕ : Ω → R such that the gradient
∇ϕ = F . Such a function is called a potential for F .
Lemma 5.8. If Ω is connected and ϕ, ψ are two potentials for F then they differ by
a constant.
Proof.
∂ϕ ∂ψ ∂ϕ ∂ψ
= =⇒ ϕ(x, y) = ψ(x, y) + c(y); = =⇒ ϕ(x, y) = ψ(x, y) + d(x).
∂x ∂x ∂y ∂y
Subtracting, c(y) = d(x) so this is locally a constant, hence by connectedness is
constant. 
Proposition 5.9. If F is conservative and γ : [a, b] → Ω with A = γ(a), B = γ(b)
then Z
F · dγ = ϕ(A) − ϕ(B).
γ

Proof.
Z Z b
F · dγ ≡ F (γ(t)) · γ 0 (t)dt.
γ a
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 59

And F (γ(t)) = ∇ϕ(γ(t)) so


F (γ(t)) · γ 0 (t) = ∇ϕ(γ(t)) · γ 0 (t) = (ϕ ◦ γ)0 (t)
thus
Z Z b
F · dγ = (ϕ ◦ γ)0 (t)dt = ϕ ◦ γ(t)|ba = (ϕ(γ(b) − ϕ(γ(a)) = ϕ(A) − ϕ(B).
γ a

Remark 5.4. This says that for conservative vector fields, we can find a potential
and then evaluate a line integral in a very simple way, just as in one dimension with
the Fundamental Theorem of Calculus. Both of these are special cases of Stokes’
Theorem; see below in §???.
Next we review some equivalent conditions for F to be conservative.
Proposition 5.10. The following are equivalent, for a vector field on a pathwise
connected domain Ω ⊆ Rn :
(i) F is conservative, i.e. there exists a potential function for F , that is, ϕ : Ω → R
such that ∇ϕ = F .
(ii) The line integral is path-independent.
(iii)For γ a piecewise C 1 path which is closed i.e. γ(a) = γ(b), the line integral is 0.
Proof. (i) =⇒ (ii): From Proposition 5.9,
Z
F · dγ = ϕ(A) − ϕ(B);
γ

thus this value only depends on ϕ(A) and ϕ(B), not on the path taken to get there.
RHence if thereR are two paths γ1 , γ2 with the same initial and final points A, B, then
γ1
F · dγ1 = γ2 F · dγ2 .
(ii) =⇒ (iii): If γ is a closed path, then γ(a) = A = γ(b) = B. Define a second
path η with Rthe same initial and final points AR= B but with η(t) = A for all t. Then
η 0 (t) = 0 so η F · dη = 0, whence by (ii) also γ F · dγ = 0.
Another proof is the following: Given a closed path γ, we choose some c ∈ [a, b]
and define C = γ(c). Write γ1 for the path γ restricted to [a, c] and γ2 for γ restricted
to [c, b]. Then by (ii) γ1 and the time-reversed path γe2 have the same initial and final
points, so Z Z
F · dγ1 = F · de
γ2 .
γ1 γ
e2
Therefore
Z Z b
F · dγ = F (γ(t)) · γ 0 (t)dt =
γ a
Z c Z b Z Z
0 0
F (γ(t)) · γ (t)dt + F (γ(t)) · γ (t)dt = F · dγ1 + F · dγ2 =
a c γ1 γ2
Z Z
F · dγ1 − F · de
γ2 = 0.
γ1 γ
e2
60 ALBERT M. FISHER

(iii) =⇒ (ii): We essentially reverse this last argument. We are given that the
integral over a closed path is 0. If thereR are two paths
R γ1 , γ2 with the same initial and
final points A, B we are to show that γ1 F · dγ1 = γ2 F · dγ2 .
As above, we write γ e2 for the time-reversed path. Then γ = γ1 + γ e2 is a closed
loop, so
Z Z Z Z Z
0 = F · dγ = F · dγ1 + F · de
γ2 = F · dγ1 − F · dγ2 = 0.
γ γ1 γ
e2 γ1 γ2

(ii) =⇒ (i): We define a function ϕ by fixing some point A and choosing ϕ(A)
arbitrarily. Then we define the other values as follows. Letting B ∈ Ω, since the
region is path connected there exists a piecewise C 1 path γ : [a, b] → Ω with A =
γ(a), B = γ(b). We set Z
ϕ(B) = F · dγ.
γ
By (ii), this is well-defined as it does not depend on the path.
We claim that ∇ϕ = (ϕx , ϕy ) = F = (F1 , F2 ), showing the calculation for the case
of F : R2 → R. We compute ∂ϕ ∂x
at the point B = (B0 , B1 ) and shall show that
∂ϕ
| = F1 (B).
∂x B
Defining a path η by η(t) = B + te1 , so η(0) = B, we extend the path γ by sticking
η on its end. That is, we define for t ≥ b, γ(t) = B + η(t − b). We still have for
C = γ(c), with c > b, Z c
ϕ(C) = F (γ(t)) · γ 0 (t)dt
a
Rb Rc
by path-independence of (ii). This equals a F (γ(t)) · γ 0 (t)dt + b F (γ(t)) · γ 0 (t)dt =
R c−b
ϕ(B) + 0 F (η(t)) · η 0 (t)dt.
By definition, and taking c = h, ∂ϕ | = limh→0 h1 (ϕ(γ(b + h)) − ϕ(γ(b))) =
∂x B
limh→0 h1 (ϕ(η(h)) − ϕ(η(0))) = limh→0 ( h1 (ϕ(η(h)) − ϕ(B)) = limh→0 h1 ϕ(η(h). Now
1 h
Z
1
lim (ϕ(η(h)) = lim F (η(t)) · η 0 (t)dt
h→0 h h→0 h 0

1 h 1 h
Z Z
= lim F (η(t)) · (1, 0)dt = lim F (B0 + t, B1 ) · (1, 0)dt
h→0 h 0 h→0 h 0

1 h
Z
= lim F1 (B0 + t, 0)dt = F1 (B0 , B1 ) = F1 (B)
h→0 h 0

since the partial derivative F1 is assumed to be continuous.


This shows that ∂ϕ | = F1 (B). So ∇ϕ = F .
∂x B
The same argument works for any dimension, proving the theorem.

Next we explain where the term “conservative” comes from: from the conservation
of energy in mechanics!
Suppose we have an object (a point mass) and a vector field F of forces acting
on this object. This will move according to Newton’s law F = ma; here F and also
the acceleration a are vector quantities, while the mass m is a positive scalar. If the
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 61

position of the object in time is given by the curve γ(t), then we write v(t) = γ(t)
for the velocity and a(t) = v0 (t) = γ 00 (t) for the acceleration. So Newton’s law states
F (γ(t)) = ma(t) = mγ 00 (t).
Definition 5.5. Work is defined in mechanics to be (force) · (distance). This means
that the work done by moving a particle against a force is given by that expression.
The continuous-time version of this is given by a line integral.
Precisely, we define the
R work done by moving a particle along a path (a curve) γ
in a force field F to be γ F · dγ.
The kinetic energy of the particle is 12 m||v||2 .
Proposition 5.11. The work done by moving along the path γ in a force field F
from time a to time b is the difference in kinetic energies, Ekin (b) − Ekin (a).
Proof. The work done by moving along the path γ from time a to time b is
Z Z b Z b
0
F · dγ = F (γ(t)) · γ (t)dt = m γ 00 (t)) · γ 0 (t)dt
γ a a

Now by Leibnitz’ Rule,


1 1d
γ 00 (t)) · γ 0 (t) = (γ 0 (t) · γ 0 (t))0 = ||v||2 (t)
2 2 dt
so our integral is
Z b
1 d 1 1 1
m ||v(t)||2 dt = m||v(t)||2 |ba = m||v(b)||2 − m||v(a))||2 = Ekin (b)−Ekin (a).
2 a dt 2 2 2


This is valid for any force field, conservative or not.


Definition 5.6. Given a conservative vector field F , so with potential function ϕ,
we define the potential energy of F to be Epot = −ϕ.
Note that the potential energy function of physics has the opposite sign from the
potential function used in mathematics, whose gradient gives the field.
The total energy of a particle moving in a force field is the sum of the potential and
kinetic energies, Etot = Epot + Ekin . Note that the potential energy at time a depends
only on the position A = γ(a), so we write this as Epot (A), while the kinetic energy
depends on time and the path, so we write this as Ekin (a), as for the total energy
Etot (a).
Proposition 5.12. In a conservative force field F , the work done by moving along
the path γ from time a to time b is ϕ(B) − ϕ(A) = Epot (A) − Epot (B).
Proof. This is just Proposition 5.9 restated in the context of mechanics. 
Theorem 5.13. If a particle moves according to Newton’s law F = ma in a conser-
vative force field, then the total energy is preserved: Etot (a) = Etot (b).
62 ALBERT M. FISHER

Proof. We have shown in Proposition 5.11 that the work done (in any field) is
Z
F · dγ = Ekin (b) − Ekin (a).
γ

But in a conservative field, we also have a second expression for this: the work done
is Z
F · dγ = ϕ(B) − ϕ(A) = Epot (A) − Epot (B).
γ
Thus
Ekin (b) − Ekin (a). = Epot (A) − Epot (B)
so
Etot (a) = Ekin (a) + Epot (A) = Ekin (b) + Epot (B) = Etot (b).

Rb
Remark 5.5. Note that we calculated the line integral a F (γ(t)) · γ 0 (t)dt in two
different ways, in Proposition 5.9 and Proposition 5.11. For the first we used the
existence of a potential to rewrite F (γ(t)) as ∇ϕ(γ(t)) and use the Chain Rule; for
the second we used Newton’s Law to rewrite F as ma = mγ 00 and apply Leibnitz’
Rule.
It is interesting that this are the same two very different techniques applied to give
two different proofs of Corollary 5.6 above.

The curl of a vector field; conservative vector fields.



Definition 5.7. The curl of a vector field F = (P, Q) on R2 is curl(F ) = ( ∂x Q−
∂ 3 2
∂y
P )k; this is in R not R , but we will soon see the reason for this convention. The
curl of a vector field F = (P, Q, R) on R3 is
i j k ∂ ∂ ∂ ∂ ∂ ∂
∂ ∂ ∂

= ∂y ∂z i− ∂x ∂z j+ ∂x ∂y k = Ry − Qz , Pz − Rx , Qx − Py .
∂x ∂y ∂z P R
Q R P Q
P Q R
This can also be written as a vector product, since
i j k
v∧w =v×w = 1
v v 2 v3 ,
w1 w2 w3
see Part I of these Notes. So one writes
 
∂ ∂ ∂
curl(F ) = , , ∧ (P, Q, R),
∂x ∂y ∂z
which is often abbreviated as
curl(F ) = ∇ ∧ F = ∇ × F.
Note that to define the curl of a vector field in R2 , we have to understand that R2
is identified with the x − y plane embedded in R3 , with the the curl a vector in R3
which is perpendicular to this embedded plane.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 63

Remark 5.6. Note that these formulas represent the determinant of a matrix of sym-
bols rather than numbers, so only make sense as formulas. Nevertheless some of the
properties carry over from the usual situation of a matrix of numbers. For exam-
ple, multilinearity of the determinant or linearity of the vector product is reflected
in linearity of the curl: given two vector fields on R3 , F, G then curl(αF + βG) =
α curl(F ) + β curl(G).
The formulas for R2 and R3 are connected. To understand this, take the fields
F = (P, Q) and Fb = (Pb, Q, b R)
b with R b ≡ 0 and with Pb(x, y, z) = P (x, y) and
Q(x,
b y, z) = Q(x, y), whence Q by − Q
bz = Pbz = 0 so then curl(Fb) = R bz , Pbz − R bx −
bx , Q
  
Pby = −Q bx − Pby = 0, 0, Q
bz , Pbz , Q bx − Pby = (Qbx − Pby )k.
In other words, curl(Fb) = curl(F ) in this case.
Proposition 5.14. If a field F on R2 is conservative, then the curl is 0.
Proof. This follows immediately from the equality of mixed partials, Lemma 4.28. 
Remark 5.7. The proposition says: curl(gradϕ) = 0, that is,
∇ ∧ (∇ϕ) = ∇ × (∇ϕ) = 0.
In fact, the curl in R3 can be understood with the help of that in R2 : if Fb is 0 in
some other direction v (replacing the direction k), then the curl is a multiple of v,
and is equal to the curl on the plane perpendicular to v.
This will always be the case for a linear vector field, because we can rotate the field
so that v now lines up with k and we are in the previous situation. If Fb is not linear,
we define:
Definition 5.8. The linearization of F at p is the linear vector field defined by the
derivative matrix F ∗ = DFp .
As we next show, the curl of F at p is equal to that for its linearization: curl(Fb)|p =
curl(Fb∗ )|0 :
Theorem 5.15. Let F = (P, Q, R) be a differentiable vector field on R3 , with deriv-
ative DFp at the point p. Let F ∗ denote the linear vector field defined by the matrix
DFp .
Then curl(F )|p = curl(F ∗ )|0 , which is constant.
The same holds for R2 .
 
2 Px Py
Proof. For the case of R , so F = (P, Q), the derivative matrix is DF = .
Qx Qy
The curl is calculated from the off-diagonal entries. So curl(F ) and curl(F ∗ ) are
the same, as they are determined by these entries. More precisely, DFp (x, y) =
(xPx + yPy , xQx + yQy ) = (Pe, Q) e x − (Pe)y )k = (Qx − Py )k.
e which has curl ((Q)
For the (3 × 3) case, the derivative of a linear map is constant, so for all x,
D(F ∗ )(x) = D(F ∗ )(0) = DFp = F ∗ .
 
Px P y Pz
F ∗ ≡ DFp = Qx Qy Qz  |p
Rx Ry Rz
64 ALBERT M. FISHER

Write the rows as Pe, Q, e Then the curl of the linear vector field defined by F ∗ is
e R.
 
ey − Q
R ez , Pez − R ex − Pey = Ry − Qz , Pz − Rx , Qx − Py = curl(F )p ,
ex , Q
proving the claim.
We note that since for any chosen p, DFp is a linear map, its derivative is constant,
equal to that linear map at any point. Thus curl(F ∗ )|q = curl(F ∗ )|0 , for any q.
Another way to say this is that for any linear vector field the curl is the same at all
points.

The curl is a type of derivative, so it makes sense that it can be calculated from
the derivative matrix. The geometrical meaning of curl is an infinitesimal rotation:
a sphere in R3 rotates about an axis. (To prove this, in Linear Algebra the Spec-
tral Theorem tells us that a rotation –given by an orentation-preserving orthogonal
matrix–has an eigenvector; this gives the axis). The curl measures the infinitesimal
rotation of the vector field, and its vector points along that axis, using the right-hand
rule to indicate the direction of the vector. Why this is an infinitesimal rotation is
explained by the notion of the exponential of a matrix, illustrated in the next example.
See the online text https://activecalculus.org/vector/ for some nice illustrations.
5.4. Rotations and exponentials; angle as a potential.
 First we consider the
0 −1
linear vector field V on R2 defined by A = . We shall explain how this is
1 0
tangent to the rotation flow
 
cos t − sin t
Rt = ,
sin t cos t
see Fig. 19.
The relationship between the matrices A and Rt is simple, beautiful and profound.
We extend the definition of ex to a square matrix M via the Taylor series
exp(M ) = I + M + M 2 /2 + · · · + M k /k! + . . .
It is not hard to show (using comparison
  and the matrix norm) that this always
0 −1
converges. In particular, for A = , then
1 0
 
tA cos t − sin t
e = = Rt
sin t cos t
gives the rotation flow. To see this, write out the first few terms of the matrix series
and use the Taylor series for sin, cos:

sin(t) = t − t3 /3! + t5 /5! − . . .

cos(t) = 1 − t2 /2! + t4 /4! − . . .


d tA
Conversely, A is the infinitesimal version of this flow, since dt e = AetA exactly as
d
for real functions, hence at t = 0 this equals A. Thus, dt |0 Rt = A so A does give the
infinitesimal rotation.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 65

Figure 19. Orbits of the rotation flow.

A similar equation holds in R3 , which explains why the curl of a vector field does
measure the infinitesimal rotation.
This is related to the most basic and most important differential equation: that for
exponential growth,
f 0 (t) = f (t)
which has as its solution
f (t) = Ket .
0
 The same holds for the vector differential equation γ (t) = Aγ(t) where A =
0 −1
and γ = (x, y), that is in matrix form,
1 0
 0    
x (t) 0 −1 x(t)
=
y 0 (t) 1 0 y(t)
with initial condition (x0 , y0 ) has solution
    
x(t) cos t − sin t x0
=
y(t) sin t cos t y0
The derivative of the linear map V : R2 → R2 at a point p is DVp = A for all
p, since the derivative of a linear map is constant, with value equal to the matrix
itself.
 We claim
 thefield V is not conservative. Now writing V = (P, Q), DV =
Px Py 0 −1
= , so the curl is Qx − Py = 1 + 1 = 2. Thus by Proposition 5.14,
Qx Qy 1 0
V is not conservative. R
For a second proof, we calculate the line integral γ V · dγ for the curve γ(t) =
(cos t, sin t), t ∈ [0, 2π]. This is
Z 2π Z 2π
0
V (γ(t)) · γ (t)dt = (− sin t, cos t) · (− sin t, cos t)dt = 2π.
0 0
But this is a closed loop, hence by (iii) of Proposition 5.10 is not conservative.
66 ALBERT M. FISHER

Next we modify V to a nonlinear vector field F , defined everywhere on the plane


except at 0.
Thus on U the open set R2 \ (0, 0) we define
 
−y x
F = (P, Q) = ,
x2 + y 2 x2 + y 2
Exercise 5.2. What is ||F (v)|| for v = (x, y) in terms of r = ||v||? Calculate the
derivative, DF , and use that to verify that curl(F ) = 0.
Lemma 5.16. Verify: R
(i) For θ ∈ [0, +∞) and for γ : [0, θ] → U with γ(t) = (cos t, sin t) then γ F · dγ = θ.
R
(ii) For θ ∈ (−∞, 0] then also then γ F · dγ = θ.
We can use line integrals to measure (more precisely, to define!) the number of
times a curve in the plane “winds about” a certain point. Here is the definition for
the point 0:
Definition 5.9. Given a closed
R curve γ in R \ 0, the winding number or index of γ
about 0 of I(γ; 0) ≡ 1/2π γ F · dγ.
Corollary 5.17. For γ(t) = (cos(2πnt), sin(2πnt)) with t ∈ [0, 1], n ∈ Z then the
winding number of γ about 0 is n.
Exercise 5.3. Let A = (1, 0) and B = (1, 1) and suppose
R γ : [a, b] → R \ 0 with
γ(a) = A, γ(b) = B. What are the possible values of γ F · dγ? Why, precisely?

To define this for a different


R point x ∈ R2 , we would translate F to Fx = F − x
and set I(γ; x) ≡ 1/2π γ Fx · dγ.
Remark 5.8. This provides one way of defining the inside and outside of a curve: x
is on the outside iff I(γ; x) = 0, otherwise on the inside. (For x ∈ Im(γ) it is not
defined).
Conclusion: Despite the fact that we have curl(F ) = 0, this field F Rcannot be
conservative because the integral around the closed loop γ with θ = 2π is γ F · dγ =
2π.
We set Ω = R2 \ {(x, y) : y = 0, x ≥ 0}, the plane with the positive part of the
x-axis removed. We define the angle function Θ : Ω → (0, 2π) to be the angle of the
point (x, y) measured in the counterclockwise direction from this halfline.
A formula for Θ is given in 22 below. See Fig. 20.
Definition 5.10. Two curves γ, η : [a, b] → Rm are homotopic iff there is a con-
tinuous function Φ : [0, t] × [a, b] → Rm such that Φ(0, t) = γ(t) and Φ(1, t) = η(t).
If you draw a picture of this you will see that it says that the first curve can be
continuously deformed into the second. A curve γ is said to be homotopic to a point
iff it is homotopic to a constant curve η(t) = p for all t. For an example, the curve
γ(t) = (cos t, sin t) in R2 is homotopic to a point; however in the domain U = R2 \ {0}
it is not.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 67

A region Ω ⊆ R2 is simply connected iff it is pathwise connected and has no “holes”,


meaning every closed curve is homotopic to a point. In the above example, U =
R2 \ {0} has a “hole” at 0.
The basic result is:
Theorem 5.18. If a region Ω is simply connected, and if curl(F ) = 0 on Ω, then
there exists a potential function ϕ for F defined on Ω.
Proof. We proved in Proposition 5.10 that if the domain Ω is pathwise-connected, and
the line integral over a closed loop is 0 (or equivalently, path-independent) then there
exists a potential function ϕ. The method of proof was to define ϕ by integration;
the path-independence means this function is well-defined.
We claim that if Ω is simply connected,and the curl is 0, then path-independence
holds. The reason is that the path integral changes continuously over a continuous
homotopy. But...TO DO 
Example 4. We analyze the important specific example of the angle function Θ. This
is a potential function for the field F , but only on the restricted, simply connected
domain R2 minus the positive real axis.
What happens at the limit as the angle goes to 2π is quite interesting, explained
geometrically by the graph of Θ.
For the angle function Θ example we carry this out directly. The domain of defi-
nition of cot(θ) = cos(θ)/ sin(θ) is (0, π). So:
(
arccot(x/y) for y > 0; taking values θ ∈ (0, π)
Θ(x, y) = (20)
arccot(x/y) + π for y < 0; taking values θ ∈ (π, 2π)
We choose the initial point A = (−1, 0) and connect it to B ∈ Ω by a path γ in Ω,
defining ϕ by ϕ(A) = 0, and Z
ϕ(B) = F · dγ.
γ
This is well-defined since Ω is pathwise connected, so by (ii) of Proposition 5.10 it is
path-independent.
Lemma 5.19. We claim that ϕ(x, y) + π = Θ(x, y) for all (x, y) ∈ Ω.
Proof. We use the following path to connect A and B = (x, y). We define γ1 (t) =
(−1, t) for t ∈ [0, y] and γ2 (t) = (t, y) for t ∈ [−1, x]. Note that γ10 = (0, 1), γ20 = (1, 0).
We define γ = γ1 + γ2 . This goes vertically up from A to the point (−1, y) and then
horizontally over to B.
We have Z Z
ϕ(x, y) = F · dγ1 + F · dγ2 . (21)
γ1 γ2
To evaluate this we need to recall some facts about inverse trigonometric functions.
We have cot(θ) = cos(θ)/ sin(θ) = x/y so arccot(x/y) = θ. The domain of definition
of cot is (0, π).
So we have these formulas for the angle function Θ:
68 ALBERT M. FISHER

Figure 20. The angle function Θ.

Θ(x, y) = arccot(x/y), taking values θ ∈ (0, π), so for y > 0,


and

Θ(x, y) = arccot(x/y) + π, taking values θ ∈ (π, 2π), for y < 0


Summarizing,

arccot(x/y) for y > 0

Θ(x, y) = arccot(x/y) + π for Θ ∈ (π, 2π), so for y < 0 (22)

π for Θ = π, for y = 0, x < 0

Next we evaluate ϕ from (21):


Z Z y Z y
x
F · dγ1 = F (−1, t) · (0, 1)dt = 2 2
◦ (−1, t)dt =
γ1 0 0 x +y
Z y
−1
2
dt = arccot(y) − arccot(0) = arccot(y) − π/2.
0 1+t
And: Z x Z x
−y −y
Z
F · dγ2 = 2 2
◦ (t, y)dt = 2 2
dt
γ2 −1 x + y −1 t + y
Here we use the substitution u = t/y, so du = 1/ydt, t = uy, and
−y −y −1
2 2
= 2 2
= 2 .
t +y (uy) + y u +1
Thus we have
Z x Z u=x/y
−y −1
2 2
dt = 2
du = arccot(x/y) − arccot(−1/y).
−1 t + y u=−1/y u + 1
Z Z
ϕ(x, y) = F ·dγ1 + F ·dγ2 = arccot(y)−arccot(−1/y)+arccot(x/y)−π/2. (23)
γ1 γ2

We claim that the first part of this is locally constant, in fact:


(
−π/2 for y > 0
arccot(y) − arccot(−1/y) = (24)
π/2 for y < 0
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 69

To prove this, we calculate that the following derivative is 0:


 
d −1 −1
arccot(y) − arccot(−1/y) = −− · y −2 =
dy 1+y 2 1 + ( −1
y
)2

−1 1
+ =0
1 + y2 y2 + 1
To find the constant we evaluate at a single point (actually at two points), where
it is easy: at 1 and −1: Now cot(π/4) = 1, arccot(1) = π/4; this is for the case y > 0,
and cot(−π/4) = −1, arccot(−1) = 3π/4, for the case y < 0, of (24).
So as claimed in (24),
(
−π/4 − 3π/4 = −π/2, for y = 1
arccot(y) − arccot(−1/y) =
3π/4 − π/4 = π/2, for y = −1
Combining this with (23),
(
−π for y > 0
ϕ(x, y) = arccot(x/y) +
0 for y < 0
(
0 for y > 0
ϕ(x, y) + π = arccot(x/y) +
π for y < 0
From (22),
(
0 for y > 0
Θ(x, y) = arccot(x/y) + (25)
π for y < 0
Lastly, for y = 0, x < 0 we have ϕ + π = 0 + π = π and also Θ = π., since
ϕ(−1, 0) = 0 while Θ(−1, 0) = π.
This proves the Claim for all cases, that
Θ = ϕ + π.


Remark 5.9. To better understand the potential function Θ, draw its level curves;
they are rays from the origin, climbing up like a spiral staircase.
Note that for γ(t) = (cos t, sin t) then
Z 2π Z 2π
0
F (γ(t)) · γ (t)dt = (cos t, sin t) · (− sin t, cos t)dt = 2π
0 0
and also
lim Θ(γ(t)) − Θ(1) = lim Θ(B) − Θ(0) = 2π − 0 = 2π
t→2π B→0
R
so the formula γ F · dγ = ϕ(B) − ϕ(A) is still valid in the limit; it is also valid if we
can somehow allow for a “multi-valued function” as a potential!
See §5.15, Fig 24 below for a different view of this potential: it is related to the
electrostatic field of a single charge at the origin.
70 ALBERT M. FISHER

5.5. Line integral with respect to a differential form. We have been studying
line integrals,
Z
F · dγ,
γ

with this expression defined to equal


Z b
F (γ(t)) · γ 0 (t)dt.
a

We now introduce a differentR notation for the line integral. If F = (P, Q) is a


2
vector field on R , we write γ P dx + Qdy and define this to be simply equal to the
line integral with respect to F . But exactly what is the meaning of the expression
P dx + Qdy? (Do not mistake this for a formula from first-semester Calculus!)
To explain this we recall that given a vector space V , its dual space V ∗ is the set
of all linear functionals on V , that is all λ : V → R linear. Note that V ∗ is itself
a vector space; the operations on V ∗ are defined pointwise as for any collection of
functions taking values in a vector space, that is (λ1 + λ2 )(v) ≡ λ1 (v) + λ2 (v), and
similarly, (aλ)(v) ≡ a(λ(v)). We call λ a dual vector or a co-vector. If we have an
inner product hv, wi on V , then we define an explicit vector v∗ = λv ∈ V ∗ dual to
v ∈ V by λv (w) = hv, wi. We write this function as hv, ·i = λv (·). By bilinearity, V
defines linear functionals on V ∗ , denoted h·, vi.
The map v 7→ hv, ·i = v∗ from V to V ∗ depends on the choice of the inner product,
or equivalently on the choice of a basis, which we define to be orthonormal. Given
this choice we can think of a co-vector as simply a vector.
The term duality in math refers to any situation where you can switch back and
forth; in this case, v 7→ v∗ 7→ (v∗ )∗ = v. Thus the double dual map is just the identity
on V , and V ∗∗ = V .
A differential one-form η is a field of dual vectors, elements of the dual vector space
V to V . That is, η : V → V ∗ .

Given a function ϕ : Rn → R, we define η = dϕ to be the one-form dual to the


gradient, ∇ϕ. In particular, taking ϕ(x, y) = x, then the one-form dϕ = dx is dual
to the constant vector field F (x) = e1 where e1 = (1, 0). So any one-form η can be
written as a linear combination:

η = P dx + Qdy.

Note that the coefficients depend on the location: they are functions P (x, y), Q(x, y).
This one-form is dual to the vector field F = (P, Q), and conversely, F is dual to
η.
Similarly in R3 we can express a one-form as

η = P dx + Qdy + Rdz.

Again, we then define the line integral with respect to a one-form as equal to its line
integral over the associated vector field.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 71

Given a one-form η, we define line integral of a curve γ over η to be simply the line
integral of the corresponding vector field F , so
Z Z Z Z b
η = P dx + Qdy = F · dγ = F (γ(t)) · γ 0 (t)dt.
γ γ γ a
Thus to calculate a line integral over a one-form, the first step is to write it out as a
standard line integral with respect to the dual field F = (P, Q).
A key fact about line integrals is that the orientation of γ Ris important, Rsince for
γ : [a, b] 7→ R2 with opposite curve γe = −γ, then as we know, γe F · deγ = − γ F · dγ.
Thus γ is an oriented curve, and not just the point set Im(γ), the image of the
curve. R
This is the same as the difference between the Riemann integral [a,b] f (x)dx and the
Rb
integral a f (x)dx = F (b) − F (a) defined from a primitive F , since in the second case
Ra Rb
A = [a, b] is treated as an oriented interval and we have b f (x)dx = − a f (x)dx.
These matters become more subtle for double and triple and k-tuple integrals,
where V ∗ is replaced by the set of alternating k-tensors on Rk , as we explain below.
So far we have only treated one-tensors:
Definition 5.11. Given a vector space V , a one-tensor is an element of the dual
space V ∗ . A differential one-form η on a vector space V is a function taking values in
the one-tensors, so equivalently, η : V → V ∗ . Choice of an inner product associates
V to V ∗ , by sending v 7→ λv ∈ V ∗ with λv (w) = hv, wi. This is an isomorphism,
which depends on the choice of inner product.
5.6. Green’s Theorem: Stokes’ Theorem in the Plane. Here we follow the
outlines of Guidorizzi’s Calculus 3 text: [Gui02]. In my opinion this is (for those who
know Portuguese) a good text to teach from, as it is well organized, with correct
proofs and good worked-out examples and exercises of a consistent level, but it’s not
so easy to study from as it is too dry and also because it lacks the beauty of a more
advanced and abstract approach. The latter is given in spades in Spivak’s beautiful
[Spi65] and Guillemin and Pollack’s transcendent [GP74]; the approach in these notes
is to bridge the way to this very beautiful and powerful more abstract approach while
keeping our feet firmly on the ground of simplicity.
Definition 5.12. Given a simple closed C 1 curve γ in R2 , so γ : [a, b] → R2 with
γ(a) = γ(b), we define a curve on the circle by
whγ(t) = γ(t)/||γ(t)||. This is just the normalized tangent vector, so to see how the
tangent vector turns, we look at how γ b moves along the unit circle.
One can prove (and it makes sense intuitively) that:
Lemma 5.20. γ b either goes around once in the clockwise direction or once in the
counterclockwise direction.
We say γ is oriented positively if it is a counterclockwise motion, otherwise we say
it is oriented negatively.
Given a simple closed curve γ in the plane, to state Green’s Theorem we need tp
be able to talk about its inside and outside. This enables us to define its orientation
as well.
72 ALBERT M. FISHER

These ideas are made precise by the famous Jordan Curve Theorem:
Theorem 5.21. (Jordan) A continuous simple closed curve γ in R2 partitions the
plane into three connected sets:
–the interior of the curve, an open set we call K;
–the image of γ, a closed set, which is the topological boundary of K, so we call it
∂K = Im(γ), the boundary of K;
–the exterior of γ, the open set which is the complement of K ∪ ∂K.
Definition 5.13. Given such a curve, we say it has positve orientation iff it goes in
the counterclockwise direction as seen from the inside.
Proposition 5.22. If γ is oriented positvely and piecewise C 1 , then the interior region
K is to the left of the tangent vector γ 0 (t) for all t where γ 0 (t) exists and is nonzero.
Unfortunately, we will not prove any of these beautiful results here, as good proofs
require a more advanced perspective, bringing in ideas from algebraic or differential
topology; see [Arm83], and as they are clear intuitively by sketching a few pictures.
These ideas also are needed in Complex Analysis. There is a nice treatment relating
this to line integrals in the third edition of Marsden-Hoffman: [MH98].
Theorem 5.23. (Green’s Theorem) Let γ be a simple closed positively oriented curve
in R2 , piecewise C 1 , with non-empty interior. Write K for the closure of the interior
of γ. Let F = (P, Q) be a C 1 vector field defined on some open set U ⊇ K.
Then Z Z Z
F · dγ = curl(F ) · kdxdy.
γ K

equivalently,
Z Z Z
P dx + Qdy = (Qx − Py )dxdy.
γ K

The proof of Green’sTtheorem will be given in stages:

Proof. Proof for rectangle: Let K = [a, b] × [c, d]. Write A = (a, c), B = (b, c), C =
[b, d), D = (a, d). Let γ = γ1 + · · · + γ4 be unit-speed boundary curves traversing
the segments in a counterclockwise direction, γ1 from A to B and so on. Thus
γ1 (t) = A + t(1, 0) = (t, c) for t ∈ [a, b], so γ10 = (1, 0). We have
Z Z b
P dx + Qdy = P (t, c)dt
γ1 a

and similarly for the other cases, so


Z Z b Z b Z d Z d
P dx + Qdy = P (t, c)dt + − P (t, d)dt + Q(b, t)dt − Q(a, t)dt =
γ a a c c
Z b Z d
P (t, c) − P (t, d)dt + Q(b, t) − Q(a, t)dt.
a c
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 73

On the other hand,


Z Z d Z b  Z b Z d 
∂Q ∂P
(Qx − Py )dxdy = dx dy − dy dx =
K c a ∂x a c ∂y
Z d Z b
Q(b, y) − Q(a, y)dy − P (x, d) − P (x, c)dx =
c a
Z d Z b
Q(b, t) − Q(a, t)dt − P (t, d) − P (t, c)dt
c a

which is exactly what we had before!


Proof for right triangle: We take for K the triangle with corners A = (0, 0),
B = (1, 0), C = (1, 1) and with boundary curve γ1 + γ2 − γ3 with γ1 (t) = (t, 0),
γ2 (t) = (1, t) and γ3 (t) = (t, t), all for t ∈ [0, 1].. (Here −γ3 means the opposite,
i.e. orientation-reversed, curve.) Thus γ10 = (1, 0), γ20 = (0, 1) and −γ30 = (1, 1).
We have for F = (P, Q),
Z Z Z 1 Z 1
F · dγ = P dx + Qdy = P (t, 0) + Q(1, t)dt − P (t, t) + Q(t, t)dt.
γ γ 0 0

On the other hand,


Z Z Z y=1 Z x=1  Z x=1 Z y=x 
∂Q ∂P
Qx − Py dxdy = dx dy − dy dx =
K y=0 x=y ∂x x=0 y=0 ∂y
Z y=1 Z x=1
Q(x, y)|x=1
x=y dy − P (x, y)|y=x
y=0 dx =
y=0 x=0
Z y=1 Z x=1
Q(1, y) − Q(y, y)dy − P (x, x) − P (x, 0)dx
y=0 x=0

which equals the line integral! 


Proof for right triangle with one curvy side:
Next we consider a topological triangle with vertices at A = (a, c), B = (b, c),
C = (b, d) and with boundary curve γ1 + γ2 − γ3 with γ1 (t) = (t, c) for t ∈ [a, b];
γ2 (t) = (b, t) for t ∈ [c, d], and −γ3 where γ3 (t) = (t, f (t)) for t ∈ [a, b], f (a) = c and
f (b) = d.
We assume that f is invertible, with inverse g.
We have for F = (P, Q) :
Z Z Z b Z d Z b
F · dγ = P dx + Qdy = P (t, c)dt + Q(b, t)dt − (P, Q)(γ3 (t)) · (1, f 0 (t))dt
γ γ a c a

Here Z b Z b
0
(P, Q)(γ3 (t)) · (1, f (t))dt = P (t, f (t)) + Q(t, f (t))f 0 (t)dt
a a
so the total is
Z b Z d Z b Z b
P (t, c)dt + Q(b, t)dt − P (t, f (t)) − Q(t, f (t))f 0 (t)dt.
a c a a
74 ALBERT M. FISHER

On the other hand,


Z Z Z y=d Z x=b  Z x=b Z y=f (x) 
∂Q ∂P
Qx − Py dxdy = dx dy − dy dx =
K y=c x=g(y) ∂x x=a y=c ∂y
Z y=d Z x=b
Q(b, y) − Q(g(y), y)dy − P (x, f (x)) − P (x, c)dx =
y=c x=a
Z x=b Z y=d Z x=b Z y=d
P (x, c)dx + Q(b, y)dy − P (x, f (x))dx − Q(g(y), y)dy
x=a y=c x=a y=c

We are almost done. Note that each expression has four terms, and the first three
of them agree, just changing the variable of integration from time t to the spatial
coordinates x and y. It remains to check the last term. This is a substitution,
making use of the inverse function: writing s = f (t), so t = g(s), then ds = f 0 (t)dt
whence indeed
Z b Z s=d Z y=d
0
Q(t, f (t))f (t)dt = Q(g(s), s)ds = Q(g(y), y)dy
a s=c y=c

completing the proof.



Proof for more complicated regions.
Once we have these special cases we can build up to the general statement of
Green’s Theorem as follows. First we consider other cases of an open region K with
a simple closed piecewise-C 1 boundary curve γ. Using straigt lines, we cut K into
pieces of the above forms and add up the results. The key point is that the pieces
have nonintersecting interiors, and meet on their boundaries in curves with opposite
orientation. The double integrals add as this boundary has content zero so those
add; on the line integral side of the equation, the question is why do the boundary
intersections always meet in curves with opposite orientation? But this is easy to
justify: we prove this by induction on the number of pieces, reducing to two regions.
Their boundaries meet in curves with opposite orientations because each is counter-
clockwise as seen from its own interior, hence opposite as seen from the other region.
The next step is to consider two disjoint simple closed piecewise-C 1 boundary curves
γ1 , γ2 with regions K1 , K2 . If these regions are disjoint, we simply define the boundary
of the union k1 ∪ K2 to be γ1 together with γ2 , which we write as γ1 + γ2 . The result
clearly hols for this case also. Next consider the case where γ2 is inside of K1 . Then
we consider the region K = K1 \ (K2 ∪ Im(γ2 ). For example, if the curves are
concentric circles, then K is called an annulus: a disk with a hole removed from it.
We defined the boundary curve to be γ = γ1 − γ2 . That is, the outer curve γ1 is
oriented positively, while the inner curve is oriented negatively.
Note that the resulting boundary curve γ has the property that as we traverse the
curve, the region K always occurs on the left-hand side.
It is then easy to show by subtracting the two results for γ1 , γ2 that Green’s The-
orem still holds.
Note that such a region is now not simply connected.
We do similarly for a disk with k holes removed.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 75

A more formal proof uses the notion of chains as developed in [Spi65] or [GP74].

Exercise 5.4. Consider the field


 
−y x
F = (P, Q) = , 2
x + y x + y2
2 2

of Exercise 5.2, for the region with two boundary circles of radius 1 and 2. What
does Green’s Theorem say in this case?

Remark 5.10. The proof of Green’s Theorem for rectangular regions given above may
remind the reader of the proof we gave above for the equality of mixed partials Lemma
4.28. We next see the exact connection between the two arguments, by showing how
the equality of mixed partials follows as a corollary of Green’s Theorem.
Given a vector field F = (P, Q) and the rectangle from the proof above, R =
[a, b] × [c, d], we parametrize
R the boundary of R by a counterclockwise curve γ and
calculate the line integral γ F · dγ. The corners of R are A, B, C, D with A =
(a, c), B = (b, c), C = (b, d) and D = (a, d). We have γ = γAB + γBC + γCD + γDA
where these are the unit-speed paths; we use the inverse paths for the last two. Thus
0
γAB (t) = (t, c) for t ∈ [a, b]; γAB = (1, 0)
0
γBC (t) = (b, t) for t ∈ [c, d]; γBC = (0, 1)
0
−γCD (t) = (t, d) for t ∈ [a, b]; −γCD = (1, 0)
0
−γDA (t) = (a, t) for t ∈ [c, d]; −γDA = (0, 1)

Then
Z Z b Z b
F · dγAB = (P, Q)(γAB ) · (1, 0)dt = P (t, c)dt
γAB a a
Z Z d Z d
F · dγBC = (P, Q)(γBC ) · (0, 1)dt = Q(b.t)dt
γBC c c
Z Z b Z b
F · dγCD = (P, Q)(γCD ) · (1, 0)dt = P (t, d)dt
γCD a a
Z Z d Z d
F · dγDA = (P, Q)(γDA ) · (0, 1)dt = Q(a, t)dt
γDA c c

So far this is true for any vector field. We now assume F is conservative, so there

exists ϕ with F = ∇ϕ, so F = (P, Q) where P (x, y) = ∂x ϕ(x, y) and Q(x, y) =

∂y
ϕ(x, y). So

Z Z b Z b

F · dγAB = P (t, c)dt = ϕ(t, c)dt = ϕ(t, c)|ba
γAB a a ∂x
76 ALBERT M. FISHER

and we have: Z
F · dγAB = ϕ(t, c)|ba = ϕ(b, c) − ϕ(a, c)
γAB
Z
F · dγBC = ϕ(b, t)|dc = ϕ(b, d) − ϕ(b, c)
γBC
Z
− F · dγCD = ϕ(t, c)|ba = ϕ(b, d) − ϕ(a, d)
γCD
Z
− F · dγCD = ϕ(t, c)|dc = ϕ(a, d) − ϕ(a, c)
γCD

Thus Z
 
F · dγ = ϕ(b, c) − ϕ(a, c) + ϕ(b, d) − ϕ(b, c)
γ
 
+ ϕ(a, d) − ϕ(b, d) + ϕ(a, c) − ϕ(a, d) = 0
Note that this statement Z
F · dγ = 0.
γ

is equivalent to that
Z Z
F dγ1 = − F dγ2 ,
γAB +γCD γBC +γDA

as we traverse the sides of R in a different order. And this was exactly the concluding
step in the proof of Lemma 4.28.
We have proved that if F is conservative, then the integral around any rectangular
loop is 0. This prrof has not used the equality of mixed partials.
But we can prove the equality of mixed partials from this fact as a corollary of
Green’s Theorem. Green’s Theorem states that
Z Z Z
curl(F ) dxdy = F · dγ
R γ

for γ = ∂R as above.
Since we have proved that the line integral around ∂B is 0 for each rectangle,
Green’s Theorm tells us that curl(F ) = 0.
But curl(F ) = ( ∂Q
∂x
− ∂P
∂y
)k and hence the mixed partials are equal.

5.7. The Divergence Theorem in the plane.


Definition 5.14. Let F = (P, Q) be a C 1 vector field in R2 . The divergence of F is
defined to be:
div(F ) = Px + Qy .
We shall use the notation: given a vector v = (a, b) ∈ R2 , then v∗ = (b, −a) and

v
e = (−b, a).
For the particular case of F = (P, Q) we write G for Fe∗ = (−Q, P ).
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 77

Theorem 5.24. Let F = (P, Q) be a C 1 vector field in the plane, and let γ be a
piecewise C 1 , positively oriented simple closed curve, with interior region K. We
define n = γ 0∗ /||γ 0∗ || = γ 0∗ /||γ 0 ||; this is the outward normal vector of γ.
Then Z Z Z
F · nds = div(F )dxdy.
γ K

The same holds more generally for a finite collection of disjoint such regions K1 , . . . Kn
with boundaries γ1 , . . . , γn and then writing K = ∪Kn and γ = γ1 + · · · + γn .
Proof. We place the two statements side-by-side, for γ the boundary curve of K, one
for the field F and the other for G = Fe∗ :
Green’s Theorem: Z Z Z
G · dγ = curl(G) · k dxdy
γ K

Divergence Theorem:
Z Z Z
F · n ds = div(F ) dxdy.
γ K

Note here that curl(G) · k = div(F ), so once the prove the two different types of
line integrals are equal, the theorem is proved!
For γ(t) = (x(t), y(t)), then γ 0 (t) = (x0 (t), y 0 (t)), and n = γ 0∗ /||γ 0 || = (y 0 , −x0 )/||γ 0 || =
(y , −x0 )/||(y 0 , −x0 )||.
0

Recall (Def. 5.2) that the line integral of second type of a function f : R2 → R over
γ : [a, b] → R2 is defined to be
Z Z b
f (v)ds ≡ f (γ(t)) ||γ 0 (t)||dt
γ a

where ds is the element of arclength, ds = ||γ 0 (t)||dt. As we showed in Proposition


5.3, this value is independent of prametrization.z Now for this to make sense, it is
enough for the function f to be defined on the image of γ, not necessarily on all of
R2 . So when we write the formula
Z
F · n ds
γ

what we mean by this is the line integral of second type of the function f over γ,
where f is defined on the image of γ by
f (γ(t)) = F (γ(t)) · n(t).
Thus
Z Z Z b
F · n ds ≡ f (v)ds ≡ f (γ(t))||γ 0 (t)||dt.
γ γ a

Now writing in components F = (P, Q), we have


78 ALBERT M. FISHER

Z Z b Z b
0
F · n ds = F (γ(t)) · n(t) ||γ (t)||dt = F (γ(t))) · (y 0 , −x0 )/||γ 0 (t)|| ||γ 0 (t)||Rdt
γ a a
Z b Z b Z b
0 0 0 0
= (P, Q)(γ(t)) · (y , −x ) dt = (−Q, P )(γ(t)) · (x , y ) dt = G(γ(t)) · γ 0 (t) dt
a a a
Z Z Z Z Z
= G · dγ = curl(G) · k dxdy = div(F ) dxdy
γ K K

proving the Theorem.



Remark 5.11. An explanation is that F is lined up with n, thus producing positive
divergence, iff Fe∗ is lined up with γ 0 , thus producing positive curl. The reason for
using Fe∗ rather than F ∗ is so the sign matches; the key point is that for v = (a, b)
and w = (c, d), then v∗ = (b, −a) and w e ∗ = (−c, d), and v·w∗ = w· v e∗ . So α(v, w) ≡

v · w defines an alternating form; indeed, it equals det(v, w)! See Proposition ?? ff.
regarding two-tensors.
Using this notation, the last part of the proof can be summarized as:
Z Z b Z b
0∗
F · n ds = F (γ(t)) · γ (t)dt = Fe∗ (γ(t)) · γ 0 (t)dt =
γ a a
Z Z Z Z
curl(Fe∗ ) · k dxdy = div(F ) dxdy.
K K

See p. 79 of [War71] regarding the star operator.


5.8. Surface area and the “determinant” of a rectangular matrix. To define
the surface area, and more generally, k-dimensional volume in Rn for k ≤ n, we need
an interesting bit of linear algebra.
We recall from above the equivalent definitions of the determinant, proved in The-
orem 4.19.
These were first the standard
Algebraic Definition; Note that this is only defined for square matrices.
We then gave this
Geometric Definition: Let M be an (n × n) real matrix. Then
detM = (±1)(factor of change of volume)
where we take +1 if M preserves orientation, −1 if that is reversed. (Here this is
n-dimensional volume and so is length, area in dimensions 1, 2).
We next discussed the vector product in R3 , presenting three definitions. The first
two were:
Algebraic definition of v ∧ w, via the symbolic “determinant” formula;
Geometric definition of v ∧ w.
Letting P (v, w) denote the parallelogram spanned by v, w ∈ R3 , that is, P (v, w) ≡
{(av + bw : a, b ∈ [0, 1]}, then from the geometric definition, the norm of the vector
product ||v ∧ w|| equals the area of P (v, w).
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 79

Now this parallelogram P (v, w) is the image of the unit square I × I = [0, 1] × [0, 1]
in R2 by the linear map
 
v1 w1
A = v2 w2 
v3 w 3
This suggests that we can turn the above definition around and give an analogue
of the geometric definition of the determinant of the rectangular matrix A: it is to be
the area of this image parallelogram.
In other words, although the algebraic definition of determinant does not extend
to rectangular matrices, the geometric definition does, and more generally for a k-
parallelopiped P in Rn , analogous to this simplest case of 2-parallelograms in R3 .
The tantalizing task is then to find an algebraic formula for this geometric definition
in general, which must of course include the usual (n × n) case.
An answer comes from the following formula for k-dimensional volume in Rn , see
[HH15] p. 526. Noting that AAt is a square matrix, Hubbard proves the volume of a
k-parallelopiped P in Rn equals:
p
vol(P ) = det(At A).
The Gram matrix is At A; this is useful in Linear Algebra. The name comes from
Jorgen Pedersen Gram, famous for many things including the Gram-Schmidt orthog-
onalization procedure; see Wikipedia. The Gram determinant is the determinant of
the Gram matrix, so Hubbard’s formula is the square root of this. See [CJBS89]
p. 191. We prefer Hubbard’s presentation to Courant-John, in part because we find
the matrix form both easier to work with and to understand.
We present three proofs of the volume formula, first for the simplest cases:
Lemma 5.25.  
v
(i)For A a (2 × 1) matrix v = 1 then the length of the image of the unit line
v
p2
segment I = {(0, t) : t ∈ [0, 1]} is det(At A).
(ii)For A the (3 × 2) matrix with columns v, w as above, then
p
||v ∧ w|| = det(At A).
 
  v1  
t
Proof. For (i) we have A A = v1 v2 = v12 + v22 .
v2
For (ii),
   
  y y   v1 w 1  
−→ v −→  v1 v 2 v 3 v · v v · w
At A = v w
 = w 1 w 2 w 3 v2 w 2 = w · v w · w
  
−→ w −→ y y v3 w3
so detAt A = ||v||2 ||w||2 − 2v · w while we know from Corollary 4.20 that the area of
the parallelogram P (v, w) ⊂ R3 satisfies
(area)2 = (v · v)(w · w) − (v · w)2 .
80 ALBERT M. FISHER

Note that the formula


vol(P )2 = det(At A)
is not only completely general but is much easier to remember! 
Following Hubbard, we shall extend the above ideas to (n × k) matrices. For this
we introduce the follwing notation.
Definition 5.15. As we have explained, the word determinant is reserved for square
matrices, so we suggest using this notation for the general case of an (n × k) matrix:
p
Det(A) ≡ det(At A).
Note that (unlike for detA), DetA is always ≥ 0.
Lemma 5.26. Let A be a (n×k) matrix. Then Det(A) = Det(At ). That is, det(At A) =
det(AAt ).
Proof. Note that so far we know this only for square matrices! Also, the two matrices
At A and AAt are quite different, as the first is (2 × 2), while the second is (3 × 3).
We give the proof first for the case of a (3 × 2) matrix
 
v1 w 1
A = v2 w2  .
v3 w3
We define n to be the unit vector perpendicular to v, w such that (v, w, n) has
positve orientation, so n = v ∧ w/||v ∧ w||. Write Ab for the (3 × 3) matrix:
Thus
         
−→ n −→ y y y n1 n2 n3 n1 v1 w1
bt A
A b = −→ v −→ n v w =  v1 v2 v3  . n2 v2 w2 
  
−→ w −→ y y y w1 w2 w3 n3 v3 w3
   
n·n n·v n·w 1 0 0
=  v · n v · v v · w  =  0 v · v v · w
w·n w·v w·w 0 w·v w·w
bt ) = det(A),
Now we know that det(A b since these are square matrices. Thus

DetA = det(A) bt ) = Det(At ).


b = det(A
This is not a priori obvious since as noted above, At A 6= AAt (the first being (2 × 2),
the second (3 × 3); also the second is more complicated!)
For the general case, with A an (n × k) matrix, first for k < n, instead of adding
the single row n at the top to form A, b we add n − k vectors, as follows. We consider
the subspace V generated by v1 , . . . vk . and find an orthonormal basis (u1 , . . . uk ) for
V by the Gram-Schmidt procedure. We then complete this to an orthonormal basis
(u1 , . . . uk , u b n ) of Rn . Thus these last vectors are perpendicular to V . The
b k+1 , . . . , u
resulting calculations are identical to the (3 × 3) case.
A fortiori this also proves the case k > n.

LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 81

Theorem 5.27. Given an (n × k) matrix A with k ≤ n and denoting by P the


parallelopiped in Rn which is the image by p
A of I1 × · · · × Ik where each Ij = I, then
the k-dimensional volume of P is DetA ≡ det(At A).

Proof. vol(P ) = |det(A)|


b = DetA. 

This is independent of parameter change; the first proof is geometric and is simply
that volume does not depend on parameterization. The second, algebraic proof shows
how nice the formulas are:

Lemma 5.28. (linear change–of–variables theorem) Let B be a (k × k) matrix and


A a (n × k) matrix. Then for A
e = AB, Det(A)
e = Det(A)|det(B)|.

Proof.
p p p
Det(AB) = det((AB)t AB) =det((B t At )AB) = det(B t (At A)B
p p
= det(B t )det(At A)det(B) = |detB| det(At A)

since as we know, detB = detB t .




Corollary 5.29. If detB = 1, v1 , · · · , vk are the columns of A, and v e1 , v


ek are the
columns of AB, then the parralelopiped spanned by v e1 , · · · , v
ek and that spanned by
v1 , · · · , vk have the same volume.

We give a second proof of Theorem 5.27, using only the geometric definition of
determinant.....
(TO DO)
........  
a b
Given a (2 × 2) matrix B = b to be the following (3 × 3) matrix:
, define B
c d
 
1 0 0
b = 0 a b 
B
0 c d

then detB b = detB.


...............
Noting that:
det(A)
b = (DetA)

and so the area of P (v, w), which we write as area(v, w), equals the volume of the
image of the unit cube by A,
b which we write as vol(A),
b we conclude that

area(v, w) = vol(A)
b = detA
b = DetA

proving Hubbard’s formula for this case.


82 ALBERT M. FISHER

5.9. Surface area and surface integrals. Given a domain (a connected open set)
B ⊆ R2 and a C 1 map σ : B → S ⊆ R3 , such that for every (u.v) ∈ B is a regular
point, i.e. such that ||σu ∧ σv || =
6 0, then as above, σ is a parametrized surface. (Recall
that the regularity condition guarantees that the tangent plane exists at that point).
Definition 5.16. We define the surface area of σ to be
Z Z
area(σ) = ||σu ∧ σv || dudv.
B

This makes sense because area(P (v.w)) = ||v ∧ w|| is the area of the parallelogram
spanned by the vectors v, w, so ||σu ∧ σv || is the infinitesimal area, and the integral
adds this up. The intuition is that for a C1 map, the surface can be well-approximated
by a polygonal surface made up of parallelograms.
Given a parametrized surface σ as above and an invertible C 1 map H : A → B
e = σ ◦ H is a reparametrization of σ via the change of parameter H.
then σ
The next result is the analogue of Proposition 4.7 for curves:
Theorem 5.30. (area is invariant of parametrization). Suppose A ⊆ R2 is a domain
and H : A → B is C1 and invertible. Then for σ
e = σ ◦ H,
area(e
σ ) = area(σ).
Proof. We are to show that
Z Z Z Z
||e
σs ∧ σ
et || dsdt. = ||σu ∧ σv || dudv.
A B
We know from the change-of-variables formula for double integrals that for F :
B → R, defining Fe = F ◦ H, then
Z Z Z Z Z Z
F (s, t)|detDH(s, t)|dsdt =
e F ◦H(s, t)|detDH(s, t)|dsdt = F (u, v) dudv.
A A B

We define F to be F (u, v) = ||σu ∧σv ||(u, v) = DetDσ(u, v) and as above Fe = F ◦H.


Since by the Chain Rule,
D(σ ◦ H)(s, t) = Dσ|(u,v) DH|(s,t)
we have at the point (s, t), using Lemma 5.28:
||e
σs ∧ σ
et || = ||(σ ◦ H)s ∧ (σ ◦ H)t ||
= Det(D(σ ◦ H))|(s,t) = Det(D(σ ◦ H)|(s,t) )DH|(s,t) )
= Det(D(σ ◦ H)|(s,t) )det|DH|(s,t) |.
So
Z Z Z Z
||e
σs ∧ σet ||det(DH|(s,t) ) dsdt = Det(D(σ ◦ H)|(s,t) )det(DH|(s,t) ) dsdt
A A
Z Z Z Z Z Z
= F (s, t)det|DH|(s,t) | dsdt =
e F (u, v) dudv = ||σu ∧ σv || dudv.
A B B

LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 83

Theorem 5.31. (surface integral is invariant of parametrization). Suppose A ⊆ R2


is a domain and H : A → B is C1 and invertible, and with F : B → R continuous.
e = σ ◦ H,
Then for σ
Z Z Z Z
F ◦ H(s, t)||e
σs ∧ σ
et || dsdt = F (u, v)||σu ∧ σv || dudv.
A B

Proof. We follow the above proof, again using the change-of-variables theorem for
double integrals. 
Given a parametrized surface σ as above, with its image the parametrized surface
S = σ(B), and a function G : S → R, we define the surface integral of G over σ to
be: Z Z Z Z
G(v)dA = G(u, v)||σu ∧ σv || dudv.
S B
Theorem 5.31 shows this is indeed well-defined as it is invariant with respect to change
of parameteriization.
This notation is analagous to the line integral of second type in Def. 5.2:
Z Z b
f (v)ds ≡ f (γ(t)) ||γ 0 (t)||dt
γ a
0
where we called ds = ||γ (t)||dt the infinitesimal arc length, and this integral is
integration with respect to arc length. Here we call dA = ||σu ∧ σv || dudv the area
form, and this is an integral with repect to area.
5.10. Integrals over parametrized submanifolds. Let us note that the above
formula for surface area of a parametrized surface σ can be written in a different way,
given our definition of Det(A) for a rectangular matrix:
Z Z Z Z
area(σ) = ||σu ∧ σv || dudv DetDσ dudv.
B B
In this integral, we are not keeping
√ track of orientation of the surface, since surface
area is always positive. ( DetM = M t M ≥ 0) and the integral of a positive function
with respect to dA is always positive. This is just like a line integral of second type.
When we wish to include orientation, we use instead the notion of a two-form on
3
R (or a one-form for line integrals).
The above formula, and the proof of invariance for change of parameter, extends
immediately to the situation of a k-dimensional space inside of Rn :
Definition 5.17. Given a domain (a connected open set) B ⊆ Rk and a C 1 map
ϕ : B → S ⊆ Rn , such that for every u ∈ B is a regular point, i.e. such that DetDϕ 6=
0, equivalently Dϕ has maximal rank (= k) then as above, ϕ is a parametrized
submanifold of dimension k. The regularity condition guarantees that the tangent
space to ϕ exists at tha point.
We define the k-dimensional volume of σ to be
Z Z
vol(ϕ) = DetDϕ du
B
where du = dx1 dx2 . . . dxk .
84 ALBERT M. FISHER

Given a parametrized submanifold ϕ as above and an nvertible C1 map H : A → B


e = ϕ ◦ H is a reparametrization of ϕ via the change of parameter H.
then ϕ
Theorem 5.32. (k-dimensional volume is invariant of parametrization). Suppose
A ⊆ Rk is a domain and H : A → B is C1 and invertible. Then
vol(ϕ)
e = vol(ϕ).
Proof. We exactly follow the proof for surfaces. 
Similar to the case for surfacees, we say for M = ϕ(B): given a function G : M →
R, we define the volume integral of G to be:
Z Z
G(u)DetDϕ du.
B

Theorem 5.33. (Change-of-variables theorem)....(TO DO)


Proof. 
Next we prove Hubbard’s formula in a different way. We consider a k-parallelopiped
in Rn .
First we give the proof for k = 2 and n arbitrary.
(TO DO)...
5.11. The Divergence Theorem in space. (TO DO)
5.12. Stokes’ Theorem. Green’s Theorem and the Divergence Theorem both turn
out to be a special case of the fundamental result of vector calculus: Stokes’ Theorem,
where the points A, B, C, D are the boundary of the curve γ and get replaced by the
boundary of any domain.
Z Z
ω= dω
∂Ω Ω
or, in a different notation,
h∂Ω, ωi = hΩ, dωi.
In this notation, which can be called functional notation, h·, ·i is a pairing . A pairing
is a bilinear operator, but on the right we have a vector space (of d-forms) and on the
left an additive group (of d-chains, generated by d dimensional submanifolds). Here
d = k − 1 on the left and d = k on the right. The analogous assumption to the field
being conservative is hidden here, in that we begin with a k − 1-form on the left, like
the potential, and take its derivative on the right, like its gradient.
(TO DO)
5.13. Poincaré’s Lemma: Existence of the vector potential. A key idea of
Vector Calculus is to extend the Fundamental Theorem of Calculus in a variety of
ways. The first is that if for a vector field F on Rn , we have a function φ : Rn → R such
that ∇ϕ = F , then for a C 1 path γ : [a, b] → Rn with endpoints A = γ(a), B = γ(b)
then Z Z b
F · dγ = F (γ(t)) · γ 0 (t)dt = ϕ(B) − ϕ(A).
γ a
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 85

This is just like the case in one dimension where given f : [a, b] → R and a function
F satisfying F 0 = f then
Z
f (x)dx = F (b) − F (a).

There is however one important difference: for f Riemann integrable there always
exists such a primitive or antiderivative F , while for higher dimensions this only works
if the field F is conservative, is which case ϕ is called a potential function.
Equivalently, in differental form notation, say for F = (P, Q) then the form η =
P dx + Qdy has a primitive ϕ such that dϕ = η. This leads to the nice formula
Z Z Z Z
η = P dx + Qdy = dη = ϕ = ϕ(B) − ϕ(A).
γ γ ∂γ B=A

The terminology thus that F has a potential ϕ iff η has a primitive ϕ.


In one dimension the potentialR x is simply called the primitive, and can be defined
from the integral by: F (x) = x0 f (r)dr. This is defined up to a constant, choice
of which corresponds to changing the initial point x0 ; thus we have made the choice
F (x0 ) = 0.
Exactly the same thing works for the line integrals, where we can attempt to
define a potential function in the same way; we did this in Theorem 5.18. Recalling
the proof, the potential is defined up to an initial poin A, setting ϕ(A) = 0 and
Rb
ϕ(B) = a F (γ(t)) · γ 0 (t)dt. This will be defined if there exists a path γ connecting
A and B (by definition, iff the region is pathwise-connected) and will be well-defined
iff this definition is independent of the path chosen. That is one of the equivalent
characterizations of conservative field. (To prove that this definition indeed gives a
potential, we calculated the partials and showed one indeed recovers the field; this is
the “hardest” step in proving the equivalence of the conditions).
This result, which extends the Fundamental Theorem of Vector Calculus, itself
extends much further, to Green’s Theorem, the Divergence Theorem, and Stokes’
Theorem in R2 and R3 . All of this becomes simultaneously much more complicated
and much simpler in its most natural setting: the generalized Stokes Theorem on
manifolds with boundary.
Why we say “much simpler” is shown by the statement:
Z Z
η= dη
∂B B
or equivalently
h∂B, ηi = hB, dηi.
The second notation exhibits the integral as a bilinear form, like an inner product.
However here the elements on the right-hand side are differential forms, which form
a vector space, while on the left-hand side these are chains, parametrized manifolds
which can be added, subtracted or multiplied by integers, thus belonging to a module
(over the ring Z) rather than a vector space.
This second equation says that the boundary operator ∂ on chains is dual to the
exterior derivative operator d on forms. This relationship can be summarized by
86 ALBERT M. FISHER

saying that these operators are adjoints. (Note that this is indeed analogous to the
definition of the transpose, or adjoint, of a linear operator!)
The first difficulty hidden by this simple notation is all in the definitions, which are
equally abstract and deep. The secondary difficult comes in bridging the abstraction
to the concrete versions of Vector Calculus in R2 and R3 .
We mention two auxilliary points which come up in all these settings. The ba-
sic theorem is Stokes, which can be thought of as (and indeed can be called) the
Fundamental Theorem of Vector Calculus.
We shall need:
Definition 5.18. A differential k- form η is closed iff dη = 0.
It is exact iff there exists a (k − 1)-form α such that dα = η.
Lemma 5.34. If dα = η then dη = 0. Thus, d(dα) = 0. That is, an exact form is
closed.
In fact, we have seen a special case of this in Proposition 5.14, that ∇ × (∇ϕ) = 0.
The two other results are these:
Theorem 5.35. (Poincaré Lemma) On a simply connected domain, a closed form is
exact.
Thus the Poincaré Lemma says that for topologically nice doman (simply con-
nected), a primitive always exists; specifically, for one-forms in Rn , we know this,
since Again, we have seen this in the special case: in R3 , for a simpky connected
domain, for the dual vector field F , if curl(F ) = 0 then there
P exists ϕ such that
∇ϕ = F , thus F has a potential. And ∇ϕ = F iff dϕ = η = Pi dxi .
The second related result is:
Theorem 5.36. (Hodge Decomposition) On a simply connected domain, every dif-
ferential form can be uniquely written as the sum of a closed form and an exact form.
For vector fields in Rn , we say:
Definition 5.19. A vector field F is divergence-free or incompressible iff div(F ) = 0.
It is curl-free or conservative or irrotational iff curl(F ) = 0.
The Hodge decomposition then gives:
Theorem 5.37. (Helmholtz Decomposition) On a simply connected domain, every
vector field which vanishes fast enough at ∞ can be uniquely written as the sum of a
two vector fields, one divergence-free and one curl-free.
Corollary 5.38. A vector field on a simply connected domain, which vanishes fast
enough at ∞, is determined by its divergence and its curl.
Proof. By the Helmholtz Decomposition, our fieldF = Fd + Fc where Fd is curl-
free and Fc is divergence-free. Then curl(F ) = curl(Fc ) + curl(Fd ) = curl(Fc ) and
div(F ) = div(Fc ) + div(Fd ) = div(Fd ). Hence F = curl(F ) + div(F ).??? 
For vector fields on a simply connected domain in Rn , there are two versions of
Poincaré’s Lemma. The first says that a curl-free vector field has a potential, hence
is conservative: if curlF = 0 then there exists ϕ such that ∇ϕ = F .
The second statement is:
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 87

Theorem 5.39. If div(F ) = 0, then there exists a field A such that curl(A) = F .
For the proof we need:
Lemma 5.40. (Derivative under the Integral) Suppose for U ⊆ R2 open that f : U →
Rb
R is continuous, and that ∂f /∂y exists and is continuous. Define ϕ(y) = a f (x, y)dx.
Then Z b Z b
0 d ∂f
ϕ (y) = f (x, y)dx = (x, y)dx.
dy a a ∂y

Example 5. Before the proof, we consider some examples.


2
Remark 5.12. For these examples, recall that the Gaussian function e−x is a well-
known example for which the antiderivative cannot be found “in closed form”. Roughly
this means as a finite formula involving other elementary functions (polynomials,
trigonometric functions, log and exp); for a precise statement, which makes use of
the notion of a differential field, see [Ros68]. (Note that one can however easily give
an infinite formula, using Taylor’s series.)
Problem: Find ∂ϕ/∂x and ∂ϕ/∂y for
Z x
2
ϕ(x, y) = e−yt dt.
0
∂ϕ 2
The first is easy, as by the Fundamental Theorem of Calculus we have: ∂x
= e−yx .
For the second, we apply the Lemma, giving:
∂ϕ x −yt2
Z Z x Z x
∂ −yt2 2
e dt = e dt = −t2 e−yt dt.
∂y 0 0 ∂y 0
Problem: For
Z t2
2
h(t) = e−tu du,
0
calculate h0 (t).
Well, this is indeed pretty confusing, as the variable t occurs in two different spots!
The trick is to first define a function of two variables
Z x
2
ϕ(x, y) = e−yu du
0
2
and then compose it with a curve. Now as above ∂ϕ∂x
= e−yx , while
Z x
∂ϕ 2
= −u2 e−yu du.
∂y 0

Defining the curve γ(t) = (t2 , t) then h(t) = ϕ(γ(t)). We then apply the Chain
Rule:
∂ϕ ∂ϕ
h0 (t) = ∇ϕ|γ(t) · γ 0 (t) = |γ(t) x0 (t) + |γ(t) y 0 (t) =
∂x ∂y
Z t2 Z t2
−tt4 2 −tu2 −t5 2
e 2t + −u e du = 2te − u2 e−tu du.
0 0
88 ALBERT M. FISHER

Proof. (of Lemma) Our proof follows Apostol p. 448 [?].


Rb
We are given that ϕ(y) = a f (x, y)dx and want to find ϕ0 (y). We have:
Z b Z b  Z b 
ϕ(y + h) − ϕ(y) 1 1
= f (x, y+h)dx− f (x, y)dx = f (x, y+h)−f (x, y)dx
h h a a h a
Now by the Mean Value Theorem, for each fixed y there exists cy ∈ [a, b] such that
∂f
f (x, y + h) − f (x, y) = (cy , y) · h.
∂y
So
Z b  b
ϕ(y + h) − ϕ(y)
Z
1 ∂f ∂f
= (cy , y) · h dx = (cy , y) dx
h h a ∂y a ∂y
∂f ∂f
But by continuity of the partial derivative, (c , y)
∂y y
→ ∂y
(x, y) as h → 0. This gives
b
ϕ(y + h) − ϕ(y)
Z
∂f
→ (x, y) dx
h a ∂y
as claimed.


Proof. (of Theorem)


We give the proof for the easier case of a star-shaped domain.
Let F = (P, Q, R), with 0 = divF = Pxx + Qyy + Rzz .
We want to find a field A = (L, M, N ) such that curl(A) = F . Now
curlA = (Ny − Mz , Lz − Nx , Mx − Ly ).
We consider the simpler case where L = 0. Then we have
Lz − Nx = −Nx = Q, Mx − Ly = Mx = R
Now ∂N (x, y, z)/∂x = d/dt|t=x (Q(t, y, z) so for any inital point x0 , y, z) we have
Z x
N (x, y, z) = Q(t, y, z)dt + c(y, z)
t=x0

where c(y, z) is constant in x. Similarly


Z x
M (x, y, z) = R(t, y, z)dt + d(y, z)
t=x0

where c(y, z) is constant in x.


We look for a solution with c(y, z) = 0. We know that P = Ny − Mz so subtracting
the previous two equations gives
Z x Z x
P = Ny − Mz = ∂/∂y Q(t, y, z)dt − ∂/∂z R(t, y, z)dt + ∂/∂zd(y, z)
t=x0 t=x0
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 89

Now from the Lemma, taking the derivative inside the integral, this gives
Z x Z x
∂/∂yQ(t, y, z)dt − ∂/∂zR(t, y, z)dt + ∂/∂zd(y, z) =
t=x0 t=x0
Z x
−∂/∂yQ(t, y, z) − ∂/∂zR(t, y, z)dt + ∂/∂zd(y, z)
t=x0

Using the fact that divF = 0, we know that −Qy − Rz = Px so this is


Z x
−∂/∂xP (t, y, z)dt + ∂/∂zd(y, z) = P (x, y, z) − P (x0 , y, z) + ∂/∂zd(y, z)
t=x0

We now have the equation


P (x, y, z) = P (x, y, z) − P (x0 , y, z) + ∂/∂zd(y, z)
So we will be done if we can find a function d(y, z) satisfying

∂/∂zd(y, z) = −P (x0 , y, z)
(check all signs!)
So we simply define Z z
d(y, z) = P (x0 , y, r)dr
t=z0
giving the first part of the solution, defined up to a constant.
So far we have shown that for F = (P, Q, R) then the field A = (L, M, N ) with
L = 0. Z x
N (x, y, z) = Q(t, y, z)dt
t=x0
Z x
M (x, y, z) = R(t, y, z)dt + d(y, z)
t=x0
where Z z
d(y, z) = P (x0 , y, r)dr
t=z0
Putting these together we have shown that given F = (P, Q, R), then for A =
(L, M, N ) defined by
L=0
Z x
N (x, y, z) = Q(t, y, z)dt
Zt=x
x
0
Z z
M (x, y, z) = R(t, y, z)dt + P (x0 , y, r)dr.
t=x0 t=z0

then we have
curl(A) = F
.

90 ALBERT M. FISHER

Figure 21. Dual families of hyperbolas: Level curves (equipotential


curves) for the real and imaginary parts of f (z) = z 2 = (x+iy)(x−iy) =
(x2 − y 2 ) + 2(xy)i

Remark 5.13. There is a strangeness in the above proof as we arbitrarily chose L = 0


and yet somehow found a solution.
This is explained by noting that any solution A above is defined up to addition of
a field B with curl(B) = 0. Call the particular solution above AL . Then if we carry
out the above construction assuming instead that M = 0 we get solution AM and if
we assume instead N = 0 we get solution AN . But then indeed curl(AL − AM ) =
F − F = 0 and similarly curl(AL − AN ) = 0, curl(AM − AN ) = 0.

5.14. Analytic functions and harmonic conjugates.


Definition 5.20. A remarkable fact in Complex Analysis is that we have these three
equivalent definitions:
(i) A function f : U ⊆ C → C is holomorphic iff it is complex differentiable, i.e. its
derivative, given by the usual limit, exists and is a complex number. Thus, f is
holomorphic at z ∈ U, with derivative f 0 (z), iff f 0 (z) = limh→0 (f (z + h) − f (z))/h
exists.
If this number is z ∈ C, then writing z = reiθ for r ≥ 0, since by Euler’s formula
eiθ = cos θ + i sin θ = c + is, we see that the multiplication map w 7→ z · w is in real
coordinates     
w1 c −s w1
7→ r
w2 s c w2
In other words the function has a very special type of R2 -derivative: a dilation com-
posed with a rotation.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 91

(ii) This implies the map is conformal: angles and orientation are preserved infinites-
imally. By contrast, an anticonformal map preserves angles but reverses orientation;
the simplest example is z 7→ z where for z = a+ib, its complex conjugate is z = a−ib.
A general antiholomorpic map is given by a holomorphic map preceded or followed
by complex conjugation, so the R2 -derivative is a rotation composed with a reflection
in a line through (0, 0). Note that for both conformal and anticonformal maps, in-
finitesimal circles are taken to infinitesimal circles (not ellipses, which is the general
case).
(iii) A function is (complex) analytic iff it has a power series expansion near a point.
In particular, knowing a function has one continuous complex derivative, i.e. in
C 1 , implies, very differently from the real case, it is not only infinitely continuously
differentiable (C ∞ ) but has a power series (is C ω ).
Remark 5.14. Recalling Definition 5.20, if f : C → C is a complex analytic function,
with f = u + iv, then this defines a vector field F = (u, v) on R2 . We note that in
this case the field F has a special form:
   
ux uy a −b
DF = =
vx vy b a
since f is analytic iff it is complex differentiable, meaning that f 0 (z) is a complex
number w = a + ib = reiθ , giving a dilation times a rotation. This proves the
Cauchy-Riemann equations R ux = vy , uy = −vx .
Now the line integral γ F dγ is closely related to the contour integral of f over γ,
R
written γ f . The beginnings of the theory are developed in parallel; see e.g. [MH87]
p.95 ff. In particular, the winding mumber can be defined using a contour integral.
Of course this is only a starting point for the deep and beautiful subject of Complex
Analysis.
Definition 5.21. A function u : U → R is harmonic iff u is C 2 and uxx + uyy = 0.
We define a linear operator ∆, also written as ∇2 and called the Laplacian, on the
vector space C 2 (U, R) by ∆(u) = uxx + uyy . So u is harmonic iff ∆(u) = 0, iff u is in
the kernel of the operator.
The reason for the notation ∇2 isbeacuse it is notationally suggestive, as we can
think of it as the dot product: ∇2 ϕ = (∇·∇)(ϕ) = ∇·(∇(ϕ) = (δ/δx+δ/δy)·(ϕx +ϕy ).

Theorem 5.41. For a complex analytic function f : U → C, where U ⊆ C is open,


with real and imaginary parts u = <(f ), v = Im(f ) so f = u + iv, then thought of as
real functions on U ⊂ R2 ,
(i) these satisfy the Cauchy-Riemann equations ux = vy , uy = −vx ;
(ii) u, v are both harmonic functions;
(iii) their gradient vector fields are orthogonal;
(iv) their families of level curves are orthogonal
Proof. If f : C → C is a complex analytic function, then by definition the derivative
f 0 (z) = limh→0 (f (z + h) − f (z))/h is a complex number w = (a + ib) = reiθ . Now
multiplication by a complex number defines a linear transformation of C hence of R2 ;
92 ALBERT M. FISHER

Figure 22. Level curves for the real and imaginary parts of f (z) =
z 2 (z − 1)2 .

Figure 23. Level curves for the real and imaginary parts of f (z) = z 3 .

since this is a rotation followed by a dilation, this matrix has a special form. Writing
f = u + iv, then thought of as a map F of R2 , this is the vector field F = (u, v), the
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 93

derivative of which is the matrix


 
ux uy
DF = .
vx vy
Because we know this is a rotation by θ followed by a dilation by r ≥ 0, this equals
   
a −b cos θ − sin θ
=r .
b a sin θ cos θ
This proves the Cauchy-Riemann equations ux = vy , uy = −vx .
Now ux = vy whence uxx = vxy and
uy = −vx whence uyy = −vyx , giving that
uxx + uyy = vxy − vyx = 0
by the equality of mixed partials, Lemma 4.28.
Similarly, from ux = vy we have that uyx = vyy and from uy = −vx that uxy = −vxx
whence
vxx + vyy = uxy − uyx = 0.
So both u and v are harmonic.
Recalling the notation that for F = (P, Q) then F ∗ = (Q, −P ), we have
∇u = (ux , uy ) = (vy , −vx )
so
∇v = F ∗
gives an orthogonal field.
Lastly the level curves are perpendicular to the gradient fields, F and F ∗ , so since
these are orthogonal so are those families of curves.

In fact the converse also holds:
Proposition 5.42. If u, v are C 2 functions which are harmonic, such that the pair
(u, v) satisfies the Cauchy-Riemann equations, then f = u + iv is analytic.
Proof. As above, the derivative of F : R2 → R2 with F = (u, v) is the matrix
 
ux uy
DF = .
vx vy
The Cauchy-Riemann equations imply that this equals
 
a −b
b a
and so the map is given by multiplication by the complex number w = reiθ as above.
That the limit exists for DF implies that the limit exists for f 0 (z) and equals w.

Definition 5.22. If if f = u + iv is analytic then u, v are called harmonic conjugates.
Proposition 5.43.
94 ALBERT M. FISHER

(i) If U is a simply connected domain and u : U → R is harmonic, then there exists


a unique v : U → R such that (u, v) are harmonic conjugates.
(ii) The ordered pair (u, v) are harmonic conjugates iff the pair (v, −u) are (so order
matters here!)
Proof. (i): By the previous proposition it is enough to find v harmonic such that
(u, v) satisfies the Cauchy-Riemann equations, so such that vy = ux and vx = −uy .
But this is just like the problem of finding a potential for a curl zero vector field!
Thus, we consider the vector field F = (P, Q) = (−uy , ux ).
Then curl(F ) · k = Qx − Py = −uxx − uyy = 0.
By Theorem 5.18, there is a potential for F ; we call this v. Thus ∇(v) = (vx , vy ) =
(P, Q) = (−uy , ux ) so vxx + vyy = −uxy + uyx = 0 by the equality of mixed par-
tials, whence v is a harmonic function such that the pair (u, v) are indeed harmonic
conjugates.
In this proof of (i) we have followed Churchill [CB14]. 
Remark 5.15. Lang [Lan99] and Marsden-Hoffman [MH87] have nice treatments of
this. Following Marsden and Hoffman, note that since if = v − iu is analytic, then v
and −u are harmonic conjugates (but that the order is important!) A second, purely
complex analytic, proof of (i) is given by Marsden [MH87]. See also Ahlfors [Ahl66].
Fig. 21 shows the harmonic conjugates for the function f (z) = z 2 .
Corollary 5.44. Given a harmonic function u : U → R, where U is a simply con-
nected domain in R2 , then there exists a unique v : U → R, which is harmonic such
that (u, v) satisfy the Cauchy-Riemann equations. Also there exists a unique analytic
f on U thought of as a subset of C such that u = <(f ). Moreover f = u + iv. Writ-
ing fe for the second analytic function defined from the harmonic function v, then
fe = v − iu has harmonic conjugate pair (v, −u). Furthermore fe(z) = −if (z).
Harmonic functions are characterized by the important mean value property: for a
proof see e.g. [MH87].
Theorem 5.45. A C 2 function u is harmonic iff the value at a point p is equal to
the average of the values on any circle about p.
Definition 5.23. A flow τt on Rn is a gradient flow iff there is a function ϕ : Rn → R
such that for the field F = ∇ϕ, then the flow orbits are tangent to the gradient vector
field. That is, the orbits γ(t) = τt (x) for some initial point x satisfy the differential
equation
γ 0 (t) = F (γ(t))
.
We conclude:
Theorem 5.46. Let u be a harmonic function on U ⊆ R2 . Let v be its harmonic
conjugate. Write F = (u, v) and Fe = (−v, u), so F = ∇u and Fe = ∇v. Then the
gradient flow of u is the flow of F , and the gradient flow of v is the flow of Fe. The
flow lines of F are the level curves of v and the flow lines of Fe are the level curves
of u. The orbits of F and Fe are mutually orthogonal.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 95

Figure 24. Equipotential curves and lines of force for the electrostatic
field of a single charge in the plane. The equipotentials are level curves
for the potential function ϕ and change color as the angle increases from
0 to π and again from π to 2π. This depends on the formula chosen for
ϕ and the “color map” chosen for the graphics. In complex terms, the
complex log function is f (z) = log(z) and for z = reiθ with θ ∈ [0, 2π)
then f (z) = log(reiθ ) = log(r) + log(eiθ ) = log(r) + iθ = u + iv with
harmonic conjugates u(x, y) = log(r) and v(x, y) = θ. We see the level
curves in the Figure; they form a spiral staircase. See Fig. 20.

Example 6. Consider f (z) = z 2 = u + iv. The gradient fields are F (v) = Av and
Fe(v) = Av
e where
 
1 0
A=
0 −1
and  
0 1
A
e= ,
1 0
for the potentials u and v respectively.
5.15. Electrostatic and gravitational fields in the plane and in R3 . The same
geometry (with dual, orthogonal families of level curves) happens for electrostatic
fields: one family is the equipotentials (curves or surfaces, depending on the dimension)
while the other depicts the lines of force: flow lines tangent to the force vector field.
See the Figures.
96 ALBERT M. FISHER

Figure 25. Equipotential curves and lines of force for the electrostatic
field of two opposite charges in the plane. Colors indicate different
levels of the potential and dual potential, where these are the harmonic
conjugates coming from the associated complex function g(z) = f (z) −
f (z − 1) = log(z) − log(z − 1). These harmonic functions are u(x, y) −
u(x − 1, y) and v(x, y) − v(x − 1, y).

When the opposite charges of Fig. 25 get closer and closer, the behavior approxi-
mates that of an Electrostatic Dipole; see Figs. 26, 30. The charges would cancel out,
if we place one on top of the other, but if we take a limit of the fields as the distance d
goes to 0 as charges c are balanced with this so that the product dc remains constant,
then the limit of the fields (and potentials) exists. Note there is a limiting vector
from plus to minus, along the x-axis. The picture is for the case of charges in the
plane.
We note here that the pictures are unchanged by this sort of normalization, since:

Lemma 5.47.
(i) If F is a conservative field on Rn with potential function ϕ, then the collection of
equipotential curves (or dimension (n − 1) submanifolds) is the same as for the field
aF , a 6= 0.
(ii) If γ is a line of force for F , then γ is orthogonal to each equipotential submanifold.

Proof. (i) We have: ∇ϕ = F iff ∇aϕ = aF , and the level curve of level c corresponds
to that of level ac.
(ii) line of force for F is a curve γ with the property that F(γ(t) = γ 0 (t), i.e. γ is
tangent to the field everywhere (is an orbit of the flow for the ODE). Then ϕ(γ(t) = c
so ϕ(γ(t))0 = 0 but by the Chain Rule this is ϕ(γ(t))0 = ∇ϕ(γ(t)) · γ 0 (t) = F (γ(t)) ·
γ 0 (t).

LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 97

Figure 26. Equipotential curves and lines of force for the electrostatic
field of two unlike charges, now closer together.

That the pictures converge (of both the equipotentials and field lines) looks clear
from the figures, but to have the fields and potentials converge we need this normal-
ization.
The potential function shown is
1 (x + d)2 + y 2
u(u, y) = log
d (x − d)2 + y 2
for d = 1, .5, .05.
Dipoles (both electric and magnetic) are useful in applications to electrical engi-
neering and are itnriguing mathematically.
We mention that the geometry of fields in two-dimensional space has practical rel-
evance: for example, the magnetic field generated by electric current passing throung
a wire (in the form of a line) decreases like 1/r, as we can think of the field as be-
ing in the plane perpendicular to the wire. For fascinating related material see the
Wikipedia article on Ampere’s circuital law.
Experiments show that the force between two charged particles with charges q1 , q2 ∈
R with position difference given by a vector v ∈ R3 is
q 1 q2 v
· , r = ||v||
r2 ||v||
(so it is positive hence repulsive if the charges have the same sign).
An intuitive explanation for the factor of 1/r2 is this: suppose we have a light bulb
at the origin and we want to calculate the light density at distance r; the light consists
of photons, and the number emitted per second is the same as the number that pass
98 ALBERT M. FISHER

Figure 27. Equipotential curves for the electrostatic field of a planar


dipole: two unlike charges close together.

through a sphere of radius r, which is proportional to the area 4πr2 . Another way
to say this that we are counting the number of field lines per unit area. Both the
electrostatic field of a single charge and gravity (which is more simple as the is no
negative gravity) are mediated by radiating particles and so should decrease in the
same way.
We claim that the attractive potential ϕ of a single charge in R3 is
ϕ = 1/r = (x2 + y 2 + z 2 )−1/2
Since the force field is then F = ∇ϕ we have F = (P, Q, R) where
−x
P = 2
(x + y 2 + z 2 )3/2
and similarly for Q,R. The field strength at (x, y, z) is then
F (x, y, z) = ||(x, y, z)||/||(x, y, z)||3 = 1/r2
as we wanted.
We are thinking of a single large charge being tested by a small charge; we are not
yet calculating the resulting field of two equal charges (or the gravitational field of
two equal mass objects).
In two dimensions, the math is very different, as the field strength now should be
proportional to 1/r as it is inversely proportional to the circumference of a circle,
2πr.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 99

Figure 28. Equipotential curves and lines of force for the electrostatic
field of a planar dipole: two unlike charges very close together. The
potential is 1/d log(((x − d)2 + (x + d)2 ) for d = 0.5.

Thus in R2 , for a single unit charge particle at the origin, we claim that the potential
is
1
ϕ(x, y) = log(x2 + y 2 )
2
for then the force field is
 
x y
F = (P, Q) = ∇ϕ = ,
x2 + y 2 x2 + y 2
which has norm
||F || = ||(x, y)||/||(x, y)||2 = 1/r,
as we wished.
The dual field is  
∗ −y x
F = (−Q, P ) = ,
x2 + y 2 x 2 + y 2
which as we have seen in §5.4 has potential the angle function Θ of Fig. 20, given
by ψ(x, y) = arctan(y/x) or ψ(x, y) = arccot(x/y) depending on the location (since
R2 \ 0 is not simply connected).
The corresponding analytic function is
f (z) = log(z)
100 ALBERT M. FISHER

Figure 29. Equipotential curves and lines of force for the electrostatic
field of an approximate planar dipole: two unlike charges close together.

Figure 30. Equipotential curves and lines of force for the electrostatic
field of an approximate planar dipole: two unlike charges close together.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 101

Figure 31. Equipotential curves and lines of force for the electrostatic
field of two like charges in the plane. Since for one charge at 0 the
associated complex function is f (z) = log(z) = u + iv, here it is g(z) =
f (z) + f (z − 1) = log(z) + log(z − 1). The equipotentials and field lines
are respectively the level curves for the harmonic conjugates u(x, y) +
u(x − 1, y) and v(x, y) + v(x − 1, y).

and for z = reiθ with θ ∈ [0, 2π) then f (z) = log(reiθ ) = log(r) + log(eiθ ) = log(r) +
iθ = u + iv giving the harmonic conjugates u(x, y) = log(r) and v(x, y) = θ, whose
level curves we see in Fig. 24.
This is the case of a single charge. In fact, when combining objects all we have to
do is add the two potentials, ϕ = ϕ1 + ϕ2 , and then the gradient will give the field.
See Figs. 25, 31 for the cases of two oppositely, and equally, charged particles.
That we sum the potentials means in two dimensions that we sum the associated
complex functions as well; for opposite charges we change one of the signs.
In this figure, we have depicted two sets of curves: the level curves of the field ϕ
(the equipotentials), and the flow lines of the gradient field F = ∇ϕ (the lines of
force).
We can formulate this as a theorem; compare to Theorem 5.41 regarding analytic
functions:
Theorem 5.48. For an electrostatic field F = (P, Q) on the plane, then P and Q
are harmonic conjugates, whence
(i)their gradient vector fields are orthogonal;
(ii) their families of level curves are orthogonal.
Further, the potential P and dual potential Q are (perhaps integral) linear combi-
nations of the log and argument (angle) functions on R2 . The corresponding analytic
functions are (integral) linear combinations of the complex log function.
Proof. For a finite combination of point charges at points
P pi ∈ R2 with charges qi ∈ R,
the associated analytic function on C is f (z) = qi log(z − zi ) where p = (x, y)
corresponds to z = x + iy.
102 ALBERT M. FISHER

Figure 32. Equipotential curves and lines of force for the electrostatic
field of two like charges in the plane. Close to the center, (1/2, 0), the
potential and its dual start to approximate the dual hyperbolas of
Fig. 21.

For a charge density given by a Riemann integrable real-valued function q, the


associated analytic function is the vector-valued integral version of this (see §??):
Z
f (z) = q(w) log(z − w)dxdy.
R2

(The more general measure version of this also holds).




At first we may think that a potential such as the hyperbola shown in Fig. 21,
cannot come from an electrostatic field. However as Feynman Vol II §7.3 [FLS64]
points out, it can (in the limit): the field in the exact middle of two opposite charges
of Fig. 25 looks just like this. See Figs. 32, 33.
To prove this rigorously, take instead the charges at −1, 1 so now f (z) = log(z +
1) + log(z − 1). The Taylor expansion of this about the middle point 0 is −z 2 + . . .
and g(z) = z 2 is indeed the analytic function of Fig. 21. g determines harmonic
conjugates which define the linear vector fields depicted there.

Theorem 5.49. For gravitational fields in R2 , we have the same statement as The-
orem 5.48 except that now only positive values of the density function q can occur.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 103

Figure 33. Equipotential curves and lines of force for the electrostatic
field of two like charges in the plane. Close to the center, (1/2, 0), the
potential and its dual start to approximate the dual hyperbolas of
Fig. 21.

In fact, according to Feynman, any harmonic function and hence any complex
analytic function can occur for a physical electrostatic field in R2 . One can prove this
as follows.
From the mathematical point of view, there are two equivalent ways to characterize
an electrostatic field (in R2 or in R3 ). The first is that the potential of the field is a
solution of Poisson’s equation,
∇2 (ϕ) = ρ

where ρ is a signed measure describing the distribution of charge. From this point
of view one can then go about solving this linear partial differential equation. The
second is to describe the fundamental solution , which is the single point charge, with
its associated (gradient) field, and then define an electric field to be a (vector-valued
integral) linear combination of such fundamental solutions, integrated with respect
to the charge density.
From the first point of view what is fundamental is the PDE, from the second what
is most basic is the fundamental solution (this is Coulomb’s law!). What bridges the
two is the superposition principle, which simply says the space of solutions is a vector
space: we can take linear combinations.
104 ALBERT M. FISHER

In other words, for this linear equation knowing the fundamental solution charac-
terizes the infinite-dimensional vector space of all solutions. And conversely, one of
the methods for solving the PDE is to find its fundamental solution.
(For gravity the solution space is not all of the vector space but rather the positive
cone inside of it).
Now ∇(ϕ) = F is the field, so Poisson’s equation states that
∇2 (ϕ) = div(F ) = ρ.
Thus from the field or the potential we can determine the charge distribution. Ap-
plying the operator ∇ is a type of derivative; the opposite procedure is a type of
integration. Thus given the charge density ρ we find the field by solving the (partial)
differential equation divF = ρ, and given the field we find the potential by solving
the PDE
∇ϕ = F.
Combining these, given ρ we can find ϕ by solving the PDE
∇2 ϕ = ρ,
which is now a second order PDE as it involves second order partials.
The general operation of solving a DE is referred to as integration. As always,
differention is automatic, while integration can be hard! Mathematically speaking,
the first task is to prove that under certain circumstances a solution exists, and
conversely trying to identify any obstructions to having a solution. Such obstructions
are often especially interesting because they are topological; e.g. the equation ∇ϕ =
only has a solution on a simply connected U ⊆ R2 \ {0}.
If there is no charge in a region U, then from Poisson’s equation
∇2 (ϕ) = ρ = 0
and the potential function ϕ is harmonic. Thus for Figs. 25, 25, the potential is
0 everywhere except exacly at those two points. At those points themselves the
potential is infinite and the field is not only infinite but points in all directions, so
neither is defined. When we have a continuous charge density, however, these are
defined everywhere. In that case, by Poisson’s equation the potential is not harmonic
as ∇2 (ϕ) = ρ 6= 0. When the charge density is continuous but nonzero, the field
and potential make perfect sense mathematically being continuous functions, but the
potential is no longer a harmonic function, so it certainly cannot (in
R2 ) have a harmonic conjugate and does not extend to a comlpex analytic function.
Hence the tools of Complex Analysis are not as applicable. Nevertheless, there is still
a dual potential, whose level sets are orthogonal to those of ϕ, similar to the harmonic
case.
To prove this, (I believe and would like to work this out!) we can again refer to
the fundamental solution; since it hold there it must extend to all densities ρ.
But what “is” a point charge? From the mathematical point of view it is a point
mass, simply a measure concentrated at a point. In physics this is called a Dirac
delta function, which is the viewpoint of Riemann-Stieltjes integration. From the
standpoint of Lebesgue integration, it is a measure and not a function at all.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 105

Then we know how to rigorously treat two cases: point masses and continuous
densities. Similarly, one can include densities given by any other Borel measures.
I say “density” rather than “distribution” here because that word will immediately
get used in a very different way! That is the yet more sophisticated viewpoint of
Laurent Schwartz’ theory of distributions, see e.g. [Rud73]. Roughly speaking a
Schwartz distribution is a continuous linear functional defined on a carefully chosen
space of test functions which are smooth and rapidly decreasing. This enables one to
define derivatives, by duality. Thus the advantage of Schwartz distributions is that
they can be differentiated and also can be convolved. Thus if one finds a fundamental
solution to be a Schwartz distribution, the general solution is found by convolving
this over the density. This is exactly what we have described above.
For the simplest case of the fields described above we can get away with point
masses, but for more sophisticated examples we really do need Schwartz distributions.
This is the case when we consider dipoles, but that is beyond the present scope.
For a clear overview of the physics, see the beginning of Jackson’s text [Jac99];
this however goes quickly into much (much) deeper material, including boundary
values, dipoles, Green’s functions, and magnetism, dynamics and the connections
with Special Relativity.
For a remarkable mathematical treatment see Arnold’s book on PDEs: [Arn04].
Now to sketch a proof of Feynman’s claim, given a harmonic function, we define a
field to be the gradient of this potential. Given a field, we find such a potential. ...
Finding a potential
Next we see (by working out some examples) how to find the potential of a conser-
vative vector field.
We know that given F : Rn → R, the gradient vector field is orthogonal to the
level hypersurfaces (submanifolds of dimension n − 1) of F , so level surfaces in R3
and level curves in R2 .
We know that a vector field is conservative iff it is the gradient of a function. There
are two ways that this can fail to be the case: locally or globally.
We want to first examine the local problem: when is a vector field locally conser-
vative?
Switching equivalently to the language of differential forms, the vector field V is
conservative iff the associated 1−form η is exact. We know that a necessary condition
for this to occur is that the form be closed, i.e. that d(η) = 0. In R2 or R3 this is the
same as curl = 0.
Poincaré’s Lemma tells us that locally, the converse holds: any closed form is
exact. A basic counterexample for the global exactness is the angle function on the
plane: there is a local potential (the infinite spiral staircase) but this is a multivalued
function, so not a potential in the usual sense.
Here is a method to try to find a potential for any vector field in the plane. Given a
nowhere-0 vector field V , we want to find a potential ϕ, that is a function ϕ : R2 → R
such that ∇ϕ = V . In this case, its level curves are orthogonal to V . So, let us
consider the orthogonal vector field W to V , say at angle +π/2. Then, using the
Fundamental Theorem of ODEs, draw the integral curves. These are unique hence
106 ALBERT M. FISHER

do not intersect. Globally, they might say be spirals, we can define a function ϕ with
different values on each. Thus, ϕ is a candidate for a potential.
We can see an example in the illustrations of the electrostatic potentials Fig. 31,
25.
There are two families of curves: the equipotentials and the lines of force.
The lines of force are tangent to the gradient vector field. For opposite charges,
we can picture the gradient flow as flowing from the positive to the negative charge.
In fact, we can interpret this as a gravitational field, with a mountain at the positive
and a valley at the negative charge. For like charges, we can picture two mountains.
It is important to remember that there are two quite different interpretations, as
force fields or as velocity fields. The gradient flow refers to the velocity field, and a
particle moves along the curve with that tangent vector. For the force field interpre-
tation, the particle accelerates and may go off the curve because of the acceleration
due to the curvature.
In any case, we can try to imagine switching roles, so the equipotential curves
become the orbits of a gradient flow and vice-versa.
If this works, we will have succeeded in constructing a potential for our vector field.
....
5.16. The role of differential forms. (To DO)......

6. Ordinary differential equations


6.1. The classical one-dimensional case. 1. Introduction:
Our notes for this section are based in part on course notes by Marina Talet,
Université Aix-Marseille, 2021.
Consider an integration exercise from Calculus such as: given f (x) = 1/x, defined
on R \ {0}, find Z
F (x) = f (x)dx
with the solution
F (x) = log |x| + c
The function F has several names: the antiderivative, integral, or primitive or indefi-
nite integral of f , and the operation of finding F is termed integration of the function
f.
We can rewrite this as: find y = y(x) where
y0 = f
which can be considered as the simplest type of differential equation. If we specify
that y(1) = 0, this initial condition fixes the solution on the interval (0, +∞), as then
c = 0.
The explanation for the integral being indefinite in that it is defined up to a constant
c is that the derivative map D : Ck+1 → Ck is a linear transformation with one-
dimensional kernel the constant functions.
The general concept of integrating or finding a primitive goes far beyond this.
We find further examples in what follows.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 107

Exponential growth.

Let us recall the two main rules for exponents:

(i)ab+c = ab ac

and

(ii)abc = (ab )c .
An exponential function is of the form f (x) = ax for a > 0 and x ∈ R. The number
1
a is termed the base and x the exponent. By the above rules, 1/a = a−1 and (a 2 )2 = a
√ 1
whence a = a 2 . This makes it easy to define ax for exponent rational.

Thus exponentiation turns adddition into multiplication, multiplication into taking


powers. Mathematically speaking, the fact that ax+y = ax ay and a0 = 1 tells us that
writing Φa (x) = ax , the map Φ is an isomorphism from the additive group of real
numbers (R, +) to the multiplicative group of positive real numbers (R> ·) where
R> ≡ (0, +∞).
We write lna (x) for the inverse function of base a, that is, for f (x) = ax and
g = lna (x) then f ◦ g(x) = x and g ◦ f (x) = x wherever these are defined: the first
for x > 0 and the second for all x ∈ R.
This function does the opposite of the exponential: it maps R> ≡ (0, +∞) to
(R, +) and converts multiplication to addition and powers to products, thus
lna (xy) = lna (x) + lna (y)
and
lna (xy ) = y lna (x).

The most practical base in many applications is 10 or 2, but in pure mathematics


by far the most important base is the irrational number e = 2.1714... The reason is
that the formula for the derivative is much simpler for base e, as we shall see shortly.
But first, to define ex for real, non-rational exponents, we note that there are several
approaches one can take.

(1)First, we can use continuity: it can be proved that there is a unique continuous
way to extend this function to the reals. That is, we can approximate x by rational
numbers and take the limit.

We can also give more explicit definitions:

(2) From the Fundamental Theorem of Ordinary Differential Equations, there exists
a unique solution to the differential equation (or DE) y 0 = y satisfying y(0) = 1 (this
is called an initial condition for the DE). We define this function to be exp(t) = y(t),
and then and define the number e to be exp(1). This is also the slope of ex at x = 0.
108 ALBERT M. FISHER

(3) We define ex to be the function with the following series expansion:


x 2 x3 xn
exp(x) = 1 + x + + + ··· + + ... (26)
2 6 n!
where the factorial is defined by 0! = 1, k! = 1 · 2 · · · · · k.
This expresses ex as a limit of rational numbers. In particular, the number e = e1
can be approximated as a decimal using this series.

We need:
Convergence of the Tayor series for the exponential function.

Theorem 6.1. The series for ex in (26) converges for all x ∈ R. The corresponding
series where x is replaced by a complex number z ∈ C convergse, as does the series
where x is replaced by a square matrix.
Proof. For x fixed, let m > 2x so x/m < 1/2. Then for any n > 0,
 n
xn+m xm 1
≤ ·
(n + m)! m! 2
which gives a geometric series hence converges. Thus the sequence of partial sums is
an increasing bounded sequence hence converges by the completeness property of the
real numbers.
Similarly, using the fact that |zw| = |z||w|, the series
z2 z3 zn
exp(z) = 1 + z + + + ··· + + ...
2 6 n!
comverges for all z ∈ C.
For square matrices we need the follwing: We define a norm on the linear space
L(V, W ) of all linear operators from one normed space V to another W by ||A|| =
supv∈V ||Av||/||v|| = sup||v||=1 ||Av||. This is called the operator norm. One of its
main useful properties is that it (clearly) behaves nicely under composition: ||AB|| ≤
||A|| ||B||. This is called submultiplicativity.
In particular this holds for squares matrices A, B. Using this, just as for complex
numbers, the series
A2 A3 An
exp(A) = 1 + A + + + ··· + + ...
2 6 n!
comverges for all A ∈ M(d×d) , the collection of square matrices (with entries in R or
C.)


Note that the derivative of the series for f (x) = ex taken term-by-term, does satisfy
the DE y 0 = y, y(0) = 1, so (3) yields (2). Conversely, knowing the derivative from
(2) gives (3) as the Taylor series: recall that for an infinitely differentiable function
f the Taylor series about 0 is
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 109


X f (k) (0)xk
.
k=0
k!

We let ln(x) denote the natural logarithm, the inverse function g(x) of f (x) = ex =
exp(x), so ln = lne .
Another way to define ln is via integration: for x > 0,
Z x
1
ln(x) = dx
1 x
Another possible definition for ln is via its Taylor series, calculated around the
value x = 1, in other words, find the series for ln(x + 1) around 0.)

This leads to a third definition of exp:


(4) First we define ln, in one of these ways; then exp is defined to be its inverse
function.

Next we define, for any base a > 0:


ax = e(ln a)x .
Hence the derivative is (ax )0 = ln(a)ax . Note that indeed (ex )0 = ex , and the
number e is the only base such that ax is its own derivative.
We then denote by lna its inverse function.

Exponential growth and doubling times; exponential decay and half-life

Suppose a quantity f (n) doubles every day, starting at 1 at time n = 0. Then


we have f (n) = 2n . (You should draw the graph, for say n = −3, . . . 3). Here the
doubling time is 1.
When we first see this equation, we naturally wonder why not to use base 2 (or
perhaps 10!) instead of the irrational number e = 2.1714 . . . . The reason is because
base e has the simplest expressions for the derivative, hence also for the series. In
fact, for both calculations and theory, for exponential and also for logs, it is generally
easier to first change to base e.
Nevertheless the concept of doubling time is intuitively very useful.
When at for a < 1 we call this exponential decay, for example the decay of radioac-
tivity of a substance.
When we know the doubling time of for instance a pandemic or a bank account,
we can easily make rough estimates in our heads, and similarly for exponential decay
of a radioactive substance.
These can vary considerably, ranging from 4.4 billion years for Uranium-238 to
10−24 seconds for Hydrogen-7. Plutonium-239 has a half-life 24, 110 years, indicat-
ing its danger when in radioactive waste, while Carbon-14 which is so useful in the
radiocarbon dating process used by archeologists has a half-life of 5, 730 years.
Exercises:
110 ALBERT M. FISHER

(1) Show that lna (x) = ln(x)/ ln(a).


(2) Find the Taylor’s series for ln(x + 1) about x = 0.
(3) Prove that the series for ex converges for all x ∈ R.
(4) Find a formula for the doubling time td for f (t) = eat , for a > 0.
(5) Find the half-life th half-life for f (t) = eat , for a < 0.
Solving the equation y 0 = ay, a ∈ R.
Exponential growth y(t) = At for A > 1 grows at a rate proportional to the
quantity at time t. Thus for example for A = 2, 2n+1 − 2n = 2n (2 − 1) = 2n , while for
a bank account growing at 10% per year, a = 1.10 and y(t + 1) − y(t) = An+1 − An =
An (A − 1) = c · y(n) for the constant c = A − 1.
As noted, this includes both exponential growth or decay, and also the constant
case y 0 = 0.
We simplify to the special case a = 1 and recall that y(t) = et solves this. Then
we see that y(t) = Ket for K ∈ R also works. Lastly we note that y(t) = Keat will
provide a solution of the DE y 0 = ay, for any a ∈ R.
This is valid for any a, K ∈ R.
But are these all possible solutions? To answer this let v(t) = eat and suppose
that u is another solution, so u0 = au. Now since v(t) = eat , v −1 = e−at whence
(v −1 )0 = −av −1 .
We guess that u = Kv is the only possibility, thus that u/v will be a constant.
Equivalently, its derivative is 0. We compute:

(u/v)0 = (u · v −1 )0 = u0 (v −1 ) + u · (v −1 )0 = auv −1 − auv −1 = 0


as we guessed, so u/v = K and
u = Kv = Keat .

1.b. Analytic definition of a differential equation.


It is time for a definition! Basically, differential equation (denoted DE) is a relation
between an unknown function y (to be determined) and a certain number of its
derivatives. More precisely,

Definition 6.1. A differential equation in one dimension is an equation of the form


(∗) F (t, y(t), y 0 (t), · · · , y (n) (t)) = 0,
where F is a given real-valued continuous function of n + 2 variables (so at least one
derivative is involved!) and t 7→ y(t) is a unknown function, that we are trying to
find. We denote by y (n) its derivative of order n, a strictly positive integer.
The above DE (*) is then said to be order n.
It is a linear DE iff the function F : Rn+2 → R is affine (unfortunately, this is the
standard terminology!). Thus the equation y 0 = −y + c is called linear.
It is a autonomous or stationary DE if there is no variable t for F , so F is of the
form
(∗) F (y, y 0 , · · · , y (n) ) = 0
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 111

Figure 34. Slope field and solution curves for exponential growth y 0 =
cy. The equation is in one dimension, and its flow is along the real line;
these curves are the graphs of those solutions, so including the time
variable. This can also be viewed as solutions to a vector ODE in
the plane, where the curves are tangent to the vector field V (x, y) =
(1, y). These solution curves are γ(t) = (t, y(t)) so γ 0 (t) = (1, y 0 (t)) =
V (γ(t)) = (1, y(t)). The difference between a slope field and a vector
field is this: segments in the slope field are parallel to the vector field
but meet the curves in their midpoint. The picture of the slope field
is often easier to understand, as it is much less cluttered since all the
segments are all of the same maneagable length.

Otherwise it is a non-autonomous or non-stationary or time-varying DE.


The DE is said to be explicit of order n, or in normal form, if it can be solved
for the highest-order derivative, in other words if can be written in the form
(∗) y (n) (t) = F (t, y(t), y 0 (t), · · · , y (n−1) (t))
where now F is a given continuous function of n + 1 variables. Otherwise it is an
implicit DE.
2 2
To motivate this terminology, recall that √ in Calculus thep equation x + y = 1
is equivalent to the four equations y = ± 1 − x2 , x = ± 1 − y 2 and we can say
that the first equation is an implicit equation in that it “implies” the other “explicit”
equations where we have solved for one variable as a function of the other. For Rn ,
the Implicit Function Theorem can help us determine when this can be done. In
much the same way, we can have an implicit DE, for example (y 0 )2 + y 2 = 1 which
implies the explicit equations y 0 = ±(1 − y 2 ).
Here are some examples:
(1)For a : R → R continuous, y 0 = a(t) is an explicit equation, nonautonomous
unless a(t) = a is constant. The solution is just the antiderivative of a(t).
112 ALBERT M. FISHER

(2)F (r, s) = s, so y 0 = F (t, y) = y: y 0 = y. This is the autonomous, linear, first


order equation we encountered above.
(3)For a, b ∈ R, F (r, s) = a · s + b, so y 0 = ay + b. This is an autonomous linear
DE.
(4) For a, b : R → R continuous, F (r, s) = a(t) · s + b(t), so y 0 = a(t)y + b(t). This
is a linear first-order nonautonomous DE.

More examples:
y 0 (t) + y(t) = 1: is an DE of first order
sin(y 0 (t)) + y 3 (t) = t: an implicit DE of first order
y (7) (t) + y 9 (t) = t + sin(5t): an DE of order 7
y(t) + y 2 (t) = t is not an DE (as there is no derivative involved!).

A solution of (*) over an interval I of R is a function t 7→ y(t) which is n times


derivable on I and which satisfies (*).
The interval I on which we solve a DE is very important, as changing the interval
may allow for different solutions.
To solve (*) means to find all the solutions of (*).
There can be zero, one, several or an infinite number of solutions.

Examples:
- The DE y 0 (t) = 0 admits an infinite number of solutions. Indeed, y(t) = c for any
c ∈ R is a solution.
- The implicit DE y 02 (t) + 1 = 0 does not admit any real solution.

6.2. Flows, systems of DEs and vector differential equations. At this point it
will actually be better to consider an apparently more difficult problem: that of DEs
in higher dimensions, or vector differential equations.
Definition 6.2. Given a vector field V on Rn , a vector differential equation of first
order with initial condition x ∈ Rn is:γ(0) = x. is:

γ 0 (t) = V (γ(t); γ(0) = x


This is equivalent to a system of n first order DEs. For example, taking n = 2 so
V = (P, Q), then for γ = (x, y),
 0  
x P (x, y)
=
y0 Q(x, y)
or equivalently
(
x0 =
(27)
y0 =
In §§?? and 5.4 we have already given an introduction to this topic.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 113

We summarize what was said there.

Definition 6.3. Given a set X and a function τ : X × R → X, note that fixing t ∈ R


then τt (x) = τ (x, t) defines a map τt : X → X. Thus {τt }t∈R is a collection of maps
on X.
We say τ defines a flow on X iff
(i) τ0 is the identity map and
(ii) τt satisfies the flow property

τt+s = τs ◦ τt .

A flow is also known as a one-parameter group of transformations.


We think of the variable t as time; then τt is called the time-t map of the flow.
The orbit of a point x ∈ X is {τt (x) : t ∈ R}.
A flow is a continuous-time dynamical system as it describes the time evolution of
a point x ∈ X as it moves along its orbit. The future orbit is the collection of points
τt (x) for t > 0, the past orbit for t < 0, and the present moment is τ0 (x) = x, where
we are right now. It is assumed that the system has only one past and future; there
is no randomness here. In other words, two orbits are either disjoint or identical.
(This is the case for any physical system except possibly quantum mechanics, for
example one obeying Newton’s laws; however, even in such a “predictable” system,
randomness can enter in a different way because of complicated dynamics: chaos or
sensitive dependence on initial conditions.)
If X is a vector space V , then τt is a linear flow iff each map τt is linear. Note that
by the flow property plus the fact that τ0 is the identity, τt is bijective as its inverse
is τ−t . Thus each τt is a linear isomorphism of V .

Example 7. (Rotation Flow 1)


Consider
 the
 group of rotations of the plane, setting a = cos(2πt), b = sin(2πt) and
a −b
Rt = and defining G = {Rt : t ∈ R} ∼= T1 = R/Z.
b a
Noting that Rt+s = Rs ◦ Rt = Rt ◦ Rs , this is a flow.
The next result will show that these maps are of the form etA , for a certain matrix
A, making the connection with Lie algebras. As we shall see, in fact all linear flows
arise in this way.
 
0 −1
Proposition 6.2. (Rotation flow) For A = , then
1 0
 
tA cos t − sin t
e = Rt = .
sin t cos t
 
0 −1
Proof. (First proof ) For A = we note that the powers of A have period 4,
1 0
with (A0 , A1 , A2 , A3 , . . . ) = (I, A, −I, −A, . . . ). We separate the Taylor series into
114 ALBERT M. FISHER

Figure 35. Some discrete and continuous-time orbits of the rotation


flow, Example 6.2.

even and odd terms. Writing c = cos t and s = sin t, this gives:

X
exp(tA) = (tA)k /k! =
k=0
I + tA + (tA) /2 + (tA)3 /3! + (tA)4 /4! + · · · =
2

(I + (tA)2 /2 + (tA)4 /4! + . . . ) + (tA + (tA)3 /3! + (tA)5 /5! + . . . ) = (28)


I(1 − t2 /2 + t4 /4! − t6 /6! + . . . ) + A(t − t3 /3! + t5 /5! − . . . ) =
         
c 0 s 0 c 0 0 −s c −s
+A = + =
0 c 0 s 0 c s 0 s c
as claimed. 

Exercise 6.1. (Harmonic oscillator)


Perhaps the second most important example is that of the the harmonic oscillator,
with the DE
y 00 = −y.
The idea is that a spring is attached to a wall and the other end to an object; when
this is pulled out to a distance y the force felt is approximately −cy, where c > 0 is
a constant called the spring constant. By Newton’s Law F = ma, this gives (taking
m = 1) the above equation. The reason for the − sign is that it is being pulled back
toward its rest position 0: in the negative direction if if y > 0, in the positive direction
if y < 0, changing as it oscillates.
Ir is clear that (taking for simplicity c = 1) y(t) = sin t and cos t are solutions.
To find all the solutions, we show how this second-order equation in one dimension
can be rewritten as a system of two one-dimensional first-order equations, and equiv-
alently as a vector DE in R2 , given by a linear vector field. This is always possible:
a single higher-order DE in one dimension can always be rewritten as an equivalent
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 115

order-one vector DE (also in the nonautonomous case, by adding one more dimen-
sion). Geometrically, this means we have a velocity vector field, with an integral
curve exactly corresponding to a solution!
To carry this out in this case, we set w1 = −y, w2 = y 0 . We thus have the pair of
equations w20 = y 00 = −y = w1 ,, w10 = −y 0 = −w2 giving the system
(
w10 = −w2
w20 = w1
This can be written in matrix form, where w = (w1 , w2 ), as
w0 = Aw (29)
so

w10
    
0 −1 w1
= .
w20 1 0 w2
Definition 6.4. Equation (29), w0 = Aw where A is an (n × n) matrix, and w =
w(t) ∈ Rn is called a vector DE. Note that A defines a linear vector field on Rn , and
w(t) is a curve in Rn ; it is a curve which is tangent to the vector field, as its tangent
vector at each point is the vector field. If the starting point of the curve w(0) = w0
is specified, then we have a vector DE with this initial condition.
Remark 6.1. Now A defines a linear vector field on R2 . Since the variable for time
does not occur here, we had for y 00 = −y ian autonomous second-order equation,
and now we have an autonomous vector DE, equivalently an autonomous system of
two first-order equations. Physically, the variables (w1 , w2 ) = (−y, y 0 ) represent, for
the oscillator, −position and velocity (or momentum, since momentum= mV ). The
vector solution w(t) = (cos t, sin, t) gives the one-dimensional solution y(t) = cos t
for the original equation; the graph of the curve w(t) is the helix (t, cos t, sin t) which
projects to the position y(t) = cos t and velocity y 0 (t) = sin t.
In Physics, we often pass to (position, momentum) soace which is called phase
space.
Exercise 6.2. Solve the second order linear equation y 00 = −y by the following
strategy: we define y 0 = x and x0 = −y, giving a system of two equations of first
order. Then we rewrite the system in vector form x0 = Ax, for x = (x, y) as above.
Now solve this vector DE explicitly for initial condition x0 = (a, b) and sketch the
solutions. Lastly, returning to the original equation y 00 = −y, what are the solutions
y(t)?

Solution.  
0 −1
The matrix is A = . We have the solution
1 0
      
tA cos t − sin t a a cos t − b sin t x(t)
e x0 = Rt x0 = = =
sin t cos t b a sin t + b cos t y(t)
116 ALBERT M. FISHER

for the vector DE with initial condition x0 = (a, b). The general solution for the
one-dimensional second order equation y 00 = −y is therefore

y(t) = a sin t + b cos t. (30)


0 0
Note that x(0) = a = y (0), so the initial condition is y(0) = b, y (0) = a. Physi-
cally, this corresponds to a harmonic oscillator with mass and spring constant 1, and
with initial position y(0) = b, initial velocity y 0 (0) = a.
Fixing y(0) = b, we see all the circles which meet the line y = b in the plane, each
corresponding to a different initial velocity.

The general linear higher-order linear case.


Any higher-order linear DE can be handled similarly. Thus, the matrix for the
general nth -order linear case
y (n) = a1 y + a2 y 0 + · · · + an y (n−1)
then has the nice form

0 1 0 ... 0
 
0 0 1 ... 
. 
 ..
A= .

0 0 0 ... 1
a1 a2 a3 . . . an

References
[Ahl66] Lars V. Ahlfors. Complex Analysis. McGraw-Hill, second edition, 1966.
[Arm83] M.A. Armstrong. Basic Topology, volume 8 of Undergraduate Texts in Mathematics.
Springer, 1983.
[Arn04] Vladimir Igorevc Arnold. Lectures on partial differential equations. Springer, 2004.
[CB14] Ruel Churchill and James Brown. Ebook: Complex Variables and Applications. McGraw
Hill, 2014.
[CJBS89] Richard Courant, Fritz John, Albert A Blank, and Alan Solomon. Introduction to calculus
and analysis, volume 2. Springer, 1989.
[DC16] Manfredo P Do Carmo. Differential geometry of curves and surfaces: revised and updated
second edition. Courier Dover Publications, 2016.
[FLS64] RP Feynman, RB Leighton, and M Sands. The feynman lectures on physics, ii, addison-
rwesley. Reading, Massachusetts, 1964.
[GP74] Victor Guillemin and Alan Pollack. Differential topology. Prentice-Hall, 1974.
[Gui02] Hamilton Luiz Guidorizzi. Um Curso de Cálculo, Vols I- III, ltc, 5a, 2002.
[HH15] John H Hubbard and Barbara Burke Hubbard. Vector Calculus, Linear Algebra, and
Differential Forms: a unified approach. Matrix Editions, 2015.
[Jac99] John David Jackson. Classical electrodynamics, 1999.
[Lan99] Serge Lang. Complex Analysis, volume 103 of Graduate Texts in Mathematics. Spriger,
1999.
[Mar74] Jerrold E. Marsden. Elementary Classical Analysis. W. H. Freeman, 1974.
[MH87] Jerrold E. Marsden and Michael J. Hoffman. Basic Complex Analysis. W. H. Freeman,
second edition, 1987.
[MH98] JE Marsden and JM Hoffman. Basic complex analysis, 3a edição, 1998.
LECTURE NOTES FOR VECTOR CALCULUS PARTS I AND II (CALCULUS 2 AND 3) 117

[Mil16] John Milnor. Morse theory.(am-51), volume 51. In Morse Theory.(AM-51), Volume 51.
Princeton university press, 2016.
[MW97] John Milnor and David W Weaver. Topology from the differentiable viewpoint, volume 21.
Princeton university press, 1997.
[O’N06] Barrett O’Neill. Elementary differential geometry. Elsevier, 2006.
[Ros68] Maxwell Rosenlicht. Liouvilles theorem on functions with elementary integrals. Pacific
Journal of Mathematics, 24(1):153–161, 1968.
[Rud73] W. Rudin. Functional Analysis. McGraw-Hill, New York, 1973.
[Spi65] Michael Spivak. Calculus on Manifolds, volume 1. WA Benjamin New York, 1965.
[War71] Frank W Warner. Foundations of differentiable manifolds and Lie groups, volume 94 of
Graduate Texts in Mathematics. Springer Verlag, 1971.

Albert M. Fisher, Dept Mat IME-USP, Caixa Postal 66281, CEP 05315-970 São
Paulo, Brazil
URL: http://ime.usp.br/∼afisher
E-mail address: [email protected]

You might also like