Vector Calculus
Vector Calculus
Vector Calculus
Andrew Monnot
Contents
1 Vector Spaces 2
1.1 Basic Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Bases and Dimension of Vector Spaces . . . . . . . . . . . . . . . . . . . 2
1.3 Functions between Vector Spaces . . . . . . . . . . . . . . . . . . . . . . 3
4 Differentiation 15
4.1 Differentiation in Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . 15
4.2 Differentiation in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.3 Differentials and Tangent Planes in Rn . . . . . . . . . . . . . . . . . . . 17
4.4 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5 Integration 21
5.1 Integration in Vector Spaces? . . . . . . . . . . . . . . . . . . . . . . . . 21
5.2 Integration in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.3 Change of Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.4 Subordinate Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6 Vector Calculus in R3 28
6.1 Gradient, Curl, and Divergence . . . . . . . . . . . . . . . . . . . . . . . 28
6.2 Main Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
7 Applications 35
7.1 Vortex Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
7.2 Electrodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
References 39
1
1 Vector Spaces
1.1 Basic Terminology
We begin with the formal definition of a vector space.
Definition 1.1. A vector space over a field F (also called an F -vector space)
is a set V together a binary operation + : V × V → V (called addition), a function
· : F × V → V (called scalar multiplication), and an element 0 ∈ V (called the zero
vector) such that
X × Y = {(x, y) : x ∈ X and y ∈ Y }.
Definition 1.2. Let U and V be vector spaces over a field F. We define the product
vector space U × V as Cartesian product of U and V together with the following
definitions:
0 = (0, 0)
(u1 , v1 ) + (u2 , v2 ) = (u1 + u2 , v1 + v2 )
r · (u, v) = (ru, rv).
One can easily verify that U × V becomes an F -vector space with the above rules.
(note by ru we mean r · u, and we will omit the · from now on when performing scalar
multiplication). It follows that Rn = Xni=1 R is a real vector space, and that Cn is both a
real and complex vector space (meaning its scalars may come from the field R or C).
for ei ∈ S and ci ∈ F.
2
Example 1.4. Let V = R2 and S = {î, ĵ} where î = (1, 0) and ĵ = (0, 1). Then S spans
R2 since for any v = (a, b) ∈ R2 we can write
Example 1.7. We showed that {î, ĵ} spanned R2 . Now observe that
and hence a = 0 and b = 0. So {î, ĵ} is a linearly independent set, and is hence a basis of
R2 .
Proposition 1.8. Let B and B 0 be two bases for a vector space V. Then B and B 0 have
the same number of elements (cardinality). Moreover, every vector space has a basis.
Definition 1.9. Let B be a basis of V. We define the dimension of V over its field F
by
dimF (V ) = |B|,
where |B| is the cardinality of B.
Example 1.10. Let V = Rn and ei = (0, ..., 1, ..., 0) be the vector in Rn with a 1 in the
ith component and 0s in each of the other components. Then {e1 , ..., en } is a basis of Rn
and hence
dimR (Rn ) = n.
f (x + y) = f (x) + f (y)
f (rx) = rf (x).
Example 1.12. Any function f : R → R of the form f (x) = ax is a linear map since
and
f (rx) = a(rx) = r(ax) = rf (x).
3
Proposition 1.13. If f : U → V is a linear map between vector spaces, then f (0) = 0.
Proof. Since f is a linear map and every vector x ∈ U has an inverse −x such that
x + (−x) = 0, we have that
4
2 The Vector Space Rn
2.1 Magnitude in R
For x ∈ R, we define its norm (or magnitude) as its absolute value |x|, which is defined
as
x if x > 0
|x| = −x if x < 0 .
0 if x = 0
And for two real numbers x, y ∈ R, we define their distance as the norm of the difference:
d(x, y) = |x − y|.
Proof. (a) This follows from the definition (if x < 0, then −x > 0.) (b) xy < 0 iff
x < 0 or y < 0 but not both. Hence |xy| = −xy = |x||y|. xy > 0 iff x, y > 0 or x, y < 0.
In either case, |xy| = xy = (−x)(−y) = |x||y|. And |xy| = 0 iff x = 0 or y = 0 and hence
iff |x||y| = 0. (c) Note that we have −|x| ≤ x ≤ |x| and −|y| ≤ y ≤ |y|. If we add these
two inequalities we obtain
(b) |x − y| = |y − x|.
(c) |x − y| ≤ |x − z| + |z − y|.
Proof. (a) Nonnegativity follows immediately from the previous proposition, as does
the fact that |x − y| = 0 iff x − y = 0 iff x = y. (b) We have
(c) We have
|x − y| = |(x − z) + (z − y)| ≤ |x − z| + |z − y|
by the previous proposition.
5
2.2 Magnitudes in Rn
Can we construct a notion of magnitude k · k on Rn that satisifies the same properties?
Namely, is there a function k · k : Rn → R such that
(a) kxk ≥ 0 and kxk = 0 iff x = 0,
(b) kcxk = |c|kxk, and
(c) kx + yk ≤ kxk + kyk,
which we will call a norm. Note that in condition (b) above we look at a vector x ∈ Rn
being multiplied by a scalar c ∈ R since we don’t yet have a notion of a product of vectors.
The first guess for x = (x1 , ..., xn ) ∈ Rn might be
n
X
kxk = kxk1 = |xi |.
i=1
And lastly
kx + yk1 = k(x1 , ..., xn ) + (y1 , ..., yn )k1
= k(x1 + y1 , ..., xn + yn )k1
Xn
= |xi + yi |
i=1
Xn
≤ |xi | + |yi |
i=1
n
X n
X
= |xi | + |yi |
i=1 i=1
= kxk1 + kyk1 .
Let us also define !1/n
m
X
kxkn = |xi |n .
i=1
m
Proposition 2.4. kxkn is a norm on R .
Proof. kxkn ≥ 0 as it is a root of sums of positive numbers. Thus it is zero iff x = 0.
We also have
m
X m
X m
X
kcxknn = n
|cxi | = n n
|c| |xi | = |c| n
|xi |n = |c|n kxknn
i=1 i=1 i=1
6
and hence kcxkn = |c|kxkn . The last part, kx + ykn ≤ kxkn + kykn , will be proved for
n = 2 once we discuss the dot product.
In R, |x − y| satisfied additional properties.
Proposition 2.5. For x, y, z ∈ Rm we have
(a) kx − ykn ≥ 0 and kx − ykn = 0 iff x = y.
(b) kx − ykn = ky − xkn .
(c) kx − ykn ≤ kx − zkn + kz − ykn .
Proof. (a) This again follows immediately from the fact that k · kn is a norm. (b)
We have
m
!1/n
X
kx − ykn = |xi − yi |n
i=1
m
!1/n
X
n
= |(−1)(yi − xi )|
i=1
m
!1/n
X
= | − 1|n |yi − xi |n
i=1
m
!1/n
X
= |yi − xi |n
i=1
= ky − xkn .
(c) And this similarly immediately follows:
kx + ykn = k(x − z) + (z − y)kn ≤ kx − zkn + kz − ykn .
7
Because of this relation of the dot product to the norm k · k2 , we will henceforth denote
k · k2 simply by k · k.
Theorem 2.7. (Cauchy-Schwarz Inequality) If x, y ∈ Rn , then
|x · y| ≤ kxkkyk.
Proof. Suppose y = 0, then the result is trivial. Now let y 6= 0 and note that for any
t ∈ R we have
0 ≤ kx − tyk2 = kxk2 − 2t(x · y) + t2 kyk2 .
Let t = x · y/kyk2 , then we obtain
8
2.4 Hyperplanes
To define a hyperplane in Rn , we will want to know the direction n that it faces and one
point p we want to be on the hyperplane. We’ll then add all points on line segments
containing p such that all points on the line also face in the direction of n.
Πn (p) = {x ∈ Rn : (x − p) · n = 0}.
That is, n is orthogonal to all points on the plane. It follows that points x =
(x1 , ..., , xm ) on hyperplanes in Rm satisfy the equation
m
X
ni (xi − pi ) = 0
i=1
or equivalently
m
X m
X
n i xi = ni pi = C
i=1 i=1
for a P
choice of n, p. Hence a hyperplane is a vector space iff C = 0, for in this case we’d
have m i=1 ni xi = 0 and hence
m
X m
X
ni (cxi ) = c ni xi = c0 = 0.
i=1 i=1
(a) x × x = 0
(b) x × y = −(y × x)
9
(c) (cx) × y = x × (cy) = c(x × y)
(d) x × (y + z) = (x × y) + (x × z)
(e) (x × y) · z = x · (y × z)
(e)
kx × yk2 = (x × y) · (x × y)
= x · (y × (x × y))
= x · (kyk2 x − (y · x)y)
= kxk2 kyk2 − x · ((y · x)y)
= kxk2 kyk2 − (x · y)2 .
(x × y) · x = −(y × x) · x = −y · (x × x) = −y · 0 = 0.
Corollary 2.16.
x × (y × z) + z × (x × y) + y × (z × x) = 0.
The above identity is called the Jacobi identity.
Corollary 2.17.
kx × yk = kxkkyk sin θ.
10
Proof.
11
3 Limits and Continuity of Functions between Vec-
tor Spaces
In order to discuss limits and continuity on vector spaces, we need a notion of “closeness”.
If our vector space has a norm k · k, then we will say two vectors x and y are ε-close for
a real number ε > 0 if there is some vector z ∈ V such that
Limits can also be defined on more abstract vector spaces that only have a notion of
distance (vector metric spaces)–or even more generally, that have a more abstract notion
of closeness (topological vector spaces). But we will limit oursevles to vector spaces with
norms (normed spaces) whose field is R.
3.1 Limits
Definition 3.1. Let U and V be normed spaces with norms k · kU and k · kV respectively.
Then for a function f : U → V, we say L is the limit of f as x → a, denoted
lim f (x) = L
x→a
Proof. We omit (a) and (b) as their proofs are essentially the same as those from
calculus. (c) Let L = limx→a f (x). Then for every ε > 0 there is a δL such that
Hence we wish to show limx→a kf (x)k = kLk. Let ε > 0 and let δ = δL . Then
12
Proposition 3.3. Let f, g : Rn → Rm , then
lim (f (x) · g(x)) = lim f (x) · lim g(x) .
x→a x→a x→a
Also if f, g : R3 → R3 , then
lim (f (x) × g(x)) = lim f (x) × lim g(x) .
x→a x→a x→a
Proof. Let f (x) = (f (x)1 , ..., f (x)m ) and g(x) = (g(x)1 , ..., g(x)m ). Then
m
!
X
lim (f (x) · g(x)) = lim f (x)i g(x)i
x→a x→a
i=1
m
X
= lim (f (x)i g(x)i )
x→a
i=1
m
X
= lim f (x)i lim g(x)i
x→a x→a
i=1
= lim f (x) · lim g(x) .
x→a x→a
We also have a claim regarding composition of functions.
Proposition 3.4. Let f : U → V and g : V → W such that limx→a f (x) = L and
limy→L g(y) = M. Then
Hence
kx − ak < δ1 =⇒ kg(f (x)) − M k < ε.
Proposition 3.5. Let f : Rn → Rm . Then limx→a f (x) exists iff limxi →ai f (x) exists for
all 1 ≤ i ≤ n. Moreover if limx→a f (x) exists, then
13
3.2 Continuity
Definition 3.6. Let f : U → V be a function between normed spaces and Ω ⊆ U. f is
continuous on Ω iff for all a ∈ Ω
In other words f is continuous on Ω iff for every ε > 0 and a ∈ Ω, there is a δa such
that
kx − akU < δa =⇒ kf (x) − f (a)kV < ε.
That is, the limits of f exist at all points a ∈ Ω and are equal to f (a).
(a) f + g is continuous on U.
(c) f g is continuous on U.
(d) h ◦ f is continuous on U.
14
4 Differentiation
4.1 Differentiation in Vector Spaces
Definition 4.1. Let f : U → V be a function between normed spaces. We say that f
is differentiable at x ∈ U if kDfx (h)kV /khkU < ∞ and
kf (x + h) − f (x) − Dfx (h)kV
lim =0
h→0 khkU
for a linear function Dfx : U → V. f is differentiable on U if it is differentiable at every
x ∈ U (that is, for every x ∈ U, there is a Dfx satisfying the above limit). In this case
we call Df the derivative (or Fréchet derivative) of f.
That is, f is differentiable at x if for every ε > 0, there is a δx > 0 such that
kf (x + h) − f (x) − Dfx (h)kV
khkU < δx =⇒ < ε,
khkU
since the fraction given is a function Fx : U → R with
kf (x + h) − f (x) − Dfx (h)kV
Fx (h) =
khkU
and we are looking at limh→0 Fx (h).
Proposition 4.2. Let f : U → V be differentiable at a ∈ U, then f is continuous at a.
Proof. Let ε > 0, then we wish to find a δ > 0 such that limx→a f (x) = f (a). In
other words, we want that
kx − akU < δ =⇒ kf (x) − f (a)kV < ε.
Note that since f is differentiable, f (a) is defined. We also have a δa for any ε0 such that
kf (a + h) − f (a) − Dfa (h)kV
khkU < δa =⇒ < ε0 .
khkU
Or equivalently, we get that
kf (a + h) − f (a) − Dfa (h)kV < ε0 khkU < ε0 δa .
If ε0 = 1, then note that we have
kf (a + h) − f (a) − Dfa (h)kV < khkU
and hence by the triangle inequality
kf (a + h) − f (a)kV ≤ kDfx (h)kV + khkU ≤ kDfa kkhkU + khkU = khkU (kDfa k + 1)
where kDfa k = sup{kDfa ((h))kU /khkV }. Since f is differentiable, kDfa k < ∞. Thus
limh→0 f (a + h) = f (a). Hence if h = x − a and δ = ε/(kDfa k + 1) we obtain
kx − akU = khkU < δ =⇒kf (x) − f (a)kV
= kf (a + h) − f (a)kV
≤ khkU (kDfa k + 1)
ε
< (kDfa k + 1)
kDfa k + 1
= ε.
So f is continuous at a.
Some standard properties follow, whose proofs we leave to the reader.
15
Proposition 4.3. Let f, g : U → V be differentiable on U and h : V → W be
differentiable on ran f with derivatives Df, Dg, and Dh respectively. Then
(iii) (Chain Rule) h ◦ f is differentiable, and D(h ◦ f ) = D(h(f ))D(f ) (where the
product is matrix multiplication).
4.2 Differentiation in Rn
Let f : Rn → R. Then the derivative Df of f (if it exists) would satisfy
for every x ∈ Rn .
Proposition 4.5. Let f : Rn → Rm be differentiable with f (x) = (f1 (x), ..., fm (x)).
Then Di fj exists for each 1 ≤ i ≤ n and 1 ≤ j ≤ m.
Proof. Let h = tei with t > 0 (without loss of generality). Then since f is differen-
tiable (and hence each fj ) and ktei k = tkei k = t, we have
16
Note, often in calculus/analysis texts, Dfx and ∇fx are written as Df (x) and ∇f (x).
We do not use this notation since we do not mean the matrix multiplication of Df and
x or ∇f and x. Contrastingly in the definition of the derivative we had Dfx (h) where we
did mean the matrix multiplication of Dfx and the vector h. Correspondingly, evaluation
of Dfx on a vector h ∈ Rn is defined by matrix multiplication (or in the case of the
gradient, via the dot product). We will henceforth leave out the (x) for convenience.
Theorem 4.9. If Dij f and Dji f exist and are continuous, then Dij f = Dji f.
∇(f g) = f ∇g + g∇f.
f (x + tu) − f (x)
∇v f = lim .
t→0 t
Hence
∂f
∇ei f = .
∂xi
df = f 0 dx = Df dx.
To still get a real value in n dimensions, we change the product to dot product.
17
Definition 4.13. Let f : Rn → R. Then we define the differential of f as
n
X ∂f
df = Df · dx = ∇f · dx = dxi .
i=1
∂xi
n · (x − p) = 0
for a point p ∈ Π and a vector n orthogonal to points on the plane. Suppose we wish to
find a hyperplane tangent to f at point p for a differentiable function f. Then we need
only to find a normal vector n. In the case when f : R → R, we used the point-slope
method to obtain the tangent plane (which was a line) through the point (p, f (p)) that
also went through (a, f (a)) defined by
In this case, if we place the function in R2 so that f = {(x, f (x))}, we had a tangent line
at the point (p, f (p)). Hence
Hence, placing the function into Rn+1 so that our second vector becomes A = (a1 , ..., an , f (a))
and the function becomes X = (x1 , ..., xn , f (x)), we would want
n+1
X n
X
0 = n · (X − A) = ni (Xi − Ai ) = −(f (x) − f (a)) + ni (xi − ai ).
i=1 i=1
Hence
∂f ∂f
n= (p), ..., (p), −1 .
∂x1 ∂xn
So, this characterizes an “(n−1)-dimensional” tangent hyperplane (with n+1 coordinates,
but note that having n−1 of the evaluations of the partial derivatives at p puts a constraint
on the last one) in Rn while viewing f as an “n-dimensional” subset of Rn+1 as all points
of the form (x1 , ..., xn , f (x)) (similarly, choosing the n xi ’s constrains f (x)–hence “n-
dimensional”).
The construction of a tangent hyperplane is the answer to the question: “Given a
differentiable function f : Rn → R and two points p ∈ Rn and a ∈ Rn+1 , can one
18
construct a hyperplane containing the points (p1 , ..., pn , f (p)) and a?” A weaker question
is: “Given a differentiable function f : Rn → R and a point a ∈ Rn+1 , can one find a
point p ∈ Rn such that there is a tangent hyperplane of f at p passing through a?” The
answer to this is also yes with some slight assumptions.
For two points a, b ∈ Rn , we define the line segment between them as
4.4 Optimization
Optimization is the study of critical points of differentiable functions. Like the differential
of a function, the notion of a critical point of a function doesn’t quite generalize to an
arbitrary differentiable function between vector spaces. We in turn only consider the
notion of critical points for differentiable functions f : Rn → Rm .
For motivation, if f : R → R is differentiable and f 0 (x) = 0 for some x ∈ R, then we
call x a critical point. For functions f : Rn → R, if ∇f (x) = 0 for some x ∈ Rn , we will
say the same of x. But for f : Rn → Rm , we slightly weaken our requirements regarding
the Jacobian derivative Df.
(a) local minimum of f if there is some r > 0 such that f (p) ≤ f (x) for all
x ∈ Br (p);
(b) local maximum of f if there is some r > 0 such that f (p) ≥ f (x) for all
x ∈ Br (p);
19
Theorem 4.20. (Second Derivatives Test) Let x be a critical point of a twice
differentiable function f : Rn → Rm .
(a) If all of the eigenvalues of D2 f (x) are negative, then x is a local maximum.
(b) If all of the eigenvalues of D2 f (x) are positive, then x is a local minimum.
20
5 Integration
5.1 Integration in Vector Spaces?
Can we define a notion of integration in general vector spaces? It turns out that if f : U →
V is a function between vector spaces with U and V normed and V “complete” (Cauchy
sequences have limits), then there is a natural way to define the integral of f (if it exists).
It ends up being nothing intuitive and is correspondingly not particularly interesting
to examine. It requires some knowledge of measure theory: U has an induced measure
space structure with d-dimensional Hausdorff measure and Caratheodory-measurable sets
as the σ-algebra, and the completion of V allows one to define limits of step functions...but
enough of that. Since these notes do not require measure theory as a prerequisite, we in
turn restrict ourselves to real vector spaces.
5.2 Integration in Rn
Recall that for a function f : R → R, we define the integral of f over the interval [a, b]
(or equivalently (a, b), (a, b] or [a, b)) by
Z b n
X
f (x) dx = lim f (ti )∆xi
a n→∞
i=1
where ∆xi = |xi − xi−1 |, x0 = a, xn = b, and ti ∈ [xi−1 , xi ]. The limit assumes that we
make further subdivisions of the interval [a, b] as n → ∞—where in each successive step,
xi might be different than before.
To amend this, one can discuss partitions of the interval [a, b] as sets P consisting of
points in [a, b] (the xi ’s). One can then say a partition P 0 is finer than P if P ⊆ P 0 (i.e.
that P 0 has all the same points as P and possibly some more). We can define a norm on
a partition P = {x1 , ..., xn−1 , xn } (with x0 = a and xn = n) by
Then if {Pn } is a sequence of partitions of [a, b] such that Pn+1 is finer than Pn for all n
and limn→∞ kPn k = 0, we can then define the integral by
Z b X
f (x) dx = lim f (ti )∆xi
a n→∞
xi ∈Pn
where ti ∈ [xi−1 , xi ]. In this case, one can show that the above definition does not depend
upon the ti ’s chosen in each subinterval.
Now what if f : R2 → R? If we wanted to continue with the same reasoning, we
might want to look at sums of the form
X
f (ti )∆Ai
i
where Ai is a rectangle in R2 , ∆Ai is the area of the rectangle, and ti is a point in the
rectangle. It turns out this is the correct way to define the integral. Now since ti ∈ R2 ,
21
we can write it as ti = (t1i , t2i ) for t1i , t2i ∈ R. Similarly since Ai is a rectangle in R2 , its
area can be written as product |xi − xi−1 ||yi − yi−1 |. Hence our sum looks like
X X X
f (ti )∆Ai = f (t1i , t2i )|xi − xi−1 ||yi − yi−1 | = f (t1i , t2i )∆xi ∆yi .
i i i
Hence we are essentially just subdividing the x axis and y axis, or say, intervals on them
such as [a, b] and [c, d], with the intent of letting the number of rectangles go to infinity
(while their areas go to 0). Should we require [a, b] and [c, d] to have the same number
of subdivisions at each stage? It turns out for that most ordinary circumstances, it will
not matter. Hence we may index the subdivisions of the [a, b] and [c, d] axes separately
as follows: !
XX X X
f (t1i , t2j )∆xi ∆yj = f (t1i , t2j )∆xi ∆yj
j i j i
and correspondingly sum over one and then the other. So we want a sequence of partitions
{Pn } of [a, b] and a sequence of partitions {Pm0 } of [c, d] such that limn→∞ kPn k = 0 and
limm→∞ kPm0 k = 0. We then define the integral of f over the rectangle [a, b] × [c, d] by
Z dZ b Z d Z b !
X X
f (x, y) dx dy = f (x, y) dx dy = lim lim f (t1i , t2j ) ∆xi ∆yj .
c a c a m→∞ n→∞
0
yj ∈Pm xi ∈Pn
Like derivatives, if f : Rn → Rm with f (x) = (f1 (x), ..., fm (x)) for x ∈ Rn and R =
Xni=1 [ai , bi ], we define Z Z Z
f dx = f1 dx, ..., fm dx .
R R R
n
What if we wish to integrate over subsets of R that aren’t rectangles? This requires
another definition of the integral, which turns out to include that one we previously had.
22
Say we want to integrate over a subset X ⊆ Rn , we will have to make an assumption
about X that
Xk m
X
sup |Ri | = inf |Sj0 |
S⊆X X⊆S 0
P (S) i=1 P (S 0 ) j=1
where the supremum (or infimum) is taken over all subsets S (or S 0 ) contained in (or
containing) X and partitions of those rectangles (and Ri denotes the ith rectangle in a
partition P (S) of S). In this case, we call X a Jordan region and denote the commone
value by |X|, and call it the volume of X. Let us define the norm of an n-dimensional
partition by
kP (S)k = max |Ri |.
i
That is, the norm of the partition is the volume of the largest rectangle in the partition.
Then it is clear that this map is continuously differentiable and injective. The magnitude
of the determinant of the Jacobian is
∂x ∂x
cos θ −r sin θ
∂r
| det Dφ| = ∂y ∂θ
∂y =
= |r cos2 θ + r sin2 θ| = r.
∂r ∂θ
sin θ r cos θ
Hence Z Z 2π Z ∞
f (x, y) dx dy = f (r, θ) r dr dθ
R2 0 0
23
Example 5.6. (Spherical Coordinates in R3 ) Define φ : R+ × [0, 2π) × [0, π] → R3
by
φ(ρ, θ, ϕ) = (ρ sin ϕ cos θ, ρ sin ϕ sin θ, ρ cos ϕ).
This map is also clearly continuously differentiable and injective. We also have
∂x ∂x ∂x
∂ρ ∂θ ∂ϕ
| det Dφ| = ∂y ∂y ∂y
∂ρ ∂θ ∂ϕ
∂z ∂z ∂z
∂ρ ∂θ ∂ϕ
sin ϕ cos θ −ρ sin ϕ sin θ ρ cos ϕ cos θ
= sin ϕ sin θ ρ sin ϕ cos θ ρ cos ϕ sin θ
cos ϕ 0 −ρ sin ϕ
= ρ2 sin ϕ.
Thus Z Z π Z 2π Z ∞
f (x, y, z) dx dy dz = f (ρ, θ, ϕ) ρ2 sin ϕ dρ dθ dϕ.
R3 0 0 0
This may seem to suggest that there is no nice way of integrating of subsets of strictly
less “dimension”. Rather than integrating in the classical way (in which case we essen-
tially get hypervolumes being 0 since, say, one of the components is constant and hence
rectangles have 0 length in that component), we will come up with alternate methods for
subordinate integration.
Let γ : [0, 1] → Rn be a continuously differentiable and injective map (which we will
call a path) and f : Rn → R. We ask the question: what is the integral of f along γ?
Change of variables would give us:
Z Z Z 1
f dγ := f (x) dγ(x) = f (γ(t))| det Dγ| dt.
γ γ([0,1]) 0
24
But det Dγ doesn’t make sense in this case since
Dγ = (γ10 (t), ..., γn0 (t)) .
It turns out that kDγk works as a substitute for | det Dγ|. Intuitively in Rn we have an
infinitesimal Pythagorean theorem:
dγ 2 = dγ12 + · · · + dγn2 ,
which would yield
s 2 2
Z Z 1 Z 1
dγ1 dγn
f dγ = f + ··· + dt = f kDγk dt.
γ 0 dt dt 0
25
Thus
∆S = k det A(V )k
s 2 2
∂S ∂S
= ∆x1 · · · ∆xn + · · · + ∆x1 · · · ∆xn + (∆x1 · · · ∆xn )2
∂x1 ∂xn
s 2 2
∂S ∂S
= + ··· + + 1 ∆x1 · · · ∆xn .
∂x1 ∂xn
Or in the limit:
s 2 2
∂S ∂S
dS = + ··· + + 1 dx1 · · · dxn .
∂x1 ∂xn
If instead we have that f : Rn+1 → Rn+1 , we can define its surface integral with respect
to an oriented surface S (meaning it has a normal vector that changes continuously along
S) as Z Z
f · dS = (f · n) dS
S S
where n is a unit normal vector to S. Since S : Rn → R, suppose we consider the zero
function
Also
∂S ∂S
∇Σ · (x1 , ..., xn , S(x1 , ..., xn )) = − x1 − · · · − xn + S(x1 , ..., xn ) = c
∂x1 ∂xn
for some constant c (since applying d to both sides must give us a 0 on the right, by
computing the differential of S). But since Σ = 0, we must have c = 0. So ∇Σ is normal
to the surface. Hence we can define the unit normal vector
∂S ∂S
∇Σ − ∂x1
, ..., ∂xn
, 1
n= = r .
k∇Σk ∂S
2 2
∂S
∂x1
+ · · · + ∂xn + 1
26
Hence we obtain
Z Z
f · dS = (f · n) dS
S S
s
∂S ∂S
Z − ∂x 1
, ..., ∂xn
, 1 ∂S
2
∂S
2
= f · r + ··· + + 1 dx1 · · · dxn
S ∂S
2 2
∂S
∂x1 ∂xn
∂x1
+ · · · + ∂xn + 1
Z
∂S ∂S
= (f1 , ..., fn+1 ) · − , ..., , 1 dx1 · · · dxn
S ∂x1 ∂xn
Z
∂S ∂S
= −f1 − · · · − fn + fn+1 dx1 · · · dxn
S ∂x1 ∂xn
where each fi has the change of variables in its arguments: fi (x1 , ..., xn , S(x1 , ..., xn )).
27
6 Vector Calculus in R3
6.1 Gradient, Curl, and Divergence
Recall that for f : Rn → R, we have the gradient ∇f : Rn → Rn defined by
∂f ∂f
∇f (x) = (x), ..., (x) .
∂x1 ∂xn
Since we will be restricting ourselves to R3 in this section, we will simplify our notation
with x1 = x, x2 = y, and x3 = z:
∂f ∂f ∂f
∇f = , , = (fx , fy , fz )
∂x ∂y ∂z
∂f3 ∂f2 ∂f1 ∂f3 ∂f2 ∂f1
curl f = − , − , −
∂y ∂z ∂z ∂x ∂x ∂y
∂f1 ∂f2 ∂f3
div f = + + .
∂x ∂y ∂z
Proposition 6.3.
curl ∇f = 0.
Proof.
28
Proposition 6.4.
div curl f = 0.
Proof.
∂f3 ∂f2 ∂f1 ∂f3 ∂f2 ∂f1
div curl f = div − , − , −
∂y ∂z ∂z ∂x ∂x ∂y
∂ ∂f3 ∂f2 ∂ ∂f1 ∂f3 ∂ ∂f2 ∂f1
= − + − + −
∂x ∂y ∂z ∂y ∂z ∂x ∂z ∂x ∂y
= 0.
such that any double composition in the chain is 0 (i.e. curl of grad, or div or curl).
Such a chain is called a complex. Of interest might be the vector fields f : R3 → R3
that fall into two categories: (1) irrotational ones that aren’t gradients of a scalar field,
and (2) incompressible ones that aren’t the curl of a vector field. Although it turns out
that all irrotational ones are the gradient of a scalar field, and all incompressible ones are
the curl of a vector field. These theorems follow from the fundamental theorem for line
integrals, Stokes’ theorem, and Gauss’ theorem (that latter two of which appear in the
next section).
We will also use the following notation:
∂ 2f ∂ 2f ∂ 2f
∇2 f = ∇ · ∇f = div (grad f ) = + +
∂x2 ∂y 2 ∂z 2
for a scalar field f, and
∇2 f = ∇2 f1 , ∇2 f2 , ∇2 f3
Proposition 6.5.
curl (curl f ) = ∇(div f ) − ∇2 f.
Proof. Let P = f3y − f2z , Q = f1z − f3x , and R = f2x − f1y so that curl f = (P, Q, R).
29
Then
e1 e2 e3
∂ ∂ ∂
curl (curl f ) = ∂x ∂y ∂z
P Q R
= (Ry − Qz , Pz − Rx , Qx − Py )
= (f2xy − f1yy ) − (f1zz − f3xz ), (f3yz − f2zz ) − (f2xx − f1yx ),
(f1zx − f3xx ) − (f3yy − f2zy )
= f2xy + f3xz − (f1yy + f1zz ), f3yz + f1yx − (f2xx + f2zz ), f1zx + f2zy − (f3xx + f3yy )
= f2xy + f3xz + f1xx , f3yz + f1yx + f2yy , f1zx + f2zy + f3zz − ∇2 f
∂ ∂ ∂
= (f1 + f2y + f3z ), (f1x + f2y + f3z ), (f1x + f2y + f3z ) − ∇2 f
∂x x ∂y ∂z
2
= ∇(f1x + f2y + f3z ) − ∇ f
= ∇(div f ) − ∇2 f.
We will denote the above by curl2 f.
dρ
Here f is called the Radon-Nikodym derivative and is denoted f = dµ .
You can think of it as saying
R “given two differentials dy and dx, there is an integrable
function f such that dy = f dx.” And we can write f = dy/dx. The actual definition
of a measure isn’t quite like a differential, but it’s a managable generalization. Another
general form is:
An n-dimensional manifold can be thought of as a space that looks like Rn locally (i.e.
in small neighborhoods). For example, the interval [a, b] is a compact 1-dimensional
manifold with boundary (its boundary is {a, b}, the endpoints). A 0-form on this interval
30
is just a smooth function f defined on it (in general, an n-form looks like f dx1 ∧· · ·∧dxn ).
So the theorem applied to this example is (and recall df = f 0 (x) dx)
Z Z Z
0
df = f (x) dx = f = f (b) − f (a),
[a,b] [a,b] {a,b}
which is simply the fundamental theorem of calculus. It turns out that all of the theorems
in this section are special cases of the above Stokes’ theorem. In what follows, a simple
path will be an injective one (so it doesn’t intersect itself); an orientation on a closed
path will a choice about whether one moves clockwise or counterclockwise. A closed path
with orientation is called an oriented path. An oriented surface is a surface with a
set of normal vectors at each point that change continuously.
This can be seen as a special case of the above Stokes’ theorem by the following
reasoning. S is a surface, so it looks like R2 in very small neighborhoods. That is, S is a
2-dimensional manifold. It also satisfies the assumptions we need. Also,
f · dγ is a sum of 1-forms, and is thus a 1-form itself. In the statement of the general
Stokes’ theorem, dω refers to the de Rham derivative of ω, and in our case it turns out
that
d(f · dγ) = curl f · dS.
But we can prove the Kelvin-Stokes’ theorem directly without showing it is a special case
of the more general Stokes’ theorem by noting the fact that since γ is a boundary of the
surface, we can write
31
Proof.
Z Z b
f · dγ = f · γ 0 (t) dt
γ a
Z b
= (f1 x0 (t) + f2 y 0 (t) + f3 z 0 (t)) dt
a
Z b
0 0 ∂S dx ∂S dy
= f1 x (t) + f2 y (t) + f3 + dt
a ∂x dt ∂y dt
Z b
∂S dx ∂S dy
= f1 + f3 + f2 + f3 dt
a ∂x dt ∂y dt
Z
∂S ∂S
= f1 + f3 dx + f2 + f3 dy
γ∩R2 ∂x ∂y
ZZ
∂ ∂S ∂ ∂S
= f2 + f3 − f1 + f3 dx dy
S ∂x ∂y ∂y ∂x
ZZ
∂S ∂S
= −(f3y − f2z ) − (f1z − f3x ) + (f2x − f1y ) dx dy
S ∂x ∂y
ZZ
∂S ∂S
= curl f · − , − , 1 dx dy
S ∂x ∂y
ZZ
= curl f · dS.
S
32
Theorem 6.8. (Gauss’ Theorem)(Divergence Theorem) Let B be a simple body
and S be its boundary with positive orientation. If f : R3 → R3 is continuously differen-
tiable on B then ZZ ZZZ
f · dS = div f dxdydz.
S B
Proof.
ZZ ZZ
f dS = (f1 , f2 , f3 ) · (n1 , n2 , n3 ) dS
S ZZ S
= (f1 n1 + f2 n2 + f3 n3 ) dS
S
Z Z Z Z Z Z Z Z Z
∂f1 ∂f2 ∂f3
= dx dydz + dy dxdz + dz dxdy
S ∂x S ∂y S ∂z
ZZZ
∂f1 ∂f2 ∂f3
= + + dx dy dz
B ∂x ∂y ∂z
ZZZ
= div f dxdydz.
B
Recall that curl of gradient is zero and divergence of curl is zero. Also recall that vector
fields with 0 curl are irrotational and vector fields with 0 divergence are incompressible.
We said that, in fact, every irrotational vector field is the gradient of some other vector
field, and that every incompressible vector field is the curl of another vector field. The
first step in showing this is to show that every vector field can be written as the sum of
an irrotational and incompressible field.
f = curl A − ∇B
for some vector field A and scalar field B. It follows immediately that C = curl A and
R = −∇B for the divergence of curl is 0 and curl of gradient is 0. In fact A and B have
the forms Z
1 curl f
A= · dx
4π R3 kx − x0 k
and Z
1 div f
B= dx
4π R3 kx − x0 k
for any choice of x0 ∈ R3 .
33
(c) If f is an irrotational vector field, then it is the gradient of some scalar field.
(d) If f is an incompressible vector field, then it is the curl of some vector field.
Proof. We have already proven (a) and (b). By Helmholtz we have f = curl A − ∇B
where A and B are defined by the integrals above. If f is irrotational, curl f = 0, so
A = 0 and thus f = −∇B. Similarly if f is incompressible, then div f = 0, so B = 0 and
hence f = curl A.
Since both equations involve the Laplacian (∇2 ) and f can be written in either way, we
will stick to the simpler first case. Hence an irrotational and incompressible vector field
f has the form
f = −∇ϕ
where ϕ is a solution to Laplace’s equation:
∇2 ϕ = 0.
34
7 Applications
7.1 Vortex Dynamics
In fluid mechanics, we can let v : R3 → R3 be the velocity field of a fluid (where a fluid
is defined with the local property of being an amount of mass or energy per unit volume
and having a velocity vector at each point). Let us assume conservation of mass/energy
in a fixed volume V. So we have ZZZ
= ρ dV
V
where ρ denotes mass/energy density and is the total mass/energy in the volume. The
change in energy over time depends on how much energy enters or leaves the volume V.
Hence if J denotes the energy velocity field (in the sense that J = ρv), then
ZZ
∂
=− J · dS.
∂t ∂V
That is, a negative change in energy corresponds to how much of the energy velocity
field hits the boundary of the volume (i.e. exits the volume). We use partial derivative
notation since we think of energy field as a function : R4 → R with a time variable as
well. By Gauss’ theorem we have
ZZZ ZZ ZZZ
∂ ∂ρ
− div J dV = − J · dS = = dV.
V ∂V ∂t V ∂t
Hence
∂ρ
= −div J.
∂t
This is equivalent to saying
∂ρ ∂ρ
0= + div (ρv) = + ∇ρ · v + ρ div v.
∂t ∂t
This is called the continuity equation. As a function of four variables, the total deriva-
tive of the energy density with respect to time is
dρ ∂ρ dt ∂ρ dx ∂ρ dy ∂ρ dz
= + + +
dt ∂t dt ∂x dt ∂y dt ∂z dt
∂ρ
= + ∇ρ · v
∂t
= −ρ div v.
Now we can define the vorticity of the fluid velocity field v as its curl. We will write
ω = curl v.
Suppose we have a subvolume in which the vorticity of the fluid is nonzero (which we
will call a vortex). Then since div ω = 0, it follows that the rate of change in density of
the vortex is 0 by applying the equation we derived above. We can however ask how the
vortex changes as a whole over time in the fluid. A similar equation can be derived from
the Navier-Stokes equation for a Newtonian fluid:
dω ∂w
= + ∇ω · v.
dt ∂t
35
Since v : R3 → R3 , we also have ω : R3 → R3 , so what does ∂ω/∂t mean? Moreover,
what is the gradient of a vector field? Suppose we have a collection of velocity fields {vt }.
Then we have a collection of vorticity fields {ωt = curl vt }. We can in turn think of v and
ω as functions v, ω : R4 → R3 . Correspondingly we will have
∂ω ∂ ∂ω1 ∂ω2 ∂ω3
= (ω1 (x, y, z, t), ω2 (x, y, z, t), ω3 (x, y, z, t)) = , ,
∂t ∂t ∂t ∂t ∂t
where the dot product is defined on column vectors of the Jacobian and
ω if j = 1
∂ωi ix
= ωiy if j = 2 .
∂xj
ωiz if j = 3
This equation, called the vorticity equation, describes the how a vortex changes over
time as it moves through a velocity field. It gives componentwise equations
3
dωi ∂ωi X ∂ωi
= + vj .
dt ∂t j=1
∂x j
7.2 Electrodynamics
In electrodynamics we have two vector fields E, B : R3 → R3 called the electric and
magnetic fields respectively. Contributions from Gauss, Faraday, Ampére, and Maxwell
36
led to the discovery of what are called Maxwell’s equations:
ρ
div E =
0
div B = 0
∂B
curl E = −
∂t
∂E
curl B = µ0 J + 0
∂t
where the partial derivatives with respect to time are defined as before for fluid velocity
fields, ρ = dQ/dV is the charge density, J = dI/dV is the current density, 0 is the
permittivity of free space (or electric constant), and µ0 is the permeability of free space
(or magnetic constant).
Integrating the first equation over a closed volume and applying Gauss’ theorem gives
us ZZZ ZZ ZZZ
ρ Q
div E dV = E · dS = dV =
V ∂V V 0 0
where S = ∂V. Similarly for the second equation we obtain
ZZ
B · dS = 0.
∂V
where γ is a simple closed curve on the boundary of V. Similarly for the fourth equation
one can obtain
Z ZZ ZZ
∂E ∂E
B · dγ = µ0 J + 0 · dS = µ0 IS + 0 · dS
γ ∂V ∂t ∂V ∂t
where IS is the net current on the boundary. This gives us the integral forms of Maxwell’s
equations:
ZZ
Q
E · dS =
0
Z Z∂V
B · dS = 0
∂V
Z ZZ
∂B
E · dγ = − · dS
γ ∂V ∂t
ZZ
d
=− B · dS
dt ∂V
Z ZZ
∂E
B · dγ = µ0 IS + 0 · dS
γ ∂V ∂t
ZZ
d
= µ0 IS + 0 E · dS
dt ∂V
37
where γ is a simple closed curve on ∂V. If we assume our volume has no charge or current,
then ρ = J = 0, and Maxwell’s equations are
div E = 0
div B = 0
∂B
curl E = −
∂t
∂E 1 ∂E
curl B = µ0 0 = 2
∂t c ∂t
where c is the speed of an electromagnetic wave in a vacuum, since
1
c= √ .
µ0 0
When there is no charge or current, we also have
curl2 (E) = ∇(div E) − ∇2 E
= −∇2 E
= curl (curl E)
∂B
= −curl
∂t
∂
= − curl B
∂t
1 ∂ 2E
=− 2 2.
c ∂t
So
∂ 2E
2
= c2 ∇2 E.
∂t
Also
curl2 (B) = ∇(div B) − ∇2 B
= −∇2 B
= curl curl B
1 ∂E
= 2 curl
c ∂t
1 ∂
= 2 curl E
c ∂t
1 ∂ 2B
=− 2 2 .
c ∂t
So we also have
∂ 2B
= c2 ∇2 B.
∂t2
Hence E and B both satisfy the equation
∂ 2ψ
= c2 ∇ 2 ψ
∂t2
for a constant c and vector field ψ. This equation is called the wave equation, and has
important applications in other areas of physics as well.
38
References
[1] Aubin, Theirry. A Course in Differential Geometry. American Mathematical Society.
2001.
[2] Folland, Gerald B. Real Analysis: Modern Techniques and Their Applications. 2nd
Edition. John Wiley & Sons, Inc. 1999.
[3] Griffiths, David. Introduction to Electrodynamics. 3rd Edition. Prentice Hall, Inc.
1999.
[5] Wade, William. An Introduction to Analysis. 3rd Edition. Pearson Education, Inc.
2004.
39