Type classification: this resource is a course.

Subject classification: this is a physics resource.

Subject classification: this is a mathematics resource.

Educational level: this is a secondary education resource.

Educational level: this is a tertiary (university) resource.

Multivariable calculus (also known as multivariate calculus) is the study of problems of differential and integral calculus in situations in which the number of variables being differentiated with respect to, or integrated over, is greater than one. Simply put, it is calculus in multiple dimensions.

It is a field rich in theoretical issues and even richer in practical applications. In particular, it relates to vectors, areas, volumes, curves, surfaces, and their analogs in dimensions even higher than the familiar three.

MV calculus is a rather advanced course. A thorough understanding of it involves a number of prerequisite fields, with which the student will need to become familiar. They will not be covered in this course, though they will be covered elsewhere in Wikiversity. They include:

Differential and integral calculus, of course. The student will be required to evaluate definite integrals, either through knowledge of the subject, or through computer programs or web sites that can perform integration. Some integrators are listed at the end of this article.
The various theoretical mathematical issues that go with a good understanding of calculus, such as limits and elementary set theory.
Finite-dimensional linear algebra—vectors, linear transformations, matrices, inversion, determinants, and orthogonal matrices.
The basic notion of using different coordinate systems (such as polar coordinates) to describe a space. This concept is central to the subject. Much of what we do will involve operating in a coordinate system suited to the problem at hand.
For the physical applications that will be covered, an understanding of the underlying physics. This may include, at various times, vectors, electric and magnetic fields, velocity, and momentum.

MV calculus is a wide-ranging subject. It can involve very abstract and theoretical issues, as well as intuitive geometrical ones. This course will emphasize the latter. In particular, it will not cover:

The inverse function theorem and implicit function theorem.
Fubini's theorem. We will give plausible geometrical arguments for multiple integration, and let it go at that. We will not give rigorous proofs.
Measure theory.
Issues of repeated differentiability and "smoothness".
Various theoretical considerations of "pathological" situations such as divergent integrals or non-orientable manifolds.

We will not give rigorous proofs of the various theorems here. Such proofs are taught in college-level analysis and differential geometry courses, and often dwell upon esoteric issues of singularities and infinities.

Integration

In multivariable calculus, and in the related areas of physics, we will be extending the notion of the integral well beyond the simple case of the Riemann Integral from elementary calculus. In fact, integration will be the main subject of this course.

These extensions will consist of

Integration over higher-dimensional regions, such as surfaces and volumes, with sophisticated ways of specifying these regions.
Integration of functions, typically representing physical quantities, such as integration of density to get total mass, or integration of a gravitational or electric field along a path to get potential energy or electric potential.
Integration of vector fields, for example, a magnetic field, across a surface to get the total flux.
Integration in coordinate systems other than the familiar Cartesian coordinates, such that the physically correct result is always obtained, independently of the coordinate system used.

In all cases, the additional sophistication in the integrals will just involve things that are done to set up the mathematical problem. The actual integration will always be the same—one-dimensional Riemann integration of some function. That is, the integration that we will do will always reduce to the familiar definite integration for elementary calculus:

\int _{a}^{b}f(x)dx

Recall that the solution to such a problem is easy to state, though maybe not easy to solve: Find a function $g(x)$ such that $f$ is the derivative of $g$ :

f(x)={\frac {dg}{dx}}

Then the integral is the difference between the values of $g$ at $a$ and $b$ :

\int _{a}^{b}f(x)dx=g(b)-g(a)

We will generalize the familiar integration of mathematical functions defined on intervals of the real numbers, to integration over geometric curves, surfaces, and volumes. We will take a slightly different viewpoint from that of elementary calculus. Instead of integrating a mathematical function—an act which gets the area under the graph of the function—the focus will be on integration over geometric regions. This is just a change in the way we think about these operations—when solving actual problems to get actual answers, we will still perform integration in the usual way.

The first thing we do is develop a more sophisticated way of measuring the area of a plane figure. We know that the area under the graph of a function can be obtained by calculating the definite integral:

A=\int _{a}^{b}f(x)dx

This can give us the area of a figure with a straight horizontal bottom edge, straight vertical left and right edges, and an arbitrary function giving the top edge. It adds up the areas of tiny vertical strips, where the width of each strip is the mysterious symbol "dx" and the height is $f(x)$ . We could replace $f(x)$ with

\int _{0}^{f(x)}1\ dy

so we have

A=\int _{a}^{b}\left(\int _{0}^{f(x)}1\ dy\right)dx

This is two nested integrals, or a double integral. The quantity in parentheses is the integrand of the outer integral. As such, it is a function of x, so its limits are permitted to depend on x. In more complicated problems, such as finding the total electric charge, we might replace the inner integrand ("1" in the present example) with the density of electric charge, which could be a function of x and y.

The big parentheses are usually omitted, but their implict meaning must be followed. The innermost integral, and its limits of integration, relate to the innermost "d" symbol, and so on.

In general, the area of a region is:

\int _{\text{leftmost point of the region}}^{\text{rightmost point of the region}}\left(\int _{\text{bottom of the region at a given x}}^{\text{top of the region at a given x}}\ 1\ dy\right)dx

This is the essence of how double (and later, triple) integrals are used to calculate areas, volumes, and other things over 2 or 3-dimensional regions.

A triple integral could be written:

\int _{A}^{B}\int _{C(x)}^{D(x)}\int _{E(x,y)}^{F(x,y)}\ G(x,y,z)\ dz\ dy\ dx

Each successive integral may use the values of all outer integration variables in specifying its limits. The limits of the outermost integral must be constants.

While this use of double integrals to calculate areas may seem like excessive make-work, and using "1" as the integrand may seem boring, it allows us to find the areas of many (but not all^[1]) regions.

The blue area is the integral of g, and the blue and yellow areas combined are the integral of f. The yellow area is the desired result.

If $f(x)$ and $g(x)$ give the upper and lower edges of a region, its area is:

A=\int _{a}^{b}\int _{g(x)}^{f(x)}1\ dy\ dx

Suppose we have a circle of radius R centered at (A, B). The integration limits are A-R and A+R. The upper function is:

B+{\sqrt {R^{2}-(x-A)^{2}}}

and the lower function is:

B-{\sqrt {R^{2}-(x-A)^{2}}}

The area of the circle is:

\int _{A-R}^{A+R}\int _{B-{\sqrt {R^{2}-(x-A)^{2}}}}^{B+{\sqrt {R^{2}-(x-A)^{2}}}}1\ dy\ dx=\int _{A-R}^{A+R}\ 2\ {\sqrt {R^{2}-(x-A)^{2}}}\ dx

We use the change-of-variable theorem to shift x, using the substitution x = A + u:

\int _{-R}^{R}\ 2\ {\sqrt {R^{2}-u^{2}}}\ du

(We'll have a lot more to say about the change-of-variable theorem later, but now we are just using the version from elementary calculus, that helps us calculate ordinary integrals.) When we evaluate this integral, we get:

u{\sqrt {R^{2}-u^{2}}}+R^{2}\sin ^{-1}(u/R)\left|\right._{-R}^{R}=\pi R^{2}

Approximating a plane region with tiny squares.

The geometrical interpretation is that we have divided the region into thin vertical strips abutting each other from left to right, and then divided each strip into tiny squares, abutting each other from bottom to top, and added up all the areas. A better way to look at this is that we have simply divided the figure into tiny square or rectangular pieces (tiny parallelograms will in fact turn out to be the best way to think about this) and added up their areas.

We could have performed the double integration in the other order, dividing the circle into horizontal strips first, and then subdividing those, so that dx is the "inner" integral and dy the "outer" one. This would have gotten the same result. The theorem that says that this is so is Fubini's theorem. Proving it is outside the scope of this course.

Now, given that Fubini's theorem says that the order of the integrations doesn't matter, we can "abstract away" that order, say that we are really just "integrating over a region", and use a more abstract notation. Once we have defined a 2-dimensional region R, we can just write two successive integral signs with the subscript R instead of the written-out limits of integration, and use the symbol "dA" to mean "infinitesimal piece of area". (Recall that "dx" means, in an informal way, "infinitesimal bit of length".) The integral would look like:

\iint _{R}\ 1\ dA

or, more generally, if $f$ is some function that we want to integrate (for example, density):

\iint _{R}\ f(p)\ dA

where p is some way of indicating a point in the region. We will always use a coordinate system, so, for example, if we are using polar coordinates, the integral might look like:

\iint _{R}\ f(r,\theta )\ dA

For triple integrals over a volume, we do something similar:

\iiint _{R}\ f(r,\theta ,\phi )\ dV

The symbols dA and dV are often called the "area element" and "volume element", respectively.

Obviously, we have more work to do before we can integrate over regions described by coordinate systems other than Cartesian coordinates.

Digression: Other notation for integrals.

People sometimes use various creative symbols to describe particular types of regions. Don't worry about these notations—we are listing them here to explain things that you might see elsewhere.

\oint _{C}

means a 1-dimensional region, that is, an arc, $C$ , whose end point is the same as its starting point. That is, the arc is a closed loop.

\iint _{S}\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\;\;\;\subset \!\supset S

means a 2-dimensional region, that is, a surface, $S$ , that is the boundary of some 3-dimensional volume.

\int _{\partial R}

(that's right; a partial derivative symbol!) means the boundary of $R$ . So, for example,

\iint _{\partial V}

is the integral over the surface of the volume $V$ . This notation arises in the general form of Stokes' theorem.

Change of Variable; Change of Coordinate System

The change-of-variable theorem is an extremely important tool of integral calculus. It is even more important in multivariate calculus, since it is central to the concepts of coordinate systems and coordinate system change operations. It is implicit in nearly all integrals over curves, surfaces, and volumes.

We would like to be able to perform the integrations of the previous section in coordinate systems other than plain Cartesian coordinates. In fact, the use of arbitrary coordinate systems (often called "curvilinear coordinates"), is the central object of this course. For example, the integral to find the area of a circle would be easier to set up if we were using polar coordinates, because a circle is trivial to describe in polar coordinates. To do this, we need to revisit the change-of-variable theorem from elementary calculus, and extend it to higher dimensions.

Basically, the change-of-variable theorem of elementary calculus says that algebraic substitution and manipulations actually work, even when the manipulations involve the "d" symbol. (Remember that things like "dx" are infinitesimals; their actual value would have to be considered to be zero. By a miracle of calculus notation, we can manipulate them anyway.)

Here is a simple example, not (yet) involving integrals. Suppose y is a function of u, and u is a function of x, as follows:

y=\log u\,

u=\sin x\,

We know that

{\frac {dy}{du}}={\frac {1}{u}}

{\frac {du}{dx}}=\cos x

Can we get ${\frac {dy}{dx}}$ from this? The change-of-variable theorem says that we can multiply the derivatives and "cancel" the du:

{\frac {dy}{dx}}={\frac {dy}{du}}{\frac {du}{dx}}={\frac {1}{u}}\ \cos x

We can remove u from this, getting the answer in terms of x:

{\frac {dy}{dx}}={\frac {\cos x}{\sin x}}=\cot x

This is the same answer as the chain rule would have given:

y=\log(\sin x)\,

{\frac {dy}{dx}}={\frac {1}{\sin x}}\cos x=\cot x

It's easy to see why this is true—the multiplication in the chain rule is just the multiplication from which we canceled the dy.

We can do the same thing for integrals, observing why the "dx" symbol in an integral is such an important part of the notation that integrals use. Given:

\int \cot x\ dx

we introduce a new variable u:

u=\sin x\,

so the integral is

\int {\frac {\cos x}{u}}dx

We have

{\frac {du}{dx}}=\cos x

So, taking the usual liberties with the notation:

dx={\frac {du}{\cos x}}

so the integral is:

\int {\frac {\cos x}{u}}{\frac {du}{\cos x}}=\int {\frac {du}{u}}=\log u=\log \sin x

Tricks like this are staples of integration technique.

Why do we consider the change-of-variable theorem from elementary calculus to be so important? Because a change of variable is actually a change of coordinate system.

When we were integrating

\int \cot x\ dx

and we made the substitution

u=\sin x\,

we were actually changing from the coordinate system $x$ on our "space" to the new coordinate system $u$ .

So, to review what the change-of-variable procedure does, if $x$ is one coordinate system and $u$ is another, and we want to calculate

\int _{R}f(x)\ dx

where $f$ is a function of position in "space" (1-dimensional) and $R$ is a description of the region in terms of $x$ , we have

\int _{R}f(x)\ dx=\int _{R}f(u){\frac {dx}{du}}\ du

The integral on the right is an integral in terms of $u$ . The region $R$ is defined in terms of $u$ in determining the limits of the integration, and $f(u)$ is the same function of position, defined in terms of $u$ .

Let's try this for an example in two dimensions. Suppose $R$ is a unit disk, which happens to be more dense at the edges than in the center, and $f({\text{position}})$ is the density at a given point. The density is the square of the distance from the center.

f(x,\ y)=x^{2}+y^{2}\,

To find the mass of the whole disk we need to get

M=\iint _{R}f(x,y)\ dy\ dx=\int _{-1}^{1}\int _{-{\sqrt {1-x^{2}}}}^{\sqrt {1-x^{2}}}(x^{2}+y^{2})\ dy\ dx

We know the coordinate change functions between Cartesian and polar coordinates:

x=r\cos \theta \quad \quad \quad r={\sqrt {x^{2}+y^{2}}}

y=r\sin \theta \quad \quad \quad \theta =\tan ^{-1}(y/x)

We rewrite the integral using the change-of-variable theorem:

M=\iint _{R}f(r,\theta ){\frac {d(x,y)}{d(r,\theta )}}\ d\theta \ dr

We need to figure out what

{\frac {d(x,y)}{d(r,\theta )}}

must mean. We will work this out shortly, but here is a sneak peek. It will be called the Jacobian determinant, or just Jacobian, denoted $J\langle xy/r\theta \rangle \,$ , or just $\mathbf {J}$ when the context is clear. In this case its value will be $r$ (we will explain that soon), so:

M=\int _{0}^{1}\int _{0}^{2\pi }f(r,\theta )\ J\ d\theta \ dr=\int _{0}^{1}\int _{0}^{2\pi }r^{2}\ r\ d\theta \ dr=\int _{0}^{1}2\pi r^{3}\ dr=2\pi {\frac {r^{4}}{4}}{\Bigg |}_{0}^{1}{\Bigg .}={\frac {\pi }{2}}

Partial Derivatives, the Chain Rule, and the Jacobian Matrix

To make sense of the mysterious

{\frac {d(x,y)}{d(r,\theta )}}

we are going to have to extend the concept of derivatives to multiple dimensions. When a function depends on multiple arguments the partial derivative is required. This is the derivative of a function with respect to a particular one of its arguments, while holding all the other arguments fixed.

For example, if $f(x,\ y)$ is a function of two arguments (for example, it's the temperature measured over a disk or other 2-dimensional object), then it has the two partial derivatives:

\displaystyle {\frac {\partial f}{\partial x}}\ \ \ {\text{and}}\ \ \ \displaystyle {\frac {\partial f}{\partial y}}

When we change coordinate systems, we have N functions of N arguments, where N is the dimension of the space. (N will be 2 or 3 for the cases we handle, but the methods work for any N.)

For example, when converting between 2-dimensional polar coordinates and Cartesian coordinates, we have $x$ and $y$ , each of which is a function of $r$ and $\theta$ , and vice-versa:

x=r\cos \theta \,

y=r\sin \theta \,

r={\sqrt {x^{2}+y^{2}}}\,

\theta =\tan ^{-1}(y/x)\,

In 3-dimensional spherical coordinates, we have $x$ , $y$ and $z$ , each of which is a function of $r$ , $\theta$ and $\phi$ , and vice-versa:

x=r\sin \theta \cos \phi \,

y=r\sin \theta \sin \phi \,

z=r\cos \theta \,

r={\sqrt {x^{2}+y^{2}+z^{2}}}\,

\theta =\tan ^{-1}({\sqrt {x^{2}+y^{2}}}/z)\,

\phi =\tan ^{-1}(y/x)\,

It is useful to put the N (2, 3, or whatever) partial derivatives into an NxN matrix, like this:

{\begin{bmatrix}\displaystyle {\frac {\partial x}{\partial r}}&\displaystyle {\frac {\partial x}{\partial \theta }}\\\\\displaystyle {\frac {\partial y}{\partial r}}&\displaystyle {\frac {\partial y}{\partial \theta }}\end{bmatrix}}\,

Or, going in the other direction:

{\begin{bmatrix}\displaystyle {\frac {\partial r}{\partial x}}&\displaystyle {\frac {\partial r}{\partial y}}\\\\\displaystyle {\frac {\partial \theta }{\partial x}}&\displaystyle {\frac {\partial \theta }{\partial y}}\end{bmatrix}}\,

This is the derivative matrix of the transformation from one coordinate system to the other. It is often called the Jacobian matrix, though some authors use the term "Jacobian" specifically to refer to the matrix's determinant. To avoid any confusion on this point, we will always put square brackets around matrices, and use boldface letters to denote determinants.

The notation for the Jacobian matrix from the coordinate system $x_{1},x_{2},...x_{n}\,$ to $f_{1},f_{2},...f_{n}\,$ is:

J\langle f_{1}f_{2}...f_{n}x_{1}x_{2}...x_{n}\rangle ={\frac {\partial (f_{1},f_{2},...,f_{n})}{\partial (x_{1},x_{2},...,x_{n})}}={\begin{bmatrix}\displaystyle {\frac {\partial f_{1}}{\partial x_{1}}}&...&\displaystyle {\frac {\partial f_{1}}{\partial x_{n}}}\\...&&...\\\displaystyle {\frac {\partial f_{n}}{\partial x_{1}}}&...&\displaystyle {\frac {\partial f_{n}}{\partial x_{n}}}\end{bmatrix}}

and the determinant is:

\mathbf {J} ={\begin{vmatrix}\displaystyle {\frac {\partial f_{1}}{\partial x_{1}}}&...&\displaystyle {\frac {\partial f_{1}}{\partial x_{n}}}\\...&&...\\\displaystyle {\frac {\partial f_{3}}{\partial x_{1}}}&...&\displaystyle {\frac {\partial f_{n}}{\partial x_{n}}}\end{vmatrix}}

For 2-dimensional polar coordinates, the Jacobian matrices are:

J\langle xy/r\theta \rangle ={\frac {\partial (x,y)}{\partial (r,\theta )}}={\begin{bmatrix}\displaystyle {\frac {\partial x}{\partial r}}&\displaystyle {\frac {\partial x}{\partial \theta }}\\[2.5ex]\displaystyle {\frac {\partial y}{\partial r}}&\displaystyle {\frac {\partial y}{\partial \theta }}\end{bmatrix}}={\begin{bmatrix}\cos \theta &-r\sin \theta \\\sin \theta &r\cos \theta \end{bmatrix}}={\begin{bmatrix}\displaystyle {\frac {x}{\sqrt {x^{2}+y^{2}}}}&-y\\[2.5ex]\displaystyle {\frac {y}{\sqrt {x^{2}+y^{2}}}}&x\end{bmatrix}}

with determinant $r\,$ , and:

J\langle r\theta /xy\rangle =\displaystyle {\frac {\partial (r,\theta )}{\partial (x,y)}}={\begin{bmatrix}\displaystyle {\frac {\partial r}{\partial x}}&\displaystyle {\frac {\partial r}{\partial y}}\\[2.5ex]\displaystyle {\frac {\partial \theta }{\partial x}}&\displaystyle {\frac {\partial \theta }{\partial y}}\end{bmatrix}}={\begin{bmatrix}\displaystyle {\frac {x}{\sqrt {x^{2}+y^{2}}}}&\displaystyle {\frac {y}{\sqrt {x^{2}+y^{2}}}}\\[2.5ex]\displaystyle {\frac {-y}{x^{2}+y^{2}}}&\displaystyle {\frac {x}{x^{2}+y^{2}}}\end{bmatrix}}={\begin{bmatrix}\cos \theta &\sin \theta \\\\\displaystyle {\frac {-\sin \theta }{r}}&\displaystyle {\frac {\cos \theta }{r}}\end{bmatrix}}

with determinant $1/r\,$ .

For 3-dimensional spherical coordinates, the Jacobian matrices are:

J\langle xyz/r\theta \phi \rangle ={\begin{bmatrix}\sin \theta \cos \phi &r\cos \theta \cos \phi &-r\sin \theta \sin \phi \\\sin \theta \sin \phi &r\cos \theta \sin \phi &r\sin \theta \cos \phi \\\cos \theta &-r\sin \theta &0\end{bmatrix}}

with determinant $r^{2}\sin \theta \,$ , and:

J\langle r\theta \phi /xyz\rangle ={\begin{bmatrix}\sin \theta \cos \phi &\sin \theta \sin \phi &\cos \theta \\\\\displaystyle {\frac {\cos \theta \cos \phi }{r}}&\displaystyle {\frac {\cos \theta \sin \phi }{r}}&-\displaystyle {\frac {\sin \theta }{r}}\\\\-\displaystyle {\frac {\sin \phi }{r\sin \theta }}&\displaystyle {\frac {\cos \phi }{r\sin \theta }}&0\end{bmatrix}}

with determinant $1/(r^{2}\sin \theta )\,$ .

The chain rule of elementary calculus tells us that, if two functions are chained together in a sequence, the derivative of the final chained function is the product of the individual derivatives. If $f\,$ is a function of $g\,$ , with derivative $df/dg\,$ , and $g\,$ is a function of $x\,$ , with derivative $dg/dx\,$ , the the derivative of $f\,$ as a function of $x\,$ is:

{\frac {df}{dx}}={\frac {df}{dg}}{\frac {dg}{dx}}

A miracle of calculus notation makes this all look straightforward.

In multiple dimensions, we might have $f_{1}...f_{n}\,$ as functions of $u_{1}...u_{n}\,$ , which are themselves functions of $x_{1}...x_{n}\,$ . This means that $f_{1}...f_{n}\,$ are functions of $x_{1}...x_{n}\,$ . What are their partial derivatives? What is the Jacobian matrix?

The Chain Rule for Partial Derivatives tells us that:

\displaystyle {\frac {\partial f_{i}}{\partial x_{j}}}=\sum _{k=1}^{N}\displaystyle {\frac {\partial f_{i}}{\partial g_{k}}}\displaystyle {\frac {\partial g_{k}}{\partial x_{j}}}

In terms of the matrices, this means that:

[J\langle f/x\rangle ]_{ij}=\sum _{k=1}^{N}[J\langle f/g\rangle ]_{ik}[J\langle g/x\rangle ]_{kj}

But that's just the formula for matrix multiplication!

So we have

[J\langle f/x\rangle ]=[J\langle f/g\rangle ][J\langle g/x\rangle ]\,

This is the chain rule for multivariable calculus.

It follows from this that, if the second change is just the inverse of the first, bringing us back to the original coordinate system, the two matrices are inverses. And their determinants are reciprocals of each other.

Exercise: Check that for the matrices referred to above, going back and forth between Cartesian and polar/spherical coordinates.

The coordinates should always be chosen such that the Jacobian determinant is positive. If it isn't, exchange the order of two of the coordinates. This ensures that the coordinate system is "right handed". (Or more precisely, that it has the same handedness as the Cartesian system.) As an example, the coordinate order $(r,\theta )\,$ is correct for polar coordinates; $(\theta ,r)\,$ is not. The order $(r,\theta ,\phi \,)$ is correct for spherical coordinates; $(r,\phi ,\theta )\,$ is not.

The Jacobian determinant must not be zero. (Equivalently, the Jacobian matrix must not be singular.) The astute reader will notice that this means that polar coordinates don't work at the origin, and spherical coordinates don't work at the north and south poles.

The rule we are developing, for integrating over a region in arbitrary coordinates $u$ and $v$ , is:

\iint _{R}f(u,v)\ \mathbf {J} \langle xy/uv\rangle \ du\ dv

where $x$ and $y$ are Cartesian coordinates.

Using this formula, the area of a circle is now trivial:

A=\int _{0}^{2\pi }\int _{0}^{R}\ 1\ \mathbf {J} \langle xy/r\theta \rangle \ dr\ d\theta =\int _{0}^{2\pi }\int _{0}^{R}\ r\ dr\ d\theta =\int _{0}^{2\pi }{\frac {R^{2}}{2}}d\theta =\pi R^{2}

Exercise: Use spherical coordinates, with $\mathbf {J} =r^{2}\sin \theta \,$ , to get the volume of a sphere. Compare this with the complexity of the integral to find the volume in Cartesian coordinates.

Why Do We Put the J Determinant Into an Integral?

We need to show that the formula, from the previous section, for an integral in arbitrary coordinates, is correct:

\iint _{R}f(u,v)\ \mathbf {J} \langle xy/uv\rangle \ du\ dv

When we calculate a surface integral in 2 dimensions with Cartesian coordinates, we are breaking the surface into tiny rectangles delineated by tiny changes in the coordinate values $x$ and $y$ . Those tiny changes can be denoted $\Delta x\,$ and $\Delta y\,$ . The area is is the product $\Delta x\ \Delta y\,$ . That area is multiplied by the integrand function $f(x,\ y)$ (evaluated somewhere in the rectangle), and the results are added. As the sizes of the rectangles becomes infinitesimal, the error arising from imprecision in evaluating $f(x,\ y)$ goes to zero, in accordance with the theory of Riemann integration, and we get the true integral:

\iint _{R}f(x,y)\ dx\ dy

We can use a different Cartesian coordinate system (shifted or rotated), still using the same formula:

\iint _{R}f(u,v)\ du\ dv

but the function $f$ must be expressed in terms of the new coordinates, and the meaning of $R\,$ in terms of the limits of integration on $u$ and $v$ will be different. The areas of the tiny rectangles will still be $\Delta u\ \Delta v\,$ .

A tiny "rectangle" in polar coordinates.

How about non-Cartesian coordinate systems such as polar cordinates? Since the coordinate lines are not straight, we don't really have rectangles, but, in the limit as the surface elements get smaller, the tiny areas delineated by the lines of constant $r$ and $\theta$ become arbitrarily close to being rectangular. Also, we know that the "height" of each almost-rectangle is $\Delta r\,$ , and its "width" is $r\ \Delta \theta \,$ . Hence the area is $r\ \Delta r\ \Delta \theta \,$ , and an integral in polar coordinates is therefore:

\iint _{R}f(r,\theta )\ r\ dr\ d\theta

We already calculated that $\mathbf {J}$ , the Jacobian determinant for the $x,y\to r,\theta$ transformation is $r\,$ , so the integral is:

\iint _{R}f(r,\theta )\ \mathbf {J} \ dr\ d\theta

We will work out the general case next, for a transformation between any coordinate system and Cartesian coordinates. Many treatments of this require the use of orthogonal coordinates, that is, coordinate systems in which the coordinate lines always cross at right angles. Essentially all of the coordinate systems that people use in practice (polar, cylindrical, spherical, etc.) satisfy this condition. However we will derive the theorems for any (nonsingular) coordinate system.

Transforming Between Arbitrary Coordinates and Cartesian Coordinates in Two Dimensions

A very tiny "patch" in arbitrary coordinates.

When we use an arbitrary coordinate system $(u,v)$ , the tiny patches delineated by the lines of constant $u$ and $v$ aren't necessarily rectangular, because the lines don't intersect at right angles. The tiny patches are parallelograms. The lines of constant $u$ and $v$ , while not orthogonal, still approach straightness as the patch gets smaller.

In the example shown, the blue patch has $\Delta u=.001$ and $\Delta v=.0001$ .

Now label 3 of the points: A, B, and C.

When we move from point A to point B, $v$ does not change, but $u$ increases by .001, that is, $\Delta u=.001$ . Since $x$ and $y$ , the true Cartesian coordinates of these points, depend on $u$ and $v$ , we can calculate their change, using the partial derivatives. ${\frac {\partial x}{\partial u}}$ and ${\frac {\partial y}{\partial u}}$ tell us how much $x$ and $y$ change for this tiny change in $u$ while holding $v$ constant, so

\Delta x=\displaystyle {\frac {\partial x}{\partial u}}\ \Delta u\ \ \ \ \ \Delta y=\displaystyle {\frac {\partial y}{\partial u}}\ \Delta u

Similarly, as we move from A to C, we get:

\Delta x=-\displaystyle {\frac {\partial x}{\partial v}}\ \Delta v\ \ \ \ \ \Delta y=\displaystyle {\frac {\partial y}{\partial v}}\ \Delta v

(The reason for the minus sign is that $x$ was decreasing while $v$ was increasing.)

The third diagram shows the overall rectangle, with area

\left(\displaystyle {\frac {\partial x}{\partial u}}\ \Delta u-\displaystyle {\frac {\partial x}{\partial v}}\ \Delta v\right)\left(\displaystyle {\frac {\partial y}{\partial u}}\ \Delta u+\displaystyle {\frac {\partial y}{\partial v}}\ \Delta v\right)

which is

\displaystyle {\frac {\partial x}{\partial u}}\ \displaystyle {\frac {\partial y}{\partial u}}\ \left(\Delta u\right)^{2}-\displaystyle {\frac {\partial x}{\partial v}}\ \displaystyle {\frac {\partial y}{\partial v}}\ \left(\Delta v\right)^{2}+\left(\displaystyle {\frac {\partial x}{\partial u}}\ \displaystyle {\frac {\partial y}{\partial v}}-\displaystyle {\frac {\partial x}{\partial v}}\ \displaystyle {\frac {\partial y}{\partial u}}\right)\ \Delta u\ \Delta v

But the yellow part has area

\displaystyle {\frac {\partial x}{\partial u}}\ \displaystyle {\frac {\partial y}{\partial u}}\ \left(\Delta u\right)^{2}

and the pink part has area

-\displaystyle {\frac {\partial x}{\partial v}}\ \displaystyle {\frac {\partial y}{\partial v}}\ \left(\Delta v\right)^{2}

so the actual inner parallelogram has area

\left(\displaystyle {\frac {\partial x}{\partial u}}\ \displaystyle {\frac {\partial y}{\partial v}}-\displaystyle {\frac {\partial x}{\partial v}}\ \displaystyle {\frac {\partial y}{\partial u}}\right)\ \Delta u\ \Delta v=\mathbf {J} \langle xy/uv\rangle \ \Delta u\ \Delta v

This establishes that the Jacobian determinant, from a 2-dimensional Cartesian system to an arbitrary coordinate system, is the correct multiplier to use when integrating.

Transforming Between Arbitrary Coordinates in Two Dimensions

But we can do better—we can handle transformations among any (2-dimensional) coordinate systems. Suppose that the coordinates that we have been calling $x$ and $y$ , and have been assuming are Cartesian, are not Cartesian. The argument above still holds, with some minor modifications. We aren't allowed to say that $\Delta x\ \Delta y$ is the true area of the patch, and we aren't allowed to assume that the $x/y$ coordinate lines are perpendicular.

First, we are going to change the names $x$ and $y$ to $q$ and $r$ , because we don't like to use names $x$ , $y$ , (or $z$ ) for any coordinate system that isn't Cartesian.

Now suppose that the area of a patch, in $(q,r)\,$ coordinates, is given by $K\Delta q\ \Delta r$ . $K\,$ is the correction factor for the $(q,r)\,$ coordinate system. (From what we proved above, we know that it is just the Jacobian $\mathbf {J} \langle xy/qr\rangle$ . $K\,$ can vary from one point to another, just as Jacobians can.)

An arbitrary patch measured relative to another arbitrary coordinate system.

The picture now looks like the diagram at the right. The outer parallelogram is delineated by the lines in the $(q,r)\,$ coordinate system, with a $q$ dimension of

\displaystyle {\frac {\partial q}{\partial u}}\ \Delta u-\displaystyle {\frac {\partial q}{\partial v}}\ \Delta v

and an $r$ dimension of

\displaystyle {\frac {\partial r}{\partial u}}\ \Delta u+\displaystyle {\frac {\partial r}{\partial v}}\ \Delta v

Its area is

\mathbf {K} \left(\displaystyle {\frac {\partial q}{\partial u}}\ \Delta u-\displaystyle {\frac {\partial q}{\partial v}}\ \Delta v\right)\left(\displaystyle {\frac {\partial r}{\partial u}}\ \Delta u+\displaystyle {\frac {\partial r}{\partial v}}\ \Delta v\right)

Now the two pink triangles can be combined into a parallelogram with a $q$ dimension of

-\displaystyle {\frac {\partial q}{\partial v}}\ \Delta v

and an $r$ dimension of

\displaystyle {\frac {\partial r}{\partial v}}\ \Delta v

So their combined area is

-\mathbf {K} \displaystyle {\frac {\partial q}{\partial v}}\ \displaystyle {\frac {\partial r}{\partial v}}\ \left(\Delta v\right)^{2}

Similarly, the area of the combined yellow triangles is

\mathbf {K} \displaystyle {\frac {\partial q}{\partial u}}\ \displaystyle {\frac {\partial r}{\partial u}}\ \left(\Delta u\right)^{2}

So the area of the inner parallelogram (the one delineated by the lines in the $(u,v)\,$ coordinate system) is

\mathbf {K} \left(\displaystyle {\frac {\partial q}{\partial u}}\ \displaystyle {\frac {\partial r}{\partial v}}-\displaystyle {\frac {\partial q}{\partial v}}\ \displaystyle {\frac {\partial r}{\partial u}}\right)\ \Delta u\ \Delta v=\mathbf {K} \ \mathbf {J} \langle qr/uv\rangle \ \Delta u\ \Delta v

This means that, if an integral can be expressed in $(q,r)\,$ coordinates as

\iint _{R}g(q,r)\ \mathbf {K} \ dq\ dr

we can calculate it in $(u,v)\,$ coordinates as

\iint _{R}g(u,v)\ \mathbf {K} \ \mathbf {J} \langle qr/uv\rangle \ du\ dv

We can move $\mathbf {K}$ into the integrand function that we call $g$ : $f=g\mathbf {K} \,$ , and treat that as the new integrand.

So we get the general formula for change of arbitrary coordinates in (2-dimensional) integration—multiply by the Jacobian determinant:

\iint _{R}f(q,r)\ dq\ dr=\iint _{R}f(u,v)\ \mathbf {J} \langle qr/uv\rangle \ du\ dv

We need to have it work in the other direction too, of course:

\iint _{R}f(q,r)\ \mathbf {J} \langle uv/qr\rangle \ dq\ dr=\iint _{R}f(u,v)\ du\ dv

This requires that

\mathbf {J} \langle uv/qr\rangle ={\frac {1}{\mathbf {J} \langle qr/uv\rangle }}

but we have already established that the matrices are inverse, so their determinants are reciprocals.

This means that (in 2 dimensions) we can change coordinate systems at will, introducing a Jacobian determinant whenever we do so.

Transforming Between Arbitrary Coordinates in Any Number of Dimensions

Finally, we generalize this to 3 (or more) dimensions. We use a geometrical trick equivalent to the one that elementary geometry students use for deriving the area of a parallelogram—part of the parallelogram is removed from one side and reattached to the other, as shown at the right.

The "parallelogram trick". The area is the base times the perpendicular height.

(We could have used the same trick in 2 dimensions; instead we used a method that makes it easier to see the connection between the geometry and the calculation of the determinant.)

We will first consider a change of coordinates in which only one coordinate is different. That is, we have a coordinate system $(q,r,s)\,$ , in which the volume of a tiny chunk of space delineated by the coordinate lines is $K\Delta q\ \Delta r\ \Delta s$ .

Let $(u,v,w)\,$ be another coordinate system, but with a very simple transformation—only the third coordinate is allowed to be different.

q=u\,

r=v\,

s={\text{some function of }}(u,v,w)\,

The Jacobian matrix is

J\langle qrs/uvw\rangle ={\begin{bmatrix}\displaystyle {\frac {\partial q}{\partial u}}&\displaystyle {\frac {\partial q}{\partial v}}&\displaystyle {\frac {\partial q}{\partial w}}\\[2.5ex]\displaystyle {\frac {\partial r}{\partial u}}&\displaystyle {\frac {\partial r}{\partial v}}&\displaystyle {\frac {\partial r}{\partial w}}\\[2.5ex]\displaystyle {\frac {\partial s}{\partial u}}&\displaystyle {\frac {\partial s}{\partial v}}&\displaystyle {\frac {\partial s}{\partial w}}\end{bmatrix}}={\begin{bmatrix}1&0&0\\[2.5ex]0&1&0\\[2.5ex]\displaystyle {\frac {\partial s}{\partial u}}&\displaystyle {\frac {\partial s}{\partial v}}&\displaystyle {\frac {\partial s}{\partial w}}\end{bmatrix}}

The Jacobian determinant is

\mathbf {J} \langle qrs/uvw\rangle =\displaystyle {\frac {\partial s}{\partial w}}

A chunk delineated by (u, v, w), shown in (q, r, s) space.

The "parallelogram trick" in 3 dimensions.

Now suppose we have a tiny chunk of space delineated by the lines in the $(u,v,w)\,$ coordinate system, with sides $\Delta u\,$ , $\Delta v\,$ , and $\Delta w\,$ . The diagram at the right shows that chunk, relative to the $(q,r,s)\,$ system.

Since the first two coordinates are the same, the chunk fits properly into the $(q,r)\,$ coordinate system, but is pushed upward in the $s\,$ coordinate.

We chop off a "level" (relative to the $(q,r,s)\,$ coordinates) piece from the top of the chunk, and insert it at the bottom. The volume is unchanged. It is

V=K\Delta q\ \Delta r\ \Delta s=K\Delta u\ \Delta v\ \displaystyle {\frac {\partial s}{\partial w}}\Delta w=K\Delta u\ \Delta v\ \mathbf {J} \langle qrs/uvw\rangle \ \Delta w

This means that an integral that would have been

\iiint _{R}g(q,r,s)\ \mathbf {K} \ dq\ dr\ ds

in one coordinate system is

\iiint _{R}g(u,v,w)\ \mathbf {K} \ \mathbf {J} \langle qrs/uvw\rangle \ du\ dv\ dw

in the other.

As before, we move $\mathbf {K}$ into the integrand: $f=g\mathbf {K} \,$ , so we have:

\iiint _{R}f(q,r,s)\ dq\ dr\ ds=\iiint _{R}f(u,v,w)\ \mathbf {J} \langle qrs/uvw\rangle \ du\ dv\ dw

So we have shown that the transformation formula works in 3 dimensions (actually, any number of dimensions, but the "parallelogram trick" can't be visualized) as long as the coordinate transformation involves only one coordinate. But we can make coordinate changes repeatedly, so we can change the coordinates one at a time until we get the desired coordinate system.

Actually, there is a slight problem that we need to point out. We don't have complete freedom to choose the order of the coordinates. We could not change the system

(x,y)\,

to

(y,x)\,

one coordinate at a time, because the intermediate stage would be

(x,x)\,

or

(y,y)\,

, both of which are singular. But we can always change a given coordinate in one system into some coordinate in the other system.

To summarize the integral transformation rule:

\int ...\int _{R}f(q_{1}...q_{N})\ dq_{1}...dq_{N}=\int ...\int _{R}f(u_{1}...u_{N})\ \mathbf {J} \langle q_{1}...q_{N}/u_{1}...u_{N}\rangle \ du_{1}...du_{n}

In 1 dimension this is just the change-of-variable theorem from elementary calculus.

Example: Volume of a Solid Torus

Mathematicians usually use the word "torus" to mean the 2-dimensional surface—it has interesting topological and geometrical properties. But for now, we are interested in the 3-dimensional solid torus. What is its volume?

Let $S$ be the main radius, and $R$ be the cross-sectional radius. (We consider only the case $R<S$ .) A naive approximation would give the volume as the cross-sectional area times the main circumference, or $2\pi ^{2}SR^{2}\,$ . But is that exactly right? The volume contribution is greater in the "outside" part, and smaller in the "inside" part.

We can solve the problem by creating "toroidal coordinates"^[2] in 3-dimensional space, $(r,\theta ,\phi \,)$ . $\theta \,$ is the angle around the main center, $\phi \,$ is the angle around the cross-sectional circle, and $r\,$ is the distance from the cross-sectional center.

This coordinate system is not acceptable for all of space—there are points (outside of the torus) that have two different sets of coordinates. But it works inside the torus.

In these coordinates, the solid torus is just the set of points with $r\leq R\,$ . (We will look at the surface of the torus in a later lecture. It is the restriction $r=R\,$ .)

The transformation to Cartesian coordinates is:

x=(S+r\cos \phi )\cos \theta \,

y=(S+r\cos \phi )\sin \theta \,

z=r\sin \phi \,

The Jacobian matrix is:

J\langle xyz/r\theta \phi \rangle ={\begin{bmatrix}\displaystyle {\frac {\partial x}{\partial r}}&\displaystyle {\frac {\partial x}{\partial \theta }}&\displaystyle {\frac {\partial x}{\partial \phi }}\\[2.5ex]\displaystyle {\frac {\partial y}{\partial r}}&\displaystyle {\frac {\partial y}{\partial \theta }}&\displaystyle {\frac {\partial y}{\partial \phi }}\\[2.5ex]\displaystyle {\frac {\partial z}{\partial r}}&\displaystyle {\frac {\partial z}{\partial \theta }}&\displaystyle {\frac {\partial z}{\partial \phi }}\end{bmatrix}}={\begin{bmatrix}\cos \theta \cos \phi &-(S+r\cos \phi )\sin \theta &-r\cos \theta \sin \phi \\[2.5ex]\sin \theta \cos \phi &(S+r\cos \phi )\cos \theta &-r\sin \theta \sin \phi \\[2.5ex]\sin \phi &0&r\cos \phi \end{bmatrix}}

The determinant is:

\mathbf {J} \langle xyz/r\theta \phi \rangle =r(S+r\cos \phi )\,

So the volume of the torus is:

\int _{0}^{R}\int _{0}^{2\pi }\int _{0}^{2\pi }r(S+r\cos \phi )\ d\theta \ d\phi \ dr=2\pi \int _{0}^{R}\int _{0}^{2\pi }(rS+r^{2}\cos \phi )\ d\phi \ dr=2\pi ^{2}Sr^{2}

Exercise: Calculate the volumes of the inside part ( $-\pi /2\leq \phi \leq \pi /2$ ) and the outside part ( $\pi /2\leq \phi \leq 3\pi /2$ ) separately.

Exercise: The angular momentum of a rigid body rotating around some axis is the rotation speed times the integral, over the volume, of the density times the distance from the axis of rotation. What is the angular momentum of a torus of uniform density rotating around its main axis?

Exercise: The moment of rotational inertia is the integral of the density times the square of the distance from the axis of rotation. What is the moment of rotational inertia of a torus of uniform density?

Example: Confocal Coordinates

This example illustrates the wide applicability of coordinate systems. This coordinate system is not commonly used. The coordinates are $\rho \,$ and $\theta \,$ . $A\,$ is a fixed parameter giving the position of the foci. Lines of constant $\rho \,$ are ellipses, and lines of constant $\theta \,$ are hyperbolas.

The transformation to Cartesian coordinates is:

x=A\cosh \rho \cos \theta \,

y=A\sinh \rho \sin \theta \,

The Jacobian matrix is:

J\langle xy/\rho \theta \rangle ={\begin{bmatrix}A\sinh \rho \cos \theta &-A\cosh \rho \sin \theta \\[2.5ex]A\cosh \rho \sin \theta &A\sinh \rho \cos \theta \end{bmatrix}}

The determinant is:

\mathbf {J} \langle xy/\rho \theta \rangle =A^{2}(\sinh ^{2}\rho +\sin ^{2}\theta )\,

Suppose we want to find the area of the ellipse formed by holding $\rho =R\,$ .

The semi-major axis is $A\cosh R\,$ , and the semi-minor axis is $A\sinh R\,$ .

The area is:

\int _{0}^{R}\int _{0}^{2\pi }A^{2}(\sinh ^{2}\rho +\sin ^{2}\theta )\ d\theta \ d\rho =\pi A^{2}\sinh R\cosh R

External Integration Software

Footnotes and References

↑ The restriction is that a vertical lines must enter and leave the region just once. But more complicated regions can be broken up into smaller ones, so that any reasonable region may in fact be integrated. Furthermore, changing the coordinate system can often help.
↑ This coordinate system is not standard; you won't find it in the literature. We are just making it up. We can make up any coordinate system we want.

[1] The restriction is that a vertical lines must enter and leave the region just once. But more complicated regions can be broken up into smaller ones, so that any reasonable region may in fact be integrated. Furthermore, changing the coordinate system can often help.

[2] This coordinate system is not standard; you won't find it in the literature. We are just making it up. We can make up any coordinate system we want.

[1]

[2]