Partial Derivatives
Partial Derivatives
Partial Derivatives
... . . . . . . . . ... . ..
.. . ... .
• •
• •
√ y�
• 2
x� (1,1)
The figure above shows more than just the graph of three points. Here are the steps we
used to draw the graph. Remember, this is just a sketch, it should suggest the shape of the
graph and some of its features.
1. First we draw the axes. The z-axis points up, the y-axis is to the right and the x-axis
comes out of the page, so it is drawn at the angle shown. This gives a perspective with the
eye somewhere in the first octant.
2. The yz-traces are those curves found by setting x = a constant. We start with the trace
when x = 0. This is an upward pointing parabola in the yz-plane.
√
3. Next we sketch the trace with z = 3. This is a circle of radius 3 at height z = 3. Note,
the traces where z = constant are generally called level curves.
This is enough for this graph. Other graphs take other traces. You should expect to do a
certain amount of trial and error before your figure looks right.
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Gallery of graphs
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Level Curves and Contour Plots
Level curves and contour plots are another way of visualizing functions of two variables. If
you have seen a topographic map then you have seen a contour plot.
Example: To illustrate this we first draw the graph of z = x2 + y 2 . On this graph we draw
contours, which are curves at a fixed height z = constant.
For example the curve at height z = 1 is the circle x2 + y 2 = 1. On the graph we have
to draw this at the correct height. Another way to show this is to draw the curves in the
xy-plane and label them with their z-value. We call these curves level curves and the entire
plot is called a contour plot.
For this example they are shown in the plot on the right. Notice that the 3D graph is simply
the level curves ’pulled out’ each to its correct height.
z
y
• z=4
.. .. ... . ... ... . .. z=2
.. . ... .
• •z = 1 z=1 �
• x
�• • y�
��� √ • √1
x�� (1/ 2,1/ 2)
Here is another plot of a ’mountain pass’. Notice that in the contour plot the mountain
pass is represented by a level curve that crosses itself. Moving up or down from the cross
level curves heights decrease and moving right or left in the other they increase.
z = 400
................... . . . ........................ z = 600
...... . .. . . . z = 800
.... ... ........ .. ........... . ........... .. ... ............
.. ..
..... . .. ......... .. .. ......... ...... .. .. ... ...... .. . . ........ z = 1000
.... .
Mountain pass
Level curves
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Partial derivatives
Partial derivatives
Let w = f (x, y) be a function of two variables. Its graph is a surface in xyz-space, as
w
pictured. w=f(x,y)
Fix a value y = y0 and just let x vary. You get a function of one variable, w=f(x,y 0)
(1) w = f (x, y0 ), the partial function for y = y0 .
P
Its graph is a curve in the vertical plane y = y0 , whose slope at the
point P where x = x0 is given by the derivative
y0 y
� � x0
d � ∂f ��
(2) f (x, y0 )�� , or . x
dx x0 ∂x �(x0 ,y0 )
We call (2) the partial derivative of f with respect to x at the point (x0 , y0 ); the right side
of (2) is the standard notation for it. The partial derivative is just the ordinary derivative
of the partial function — it is calculated by holding one variable fixed and differentiating
with respect to the other variable. Other notations for this partial derivative are
� � � � �
∂w �� ∂f ∂w
fx (x0 , y0 ), , , ;
∂x �(x0 ,y0 ) ∂x 0 ∂x 0
the first is convenient for including the specific point; the second is common in science
and engineering, where you are just dealing with relations between variables and don’t
mention the function explicitly; the third and fourth indicate the point by just using a
single subscript.
Analogously, fixing x = x0 and letting y vary, we get the partial function w = f (x0 , y),
whose graph lies in the vertical plane x = x0 , and whose slope at P is the partial derivative
of f with respect to y; the notations are
� � � � � �
∂f �� ∂w �� ∂f ∂w
, f y (x 0 , y 0 ), , , .
∂y �(x0 ,y0 ) ∂y �(x0 ,y0 ) ∂y 0 ∂y 0
The partial derivatives ∂f /∂x and ∂f /∂y depend on (x0 , y0 ) and are therefore functions of
x and y.
Written as ∂w/∂x, the partial derivative gives the rate of change of w with respect to x
alone, at the point (x0 , y0 ): it tells how fast w is increasing as x increases, when y is held
constant.
For a function of three or more variables, w = f (x, y, z, . . . ), we cannot draw graphs any
more, but the idea behind partial differentiation remains the same: to define the partial
derivative with respect to x, for instance, hold all the other variables constant and take the
ordinary derivative with respect to x; the notations are the same as above:
� � � �
d ∂f ∂w
f (x, y0 , z0 , . . . ) = fx (x0 , y0 , z0 , . . . ), , .
dx ∂x 0 ∂x 0
1
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
The Tangent Approximation
Assume the plane is not vertical; then C = 0, so we can divide through by C and solve for
w − w0 , getting
The plane passes through (x0 , y0 , w0 ); what values of the coefficients a and b will make it
also tangent to the graph there? We have
The intuitive idea is that if we stay near (x0 , y0 , w0 ), the graph of the tangent plane (4)
will be a good approximation to the graph of the function w = f (x, y). Therefore if the
point (x, y) is close to (x0 , y0 ),
( ) ( )
∂w ∂w
(5) f (x, y) ≈ w0 + (x − x0 ) + (y − y0 )
∂x 0 ∂y 0
height of graph ≈ height of tangent plane
The function on the right side of (5) whose graph is the tangent plane is often called the
linearization of f (x, y) at (x0 , y0 ): it is the linear function which gives the best approxima
tion to f (x, y) for values of (x, y) close to (x0 , y0 ).
An equivalent form of the approximation (5) is obtained by using Δ notation; if we put
Δx = x − x0 , Δy = y − y0 , Δw = w − w0 ,
then (5) becomes
( ) ( )
∂w ∂w
(6) Δw ≈ Δx + Δy, if Δx ≈ 0, Δy ≈ 0 .
∂x 0 ∂y 0
This formula gives the approximate change in w when we make a small change in x and y.
We will use it often.
The analogous approximation formula for a function w = f (x, y, z) of three variables
would be
( ) ( ) ( )
∂w ∂w ∂w
(7) Δw ≈ Δx + Δy + Δz, if Δx, ΔyΔz ≈ 0 .
∂x 0 ∂y 0 ∂z 0
Unfortunately, for functions of three or more variables, we can’t use a geometric argument
for the approximation formula (7); for this reason, it’s best to recast the argument for (6)
in a form which doesn’t use tangent planes and geometry, and therefore can be generalized
to several variables. This is done at the end of this Chapter TA; for now let’s just assume
the truth of (7) and its higher-dimensional analogues.
Here are two typical examples of the use of the approximation formula. Other examples
are in the Exercises. In the rest of your study of partial differentiation, you will see how the
approximation formula is used to derive the important theorems and formulas.
Example 1. Give a reasonable square, centered at (1, 1), over which the value of
w = x3 y 4 will not vary by more than ± .1 .
Solution. We use (6). We calculate for the two partial derivatives
wx = 3x2 y 4 wy = 4x3 y 3
and therefore, evaluating the partials at (1, 1) and using (6), we get
Δw ≈ 3Δx + 4Δy.
Thus if |Δx| ≤ .01 and Δy| ≤ .01, we should have
|Δw| ≤ 3|Δx| + 4|Δy| ≤ .07,
which is within the bounds. So the answer is the square with center at (1,1) given by
|x − 1| ≤ .01, |y − 1| ≤ .01 .
TANGENT APPROXIMATION 3
ΔV ≈ bc Δa + ac Δb + ab Δc
≈ 6 Δa + 3 Δb + 2 Δc, at (1, 2, 3);
thus it is most sensitive to small changes in side a, since Δa occurs with the largest coefficient.
(That is, if one at a time the measurement of each side were changed by say .01, it is the
change in a which would produce the biggest change in V , namely .06 .)
The result may seem paradoxical — the value of V is most sensitive to the
length of the shortest side — but it’s actually intuitive, as you can see by thinking
about how the box looks.
Sensitivity Principle The numerical value of w = f (x, y, . . . ), calculated at some
point (x0 , y0 , . . . ), will be most sensitive to small changes in that variable for which the
corresponding partial derivative wx , wy , . . . has the largest absolute value at the point.
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
The Tangent approximation
� � � � � �
∂w ∂w ∂w
(7) Δw ≈ Δx + Δy + Δz, if Δx, ΔyΔz ≈ 0 .
∂x 0 ∂y 0 ∂z 0
is not a precise mathematical statement, since the symbol ≈ does not specify exactly how
close the quantitites on either side of the formula are to each other. To fix this up, one
would have to specify the error in the approximation. (This can be done, but it is not often
used.)
A more fundamental objection is that our discussion of approximations was based on the
assumption that the tangent plane is a good approximation to the surface at (x0 , y0 , w0 ).
Is this really so?
Look at it this way. The tangent plane was determined as the plane which has the same
slope as the surface in the i and j directions. This means the approximation (6) will be
good if you move away from (x0 , y0 ) in the i direction (by taking Δy = 0), or in the j
direction (putting Δx = 0). But does the tangent plane have the same slope as the surface
in all the other directions as well?
Intuitively, we should expect that this will be so if the graph of f (x, y) is a “smooth”
surface at (x0 , y0 ) — it doesn’t have any sharp points, folds, or look peculiar. Here is the
mathematical hypothesis which guarantees this.
These are continuous at all points except (0, 0), where they are undefined. So the function
is smooth except at the origin; the approximation formula (6) should be valid everywhere
except at the origin.
1
2 THE TANGENT APPROXIMATION
�
Indeed, investigating the graph of this function, since w = x2 + y 2 says that
height of graph over (x, y) = distance of (x, y) from w-axis,
the graph is a right circular cone, with vertex at (0, 0), axis along the w-axis, and vertex
angle a right angle. Geometrically the graph has a sharp point at the origin, so there should
be no tangent plane there, and no valid approximation formula (6) — there is no linear
function which approximates a cone at its vertex.
A non-geometrical argument for the approximation formula
We promised earlier a non-geometrical approach to the approximation formula (6) that
would generalize to higher-dimensions, in particular to the 3-variable formula (7). This
approach will also show why the hypothesis (8) of smoothness is needed. The argument is
still imprecise, since it uses the symbol ≈, but it can be refined to a proof (which you will
find in your book, though it’s not easy reading).
It uses the one-variable approximation formula for a differentiable function w = f (u) :
(9) Δw ≈ f ′ (u0 )Δu, if Δu ≈ 0 .
We wish to justify — without using reasoning based on 3-space — the approximation formula
y 0 +Δy
� � � �
∂w ∂w R
(6) Δw ≈ Δx + Δy, if Δx ≈ 0, Δy ≈ 0 .
∂x 0 ∂y 0 Δy
Δx
y0
P Q
We are trying to calculate the change in w as we go from P to R in the
picture, where P = (x0 , y0 ), R = (x0 + Δx, y0 + Δy). This change can be
x x0 + Δx
thought of as taking place in two steps: 0
similarly,
�
d �
Δw2 ≈ f (x0 + Δx, y)�� · Δy = fy (x0 + Δx, y0 ) Δy
dy y0
(12) ≈ fy (x0 , y0 ) Δy,
if we assume that fy is continuous (i.e., f is smooth), since the difference between the two
terms on the right in the last two lines will then be like ǫ Δy, which is negligible compared
with either term itself. Substituting the two approximate values (11) and (12) into (10)
gives us the approximation formula (6). �
To make this a proof, the error terms in the approximations have to be analyzed, or more
simply, one has to replace the ≈ symbol by equalities based on the Mean-Value Theorem of
one-variable calculus.
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Tangent approximation
1. a) Find the equation tangent plane to the graph of z = x2 + y 2 at the point (2,1,5).
b) Give the tangent approximation for z near the point (x0 , y0 ) = (2, 1).
∂z ∂z ∂z ∂z
Answer: a) = 2x and = 2y ⇒ (2, 1) = 4 and (2, 1) = 2.
∂x ∂y ∂x ∂y
The tangent plane at (2,1,5) is
� �
∂z �� ∂z ��
(z − 5) = (x − 2) + (y − 1) = 4(x − 2) + 2(y − 1).
∂x �0 ∂y �0
b) The tangent approximation is the same formula, with the interpretation that for a fixed
(x0 , y0 ) the value of z on the graph of the function is near that of z on the tangent plane.
Thus, for (x0 , y0 ) ≈ (2, 1) we have
Δz ≈ 4Δx + 2Δy.
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Critical Points
Critical points:
A standard question in calculus, with applications to many fields, is to find the points where
a function reaches its relative maxima and minima.
Just as in single variable calculus we will look for maxima and minima (collectively called
extrema) at points (x0 , y0 ) where the first derivatives are 0. Accordingly we define a critical
point as any point (x0 , y0 ) where
∂f ∂f
(x0 , y0 ) = 0 and (x0 , y0 ) = 0.
∂x ∂y
Often we will abbreviate this as fx = 0 and fy = 0.
Our first job is to verify that relative maxima and minima occur at critical points. The
figures below illustrates that they occur at places where the tangent plane is horizontal.
z z
y y
x x
Max. with horizontal tang. plane Min. with horizontal tang. plane
Since horizontal planes are of the form z = constant. and the equation of the tangent plane
at (x0 , y0 , z0 ) is
z = z0 + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 )
we see it is horizontal when
Thus, extrema occur at critical points. But, just as in single variable calculus, not all critical
points are extrema.
Example: Find the critical points of z = x2 + y 2 + .5.
∂z ∂z
Answer: = 2x and = 2y. Clearly the only point where both derivatives are 0 is
∂x ∂y
(0, 0). Thus, there is a single critical point at (0, 0). The figure shows it is clearly the point
where z reaches a minimum value. (See the figure above on the right.)
Example: Find the critical points of z = 1 − x2 − y 2 .
∂z ∂z
Answer: = −2x and = −2y. Clearly the only point where both derivatives are
∂x ∂y
0 is (0, 0). Thus, there is a single critical point at (0, 0). The figure shows it is clearly the
point where z reaches a maximum value. (See the figure above on the left.)
1
Example: Find the critical points of z = −x2 + y 2 .
∂z ∂z
Answer: = −2x and = 2y. Clearly the only point where both derivatives are
∂x ∂y
0 is (0, 0). Thus, there is a single critical point at (0, 0). The figure shows it is neither a
minimum or a maximum.
z
x
Saddle with horizontal tang. plane
z
y
x
2
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Least Squares Interpolation
(1) y = ax + b
which “best” passes through them. Assuming our errors in measurement are distributed
randomly according to the usual bell-shaped curve (the so-called “Gaussian distribution”),
it can be shown that the right choice of a and b is the one for which the sum D of the
squares of the deviations
n
(xi ,yi )
� � �2
(2) D = yi − (axi + b)
i=1
is a minimum. In the formula (2), the quantities in parentheses (shown by (xi , ax i+b)
dotted lines in the picture) are the deviations between the observed values
yi and the ones axi + b that would be predicted using the line (1). i
The deviations are squared for theoretical reasons connected with the assumed Gaussian
error distribution; note however that the effect is to ensure that we sum only positive
quantities; this is important, since we do not want deviations of opposite sign to cancel each
other out. It also weights more heavily the larger deviations, keeping experimenters honest,
since they tend to ignore large deviations (“I had a headache that day”).
This prescription for finding the line (1) is called the method of least squares, and the
resulting line (1) is called the least-squares line or the regression line.
To calculate the values of a and b which make D a minimum, we see where the two partial
derivatives are zero:
n
∂D �
= 2(yi − axi − b)(−xi ) = 0
∂a i=1
(3) n
∂D �
= 2(yi − axi − b)(−1) = 0 .
∂b i=1
1
2 LEAST SQUARES INTERPOLATION
These give us a pair of linear equations for determining a and b, as we see by collecting
terms and cancelling the 2’s:
�� � �� � �
2
xi a + xi b = x i yi
(4) �� � �
xi a + n b = yi .
(Notice that it saves a lot of work to differentiate (2) using the chain rule, rather than first
expanding out the squares.)
The equations (4) are usually divided by n to make them more expressive:
1�
s̄ a + x̄ b = x i yi
(5) n
x̄ a + b = ȳ ,
� 2
where x̄ and ȳ are the average of the xi and yi , and s̄ = xi /n is the average of the squares.
From this point on use linear algebra to determine a and b. It is a good exercise to see
that the equations are always solvable unless all the xi are the same (in which case the best
line is vertical and can’t be written in the form (1)).
In practice, least-squares lines are found by pressing a calculator button, or giving a
MatLab command. Examples of calculating a least-squares line are in the exercises accom
panying the course. Do them from scratch, starting from (2), since the purpose here is to
get practice with max-min problems in several variables; don’t plug into the equations (5).
Remember to differentiate (2) using the chain rule; don’t expand out the squares, which
leads to messy algebra and highly probable error.
is a minimum. Now there are three unknowns, a0 , a1 , a2 . Calculating (remember to use the
chain rule!) the three partial derivatives ∂D/∂ai , i = 0, 1, 2, and setting them equal to zero
leads to a square system of three linear equations; the ai are the three unknowns, and the
coefficients depend on the data points (xi , yi ). They can be solved by finding the inverse
matrix, elimination, or using a calculator or MatLab.
If the points seem to lie more and more along a line as x → ∞, but lie on one side of the
line for low values of x, it might be reasonable to try a function which has similar behavior,
like
1
(8) y = a0 + a1 x + a2
x
LEAST SQUARES INTERPOLATION 3
and again minimize the sum of the squares of the deviations, as in (7). In general, this
method of least squares applies to a trial expression of the form
where the fi (x) are given functions (usually simple ones like 1, x, x2 , 1/x, ekx , etc. Such an
expression (9) is called a linear combination of the functions fi (x). The method produces
a square inhomogeneous system of linear equations in the unknowns a0 , . . . , ar which can
be solved by finding the inverse matrix to the system, or by elimination.
The method also applies to finding a linear function
(10) z = a1 + a2 x + a3 y
where there are two independent variables x and y and a dependent variable z (this is
the quantity being experimentally measured, for different values of (x, y)). This time after
differentiation we get a 3 × 3 system of linear equations for determining a1 , a2 , a3 .
The essential point in all this is that the unknown coefficients ai should occur linearly
in the trial function. Try fitting a function like cekx to data points by using least squares,
and you’ll see the difficulty right away. (Since this is an important problem — fitting an
exponential to data points — one of the Exercises explains how to adapt the method to this
type of problem.)
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Least squares interpolation
1. Use the method of least squares to fit a line to the three data points
Answer: We are looking for the line y = ax + b that best models the data. The deviation
of a data point (xi , yi ) from the model is
yi − (axi + b).
By best we mean the line that minimizes the sum of the squares of the deviation. That is
we want to minimize
(Remember, the variables whose values are to be found are a and b.) We do not expand
out the squares, rather we take the derivatives first. Setting the derivatives equal to 0 gives
∂D
∂a = −2(2 − a − b) − 4(1 − 2a − b) = 0 ⇒ 10a + 6b = 8 ⇒ 5a + 3b = 4
∂D
∂b = 2b − 2(2 − a − b) − 2(1 − 2a − b) = 0 ⇒ 6a + 6b = 6 ⇒ 3a + 3b = 3.
This linear system of two equations in two unknowns is easy to solve. We get
1 1
a= , b= .
2 2
Here is a plot of the problem.
y
3
y = 12 x + 1
2
2
x
1 2 3
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Second Derivative Test
Second-derivative test. Let (x0 , y0 ) be a critical point of f (x, y), and A, B, and C
be as in (1). Then
Example 1. Find the critical points of w = 12x2 + y 3 − 12xy and determine their type.
Solution. We calculate the partial derivatives easily:
A = wxx = 24
wx = 24x − 12y
(2) B = wxy = −12
wy = 3y 2 − 12x
C = wyy = 6y
To find the critical points we solve simultaneously the equations wx = 0 and wy = 0; we get
wx = 0 y = 2x (x, y) = (0, 0)
⇒ 2 ⇒ 4x2 = 4x ⇒ x = 0, 1 ⇒ .
wy = 0 y = 4x (x, y) = (1, 2)
Thus there are two critical points: (0, 0) and (1, 2). To determine their type, we use the
second derivative test: we have AC − B 2 = 144y − 144, so that
at (0, 0), we have AC − B 2 = −144, so it is a saddle point; 3
The test involves the quantity AC − B 2 . In general, whenever we see the expressions
B − 4AC or B 2 − AC or their negatives, it means the quadratic formula is involved, in one
2
of its two forms (the second is often used to get rid of the excess two’s):
√
−B ± B 2 − 4AC
(3) Ax2 + Bx + C = 0 ⇒ x =
√2A
−B ± B 2 − AC
(4) Ax2 + 2Bx + C = 0 ⇒ x =
A
This is what is happening here. We want to know whether, near a critical point P0 , the
graph of our function w = f (x, y) always stays on one side of its horizontal tangent plane
(P0 is then a maximum or minimum point), or whether it lies partly above and partly below
the tangent plane (P0 is then a saddle point). As we will see, this is determined by how the
graph of a quadratic function f (x) lies with respect to the x-axis. Here is the basic lemma.
Proof of the Lemma. To prove (5), we note that the quadratic formula in the form (4)
shows that the zeros of Ax2 + 2Bx + C are imaginary, i.e., it has no real zeros. Therefore
its graph must lie entirely on one side of the x-axis; which side can be determined from
either A or C, since
A > 0 ⇒ lim Ax2 + 2Bx + C = ∞; C > 0 ⇒ Ax2 + 2Bx + C > 0 when x = 0.
x→∞
u = x − x0 , v = y − y0 .
Then the best quadratic approximation is (if the x, y on the left and u, v on the right is
upsetting, just imagine u and v replaced everywhere by x − x0 and y − y0 ):
1� 2
Au + 2Buv + Cv 2 ;
�
(13) w = f (x, y) ≈ w0 +
2
here the coefficients A, B, C are given as in (1) by the second partial derivatives with respect
to u and v at (0, 0), or what is the same (according to the chain rule—see the footnote below),
by the second partial derivatives with respect to x and y at (x0 , y0 ).
(Intuitively, one can see the coefficients have these values by differentiating
both sides of (13) and pretending the approximation is an equality. There are no
linear terms in u and v on the right since (0, 0) is a critical point.)
Since the quadratic function on the right of (13) is the best approximation to w = f (x, y)
for (x, y) close to (x0 , y0 ), it is reasonable to suppose that their graphs are essentially the
same near (x0 , y0 ), so that if the quadratic function has a maximum, minimum or saddle
point there, so will f (x, y). Thus our results for the special case of a quadratic function
having the origin as critical point carry over to the general function f (x, y) at a critical
point (x0 , y0 ), if we interpret A, B, C as the second partial derivatives at (x0 , y0 ).
This is what the second derivative test says. �
Footnote: Using u = x − x0 and v = y − y0 , we can apply the chain rule for partial
derivatives, which tells us that for all x, y and the corresponding u, v, we have
∂u ∂v
wx = wu + wv = wu , since ux = 1 and vx = 0,
∂x ∂x
and similarly, wy = wv . Therefore at the corresponding points,
(wx )(x0 ,y0 ) = (wu )(0,0) , (wy )(x0 ,y0 ) = (wv )(0,0) ,
(wxx )(x0 ,y0 ) = (wuu )(0,0) , (wxy )(x0 ,y0 ) = (wuv )(0,0) , (wyy )(x0 ,y0 ) = (wvv )(0,0) .
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Second derivative test
f (x, y) = x6 + y 3 + 6x − 12y + 7.
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Chain Rule and Total Differentials
1. Find the total differential of w = x3 yz + xy + z + 3 at (1, 2, 3).
Answer: The total differential at the point (x0 , y0 , z0 ) is
dw = wx (x0 , y0 , z0 ) dx + wy (x0 , y0 , z0 ) dy + wz (x0 , y0 , z0 ) dz.
In our case,
wx = 3x2 yz + y, wy = x3 z + x, wz = x3 y + 1.
Substituting in the point (1, 2, 3) we get: wx (1, 2, 3) = 20, wy (1, 2, 3) = 4, wz (1, 2, 3) = 3.
Thus,
dw = 20 dx + 4 dy + 3 dz.
2. Suppose w = x3 yz + xy + z + 3 and
x = 3 cos t, y = 3 sin t, z = 2t.
dw
Compute and evaluate it at t = π/2.
dt
Answer: We do not substitute for x, y, z before differentiating, so we can practice the chain
rule.
dw ∂w dx ∂w dy ∂w dz
= + +
dt ∂x dt ∂y dt ∂z dt
= (3x2 yz + y)(−3 sin t) + (x3 z + x)(3 cos t) + (x3 y + 1)(2).
At t = π/2 we have x = 0, y = 3, z = π, sin π/2 = 1, cos π/2 = 0.
Thus,
dw
= 3(−3) + 3(0) + (1)2 = −7.
dt π/2
3. Show how the tangent approximation formula leads to the chain rule that was used in
the previous problem.
Answer: The approximation formula is
∂f ∂f ∂f
Δw ≈ Δx + Δy + Δz.
∂x o ∂y o ∂z o
If x, y, z are functions of time then dividing the approximation formula by Δt gives
Δw ∂f Δx ∂f Δy ∂f Δz
≈ + + .
Δt ∂x o Δt ∂y o Δt ∂z o Δt
In the limit as Δt → 0 we get the chain rule.
dw
Note: we use the regular ’d’ for the derivative dt because in the chain of computations
t → x, y, z → w
the dependent variable w is ultimately a function of exactly one independent variable t.
Thus, the derivative with respect to t is not a partial derivative.
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Chain rule
Now we will formulate the chain rule when there is more than one independent variable.
We suppose w is a function of x, y and that x, y are functions of u, v. That is,
The use of the term chain comes because to compute w we need to do a chain of computa
tions
(u, v) → (x, y) → w.
We will say w is a dependent variable, u and v are independent variables and x and y
are intermediate variables.
∂w ∂w
Since w is a function of x and y it has partial derivatives and .
∂x ∂y
Since, ultimately, w is a function of u and v we can also compute the partial derivatives
∂w ∂w
and . The chain rule relates these derivatives by the following formulas.
∂u ∂v
∂w ∂w ∂x ∂w ∂y
= +
∂u ∂x ∂u ∂y ∂u
∂w ∂w ∂x ∂w ∂y
= + .
∂v ∂x ∂v ∂y ∂v
∂w
Example: Given w = x2 y + y 2 + x, x = u2 v, y = uv 2 find .
∂u
Answer: First we compute
∂w ∂w ∂x ∂y ∂x ∂y
= 2xy + 1, = x2 + 2y, = 2uv, = v2 , = u2 , = 2uv.
∂x ∂y ∂u ∂u ∂v ∂v
The chain rule then implies
∂w ∂w ∂x ∂w ∂y
= +
∂u ∂x ∂u ∂y ∂u
= (2xy + 1)2uv + (x2 + 2y)v 2
∂w ∂w ∂x ∂w ∂y
= +
∂v ∂x ∂v ∂y ∂v
= (2xy + 1)u2 + (x2 + 2y)2uv.
Often, it is okay to leave the variables mixed together. If, for example, you wanted to
compute ∂w ∂u when (u, v) = (1, 2) all you have to do is compute x and y and use these
∂w
values, along with u, v, in the formula for .
∂u
∂w
x = 2, y = 4 ⇒ = (5)(4) + (12)(4) = 68.
∂u
If you actually need the derivatives expressed in just the variables u and v then you would
have to substitute for x, y and z.
Proof of the chain rule:
Just as before our argument starts with the tangent approximation at the point (x0 , y0 ).
� �
∂w �� ∂w ��
Δw ≈ Δx + Δy.
∂x �o ∂y �o
∂w
Finally, letting Δu → 0 gives the chain rule for .
∂u
Ambiguous notation
Often you have to figure out the dependent and independent variables from context.
Thermodynamics is a big player here. It has, for example, the variables P , T , V , U , S.
and any two can be taken to be independent and the others are functions of those two.
We will do more with this topic in the future.
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Chain rule with more variables
1. Let w = xyz, x = u2 v, y = uv 2 , z = u2 + v 2 .
∂w
a) Use the chain rule to find .
∂u
b) Find the total differential dw in terms of du and dv.
∂w
c) Find at the point (u, v) = (1, 2).
∂u
Answer: a) The chain rule says
∂w ∂w ∂x ∂w ∂y ∂w ∂z
= + +
∂u ∂x ∂u ∂y ∂u ∂z ∂u
= (yz)(2uv) + (xz)(v 2 ) + (xy)(2u).
dw = yz dx + xz dy + xy dz
and
dx = 2uv du + u2 dv, dy = v 2 du + 2uv dv, dz = 2u du + 2v dv.
Substituting for dx, dy, dz in the equation for dw gives
Therefore
∂w ∂w
= 2yzuv + xzv 2 + 2xyu and = yzu2 + 2xzuv + 2xyv.
∂u ∂v
∂w
(u, v) = (1, 2) ⇒ (x, y, z) = (2, 4, 5) ⇒ = (20)(4) + (10)(4) + (8)(2) = 136.
∂u
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Gradient: definition and properties
Note well the following: (as we look more deeply into properties of the gradient these can
be points of confusion).
1. The gradient takes a scalar function f (x, y) and produces a vector �f .
2. The vector �f (x, y) lies in the plane.
The gradient has many geometric properties. In the next session we will prove that for
w = f (x, y) the gradient is perpendicular to the level curves f (x, y) = c. We can show this
by direct computation in the following example.
Example 1: Compute the gradient of w = (x2 + y 2 )/3 and show that the gradient at
(x0 , y0 ) = (1, 2) is perpendicular to the level curve through that point.
Answer: The gradient is easily computed y�
� � �
2 � �
�w = �2x/3, 2y/3� = �x, y�. �
3 � �
� � � � � x�
At (1, 2) we get �w(1, 2) = 23 �1, 2�. The level curve through (1, 2) is � �
�
2 2 � �
(x + y )/3 = 5/3,
� � �
√ �
which is identical tox2 + y 2
= 5. That is, it is a circle of radius 5 centered
at the origin. Since the gradient at (1,2) is a multiple of �1, 2�, it points z = (x2 + y 2 )/3
radially outward and hence is perpendicular to the circle. Below is a figure
showing the gradient field and the level curves.
Example 2: Consider the graph of y = ex . Find a vector perpendicular to the tangent
to y = ex at the point (1, e).
Old method: Find the slope take the negative reciprocal and make the vector.
New method: This graph is the level curve of w = y − ex = 0.
�w = �−ex , 1� ⇒ (at x = 1) �w(1, e) = �−e, 1� is perpendicular to the tangent vector to
the graph, v = �1, e�.
Higher dimensions
Similarly, for w = f (x, y, z) we get level surfaces f (x, y, z) = c. The gradient is perpendic
ular to the level surfaces.
w = x2 + 2y 2 + 3z 2 .
Our surface is the level surface w = 6. Saying the gradient is perpendicular to the surface
means exactly the same thing as saying it is normal to the tangent plane. Computing
Using point normal form we get the equation of the tangent plane is
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Gradient: proof that it is perpendicular to level curves and
surfaces
be a curve on the level surface with r(t0 ) = �x0 , y0 , z0 �. We let g(t) = f (x(t), y(t), z(t)).
Since the curve is on the level surface we have g(t) = f (x(t), y(t), z(t)) = c. Differentiating
this equation with respect to t gives
� � � � � �
dg ∂f �� dx �� ∂f �� dy �� ∂f �� dz ��
= + + = 0.
dt ∂x �P dt �t0 ∂y �P dt �t0 ∂z �P dt �t0
Since the dot product is 0, we have shown that the gradient is perpendicular to the tangent
to any curve that lies on the level surface, which is exactly what we needed to show.
z
�f
x
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Gradient: proof that it is perpendicular to level curves and
surfaces
be a curve on the level surface with r(t0 ) = �x0 , y0 , z0 �. We let g(t) = f (x(t), y(t), z(t)).
Since the curve is on the level surface we have g(t) = f (x(t), y(t), z(t)) = c. Differentiating
this equation with respect to t gives
� � � � � �
dg ∂f �� dx �� ∂f �� dy �� ∂f �� dz ��
= + + = 0.
dt ∂x �P dt �t0 ∂y �P dt �t0 ∂z �P dt �t0
Since the dot product is 0, we have shown that the gradient is perpendicular to the tangent
to any curve that lies on the level surface, which is exactly what we needed to show.
z
�f
x
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Tangent Plane to a Level Surface
1. Find the tangent plane to the surface x2 + 2y 2 + 3z 2 = 36 at the point P = (1, 2, 3).
Answer: In order to use gradients we introduce a new variable
w = x2 + 2y 2 + 3z 2 .
Our surface is then the the level surface w = 36. Therefore the normal to surface is
At the point P we have Vw|P = U2, 8, 18). Using point normal form, the equation of the
tangent plane is
2. Use gradients and level surfaces to find the normal to the tangent plane of the graph of
z = f (x, y) at P = (x0 , y0 , z0 ).
Answer: Introduce the new variable
w = f (x, y) − z.
The graph of z = f (x, y) is just the level surface w = 0. We compute the normal to the
surface to be
Vw = Ufx , fy , −1).
At the the point P the normal is Ufx (x0 , y0 ), fy (x0 , y0 ), −1), so the equation of the tangent
plane is
fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 ) − (z − z0 ) = 0.
We can write this in a more compact form as
∂f ∂f
(z − z0 ) = (x − x0 ) + (y − y0 ),
∂x 0 ∂y 0
which is exactly the formula we saw earlier for the tangent plane to a graph.
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Directional Derivatives
Directional derivative
Like all derivatives the directional derivative can be thought of as a ratio. Fix a unit vector
u and a point P0 in the plane. The directional derivative of w at P0 in the direction u
is defined as
dw Δw
= lim .
ds P0 ,u Δs→0 Δs
Here Δw is the change in w caused by a step of length Δs in the direction of u (all in the
xy-plane).
Below we will show that
dw
= Vw(P0 ) · u. (1)
ds P0 ,u
We illustrate this with a figure showing the graph of w = f (x, y). Notice that Δs is
measured in the plane and Δw is the change of w on the graph.
ww
•.� . .
� ..
Δw ..
� ..
Δs ..
..
..
..
..
..
• .. -y
P0
Δs
u
JJ
JJ
x
Proof of equation 1
The figure below represents the change in position from P0 resulting from taking a step of
size Δs in the u direction.
J u
yw JJ
JJ
JJJ
J
ΔsJJJ Δy
JJJ
P0 •JJ
Δx -x
Δx Δy
Since (Δs)2 = (Δx)2 + (Δy)2 we have that , . is a unit vector, so
Δs Δs
Δx Δy
u= , .
Δs Δs
1
Dividing this approximation by Δs gives
Δw ∂w Δx ∂w Δy
≈ + .
Δs ∂x P0 Δs ∂y P0 Δs
We can rewrite this as a dot product
Δw ∂w ∂w Δx Δy
≈ , · , .
Δs ∂x P0 ∂y P0 Δs Δs
In the dot product the first term is Vw|P0 and the second is just u, so,
Δw
≈ Vw|P0 · u.
Δs
Now taking the limit we get equation (1).
2
(In the last equation we dropped the |u| because it equals 1.) Now it is obvious that this is
greatest when θ = 0. That is, when Vw and u are in the same direction.
3
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Lagrange Multipliers
We will give the argument for why Lagrange multipliers work later. Here, we’ll look at
where and how to use them. Lagrange multipliers are used to solve constrained optimization
problems. That is, suppose you have a function, say f (x, y), for which you want to find
the maximum or minimum value. But, you are not allowed to consider all (x, y) while you
look for this value. Instead, the (x, y) you can consider are constrained to lie on some curve
or surface. There are lots of examples of this in science, engineering and economics, for
example, optimizing some utility function under budget constraints.
z
y
x
The area of one side = yz. There are two double thick sides ⇒ cardboard used = 4yz.
The area of the front (and back) = xz. It is single thick ⇒ cardboard used = 2xz.
The area of the bottom = xy. It is triple thick ⇒ cardboard used = 3xy.
Thus, the total cardboard used is
w = f (x, y, z) = 4yz + 2xz + 3xy.
The fixed volume acts as the constraint. It forces a relation between x, y and z so they
can’t all be varied independently. The constraint is
V = xyz = 3.
Our first job is to set up the equations to look for critical points. Vf = (2z + 3y, 4z + 3x, 4y + 2x)
and VV = (yz, xz, xy).
The Lagrange multiplier equations are then
Vf = λVV, and V = 3
⇔ (2z + 3y, 4z + 3x, 4y + 2x) = λ(yz, xz, xy), xyz = 3
1
Next we solve these equations for critical points. We do this by solving for λ in each equation
(we call this solving symmetrically).
2z+3y
yz = λ 4z+3xxz = λ, 4y+2x
xy = λ, xyz = 3 ⇒ y2 + z3 = x4 + z3 = x4 + y2
2 4 3 2
⇒ y = x ⇒ x = 2y and z = y ⇒ z = 23 y
Now, xyz = 3 ⇒ 3y 3 = 3 ⇒ y = 1
Answer: x = 2, y = 1, z = 32 , w = 18.
Sphere example:
Minimize w = y constrained to x2 + y 2 + z 2 = 1.
Answer: Vf = (0, 1, 0), Vg = (2x, 2y, 2z)
Vf = λVg ⇒ (0, 1, 0) = λ(2x, 2y, 2z) ⇒ x = z = 0.
Constraint ⇒ y = ±1. (Gives the minimum and maximum respectively).
2
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Proof of Lagrange Multipliers
Here we will give two arguments, one geometric and one analytic for why Lagrange multi
pliers work.
Critical points
For the function w = f (x, y, z) constrained by g(x, y, z) = c (c a constant) the critical points
are defined as those points, which satisfy the constraint and where Vf is parallel to Vg.
In equations:
Vf (x, y, z) = λVg(x, y, z) and g(x, y, z) = c.
•
P w = 17
w = 16
w = 15
w = 14
w = 12
w = 10
w=8 �x
w=6
1
Analytic proof for Lagrange (in three dimensions)
Suppose f has a local maximum at P on the constraint surface.
Let r(t) = (x(t), y(t), z(t)) be an arbitrary parametrized curve which lies on the constraint
surface and has (x(0), y(0), z(0)) = P . Finally, let h(t) = f (x(t), y(t), z(t)). The setup
guarantees that h(t) has a maximum at t = 0.
Taking a derivative using the chain rule in vector form gives
2
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Non-independent Variables
1. We give a worked example here. A fuller explanation will be given in the next session.
Let
w = x3 y 2 + x2 y 3 + y
and assume x and y satisfy the relation
x2 + y 2 = 1.
We consider x to be the independent variable, then, because y depends on x we have w is
ultimately a function of the single variable x.
dw
a) Compute using implicit differentiation.
dx
dw
b) Compute using total differentials.
dx
Answer:
dy 2 dy
a) Implicit differentiation means remembering that y is a function of x, e.g., = 2y .
dx dx
Thus,
dw dy dy dy
= 3x2 y 2 + 2x3 y + 2xy 3 + 3x2 y 2 + .
dx dx dx dx
dy
Now we differentiate the constraint to find .
dx
dy dy x
x2 + y 2 = 1 ⇒ 2x + 2y =0 ⇒ =− .
dx dx y
dw
Substituting this in the equation for gives
dx
dw x x x x
= 3x2 y 2 − 2x3 y + 2xy 3 − 3x2 y 2 − = 3x2 y 2 − 2x4 + 2xy 3 − 3x3 y − .
dx y y y y
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Non-independent Variables
while three of the many possible actual partial derivatives are (we use the chain rule)
� � � �
∂f ∂z
= fx + fz = y 2 z 4 + 8xy 2 z 3 ;
∂x y ∂x y
� � � �
∂f ∂z
= fy + fz = 2xyz 4 + 12xy 2 z 3 ;
∂y x ∂y x
� � � �
∂f ∂y
= fy + fz = 23 xyz 4 + 4xy 2 z 3 .
∂z x ∂z x
Rules connecting partial derivatives. These rules are widely used in the applications,
especially in thermodynamics. Here we will use them as an excuse for further practice with
the chain rule and differentials.
With an eye to thermodynamics, we assume a set of variables t, u, v, w, x, y, z, . . . con
nected by several equations in such a way that
• any two are independent;
• any three are connected by an equation.
Thus, one can choose any two of them to be the independent variables, and then each of
the other variables can be expressed in terms of these two.
We give each rule in two forms—the second form is the one ordinarily used, while the
first is easier to remember. (The first two rules are fairly simple in either form.)
� � � � � �
∂x ∂y ∂x 1
(8a,b) = 1 = reciprocal rule
∂y z ∂x z ∂y z (∂y/∂x)z
� � � � � � � �
∂x ∂y ∂x ∂x (∂x/∂t)z
(9a,b) = = , chain rule
∂y z ∂t z ∂t z ∂y z (∂y/∂t)z
� � � � � � � �
∂x ∂y ∂z ∂x (∂x/∂z)y
(10a,b) = −1 = − , cyclic rule
∂y z ∂z x ∂x y ∂y z (∂y/∂z)x
Note how the successive factors in the cyclic rule are formed: the variables are used in the
successive orders x, y, z; y, z, x; z, x, y; one says they are permuted cyclically, and this
explains the name.
1
2 NON-INDEPENDENT VARIABLES
Proof of the rules. The first two rules are simple: since z is being held fixed throughout,
each variable becomes a function of just one other variable, and (9) is just the one-variable
chain rule. Then (8) is just the special case of (9) where x = t.
The cyclic rule is less obvious — on the right side it looks almost like the chain rule, but
different variables are being held constant in each of the differentiations, and this changes it
entirely. To prove it, we suppose f (x, y, z) = 0 is the equation satisfied by x, y, z; taking y
and z as the independent variables and differentiating f (x, y, z) = 0 with respect to y gives:
� � � �
∂x ∂x fy
(11) fx + fy = 0; therefore = − .
∂y z ∂y z fx
Permuting the variables in (11) and multiplying the resulting three equations gives (10a):
� � � � � �
∂x ∂y ∂z fy fz fx
= − ·− ·− = −1.
∂y z ∂z x ∂x y fx fy fz
� �
∂w
Example 6. Suppose w = w(x, r), with r = r(x, θ). Give an expression for in
∂r θ
terms of formal partial derivatives of w and r.
Solution. Evidently the independent variables are to be r and θ, since these are the
ones that occur in the lower part of the partial derivative, with x dependent on them. Since
θ is viewed as a constant, the chain rule gives
� � � �
∂w ∂x
= wx + wr ;
∂r θ ∂r θ
� �
∂x 1
= ,
∂r θ (∂r/∂x)θ
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
18.02 Problem Set 4
At MIT problem sets are referred to as ’psets’. You will see this term used occasionally
within the problems sets.
The 18.02 psets are split into two parts ’part I’ and ’part II’. The part I are all taken
from the supplementary problems. You will find a link to the supplementary problems
and solutions on this website. The intention is that these help the student develop
some fluency with concepts and techniques. Students have access to the solutions
while they do the problems, so they can check their work or get a little help as they
do the problems. After you finish the problems go back and redo the ones for which
you needed help from the solutions.
The part II problems are more involved. At MIT the students do not have access
to the solutions while they work on the problems. They are encouraged to work
together, but they have to write their solutions independently.
At MIT the underlined problems must be done and turned in for grading.
The ‘Others’ are some suggested choices for more practice.
A listing like ’§1B : 2, 5b, 10’ means do the indicated problems from supplementary
problems section 1B.
1 Functions of several variables. Level Curves. Partial derivatives. Tangent plane.
Linear Approximation
§2A: 1c, 2be, 3b, 5a; Others: 1abd, 2acd, 3ac, 4, 5b
§2B: 1b, 6, 9; Others: 1a, 2, 3, 5
1
Part II (29 points)
2
Problem 4 (5)
Suppose that three non-negative numbers are restricted by the condition that the
sum of their squares is equal to 27. Using critical point analysis, with 2nd derivative
and/or boundary tests as needed, find the maximum and minimum values of the sum
of their cubes.
3
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
2. Partial Differentiation
2A. Functions and Partial Derivatives
2A-1 Sketch five level curves for each of the following functions. Also, for a-d, sketch the
portion of the graph of the function lying in the first octant; include in your sketch the
traces of the graph in the three coordinate planes, if possible.
�
a) 1 − x − y b) x2 + y 2 c) x2 + y 2 d) 1 − x2 − y 2 e) x2 − y 2
2A-2 Calculate the first partial derivatives of each of the following functions:
x 2
a) w = x3 y − 3xy 2 + 2y 2 b) z = c) sin(3x + 2y) d) ex y
y
e) z = x ln(2x + y) f) x2 z − 2yz 3
2A-4 By using fxy = fyx , tell for what value of the constant a there exists a function
f (x, y) for which fx = axy + 3y 2 , fy = x2 + 6xy, and then using this value, find such a
function by inspection.
2A-5 Show the following functions w = f (x, y) satisfy the equation wxx + wyy = 0 (called
the two-dimensional Laplace equation):
a) w = eax sin ay (a constant) b) w = ln(x2 + y 2 )
2B-3 Using the approximation formula, find the approximate change in the hypotenuse of
a right triangle, if the legs, initially of length 3 and 4, are each increased by .010 .
2B-4 The combined resistance R of two wires in parallel, having resistances R1 and R2
respectively, is given by
1 1 1
= + .
R R1 R2
If the resistance in the wires are initially 1 and 2 ohms, with a possible error in each
of ±.1 ohm, what is the value of R, and by how much might this be in error? (Use the
approximation formula.)
2B-5 Give the linearizations of each of the following functions at the indicated points:
a) (x + y + 2)2 at (0, 0); at (1, 2) b) ex cos y at (0, 0); at (0, π/2)
1
2 E. 18.02 EXERCISES
2B-6 To determine the volume of a cylinder of radius around 2 and height around 3, about
how accurately should the radius and height be measured for the error in the calculated
volume not to exceed .1 ?
2B-7 a) If x and y are known to within .01, with what accuracy can the polar coordinates
r and θ be calculated? Assume x = 3, y = 4.
b) At this point, are r and θ more sensitive to small changes in x or in y? Draw a
picture showing x, y, r, θ and confirm your results by using geometric intuition.
2B-8* Two sides of a triangle are a and b, and θ is the included angle. The third side is c.
a) Give the approximation for Δc in terms of a, b, c, θ, and Δa, Δb, Δθ.
b) If a = 1, b = 2, θ = π/3, is c more sensitive to small changes in a or b?
2C-3 Two sides of a triangle have lengths respectively a and b, with θ the included angle.
Let A be the area of the triangle.
a) Express dA in terms of the variables and their differentials.
b) If a = 1, b = 2, θ = π/6, to which variable is A most sensisitve? least sensitive?
c) Using the values in (b), if the possible error in each value is .02, what is the possible
error in A, to two decimal places?
2C-4 The pressure, volume, and temperature of an ideal gas confined to a container are
related by the equation P V = kT , where k is a constant depending on the amount of gas
and the units. Calculate dP two ways:
a) Express P in terms of V and T , and calculate dP as usual.
b) Calculate the differential of both sides of the equation, getting a “differential equa
tion”, and then solve it algebraically for dP .
c) Show the two answers agree.
2. PARTIAL DIFFERENTIATION 3
2C-5 The following equations define w implicitly as a function of the other variables.
Find dw in terms of all the variables by taking the differential of both sides and solving
algebraically for dw.
1 1 1 1
a) = + + b) u2 + 2v 2 + 3w2 = 10
w t u v
2D-3 By viewing the following surfaces as a contour surface of a function f (x, y, z), find
its tangent plane at the given point.
2D-4 The function T = ln(x2 +y 2 ) gives the temperature at each point in the plane (except
(0, 0)).
a) At the point P : (1, 2), in which direction should you go to get the most rapid increase
in T ?
b) At P , about how far should you go in the direction found in part (a) to get an increase
of .20 in T ?
c) At P , approximately how far should you go in the direction of i + j to get an increase
of about .12?
d) At P , in which direction(s) will the rate of change of temperature be 0?
2D-8 The atmospheric pressure in a region of space near the origin is given by the formula
P = 30 + (x + 1)(y + 2)ez . Approximately where is the point closest to the origin at which
the pressure is 31.1?
2D-9 The accompanying picture shows the level curves of a function w = f (x, y). The
value of w on each curve is marked. A unit distance is given.
P
a) Draw in the gradient vector at A. 1
b) Find a point B where w = 3 and ∂w/∂x = 0. 2 4 5
3
c) Find a point C where w = 3 and ∂w/∂y = 0.
Q
d) At the point P estimate the value of ∂w/∂x and ∂w/∂y.
e) At the point Q, estimate dw/ds in the direction of i + j
A
f) At the point Q, estimate dw/ds in the direction of i − j .
g) Approximately where is the gradient 0? 1
2E-2 In each of these, information about the gradient of an unknown function f (x, y) is
given; x and y are in turn functions of�t. Use the� chain rule to find out additional information
about the composite function w = f x(t), y(t) , without trying to determine f explicitly.
dw
a) ∇w = 2 i + 3 j at P : (1, 0); x = cos t, y = sin t. Find the value of at t = 0.
dt
dw
b) ∇w = y i + x j ; x = cos t, y = sin t. Find and tell for what t-values it is zero.
dt
df
c) ∇f = h1, −1, 2i at (1, 1, 1). Let x = t, y = t2 , z = t3 ; find at t = 1.
dt
df
d) ∇f = h3x2 y, x3 + z, yi; x = t, y = t2 , z = t3 . Find .
dt
2E-3 a) Use the chain rule for f (u, v), where u = u(t), v = v(t), to prove the product rule
d
D(uv) = v Du + u Dv, where D = .
dt
d
b) Using the chain rule for f (u, v, w), derive a similar product rule for (uvw), and use
dt
2t
it to differentiate te sin t.
d v
c)* Derive similarly a rule for the derivative u , and use it to differentiate (ln t)t .
dt
2. PARTIAL DIFFERENTIATION 5
2E-4 Let w = f (x, y), and assume that ∇w = 2 i + 3 j at (0, 1). If x = u2 − v 2 , y = uv,
∂w ∂w
find , at u = 1, v = 1.
∂u ∂v
2E-5 Let w = f (x, y), and suppose we change from rectangular to polar coordinates:
x = r cos θ, y = r sin θ.
1
a) Show that (wx )2 + (wy )2 = (wr )2 + 2 (wθ )2 .
r
∂w ∂w
b) Suppose ∇w = 2 i − j at the point x = 1, y = 1. Find and when
√ ∂r ∂θ
r = 2, θ = π/4, and verify the relation in part (a), at the point.
2E-6 Let w = f (x, y), and make the change of variables x = u2 − v 2 , y = 2uv. Show
(wu )2 + (wv )2
(wx )2 + (wy )2 =
4(u2 + v 2 )
2E-7 The Jacobian
� matrix
� for the change of variables x = x(u, v), y = y(u, v) is defined
xu xv
to be J = . Let ∇f (x, y) be represented as the row vector hfx fy i.
yu yv
Show that � �
∇f x(u, v), y(u, v) = ∇f (x, y) · J (matrix multiplication).
2E-8 a) Let w = f (y/x); i.e., w is the composite of the functions w = f (u), u = y/x.
∂w ∂w
Show that w satisfies the PDE (partial differential equation) x + y = 0.
∂x ∂y
∂w ∂w
b)* Let w = f (x2 − y 2 ); show that w satisfies the PDE y + x = 0.
∂x ∂y
∂w ∂w
c)* Let w = f (ax + by); show that w satisfies the PDE b − a = 0.
∂x ∂y
2F. Maximum-minimum Problems
2F-1 Find the point(s) on each of the following surfaces which is closest to the origin.
(Hint: it’s easier to minimize the square of the distance, rather than the distance itself.)
a) xyz 2 = 1 b) x2 − yz = 1
2F-2 A rectangular produce box is to be made of cardboard; the sides of single thickness,
the front and back of double thickness, and the bottom of triple thickness, with the top
left open. Its volume is to be 1 cubic foot; what proportions for the sides will use the least
cardboard?
2F-3* Consider all planes passing through the point (2,1,1) and such that the intercepts
on the three coordinate axes are all positive. For which of these planes is the product of the
three intercepts smallest? (Hint: take the plane in the form z = ax + by + c, where a and b
are the independent variables.)
2F-4* Find the extremal point of x2 + 2xy + 4y 2 + 6 and show it is a minimum point by
completing the square.
2F-5 A drawer in a chest has an open top; the bottom and back are made of cheap wood
costing $1/sq. ft; the sides have to be thicker, and cost $2/sq.ft., while the front costs
$4/sq.ft. for the better quality wood and finishing. The volume is to be 2.5 cu. ft. What
dimensions will produce the drawer costing the least to manufacture?
6 E. 18.02 EXERCISES
2G-1 Find by the method of least squares the line which best fits the three data points
given. Do it from scratch, using the formula
� n
D= (yi − (axi + b))2 ,
i=1
which was in the reading on least squares, and differentiation (use the chain rule). Sketch
the line and the three points as a check.
a)* (0, 0), (0, 2), (1, 3) b)* (0, 0), (1, 2), (2, 1) c) (1, 1), (2, 3), (3, 2)
2G-2* Show that the equations (4) for the method of least squares have a unique solution,
unless all the xi are equal. Explain geometrically why this exception occurs.
�n
Hint: use the fact that for all values of u, we have 1 (xi − u)2 ≥ 0, since
squares are always non-negative. Write the left side as a quadratic polynomial in
u. Usually it has no roots. What does this imply about the coefficients? When
does it have a root? (Answer these two questions by using the quadratic formula.)
2G-3* Use least squares to fit a second degree polynomial exactly through the points
(−1, −1), (0, 0), (1, 3) (you might want to go back and read the last section in the note
about least squares).
2G-4 What linear equations in a, b, c does the method of least squares lead to, when you
use it to fit a linear function z = a + bx + cy to a set of data points (xi , yi , zi ), i = 1, . . . , n?
2G-5* What equations are you led to for determining a when you try to fit the exponential
curve y = eax to two data points (1, y1 ), (2, y2 ) by the method of least squares?
The moral is: don’t do it this way. In general to fit an exponential y = ceax
to a set of data points (xi , yi ), take the log of both sides:
ln y = a x + ln c
This gives a linear function in the variables x and ln y, whose coefficients a and
ln c can be determined by applying the method of least squares to fit the data
points (xi , ln yi ).
a) x2 − xy − 2y 2 − 3x − 3y + 1 b) 3x2 + xy + y 2 − x − 2y + 4 c) 2x4 + y 2 − xy + 1
2H-2* Use the 2nd-derivative criterion to verify that the critical point (m0 , b0 ) determining
the regression (= least-squares) line y = m0 x + b0 really minimizes the function D(m, b)
giving the sum of the squares of the deviations. (You will need the inequality in problem
��
a2i and A · B =
�
1B-15, for n-vectors A = ha1 , a2 , . . . , an i, defining |A| = a i bi . )
2H-6 a) Two wires of length 4 are cut in the same way into three pieces, of length x, y
and z; the four x, y pieces are used as the four sides of a rectangle; the two z pieces are bent
at the middle and joined at the ends to make a square of side z/2. Find the rectangle and
square made this way which together have the largest and the smallest total area.
Using the answer, tell what type the critical point is.
b) Confirm the critical point type by using the second derivative test.
2H-7 a) Find the maximum and minimum points of the function 2x2 − 2xy + y 2 − 2x
on the rectangle R : 0 ≤ x ≤ 2; −1 ≤ y ≤ 2; using this information, determine the type of
the critical point.
b) Confirm the critical point type by using the second derivative test.
2I-2 Using √ Lagrange multipliers, tell which point P in the first octant and on the surface
x3 y 2 z = 6 3 is closest to the origin. (As usual, it is easier algebraically to minimize |OP |2
rather than |OP |.)
2I-3 (Repeat of 2F-2, but this time use Lagrange multipliers.) A rectangular produce box
is to be made of cardboard; the sides of single thickness, the ends of double thickness, and
the bottom of triple thickness, with the top left open. Its volume is to be 1 cubic foot; what
should be its proportions in order to use the least cardboard?
2I-4 In an open-top wooden drawer, the two sides and back cost $2/sq.ft., the bottom
$1/sq.ft. and the front $4/sq.ft. Using Lagrange multipliers, show that the following prob
lems lead to the same set of three equations in λ, plus a different fourth equation, and they
have the same solution.
a) Find the dimensions of the drawer with largest capacity that can be made for a total
wood cost of $72.
b) Find the dimensions of the most economical drawer having volume 24 cu. ft.
8 E. 18.02 EXERCISES
2J-3 In Example 2, using the chain rule calculate, in terms of x, y, z, t, the derivatives
� � � �
∂w ∂w
a) b)
∂t x,z ∂z x,y
2J-5 Let S = S(p, v, T ) be the entropy of a gas, assumed to obey the ideal gas law (1).
Give expressions in terms of the
� formal
� partial derivatives
� Sp ,�Sv , and ST for
∂S ∂S
a) b)
∂p v ∂T v
� � � �
3 2 ∂w ∂w
2J-6 If w = u − uv , u = xy, v = u + x, find and using
∂u x ∂x u
a) the chain rule b) differentials .
2J-7 Let P be the point (1, −1, 1), and assume z = x2 + y + 1, and that f (x, y, z) is a
differentiable function for which ∇f (x, y, z) = 2 i + j − 3 k at P .
Let g(x, z) = f (x, y(x, z), z); find ∇g at the point (1, 1), i.e., x = 1, z = 1.
√
2J-8 Interpreting r, θ as polar coordinates, let w = r2 − x2 .
� �
∂w
a) Calculate , by first writing w in terms of r and θ.
∂r θ
r
w
b)* Then calculate it by substituting into the final formula given θ
in Example 6. x
c)* Finally, obtain the answer by intuitive geometrical reasoning (see picture).
2K-2. For what value(s) of n will w = (x2 + y 2 + z 2 )n solve the 3-dimensional Laplace
equation (Notes P, (1))? Where have you seen this function in physics?
2K-3. The solutions in exercises 2K-1 and 2K-2 have circular and spherical symmetry,
respectively. But there are many other solutions. For example
2. PARTIAL DIFFERENTIATION 9
a) find all solutions of the two-dimensional Laplace equation (see 2K-1) of the form
w = ax2 + bxy + cy 2
and show they can all be written in the form c1 f1 (x, y) + c2 f2 (x, y), where c1 , c2 are
arbitrary constants, and f1 , f2 are two particular polynomials — that is, all such solutions
are linear combinations of two particular polynomial solutions.
b)* Find and derive the analogue of part (a) for all of the cubic polynomial solutions
ax3 + bx2 y + cxy 2 + dy 3 to the two-dimensional Laplace equation.
2K-4. Show that the one-dimensional wave equation (Notes P, (4), first equation) is
satisfied by any function of the form
where f (u) and g(u) are arbitrary twice-differentiable functions of one variable.
Take g(u) = 0, and interpret physically the solution w = f (x + ct). What does f (x)
represent? What is the relation of f (x + ct) to it?
Note how this exercise shows that a solution to the wave equation can in
volve completely arbitrary functions; this is also clear from the remarks about
the Laplace equation being solved by any gravitational or electrostatic potential
function in a mass- or charge-free region of space.
2K-5. Find solutions to the one-dimensional heat equation (Notes P, (5), first equation)
having the form
w = ert sin kx k, r constants
satisfying the additional conditions for all t:
w(0, t) = 0, w(1, t) = 0 .
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
18.02 Problem Set 5
At MIT problem sets are referred to as ’psets’. You will see this term used occasionally
within the problems sets.
The 18.02 psets are split into two parts ’part I’ and ’part II’. The part I are all taken
from the supplementary problems. You will find a link to the supplementary problems
and solutions on this website. The intention is that these help the student develop
some fluency with concepts and techniques. Students have access to the solutions
while they do the problems, so they can check their work or get a little help as they
do the problems. After you finish the problems go back and redo the ones for which
you needed help from the solutions.
The part II problems are more involved. At MIT the students do not have access
to the solutions while they work on the problems. They are encouraged to work
together, but they have to write their solutions independently.
At MIT the underlined problems must be done and turned in for grading.
The ‘Others’ are some suggested choices for more practice.
A listing like ’§1B : 2, 5b, 10’ means do the indicated problems from supplementary
problems section 1B.
1 Differentials. Chain rule
§2C: 1ad, 2, 3, 5ab Others: 1bc
§2E: 1c, 2bc, 8a; Others: 1ab, 2d, 4, 5, 7
Problem 2 (3)
Let f (x, y, z, t) be a smooth function, and let �f = �fx , fy , fz � be the gradient in the
space variables only. Let r = r(t) = �x(t), y(t), z(t)� be a smooth curve, and v = r� (t);
Df d
and suppose we use the notation = f (r(t), t).
Dt dt
Df ∂f
Use the Chain Rule to show that = + v · �f .
Dt ∂t
D
Background : The notation Dt comes from the physics of fluid motion, where it is
called the convective derivative (or material or substantial derivative, and by several
other names), and means the rate of change along a moving path of some physical
quantity (scalar or vector) which is being transported by fluid currents.
In this macroscopic model, the fluid is pictured as a continuum of point masses rather
than as individual molecules. At a location (x, y, z) in space and a time t, the point
mass has a density ρ = ρ(x, y, z, t), and a velocity v = v(x, y, z, t). This means
that the vector v(x, y, z, t) points in direction tangent to the path of a particle at
(x, y, z, t) in the flow, and has magnitude equal to the instantaneous speed of the
particle located at that point and which is moving in the flow.
Now suppose that the curve r = r(t) is a path of a point mass in the flow, so that
(by definition) r� (t) = v(r(t), t). The convective derivative Df Dt
of f along this path is
the time rate of change of f using only the values of f (x, y, z, t) for which the space
variables (x, y, z) are restricted to the path r(t) = �x(t), y(t), z(t)� of a particle in the
flow. For this reason you will see the convective derivative described as the rate of
change of the quantity f “moving along the flow” or “moving with an element of the
fluid” (and other similar language).
Problem 3 (5: 1,2,2) (continuation)
Now take the case f = ρ, the density of the fluid. A fluid flow is called incompressible
Dρ
if = 0.
Dt
As discussed above, this means that the mass density is constant along the paths of
the flow. Any substance (like water, at moderate pressures) which has the property
that its density is constant in all variables (x, y, z, t) will of course be incompressible,
which is the usual way one pictures something which cannot be compressed. However,
incompressibility is in general a property of the flow rather than just the fluid itself,
since it says only that the rate of change of the density moving along the flow is zero.
The following examples illustrate this.
a) Suppose that the density function depends only on time t but is constant in the
space variables (x, y, z), that is, ρ = ρ(t). Then show that the flow is incompressible
if and only if the density ρ(t) is constant in all the variables (x, y, z, t) (that is, the
constant-density case discussed above).
b) Next suppose instead that the density depends only on the space variables (x, y, z)
but not (explicity) on t, so that ρ = ρ(x, y, z). An incompressible flow in this case is
called stratified.
Use the result of problem 2 to give the condition on ρ and v for stratified flow.
A flow is called steady if the density ρ and the velocity field v of the flow do not
depend explicitly on the time t, i.e. ρ = ρ(x, y, z) and v = v(x, y, z). In this case,
the term streamlines is used for the paths of the particles in the flow, since they keep
their same shapes over time.
c) Suppose one has a 2D stratified steady flow, so that ρ = ρ(x, y) and v = v(x, y),
and suppose also that the density varies only with the height y. Draw a picture of
the streamlines for such as flow. Then explain why they must follow this pattern,
and why the term “stratified” fits in this case.
(This could be, for example, a cross-section of a very regular ocean current, if it is an
incompressible steady flow whose density varies only with the depth.)
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
18.02 Problem Set 5, Part II Solutions
Df
Problem 2 Dt
= dtd f (r(t), t) = dtd f (x(t), y(t), z(t), t) =
∂f dx ∂f dy
∂x dt
+ ∂y dt
+ ∂f dz
∂z dt
+ ∂f dt
∂t dt
= r� (t) · �f (r(t)) + ∂f∂t
=v · �f + ∂f
∂t
using
�
v = r (t)
Dρ ∂ρ
Problem 3 = + v · �ρ
Dt ∂t
(a) If ρ = ρ(t) only, then �ρ = �ρx , ρy , ρz � = 0. Thus Dρ
Dt
= 0 if and only if
∂ρ
∂t
= 0.
b) If ∂ρ
∂t
= 0, then DρDt
= 0 if and only if v · �ρ = 0. So the condition
for stratified flow is that the velocity vectors of the flow are orthogonal to
the density gradients, or, equivalently, tangent to the surfaces of constant
density.
c) If ρ = ρ(y) only, then �ρ = �0, ρy �, so that the gradient of the density
is always parallel to j. Therefore, by the result of part(b), the streamlines,
which follow the velocity vectors v, are always horizontal. The flow is thus
layered by density, which is consistent with the meaning of the word stratified.
1
(b) We compute
�f (x, y) = �fx , fy � = �−1, −4� .
x + 4y = 4 .
We are looking for a point (x, y) that lies on the line that passes through the
origin in gradient direction, i.e.,
Thus x = −s and y = −4s = 4x. Plugging y = 4x into the level curve for
f = 0 gives
x + 16x = 4 ,
or x = 4/17 and y = 16/17.
w �−2, −1� 6
�f (x, y) · = �−1, −4� · √ =√ .
|w| 5 5
2
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
18.02 Problem Set 6
At MIT problem sets are referred to as ’psets’. You will see this term used occasionally
within the problems sets.
The 18.02 psets are split into two parts ’part I’ and ’part II’. The part I are all taken
from the supplementary problems. You will find a link to the supplementary problems
and solutions on this website. The intention is that these help the student develop
some fluency with concepts and techniques. Students have access to the solutions
while they do the problems, so they can check their work or get a little help as they
do the problems. After you finish the problems go back and redo the ones for which
you needed help from the solutions.
The part II problems are more involved. At MIT the students do not have access
to the solutions while they work on the problems. They are encouraged to work
together, but they have to write their solutions independently.
At MIT the underlined problems must be done and turned in for grading.
The ‘Others’ are some suggested choices for more practice.
A listing like ’§1B : 2, 5b, 10’ means do the indicated problems from supplementary
problems section 1B.
1 Lagrange multipliers.
§2I: 1, 3; Others: 2, 4
2 Non-independent variables.
§2J: 1, 2, 3a, 4b, 5a, 7; Others: 3b, 4a, 6
Part II (15 points)
a) Determine what choice of I1 and I2 will minimize energy loss and hence determine
what the currents will be along the two paths. (Alternatively, if you already are
familiar with resistors in parallel and current flows, verify that the currents I1 and I2
do in fact minimize energy loss).
b) Suppose instead we had three resistors in parallel. In terms of R1 ,R2 , and R3
determine the values of I1 , I2 , and I3 which minimize energy loss.
Problem 3 (7: 1,2,2,2)
Using the usual rectangular and polar coordinates, let w be the area of the right
triangle in the first quadrant having its vertices at (0, 0), (x, 0) and (x, y). Using the
equation expressing w in terms of x and y and�the � equations�expressing
� y in terms of
∂w ∂w
x and θ, calculate the two partial derivatives and in three different
∂x θ ∂θ x
ways.
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
18.02 Problem Set 6, Part II Solutions
Problem 2
a) We want to minimize
I12 R1 + I22 R2
subject to
I1 + I2 = I
where I is a constant. Using Lagrange multipliers we get the equations:
2I1 R1 = λ, 2I2 R2 = λ, I1 + I2 = I
subject to
I1 + I2 + I3 = I
where I is a constant. Using Lagrange multipliers we get the equations:
1
which we solve to get that
R2 R3 R1 R3 R2 R1
I1 = I, I2 = I, I3 = I,
D D D
where D = R1 R3 + R2 R3 + R1 R2
Problem 3
y y y r + Δr
(x, y)
y y �
r y
θ θ x θ
x x x Δx x
x
Fig. 1 Fig. 2 Fig. 3
a) y = x tan θ (see Fig. 1). Area = w = 12 xy = 21 x2 tan θ.
� � � �
∂w ∂w 1
⇒ = x tan θ = y, and = x2 sec2 θ.
∂x θ ∂θ x 2
b) As before, y = x tan θ and wx = 12 y, wy = 12 x.
� � � � � �
∂w ∂x ∂y 1 1 1 1
= wx + wy = y + x tan θ = y + y = y,
∂x ∂x ∂x 2 2 2 2
� �θ � �θ � �θ
∂w ∂x ∂y 1 1 2 2
= wx + wy = 0 + x2 sec2 θ = x sec θ.
∂θ x ∂θ x ∂θ x 2 2
c) dw = 12 y dx + 12 x dy, dy = tan θ dx + x sec2 θ dθ.
Eliminate dy from the equation for dw.
dw = 12 y dx + 12 x(tan θ dx + x sec2 θ dθ) = ( 12 y + 12 x tan θ)dx + ( 21 x2 sec2 θ)dθ.
� � � �
∂w 1 1 ∂w 1
⇒ = y + x tan θ = y, and = x2 sec2 θ.
∂x θ 2 2 ∂θ x 2
d) If we fix θ and vary x then (see Fig. 2)
Δw = area of trapezoidal strip at right = Δx · 12 (y + y + Δy) = yΔx + 12 Δx ·
Δy ≈ yΔx. � �
Δw ∂w
(We ignore second order terms.) ⇒ ≈y ⇒ = y.
Δx ∂x θ
If we fix x and vary θ then (see Fig. 3) Δw = area of thin wedge.
The angle of the wedge is Δθ and Δw = 12 r(r + Δr) sin(Δθ) ≈ 12 r(r +
Δr)Δθ ≈ 12 r2 Δθ.
(Here, we’ve used sin x ≈ x and then dropped second order terms.)
� �
Δw 1 2 1 2 2 ∂w 1
⇒ ≈ r = x sec θ ⇒ = x2 sec2 θ.
Δθ 2 2 ∂θ x 2
2
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.