Differentiation Theory
Differentiation Theory
@f f (x1 , x2 , . . . , xj + h, . . . , xn ) f (x1 , . . . , xn )
(x1 , . . . , xn ) = lim
@xj h!0 h
f (x + hej ) f (x)
= lim
h!0 h
In other words, @f /@xj is just the derivative of f with respect to the variable xj ,
with the other variables held fixed. If f : R3 ! R, we shall often use the notation
@f /@x, @f /@y, @f /@z in place of @f /@x1 , @f /@x2 , @f /@x3 . If f : U ⇢ Rn ! Rm ,
then we can write
so that we can speak of the partial derivatives of each component; for example,
@fm /@xn is the partial derivative of the mth component with respect to xn , the nth
variable.
To indicate that a partial derivative is to be evaluated at a particular point, for
example, at (x0 , y0 ), we write
@f @f @f
(x0 , y0 ) or or .
@x @x x=x0 ,y=y0 @x (x0 ,y0 )
When we write z = f (x, y) for the dependent variable, we sometimes write @z/@x
for @f /@x. Strictly speaking, this is an abuse of notation, but it is common practice
to use these two notations interchangeably.
58
6.2 The Linear or Affine Approximation
To “motivate” our definition of di↵erentiability, let us compute what the equation
of the plane tangent to the graph of f : R2 ! R, (x, y) 7! f (x, y) at (x0 , y0 ) ought
to be if f is smooth enough. In R3 , a non-vertical plane has an equation of the form
z = ax + by + c.
If it is to be the plane tangent to the graph of f , the slopes along the x and y axes
must be equal to @f /@x and @f /@y, the rates of change of f with respect to x and
y. Thus, a = @f /@x, b = @f /@y (evaluated at (x0 , y0 )). Finally, we may determine
the constant c from the fact that z = f (x0 , y0 ) when x = x0 , y = y0 . Thus, we get
the linear approximation (or, more accurately said, affine approximation):
@f @f
z = f (x0 , y0 ) + (x0 , y0 ) (x x0 ) + (x0 , y0 ) (y y0 ) , (1)
@x @y
which should be the equation of the plane tangent to the graph of f at (x0 , y0 ),
if f is “smooth enough” (see figure below). Our definition of di↵erentiability will
mean in e↵ect that the plane defined by the linear approximation (1) is a “good”
approximation of f near (x0 , y0 ).
Figure 16: For points (x, y) near (x0 , y0 ), the graph of the tangent plane is close to
the graph of f.
59
Definition 6.2. (Di↵erentiable: Two Variables) Let f : R2 ! R. We
say f is di↵erentiable at (x0 , y0 ), if @f /@x and @f /@y exist at (x0 , y0 ) and
if
h i h i
@f @f
f (x, y) f (x0 , y0 ) @x (x 0 , y0 ) (x x 0 ) @y (x 0 , y0 ) (y y0 )
! 0 (2)
k(x, y) (x0 , y0 )k
is called the tangent plane of the graph of f at the point (x0 , y0 , f (x0 , y0 )).
is our good approximation to f near (x0 , y0 ). “Good” is q taken in the sense that
expression (3) di↵ers from f (x, y) by something small times (x x0 )2 + (y y0 )2 .
We say that expression (3) is the best linear approximation to f near (x0 , y0 ).
60
6.5 Di↵erentiability: The General Case
Now we are ready to give a definition of di↵erentiability for maps f of Rn to
Rm , using the preceding discussion as motivation. The derivative Df (x0 ) of f =
(f1 , . . . , fm ) at a point x0 is a matrix T whose elements are tij = @fi /@xj evaluated
at x0 .
where T = Df (x0 ) is the m ⇥ n matrix with matrix elements @fi /@xj evalu-
ated at x0 and T(x x0 ) means the product of T with x x0 (regarded as
a column matrix). We call T the derivative of f at x0 .
n
X
1 @f
lim f (x0 + h) f (x0 ) (x0 ) hj = 0,
h!0 khk @xj
j=1
because
n
X @f
Th = hj (x0 ) .
@xj
j=1
61
where @fi /@xj is evaluated at x0 . The matrix Df (x0 ) is, appropriately, called the
matrix of partial derivatives of f at x0 .
6.6 Gradients
For real-valued functions we use special terminology for the derivative.
We can form the corresponding vector (@f /@x1 , . . . , @f /@xn ), called the gra-
dient of f and denoted by rf , or grad f .
@f @f @f
rf = i+ j+ k,
@x @y @z
while for f : R2 ! R,
@f @f
rf = i+ j.
@x @y
In terms of inner products, we can write the derivative of f as
Df (x)(h) = rf (x) · h.
This result is very reasonable, because “di↵erentiability” means that there is enough
smoothness to have a tangent plane, which is stronger than just being continuous.
As we have seen, it is usually easy to tell when the partial derivatives of a func-
tion exist using what we know from one-variable calculus. However, the definition of
di↵erentiability looks somewhat complicated, and the required approximation con-
dition in equation (4) may seem, and sometimes is, difficult to verify. Fortunately,
there is a simple criterion, given in the following theorem, that tells us when a
function is di↵erentiable.
62
Theorem 6.2. Let f : U ⇢ Rn ! Rm . Suppose the partial derivatives
@fi /@xj of f all exist and are continuous in a neighborhood of a point x 2 U .
Then f is di↵erentiable at x.
63