0% found this document useful (0 votes)
36 views10 pages

Diff Notes

This document discusses differentiation and the chain rule. It provides definitions and theorems regarding differentiability of functions between open sets in Rn and Rm. It proves the chain rule using linear transformations between the derivatives of the inner and outer functions. It also discusses sufficient conditions for differentiability and relates tangent vectors to curves and their images under differentiable functions.

Uploaded by

omsaiboggala811
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views10 pages

Diff Notes

This document discusses differentiation and the chain rule. It provides definitions and theorems regarding differentiability of functions between open sets in Rn and Rm. It proves the chain rule using linear transformations between the derivatives of the inner and outer functions. It also discusses sufficient conditions for differentiability and relates tangent vectors to curves and their images under differentiable functions.

Uploaded by

omsaiboggala811
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Notes on Differentiation

1 The Chain Rule


This is the following famous result:

1.1 Theorem. Suppose U and V are open sets with f and g complex-valued func-
tions defined on U and V respectively, where f (U ) ⊂ V . Suppose that z0 ∈ U (so
that f (z0 ) ∈ V ). If f is (complex) differentiable at z0 and g is differentiable at f (z0 ),
then g ◦ f : U → C is differentiable at z0 , and (g ◦ f )0 (z0 ) = g 0 (f (z0 ))f (z0 ).

Proof. Let w0 = f (z0 ). Our hypotheses are that

|R(z)|
f (z) − f (z0 ) = f 0 (z0 )(z − z0 ) + R(z) where → 0 as z → z0 , (1)
|z − z0 |

and
|R(w)|
g(w) − g(w0 ) = g 0 (w0 )(w − w0 ) + S(w) where → 0 as w → w0 . (2)
|w − w0 |

Our goal is to show that if h = g ◦ f then

|T (z)|
h(z) − h(z0 ) = g 0 (f (z0 ))(z − z0 ) + T (z) where → 0 as z → z0 . (3)
|z − z0 |

To this end, substitute w = f (z) into (2) (legal because f (U ) ⊂ V ) to get

h(z) − h(z0 ) = g 0 (f (z0 ))[f (z) − f (z0 )] + S(f (z)),

and then substitute the result of (1) into this equation to get:

h(z) − h(z0 ) = g 0 (f (z0 ))f 0 (z0 )(z − z0 ) + g 0 (f (z0 ))R(z) + S(f (z)) (4)

The second term on the right side of (4) is o(|z − z0 |) as z → z0 , so all we have to do
is show that the same is true for the third term on the right.
For this, let ε(w) = S(w)/|w − w0 |, so that S(w) = ε(w)|w − w0 | where ε(w) → 0
as w → w0 . Then substituting w = f (z) and using (1) one last time:

S(f (z)) = ε(f (z))|f (z) − f (z0 )| = ε(f (z))|f 0 (z0 )| |z − z0 | + ε(f (z))R(z),

where, on the right-hand side of the equation, the first term on the is o(|z − z0 |)
because f (z) → f (z0 ) as z → z0 (remember: differentiability at a point implies
continuity there), and the second term is o(|z − z0 |) because R(z) has this property,
and ε is bounded. Thus S(f (z)) is o(|z − z0 |) and z → z0 , and the proof of the chain
rule is complete. ¤
Math 829 Spring 1999

1.2 Exercise. Suppose f obeys the hypotheses above, [a, b] is a finite, closed real
interval contained in f (U ), and γ : [a, b] → C is (real) differentiable at a point t0 ∈
(a, b). show that f ◦γ is (real) differentiable at t0 , and that (f ◦γ)0 (t0 ) = f 0 (γ(t0 ))γ 0 (t0 ).

2 Sufficient condition for differentiability.


We begin with a concrete situation. Suppose G is an open subset of R2 and u : G → R
is a real-valued function defined on G. Let z0 = (x0 , y0 ) be a point of G.

2.1 Theorem. If the first partial derivatives of u exist at every point of G and are
continuous at z0 , the u is differentiable at z0 .

Proof. From our discussion in class, it is enough to show that if


def
R(h) = u(z0 + h) − u(z0 ) − [ux (z0 )h1 + uy (z0 )h2 ]

(for h = (h1 , h2 ) ∈ R2 with |h| sufficiently small, then |R(h)| = o(|h|) as |h| → 0.
The first step is to write

u(z0 + h) − u(z0 ) = [u(x0 + h1 , y0 + h2 ) − u(x0 , y0 + h2 )] + [u(x0 , y0 + h2 ) − u(x0 , y0 )],

and then apply the (one-variable) Mean Value Theorem of differential calculus to each
of the square-bracketed terms on the right. With this you see that the right-hand
side of the equation above is:

ux (x0 + h1 , y0 + t1 )h1 + uy (x0 , y0 + t2 ),

where t1 lies between 0 and h1 and t2 lies between 0 and h2 . Thus z1 = (x0 +h1 , y0 +t1 )
and z2 = (x0 , y0 + t2 ) both → z0 as h → 0, and

R(h) = [ux (z1 ) − ux (z0 )]h1 + [uy (z2 ) − yy (z0 )]h2 , (5)

for all h sufficiently small.


Let ε1 (h) and ε2 (h) denote, respectively, the two terms in square brackets on the
right-hand side of (5) (these terms are functions of h because the points z1 and z2
depend only on h). The continuity of ux and uy at z0 (used here for the first and
only time) guarantees that that both ε1 (h) and ε2 (h) → 0 as h → 0. Thus the same
is true of
def
ε(h) = max{|ε1 (h)|, |ε2 (h)|}.
Now an easy estimate starting with (5) yields

|R(h)| ≤ ε(h)|h|,

which shows that u is differentiable at z0 because (as we just noted) ε(h) → 0 as


h → 0. ¤

-2-
Math 829 Spring 1999

2.2 Exercise. Define “differentiable” for real-valued functions defined on open sub-
sets of Rn . State and prove a sufficient condition for differentiability of such functions
that generalizes Theorem 2.1.

3 The “cosmic truth” about differentiation


Suppose U is an open subset of Rn , and that f : U → Rm .

3.1 Definition. We say that f is differentiable at a point p0 of U if there is a


linear transformation T : Rn → Rm such that

f (p0 + h) = f (p0 ) + T h + R(h) ∀h ∈ Rn sufficiently close to p0 , (6)

where |R(h)| = o(|h|) as |h| → 0 in Rn .

3.2 Notation. If f is differentiable at p0 then the linear transformation T in the


above definition is called the derivative of f at p0 . We will write T = Df (p0 ),
preferring (almost always) to reserve the “prime” notation for the complex derivative.

3.3 Exercises. You should go back to the definition of differentiability for functions
R2 → R and identify this linear transformation. Do the same for functions from an
open subset of Rn to R. Do the same for differentiable functions from intervals of
the real line to R2 , or more generally Rn (the so-called “vector-valued functions).
Finally, how do you fit the definition of “complex differentiability” into this “linear
transformation” context?

3.4 Exercise. If f is differentiable at p0 then f is continuous at p0 .

3.5 The matrix of Df(p0 ). Let {e1 , e2 , . . . en } be the standard unit vector basis
for Rn (i.e., ej is the vector with 1 in the j-th position and zeros elsewhere). Then
upon fixing j and substituting h = tej into (6) and letting t → 0, an argument entirely
similar to the one we used in class for the case n = 2, m = 1 shows that
µ ¶
∂f ∂f1 ∂fm
T ej = (p0 ) = (p0 ), . . . , (p0 ) ,
∂xj ∂xj ∂xj
where fj : U → R is the j-th coordinate function of f :

f = (f1 , f1 , . . . , fn ).

Thus the matrix of Df (p0 ) with respect to the standard bases in Rn and Rm respec-
∂f
tively is the one whose j-th column is ∂x j
(p0 ) (written as a column vector, rather
than as the usual row vector). Let’s call this matrix [Df (p0 )]. Thus [Df (p0 )] =
i=1,j=1 . The image Df (p0 )h of a vector h ∈ R is the m-dimensional vector
(p0 )]m,n
∂fi n
[ ∂xj
found from the equation
[Df (p0 )h] = [Df (p0 )][h],

-3-
Math 829 Spring 1999

where square brackets around a vector denote the corresponding column matrix, i.e.
the transpose of the original (row) vector.

3.6 The Chain Rule revisited. Now suppose that, in addition to the setup above,
V is an open subset of Rm contained in f (U ), and g : V → Rp is differentiable at
f (p0 ). Then g ◦ f : U → Rp is differentiable at p0 , and

D(g ◦ f )(p0 ) = (Dg)(f (p0 ))Df (p0 )

where the product between the derivatives on the right is the product of linear linear
transformations, i.e. their composition.

3.7 Exercise. Adopt the argument of §1 to prove this version of the Chain Rule.

Suggestion: If you make a suitable definition of the norm |T | of a linear transforma-


tion, say
|T | = max{|T x| : x ∈ Rn , |x| = 1},
then the maximum in question exists (and is finite) because the map x → |T x| is
a continuous real-valued function on the (compact) unit sphere of Rn , and you can
easily prove that
|T x| ≤ |T ||x| ∀x ∈ Rn , (7)
From this you should be able to write down a proof of the chain rule that is almost
word-for-word the same as the one in §1. If you want a more concretely defined norm
for linear transformations you can take |T | to be the square root of sum of the squares
of the entries of [T ], a quantity that is, in general, larger than the previously defined
norm. For this one the Cauchy-Schwarz inequality gives (7).

3.8 Corollary Suppose that:

• f : U → Rm is differentiable at at each point of an open set U ⊂ Rn ,

• I is an open interval of the real line, and

• γ : I → Rn is a differentiable function with γ(I) ⊂ U .

Suppose t0 ∈ I, v is a nonzero vector in Rn that is tangent to the curve γ at γ(t0 ).


Then Df (γ(t0 ))γ(t0 ) is a vector tangent to the image-curve f ◦ γ at f (γ(t0 )) (as long
as this vector is nonzero).

Proof. A tangent vector to a curve γ at one of its points γ(t0 ) is just γ 0 (t0 ), which you
can think of as a vector (with coordinates equal to the derivatives of the coordinate
functions). (As a linear transformation R → Rn , this derivative would just be the
map that takes h ∈ R to h times the tangent vector.) With this in hand, the Corollary
becomes a restatement of the Chain Rule—try it! ¤

-4-
Math 829 Spring 1999

Corollary 3.8 says, roughly, that the linear map Df (p0 ) takes each vector tangent
to a given curve through p0 into a vector tangent to the image curve at f (p0 ). In
this context, when we assert that Df (p0 ) is a linear transformation from Rn to Rn we
should actually think of Rn as the space of vectors tangent to all possible differentiable
curves through p0 , and Rm as the corresponding “tangent space” at f (p0 ). This point
of view will show up again in the next section.

3.9 Exercise. Suppose G is an open subset of Rn , p0 ∈ G, and uj : G → R is a


real-valued function defined on G (j = 1, 2, . . . , m). Then f = (u1 , . . . , um ) : G →
Rm , and every Rm -valued function on G has this form. State and prove a sufficient
condition for differentiability of f at p0 that generalizes Theorem 2.1 to this situation.
Suggestion: The problem quickly reduces to Exercise 2.2.

4 Conformality of the Stereographic Projection


We apply the ideas of the previous sections to prove that the stereographic projection
is conformal in that differentiable curves in the plane that meet at a point z, get
projected to curves on S 2 that meet at z ∗ and make the same angle (with the same
sense) there.

4.1 Notation. For this discussion we denote:


• Points of R3 by (ξ, η, ζ), and those of R2 by (x, y),
• The North Pole (0, 0, 1) of S 2 by N ,
• The inner product (i.e. dot product) of two vectors v and w in the same Eu-
clidean space by hv, wi.

4.2 The “stereographic extension.” Consider the natural extension σ of the


stereographic projection to R3 \{ζ = 1}, defined by:
ξ η
σ(ξ, η, ζ) = ( , ). (8)
1−ζ 1−ζ
def
By the sufficient condition of Exercise 3.9, σ is differentiable on M = R3 \{ζ = 1},
and by our work in class the matrix of the derivative of σ = (σ1 , σ2 ) at a point
p = (ξ, η, ζ) ∈ M is obtained by placing the (vector) partial derivative of σ with
respect to each coordinate down the respective columns of a two by three matrix:
" # " #
∂σ1 ∂σ1 ∂σ1 ξ
∂ξ
(p) ∂η
(p) ∂ζ
(p) 1 1 0 1−ζ
[Dσ(p)] = ∂σ = η
∂ξ
2
(p) ∂σ2
∂η
(p) ∂σ2
∂ζ
(p) 1 − ζ 0 1 1−ζ

From our discussion of the chain rule, the problem is to show that, for every point
p = (ξ, η, ζ) ∈ S 2 \{N }, the linear transformation Dσ(p) preserves angles between
vectors tangent to S 2 at p! In other words, if v and w are three-dimensional vectors
tangent to S 2 at p (analytically: their inner products with p are both zero), then the
angle from v to w is the same as the angle from Dσ(p)v to Dσ(p)w.

-5-
Math 829 Spring 1999

4.3 Analytic statement of the problem. It’s pretty clear that the stereographic
projection preserves the sense of angles, so we will concentrate on preservation of
magnitudes of angles. Here is the analytic expression of what needs to be done.

Suppose p ∈ S 2 \{N } and v, w ∈ R3 with

hp, vi = 0 and hp, wi = 0, (9)

(i.e. v and w are orthogonal to the line from the origin to p, and hence tangent to S 2
at p). Let A = [Dσ(p)], the matrix of Df (p) with respect to the standard basis. We
desire to show that
hAv, Awi hv, wi
= . (10)
|Av| |Aw| |v| |w|
The quantities that show up on the left and right hand sides of this last equation
are, you will recall, the cosines of the angles between the respective pairs of vectors1 .

4.4 Exercise—reduction of problem. Show that in order to prove (10) you need
only show that there is a positive constant c such that

hAv, Awi = chv, wi (11)

for all v, w ∈ R3 satisfying (9).

In order to prove (11) we observe that

hAv, Awi = hA∗ Av, wi

where A∗ is the linear transformation R2 → R3 whose matrix is the transpose of the


matrix of A. This is just a matrix calculation based on the fact that hx, yi = [x]T [y]
(matrix product), where x and y are any vectors in the same Euclidean space, [x]
and [y] are their respective column vectors, and the superscript “T ” denotes matrix
transpose. Thus:

hAw, Avi = [Aw]T [Av] = ([A][w])T [A][v]

= [w]T [A]T [A][v] = [w]T [A∗ A][v]

= hw, A∗ Avi

from which the desired result follows by the symmetry of the inner product (it is
unchanged if the order of its entries is reversed). Note that this calculation works as
well for any m × n matrix and any pair of vectors in Rn .
1
Remember that the inner product on the left-hand side of the equation is the one for R3 , and
the one on the right is that of R2 .

-6-
Math 829 Spring 1999

4.5 Computations with A∗ A. For p = (ξ, η, ζ) ∈ S 2 \{N } let’s write


ξ η
x= and y = . (12)
1−ζ 1−ζ
We use x and y simply as notational conveniences here, but nevertheless, note that
x + iy is the stereographic image, in the complex plane, of the point p = (ξ, η, ζ) of
def
S 2 . From the work of §4.2 the matrix of A = Dσ(p) can be written
· ¸
1 1 0 x
[A] = ,
1−ζ 0 1 y
whereupon  
1 0 x
[A∗ A] = [A]T [A] = c  0 1 y ,
2 2
x y x +y
1
where c = (1−ζ) 2.

Suppose for the moment that v = (v1 , v2 , v3 ) and w = (w1 , w2 , w3 ) are any vectors
in R3 . Then using the fact that hAv, Awi = hA∗ Av, wi = [w]T [A∗ A][v] we obtain
after a little calculation:
c−1 hAv, Awi = hA∗ Av, wi = [w]T [A∗ A][v]

= (v1 + xv3 )w1 + (v2 + yv3 )w2 + [xv1 + yv2 + (x2 + y 2 )v3 ]w3

= v1 w1 + v2 w2 + v3 w3 + (xw1 + yw2 )v3 + [xv1 + yv2 + (x2 + y 2 − 1)v3 ]w3

= hv, wi + ∆,
where
∆ = (xw1 + yw2 )v3 + [xv1 + yv2 + (x2 + y 2 − 1)v3 ]w3 .
We claim that if v and w are tangent to S 2 at p then ∆ = 0, which will finish our proof.
For this, go back to equation (12) describing x and y in terms of the coordinates of
p, and note that

x2 + y 2 − 1 = ,
1−ζ
hence
(1 − ζ)∆ = (ξw1 + ηw2 )v2 + (ξv1 + ηv2 + 2ζv3 )w3

= (ξw1 + ηw2 + ζw3 )v3 + (ξv1 + ηv2 + ζv3 )w3

= hp, wiv3 + hp, viw3


If v and w are tangent to S 2 at p, then the inner products in the last line are both
zero, hence ∆ = 0 (since p 6= N ⇒ ζ 6= 1). This completes the proof that the
stereographic projection is conformal. ¤

-7-
Math 829 Spring 1999

5 The Mercator Projection.


This section borrows heavily from the beautiful book by Eli Maor: Trigonometric
Delights, Princeton University Press, 1998, especially Chapters 13 and 14.

5.1 Mercator, conformality, and calculus. In order to make a map of the earth
one has to address the problem of representing the sphere S 2 with some accuracy
on a flat plane. The Flemish map maker Gerardus Mercator attacked the problem
of making a map that would represent S 2 \{|ζ| < 1 − ε}, the unit sphere of R3
with equal spherical caps removed around the north and south poles, on a rectangle
[−π, π] × [−h, h], where ε is some small positive number, and h is positive. Mercator
wanted his map to have the following properties:

(a) The circles of latitude should be represented by horizontal lines of length 2h,
with the equator (latitude zero degrees) the horizontal axis of symmetry.

(b) Equally spaced circles of longitude (great circles on S 2 through the north and
south poles) should be represented by equally spaced vertical lines.

(c) The correspondence between points of S 2 and points of Mercator’s map should
be conformal.

Conformality is the most important property—it guarantees that, at least in principal,


a traveler wishing to go from point A on the globe to point B need only draw a line
between the corresponding points on Mercator’s map, measure the angle between this
line and the vertical axis (true north), and using a compass, travel along a path that
always makes the desired constant compass heading.
Such a map cannot, of course, accurately represent distances. Indeed, the circles
of latitude on S 2 get smaller as their centers approach the poles, and the Mercator
representation must stretch distances along these circles to make the corresponding
horizontal lines the same length. This is why Greenland, for example, looks immense
on a Mercator projection, when in reality it is not. In order to preserve conformality
in the face of such horizontal stretching, Mercator had to correspondingly distort the
distances between the lines of latitude: for example the vertical distance on Mercator’s
map between the lines representing latitude 15◦ and 30◦ will be larger than that
between the equator and the line representing latitude 15◦ . Mercator’s great triumph
was to figure out how to accomplish the vertical stretching that insures conformality.
He published his map of the world in 1569, but unfortunately never explained his
method, an omission that contributed to a certain skepticism about the value of his
accomplishment.
In 1599 Edward Wright lifted the veil of mystery from Mercator’s method, publish-
ing an accurate account of its underlying mathematics. Nowadays the mathematical
foundation of Mercator’s feat reduces to an exercise in freshman calculus, as is de-
scribed succinctly in Chapter 13 of the above-mentioned book by Maor. There it’s
shown how, in order for spherical rectangles to get mapped to “similar” plane rectan-
gles, the height y of the line representing the circle of latitude λ on the sphere must

-8-
Math 829 Spring 1999

obey the differential equation


dy = sec λ dλ, (13)
in other words, Z λ
y= sec t dt (14)
0
Now Wright did not phrase his solution in terms of calculus—indeed, his work ap-
peared long before Newton and Leibnitz fully developed the subject! What Wright
did was solve the discrete analogue of (13) in increments of one degree. Nowadays we
prove in freshman calculus (at least we used to, before Calculus “Reform”) that
Z λ ¯ µ ¶¯
¯ λ π ¯¯
¯
sec t dt = ln ¯tan + , (15)
0 2 4 ¯

but note that it wasn’t until 1614, fifteen years after Wright’s treatise on Mercator’s
projection, that Napier published his invention of logarithms!
In 1645 Henry Bond, based on both the work of Wright and recently published
tables of logarithms, conjectured (15), and this became one of the outstanding mathe-
matical problems of the latter half of the seventeenth century. James Gregory proved
(15) in 1668, but his proof was so complicated that it was viewed as suspicious at
best. Two years later Isaac Barrow, Newton’s predecessor at Cambridge, gave a com-
prehensible proof—essentially the same one you find in the calculus books of today.
For this Barrow invented the technique of partial fraction decompositions, which he
applied to the evaluation of many other integrals.

5.2 The Mercator Projection and the complex logarithm. The work we have
done in these notes on conformality of the stereographic projection, the corresponding
conformality of holomorphic functions done in class, and the holomorphicness of the
Principal Branch of the logarithm function result in a quick solution of Mercator’s
problem:
def
(a) First note that the stereographic projection maps the spherical region Sε =
S 2 \{|ζ| < 1 − ε} onto the annular open subset of C described by

AR = {z ∈ C : 1/R < |z| < R},


p
where R = (2 − ε)/ε > 1. This is a little exercise in right triangles, along with
the formulas describing the stereographic projection, and one of the reflection
results you obtained in the problem set about that projection. I leave it to you.
The actual relationship between R and ε is not important: what is important
is that the inner radius of the annulus is the reciprocal of the outer radius.

(b) Recall that, under the stereographic projection, the circles of longitude on the
sphere go to rays in the plane, and the circles of latitude go to circles in the
plane centered at the origin.

-9-
Math 829 Spring 1999

(c) The conformal map w = φ(z) = iLog z takes the slit annulus AR \(−R, −1/R)
one-to-one onto the rectangle Rh = [−π, π] × [− ln h, ln h], taking the concentric
circles that are images of the latitudes on the sphere onto horizontal lines in the
rectangle, and the rays that are the images of the circles of longitude to vertical
lines in the rectangle.

(d) Since both the stereographic projection σ and the logarithm map φ defined
above are conformal, so is their composition φ◦σ : Sε → Rh .This is the Mercator
projection.

5.3 Position of the lines of latitude. A careful analysis of our representation


φ ◦ σ of the Mercator projection gives another way of seeing how the integral of the
secant enters the picture. First note that the stereographic projection sends a point
P ∗ of latitude λ on the sphere to a point P of the plane that lies tan( λ2 + π4 ) units
from the origin. To see this, just look at the picture below, which takes place in the
plane of P ∗ and the vertical axis of R3 .

Ν
γ
γ P*

α
λ
O 1 P
α + λ = π/2
2γ + α = π
therefore:
γ = λ/2 + π/4

Figure 1: Latitude λ of P ∗ vs. magnitude of P

The distance in question, OP , is the tangent of the angle γ = ∠ON P ∗ (vertex


at N ), but because the sides ON and OP ∗ of triangle ON P ∗ have the same length
(namely, 1), you can easily see that γ = λ2 + π4 . We are interested in the imaginary
part of φ(P ), and this is just
¯ µ ¶¯
¯ λ π ¯¯
¯
ln ¯tan + ,
2 4 ¯

which, as we noted earlier, equals 0
sec t dt.

-10-

You might also like