A Brief Introduction To Infinitesimal Calculus: Section 1: Intuitive Proofs With "Small" Quantities
A Brief Introduction To Infinitesimal Calculus: Section 1: Intuitive Proofs With "Small" Quantities
nb
Suppose a function f @xD is continuous on a compact interval @a, bD. Then f @ xD attains both a maximum and minimum, that is, there are points xMAX and xmin in @a, bD, so that for every other x in @a, bD, f @ xmin D f @ xD f @xMAX D.
Formulating the meaning of "continuous" is a large part of making this result precise. We will take the intuitive "definition" that f @ xD is continuous means that if an input value x1 is close to another, x2 , then the output values are close. We summarize this as: f @ xD is continuous if and only if a x1 x2 b f @x1 D f @ x2 D
Lecture1.nb
kHb- aL the maximum of the finite partition occurs at one (or more) of the points xM = a + . This means that for any other H jHb -a L partition point x1 = a + , f @ xM D f @ x1 D. H jHb- aL 1 Any point a x b is within of a partition point x1 = a + , so if H is very large, x x1 and H H
f @xM D f @ x1 D f @xD
so we have found the approximate maximum. It is not hard to make this idea into a sequential argument where xM @H D depends on H , but there is quite some trouble to make the sequence xM @H D converge (using some form of compactness of @a, bD.) Robinson's theory simply shows that the hyperreal xM chosen when 1 H is infinitesimal, is infinitely near an ordinary real number where the maximum occurs. (A very general and simple re-formulation of compactness.) We complete this proof as a simple example of Keisler's Axioms in Section 2.
In begining calculus you learned that the derivative measures the slope of the line tangent to a curve y = f @ xD at a particular point, Hx, f @ xDL. We begin by setting up convenient "local variables" to use to discuss this problem. If we fix a particular Hx, f @ xDL in the x- y-coordinates, we can define new parallel coordinates Hdx, dyL through this point. The Hdx, dyL-origin is the point of tangency to the curve. y y = f@xD dy dy dy = m dx dx dx
f@xD
1 x
A line in the local coordinates through the local origin has equation dy = m dx for some slope m. Of course we seek the proper value of m to make dy = m dx tangent to y = f @ xD.
Lecture1.nb
y y = f@xD
dy dy = m dx dx
f@xD
1 x The Tangent as a Limit You probably learned the derivative from the approximation
f @x+Dx D- f @xD limDx0 = f @ xD Dx
If we write the error in this limit explicitly, the approximation can be expressed as
f @x+Dx D- f @xD = f @xD + or f @ x + DxD - f @ xD = f @xD Dx + Dx Dx
where 0 as Dx 0. Intuitively we may say the error is small, 0 , in the formula f @ x + dxD - f @ xD = f @ xD dx + dx (1.1.1)
when the change in input is small, dx 0. The nonlinear change on the left side equals a linear change plus a term that is small compared with the input change. The error has a direct graphical interpretation as the error measured above x + dx after magnification by 1 dx. This magnification makes the small change dx appear unit size and the term dy dx measures after magnification. y f@x+dxD f@x+dxD 1 f@xD dy dy=m dx dx 1 dx
dy=m dx
f@xD
1 x
x+dx 2
dx x
Lecture1.nb When we focus a powerful microscope at the point Hx, f @ xDL we only see the linear curve dy = m dx, because 0 is smaller than the thickness of the line. The figure below shows a small box magnified on the right. dy y 1 -1 -1 1 x dx dx dy
The definition of the integral we use is the real number approximated by a sum of small slices,
b-dx a f @xD x x=a f @xD dx ,when dx 0 b
step dx
We know that if F@ xD has derivative F @ xD = f @ xD, the differential approximation above says, F@ x + dxD - F @xD = f @xD dx + dx
Lecture1.nb
b-dx x=a F @x + dxD - F@ xD = F@b 'D - F @aD
step dx
This gives, b-dx b-dx -dx f @ x D d x H F @ b ' D F @ a DL d x b x= a dx step dx x=a x =a step dx step dx
x or a f @xD x bx-d =a f @x D dx F @b 'D - F @aD . Since F @ xD is continuous, F @b 'D F @bD, so a f @ xD x = F @bD - F @aD.
step dx
step dx
We need to know that all the epsilons above are small when the step size is small, 0, when dx 0 for all x = a, a + dx, a + 2 dx, . This is a uniform condition that has a simple appearance in Robinson's theory. There is something to explain here because the theorem stated above is false if we take the usual pointwise notion of derivative and the Reimann integral. (There are pointwise differentiable functions whose derivative is not Riemann integrable.) The condition needed to make this proof complete is natural geometrically and plays a role in the intuitive proof of the inverse function theorem in the next section.
Let x1 x2 , but x1 x2 . Use the differential approximation with x = x1 and dx = x2 - x1 and also with x = x2 and dx = x1 - x2 , geometrically, looking at the tangent approximation from both endpoints. f @x2 D - f @ x1 D = f @x1 D H x2 - x1 L + 1 Hx2 - x1 L f @x1 D - f @ x2 D = f @x2 D H x1 - x2 L + 2 Hx1 - x2 L Adding, we obtain 0 = HH f @ x1 D - f @ x2 DL + H1 - 2 LL Hx2 - x1 L
Dividing by the nonzero term Hx2 - x1 L and adding f ' @ x2 D to both sides, we obtain,
If f @ x0 D 0 then f @xD has an inverse function in a small neighborhood of x0 , that is, if y y0 = f @ x0 D, then there is a unique x x0 so that y = f @xD.
Lecture1.nb
dy We saw above that the differential approximation makes a microscopic view of the graph look linear. If y y1 the linear equation dy = m dx with m = f ' @ x1 D can be inverted to find a first approximation to the inverse, y - y0 = m Hx1 - x0 L
1 x1 = x0 + Hy - y0 L m
y dy y dy=m dx dx 1 dx
y 0 =f@x 0 D
y0 =f@x0 D We test to see if f @x1 D = y. If not, examine the graph microscopically at Hx1 , y1 L = Hx1 , f @ x1 DL. Since the graph appears the same as it's tangent to within and since m = f ' @x1 D f ' @ x2 D, the local coordinates at H x1 , y1 L look like a line of slope m: 1 x0 2 dy
x0
x1
y-y 1
x0
1 Continue in this way generating a sequence of approximations, x1 = x0 + m H y - y0 L, xn+ 1 = G@ xn D, where the recursion 1 function G@x D = x + m H y - f @x DL. The distance between successive approximations is
Lecture1.nb by the Mean Value Theorem for Derivatives. Notice that G ' @x D = 1 - f ' @x D m 0, for x x0 , so G ' @x D < 1 2 in particular, and
1 x2 - x1 = G@ x1 D - G@x 0 D x1 - x0 2 1 1 x3 - x2 = G@ x2 D - G@x 1 D x2 - x1 x1 - x0 2 22
1 xn+1 - xn x1 - x0 2n
A geometric series estimate shows that the series converges, xn x x0 and f @xD = y.
To complete this proof we need to show that G@x D is a contraction on some nonzero interval. The function G@x D must map the interval into itself and have derivative less than 1 2 on the interval. The precise definiton of the derivative matters because the result is false if f @ xD is defined by a pointwise limit. The function f @xD = x + x2 Sin@p xD with f @0D = 0 has pointwise derivative 1 at zero, but is not increasing in any neighborhood of zero. (If you "move the microscope an infinitesimal amount" when looking at y = x2 Sin@p xD + x, the graph will look nonlinear.)
Sin@qD 1
q x
Now make a small change in the angle and magnify by 1 dq to make the change appear unit size.
Cos@qD
Lecture1.nb
y 1
q+dq
dq dSin
Sin@qD
q Sin@q+dqD
Cos@qD
Sin@qD
Since magnification does not change lines, the radial segments from the origin to the tiny increment of the circle meet the circle at right angles and appear under magnification to be parallel lines. Smoothness of the circle means that under powerful magnification, the circular increment appears to be a line. The difference in values of the sine is the long vertical leg of the increment "triangle" above on the right. The apparant hypotenuse with length dq is the circular increment. Since the radial lines meet the circle at right angles the large triangle on the unit circle at the left with long leg Cos@qD and hypotenuse 1 is similar to the increment triangle, giving
Cos @qD dSin 1 dq
We write approximate similarity because the increment "triangle" actually has one circular side that is -straight. In any dSin case, this is a convincing argument that = Cos@q D. A similar geometric argument on the increment triangle shows that dq dCos = Sin @ q D . dq The Polar Area Differential The derivation of sine and cosine is related to the area differential in polar coordinates. If we take an angle increment of dq and magnify a view of the circular arc on a circle of radius r, the length of the circular increment is r dq, by similarity. y
r dq
dr x A magnified view of circles of radii r and r + dr between the rays at angles q and q + dq appears to be a rectangle with sides of lengths dr and r dq. If this were a true rectangle, its area would be r dq dr , but it is only an approximate rectangle. Technically, we can show that the area of this region is r dq dr plus a term that is small compared with this infinitesimal,
Lecture1.nb
A magnified view of circles of radii r and r + dr between the rays at angles q and q + dq appears to be a rectangle with sides of lengths dr and r dq. If this were a true rectangle, its area would be r dq dr , but it is only an approximate rectangle. Technically, we can show that the area of this region is r dq dr plus a term that is small compared with this infinitesimal, dA = r dq dr + dqdr Keisler's Infinite Sum Theorem assures us that we can neglect this size error and integrate with respect to r dq dr. Holditch's Formula The area swept out by a tangent of length R as it traverses an arbitrary convex curve in the plane is A = p R2 .
We can see this interesting result by using a variation of polar coordinates and the infinitesimal polar area increment above. Since the curve is convex, each tangent meets the curve at a unique angle, j, and each point in the region swept out by the tangents is a distance r along that tangent.
r j
We look at an infinitesimal increment of the region in r-j-coordinates, first holding the the j-base point on the curve and changing j. Microscopically this produces an increment like the polar increment:
rdj
dr
Next, looking at the base point of the tangent on the curve, moving to the correct j + dj-base point, moves along the curve. Microscopically this looks like translation along the tangent line (by smoothness):
Lecture1.nb
10
H0,j+djL H0,jL Including this near-translation in the infinitesimal area increment produces a parallelogram:
Hr,j+djL
The radius r of a circle drawn through three infinitely nearby points on a curve in the H x, yL-plane satisfies where s denotes the arclength. For example, if y = f @ xD, so ds = y f @ xD f @xD 1 d i j z = - = - j z 3 2 r dx "################ #### # ## # 2 H1 + f @xD2 L k 1+H f @xDL { "################ ######### 1 + H f @ xDL2 dx, then "################ ############ x @t D2 + y @tD2 dt , then
Lecture1.nb
d I M
dy
11
y @t D x @t D- x @t D y @t D 1 ds = - = 2 2 3 2 r dx
H x @tD + y @t D L
Changes Consider three points on a curve ! with equal distances Ds between the points. Let aI and aII denote the angles between the horizontal and the segments connecting the points as shown. We have the relation between the changes in y and a:
Dy Sin@aD = D s
(1.1.2)
PIII
PII Ds2 aI P
I
aII
Da2
Da
The angle between the perpendicular bisectors of the connecting segments is also Da, because they meet the connecting segments at right angles. These bisectors meet at the center of a circle through the three points on the curve whose radius we denote r. The small triangle with hypotenuse r gives
Da Ds 2 Sin@ D = 2 r
(1.1.3)
Small Changes Now we apply these relations when the distance between the successive points is an infinitesimal ds. The change
dy -d Sin@aD = -dI M = Sin@aD - Sin@a - daD = Cos@aD da + J da, ds
(1.1.4)
Combining this with formula (1.1.3) for the infinitesimal case (assuming r 0), we get
ds da = + i da, with i 0 r
Lecture1.nb
12
Keisler's Function Extension Axiom allows us to apply formulas (1.1.3) and (1.1.4) when the change is infinitesimal, as we shall see. We still have a gap to fill in order to know that we may replace infinitesimal differences with differentials (or derivatives), especially because we have a difference of a quotient of differences. First differences and derivatives have a fairly simple rigorous version in Robinson's theory, just using the differential approximation (1.1.1). This can be used to derive many classical differential equations like the tractrix, catenary, and isochrone, see: Chapter 5 Differenital Equations from Increment Geometry in Projects for Calculus: The Language of Change on my website at http://www.math.uiowa.edu/%7Estroyan/ProjectsCD/estroyan/indexok.htm Second differences and second derivatives have a complicated history. See H. J. M. Bos, Differentials, Higher-Order Differentials and the Derivative in the Leibnizian Calculus, Archive for History of Exact Sciences, vol. 14, nr. 1, 1974. This is a very interesting paper that begins with a course in calculus as Leibniz might have presented it. The natural exponential The natural exponential function satisfies
dy = y dx
y@0D = 1
Recursively,
y@3 dxD = y@2 dxD + y '@2 dxD dx = y@2 dxD H1 + dxL = H1 + dxL3 y@xD = H1 + dxL xdx , for x = 0, dx, 2 dx, 3 dx,
This is the product expansion H1 + dxL1dx , for dx 0. No introduction to calculus is complete without mention of this sort of "infinite algebra" as championed by Euler as in L. Euler, Introductio in Analysin Infinitorum, Tomus Primus, Lausanne, 1748. Reprinted as L. Euler, Opera Omnia, ser. 1, vol. 8. Translated from the Latin by J. D. Blanton, Introduction to Analysis of the Infinite, Book I, SpringerVerlag, New York, 1988. A wonderful modern interpretaion of these sorts of computations is in Mark McKinzie and Curtis Tuckey, Higher Trigonometry, Hyperreal Numbers and Euler's Analysis of Infinities, Math Magazine , vol. 74, nr. 5, Dec. 2001, p. 339-368 W. A. J. Luxemburg's reformulation of the proof of one of Euler's central formulas Sin@zD = z
k= 1 z I1 - H L M kp 2
appears in our monograph, Introduction to the Theory of Infinitesimals, Academic Press Series on Pure and Applied Math. vol 72, 1976, Academic Press, New York.
Lecture1.nb
13
Non-standard Analysis, North-Holland Publishing Co., Amsterdam, 1966. Revised edition by Princeton University Press, Princeton, 1996, begins: The history of a subject is usually written in the light of later developments. For over half a century now, accounts of the history of the Differential and Integral Calculus have been based on the belief that even though the idea of a number system containing infinitely small and infinitely large elements might be consistent, it is useless for the development of Mathematical Analysis. In consequence, there is in the writings of this period an noticable contrast between the severity with which the ideas of Leibniz and his successors are treated and the leniency accorded to the lapses of the early proponents of the doctrine of limits. We do not propose here to subject any of these works to a detailed criticism. However, it will serve as a starting point for our discussion to try to give a fair summary of the contemporary impression of the history of the Calculus... I recomend that you read Robinson's Chapter X. I have often wondered if mathematicians in the time of Weierstrass said things like, 'Karl's epsilon-delta stuff isn't very interesting. All he does is re-prove old formulas of Euler.' I have a non-standard interest in the history of infinitesimal calculus. It really is not historical. Some of the old derivations like Bernoulli's derivation of Leibniz' formula for the radius of curvature seem to me to have a compelling clarity. Robinson's theory of infiitesimals offers me an opportunity to see what is needed to complete these arguments with a contemporary standard of rigor. Working on such problems has led me to believe that the best theory to use to underly calculus when we present it to beginners is one based on the kind of derivatives described in Section 2 and not the pointwise approach that is the current custom in the U.S. I believe we want a theory that supports intuitive reasoning like the examples above and pointwise defined derivatives do not.