Extending The Algebraic Manipulability of Differentials
Extending The Algebraic Manipulability of Differentials
1
ral inclination is to treat differentials as fractions.1 Ad- Say that it is later discovered that x is a function of t
ditionally, there are several little-known but extremely so that x = t 2 . The problem here is that the chain rule
helpful formulas which are straightforwardly deducible for the second derivative is not the same as what would
from this new notation. be implied by the algebraic representation.
Even absent these practical concerns, we find that Here we arrive at one of the major problematic points
reconceptualizing differentials in terms of algebraically- for using the current notation of the second derivative
manipulable terms is an interesting project in its own algebraically. To demonstrate the problem explicitly, if
right, and perhaps may help us see the derivative in a one were to take the second derivative seriously as a set
2
new way, and adapt it to new uses in the future. There of algebraic units, one should be able to multiply ddxy2
may also be additional formulas which can in the future 2
by ddxt 2 to get the second derivative of y with respect to
be more directly connected to the algebraic formulation t. However, this does not work. If the differentials are
of the derivative. 2
being treated as algebraic units, then ddxt 2 is the same as
dx 2
dt , which is just the first derivative of x with respect
to t squared. The first derivative of x with respect to
2 The Problem of Manipulating t is ddxt = 2t. Therefore, treating the second derivative
algebraically would imply that all that is needed to do
Differentials Algebraically to convert the second derivative of y with respect to x
into the second derivative of y with respect to t is to
When dealing with the first derivative, there are gen- multiply by (2t)2 .
erally few practical problems in treating differentials
However, this reasoning leads to the false conclusion
algebraically. If y is a function of x, then ddxy is the first 2
derivative of y with respect to x. This can generally be that ddt 2y = 24t 4 . If, instead, the substitution is done
treated as a fraction. at the beginning, it can be easily seen that the result
should be 30t 4 :
For instance, since ddxy is the first derivative of x with re- y = x3
spect to y, it is easy to see that these values are merely
the inverse of each other. The inverse function theo- x = t2
rem of calculus states that ddxy = d1y . The generaliza- y = (t 2 )3
dx
tion of this theorem into the multivariable domain es- y = t6
sentially provides for fraction-like behavior within the y ′ = 6t 5
first derivative.
y ′′ = 30t 4
Likewise, in preparation for integration, both sides of
the equation can be multiplied by dx. Even in multi- This is also shown by the true chain rule for the sec-
variate equations, differentials can essentially be mul- ond derivative, based on Faà di Bruno’s formula [2].
tiplied and divided freely, as long as the manipulations This formula says that the chain rule for the second
are dealing with the first derivative. derivative should be:
2
Even the chain rule goes along with this. Let x depend d2 y d2 y dx dy d2 x
= + (1)
on parameter u. If one has the derivative dduy and mul- dt 2 dx 2 dt dx dt 2
tiplies it by the derivative ddux then the result will be This, however, is extremely unintuitive, and essentially
dy makes a mockery out of the concept of using the differ-
d x . This is identical to the chain rule in Lagrangian
notation. ential as an algebraic unit.
It is well recognized that problems occur when if one It is generally assumed that this is a problem for the
tries to extend this technique to the second derivative idea that second differentials should be treated as alge-
[1]. Take for a simple example the function y = x 3 . braic units. However, it is possible that the real prob-
The first derivative is ddxy = 3x 2 . The second derivative lem is that the notation for second differentials has not
d2 y been given as careful attention as it should.
is dx 2
= 6x.
1 Since many in the engineering disciplines are not formally The habits of mind that have come from this have even
trained mathematicians, this also can prevent professionals in affected nonstandard analysis, where, despite their ap-
applied fields from making similar mistakes. preciation for the algebraic properties of differentials,
2
have left the algebraic nature of the second derivative dx as separate steps. Originally, in the Leibnizian con-
either unexamined (as in [3]) or examined poorly (i.e., ception of the differential, one did not even bother solv-
leaving out the problematic nature of the second deriva- ing for derivatives, as they made little sense from the
tive, as in [4, pg. 4]). original geometric construction of them [5, pgs. 8, 59].
stand, and requires few if any special cases, save the standard ential of x 2 was wanted, it would be written as d(x 2 ). The rules
requirements of continuity and smoothness. are given in [5, pg. 24].
3
2 2
arrives at the result ddxy2 . Unfortunately, that is not the in (3) is that the ratio ddxx2 reduces to zero. However,
same sequence of steps that happens when two deriva- this is not necessarily true. The concern is that, since
dx d2 x
tives are performed, and thus it leads to a faulty for- d x is always 1 (i.e., a constant), then d x 2 should be
mulation of the second derivative. zero. The problem with this concern is that we are no
2
longer taking ddxx2 to be the derivative of ddxx . Using the
notation in (3), the derivative of ddxx would be:
4 Extending the Second Deriva- dx
d dx d2 x dx d2 x
tive’s Algebraic Manipulabil- = − (5)
dx dx 2 dx dx 2
ity In this case, since ddxx reduces to 1, the expression is ob-
2
viously zero. However, in (5), the term ddxx2 is not itself
As a matter of fact, order of operations is very impor- necessarily zero, since it is not the second derivative of
tant when doing derivatives. When doing a derivative, x with respect to x.
one first takes the differential and then divides by dx.
The second derivative is the derivative of the first, so
the next differential occurs after the first derivative is
complete, and the process finishes by dividing by dx 5 The Notation for the Higher
again. Order Derivatives
However, what does it look like to take the differential
of the first derivative? Basic calculus rules tell us that The notation for the third and higher derivatives can
the quotient rule should be used: be found using the same techniques as for the second
derivative. To find the third derivative of y with respect
dy dx(d(dy)) − dy(d(dx))
d = to x, one starts with the second derivative and takes the
dx (dx)2
differential:
dx d y − dy d2 x
2
=
dx 2 © d dx ª
dy
d ®
dx ®
2
dx d y dy d2 x
= −
« ¬
dx 2 dx 2
dx d y dy d2 x
2
= − d2 y dy d2 x
dx dx dx dx =d −
dx 2 dx dx 2
d2 y dy d2 x
= − dx d2 y − dy d2 x
dx dx dx =d
dx 3
Then, for the second step, this can be divided by dx,
(dx 3 )(d(dx d2 y − dy d2 x)) − (dx d2 y − dy d2 x)(d(dx 3 ))
yielding: =
d ddyx (dx 3 )2
d2 y dy d2 x
= 2− (3) d3 y dy d3 x d2 x d2 y dy (d2 x)2
dx dx dx dx 2 = 2− 2
−3 2 +3
dx dx dx dx dx dx dx 3
This, in fact, yields a notation for the second derivative
which is equally algebraically manipulable as the first Finally, this result is divided by dx:
derivative. It is not very pretty or compact, but it !
works algebraically. d
( ddyx )
d
dx
d3 y d y d3 x 2 2
(d2 x)2
The chain rule for the second derivative fits this al- dx = dx 3
− dx dx 3 − 3 ddxx2 ddxy2 + 3 ddxy dx 4
(6)
gebraic notation correctly, provided we replace each
instance of the second derivative with its full form This expression includes a lot of terms not normally
(cf. (1)): seen, so some explanation is worthwhile. In this ex-
d2 y
− d y d2 x
= d2 y
− d y d2 x dx 2
+ dy d2 x
− d x d2 t
(4) pression, d2 x represents the second differential of x, or
dt 2 dt d x 2 dx 2 dx dx 2 dt dx dt 2 dt dt 2
d(d(x)). Therefore, (d2 x)2 represents (d(d(x)))2 . Like-
This in fact works out perfectly algebraically. wise, dx 4 represents (d(x))4 .
One objection that has been given to the present au- Because the expanded notation for the second and
thors by early reviewers about the formula presented higher derivatives is much more verbose than the first
4
derivative, it is often useful to adopt a slight modifi- this on the second derivative:
cation of Arbogast’s D notation (see [6, pgs. 209,218– d2 y dy d2 x
219]) for the total derivative instead of writing it as D2x y = 2 −
dx dx dx 2
algebraic differentials:4 dx 3
d y dx 3 dy d2 x dx 3
2
D2x y 3 = 2 3 −
dy dx dy dx dx 2 dy 3
d2 y dy d2 x 3
D2x y = − (7) dx d y dx d2 x
2
dx 2 dx dx 2 D2x y = 2 −
dy dy dy dy 2
d3 y dy d3 x
D3x y = 3 − − 3
d2 x d2 y
+ 3
dy (d2 x)2
(8) 3
dx dx dx 3 dx 2 dx 2 dx dx 4 dx d2 x dx d2 y
−D2x y = 2−
dy dy dy dy 2
This gets even more important as the number of deriva- !3
1 d2 x dx d2 y
tives increases. Each one is more unwieldy than the −D2x y dy = 2 −
previous one. However, each level can be interconverted dy dy dy 2
3
dx
into differential notation as follows: 1 d2 x dx d2 y
−D2x y = −
D1x y dy 2 dy dy 2
d(Dxn−1 y)
Dxn y = (9) It can be seen that this final equation is the derivative
dx
of x with respect to y. Therefore, it can generally be
The advantages of Arbogast’s notation over Lagrangian stated that the second derivative of y with respect to x
notation are that (1) this modification of Arbogast’s can be transformed into the second derivative of x with
respect to y with the following formula:
3
notation clearly specifies both the top and bottom dif-
ferential, and (2) for very high order derivatives, La- 2 1
grangian notation takes up n superscript spaces to write − Dx y = Dy2 x (10)
D1x y
for the nth derivative, while Arbogast’s notation only
To see this formula in action on a simple equation, con-
takes up log(n) spaces.
sider y = x 3 . Performing two derivatives gives us:
Therefore, when a compact representation of higher or- y = x3 (11)
der derivatives is needed, this paper will use Arbogast’s
D1x y = 3x 2
(12)
notation for its clarity and succinctness.5
D2x y = 6x (13)
According to (10), Dy2 x
(or, x ′′
in Lagrangian notation)
can be found by performing the following:
6 Swapping the Independent 3
1
Dy2 x = −(6x)
and Dependent Variables 3x 2
−6x
=
27x 6
In fact, just as the algebraic manipulation of the first −2 −5
derivative can be used to convert the derivative of y = x (14)
9
with respect to x into the derivative of x with respect to
This can be checked by taking successive derivatives of
y, combining it with Arbogast’s notation for the second
the inverse function of (11):
derivative can be used to generate the formula for doing
1
x = y3
4
The difference between this notation and that of Arbogast is 1 −2
Dy1 x = y 3
that we are subscripting the D with the variable with which the 3
derivative is being taken with respect to. Additionally, we are 2 −5
2
always supplying in the superscript the number of derivatives we Dy x = − y 3 (15)
are taking. Therefore, where Arbogast would write simply D, 9
this notation would be written as D 1x . (15) can be seen to be equivalent to (14) by substituting
5 It may be surprising to find a paper on the algebraic no-
for y using (11):
tation of differentials using a non-algebraic notation. The goal,
however, is to only use ratios when they act as ratios. When 2 −5
writing a ratio that works like a ratio is too cumbersome, we Dy2 x = − (x 3 ) 3
9
prefer simply avoiding the ratio notation altogether, to prevent
making unwarranted leaps based on notation that may mislead
2 −5
=− x (16)
the intuition. 9
5
This is the same result achieved by using the inversion the real exact solutions of which is
formula (cf. (10)). √3
6 2c1
y(x) = r q −
3 3 2
162(x + c2 ) + 23328c1 + [162(x + c2 )]
r q
7 Using the Inversion Formula
23328c13 + [162(x + c2 )]2
3
162(x + c2 ) +
for the Second Derivative − √3
3 2
(19)
While the inversion formula (cf. (10)) is not original,
it is a tool that many mathematicians are unaware of, Here c1 and c2 are integration constants that must be
and is rarely considered for solving higher-order differ- determined from given boundary or Cauchy conditions.
entials.6
On the other hand, (18) results in
As an example of how to apply (10), consider second y3
x(y) = + c1 y + c2,
order ordinary nonlinear differential equations of the 6
form the real inverse of which exactly coincides with (19).
F (y ′′, y ′, y) = 0.
6
“independent” variables. In the Leibniz conception, However, if we assume that x is truly the independent
what we would consider an “independent” variable is variable, then this means that d2 x = 0 and therefore
2
a variable whose first derivative is considered constant. the whole expression ddxy ddxx2 reduces to 0 as well. This
2
This leads to numerous simplifications of differentials reduces (3) to the modern notation of ddxy2 . Addition-
because, if a differential is constant, by standard differ- ally, if we take the assumption that x is the independent
ential rules its differential is zero. Therefore, if x is the variable, then the problems identified in Section 2 dis-
independent variable (using modern terminology) then appear, because x, as an independent variable, cannot
that implies that dx is constant. If dx is a constant then be dependent on t.8
(even if it is an infinitely small, unknowable constant),
2
then that means that its differential is zero. There- In addition to (3) being reducible to ddxy2 under the as-
fore, d2 x and higher differentials of x reduce to zero, sumption that x is the independent variable, the Leib-
simplifying the equation.7 nizian view also gives a set of tools that allows us to
2
reinflate instances of ddxy2 into (3). Euler showed that,
As an example, given the equation
given an equation from a specific “progression of vari-
xy = 3 ables” (i.e., a particular choice of an independent vari-
able), we can modify that equation in order to see what
the first differential of this would be given by it would have been if no choice of independent vari-
able had been made. According to [5, pg. 75], the sub-
x dy + y dx = 0 stitution for reinflating a differential from a particular
progression of variables (i.e., a particular independent
and the second differential of this would be given by
variable) into one that is independent of the progres-
x d2 y + 2 dx dy + y d2 x = 0. sion of variables (i.e., no independent variable chosen),
an expansion practically identical to (3) can be used.
Then, you could simplify the equation by choosing any
single differential to hold constant. This is referred
to in Leibnizian thought as choosing a “progression of
variables,” and it is identical to choosing an indepen-
9 Future Work
dent variable [5, pg. 71]. Therefore, if one chooses x
as the independent variable, then dx is constant, and The notation presented here provides for a vast im-
therefore d2 x = 0. Thus, the equation reduces to provement in the ability for higher order differentials
to be manipulated algebraically.
x d2 y + 2 dx dy = 0.
This improved notation yields several potential areas
However, if y is the independent variable, then dy is
for study. These include:
held constant and therefore its differential, d2 y = 0.
This leads to the equation
1. developing a general formula for the algebraic ex-
2 dx dy + y d2 x = 0. pansion of higher derivatives,
This understanding explains the success of the modern 2. identifying additional second order differential
notation of the second derivative. The notation given equations that are solvable by swapping the de-
in (3) is pendent and independent variable,
d2 y dy d2 x
D2x = 2 − . 3. finding other ways that differential equations can
dx dx dx 2
7
be rendered solvable using insights from the new
As a way of understanding this, imagine the common inde-
pendent variable used in physics, especially prior to relativity—
notation,
time. Especially consider the way that time flows in a pre-
relativistic era. It flows in a continual, constant fashion. There- 4. finding further reductions in special formulas that
fore, if the flow of time (i.e., dt) is constant, then by the rules of can be rendered by using algebraically manipulable
differentiation the second differential of time must be zero. Thus, notations,
an independent variable is one which acts in a similar fashion to
time. Another way to consider this is to consider the indepen- 8 To be clear, there is nothing preventing someone from mak-
dence of the independent variable. It’s changes (i.e., differences) ing an independent variable dependent on a parameter. However,
are, by definition, independent of anything else. Therefore, we doing so then brings them around to needing to use the form of
may not assign a rule about the differences between the values. the second derivative defined here (which does not presume a
Thus, because there is no valid rule, the second differential may particular choice of independent variable), or a compensating
not be zero, but it is at most undefinable by definition. mechanism such as Faà di Bruno’s formula.
7
5. investigating the conjecture that the second differ-
ential of independent variables are always zero and
its potential implications, and
6. extending this project to allow partial differentials
to be algebraically manipulable.
10 Acknowledgements
References
[1] E. W. Swokowski, Calculus with Analytic Geome-
try. PWS Publishers, alternate ed., 1983.
[2] W. P. Johnson, “The curious history of Fa‘a
di Bruno’s formula,” American Mathematical
Monthly, pp. 217–234, 2002.
[3] J. M. Henle and E. M. Kleinberg, Infinitesimal Cal-
culus. Dover Publications, 1979.
[4] H. J. Keisler, Elementary Calculus: An Infinitesi-
mal Approach. PWS Publishers, second ed., 1985.
[5] H. J. M. Bos, “Differentials, higher-order differen-
tials, and the derivative in the Leibnizian calculus,”
Archive for History of Exact Sciences, vol. 14, no. 1,
pp. 1–90, 1974.
[6] F. Cajori, A History of Mathematical Notations
Volume II. Open Court Publishing, 1929.