Ordinary Differential Equation
Ordinary Differential Equation
Ordinary Differential Equation
Com S 477/577
Nov 7, 2002
Introduction
The solution of differential equations is an important problem that arises in a host of areas. Many
differential equations are too difficult to solve in closed form. Instead, it becomes necessary to
employ numerical techniques.
Differential equations have a major application in understanding physical systems that involve
aerodynamics, fluid dynamics, thermodynamics, heat diffusion, mechanical oscillations, etc. They
are used for developing control algorithms for dynamic simulations. Other applications also include
optimization and stochastics.
We will consider ordinary differential equations (ODEs) and focus on two classes of problems:
(a) first-order initial-value problems and (b) linear higher-order boundary-value problems. The
basic principles we will see in these two classes of problems carry over to more general problems.
For instance, higher-order initial-value problems can be rewritten in vector form to yield a set of
simultaneous first-order equations. First-order techniques may then be used to solve this system.
Consider a general nth order differential equation in the form
y (n) (x) = f x, y(x), y (x), . . . , y (n1) (x) ,
(1)
where f is a function from n to . A general solution of this equation will usually contain
n arbitrary constants. A particular solution is obtained by specifying n auxiliary constraints.
For instance, one might specify the values of y and its derivatives at some point x = x0 as
y(x0 ), y (x0 ), . . . , y (n1) (x0 ). Such a problem is called an initial-value problem. In effect, the auxiliary conditions specify all the relevant information at some starting point x0 , and the differential
equation tells us how to proceed from that point.
If the n auxiliary conditions are specified at different points, then the problem is called a
boundary-value problem. In this case, we tie the function (or its derivatives) down at several points,
and the differential equation tells us the shape of y(x) between those points. For example, the
following n conditions on y are specified at two points x0 and x1 :
= 0,
f1 y(x0 ), . . . , y (n1) (x0 ), y(x1 ), . . . , y (n1) (x1 )
..
.
= 0.
fn y(x0 ), . . . , y (n1) (x0 ), y(x1 ), . . . , y (n1) (x1 )
The nth order equation (1) can be transformed into a system of first order equations if we
introduce n 1 variables zi = y (i) , i = 1, . . . , n 1. This system consists of n equations:
y = z1 ,
z1 = z2 ,
..
.
zn1
= f (x, y, z1 , . . . , zn1 ).
Taylors Algorithm
If the exact solution y(x) has a Taylor series expansion about x0 , then we can write
y(x) = y0 + (x x0 )y (x0 ) +
(x x0 )2
y (x0 ) +
2!
Of course, if we do not know y(x), then we do not explicitly know the values of its derivatives
y (x0 ), y (x0 ), . . .. However, if f is sufficiently differentiable, then we can obtain the derivatives of
y from f :
y (x) = f x, y(x) ,
y = fx + fy y
= fx + fy f,
y
(2)
2
fy2 f
h
hk1 (k1)
f (x, y) + +
f
(x, y),
2!
k!
where
f
Taylors series tells us that
(j)
for each k = 1, 2, . . .,
dj
(x, y) = j f x, y(x) = y (j+1) (x).
dx
ba
N
for n = 0, 1, . . . , N 1.
The calculation of yn+1 uses information about y and its derivatives that comes from a single
point, namely from xn . For this reason, Taylors algorithm is a one-step method.
Taylors algorithm of order 1, also known as Eulers method, has the basic update rule:
yn+1 yn + hf (xn , yn ).
Unfortunately, this method is not very accurate. It requires very small step sizes to obtain
good accuracy. And there are stability problems (such as error accumulation). Nevertheless, the
basic idea of adding small increments to previous estimates to obtain new estimates leads to more
advanced methods.
Error Estimation
We would like to understand the quality of our differential equation solvers by estimating the error
between y(xn ) and yn . There are three types of errors:
1. Local discretization error This is the error introduced in a single step of the equation solver
as it moves from xn to xn+1 . In other words, it is the error in the estimate yn+1 that would
result if y(xn ) were known perfectly.
2. Full discretization error This is the net error between y(xn ) and yn at step n. This error is
the sum of the local discretization errors, plus any numerical roundoff errors.
3. Numerical roundoff error Limited machine precision can introduce errors. This type of
errors will be ignored by us.
In general, an algorithm is said to be of order k if its local discretization error is O(hk+1 ), where
h is the step size.
3
3.1
Taylors theorem tells us that the local error for Taylors algorithm of order k is simply
hk+1 f (k) , y()
h(k+1) y (k+1) ()
,
or equivalently,
,
(k + 1)!
(k + 1)!
where is some point in the interval (xn , xn + h). Thus Taylors algorithm of order k is indeed of
order k. And Eulers algorithm is of order 1.
3.2
This type of error can be very difficult to estimate. Let us therefore illustrate the approach with
Eulers method. First, we define
en = y(xn ) yn .
Taylors theorem says that
y(xn+1 ) = y(xn ) + hy (xn ) +
h2
y (n ),
2
Recall that
yn+1 = yn + hf (xn , yn ).
Subtract this equation from the previous one:
h2
en+1 = en + h f xn , y(xn ) f (xn , yn ) + y (n ).
2
Now, we apply the mean-value theorem to f and obtain
h2
en+1 = en + hfy (xn , yn ) y(xn ) yn + y (n )
2
h2
= en 1 + hfy xn , yn + y (n ),
2
(3)
h2
Y.
2
Since e0 = 0 and 1 + hL > 1, we can prove by induction that |en | n , where n satisfies the
recurrence
h2
n+1 = (1 + hL)n + Y.
2
4
hY
((1 + hL)n 1) .
2L
Consequently,
|en | n
hY
2L
hY
=
2L
enhL 1
e(xn x0 )L 1 .
(4)
From the above analysis we see that the full discretizaton error approaches zero as the step size
h is made smaller. That the error is O(h) implies that the convergence may be slow.
Example 1. Consider the differential equation
y = y 2 ,
with
y(1) = 1.
n = 0, 1, . . . .
|E|
max
=
=
To obtain full error, note that y (x) =
2
x3
Y =2
and
L = 2.
h 2(xn 1)
e
1 ,
2
and
h 2
(e 1) 0.3195.
2
So we should expect about 1 decimal digit of accuracy.
max |en |
n
The following table compares the actual results obtained against those yielded from the exact solution.
xn
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
yn
1.0
0.9
0.819
0.7519
0.69539
0.64703
0.60516
0.56854
0.53622
0.50746
0.48171
f (xn , yn )
1.0
0.810
0.67076
0.56539
0.48356
0.41865
0.36622
0.32324
0.28753
0.25752
0.23205
y(xn )
1.0
0.90909
0.83333
0.76923
0.71429
0.66667
0.625
0.58824
0.55556
0.52632
0.5
The maximum error is just under 0.02, well within our estimates. But it is still not small enough.
Runge-Kutta Methods
Eulers method is unacceptable because it requires mall step sizes. Higher-order Taylor algorithms
are unacceptable because they require higher-order derivatives.
Runge-Kutta methods attempt to obtain greater accuracy than, say, Eulers method, without
explicitly evaluating higher order derivatives. These methods evaluate f (x, y) at selected points in
the interval [xn , xn+1 ].
The idea is to combine the estimates of y resulting from these selected evaluations in such a
way that error terms of order h, h2 , . . ., etc., are canceled out, to the desired accuracy.
For instance, the Runge-Kutta method of order 2 tries to cancel terms of order h and h2 , leaving
a local discretization error of order h3 . Similarly, the Runge-Kutta method of order 4 tries to cancel
terms of order h, h2 , h3 , h4 , leaving an O(h5 ) local discretization error.
We wish to evaluate f (x, y) at two points in the interval [xn , xn+1 ], then combine the results to
obtain an error of order h3 . The basic step of the algorithm is
yn+1 = yn + ak1 + bk2 ,
where
k1 = hf (xn , yn ),
k2 = hf (xn + h, yn + k1 ).
In the above, a, b, , and are fixed constants. They will be chosen in such a way as to obtain the
O(h3 ) local discretization error.
Intuitively, the basic step first evaluates y at (xn , yn ) by computing yn = f (xn , yn ). The
algorithm then tentatively uses this approximation to y to step to the trial point (xn +h, yn +k1 ).
At this trial point, the algorithm reevaluates the derivative y , then uses both derivative estimates
6
to compute yn+1 . By using two points to compute y , the algorithm should be obtaining second
derivative information, hence be able to reduce the size of the discretization error.
Let us derive the constants a, b, and . First, we use Taylors expansion in x:
h2
h3
y (xn ) + y (xn ) +
2!
3!
h2
= y(xn ) + hf (xn , yn ) + (fx + f fy )|(xn ,yn )
2
h3
+ (fxx + 2fxy f + fyy f 2 + fx fy + fy2 f )|(xn ,yn ) + O(h4 ).
6
(5)
2 h2
2 k12
fxx +
fyy + hk1 fxy + O(h3 ).
2
2
Subsequently, we have
yn+1 = yn + ak1 + bk2
2
= yn + (a + b)hf + bh (fx + f fy ) + bh
2
fxx + 2 f 2 fyy + f fxy
2
+ O(h4 ). (6)
We want yn+1 to estimate y(xn+1 ) with O(h3 ) local error. Let us equate the formulas (5)
with (6) and match the terms in h and h2 , respectively. This yields the following three constraint
equations for a, b, , and :
a + b = 1,
1
,
b =
2
1
b =
.
2
Note that for the purpose of local discretization error, y(xn ) matches yn . The three constraint
equations on four constants gives us a family of second-order Runge-Kutta methods.
One popular method is to choose a = b = 21 and = = 1. This effectively places the trial
point at xn+1 , then moves from yn to yn+1 by averaging the derivatives computed at xn and xn+1 .
Thus
1
yn+1 = yn + (k1 + k2 )
2
with
k1 = hf (xn , yn ),
k2 = hf (xn + h, yn + k1 ).
The Runge-Kutta method of order 2 has a local discretization error O(h3 ). This is superior
than Eulers method, which has local error O(h2 ). In order for a Taylor-based method to obtain
O(h3 ) error, it must compute second derivatives. The Runge-Kutta method, however, does not
need to compute second derivatives. Instead it performs two evaluations of the first derivative.
7
5.1
Runge-Kutta of Order 4
A widely used method is the following order 4 Runge-Kutta method. It produces O(h5 ) local
discretization error, and thus leads to solutions quickly and accurately. The integration formula
has the form
1
yn+1 = yn + (k1 + 2k2 + 2k3 + k4 ),
6
where
k1 = hf (xn , yn ),
k1
h
,
k2 = hf xn + , yn +
2
2
k2
h
,
k3 = hf xn + , yn +
2
2
k4 = hf (xn + h, yn + k3 ).
The term k1 estimates the derivative y = f (x, y) at the start point. The term k2 estimates y at
the midpoint between xn and xn+1 , using a y value obtained with k1 . The term k3 estimates y at
n+1
, but with an updated y value obtained by using k2 . The term k4 estimates
the midpoint xn +x
2
y (x n+1)
y n+1
4
3
y ( xn ) = yn
xn
x n + xn+1
2
x n+1
In the above figure, the curve represents the actual solution to the differential equation y (x) =
f (x, y). If y(xn ) = yn as in the picture, then we desire that yn+1 = y(xn+1 ). The picture shows
the process whereby yn+1 is computed. Points are labelled by numbers in the order that they are
computed. We assume that the vector field given by y = f (x, y) is continuous, and that other
integral curves locally look like the one shown.
Example 2. Reconsider the differential equation in Example 1:
y = y 2
with y(1) = 1.
The table below shows the solution of this problem over the interval [1, 2] using the fourth order Runge-Kutta
f (xn , yn )
1.0
0.82645
0.69445
0.59172
0.51020
0.44445
0.39063
0.34602
0.30864
0.27701
0.25000
yn
1.0
0.90909
0.83333
0.76923
0.71429
0.66667
0.62500
0.58824
0.55556
0.52632
0.50000
y(xn )
1.0
0.90909
0.83333
0.76923
0.71429
0.66667
0.625
0.58824
0.55556
0.52632
0.5
Compare this with the results of Eulers method shown in Example 1. There is no error to the fifth
decimal digits. In fact, the error was less than 5 107 .
Adams-Bashforth Method
Runge-Kutta propagates a solution over an interval by combining the information from several
Euler-style steps (each involving one evaluation of the right-hand f s), and then using the information obtained to match a Taylor series expansion up to some higher order. It essentially makes
several small steps for each major step from xn to xn+1 .
An alternative of moving from xn to xn+1 is to make use of prior information at xn1 , xn2 , . . ..
In other words, rather than evaluating f (x, y) at several intermediate points, make use of known
values of f (x, y) at several past points. Methods that make use of information at several of the
xi s are called multi-step methods. We will here look at one such method, known as the AdamsBashforth method.
Suppose we have approximations to y(x) and y (x) at the points x0 , . . . , xn . If we integrate the
differential equation
y (x) = f x, y(x)
from xn to xn+1 , we obtain
xn+1
y (x) dx =
xn
Hence
yn+1 = yn +
xn+1
xn
xn+1
xn
f x, y(x) dx.
f x, y(x) dx.
How do we evaluate the integral? The trick is to approximate f (x, y(x)) with an interpolating
polynomial. Of course, we do not know exact values of f (x, y(x)) anywhere except at x0 . However,
we do have approximate values f (xk , yk ) for k = 0, 1, . . . , n. So we will construct an interpolating
polynomial that passes through some of those values.
Specifically, if we are interested in an order m + 1 method, we will approximate the function
f (x, y(x)) with an interpolating polynomial that passes through the m + 1 points (xnm , fnm ),
. . ., (xn , Rfn ). Here fi = f (xi , yi ). We then integrate this polynomial in order to approximate the
x
integral xnn+1 f (x, y(x)) dx.
9
We have seen interpolating polynomials written in terms of divided differences. It is also possible
to write interpolating polynomials in terms of forward differences. Forward differences are defined
recursively as
fk
if i = 0;
i
fk =
i1
i1
fk+1 fk if i > 0.
A forward difference table looks like this:
xi 0
x0 f 0
1
f0
x1
f1
f1
x2
f2
f2
x3
f3
f3
x4
f4
x5
f5
2 f 0
3 f 0
2 f 1
2 f
3 f
4 f 0
4 f
3 f 2
2 f 3
5 f 0
f4
where
s =
s
=
k
x xn
,
h
(s)(s 1) (s k + 1)
,
k
if k 1.
m
1X
(1)
0 k=0
s
k fnk ds
k
o
n
= yn + h 0 fn + 1 fn1 + + m m fnm
where
k = (1)k
1
(7)
s
ds.
k
The sequence {k } are precomputable numbers. The simplest case, obtained by setting m = 0
in (7), leads to Eulers method.
10
One popular Adams-Bashforth method is the one of order 4. To derive it, we need to compute
i , i = 0, 1, 2, 3:
0 = 1,
Z 1
1
(s) ds =
1 =
,
2
0
Z 1
(s)(s 1)
5
2 =
ds =
,
2
12
0
Z 1
(s)(s 1)(s 2)
3
3 =
ds =
.
6
8
0
So
5
3
1
yn+1 = yn + h fn + fn1 + 2 fn2 + 3 fn3 .
2
12
8
(8)
h
55fn 59fn1 + 37fn2 9fn3 .
24
251 5 (5)
h y (),
720
Example 3. Let us return to the example y = y 2 with y(1) = 1 to be solved over the interval [1, 2].
In order to start off the order 4 Adams-Bashforth method, we need to know the values of f (x, y(x)) at
the four points x0 , x1 , x2 , x3 . In general, one obtains these either by knowing them, or by running a highly
accurate method that computes approximations to y(x1 ), y(x2 ), and y(x3 ). In our case, we will take the
values obtained by the Runge-Kutta method in Example 2.
xn
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
yn
1.0
0.90909
0.83333
0.76923
0.71444
0.66686
0.62525
0.58848
0.55580
0.52655
0.50023
f (xn , yn )
1.0
0.82645
0.69445
0.59172
0.51042
0.44470
0.39093
0.34631
0.30892
0.27726
0.25023
11
y(xn )
1.0
0.90909
0.83333
0.76923
0.71429
0.66667
0.625
0.58824
0.55556
0.52632
0.5
Notice that the maximum error is approximately 2.5 104 . This is indeed within the bound we obtained
for the local error, namely
251 5 (5)
|E| max
h y (x)
1x2 750
251
120
= max
(.1)5 6
1x2 720
x
4.2 104 .
Both the Runge-Kutta method and the Adams-Bashforth method are methods of order 4,
meaning that their local discretization error is O(h5 ). However, the constant coefficient hidden
in the error term tends to be higher for the Adams-Bashforth method than for the Runge-Kutta
method. Consequently, the Runge-Kutta method tends to exhibit greater accuracy. This is shown
in Example 3. Adams-Bashforth also has the drawback that it is not self-starting one must
supply four initial data points rather than just one.
Both methods require evaluation of f (x, y(x)) at four points in order to move from xn to
xn+1 . However, while Runge-kutta must generate three intermediate points at which to evaluate
f (x, y(x)), Adams-BAshforth uses information already available. Consequently, Adams-Bashforth
requires less computation and is therefore faster.
Multi-step methods use the basic formula:
yn+1
m
X
k s
(1)
= ynp + h
k fnk ds.
k
p
Z
k=0
The integration is from xnp to xn+1 using interpolation at the points xnm , . . . , xn . The case
p = 0 yields Adams-Bashford. Some especially interesting formulas of this type are
yn+1 = yn1 + 2hfn ,
E=
h3
y (),
6
when m = 1 and p = 1.
and
yn+1 = yn3 +
4h
(2fn fn1 + 2fn2 ),
3
E=
14 5 (5)
h y (),
45
when m = 3 and p = 3.
It can be shown that the initial value problem involving the differential equation y = f (x, y) has
exactly one solution provided that the function f satisfies a few simple regular conditions.
Theorem 1 Let f be defined and continuous on the strip S = { (x, y) | a x b, y n }.
Furthermore, let there be a constant L such that
kf (x, y1 ) f (x, y2 )k Lky1 y2 k
(Lipschitz condition)
for all x [a, b] and all y1 , y2 n . Then for every x0 [a, b] and every y0 n there exists
exactly one function y(x) such that
(a) y(x) is continuously differentiable on [a, b];
(b) y (x) = f (x, y(x)) for x [a, b];
(c) y(x0 ) = y0 .
The second theorem states that the solution of an initial value problem depends continuously
on the initial value.
Theorem 2 Let f be defined and continuous on the strip S = { (x, y) | a x b, y n } and
satisfy the Lipschitz condition
kf (x, y1 ) f (x, y2 )k Lky1 y2 k,
some constant L,
for all (x, yi ) S, i = 1, 2. Let a x0 b. Then for the solution of y(x; s) of the initial value
problem
y = f (x, y),
y(x0 ; s) = s
there holds the estimate
ky(x; s1 ) y(x; s2 )k eL|xx0 | ks1 s2 k,
for a x b.
References
[1] M. Erdmann. Lecture notes for 16-811 Mathematical Fundamentals for Robotics. The Robotics
Institute, Carnegie Mellon University, 1998.
[2] S. D. Conte and Carl de Boor. Elementary Numerical Analysis. McGraw-Hill, Inc., 2nd edition,
1972.
[3] J. Stoer and R. Bulirsch. Introduction to Numerical Analysis. Springer-Verlag New York, Inc.,
2nd edition, 1993.
13