0% found this document useful (0 votes)
89 views21 pages

Math 361S Lecture Notes Numerical Solution of Odes: Part I: Topics Covered

This document provides an overview and introduction to numerical methods for solving ordinary differential equations (ODEs). It covers topics like Euler's method, stability, Runge-Kutta methods, and implicit methods. For each topic, it provides examples and discusses concepts like local truncation error, consistency, convergence, regions of absolute stability, and stiff differential equations. The goal is to understand the fundamentals of numerical methods for ODEs and how to approach solving nasty ODEs by combining different techniques.

Uploaded by

Ansarii Abu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
89 views21 pages

Math 361S Lecture Notes Numerical Solution of Odes: Part I: Topics Covered

This document provides an overview and introduction to numerical methods for solving ordinary differential equations (ODEs). It covers topics like Euler's method, stability, Runge-Kutta methods, and implicit methods. For each topic, it provides examples and discusses concepts like local truncation error, consistency, convergence, regions of absolute stability, and stiff differential equations. The goal is to understand the fundamentals of numerical methods for ODEs and how to approach solving nasty ODEs by combining different techniques.

Uploaded by

Ansarii Abu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Math 361S Lecture Notes

Numerical solution of ODEs: Part I


Jeffrey Wong
April 2, 2019

Topics covered
• Overview

◦ Brief review of ODEs


◦ Perspective for numerical methods

• Euler’s method

◦ Local and global truncation error; consistency


◦ Convergence; proof of convergence

• Stability

◦ Stiff differential equations


◦ Regions of absolute stability
◦ Some implicit methods (trapezoidal method, backward euler)

• Runge-Kutta methods

◦ (Briefly) Taylor series method


◦ Constructing higher-order one step methods
◦ Runge-Kutta methods: the general approach
◦ Example: RK4 (fixed step size)

• Implicit one step methods

◦ General notes
◦ Implementation

1
1 Overview of goals
In this section of the course we will consider the solution of ordinary differential equations.
The tools we have developed so far - interpolation, differentiation and integtration - will
serve as a foundation for constructing the algorithms. We will also make use of some linear
algebra and non-linear equations (e.g. Newton’s method) to solve some sub-problems.
Our main goal here twofold. First, we seek to understand the fundamentals of numerical
methods for ODEs: what are the numerical issues that arise, how to recognize and correct
them, and the right perspective. If you are handed a nasty ODE, how do you approach
writing a method to solve it? How can you argue that the answer you obtain is correct?
Second, as ODEs are a synthesis of existing techniques, we will see how pieces can be
combined and how the fundamental principles (error in interpolation, numerical stability of
integration and so on) show up in practice. We’ll also see a few more concepts:
• How stability manifests in numerical solution to ODEs (and how it connects to the
mathematical notion of stability from theory)
• Managing trade-offs between accuracy and stability (the consequences of sacrificing
one for the other are more severe for ODEs than the other problems we have seen)
• Designing adaptive algorithms that ’just work’ like ode45 in Matlab (we saw a bit of
this with adaptive integration, but will explore it in detail here)

2 The problem to solve


2.1 Note on background
See sections 7.1-7.3 of Moler’s book or any standard text on ODEs for a review of ODEs. The
bare minimum will be presented here. Essentially no ODE theory is required to solve ODEs
numerically, but the theory does provide important intuition, so it will greatly enhance your
understanding of the numerics.

2.2 Initial value problems


Here, we will consider solving an initial value problem
y 0 (t) = f (t, y), t ∈ [a, b] (2.1a)
y(t0 ) = y0 . (2.1b)
The equation (2.1a) is the ODE (ordinary differential equation) for y(t) and (2.1b) is
the initial condition. We seek a function y(t) that satisfies (2.1a) for all t in the given
interval and whose value at t0 is y0 .

The ODE is first-order (the highest derivative is first-order) and scalar (y(t) is a real-
valued function).

Throughout, we shall assume that

2
• f (t, y) is a continuous function in t and y

• f has partial derivatives of all orders required for any derivation (mostly for Taylor
expansions)

It is a fundamental theorem that these conditions ensure a unique solution exists, at least
in some interval around (t0 , y0 ). It is not enough to ensure that a solution exists for all t;
the solution may diverge.

For our purposes, we will attempt to construct numerical solutions where the actual so-
lution exists, so the theory is just there to ensure that the problem to solve is well-defined.

2.3 Slope fields and sensitivity


Here we review a useful geometric interpretation of the ODE

y 0 = f (t, y).

A solution (t, y(t)) forms a curve in the (t, y) plane. The ODE tells us the direction of the
curve at any given point, since

y(t) is parallel to (1, y 0 ) = (1, f ).

In this sense, solutions to the ODE follow the ’slope field’, which is the vector field

(1, f (t, y))

in the (t, y) plane. To find a solution to the IVP starting at (t0 , y0 ), we may follow the
slope field to construct the curve; this is the basis of the simplest numerical method that is
detailed in the next section.

The slope field gives geometric intuition for some important concepts for numerical methods.
For instance, we may ask: if y(t) gets perturbed by some amount ∆y at time t0 , how far
apart are the original and perturbed curves after some time?

Put another way: how sensitive is the ODE to changes in the initial condition?

The picture is shown in Figure 3. Suppose we have two solutions x(t) and y(t) to the
same ODE,
y 0 = f (t, y), y(t0 ) = y0 ,
x0 = f (t, x), x(t0 ) = x0 .
Informally, the difference z = y − x satisfies
∂f
z 0 = f (t, y) − f (t, x) = (t, ξ)z
∂y

3
for some ξ, by the mean value theorem. Thus the size of ∂f /∂y determines how fast the
difference can change with time. Let

∂f
L = max (t, y) .
t∈[a,b],y∈R ∂y

Then we have, informally, that

|z|0 ≤ L|z| =⇒ |z| ≤ |z0 |eLt

ignoring that the absolute value cannot be manipulated this way with a derivative. Thus L
(the max. of the variation of f with respect to y) is the exponential rate, at worst, at which
the two solutions can move apart.

However, the bound is sometimes pessimistic. Taking absolute values discards informa-
tion about the sign, so if z ∼ −Lz then the bound is the same, even though z then decays
exponentially. This is shown in the figure.

4
4 3

2
2
1

0 0
0 1 2 0 1 2

Figure 1: Sketch of the difference in two solutions that start at nearby points (t0 , x0 ) and
(t0 , y0 ) and numerical examples for y 0 = ty and y 0 = −ty.

4
3 Numerical methods: the basics

Summary of notation in this section:

• t0 , · · · , tN : the points where the approximate solution is defined

• yj = y(tj ): the exact solution to the IVP

• ỹj the approximation at tn

• τj : truncation error in obtaining ỹj

• h or ∆t: the ‘step size’ tj − tj−1 (if it is constant); otherwise hj or ∆tj

Here we introduce Euler’s method, and the framework to be used for better numerical
methods later. We seek a numerical solution to the IVP

y 0 = f (t, y), y(a) = y0

and suppose we wish to solve for y(t) up to a time1 t = b. The approximation will take the
form of values ỹj defined on a grid

a = t0 < t1 < · · · < tN = b

such that
ỹj ≈ y(tj ).
For convenience, denote by yj the exact solution at tj and let the ‘error’ at each point be

ej = yj − ỹj .

It will be assumed that we have a free choice of the tj ’s. The situation is sketched in Figure 2.

3.1 Euler’s method


Assume for simplicity that
h = tj − tj−1 ,
the ‘time step’ size, is constant (not necessary, juts convenient).

Suppose that we have the exact value of y(t). To get y(t + h) from y(t), expand in a
Taylor series and use the ODE to simplify the derivatives:

y(t + h) = y(t) + hy 0 (t) + O(h2 )


= y(t) + hf (t, y) + O(h2 ).
1
Here t is just the independent variable; time is a useful analogy since it suggests the direction (forward
in time) in which the ODE is to be solved.

5
Figure 2: Numerical solution of an IVP forward in time from t = a to t = b.

This gives, for any point tj , the formula

y(tj + h) = y(tj ) + hf (tj , y(tj )) + τj+1 (3.1)

where τj+1 is the local truncation error defined below. We could derive a formula, but
the important thing is that
τj+1 = O(h2 ).
Dropping the error in (3.1) and iterating this formula, we get Euler’s method:

yej+1 = yej + hf (tj , yej ). (3.2)

The initial point is, ideally,


ye0 = y0
since the initial value (at t0 ) is given. However, in practice, there may be some initial error
in this quantity as well.

Notice that the total error is not just the sum of the truncation errors, because f is evaluated
at the approximation yen . The truncation error will propagate through the iteration, as a
careful analysis will show.

Definition (Local truncation error, or LTE): The local truncation error is the error
incurred in obtaining ỹj when the previous data (yj−1 etc.) is known exactly. It is ’local’ in
the sense that it does not include errors created at previous steps.

Another interpretation is this: The local truncation error is what is ’left over’ when
the exact solution is plugged into the approximation. For Euler’s method,

yej+1 = yej + hf (tj , yej )

6
if we plug in y(tj ) instead of ỹj , the LHS/RHS are not equal; the difference is the local
truncation error:
yj+1 = yj + hf (tj , yj ) + τj+1 .

3.2 Convergence
Suppose we use Euler’s method to generate an approximation

(tj , yej ), j = 0, · · · , N

to the solution y(t) in the interval [a, b] (with t0 = a and tN = b). The ‘error’ in the
approximation that matters in practice is the global error

E = max |yj − yej | = max |ej |


0≤j≤N 0≤j≤N

where
ej = yj − yej
is the error at tj . This is a measure of how well the approximation yej agrees with the true
solution over the whole interval.2

Our goal is to show that, given an interval [a, b], the global error has the form

max |ej | = O(hp )


0≤j≤N

for some integer p, the order of the approximation. In this case we say that the method is
convergent; the approximation, in theory, converges to the true solution y(t) as h → 0.3

As an example, consider
y 0 = ty, y(0) = 0.1
2
which has exact solution y(t) = 0.1et . Below, we plot some approximations for various time-
steps h; on the right is the max. error in the interval. The log-log plot has a slope of 1,
indicating the error should be O(h).
2
Note that this is not precisely true since the approximation is not defined for all t; we would need to
interpolate and that would have its own error bounds. But in practice we typically consider error at the
points where the approximation is computed.
3
See previous footnote; really this means that the approximation as a piecewise-defined function, e.g.
piecewise linear, converges to y(t) as h → 0. Since the points get arbitrarily close together as h → 0, the
distinction between ‘max error at the tj ’s’ and ‘max error as functions’ is not of much concern here.

7
0.8 10 0
Exact
0.6 h=0.4
h=0.2

max. err
h=0.1
10 -2
y

0.4

0.2

10 -4
0 0.5 1 1.5 2 10 -4 10 -2 10 0
t h

3.3 Euler’s method: error analysis


The details of the proof are instructive, as they will illustrate how error propagates and the
’worst case’. Assume that, as before, we have a fixed step size h = tj − tj−1 , points

a = t0 < t1 < · · · < tN = b

and seek a solution in [a, b]. Further assume that

i) The partial derivative ∂f /∂y is bounded by L in the interval:



∂f
L := max (t, y) < ∞, (3.3)
t∈[a,b],y∈R ∂y

ii) The local truncation error τj is O(h2 ), uniform in j:

max |τj | = O(h2 ).


0≤j≤N

This holds, for instance, if the appropriate derivatives of f are all bounded in [a, b] (for
all y), but we will not be too precise about the matter.

Condition (3.3) is called a Lipschitz condition and L is the Lipschitz constant. The
relevant consequence is that

|f (t, y1 ) − f (t, y2 )| ≤ L|y1 − y2 | for all t ∈ [a, b] and y1 , y2 ∈ R (3.4)

which is a direct consequence of the mean value theorem. This inequality is the key ingredient
for the proof. The theorem is the following:

8
Theorem (Convergence for Euler’s method): Let ỹj be the result of applying Euler’s
method (3.2) starting at (t0 , ỹ0 ) and suppose that (i) and (ii) hold. Then
 L(b−a) 
L(b−a) e − 1 max |τk |
max |ej | ≤ e |e0 | + . (3.5)
0≤j≤n L h

where τk is the local truncation error and ej = yj − ỹj is the error at tj .

In particular, max τj = O(h2 ) and, as h → 0, if e0 = 0 (no error in y0 ) then

max |ej | = O(h). (3.6)


0≤j≤n

Proof. To start, recall that from the definition of the truncation error and the formula,

yj+1 = yj + hf (tj , yj ) + τj+1 (3.7)


yej+1 = yej + hf (tj , yej ) (3.8)

Subtracting (3.8) from (3.7) gives


 
ej+1 = ej + h f (tj , yj ) − f (tj , yej ) + τj+1 .

From assumption (i) we have the Lipschitz property (3.4), so

|ej+1 | ≤ (1 + Lh)|ej | + |τj+1 |.

Iterating, we get
|e1 | ≤ (1 + Lh)|e0 | + |τ1 |,
|e2 | ≤ (1 + Lh)2 |e0 | + (1 + Lh)|τ1 | + |τ2 |
and in general
j
X
j
|ej | ≤ (1 + Lh) |e0 | + (1 + Lh)j−k |τk |.
k=1

Bounding each |τk | by the maximum in (ii) and evaluating the sum,

(1 + Lh)j − 1
|ej | ≤ (1 + Lh)j |e0 | + (max |τk |) .
Lh
Now we want to take the maximum over j, so the RHS must be written to be independent
of j. To fix this problem, we use the crude estimate

(1 + Lh) ≤ eLh

to obtain
eLjh − 1
 
Ljh
|ej | ≤ e |e0 | + max |τk |
Lh

9
But jh = tj − t0 ≤ b − a (equal when j = N ) so
 L(b−a) 
L(b−a) e − 1 max |τk |
|ej | ≤ e |e0 | + .
L h

Taking the maximum over j (note that the RHS is independent of j) we get (3.5).

Now if ỹ0 = y0 i.e. the initial value is calculated exactly, then

max |τk |
max |ej | ≤ C
0≤j≤N h
for a constant C that depends on the interval size and L but not on h. By assumption (ii0,
we know that max |τk | = O(h2 ) so the maximum error is O(h).

3.4 Interpreting the error bound


A few observations on what the error bound tells us:

• The LTE introduced at each step grows at a rate of eLt at worst.

• An initial error (in y0 ) also propagates in the same way.

• The LTE is O(h2 ) and O(1/h) steps are taken, so the total error is O(h); the propa-
gation does not affect the order of the error on a finite interval.

Note that the factor L and (1 + Lh) are method dependent; for other methods, the factors
may be other expressions related to L.

Our two examples from the theory,

(a) y 0 = ty, (b) y 0 = −ty

illustrate the propagation issue. As with actual solutions, the error and a numerical solution
(or two nearby numerical solutions), can grow like eLt at worst. Indeed, for (a), the error
grows in this way; the error bound is good here.

However, for (b), the numerical solutions actually converge to the true solution as t in-
creases; in fact we have that the error behaves more like e−Lt . But the error bound cannot
distinguish between the two cases, so it is pessimistic for (b).

In either case, the global error over [a, b] is O(h) as h → 0.

3.5 Order
The order p of a numerical method for an ODE is the order of the global truncation error.
Euler’s method, for instance, has order 1 since the global error is O(h).

10
6
3

4
2

2
1

0 0
0 1 2 3 0 1 2 3

Figure 3: Numerical solutions to y 0 = ty and y 0 = −ty with different values of N ; note the
behavior of the error as t increases.

The 1/h factor is true for (most) other methods, so as a rule of thumb,

LTE = O(hp+1 ) =⇒ global error = O(hp ).

The interpretation here is that to get from a to b we take ∼ 1/h steps, so the error is on
the order of the number of steps times the error at each step, (1/h) · O(hp+1 ) = O(hp ). The
careful analysis shows that the order is not further worsened by the propagation of the errors.

Warning: Some texts define the LTE with an extra factor of h so that it lines up with the
global error, in which case the rule is that the LTE and global error have the same order.
For this reason it is safest to say that the error is O(hp ) rather than to use the term ‘order
p’, but either is fine in this class.

4 Consistency, stability, and convergence


Convergence is the property any good numerical method for an ODE must have. However,
as the proof shows, it takes some effort (and is harder for more complicated methods). To
better understand it, we must identify the key properties that guarantee convergence and
how they can be controlled.

This strategy leads to two notions: consistency and stability, which are pieces that are
necessary for convergence.

As before, we solve an IVP in an interval [a, b] starting at t0 = a with step size h.

11
Definition (convergence): The numerical method is said to be convergent with order p
if, given an interval [a, b] where (i) as the timestep goes to zero:
 
max |eyj − y(tj )| = O(hp ) as h → 0.
0≤j≤n

4.1 Consistency and stability


A method is called consistent if

the LTE at each step is o(h) as h → 0.

That is,
τj
lim = 0 for all j.
h→0 h

To check consistency, we may assume the result of the previous step is exact (since
this is how the LTE is defined). This is a benefit, as there is no need to worry about the
accumulation of errors at earlier steps.

Example (Checking consistency): Euler’s method,

yej+1 = yej + hf (tj , yej )

is consistent since the truncation error is


h2 00
τj+1 = y(tj+1 ) − y(tj ) − hf (tj , y(tj )) = y (ξj )
2
where ξj lies between tj and tj+1 . For each j, the error is O(h2 ) as h → 0. From the
point of view of consistency, j is fixed so y 00 (ξj ) is just some constant; we do not need a
uniform bound on y 00 that is true for all j.

In constrast, stability refers to the sensitivity of solutions to initial conditions. First, it is


worth reviewing the notion of stability for IVPs.

Stability for first-order IVPs: Consider the IVP

y 0 = f (t, y), y(t0 ) = y0 (4.1)

and assume that


∂f
(t, y) ≤ L (4.2)
∂y

12
If (4.2) holds for some interval t ∈ [a, b] containing t0 and all y and f is continuous then the
IVP (4.1) has a unique solution in [a, b].
Moreoever, if y1 amd y2 are two solutions to the ODE with different initial conditions then

|y1 (t) − y2 (t)| ≤ eL(t−t0 ) |y1 (t0 ) − y2 (t0 )|.

We would like to have a corresponding notion of ‘numerical’ stability.

Definition (zero-stability) Suppose {yn } and {zn } are approximate solutions to (4.1) in
[a, b]. If it holds that
|yn − zn | ≤ C|y0 − z0 | + O(hp ) (4.3)
where C is independent of n, then the method is called zero stable.

Note that the best we can hope for is C = eL(t−t0 ) since the numerical method will never
be more stable than the actual IVP. In what follows, we will try to determine the right
notions of stability for the numerical method.

As written, the stability condition is not easy to check. However, one can derive easy
to verify conditions that imply zero stability. We have the following informal result:

Zero-stability, again: A ‘typical’ numerical method is zero stable in the sense (4.3) if it
is numerically stable when used to solve the trivial ODE

y 0 = 0.

The condition is non-trivial to prove but much easier to check.

Here ‘typical’ includes any of the methods we consider in class (like Euler’s method) and
covers most methods for ODEs one encounters in practice.
With some effort, one can show that this notion of stability is exactly the minimum required
for the method to converge.

Convergence theorem (Dahlquist) A ‘typical’ numerical method converges if it is con-


sistent and zero stable. Moreover, it is true that

LT E = O(hp+1 ) =⇒ global error = O(hp ).

This assertion was proven for Euler’s method directly. Observe that the theorem lets
us verify two simple conditions (easy to prove) to show that a method converges (hard to
prove).

13
Example: convergence by theorem The Backward Euler method (to be studied in
detail later) is
yej+1 = yej + hf (tj+1 , yej+1 ).
Consistency: The local truncation error is defined by

yj+1 = yj + hf (tj + h, yj+1 ) + τj+1 .

Expanding around tj + h and using the ODE we get

yj = yj+1 − y 0 (tj+1 )h + O(h2 ) = yj+1 − hf (tj + h, yj+1 ) + O(h2 ).

Plugging this in to the formula yields τj+1 = O(h2 ) so the method is consistent.

Zero stability: This part is trivial. When the ODE is y 0 = 0,

yej+1 = yej

which is clearly numerically stable.

The theorem then guarantees that the method is convergent, and that the order
of convergence is 1 (the global error is O(h)).

Note: It is mostly straightforward to prove convergence directly as we did for


Euler’s method; see HW for this.

4.2 Example of an unstable method


Obviously, any method we propose should be consistent (i.e. the truncation error is small
enough). As the theorem asserts, consistency is not enough for convergence! A simple ex-
ample illustrates what can go wrong when a method is consistent but not stable.

Euler’s method can be derived by replacing y 0 in the ODE with a forward difference:

y(t + h) − y(t)
= y 0 = f (t, y).
h
One might hope, then, that a more accurate method can be obtained by using a second-order
forward difference
−y(t + 2h) + 4y(t + h) − 3y(t)
y 0 (t) = + O(h2 ).
2h
Plugging this in, we obtain the method

−e
yj+2 = −4e
yj+1 + 3e
yj + 2hf (tj , yej ) (4.4)

14
which is consistent with an O(h3 ) LTE. However, this method is not zero stable!

It suffices to show numerical instability for the trivial ODE y 0 = 0. The iteration reduces to

yj+1 − 3e
yej+2 = 4e yj .

Plugging in yj = rj we get a solution when

r2 − 4r + 3 = 0 =⇒ r = 1, 3

so the general solution is


yej = a + b · 3j .
If initial values are chosen so that
ye0 = ye1
then yj = y0 for all j with exact arithmetic. However, if there are any errors (e y0 6= ye1 ) then
b 6= 0 and |e
yj | willl grow exponentially. Thus, the method is unstable, and is not convergent.

Obtaining a second order method therefore requires a different approach.

5 Runge-Kutta methods
In this section the most popular general-purpose formulas are introduced, which can be
constructed to be of any order.

5.1 Setup: one step methods


Euler’s method is the simplest of the class of one step methods, which are methods that
involve only yj and intermediate quantities to compute the next value yj+1 . In contrast,
multi-step methods use more than the previous point, e.g.

yj+1 = yj + h(a1 yj−1 + a2 yj−2 )

which will be addressed later.

Definition (one step methods): A general ‘explicit’ one step method has the form

yej+1 = yej + hψ(tj , yej ) (5.1)

where ψ is some function we can evaluate at (tj , yj ). The truncation error is defined by

yj+1 = yj + hψ(tj , yj ) + τj+1 . (5.2)

15
The term ‘explicit’ refers to the fact that yj+1 can be computed explicitly at each step, which
makes computation easy.

To improve on the accuracy of Euler’s method with a one-step method, we may try to
include higher order terms to get a. To start, write (5.2) as
yj+1 = yj + hψ(tj , yj ) +τj+1
|{z} | {z }
LHS RHS

For a p-th order method, we want the LHS to equal the RHS up to O(hp+1 ). Now expand
the LHS in a Taylor series around tj :
h2 00
LHS: yj+1 = y(tj ) + hy 0 (tj ) + y (tj ) + · · ·
2
A p-order formula is therefore obtained by taking
h hp−1 (p−1)
ψ(tj , yj ) = y 0 (tj ) + y 00 (tj ) + · · · + y (tj ).
2 p!
The key point is that the derivatives of y(t) at can be expressed in terms of f and its partial
derivatives - which we presumably know. Simply differentiate the ODE y 0 = f (t, y(t)) in t,
being careful with the chain rule. If G(t, y) is any function of t and y evaluated on the
solution y(t) then
d
(G(t, y(t))) = Gt + Gy y 0 (t) = Gt + f Gy .
dt
with subscripts denoting partial derivatives and Gt etc. evaluated at (t, y(t)).

It follows that
y 0 (t) = f (t, y(t)),
y 00 (t) = ft + fy f,
y 000 (t) = (ft + fy f )0 = ftt + fty f + · · · (see HW).
In operator form,  p−1
(p) ∂ ∂
y = +f f.
∂t ∂y

Taylor’s method: The p-th order one-step formula

h2 00 hp+1 (p+1)
yj+1 = yj + hy 0 (tj ) + y (tj ) + · · · + y (tj ) + O(hp+1 ).
2 (p + 1)!

Note that y 0 , y 00 , · · · are replaced by formulas involving f and its partials by repeatedly
differentiating the ODE.

This method is generally not used due to the convenience of the (more or less equiv-
alent) Runge-Kutta methods.

16
5.2 A better way
Taylor’s method is inconvenient because it involves derivatives of f . Ideally, we want a one
step method that needs to know f (t, y) and nothing more.

The key observation is that the choice of ψ in Taylor’s method is not unique. We can replace
ψ with anything else that has the same order error. The idea of a Runge-Kutta
method is to replace the expression with function evaluations at ‘intermediate’ points in-
volving computable values starting with f (tj , yj ).

Let us illustrate this by deriving a second order one step method of the form

yj+1 = yj + w1 hf1 + w2 hf2 +O(h3 ).


|{z} | {z }
LHS RHS

where

f1 = f (tj , yj ),
f2 = f (tj + h/2, yj + hβf1 )

and w1 , w2 , β are constants to be found.

Aside (integration): You may notice that this resembles an integration formula using two
points; this is not a coincidence since
Z tj+1
0
y = f (t, y) =⇒ yj+1 − yj = f (t, y(t)) dt
tj

so we are really estimating the integral of f (t, y(t)) using points at tj and tj+1/2 . The problem
is more complicated than just integrating f (t) because the argument depends on the unknown
y(t), so that also has to be approximated.

To find the coefficients, expand everything in a Taylor series, keeping terms up to order h2 :

h2 00
LHS = yj + hyj0 + yj + O(h3 )
2
h2
= yj + hf + (ft + f fy ) + O(h3 )
2
where f etc. are all evaluated at (tj , yj ). For the fi ’s, we only need to expand f2 :

h2
hf2 = hf + ft + h2 fy (βf1 ) + O(h3 )
2
h2
= hf + ft + βh2 f fy + O(h3 ).
2

17
Plugging this into the RHS gives

w2 h2
RHS = yj + h(w1 + w2 )f + ft + w2 βh2 f fy + O(h3 )
2
h2
LHS = yj + hf + (ft + f fy ) + O(h3 )
2
Comparing the LHS/RHS are equal up to O(h3 ) if
1
w1 + w2 = 1, w2 = 1, w2 β =
2
which gives
1
w1 = 0, w2 = 1, β= .
2
We have therefore obtained the formula

yj+1 = yj + hf1 + hf2 + O(h3 ),

f1 = f (tj , yj ), f2 = f (tj + h/2, yj + hf1 /2),


which is called the modified Euler method.

Remark (integration connection): In this case one can interpret the formula as using
the midpoint rule to estimate
Z tj+1
y(tj+1 ) = y(tn ) + f (t, y(t)) dt ≈ y(tj ) + hf (tj + h/2, y(tj + h/2))
tj

but using Euler’s method to estimate the midpoint value:


h
y(tj + h/2) ≈ yn + f (tj , yj ).
2
In
R fact, when f = f (t), the method reduces exactly to the composite midpoint rule for
f (t) dt. However, in general, the various intermediate quantities do not have a clear in-
terpretation, and the formulas can appear somewhat mysterious. Deriving methods from
integration formulas is done in a different way (multistep methods).

18
5.3 Higher-order explicit methods
The modified Euler method belongs in the class of Runge-Kutta methods.

Definition (Explicit RK method) A general explicit Runge-Kutta (RK) method’ uses


m ‘substeps’ to get from yj to yj+1 and has the form

f1 = f (tj , yj )
f2 = f (tj + c2 h, yj + ha21 f1 )
f3 = f (tj + c3 h, yj + ha31 f1 + ha32 f2 )
..
.
fm = f (tj + cm h, yj + ham1 f1 + · · · + ham,m−1 fm−1 )
yj+1 = yj + h(w1 f1 + · · · + wm fm ).

Each fj is an evaluation of f at a y-value obtained as yj plus a linear combination of the


previous fi ’s. The ‘next’ value yj+1 is a linear combination of all the fi ’s.

• The best possible local truncation error is O(hp+1 ) where p ≤ m. For each p, the
system is underdetermined and has a family of solutions (see HW for the p = 2 case).
• Unfortunately, it is not true that p = m. That is, to get a high order method - fifth
order and above - we need more substeps per iteration than the order.
• Deriving RK methods past third order is quite tedious and a mess of algebra, since
the system for the coefficients is non-linear and the Taylor series expansions become
complicated. (Exercise: verify that RK-4 below has an O(h5 ) LTE.)
Thankfully, just about every useful set of coefficients - at least for general purpose methods
- has been calculated already, so in practice one can just look them up. They are typically
arranged in the Butcher Tableau (see book for details).

The classical RK-4 method: One four step method of note is the classical ‘RK-4’ method
f1 = hf (tn , yn )
1 1
f2 = hf (tn + h, yn + f1 )
2 2
1 1
f3 = hf (tn + h, yn + f2 )
2 2
f4 = hf (tn + h, yn + f3 )
1
yn+1 = yn + (f1 + 2f2 + 2f3 + f4 ).
6
This method has a good balance of efficiency and accuracy (only four function evaluations
per step, and O(h5 ) LTE). The method would be a good first choice for solving ODEs, except
that there is a more popular variant that is better for error estimation.

19
5.4 Implicit methods
The explicit RK methods are nice because we simply ’iterate’ to compute successive values,
ending up with yj+1 . That is, all quantities in the formula can be computed explicitly.

However, we could also include yj+1 in the formula as well. For a one step method, the
formula would take the form

yj+1 = yj + ψ(tj , yj , yj+1 ) + τj+1 .

Nothing is different in the theory; the truncation error is calculated in the same way and the
same remarks on convergence apply. In practice, however, more work is required because
yj+1 is defined implicitly in the formula:

yej+1 = yej + ψ(tj , yej , yej+1 ).

Suppose we have computed values up to yej . Define

g(z) = z − yej − ψ(tj+1 , yej , z).

Then yej+1 is a root of g(z) (which is computable for any z). Thus yej+1 can be computed by
applying Newton’s method (ideally) or some other root-finder to g(z).

Practical note: The obvious initial guess is yej , which is typically close to the root. If h is
small, then yej is almost guaranteed to be close to the root, and moreover

yej+1 → yej as h → 0.

Thus, if Newton’s method fails to converge, h can be reduced to make it work. Since the
initial guess is close, quadratic convergence ensures that the Newton iteration will only take
a few steps to achieve very high accuracy - so each step is only a few times more work than
an equally accurate explicit method.

You may wonder why we would bother with an implicit method when the explicit methods
are more efficient per step; the reason is that they have other desirable properties to be
explored in the next section. For some problems, implicit methods can use much larger h
values than explicit ones.

5.4.1 Example: using Backward Euler


Suppose we wish to solve
y 0 = −t sin y, y(0) = y0
using the Backward Euler method

yj+1 = yj + hf (tj+1 , yj+1 ).

20
If Newton’s method is used, the code must know f and ∂f /∂y. The function in Matlab may
be written, for instance, in the form
[T,Y] = beuler(f,fy,[a b],y 0,h)
where f,fy are both functions of t and y. Internally, at step j we have
g(z) = z − yj − hf (tj+1 , z), g 0 (z) = 1 − hfy (tj+1 , z)
and the Newton iteration is
g(zk )
zk+1 = zk − .
g 0 (zk )
This is iterated until convergence; then yj+1 is set to the resulting z.

5.4.2 Example: deriving the trapezoidal method


Here we derive a second-order method that uses f (tj , yj ) and f (tj+1 , yj+1 ). The formula is
yj+1 = yj + hw1 f (tj , yj ) + hw2 f (tj+1 , yj+1 ) + τj+1 .
First, note that for the RHS, we only need to expand yj+1 up to an O(h2 ) error. Using
yj+1 = yj + hf + O(h2 ),
we have that
f (tj+1 , yj+1 ) = f (tj + h, yj + A)
where A = hf + O(h2 ). Since A2 = O(h2 ), the result is
RHS = hw1 f + hw2 f (tj + h, yj+1 )
 
= hw1 f + hw2 f + hft + fy A + O(A2 )
 
= h(w1 + w2 )f + w2 h2 ft + f fy + O(h3 ).

Comparing to the LHS,


h2
LHS = yj+1 = yj + hf + (ft + f f y) + O(h3 )
2
we find that w1 = w2 = 1/2.

Trapezoidal method: The implicit formula


h
yj+1 = yj + (fj + fj+1 ) + O(h3 )
2
where fj = f (tj , yj ) and fj+1 = f (tj+1 , yj+1 ).

Note that when f = f (t), the formula reduces to the composite trapezoidal rule.

21

You might also like