MOdELING With ODE
MOdELING With ODE
P. Howard
Spring 2005
Contents
1 Overview 2
3 Well-posedness Theory 17
3.1 Stability Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2 Stability and Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3 Maximum Sustainable Yield . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.4 Uniqueness Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.5 Existence Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5 Numerical Methods 44
5.1 Euler’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
1
1 Overview
A wide variety of natural phenomena such as projectile motion, the flow of electric current, and the progres-
sion of chemical reactions are well described by equations that relate changing quantities. As the derivative
of a function provides the rate at which that function is changing with respect to its independent variable,
the equations describing these phenomena often involve one or more derivatives, and we refer to them as
differential equations. In these notes we consider three critical aspects in the theory of ordinary differential
equations: 1. Developing models of physical phenomena, 2. Determining whether our models are math-
ematically “well-posed” (do solutions exist? are these solutions unique? do the solutions we find for our
equation genuinely correspond with the phenomenon we are modeling), and 3. Solving ODE numerically
with MATLAB.
∂ ∂
y 00 = f (t, y) + f (t, y)y 0 ,
∂t ∂y
∂ ∂
y 00 (0) = f (0, y(0)) + f (0, y(0))y 0 (0).
∂t ∂y
Example 2.1. (Drug concentration in an organ.) Suppose blood carries a certain drug into an organ at
variable rate rI (t) cm3 /s and out of the organ with variable rate rO (t) cm3 /s, and that the organ has an
3
initial blood volume V cm3 . If the concentration of blood entering the organ is c(t) g/cm , determine an
ODE for the amount of drug in the organ at time t.
Let x(t) denote the amount of drug in the organ at time t, measured in grams. The input rate is then
rI (t)c(t), while the output rate, assuming instantaneous mixing, is Vx(t)
(t) rO (t), where the volume of blood in
2
the organ V (t) can be computed as the initial volume V plus the difference between the blood that flows
into the organ over time t and the blood that flows out during the same time:
Z t
V (t) = V + rI (s) − rO (s)ds.
0
dx rO (t)
= rI (t)c(t) − Rt x(t) = 0.
dt V + 0 rI (s) − rO (s)ds
4
Example 2.2. (Cleaning the Great Lakes.) The Great Lakes are connected by a network of waterways, as
roughly depicted in Figure 2.1. Assume the volume of each lake remains constant, and that water flows into
the lake with volume Vk at rate rk . Suppose pollution stops abruptly (i.e., no more pollution flows into the
lakes) and develop a system of ODE that models the progression of pollution as it clears from the lakes.
r2 r5
Huron Ontario
V2 V5 r8
r1
r1 Superior
V1
r6 r7
r3
Erie
Michigan V4
V3
r3
r4
We have one differential equation for each lake. Let xk (t) represent the amount of pollutant in the lake
3
with volume Vk . We obtain the system
dx1 r1
=− x1
dt V1
dx2 r1 r3 r1 + r2 + r3
= + x1 + x3 − x2
dt V1 V3 V2
dx3 r3
= − x3
dt V3
dx4 r1 + r2 + r3 r1 + r2 + r3 + r4
= x2 − x4
dt V2 V4
dx5 r1 + r2 + r3 + r4 r1 + r2 + r3 + r4 + r5
= x4 − x5 .
dt V4 V5
4
N O2 + CO −→ N O + CO2 .
Energy is released or absorbed in a chemical reaction, but no change in total molecular weight occurs.
2N O2 −→ N O + N O3 .
In particular, this is an example of a second order elementary reaction, because two molecules interact (N O2
with itself). (A first order reaction, though not elementary, is
N2 O −→ N2 + O.
2N O2 −→N O + N O3
N O3 + CO −→N O2 + CO2 .
The designation “proposed” mechanism is a recognition that it is extremely difficult to know with certainty
what is happening at the intermollecular level in a given reaction. In general, the best we can do is propose
a mechanism that fits all known experimental data.
1 Which is to say, I couldn’t hunt one down to use as an example.
4
2.2.3 Rates of reaction
Recall that for radioactive decay, the decrease in decaying substance is assumed proportional to the amount
left. For example, the process of carbon dating depends on the decay of carbon-14 (a carbon isotope with
six protons and eight neutrons) into nitrogen-14, in which an electron is released,
C 14 −→ N 14 + e− .
r
The assumption that the rate of decay is proportional to the amount left, can be written in the form
d[C 14 ]
= −r[C 14 ],
dt
where [C 14 ] represents the concentraction of carbon-14 at time t, typically measured as moles per unit
volume. (Recall that 1 mole is approximately 6.024×1023 molecules, where 6.024×1023 is roughly Avogadro’s
number, which corresponds with the number of atoms in a 12 gram sample of carbon-12.) According to the
conservation of mass, we must have
d[N 14 ]
= +r[C 14 ],
dt
which is simply to say that for each molecule of carbon-14 lost, a molecule of nitrogen-14 is gained.
In general, for elementary reactions, we will assume the law of mass action.
Law of mass action. The rate of a chemical reaction is proportion to the product of the concentrations of
the reactants.
In the case of our elementary reaction above between nitrogen trioxide and carbon monoxide,
2 k
N O3 + CO −→ N O2 + CO2 ,
the law of mass action asserts,
d[N O3 ] d[CO]
= = −k2 [N O3 ][CO].
dt dt
A good intuitive way to think about this is that since the nitrogen trioxide and the carbon monoxide only
react when they come into contact with one another, the chance of reaction is increased if either (or, of
course, both) has a high concentration. Again by conservation of mass we have the relations
d[N O2 ] d[CO2 ]
= = +k2 [N O3 ][CO].
dt dt
Observe that the rate is always determined by the reacting chemicals.
For the reaction in which nitrogen dioxide decomposes into nitrogen monoxide and nitrogen trioxide,
1k
2N O2 −→ N O + N O3 ,
we regard the left hand side as N O2 + N O2 , so that the decay of nitrogen dioxide can be written,
d[N O2 ]
= −2k1 [N O2 ]2 .
dt
Observe that the coefficient 2 is critical in this case and indicates that for each reaction that takes place,
2 molecules of nitrogen dioxide are used. The exponent 2 is a consequence of the law of mass action. By
conservation of mass, we have
d[N O] d[N O3 ]
= = +k1 [N O2 ]2 .
dt dt
Finally, notice that the entire reaction N O2 + CO −→ N O + CO2 is modeled by a system of ODE,
d[N O3 ]
= − k2 [N O3 ][CO] + k1 [N O2 ]2
dt
d[CO]
= − k2 [N O3 ][CO]
dt
d[N O2 ]
= − 2k1 [N O2 ]2 + k2 [N O3 ][CO].
dt
5
Notice that we have a complete system of ODE and do not need to consider the concentrations of [N O] and
[CO2 ].
Example 2.3. In certain cases, a reaction can proceed in either direction. For example, in the hydrogenation
of ethylene (C2 H4 ) to ethane (C2 H6 ),
C2 H4 + H2 −→ C2 H6 ,
a proposed mechanism is
k1
H2
2H
k−1
2 k
C2 H4 + H −→ C2 H5
3 k
C2 H5 + H −→ C2 H6 ,
where the first reaction can proceed in either direction. According to the law of mass action, we can model
this mechanism with the following system of ODE,
d[H2 ]
= − k1 [H2 ] + k−1 [H]2
dt
d[H]
= 2k1 [H2 ] − 2k−1 [H]2 − k2 [C2 H4 ][H] − k3 [C2 H5 ][H]
dt
d[C2 H4 ]
= − k2 [C2 H4 ][H]
dt
d[C2 H5 ]
= k2 [C2 H4 ][H] − k3 [C2 H5 ][H].
dt
4
1 .1 .1 1.23 × 10−3
2 .1 .2 2.46 × 10−3
3 .2 .1 4.92 × 10−3
Table 2.1: Concentrations of nitrogen monoxide and hydrogen and initial reaction rates.
We now posit the general reaction rate 2k[N O]a [H2 ]b for some three constants a, b, and k. That is, we
assume
d[N O] d[H2 ]
= = −2k[N O]a [H2 ]b ,
dt dt
2 If you’ve taken first-year chemistry with a lab component you’ve almost certainly carried out an experiment similar to this
one.
6
where the 2 is really just for show here since it could be subsumed into k. (Observe that if this reaction is
elementary, we regard it as
N O + N O + H2 + H2 −→ N2 + 2H2 0,
for which a and b will both be 2.) For convenience of notation, we will write the positive rate − d[N
dt
O]
= R,
so that we have
R = 2k[N O]a [H2 ]b .
Taking now the natural logarithm of both sides, we find
ln R = ln 2k + a ln[N O] + b ln[H2 ].
Given our data for R, [N O], and [H2 ], we can use linear multivariate regression to determine the values of
a, b, and k. That is, if we write X = ln[N O], Y = ln[H2 ], and Z = ln R, we have
Z = ln 2k + aX + bY.
(In this case, we have exactly three pieces of data and three unknowns, so the fit will be precise, but in
general we would have more data and we would proceed through regression.) In the following MATLAB
code, N represents [N O] and H represents H2 .
>>N=[.1 .1 .2];
>>H=[.1 .2 .1];
>>R=[1.23e-3 2.46e-3 4.92e-3];
>>M=[ones(size(N))’ log(N)’ log(H)’];
>>p=M\log(R)’
p=
0.2070
2.0000
1.0000
>>k=exp(.207)/2
k=
0.6150
In this case we determine that a = 2, b = 1, and k = .615, so that our rate law becomes
d[N O]
= −2 · (.615)[N O]2 [H2 ].
dt
We conclude that this reaction is most likely not elementary.
So, what is the mechanism? Well—judging from our analysis, the first reaction might look something
like
2N O + H2 −→?
At this point, we need a chemist. 4
d[C 14 ]
= −r[C 14 ], r = 1.2097 × 10−4 years−1
dt
(this rate corresponds with the commonly quoted fact that carbon-14 has a half-life of 5730 years, by which
we mean the level of carbon-14 in a substance is reduced by half after 5730 years). Since the fraction of
7
carbon-12 to carbon-14 remains relatively constant in living organisms (at the same level as it occurs in the
atmosphere, roughly [C 14 ]/[C 12 ] ∼
= 1.3 × 10−12 ), we can determine how long an organism has been dead by
measuring this ratio and determining how much carbon-14 has radiated away. For example, if we find that
half the carbon-14 has radiated away, then we can say the material is roughly 5730 years old. In practice,
researchers measure this ratio of carbon-14 to carbon-12 in units called modern carbons, in which the living
ratio (ratio of carbon-14 to carbon-12 in a living organism) is defined to be 1 modern carbon.
Example 2.5. (Carbon dating the Shroud of Turin)3 The most famous (and controversial) case of carbon
dating was that of the Shroud of Turin, which many believe to have covered Jesus of Nazareth in his tomb.
In 1988, samples of the cloth were independently studied by three groups, one at the University of Arizona,
one at Oxford University, and one at the Swiss Federal Institute of Technology (ETH) in Zurich. In this
example, we will consider the data collected in Zurich. Five measurements were made on the level of modern
carbon remaining in the shroud, .8755, .8766, .8811, .8855, and .8855 (two measurements led to the same
number). Averaging these, we have a value M = .8808. Since the level of carbon-12 remains relatively
constant, we can assume that the level of the ratio of carbon-14 to carbon-12 is reduced at the same rate as
the level of carbon-14. We have, then
dM
= −rM ; r = 1.2097 × 10−4 years−1 ⇒ M (t) = M (0)e−rt .
dt
Setting M (0) = 1 as the level of modern carbon when the shroud was made, we need to find t so that
−4
.8808 = e−1.2097×10 t
.
Solving this relation, we find t = 1049 years, which dates the shroud to the year 1988 − 1049 = 939 A.D. 4
dp
= c; p(0) = p0 ⇒ p(t) = ct + p0 .
dt
Examples include cars coming off an assembly line and T cells being created in bone marrow.
2. Malthusian model. Named for the British economist Thomas R. Malthus (1766–1834), 4 the Malthusian
model assumes that both the birth rate of a population and the death rate of a population are proportional
to the current size of the population. For example, in a population of two people, the population will not
3 The study we take our data from was originally published in Nature vol. 337 (1989), no. 6208 611–615, which is available at
www.shroud.com/nature.htm. The data in the form I’m giving it here was given by Remi Van Haelst in his article Radiocarbon
Dating the Shroud of Turin: the Nature Report, which is available at www.shroud.com/vanhels5.pdf. The results of this study
have been widely disputed. One compelling argument against the date we find in this example is that the shroud was patched
in medieval times and the examples studied were part of that patch.
4 Malthus is perhaps the single most important figure in shaping current socioeconomic views of population dynamics. Prior
to the publication of Malthus’s Essay on Population in 1798, European statesmen and economists largely agreed that rising
population was an indication of economic prosperity (this point of view was argued, for example, by the influential Scottish
political economist and philosopher Adam Smith (1723–1790) in his Wealth of Nations (1776)). According to this point of view,
if a king or ruling assemblage wanted to increase its nation’s prosperity, it need only increase its pool of taxpayers. In his Essay
on Population, Malthus pointed out that environments have finite resources, and consequently that rising populations must
eventually lead to famine. Though increasingly clever methods of cultivation have allowed industrialized countries to sustain
more people than Malthus would likely have thought possible, his thesis is now widely accepted.
8
grow very rapidly, but in a population of 6.2 billion people (roughly the earth’s population in 2004) growth
is extremely rapid. Letting b represent birth rate and d represent death rate, we write,
dp
= bp − dp = rp; p(0) = p0 ⇒ p(t) = p0 ert ,
dt
where r, which is typically positive, will be referred to as the growth rate of the population.
3. Logistic model. A clear drawback of the Malthusian model is that it assumes there are no inherent
limitations on the growth of a population. In practice, most populations have a size beyond which their
environment can no longer sustain them. The logistic model incorporates this observation through the
introduction of a “carrying capacity” K, the greatest population an environment can sustain. We have,
dp p p0 K
= rp(1 − ); p(0) = p0 ⇒ p(t) = .
dt K (K − p0 )e−rt + p0
In order to better understand the role K plays, we recall the idea of equilibrium points or steady states (this
will anticipate the general stability discussion of Section 3). An equilibrium point is some point at which a
population quits changing: dp dt = 0. In the case of the logistic equation, we can find all equilibrium points
by solving the algebraic equation,
pe
rpe (1 − ) = 0 ⇒ pe = 0, K.
K
We determine whether or not a population moves toward a particular equibrium point by considering the
sign of dp
dt on either side of the equilibrium point. For the equilibrium point pe = K, we observe that for
p > K, dp dp
dt < 0 (that is, the population is decreasing), while for p < K, dt > 0 (that is, the population
is increasing). In this case, the population always approaches K, and we refer to K as a stable equilbrium
point. Very generally, stable equilibrium points represent long time behavior of solutions to ODE.
4. Gompertz model. Named for the British actuary and mathematician Benjamin Gompertz (1779–1865),
the Gompertz model is qualitatively similar to the logistic model. We have
dp p
= −rp ln( ); p(0) = p0 .
dt K
The Gompertz model is often used in the study of tumor growth.
5. General single population model. The logistic and Gompertz models are both special cases of the
general population model,
dp r p
= p 1 − ( )a ,
dt a K
where r and K play the same roles as in the logistic and Gompertz models, and a is typically fit to data.
We note that a = 1 is the logistic model, and the Gompertz model is recovered from a limit as a → 0.
6. Lotka-Volterra model. Named for the Italian mathematician Vito Volterra (1860–1940) and the
Austrian chemist, demographer, ecologist, and mathematician Alfred J. Lotka (1880–1949), the Lotka–
Volterra model describes the interaction between a predator (e.g., wildcats) with population y(t) and its
prey (e.g., rabbits) with population x(t). We have,
dx
= ax − bxy; x(0) = x0
dt
dy
= − ry + cxy; y(0) = y0 ,
dt
where a, b, c, and r are all taken positive. We observe that in the absence of predators (i.e., in the case y ≡ 0)
the prey thrive (they have Malthusian growth), while in the absence of prey (i.e., in the case x ≡ 0) the
predators die off. The interaction or predation terms signify that the larger either the predator population or
the prey population is, the more often the two populations interact, and that interactions tend to increase the
predator population and decrease the prey population. While qualitatively enlightening, the Lotka–Volterra
model isn’t robust enough to model many real interactions, though see Examples 2.6 and 2.7 in the course
notes Modeling Basics.
9
7. Competition models. In addition to predator–prey interactions, we would often like to model two
species such as rabbits and deer that compete for the same resources. In the (unlikely) event that each
species uses exactly the same amount of the environment, we could model this interaction through the ODE,
dx x+y
= r1 x(1 − )
dt K
dy x+y
= r2 y(1 − ),
dt K
where we have simply asserted that if the total population x + y exceeds the carrying capacity, both popula-
tions will begin to die off. More generally, we assume that each population has a different carrying capacity
and a different interaction with its environment, and only keep this general idea that if either population
gets sufficiently large, the other will begin to die off. Under this assumption, a reasonable model is,
dx x s1 y
= r1 x(1 − − )
dt K1 K1
dy y s2 x
= r2 y(1 − − ),
dt K2 K2
where K1 represents the carrying capacity of species x, K2 represents the carrying capacity of species y,
and s1 represents a scaling for the amount of species x’s environment used by species y (and similarly for
s2 ). For example, suppose species y is larger and eats roughly twice as much per animal as species x. Then
we take s1 = 2. It seems fairly natural that if s1 = 2, then s2 = 1/2. That is, if species y uses twice the
environment of species x, then species x uses half the environment of species y. While intuitively satisfying,
this reciprocity doesn’t always fit the data.
8. The SIR epidemic model The most simple model for studying the spread of epidemics involves three
populations, the susceptible members, S(t), the infected members, I(t), and the removed members, R(t).
(The removed members can either have recovered (in which case they are assumed in this model to be
immune) or died.) The SIR model takes the form,
dS
= − aSI
dt
dI
= aSI − bI
dt
dR
= bI.
dt
In the first equation, we observe that the rate at which susceptible members of the population become
infected is proportional to the number of interactions there are between members of the population. The
second equation records that each member lost from S(t) moves to population I(t) and that members of I(t)
recover or die at some rate b determined by the disease and typically found experimentally.
9. Half saturation constants. In the Lotka–Volterra predator–prey model above, the predator growth
due to predation takes the form +cxy. Even if there is only one predator left, this claims that if there are
enough prey, the predators will continue to grow rapidly. A better expression might be,
cxy
,
x+M
for which there is an intrinsic limit on how fast the predator population can grow when saturated with prey.
In particular, at full saturation, we consider the limit as the prey population goes to infinity,
cxy
lim = cy,
x→∞ x + M
which is the full saturation growth. We refer to the constant M as the “half saturation” constant, because
when x = M , the growth rate is at precisely half saturation,
cxy 1
= cy.
x+x 2
10
10. Learning terms. We often want to specify in our model that a species becomes more (or less) adept
at some function as time progresses. For example, we might find in a predator–prey situation that prey
learn over time to avoid predators. In this case, the growth and decay rates due to predation will depend
on the independent variable t. Typically, we assume this change is slow, and logarithmic terms are often
employed. In the Lotka–Volterra model, under the assumption that the prey learn to avoid the predators
(and the predators do not get more adept at finding the prey), we could write,
dx b
= ax − xy
dt ln(e + t)k
dy c
= − ry + xy,
dt ln(e + t)k
where we evaluate natural log at e + t so that we get 1 when t = 0, and k is a new parameter to be fit to
data.
11. Delay models. One of the deficiencies with the population models discussed above is that in each case
birth rate is assumed to change instantly with a change in population. More generally, we might expect the
members of a population to reach some threshold age before giving birth, introducing a time delay into the
model. For example, a time-delay Malthusian model would take the form
dp
= rp(t − T ),
dt
wherein the growth rate of the population at time t depends on the population at time t − T .
12. Discrete time models.5 In a discrete time population model, we increment time by some discrete
amount and update our population according to some rule. One frequently used discrete model, similar to
the continuous logistic model, takes the form
pt
pt+1 = pt er(1− K ) .
13. Matrix models. Another type of discrete population model involves the notion of a transition
matrix. In this case, we consider a number of different categories the population can be classified in, and
study the probability of a member from one category of the population moving to another category. For
example, a recent paper in the journal Conservation Biology [P. C. Cross and S. R. Beissinger, Using logistic
regression to analyze the sensitivity of PVA models: a comparison of models based on African wild dog model,
Conservation Biology 15 (2001), no. 5, 1335–1346.] considers African wild dogs, dividing the population into
three categories: pups, yearlings and adults. We assume pups survive to become yearlings with probability
Sp , that yearlings reproduce pups with probability Ry , that yearlings survive to adulthood with probability
Sy , that adults reproduce pups with probability Ra , and finally that adults have an annual probability of
survival Sa (see Figure 2.2).
Suppose we know initial populations P0 , Y0 , and A0 , and would like to determine the corresponding
populations a year later. For pups, we have loss as they become yearlings and gain as the yearlings and
adults reproduce. Arguing similarly for yearlings and adults, we arrive at the model,
P1 = Ry Y0 + Ra A0
Y1 = + Sp P0
A1 = + Sy Y0 + Sa A0 .
(It might be tempting to think we should have a loss term in the pups equation of the form −Sp P0 , but keep
in mind that after one year all pups are lost (if they’ve survived, they’ve become yearlings), and we have
only gotten new pups from reproduction.) Of course, we can write this last expression in the matrix form,
P1 0 Ry Ra P0
Y1 = Sp 0 0 Y0 .
A1 0 Sy Sa A0
5 The final two models in this section are not ODE models, but are well worth mentioning.
11
Ra
Ry
Sp
Sa
Similarly, given the year 1 populations, we can produce the year 2 populations through,
2
P2 0 Ry Ra P1 0 Ry Ra P0
Y2 = Sp 0 0 Y1 = Sp 0 0 Y0 .
A2 0 Sy Sa A1 0 Sy Sa A0
F = −µmg.
Observe in this relationship that we can reasonably regard µ ≤ 1. If not, then it would take less force to lift
the object and carry it than it would to push it. Since the entire force pushing a dragster forward is due to
12
friction (between the tires and the road), we expect the maximum force propelling the dragster forward to
be F = mg. Under this assumption, we can determine the minimum time it will take a dragster to complete
a standard quarter-mile course (402.34 meters). If x(t) represents position at time t along the course (with
initial position and initial velocity assumed 0), then we have, according to Newton’s second law,
d2 x 1
= g ⇒ x(t) = gt2 .
dt2 2
We compute the minimum track time as,
r
2(402.34)
t= = 9.06 seconds.
9.81
Let’s put this to the test. On June 2, 2001, Kenny Bernstein set the world record for a quarter mile track
with a time t = 4.477 seconds.6 4
Example 2.7. (Planetary motion) Consider the earth–sun system in two space dimensions. We choose
some arbitrary origin (0, 0) and let r1 = (x1 , y1 ) represent the position of the sun (mass M ) relative to the
origin and r2 = (x2 , y2 ) represent the position of the earth (mass m) relative to this origin. (See Figure 2.3.)
Sun Earth
r2− r1
M m
r1 r2
(0,0)
According to Newton’s law of gravitation, the magnitude of the force exerted by one (point) mass on
another is proportional to the product of the masses and inversely proportional to the distance between the
masses squared, with constant of proportionality G. Ignoring direction, we have
GM m
F = .
d2
In order to incorporate direction, we assume the force on either mass is directed radially toward the other
mass. The force on the sun due to the earth is given by,
GM m
Fsun = (r2 − r1 ),
|r2 − r1 |3
6 Race car drivers “burn out” their tires at the beginning of a race, and this makes the tires adhere to the racing surface, so
that they can “push off.” Viewed another way, the cars get more difficult to lift.
13
while the force on the earth due to the sun is given by,
GM m
Fearth = − (r2 − r1 ).
|r2 − r1 |3
Finally, according to Newton’s second law of motion, we can set F = ma, for which we obtain the vector
ODE
d2 r1 GM m
M = (r2 − r1 )
dt 2 |r2 − r1 |3
d2 r2 GM m
m 2 =− (r2 − r1 ),
dt |r2 − r1 |3
or component-wise
Gm(x2 − x1 )
x001 = ,
((x2 − x1 )2 + (y2 − y1 )2 )3/2
Gm(y2 − y1 )
y100 = ,
((x2 − x1 )2 + (y2 − y1 )2 )3/2
GM (x2 − x1 )
x002 = − ,
((x2 − x1 )2 + (y2 − y1 )2 )3/2
GM (y2 − y1 )
y200 = − .
((x2 − x1 )2 + (y2 − y1 )2 )3/2
d~x der dθ
= rt er + r = rt er + reθ ,
dt dt dt
and similarly
d2 ~x
= (rtt − r(θt )2 )er + (2rt θt + rθtt )eθ .
dt2
That is, acceleration in the radial direction is given by rtt − r(θt )2 , while acceleration in the angular direction
is given by 2rt θt + rθtt . In the case of a central force such as gravity, all force is in the radial direction, and
14
P
r
de r
de θ eθ
er θ
m(rtt − r(θt )2 ) = − F
m(2rt θt + rθtt ) = 0.
In particular, if we assume the sun’s position is fixed in space, the earth’s motion around it can be described
by the system
GM
rtt − r(θt )2 = −
r2
2rt θt + rθtt = 0.
where p is the system momentum and potential energy is described through the function U (y) such that
∂U
Force = − .
∂y
The Hamiltonian, defined by
p2
H(p, y) = Kinetic Energy + Potential Energy = + mgy
2m
represents the entire energy of the system. For a conservative system (in the event that energy is conserved)
we have dH
dt = 0, from which we find
dH ∂H dp ∂H dy
= + = 0. (2.3)
dt ∂p dt ∂y dt
15
Since kinetic energy is independent of position ( ∂K.
∂y
E.
= 0), we must have
∂H ∂U
= = −F.
∂y ∂y
where the functions y(x) can be any continuously differentiable functions for which y(0) = y(x1 ) = 0 (we
assume ys (x) and ye (x) agree at the endpoints). We say that y(x) belongs to the function class C01 [0, x1 ]:
the collection of all continuously differentiable functions on x ∈ [0, x1 ] that vanish at the endpoints.
According to our assumption, F [ye ] = minimum, and consequently
∂
F [ys ] = min ⇒ F [ys ] = 0.
s=0 ∂s s=0
That is, since φ(s) := F [ys ] is minimized at s = 0 (for any y(x)), its s-derivative at s = 0 must be 0. We
have Z x1 p
F [ys ] = F [ye + sy] = 1 + (ye0 + sy 0 )2 dx,
0
from which Z
∂ x1
(y 0 + sy 0 )y 0
F [ys ] = p e dx.
∂s 0 1 + (ye0 + sy 0 )2
∂
Upon setting ∂s F [ys ] = 0, we have
s=0
Z x1
y0 y0
p e dx = 0.
0 1 + (ye0 )2
Integrating by parts, we find that this integral equation can be re-written as
Z x1
ye00 y
dx = 0.
0 (1 + (ye0 )2 )3/2
16
At this point, we argue as follows: since y(x) can be almost any function we choose (it only needs to be
continuously differentiable and to vanish on the boundary), we can choose it to always have the same sign
as ye00 . In this case, the numerator and denominator of our integrand will both necessarily be positive, and
consequently the only chance our integral has of being 0 is if ye00 = 0. In that case, we have the boundary
value problem
ye00 = 0
y(0) = 0
y(x1 ) = y1 .
In this way we have converted the problem of minimizing a functional into the problem of solving an ODE.
3 Well-posedness Theory
Despite the clunky name, well-posedness analysis is one of the most important things for an applied math-
ematician to understand. In order to get an idea of the issues involved, we will consider the example of a
pendulum, initially perturbed but otherwise under the influence of gravity alone.
Example 3.1. Consider the motion of a mass, m, swinging at the end of a rigid rod, as depicted in Figure
3.1. Assume air resistance is negligible.
l=length of rod
θ
T=rod tension
θ
−mg
F=force on rotation
The force due to gravity on m acts vertically downward, and must be decomposed into a force −T , which
is exactly balanced by the rod, and a force F , directed tangentially to the arc of motion. Observing the right
triangle, with hypotenuse of length −mg, we have
T
cos θ = − ⇒ T = −mg cos θ,
mg
F
sin θ = − ⇒ F = −mg sin θ.
mg
17
Measuring distance as arclength, d = lθ, Newton’s second law of motion (F = ma) determines
d2 θ g
= − sin θ
dt2 l
d
θ(0) = θ0 , θ(0) = ω0 . (3.1)
dt
In order to solve equation (3.1) with MATLAB, we must first write it as a first order system. Taking x1 = θ
and x2 = dθdt , we have
dx1
= x2 ; x1 (0) = θ0 ,
dt
dx2 g
= − sin x1 ; x2 (0) = ω0 . (3.2)
dt l
Taking l = 1, we will store this equation in the MATLAB M-file pendode.m,
function xprime = pendode(t,x);
%PENDODE: Holds ODE for pendulum equation.
g = 9.81; l = 1;
xprime = [x(2);-(g/l)*sin(x(1))];
and solve it with the M-file pend.m,
function f = pend(theta0,v0);
%PEND: Solves and plots ODE for pendulum equation
%Inputs are initial angle and initial angular velocity
x0 = [theta0 v0];
tspan = [0 5];
[t,x] = ode45(@pendode,tspan,x0);
plot(x(:,1),x(:,2));
Taking initial angle π/4 and initial velocity 0 with the command pend(pi/4,0), leads to Figure 3.2 (I’ve
added the labels from MATLAB’s pop-up graphics window).
Notice that time has been suppressed and the two dependent variables x1 and x2 have been plotted
in what we refer to as a phase portrait. Beginning at the initial point θ0 = π4 , ω0 = 0 (the right-hand
tip of the football), we observe that angular velocity becomes negative (the pendulum swings to the left)
and angle decreases. At the bottom of the arc, the angle is 0 but the angular velocity is at a maximum
magnitude (though negatively directed), while at the left-hand tip of the football the object has stopped
swinging (instantaneously), and is turning around. The remainder of the curve corresponds with the object’s
swinging back to its starting position. In the (assumed) absence of air resistance or other forces, the object
continues to swing like this indefinitely. Alternatively, taking initial angle 0 and initial velocity 10 with the
command pend(0,10) leads to Figure 3.3.
Observe that in this case angular velocity is always positive, indicating that the pendulum is always
swinging in the same (angular) direction: we have started it with such a large initial velocity that it’s
looping its axis.
Now that we have a fairly good idea of how to understand the pendulum phase diagrams, we turn to the
critical case in which the pendulum starts pointed vertically upward from its axis (remember that we have
assumed it is attached to a rigid rod). After changing the variable tspan in pend to [0, 20] (solving now for
20 seconds), the command pend(pi,0) leads to Figure 3.4. In the absence of any force other than gravity, we
expect our model to predict that the pendulum remains standing vertically upward. (What could possibly
cause it to fall one way rather than the other?) What we find, however, is that our model predicts that it
will fall to the left and then begin swinging around its axis.
Consider finally a change in this last initial data of one over one trillion (10−12 = .000000000001). The
MATLAB command pend(pi+1e-12,0) produces Figure 3.5. We see that with a change in initial data as
small as 10−12 radians, the change in behavior is enormous: the pendulum spins in the opposite direction.
We conclude that our model, at least as it is solved on MATLAB, fails at the initial data point (π, 0). In
particular, we say that our model is not well-posed at this point. 4
In general, for well-posedness, we will require three things of a model:
18
Plot of angular velocity versus angle
2.5
1.5
1
Angular velocity
0.5
−0.5
−1
−1.5
−2
−2.5
−0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8
Angle in radians
π
Figure 3.2: Pendulum motion for the case θ0 = 4 and ω0 = 0.
10
9.5
Angular velocity
8.5
7.5
0 5 10 15 20 25 30 35 40 45
Angle in radians
19
Plot of angular velocity versus angle
1
−1
−2
Angular velocity
−3
−4
−5
−6
−7
−16 −14 −12 −10 −8 −6 −4 −2 0 2 4
Angle in radians
5
Angular velocity
0
0 5 10 15 20 25 30
Angle in radians
Figure 3.5: Pendulum motion for the case θ0 = π + 10−12 and ω0 = 0 s−1 .
20
1. (Existence) There exists a solution to the model.
2. (Uniqueness) The solution is unique.
3. (Stability) The solution does not change dramatically if we only change the initial data a little.
In the next three sections, we will consider each of these in turn, beginning with stability and working our
way back to the most abstract theory, existence.
21
Pendulum Phase Plane Diagram
8
4
(θ ,ω ) = (π/4,6)
0 0
2
Angular velocity
0 • • • (θ0,ω0) = (π,0)
−2
−4
(θ ,ω ) = (π/12,0)
0 0
−6
−8
−8 −6 −4 −2 0 2 4 6 8
Angle in radians
Figure 3.6: Phase plane diagram for a simple pendulum (Example 3.1 continued).
x3 x5
sin x = x − + + ...
3! 5!
For x near 0, higher powers of x are dominated by x, and we can take the approximation, sin x ∼
= x, which
leads to the linearized equations,
dx1
= x2
dt
dx2 g
= − x1 . (3.4)
dt l
(That is, the right-hand sides of (3.4) are both linear, which will always be the case when we take the linear
terms from a Taylor expansion about an equilibrium point.) Developing the phase plane equation as before,
we now have
dx2 dx2
dt − gl x1
= dx = ,
dx1 dt
1 x2
with solution
x22 g x2
+ · 1 = C,
2 l 2
22
√ p
which corresponds with ellipses centered at (0, 0) with radial axis lengths 2C and 2lC/g (see Figure 3.7).
Typically such solutions are referred to as integral curves. Returning to equations (3.4), we add direction
along the ellipses by observing from the first equation that for x2 > 0, x1 is increasing, and for x2 < 0, x1
is decreasing. The directed sections of integral curves along which the object moves are called trajectories.
Our stability conclusion is exactly the same as we drew from the more complicated Figure 3.6. In particular,
in the case that we have closed loops about an equilibrium point, we say the point is orbitally stable.
x2
2C
x1
2lC/g
Figure 3.7: Phase plane diagram near the equilibrium point (0, 0).
23
which can be solved by the method of separation of varibles for implicit solution,
y22 g y12
− + = C,
2 l 2
which corresponds with hyperbolas (see Figure 3.8). Observe that in this case all trajectories move first
toward the equilibrium point and then away. We refer to such an equilibrium point as an unstable saddle.
4
y2
y2= g/l y1
y1
Figure 3.8: Phase plane diagram near the equilibrium point (π, 0).
Example 3.2. As a second example of stability analysis, we will consider the Lotka–Volterra predator–prey
equations,
dx
= ax − bxy
dt
dy
= − ry + cxy. (3.5)
dt
First, we find all equilibrium points by solving the system of algebraic equations,
ax − bxy = 0
−ry + cxy = 0.
We find two solutions, (x1 , y1 ) = (0, 0) and (x2 , y2 ) = ( rc , ab ). The first of these corresponds with an absence
of both predator and prey, and of course nothing happens (in the short term). The second is more interesting,
a point at which the predator population and the prey population live together without either one changing.
If this second point is unstable then any small fluctuation in either species will destroy the equilibrium
and one of the populations will change dramatically. If this second point is stable then small fluctuations
in species population will not destroy the equilibrium, and we would expect to observe such equilibria in
nature. In this way, stability typically determines physically viable behavior.
In order to study the stability of this second point, we first linearize our equations by making the
substitutions
r
x = + z1
c
a
y = + z2 .
b
24
Substituting x and y directly into equation (3.5) we find
dz1 r r a br
= a( + z1 ) − b( + z1 )( + z2 ) = − z2 − bz1 z2
dt c c b c
dz2 a r a ca
= − r( + z2 ) + c( + z1 )( + z2 ) = z1 + cz1 z2 .
dt b c b b
(Observe that in the case of polynomials a Taylor expansion emerges from the algebra, saving us a step.)
Dropping the nonlinear terms, we arrive at our linear equations,
dz1 br
= − z2
dt c
dz2 ca
= z1 .
dt b
Proceeding as in the previous case, we solve the phase plane equation,
ca
dz2 z1
= bbr ,
dz1 − c z2
d2 y1 g
= y1 .
dt2 l
rt
p g equations can be solved through the ansatz (guess) y1 (t) = e , for which
Homogeneous constant coefficient
we have r − l = 0, or r = ± l . According to standard ODE theory, we conclude that any solution y1 (t)
2 g
25
If we define matrix exponentiation through Taylor expansion,
1 1
eAt = I + At + A2 t2 + A3 t3 + ...,
2 3!
then as in the case with single equations, we can conclude
c1
y(t) = eAt
c2
is a solution to (3.7). (This assertion can be checked through direct term-by-term differentiation.) In the
event that A is diagonal (which is not the case in our example), eAt is straightforward to evaluate. For
a1 0
A= ,
0 a2
we have
1 1
eAt =I + At + A2 t2 + A3 t3 + ...
2
3!
1 0 a1 0 1 a1 0 a1 0
= + t+ t2
0 1 0 a2 2 0 a2 0 a2
1 a1 0 a1 0 a1 0
+ + ...
6 0 a2 0 a2 0 a2
3
1 0 a1 0 1 a21 0 2 1 a1 0
= + t+ t + + ...
0 1 0 a2 2 0 a22 6 0 a32
at
1
e 0
= .
0 ea2 t
In the event that A is not diagonal, we will proceed by choosing a change of basis that diagonalizes A. This
is where eigenvalues begin to emerge. Recall that eigenvalues, µ, of the matrix A are scalar constants that
satisfy Av = µv for some vector v, which is referred to as the eigenvector associated with µ. Typically, an
n × n matrix will have n linearly independent eigenvectors. Observe that in the event that µ is an eigenvalue
of the matrix A, we have
(A − µ)v = 0, v 6= 0 ⇒ det(A − µ) = 0.
(If det(A − µ) 6= 0, we would conclude that v = 0 by standard matrix inversion.) We compute eigenvalues,
then, by solving the polynomial equation, det(A − µ) = 0. (The polynomial D(µ) := det(A − µ) is typically
referred to as the characteristic polynomial.) In our case, that is with
0 1
A= g ,
l 0
we have r
−µ 1 g g
D(µ) = det =µ − =0⇒µ=±
2
.
g
l −µ l l
We can determine the eigenvectors associated with eigenvalues by solving
r r
0 1 v1 g v1 g
Av = µv ⇒ g =± ⇒ v2 = ± v1 .
l 0 v2 l v2 l
Observe, in particular, that though we have two equations, we only get one relation for each eigenvalue. This
means that one component of v can be chosen (almost) arbitrarily, which corresponds with the observation
that if you multiply an eigenvector by a constant, you will get another (linearly dependent) eigenvector. In
this case, let’s choose v1 = 1 for each eigenvector (recall that we should have two), giving
p1 p1
V1 = , V2 = .
g
l − gl
Finally, we are prepared to diagonalize A. A general procedure for diagonalizing a matrix is outlined in the
following three steps.
26
1. For an n × n matrix A, find n linearly independent eigenvectors of A, V1 , V2 , ..., Vn .
2. Form a matrix P that consists of V1 as its first column, V2 as its second column, etc., with finally Vn
as its last column.
3. The matrix P −1 AP will then be diagonal with diagonal entries the eigenvalues associated with V1 , V2 , ..., Vn :
µ1 , µ2 , ..., µn .
Remark on Steps 1–3. First, it is not always the case that a matrix will have n linearly independent
eigenvectors, and in situations for which this is not the case, more work is required (in particular, instead of
diagonalizing the matrix, we put it in Jordon canonical form). Under the assumption that Step 1 is possible,
the validity of Steps 2 and 3 is straightforward. If P is the matrix of eigenvectors, then
. . .
AP = (µ1 V1 ..µ2 V2 .. . . . ..µn Vn );
that is, the matrix containing as its k th column the vector µk Vk . Multiplying on the left by P −1 , which must
exist if the Vk are all linearly independent, we have
µ1 0 0 0
. . . 0 µ2 0 0
P −1 (µ1 V1 ..µ2 V2 .. . . . ..µn Vn ) = .. .
0 0 . 0
0 0 0 µn
In this last calculation, we are almost computing P −1 P , which would yield the identity matrix.
Returning to our example, we have
s p
p1 g 1
p −1 l −p gl −1
P = ⇒P = ,
l − g
l 4g − gl 1
Observe now that we can write A in terms of its diagonalization, as A = P DP −1 . We can compute, then
1 1
eAt = I + At + A2 t2 + A3 t3 + ...,
2 3!
1 1
= I + (P DP )t + (P DP −1 )2 + (P DP −1 )3 + ...
−1
2 3!
1 1
= I + (P DP )t + (P DP P DP −1 ) + (P DP −1 P DP −1 P DP −1 ) + ...
−1 −1
2 3!
1 2 1 3 −1
= P (I + Dt + D + D + ...)P
2 3!
= P eDt P −1 .
27
Consequently, we can write the solution of our example system (3.6) as
c1 c1
y(t) = eAt = P eDt P −1
c2 c2
√g ! s p
− lt
p1 p1 e √0 l −p gl −1 c1
= .
g
l − gl 0 e lt
g
4g − gl 1 c2
√g √g ! q
e− √ lt e l√ t − 1 c1 − 12 gl c2
= pg − gt pg g
2 q
l e l − l e lt − 1
c
2 1 + 1
2
l
c
g 2
q √g q √g
(− 12 c1 − 12 gl c2 )e− l t + (− 12 c1 + 12 gl c2 )e l t
= q p √g q p √g .
(− 12 c1 − 12 gl c2 ) gl e− l t + (− 12 c1 + 12 gl c2 ) − gl e l t
Observing that in the absence of initial values the constantsqc1 and c2 remain arbitrary,q we can recover the
representation of y1 (t) above by combining C1 = (− 12 c1 − 12 gl c2 ) and C2 = (− 12 c1 + 12 gl c2 ).
The primary observation I would like to make here is that the solution’s rates of growth and decay are
exactly the eigenvalues of the matrix A. If each of these is negative (or, in the case of complex eigenvalues,
the real parts are negative) then we can conclude stability. If any of them are positive, we can conclude
instability. In the event that a growth rate is 0 or purely imaginary, more analysis remains to be done. 4
Theorem 3.1. (Linear ODE Stability) For the linear first order system of ODE
y 0 = Ay, y ∈ Rn , A ∈ Rn×n ,
the zero vector y ≡ 0 is stable or unstable as follows:
1. If all eigenvalues of A have nonpositive real parts, and all those with zero real parts are
simple, then y = 0 is stable.
2. If and only if all eigenvalues of A have negative real parts, then y = 0 is asymptotically
stable.
3. If one or more eigenvalues of A have a postive real part, then y = 0 is unstable.
28
Substituting into the SIR model, we find
dx
= − a(Se + x)y
dt
dy
= a(Se + x)y − by
dt
dz
= by.
dt
Dropping the nonlinear terms we have the linearized equations
dx
= − aSe y
dt
dy
= (aSe − b)y
dt
dz
= by.
dt
The system matix A is given by
0 −aSe 0
A= 0 (aSe − b) 0 ,
0 b 0
with eigenvalues
−µ −aSe 0
det 0 (aSe − b) − µ 0 = µ2 ((aSe − b) − µ) = 0 ⇒ µ = 0, 0, aSe − b.
0 b −µ
In the event that aSe − b > 0, we can conclude that this equilibrium point is unstable. This corresponds
with the situation that the number of infectives grows faster than it dies off (by recovery or death). In this
case, we would expect even a single infected person to cause an epidemic. In the case that aSe − b ≤ 0, we
require a more detailed study. 4
29
for which the matix A is the scalar A = r − h. But we have already argued that r > h, so A > 0, and we can
conclude instability. For the second equilibrium point, we introduce the perturbation variable x(t) through
h
p(t) = K(1 − ) + x(t),
r
for which we find
dx h x + K(1 − hr ) h
= r(x + K(1 − ))(1 − ) − h(x + K(1 − )).
dt r K r
Dropping high order terms, we have
dx
= − (r − h)x,
dt
for which A = −(r − h) < 0, and we conclude stability. (In the case of single equations, stability is more
readily observed directly, through consideration of the sign of dp dt for p above and below the equilibrium
point (see the discussion of the logistic equation in Section 2.3), but it’s instructive to see how the general
approach works.) We conclude that so long as h < r, pe = K(1 − hr ) is an equilibrium point.
Finally, we choose our harvest rate h to maxiumize the yield, defined by
h
Y (h) = pe h = hK(1 − ).
r
Maximizing in the usual way through differentiation, we have
h hK r
Y 0 (h) = K(1 − )− =0⇒h= .
r r 2
r/2
For this rate, our harvest is r
2 K(1 − r ) = rK
4 , and the fish population approaches its equilibrium point
K(1 − r/2 K
r ) = 2. 4
30
>>[t,y]=ode23(inline(’yˆ(2/3)’,’t’,’y’),[0 ,.5],0)
t=
0
0.0500
0.1000
0.1500
0.2000
0.2500
0.3000
0.3500
0.4000
0.4500
0.5000
y=
0
0
0
0
0
0
0
0
0
0
0
According to MATLAB, the solution is y(t) = 0 for all t, and indeed it is straightforward to check that this
is a valid solution to the equation. We find, in fact, that for any c > 0, the function y(t) given as
(
(t−c)3
y(t) = 27 , t ≥ c
0, t≤c
satisfies this equation. In practice, this is the fundamental issue with uniqueness: If our model does not have
a unique solution, we don’t know whether or not the solution MATLAB (or alternative software) gives us is
the one that corresponds with the phenomenon we’re modeling. 4
Two critical questions are apparent: 1. When can we insure that this problem won’t arise (that solutions
are unique)? and 2. In the case of nonuniqueness, can we develop a theory that selects the correct solution?
The second of these questions can only be answered in the context of the phenomenon we’re modeling. For
example, in Example 3.5, we selected t > 0 because we were trying to predict a future time, and only one
solution satisfied t > 0. As we observed, however, the other solution answered a different question that
might have been posed: how long ago would the object have had to leave the ground to get to height h?
Fortunately, for the first of our two questions—at least in the case of ODE—we have a definitive general
theorem.
Theorem 3.2. (ODE Uniqueness) Let f (t, y) = (f1 (t, y), f2 (t, y), ..., fn (t, y))tr be a vector function whose
components are each continuous in both t and y in some neighborhood a ≤ t ≤ b and a1 ≤ y1 ≤ b1 ,a2 ≤
y2 ≤ b2 ,...,an ≤ yn ≤ bn and whose partial derivatives ∂yl fk (t, y) are continuous in both t and y in the same
neighborhoods for each l, k = 1, ..., n. Then given any initial point (t0 , y0 ) ∈ R × Rn such that a < t0 < b and
ak < y0k < bk for all k = 1, ...n, any solution to
dy
= f (t, y); y(t0 ) = y0
dt
is unique on the neighborhood of continuity.
31
Example 3.6 continued. Notice that our equation from Example 3.6 better not satisfy the conditions
of Theorem 3.2. In this case, f (t, y) = y 2/3 , which is continuous in both t (trivially) and y. Computing
∂y f (t, y) = 23 y −1/3 , we see that the y-derivative of f is not continuous at the initial value y = 0. 4
Example 3.7. Consider again the Lotka–Volterra predator–prey model, which we can re-write in the
notation of Theorem 3.2 as (y1 = x, y2 = y),
dy1
= ay1 − by1 y2 ; y1 (t0 ) = y01
dt
dy2
= − ry2 + cy1 y2 ; y2 (t0 ) = y02 .
dt
In this case, the vector f (t, y) is
f1 (t, y1 , y2 ) ay1 − by1 y2
= .
f2 (t, y1 , y2 ) −ry2 + cy1 y2
As polynomials, f1 , f2 , ∂y1 f1 , ∂y2 f1 , ∂y1 f2 , and ∂y2 f2 must all be continuous for all t, y1 , and y2 , so any
solution we find to these equations must be unique. 4
Idea of the uniqueness proof. Before proceeding with a general proof of Theorem 3.2, we will work
through the idea of the proof in the case of a concrete example. Consider the ODE
dy
= y2; y(0) = 1, (3.9)
dt
and suppose we want to establish uniqueness on the intervals a ≤ t ≤ b and a1 ≤ y ≤ b1 , with 0 ∈ (a, b) and
1 ∈ (a1 , b1 ). We begin by supposing that y1 (t) and y2 (t) are both solutions to (3.9) and defining the squared
difference between them as a variable,
Our goal becomes to show that E(t) ≡ 0; that is, that y1 (t) and y2 (t) must necessarily be the same function.
Computing directly, we have
dE dy1 dy2
= 2(y1 (t) − y2 (t))( − )
dt dt dt
= 2(y1 (t) − y2 (t))(y1 (t)2 − y2 (t)2 )
= 2(y1 (t) − y2 (t))(y1 (t) − y2 (t))(y1 (t) + y2 (t))
= 2(y1 (t) − y2 (t))2 (y1 (t) + y2 (t))
= 2E(t)(y1 (t) + y2 (t)).
Since y1 and y2 are both assumed less than b1 , we conclude the differential inequality
dE
≤ 2E(t)(2b1 ),
dt
which upon multiplication by the (non-negative) integrating factor e−4b1 t can be written as
d −4b1 t
[e E(t)] ≤ 0.
dt
Integrating, we have
Z t t
d −4b1 s
[e E(s)]ds = e−4b1 s E(s) = e−4b1 t E(t) − E(0) ≤ 0.
0 dt 0
Recalling that y1 (0) = y2 (0) = 1, we observe that E(0) = 0 and consequently E(t) ≤ 0. But E(t) ≥ 0 by
definition, so that we can conclude that E(t) = 0.
32
Proof of Theorem 3.2. In order to restrict the tools of this proof to a theorem that should be familiar to
most students, we will carry it out only in the case of a single equation. The extension to systems is almost
identical, only requiring a more general form of the Mean Value Theorem.
We begin as before by letting y1 (t) and y2 (t) represent two solutions of the ODE
dy
= f (t, y); y(t0 ) = y0 .
dt
Again, we define the squared difference between y1 (t) and y2 (t) as E(t) := (y1 (t) − y2 (t))2 . Computing
directly, we have now
dE dy1 dy2
= 2(y1 (t) − y2 (t))( − )
dt dt dt
= 2(y1 (t) − y2 (t))(f (t, y1 ) − f (t, y2 )).
At this point, we need to employ the Mean Value Theorem (see Appendix A), which asserts in this context
that for each t there exists some number c ∈ [y1 , y2 ] so that
f (t, y1 ) − f (t, y2 )
f 0 (c) = , or f (t, y1 ) − f (t, y2 ) = ∂y f (t, c)(y1 − y2 ).
y1 − y2
Since ∂y f is assumed continuous on the closed interval t ∈ [a, b], y ∈ [a1 , b1 ], the Extreme Value Theorem (see
Appendix A) guarantees the existence of some constant L so that |∂y f (t, y)| ≤ L for all t ∈ [a, b], y ∈ [a1 , b1 ].
We have, then, the so-called Lipschitz inequality,
We conclude that
dE
≤ 2|y1 (t) − y2 (t)|L|y1 (t) − y2 (t)| = 2LE(t),
dt
from which we conclude exactly as above that E(t) ≡ 0.
Remark on the Lipschitz Inequality. Often the ODE uniqueness theorem is stated under the assumption
of the Lipschitz inequality (3.10). I have chosen to state it here in terms of the continuity of derivatives of
f because that is typically an easier condition to check. Since continuity of derivatives implies the Lipschitz
inequality, the Lipschitz formulation is more general.
x7 + 6x4 + 3x + 9 = 0.
While actually finding a real solution to this equation is quite difficult, it’s fairly easy to recognize that such
a solution must exist. As x goes to +∞, the left hand side becomes positive, while as x goes to −∞ the left
hand side becomes negative. Somewhere in between these two extremes, the left hand side must equal 0. In
this way we have deduced that a solution exists without saying much of anything about the nature of the
solution. (Mathematicians in general are notorious for doing just this sort of thing.) 4
If we really wanted to ruin MATLAB’s day, we could assign it the ODE
dy
= t−1 ; y(0) = 1.
dt
Solving by direct integration, we see that y(t) = log t + C, so that no value of C can match our initial
data. (The current version of MATLAB simply crashes.) As with the case of uniqueness, we would like to
33
insure the existence of some solution before trying to solve the equation. Fortunately, we have the following
theorem, due to Picard.
Theorem 3.3. (ODE Existence) 7 Let f (t, y) = (f1 (t, y), f2 (t, y), ..., fn (t, y))tr be a vector function whose
components are each continuous in both t and y in some neighborhood a ≤ t ≤ b and a1 ≤ y1 ≤ b1 ,a2 ≤
y2 ≤ b2 ,...,an ≤ yn ≤ bn and whose partial derivatives ∂yl fk (t, y) are continuous in both t and y in the same
neighborhoods for each l, k = 1, ..., n. Then given any initial point (t0 , y0 ) ∈ R × Rn such that a < t0 < b and
ak < y0k < bk for all k = 1, ...n, there exists a solution to the ODE
dy
= f (t, y); y(t0 ) = y0 (3.11)
dt
for some domain |t − t0 | < τ , where τ > 0 may be extremely small. Moreover, the solution y is a continuous
function of the independent variable t and of the parameters t0 and y0 .
Example 3.9. Consider the ODE
dy
= y2; y(0) = 1.
dt
Since f (t, y) = y 2 is clearly continuous with continuous derivatives, Theorem 3.3 guarantees that a solution
to this ODE exists. Notice particularly, however, that the interval of existence is not specified. To see exactly
what this means, we solve the equation by separation of variables, to find
1
y(t) = ,
1−t
from which we observe that though f (y) and its derivatives are continuous for all t and y, existence is lost
at t = 1. Referring to the statement of our theorem, we see that this statement is equivalent to saying that
τ = 1. Unfortunately, our general theorem does not specify τ for us a priori. 4
Idea of the proof of Theorem 3.3, single equations. Consider the ODE
dy
= y; y(0) = 1.
dt
Our goal here is to establish that a solution exists without every actually finding the solution. (Though if
we accidentally stumble across a solution on our way, that’s fine too.) We begin by simply integrating both
sides, to obtain the integral equation Z t
y(t) = 1 + y(s)ds.
0
(Unlike in the method of separation of variables, we have integrated both sides with respect to the same
variable, t.) Next, we try to find a solution by an iteration. (Technically, Picard Iteration.) The idea here
is that we guess at a solution, say yguess (t) and then use our integral equation to (hopefully) improve our
guess through the calculation Z t
ynew guess (t) = 1 + yold guess (s)ds.
0
Typically, we call our first guess y0 (t) and use the initial value: here, y0 (t) = 1. Our second guess, y1 (t),
becomes Z t Z t
y1 (t) = 1 + y0 (s)ds = 1 + 1ds = 1 + t.
0 0
34
Proceeding similarly, we find that
X
n ∞ k
X
tk t
yn (t) = ⇒ lim yn (t) = ,
k! n→∞ k!
k=0 k=0
P tk
and our candidate for a solution becomes y(t) = ∞ k=0 k! , an infinite series amenable to such tests as the
integral test, the comparison test, the limit comparison test, the alternating series test, and the ratio test.
The last step is to use one of these tests to show that our candidate converges. We will use the ratio test,
reviewed in Appendix A. Computing directly, we find
tk+1
ak+1 (k+1)! tk t k! t
lim = tk
= · = = 0, for all t.
k→∞ ak
k!
(k + 1)k! tk k+1
We conclude that y(t) is indeed a solution. Observe that though we have developed a series representation
for our solution, we have not found a closed form solution. (What is the closed form solution?) 4
Idea of the proof of Theorem 3.3, higher order equations and systems. We consider the ODE
y 00 (t) + y(t) = 0;
y(0) = 0; y(1) = 1. (3.12)
In order to proceed as above and write (3.12) as an integral equation, we first write it in the notation of
Theorem 4.2 by making the substitutions y1 (t) = y(t) and y2 (t) = y 0 (t):
dy1
= y2 ; y1 (0) = 0
dt
dy2
= −y1 ; y2 (0) = 1.
dt
(Notice that the assumptions of Theorem 3.3 clearly hold for this equation.) Integrating, we obtain the
integral equations,
Z t
y1 (t) = y2 (s)ds
0
Z t
y2 (t) = 1 − y1 (s)ds.
0
35
Again, we can apply the ratio test to determined that this series converges (to what?).
Proof of Theorem 3.3. As with Theorem 3.2, we will only prove Theorem 3.3 in the case of single
equations. The proof in the case of systems actually looks almost identical, where each statement is replaced
by a vector generalization. I should mention at the outset that this is by far the most technically difficult
proof of the semester. Not only is the argument itself fairly subtle, it involves a number of theorems from
advanced calculus (e.g. M409).
Integrating equation (3.11), we obtain the integral equation
Z t
y(t) = y0 + f (s, y(s))ds.
t0
Iterating exactly as in the examples above, we begin with y0 and compute y1 , y2 ,... according to
Z t
yn+1 (t) = y0 + f (s, yn (s))ds; n = 0, 1, 2, ....
t0
As both a useful calculation and a warmup for the argument to come, we will begin by estimating ky1 − y0 k,
where where k · k is defined similarly as in Theorem A.6 by
ky(t)k := sup |y(t)|.
|t−t0 |<τ
We compute Z t
|y1 (t) − y0 | = | f (s, y0 )ds|.
t0
Our theorem assumes that f is continuous on s ∈ [t0 , t] and hence bounded, so there exists some constant
M so that kf (s, y0 )k ≤ M . We have, then
|y1 (t) − y0 | ≤ τ M.
Observing that the right-hand side is independent of t we can take supremum over t on both sides to obtain
ky1 − y0 k ≤ τ M.
Finally, since we are at liberty to take τ as small as we like we will choose it so that 0 < τ ≤ /M , for some
> 0 to be chosen. In this way, we can insure that our new value y1 remains in our domain of continuity of
f.
We now want to look at the difference between two successive iterations and make sure the difference is
getting smaller—that our iteration is actually making progress. For |t − t0 | < τ , we have
Z t
|yn+1 (t) − yn (t)| =| f (s, yn (s)) − f (s, yn−1 (s)) ds|
t0
Z t
≤ L|yn (s) − yn−1 (s)|ds ≤ Lτ sup |yn (t) − yn−1 (t)|.
t0 |t−t0 |≤τ
Taking supremum over both sides (and observing that t has become a dummy variable on the right-hand
side), we conclude
kyn+1 − yn k ≤ Lτ kyn − yn−1 k,
Since τ is to be taken arbitrarily small, we can choose it to be as small as we like, and take 0 < τ ≤ L/2. In
this way, we have
1
kyn+1 − yn k ≤ kyn − yn−1 k.
2
We see that, indeed, on such a small interval of time our iterations are getting better. In fact, by carrying
this argument back to our initial data, we find
1 1 1
kyn+1 − yn k ≤ kyn − yn−1 k ≤ · kyn−1 − yn−2 k
2 2 2
1
≤ n ky1 − y0 k ≤ n .
2 2
36
In this way, we see that for n > m
X
n−1 X
n−1
kyn − ym k =k (yk+1 − yk )k ≤ kyk+1 − yk k
k=m k=m
X∞
1
≤ k
= m−1 .
2 2
k=m
We conclude that
lim kyn − ym k = lim = 0,
n>m→∞ n>m→∞ 2m−1
and thus by Cauchy’s Convergence Condition (Theorem A.6) yn (t) converges to some function y(t), which
is our solution.
Notice in particular that MATLAB uses capital D to indicate the derivative and requires that the entire
equation appear in single quotes. MATLAB takes t to be the independent variable by default, so here x must
be explicitly specified as the independent variable. Alternatively, if you are going to use the same equation
a number of times, you might choose to define it as a variable, say, eqn1.
>>eqn1 = ’Dy = y*x’
eqn1 =
Dy = y*x
>>y = dsolve(eqn1,’x’)
y = C1*exp(1/2*xˆ2)
To solve an initial value problem, say, equation (4.1) with y(1) = 1, use
>>y = dsolve(eqn1,’y(1)=1’,’x’)
y=
1/exp(1/2)*exp(1/2*xˆ2)
or
>>inits = ’y(1)=1’;
>>y = dsolve(eqn1,inits,’x’)
y=
1/exp(1/2)*exp(1/2*xˆ2)
8 Actually, whenever you do symbolic manipulations in MATLAB what you’re really doing is calling Maple.
37
Now that we’ve solved the ODE, suppose we want to plot the solution to get a rough idea of its behavior.
We run immediately into two minor difficulties: (1) our expression for y(x) isn’t suited for array operations
(.*, ./, .ˆ), and (2) y, as MATLAB returns it, is actually a symbol (a symbolic object ). The first of these
obstacles is straightforward to fix, using vectorize(). For the second, we employ the useful command eval(),
which evaluates or executes text strings that constitute valid MATLAB commands. Hence, we can use
>>x = linspace(0,1,20);
>>z = eval(vectorize(y));
>>plot(x,z)
You may notice a subtle point here, that eval() evaluates strings (character arrays), and y, as we have
defined it, is a symbolic object. However, vectorize converts symbolic objects into strings.
4.1.3 Systems
Suppose we want to solve and plot solutions to the system of three ordinary differential equations
First, to find a general solution, we proceed as in Section 4.1.1, except with each equation now braced in its
own pair of (single) quotation marks:
>>[x,y,z]=dsolve(’Dx=x+2*y-z’,’Dy=x+z’,’Dz=4*x-4*y+5*z’)
x=
2*C1*exp(2*t)-2*C1*exp(t)-C2*exp(3*t)+2*C2*exp(2*t)-1/2*C3*exp(3*t)+1/2*C3*exp(t)
y=
2*C1*exp(t)-C1*exp(2*t)+C2*exp(3*t)-C2*exp(2*t)+1/2*C3*exp(3*t)-1/2*C3*exp(t)
z=
-4*C1*exp(2*t)+4*C1*exp(t)+4*C2*exp(3*t)-4*C2*exp(2*t)-C3*exp(t)+2*C3*exp(3*t)
(If you use MATLAB to check your work, keep in mind that its choice of constants C1, C2, and C3 probably
won’t correspond with your own. For example, you might have C = −2C1 + 1/2C3, so that the coefficients
of exp(t) in the expression for x are combined. Fortunately, there is no such ambiguity when initial values
are assigned.) Notice that since no independent variable was specified, MATLAB used its default, t. For an
example in which the independent variable is specified, see Section 4.1.1. To solve an initial value problem,
we simply define a set of initial values and add them at the end of our dsolve() command. Suppose we have
x(0) = 1, y(0) = 2, and z(0) = 3. We have, then,
38
>>inits=’x(0)=1,y(0)=2,z(0)=3’;
>>[x,y,z]=dsolve(’Dx=x+2*y-z’,’Dy=x+z’,’Dz=4*x-4*y+5*z’,inits)
x=
6*exp(2*t)-5/2*exp(t)-5/2*exp(3*t)
y=
5/2*exp(t)-3*exp(2*t)+5/2*exp(3*t)
z=
-12*exp(2*t)+5*exp(t)+10*exp(3*t)
>>t=linspace(0,.5,25);
>>xx=eval(vectorize(x));
>>yy=eval(vectorize(y));
>>zz=eval(vectorize(z));
>>plot(t, xx, t, yy, t, zz)
20
15
10
0
0 0.1 0.2 0.3 0.4 0.5
Notice that all firstode.m does is take values x and y and return the value at the point (x, y) for the derivative
y 0 (x). A script for solving the ODE and plotting its solutions now takes the following form:
9 Actually, for an equation this simple, we don’t have to work as hard as we’re going to work here, but I’m giving you an
39
>>xspan = [0,.5];
>>y0 = 1;
>>[x,y]=ode23(@firstode,xspan,y0);
>>x
x=
0
0.0500
0.1000
0.1500
0.2000
0.2500
0.3000
0.3500
0.4000
0.4500
0.5000
>>y
y=
1.0000
1.0526
1.1111
1.1765
1.2500
1.3333
1.4286
1.5384
1.6666
1.8181
1.9999
>>plot(x,y)
Notice that xspan is the domain of x for which we’re asking MATLAB to solve the equation, and y0 = 1
means we’re taking the initial value y(0) = 1. MATLAB solves the equation at discrete points and places the
domain and range in vectors x and y. These are then easily manipulated, for example to plot the solution
with plot(x,y). Finally, observe that it is not the differential equation itself that goes in the function ode23,
but rather the derivatives of the differential equation, which MATLAB assumes to be a first order system.
Observe that y1 is stored as y(1) and y2 is stored as y(2), each of which are column vectors. Additionally,
yprime is a column vector, as is evident from the semicolon following the first appearance of y(2). The
MATLAB input and output for solving this ODE is given below.
40
>>xspan = [0,.5];
>>y0 = [1;0];
>>[x,y]=ode23(@secondode,xspan,y0);
>>[x,y]
ans =
0 1.0000 0
0.0001 1.0000 -0.0001
0.0005 1.0000 -0.0005
0.0025 1.0000 -0.0025
0.0124 0.9999 -0.0118
0.0296 0.9996 -0.0263
0.0531 0.9988 -0.0433
0.0827 0.9972 -0.0605
0.1185 0.9948 -0.0765
0.1613 0.9912 -0.0904
0.2113 0.9864 -0.1016
0.2613 0.9811 -0.1092
0.3113 0.9755 -0.1143
0.3613 0.9697 -0.1179
0.4113 0.9637 -0.1205
0.4613 0.9576 -0.1227
0.5000 0.9529 -0.1241
In the final expression above, the first column tabulates x values, while the second and third columns tabulate
y1 and y2 (y(1) and (y(2))), or y and y 0 . MATLAB regards y as a matrix whose elements can be referred
to as y(m, n), where m refers to the row and n refers to the column. Here, y has two columns (y1 and y2 )
and 17 rows, one for each value of x. To get, for instance, the 4th entry of the vector y(1), type y(4,1)—4th
row, 1st column. To refer to the entirety of y1 , use y(:, 1), which MATLAB reads as every row, first column.
Thus to plot y1 versus y2 we use plot(y(:,1),y(:,2)).
41
If in the Command Window, we type
>>tspan=[0,50];
>>[t,x]=ode45(@lorenz,tspan,x0);
>>plot(x(:,1),x(:,3))
the famous “Lorenz strange attractor” is sketched. (See Figure 4.2.)
The Lorenz Strange attractor
30
20
10
y(t)
0
−10
−20
−30
−20 −15 −10 −5 0 5 10 15 20
x(t)
42
function yprime = bvpexample(t,y)
%BVPEXAMPLE: Differential equation for boundary value
%problem example.
yprime=[y(2); -2*y(1)+3*y(2)];
Next, we write the boundary conditions as the M-file bc.m, which records boudary residues.
function res=bc(y0,y1)
%BC: Evaluates the residue of the boundary condition
res=[y0(1);y1(1)-10];
By residue, we mean the left-hand side of the boundary condition once it has been set to 0. In this case,
the second boundary condition is y(1) = 10, so its residue is y(1) − 10, which is recorded in the second
component of the vector that bc.m returns. The variables y0 and y1 represent the solution at x = 0 and at
x = 1 respectively, while the 1 in parentheses indicates the first component of the vector. In the event that
the second boundary condition was y 0 (1) = 10, we would replace y1(1) − 10 with y1(2) − 10.
We are now in a position to begin solving the boundary value problem. In the following code, we first
specify a grid of x values for MATLAB to solve on and an initial guess for the vector that would be given for
an initial value problem [y(0), y 0 (0)]. (Of course, y(0) is known, but y 0 (0) must be a guess. Loosely speaking,
MATLAB will solve a family of initial value problems, searching for one for which the boundary conditions
are met.) We solve the boundary value problem with MATLAB’s built-in solver bvp4c.
>>sol=bvpinit(linspace(0,1,25),[0 1]);
>>sol=bvp4c(@bvpexample,@bc,sol);
>>sol.x
ans =
Columns 1 through 9
0 0.0417 0.0833 0.1250 0.1667 0.2083 0.2500 0.2917 0.3333
Columns 10 through 18
0.3750 0.4167 0.4583 0.5000 0.5417 0.5833 0.6250 0.6667 0.7083
Columns 19 through 25
0.7500 0.7917 0.8333 0.8750 0.9167 0.9583 1.0000
>>sol.y
ans =
Columns 1 through 9
0 0.0950 0.2022 0.3230 0.4587 0.6108 0.7808 0.9706 1.1821
2.1410 2.4220 2.7315 3.0721 3.4467 3.8584 4.3106 4.8072 5.3521
Columns 10 through 18
1.4173 1.6787 1.9686 2.2899 2.6455 3.0386 3.4728 3.9521 4.4805
5.9497 6.6050 7.3230 8.1096 8.9710 9.9138 10.9455 12.0742 13.3084
Columns 19 through 25
5.0627 5.7037 6.4090 7.1845 8.0367 8.9726 9.9999
14.6578 16.1327 17.7443 19.5049 21.4277 23.5274 25.8196
We observe that in this case MATLAB returns the solution as a structure whose first component sol.x
simply contains the x values we specified. The second component of the structure sol is sol.y, which is a
matrix containing as its first row values of y(x) at the x grid points we specified, and as its second row the
corresponding values of y 0 (x).
43
Since we do not know the appropriate time interval (in fact, that’s what we’re trying to determine), we
would like to specify that MATLAB solve the equation until the pendulum swings through some specified
fraction of its complete cycle and to give the time this took. In our case, we will record the time it takes
the pendulum to reach the bottom of its arc, and multiply this by 4 to arrive at the pendulum’s period. (In
this way, the event is independent of the pendulum’s initial conditions.) Our pendulum equation
d2 θ g
= − sin θ
dt2 l
is stored in pendode.m with l = 1 (see Example 3.1). In addition to this file, we write an events file
pendevent.m that specifies the event we are looking for.
In pendevent.m, the line lookfor=x(1) specifies that MATLAB should look for the event x(1) = 0 (that is,
x(t) = 0). (If we wanted to look for the event x(t) = 1, we would use lookfor=x(1)-1.) The line stop=1
instructs MATLAB to stop solving when the event is located, and the command direction=-1 instructs
MATLAB to only accept events for which x(2) (that is, x0 ) is negative (if the pendulum starts to the right
of center, it will be moving in the negative direction the first time it reaches the center point).
We can now solve the ODE up until the time our pendulum reaches the center point with the following
commands issued in the Command Window:
>>options=odeset(’Events’,@pendevent);
>>x0=[pi/4 0];
>>[t, x, te, xe, ie]=ode45(@pendode, [0, 10], x0, options);
>>te
te =
0.5215
>>xe
xe =
-0.0000 -2.3981
Here, x0 is a vector of initial data, for which we have chosen that the pendulum begin with angle π/4 and
with no initial velocity. The command ode45() returns a vector of times t, a matrix of dependent variables x,
the time at which the event occurred, te, the values of x when the event occurred, xe, and finally the index
when the event occurred, ie. In this case, we see that the event occurred at time t = .5215, and consequently
the period is P = 2.086 (within numerical errors). Though the exact period of the pendulum is difficult to
analyze numerically, it is not difficult to show through the qsmall angle approximation sin θ ∼= θ that for θ
l
small the period of the pendulum is approximately P = 2π g , which in our case gives P = 2.001. (While
the small angle approximation gives a period independent of θ, the period of a pendulum does depend on θ.)
5 Numerical Methods
Though we can solve ODE on MATLAB without any knowledge of the numerical methods it employs, it’s
often useful to understand the basic underlying principles. In order to gain a basic understanding of how
ODE solvers work in general, we recall the development of Euler’s method.
44
5.1 Euler’s Method
Consider the first order ODE
y 0 = f (t, y); y(0) = y0 .
Recalling the definition of derivative,
y(t + h) − y(t)
y 0 (t) = lim ,
h→0 h
we suspect that for h sufficiently small,
y(t + h) − y(t)
y 0 (t) ∼
= .
h
If this is the case, then we have the approximate equation
y(t + h) − y(t)
= f (t, y(t)),
h
which we can rearrange as
y(t + h) = y(t) + hf (t, y(t)).
We now proceed by taking h small and computing y(t) iteratively. For example, taking h = .1, and beginning
with the known initial point y(0) = y0 , we have the sequence of iterations
We refer to processes in which y(tn+1 ) depends only on tn and y(tn ) (with 4t assumed fixed) as single-step
solvers. In the event that our solver depends on additional previous times, we refer to the solver as a multistep
solver.
• Multipstep solvers
– ode113. If using stringent error tolerances or solving a computationally intensive ODE file.
45
6.1 Stiff ODE
By a stiff ODE we mean an ODE for which numerical errors compound dramatically over time. For example,
consider the ODE
y 0 = −100y + 100t + 1; y(0) = 1.
Since the dependent variable, y, in the equation is multiplied by 100, small errors in our approximation will
tend to become magnified. In general, we must take considerably smaller steps in time to solve stiff ODE,
and this can lengthen the time to solution dramatically. Often, solutions can be computed more efficiently
using one of the solvers designed for stiff problems.
Fundamental Theorems
One of the most useful theorems from calculus is the Implict Function Theorem, which addresses the question
of existence of solutions to algebraic equations. Instead of stating its most general version here, we will state
exactly the case we use.
Theorem A.1. (Implicit Function Theorem) Suppose the function f (x1 , x2 , ..., xn ) is C 1 in a neighborhood
of the point (p1 , p2 , ..., pn ) (the function is continuous at this point, and its derivatives with respect to each
variable are also continuous at this point). Suppose additionally that
f (p1 , p2 , ..., pn ) = 0
and
∂x1 f (p1 , p2 , ..., pn ) 6= 0.
Then there exists a neighborhood Np of (p2 , p3 , ..., pn ) and a function φ : Np → R so that
p1 = φ(p2 , p3 , ..., pn ),
Theorem A.2. (Mean Value Theorem) Suppose f (x) is a differentiable function on the interval x ∈ [a, b].
There there exists some number c ∈ [a, b] so that
f (b) − f (a)
f 0 (c) = .
b−a
Theorem A.3. (Extreme Value Theorem) Suppose f (x) is a function continuous on a closed interval
x ∈ [a, b]. Then f (x) attains a bounded absolute maximum value f (c) and a bounded absolute minimum
value f (d) at some numbers c and d in [a, b].
P∞
Theorem A.4. (The Ratio Test) For the series k=1 ak , if limk→∞ aak+1 k
= L < 1, then the series is
absolutely convergent (which means that not only does the series itself converge, but a series created
by taking
ak+1
absolute values of the summands in the series also converges). On the other hand, if limk→∞ ak = L > 1
a ak+1
or limk→∞ ak+1
k
= ∞ the series diverges. if limk→∞ ak
= 1, the ratio test is inconclusive.
Theorem A.5. (Cauchy’s Convergence Condition) Let {an }∞ n=1 be a sequence of points and consider the
limit limn→∞ an . A necessary and sufficient condition that this limit be convergent is that
lim |an − am | = 0.
n>m→∞
Theorem A.6. (Cauchy’s Convergence Condition for functions, in exactly the form we require) Let the
series of functions {yn (t)}∞
n=1 be defined for t ∈ [a, b], and define k · k by the relation
46
Then if
lim kyn (t) − ym (t)k = 0,
n>m→∞
we have
lim yn (t)
n→∞
47
Index
boundary value problems, 42 stiff ODE, 46
dsolve, 37 vectorize(), 38
eigenvalues, 25 well-posedness, 17
equilibrium points, 21
eval, 38
event location, 43
Extreme Value Theorem, 46
Gompertz model, 9
laplace, 42
Laplace transforms, 42
law of mass action, 5
Lipschitz inequality, 33
logistic model, 9
Lotka-Volterra model, 9
Malthusian model, 8
maximum sustainable yield, 29
Mean Value Theorem, 46
multistep solver, 45
ode, 39
ode113(), 45
ode15s(), 45
ode23s(), 45
ode23t(), 45
ode23tb(), 45
single-step solvers, 45
SIR model, 10
stability
orbital, 23
48