0% found this document useful (0 votes)
33 views115 pages

Theory of ODEs 2023

This document provides an outline for a course on the theory of ordinary differential equations (ODEs). It covers topics such as existence and uniqueness of solutions, linear systems, qualitative theory, model systems, and periodic orbits. The first section establishes foundational results on existence and uniqueness of solutions using Picard's theorem. Section 2 analyzes linear systems with constant coefficients. Section 3 introduces qualitative concepts like flows, orbits, and stability. Section 4 examines model conservative and dissipative systems. Section 5 discusses linearization near fixed points and stability. The final section covers periodic orbits in two-dimensional systems using the Poincaré-Bendixson theorem.

Uploaded by

Lewis Hays
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views115 pages

Theory of ODEs 2023

This document provides an outline for a course on the theory of ordinary differential equations (ODEs). It covers topics such as existence and uniqueness of solutions, linear systems, qualitative theory, model systems, and periodic orbits. The first section establishes foundational results on existence and uniqueness of solutions using Picard's theorem. Section 2 analyzes linear systems with constant coefficients. Section 3 introduces qualitative concepts like flows, orbits, and stability. Section 4 examines model conservative and dissipative systems. Section 5 discusses linearization near fixed points and stability. The final section covers periodic orbits in two-dimensional systems using the Poincaré-Bendixson theorem.

Uploaded by

Lewis Hays
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 115

Theory of ODEs

Spring Term 2023

James C. Robinson

March 9, 2023
Contents

I Qualitative Theory of ODEs 13

1 Existence and uniqueness results 14

1.1 Recasting an ODE as an integral equation . . . . . . . . . . . . . . 14

1.2 Picard’s Theorem: globally Lipschitz case . . . . . . . . . . . . . . . 16

1.3 Picard’s Theorem: locally Lipschitz case . . . . . . . . . . . . . . . 20

1.3.1 Continuous dependence on x0 and f . . . . . . . . . . . . . 27

1.3.2 Maximal interval of existence of a solution . . . . . . . . . . 29

2 Linear systems 32

2.1 General linear n × n systems with constant coefficients . . . . . . . 37

2.1.1 Jordan canonical form over R: 2 × 2 case . . . . . . . . . . . 37

2.1.2 Jordan canonical form over R: n × n case. . . . . . . . . . . 39

2.1.3 Coordinate transformations and solution of ẋ = Ax . . . . . 40

2.2 Classification of 2D linear systems . . . . . . . . . . . . . . . . . . . 42

2.2.1 Systems in Jordan canonical form . . . . . . . . . . . . . . . 42

3 Qualitative theory of ODEs 46

3.1 General concepts: flows, orbits, invariant sets, and stability . . . . . 46

i
3.2 One-dimensional dynamics . . . . . . . . . . . . . . . . . . . . . . . 51

3.3 ω-limit points and ω-limit sets . . . . . . . . . . . . . . . . . . . . . 53

3.4 Aside: Cartesian and polar coordinates . . . . . . . . . . . . . . . . 57

3.5 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

4 Qualitative dynamics of some model systems 64

4.1 Conservative systems (the nonlinear pendulum) . . . . . . . . . . . 64

4.2 Lyapunov functions (the damped pendulum) . . . . . . . . . . . . . 71

5 Linearisation and the dynamics near fixed points 77

5.1 Linearised equations . . . . . . . . . . . . . . . . . . . . . . . . . . 77

5.1.1 Sinks are attracting . . . . . . . . . . . . . . . . . . . . . . . 79

5.1.2 Lyapunov method for proving stability/attractivity of fixed


points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

5.2 Unstable fixed points: saddles are unstable . . . . . . . . . . . . . . 87

5.3 Stable and unstable manifolds at hyperbolic fixed points . . . . . . 89

5.3.1 Approximating the stable and unstable manifolds as power


series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

5.4 Non-hyperbolic fixed points in R2 . . . . . . . . . . . . . . . . . . . 98

5.5 Global phase portrtaits by combining local pictures . . . . . . . . . 100

6 Periodic orbits in two-dimensional systems 104

6.1 The Poincaré–Bendixson Theorem . . . . . . . . . . . . . . . . . . . 104

6.2 Non-existence of periodic orbits: Dulac’s criterion . . . . . . . . . . 109

6.2.1 Example: the Lotka–Volterra model again . . . . . . . . . . 110

ii
Preliminaries

1
Open, closed, and compact sets; the matrix norm
(MVC review)

Open sets

A subset A of Rn is open if for every x ∈ A there exists ε > 0 such that

B(x, ε) ⊂ A,

where B(x, ε) = {y ∈ Rn : |y − x| < ε} is the open ball of radius ε around x. The


union of open sets is open and a finite intersection of open sets is open.

Closed sets

A subset A of Rn is closed if its complement is open.

Lemma 0.1. A set A is closed if and only if for every sequence (xn ) ∈ A with
xn → x, x ∈ A.

Proof. ⇒: Suppose that A is closed, and then (xn ) ∈ A is a sequence with xn → x,


where x ∈/ A. Then, since A is closed, by definition Rn \ A is open; so there exists
ε > 0 such that B(x, ε) ∈ Rn \ A, i.e. B(x, ε) ∩ A = ∅. But by the definition of
convergence, there exists N such that |xn − x| < ε for all n ≥ N , i.e. xn ∈ B(x, ε),
contradicting the fact that xn ∈ A for every n. This shows that we must have
x ∈ A.

⇐: We show that Rn \ A is open. Take a point y ∈


/ A, and suppose that there
is no ε > 0 such that B(y, ε) ⊂ Rn \ A. In particular, for each n ∈ N we have
B(y, 1/n) ∩ A ̸= ∅, so there exists a point xn ∈ A such that |xn − y| < 1/n.
This gives a sequence (xn ) ∈ A with xn → y. Then we should have y ∈ A;
but we started with y ∈
/ A, a contradiction. So there must exist ε > such that
n
B(y, ε) ⊂ R \ A, and so A is closed.

Compact sets

Definition 0.2. A subset K of Rn is (sequentially) compact if any sequence (xn ) ∈


K has a convergent subsequence, whose limit lies in K.

[You will see in NMT that there is a definition of compactness in terms of open
sets that is more general. It is equivalent to sequential continuity in any metric
space, in particular in Rn with its usual distance.]

2
Theorem 0.3 (Sequential Heine–Borel Theorem). The interval [0, 1] is (sequen-
tially) compact.

Proof. Take any sequence (xn ) ∈ [0, 1]; then (xn ) is a bounded sequence of real
numbers, so by the Bolzano–Weierstrass Theorem has a convergent subsequence,
xnj → x. Since 0 ≤ xnj ≤ 1, 0 ≤ x ≤ 1, i.e. x ∈ [0, 1].

[The proof in NMT using the other definition of compactness is much more painful.]

To show that [0, 1] is compact we used the Bolzano–Weierstrass Theorem. To


discuss compact sets in Rn it is useful to have a ‘higher-dimensional’ generalisation
of this theorem.

Theorem 0.4 (Bolzano–Weierstrass Theorem in Rn ). Any bounded sequence in


Rn has a convergent subsequence.

Proof. We take subsequences of subsequences...


(j) (j) (j)
Let (x(j) ) be a sequence in K. Write x(j) = (x1 , x2 , . . . , xn ). Since (x(j) ) is
bounded, each coordinate is a bounded sequence of real numbers. In particular,
(i)
(x1 ) is a bounded sequence of real numbers, so we can use the Bolzano–Weierstrass
(j )
Theorem to extract a convergent subsequence (x1 1,k ), such that
(j )
x1 1,k → x∗1 .
(j )
Now (x2 1,k ) is a bounded sequence of real numbers, so again we can find a subse-
(j ) (j ) (j )
quence x2 2,k that converges to some x∗2 . Since (x1 2,k ) is a subsequence of (x1 1,k ),
it still converges to x∗1 . We can continue in this way, to find a subsequence x(jn,k )
such that each coordinate converges; and then x(jn,k ) → x∗ = (x∗1 , . . . , x∗n ).

It is now easy to characterise compact subsets of Rn .

Theorem 0.5 (Compact sets in Rn ). A subset of Rn is compact if and only if it


is closed and bounded.

Proof. Let K be a closed, bounded subset of Rn , and take a sequence (xn ) ∈ K.


Since K is bounded, the Bolzano–Weierstrass Theorem in Rn guarantees that (xn )
has a subsequence (xnj ) that converges to some x ∈ Rn . Since xnj ∈ K and K is
closed, it follows that x ∈ K, and so K is compact.

3
For the other implication, if K is compact and (xn ) ∈ K with xn → x, then as
(xn ) ∈ K and K is compact there is a subsequence xnj → y ∈ K; but xnj → x (it is
a subsequence of xn , which converges to x), so by uniqueness of limits x = y ∈ K,
i.e. K is closed. Suppose now that K is unbounded: then there exist xn ∈ K such
that |xn | ≥ n; and this sequence has no convergent subsequence.

Continuous functions and compact sets

Continuity between Euclidean spaces (or, in fact, normed spaces) looks just
like continuity in R. If f : Rn → Rm then f is continuous at x ∈ X if for every
ε > 0 there exists δ > 0 such that

|x′ − x| < δ ⇒ |f (x′ ) − f (x)| < ε. (1)

(Note that the first | · | is the Euclidean norm in Rn , while the second is the
Euclidean norm in Rm .)

We can now generalise the Extreme Value Theorem from Analysis II.

Lemma 0.6. If A ⊂ Rn is compact and f : Rn → Rm is continuous then f (A) ⊂


Rm is compact.

Proof. Take a sequence (yn ) ∈ f (A). Then yn = f (xn ) for some sequence (xn ) ∈ A.
Since A is compact, it has a convergent subsequence xnj → x∗ ∈ A. Since f is
continuous,
f (xnj ) → f (x∗ ) =: y ∗ ∈ f (A),
so f (A) is compact.

Corollary 0.7. Suppose that f : K → R is continuous, where K is a compact


subset of Rn . Then f is bounded and attains its bounds.

Proof. By Lemma 0.6, f (K) is a compact subset of R. So f (K) must be closed


and bounded. That f (K) is bounded means that there exists R such that

f (K) ⊂ [−R, R],

so |f (x)| ≤ R for every x ∈ K, i.e. f is bounded. To see that f attains its


bounds, let f¯ = supx∈K f (x). Then for each n there exists xn ∈ K such that
f¯ − 1/n < f (xn ) ≤ f¯. Since K is compact, (xn ) has a subsequence xn → x∗ ∈ K;
taking limits shows that f¯ ≤ f (x∗ ) ≤ f¯, i.e. f (x∗ ) = f¯, so f attains its supremum.
A similar argument works for the infimum.

4
Continuity in terms of open sets

If f : X → Y , we define the preimage of a set B ∈ Y , f −1 (B), to be

f −1 (B) = {x ∈ X : f (x) ∈ B}. (2)

The notation can be a bit confusing, this is not ‘the inverse of f applied to B’,
since the preimage as defined in (2) makes sense even when f is not invertible. So
in general, for f : X → Y we have only

f −1 (f (A)) ⊇ A, A ⊂ X and f (f −1 (B)) ⊆ B, B ⊂ Y.

Simple examples show that these inclusions can be strictly: consider X = Y = R


and f (x) = x2 . Then

f −1 (f ([0, 1])) = f −1 ([0, 1]) = [−1, 1] and f (f −1 ([−1, 1])) = f ([0, 1]) = [0, 1].

The following result shows (in particular) that we could define continuity in
terms of open sets, but in this module we will mostly use the fact that if f is
continuous (understood as in (1)) then the preimage of any open set is open (and
the preimage of any closed set is closed).

Theorem 0.8. A function f : Rn → Rm is continuous if and only if f −1 (U ) is


open in Rn whenever U is open in Rm .

Proof. ⇒ Let U be an open set in Rm . If f −1 (U ) is empty then it is open; otherwise


we take x ∈ f −1 (U ). By definition f (x) ∈ U , which is open, so there exists ε > 0
such that B(f (x), ε) ⊂ U . Since f is continuous at x, there exists δ > 0 such
that |f (y) − f (x)| < ε if |y − x| < δ, i.e. B(x, δ) ⊂ f −1 (B(f (x), ε)) ⊂ f −1 (U ). So
f −1 (U ) is open.

⇐ Take x ∈ Rn and ε > 0. Then B(f (x), ε) is an open subset of Rm , so


−1
f (B(f (x), ε)) is an open subset of Rn that contains x. In particular, this means
that there exists δ > 0 such that

B(x, δ) ⊂ f −1 (B(f (x), ε)).

But this means that for any y ∈ B(x, δ), i.e. any y with |y − x| < δ, we must have
f (y) ∈ B(f (x), ε), i.e. |f (y) − f (x)| < ε; so f is continuous.

5
The matrix norm

We will consider the collection of all n × n matrices, Rn×n .

We write  
a11 · · · a1n
 .. .. 
A= . . 
an1 · · · ann
as (aij )ni,j=1 . To every n × n matrix we can associate a linear map Rn → Rn , given
by x 7→ Ax; recall that
n
X
(Ax)i = aij xj .
j=1

We define the matrix norm of A to be


|Ax|
∥A∥ = sup .
x∈Rn : x̸=0 |x|

Since A is linear, if |x| =


̸ 0 then we have

|Ax| x
= A ;
|x| |x|
so
∥A∥ = sup |Ax|,
|x|=1

which can be an easier definition to apply. The point is that it follows immediately
from the definition that |Ax| ≤ ∥A∥|x| for every x ∈ Rn .

It is possible to relate ∥A∥ to the entries of A. Note that for any x ∈ Rn with
|x| = 1 we have
n n
!2
X X
|Ax|2 = aij xj
i=1 j=1
n
" n
! n
!# n
X X X X
≤ a2ij 2
|xj | = |aij |2 ,
i=1 j=1 j=1 i,j=1

using the Cauchy–Schwarz inequality to move from the first to the second line. It
follows that !1/2
X n
∥A∥ ≤ |aij |2 .
i,j=1

6
However, if we let {e1 , . . . , en } be the standard basis of Rn , then we have
 
ai1 X n
 · · ·  = Aei ⇒ |aij |2 = |Aei |2 ,
ain j=1

so that n n n
X X X
2 2
|aij | = |Aei | ≤ ∥A∥2 |ei |2 = n∥A∥2 .
i,j=1 i=1 i=1

It follows that
n
!1/2 n
!1/2
1 X X
√ |aij |2 ≤ ∥A∥ ≤ |aij |2 . (3)
n i,j=1 i,j=1

In particular, |aij | ≤ n∥A∥ for every i, j.

This ‘matrix norm’ really is a norm on Rn×n : we have to check that

(i) ∥A∥ = 0 if and only if A = 0. This is simple.


(ii) ∥λA∥ = |λ|∥A∥. This is easy too.
(iii) The triangle inequality: ∥A + B∥ ≤ ∥A∥ + ∥B∥. For any x ∈ Rn we have
|(A + B)x| = |Ax + Bx| ≤ |Ax| + |Bx| ≤ ∥A∥|x| + ∥B∥|x| = (∥A∥ + ∥B∥)|x|,
and so
∥A + B∥ = sup |(A + B)x| ≤ ∥A∥ + ∥B∥.
|x|=1

For compositions we have ∥AB∥ ≤ ∥A∥∥B∥, since


|(AB)x| = |A(Bx)| ≤ ∥A∥|Bx| ≤ ∥A∥∥B∥|x|.
P∞
Finally, suppose that (A(k) ) are matrices such that k=1 ∥A(k) ∥ < ∞. Then it
follows that ∞
X (k)
aij
k=1
(k) √
converges absolutely, using the fact that |aij | ≤ n∥A(k) ∥ [see (3)] and the com-
parison test. So every entry in the matrix sum ∞ (k)
P
k=1 A converges absolutely;
so ∞
X
A(k)
k=1
converges.

7
The Contraction Mapping Theorem (NMT pre-
view)

Norms

If V is a vector space (over R), then a norm on V is a map ∥ · ∥ : V → [0, ∞)


such that

(i) ∥v∥ = 0 if and only if v = 0;


(ii) ∥λv∥ = |λ|∥v∥ for all λ ∈ R and v ∈ V ; and
(iii) ∥u + v∥ ≤ ∥u∥ + ∥v∥ (triangle inequality).

A ‘normed space’ is a vector space plus a norm: (V, ∥ · ∥).

‘Trivial’ examples:

1. (R, | · |)
p
2. (Rn , | · |), where |(x1 , . . . , xn )| = x21 + · · · + x2n (we use the same notation for
the usual norm on Rn as for the absolute value in R).

More interesting examples:


Pn
1. (Rn , | · |1 ), where |(x1 , . . . , xn )| = j=1 |xj |.

2. (Rn , | · |∞ ), where |(x1 , . . . , xn )| = maxj=1,...,n |xj |.

3. Let V = Rn×n , the space of all n × n matrices. If A ∈ Rn×n , then the ‘matrix
norm’ of A is
|Ax|
∥A∥ = sup = sup |Ax|.
x∈Rn , x̸=0 |x| x∈Rn , |x|=1

This defines a norm on V and


|Ax| ≤ ∥A∥|x| for every x ∈ Rn .
Moreover if A, B ∈ V then
∥AB∥ ≤ ∥A∥∥B∥.

4a. Let V = C([a, b]; R), continuous function from [a, b] into R; then
∥f ∥∞ := sup |f (t)|
t∈[a,b]

8
is a norm on V .

4b. Let
V = C([a, b]; Rn )
= {f = (f1 , . . . , fn ) : [a, b] → Rn : fi ∈ C([a, b]; R), i = 1, . . . , n},
i.e. continuous functions from [a, b] into Rn . Then
∥f ∥∞ := sup |f (t)|
t∈[a,b]

is a norm on V .

Examples 3 and 4b will be particularly important.

Convergence in normed spaces and completeness

To define convergence in a general normed space we just repeat the usual


definition, but replacing the absolute value by the norm: if (xn ) is a sequence in
X, then xn → x if for every ε > 0 there exists N such that
n≥N ⇒ ∥xn − x∥ < ε.

By copying almost word-for-word the proofs from Analysis I you can prove that
this kind of convergence behaves in the way you would expect, e.g. if vn → v and
un → u then vn + un → v + u (you have to use the fact that X is a vector space
to make sure that ‘adding’ makes sense).

Recall (from Analysis I) that a sequence (xn ) of real numbers converges if and
only if it is a Cauchy sequence: for every ε > 0 there exists N such that
n, m ≥ N ⇒ |xn − xm | < ε.
The point of this is it gives a test for convergence without knowing what the limit
is.

We say that a normed space (X, ∥ · ∥) is complete if every Cauchy sequence in


the space converges (and has a limit in the space). So this Analysis I result can
be rephrased as (R, | ˙|) is complete (or ‘R is complete with its usual norm’).

You may remember that showing that any Cauchy sequence converges is not
trivial, even when this is ‘only’ a sequence of real numbers, so it should be no
surprise that ‘being complete’ is quite a special property for a normed space.
Thankfully, many of the spaces that are often used are complete (actually, this is
probably why they are often used). In particular, we can show this for two of the
examples above.

9
Proposition 0.9. The space X = (C([a, b]; R), ∥ · ∥∞) is complete.

Proof. Suppose that (fn ) is a Cauchy sequence in X: then for every ε > 0 there
exists N such that

n, m ≥ N ⇒ ∥fn − fm ∥∞ < ε.

If we write this out in full as

m, m ≥ N ⇒ sup |fn (t) − fm (t)| < ε (4)


t∈[a,b]

then it is much clearer that for each fixed t ∈ [a, b] we have

n, m ≥ N (ε) ⇒ |fn (t) − fm (t)| < ε, (5)

where we have specified that N depends only on ε to preserve the uniformity we


have in (4).

This means that for each fixed t, (fn (t)) is a Cauchy sequence of real numbers,
so converges to some limit: we define

f (t) := lim fn (t).


n→∞

Now we show that in fact fn → f uniformly on [a, b]. But we can do this using
(5): if we take m → ∞ then we obtain

n ≥ N (ε) ⇒ |fn (t) − f (t)| < ε;

since this is true for every t ∈ [a, b], it follows that

n ≥ N (ε) ⇒ ∥fn − f ∥∞ < ε,

and so fn → f uniformly. Since f is the uniform limit of continuous functions, it


is continuous (see Analysis III).

Corollary 0.10. The space (C([a, b]; Rn ), ∥ · ∥∞ ) is complete.

Proof. Given a sequence (fk ) ∈ C([a, b]; Rn ) we have fn = (fn1 , fn2 , . . . , fkn ), where
each (fkj )k is a sequence in C([a, b]; R), so we can apply Proposition 0.9 to each of
these in turn.

10
The Contraction Mapping Theorem

Recall that a subset B of Rn is closed if whenever we have a sequence (xn ) ∈ B


and xn → x, then x ∈ B. We have the same definition in a normed space: a subset
U of (V, ∥ · ∥) is closed if whenever we have a sequence (xn ) ∈ U and xn → x, then
x ∈ U.

Theorem 0.11 (Contraction Mapping Theorem). Let (V, ∥ · ∥) be a complete


normed space, U a closed subset of V , and T : U → U a map such that

∥T (u) − T (v)∥ ≤ κ∥u − v∥, u, v ∈ U,

for some 0 ≤ κ < 1. Then there exists a unique ū ∈ U such that

T (ū) = ū,

i.e. ū is a fixed point of T .

Moreover, for any u0 ∈ U we have T n (u0 ) → ū as n → ∞ with

∥T n (u0 ) − ū∥ ≤ κn ∥u0 − ū∥, (6)

where
T n (u) := T
| ◦ T ◦{z· · · ◦ T}(u).
n times

Proof. Pick any u0 ∈ U and set un = T n (u0 ). Now, for any n ≥ 1 we have

∥un − un+1 ∥ = ∥T (un−1 ) − T (un )∥


≤ κ∥un−1 − un ∥ = κ∥T (un−2 ) − T (un−1) ∥
≤ κ2 ∥un−2 − un−1 ∥
..
.
n
≤ κ ∥u0 − u1 ∥

(strictly this should be proved by induction).

So now if m > n we can the triangle inequality to give

∥un − um ∥ ≤ ∥un − un+1 ∥ + ∥un+1 − un+2 ∥ + . . . + ∥um−1 − um ∥


≤ κn + · · · + κm−1 ∥u0 − u1 ∥


κn
≤ ∥u0 − u1 ∥. (7)
1−κ

11
It follows that (un ) is a Cauchy sequence in (V, ∥ · ∥); since this space is complete
we must have un → ū for some ū ∈ V . Since un ∈ U and U is closed, we must
have ū ∈ U .

Since T is a contraction it is continuous, and so


 
T (ū) = T lim un = lim T (un ) = lim un+1 = ū,
n→∞ n→∞ n→∞

i.e. T (ū) = ū as required.

To show that ū is unique, suppose that we also have T (v̄) = v̄. Then
∥ū − v̄∥ = ∥T (ū) − T (v̄)∥ ≤ κ∥ū − v̄∥,
which is impossible unless ∥ū − v̄∥ = 0, i.e. ū = v̄.

To obtain (8) note that, with un = T n (u0 ) once again,


∥un − ū∥ = ∥T (un−1 ) − T (ū)∥ ≤ κ∥un−1 − ū∥ ≤ · · · ≤ κn ∥u0 − ū∥.

To turn this into the version we need as a ‘black box’ for Picard’s Theorem in
the module, note that we have already shown that V = C([a, b]; Rn ) is complete.
Now observe that if Y is a closed subset of Rn , then U = C([a, b]; Y ) is a closed
subset of C([a, b]; Rn ): given a sequence (fn ) ∈ C([a, b]; Y ) such that fn → f (in
the sup norm), this means that in particular for every t ∈ [a, b] we have
fn (t) → f (t);
so (fn (t)) is a convergent sequence every element of which lies in Y , and since Y
is closed this implies that f (t) ∈ Y . So f (t) ∈ Y for every t ∈ [a, b], so we have
f ∈ C([a, b]; Y ), which shows that C([a, b], Y ) is closed.
Theorem 0.12 (Contraction Mapping Theorem in spaces of continuous func-
tions). Let X = C([a, b]; Y ), where Y is a closed subset of Rn . Suppose that
P : X → X is a map such that
∥P (x) − P (y)∥∞ ≤ κ∥x − y∥∞ , x, y ∈ X,
for some 0 ≤ κ < 1, where ∥x∥∞ := supt∈[a,b] |x(t)|. Then there exists a unique
x∗ ∈ X such that
P (x∗ ) = x∗ ,
i.e. x∗ is a fixed point of P .

Moreover, for any x ∈ X we have P n (x) → x∗ as n → ∞ with


∥P n (x) − x∗ ∥∞ ≤ κn ∥x − x∗ ∥∞ , (8)
where P n (x) := P
| ◦ P {z
◦ · · · ◦ P}(x0 ).
n times

12
Part I

Qualitative Theory of ODEs

13
Chapter 1

Existence and uniqueness results

We will primarily study autonomous ODEs,

ẋ = f (x), x ∈ U ⊂ Rn .

[Here ẋ denotes dx/dt; implicit is that x is x(t), a function of t. You could (and
many do) rewrite this equation as ẋ(t) = f (x(t)), but the extra ts just make it
look more complicated.]

However, we will start with a simple existence and uniqueness result for the
non-autonomous problem

ẋ = f (x, t), x(t0 ) = x0 .

1.1 Recasting an ODE as an integral equation

In order to analyse the initial value problem (IVP)

ẋ = f (x, t) x(t0 ) = x0 (1.1)

it is helpful to recast it as the integral equation


ˆ t
x(t) = x0 + f (x(s), s) ds. (1.2)
t0

Under fairly mild conditions on f these two problems are equivalent.

14
Lemma 1.1. If f : Rn × R → Rn is continuous and τ > 0, then the following
statements are equivalent:

(i) x : [t0 − τ, t0 + τ ] → Rn is differentiable and satisfies (1.1)


(ii) x : [t0 − τ, t0 + τ ] → Rn is continuous and satisfies (1.2).

Proof. (i) ⇒ (ii) Since x is differentiable it is continuous. The map s 7→ f (x(s), s)


is continuous, as it is the composition of continuous functions [s 7→ (x(s), s) and
(x, t) 7→ f (x, t)]; we can therefore use the Fundamental Theorem of Calculus to
deduce that
ˆ t
x(t) = x0 + f (x(s, s)) ds for every t ∈ [t0 , t0 + τ ].
t0

(ii) ⇒ (i) Since s 7→ f (x(s), s) is continuous, we can use the Fundamental Theorem
of Calculus to deduce that
ẋ(t) = f (x(t)),
and we have x(t0 ) = x0 as required.

In order to find solutions of the integral equation (1.2), we will use the following
version of the Contraction Mapping Theorem – the full version is proved in Norms,
Metrics, and Topologies, and also in the preliminary material.
Theorem 1.2 (Contraction Mapping Theorem in spaces of continuous functions).
Let X = C([a, b]; Y ), where Y is a closed subset of Rn . Suppose that P : X → X
is a map such that
∥P (x) − P (y)∥∞ ≤ κ∥x − y∥∞ , x, y ∈ X,
for some 0 ≤ κ < 1, where
∥x∥∞ := sup |x(t)|.
t∈[a,b]

Then there exists a unique x∗ ∈ X such that


P (x∗ ) = x∗ ,
i.e. x∗ is a fixed point of P .

Moreover, for any x ∈ X we have ∥P n (x) − x∗ ∥∞ → 0 as n → ∞, where


P n (x) := P
| ◦ P {z
◦ · · · ◦ P}(x0 ).
n times

15
Note that the whole of Rn is a closed set, so X = C([a, b]; Rn ) is allowed.

We will also need the following result, which you know for functions of a scalar
variable from Analysis III, but not for vector-valued integrals.
Lemma 1.3. If f : [a, b] → Rn is Riemann integrable then
ˆ b ˆ b


f (t) dt ≤ |f (t)| dt.
a a

´b
Proof. Let γ = f (s) ds; then
a
ˆ b ˆ b
2
|γ| = γ · γ = γ · f (s) ds = γ · f (s) ds
a a
ˆ b ˆ b ˆ b
≤ |γ · f (s)| ds ≤ |γ||f (s)| ds = |γ| |f (s)| ds.
a a a

If γ = 0 then the inequality is trivially true; if γ ̸= 0 then we can divide by |γ| to


obtain the claimed inequality.

1.2 Picard’s Theorem: globally Lipschitz case

We first give a proof of the existence and uniqueness of solutions to the non-
autonomous equation ẋ = f (x, t) when f is globally Lipschitz, i.e. when (1.3) is
satisfied.
Theorem 1.4 (Picard’s Theorem for globally Lipschitz f ). Suppose that the func-
tion f : Rn × R → Rn is continuous and that there exists L > 0 such that

|f (x, t) − f (y, t)| ≤ L|x − y| for every x, y ∈ Rn , t ∈ R. (1.3)

If τ L < 1, then for every x0 ∈ Rn there exists a unique differentiable function


x : [t0 − τ, t0 + τ ] → Rn that satisfies

ẋ = f (x, t), x(t0 ) = x0 , for all t ∈ [t0 , t0 + τ ].

Proof. Lemma 1.1 shows that finding a solution of the ODE on [t0 − τ, t0 + τ ] is
equivalent to finding a continuous function x : [t0 , t0 + τ ] that satisfies
ˆ t
x(t) = x0 + f (x(s), s) ds,
t0

16
i.e. to finding a fixed point of the operator P : X 7→ X, where

X = C 0 ([t0 − τ, t0 + τ ], Rn )

and ˆ t
[P (x)](t) := x0 + f (x(s), s) ds. (1.4)
t0

Note first that P (x) is continuous function of t by the Fundamental Theorem


of Calculus; so P : X → X as required.

Now, given x, y ∈ X, for any t ∈ [t0 − τ, t0 + τ ] we have


ˆ t
[P (x)](t) − [P (y)](t) = f (x(s), s) − f (y(s), s) ds.
t0

Using Lemma 1.3, we have


ˆ t
|[P (x)](t) − [T (y)](t)| ≤ |f (x(s), s) − f (y(s), s)| ds
t0
ˆ t
≤L |x(s) − y(s)| ds using (1.3)
t0
≤ L|t − t0 |∥x − y∥∞ ≤ Lτ ∥x − y∥∞ ,

(recall that ∥x − y∥∞ = sups∈[t0 −τ,t0 +τ ] |x(s) − y(s)|). It follows that

∥P (x) − P (y)∥∞ ≤ Lτ ∥x − y∥∞ ,

and so P is a contraction mapping provided that Lτ < 1, and we can apply


Theorem 1.2. Under this condition P has as unique fixed point in X, which is the
solution we are looking for.

Comments on the globally Lipschitz Picard Theorem

1. Being ‘Lipschitz’ is just a little weaker than being differentiable. The standard
example showing that a Lipschitz function need not be differentiable is f (x) = |x|,
which is not differentiable at zero. (In fact a Lipschitz function is differentiable
‘almost everywhere’.)

If f : R → R is differentiable then the Mean Value Theorem guarantees that

|f (x) − f (y)| ≤ |f ′ (c)||x − y|,

17
for some c ∈ (x, y). So if |f ′ (c)| ≤ M for all M ∈ [a, b] we have

|f (x) − f (y)| ≤ M |x − y|.

If |f ′ | is bounded on R then f is globally Lipschitz; but this condition is satisfied


by very few examples. For instance, if f : R → R is a polynomial, then for f to be
globally Lipschitz it must be of the form f (x) = ax + b. (Other examples are sin x
and cos x.) We will deal with this in our next theorem.

2. As stated, the theorem only guarantees that a solution exists locally, i.e. for
times close to t0 (t ∈ [t0 − τ, t0 + τ ]). However, when f is globally Lipschitz (as it
is here) we can we can use the theorem repeatedly to extend the interval on which
the solution exists using the following simple observation.

Suppose that x1 (t) solves

ẋ = f (x, t) x(t0 ) = x0 (1.5)

on [t0 , t1 ], and x2 (t) solves

ẋ = f (x, t) x(t1 ) = x1 (t1 )

on [t1 , t2 ]. Then (
x1 (t) t0 ≤ t ≤ t1
x(t) =
x2 (t) t1 < t ≤ t2
solves (1.5) on [0, t1 + t2 ] (note that the left derivative of x1 matches the right
derivative of x2 at 0, so x is still C 1 ). So we can try to keep extending the interval
on which the solution is defined.

We consider the problem of showing the existence of a solution with y(0) = y0


for all t ≥ 0. Under the global Lipschitz condition of the theorem we have shown
that there is a uniform time τ such that the solution starting at x0 at time t = t0
exists on the time interval [t0 , t0 + τ ] (we are only looking for the solution ‘in the
future’ at the moment).

We start with y(0) = y0 ; then the solution exists for all t ∈ [0, τ ]. We now solve
the equation starting at x(τ ) = y(τ ); the solution here exists for t ∈ [τ, 2τ ]; again
we apply the theorem with x(2τ ) = y(2τ ) and find a solution valid for t ∈ [2τ, 3τ ].
The gluing observation above means that if we continue in this way we end up
with a solution that exists for all t ≥ 0.

We could ague similarly to extend the solution to all t ≤ 0.

The issue of uniqueness of this extended solution is treated on Examples 1.

18
3. We now apply this result to the linear equation

ẋ = Ax, x(0) = x0 , (1.6)

where x ∈ Rn and A is an n × n matrix. Since (see page 6)

|Au − Av| = |A(u − v)| ≤ ∥A∥|u − v|, u, v ∈ Rn

the right-hand side is a globally Lipschitz function of x, and so we can apply the
global Picard Theorem to deduce that (1.6) has a unique solution t 7→ x(t) defined
for t ∈ [−τ, τ ] provided we choose τ < ∥A∥−1 . (In fact, we will soon see that when
f is globally Lipschitz the solution exists for all t ∈ R.) We will now deduce the
form of this solution on [−τ, τ ] using the Contracting Mapping argument.

Recall from Theorem 1.4 that the successive iterations of the Picard map P
from (1.9) converge to the solution. If we start with x0 (t) := x0 (constant) then
we obtain
ˆ t
1
x (t) = x0 + Ax0 (s) ds = x0 + tAx0
ˆ0 t ˆ t
2 1
x (t) = x0 + Ax (s) ds = x0 + A[x0 + sAx0 ] ds
0 0
t2
= x0 + tAx0 + A2 x0 . . .
2
By induction we obtain
k
k
X tj Aj
x (t) = x0 ,
j=0
j!

and by Theorem 1.4 this sequence converges to the solution of (1.6) as k → ∞: so


the solution is
x(t) = etA x0 ,
where we define ∞ j j
X tA
etA = ;
j=0
j!

that the sum converges for |t| ≤ τ is guaranteed by the way we used the Contraction
Mapping Theorem (Theorem 1.2) in the proof of Theorem 1.4, although we will
see later (more directly) that the sum in fact converges for all t ∈ R (and gives
the solution for all t in this range).

19
1.3 Picard’s Theorem: locally Lipschitz case

We now prove a similar theorem, but weaken the Lipschitz condition on f to a


local one. To simplify the argument, we now restrict to the autonomous case.

Before we prove the theorem, note that solutions of autonomous initial-value


problems depend only on the elapsed time. Suppose that x : [−τ, τ ] → Rn solves

ẋ = f (x), x(0) = x0 (1.7)

for all t ∈ [−τ, τ ]. Then y : [t0 − τ, t0 + τ ] defined by setting y(t0 + t) := x(t)


satisfies
ẏ = f (y), y(t0 ) = x0 , t ∈ [t0 − τ, t0 + τ ].
[Equivalently, we could set y(t) := x(t − t0 ), but this is perhaps a little less clear.]
So to understand the ‘general’ initial value problem (specify x at time t0 ) it is
enough to understand what we would most naturally call the initial value problem
(specify x at time 0). We will return to this idea later, when we use the solutions
of the equation to define a ‘flow’ by setting ϕt (x0 ) = x(t; x0 ), the solution of (1.7)
at time t.

Theorem 1.5. Suppose that U ⊂ Rn is open and that f : U → Rn is continuous


and locally Lipschitz, i.e. for every compact subset K of U there exist LK such that

|f (u) − f (v)| ≤ LK |u − v| u, v ∈ K.

Then for some τ = τ (x0 ) > 0 there exists a unique x : [t0 − τ, t0 + τ ] → U that
solves
ẋ = f (x), x(t0 ) = x0 (1.8)
for all t ∈ [t0 − τ, t0 + τ ].

We will soon show that in fact we can take τ (x0 ) to be uniform for all x0 ∈ K,
when K is any compact subset of U .

Proof. From our initial observations, we can restrict to the (notationally more
convenient) case t0 = 0.

Choose r > 0 such that

K := B(x0 , r) ⊂ U,

20
where B(x0 , r) = {y ∈ Rn : |y − x0 | ≤ r} (this is possible since U is open). The
set K is compact; let L = LK and let

M = sup |f (x)|,
x∈K

which is finite since x 7→ |f (x)| is a continuous map from the compact set K into R
(so we can use the Extreme Value Theorem). [We do this to fix a set that contains
x0 on which f is bounded and Lipschitz.]

Now choose τ such that1 τ M ≤ r and Lτ < 1 and set

X := C([−τ, τ ], B(x0 , r)).

As before, we want to show that the Picard operator P ,


ˆ t
P (x)(t) := x0 + f (x(s)) ds, (1.9)
0

maps X into itself and is a contraction: it will then follow from the Contraction
Mapping Theorem (Theorem 1.2) that P has a unique fixed point, which will be
the solution of the ODE (1.1) by Lemma 1.1.

Step 1: Show that P : X → X.

Take x ∈ X. Then
ˆ t
P (x)(t) = x0 + f (x(s)) ds
0

makes sense, since x(s) ∈ B(x0 , r) = K ⊂ U , where f is defined. P (x) is a


continuous function by the Fundamental Theorem of Calculus (since s 7→ f (x(s))
is continuous). We also have
ˆ t
|P (x)(t) − x0 | ≤ |f (x(s))| ds ≤ M τ ≤ r, t ∈ [−τ, τ ],
0

so P : W → W as claimed.

Step 2: Show that P is a contraction.


1 This will make sure that our iterations stay in K, which is the set on which we have a bound
and Lipschitz constant for f .

21
Given x, y ∈ W , for each t ∈ [−τ, τ ] we have
ˆ t

|P (x)(t) − P (y)(t)| = f (x(s)) − f (y(s)) ds

0
ˆ t
≤ |f (x(s)) − f (y(s))| ds
0
ˆ t
≤L |x(s) − y(s)| ds
0
≤ L|t| max |x(s) − y(s)|
s∈[0,τ ]

≤ Lτ ∥x − y∥∞ ;

so
∥P (x) − P (y)∥∞ ≤ Lτ ∥x − y∥∞ ,
and by assumption Lτ < 1. Thus P is a contraction, and it has a unique fixed
point x ∈ W , which is the solution we want.

Comments on Picard’s Theorem and further results

1. If the assumptions of the theorem do not hold then solutions need not be unique.
For example, consider the equation on [0, ∞) given by

ẋ = |x|1/2 , x(0) = 0.

This has an ‘obvious’ solution x(t) = 0 for all t ≥ 0, but we can also solve by
separating variables, noting that for any solution we will have x(t) ≥ 0:
ˆ x(t) ˆ t
dx x(t)
2x1/2

= dt ⇒ 0
= t,
0 x1/2 0

which gives
t2
2x(t)1/2 = t ⇒ x(t) = .
4

In fact there is an infinite family of solutions: for any s ≥ 0


(
0 0≤t≤s
xs (t) := (t−s)2
4
t>s

is a solution.

22
This does not contradict the theorem because |x|1/2 is not locally Lipschitz
near zero: for any L > 0, if we take 0 < x < L−1/2 then
||x|1/2 − 01/2 | = x1/2 > L|x − 0| = Lx,
so no local Lipschitz constant will work near x = 0.

2. However, if f is locally Lipschitz, then the uniqueness result that is part of this
theorem means that for an autonomous equation, distinct solutions cannot cross.

Suppose that x1 : I1 → Rn and x2 : I2 → Rn both satisfy the ODE, and that


x1 (t1 ) = x2 (t2 ). If we consider y1 (t) := x1 (t1 + t) and y2 (t) := x2 (t2 + t) then the
theorem implies that y1 (t) = y2 (t) for all t ∈ [−τ, τ ] for some τ > 0.

3. Once again, the theorem only guarantees the existence of a solution ‘locally’,
i.e. near t = 0. And if f is only locally Lipschitz, then solutions can blow up in
a finite time. When we try to keep extending the existence time by applying the
theorem repeatedly we will end up with a solution only a finite time interval.

To see this in an example, we take take U = R and consider the equation


ẋ = x2 , x(0) = x0 .
If we separate variables and integrate,
ˆ x(t) ˆ t  x(t)
dx 1
= dt ⇒ − = t,
x0 x2 0 x x0
then we obtain the solution
1
x(t) = . (1.10)
−t x−1
0
Clearly, as t → 1/x0 we have x(t) → ∞. So solutions cannot be extended beyond
t = 1/x0 .

To see how this is consistent with repeatedly using the ‘short-time existence’
we get from Theorem 1.5, we follow the proof to see what value of τ we are allowed
to take, depending on x0 . We will take x0 > 0, since this is the situation in which
the solution blows up for positive t.

In the proof we have to choose r so that B(x0 , r) ⊂ U = R. This holds for any
r > 0, so we are (at this stage) free to choose any r. Note that, since we are in R,
B(x0 , r) = [x0 − r, x0 + r].

Given a choice of r, we then set


M= sup |f (x)| = (x0 + r)2 .
x∈[x0 −r,x0 +r]

23
The first constraint on the local existence time τ in the proof is to take τ M ≤ r.
This requires
r
τ≤ ,
(x0 + r)2
and this is maximised when we take r = x0 , and gives τ = 1/4x0 .

The second condition that restricts τ is to take τ L < 1, where L is the Lipschitz
constant for f (x) = x2 on [x0 − r, x0 + r] = [0, 2x0 ] (give our choice of r). The
easiest way to find the Lipschitz constant of f is the use the Mean Value Theorem:
we know that there exists c ∈ (0, 2x0 ) such that

f (x) − f (y)
≤ f ′ (c),
x−y
and if we take the modulus of both sides we obtain

|f (x) − f (y)| ≤ |f ′ (c)||x − y|.

For f (x) = x2 we have f ′ (x) = 2x, and so whatever c ∈ (0, 2x0 ) we choose, we end
up with |f ′ (c)| < 4x0 , and so

|x2 − y 2 | ≤ 4x0 |x − y|, x, y ∈ [0, 2x0 ].

The condition τ L < 1 now requires τ < 1/4x0 .

A valid choice of τ (depending on x0 ) is therefore τ = 1/5x0 : if we start with


x(t0 ) = x0 then the solution exists on [t0 , t0 + 1/5x0 ], and then (using our explicit
solution
1 5
x(t0 + 1/5x0 ) = 1 1 = x0 .
x0
− 5x0 4
The solution increasing by a factor of 5/4 in time 1/5x0 .

As the solution increases, the existence times we get from our result decreases;
we can keep going up to a maximum time
1 1 1 1 1 1 1
+ + 2
+ ··· = 4 = ,
5x0 5(5/4)x0 5(5/4) x0 x0 5 1 − 5
x0

by which time the solution will have become infinite.

4. We used the Mean Value Theorem in R in the previous point. We can do some-
thing similar in higher dimensions. Suppose that S ⊂ Rn and that the Jacobian

24
matrix Df (x) exists for x ∈ S and x 7→ Df (x) is continuous on S. Then for each
convex2 compact subset C ⊂ S there exists an LC > 0 such that

|f (x) − f (y)| ≤ LC |x − y|, (x, t), (y, t) ∈ C.

This is a consequence of the Mean Value Inequality in Rn (see Multivariable Cal-


culus). Since the argument of Theorem 1.5 only uses the fact that f is Lipschitz on
closed balls, we can apply Theorem 1.5 in this case. [“Bounded derivatives imply
that f is Lipschitz.”]

5. As promised above, we now show that the existence time is uniform for all x0
in any compact subset K of U .

To prove this we will need the following lemma. Remember that for subsets of
n
R compact ≡ closed and bounded. The notion of ‘an open set containing A’ will
occur repeatedly: we will call such a set a neighbourhood of A.

Lemma 1.6. If A ⊂ Rn is compact 3 and V is a neighbourhood of A then there


exists ε > 0 such that the ‘ε-neighbourhood of A’ is contained in V :

Aε := {y ∈ Rn : dist(y, A) < ε} ⊂ V.

Note that if A = {a} is a single point, this reduces to the (trivial) result that
any open set V containing a contains B(a, ε) for some ε > 0 (‘trivial’ since it
follows immediately from the fact that a ∈ V and V is open).

Proof. Suppose that this is not the case. Then for every n ∈ N there exists a point
yn with dist(yn , A) < 1/n but yn ∈
/ V . Since dist(yn , A) < 1/n, there exists an ∈ A
such that
|an − yn | = dist(yn , A) < 1/n.
Since A is compact, the sequence (an ) has a subsequence anj that converges to
some a ∈ A.
2 A set C is convex if u, v ∈ C and λ ∈ [0, 1] implies that λu + (1 − λ)v ∈ C (the line segment
joining u and v lies in C).
3 If A is only closed (and not bounded), then the result of this lemma need not be true: consider
the sets A and V in R2 given by
2
A = {(x, 0) : x ∈ R} V = {(x, y) : |y| < e−x }.

Given any ε > 0, we can take x large enough that (x, ε/2) ∈
/ V [just take x such that
−x2
e < ε/2], so we do not have Aε ⊂ V for any ε > 0.

25
Figure 1.1: Left: a neighbourhood of a compact set A contains an ε-neighbourhood
of A for some ε > 0. Right: in the simplest case when A = {a}, any neighbourhood
contains B(a, ε) for some ε > 0.

Since V contains A it certainly contains a; since V is open there exists ε > 0


such that B(a, ε) ⊂ V . Then, since
|ynj − a| ≤ |ynj − anj | + |anj − a|,
it follows that for j sufficiently large |ynj − A| < ε, so ynj ∈ B(a, ε) ⊂ V . But this
contradicts out choice of yn at the beginning of the proof.
Lemma 1.7. Under the same conditions as Theorem 1.5, if Z is a compact subset
of U then there exists τ (Z) such that a unique solution of (1.8) exists for all
t ∈ [t0 , t0 + τ (Z)] for each x0 ∈ Z.

Proof. We use Lemma 1.6 to guarantee there exists r > 0 such that
Zr := {y ∈ Rn : |y − z| ≤ r, for some z ∈ Z} ⊂ U
(the lemma gives this inclusion with |y − z| < ε for some ε > 0; now take any r
with 0 < r < ε). It follows that B(x0 , r) ⊂ U for every x0 ∈ Z.

The set Zr is compact: suppose that (xn ) ∈ Zr ; then xn = zn + pn , where


Zn ∈ Z and pn ∈ B(0, r). Since Z is compact we can find a subsequence nj such
that znj → z ∈ Z; since |pnj | ≤ r we can find a further subsequence (which we
relabel) such that pnj → p with |p| ≤ r. Then xnj = znj + pnj → z + p ∈ Zr , so Zr
is compact as claimed.

Since Zr is compact, as in the main proof we can let M := supx∈Zr |f (x)| and
set L = LZr . The proof now follows exactly the same lines, showing that τ can be
taken uniformly for all x0 ∈ Z.

26
We will use this uniform existence time later to investigate what happens if a
solution cannot be extended to exist for all t ≥ 0.

6. In fact, we often treat a (simple) situation in which Theorem 1.5 is not directly
applicable. If we have a planar system and write it in polar coordinates then the
equations (for the radial direction) will be of the form

ṙ = f (r), r(0) = r0 ;

clearly in this case r ∈ [0, ∞), so f : [0, ∞) → R, and since [0, ∞) is not open we
cannot apply Theorem 1.5 directly. We will now analyse a differential inequality
that will enable us to deal with this case (on Examples 2).

1.3.1 Continuous dependence on x0 and f

We now investigate how solutions depend on the ‘data’, i.e. the initial condition
x0 and the ‘model’ f . We will need the following result.

Corollary 1.8 (Gronwall’s inequality - integral version). If u : [0, τ ] → R is con-


tinuous and ˆ t
u(t) ≤ C0 + [Lu(s) + ε] ds t ∈ [0, τ ] (1.11)
0
for some L ≥ 0, then
ε Lt
u(t) ≤ C0 eLt +

e −1 t ∈ [0, τ ]. (1.12)
L

(The proof also implies a differential version of this result: if u : [0, τ ] → R is


differentiable and u̇ ≤ Lu + ε, then (1.12) holds with C0 = u(0). This version,
though, does not require L ≥ 0.)
´t
Proof. Let U (t) := C0 + 0 [Lu(s) + ε] ds; then u(t) ≤ U (t) [by (1.11)] and, by the
Fundamental Theorem of Calculus, U is differentiable with

U̇ (t) = Lu(t) + ε ≤ LU (t) + ε.

Multiply the equation U̇ − LU ≤ ε by the integrating factor e−Lt and rearrange:


 
−Lt dU d −Lt
− LU ≤ e−Lt ε e U (t) ≤ e−Lt ε.

e ⇒
dt dt

27
Now integrating both sides from 0 to t gives
ˆ t
−Lt ε
e−Lt̃ dt̃ = 1 − e−Lt ,

e U (t) − U (0) ≤ ε
0 L
i.e.
ε Lt
U (t) ≤ U (0)eLt +

e −1 t ∈ [0, τ ].
L
Since U (0) = C0 and u(t) ≤ U (t) equation (1.12) now follows.

Theorem 1.9 (Continuous dependence). Suppose that U ⊂ Rn is open and


f, g : U → Rn are both continuous. Assume that

|f (u) − g(u)| ≤ ε, |f (u) − f (v)| ≤ L, u, v ∈ U.

If
ẋ = f (x) x(0) = x0
and
ẏ = g(y) y(0) = y0
have solutions t 7→ x(t) and t 7→ y(t) respectively, then while x(t), y(t) ∈ U ,
ε L|t|
|x(t) − y(t)| ≤ |x0 − y0 |eL|t| +

e −1 . (1.13)
L

Proof. Using the integral formulation, we have


ˆ t
|x(t) − y(t)| ≤ |x0 − y0 | + |f (x(s)) − g(y(s))| ds. (1.14)
0

Since

|f (x(s)) − g(y(s))| ≤ |f (x(s)) − f (y(s))| + |f (y(s)) − g(y(s))|


≤ L|x(s) − y(s)| + ε, (1.15)

then, setting δ(t) := |x(t) − y(t)|, from (1.14) we obtain


ˆ t
δ(t) ≤ δ(0) + Lδ(s) + ε ds,
0

and (1.13) now follows using Gronwall’s inequality from Lemma 1.8.

28
Note that if f = g and we consider the equation ẋ = f (x) with two different
initial conditions x0 and y0 then we get ε = 0 in this result and (1.13) becomes

|x(t) − y(t)| ≤ |x0 − y0 |eL|t| .

The solutions are continuous with respect to the initial condition, but can separate
exponentially fast. This is at the basis of the ‘butterfly effect’: small changes in
initial conditions can magnify very rapidly. [Note, however, that you can see this
sort of behaviour even in the simple linear system ẋ = λx, when λ > 0, but there
is no ‘chaos’ here.]

(This also gives another proof of the uniqueness of solutions, since if x0 = y0


then x(t) = y(t) for t ≥ 0.)

1.3.2 Maximal interval of existence of a solution

Definition 1.10. Suppose that U is an open subset of Rn , and that f : U → Rn is


locally Lipschitz. The maximal interval of existence J(x0 ) is the largest interval J
that contains 0, such that the solution of the initial value problem

ẋ = f (x) x(0) = x0

exists for all t ∈ J [with x(t) ∈ U for all t ∈ J].

(If f is only defined on U then ‘existing’ requires x(s) ∈ U .)

If the solution can be found explicitly, then we can compute J(x0 ) exactly.

Example: ẋ = x2 with x(0) = x0 .

The function x 7→ x2 is locally Lipschitz on R, so there is a solution at least


on some interval [−τ, τ ]. We have already solved the equation explicitly: if x0 = 0
then x(t) = 0 for all t ∈ R, and if x0 ̸= 0 then
1
x(t) = .
x−1
0 −t
So the maximal interval of existence is

1
(−∞, x0 ) x0 > 0

J = (−∞, ∞) x0 = 0

 1
( x0 , ∞) x0 < 0.

29
Figure 1.2: Solutions of ẋ = x2 : finite-time blowup.

Even if we cannot find solutions explicitly we can say something qualitative


about the maximal interval of existence.
Theorem 1.11 (Maximal interval of existence). Suppose that U ⊂ Rn is open
and that f : U → Rn is locally Lipschitz (as in Theorem 1.5). Then the maximal
interval of existence J is open.

Proof. We consider the right-hand endpoint of J, i.e. solving for t ≥ 0; we want


to show that there is a maximal interval [0, β) on which the solution exists (and
remains in U ). Let
T := {t > 0 : the equation has a solution x : [0, t] → Rn
with x(s) ∈ U for all s ∈ [0, t]};
this set is not empty (by the local version of Picard’s Theorem). If it is unbounded
then a solution exists on the semi-open interval [0, ∞) and we are done.

Otherwise, set β = sup T < ∞. If β ∈ T then the solution exists on [0, β], with
x(β) ∈ U . But now the local version of Picard’s Theorem allows us to extend the
solution beyond t = β, contradicting the definition of β. So we must have β ∈ /T
and the solution exists only on [0, β).
Theorem 1.12 (Unbounded solution). Suppose that U ⊂ Rn is open, and that
f : U → Rn is locally Lipschitz. Let J = (α, β) be the maximal interval of existence
for
ẋ = f (x) x(0) = x0 .

30
If β < ∞ then for every compact subset K of U there exists a time tK < β such
/ K for all t ≥ tK . In particular, if U = Rn , then |x(t)| → ∞ as t → β.
that x(t) ∈

Because the sets


1
KR := {x ∈ Rn : dist(x, ∂U ) ≥ and |x| ≤ R}
R
are compact we could think of the first part of this result as ‘x(t) → ∂U as t → β’.

The behaviour when U = Rn is ‘finite-time blowup’ of the solution.

Proof. Suppose that β < ∞ but that there exists a compact set K and a sequence
(tn ) with tn ↑ β such that x(tn ) ∈ K. Since K is compact, the local existence time
for solutions starting at initial conditions in K is some τ > 0, uniform over K. So
if n is large enough, tn + τ > β, a contradiction.

If U = Rn then taking the compact set K = {|x| ≤ R} we see that for each
R there exists tR < β such that |x(t)| ≥ R for all t ≥ tR , i.e. |x(t)| → ∞ as
t → β.

We have already seen the example ẋ = x2 , for which the solution blows up in
finite time. For a (much more artificial) example where the solution tends to the
boundary of U , consider
1
ẋ = f (x) = − x(0) = x0 > 0.
x
Take U = (0, ∞). To solve this equation

x2 (t) x20
q
−x dx = dt ⇒ − + =t ⇒ x(t) = x20 − 2t.
2 2
The solution is contained in U on the maximal interval J = (−∞, x20 /2); and

lim x(t) = 0 ∈ ∂U.


t→x20 /2

If U = Rn and there is no blowup then Theorem 1.12 means that the solution
must exist for all t ∈ R. For globally Lipschitz equations (e.g. linear equations)
there is no blowup (see Examples 2) which gives another proof that such equations
enjoy global existence of solutions.

31
Chapter 2

Linear systems

We now consider the autonomous linear system

ẋ = Ax x(0) = x0 ,

where x ∈ Rn and A is an n × n matrix.

We have remarked earlier that a unique solution of this equation exists for all
t ∈ R (since globally Lipschitz equations have unique solutions for all time). We
also showed earlier (see discussion after (1.6)) that close to t = 0 the solution is of
the form ∞
X (tA)k
x(t) = etA x0 = x0 .
k=0
k!

We now show that etA is defined for all t ∈ R, and discuss how to compute it.

We start with two two-dimensional examples; we will give a third when we


have an easier way to compute the matrix exponential.

Example 1: distinct eigenvalues


  k k 
λ 0 k t λ 0
A= , so (tA) = .
0 µ 0 tk µk

Computing the sum entry-by-entry we obtain


 λt 
tA e 0
e = .
0 eµt

32
Example 2: Jordan block
   
λ ε k (tλ)k εktk λk−1
A= , so (tA) = ;
0 λ 0 (tλ)k
adding entry-by-entry now yields
 λt 
tA e εteλt
e = ,
0 eλt
since ∞ ∞ ∞ k k
X ktk λk−1 X tk λk−1 X t λ
ε =ε = εt .
k=0
k! k=1
(k − 1)! k=0
k!

We now show that etA is always well defined, i.e. the series always converges.
Lemma 2.1. For any n × n matrix A = (aij ) the series

X (tA)k
k=0
k!

converges absolutely for every t ∈ R and so etA is well defined for every t.

Proof. Write aij (k) for the ij entry of Ak , and let a = maxi,j |aij |. Then
n
X
|aij (2)| = ail alj ≤ na2 ≤ (na)2 ;


l=1

assuming that |aij (k)| ≤ (an)k we have



Xn
|aij (k + 1)| = ail (k)alj ≤ n(na)k a = (na)k+1 .


l=1

So by induction |aij (k)| ≤ (na)k . Therefore


∞ ∞
X |tk aij (k)| X |t|k (na)k
≤ = exp(na|t|),
k=0
k! k=0
k!

so each ij entry converges absolutely by the comparison test, hence so does the
full matrix ∞
X (tA)k
,
k=0
k!
so etA is well defined.

33
We now prove some properties of etA .

Lemma 2.2 (Properties of etA ). Let A, B, and T be n × n matrices, with T


invertible.

(i) If B = T −1 AT then exp(B) = T −1 exp(A)T ;

(ii) if AB = BA then exp(A + B) = exp(A) exp(B); and

(iii) exp(−A) = (exp A)−1 .

Proof. (i) Observe that

(T −1 AT )k = (T −1 AT )(T −1 AT ) · · · (T −1 AT ) = T −1 Ak T

and clearly T −1 (A + B)T = T −1 AT + T −1 BT , so for any n ≥ 0


n
! ∞
−1
X Ak X (T −1 AT )k
T T = .
k=0
k! k=0
k!

Now taking the limit n → ∞ gives the result.

(ii) This is a little painful. First, since A and B commute, we have


n  
(A + B)n 1 X n j n−j X Aj B k
= AB = .
n! n! j=0 j j+k=n
j! k!

So now ∞ X
A+B
X Aj B k
e = ;
n=0 j+k=n
j! k!

we want to show that the right-hand side here is equal to


∞ ∞
X Aj X B k
.
j=0
j! k=0 k!

To do this, note that we have


2m X m m
X Aj B k X Aj X B k X ′ Aj B k X ′′ Aj B k
− = + ,
n=0 j+k=n
j! k! j=0
j! k=0 k! j! k! j! k!

34
where ′ is over j + k ≤ 2m, 0 ≤ j ≤ m, m + 1 ≤ k ≤ 2m, and ′′ is over
P P
j + k ≤ 2m, m + 1 ≤ j ≤ 2m, 0 ≤ k ≤ m.

Now,
m 2m m 2m
X ′ Aj B k X ∥Aj ∥ X ∥B k ∥ X ∥A∥j X ∥B∥k

≤ ≤ ;
j! k! j=0 j! k=m+1 k! j=0
j! k=m+1 k!
| {z } | {z }
→e∥A∥ →0

the second term tends to zero since ∞ k


P
k=0 ∥B∥ /k! converges.

Similarly, ∥ ′′ ∥ → 0 as m → ∞. This shows the required equality, and hence


P
eA+B = eA eB as claimed.

At least now (iii) is simple: take B = −A.

We can now treat our third 2 × 2 example.

Example 3: complex eigenvalues

The matrix  
a b
A=
−b a
has eigenvalues a ± ib. We want to compute eAt in this case.

Note that if we set


   
a 0 0 b
B= , C=
0 a −b 0

then  
0 ab
BC = = CB,
−ab 0
so we can use (ii) of the above theorem. We have already shown that eBt = eat I;
for C we have
   
2n+1 0 (−1)n bn 2n (−1)n bn 0
C = , C = ,
−(−1)n bn 0 0 (−1)n bn
so  
Ct cos bt sin bt
e =
− sin bt cos bt

35
and so  
at cos bt sin bt
exp(At) = e .
− sin bt cos bt

Each entry of the matrix etA depends on t, so we can differentiate it: we define
d tA
dt
e to be the matrix obtained by differentiating each entry.

Lemma 2.3. We have


d tA
e = A exp(tA) = exp(tA)A.
dt

Proof. Using the definition of the derivative,

d exp((t + h)A) − exp(tA)


exp(tA) = lim
dt h→0 h
exp(tA) exp(hA) − exp(tA)
= lim
h→0 h
exp(hA) − I
= exp(tA) lim
h→0 h
hA + (hA)2 /2 + · · ·
= exp(tA) lim .
h→0 h
Note that the we could write the term inside the limit as
 2
A3 4

A 2A
A + lim h +h +h + ··· ;
h→0 2 3! 4!

the expression inside the square brackets satisfies


2 3 4 ∥A∥2 ∥A∥3 4

2 ∥A∥
A A 2 A

2 + h + h + · · · ≤ + h + h + ··· ,
3! 4! 2 3! 4!

and so for h ≤ 1 its norm is certainly bounded by e∥A∥ , so the limit as h → 0 is A.


Therefore the overall limit is exp(tA)A. That this is the same as A exp(tA) can be
seen by taking k terms in the sum for exp(tA) and letting k tend to infinity.

Therefore x(t) = exp(tA)x0 is the solution of the IVP

ẋ = Ax, x(0) = x0 ,

since ẋ = A exp(tA)x0 = Ax(t) and exp(0A) = I, so x(0) = x0 .

36
2.1 General linear n × n systems with constant
coefficients

In general given a matrix A it is hard to compute etA directly. One possibility is to


compute it for a simpler related matrix and use part (i) of Lemma 2.2. Solutions
can also be obtained directly using the eigenvalues and eigenvectors (as in the
Differential Equations module). We discuss both ways here, briefly.

Consider ẋ = Ax where x(0) = x0 ∈ Rn and A is an n × n matrix. If we make


a linear change of coordinates x = T y then y = T −1 x and so

ẏ = T −1 ẋ = T −1 Ax = [T −1 AT ]y.

Recall from Algebra 1 that T can be chosen so that AJ = T −1 AT is in (real)


Jordan normal form. We now recall how this works in the 2 × 2 case.

2.1.1 Jordan canonical form over R: 2 × 2 case

First note that if A is a multiple of the identity then we do not need to do any
coordinate transformation put A into its canonical form. So we will exclude this
case in what follows.

Two distinct real eigenvalues

Suppose that A has real eigenvalues λ1 and λ2 with eigenvectors v1 and v2


(which will be linearly independent).

The basis change matrix is T = [v1 v2 ]. (Recall that we have x = T y, where


x is the original system and y is the ‘new system’: the unit coordinate axes for y
become v1 and v2 in the x coordinates, which is what we want.)

The new matrix in y coordinates is T −1 AT , which will be

[v1 v2 ]−1 A[v1 v2 ] = [v1 v2 ]−1 [λ1 v1 λ2 v2 ]


 
−1 λ1 0
= [v1 v2 ] [v1 v2 ]
0 λ2
 
λ1 0
= .
0 λ2

[The ‘trick’ here, and in other cases, is to write AT as T AJ (moving from the first
to the second line), so that T −1 T cancels.]

37
In this case  
λ1 0
Aj = .
0 λ2

A repeated real eigenvalue with only one eigenvector

If we have a repeated real eigenvalue λ for which there is only a single eigen-
vector v1 , then to find the Jordan Canonical Form we find a second vector v2 such
that
(A − λI)v2 = εv1 and so Av2 = λv2 + εv1 .
It is usual to choose ε = 1, but this more general choice will be useful later.

If we make a change of basis to [v1 v2 ] then A becomes

[v1 v2 ]−1 A[v1 v2 ] = [v1 v2 ]−1 [λv1 εv1 + λv2 ]


 
−1 λ ε
= [v1 v2 ] [v1 v2 ]
0 λ
 
λ ε
= .
0 λ

So here  
λ ε
AJ = .
0 λ

Complex conjugate eigenvalues

If the matrix A is real and it has complex eigenvalues, these must occur in
complex conjugate pairs. So the only way we can have complex eigenvalues for a
2 × 2 matrix is if the eigenvalues are ρ ± iω. So there are two distinct eigenvalues:
the usual (complex) Jordan canonical form of A would be
 
ρ + iω 0
;
0 ρ − iω

but we want a real matrix instead.

If
Av = (ρ + iω)v (2.1)
and we split the eigenvector v into its real and imaginary parts, v = v1 + iv2 , then
we have

A(v1 + iv2 ) = (ρ + iω)(v1 + iv2 ) = (ρv1 − ωv2 ) + i(ωv1 + ρv2 ). (2.2)

38
[Note that the other eigenvector is v1 − iv2 with eigenvalue ρ − iω: we get this by
taking complex conjugates on both sides of (2.1), since A is real.]

Taking real and imaginary parts in (2.2) we obtain

Av1 = ρv1 − ωv2 Av2 = ωv1 + ρv2 .

If we change to the basis1 {v1 , v2 } then with respect to this basis A becomes

[v1 v2 ]−1 A[v1 v2 ] = [v1 v2 ]−1 [Av1 Av2 ]


= [v1 v2 ]−1 [ρv2 − ωv2 ωv1 + ρv2 ]
 
−1 ρ ω
= [v1 v2 ] [v1 v2 ] ,
−ω ρ

so we have  
ρ ω
AJ = .
−ω ρ

2.1.2 Jordan canonical form over R: n × n case.

For an n × n matrix, we can use a coordinate transformation to give AJ =


diag(J1 , . . . , Jk ), where each Jordan block has one of the following forms: for real
eigenvalue αm or a complex eigenvalue βm ± iγm
 
βm γm
Jm = αm or
−γm βm
1 The real vectors v1 and v2 are linearly independent. We know, since they correspond to
different eigenvectors, that v+ and v− are linearly independent over C, so that if

α1 v+ + α2 v− = 0

then α1 = α2 = 0. Now suppose that β1 v1 + β2 v2 = 0; since

2v1 = v+ + v− and 2v2 = −i(v+ − v− )

we have
β1 (v+ − v− ) − β2 i(v− v− ) = (β1 − iβ2 )v+ + (β1 + iβ2 )v− = 0.
Since v+ and v− are linearly independent over C, it follows that we must have

β1 − iβ2 = β1 + iβ2 = 0,

which implies that β1 = β2 = 0, and so v1 and v2 are indeed linearly independent.

39
or  
λm ε 0 ··· 0
..
0 λm ε . 0
 
 
 .. .. ..

 0 0 . . .

 .. .. .. 
 . . . λm ε
0 0 ··· 0 λm
or
    
µ m νm ε 0
 −νm µm 0 ··· 0
 0 ε  

  
 µ m νm ε 0 ... 
 0 0 

 −νm µm 0 ε 

 ... ... .. 

 0 0    . 
 
 .. .. .. µ m νm ε 0 
 . . . 

 −νm µm  0 ε 

 µ m νm 
0 0 ··· 0
−νm µm

In the usual Jordan form we use ε = 1, but in fact any ε can be chosen by an
appropriate rescaling of the basis vectors, and we will use the form with small ε
later in the module.

2.1.3 Coordinate transformations and solution of ẋ = Ax

We return to trying to solve ẋ = Ax with x(0) = x0 ∈ Rn and A is an n × n


matrix. We have seen that we can make a linear change of coordinates y = T −1 x
so that
ẏ = [T −1 AT ]y y(0) = T x0 ,
with AJ := T −1 AT the real Jordan canonical form of A.

There are now two ways to think about finding the solution of ẋ = Ax.

1. We just want to calculate eAt . Using part (i) of Lemma 2.2 we know that

exp(tA) = T exp(t(T −1 AT ))T −1 = T exp(tAJ )T −1

and so the solution of the IVP is

x(t) = T exp(tAJ )T −1 x0 .

40
Here we don’t actually use the equation for y at all; the Jordan canonical form is
just a useful trick for finding an expression for eAt more easily.

2. But it can be useful to think about the y system: this is a linear system that
satisfies the simpler equation ẏ = AJ y.

Because the eigenvectors of AJ are orthogonal, the behaviour of solutions is


easier to understand and the phase portrait (the figure that shows the qualitative
behaviour of ‘all solutions) is easier to draw. The solutions of ẋ = Ax are solutions
of ẏ = AJ y transformed back to the original coordinates using T , so the phase
portrait for x is that for x transformed by T .

The type of picture (node, saddle, etc...) will not be changed by this trans-
formation, it just ‘moves the eigenvectors’. We use this observation in the next
section to classify all 2D linear systems and draw their phase portraits.

Before that, we look at a simple 3D system.

Example 3D system

Consider  
1 0 0
ẋ = 1 2 0  x, x(0) = x0 .
1 0 −1
The eigenvalues are 1, 2, and −1, with respective eigenvectors
     
2 0 0
−2 , 1 , 0 .
1 0 1

The matrix T to change coordinates is then


   
2 0 0 1 0 0
1
T = −2 1 0 , with T −1 =  2 2 0 .
2
1 0 1 −1 0 2

Then    
1 0 0 et 0 0
AJ = T −1 AT = 0 2 0  , etAJ =  0 e2t 0  ,
0 0 −1 0 0 e−t

41
and
etA = T etAJ T −1
  t  
2 0 0 e 0 0 1 0 0
1
= −2 1 0  0 e2t 0   2 2 0
2
1 0 1 0 0 e−t −1 0 2
  t 
2 0 0 e 0 0
1
= −2 1 0  2e2t 2e2t 0 
2 −t
1 0 1 −e 0 2e−t
 
et 0 0
=  −et + e2t e2t 0  .
(e − e−t ) 0 e−t
1 t
2
So  
et 0 0
x(t) =  −et + e2t e2t 0  x0 .
(e − e−t ) 0 e−t
1 t
2

2.2 Classification of 2D linear systems

First we look at some systems in Jordan canonical form. Then we look at general
systems in terms of their eigenvalues. From the comments in the previous section,
the classification of the ‘canonical system’ transfers immediately to the general
case, since the only difference is a change of coordinates. [One could turn this into
a definition: the classification of a linear system ẋ = Ax is that of the ‘canonical
system’ ż = AJ z, where AJ is the (real) Jordan canonical form of A.]

2.2.1 Systems in Jordan canonical form

We look at the possible (real) Jordan canonical forms for 2 × 2 matrices.

Two distinct real eigenvalues


   tλ 
λ 0 tA e 0
A= ẋ = Ax, x(0) = x0 x(t) = e x0 = x0 .
0 µ 0 etµ

• If µ < λ < 0 then we have a stable node, and for every x0 , x(t) → 0 as
t → ∞. The solution is  λt 
e x
x(t) = µt 01 .
e x02

42
Note that we have x2 (t) = Cx1 (t)µ/λ , and µ/λ > 1, so the solutions curves are tan-
gent to the x1 axis. (Solutions approach the origin tangent to the slower direction.)

Figure 2.1: A stable node: eigenvalues µ < λ < 0. Solutions approach the origin
tangent to the slower direction.

• If λ, µ > 0 then we have an unstable node: if |x0 | =


̸ 0 then |x(t)| → ∞ as
t → ∞. Solutions move away tangent to the slower direction (same argument as
before).

Figure 2.2: An unstable node: eigenvalues λ, µ > 0.

• For µ < 0 < λ the origin is a saddle point. For most initial conditions
|x(t)| → ∞, unless x01 = 0, in which case x(t) → 0. Solutions move on curves
xλ y ∥mu| =constant.

A repeated real eigenvalue λ

The Jordan canonical form is either


   
λ 0 λ ε
or .
0 λ 0 λ

43
Figure 2.3: A saddle: eigenvalues µ < 0 < λ.

In the first (diagonal) case we have a stable or unstable ‘star’.

Figure 2.4: A ‘stable star’ for a repeated negative real eigenvalue.

The phase portrait is probably easier to draw by looking at the equations

ẋ = λx + εy
ẏ = λy,

than by looking at the solution.

Figure 2.5: A stable ‘improper node’ with repeated eigenvalue λ < 0.

To find the explicit form of the solutions we can use


 part (ii) ofLemma
 2.2 to
λ 0 0 ε
compute etA ; since tA = tΛ + tE, where Λ = and E = . Since
0 λ 0 0

44
ΛE = λE = EΛ and E 2 = 0, we have
 tλ    tλ 
tA t(Λ+E) tΛ tE e 0 1 tε e εtetλ
e =e =e e = = .
0 etλ 0 1 0 etλ

[We did this in a more painful way before.]

Complex conjugate eigenvalues

For eigenvalues a ± ib the (real) JCF is


 
a b
A= ,
−b a

so the equations are ẋ = ax + by, ẏ = −bx + ay.

It’s easier to understand the solutions by using polar coordinates (r, θ) than by
using the explicit solution. We have

rṙ = xẋ + y ẏ = ar2 ⇒ ṙ = ar

and
r2 θ̇ = xẏ − y ẋ = −br2 ⇒ θ̇ = −b.
The origin is a stable focus if a < 0, an unstable focus if a > 0, a centre if a = 0.

Figure 2.6: Unstable focus (left) and stable focus (right).

Figure 2.7: A centre: many periodic orbits (b > 0).

45
Chapter 3

Qualitative theory of ODEs

From now on we consider autonomous differential equations

ẋ = f (x),

where X ⊂ Rn is called the ‘state space’, and f : X → X is continuously differen-


tiable (so that solutions to the IVP exist and are unique); we will take X = Rn
most of the time.

3.1 General concepts: flows, orbits, invariant sets,


and stability

We can think of the solutions of

ẋ = f (x), x(0) = x0 (3.1)

as functions of the initial value x0 and the time T . The we can try to describe all
solutions rather than study each IVP separately. If the solutions exist for all t ∈ R
(‘global solutions’) then we can use the differential equation to define a ‘flow’, i.e. a
map ϕ : X × R → X, (x0 , t) 7→ ϕt (x0 ), where ϕt (x0 ) is the solution of (3.1) at time
t (we could also write x(t; x0 ), or something similar, if we wanted the notation to
look more like that in the previous chapter).

So ϕt (x) satisfies
d
ϕt (x) = f (ϕt (x)) ϕ0 (x) = x.
dt
46
For example, for ẋ = Ax we have ϕt (x0 ) = eAt x0 .

Since solutions of (3.1) are unique, we have

ϕs (ϕt (x0 )) = ϕs+t (x0 ) = ϕt+s (x0 ) (3.2)

or (perhaps more abstractly)

ϕ0 = id, ϕs ◦ ϕt = ϕt+s = ϕt ◦ ϕs . (3.3)

Figure 3.1: The ‘flow property’ ϕs ◦ ϕt = ϕt+s = ϕt ◦ ϕs , which is a consequence of


uniqueness

In particular the flow is ‘reversible’ with

(ϕt )−1 = ϕ−t . (3.4)

For each t ∈ R, ϕt : X → X is a bijection. Furthermore, it follows from the


continuous dependence on initial conditions (Theorem 1.9) that ϕt : X → X is
continuous for each t ∈ R. Finally, note that since x(t; x0 ) [the solution of (3.1)] is
differentiable, it is a continuous function of t, so ϕt (x0 ) is continuous in t for every
x0 .

Note that it is possible to define a flow ‘abstractly’: it need not arise from a
differential equation.

Definition 3.1. Let X be a subset of Rn (in fact we could take (X, d) to be any
metric space). A map (t, x) 7→ ϕt (x) from R × X → X is a flow on X if

(i) ϕ0 (x) = x for every x ∈ X;

(ii) ϕt+s = ϕt ◦ ϕs = ϕs ◦ ϕt for every t, s ∈ R;

(iii) the map (t, x) 7→ ϕt (x) is continuous.

47
We now define various ‘orbits’ of a flow, which correspond to solutions of the
ODE.
Definition 3.2. The orbit of a point x ∈ X (or the trajectory through x) is the
set
O(x) := {ϕt (x) : t ∈ R} ⊂ X (3.5)
or sometimes the indexed set (ϕt (x))t∈R if the time parametrisation is important
[this indexed set contains much more information than O(x), since it gives the
direction and speed at which the orbit is traversed].

The forward orbit of x is the set


O+ (x) := {ϕt (x) : t ≥ 0}
[or (ϕt (x))t≥0 ] and the backward orbit of x is
O− (x) := {ϕt (x) : t ≤ 0}
[or (ϕt (x))t≤0 ].

Figure 3.2: The forward and backwards orbits O± (x) through a point x.

A point x0 ∈ X is called a fixed point (or equilibrium point or stationary


point or singular point) if f (x0 ) = 0, in which case
ϕt (x0 ) = x0 for all t ∈ R (3.6)
and O(x0 ) = x0 .

A periodic point x is one for which ϕT (x) = x for some T > 0 but ϕt (x) ̸= x
for 0 < t < T and then
ϕt (x) = ϕt+T (x) for all t ∈ ℜ

48
and
O(x) = {ϕt (x) : 0 ≤ t ≤ T }
is called a periodic orbit.

Figure 3.3: A periodic orbit through x.

 
0 b
Example: if ẋ = Ax with A = in X = R2 then
−b 0
 
At cos bt sin bt
ϕt (x) = e x = x;
− sin bt cos bt
the origin is a fixed point (ϕt (0) = 0 for every t ∈ R) and every point x ̸= 0 is
periodic, with O(x) a (circular) periodic orbit (of period T = 2π/b).

Figure 3.4: A centre: picture shows the case b > 0.

Other interesting kinds of orbits (which we often have to exclude in order to


prove general results):

A homoclinic orbit connects a fixed point back to itself: if x is a fixed point


and for some y ̸= x
ϕt (y) → x as t → ±∞

49
then O(y) is a homoclinic orbit.

Figure 3.5: A homoclinic orbit: the orbit connects x back to itself.

A heteroclinic orbit connects two distinct fixed points: if x0 ̸= x1 are both fixed
points, and for some y ̸= x

ϕt (y) → x0 as t → −∞ ϕt (y) → x1 as t → ∞

then O(y) is a heteroclinic orbit.

Figure 3.6: A heteroclinic orbit that connects two distinct fixed points x0 and x1 .

A sequence of heteroclinic orbits that ‘join up’ is called a heteroclinic loop:

Figure 3.7: A heteroclinic loop.

50
Definition 3.3. A subset Λ ⊂ X is invariant under a flow ϕ if

x∈Λ ⇒ ϕt (x) ∈ Λ for all t ∈ R.

The set Λ is forward invariant if x ∈ Λ ⇒ ϕt (x) ∈ Λ for all t ≥ 0, and backward


invariant if x ∈ Λ ⇒ ϕt (x) ∈ Λ for all t ≤ 0.

In the next result we extend the definition of ϕt to sets of initial conditions in


the obvious way, i.e. if A ⊂ X then

ϕt A =:= {ϕt x : x ∈ A}.

Lemma 3.4. Λ is invariant if and only if ϕt Λ = Λ for all t ∈ R.

Proof. The ‘if’ direction is immediate.

Conversely, suppose that Λ is invariant. If y ∈ ϕt Λ then y = ϕt x for some


x ∈ Λ, so y ∈ Λ, i.e. ϕt Λ ⊆ Λ. On the other hand, if z ∈ Λ then ϕ−t z ∈ Λ, from
which it follows that z = ϕt (ϕ−t z) ∈ ϕt Λ, so Λ ⊆ ϕt Λ.

Fixed points, periodic orbits, homoclinic and heteroclinic orbits are all invariant
sets, as are any orbit O(x), the whole state space X, and – in the case of two
dimensions – the area inside any periodic orbit, homoclinic loop, or heteroclinic
loop.

Figure 3.8: Periodic orbits, homoclinic loops, and hereoclinic loops can divide R2
into two distinct regions, an ‘inside’ and an ‘outside’.

3.2 One-dimensional dynamics

Now consider
ẋ = f (x), x ∈ R, (3.7)

51
where f is a Lipschitz continuous function, so that the local flow ϕ is defined (i.e.
ϕt (x) exists at least for t close to zero).
Proposition 3.5. The orbits of (3.7) consist of fixed points (where f (x) = 0),
heteroclinic orbits joining distinct fixed points, and orbits which tend to ±∞ or
come from ±∞.

We saw in the Differential Equations module that we can easily sketch the phase
portrait for one-dimensional systems. We now prove a result that guarantees that
gives the right behaviour.

Figure 3.9: It is easy to sketch the phase portrait for a 1D system ẋ = f (x), by
sketching the graph of the function f .

Note that the set of all fixed points, in this case

Fix(f ) := {x ∈ X : f (x) = 0}

is closed (as f is continuous). So its complement X \ Fix(f ) is open – and an open


subset of R is the disjoint union of open intervals.
Proposition 3.6 (“Solution by sketch”). Suppose that f ̸= 0 on I = (a, b), with
f (a) = f (b) = 0. Then for every x ∈ I, ϕt (x) is defined for all t ∈ R and is
monotonic: if f > 0 on I then
(
b t→∞
ϕt (x) →
a t → −∞

and if f < 0 on I then (


a t→∞
ϕt (x) → .
b t → −∞

52
Proof. Taking x ∈ (a, b), uniqueness of solutions implies that ϕt (x) cannot cross
a and b (since they are fixed points). So ϕt (x) ∈ (a, b) and exists for all t ∈ R by
Theorem 1.12.

We consider the case f > 0 on I and show that ϕt (x) → b as t → ∞. Since


f > 0 it is clear that ϕt (x) is monotonically increasing (in t); it is also bounded
above by b, so we can set
c = sup ϕt (x).
t≥0

Now, if c < b then f (c) = δ > 0, and since f is continuous there exists ε > 0 such
that f (y) ≥ δ/2 for y ∈ [c−ε, c+ε]. Now, for some τ > 0 we have ϕτ (x) ∈ [c−ε, c];
but then if we take τ ′ > τ + 2ε/δ we have
ˆ τ′
ϕτ ′ (x) = ϕτ (x) + f (ϕt (x)) dt > c,
τ

which contradicts the definition of c; so we must have c = b, i.e. ϕt (x) → b as


t → ∞.

An almost identical argument shows that on the extreme intervals (−∞, xm )


and (xM , ∞) the orbits must go to or come from ±∞ (in finite or infinite time).
[E.g. if f > 0 on (xM , ∞), we take x > xM , and assume that c := supt≥0 ϕt x < ∞
then we obtain a contradiction in exactly the same way.]

Note that if we have an equation posed on [0, ∞) (such as a radial equation for
a planar system in polar coordinates) we have a parallel result to Proposition 3.6.

3.3 ω-limit points and ω-limit sets

To study the eventual/asymptotic behaviour of orbits we make the following defi-


nition.

Definition 3.7. A point y ∈ X is an ω-limit point of x for a flow ϕ if

ϕtk (x) → y as k → ∞

for some increasing sequence of times tk → ∞.

The set

ω(x) = {y ∈ X : ϕtk (x) → y as k → ∞ for some sequence (tk ) → ∞}

53
[the set of all ω-limit points of x] is called the omega-limit set (ω-limit set) of x.

The α-limit set of x is the set of all ‘α-limit points’:

α(x) = {y ∈ X : ϕtk (x) → y as k → ∞ for some sequence (tk ) → −∞}.

There are two very easy cases.


Lemma 3.8. If x0 is a fixed point then

ω(x0 ) = {x0 } = α(x0 ).

If y is on a periodic orbit then ω(y) = O(y) = α(y).

Proof. If x0 is a fixed point that ϕt (x0 ) = x0 for every t ∈ R.

If y is on a periodic orbit then, given any x ∈ O(y), there exists t such that
x = ϕt (y) and then if T is the period of y then

ϕt+kT (y) = ϕt (y) = x for every k ∈ Z,

and so we can let t±k = t ± kT ; (t±k ) → ±∞, and ϕt±k (y) → x and k → ∞.

Examples in R2

(i) Consider the ODE given in polar coordinates by

ṙ = r(1 − r2 ) θ̇ = 1. (3.8)

Figure 3.10: Graph of r(1 − r2 ) and the 1D phase portrait for the r dynamics.

In the sketch of the trajectories in R2 we can see that γ := {r = 1} is a periodic


orbit and is the ω-limit set of every point x ̸= 0.

54
Figure 3.11: α and ω limit sets for equation (3.8).

(ii) Trajectories near a homoclinic orbit

- x2 is a fixed point so ω(x2 ) = {x2 }

- y1 is on a homoclinic orbit, so α(y1 ) = ω(y1 ) = {x2 }

- ω(x1 ) = O(y1 ) ∪ {x2 }

- ω(y2 ) = ∅ since |ϕt (y2 )| → ∞ as t → ∞

Figure 3.12: Sketch of trajectories near a homoclinic orbit in R2 .

55
Properties of the ω-limit set

Proposition 3.9. If ω(x) is non-empty then it is closed and invariant.

Proof. First we show that ω(x) is closed. Suppose that yk ∈ ω(x), k ∈ N, with
yk → y ∗ ∈ X; we need to show that y ∗ ∈ ω(x).

For each n ∈ N we can find kn such that |ykn − y ∗ | < 1/n. Since ykn ∈ ω(x),
we can find tn > tn−1 + 1 such that

|ϕtn (x) − ykn | < 1/n,

and so

|ϕtn (x) − y ∗ | ≤ |ϕtn (x) − ykn | + |ykn − y ∗ | < 2/n → 0 as n → ∞,

which shows that y ∗ ∈ ω(x).

Now to show that ω(x) is invariant, take y ∈ ω(x); then there exist (tj ) with
tj → ∞ such that ϕtj (x) → y. Since ϕτ : X → X is continuous for every t ∈ R, we
have
ϕτ +tj (x) = ϕτ (ϕtj (x)) → ϕτ (y)
as j → ∞; so ϕτ (y) ∈ ω(x) for any τ ∈ R, i.e. ω(x) is invariant.

Given any set A ⊂ Rn , for x ∈ Rn we set

dist(x, A) = inf |x − a|.


a∈A

If A is closed then there exists a point ā ∈ A such that dist(x, A) = |x − ā|, see
Examples Sheet 3.

Proposition 3.10 (Localisation of the ω-limit set). If the forward orbit of x ∈ Rn


lies in a compact subset K of Rn then ω(x) is a non-empty subset of K, and
dist(ϕt (x), ω(x)) → 0 as t → ∞.

Proof. Since ϕt (x) ∈ K for all t ≥ 0, it follows that in particular (ϕn (x)) is a
bounded sequence in Rn , and hence by the (Rn -generalisation of the) Bolzano–
Weierstrass Theorem, it has a convergent subsequence whose limit is an element
of ω(x) by definition, which shows that ω(x) is non-empty. Since K is compact
it is closed, so the limit of any sequence contained in K (such as ϕtn (x) for some
tn → ∞) lies in K.

56
Now suppose that dist(ϕt (x), ω(x)) does not converge to zero: this means that
there is some ε > 0 and a sequence tn → ∞ such that

dist(ϕtn (x), ω(x)) ≥ ε. (3.9)

Since ϕtn (x) is a sequence contained in the compact set K, it must have a conver-
gent subsequence: so ϕtnj (x) → y. Then y ∈ ω(x), so

dist(ϕtnj (x), ω(x)) → 0,

which contradicts (3.9). So dist(ϕt (x), ω(x)) → 0 as claimed.

Note that in the proofs of these two results we only ever used ϕt with t ≥ 0;
this will be useful later.

3.4 Aside: Cartesian and polar coordinates

The fundamental equality to move between Cartesian and polar coordinates is, of
course
(x, y) = (r cos θ, r sin θ).

To convert from Cartesian to polar, for the r equation we use

r 2 = x2 + y 2 ⇒ rṙ = xẋ + y ẏ.

For the θ equation, start from x = r cos θ, y = r sin θ; differentiate to give

ẋ = ṙ cos θ − rθ̇ sin θ rẋ = ṙx − ry θ̇


ẏ = ṙ sin θ + rθ̇ cos θ rẏ = ṙy + rxθ̇,

Multiply the first equation by y, the second by x, and subtract the first from the
second

ry ẋ = ṙxy − ry 2 θ̇
rxẏ = ṙxy + rx2 θ̇,
− − − − − − − − −−
r[xẏ − y ẋ] = r(x2 + y 2 )θ̇ = r3 θ̇,

which gives r2 θ̇ = xẏ − y ẋ.

57
For a quicker derivation of the expression for θ̇, ignoring possible problems
when x = 0, we can write
1 xẏ − y ẋ xẏ − y ẋ xẏ − y ẋ
θ = tan−1 (y/x) ⇒ θ̇ = y2 2
= 2 2
= .
1+ x x +y r2
x2

In both of these cases (rṙ = xẋ + y ẏ and r2 θ̇ = xẏ − y ẋ) you will most likely
end up with an expression on the right-hand side that still involves x and y. To
get a system for r and θ you then have to put x = r cos θ and y = r sin θ. [Very
occasionally it will be easier to interpret the right-hand side in x and y coordinates,
though!]

3.5 Stability

Recall that1 a neighbourhood of a set A is any open set containing A.

Definition 3.11. An invariant set Λ is said to be (Lyapunov) stable if given any


neighbourhood U of Λ there is a neighbourhood V of Λ such that

x∈V ⇒ ϕt (x) ∈ U for all t ≥ 0.

If an invariant set is not Lyapunov stable then it is said to be unstable.

Figure 3.13: Lyapunov stability: ‘start near, stay near’


1 There is some ‘flexibility’ in this definition. For some a neighbourhood is any set that contains
an open set that contains A, and an ‘open neighbourhood’ is any open set containing A.

58
Example in R2 :

ẋ = y
ẏ = −x.

If we rewrite this in polar coordinates (x, y) = (r cos θ, r sin θ) then

r 2 = x2 + y 2 ⇒ rṙ = xẋ + y ẏ = xy − yx = 0

and
xẏ − y ẋ −x2 − y 2
θ̇ = = = −1
r2 x2 + y 2
so
ṙ = 0, θ̇ = −1.

Figure 3.14: A centre: the origin is Lyapunov stable, and so is each periodic orbit
γR , R > 0.

In this example the origin is stable and so is each periodic orbit

γR := {(r, θ) : r = R}, R > 0.

This is easy to ‘see’, but to apply the definition carefully, first take any neighbour-
hood U of γR ; then (by Lemma 1.6) there exists ε > 0 such that

V := {(r, θ) : |r − R| < ε} ⊂ U.

59
The region V is invariant (it is bounded by two closed orbits and solutions cannot
cross), so x ∈ V ⇒ ϕt (x) ∈ V ⊂ U for all t ≥ 0

A saddle point in a linear system is unstable. For example, if we take

ẋ = x, ẏ = −y,

then if U is a ball of radius r around the origin, whatever neighbourhood V we


choose will contain a ball of radius ε for some ε > 0 (by Lemma 1.6), and if we
start at (ε/2, 0) the solution will leave U .

Figure 3.15: A saddle point is always unstable.

We now make a definition of what it means for an invariant set to be attract-


ing. Be careful! Different authors use this word very differently (I used it for
asymptotically stable last year).

Definition 3.12. An invariant set Λ is said to be attracting if there exists a


neighbourhood V of Λ such that for every x ∈ V , ϕt (x) → Λ as t → ∞, meaning
that
dist(ϕt (x), Λ) := inf{|ϕt (x) − y| : y ∈ Λ} → 0 as t → ∞.

However, more useful is the following definition, which combines both kinds of
‘stability’.

Definition 3.13. An invariant set Λ is said to be asymptotically stable if Λ is


Lyapunov stable and also attracting.

So this is ‘start near, stay near and tend to’.

60
An invariant set that is asymptotically stable if we reverse time is called asymp-
totically unstable.

In the example
ṙ = r(1 − r2 ) θ̇ = 1 (3.10)
the periodic orbit {r = 1} is asymptotically stable and the origin is asymptotically
unstable. Cartesian version: ẋ = x − y − x3 − xy 2 , ẏ = x + y − yx3 − y 3 .]

Figure 3.16: Phase portrait for equation (3.10): the origin is asymptotically un-
stable, and the circle at r = 1 is asymptotically stable.

But consider the fixed point p in Figure 3.17.

For every point x ̸= p in a neighbourhood V of p we have ϕt (x) → p as t → ∞


(and also ω(x) = p); however, p is not Lyapunov stable, since we get as close to
p as we want and find a point whose orbits still leaves V , e.g. any point x0 on
the ‘unstable side’ of the homoclinic orbit γ connecting x0 to itself. So while p is
‘attracting’, it is not asymptotically stable. [However, γ ∪ {p} is asymptotically
stable.]

Definition 3.14. If Λ is an invariant set then its basin of attraction, B(Λ), is

B(Λ) := {x ∈ X : dist(ϕt (x), Λ) → 0 as t → ∞}.

Example: the damped Duffing equation (see Problem Sheet 5) has phase por-
trait:

61
Figure 3.17: The point p is attracting, but not stable.

Figure 3.18: Basins of attraction of two fixed points for the damped Duffing equa-
tion.

Lemma 3.15. If Λ is a compact invariant set that is attracting then B(Λ) is open.

Proof. Since Λ is attracting, it has a neighbourhood V such that for every x ∈ V ,


ϕt (x) → Λ as t → ∞. Since V is a neighbourhood of Λ, there exists ε > 0 such
that dist(x, Λ) < ε implies that x ∈ V [Lemma 1.6].

Now suppose that y ∈ B(Λ); then, since ϕt (y) → Λ, there exists t > 0 such
that dist(ϕt (y), Λ) < ε. Since ϕt (x) is continuous in x, there exists δ > 0 such that
for all z ∈ B(y, δ), dist(ϕt (z), Λ) < ε, and so then ϕt+s (z) → Λ as s → ∞. This
shows that z ∈ B(Λ) for all z ∈ B(y, δ), so B(Λ) is open.

62
Note that if Λ is not attracting then its basin of attraction may not be open,
e.g. consider a saddle point as in Figure 3.15. Its basin of attraction is the y axis,
which is not an open subset of R2 .

Definition 3.16. If B(Λ) = X (the whole space) then we say that Λ is a global
attractor.

Easy example in R2 : ẋ = −x, ẏ = −2y, then the origin is the global attractor.

Figure 3.19: For the linear system ẋ = −x, ẏ = −2y, the origin is the global
attractor.

In general, for a linear system ẋ = Ax in Rn , if all the eigenvalues have negative


real parts then x = 0 is the global attractor.

[Without our definition there can be more than one global attractor: for example
in the 1D system ẋ = x(1 − x2 ) both {−1, 0, 1} and [−1, 1] are global attractors.

Figure 3.20: Non-uniqueness in the definition of a global attractor.

If we alter the definition so that the global attractor has to attract bounded sets
of initial conditions at a uniform rate then we obtain uniqueness (and some other
nice properties, e.g. connectedness).]

63
Chapter 4

Qualitative dynamics of some


model systems

We now introduce some general concepts, motivated and illustrated by some par-
ticular examples.

4.1 Conservative systems (the nonlinear pendu-


lum)

When there is a conserved quantity (e.g. the energy) this can help us understand
the dynamics of a system of ODEs.

A canonical example is the ideal (nonlinear) pendulum


θ̈ + sin θ = 0.
If we put p = θ̇ then we obtain the two-dimensional system
θ̇ = p
ṗ = − sin θ,
with x = (θ, p) in the phase space S 1 × R (a cylinder) - since θ is an angle
(0 ≤ θ < 2π). To draw the phase portrait we ‘cut’ the cylinder along θ = −π = π
and ‘flatten’ it.

Note that the energy


1
H(x) = p2 − cos θ
2
64
Figure 4.1: A simple ideal pendulum: the phase space is a cylinder.

[p2 /2 is the kinetic energy, − cos θ the gravitational potential energy - we have set
all constants equal to 1 to make the algebra simpler] is constant:
d ∂H ∂H
H(x) = ṗ + ẋ = p(− sin θ) + sin θp = 0.
dt ∂p ∂x
Definition 4.1. A system ẋ = f (x) is said to be conservative if there exists a
non-trivial C 1 function H : X → R that is constant along all orbits, i.e.
d
H(ϕt (x))|t=0 = 0 for every x ∈ X.
dt
(By non-trivial we mean that DH ̸= 0 almost everywhere.)

By the chain rule we have


d d
H(ϕt (x))|t=0 = DH(x) ϕt (x) = DH(x)f (x) = ∇H(x) · f (x).
dt dt

If we denote the level sets of H by1


ΣE := {x ∈ X : H(x) = E} = H −1 ({E}),
then since H is constant along orbits,
x ∈ ΣE ⇒ ϕt (x) ∈ ΣE for all t ∈ R,
1 Level sets can have several disconnected components: in the pendulum example ΣE consists
of two disjoint pieces for E > 1.

65
so each ΣE is an invariant set. If we study the dynamics on each level set then we
can reduce the dimension of the problem by one.

If we want to understand the structure of the level sets then the following
‘Level Set Lemma’ is extremely useful. Recall that a critical point of a function
H : R2 → R is a point z at which ∇H(z) = 0.
Theorem 4.2. Let H : R2 → R be a C 1 function and let E be one of the connected
components of

Ec := {(x, y) ∈ R2 : H(x, y) = c} = H −1 ({c}).

If E is bounded and contains no critical points of H then it is a closed curve of


finite length.

The nonlinear pendulum

The equations are


θ̇ = p ṗ = − sin θ.
There are two fixed points, (0, 0) (pendulum straight down) and (π, 0) (pendulum
straight up).

The level sets of H(x) = p2 /2 − cos θ are


1 2
ΣE = {(θ, p) : p = cos θ + E}
2
p
so on ΣE we have p = ± 2(cos θ + E).

• if E < −1 then ΣE = ∅.

• if E = −1 then ΣE = {(0, 0)}.

• if −1 < E < 1 then ΣE is a closed curve encircling the origin made of the two
parts p
{(θ, 2(cos θ + E)) : −θE ≤ θ ≤ θE }
and its mirror image in the p = 0 axis
p
{(θ, − 2(cos θ + E)) : −θE ≤ θ ≤ θE },

where θE = cos−1 (−E) = cos−1 (E) is the value at which cos θE + E = 0.

We could also use the Level Set Lemma to show that this is a closed curve: the
range of θ is bounded (as we just showed), and p2 ≤ 2(1 + E), so p is bounded.

66
Since the only critical points of H are (0, 0) and (0, ±π), these level sets are closed
curves of finite length. (You can check directly that they have finite length, but it
is painful.)

In fact, this curve ΣE must be a periodic orbit: we use the fact that ΣE has
finite length ℓE , and there are no fixed points on ΣE (the only two fixed points are
(0, 0), where E = −1, and (π, 0), where E = 1).

So, since f is continuous, the trajectory along ΣE must move in only one
direction (f is never zero on ΣE , so the trajectory cannot change direction) and
|f | ≥ δ > 0 on ΣE . To see this second fact, observe that since |f | : ΣE → R is
continuous and ΣE is compact, |f | attains its lower bound on ΣE : this must be
some δ > 0, since otherwise there would be point on ΣE with |f (x)| = 0.

If we let ℓ denote the arclength on ΣE as we move along the trajectory starting


at some x ∈ ΣE then ˆ t
ℓ(t) = |f (ϕs (x))| ds ≥ δt,
0
and since ℓ is a continuous function of t there exists T > 0 such that ℓ(T ) = ℓE , and
then ϕT (x) = x. These periodic orbits correspond to oscillations of the pendulum.

It is worth giving the consequence of this argument more formally.


Lemma 4.3. Suppose that f : R2 → R2 is locally Lipschitz, and let H be as in
Theorem 4.2. Suppose that H is conserved under the flow generated by f . If E is
a connected component of Ec that contains no critical points of H and no points
at which f = 0, then E is a periodic orbit of ẋ = f (x).

• if E = 1 then ΣE consists of the fixed point (π, 0) and two orbits homoclinic to
(π, 0).

• if E > 1 then ΣE consists of two disjoint closed curves that wind around the
cylinder; again, these are both periodic orbits, which represent clockwise and anti-
clockwise rotations of the pendulum. [The argument would be the same as in the
case |E| < 1, but showing that the closed curves (remember that they are actually
on the cylinder) have finite length is easier since the integrand in the arc-length
integral is bounded above.]

We can use the conservation of H to investigate the stability of some of the


invariant sets in this phase portrait.

The fixed point (0, 0) is stable: given a neighbourhood U of (0, 0), let
ε = inf{H(x) : x ∈ X \ U } > −1

67
Figure 4.2: The phase portrait for the pendulum.

and let
V = {x : H(x) < ε}.
Then V is an invariant set entirely contained in U , so if x ∈ V then ϕt (x) ∈ V ⊂ U
for all t ≥ 0.

Similarly, each of the periodic orbits is stable. If we consider a neighbourhood


U of the periodic orbit Γ = H −1 (E) with E ∈ (−1, 1), we want to find ε such that

Γε := {x : E − ε < H(x) < E + ε} ⊂ U. (4.1)

Since H −1 ([−1, M ]) is bounded for any M , Γδ is bounded for any δ > 0.

Suppose that an ε as in (4.1) does not exist. Then there exist xn such that
|H(xn ) − E| < 1/n and xn ∈ / U . Since xn ∈ Γ1 is bounded it is contained in a
compact set, and so there is a subsequence xnj such that xnj → x∗ . Since H is
continuous, H(x∗ ) = E, so x∗ ∈ Γ.

Since xnj → x∗ , we must have xnj ∈ U for j sufficiently large, contradicting


our choice of (xn ).

68
Figure 4.3: The phase portrait from Figure 4.2 ‘rolled up’ again onto the cylinder.
The ‘back’ of the cylinder looks essentially the same (but reflected).

Figure 4.4: An alternative view, in which horizontal slices correspond to H =


constant. The phase portrait on the cylinder from Figure 4.3 has been ‘bent’ to
give this new picture.

69
Figure 4.5: The origin is stable.

We can now take V = Γε in the definition of Lyapunov stability: if x ∈ Γε then


ϕt x ∈ Γε ⊂ U for all t ≥ 0 since H is constant on trajectories.

[We could use the same argument for the invariant set Σ1 made of the two
homoclinic orbits (blue in Figure 4.2). If we want to use a similar argument for
either one of the orbits that make up ΣE , then we would start with the definition

Γε = {x : p > 0, E − ε < H(x) < E + ε}

for the upper portion, or the same with p < 0 for the lower portion.]

The fixed point (π, 0) is unstable, meaning not Lyapunov stable: we can find a
neighbourhood U of (π, 0) such that for any neighbourhood V of x, ϕt (x) ∈ / U for
some t > 0. Take U to be a ball of radius 1, say, centred at (π, 0). Then whatever
neighbourhood V of (π, 0) we satrt in there are points in V on the homoclinic
trajectories whose forward orbit leaves U after some time.

Similarly, neither of the homoclinic orbits taken individually are stable. How-
ever, Σ1 is stable, as indicated above.

The pendulum is an example of a Hamiltonian system.


Definition 4.4. A system of the form
∂H
ẋ = (x, y)
∂y
∂H
ẏ = − (x, y)
∂x
where H : X → R is a C 1 function is called a Hamiltonian system.

Here the Hamiltonian H is often the energy, e.g. for the pendulum if we set

70
Figure 4.6: (π, 0) is unstable, as are the homoclinic orbits joining this fixed point
to itself.

H = p2 /2 − cos θ then

∂H ∂H
θ̇ = =p ṗ = − = − sin θ.
∂p ∂θ

All Hamiltonian systems are conservative (with H constant) since


 
dH ∂H ∂H ∂H ∂H ∂H ∂H
= ẋ + ẏ = + − = 0.
dt ∂x ∂y ∂x ∂y ∂y ∂x

Conserved quantities can be extremely useful: they can be used to reduce the
dimension of the problem by one; if X = R2 then a conserved quantity turns the
problem into a collection of one-dimensional problems. If X = R3 then dynamics
can be reduced to two-dimensional dynamics, and these are much simpler than
dynamics in any higher dimension because of the Jordan Curve Lemma (‘a closed
curve in the plane has an inside and an outside’). Trajectories cannot cross from
inside to outside a periodic orbit or a homoclinic/heteroclinic loop.

4.2 Lyapunov functions (the damped pendulum)

If there is a function H : X → R that is non-increasing along orbits this can help


us understand the dynamics.

71
Example: the damped pendulum

We take k > 0 and consider


(
θ̇ = p
θ̈ + k θ̇ + sin θ = 0 ⇔
ṗ = −kp − sin θ.

If we consider H = p2 /2 − cos θ again, then now we have

dH
= (sin θ)p + p(−kp − sin θ) = −kp2 ≤ 0 for every (θ, p).
dt
So H is non-increasing along orbits: H(ϕt (x)) ≤ H(x) for all x ∈ X, t ≥ 0. We
only have dH/dt = 0 when p = 0, but if θ ∈ / {0, π} then ṗ ̸= 0 and the orbits are
only at such points instantaneously.

More formally, since for any θ ̸= 0, π there exists ε > 0 such that |ṗ| > ε in a
neighbourhood of x = (θ, 0), we have
ˆ τ
2
∆H ≤ ε2 t2 dt = − kτ 3 ε2 < 0,
−k |{z}
−τ 3
|p(t)|≥ε|t|

so H decreases along all orbits expect for the two fixed points.

[The idea here is to consider the trajectory passing through x = (θ, 0) at t = 0,


where ṗ = − sin θ ̸= 0. In a neighbourhood of this point if θ ∈ (0, π) we have
ṗ ≤ −ε and if θ ∈ (−π, 0) we have ṗ ≥ ε, for some ε > 0 (we could take
ε = | sin θ|/2). In
´ t the case illustrated in the figure (θ ∈ (−π, 0)) for t > 0 we
´ −t
then have p(t) = 0 ṗ(s) ds ≥ tε, and for t < 0 we have p(t) = 0 ṗ(s) ds ≤ tε; so
|p(t)| ≥ εt (and similarly for the other range of θ). We use these lower bounds in
the integral for ∆H.]

We will now introduce La Salle’s Invariance Principle, which we enable us to


prove that each orbit converges to a fixed point as t → ∞.

Definition 4.5. A Lyapunov function for a flow ϕ : X × R → X is a continuous


function H : X → R such that for all x ∈ X

H(ϕt (x)) ≤ H(x) t ≥ 0. (4.2)

If H is C 1 then we could replace (4.2) with dH/dt ≤ 0 along orbits (i.e.


DH(x)f (x) ≤ 0).

72
Figure 4.7: Apart from at the two fixed points, ∆H < 0.

For each c ∈ R, let us write

Mc := {x ∈ X : H(ϕt (x)) = c for all t ≥ 0};

i.e. Mc is all x that have forward orbits along which H is constant and equal to
c. For example, for the damped pendulum H decreases along all orbits except for
the fixed points, so

M1 = {(π, 0)}, M−1 = {(0, 0)}, and Mc = ∅ if c ̸= −1, 1.

The LaSalle Invariance Principle says that the Lyapunov function is constant
on any ω-limit set.
Theorem 4.6 (LaSalle Invariance Principle). If H is a Lyapunov function for a
flow ϕ then for every x ∈ X there exists c ∈ R such that

ω(x) ⊂ Mc := {y ∈ X : H(ϕt (y)) = c for all t ≥ 0}. (4.3)

Proof. Pick x ∈ X. If ω(x) is empty then there is nothing to prove, as the empty
set is a subset of any set.

Otherwise, let

c := inf H(ϕt (x)) = lim H(ϕt (x)) ∈ R ∪ {−∞},


t≥0 t→∞

since H is non-increasing. If c = −∞ then |ϕt (x)| → ∞ (since H is continuous);


but then ω(x) = ∅, so if ω(x) is non-empty we have c ∈ R.

Now take y ∈ ω(x): then there is an increasing sequence (tj ) → ∞ such that
ϕtj (x) → y as j → ∞. Since H is continuous, H(y) = limj→∞ H(ϕtj (x)) = c.

73
All that remains is to show that y ∈ Mc , i.e. that H(ϕt (y)) = c for all t ≥ 0.
But we know that ω(x) is invariant, so if y ∈ ω(x) we have ϕt (y) ∈ ω(x), and we
have just shown that H(z) = c for all z ∈ ω(x), so H(ϕt (y)) = c as required.

Note that in this proof we only ever use properties of ϕt for t ≥ 0. [The
results on ω-limit sets from Propositions 3.9 and 3.10 only use this ‘future’ part of
ϕt .] This means that the Invariance Principle is also true if we have an Lyapunov
function that is defined on some positively invariant subset V of X (and then (4.3)
holds for all x ∈ V ); we will use this observation in the proof of Theorem 5.4.

If H is strictly decreasing along trajectories except at fixed points, then the


fact that ω(x) ⊂ Mc means that ω(x) must be a subset of all the fixed points. If
there are only a finite number of fixed points we can do better.
Corollary 4.7. Suppose that H is strictly decreasing along trajectories (except at
fixed points) and that ϕ has only a finite number of fixed points. Then if ω(x) ̸= ∅,
ω(x) = {x∗ }, where x∗ is a fixed point.

Proof. Since there are only a finite number of fixed points, there exists δ > 0 such
that for any pair of distinct fixed points y and z, |y − z| > δ.

Since ϕt x → ω(x), there exists t0 such that

dist(ϕt x, ω(x)) < δ/3 for all t ≥ t0 . (4.4)

Suppose that ω(x) contains two distinct fixed points, y and z. Then, since
y ∈ ω(x), there exists ty > t0 such that

|ϕty x − y| < δ/3.

Since z ∈ ω(x), there exists tz > ty such that

|ϕtz x − z| < δ/3.

Now observe that

|ϕtz x − y| ≥ |y − z| − |ϕtz x − z| > δ − (δ/3) = 2δ/3.

Since t 7→ |ϕt x − y| is continuous it follows, using the Intermediate Value Theorem,


that there exists t with ty < t < tz such that

|ϕt x − y| = δ/2.

74
If w ̸= y is any fixed point then

|ϕt x − w| ≥ |w − y| − |ϕt x − y| > δ − (δ/2) = δ/2.

So we have |ϕt x − w| ≥ δ/2 for any fixed point (including x): in other words,
dist(ϕt x, ω(x)) ≥ δ/2. But t > t0 , contradicting (4.4). It follows that ω(x) consists
of a single fixed point as claimed.

Application to the damped pendulum

If x ∈ X then H(ϕt (x)) ≤ E = H(x) for all t ≥ 0. Since the set

{y ∈ X : H(y) ≤ E}

is compact it follows from Proposition 3.10 that ω(x) is non-empty.

La Salle’s Invariance Principle implies that there must exist c ∈ {−1, 1} such
that ω(x) ⊂ Mc , i.e. ω(x) = {(0, 0)} or {(π, 0)}, and Proposition 3.10 shows that
ϕt (x) → ω(x), so the forward orbit must converge to one of the fixed points.

The origin is an attractor: since it is Lyapunov stable, nearby orbits cannot


converge to (π, 0), so they must converge to (0, 0).

We would like to be able to identify which fixed points orbits of the damped
pendulum tend to. We claim that the phase portraits are as in Figure 4.8 (for the
case k = 2 see Sheet 6, Q2).

Just two orbits tend to (π, 0) and all the others go to (0, 0). To justify this, we
will have to examine the dynamics near the fixed points, which we will do in the
next section via linearisation.

75
76

Figure 4.8: Phase portrait for the damped pendulum, depending on the value of
k. For k = 2 see Examples Sheet 6.

Figure 4.9: The phase portrait for the damped pendulum presented as in Figure
4.4.
Chapter 5

Linearisation and the dynamics


near fixed points

Let us consider ẋ = f (x) on a state space X (typically Rn ), assuming that f is C 1


and that the flow has a fixed point x∗ , where f (x∗ ) = 0.

5.1 Linearised equations

By Taylor’s Theorem,

f (x) = f (x∗ ) + Df (x∗ )(x − x∗ ) + o(|x − x∗ |);


| {z } | {z } | {z }
=0 linear part nonlinear part

recall that o(|x − x∗ |)/|x − x∗ | → 0 as x → x∗ .

We change coordinates, putting x = x∗ + y; then


d ∗
(x + y) = ẋ = f (x) = Df (x∗ )y + o(|y|),
dt
so we obtain
ẏ = Df (x∗ )y + o(|y|),
where o(|y|) are the higher-order nonlinear terms.

The equation
ẏ = Df (x∗ )y (5.1)

77
is called the linearised equation near x∗ .

In this section we will see what features of the linear system (5.1) persist when
we reintroduce the nonlinear terms.
Definition 5.1. A fixed point x∗ is hyperbolic if every eigenvalue of Df (x∗ ) has
non-zero real part.
We will see that the local phase portrait near any hyperbolic fixed point x∗
‘looks like’ that of the linearised system. There are three possible cases:

• if all eigenvalues of Df (x∗ ) have negative real parts then x∗ is a sink and for the
original system x∗ is attracting;

• if all eigenvalues of Df (x∗ ) have positive real parts then x∗ is a source and for
the original system x∗ is a repellor;

• otherwise, x∗ is a saddle point and for the original system x∗ is unstable.

Figure 5.1: Sinks and sources in one-dimensional systems

Example: for the (possibly damped) pendulum


θ̇ = p
ṗ = − sin θ − kp
we have (for x = (θ, p))
 
0 1
Df (x) = .
− cos θ −k

At x∗ = (0, 0) we have
 
0 1
Df (0, 0) = ,
−1 k

78
whose eigenvalues are the solutions of λ2 + kλ + 1 = 0, so are

−k ± k 2 − 4
λ= ;
2
if k = 0 (ideal pendulum) then λ = ±i, so (0, 0) is not hyperbolic
if 0 < k < 2 then the origin is a stable focus
if k = 2 the origin is a stable improper node
if k > 2 the origin is a stable node

For k > 0 the origin is hyperbolic and is a sink

At x∗ = (π, 0) we have
 
0 1
Df (π, 0) = ,
1 −k

whose eigenvalues are the solutions of λ2 + kλ − 1, so are



−k ± k 2 + 4
λ= ;
2
so for all k ≥ 0 the point (π, 0) is always a (hyperbolic) saddle.

We will state a theorem later (the Hartman–Grobman Theorem) that guaran-


tees that close to a hyperbolic fixed point the phase portrait ‘looks like’ that for
the linearised system. For now we prove some related (and simpler) results.

5.1.1 Sinks are attracting

To show that sinks are attracting we will use the Adapted Norm Lemma, which
we prove here in the two-dimensional case (see separate ‘handout’ for the general
case).
Lemma 5.2 (Adapted Norm Lemma). Let A be a linear operator on a real n-
dimensional vector space E. Suppose that all the eigenvalues of A satisfy
a < Re λ < b, for some a, b ∈ R.
Then there exists a basis B = {b1 , . . . , bn } such that
a∥x∥2B ≤ ⟨x, Ax⟩B ≤ b∥x∥2B for every x ∈ E, (5.2)
where ⟨· ·⟩B denotes the usual inner product in this basis and ∥·∥B is the associated
norm.

79
If x = nj=1 xj bj and y = nj=1 yj bj we define ⟨x, y⟩B = nj=1 xj yj and set
P P P
p
∥x∥B = ⟨x, x⟩B . It is easy to show that there exist constants c1 , c2 with 0 <
c1 ≤ c2 such that
c1 ∥x∥B ≤ |x| ≤ c2 ∥x∥B , (5.3)
where |x| is the usual Euclidean norm of x. [See handout: or use the fact that all
norms on Rn are equivalent, see Norms, Metrics, & Topologies.

Proof. We restrict to the case n = 2. We choose a basis {b1 , b2 } for E so that A


has real Jordan normal form that is one of the following three cases:
     
λ 0 λ ε ρ ω
, , ,
0 µ 0 λ −ω ρ

where we choose (if need be) ε sufficiently small so that λ ± ε ∈ [a, b]. Then,
case-by-case, for x = (x1 , x2 ) in terms of the basis B, ⟨Ax, x⟩ is given by
  
 λ1 0 x1
x 1 x2 = λ1 |x1 |2 + λ2 |x2 |2
0 λ2 x2
    
 λ ε x1 λx1 + εx2
x1 x2 = (x1 x2 ) = λ|x1 |2 + εx1 x2 + λ|x2 |2
0 λ x2 λx2
= λ∥x∥2B + εx1 x2
    
 ρ ω x1  ρx1 + ωx2
x1 x2 = x1 x2 = ρ|x1 |2 + ρ|x2 |2 = ρ∥x∥2B .
−ω ρ x2 −ωx1 + ρx2

Note that in the middle case we have |x1 x2 | ≤ x21 + x22 = ∥x∥2B , so in every case
we obtain (5.2).

Theorem 5.3 (Sinks are attracting). If x∗ be a sink of ẋ = f (x) on Rn , then x∗


is attracting. More precisely, suppose that Reλ < −a, a > 0, for every eigenvalue
λ of Df (x∗ ). Then there exists a neighbourhood V of x∗ such that

(a) ϕt (x) is defined and contained in V for all x ∈ V and t ≥ 0;

(b) there is a constant C > 0 such that

|ϕt (x) − x∗ | ≤ Ce−at |x − x∗ | for every x ∈ V, t ≥ 0. (5.4)

In particular, x∗ as Lyapunov stable and ϕt (x) → x∗ as t → ∞ for every x ∈ V ,


so x∗ is attracting.

80
Proof. We will assume that x∗ = 0 (if not we can change coordinates, putting
y = x − x∗ ). Let A = Df (0), and let ε be small enough that

Re λ < −(a + ε)

for every eigenvalue λ of A.

The Adapted Norm Lemma (Lemma 5.2) shows that Rn has a basis B such
that
⟨x, Ax⟩B ≤ −(a + ε)∥x∥2B for all x ∈ Rn .
By the definition of the derivative we have

∥f (x) − Ax∥B
lim = 0.
x→0 ∥x∥B

[This limit certainly holds if we use in place of ∥ · ∥B the usual norm on Rn ; that
it also holds using ∥ · ∥B is a consequence of (5.3).] since the Cauchy–Schwarz
inequality gives
|⟨x, f (x) − Ax⟩B | ≤ ∥x∥B ∥f (x) − Ax∥B
it follows that
⟨x, f (x) − Ax⟩B
lim = 0.
x→0 ∥x∥2B

So we can find δ > 0 such that if x ∈ V := {y ∈ Rn : ∥y∥B ≤ δ} then



⟨x, f (x)⟩B ⟨x, Ax⟩

∥x∥2 − < ε,
2
B ∥x∥ B

so that ⟨x, f (x)⟩B ≤ −a∥x∥2B .

Now it follows that we have


n
d 1d X
∥x(t)∥B ∥x∥B = ∥x∥2B = xi (t)ẋi (t) = ⟨x, f (x)⟩B ≤ −a∥x∥2B ;
dt 2 dt i=1

so
d
∥x∥B ≤ −a∥x∥B (5.5)
dt
and ∥x(t)∥B is decreasing (whenever x ̸= 0).

Therefore x(t) cannot escape the set V and so x(t) is defined for all t ≥ 0 and
in V for all t ≥ 0.

81
Integrating (5.5) gives ∥ϕt (x)∥B ≤ e−at ∥x∥B , or, translating the fixed point
back to x∗ ,
∥ϕt (x) − x∗ ∥B ≤ e−at ∥x − x∗ ∥B .
Now part (b) follows using (5.3).

To show that x∗ is Lyapunov stable, take any neighbourhood U of x∗ ; then U ∩V


is also a neighbourhood of x∗ , so there exists δ > 0 such that B(x∗ , δ) ⊂ U ∩ V .
Then if x ∈ B(x∗ , δ/C) it follows that |ϕt (x) − x∗ | ≤ C|x − x∗ | < δ for all t ≥ 0,
so in particular ϕt (x) ∈ U for all t ≥ 0.

Figure 5.2: Nonlinear sinks in adapted and standard coordinates.

Not that by reversing time, if all the eigenvalues λ of Df (x∗ ) have Re λ > 0
then x∗ is repelling (= attracting ‘in negative time’).

We can immediately apply this theorem to guarantee that for the damped
pendulum, (0, 0) is attracting.

5.1.2 Lyapunov method for proving stability/attractivity


of fixed points

We can use a similar method to show that fixed points are attracting by using
other possible choices of Lyapunov functions.
Theorem 5.4 (Lyapunov’s Stability Theorems). Let ϕ be a continuous flow on X
with a fixed point x∗ . Let N be compact subset of X that contains a neighbourhood
of x∗ and let H : X → R be a continuous function with
H(x) > H(x∗ ) for all x ∈ N \ {x∗ }

82
such that H is non-increasing along orbits in N (i.e. H(ϕt (x)) ≤ H(x) for all
x ∈ N and t ≥ 0 such that ϕs (x) ∈ N for all 0 ≤ s ≤ t). Then

(i) x∗ is Lyapunov stable (Lyapunov’s First Stability Theorem) and

(ii) if H is strictly decreasing along orbits in N \ {x∗ } then x∗ is asymptotically


stable (Lyapunov’s Second Stability Theorem).

Such a function H is called a local Lyapunov function. In the proof of Theorem


5.3 we used ∥x∥B as a local Lyapunov function.

Proof. By assumption N contains a neighbourhood Z of x∗ .

(i) Let U be a neighbourhood of x∗ in X. We need to show that there is a


neighbourhood V of x∗ such that x ∈ V implies that ϕt (x) ∈ U for all t ≥ 0.

We will show that there is an open subset V ⊂ U such that x ∈ V ⇒ ϕt (x) ∈ V


for all t ≥ 0.

Without loss of generality we can assume that H(x∗ ) = 0, so that H(x) > 0
for all x ∈ N \ {x∗ } (replace H(x) by H(x) − H(x∗ ) if need be).

Now, W = U ∩ Z is a neighbourhood of x∗ with W ⊂ U ∩ N . Then N \ W is


compact [closed and bounded], so there exists δ > 0 such that H(x) ≥ δ on N \ W
(since a continuous function on a compact set attains its bounds and H ̸= 0 on
N \ W ). Set
V := {x ∈ W : H(x) < δ} :
this is a neighbourhood of x∗ , since it contains x∗ (as H(x∗ ) = 0 < δ) and is
an open set because it is the inverse image of the open set1 (−∞, δ) under the
continuous map H : X → R intersected with the open set W .

We want to show that for any x ∈ V we have ϕt (x) ∈ V for all t ≥ 0. Given
some x ∈ V , let

T := sup{τ ≥ 0 : ϕt (x) ∈ V for all t ∈ [0, τ )};

we will assume that T < ∞ and deduce a contradiction.

Since H is non-increasing we have H(ϕt (x)) ≤ H(x) for all t ∈ [0, τ ). Since H
is continuous and ϕt (x) is continuous in t we must also have H(ϕT (x)) ≤ H(x);
we claim that we must also have ϕT (x) ∈ V .
1 Since H : X → [0, ∞) we could take any open set (−k, δ) with k > 0.

83
Figure 5.3: Construction for proof of Lyapunov’s First Stability Theorem.

Note that

• H(ϕT (x)) ≤ H(x) < δ since x ∈ V ;

• since ϕt (x) ∈ N for all t ∈ [0, T ) and N is compact, we have ϕT (x) ∈ N ; but
H(y) ≥ δ for all y ∈ N \ W , so we must have ϕT (x) ∈ W .

Therefore
ϕT (x) ∈ W with H(ϕT (x)) < δ
so ϕT (x) ∈ V as claimed.

Now, since V is open and ϕt (x) is continuous in t, we must have ϕT +s (x) ∈ V


for all s ∈ [0, ε) for some ε > 0 sufficiently small: but this contradicts the definition
of T , so O+ (x) ⊂ V , which shows that x∗ is Lyapunov stable.

(ii) If we take V as above and some x ∈ V \ {x∗ }, then ϕt (x) ∈ V for all
t ≥ 0, which (by Proposition 3.10) shows that ω(x) ̸= ∅. Now we need to show
that ω(x) = {x∗ }. This follows immediately from La Salle’s Invariance Principle
applied on the positively invariant set V : since H is strictly decreasing along
trajectories, the only set in V on which H is constant is {x∗ }.

There is a useful variant of the result in (ii), which is worth stating formally.
Corollary 5.5. Suppose that V is a contained in a compact set, and is a positively
invariant set for a flow ϕ. Assume that V contains a fixed point x∗ , H(x) > H(x∗ )
for all x ∈ V \ {x∗ }, and H is strictly decreasing along orbits in V \ {x∗ }. Then

84
ϕt (y) → x∗ as t → ∞ for all y ∈ V . If V contains a neighbourhood of x∗ then x∗
is asymptotically stable.

Proof. The proof of the first part follows exactly the proof of part (ii) in Theorem
5.4, except that now we are given the existence of a positively invariant V as an
assumption.

To show that x∗ is asymptotically stable we have to show that it is also Lya-


punov stable. In order to use part (i) of Theorem 5.4 it suffices to find a compact
set N ⊂ V that contains a neighbourhood of x∗ . This is simple, since the fact
that V contains a neighbourhood of x∗ means that there exists ε > 0 such that
B(x∗ , ε) ⊂ V , and then N := B(x∗ , ε/2) satisfies our requirements.

We can use these results to show that some fixed points that are not sinks are
nevertheless Lyapunov stable or attracting.

Example in R: ẋ = −x3 . The linearisation is ẋ = 0. If we set H(x) = 12 x2 then


Ḣ = dH
dx
ẋ = −x4 < 0 for x ̸= 0, so the origin is an attractor.

Figure 5.4: The equation ẋ = −x3 : the origin is an attractor.

Example in R2 :
ẋ = −y − x3 ẏ = x5 . (5.6)
The origin is a fixed point; the linearised dynamics are ẋ = −y, ẏ = 0.

However, if we try the Lyapunov function

ax2k by 2m
H(x, y) = +
2k 2m
then
dH ∂H ∂H
= ẋ + ẏ = ax2k−1 (−y − x3 ) + by 2m−1 x5 ;
dt ∂x ∂y

85
Figure 5.5: The linearised system ẋ = −y, ẏ = 0 has a line of fixed points.

so if we take 2k − 1 = 5, 2m − 1 = 1, and a = b [k = 3, m = 1, a = b = 1, say]


then for
x6 y 2
H(x, y) = + ≥0
6 2
we have
dH
= −x8 ≤ 0,
dt
and so the origin is Lyapunov stable.

We can do better with a little more work: the fact that Ḣ = −x8 means that
H is strictly decreasing except when x = 0. But when x = 0, y ̸= 0, we have

|ẋ| = |y|;

in some neighbourhood of (0, y) we have |ẋ| > |y|/2. Suppose that x(0) = (0, y);
then for t small enough we have |x(t)| ≥ |yt|/2, and then
ˆ t ˆ t
8
H(x(t)) − H(x(0)) = − |x(s)| ds ≤ − |ys|8 /28 ds < 0,
0 0

so H is in fact strictly decreasing along all trajectories apart from at the origin.
It now follows that the origin is also attracting, so it is asymptotically stable.

If instead of this we had

ẋ = −y − x3 ẏ = x5 − y 3 , (5.7)

then taking the same choice of H as above we would get

Ḣ = x5 (−y − x3 ) + y(x5 − y 3 ) = −x8 − y 4 < 0 for all (x, y) ̸= (0, 0),

and it is easy to see that the origin is asymptotically stable without the sort of
more careful argument above.

86
Figure 5.6: The origin is Lyapunov stable for equation (5.6). It is – in fact – also
attracting.

Figure 5.7: The origin is attracting for equation (5.7).

5.2 Unstable fixed points: saddles are unstable

Theorem 5.6. Suppose that x∗ is a fixed point of ẋ = f (x) on Rn , with f a C 1


function. If Df (x∗ ) has an eigenvalue with positive real part then x∗ is unstable.

We give the proof in the two-dimensional case; the argument in higher dimen-
sions is essentially the same, but the 2D case is a little more straightforward and
so significantly clearer.

Note that in 2D if both eigenvalues have positive real part then we can use our
previous result in the form ‘sources are repellors’. So we only need to consider the
case that we have one positive eigenvalue µ and one negative eigenvalue −λ (we
take λ, µ > 0).

Proof. Without loss of generality, as before we take x∗ = 0, and set A := Df (0).


We change coordinates so that A takes on its canonical form; then with z = (x, y)

87
(in the new coordinates) we have
   
µx R(z)
f (x, y) = + ,
−λy S(z)

where R(z) and S(z) are higher order terms: o(|z|) as z → 0 [recall that g(z) =
o(|z|) as z → 0 means that ∥g(z)∥/|z| → 0 as z → 0.]

Note that we are writing f (z) = Df (0)z + V (z), where V = (R, S). By the
definition of the derivative, |f (z) − Df (0)z| = |V (z)| = o(|z|); this is where the
inequalities for |R(z)| and |S(z)|, the components of V , come from.

Let H(z) = 21 (|y|2 − |x|2 ); then

d
H(z) = (y, ẏ) − (x, ẋ)
dt
= (y, −λy) + (y, S(z)) − (x, µx) − (x, R(z))
≤ −λ|y|2 − µ|x|2 + |y||S(z)| + |x||R(z)|
≤ −µ|x|2 + |z|(|R(z)| + |S(z)|).

Now, there exists a δ > 0 such that


µ 2
|R(z)|, |S(z)| ≤ |z| for all z ∈ U := {z ∈ Rn : |z| ≤ δ}.
4

Let
C = {z : |y| ≤ |x|};
then if z ∈ C ∩ U we have
µ 2
|R(z)|, |S(z)| ≤ |x|
2
and so within C ∩ U we have
d µ
H(z) ≤ − |x|2 :
dt 2
H is non-increasing along all orbits in C ∩ U \ {0}.

While the orbits remain in U they cannot leave C, because H increases as we


cross from C over the boundary ∂C \ {0}. We want to show that orbits cannot
remain in C ∩ U for all t > 0.

We have
1d 2 µ
|x| = µ|x|2 + R(z)x ≥ |x|2 ,
2 dt 2
88
and so |x(t)| ≥ eµt |x(0)| as long as x(t) remains in C ∩ U . This will eventually
exceed δ, so that the solution leaves U .

Thus there are points z arbitrarily close to 0 such that ϕt (z) leaves U for some
t > 0. This contradicts the Lyapunov stability of 0, and therefore the origin is
unstable.

5.3 Stable and unstable manifolds at hyperbolic


fixed points

Suppose that x∗ is a hyperbolic fixed point of ẋ = f (x) on Rn , where f is C k ,


k ≥ 1, and the linearised dynamics at x∗ is
ẋ = Df (x∗ )x on Rn . (5.8)

• Let E s be the subspace of Rn spanned by the (generalised) eigenvectors of Df (x∗ )


associated to the eigenvalues λ with Re λ < 0; let ds = dim(E s ).

• Let E u be the subspace of Rn spanned by the (generalised) eigenvectors of Df (x∗ )


with Re λ > 0 and let du = dim(E u ).

Since x∗ is hyperbolic, Rn = E s ⊕ E u , so du + ds = n.

Figure 5.8: Unstable (E1 = Eu ) and stable (E2 = Es ) subspaces.

Each of E u and E s is invariant under the linearised dynamics (5.8), with


E s = {x ∈ Rn : ϕLt (x) → 0 as t → ∞}

89
the ‘stable subspace’, where ϕLt is the linearised flow arising from (5.8), and
E u = {x ∈ Rn : ϕLt (x) → 0 as t → −∞},
the ‘unstable subspace’. [Note that E s is not a ‘stable set’ in the Lyapunov sense.]
Definition 5.7. The stable manifold W s (x∗ ) and the unstable manifold W u (x∗ )
of the fixed point x∗ are the sets
W s (x∗ ) :={x ∈ Rn : ϕt (x) → x∗ as t → ∞}
W u (x∗ ) :={x ∈ Rn : ϕt (x) → x∗ as t → −∞};
they are the nonlinear versions of the stable and unstable subspaces defined above.

Note that these are invariant: if x ∈ W s (x∗ ) then for any s ∈ R we have
ϕt (ϕs x) = ϕt+s x → x∗
as t → ∞, so ϕs x ∈ W s (x∗ ); a very similar argument works to show that W u (x∗ )
is invariant.

We will state (and use) the following result about the existence of stable and
unstable manifolds.
Theorem 5.8 (Stable and unstable manifolds). Given the above assumptions there
exists a stable manifold W s (x∗ ) of dimension ds and an unstable manifold Wu (x∗ )
of dimension du , both of which contain x∗ , and which are tangent to x∗ + E s and
x∗ + E u , respectively (at x∗ ).

Roughly speaking, an n-dimensional manifold M is a set such that near each


point of M , M ‘looks like’ a little bit of Rn . These are the subject of the third
year Manifolds module, but here we will only consider one-dimensional manifolds
(which are curves) and two-dimensional manifolds (which are surfaces).

In Figure 5.9 ds = 2 and du = 1. The stable manifold is tangent to the plane


E + x̄, and the unstable manifold tangent to the line E u + x̄.
s

Note that if ds = n (i.e. x∗ is a sink) then E s = Rn and if du = n (i.e. x∗ is a


source) then E u = Rn .

The stable manifold is unstable as a set unless du = 0.

In two dimensions a fixed point x∗ is a saddle if and only if the eigenvalues


of Df (x∗ ) are λ and −µ, for some λ, µ > 0. There are precisely two orbits that
converge to x∗ as t → ∞, and they approach tangentially to the stable subspace
E s , one from each side. There are precisely two orbits that converge to x∗ as
t → −∞, and these are tangential to the unstable subspace (see Figure 5.10).

90
Figure 5.9: Stable and unstable manifolds in R3 near a hyperbolic fixed point x̄:
in this picture the stable manifold is two-dimensional and the unstable manifold
is one-dimensional.

Figure 5.10: Stable and unstable manifolds near a saddle point in R2 .

Stable and unstable manifolds for the damped pendulum

Recall from Section 4.2 that for the damped pendulum

θ̇ = p
ṗ = − sin θ − kp,

91
La Salle’s Invariance Principle implies that the ω-limit set of every orbit is one of
the fixed points. The fixed point (π, 0) is a saddle, so the SUM Theorem implies
that (π, 0) is the ω-limit set of precisely two orbits and is the α-limit set of two
other orbits. The origin is then the ω-limit set of all other orbits in S 1 × R.

Figure 5.11: Stable and unstable manifolds for the damped pendulum.

The stable manifold as ‘separatrix’

Note that if dim Wxu∗ ̸= 0 then Wxs∗ forms the boundary between different
behaviours of the orbits; it is sometimes therefore called a separatrix. For example,
in the phase portrait for the Duffing equation with damping
ẋ = y
ẏ = −ky + x − x3
(see Sheet 5, Q1) the origin is saddle point and W s (0, 0) is the boundary between
the basins of attraction of the two sinks at (−1, 0) and (1, 0).

Figure 5.12: Stable manifolds in the Duffing equation.

We will see the stable manifold playing the role of a ‘separatrix’ again later
when we look at predator-prey systems in certain parameter ranges.

92
5.3.1 Approximating the stable and unstable manifolds as
power series

If f is smooth enough, then we can compute W u (x∗ ) and W s (x∗ ) close to any fixed
point using power series.

We will look at some two-dimensional examples close to saddle points, where


the stable and unstable manifolds are one-dimensional. The same approach works
for both the stable and unstable manifolds.

Suppose that the system is

ẋ = f (x, y)
ẏ = g(x, y),

with a fixed point a the origin. Suppose that we want to approximate


  the unstable
1
manifold, when the eigenvector in the unstable direction is .
α

We will look for the unstable manifold in the form2 y = Y (x), where Y is a
polynomial:
Y (x) = a0 + a1 x + a2 x2 + a3 x3 + · · · .
Note that we must have Y (0) = 0, because the manifold passes through the origin,
and we must have Y ′ (0) = a1 = α because we know that the unstable manifold is
tangent to the linear unstable space at the origin.

So we try
Y (x) = α + a2 x2 + a3 x3 + · · · . (5.9)
Since the manifold is invariant, if we differentiate y = Y (x) we obtain

ẏ = Y ′ (x)ẋ;

we substitute in from the origin ODE, with y = Y (x), and end up with

g(x, Y (x)) = Y ′ (x)f (x, Y (x)).

We now substitute the power series expansion of Y (x) in (5.9) into this equation
and solve for the aj s by equating coefficients of powers of x.
2 The only situation in which we cannot write the manifold as y = Y (x) is if the linear direction
is vertical; in this case the direction will be (0, 1), and we would have to look for the manifold
in the form x = X(y); we can use (almost) exactly the same approach. Differentiating
P∞ we
must have ẋ = X ′ (y)ẏ, and we’d look for a power series in the form X(y) = k=2 ak y k , there
being no y term since the manifold has to be tangent to the y axis at the origin.

93
For a simple example, consider the two-dimensional system

ẋ = x ẏ = −y + x2 (5.10)

near the fixed point (0, 0).

Figure 5.13: Stable and unstable manifolds in (5.10).

The y-axis (x = 0) is the stable manifold for the nonlinear system. The SUM
Theorem says that the unstable manifold W u (0) should be tangent to the x-axis,
so we try

X
Y (x) = ak x k .
k=2

For this to be invariant we need ẏ = Y ′ (x)ẋ, i.e.

(−Y + x2 ) = Y ′ (x)x,

which gives

−(a2 x2 + a3 x3 + · · · + an xn + . . . ) + x2 = x(2a2 x + 3a3 x2 + · · · + nan xn−1 + · · · ).

Equating coefficients yields


1
coeff of x2 : 2a2 = −a2 + 1 ⇒ a2 =
3
coeff of x3 : 3a3 = −a3 ⇒ a3 = 0
coeff of xn : nan = −an ⇒ an = 0, n > 2.

So in this case we just get Y (x) = x2 /3.

94
Example: the damped Duffing equation near the origin.

Now we look at the unstable manifold at the origin in the damped Duffing equa-
tion, choosing k = 3/2 (which simplifies the algebra). Recall that the equations
are

ẋ = y
ẏ = −3y/2 + x − x3 .

At the origin the linearisation is


 
0 1
Ż = Z;
1 −3/2

the eigenvalues are the solutions of λ(λ + 3/2) − 1 = 0, i.e.


p
−3/2 ± 9/4 + 4 −3/2 ± 5/2
λ= = = −2 or 1/2.
2 2
 
1
The unstable direction has eigenvector . So we look for the unstable mani-
1/2
fold in the form of the graph y = Y (x), where
x
Y (x) = + a2 x 2 + a3 x 3 + · · · .
2

We have ẏ = Y ′ (x)ẋ, so
3
− Y (x) + x − x3 = Y ′ (x)Y (x).
2
This gives
3 hx
+ a2 x2 +a3 x3 + · · · + x − x3


2 2  h
1 2 x 2 3
i
= + 2a2 x + 3a3 x + · · · + a2 x + a3 x + · · · .
2 2

We compare coefficients. The coefficient of x should already be correct:


3 11
− +1= .
4 22
If this does not work then you have got your eigenvector in the wrong direction!

95
Coefficients of x2 :
3 1 1
− a2 = a2 + 2a2 ⇒ a2 = 0.
2 2 2
Coefficients of x3 :
3 1 3a3 2
− a3 − 1 = a3 + 2a22 + ⇒ a3 = − .
2 2 2 7
To O(x3 ) we have
x 2x3
Y (x) = − .
2 7

The dynamics on the unstable manifold is given by the one-dimensional equa-


tion in x alone
x 2x3
ẋ = − .
2 7

We could do the same for the stable manifold at (0, 0); the stable direction
1
is , so we could look for the stable manifold in the form y = Y (x), where
−2
Y (x) = −2x + a2 x2 + a3 x3 + · · · . This is invariant too, so we still have ẏ = Y ′ (x)ẋ,
i.e.
3
− Y (x) + x − x3 = Y ′ (x)Y (x),
2
which now gives
3
− −2x + a2 x2 + a3 x3 + · · · ] + x − x3
2
= −2 + 2a2 x + 3a3 x2 + · · · −2x + a2 x2 + a3 x3 + · · · .
  

Once again the coefficient of x is already correct:

3 + 1 = 4.

The O(x2 ) term again shows that a2 = 0, and then at O(x3 ) [using the fact that
a2 = 0] we obtain
3
− a3 − 1 = −2a3 − 6a3 ⇒ a3 = 2/13;
2
to third order the stable manifold is y = −2x + (2x3 /13), and the equation on the
stable manifold is
2x3
ẋ = −2x + .
13
96
If you want to do this procedure near a fixed point (x∗ , y ∗ ) that is not the origin
there are two possible approaches: (i) look for a manifold given by the equation

Y (x) = y ∗ + α(x − x∗ ) + a2 (x − x∗ )2 + a3 (x − x∗ )3 + · · ·

or change coordinates in the origin equations: put ξ = x − x∗ , η = y − y ∗ , and


then look for the manifold near the origin in (ξ, η) coordinates, and then transform
back.

The Hartman–Grobman Theorem

We have seen that the dynamics near a hyperbolic fixed point are similar to
those of the linearised system: sinks are still attracting, and saddles have sta-
ble and unstable manifolds that play the same role as the stable and unstable
subspaces. The Hartman–Grobman Theorem shows that orbits of the linearised
system correspond to orbits of the nonlinear system close to the fixed point (the
‘pictures’ are ‘the same’).
Theorem 5.9 (Hartman–Grobman). Suppose that x∗ is a hyperbolic fixed point
of ẋ = f (x) on Rn , where f is continuously differentiable. Then there exists a
neighbourhood U of x∗ and a homeomorphism h from U onto a neighbourhood V
of 0 ∈ Rn such that h ◦ ϕt = ψt ◦ h, where ϕt is the flow associated to ẋ = f (x)
and ψt is the flow associated to the linearisation ξ˙ = Df (x∗ )ξ.

In other words, local to the fixed point h maps orbits to orbits and preserves
the time parametrisation.

Figure 5.14: Illustration of the Hartman–Grobman Theorem

We omit the proof.

97
5.4 Non-hyperbolic fixed points in R2

We now consider what happens near a non-hyperbolic fixed point x∗ of ẋ = f (x),


i.e. a point where at least one eigenvalue of Df (x∗ ) has zero real part.

Examples:

(i) one eigenvalue zero, the other non-zero, e.g. both systems

ẋ = x2 ẋ = x3
ẏ = −y ẏ = −y (5.11)

have the same linearisation at zero, ẋ = 0, ẏ = −y. But the phase portraits are
very different.

Figure 5.15: Phase portrait for the linearised system ẋ = 0, ẏ = −y.

Figure 5.16: Phase portraits for the nonlinear systems in (5.11): ‘saddle-node’ on
the left; ‘nonlinear saddle’ on the right.

(ii) the eigenvalues of Df (x∗ ) are purely imaginary (complex conjugate ±iω)

The linearised system would be a centre. In the case ω = 1 we’d have

ẋ = y
ẏ = −x.

98
The nonlinear system can be many things:

(a) a ‘nonlinear centre’, like (0, 0) for the pendulum or (±1, 0) for the conservative
Duffing equation.

Figure 5.17: ‘Nonlinear centres’: the pendulum (left) and Duffing’s equation (right)

(b) a sink (‘nonlinear stable focus’), e.g. for


ẋ = y − x(x2 + y 2 ) ṙ = −r3
ẏ = −x − y(x2 + y 2 ) θ̇ = −1.

(c) a source (‘nonlinear unstable focus’), e.g. for


ẋ = y + x(x2 + y 2 ) ṙ = r3
ẏ = −x + y(x2 + y 2 ) θ̇ = −1.

Figure 5.18: Nonlinear stable and unstable foci.

(d) infinitely many isolated periodic orbits encircling x∗ , accumulating there, e.g.
for
(
1 r sin 1r r ̸= 0
ẋ = y + x sin p ṙ =
x2 + y 2 0 r=0
1
ẏ = −x + y sin p θ̇ = −1.
x + y2
2

99
Figure 5.19: Infinitely many periodic orbits.

5.5 Global phase portrtaits by combining local


pictures

We look at Lotka–Volterra equations from population dynamics as an example.

Recall from Differential Equations that the logistic equation

ẋ = x(A − ax)

is a simple population model for one species: A is the rate of growth of the popu-
lation without any restrictions (ẋ = Ax) and the −ax2 limits growth due to finite
resources.

Lotka–Volterra models are coupled logistic equations: for two species x and y,

ẋ = x(A − a1 x + b1 y)
ẏ = y(B + b2 x − a2 y),

where A, B, a1 , a2 > 0, and b1 and b2 could be ether sign. The sign of bi determines
the type of interaction:

(i) b1 , b2 > 0. Cooperative case. An increase in one population enhances the


growth of the other, e.g. flowers and bees.

(ii) b1 , b2 < 0. Competition. Both species are in competition for the same re-
source, e.g. rabbits and sheep (and grass).

(iii) b1 > 0 and b2 < 0. Predator-prey. With these parameters x is the predator
and y the prey (e.g. wolves and sheep).

100
Example: competitive case
ẋ = x(3 − x − 2y)
ẏ = y(2 − x − y).

STEP 1: Find the fixed points and study the linearisation about each fixed
point.

For fixed points we need ẋ = ẏ = 0,


x(3 − x − 2y) = 0 and y(2 − x − y) = 0
which gives the four fixed points (0, 0), (0, 2), (3, 0), and (1, 1).

We have  
3 − 2x − 2y −2x
Df (x, y) =
−y 2 − x − 2y
and so we have
 
3 0
(0,0): Df (0, 0) = with eigenvalues and eigenvectors
0 2
   
1 0
λ1 = 3 e 1 = , λ2 = 2 e2 =
0 1
and this is an unstable node.
 
−1 0
(0,2): Df (0, 2) = with eigenvalues and eigenvectors
−2 −2
   
1 0
λ1 = −1 e1 = , λ2 = −2 e2 =
−2 1
and this is a sable node, as is
 
−3 −6
(3,0): Df (3, 0) = with eigenvalues and eigenvectors
0 −1
   
1 3
λ1 = −3 e1 = , λ2 = −1 e2 = .
0 −1
 
−1 −2
(1,1): Df (1, 1) = with eigenvalues and eigenvectors
−1 −1

 
1
λ± = −1 ± 2, e± = ;
∓ √12
and this is a saddle point.

101
STEP 2. Note that all the fixed points are hyperbolic. So the local phase por-
traits near the fixed points look like the phase portraits for the linearised systems.

Figure 5.20: Local phase portraits in a Lotka–Volterra model.

So now we ‘join the dots’, paying attention too to the directions of the vector
field in regions separated by the ‘nullclines’ (where ẋ = 0 or ẏ = 0).

Figure 5.21: Nullclines in the Lotka–Volterra model.

We only need to draw the x, y ≥ 0 quadrant, since we are interested in the


equations as a model of species. Note that this region is invariant (if it wasn’t the
model would be a very poor one).

102
Figure 5.22: Full phase portrait for the Lotka–Volterra model.

s
Note that in the final phase portrait, the stable manifold W(1,1) of the saddle
at (1, 1) divides the quadrant into two regions [the term ‘separatrix’ is sometimes
used]. Everything below this tends to (3, 0), and everything above to (0, 2). So in
this model one species eventually dies out and the other reaches a steady popula-
tion, depending on the initial conditions - apart from the very special case where
s
the initial condition lies on W(1,1) , in which case we end up with equal numbers
of both species. But this point is unstable, so any small changes will lead to
extinction of one species.

[With different parameters other behaviours are possible in competitive models.]

We will return to this class of examples again soon.

103
Chapter 6

Periodic orbits in
two-dimensional systems

We now turn to two results that allow us to prove the existence/non-existence of


periodic orbits in planar systems.

6.1 The Poincaré–Bendixson Theorem

This result enables us to prove the existence of a periodic orbit.

Theorem 6.1 (Poincaré–Bendixson Theorem). Suppose that ẋ = f (x) on R2 ,


with f continuously differentiable. Denote the associated flow by ϕ. If ϕt (x) ∈ K
for all t ≥ 0, where K is compact, then either ω(x) contains a fixed point or ω(x)
is a periodic orbit.

So we can prove the existence of a periodic orbit by finding a compact forward-


invariant region K that contains no fixed points: the Poincaré–Bendixson Theorem
then guarantees that ω(x) is a periodic orbit for each x ∈ K. A periodic orbit
γ with γ = ω(x) for some x ∈ / γ is called a limit cycle (so, for example, periodic
orbits surrounding a linear centre are not limit cycles).

Proof. Since ϕt (x) ∈ K for all t ≥ 0 then ω(x) is a non-empty subset of K


(Proposition 3.10). If we take y ∈ ω(x) then ϕt (y) ∈ ω(x) ⊂ K for all t ≥ 0, so
ω(y) is also non-empty (and a subset of K).

104
Pick some z ∈ ω(y). If z is a fixed point then z ∈ ω(y) ⊂ ω(x), so we fall into
the first possibility in the theorem.

Otherwise, z is not a fixed point, so f (z) ̸= 0. Since f is continuous we can


find a ‘local transverse section’ Σ to the flow near z: a line in R2 on which f (x) · n
has constant sign (we could take n = f (z)/|f (z)|, for example).

Figure 6.1: A local transverse section to the flow near z.

So the orbits of points on Σ all cross Σ from the same side.

Since z ∈ ω(y) there is a sequence tk → ∞ such that ϕtk (y) → z as k → ∞.


So O+ (y) intersects Σ arbitrarily close to z.

We want to prove that all these intersections happen at the same point, in
which case O+ (y) is periodic and the common intersection point is z.

Figure 6.2: We want to show that O+ (y) is a periodic orbit.

Suppose that this is not the case, and let y1 and y2 be two consecutive, distinct
intersections of O+ (y) with Σ.

105
The region bounded by this portion of the orbit and a section of Σ,

{ϕt (y) : t1 ≤ t ≤ t2 } ∪ [y1 , y2 ],

is a ‘trapping region’: either the orbits can enter (through Σ) and then they cannot
escape (left-hand picture), or they can start inside and leave through Σ but then
never return (as in the right-hand picture).

Figure 6.3: (i) once inside orbits cannot escape; (ii) once out, orbits cannot re-
enter.

Now, since y ∈ ω(x) the orbit of x must come arbitrarily close to y; and then,
since solutions depend continuously on the initial conditions, this orbit will stay
close to the orbit through y, so must at some point cross Σ through [y1 , y2 ]. It
is then either trapped inside (so cannot return to close y), or in the second case
trapped outside, and again cannot return close to y. This contradicts the fact
that y ∈ ω(x), so we must have y1 = y2 . This means that ϕt1 (y) = ϕt2 (y), so
ϕt2 −t1 (y) = y and y lies on a periodic orbit γ that is contained in ω(x).

Finally, we want to make sure that in fact ω(x) = γ; to do this we can show
that ϕt (x) → γ as t → ∞.

We take a new transverse intersection Σ′ , now through y. The forward orbit


O+ (x) intersects Σ′ arbitrarily close to y at times tj → ∞ and these intersections
must converge monotonically to y because the forward orbit of each intersection
point is trapped between the earlier part of O+ (x) and the periodic orbit γ.

106
Figure 6.4: Intersections move monotonically along Σ.

For any neighbourhood V of γ, the fact that ϕ is continuous and ϕtj (x) → γ as
j → ∞ implies that there exists a j such that ϕt (x) ∈ V for all t ∈ [tj , tj+1 ]. But
then the trapping implies that in fact ϕt (x) ∈ V for all t ≥ tj , and so ϕt (x) → γ,
so ω(x) can contain no points outside γ and we have finished the proof.

The theorem relies on the existence of a Jordan curve, that divides the state
space into two disjoint parts. So the theorem works on R2 , and also on the sphere
S 2 and the cylinder S 1 × R; but not on the torus T2 . It does not work in Rn when
n > 2.

Application of the Poincaré–Bendixson Theorem

Typically we find a bounded annular region D that is positively invariant and


contains no fixed point, e.g.

D = {(r, θ) : R1 ≤ r ≤ R2 }, ṙ ≥ 0 for r = R1 , ṙ ≤ 0 for r = R2 .

(See Sheet 5, Q4&5.)

107
Figure 6.5: A Jordan curve divides the state space into two disjoint parts: this
works on R2 and S 2 but not on the torus.

Figure 6.6: A positively invariant annular region.

Example: the system


1
ẋ = y + x(1 − 2x2 − 2y 2 )
4
1
ẏ = −x + y(1 − x2 − y 2 ),
2
where r2 = x2 + y 2 . Then
1 1
r2 θ̇ = ẏx − ẋy = x(−x + y(1 − r2 )) − y(y + x(1 − 2r2 ))
2 4
1
= −r2 + r2 sin θ cos θ
4
so
1
θ̇ = −1 + sin 2θ,
8

108
which is strictly negative for all θ, so there are no fixed points (apart from the
origin). To find the radial component we have
1 1
rṙ = xẋ + y ẏ =x(y + x(1 − 2r2 )) + y(−x + y(1 − r2 ))
4 2
2 2
x y
= (1 − 2r2 ) + (1 − r2 )
4 2
x2 y 2 y 2 x2 r 2 y 2 r 2
= + + − −
4 4 4 2 2
2
r 1
= (1 + sin2 θ) − r4 ,
4 2
so
r 1
ṙ = (1 + sin2 θ) − r3 .
4 2
Then ṙ ≥ 0 for all θ if 21 r − r3 ≥ 0, i.e. for 0 < r ≤ √1 ,
2
and ṙ ≤ 0 for all θ if
r − r3 ≤ 0, i.e. if r ≥ 1.

So the region
1
{(r, θ) : √ ≤ r ≤ 1}
2
is forward invariant and contains no fixed points, so it must contain at least one
periodic orbit. All orbits except that of the origin enter this region, so their ω-limit
sets are periodic orbits.

6.2 Non-existence of periodic orbits: Dulac’s cri-


terion

We will use the Divergence Theorem: if Γ is a simple closed curve in R2 enclosing


a region Ω, and f : R2 → R2 is C 1 and g : R2 → R is C 1 , then
ˆ ˆ
g(f · n) dl = ∇ · (gf ) dx dy,
Γ Ω

where n denotes the outward normal to Ω.

Theorem 6.2 (Dulac’s Criterion). If there exists a C 1 function g : R2 → R such


that ∇·(gf ) is continuous and non-zero on some simply connected domain D, then
no periodic orbit of ẋ = f (x) can lie entirely within D.

109
Figure 6.7: Setup for the Divergence Theorem.

The function g is the ‘weight function’. In the simple case g = 1 this is called the
‘Divergence Test’. The theorem can be (easily) generalised to rule out homoclinic
or heteroclinic loops.

Proof. Suppose that some periodic orbit γ does lie entirely in D, and let Ω be the
region enclosed by γ. Then since f is tangent to γ (as γ is a trajectory), f · n = 0,
where n is the outward-pointing normal to Ω. So
ˆ
g(f · n) dl = 0.
γ

By the Divergence Theorem, this implies that


ˆ
∇ · (gf ) dx dy = 0.

But our assumption implies that the scalar function ∇·(gf ) is either (strictly) pos-
itive or negative throughout Ω, so this integral must be non-zero, a contradiction.
So there is no periodic orbit lying wholly within D.

6.2.1 Example: the Lotka–Volterra model again

Again we consider

ẋ = x(A − a1 x + b1 y)
ẏ = y(B + b2 x − a2 y),

with a1 , a2 > 0 (in the positive quadrant x, y ≥ 0).

110
1
We use the weight function g(x, y) = xy . Then
 
1 1
gf = (A − a1 x + b1 y), (B + b2 x − a2 y)
y x
and
a1 a2
∇ · (f g) = −
− <0
y x
for all x, y > 0. So there are no periodic orbits in the positive quadrant, and will
not be if either a1 or a2 are non-zero.

Can we obtain periodic orbits in the a1 = a2 = 0 case?

Consider the system


ẋ = x(−A + b1 y)
ẏ = y(B − b2 x),
where all the parameters are positive. In this system species x dies out if isolated,
while species y will grow: for example, pikes (x) and eels (y), assuming ample food
supplies for the eels.

This system has two fixed points,


 
B A
(0, 0) and , .
b2 b1
We have  
−A + b1 y b1 x
Df (x, y) = ;
−b2 y B − b2 x
 
−A 0
Df (0, 0) = and so (0, 0) is a saddle;
0 B
 
0 b1 B/b2
Df (B/b2 , A/b1 ) =
−b2 A/b1 0

with eigenvalues ±i AB, so this fixed point is a centre.

Are there actually periodic orbits? In this case we can find the explicit form
of the solution curves:
dy ẏ y(B − b2 x)
= = .
dx ẋ x(−A + b1 y)
Separating variables we have
ˆ ˆ
−A + b1 y) B − b2 x
dy = dx
y x

111
Figure 6.8: Periodic orbits in the ‘pike and eels’ model.

and so
−A log y + b1 y − B log x + b2 x = E, constant.
So in fact this is a conservative system with
H(x, y) := −A log y + b1 y − B log x + b2 x
conserved along orbits.

By the Level Set Lemma, the level sets of H in x, y > 0 are closed curves away
from the interior fixed point, since (i) the only critical point of H is when
∂H B ∂H A
= − + b2 = 0 and = − + b1 = 0,
∂x x ∂y y
i.e. at (B/b2 , A/b1 ) and (ii) the level sets are bounded, since
exp(b1 y + b2 x)
H(x, y) = E ⇒ = eE ,
y A xB
and eb1 y y −A → ∞ as y → 0, ∞ (similarly for eb2 x x−B ).

Let H0 = H(B/b2 , A/b1 ). Then for any E with H0 < E < ∞ the curve
H(x, y) = E is a periodic orbit. The argument is the same as we used for the
pendulum (finite length, no fixed points, so the trajectory moves in the same
direction with a speed that is bounded below).

112

You might also like