0% found this document useful (0 votes)
28 views567 pages

CST, Calculus For Cognitive Scientists Higher Order Models and Their Analysis

Uploaded by

David Olivera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views567 pages

CST, Calculus For Cognitive Scientists Higher Order Models and Their Analysis

Uploaded by

David Olivera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 567

Cognitive Science and Technology

James K. Peterson

Calculus for
Cognitive
Scientists
Higher Order Models and Their Analysis
Cognitive Science and Technology

Series editor
David M.W. Powers, Adelaide, Australia

More information about this series at http://www.springer.com/series/11554


The seadragons were intrigued by calculus and flocked to the teacher
James K. Peterson

Calculus for Cognitive


Scientists
Higher Order Models and Their Analysis

123
James K. Peterson
Department of Mathematical Sciences
Clemson University
Clemson, SC
USA

ISSN 2195-3988 ISSN 2195-3996 (electronic)


Cognitive Science and Technology
ISBN 978-981-287-875-5 ISBN 978-981-287-877-9 (eBook)
DOI 10.1007/978-981-287-877-9

Library of Congress Control Number: 2015958343

© Springer Science+Business Media Singapore 2016


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made.

Printed on acid-free paper

This Springer imprint is published by SpringerNature


The registered company is Springer Science+Business Media Singapore Pte Ltd.
I dedicate this work to the biology students
who have learned this material over the last
10 semesters, the practicing biologists, the
immunologists, the cognitive scientists, and
the computer scientists who have helped an
outsider think much better and to my family
who have listened to my ideas in the living
room and over dinner for many years. I hope
that this text helps inspire everyone who
works in science to consider mathematics and
computer science as indispensable tools in
their own work in the biological sciences.
Acknowledgments

We would like to thank all the students who have used the various iterations
of these notes as they have evolved from handwritten to the fourth fully typed
version here. We particularly appreciate your interest as this course is required and
uses mathematics; a combination that causes fear in many biological science
majors. We have been pleased by the enthusiasm you have brought to this inter-
esting combination of ideas from many disciplines. Finally, we gratefully
acknowledge the support of Hap Wheeler in the Department of Biological Sciences
during the years 2006 to 2014 for believing that this material would be useful to
biology students.
For this new text on a follow-up course to the first course on calculus for
cognitive scientists, we would like to thank all of the students from Spring 2006 to
Fall 2014 for their comments and patience with the inevitable typographical errors,
mistakes in the way we explained topics, and organizational flaws as we have
taught second semester of calculus ideas to them. This new text starts assuming you
know something the equivalent of a first semester course in calculus and particu-
larly know about exponential and logarithm functions, first-order models and the
MATLAB tools needed to solve the models numerically. In addition, you need to
know a fair bit of a start into calculus for functions of more than one variable and
the ideas of approximation to functions of one and two variables. These are not
really standard topics in just one course in calculus, which is why our first volume
was written to provide coverage of all those things. In addition, all of the mathe-
matics subserve ideas from biological models so that everything is wrapped toge-
ther in a pleasing package!
With that background given, in this text, we add new material on linear and
nonlinear systems models and more biological models. We also cover a useful way
of solving what are called linear partial differential equations using the technique

vii
viii Acknowledgments

named Separation of Variables. To make sense of all this, we naturally have to dip
into mathematical waters at appropriate points and we are not shy about that! But
rest assured, everything we do is carefully planned because it is of great use to you
in your attempts to forge an alliance between cognitive science, mathematics, and
computation.
Contents

Part I Introduction
1 Introductory Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1 A Roadmap to the Text. . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Code. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Final Thoughts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Part II Review
2 Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1 The Inner Product of Two Column Vectors . . . . . . . . . . . . . . 11
2.1.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Interpreting the Inner Product. . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 Determinants of 2  2 Matrices . . . . . . . . . . . . . . . . . . . . . . 19
2.3.1 Worked Out Problems . . . . . . . . . . . . . . . . . . . . . . 20
2.3.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4 Systems of Two Linear Equations. . . . . . . . . . . . . . . . . . . . . 21
2.4.1 Worked Out Examples. . . . . . . . . . . . . . . . . . . . . . 23
2.4.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.5 Solving Two Linear Equations in Two Unknowns . . . . . . . . . 25
2.5.1 Worked Out Examples. . . . . . . . . . . . . . . . . . . . . . 28
2.5.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.6 Consistent and Inconsistent Systems . . . . . . . . . . . . . . . . . . . 31
2.6.1 Worked Out Examples. . . . . . . . . . . . . . . . . . . . . . 33
2.6.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.7 Specializing to Zero Data . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.7.1 Worked Out Examples. . . . . . . . . . . . . . . . . . . . . . 37
2.7.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

ix
x Contents

2.8 Matrix Inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39


2.8.1 Worked Out Examples. . . . . . . . . . . . . . . . . . . . . . 40
2.8.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.9 Computational Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . 42
2.9.1 A Simple Lower Triangular System . . . . . . . . . . . . 42
2.9.2 A Lower Triangular Solver . . . . . . . . . . . . . . . . . . 43
2.9.3 An Upper Triangular Solver . . . . . . . . . . . . . . . . . . 43
2.9.4 The LU Decomposition of A Without Pivoting. . . . . 44
2.9.5 The LU Decomposition of A with Pivoting . . . . . . . 48
2.10 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . 50
2.10.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.10.2 The General Case . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.10.3 The MatLab Approach. . . . . . . . . . . . . . . . . . . . . . 57
Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3 Numerical Methods Order One ODEs. . . . . . . . . . . . . . . . . . . . . . 61
3.1 Taylor Polynomials. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.1.1 Fundamental Tools . . . . . . . . . . . . . . . . . . . . . . . . 62
3.1.2 The Zeroth Order Taylor Polynomial. . . . . . . . . . . . 63
3.1.3 The First Order Taylor Polynomial . . . . . . . . . . . . . 64
3.1.4 Quadratic Approximations . . . . . . . . . . . . . . . . . . . 67
3.2 Euler’s Method with Time Independence. . . . . . . . . . . . . . . . 69
3.3 Euler’s Method with Time Dependence . . . . . . . . . . . . . . . . . 74
3.3.1 Lipschitz Functions and Taylor Expansions . . . . . . . 74
3.4 Euler’s Algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.4.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.5 Runge–Kutta Methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.5.1 The MatLab Implementation . . . . . . . . . . . . . . . . . 81
4 Multivariable Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.1 Functions of Two Variables . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.1.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.2 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.3 Partial Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.3.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
4.4 Tangent Planes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.4.1 The Vector Cross Product . . . . . . . . . . . . . . . . . . . 98
4.4.2 Back to Tangent Planes . . . . . . . . . . . . . . . . . . . . . 102
4.4.3 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.4.4 Computational Results . . . . . . . . . . . . . . . . . . . . . . 104
4.4.5 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.5 Derivatives in Two Dimensions!. . . . . . . . . . . . . . . . . . . . . . 106
4.6 The Chain Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
4.6.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
Contents xi

4.7 Tangent Plane Approximation Error . . . . . . . . . . . . . . . . . . . 111


4.8 Second Order Error Estimates . . . . . . . . . . . . . . . . . . . . . . . 114
4.8.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
4.8.2 Hessian Approximations . . . . . . . . . . . . . . . . . . . . 116
4.9 Extrema Ideas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
4.9.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
4.9.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

Part III The Main Event


5 Integration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
5.1 Integration by Parts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
5.1.1 How Do We Use Integration by Parts? . . . . . . . . . . 130
5.1.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
5.2 Partial Fraction Decompositions . . . . . . . . . . . . . . . . . . . . . . 135
5.2.1 How Do We Use Partial Fraction Decompositions
to Integrate? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
5.2.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
6 Complex Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
6.1 The Complex Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
6.1.1 Complex Number Calculations . . . . . . . . . . . . . . . . 143
6.1.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
6.2 Complex Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
6.2.1 Calculations with Complex Functions . . . . . . . . . . . 146
6.2.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
7 Linear Second Order ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
7.1 Homework. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
7.2 Distinct Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
7.2.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
7.3 Repeated Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
7.3.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
7.4 Complex Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
7.4.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
7.4.2 The Phase Shifted Solution . . . . . . . . . . . . . . . . . . 168
8 Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
8.1 Finding a Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
8.1.1 Worked Out Examples. . . . . . . . . . . . . . . . . . . . . . 172
8.1.2 A Judicious Guess . . . . . . . . . . . . . . . . . . . . . . . . 173
8.1.3 Sample Characteristic Equation Derivations . . . . . . . 179
8.1.4 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
xii Contents

8.2 Two Distinct Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . 182


8.2.1 Worked Out Solutions . . . . . . . . . . . . . . . . . . . . . . 186
8.2.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
8.2.3 Graphical Analysis . . . . . . . . . . . . . . . . . . . . . . . . 192
8.3 Repeated Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
8.3.1 The Repeated Eigenvalue Has Two Linearly
Independent Eigenvectors. . . . . . . . . . . . . . . . . . . . 216
8.3.2 The Repeated Eigenvalue Has only One
Eigenvector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
8.4 Complex Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
8.4.1 The General Real and Complex Solution . . . . . . . . . 222
8.4.2 Rewriting the Real Solution . . . . . . . . . . . . . . . . . . 225
8.4.3 The Representation of A . . . . . . . . . . . . . . . . . . . . 226
8.4.4 The Transformed Model . . . . . . . . . . . . . . . . . . . . 227
8.4.5 The Canonical Solution . . . . . . . . . . . . . . . . . . . . . 228
8.4.6 The General Model Solution . . . . . . . . . . . . . . . . . 230
9 Numerical Methods Systems of ODEs . . . . . . . . . . . . . . . . . . . . . . 237
9.1 Setting Up the Matrix and the Vector Functions . . . . . . . . . . . 238
9.1.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
9.2 Linear Second Order Problems as Systems . . . . . . . . . . . . . . 241
9.2.1 A Practice Problem . . . . . . . . . . . . . . . . . . . . . . . . 243
9.2.2 What If We Don’t Know the True Solution? . . . . . . 244
9.2.3 Homework: No External Input . . . . . . . . . . . . . . . . 245
9.2.4 Homework: External Inputs . . . . . . . . . . . . . . . . . . 246
9.3 Linear Systems Numerically. . . . . . . . . . . . . . . . . . . . . . . . . 246
9.3.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
9.4 An Attempt at an Automated Phase Plane Plot . . . . . . . . . . . . 252
9.5 Further Automation! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
9.5.1 Eigenvalues in MatLab . . . . . . . . . . . . . . . . . . . . . 254
9.5.2 Linear System Models in MatLab Again . . . . . . . . . 256
9.5.3 Project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
9.6 AutoPhasePlanePlot Again. . . . . . . . . . . . . . . . . . . . . . . . . . 261
9.6.1 Project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
9.7 A Two Protein Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
9.7.1 Protein Values Should be Positive! . . . . . . . . . . . . . 267
9.7.2 Solving the Two Protein Model . . . . . . . . . . . . . . . 271
9.7.3 Project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
9.8 A Two Box Climate Model . . . . . . . . . . . . . . . . . . . . . . . . . 278
9.8.1 Project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
9.8.2 Control of Carbon Loading . . . . . . . . . . . . . . . . . . 285
Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
Contents xiii

Part IV Interesting Models


10 Predator–Prey Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
10.1 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
10.2 The Nullcline Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
10.2.1 The x0 ¼ 0 Analysis . . . . . . . . . . . . . . . . . . . . . . . 292
10.2.2 The y0 ¼ 0 Analysis . . . . . . . . . . . . . . . . . . . . . . . 293
10.2.3 The Nullcline Plane. . . . . . . . . . . . . . . . . . . . . . . . 295
10.2.4 The General Results . . . . . . . . . . . . . . . . . . . . . . . 295
10.2.5 Drawing Trajectories . . . . . . . . . . . . . . . . . . . . . . . 298
10.3 Only Quadrant One Is Biologically Relevant . . . . . . . . . . . . . 299
10.3.1 Trajectories on the y þ Axis . . . . . . . . . . . . . . . . . . 299
10.3.2 Trajectories on the x þ Axis . . . . . . . . . . . . . . . . . . 300
10.3.3 What Does This Mean Biologically? . . . . . . . . . . . . 300
10.3.4 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
10.4 The Nonlinear Conservation Law . . . . . . . . . . . . . . . . . . . . . 301
10.4.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
10.4.2 The General Derivation . . . . . . . . . . . . . . . . . . . . . 304
10.4.3 Can a Trajectory Hit the y Axis Redux? . . . . . . . . . 308
10.4.4 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
10.5 Qualitative Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
10.5.1 The Predator–Prey Growth Functions . . . . . . . . . . . 309
10.5.2 General Results . . . . . . . . . . . . . . . . . . . . . . . . . . 312
10.5.3 The Nonlinear Conservation Law Using f and g . . . . 314
10.5.4 Trajectories are Bounded in x Example . . . . . . . . . . 315
10.5.5 Trajectories are Bounded in y Example . . . . . . . . . . 317
10.5.6 Trajectories are Bounded Example . . . . . . . . . . . . . 318
10.5.7 Trajectories are Bounded in x General Argument . . . 318
10.5.8 Trajectories are Bounded in y General Argument . . . 320
10.5.9 Trajectories are Bounded General Argument. . . . . . . 322
10.5.10 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
10.6 The Trajectory Must Be Periodic . . . . . . . . . . . . . . . . . . . . . 324
10.6.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
10.7 Plotting Trajectory Points! . . . . . . . . . . . . . . . . . . . . . . . . . . 326
10.7.1 Case 1: u ¼ dc . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
10.7.2 Case 2: x1 \U\ dc . . . . . . . . . . . . . . . . . . . . . . . . . 327
10.7.3 x1 \u1 \u2 \ dc . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
10.7.4 Three u Points . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
10.8 The Average Value of a Predator–Prey Solution . . . . . . . . . . . 332
10.8.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
10.9 A Sample Predator–Prey Model . . . . . . . . . . . . . . . . . . . . . . 336
10.9.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
10.10 Adding Fishing Rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
10.10.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
xiv Contents

10.11 Numerical Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341


10.11.1 Estimating the Period T Numerically . . . . . . . . . . . . 343
10.11.2 Plotting Predator and Prey Versus Time. . . . . . . . . . 345
10.11.3 Plotting Using a Function . . . . . . . . . . . . . . . . . . . 346
10.11.4 Automated Phase Plane Plots . . . . . . . . . . . . . . . . . 349
10.11.5 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
10.12 A Sketch of the Predator–Prey Solution Process . . . . . . . . . . . 352
10.12.1 Project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
11 Predator–Prey Models with Self Interaction . . . . . . . . . . . . . . . . . 355
11.1 The Nullcline Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
11.1.1 The x0 ¼ 0 Analysis . . . . . . . . . . . . . . . . . . . . . . . 355
11.1.2 The y0 ¼ 0 Analysis . . . . . . . . . . . . . . . . . . . . . . . 356
11.1.3 The Nullcline Plane. . . . . . . . . . . . . . . . . . . . . . . . 356
11.1.4 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
11.2 Quadrant 1 Trajectories Stay in Quadrant 1 . . . . . . . . . . . . . . 360
11.2.1 Trajectories Starting on the y Axis . . . . . . . . . . . . . 361
11.2.2 Trajectories Starting on the x Axis . . . . . . . . . . . . . 363
11.2.3 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
11.3 The Combined Nullcline Analysis in Quadrant 1 . . . . . . . . . . 364
11.3.1 The Case dc \ ae . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
11.3.2 The Nullclines Touch on the X Axis . . . . . . . . . . . . 365
11.3.3 The Nullclines Cross in Quadrant 4. . . . . . . . . . . . . 365
11.3.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
11.4 Trajectories in Quadrant 1 . . . . . . . . . . . . . . . . . . . . . . . . . . 368
11.5 Quadrant 1 Intersection Trajectories . . . . . . . . . . . . . . . . . . . 369
11.5.1 Limiting Average Values in Quadrant 1. . . . . . . . . . 370
11.6 Quadrant 4 Intersection Trajectories . . . . . . . . . . . . . . . . . . . 373
11.6.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
11.7 Summary: Working Out a Predator–Prey Self-Interaction
Model in Detail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
11.8 Self Interaction Numerically. . . . . . . . . . . . . . . . . . . . . . . . . 375
11.8.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
11.9 Adding Fishing! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378
11.10 Learned Lessons. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
12 Disease Models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
12.1 Disease Model Nullclines . . . . . . . . . . . . . . . . . . . . . . . . . . 382
12.1.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384
12.2 Only Quadrant 1 is Relevant . . . . . . . . . . . . . . . . . . . . . . . . 385
12.2.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386
12.3 The I Versus S Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
12.4 Homework. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
Contents xv

12.5 Solving the Disease Model Using Matlab . . . . . . . . . . . . . . . 390


12.5.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392
12.6 Estimating Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392
12.6.1 Approximating the dR=dt Equation . . . . . . . . . . . . . 393
12.6.2 Using dR=dt to Estimate ρ . . . . . . . . . . . . . . . . . . . 398
13 A Cancer Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401
13.1 Two Allele TSG Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
13.2 Model Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 406
13.3 Preliminary Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407
13.4 Solving the Top Pathway Exactly . . . . . . . . . . . . . . . . . . . . . 408
13.4.1 The X0 –X1 Subsystem . . . . . . . . . . . . . . . . . . . . . . 408
13.5 Approximation Ideas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 410
13.5.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
13.5.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
13.6 Approximation of the Top Pathway . . . . . . . . . . . . . . . . . . . 413
13.6.1 Approximating X0 . . . . . . . . . . . . . . . . . . . . . . . . . 413
13.6.2 Approximating X1 . . . . . . . . . . . . . . . . . . . . . . . . . 414
13.6.3 Approximating X2 . . . . . . . . . . . . . . . . . . . . . . . . . 415
13.7 Approximating the CIN Pathway Solutions . . . . . . . . . . . . . . 417
13.7.1 The Y0 Estimate . . . . . . . . . . . . . . . . . . . . . . . . . . 418
13.7.2 The Y1 Estimate . . . . . . . . . . . . . . . . . . . . . . . . . . 420
13.7.3 The Y2 Estimate . . . . . . . . . . . . . . . . . . . . . . . . . . 422
13.8 Error Estimates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
13.9 When Is the CIN Pathway Dominant?. . . . . . . . . . . . . . . . . . 424
13.9.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425
13.10 A Little Matlab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426
13.10.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430
13.11 Insight Is Difficult to Achieve . . . . . . . . . . . . . . . . . . . . . . . 430
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431

Part V Nonlinear Systems Again


14 Nonlinear Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . 435
14.1 Linear Approximations to Nonlinear Models . . . . . . . . . . . . . 435
14.1.1 A First Nonlinear Model . . . . . . . . . . . . . . . . . . . . 438
14.1.2 Finding Equilibrium Points Numerically . . . . . . . . . 443
14.1.3 A Second Nonlinear Model . . . . . . . . . . . . . . . . . . 453
14.1.4 The Predator–Prey Model . . . . . . . . . . . . . . . . . . . 459
14.1.5 The Predator–Prey Model with Self Interaction . . . . . 465
14.1.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
xvi Contents

15 An Insulin Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475


15.1 Fitting the Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481
15.1.1 Gradient Descent Implementation . . . . . . . . . . . . . . 484
15.1.2 Adding Line Search . . . . . . . . . . . . . . . . . . . . . . . 487
Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491

Part VI Series Solutions to PDE Models


16 Series Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495
16.1 The Separation of Variables Method . . . . . . . . . . . . . . . . . . . 495
16.1.1 Determining the Separation Constant . . . . . . . . . . . . 497
16.2 Infinite Series. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501
16.3 Independant Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503
16.3.1 Independent Functions . . . . . . . . . . . . . . . . . . . . . . 504
16.3.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507
16.4 Vector Spaces and Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . 507
16.4.1 Inner Products in Function Spaces . . . . . . . . . . . . . 510
16.5 Fourier Series. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511
16.5.1 The Cable Model Infinite Series Solution . . . . . . . . . 512
Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513

Part VII Summing It All Up


17 Final Thoughts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517
17.1 Fostering Interdisciplinary Appreciation. . . . . . . . . . . . . . . . . 517
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 522

Part VIII Advise to the Beginner


18 Background Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 528

Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 531

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547
List of Figures

Figure 3.1 True versus Euler for


x0 ¼ 0:5xð60  xÞ; xð0Þ ¼ 20 . . . . . . . . . . . . . . . . . ..... 84
Figure 4.1 The traces f ðx0 ; yÞ and f ðx; y0 Þ for the surface
z ¼ x2 þ y2 for x0 ¼ 0:5 and y0 ¼ 0:5 . . . . . . . . . . . ..... 90
Figure 4.2 The traces f ðx0 ; yÞ and f ðx; y0 Þ for the surface
z ¼ x2 þ y2 for x0 ¼ 0:5 and y0 ¼ 0:5 with added
tangent lines. We have added the tangent plane
determined by the tangent lines . . . . . . . . . . . . . . . ..... 92
Figure 4.3 The traces f ðx0 ; yÞ and f ðx; y0 Þ for the surface
z ¼ x2 þ y2 for x0 ¼ 0:5 and y0 ¼ 0:5 with added
tangent lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 93
Figure 4.4 The traces f ðx0 ; yÞ and f ðx; y0 Þ for the surface
z ¼ x2 þ y2 for x0 ¼ 0:5 and y0 ¼ 0:5 with added
tangent lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
Figure 4.5 The Surface f ðx; yÞ ¼ 2x2 þ 4y3 . . . . . . . . . . . . . . . . . . . . 124
Figure 6.1 Graphing complex numbers. . . . . . . . . . . . . . . . . . . . . . . 142
Figure 6.2 Graphing the conjugate of a complex numbers . . . . . . . . . 143
Figure 6.3 The complex number 2 þ 8i . . . . . . . . . . . . . . . . . . . . . 144
Figure 7.1 Linear second order problem: two distinct roots. . . . . . . . . 156
Figure 7.2 Linear second order problem: repeated roots . . . . . . . . . . . 160
Figure 7.3 Linear second order problem: complex roots . . . . . . . . . . . 165
Figure 7.4 Linear second order problem: complex roots two . . . . . . . . 167
Figure 8.1 Finding where x0 \0 and x0 [ 0. . . . . . . . . . . . . . . . . . . . 193
Figure 8.2 Finding where y0 \ 0 and y0 [ 0 . . . . . . . . . . . . . . . . . . . 193
Figure 8.3 Combining the x0 and y0 algebraic sign regions . . . . . . . . . 194
Figure 8.4 Drawing the nullclines and the eigenvector lines on
the same graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
Figure 8.5 Trajectories in Region I . . . . . . . . . . . . . . . . . . . . . . . . . 198
Figure 8.6 Combining the x0 and y0 algebraic sign regions . . . . . . . . . 199
Figure 8.7 Trajectories in Region II . . . . . . . . . . . . . . . . . . . . . . . . . 201

xvii
xviii List of Figures

Figure 8.8 Region III trajectories . . . . . . . . . . . . . . . . . . . . . . . . . . . 202


Figure 8.9 A magnified Region III trajectory . . . . . . . . . . . . . . . . . . 203
Figure 8.10 Region IV trajectories. . . . . . . . . . . . . . . . . . . . . . . . . . . 204
Figure 8.11 Trajectories in all regions . . . . . . . . . . . . . . . . . . . . . . . . 205
Figure 8.12 Example x0 ¼ 4x  2y; y0 ¼ 3x  y, Page 1 . . . . . . . . . . . . 209
Figure 8.13 Example x0 ¼ 4x  2y; y0 ¼ 3x  y, Page 2 . . . . . . . . . . . . 209
Figure 8.14 Example x0 ¼ 4x  2y; y0 ¼ 3x  y, Page 3 . . . . . . . . . . . . 210
Figure 8.15 A phase plane plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
Figure 8.16 A scaled phase plane plot . . . . . . . . . . . . . . . . . . . . . . . . 220
Figure 8.17 The partial ellipse for the phase plane portrait . . . . . . . . . . 232
Figure 8.18 The complete ellipse for the phase plane portrait . . . . . . . . 233
Figure 8.19 The ellipse with axes for the phase plane portrait. . . . . . . . 234
Figure 8.20 The spiral out phase plane portrait . . . . . . . . . . . . . . . . . . 234
Figure 9.1 Solution to x00 þ 4x0  5x ¼ 0; xð0Þ ¼ 1; x0 ð0Þ ¼ 1
on [1.3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 244
Figure 9.2 Solution to x00 þ 4x0  5x ¼ 0; xð0Þ ¼ 1; x0 ð0Þ ¼ 1
on [1.3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 245
Figure 9.3 Solving x0 ¼ 3x þ 4xy; y0 ¼ x þ 2y;
xð0Þ ¼ 1; yð0Þ ¼ 1 . . . . . . . . . . . . . . . . . . . . . . . ..... 247
Figure 9.4 Solving
x0 ¼ 4x þ 9xy; y0 ¼ x  6y; xð0Þ ¼ 4; yð0Þ ¼ 2 . . . . . . . 251
Figure 9.5 Phase plane x0 ¼ 4x  y; y0 ¼ 8x  5y . . . . . . . . . . . . . . . . 254
Figure 9.6 Sample plot. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
Figure 9.7 Phase plane plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
Figure 9.8 Phase plane x0 ¼ 4x þ 9y; y0 ¼ x  6y . . . . . . . . . . . . . . 263
Figure 9.9 Two interacting proteins again. . . . . . . . . . . . . . . . . . . . . 274
Figure 9.10 Phase plane for two interacting proteins again . . . . . . . . . . 275
Figure 9.11 Phase plane for two interacting proteins on a short
time scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 276
Figure 9.12 Phase plane for two interacting proteins on a long
time scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
Figure 9.13 Simple two box climate model: short time scale . . . . . . . . 283
Figure 9.14 Simple two box climate model: mid time scale . . . . . . . . . 284
Figure 9.15 Simple two box climate model: long time scale . . . . . . . . . 284
Figure 10.1 x0 ¼ 0 nullcline for x0 ¼ 2x þ 5y . . . . . . . . . . . . . . . . . . . 294
Figure 10.2 y0 ¼ 0 nullcline for y0 ¼ 6x þ 3y . . . . . . . . . . . . . . . . . . . 294
Figure 10.3 The combined x0 ¼ 0 nullcline for x0 ¼ 2x þ 5y and
y0 ¼ 0 nullcline for y0 ¼ 6x þ 3y information . . . . . . ..... 296
Figure 10.4 Finding where x0 \0 and x0 [ 0 for the predator–prey
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 296
Figure 10.5 Finding where y0 \0 and y0 [ 0 for the predator–prey
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 297
Figure 10.6 Combining the x0 and y0 algebraic sign regions . . . . ..... 297
Figure 10.7 Trajectories in region I . . . . . . . . . . . . . . . . . . . . . ..... 298
List of Figures xix

Figure 10.8 Trajectories in region II . . . . . . . . . . . . . . . . . . . . . . . . . 302


Figure 10.9 A periodic trajectory . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
Figure 10.10 A spiraling out trajectory . . . . . . . . . . . . . . . . . . . . . . . . 310
Figure 10.11 A spiraling in trajectory . . . . . . . . . . . . . . . . . . . . . . . . . 311
Figure 10.12 The predator–prey f growth graph . . . . . . . . . . . . . . . . . . 312
Figure 10.13 The predator–prey g growth graph . . . . . . . . . . . . . . . . . . 313
Figure 10.14 The conservation law f ðxÞ gðyÞ ¼ 0:7 fmax gmax
implies there are two critical points x1 and x2
of interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 316
Figure 10.15 Predator–prey trajectories with initial conditions from
Quadrant 1 are bounded in x . . . . . . . . . . . . . . . . . ..... 316
Figure 10.16 The conservation law f ðxÞ gðyÞ ¼ 0:7 fmax gmax
implies there are two critical points y1 and y2
of interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 317
Figure 10.17 Predator–prey trajectories with initial conditions from
Quadrant 1 are bounded in y . . . . . . . . . . . . . . . . . ..... 318
Figure 10.18 Predator–prey trajectories with initial conditions from
Quadrant 1 are bounded in x and y . . . . . . . . . . . . ..... 319
Figure 10.19 The conservation law f ðxÞ gðyÞ ¼ 0:7 fmax gmax
implies there are two critical points x1 and x2
of interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 320
Figure 10.20 Predator–Prey trajectories with initial conditions from
Quadrant 1 are bounded in x . . . . . . . . . . . . . . . . . ..... 321
Figure 10.21 The conservation law f ðxÞ gðyÞ ¼ 0:7 fmax gmax
implies there are two critical points y1 and y2
of interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 321
Figure 10.22 Predator–prey trajectories with initial conditions from
Quadrant 1 are bounded in y . . . . . . . . . . . . . . . . . ..... 322
Figure 10.23 Predator–prey trajectories with initial conditions from
Quadrant 1 are bounded in x and y . . . . . . . . . . . . ..... 323
Figure 10.24 The f curve with the point x1 \c=d ¼ 7=3\x \x2
added . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 324
Figure 10.25 The g curve with the rgmax line added showing the
y values for the chosen x . . . . . . . . . . . . . . . . . . . ..... 325
Figure 10.26 The predator–prey f growth graph trajectory analysis
for x1 \u\ dc . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 327
Figure 10.27 The predator–prey g growth analysis for one point
x1 \u\ dc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 328
Figure 10.28 The predator–prey f growth graph trajectory analysis
for x1 \u1 \u2 \ dc points . . . . . . . . . . . . . . . . . . . ..... 329
Figure 10.29 The spread of the trajectory through fixed lines on the
x axis gets smaller as we move away from the center
point dc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 330
Figure 10.30 The trajectory must be periodic . . . . . . . . . . . . . . . ..... 331
xx List of Figures

Figure 10.31 The theoretical trajectory for


x0 ¼ 2x  10xy; y0 ¼ 3y þ 18xy. We do not know
the actual trajectory as we can not solve for x and
y explicitly as functions of time. However, our
analysis tells us the trajectory has the qualitative
features shown. . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 337
Figure 10.32 Phase plane x0 ¼ 12x  5xy; y0 ¼ 6y þ 3xy;
xð0Þ ¼ 0:2; yð0Þ ¼ 8:6 . . . . . . . . . . . . . . . . . . . . . ..... 342
Figure 10.33 Predator–prey plot with final time less than the
period. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 344
Figure 10.34 Predator–prey plot with final time about the period . ..... 345
Figure 10.35 y versus t for x0 ¼ 12x  5xy; y0 ¼ 6y þ 3xy;
xð0Þ ¼ 0:2; yð0Þ ¼ 8:6 . . . . . . . . . . . . . . . . . . . . . ..... 346
Figure 10.36 y versus t for x0 ¼ 12x  5xy; y0 ¼ 6y þ 3xy;
xð0Þ ¼ 0:2; yð0Þ ¼ 8:6 . . . . . . . . . . . . . . . . . . . . . . . . . . 347
Figure 10.37 Predator–prey plot with step size too large . . . . . . . . . . . . 348
Figure 10.38 Predator–prey plot with step size better! . . . . . . . . . . . . . . 348
Figure 10.39 Predator–prey plot for multiple initial conditions!. . . . . . . . 349
Figure 10.40 Predator–prey phase plot. . . . . . . . . . . . . . . . . . . . . . . . . 353
Figure 10.41 Predator–prey phase plane plot . . . . . . . . . . . . . . . . . . . . 353
Figure 11.1 Finding where x0 \0 and x0 [ 0 for the Predator–Prey
self interaction model . . . . . . . . . . . . . . . . . . . . . . ..... 357
Figure 11.2 Finding where y0 \0 and y0 [ 0 for the Predator–Prey
self interaction model . . . . . . . . . . . . . . . . . . . . . . ..... 357
Figure 11.3 x0 nullcline for x0 ¼ x ð4  5y  exÞ . . . . . . . . . . . . ..... 359
Figure 11.4 y0 nullcline for y0 ¼ y ð6 þ 2x  fyÞ . . . . . . . . . . . ..... 359
Figure 11.5 The x0 \0 and x0 [ 0 signs in Quadrant 1 for the
Predator–Prey self interaction model. . . . . . . . . . . . ..... 360
Figure 11.6 The y0 \0 and y0 [ 0 signs in Quadrant 1 for the
Predator–Prey self interaction model. . . . . . . . . . . . ..... 360
Figure 11.7 The Quadrant 1 nullcline regions for the
Predator–Prey self interaction model
when c=d\a=e . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 365
Figure 11.8 The qualitative nullcline regions for the
Predator–Prey self interaction model
when c=d ¼ a=e . . . . . . . . . . . . . . . . . . . . . . . . . ..... 366
Figure 11.9 The qualitative nullcline regions for the
Predator–Prey self interaction model
when c=d [ a=e . . . . . . . . . . . . . . . . . . . . . . . . . ..... 366
Figure 11.10 The Quadrant 1 nullcline regions for the
Predator–Prey self interaction model
when c=d\a=e . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 367
List of Figures xxi

Figure 11.11 The qualitative nullcline regions for the


Predator–Prey self interaction model
when c=d ¼ a=e . . . . . . . . . . . . . . . . . . . . . . . . . ..... 368
Figure 11.12 The qualitative nullcline regions for the
Predator–Prey self interaction model
when c=d [ a=e . . . . . . . . . . . . . . . . . . . . . . . . . ..... 368
Figure 11.13 Sample Predator–Prey model with self interaction:
crossing in Q1. . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 369
Figure 11.14 Sample Predator–Prey model with self interaction:
the nullclines cross on the x axis . . . . . . . . . . . . . . ..... 369
Figure 11.15 Sample Predator–Prey model with self interaction:
the nullclines cross in Quadrant 4 . . . . . . . . . . . . . ..... 370
Figure 11.16 Predator–Prey system:
x0 ¼ 2x  3xy  3x2 ; y0 ¼ 4y þ 5xy  3y2 . . . . . . . ..... 376
Figure 11.17 Predator–Prey system: x0 ¼ 2x  3xy  1:5x2 ;
y0 ¼ 4y þ 5xy  1:5y2 in this example, the
nullclines cross so the trajectories moves towards a
fixed point (0.23, 0.87) as shown . . . . . . . . . . . . . . ..... 377
Figure 12.1 Finding where I 0 \0 and I 0 [ 0 for the disease
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 383
Figure 12.2 Finding where S0 \0 and S0 [ 0 regions for the
disease model . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 383
Figure 12.3 Finding the ðI 0 ; S0 Þ algebraic sign regions for the
disease model . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 384
Figure 12.4 The disease model in Quadrant 1 . . . . . . . . . . . . . . ..... 386
Figure 12.5 Solution to
S0 ¼ 5SI; I 0 ¼ 5SI þ 25I; Sð0Þ ¼ 10; Ið0Þ ¼ 5 . . . . ..... 391
Figure 13.1 Typical colon crypts. . . . . . . . . . . . . . . . . . . . . . . ..... 403
Figure 13.2 The pathways for the TSG allele losses . . . . . . . . . ..... 404
Figure 13.3 The pathways for the TSG allele losses rewritten
using selective advantage . . . . . . . . . . . . . . . . . . . ..... 405
Figure 13.4 The pathways for the TSG allele losses rewritten
using mathematical variables . . . . . . . . . . . . . . . . . ..... 405
Figure 13.5 The number of cells in A and ACIN state versus
time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 428
Figure 14.1 Nonlinear phase plot for multiple initial conditions! . ..... 443
Figure 14.2 Equilibrium points graphically for the trigger
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453
Figure 14.3 Trigger model linearized about Q1 . . . . . . . . . . . . . . . . . . 456
Figure 14.4 Trigger model linearized about Q2 . . . . . . . . . . . . . . . . . . 457
Figure 14.5 Trigger model linearized about Q3 . . . . . . . . . . . . . . . . . . 459
Figure 14.6 Trigger model phase plane plot . . . . . . . . . . . . . . . . . . . . 460
Figure 14.7 Predator–Prey model linearized about Q1 . . . . . . . . . . . . . 461
Figure 14.8 Predator–Prey model linearized about Q2 . . . . . . . . . . . . . 463
xxii List of Figures

Figure 14.9 Predator–Prey model phase plane plot . . . . . . . . . . ..... 463


Figure 14.10 Predator–Prey model with self interaction with
nullcline intersection Q4 in Quadrant I, linearized
about Q4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 467
Figure 14.11 Predator–Prey model linearized about Q3 . . . . . . . . ..... 468
Figure 14.12 Predator–Prey self interaction model with intersection
in quadrant I . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 469
Figure 14.13 Predator–Prey model with self interaction with
nullcline intersection Q4 in Quadrant IV, linearized
about Q4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 470
Figure 14.14 Predator–Prey model with self interaction with
nullcline intersection in Quadrant IV, linearized about
Q3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 471
Figure 14.15 Predator–Prey self interaction model with nullcline
intersection in Quadrant IV phase plane plot . . . . . . ..... 472
Figure 15.1 The diabetes model curve fit: no line search . . . . . . ..... 486
Figure 15.2 The diabetes model curve fit with line search on
unscaled data. . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 490
Figure 15.3 The diabetes model curve fit with line search on
scaled data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 491
List of Tables

Table 1.1 Typical engineering calculus sequence. . . . . . . . . . . . ..... 4


Table 10.1 The percent of the total fish catch in the Mediterranean
Sea which was considered not food fish . . . . . . . . . . ..... 289
Table 10.2 The percent of the total fish catch in the Mediterranean
Sea considered predator and considered food . . . . . . . ..... 290
Table 10.3 During World War One, fishing is drastically curtailed,
yet the predator percentage went up while the food
percentage went down . . . . . . . . . . . . . . . . . . . . . . . ..... 290
Table 10.4 The average food and predator fish caught in the
Mediterranean Sea . . . . . . . . . . . . . . . . . . . . . . . . . ..... 339
Table 13.1 The Non CIN Pathway Approximations with error
estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 417
Table 13.2 The CIN Model Approximations with error estimates . ..... 423
Table 13.3 The Non CIN and CIN Model Approximations with
error estimates using u1 ¼ u2 ¼ u and uc ¼ R u . . . . . ..... 423
Table 13.4 The Non CIN and CIN Model Approximations
Dependence on population size N and the CIN rate for
R  39 with u1 ¼ 107 and uc ¼ R u1 . . . . . . . . . . . . ..... 423
Table 13.5 The CIN decay rates, uc required for CIN dominance.
with u1 ¼ u2 ¼ 107 and uc ¼ R u1 for R  1 . . . . . . . ..... 425

xxiii
List of Code Examples

Listing 1.1 How to add paths to octave . . . . . . . . . . . . . . . . . . . . . . 5


Listing 1.2 Set paths in octave . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Listing 2.1 Lower Triangular Solver . . . . . . . . . . . . . . . . . . . . . . . . 43
Listing 2.2 Sample Solution with LTriSol . . . . . . . . . . . . . . . . . . . . 43
Listing 2.3 Upper Triangular Solver . . . . . . . . . . . . . . . . . . . . . . . . 44
Listing 2.4 Sample Solution with UTriSol . . . . . . . . . . . . . . . . . . . . 44
Listing 2.5 Storing multipliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Listing 2.6 Extracting Lower and Upper Parts of a matrix . . . . . . . . . 46
Listing 2.7 LU Decomposition of A Without Pivoting. . . . . . . . . . . . 47
Listing 2.8 Solution using LU Decomposition . . . . . . . . . . . . . . . . . 47
Listing 2.9 LU Decomposition of A With Pivoting . . . . . . . . . . . . . . 48
Listing 2.10 Solving a System with pivoting . . . . . . . . . . . . . . . . . . . 49
Listing 2.11 Eigenvalues in Matlab. . . . . . . . . . . . . . . . . . . . . . . . . . 58
Listing 2.12 Eigenvectors in Matlab . . . . . . . . . . . . . . . . . . . . . . . . . 58
Listing 2.13 Eigenvalues and Eigenvectors Example . . . . . . . . . . . . . . 59
Listing 2.14 Checking orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . 59
Listing 3.1 RKstep.m: Runge–Kutta Codes . . . . . . . . . . . . . . . . . . . 82
Listing 3.2 FixedRK.m: The Runge–Kutta Solution . . . . . . . . . . . . . 83
Listing 3.3 True versus All Four Runge–Kutta
Approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Listing 4.1 DrawSimpleSurface . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Listing 4.2 Drawing a simple surface . . . . . . . . . . . . . . . . . . . . . . . 87
Listing 4.3 Drawing a full trace . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Listing 4.4 Drawing Tangent Lines . . . . . . . . . . . . . . . . . . . . . . . . . 92
Listing 4.5 Drawing Tangent Lines . . . . . . . . . . . . . . . . . . . . . . . . . 93
Listing 4.6 DrawTangentPlanePackage . . . . . . . . . . . . . . . . . . . . . . 105
Listing 4.7 Drawing Tangent Planes . . . . . . . . . . . . . . . . . . . . . . . . 106
Listing 4.8 The surface f(x, y) = 2x2 + 4y3 . . . . . . . . . . . . . . . . . . . . 124
Listing 7.1 Solution to x00  3x0  10x ¼ 0; xð0Þ ¼ 10;
x0 ð0Þ ¼ 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .... 155
Listing 7.2 Solution to u'' + 16u' + 64u = 0; u(0) = 1, u'(0) = 8 . .... 159

xxv
xxvi List of Code Examples

Listing 7.3 Solution to u'' + 8u' + 25u = 0; u(0) = 3, u'(0) = 4 . . . . . . 165


Listing 7.4 Solution to u'' + 8u' + 20u = 0; u(0) = 3, u'(0) = 4 . . . . . . 166
Listing 8.1 Checking our arithmetic! . . . . . . . . . . . . . . . . . . . . . . . . 232
Listing 8.2 Plotting for more time. . . . . . . . . . . . . . . . . . . . . . . . . . 232
Listing 8.3 Plotting the ellipse and its axis. . . . . . . . . . . . . . . . . . . . 233
Listing 9.1 Linear Second Order Dynamics . . . . . . . . . . . . . . . . . . . 242
Listing 9.2 Solving x'' + 4x' – 5x = 0; x(0) = −1, x'(0) = 1 . . . . . . . . 243
Solving x00 þ 4x0  5x ¼ 10 sinð5  tÞe0:03t ;
2
Listing 9.3
0
xð0Þ ¼ 1; x ð0Þ ¼ 1 . . . . . . . . . . . . . . . . . . . . . ..... 244
Listing 9.4 Solving x' = −3x + 4y, y' = −x + 2y; x(0) = −1;
y(0) = 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 247
Listing 9.5 Solving x' = −3x + 4y, y' = −x + 2y; x(0) = −1;
y(0) = 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 248
Listing 9.6 Phase Plane for x' = 4x + 9y, y' = −x − 6y; x(0) = 4;
y(0) = −2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
Listing 9.7 AutoPhasePlanePlot.m . . . . . . . . . . . . . . . . . . . . . . . . . 252
Listing 9.8 AutoPhasePlanePlot Arguments . . . . . . . . . . . . . . . . . . . 253
Listing 9.9 Trying Some Phase Plane Plots . . . . . . . . . . . . . . . . . . . 253
Listing 9.10 Eigenvalues in Matlab: eig . . . . . . . . . . . . . . . . . . . . . . 254
Listing 9.11 Eigenvectors and Eigenvalues in Matlab: eig . . . . . . . . . . 255
Listing 9.12 Example 5 × 5 Eigenvalue and Eigenvector
Calculation in Matlab . . . . . . . . . . . . . . . . . . . . . . . . . . 256
Listing 9.13 Inner Products in Matlab: dot. . . . . . . . . . . . . . . . . . . . . 256
Listing 9.14 Sample Linear Model: x' = 4x − y, y' = 8x − 5y. . . . . . . . 257
Listing 9.15 Find the Eigenvalues and Eigenvectors in Matlab . . . . . . . 258
Listing 9.16 Session for x' = 4x + 9y, y' = −x − 6y Phase Plane
Plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 259
Listing 9.17 AutoPhasePlanePlotLinearModel . . . . . . . . . . . . . ..... 261
Listing 9.18 Session for x' = 4x + 9y, y' = −x − 6y Enhanced
Phase Plane Plots . . . . . . . . . . . . . . . . . . . . . . . . ..... 263
Listing 9.19 Extracting A . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 264
Listing 9.20 Setting up the eigenvector and nullcline lines. . . . . ..... 264
Listing 9.21 Set up Initial conditions, find trajectories and the
bounding boxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
Listing 9.22 Set up the bounding box for all the trajectories . . . . . . . . 265
Listing 9.23 Plot the trajectories at the chosen zoom level . . . . . . . . . . 266
Listing 9.24 Choose s and set the external inputs . . . . . . . . . . . . . . . . 270
Listing 9.25 Finding the A and F for a two protein model. . . . . . . . . . 270
Listing 9.26 Finding the A and F for a two protein model:
uncommented . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
Listing 9.27 Setting up the coefficient matrix A . . . . . . . . . . . . . . . . . 272
Listing 9.28 Finding the inverse of A . . . . . . . . . . . . . . . . . . . . . . . . 272
Listing 9.29 Find the particular solution . . . . . . . . . . . . . . . . . . . . . . 272
Listing 9.30 Find eigenvalues and eigenvectors of A . . . . . . . . . . . . . 272
List of Code Examples xxvii

Listing 9.31 Solving the IVP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273


Listing 9.32 Constructing the solutions x and y and plotting . . . . . . . . 274
Listing 9.33 The protein phase plane plot . . . . . . . . . . . . . . . . . . . . . 274
Listing 9.34 Solving the problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
Listing 9.35 Estimated Protein Response Times . . . . . . . . . . . . . . . . . 276
Listing 9.36 Short Time Scale Plot . . . . . . . . . . . . . . . . . . . . . . . . . . 276
Listing 9.37 Long Time Scale Plot . . . . . . . . . . . . . . . . . . . . . . . . . . 277
Listing 9.38 A Sample Two Box Climate Model Solution . . . . . . . . . . 282
Listing 10.1 Phase Plane x' = 12x – 5xy, y' = −6y + 3xy,
x (0) = 0.2, y (0) = 8.6 . . . . . . . . . . . . . . . . . . . . . . . . 341
Listing 10.2 Annotated Predator–Prey Phase Plane Code . . . . . . . . . . . 342
Listing 10.3 First Estimate of the period T . . . . . . . . . . . . . . . . . . . . 343
Listing 10.4 Second Estimate of the period T . . . . . . . . . . . . . . . . . . 344
Listing 10.5 x versus time code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
Listing 10.6 y versus time code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
Listing 10.7 AutoSystemFixedRK.m . . . . . . . . . . . . . . . . . . . . . . . . . 347
Listing 10.8 Using AutoSystemFixedRK . . . . . . . . . . . . . . . . . . . . . . 348
Listing 10.9 Automated Phase Plane Plot for x' = 6x – 5xy,
y' = −7y + 4xy . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 349
Listing 11.1 Phase Plane for x' = 2x – 3xy − 3x2,
y' = −4x + 5xy – 3y2 . . . . . . . . . . . . . . . . . . . . . ..... 375
Listing 11.2 Phase Plane for x' = 2x – 3xy − 1.5x2,
y' = −4x + 5xy – 1.5 y2 . . . . . . . . . . . . . . . . . . . ..... 376
Listing 12.1 Solving S0 ¼ 5SI; I 0 ¼ 5SI + 25I;
Sð0Þ ¼ 10; Ið0Þ ¼ 5: Epidemic! . . . . . . . . . . . . . . . . . . . 391
Listing 13.1 The cancer model dynamics. . . . . . . . . . . . . . . . . . . . . . 426
Listing 13.2 A First Cancer Model Attempt . . . . . . . . . . . . . . . . . . . . 427
Listing 13.3 Our First Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428
Listing 13.4 Switching to time units to years: step size is one half
year . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
Listing 13.5 Switching the step size to one year. . . . . . . . . . . . . . . . . 429
Listing 14.1 Jacobian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
Listing 14.2 Jacobian at equilibrium point (0; 0) . . . . . . . . . . . . . . . . 439
Listing 14.3 Jacobian at equilibrium point (1; 0) . . . . . . . . . . . . . . . . 440
Listing 14.4 Jacobian at equilibrium point (0; 1) . . . . . . . . . . . . . . . . 441
Listing 14.5 Autonomousfunc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442
Listing 14.6 Sample Phase Plane Plot . . . . . . . . . . . . . . . . . . . . . . . . 442
Listing 14.7 Bisection Code. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444
Listing 14.8 Function Definition In MatLab. . . . . . . . . . . . . . . . . . . . 445
Listing 14.9 Applying bisection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445
Listing 14.10 Bisection step by step code . . . . . . . . . . . . . . . . . . . . . . 445
Listing 14.11 Sample Bisection run step by step . . . . . . . . . . . . . . . . . 446
Listing 14.12 Should We Do A Newton Step?. . . . . . . . . . . . . . . . . . . 447
Listing 14.13 Global Newton Method. . . . . . . . . . . . . . . . . . . . . . . . . 448
xxviii List of Code Examples

Listing 14.14 Global Newton Function . . . . . . . . . . . . . . . . . . . . . . . . 449


Listing 14.15 Global Newton Function Derivative . . . . . . . . . . . . . . . . 449
Listing 14.16 Global Newton Sample . . . . . . . . . . . . . . . . . . . . . . . . . 449
Listing 14.17 Global Newton runtime results . . . . . . . . . . . . . . . . . . . . 449
Listing 14.18 Finite difference approximation to the derivative . . . . . . . 450
Listing 14.19 Secant approximation to the derivative . . . . . . . . . . . . . . 450
Listing 14.20 Finite Difference Global Newton Method . . . . . . . . . . . . 451
Listing 14.21 Sample GlobalNewton Finite Difference solution . . . . . . . 452
Listing 14.22 Global Newton Finite Difference runtime results . . . . . . . 452
Listing 14.23 Finding equilibrium points numerically . . . . . . . . . . . . . . 454
Listing 14.24 Jacobian at Q1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
Listing 14.25 Linearsystemep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
Listing 14.26 Phase plane plot for Q1 linearization . . . . . . . . . . . . . . . 456
Listing 14.27 Jacobian at Q2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456
Listing 14.28 Phase plane for Q2 linearization . . . . . . . . . . . . . . . . . . . 457
Listing 14.29 Jacobian at Q3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
Listing 14.30 Phase plane for Q3 linearization . . . . . . . . . . . . . . . . . . . 458
Listing 14.31 Trigger model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459
Listing 14.32 Phase plane plot for trigger model . . . . . . . . . . . . . . . . . 459
Listing 14.33 Jacobian at Q1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 460
Listing 14.34 Phase plane plot for Q1 linearization . . . . . . . . . . . . . . . 461
Listing 14.35 Jacobian at Q2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462
Listing 14.36 Phase plane for Q2 linearization . . . . . . . . . . . . . . . . . . . 462
Listing 14.37 Predator–Prey dynamics . . . . . . . . . . . . . . . . . . . . . . . . 462
Listing 14.38 AutoPhasePlanePlotRKF5NoPMultiple . . . . . . . . . . . . . . 464
Listing 14.39 A sample phase plane plot. . . . . . . . . . . . . . . . . . . . . . . 465
Listing 14.40 Jacobian at Q4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466
Listing 14.41 Phase plane for Q4 linearization . . . . . . . . . . . . . . . . . . . 466
Listing 14.42 Jacobian at Q3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467
Listing 14.43 Phase plane for Q3 linearization . . . . . . . . . . . . . . . . . . . 468
Listing 14.44 PredPreySelf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468
Listing 14.45 Actual Phase plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468
Listing 14.46 Jacobian at Q4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470
Listing 14.47 Phase plane plot for Q4 linearization . . . . . . . . . . . . . . . 470
Listing 14.48 Jacobian at Q3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471
Listing 14.49 Phase plane for Q3 linearization . . . . . . . . . . . . . . . . . . . 471
Listing 14.50 PredPreySelf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
Listing 14.51 The actual phase plane . . . . . . . . . . . . . . . . . . . . . . . . . 472
Listing 15.1 Nonlinear LS For Diabetes Model: Version One . . . . . . . 484
Listing 15.2 The Error Gradient Function . . . . . . . . . . . . . . . . . . . . . 485
Listing 15.3 Diabetes Error Calculation. . . . . . . . . . . . . . . . . . . . . . . 485
Listing 15.4 Run time results for gradient descent on the original
data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 486
Listing 15.5 Line Search Code. . . . . . . . . . . . . . . . . . . . . . . . ..... 487
List of Code Examples xxix

Listing 15.6 Nonlinear LS For Diabetes Model . . . . . . . . . . . . . . . . . 488


Listing 15.7 Some Details of the Line Search . . . . . . . . . . . . . . . . . . 489
Listing 15.8 The Full Run with Line Search . . . . . . . . . . . . . . . . . . . 489
Listing 15.9 Optimal Parameter Values . . . . . . . . . . . . . . . . . . . . . . . 490
Listing 15.10 Runtime Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 490
Abstract

This book tries to show how mathematics, computer science, and biology can be
usefully and pleasurably intertwined. The first volume (J. Peterson, Calculus for
Cognitive Scientists: Derivatives, Integration and Modeling (Springer, Singapore,
2015 in press)) discussed the necessary one- and two-variable calculus tools as well
as first-order ODE models. In this volume, we explicitly focus on two-variable
ODE models both linear and nonlinear and learn both theoretical and computational
tools using MATLAB to help us understand their solutions. We also go over
carefully on how to solve cable models using separation of variables and Fourier
series. And we must always caution you to be careful to make sure the use of
mathematics gives you insight. These cautionary words about the modeling of the
physics of stars from 1938 should be taken to heart:
Technical journals are filled with elaborate papers on conditions in the interiors of model
gaseous spheres, but these discussions have, for the most part, the character of exercises in
mathematical physics rather than astronomical investigations, and it is difficult to judge the
degree of resemblance between the models and actual stars. Differential equations are like
servants in livery: it is honourable to be able to command them, but they are “yes” men,
loyally giving support and amplification to the ideas entrusted to them by their master—
Paul W. Merrill, The Nature of Variable Stars, 1938, quoted in Arthur I. Miller Empire
of the Stars: Obsession, Friendship, and Betrayal in the Quest for Black Holes, 2005.

The relevance of this quotation to our pursuits is clear. It is easy to develop


sophisticated mathematical models that abstract from biological complexity
something we can then analyze with mathematics or computational tools in an
attempt to gain insight. But as Merrill says,
it is difficult to judge the degree of resemblance between the models and actual [biology]

where we have taken the liberty to replace physics with our domain here of biology.
We should never forget the last line

xxxi
xxxii Abstract

Differential equations are like servants in livery: it is honourable to be able to command


them, but they are “yes” men, loyally giving support and amplification to the ideas
entrusted to them by their master.

We must always take our modeling results and go back to the scientists to make
sure they retain relevance.
History

Based On:
Notes On MTHSC 108 for Biologists developed during the
Spring 2006 and Fall 2006,
Spring 2007 and Fall 2007 courses
The first edition of this text was used in
Spring 2008 and Fall 2008,
The course was then relabeled at MTHSC 111
and the text was used in
Spring 2009 and Fall 2009 courses
The second edition of this text was used in
Spring 2010 and Fall 2010 courses
The third edition was used in the
Spring 2011 and Fall 2011,
Spring 2012 and Summer Session I courses
The fourth edition was used in
Fall 2012, Spring 2013, and Fall 2013 courses
The fifth edition was used in the Spring 2014 course
Also, we have used material from notes on
Partial Differential Equation Models
which has been taught to small numbers of students since 2008.

xxxiii
Part I
Introduction
Chapter 1
Introductory Remarks

In this course, we will try to introduce beginning cognitive scientists to more of the
kinds of mathematics and mathematical reasoning that will be useful to them if they
continue to live, grow and learn within this area. In our twenty first century world,
there is a tight integration between the areas of mathematics, computer science and
science. Now a traditional Calculus course for the engineering and physical sciences
consists of a four semester sequence as shown in Table 1.1.
Unfortunately, this sequence of courses is heavily slanted towards the needs of the
physical sciences. For example, many of our examples come from physics, chemistry
and so forth. As long as everyone in the class has the common language to under-
stand the examples, the examples are a wonderful tool for adding insight. However,
the typical students who are interested in cognitive science often find the language
of physics and engineering to be outside their comfort zone. Hence, the examples
lose their explanatory power. Our first course starts a different way of teaching this
material and this text will continue that journey. Our philosophy, as usual, is that all
parts of the course must be integrated, so we don’t want to use mathematics, science
or computer approaches for their own intrinsic value. Experts in these separate fields
must work hard to avoid this. This is the time to be generalists and always look for
connective approaches. Also, models are carefully chosen to illustrate the basic idea
that we know far too much detail about virtually any biologically based system we
can think of. Hence, we must learn to throw away information in the search of the
appropriate abstraction. The resulting ideas can then be phrased in terms of mathe-
matics and simulated or solved with computer based tools. However, the results are
not useful, and must be discarded and the model changed, if the predictions and illu-
minating insights we gain from the model are incorrect. We must always remember
that throwing away information allows for the possibility of mistakes. This is a hard
lesson to learn, but important. Note that models from population biology, genetics,
cognitive dysfunction, regulatory gene circuits and many others are good examples
to work with. All require massive amounts of abstraction and data pruning to get
anywhere, but the illumination payoffs are potentially quite large.

© Springer Science+Business Media Singapore 2016 3


J.K. Peterson, Calculus for Cognitive Scientists,
Cognitive Science and Technology, DOI 10.1007/978-981-287-877-9_1
4 1 Introductory Remarks

Table 1.1 Typical engineering calculus sequence


Course Topics
MTHSC 106 Simple limit ideas
Functions and continuity
Differentiation and applications
Simple integration and applications
MTHSC 108 More integration and applications
Sequences and series
MTHSC 206 Polar coordinates
Space curves
Double and triple integrals
Vectors and functions of more than one variable
Partial derivatives
2D and 3D applications
MTHSC 208 First order ordinary differential equations (ODEs)
Linear second order differential equations
Complex numbers and linear independence of functions
Systems of linear differential equations
Matrices, eigenvalues and eigenvectors
Qualitative analysis of linear systems
Laplace transform techniques

1.1 A Roadmap to the Text

In this course, we introduce enough relevant mathematics and the beginnings of useful
computational tools so that you can begin to understand a fair bit about Biological
Modeling. We present a selection of nonlinear biological models and slowly build you
to the point where you can begin to have a feel for the model building process. We start
our model discussion with the classical Predator–Prey model in Chap. 10. We try to
talk about it as completely as possible and we use it as a vehicle to show how graphical
analysis coupled with careful mathematical reasoning can give us great insight. We
discuss completely the theory of the original Predator–Prey model in Sect. 10.1 and
its qualitative analysis in Sect. 10.5. We then introduce the use of computational tools
to solve the Predator–Prey model using MatLab in Sect. 10.11. While this model is
very successful at modeling biology, the addition of self-interaction terms is not.
The self-interaction models are analyzed in Chap. 11 and computational tools are
discussed in Sect. 11.8.
In Chap. 12, we show you a simple infectious disease model. The nullclines for this
model are developed in Sect. 12.1 and our reasoning why only trajectories that start
with positive initial conditions are biologically relevant are explained in Sect. 12.2.
1.1 A Roadmap to the Text 5

The infectious versus susceptible curve is then derived in Sect. 12.3. We finish this
Chapter with a long discussion of how we use a bit of mathematical wizardry to
develop a way to estimate the value of ρ in these disease models by using data
gathered on the value of R  . This analysis in Sect. 12.6, while complicated, is well
worth your effort to peruse!
In Chap. 13, we show you a simple model of colon cancer which while linear
is made more complicated by its higher dimensionality—there are 6 variables of
interest now and graphical analysis is of less help. We try hard to show you how
we can use this model to get insight as to when point mutations or chromosomal
instability are the dominant pathway to cancer.
In Chap. 15, we go over a simple model of insulin detection using second order
models which have complex roots. We use the phase shifted form to try to detect
insulin from certain types of data.

1.2 Code

The code for much of this text is in the directory ODE in our code folder which you
can download from Biological Information Processing (http://www.ces.clemson.
edu/~petersj/CognitiveModels.html). These code samples can then be downloaded
as the zipped tar ball CognitiveCode.tar.gz and unpacked where you wish. If you
have access to MatLab, just add this folder with its sub folders to your MatLab path.
If you don’t have such access, download and install Octave on your laptop. Now
Octave is more of a command line tool, so the process of adding paths is a bit more
tedious. When we start up an Octave session, we use the following trick. We write
up our paths in a file we call MyPath.m. For us, this code looks like this

Listing 1.1: How to add paths to octave


f u n c t i o n MyPath ( )
%
s 1 = ’ / home / p e t e r s j / MatLabFiles / B i o I n f o / : ’ ;
s 2 = ’ / home / p e t e r s j / MatLabFiles / B i o I n f o /GSO: ’ ;
s 3 = ’ / home / p e t e r s j / MatLabFiles / B i o I n f o /HH: ’ ;
s 4 = ’ / home / p e t e r s j / MatLabFiles / B i o I n f o / I n t e g r a t i o n : ’ ;
s 5 = ’ / home / p e t e r s j / MatLabFiles / B i o I n f o / I n t e r p o l a t i o n : ’ ;
s 6 = ’ / home / p e t e r s j / MatLabFiles / B i o I n f o / L i n e a r A l g e b r a : ’ ;
s 7 = ’ / home / p e t e r s j / MatLabFiles / B i o I n f o / N e r n s t : ’ ;
s 8 = ’ / home / p e t e r s j / MatLabFiles / B i o I n f o /ODE: ’ ;
s 9 = ’ / home / p e t e r s j / MatLabFiles / B i o I n f o / RootsOpt : ’ ;
s 1 0 = ’ / home / p e t e r s j / MatLabFiles / B i o I n f o / L e t t e r s : ’ ;
s 1 1 = ’ / home / p e t e r s j / MatLabFiles / B i o I n f o / Graphs : ’ ;
s 1 2 = ’ / home / p e t e r s j / MatLabFiles / B i o I n f o / PDE : ’ ;
s 1 3 = ’ / home / p e t e r s j / MatLabFiles / B i o I n f o / FDPDE : ’ ;
s 1 4 = ’ / home / p e t e r s j / MatLabFiles / B i o I n f o / 3 DCode ’ ;
s = [ s1 , s2 , s3 , s4 , s5 , s6 , s7 , s8 , s9 , s 1 2 ] ;
addpath ( s ) ;
end
6 1 Introductory Remarks

The paths we want to add are setup as strings, here called s1 etc., and to use this,
we start up Octave like so. We copy MyPath.m into our working directory and then
do this

Listing 1.2: Set paths in octave


o c t a v e >> MyPath ( ) ;

We agree it is not as nice as working in MatLab, but it is free! You still have
to think a bit about how to do the paths. For example, in Peterson (2015c), we
develop two different ways to handle graphs in MatLab. The first is in the direc-
tory GraphsGlobal and the second is in the directory Graphs. They are not
to be used together. So if we wanted to use the setup of Graphs and noth-
ing else, we would edit the MyPath.m file to set s = [s11]; only. If we
wanted to use the GraphsGlobal code, we would edit MyPath.m so that
s11 = ’/home/petersj/MatLabFiles/BioInfo/GraphsGlobal:’;
and then set s = [s11];. Note the directories in the MyPath.m are ours: the main
directory is ’/home/petersj/MatLabFiles/BioInfo/ and of course,
you will have to edit this file to put your directory information in there instead
of ours.
All the code will work fine with Octave. So pull up a chair, grab a cup of coffee
or tea and let’s get started.

1.3 Final Thoughts

As we said in Peterson (2015a), we want you to continue to grow in a multidisciplinary


way and so we think you will definitely need to learn more. Remember, every time
you try to figure something out in science, you will find there is a lot of stuff you
don’t know and you have to go learn new tricks. That’s ok and you shouldn’t be afraid
of it. You will probably find you need more mathematics, statistics and so forth in
your work, so don’t forget to read more as there is always something interesting
over the horizon you are not prepared for yet. But the thing is, every time you
figure something out, you get better at figuring out the next thing! Also, we have
written several more companion texts for you to consider on your journey. The next
on is Calculus for Cognitive Scientists: Partial Differential Equation Models,
Peterson (2015b) and then there is the fourth volume on bioinformation processing,
BioInformation Processing: A Primer On Computational Cognitive Science,
Peterson (2015c) which starts you on building interesting neural systems, which this
material will prepare you for. Enjoy your journeys!
References 7

References

J. Peterson, Calculus for Cognitive Scientists: Derivatives, Integration and Modeling, Springer
Series on Cognitive Science and Technology (Springer Science+Business Media Singapore Pte
Ltd, Singapore, 2015a in press)
J. Peterson, Calculus for Cognitive Scientists: Partial Differential Equation Models, Springer Series
on Cognitive Science and Technology (Springer Science+Business Media Singapore Pte Ltd,
Singapore, 2015b in press)
J. Peterson, BioInformation Processing: A Primer On Computational Cognitive Science, Springer
Series on Cognitive Science and Technology (Springer Science+Business Media Singapore Pte
Ltd, Singapore, 2015c in press)
Part II
Review
Chapter 2
Linear Algebra

We need to use both vector and matrix ideas in this course. This was covered already
in the first text (Peterson 2015), so we will assume you can review that material
before you start into this chapter. Here we will introduce some new ideas as well
as tools in MatLab we can use to solve what are called linear algebra problems; i.e.
systems of equations. Let’s begin by looking at inner products more closely.

2.1 The Inner Product of Two Column Vectors

We can also define the inner product of two vectors. If V and W are two column
vectors of size n × 1, then the product V T W is a matrix of size 1 × 1 which we
identify with a real number. We see if
⎡ ⎤ ⎡ ⎤
V1 W1
⎢ V2 ⎥ ⎢ W2 ⎥
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥
V = ⎢ V3 ⎥ and W = ⎢ W3 ⎥
⎢ .. ⎥ ⎢ .. ⎥
⎣.⎦ ⎣ . ⎦
Vn Wn

then we define the 1 × 1 matrix

V T W = W T V = V , W  [V1 W1 + V2 W2 + V3 W3 + · · · + Vn Wn ]

and we identify this one by one matrix with the real number

V1 W1 + V2 W2 + V3 W3 + · · · + Vn Wn

This product is so important, it is given a special name: it is the inner product of


the two vectors V and W . Let’s make this formal with Definition 2.1.1.

© Springer Science+Business Media Singapore 2016 11


J.K. Peterson, Calculus for Cognitive Scientists,
Cognitive Science and Technology, DOI 10.1007/978-981-287-877-9_2
12 2 Linear Algebra

Definition 2.1.1 (The Inner Product Of Two Vectors)


If V and W are two column vectors of size n × 1, the inner product of these vectors
is denoted by < V, W > which is defined as the matrix product V T W which is
equivalent to the W T V and we interpret this 1×1 matrix product as the real number

V1 W1 + V2 W2 + V3 W3 + · · · + Vn Wn

where Vi are the components of V and Wi are the components of W .

2.1.1 Homework

Exercise 2.1.1 Find the dot product of the vectors V and W given by
 
6 7
V = and W = .
1 2

Exercise 2.1.2 Find the dot product of the vectors V and W given by
 
−6 2
V = and W = .
−8 6

Exercise 2.1.3 Find the dot product of the vectors V and W given by
 
10 2
V = and W = .
−4 80

We add, subtract and scalar multiply vectors and matrices as usual. We also suggest
you review how to do matrix–vector multiplications. Multiplication of matrices is
more complex as we discussed in the volume (Peterson 2015). Let’s go through it
again a bit more abstractly. Recall the dot product of two vectors V and V W is
defined to be
n
V , W  = Vi Wi
i=1

where n is the number of components in the vectors. Using this we can define the
multiplication of the matrix A of size n × p with the matrix B of size p × m as
follows.
2.1 The Inner Product of Two Column Vectors 13
⎡ ⎤
Row 1 of A
⎢ Row 2 of A⎥
⎢ ⎥
⎢ .. ⎥ Column 1 of B | · · · | Column n of B
⎣ . ⎦
Row n of A
⎡ ⎤
< Row 1 of A, Column 1 of B > . . . < Row 1 of A, Column n of B >
⎢< Row 2 of A, Column 1 of B > . . . < Row 2 of A, Column n of B >⎥
⎢ ⎥
=⎢ .. .. .. ⎥
⎣ . . . ⎦
< Row n of A, Column 1 of B > . . . < Row n of A, Column n of B >

We can write this more succinctly with we let Ai denote the rows of A and B i be the
columns of B. Note the use of subscripts for the rows and superscripts for the columns.
Then, we can rewrite the matrix multiplication algorithm more compactly as
⎡ ⎤ ⎡ ⎤
A1  A1 , B 1 . . .  A1 , B n 
⎢ A2 ⎥ ⎢ A2 , B 1  . . .  A2 , B n ⎥
⎢ ⎥ ⎢ ⎥
⎢ .. ⎥ B1 | · · · | Bn =⎢ .. .. .. ⎥
⎣ . ⎦ ⎣ . . . ⎦
An  An , B  . . .  An , B 
1 1

Thus, the entry in row i and column j of the matrix product AB is

AB i j =  Ai , B j .

Comment 2.1.1 If A is a matrix of any size and 0 is the appropriate zero matrix
of the same size, then both 0 + A and A + 0 are nicely defined operations and the
result is just A.

Comment 2.1.2 Matrix multiplication is not communicative: i.e. for square matri-
ces A and B, the matrix product A B is not necessarily the same as the product B A.

2.2 Interpreting the Inner Product

What could this number < V, W > possibly mean? To figure this out, we have to
do some algebra. Let’s specialize to nonzero column vectors with only 2 compo-
nents. Let
 
a b
V = and W =
c d

Since these vectors are not zero, only one of the terms in (a, c) and in (b, d) can be
zero because otherwise both components would be zero and we are assuming these
vectors are not the zero vector. We will use this fact in a bit. Now here < V, W > =
ab + cd. So
14 2 Linear Algebra

(ab + cd)2 = a 2 b2 + 2abcd + c2 d 2

 
|| V ||2 || W ||2 = a 2 + c2 b2 + d 2
= a 2 b2 + a 2 d 2 + c2 b2 + c2 d 2

Thus,

|| V ||2 || W ||2 − (< V, W >)2 = a 2 b2 + a 2 d 2 + c2 b2 + c2 d 2 − a 2 b2 − 2abcd − c2 d 2


= a 2 d 2 − 2abcd + c2 b2
= (ad − bc)2 .

Now, this does look complicated, doesn’t it? But this last term is something squared
and so it must be non-negative! Hence, taking square roots, we have shown that

|< V, W >| ≤ || V || || W ||

Note, since a real number is always less than or equal to it absolute value, we can
also say

< V, W > ≤ || V || || W ||

And we can say more. If it turned out that the term (ad − bc)2 was zero, then
ad − bc = 0. There are then a few cases to look at.

1. If all the terms a, b, c and d are not zero, then we can write ad = bc implies a/c =
b/d. We know the vector V can be interpreted as the line segment starting at (0, 0)
on the line with equation y = (a/c)x. Similarly, the vector W can be interpreted
as the line segment connecting (0, 0) and (b, d) on the line y = (b/d)x. Since
a/c = b/d, these lines are the same. So both points (a, c) and (b, d) line on the
same line. Thus, we see these vectors lay on top of each other or point directly
opposite each other in the x − y plane; i.e. the angle between these vectors is 0
or π radians (that is 0◦ or 180◦ ).
2. If a = 0, then bc must be 0 also. Since we know the vector V is not the zero
vector, we can’t have c = 0 also. Thus, b must be zero. This tells us V has
components (0, c) for some non zero c and V has components (0, d) for some
non zero d. These components also determine two lines like in the case above
which either point in the same direction or opposite one another. Hence, again,
the angle between the lines determined by these vectors is either 0 or π radians.
3. We can argue just like the case above if d = 0. We would find the angle between
the lines determined by the vectors is either 0 or π radians.

We can summarize our results as a Theorem which is called the Cauchy–Schwartz


Theorem for two dimensional vectors.
2.2 Interpreting the Inner Product 15

Theorem 2.2.1 (Cauchy Schwartz Theorem For Two Dimensional Vectors)


If V and W are two dimensional column vectors with components (a, c) and (b, d)
respectively, then it is always true that

< V, W >≤|< V, W >| ≤ || V || || W ||

Moreover,
|< V, W >| = || V || || W ||

if and only the quantity ad − bc = 0. Further, this quantity is equal to 0 if and only
if the angle between the line segments determined by the vectors V and W is 0◦
or 180◦ .

Here is yet another way to look at this: assume there is a non zero value of t so
that the equation below is true.
  
a b 0
V +t W = +t =
c d 0

This implies
 
a b
= −t
c d

Since these two vectors are equal, their components must match. Thus, we must have
a = −t b
c = −t d

Thus,
c
a d = (−t b) =bc
−t

and we are back to ad − bc = 0! Hence, another way of saying that the vectors V
and W are either 0◦ or 180◦ apart is to say that as vectors they are multiples of one
another! Such vectors are called collinear vectors to save writing. In general, we
say two n dimensional vectors are collinear if there is a nonzero constant t so that
V = t W although, of course, we can’t really figure out a way to visualize these
vectors!
Now, the scaled vectors E = ||VV || and F = ||W W
||
have magnitudes of 1. Their
components are (a/ || V ||, c/ || V || and (b/ || W ||, d/ || W ||. These points
lie on a circle of radius 1 centered at the origin. Let θ1 be the angle E makes with
the positive x-axis. Then, since the hypotenuse distance that defines the cos(θ1 ) and
sin(θ1 ) is 1, we must have
16 2 Linear Algebra

a
cos(θ1 ) =
|| V ||
c
sin(θ1 ) =
|| V ||

We can do the same thing for the angle θ2 that F makes with the positive x axis to see

b
cos(θ2 ) =
|| W ||
d
sin(θ2 ) =
|| W ||

The angle between vectors V and W is the same as between vectors E and F. Call
this angle θ. Then θ = θ1 − θ2 and using the formula for the cos of the difference
of angles
cos(θ) = cos(θ1 − θ2 )
= cos(θ1 ) cos(θ2 ) + sin(θ1 ) sin(θ2 )
a b c d
= +
|| V || || W || || V || || W ||
ab + cd
=
|| V || || W ||
< V, W >
=
|| V || || W ||

Hence, the ratio < V, W > /(|| V || || W ||) is the same as cos(θ)! So we can use
this simple calculation to find the angle between a pair two dimensional vectors.
The more general proof of the Cauchy Schwartz Theorem for n dimensional vec-
tors is a journey you can take in another mathematics class! We will state it though
so we can use it later if we need it.

Theorem 2.2.2 (Cauchy Schwartz Theorem For n Dimensional Vectors)


If V and W are n dimensional column vectors with components (V1 , . . . , Vn ) and
(W1 , . . . , Wn ) respectively, then it is always true that

< V, W > ≤ | < V, W > | ≤ || V || || W ||

Moreover,

|< V, W >| = || V || || W ||

if and only the vector V is a non zero multiple of the vector W .


2.2 Interpreting the Inner Product 17

Theorem 2.2.2 then tells us that if the vectors V and W are not zero, then

< V, W >
−1 ≤ ≤1
|| V || || W ||

and by analogy to what works for two dimensional vectors, we can use this ratio to
define the cos of the angle between two n dimensional vectors even though we can’t
see them at all! We do this in Definition 2.2.1.
Definition 2.2.1 (The Angle Between n Dimensional Vectors)
If V and W are two non zero n dimensional column vectors with components
(V1 , . . . , Vn ) and (W1 , . . . , Wn ) respectively, the angle θ between these vectors is
defined by

< V, W >
cos(θ) =
|| V || || W ||

Moreover, the angle between the vectors is 0◦ if < V, W > = 1 and is 180◦ if
< V, W > = −1.

2.2.1 Examples

Example 2.2.1 Find the angle between the vectors V and W given by
 
−6 −8
V = and W = .
13 1

Solution Compute the inner product < V , W > =  (−6)(−8) + (13)(1)


√ = 61.
Next, find themagnitudes of these vectors:
√ || V || = (−6) 2 + (13)2 = 205 and
|| W || = (−8)2 + (1)2 = 65. Then, if θ is the angle between the vec-
tors, we know

< V, W > 61
cos(θ) = =√ √
|| V || || W || 205 65
= 0.5284

Hence, since V is in quadrant 2 and W is in quadrant 2 as well, we expect the


angle between them should be between 0◦ and 90◦ . Your calculator should return
cos−1 (0.5284) = 58.10◦ or 1.0141 rad. You should graph these vectors and see this
visually too.
Example 2.2.2 Find the angle between the vectors V and W given by
 
−6 8
V = and W = .
−13 1
18 2 Linear Algebra

Solution Compute the inner product < V , W > = (−6)(8) + (−13)(1) = √


−61.
Next, find the magnitudes
 of these vectors:
√ || V || = (−6) 2 + (−13)2 = 205
and || W ||= (8) + (1) = 65. Then, if θ is the angle between the vec-
2 2

tors, we know

< V, W > −61


cos(θ) = =√ √
|| V || || W || 205 65
= −0.5284

Hence, since V is in quadrant 3 and W is in quadrant 1, we expect the angle between


them should be larger than 90◦ . Your calculator should return cos−1 (−0.5284) =
121.90◦ or 2.1275 rad. You should graph these vectors and see this visually too.

Example 2.2.3 Find the angle between the vectors V and W given by
 
6 8
V = and W = .
−13 1

Solution Compute the inner product < V , W >=(6)(8) + (−13)(1) =√35. Next,
√ vectors: || V || = (6) + (−13) = 205 and
find the magnitudes of these 2 2

|| W || = (8) + (1) = 65. Then, if θ is the angle between the vectors, we know
2 2

< V, W > 35
cos(θ) = =√ √
|| V || || W || 205 65
= 0.3032

Hence, since V is in quadrant 4 and W is in quadrant 1, we expect the angle between


them should be between 0◦ and 180◦ . Your calculator should return cos−1 (0.3032) =
72.35◦ or 1.2627 rad. You should graph these vectors and see this visually too.

2.2.2 Homework

Exercise 2.2.1 Find the angle between the vectors V and W given by
 
5 7
V = and W = .
4 2

Exercise 2.2.2 Find the angle between the vectors V and W given by
 
−6 9
V = and W = .
−8 8
2.2 Interpreting the Inner Product 19

Exercise 2.2.3 Find the angle between the vectors V and W given by
 
10 2
V = and W = .
−4 8

Exercise 2.2.4 Find the angle between the vectors V and W given by
 
6 7
V = and W = .
1 2

Exercise 2.2.5 Find the angle between the vectors V and W given by
 
3 2
V = and W = .
−5 −3

Exercise 2.2.6 Find the angle between the vectors V and W given by
 
1 −2
V = and W = .
−4 −3

2.3 Determinants of 2 × 2 Matrices

Since the number ad − bc is so important is all of our discussions about the rela-
tionship between the two dimensional vectors V and W with components (a, c) and
(b, d) respectively, we will define this number to be the determinant of the matrix
A formed by using V for column 1 and W for column 2 of A. That is
    
ab a b
A= = V W =
cd c d

We then formally define the determinant of the 2 × 2 matrix A by Definition 2.3.1.

Definition 2.3.1 (The Determinant Of A 2 × 2 Matrix)


Given the 2 × 2 matrix A defined by

ab
A= ,
cd

the determinant of A is the number ad − bc. We denote the determinant by det ( A)


or | A |.

Comment 2.3.1 It is also common to denote the determinant by


 
a b 
det A =   .
cd
20 2 Linear Algebra

Also, note that if we looked at the transpose of A, we would find


    
a c a c
A =
T
= Y Z = .
bd b d

 
Notice that det AT is (a)(d) − (b)(c) also. Hence, if det AT is zero, it means
that Y and Z are collinear. Hence, it the det ( A) is zero, both the vectors determined
by the rows of A and the columns of A are collinear. Let’s summarize what we know
about this new thing called the determinant of A.
1. If | A | is 0, then the vectors determined by the columns of A are collinear.
This also means that the vectors determined by the columns are multiples of one
another. Also, the vectors determined by the columns of AT are also collinear.
2. If | A | is not 0, then the vectors determined by the columns of A are not collinear
which means these vectors point in different directions. Another way of saying
this is that these vectors are not multiples of one another. The same is true for the
columns of the transpose of A.

2.3.1 Worked Out Problems

Example 2.3.1 Compute the determinant of



16.0 8.0
A=
−6.0 −5.0

Solution | A | = (16)(−5) − (8)(−6) = −32.


Example 2.3.2 Compute the determinant of

−2.0 3.0
A=
6.0 −9.0

Solution | A | = (−2)(−9) − (3)(6) = 0.


Example 2.3.3 Determine if the vectors V and W given by
 
4 −2
V = and W =
5 3

are collinear.
Solution Form the matrix A using these vectors as the columns. This gives

4 −2
A=
5 3
2.3 Determinants of 2 × 2 Matrices 21

The calculate | A | = (4)(3) − (−2)(5). Since this value is 22 which is not zero,
these vectors are not collinear.

Example 2.3.4 Determine if the vectors V and W given by


 
−6 3
V = and W =
4 −2

are collinear.

Solution Form the matrix A using these vectors as the columns. This gives

−6 3
A=
4 −2

The calculate | A | = (−6)(−2) − (3)(4). Since this value is 0, these vectors are
collinear. You should graph them in the x−y plane to see this visually.

2.3.2 Homework

Exercise 2.3.1 Compute the determinant of



2.0 −3.0
6.0 5.0

Exercise 2.3.2 Compute the determinant of



12.0 −1.0
4.0 2.0

2.4 Systems of Two Linear Equations

We can use all of this material to understand simple two linear equations in two
unknowns x and y. Consider the problem
2x +4 y = 7 (2.1)
3 x + 4 y = −8 (2.2)
22 2 Linear Algebra

Now consider the equation below written in terms of vectors:


  
2 4 7
x +y =
3 4 −8

Using the standard ways of multiplying vectors by scalars and adding vectors, we
see the above can be rewritten as
  
2x 4y 7
+ =
3x 4y −8

or
 
2x + 4y 7
=
3x + 4y −8

This last vector equation is clearly the same as the original Eqs. 2.1 and 2.2:

2x + 4y = 7
3x + 4y = −8

Further, in this example, letting


  
2 4 7
V = , W= , and D =
3 4 −8

we see Eqs. 2.1 and 2.2 are equivalent to the vector equation

x V + y W = D.

We can also write the system Eqs. 2.1 and 2.2 in an equivalent matrix–vector form.
Recall the original system which is written below:
2x +4 y = 7
3 x + 4 y = −8

We have already identified this system is equivalent to the vector equation

xV+yW = D

where
  
2 4 7
V = , W= and D =
3 4 −8
2.4 Systems of Two Linear Equations 23

Now use V and W as column one and column two of the matrix A

24
A= V W =
34

Then, the original system can be written in the matrix–vector form


  
24 x 7
=
34 y −8

We call the matrix A the coefficient matrix of the system given by Eqs. 2.1 and 2.2.
Now we can introduce a new type of notation. Think of the column vector

x
y

as being a vector variable. We will use a bold font and a capital letter for this and set


x
X=
y

Then, the original system can be written as

A X = D.

We typically refer to the vector D as the data vector associated with the system given
by Eqs. 2.1 and 2.2.

2.4.1 Worked Out Examples

Example 2.4.1 Consider the system of equations

1x + 2y =9
−5 x + 12 y = −1

Find the matrix vector equation form of this system.


24 2 Linear Algebra

Solution Define V , W and D as follows:


  
1 2 9
V = , W= and D =
−5 12 −1

Then define the matrix A using V and W as its columns:



1 2
A=
−5 12

and the system is equivalent to

A X = D.

Example 2.4.2 Consider the system of equations

7x + 5y =2
−3 x + −4 y = 1

Find the matrix vector equation form of this system.


Solution Define V , W and D as follows:
  
7 5 2
V = , W= and D =
−3 −4 1

Then define the matrix A using V and W as its columns:



7 5
A=
−3 −4

and the system is equivalent to

A X = D.

2.4.2 Homework

Exercise 2.4.1 Consider the system of equations

1α + 2β = 3
4α + 5β = 6
2.4 Systems of Two Linear Equations 25

Find the matrix vector equation form of this system.

Exercise 2.4.2 Consider the system of equations


−1 w + 3 z = 21
6 w + 7 z = 12

Find the matrix vector equation form of this system.

Exercise 2.4.3 Consider the system of equations


−7 u + 14 v = 8
25 u + −2 v = 8

Find the matrix vector equation form of this system.

2.5 Solving Two Linear Equations in Two Unknowns

We now know how to write the system of two linear equations in two unknowns
given by Eqs. 2.3 and 2.4

a x + b y = D1 (2.3)
c x + d y = D2 (2.4)

in an equivalent matrix–vector form. This system is equivalent to the vector equation

xV+yW = D

where
  
a b D1
V = , W= and D =
c d D2

Finally, using V and W as column one and column two of the matrix A

ab
A= V W =
cd

Then, the original system was written in vector and matrix–vector form as
  
ab x D1
xV+yW = =
cd y D2
26 2 Linear Algebra

Now, we can solve this system very easily as follows. We have already discussed the
inner product of two vectors. So we could compute the inner product of both sides
of x V + y W = D with any vector U we want and get
< U, x V + y W > = < U, D >

We can simplify the left hand side to get

x < U, V > +y < U, W > = < U, D >

Since this is true for any vector U, let’s try to find useful ones! Any vector U that
satisfies < U, W > = 0 would be great as then the y would drop out and we could
solve for x. The angle between such vector U and W would then be 90◦ or 270◦ .
We will call such vectors orthogonal as the lines associated with the vectors are
perpendicular.
We can easily find such a vector. Since W defines a line through the origin with
slope d/b, from our usual algebra and pre-calculus courses, we know the line through
the origin which is perpendicular to it has negative reciprocal slope: i.e. −b/d. A
line with the slope −b/d corresponds with a vector with components (d, −b). The
usual symbol for perpendicularity is ⊥ so we will label our vector orthogonal to W
as W ⊥ . We see that

⊥ d
W =
−b

and as expected

< W ⊥ , W > = (d)(b) + (−b)(d) = 0

Thus, we have

< W ⊥, D > = x < W ⊥, V > + y < W ⊥, W >


= x < W ⊥, V >

This looks complicated, but it can be written in terms of things we understand. Let’s
actually calculate the inner products. We find

< W ⊥ , V > = (d)(a) + (−b)(c) = det ( A)

and

D1 b
< W ⊥ , D > = (d)(D1 ) + (−b)(D2 ) = det .
D2 d
2.5 Solving Two Linear Equations in Two Unknowns 27

Hence, by taking the inner product of both sides with W ⊥ , we find the y term drops
out and we have
 
ab D1 b
x det = det
cd D2 d

Thus, if det ( A) is not zero, we can solve for x to get



D1 b
det
D2 d det D W
x=  =
ab det V W
det
cd

We can do a similar thing to find out what the variable y is by taking the inner
product of both sides of of x V + y W = D with the vector V ⊥ and get

x < V ⊥, V > + y < V ⊥, W > = < V ⊥, D >

where

⊥ c
V =
−a

and as expected

< V ⊥ , V > = (c)(a) + (−a)(c) = 0

Going through the same steps as before, we would find that if det ( A) is non zero,
we could solve for y to get

a D1
det
c D2 det V D
y=  =
ab det V W
det
cd

Let’s summarize:

1. Given any system of two linear equations in two unknowns, there is a coefficient
matrix A with first column V and second column W that is associated with it.
Further, the right hand side of the system defines a data vector D.
2. If det ( A) is not zero, we can solve for the unknowns x and y as follows:
28 2 Linear Algebra

det DW
x= 
det V W

det V D
y= 
det V W

This method of solution is known as Cramer’s Rule.


3. This system of two linear equations in two unknowns is associated with two
column vectors V and W . You can see there is a unique solution if and only if
| A | is not zero. This is the same as saying there is a unique solution if and only
if the vectors are not collinear.

We can state this as Theorem 2.5.1.

Theorem 2.5.1 (Cramer’s Rule)


Consider the system of equations
a x + b y = D1
c x + d y = D2 .

Define the vectors


  
a b D1
V = , W= , and D =
c d D2

Also, define the matrix A by


A= V W

Then, if det ( A) = 0, the unique solution to this system of equations is given by



det D W
x=
det ( A)

det V D
y=
det ( A)

2.5.1 Worked Out Examples

Example 2.5.1 Solve the system

−2 x + 4 y = 6
8 x + −1 y = 2

using Cramer’s Rule.


2.5 Solving Two Linear Equations in Two Unknowns 29

Solution Solve

−2 x + 4 y = 6
8 x + −1 y = 2

using Cramer’s Rule. We have


  
−2 4 6
V = , W= and D =
8 −1 2

det DW
x= 
det V W
 
6 4
det
2 −1
=  
−2 4
det
8 −1
−30
= = 1,
−30

det V D
y= 
det V W
 
−2 6
det
8 2
=  
−2 4
det
8 −1
−52 26
= =−
−30 15

Example 2.5.2 Solve the system

−5 x + 1 y = 8
9 x + −10 y = 2

using Cramer’s Rule.

Solution We have
  
−5 1 9
V = , W= and D =
9 −10 −2
30 2 Linear Algebra

det DW
x= 
det V W
 
8 1
det
2 −10
=  
−5 1
det
9 −10
−82 −82
= =− .
41 41
det V D
y= 
det V W
 
−5 8
det
9 2
=  
−5 1
det
9 −10
−82
=
41

2.5.2 Homework

Exercise 2.5.1 Solve the system

−3 x + 4 y = 6
8 x + 9 y = −1

using Cramer’s Rule.

Exercise 2.5.2 Solve the system

2x + 3y =6
−4 x + 0 y = 8

using Cramer’s Rule.

Exercise 2.5.3 Solve the system

18 x + 1 y = 1
−9 x + 3 y = 17
2.5 Solving Two Linear Equations in Two Unknowns 31

using Cramer’s Rule.


Exercise 2.5.4 Solve the system

−7 x + 6 y = −4
8x + 1y =1

using Cramer’s Rule.


Exercise 2.5.5 Solve the system

−90 x + 1 y = 1
80 x + −1 y = 1

using Cramer’s Rule.

2.6 Consistent and Inconsistent Systems

So what happens if det ( A) = 0? By the remark above, we know that the vectors V
and W are collinear. We also know from our discussions in Sect. 2.3 that the columns
of AT are collinear. Hence, there is a non zero constant r so that
 
a c
=r
b d

Thus, a = r c and b = r d and the original system can be written as

r c x + r d y = D1
c x + d y = D2

or

r (c x + d y) = D1
c x + d y = D2

You can see we do not really have two equations in two unknowns since the top
equation on the left hand side is just a multiple of the left hand side of the bottom
32 2 Linear Algebra

equation. This can only make sense if D1 /r = D2 or D1 = r D2 . We can conclude


that the relationship between the components of D must be just right! Hence, we
have the system

c x + d y = D1 /r
c x + d y = D2

Now subtract the top equation from the bottom equation. You find

0 x + 0 y = 0 = D2 − D1 /r

This equation only makes sense if when you subtract the top from the bottom equa-
tion, the new right hand side is 0! We call such systems consistent if the right hand
side becomes 0 and inconsistent if not zero. So we have a great test for inconsistency.
We scale the top or bottom equation just right to make them identical and subtract
the two equations. If we get 0 = α for a nonzero α, the system is inconsistent.
Here is an example. Consider the system

2x +3y = 8
4x +6y = 9

Here, the column vectors of AT are


 
2 4
Y = and Z =
3 6

We see Z = 2 Y and we have the system

2 x + 3 y = 8 = D1
2 (2 x + 3 y) = 9 = D2

This system would be consistent if the bottom equation was exactly two times the top
equation. For this to happen, we need D2 = 2 D1 ; i.e., we need 9 = 2 × 8 which is
impossible. So these equations are inconsistent. As mentioned earlier, an even better
way to see these equations are inconsistent is to subtract the top equation from the
bottom equation to get

0x + 0 y = 0 = 1
2.6 Consistent and Inconsistent Systems 33

which again is not possible. Remember, consistent equations when det ( A) = 0


would have the some multiple of top − bottom equation = zero.
Another way to look at this situation is to note that the column vectors, V and W ,
of A are collinear. Hence, there is another non zero scalar s so that V = s W . We
can then rewrite the usual vector form of our system as

D=xV+yW
= x sW + y W

This says that the data vector D = (xs + y) W . Hence, if there is a solution x and y,
it will only happen in D is a multiple of W . This says D is collinear with W which
in turn is collinear with V . Going back to our sample

2x +3y = 8
4 x + 6 y = 9.

We see D with components (8, 9) is not a multiple of V with components (2, 4) or


W with components (3, 6). Thus, the system must be inconsistent.

2.6.1 Worked Out Examples

Example 2.6.1 Consider the system

4 x + 5 y = 11
−8 x − 10 y = −22

Determine if this system is consistent or inconsistent.

Solution We see immediately that the determinant of the coefficient matrix A is


zero. So the question of consistency is reasonable to ask. Here, the column vectors
of AT are
 
4 −8
Y = and Z =
5 −10

We see Z = −2 Y and we have the system


4 x + 5 y = 11 = D1
−2 (4 x + 5 y) = −22 = D2
34 2 Linear Algebra

This system would be consistent if the bottom equation was exactly minus two times
the top equation. For this to happen, we need D2 = −2 D1 ; i.e., we need −22 =
−2 × 11 which is true. So these equations are consistent.

Example 2.6.2 Consider the system

6 x + 8 y = 14
18 x + 24 y = 48

Determine if this system is consistent or inconsistent.

Solution We see immediately that the determinant of the coefficient matrix A is zero.
So again, the question of consistency is reasonable to ask. Here, the column vectors
of AT are
 
6 18
Y = and Z =
8 24

We see Z = 3 Y and we have the system

6 x + 8 y = 14 = D1
3 (6 x + 8 y) = 48 = D2

This system would be consistent if the bottom equation was exactly three times the
top equation. For this to happen, we need D2 = 3 D1 ; i.e., we need −48 = 3 × 14
which is not true. So these equations are inconsistent.

2.6.2 Homework

Exercise 2.6.1 Consider the system

2x +5y = 1
8 x + 20 y = 4

Determine if this system is consistent or inconsistent.


2.6 Consistent and Inconsistent Systems 35

Exercise 2.6.2 Consider the system

60 x + 80 y = 120
6 x + 8 y = 13

Determine if this system is consistent or inconsistent.

Exercise 2.6.3 Consider the system

−2 x + 7 y = 10
20 x − 70 y = 4

Determine if this system is consistent or inconsistent.

Exercise 2.6.4 Consider the system

x+y=1
2x +2 y = 3

Determine if this system is consistent or inconsistent.

Exercise 2.6.5 Consider the system

−11 x − 3 y = −2
33 x + 9 y = 6

Determine if this system is consistent or inconsistent.

2.7 Specializing to Zero Data

If the system we want to solve has zero data, then we must solve a system of equa-
tions like

a x +b y = 0
c x + d y = 0.
36 2 Linear Algebra

Define the vectors V and W as usual. Note D is now the zero vector
  
a b 0
V = , W= , and D =
c d 0

Also, define the matrix A by


A= V W

Then, if det ( A) = 0, the unique solution to this system of equations is given by


 
0b
det
0d
x=
det ( A)
0
= =0
ad − bc
 
a0
det
c0
y=
det ( A)
0
= =0
ad − bc

Hence, the unique solution to a system of the form A X = 0 is x = 0 and y = 0.


But what happens if the determinant of A is zero? In this case, we know the column
vectors of AT are collinear: i.e.
 
a c
Y = and Z = ,
b d

are collinear and so there is a non zero constant r so that a = r c and b = r d. This
gives the system
r (c x + b y) = 0
c x + d y = 0.

Now if you multiply the bottom equation by r and subtract from the top equation,
you get 0. This tells us the system is consistent. The original system of two equations
is thus only one equation. We can choose to use either the original top or bottom
equation to solve. Say we choose the original top equation. Then we need to find x
and y choices so that
a x +b y = 0

There are in finitely many solutions here! It is easiest to see how to solve this kind
of problem using some examples.
2.7 Specializing to Zero Data 37

2.7.1 Worked Out Examples

Example 2.7.1 Find all solutions to the consistent system

−2 x + 7 y = 0
20 x − 70 y = 0

Solution First, note the determinant of the coefficient matrix of the system is zero.
Also, since the bottom equation is −10 times the top equation, we see the system is
also consistent. We solve using the top equation:

−2 x + 7 y = 0

Thus,
7y =2x
y = (2/7) x

We see a solution vector of the form


  
x x 1
X= = =x
y (2/7) x (2/7)

will always work. There is a lot of ambiguity here as the multiplier x is completely
arbitrary. For example,if we let x = 7 c for an arbitrary c, then solving for y, we
find y = 2 c. We can then rewrite the solution vector as

2
c
7

in terms of the arbitrary multiplier c. It does not really matter what form we pick,
however we often try to pick a form which has integers as components.

Example 2.7.2 Find all solutions to the consistent system

4x +5y = 0
8 x + 10 y = 0

Solution First, note the determinant of the coefficient matrix of the system is zero.
Also, since the bottom equation is 2 times the top equation, we see the system is also
consistent. We solve using the bottom equation this time:

8 x + 10 y = 0
38 2 Linear Algebra

Thus,
10 y = −8 x
y = −(4/5) x

We see a solution vector of the form


  
x x 1
X= = =x
y −(4/5) x −(4/5)

will always work. Again, there is a lot of ambiguity here as the multiplier x is
completely arbitrary. For example,if we let x = 10 d for an arbitrary d, then solving
for y, we find y = −8 d. We can then rewrite the solution vector as

10
d
−8

in terms of the arbitrary multiplier d. Again, it is important to note that it does not
really matter what form we pick, however we often try to pick a form which has
integers as components.

2.7.2 Homework

Exercise 2.7.1 Find all solutions to the consistent system

x +3y = 0
6 x + 18 y = 0

Exercise 2.7.2 Find all solutions to the consistent system

−3 x + 4 y = 0
9 x − 12 y = 0

Exercise 2.7.3 Find all solutions to the consistent system

2x +7y = 0
1 x + (3/2) y = 0

Exercise 2.7.4 Find all solutions to the consistent system

−10 x + 5 y = 0
20 x − 10 y = 0
2.7 Specializing to Zero Data 39

Exercise 2.7.5 Find all solutions to the consistent system

−12 x + 5 y = 0
4 x − (5/3) y = 0

2.8 Matrix Inverses

If a matrix A has a non zero determinant, we know the system


 
x D1
A =
y D2

has a unique solution for each right hand side vector



D1
D= .
D2

If we could find another matrix B which satisfied

BA= AB=I

we could multiply both sides of our system by B to find


 
x D1
BA =B
y D2
 
x D1
I =B
y D2
 
x D1
=B
y D2

which is the solution to our system! The matrix B is, of course, special and clearly
plays the role of an inverse for the matrix A. When such a matrix B exists, it is called
the inverse of A and is denoted by A−1 .
Definition 2.8.1 (The Inverse of the matrix A)
If there is a matrix B of the same size as the square matrix A, B is said to be inverse
of A is

BA= AB=I

In this case, we denote the inverse of A by A−1 .


We can show that the inverse of A exists if and only if det ( A) = 0. In general,
it is very hard to find the inverse of a matrix, but in the case of a 2 × 2 matrix, it
is very easy.
40 2 Linear Algebra

Definition 2.8.2 (The Inverse of the 2 × 2 matrix A)


Let

ab
A=
cd

and assume det ( A) = 0. Then, the inverse of A is given by



−1 1 d −b
A =
det ( A) −c a

2.8.1 Worked Out Examples

Example 2.8.1 For



62
A=
34

find A−1 .

Solution Since det ( A) = 18, we see



−1 1 4 −2
A =
18 −3 6

Example 2.8.2 For the system


  
−2 4 x 8
=
35 y 7

find the unique solution.

Solution The coefficient matrix here is



−2 4
A=
3 5

Since det ( A) = −22, we see



−1 −1 5 −4
A =
22 −3 −2
2.8 Matrix Inverses 41

and hence,
 
x 8
= A−1
y 7
  
−12
−1 40 − 28
= = 22
22 −24 − 14 38
22

2.8.2 Homework

Exercise 2.8.1 For



68
A=
34

find A−1 if it exists.

Exercise 2.8.2 For



68
A=
35

find A−1 if it exists.

Exercise 2.8.3 For



−3 2
A=
3 5

find A−1 if it exists.

Exercise 2.8.4 For the system


  
43 x −5
=
11 2 y 3

find the unique solution if it exists.

Exercise 2.8.5 For the system


  
−1 2 x 10
=
12 y 30

find the unique solution if it exists.


42 2 Linear Algebra

Exercise 2.8.6 For the system


  
40 30 x −5
=
16 5 y 10

find the unique solution if it exists.

2.9 Computational Linear Algebra

Let’s look at how we can use MatLab/Octave to solve the general linear system of
equations
Ax =b

where A is a n × n matrix, x is a column vector with n rows whose components are


the unknowns we wish to solve for and b is the data vector.

2.9.1 A Simple Lower Triangular System

We will start by writing a function to solve a special system of equations; we begin


with a lower triangular matrix system L x = b. For example, if the system we wanted
to solve was

⎡ ⎤ ⎡ ⎤ ⎡ ⎤
1 −2 3 x 9
⎣0 4 1⎦ ⎣ y ⎦ = ⎣8⎦
0 0 6 z 2

this is easily solve by starting at the last equation and working backwards. This is
called backsolving. Here, we have

2
z=
6
1 23
4y = 8 − z = 8 − =
3 3
23
y=
12
23 71
x = 9 + 2y − 3z = 9 + −1=
6 6
It is easy to write code to do this in MatLab as we do below.
2.9 Computational Linear Algebra 43

2.9.2 A Lower Triangular Solver

Here is a simple function to solve such a system.

Listing 2.1: Lower Triangular Solver


f u n c t i o n x = LTriSol (L, b )
%
% L i s n x n Lower T r i a n g u l a r M a t r i x
% b i s nx1 d a t a v e c t o r
5 % O b t a i n x by f o r w a r d s u b s t i t u t i o n
%
n = length (b) ;
x = zeros (n , 1 ) ;
f o r j =1: n−1
10 x ( j ) = b ( j ) / L( j , j ) ;
b ( j + 1: n ) = b ( j +1: n ) − x ( j ) ∗L( j +1: n , j ) ;
end
x ( n ) = b ( n ) /L( n , n ) ;
end

To use this function, we would enter the following commands at the Matlab prompt.
For now, we are assuming that you are running Matlab in a local directory which
contains your Matlab code LTriSol.m. So we fire up Matlab and enter these com-
mands:

Listing 2.2: Sample Solution with LTriSol


A = [2 0 0; 1 5 0; 7 9 8]
A =
2 0 0
1 5 0
5 7 9 8
b = [6; 2; 5]
b =
6
2
10 5
x = L T r i S o l (A, b )
x =
3.0000
−0.2000
15 −1.7750

which solves the system as we wanted.

2.9.3 An Upper Triangular Solver

Here is a simple function to solve a similar system where this time A is upper
triangular. The code is essentially the same although the solution process starts at
the top and sweeps down.
44 2 Linear Algebra

Listing 2.3: Upper Triangular Solver


f u n c t i o n x = UTriSol (U, b )
%
% U i s nxn n o n s i n g u l a r Upper T r i a n g u l a r m a t r i x
% b i s nx1 d a t a v e c t o r
5 % x i s s o l v e d by back s u b s t i t u t i o n
%
n = length (b) ;
x = zeros (n , 1 ) ;
f o r j = n: −1:2
10 x ( j ) = b ( j ) /U( j , j ) ;
b ( 1 : j −1) = b ( 1 : j −1) − x ( j ) ∗U( 1 : j −1 , j ) ;
end
x ( 1 ) = b ( 1 ) / U( 1 , 1 ) ;
end

As usual, to use this function, we would enter the following commands at the Matlab
prompt. We are still assuming that you are running Matlab in a local directory and
that your Matlab code UTriSol.m is also in this directory.
So we enter these commands in Matlab.

Listing 2.4: Sample Solution with UTriSol


C = [7 9 8; 0 1 5; 0 0 2]
C =
7 9 8
0 1 5
5 0 0 2
b = [ 6; 2; 5]
b =
6
2
10 5
x = UTriSol (C, b )
x =
11.5000
−10.5000
15 2.5000

which again solves the system as we wanted.

2.9.4 The LU Decomposition of A Without Pivoting

It is possible to take a general matrix A and rewrite it as the product of a lower


triangular matrix L and an upper triangular matrix U. Here is a simple function to
solve a system using the LU decomposition of A. First, it finds the LU decomposition
and then it uses the lower triangular and upper triangular solvers we wrote earlier.
To do this, we add and subtract multiples of rows together to remake the original
matrix A into an upper triangular matrix. Let’s do a simple example. Let the matrix
A be given by
2.9 Computational Linear Algebra 45
⎡⎤
8 23
A = ⎣−4 3 2⎦
7 89

Start in the row 1 and column 1 position in A. The entry there is the pivoting element.
Divide the entries below it by the 8 and store them in the rest of column 1. This gives
the new matrix A∗
⎡ ⎤
8 23
A ∗ = ⎣− 8 3 2 ⎦
4

7
8
89

If we took the original row 1 and multiplied it by the − 48 and subtracted it from row
2, we would have the new second row

0 4 3.5

If we took the 78 , multiplied the original row 1 by it and subtracted it from the original
row 3, we would have

50 51
0 8 8

With these operations done, we have the matrix A∗ taking the form
⎡ ⎤
8 2 3
A∗ = ⎣− 8 4 3.5⎦
4

7 25 51
8 4 8

The multipliers in the lower part of column 1 are important to what we are doing, so
we are saving them in the parts of column 1 we have made zero. In MatLab, what
we have just done could be written like this

Listing 2.5: Storing multipliers


% t h i s i s a 3 x3 m a t r i x
n = 3
% s t o r e m u l t i p l i e r s i n t h e r e s t o f column 1
A( 2 : n , 1 ) = A( 2 : n , 1 ) / A( 1 , 1 ) ;
5 % compute t h e new t h e 2 x2 b l o c k which
% r e m o v e s column 1 and row 1
A( 2 : n , 2 : n ) = A( 2 : n , 2 : n ) − A( 2 : n , 1 ) ∗A( 1 , 2 : n ) ;
46 2 Linear Algebra

The code above does what we just did by hand. Now do the same thing again, but
start in the column 2 and row 2 position in the new matrix A∗ . The new pivoting
element is 4, so below it in column 2, we divide the rest of the elements of column
2 by 4 and store the results. This gives
⎡ ⎤
8 2 3
A = ⎣− 8 4 3.5⎦
∗ 4

7 25 51
8 16 8

We are not done. We now calculate the multiplier 25 16


times the part of this row 2 past
the pivot position and subtract it from the rest of row 3. We actually have a 0 then in
both column 1 and column 2 of row 3 now. So, the calculations give
29
00 32

although the row we store in A∗ is

7 25 29
8 16 32

We are now done. We have converted A into the form


⎡ ⎤
8 2 3
A∗ = ⎣− 8 4 3.5⎦
4

7 25 29
8 16 32

Let this final matrix be called B. We can extract the lower triangular part of B using
the MatLab command tril(B,-1) and the lower triangular matrix L formed from
A is then made by adding a main diagonal of 1’s to this. The upper triangular part of
A is then U which we find by using triu(B). In code this is

Listing 2.6: Extracting Lower and Upper Parts of a matrix


L = e y e ( n , n ) + t r i l (A, −1) ;
U = t r i u (A) ;

In our example, we find


⎡ ⎤ ⎡ ⎤
1 0 0 82 3
L = ⎣− 8 1 0⎦ U = ⎣0 4 2 ⎦
4 7

7 25 29
8 16
1 0 0 32

The full code is listed below.


2.9 Computational Linear Algebra 47

Listing 2.7: LU Decomposition of A Without Pivoting


f u n c t i o n [ L , U] = GE(A)
%
3 % A i s nxn m a t r i x
% L i s nxn l o w e r t r i a n g u l a r
% U i s nxn u p p e r t r i a n g u l a r
%
% We compute t h e LU d e c o m p o s i t i o n o f A u s i n g
8 % G a us s i a n E l i m i n a t i o n
%

[ n , n ] = s i z e (A) ;
f o r k =1: n−1
13 % find multiplier
A( k +1 : n , k ) = A( k +1: n , k ) /A( k , k ) ;
% z e r o o u t column
A( k +1: n , k +1: n ) = A( k +1: n , k +1: n ) − A( k +1 : n , k ) ∗A( k , k +1: n ) ;
end
18 L = e y e ( n , n ) + t r i l (A, −1) ;
U = t r i u (A) ;
end

Now in MatLab, to see it work, we enter these commands:


Listing 2.8: Solution using LU Decomposition
A = [ 1 7 24 1 8 1 5 ; 23 5 7 14 1 6 ; 4 6 13 20 2 2 ; . . .
10 12 19 21 3 ; 1 1 18 25 2 9 ]
A =
17 24 1 8 15
5 23 5 7 14 16
4 6 13 20 22
10 12 19 21 3
11 18 25 2 9
[ L , U] = GE(A) ;
10 L
L =
1.0000 0 0 0 0
1.3529 1.0000 0 0 0
0.2353 −0.0128 1.0000 0 0
15 0.5882 0.0771 1.4003 1.0000 0
0.6471 −0.0899 1.9366 4.0578 1.0000
U
U =

20 17.0000 24.0000 1.0000 8.0000 15.0000


0 −27.4706 5.6471 3.1765 −4.2941
0 0 12.8373 18.1585 18.4154
0 0 0 −9.3786 −31.2802
0 0 0 0 90.1734
25 b = [ 1; 3; 5; 7; 9]
b =
1
3
5
30 7
9
y = LTriSol (L, b )
y =
1.0000
48 2 Linear Algebra

35 1.6471
4.7859
−0.4170
0.9249
x = UTriSol (U, y )
40 x =

0.0103
0.0103
0.3436
45 0.0103
0.0103
c = A∗x
c =
1.0000
50 3.0000
5.0000
7.0000
9.0000

which solves the system as we wanted.

2.9.5 The LU Decomposition of A with Pivoting

Here is a simple function to solve a system using the LU decomposition of A with


what is called pivoting. This means we find the largest absolute value entry in the
column we are trying to zero out and perform row interchanges to bring that one to
the pivot position. The MatLab code changes a bit; see if you can see what we are
doing and why! Note that this pivoting step is needed if a pivot element in a column k,
row k position is very small. Using it as a divisor would then cause a lot of numerical
problems because we would be multiplying by very large numbers.

Listing 2.9: LU Decomposition of A With Pivoting


f u n c t i o n [ L , U, p i v ] = GePiv (A) ;
2 %
% A i s nxn m a t r i x
% L i s nxn l o w e r t r i a n g u l a r m a t r i x
% U i s nxn u p p e r t r i a n g u l a r m a t r i x
% p i v i s a nx1 i n t e g e r v e c t o r t o h o l d v a r i a b l e o r d e r
7 % permutations
%
[ n , n ] = s i z e (A) ;
piv = 1:n ;
f o r k =1: n−1
12 [ maxc , r ] = max ( abs (A( k : n , k ) ) ) ;
q = r+k−1;
piv ( [ k q ] ) = piv ( [ q k ] ) ;
A( [ k q ] , : ) = A( [ q k ] , : ) ;
i f A( k , k ) ˜=0
17 A( k +1: n , k ) = A( k + 1: n , k ) /A( k , k ) ;
A( k +1: n , k +1: n ) = A( k +1: n , k +1: n ) − A( k +1: n , k ) ∗A( k , k +1: n ) ;
end
end
L = e y e ( n , n ) + t r i l (A, −1) ;
22 U = t r i u (A) ;
end
2.9 Computational Linear Algebra 49

We use this code to solve a system as follows:

Listing 2.10: Solving a System with pivoting


A = [ 1 7 24 1 8 1 5 ; 23 5 7 14 1 6 ; 4 6 13 20 2 2 ; . . .
2 10 12 19 21 3 ; 11 18 25 2 9]
A =
17 24 1 8 15
23 5 7 14 16
4 6 13 20 22
7 10 12 19 21 3
11 18 25 2 9
b = [1; 3; 5; 7; 9]
b =
1
12 3
5
7
9
[ L , U, p i v ] = GePiv (A) ;
17 L
L =
1.0000 0 0 0 0
0.7391 1.0000 0 0 0
0.4783 0.7687 1.0000 0 0
22 0.1739 0.2527 0.5164 1.0000 0
0.4348 0.4839 0.7231 0.9231 1.0000
U
U =
23.0000 5.0000 7.0000 14.0000 16.0000
27 0 20.3043 −4.1739 −2.3478 3.1739
0 0 24.8608 −2.8908 −1.0921
0 0 0 19.6512 18.9793
0 0 0 0 −22.2222
piv
32 piv =

2 1 5 3 4
y = LTriSol (L, b ( piv ) ) ;
y
37 y =
3.0000
−1.2174
8.5011
0.3962
42 −0.2279
x = UTriSol (U, y ) ;
x
x =
0.0103
47 0.0103
0.3436
0.0103
0.0103
c = A∗x
52 c =
1.0000
3.0000
5.0000
7.0000
57 9.0000

which solves the system as we wanted.


50 2 Linear Algebra

2.10 Eigenvalues and Eigenvectors

Another important aspect of matrices is called the eigenvalues and eigenvectors of


a matrix. We will motivate this in the context of 2 × 2 matrices of real numbers,
and then note it can also be done for the general square n × n matrix. Consider the
general 2 × 2 matrix A given by

a b
A=
cd

Is it possible to find a non zero vector v and a number r so that

A v = r v? (2.5)

There are many ways to interpret what such a number and vector pair means, but for
the moment, we will concentrate on finding such a pair (r, v). Now, if this was true,
we could rewrite the equation as

r v− Av = 0 (2.6)

where 0 denotes the vector of all zeros



0
0= .
0

Next, recall that the two by two identity matrix I is given by



10
I=
01

and it acts like multiplying by one with numbers; i.e. I v = v for any vector v. Thus,
instead of saying r v, we could say r I v. We can therefore write Eq. 2.6 as

r I v− Av = 0 (2.7)

We know that we can factor the vector v out of the left hand side and rewrite again
as Eq. 2.8.
 
r I−A v=0 (2.8)

Now recall that we want the vector v to be non zero. Note, in solving this system,
there are two possibilities:
2.10 Eigenvalues and Eigenvectors 51

(i): the determinant of B is non zero which implies the only solution is v = 0.
(ii): the determinant of B is zero which implies the there are infinitely many solutions
for v all of the form a constant c times some non zero vector E.
Here the matrix B = r I − A. Hence, if we want a non zero solution v, we must
look for the values of r that force det (r I − A) = 0. Thus, we want

0 = det (r I − A)

r − a −b
= det
−c r − d
= (r − a) (r − d) − b c
= r 2 − (a + d) r + ad − bc.

This important quadratic equation in the variable r determines what values of r


will allow us to find non zero vectors v so that A v = r v. Note that although we
started out in our minds thinking that r would be a real number, what we have done
above shows us that it is possible that r could be complex.

Definition 2.10.1 (The Eigenvalues and Eigenvectors of a 2 by 2 Matrix)


Let A be the 2 × 2 matrix

a b
A= .
cd

Then an eigenvalue r of the matrix A is a solution to the quadratic equation defined by

det (r I − A) = 0.

Any non zero vector that satisfies the equation

Av =r v

for the eigenvalue r is then called an eigenvector associated with the eigenvalue r
for the matrix A.

Comment 2.10.1 Since this is a quadratic equation, there are always two roots
which take the forms below:
(i): the roots r1 and r2 are real and distinct,
(ii): the roots are repeated r1 = r2 = c for some real number c,
(iii): the roots are complex conjugate pairs; i.e. there are real numbers α and β so
that r1 = α + β i and r2 = α − β i.
52 2 Linear Algebra

Let’s look at some examples:

Example 2.10.1 Find the eigenvalues and eigenvectors of the matrix



−3 4
A=
−1 2

Solution The characteristic equation is


   
10 −3 4
det r − =0
01 −1 2

or
 
r + 3 −4
0 = det
1r −2
= (r + 3)(r − 2) + 4
= r2 + r − 2
= (r + 2)(r − 1)

Hence, the roots, or eigenvalues, of the characteristic equation are r1 = −2 and


r2 = 1. Next, we find the eigenvectors associated with these eigenvalues.
1. For eigenvalue r1 = −2, substitute the value of this eigenvalue into

r + 3 −4
1r −2

This gives

1 −4
1 −4

The two rows of this matrix should be multiples of one another. If not, we made
a mistake and we have to go back and find it. Our rows are indeed multiples, so
pick one row to solve for the eigenvector. We need to solve
  
1 −4 v1 0
=
1 −4 v2 0

Picking the top row, we get

v1 − 4 v2 = 0
1
v2 = v1
4
2.10 Eigenvalues and Eigenvectors 53

Letting v1 = A, we find the solutions have the form


 
v1 1
=A
v2 1
4

The vector

1
1/4

is our choice for an eigenvector corresponding to eigenvalue r1 = −2.


2. For eigenvalue r2 = 1, substitute the value of this eigenvalue into

r + 3 −4
1r −2

This gives

4 −4
1 −1

Again, the two rows of this matrix should be multiples of one another. If not, we
made a mistake and we have to go back and find it. Our rows are indeed multiples,
so pick one row to solve for the eigenvector. We need to solve
  
4 −4 v1 0
=
1 −1 v2 0

Picking the bottom row, we get

v1 − v2 = 0
v2 = v1

Letting v1 = B, we find the solutions have the form


 
v1 1
=B
v2 1

The vector

1
1

is our choice for an eigenvector corresponding to eigenvalue r2 = 1.


54 2 Linear Algebra

Example 2.10.2 Find the eigenvalues and eigenvectors of the matrix



4 9
A=
−1 −6

Solution The characteristic equation is


   
10 4 9
det r − =0
01 −1 −6

or
 
r − 4 −9
0 = det
1r +6
= (r − 4)(r + 6) + 9
= r 2 + 2 r − 15
= (r + 5)(r − 3)

Hence, the roots, or eigenvalues, of the characteristic equation are r1 = −5 and


r2 = 3. Next, we find the eigenvectors associated with these eigenvalues.
1. For eigenvalue r1 = −5, substitute the value of this eigenvalue into

r − 4 −9
1r +6

This gives

−9 −9
1 1

The two rows of this matrix should be multiples of one another. If not, we made
a mistake and we have to go back and find it. Our rows are indeed multiples, so
pick one row to solve for the eigenvector. We need to solve
  
−9 −9 v1 0
=
1 1 v2 0

Picking the bottom row, we get

v1 + v2 = 0
v2 = − v1
2.10 Eigenvalues and Eigenvectors 55

Letting v1 = A, we find the solutions have the form


 
v1 1
=A
v2 −1

The vector

1
−1

is our choice for an eigenvector corresponding to eigenvalue r1 = −5.


2. For eigenvalue r2 = 3, substitute the value of this eigenvalue into

r − 4 −9
1r +6

This gives

−1 −9
1 9

Again, the two rows of this matrix should be multiples of one another. If not, we
made a mistake and we have to go back and find it. Our rows are indeed multiples,
so pick one row to solve for the eigenvector. We need to solve
  
−1 −9 v1 0
=
1 9 v2 0

Picking the bottom row, we get

v1 + 9 v2 = 0
−1
v2 = v1
9
Letting v1 = B, we find the solutions have the form
 
v1 1
=B −1
v2 9

The vector

1
−1
9

is our choice for an eigenvector corresponding to eigenvalue r2 = 3.


56 2 Linear Algebra

2.10.1 Homework

Exercise 2.10.1 Find the eigenvalues and eigenvectors of the matrix



6 3
A=
−11 −8

Exercise 2.10.2 Find the eigenvalues and eigenvectors of the matrix



2 1
A=
−4 −3

Exercise 2.10.3 Find the eigenvalues and eigenvectors of the matrix



−2 −1
A=
8 7

Exercise 2.10.4 Find the eigenvalues and eigenvectors of the matrix



−6 −3
A=
4 1

Exercise 2.10.5 Find the eigenvalues and eigenvectors of the matrix



−4 −2
A=
13 11

2.10.2 The General Case

For a general n × n matrix A, we have the following:

Definition 2.10.2 (The Eigenvalues and Eigenvectors of a n by n Matrix)


Let A be the n × n matrix.
⎡ ⎤
A11 A12 · · · An1
⎢ A21 A22 · · · An2 ⎥
⎢ ⎥
A=⎢ . .. .. .. ⎥ .
⎣ .. . . . ⎦
An1 An2 · · · Ann
2.10 Eigenvalues and Eigenvectors 57

Then an eigenvalue r of the matrix A is a solution to the polynomial defined by

det (r I − A) = 0.

Any non zero vector that satisfies the equation

Av =r v

for the eigenvalue r is then called an eigenvector associated with the eigenvalue r
for the matrix A.

Comment 2.10.2 Since this is a polynomial equation, there are always n roots some
of which are real numbers which are distinct, some might be repeated and some might
be complex conjugate pairs (and they can be repeated also!). An example will help.
Suppose we started with a 5 × 5 matrix. Then, the roots could be
1. All the roots are real and distinct; for example, 1, 2, 3, 4 and 5.
2. Two roots are the same and three roots are distinct; for examples, 1, 1, 3, 4 and
5.
3. Three roots are the same and two roots are distinct; for examples, 1, 1, 1, 4 and
5.
4. Four roots are the same and one roots is distinct from that; for examples, 1, 1,
1, 1 and 5.
5. Five roots are the same; for examples, 1, 1, 1, 1 and 1.
6. Two pairs of roots are the same and one roots is different from them; for examples,
1, 1, 3, 3 and 5.
7. One triple root and one pair of real roots; for examples, 1, 1, 1, 3 and 3.
8. One triple root and one complex conjugate pair of roots; for examples, 1, 1, 1,
3 + 4i and 3 − 4i.
9. One double root and one complex conjugate pair of roots and one different real
root; for examples, 1, 1, 2, 3 + 4i and 3 − 4i.
10. Two complex conjugate pair of roots and one different real root; for examples,
−2, 1 + 6i, 1 − 6i, 3 + 4i and 3 − 4i.

2.10.3 The MatLab Approach

We will now discuss certain ways to compute eigenvalues and eigenvectors for a
square matrix in MatLab. For a given A, we can compute its eigenvalues as follows:
58 2 Linear Algebra

Listing 2.11: Eigenvalues in Matlab


A = [ 1 2 3 ; 4 5 6 ; 7 8 −1]
A =
3 1 2 3
4 5 6
7 8 −1
E = e i g (A)
E =
8 −0.3954
11.8161
−6.4206

So we have found the eigenvalues of this small 3 × 3 matrix. To get the eigenvectors,
we do this:

Listing 2.12: Eigenvectors in Matlab


[V, D] = e i g (A)
V =
0.7530 −0.3054 −0.2580
−0.6525 −0.7238 −0.3770
5 0.0847 −0.6187 0.8896
D =
−0.3954 0 0
0 11.8161 0
0 0 −6.4206

Note the eigenvalues are not returned in ranked order. The eigenvalue/eigenvector
pairs are thus

λ1 = −0.3954
⎡ ⎤
0.7530
V1 = ⎣ −0.6525 ⎦
0.0847

λ2 = 11.8161
⎡ ⎤
−0.3054
V2 = ⎣ −0.7238 ⎦
−0.6187

λ3 = −6.4206
⎡ ⎤
−0.2580
V3 = ⎣ −0.3770 ⎦
0.8896

Now let’s try a nice 5 × 5 array that is symmetric:


2.10 Eigenvalues and Eigenvectors 59

Listing 2.13: Eigenvalues and Eigenvectors Example


1 B = [1 2 3 4 5;
2 5 6 7 9;
3 6 1 2 3;
4 7 2 8 9;
5 9 3 9 6]
6 B =
1 2 3 4 5
2 5 6 7 9
3 6 1 2 3
4 7 2 8 9
11 5 9 3 9 6
[W, Z] = e i g (B)
W =

0.8757 0.0181 −0.0389 0.4023 0.2637


16 −0.4289 −0.4216 −0.0846 0.6134 0.5049
0.1804 −0.6752 0.4567 −0.4866 0.2571
−0.1283 0.5964 0.5736 −0.0489 0.5445
0.0163 0.1019 −0.6736 −0.4720 0.5594
Z =
21 0.1454 0 0 0 0
0 2.4465 0 0 0
0 0 −2.2795 0 0
0 0 0 −5.9321 0
0 0 0 0 26.6197

It is possible to show that the eigenvalues of a symmetric matrix will be real and
eigenvectors corresponding to distinct eigenvalues will be 90◦ apart. Such vectors
are called orthogonal and recall this means their inner product is 0. Let’s check it
out. The eigenvectors of our matrix are the columns of W above. So their dot product
should be 0!

Listing 2.14: Checking orthogonality


C = d o t (W( 1 : 5 , 1 ) ,W( 1 : 5 , 2 ) )
C =
1 . 3 3 3 6 e −16

Well, the dot product is not actually 0 because we are dealing with floating point
numbers here, but as you can see it is close to machine zero (the smallest number
our computer chip can detect). Welcome to the world of computing!

Reference

J. Peterson, Calculus for Cognitive Scientists: Derivatives, Integration and Modeling, Springer
Series on Cognitive Science and Technology (Springer Science + Business Media Singapore Pte
Ltd., Singapore, 2015 in press)
Chapter 3
Numerical Methods Order One ODEs

Now that you are taking this course on More Calculus for Cognitive Scientists,
we note that in the previous course, you were introduced to continuity, derivatives,
integrals and models using derivatives. You were also taught about functions of two
variables and partial derivatives along with more interesting models. You were also
introduced to how to solve models using Euler’s method. We use these ideas a lot,
so there is a much value in reviewing this material. So let’s dive into it again. When
we try to solve systems like

dy
= f (t, y) (3.1)
dt
y(t0 ) = y0 (3.2)

where f is continuous in the variables t and y, and y0 is some value the solution is
to have at the time point t0 , we will quickly find that it is very hard in general to do
this by hand. So it is time to begin looking at how the MatLab environment can help
us. We will use MatLab to solve these differential equations with what are called
numerical methods. First, let’s discuss how to approximate functions in general.

3.1 Taylor Polynomials

We can approximate a function at a point using polynomials of various degrees. We


can first find the constant that best approximates a function f at a point p. This is
called the zeroth order Taylor polynomial and the equation we get is

f (x) = f ( p) + E 0 (x, p)

© Springer Science+Business Media Singapore 2016 61


J.K. Peterson, Calculus for Cognitive Scientists,
Cognitive Science and Technology, DOI 10.1007/978-981-287-877-9_3
62 3 Numerical Methods Order One ODEs

where E 0 (x, p) is the error. On the other hand, we could try to find the best straight
line that does the job. We would find

f (x) = f ( p) + f  ( p)(x − p) + E 1 (x, p)

where E 1 (x, p) is the error now. This straight line is the first order Taylor polyno-
mial but we know it also as the tangent line. We can continue finding polynomials
of higher and higher degree and their respective errors. In this class, our interests
stop with the quadratic case. We would find

f (x) = f ( p) + f  ( p)(x − p) + f  ( p)(x − p)2 + E 2 (x, p)

where E 2 (x, p) is the error. This is called the second order Taylor polynomial or
quadratic approximation. Now let’s dig into the theory behind this so that we can
better understand the error terms.

3.1.1 Fundamental Tools

Let’s consider a function which is defined locally at the point p. This means there
is at least an interval (a, b) containing p where f is defined. Of course, this interval
could be the whole x axis!. Let’s also assume f  exists locally at p in this same
interval. Now pick any x is the interval [ p, b) (we can also pick a point in the left
hand interval (a, p] but we will leave that discussion to you!). From Calculus I,
recall Rolle’s Theorem and the Mean Value Theorem. These are usually discussed in
Calculus I, but we really prove them carefully in course called Mathematical Analysis
(but that is another story).

Theorem 3.1.1 (Rolle’s Theorem)


Let f : [a, b] →  be a function defined on the interval [a, b] which is continuous
on the closed interval [a, b] and is at least differentiable on the open interval (a, b).
If f (a) = f (b), then there is at least one point c, between a and b, so that f  (c) = 0.

and

Theorem 3.1.2 (The Mean Value Theorem)


Let f : [a, b] →  be a function defined on the interval [a, b] which is continuous
on the closed interval [a, b] and is at least differentiable on the open interval (a, b).
Then there is at least one point c, between a and b, so that

f (b) − f (a)
= f  (c).
b−a
3.1 Taylor Polynomials 63

3.1.2 The Zeroth Order Taylor Polynomial

Our function f on the interval [ p, x] satisfies all the requirements of the Mean Value
Theorem. So we know there is a point cx with p < cx < x so that

f (x) − f ( p)
= f  (cx ).
x−p

This can be written as

f (x) = f ( p) + f  (cx )( p − a).

Let the constant f ( p), a polynomial of degree 0, be denoted by

P0 ( p, x) = f ( p).

We’ll call this the 0th order Taylor Polynomial for f at the point p. Next, let the 0th
order error term be defined by

E 0 ( p, x) = f (x) − P0 ( p, x) = f (x) − f ( p).

The error or remainder term is clearly the difference or discrepancy between the
actual function value at x and the 0th order Taylor Polynomial. Since f (x) − f ( p) =
f  (cx )(x − p), we can write all we have above as

E 0 ( p, x) = f  (cx )(x − p), some cx , with p < cx < x.

We can interpret what we have done by saying f ( p) is the best choice of 0th order
polynomial or constant to approximate f (x) near p. Of course, for most functions,
this is a horrible approximation! So the next step is to find the best straight line that
approximates f near p. Let’s try our usual tangent line to f at p. We summarize this
result as a theorem.

Theorem 3.1.3 (Zeroth Order Taylor Polynomial)


Let f : [a, b] →  be continuous on [a, b] and be at least differentiable on (a, b).
Then for each p in [a, b], there is at least one point c, between p and x, so that
f (x) = f ( p) + f  (c)(x − p). The constant f ( p) is called the zeroth order Taylor
Polynomial for f at p and we denote it by P0 (x; p). The point p is called the base
point. Note we are approximating f (x) by the constant f ( p) and the error we make
is E 0 (x, p) = f  (c)(x − p).
64 3 Numerical Methods Order One ODEs

3.1.2.1 Examples

Example 3.1.1 If f (t) = t 3 , by the theorem above, we know on the interval [1, 3]
that at 1 f (t) = f (1) + f  (c)(t − 1) where c is some point between 1 and t. Thus,
t 3 = 1 + (3c2 )(t − 1) for some 1 < c < t. So here the zeroth order Taylor
Polynomial is P0 (t, 1) = 1 and the error is E 0 (t, 1) = (3c2 )(t − 1).

Example 3.1.2 If f (t) = e−1.2t , by the theorem above, we know that at 0 f (t) =
f (0) + f  (c)(t − 0) where c is some point between 0 and t. Thus, e−1.2t = 1 +
(−1.2)e−1.2c (t −0) for some 0 < c < t or e−1.2t = 1−1.2e−1.2c t. So here the zeroth
order Taylor Polynomial is P0 (t, 1) = 1 and the error is E 0 (t, 0) = −1.2e−1.2c t.

Example 3.1.3 If f (t) = e−0.00231t , by the theorem above, we know that at 0


f (t) = f (0) + f  (c)(t − 0) where c is some point between 0 and t. Thus,
e−0.00231t = 1 + (−0.00231)e−0.00231c (t − 0) for some 0 < c < t or e−0.00231t =
1 − 0.00231e−0.00231c t. So here the zeroth order Taylor Polynomial is P0 (t, 1) = 1
and the error is E 0 (t, 0) = −0.00231e−0.00231c t.

3.1.3 The First Order Taylor Polynomial

If a function f is differentiable, from Calculus I, we know we can approximate its


value at the point p by its tangent line, T . We have

f (x) = f ( p) + f  ( p) (x − p) + E T (x, p), (3.3)

where the tangent line T is the function

T (x) = f ( p) + f  ( p) (x − p)

and the term E 1 (x, p) represents the error between the true function value f (x) and
the tangent line value T (x). That is

E 1 (x, p) = f (x) − T (x).

Another way to look at this is that the tangent line is the best straight line or linear
approximation to f at the point p. We all know from our first calculus course how
these pictures look. If the function f is curved near p, then the tangent line is not
a very good approximation to f at p unless x is very close to p. Now, let’s assume
f is actually two times differentiable on the local interval (a, b) also. Define the
constant M by

f (x) − f ( p) − f  ( p)(x − p)
M= .
(x − p)2
3.1 Taylor Polynomials 65

In this discussion, this really is a constant value because we have fixed our value of
x and p already. We can rewrite this equation as

f (x) = f ( p) + f  ( p)(x − p) + M(x − p)2 .

Now let’s define the function g on an interval I containing p by

g(t) = f (t) − f ( p) − f  ( p) (t − p) − M(t − p)2 .

Then,

g  (t) = f  (t) − f  ( p) − 2M(t − p)


g  (t) = f  (t) − 2M.

Then,

g(x) = f (x) − f ( p) − f  ( p) (x − p) − M(x − p)2


= f (x) − f ( p) − f  ( p) (x − p) − f (x) + f ( p) + f  ( p)(x − p)
=0

and

g( p) = f ( p) − f ( p) − f  ( p) ( p − p) − M( p − p)2 = 0.

We thus know g(x) = g( p) = 0. Also, from the Mean Value Theorem, there is a
point cx0 between p and x so that

g(x) − g( p)
= g  (cx0 ).
p−x

Since the numerator is g(x) − g( p), we now know g  (cx0 ) = 0. But we also have

g  ( p) = f  ( p) − f  ( p) − 2M( p − p) = 0.

Next, we can apply Rolle’s Theorem to the function g  . This tells us there is a point
cx1 between p and cx0 so that g  (cx1 ) = 0. Thus,

0 = g  (cx1 ) = f  (cx1 ) − 2M.

and simplifying, we have

1  1
M= f (cx ).
2
66 3 Numerical Methods Order One ODEs

Remembering what the value of M was, gives us our final result

f (x) − f ( p) − f  ( p)(x − p) 1
= f  (cx1 ), some cx1 with p < cx1 < cx0 .
(x − p)2 2

which can be rewritten as


1  1
f (x) = f ( p) + f  ( p)(x − p) + f (cx )(x − p)2 , some cx1 with p < cx1 < cx0 .
2
We define the 1st order Taylor polynomial, P1 (x, p) and 1st order error, E 1 (x, p) by

P1 (x, p) = f ( p) + f  ( p)(x − p)
E 1 (x, p) = f (x) − P1 (x, p) = f (x) − f ( p) − f  ( p)(x − p)
1
= f  (cx1 )(x − p)2 .
2
Thus, we have shown, E 1 (x, p) satisfies

(x − p)2
E 1 (x, p) = f  (cx1 ) (3.4)
2

where cx1 is some point between x and p. Note the usual Tangent line is the same as
the first order Taylor Polynomial, P1 ( f, p, x) and we have a nice representation of
our error. We can state this as our next theorem:

Theorem 3.1.4 (First Order Taylor Polynomial)


Let f : [a, b] →  be continuous on [a, b] and be at least twice differentiable on
(a, b). For a given p in [a, b], for each x, there is at least one point c, between p
and x, so that f (x) = f ( p) + f  ( p)(x − p) + (1/2) f  (c)(x − p)2 . The f ( p) +
f  ( p)(x − p) is called the first order Taylor Polynomial for f at p. and we denote
it by P1 (x; p). The point p is again called the base point. Note we are approximating
f (x) by the linear function f ( p)+ f  ( p)(x − p) and the error we make is E 1 (x, p) =
(1/2) f  (c)(x − p).

3.1.3.1 Example

Let’s do some examples to help this sink in.


Problem One: Let’s find the tangent line approximations for a simple exponential
decay function.

Example 3.1.4 For f (t) = e−1.2t on the interval [0, 5] find the tangent line approx-
imation, the error and maximum the error can be on the interval.
3.1 Taylor Polynomials 67

Solution Using base point 0, we have at any t

f (t) = f (0) + f  (0)(t − 0) + (1/2) f  (c)(t − 0)2


= 1 + (−1.2)(t − 0) + (1/2)(−1.2)2 e−1.2c (t − 0)2
= 1 − 1.2t + (1/2)(1.2)2 e−1.2c t 2 .

where c is some point between 0 and t. Hence, c is between 0 and 5 also. The first
order Taylor Polynomial is P1 (t, 0) = 1 − 1.2t which is also the tangent line to
e−1.2t at 0. The error is (1/2)(−1.2)2 e−1.2c t 2 .
Now let AE(t) denote absolute value of the actual error at t and ME be maximum
absolute error on the interval. The largest the error can be on [0, 5] is when f  (c)
is the biggest it can be on the interval. Here,

AE(t) = (1/2)(1.2)2 e−1.2c t 2 ≤ (1/2)(1.2)2 × 1 × (5)2 = (1/2)1.44 × 25 = ME.

Problem Two: Let’s find the tangent line approximations for a simple exponential
decay function again but let’s do it a bit more generally.

Example 3.1.5 If f (t) = e−βt , for β = 1.2 × 10−5 , find the tangent line approxi-
mation, the error and the maximum error on [0, 5].

Solution At any t

1 
f (t) = f (0) + f  (0)(t − 0) + f (c)(t − 0)2
2
1
= 1 + (−β)(t − 0) + (−β)2 e−βc (t − 0)2
2
1
= 1 − βt + β 2 e−βc t 2 .
2
where c is some point between 0 and t which means c is between 0 and 5. The first
order Taylor Polynomial is P1 (t, 0) = 1 − βt which is also the tangent line to e−βt
at 0. The error is 21 β 2 e−βc t 2 . The largest the error can be on [0, 5] is when f  (c) is
the biggest it can be on the interval. Here,
−5
AE(t) = |(1/2)(1.2 × 10−5 )2 e−1.2×10 c t 2 | ≤ (1/2)(1.2 × 10−5 )2 (1)(5)2
= (1/2)1.44 × 10−10 (25) = ME

3.1.4 Quadratic Approximations

We could also ask what quadratic function Q fits f best near p. Let the quadratic
function Q be defined by
68 3 Numerical Methods Order One ODEs

(x − p)2
Q(x) = f ( p) + f  ( p) (x − p) + f  ( p) . (3.5)
2
The new error is called E Q (x, p) and is given by

E Q (x, p) = f (x) − Q(x).

If f is three times differentiable, we can argue like we did in the tangent line approx-
imation (using the Mean Value Theorem and Rolle’s theorem on an appropriately
defined function g) to show there is a new point cx2 between p and cx1 with

(x − p)3
E Q (x, p) = f  (cx2 ) (3.6)
6
So if f looks like a quadratic locally near p, then Q and f match nicely and the
error is pretty small. On the other hand, if f is not quadratic at all near p, the error
will be large. We then define the second order Taylor polynomial, P2 ( f, p, x) and
second order error, E 2 (x, p) = E Q (x, p) by

1 
P2 (x, p) = f ( p) + f  ( p)(x − p) + f ( p)(x − p)2
2
1 
E 2 (x, p) = f (x) − P2 (x, p) = f (x) − f ( p) − f  ( p)(x − p) − f ( p)(x − p)2
2
1  2
= f (cx ) (x − p)3 .
6
Theorem 3.1.5 (Second Order Taylor Polynomial)
Let f : [a, b] →  be continuous on [a, b] and be at least three times differentiable
on (a, b). Given p in [a, b], for each x, there is at least one point c, between p and x, so
that f (x) = f ( p) + f  ( p)(x − p) + (1/2) f  ( p)(x − p)2 + (1/6) f  (c)(x − p)3 .
The quadratic f ( p) + f  ( p)(x − p) + (1/2) f  ( p)(x − p)2 is called the second
order Taylor Polynomial for f at p and we denote it by P2 (x, p). The point p
is again called the base point. Note we are approximating f (x) by the quadratic
f ( p) + f  ( p)(x − p) + (1/2) f  ( p)(x − p)2 and the error we make is E 2 (x, p) =
(1/6) f  (c)(x − p).

3.1.4.1 Examples

Let’s work out some problems involving quadratic approximations.

Example 3.1.6 If f (t) = e−βt , for β = 1.2 × 10−5 , find the second order approxi-
mation, the error and the maximum error on [0, 5].
3.1 Taylor Polynomials 69

Solution For each t in the interval [0, 5], then there is some 0 < c << t < 5 so that

f (t) = f (0) + f  (0)(t − 0) + (1/2) f  (0)(t − 0)2


+ (1/6) f  (c)(t − 0)3
1 1
= 1 + (−β)(t − 0) + (−β)2 (t − 0)2 + (−β)3 e−βc (t − 0)3
2 6
= 1 − βt + (1/2)β 2 − (1/6)β 3 e−βc t 3 .

The second order Taylor Polynomial is p2 (t, 0) = 1 − βt + (1/2)β 2 t 2 which is


also called the quadratic approximation to e−βt at 0. The error is − 16 β 3 e−βc t 3 . The
error is largest on [0, 5] when f  (c) is the biggest it can be on the interval. Here,
−5 −5
AE(t) = | − (1/6)(1.2 × 10−5 )3 e−1.2×10 c t 3 | ≤ | − (1/6)(1.2 × 10−5 )3 e−1.2×10 c t 3 |
≤ (1/6)(1.2 × 10−5 )3 (1) (5)3 = (1/6) 1.728 × 10−15 (125) = ME

Example 3.1.7 Do this same problem on the interval [0, T ].

Solution The approximations are the same, 0 < c < T and


−5 −5
AE = | − (1/6)(1.2 × 10−5 )3 e−1.2×10 c t 3 | ≤ | − (1/6)(1.2 × 10−5 )3 e−1.2×10 c t 3 |
≤ (1/6)(1.2 × 10−5 )3 T 3 = (1/6) 1.728 × 10−15 (T 3 ) = ME.

We can find higher order Taylor polynomials and remainders using these arguments
as long as f has higher order derivatives. But, for our purposes, we can stop here.

3.2 Euler’s Method with Time Independence

Let’s try to approximate the solution to the model x  = f (x) with x(0) = x0 . Note
the dynamics does not depend on time t. The solution x(t) can be written

x(t) = x(0) + x  (0)(t − 0) + x  (ct )(t − 0)2 /2

where ct is some number between 0 and t. To approximate the solution, we will


divide the interval [0, T ] into pieces of length h. We call h the stepsize. If h does not
evenly divide T , we just use the last subinterval even though it may be a bit short.
Let N be the number of subintervals we get by doing this.
• Example: Divide [0, 5] using h = 0.4. Then 5/0.4 = 12.5 so we create 13
subintervals with the last one of length 0.2 instead of 0.4. So N = 13.
• Example: Divide [0, 10] using h = 0.2. Then 10/0.2 = 50 and we get N = 50.
70 3 Numerical Methods Order One ODEs

To approximate the true solution x(h) we then have

x(h) = x(0) + x  (0)(h − 0) + x  (ch )(h − 0)2 /2


= x0 + x  (0)h + x  (ch )h 2 /2,

where ch is between 0 and h. We can rewrite this more. Note x  = f (x) tells
us we can replace x  (0) by f (x(0)) = f (x0 ). Also, since x  = f (x), the chain
rule tells us x  = f  (x) x  = (d f /d x) f where we let f  (x) = (d f /d x)(x). So
x  (ch ) = f  (x(ch )) f (x(ch )) and we have

x(h) = x0 + f (x0 )h + f  (x(ch )) f (x(ch )) h 2 /2

Let x1 be the true solution x(h) and let x̂0 be the starting or zeroth Euler approximate
which is defined by x̂0 = x0 . Hence, we make no error at first. Further, let the first
Euler approximate x̂1 be defined by x̂1 = x0 + f (x0 ) h = x̂0 + f (x̂0 ) h. Which is
the tangent line approximation to x at the point t = 0! Then we have

x1 = x̂1 + f  (x(ch )) f (x(ch )) h 2 /2.

Define the error at the first step by E 1 = |x1 − x̂1 |. Thus,

E 1 = |x1 − x̂1 | = | f  (x(ch ))| | f (x(ch ))| h 2 /2.

Now let’s do bounds. Now x is continuous on [0, T ] so x is bounded. Hence,


||x||∞ = max0≤t≤T |x(t)| is some finite number. Call it D. We see x(t) lives in the
interval [−D, D] which is on the x axis. Then
• f is continuous on [−D, D], so || f ||∞ = max −D≤x≤D | f (x)| is some finite
number.
• f  is continuous on [−D, D], so || f  ||∞ = max −D≤x≤D | f  (x)| is some finite
number.
The specific numbers we used for the example y  = 3y are an example of these
bounds. Using these bounds, we have

E 1 = |x1 − x̂1 | = | f  (x(ch ))| | f (x(ch ))| h 2 /2


≤ || f ||∞ || f  ||∞ h 2 /2 = B h 2 /2 ≤ C h 2 /2.

where we let A = || f ||∞ , B = || f ||∞ || f  ||∞ and C be the maximum of A and B.


Now let’s do the approximation for x(2h). We will let x2 = x(2h) and we will
define the second Euler approximate by x̂2 = x̂1 + f (x̂1 ) h. The tangent line approx-
imation to x at h gives

x(2h) = x(h) + x  (h) h + x  (c2h ) h 2 /2


= x(h) + f (x(h))h + f  (x(c2h )) f (x(c2h )) h 2 /2
= x1 + f (x1 )h + f  (x(c2h )) f (x(c2h )) h 2 /2.
3.2 Euler’s Method with Time Independence 71

Now add and subtract x̂2 = x̂1 + f (x̂1 ) h to this equation.

x(2h) = x1 + f (x1 )h + f  (x(c2h )) f (x(c2h )) h 2 /2


= x1 + f (x1 )h + x̂2 − x̂2 + f  (x(c2h )) f (x(c2h )) h 2 /2
= (x1 + f (x1 )h − x̂1 − f (x̂1 ) h) + x̂2
+ f  (x(c2h )) f (x(c2h )) h 2 /2
= x̂2 + (x1 − x̂1 ) + ( f (x1 ) − f (x̂1 )) h
+ f  (x(c2h )) f (x(c2h )) h 2 /2.

We are almost there! Next, we can apply the Mean Value Theorem to the difference
f (x1 ) − f (x̂1 ) and find f (x1 ) − f (x̂1 ) = f  (xd )(x1 − x̂1 ) with xd between x1 and
x̂1 . Plugging this in, we have

x2 = x̂2 + (x1 − x̂1 ) + ( f (x1 ) − f (x̂1 )) h


+ f  (x(c2h )) f (x(c2h )) h 2 /2
= x̂2 + (x1 − x̂1 ) + f  (xd ) (x1 − x̂1 ) h
+ f  (x(c2h )) f (x(c2h )) h 2 /2
= x̂2 + (x1 − x̂1 ) (1 + f  (xd )h) + f  (x(c2h )) f (x(c2h )) h 2 /2

Thus

x2 − x̂2 = (x1 − x̂1 ) (1 + f  (xd )h) + f  (x(c2h )) f (x(c2h )) h 2 /2.

Now E 2 = |x2 − x̂2 |, so we can overestimate

E 2 = |x2 − x̂2 | ≤ |x1 − x̂1 | (1 + | f  (xd )|h)


+ | f  (x(c2h ))| | f (x(c2h ))| h 2 /2
≤ E 1 (1 + || f  ||∞ h) + || f  ||∞ || f ||∞ h 2 /2
= E 1 (1 + Ah) + B h 2 /2 ≤ E 1 (1 + Ch) + Ch 2 /2.

Since E 1 ≤ Ch 2 /2, we find

E 2 ≤ C h 2 /2 (1 + Ch) + C h 2 /2 = (C h 2 /2) (1 + (1 + Ch))

Let’s look at e1+Ch . We know using our approximations

eut = 1 + ut + u 2 euc t 2 /2

for some c between 0 and u. But the error term is positive so we know eut ≥ 1 + ut.
Letting u = Ch and using t = 1, we have
72 3 Numerical Methods Order One ODEs

1 + (1 + Ch) ≤ 1 + eCh .

and so

E 2 ≤ (C h 2 /2) (1 + eCh )

Now let’s do the approximation for x(3h). We will let x3 = x(3h) and we will
define the third Euler approximate by x̂3 = x̂2 + f (x̂2 ) h. The tangent line approxi-
mation to x at 2h gives

x(3h) = x(2h) + x  (2h) h + x  (c3h ) h 2 /2


= x(2h) + f (x(2h))h + f  (x(c3h )) f (x(c3h )) h 2 /2
= x2 + f (x2 )h + f  (x(c3h )) f (x(c3h )) h 2 /2.

Now add and subtract x̂3 = x̂2 + f (x̂2 ) h to this equation.

x(3h) = x2 + f (x2 )h + f  (x(c3h )) x(c3h ) h 2 /2


= x2 + f (x2 )h + x̂3 − x̂3 + f  (x(c3h )) f (x(c3h )) h 2 /2
= (x2 + f (x2 )h − x̂2 − f (x̂2 ) h) + x̂3
+ f  (x(c3h )) f (x(c3h )) h 2 /2
= x̂3 + (x2 − x̂2 ) + ( f (x2 ) − f (x̂2 )) h
+ f  (x(c3h )) f (x(c3h )) h 2 /2.

But we can apply the Mean Value Theorem to the difference f (x2 ) − f (x̂2 ). We
find f (x2 ) − f (x̂2 ) = f  (xu )(x2 − x̂2 ) with xu between x2 and x̂2 . Plugging this
in, we find

x3 = x̂3 + (x2 − x̂2 ) + ( f (x2 ) − f (x̂2 )) h


+ f  (x(c3h )) f (x(c3h )) h 2 /2
= x̂3 + (x2 − x̂2 ) + f  (xu ) (x2 − x̂2 ) h
+ f  (x(c3h )) f (x(c3h )) h 2 /2
= x̂3 + (x2 − x̂2 ) (1 + f  (xu )h) + f  (x(c3h )) f (x(c3h )) h 2 /2

Thus

x3 − x̂3 = (x2 − x̂2 ) (1 + f  (xu )h) + f  (x(c3h )) f (x(c3h )) h 2 /2.

Now E 3 = |x3 − x̂3 |, so we can overestimate


3.2 Euler’s Method with Time Independence 73

E 3 = |x3 − x̂3 | ≤ |x2 − x̂2 | (1 + | f  (xu )|h)


+ | f  (x(c3h ))| | f (x(c3h ))| h 2 /2
≤ E 2 (1 + || f  ||∞ h) + || f  ||∞ || f ||∞ h 2 /2
= E 2 (1 + Ah) + B h 2 /2.

Since E 2 ≤ (C h 2 /2) (2 + Ch), we find


 
E 3 ≤ (C h 2 /2) (2 + Ch) (1 + Ah) + B h 2 /2
 
≤ (C h 2 /2) (2 + Ch) (1 + Ch) + C h 2 /2
 
= C 1 + (1 + Ch) + (1 + Ch)2 h 2 /2

We have already shown that 1 + u ≤ eu for any u. It follows (1 + u)2 ≤ (eu )2 = e2u .
So we have
 
E 3 ≤ C 1 + (1 + Ch) + (1 + Ch)2 h 2 /2
 
≤ C 1 + eCh + e2Ch h 2 /2.

We also know 1 + u + u 2 = (u 3 − 1)/(u − 1). So we have

e3Ch − 1 2
E3 ≤ C h /2.
eCh − 1

Continuing we find after N steps

e N Ch − 1 2
EN ≤ C h /2.
eCh − 1

Now after N steps, we reach the end of the interval T . So N h = T . Rewriting, we


have the absolute error we make after N Euler approximation steps is

eC T − 1 2
EN ≤ C h /2.
eCh − 1

The total error is what we get when we add up E 1 + · · · + E N ≤ N E N . This gives


the estimate
eC T − 1 2
E1 + · · · + E N ≤ N C h /2
eCh − 1
N eC T − 1
≤ (Ch) h
2 Ch
N CT
= (e − 1) h
2
74 3 Numerical Methods Order One ODEs

The total or global error thus has the form of a constant times h and hence, the
solution on the entire interval is of order h. Of course, the local error at each step is
on the order of h 2 .

3.3 Euler’s Method with Time Dependence

If the dynamics function depends on time, we will still be able to look carefully at the
Euler approximates. Let’s look carefully at the errors we make when we use Euler’s
method in this case. We start with a preliminary result called a Lemma.

Lemma 3.3.1 (Estimating The Exponential Function)


For all x, 1 + x ≤ e x which implies (1 + x)n ≤ enx .

Proof We know if the base point is 0, then since e x is twice differentiable, there is
a number ξ between 0 and x so that e x = 1 + x + eξ x2 . However, the last term is
2

always nonnegative and so dropping that term we obtain e x ≥ 1 + x from which the
final result follows. 

3.3.1 Lipschitz Functions and Taylor Expansions

We need another basic idea about functions which is called Lipschitzity. This is a
definition.

Definition 3.3.1 (Functions Satisfying Lipschitz Conditions)


If f is a function defined on the finite interval [a, b], we say f satisfies a Lipschitz
condition on [a, b] if there is a positive constant K so that

| f (x) − f (y)| ≤ K |x − y|

for all x and y in [a, b].

Many functions satisfy a Lipschitz condition. For example, if f has a derivative that
is bounded by the constant K on [a, b], this means | f  (x)| ≤ K for all x in [a, b].
We can then apply the Mean Value Theorem to f on any interval [x, y] in [a, b] to
see there is a number ξ between x and y so that

| f (x) − f (y)| = | f  (ξ)| |x − y| ≤ K |x − y|.

We see f is Lipschitz on [a, b].


Next, if we want to solve the equation x  (t) = f (t, x), x(t0 ) = x0 , one way we can
attack it is to assume the solution x has enough smoothness to allow us to expand it
in a Taylor series expansion about the base point (t0 , x0 ). We assume f is continuous
3.3 Euler’s Method with Time Dependence 75

on a rectangle D which is the set of (t, x) with t ∈ [a, b] and x ∈ [c, d] with the point
(t0 , x0 ) in D for some finite intervals [a, b] and [c, d]. The theory of the solutions to
our models allows us to make sure the value of d is large enough to hold the image
of the solution x; i.e., x(t) ∈ [c, d] for all t ∈ [t0 , b]. Also, for convenience, we
are assuming x is a scalar variable although the arguments we present can easily be
modified to handle x being a vector. Using the Taylor Remainder theorem, this gives
for a given time point t:

dx 1 d2x 1 d3x
x(t) = x(t0 ) + (t0 ) (t − t0 ) + (t 0 ) (t − t 0 ) 2
+ (ξ) (t − t0 )3
dt 2 dt 2 6 dt 3

where ξ is some point in the interval [t0 , t]. We also know by the chain rule that since
x  is f that
d2x ∂f ∂ f dx
= +
dt 2 ∂t ∂t dt
d2x ∂f ∂f
= + f
dt 2 ∂t ∂x

which implies, switching to a standard subscript notation for partial derivatives (yes,
these calculations are indeed yucky2 !)

d3x
= ( f tt + f t x f ) + ( f xt + f x x f ) f + f x ( f t + f x f )
dt 3
The important thing to note is that this third order derivative is made up of algebraic
combinations on f and its various partial derivatives. We typically assume that on the
interval [t0 , b], that all of these functions are continuous and bounded. Thus, letting
||g|| represent the maximum value of the continuous function g(s, u) on the interval
[t0 , b] × [c, d], we know there is a constant B so that

|x  (t)| ≤ || f x ||∞ || f ||∞ + || f t ||∞ = B.

implying ||x  ||∞ ≤ B on [t0 , b]. Further, there is a constant C so that

|x  (ξ)| ≤ (|| f tt ||∞ + || f t x ||∞ || f ||∞ ) + (|| f xt ||∞ + || f x x ||∞ || f ||∞ ) || f ||∞
+ || f x ||∞ (|| f t ||∞ + || f x ||∞ || f ||∞ )
= C.

That is, ||x  ||∞ ≤ C on [t0 , b]. Of course, if f has sufficiently smooth higher order
partial derivatives, we can find bounds on higher derivatives of x as well. Now, using
the standard abbreviations f 0 = f (t0 , x0 ), f t0 = ∂∂tf (t0 , x0 ) and f x0 = ∂∂xf (t0 , x0 )
with similar notations for the second order partials, we see our solution can be written
as
76 3 Numerical Methods Order One ODEs

1 1 d3x
x(t) = x0 + f 0 (t − t0 ) + ( f t0 + f x0 f 0 ) (t − t0 )2 + (ξ) (t − t0 )3
2 6 dt 3
1
= x0 + f 0 (t − t0 ) + ( f t0 + f x0 f 0 ) (t − t0 )2
 2 
1 
+ f tt + f t x f ) + ( f xt + f x x f ) f + f x ( f t + f x f  (t − t0 )3 .
6 ξ

We can now state a result which tells us how much error we make with Euler’s
method. From the remarks above, the assumption ||x  ||∞ is bounded is not an un-
reasonable for many models we wish to solve. Our discussions above allow us to
be fairly quantitative about how much local error and global error we make using
Euler’s approximations. We state this as a theorem.

Theorem 3.3.2 (Error Estimates For Euler’s Method)


Assume x is a solution to x  (t) = f (t, x(t)), x(t0 ) = x0 on the interval [t0 , b] for
some positive b. Assume f and its first order partials are continuous on a rectangle
D which is the set of (t, x) with t ∈ [a, b] and x ∈ [c, d] with the point (t0 , x0 ) in
D for some finite intervals [a, b] and [c, d] with [c, d] containing the range of the
solution x. Let {t0 , t1 , . . . , t N } denote the steps of discrete times obtained using the
step size h in the Euler method. Then the Euler approximates x̂n satisfy
 
e(b−a)K − 1 h
|xn − x̂n | ≤ eb−a K |e0 | + ||x  ||∞
K 2

where xn is the value of the true solution x(tn ) and e0 = x0 − x̂0 . If we also know
e0 = 0 (the usual state of affairs), then, letting the constant B be defined by
 
e(b−a)K − 1
B= ||x  ||∞ ,
2K
 
 

we can say x(b) − x̂ N (h)  ≤ B h where N (h) is the index at which t Nh = b.

Proof From our remarks earlier, our assumptions on f and its first order partials
tell us f satisfies a Lipschitz condition with Lipschitz constant K in D and that the
solution x has a bounded second derivative on the interval [t0 , b]. Let en = xn − x̂n
and τn = h2 x  (ξn ). Then, the usual Taylor series expansion gives

h
x(tn+1 ) = x(tn ) + h f (tn , x(tn )) + h x  (ξn )
2
= x(tn ) + h f (tn , x(tn )) + h τn

where ξn is between tn+1 and tn and the Euler approximates satisfy

x̂n+1 = x̂n + h f (tn , x̂n ).


3.3 Euler’s Method with Time Dependence 77

Subtracting, we find
 
xn+1 − x̂n+1 = xn − x̂n + h f (tn , x(tn )) − f (tn , x̂n ) + h τn .

Thus,
 
en+1 = en + h f (tn , x(tn )) − f (tn , x̂n ) + h τn .

This leads to the estimate

|en+1 | ≤ |en | + h | f (tn , x(tn )) − f (tn , x̂n )| + h |τn |.

Now apply the Lipschitz condition on f to rewrite the above as

|en+1 | ≤ |en | + h K |xn − x̂n | + h |τn |


≤ (1 + K )|en | + h |τn |

It is easy to see that

h  h
|τn | = x (ξn ) ≤ ||x  ||∞ .
2 2

For convenience, let τ (h) = h


2
||x  ||. Then, we have the estimate

|en+1 | ≤ (1 + K )|en | + h |τ (h)|.

This is a recursion relation. We can easily see what is happening by working out a
few terms.

|e1 | ≤ |e0 |(1 + h K ) + h|τ (h)|


|e2 | ≤ |e1 |(1 + h K ) + h|τ (h)|
 
≤ |e0 |(1 + h K ) + h|τ (h)| (1 + h K ) + h|τ (h)|
 
≤ |e0 |(1 + h K ) + hτ (h) 1 + (1 + h K )
2

|e3 | ≤ |e2 |(1 + h K ) + h|τ (h)|


 
≤ |e0 |(1 + h K ) + h|τ (h)| 1 + (1 + h K ) + (1 + h K ) .
3 2
78 3 Numerical Methods Order One ODEs

It is easy to see that after n steps, we find

n−1
|en | ≤ |e0 |(1 + h K ) + h|τ (h)|
n
(1 + h K )i .
i=0

It is well-known that for any value of r = 1, we have the identity

1 − rn
1 + r + r 2 + r 3 + · · · + r n−1 = .
1−r

Letting r = 1 + h K , we can rewrite our error estimate as

(1 + h K )n − 1
|en | ≤ |e0 |(1 + h K )n + |τ (h)| .
K
Then, using Lemma 3.3.1, we have

(1 + h K )n − 1
|en | ≤ |e0 | enh K + h|τ (h)| .
hK
But since tn = t0 + nh, we know nh ≤ b − t0 leading to

(1 + h K )n − 1
|en | ≤ |e0 | e(b−t0 )K + |τ (h)| .
K

But (1 + h K )n ≤ enh K and since nh = b − a, we have

e(b−a)K − 1
|en | ≤ |e0 | e(b−t0 )K + |τ (h)| .
K
Now if e0 = 0 (as it normally would be), we have

e(b−a)K − 1 
|en | ≤ ||x ||∞ h
2K
and letting

e(b−a)K − 1 
B= ||x ||∞ ,
2K
we have |e N (h) | ≤ Bh as required. 

Comment 3.3.1 Note the local error we make at each step is proportional to h 2 but
the global error after we reach t = b is proportional to h. Hence, Euler’s method is
an order 1 method.
3.4 Euler’s Algorithm 79

3.4 Euler’s Algorithm

Here then is Euler’s Algorithm to approximate the solution to x  = f (t, x) with


x(0) = x0 using step size h for as many steps as we want.
• x̂0 = x0 so E 0 = 0.
• x̂1 = x̂0 + f (t, x̂0 ) h; E 1 = |x1 − x̂1 |.
• x̂2 = x̂1 + f (t, x̂1 ) h; E 2 = |x2 − x̂2 |.
• x̂3 = x̂2 + f (t, x̂2 ) h; E 3 = |x3 − x̂3 |.
• x̂4 = x̂3 + f (t, x̂3 ) h; E 4 = |x4 − x̂4 |.
• Continue as many steps as you want.
Recursively:
• x̂0 = x0 so E 0 = 0.
• x̂n+1 = x̂n + f (t, x̂n ) h; E n+1 = |xn+1 − x̂n+1 | for n = 0, 1, 2, 3, . . ..
The approximation scheme above is called Euler’s Method.

3.4.1 Examples

We need to do some examples by hand so here goes.

Example 3.4.1 Find the first three Euler approximates for x  = −2x, x(0) = 3 using
h = 0.3. Find the true solution values and errors also.

Solution Here f (x) = −2x and the true solution is x(t) = 3e−2t .
Step 0: x̂0 = x0 = 3 so E 0 = 0.
Step 1:

x̂1 = x̂0 + f (x̂0 ) h = 3 + f (3) (0.3)


= 3 + (−2(3)) (0.3) = 3 − 6(0.3) = 3 − 1.8 = 1.2.
x1 = x(h) = 3e−2h = 3e−0.6 = 1.646.
E 1 = |x1 − x̂1 | = |1.646 − 1.2| = 0.446.

Step 2:

x̂2 = x̂1 + f (x̂1 ) h = 1.2 + f (1.2) (0.3)


= 1.2 + (−2(1.2)) (0.3) = 1.2 − 2.4(0.3) = 0.48.
x2 = x(2h) = 3e−2(2h) = 3e−4h = 3e−1.2 = 0.9036.
E 2 = |x2 − x̂2 | = |0.9036 − 0.48| = 0.4236.
80 3 Numerical Methods Order One ODEs

Step 3:

x̂3 = x̂2 + f (x̂2 ) h = 0.48 + f (0.48) (0.3)


= 0.48 + (−2(0.48)) (0.3) = 0.48 − 0.96(0.3) = 0.192.
x3 = x(3h) = 3e−2(3h) = 3e−6h = 3e−1.8 = 0.4959.
E 3 = |x3 − x̂3 | = |0.4959 − 0.192| = 0.3039.

Example 3.4.2 Find the first three Euler approximates for x  = 2x, x(0) = 4 using
h = 0.2. Find the true solution values and errors also.

Solution Here f (x) = 2x and the true solution is x(t) = 4e2t .


Step 0: x̂0 = x0 = 4 so E 0 = 0.
Step 1:

x̂1 = x̂0 + f (x̂0 ) h = 4 + f (4) (0.3)


= 4 + (2(4)) (0.2) = 4 + 8(0.2) = 5.6.
x1 = x(h) = 4e2h = 4e0.4 = 5.9673.
E 1 = |x1 − x̂1 | = |5.9673 − 5.6| = 0.3673.

Step 2:

x̂2 = x̂1 + f (x̂1 ) h = 5.6 + f (5.6) (0.2)


= 5.6 + (2(5.6)) (0.2) = 5.6 + 11.2(0.2) = 7.84.
x2 = x(2h) = 4e2(2h) = 4e4h = 4e0.8 = 8.9032.
E 2 = |x2 − x̂2 | = |8.9032 − 7.84| = 1.0622.

Step 3:

x̂3 = x̂2 + f (x̂2 ) h = 7.84 + f (7.84) (0.2)


= 7.84 + (2(7.84)) (0.2) = 7.84 + 15.68(0.2) = 10.976.
x3 = x(3h) = 4e2(3h) = 4e6h = 4e1.2 = 13.2805.
E 3 = |x3 − x̂3 | = |13.2805 − 10.976| = 2.3045.

3.5 Runge–Kutta Methods

These methods are based on more sophisticated ways of approximating the solution
y  . These methods use multiple function evaluations at different time points around a
given t ∗ to approximate y(t ∗ ). In more advanced classes, we can show this technique
generates a sequence {yn } starting at y0 using the following recursion equation:
3.5 Runge–Kutta Methods 81

yn+1 = yn + h × F o (tn , yn , h, f )
y0 = y0

where h is the step size we use for our underlying partition of the time space giving

ti = t0 + i × h

for appropriate indices and F o is a fairly complicated function of the previous approx-
imate solution, the step size and the right hand side function f . The Runge–Kutta
methods are available for various choices of the superscript o which is called the
order of the method. We will not discuss much about F o in this course, as it is best
served up in a more advanced class. What we can say is this: For order o, the local
error is like h o+1 . So
Order One: Local error is h 2 and this method is the same as the Euler Method.
The global error then goes down linearly with h.
Order Two: Local error is h 3 and this method is better than the Euler Method. If
the global error for a given stepsize h is then E, halving the stepsize to h/2 gives
a new global error of E/4. Thus, the global error goes down quadratically. This
means halving the stepsize has a dramatic effect of the global error.
Order Three: Local error is h 4 and this method is better than the Euler Method. If
the global error for a given stepsize h is E, then halving the stepsize to h/2 gives a
new global error of E/8. Thus, the global error goes down as a cubic power. This
means halving the stepsize has an even more dramatic effect of the global error.
Order Four: Local error is h 5 and this method is better than the order three
Method. If the global error for a given stepsize h is E, then halving the step-
size to h/2 gives a new global error of E/16. Thus, the global error goes down as
a fourth power! This means halving the stepsize has huge effect of the global error.
We will now look at MatLab code that allows us to solve our differential equation
problems using the Runge–Kutta method instead of the Euler method of Sect. 3.3.

3.5.1 The MatLab Implementation

The basic code to implement the Runge–Kutta methods is broken into two pieces.
The first one, RKstep.m implements the evaluation of the next approximation so-
lution at point (tn , yn ) given the old approximation at (tn−1 , yn−1 ). We then loop
through all the steps to get to the chosen final time using the code in FixedRK.m.
The details of these algorithms are beyond the scope of this text and so we will not go
into them here. In this code, we are allowing for the dynamic functions to depend on
82 3 Numerical Methods Order One ODEs

time also. Previously, we have used dynamics like f = @(x) 3*x and we expect
the dynamics functions to have that form in DoEuler. However, we want to have
more complicated dynamics now—at least the possibility of it!—so we will adapt
what we have done before. We will now define our dynamics as if they depend on
time. So from now on, we would write f=@(t,x) 3*x even though there is no
time dependence. We then rewrite our DoEuler to DoEulerTwo so that we can
use these more general dynamics. This code is in DoEulerTwo.m and we have
discussed it in the first text on starting your calculus journey. You can review this
function in that text. The Runge–Kutta code uses the new dynamics functions. We
have gone over this code in the previous text, but we will show it to you again for
completeness. In this code, you see the lines like feval(fname,t,x) which
means take the function fname passed in as an argument and evaluate it as the pair
(t,x). Hence, fname(t,x) is the same as f(t,x).

Listing 3.1: RKstep.m: Runge–Kutta Codes


f u n c t i o n [ tnew , ynew , fnew ] = RKstep ( fname , t c , yc , f c , h , k )
%
% fname t h e name o f t h e r i g h t hand s i d e f u n c t i o n f ( t , y )
% t i s a s c a l a r u s u a l l y c a l l e d t i m e and
5 % y i s a vector of s i z e d
% yc approximate s o l u t i o n to y ’ ( t ) = f ( t , y ( t ) ) at t = t c
% fc f ( tc , yc )
% h The t i m e s t e p
% k The o r d e r o f t h e Runge−K u t t a Method 1<= k <= 4
10 %
% tnew t c +h
% ynew a p p r o x i m a t e s o l u t i o n a t tnew
% fnew f ( tnew , ynew )
%
15 i f k==1
k1 = h∗ f c ;
ynew = yc+k1 ;
e l s e i f k==2
k1 = h∗ f c ;
20 k2 = h∗ f e v a l ( fname , t c +( h / 2 ) , yc +( k1 / 2 ) ) ;
ynew = yc + k2 ;
e l s e i f k==3
k1 = h∗ f c ;
k2 = h∗ f e v a l ( fname , t c +( h / 2 ) , yc +( k1 / 2 ) ) ;
25 k3 = h∗ f e v a l ( fname , t c +h , yc−k1 +2∗ k2 ) ;
ynew = yc +( k1 +4∗ k2+k3 ) / 6 ;
e l s e i f k==4
k1 = h∗ f c ;
k2 = h∗ f e v a l ( fname , t c +( h / 2 ) , yc +( k1 / 2 ) ) ;
30 k3 = h∗ f e v a l ( fname , t c +( h / 2 ) , yc +( k2 / 2 ) ) ;
k4 = h∗ f e v a l ( fname , t c +h , yc +k3 ) ;
ynew = yc +( k1 +2∗ k2 +2∗ k3+k4 ) / 6 ;
else
d i s p ( s p r i n t f ( ’The RK method %2d order is not allowed!’ , k ) ) ;
35 end
tnew = t c +h ;
fnew = f e v a l ( fname , tnew , ynew ) ;
end
3.5 Runge–Kutta Methods 83

The code above does all the work. It manages all of the multiple tangent line
calculations that Runge–Kutta needs at each step. We loop through all the steps to
get to the chosen final time using the code in FixedRK.m which is shown below.

Listing 3.2: FixedRK.m: The Runge–Kutta Solution


f u n c t i o n [ t v a l s , y v a l s , f c v a l s ] = FixedRK ( fname , t0 , y0 , h , k , n )
%
% Gives approximate s o l u t i o n to
% y ’( t ) = f ( t , y( t ) )
5 % y ( t 0 ) = y0
% u s i n g a k t h o r d e r RK method
%
% t0 i n i t i a l time
% y0 initial state
10 % h stepsize
% k RK o r d e r 1<= k <= 4
% n Number o f s t e p s t o t a k e
%
% tvals t i m e v a l u e s o f form
15 % t v a l s ( j ) = t 0 + ( j −1)∗h , 1 <= j <= n
% y v a l s approximate s o l u t i o n
% y v a l s ( : j ) = approximate s o l u t i o n at
% t v a l s ( j ) , 1 <= j <= n
%
20 tc = t0 ;
yc = y0 ;
tvals = tc ;
y v a l s = yc ;
f c = f e v a l ( fname , t c , yc ) ;
25 f o r j =1: n−1
[ t c , yc , f c ] = RKstep ( fname , t c , yc , f c , h , k ) ;
y v a l s = [ y v a l s yc ] ;
tvals = [ tvals tc ];
fcvals = [ fcvals fc ];
30 end
end

Here is an example where we solve a specific model using all four Runge–Kutta
choices and plot them all together. Note when we use RKstep, we only return the
first two outputs; that is, for our returned variables, we write [htime1,xhat1] =
FixedRK(f,0,20,0.06,1,N1); instead of returning the full list of outputs
which includes function evaluations [htime1,xhat1,fhat1] = FixedRK
(f,0,20,0.06,1,N1);. We can do this as it is all right to not return the third
output. However, you still have to return the arguments in the order stated when the
function is defined. For example, if we used the command [htime1,fhat1] =
FixedRK(f,0,20,0.06,1,N1);, this would return the approximate values
and place them in the variable fhat1 which is not what we would want to do.
84 3 Numerical Methods Order One ODEs

Fig. 3.1 True versus Euler for x  = 0.5x(60 − x), x(0) = 20

Listing 3.3: True versus All Four Runge–Kutta Approximations


f = @( t , x ) . 5 ∗ x .∗(60 − x ) ;
t r u e = @( t ) 6 0 . / ( 1 + ( 6 0 / 2 0 − 1 ) ∗ exp ( −.5∗60∗ t ) ) ;
T = .6;
time = l i n s p a c e ( 0 ,T, 3 1 ) ;
5 h1 = . 0 6 ;
N1 = c e i l (T / h1 ) ;
[ htime1 , x h a t 1 ] = FixedRK ( f , 0 , 2 0 , . 0 6 , 1 , N1 ) ;
[ htime2 , x h a t 2 ] = FixedRK ( f , 0 , 2 0 , . 0 6 , 2 , N1 ) ;
[ htime3 , x h a t 3 ] = FixedRK ( f , 0 , 2 0 , . 0 6 , 3 , N1 ) ;
10 [ htime4 , x h a t 4 ] = FixedRK ( f , 0 , 2 0 , . 0 6 , 4 , N1 ) ;
% t h e . . . a t t h e end o f t h e l i n e a l l o w s us t o c o n t i n u e
% a long l i n e to the s t a r t of the next l i n e
p l o t ( time , t r u e ( t i m e ) , htime1 , xhat1 , ’*’ , htime2 , xhat2 , ’o’ , . . .
htime3 , xhat3 , ’+’ , htime4 , xhat4 , ’.’ ) ;
15 x l a b e l ( ’Time’ ) ;
y l a b e l ( ’x’ ) ;
% We want t o u s e t h e d e r i v a t i v e s y m b o l x ’ s o
% s i n c e Matlab t r e a t ’ a s t h e s t a r t and s t o p o f t h e l a b e l
% we w r i t e ’ ’ . That way Matlab w i l l t r e a t ’ ’ a s a s i n g l e q u o t e
20 % t h a t i s our d i f f e r e n t i a t i o n s y m b o l
t i t l e ( ’RK Approximations to x’’=.5x(60-x), x(0) - 20’ ) ;
% N o t i c e we c o n t i n u e t h i s l i n e t o o
l e g e n d ( ’True’ , ’RK 1, h=.06’ , ’RK 2, h=.06’ , ’RK 3, h = .06’ , . . .
’RK 4, h = .06’ , ’Location’ , ’Best’ ) ;

This generates Fig. 3.1


Note Runge–Kutta Order 4 does a great job even with a large step size.
Chapter 4
Multivariable Calculus

Now let’s review functions of more than one variable since it may have been some
time since you looked at these ideas. This is gone over carefully in the first text on
starting calculus ideas, but it is a good idea to talk about them again.

4.1 Functions of Two Variables

Let’s start by looking at the x−y plane as a collection of two dimensional vectors.
Each vector is rooted at the origin and the head of the vector corresponds to our
usual coordinate pair (x, y). The set of all such x and y determines the x−y plane
which we will also call 2 . The superscript two is used because we are now explicitly
acknowledging that we can think of these ordered pairs as vectors also with just a
slight identification on our part. Since we know about vectors, note if we have a
vector we can rewrite it, using our standard rules for vector arithmetic and scaling
of vectors as

     
6 1 0
=6 +7
7 0 1

A little thought will let you see we can do this for any vector and so we define special
vectors i = e1 and j = e2 as follows:

   
1 0
i = e1 = and j = e2 =
0 1

© Springer Science+Business Media Singapore 2016 85


J.K. Peterson, Calculus for Cognitive Scientists,
Cognitive Science and Technology, DOI 10.1007/978-981-287-877-9_4
86 4 Multivariable Calculus

Thus, any vector can be written as

 
x
= x e1 + y e1
y
=xi+y j

Now let’s start looking at functions that map each ordered pair (x, y) into a number.
Let’s begin with an example. Consider the function f (x, y) = x 2 + y 2 defined for
all x and y. Hence, for each x and y we pick, we calculate a number we can denote by
z whose value is f (x, y) = x 2 + y 2 . Using the same ideas we just used for the x−y
plane, we see the set of all such triples (x, y, z) = (x, y, x 2 + y 2 ) defines a surface
in 3 which is the collection of all ordered triples (x, y, z). Each of these triples can
be identified with a three dimensional vector whose tail is the origin and whose head
is the triple (x, y, z). We note any three dimensional vector can be written as

⎡ ⎤
x
⎣ y ⎦ = x e1 + y e2 + yz e3
z
=xi+y j+ zk

where we define the special vectors used in this representation by

⎡ ⎤ ⎡ ⎤ ⎡ ⎤
1 0 0
i = e1 = ⎣0⎦ , j = e2 = ⎣1⎦ and k = e3 = ⎣0⎦
0 0 1

We can plot this surface in MatLab with fairly simple code. As discussed in the first
volume, the utility function DrawSimpleSurface which manages how to draw
such a surface using boolean variables like DoGrid to turn a graph on or off. The
surface is drawn using a grid, a mesh, traces, patches and columns and a base all of
which contribute to a somewhat cluttered figure. Hence, the boolean variables allows
us to select how much clutter we want to see! So if the boolean variable DoGrid is
set to one, the grid is drawn. The code is self-explanatory so we just lay it out here.
We haven’t shown all the code for the individual drawing functions, but we think
you’ll find it interesting to see how we manage the pieces in this one piece of code.
So check this out.
4.1 Functions of Two Variables 87

Listing 4.1: DrawSimpleSurface


f u n c t i o n DrawSimpleSurface ( f , d e l x , nx , d e l y , ny , x0 , y0 , domesh , d o t r a c e s , dogrid , dopatch ,
docolumn , dobase )
% f i s the function defining the surface
% delx i s the s i z e of the x s t e p
% nx i s t h e number o f s t e p s l e f t and r i g h t from x0
5 % dely i s the s i z e of the y s t e p
% ny i s t h e number o f s t e p s l e f t and r i g h t from y0
% ( x0 , y0 ) i s t h e l o c a t i o n o f t h e column r e c t a n g l e b a s e
% domesh = 1 i f we want t o do t h e mesh
% d o g r i d = 1 i f we want t o do t h e g r i d
10 % d o p a t c h = 1 i f we want t h e p a t c h a b o v e t h e column
% d o b a s e = 1 i f we want t h e b a s e o f t h e column
% docolumn = 1 i f we want t h e column
% d o t r a c e s = 1 i f we want t h e t r a c e s
%
15 h o l d on
i f d o t r a c e s ==1
% s e t up x t r a c e f o r x0 , y t r a c e f o r y0
DrawTraces ( f , d e l x , nx , d e l y , ny , x0 , y0 ) ;
end
20 i f domesh==1 % p l o t t h e s u r f a c e
DrawMesh ( f , d e l x , nx , d e l y , ny , x0 , y0 ) ;
end
i f d o g r i d ==1 %p l o t x , y g r i d
DrawGrid ( f , d e l x , nx , d e l y , ny , x0 , y0 ) ;
25 end
i f dopatch ==1
% draw p a t c h f o r t o p o f column
DrawPatch ( f , d e l x , nx , d e l y , ny , x0 , y0 ) ;
end
30 i f dobase ==1
% draw p a t c h f o r t o p o f column
DrawBase ( f , d e l x , nx , d e l y , ny , x0 , y0 ) ;
end
i f docolumn ==1
35 %draw column
DrawColumn ( f , d e l x , nx , d e l y , ny , x0 , y0 ) ;
end
hold o f f
end

Hence, to draw everything for this surface, we would use the session:

Listing 4.2: Drawing a simple surface


1 f = @( x , y ) x . ˆ 2 + y . ˆ 2 ;
DrawSimpleSurface ( f , 0 . 5 , 2 , 0 . 5 , 2 , 0 . 5 , 0 . 5 , 1 , 1 , 1 , 1 , 1 , 1 ) ;

This surface has circular cross sections for different positive values of z and it is
called a circular paraboloid. If you used f (x, y) = 4x 2 + 3y 2 , the cross sections
for positive z would be ellipses and we would call the surface an elliptical paraboloid.
Now this code is not perfect. However, as an exploratory tool it is not bad! Now it is
time for you to play with it a bit in the exercises below.
88 4 Multivariable Calculus

4.1.1 Homework

Exercise 4.1.1 Explore the surface graph of the circular paraboloid f (x, y) = x 2 +
y 2 for different values of (x0 , y0 ) and x and y. Experiment with the 3D rotated
view to make sure you see everything of interest.

Exercise 4.1.2 Explore the surface graph of the elliptical paraboloid f (x, y) =
2x 2 + y 2 for different values of (x0 , y0 ) and x and y. Experiment with the 3D
rotated view to make sure you see everything of interest.

Exercise 4.1.3 Explore the surface graph of the elliptical paraboloid f (x, y) =
2x 2 + 3y 2 for different values of (x0 , y0 ) and x and y. Experiment with the 3D
rotated view to make sure you see everything of interest.

4.2 Continuity

Let’s recall the ideas of continuity for a function of one variable. Consider these three
versions of a function f defined on [0, 2].

⎧ 2
⎨x , if 0 ≤ x < 1
f (x) = 10, if x = 1

1 + (x − 1)2 if 1 < x ≤ 2.

The first version is not continuous at x = 1 because although the lim x→1 f (x) exists
and equals 1 (lim x→1− f (x) = 1 and lim x→1+ f (x) = 1), the value of f (1) is 10
which does not match the limit. Hence, we know f here has a removable discontinuity
at x = 1. Note continuity failed because the limit existed but the value of the function
did not match it. The second version of f is given below.

x 2, if 0 ≤ x ≤ 1
f (x) =
(x − 1)2 if 1 < x ≤ 2.

In this case, the lim x→1− = 1 and f (1) = 1, so f is continuous from the left. How-
ever, lim x→1+ = 0 which does not match f (1) and so f is not continuous from the
right. Also, since the right and left hand limits do not match at x = 1, we know
lim x→1 does not exist. Here, the function fails to be continuous because the limit
does not exist. The final example is below:

x 2, if 0 ≤ x < 1
f (x) =
x + (x − 1)2 if 1 < x ≤ 2.
4.2 Continuity 89

Here, the limit and the function value at 1 both match and so f is continuous at
x = 1. To extend these ideas to two dimensions, the first thing we need to do is
to look at the meaning of the limiting process. What does lim(x,y)→(x0 ,y0 ) mean?
Clearly, in one dimension we can approach a point x0 from x in two ways: from
the left or from the right or jump around between left and right. Now, it is apparent
that we can approach a given point (x0 , y0 ) in an infinite number of ways. Draw
a point on a piece of paper and convince yourself that there are many ways you
can draw a curve from another point (x, y) so that the curve ends up at (x0 , y0 )!
We still want to define continuity in the same way; i.e. f is continuous at the point
(x0 , y0 ) if lim(x,y)→(x0 ,y0 ) f (x, y) = f (x0 , y0 ). If you look at the graphs of the surface
z = x 2 + y 2 we have done previously, we clearly see that we have this kind of
behavior. There are no jumps, tears or gaps in the surface we have drawn. Let’s make
this formal.

Definition 4.2.1 (Continuity)


Let z = f (x, y) be a function of the two independent variables x and y defined
on some domain. At each pair (x, y) where f is defined in a circle of some finite
radius r ,

Br (x0 , y0 ) = {(x, y) | (x − x0 )2 + (y − y0 )2 < r },

if lim(x,y)→(x0 ,y0 ) f (x, y) exists and matches f (x0 , y0 ), we say f is continuous at


(x0 , y0 ).

Here is an example of a function which is not continuous at the point (0, 0). Let

√ 2x2 , if (x, y) = (0, 0)


f (x, y) = x +y 2
0, if (x, y) = (0, 0).

If we show the limit as we approach (0, 0) does not exist, then we will know f is not
continuous at (0, 0). If this limit exists, we should get the same value for the limit
no matter what path we take to reach (0, 0). Let the first path be given by x(t) = t
and y(t) = 2t. Then, as t → 0, (x(t),
√ y(t)) → (0, √0) as desired. Plugging in to f ,
2t/ t 2 + 4t 2 = 2/ 5 and hence the limit along this
we find for t = 0, f (t, 2t) = √
path is this constant value 2/ 5. On the other hand, along the path x(t) = t and
y(t) = −3t, for t = 0, we have f (t, −3t) = 2/3 which is not the same. Since the
limiting value differs on two paths, the limit can’t exist. Hence, f is not continuous
at (0, 0).

4.3 Partial Derivatives

Let’s go back to our simple surface example and look at the traces again. In Fig. 4.1, we
show the traces for the base point x0 = 0.5 and y0 = 0.5. We have also drawn vertical
lines down from the traces to the x−y plane to further emphasize the placement of
90 4 Multivariable Calculus

Fig. 4.1 The traces f (x0 , y) and f (x, y0 ) for the surface z = x 2 + y 2 for x0 = 0.5 and y0 = 0.5

the traces on the surface. The surface itself is not shown as it is somewhat distracting
and makes the illustration too busy.
You can generate this type of graph yourself with the function DrawFullTraces
as follows:

Listing 4.3: Drawing a full trace


DrawFullTraces ( f , 0 . 5 , 2 , 0 . 5 , 2 , 0 . 5 , 0 . 5 ) ;

Note, that each trace has a well-defined tangent line and derivative at the points x0
and y0 . We have

d d 2
f (x, y0 ) = (x + y02 )
dx dx
= 2x

as the value y0 in this expression is a constant and hence its derivative with respect to
x is zero. We denote this new derivative as ∂∂xf which we read as the partial derivative
of f with respect to x. It’s value as the point (x0 , y0 ) is 2x0 here. For any value of
(x, y), we would have ∂∂xf = 2x. We also have
4.3 Partial Derivatives 91

d d 2
f (x0 , y) = (x + y 2 )
dy dy 0
= 2y

We then denote this new derivative as ∂∂ yf which we read as the partial derivative of
f with respect to y. It’s value as the point (x0 , y0 ) is then 2y0 here. For any value of
(x, y), we would have ∂∂ yf = 2y.
The tangent lines for these two traces are then


d 
T (x, y0 ) = f (x0 , y0 ) + f (x, y0 ) (x − x0 )
dx x0
= (x02 + y02 ) + 2x0 (x − x0 )

d 
T (x0 , y) = f (x0 , y0 ) + f (x0 , y) (y − y0 )
dy y0

= (x02 + y02 ) + 2y0 (y − y0 ).

We can also write these tangent line equations like this using our new notation for
partial derivatives.

∂f
T (x, y0 ) = f (x0 , y0 ) + (x0 , y0 ) (x − x0 )
∂x
= (x02 + y02 ) + 2x0 (x − x0 )
∂f
T (x0 , y) = f (x0 , y0 ) + (x0 , y0 ) (y − y0 )
∂y
= (x02 + y02 ) + 2y0 (y − y0 ).

We can draw these tangent lines in 3D. To draw T (x, y0 ), we fix the y value to be y0
and then we draw the usual tangent line in the x−z plane. This is a copy of the x−z
plane translated over to the value y0 ; i.e. it is parallel to the x−z plane we see at the
value y = 0. We can do the same thing for the tangent line T (x, y0 ); we fix the x
value to be x0 and then draw the tangent line in the copy of the y − z plane translated
to the value x0 . We show this in Fig. 4.3. Note the T (x, y0 ) and the T (x0 , y) lines
are determined by vectors as shown below.

⎡ ⎤ ⎡ ⎤
1 ⎡ ⎤ 0 ⎡ ⎤
⎢ ⎥ 1 ⎢ ⎥ 0
0  ⎥ = ⎣ 0 ⎦ and B = ⎢ 1
 ⎥=⎣ 1 ⎦
A=⎢
⎣  ⎦ ⎣d  ⎦
d
dx
f (x, y0 ) 2x0 dy
f (x0 , y) 2y0
x0 y0

Note that if we connect the lines determined by the vectors A and B, we determine
a flat sheet which you can interpret as a piece of paper laid on top of these two lines.
92 4 Multivariable Calculus

Fig. 4.2 The traces f (x0 , y) and f (x, y0 ) for the surface z = x 2 + y 2 for x0 = 0.5 and y0 = 0.5
with added tangent lines. We have added the tangent plane determined by the tangent lines

Of course, we can only envision a small finite subset of this sheet of paper as you
can see in Fig. 4.2. Imagine that the sheet extends infinitely in all directions! The
sheet of paper we are plotting is called the tangent plane to our surface at the point
(x0 , y0 ). We will talk about this more formally later.
To draw this picture with the tangent lines, the traces and the tangent plane, we use
DrawTangentLines which has arguments (f,fx,fy,delx,nx,dely,ny,
r,x0,y0). There are three new arguments: fx which is ∂ f /∂x, fy which is ∂ f /∂ y
and r which is the size of the tangent plane that is plotted. For the picture shown in
Fig. 4.3, we’ve removed the tangent plane because the plot was getting pretty busy.
We did this by commenting out the line that plots the tangent plane. It is easy for
you to go into the code and add it back in if you want to play around. The MatLab
command line is

Listing 4.4: Drawing Tangent Lines


f x = @( x , y ) 2∗ x ;
f y = @( x , y ) 2∗ y ;
%
DrawTangentLines ( f , fx , fy , 0 . 5 , 2 , 0 . 5 , 2 , . 3 , 0 . 5 , 0 . 5 ) ;

If you want to see the tangent plane as well as the tangent lines, all you have to
do is look at the following lines in DrawTangentLines.m.
4.3 Partial Derivatives 93

Fig. 4.3 The traces f (x0 , y) and f (x, y0 ) for the surface z = x 2 + y 2 for x0 = 0.5 and y0 = 0.5
with added tangent lines

Listing 4.5: Drawing Tangent Lines


1 % s e t up a new l o c a l mesh g r i d n e a r ( x0 , y0 )
[U, V] = meshgrid ( u , v )
% s e t up t h e t a n g e n t p l a n e a t ( x0 , y0 )
W = f ( x0 , y0 ) + f x ( x0 , y0 ) ∗ (U−x0 ) + f y ( x0 , y0 ) ∗ (V−y0 )
% p l o t the tangent plane
6 s u r f (U, V,W, ’EdgeColor’ , ’blue’ ) ;

These lines setup the tangent plane and the tangent plane is turned off if there is
a % in front of surf(U,V,W,’EdgeColor’,’blue’);. We edited the file to
take the % out so we can see the tangent plane. We then see the plane in Fig. 4.4 as
we saw before.
The ideas we have been discussing can be made more general. When we take the
derivative with respect to one variable while holding the other variable constant (as
we do when we find the normal derivative along a trace), we say we are taking a
partial derivative of f. Here there are two flavors: the partial derivative with respect
to x and the partial derivative with respect to y. We can now state some formal
definitions and introduce the notations and symbols we use for these things. We
define the process of partial differentiation carefully below.

Definition 4.3.1 (Partial Derivatives)


Let z = f (x, y) be a function of the two independent variables x and y defined on
some domain. At each pair (x, y) where f is defined in a circle of some finite radius
r , Br (x0 , y0 ) = {(x, y) | (x − x0 )2 + (y − y0 )2 < r }, it makes sense to try to find
the limits
94 4 Multivariable Calculus

Fig. 4.4 The traces f (x0 , y) and f (x, y0 ) for the surface z = x 2 + y 2 for x0 = 0.5 and y0 = 0.5
with added tangent lines

f (x, y) − f (x, y0 )
lim
x→x0 ,y=y0 x − x0
f (x, y) − f (x0 , y)
lim
x=x0 ,y→y0 y − y0

If these limits exists, they are called the partial derivatives of f with respect to x and
y at (x0 , y0 ), respectively.

Comment 4.3.1 For these partial derivatives, we use the symbols


∂f ∂z
f x (x0 , y0 ), (x0 , y0 ), z x (x0 , y0 ), (x0 , y0 )
∂x ∂x
and
∂f ∂z
f y (x0 , y0 ), (x0 , y0 ), z y (x0 , y0 ), (x0 , y0 )
∂y ∂y

Comment 4.3.2 We often use another notation for partial derivatives. The function
f of two variables x and y can be thought of as having two arguments or slots into
which we place values. So another useful notation is to let the symbol D1 f be f x and
D2 f be f y . We will be using this notation later when we talk about the chain rule.

Comment 4.3.3 It is easy to take partial derivatives. Just imagine the one variable
held constant and take the derivative of the resulting function just like you did in
your earlier calculus courses.
4.3 Partial Derivatives 95

∂z
Example 4.3.1 Let z = f (x, y) = x 2 + 4y 2 be a function of two variables. Find ∂x
and ∂∂zy .
Solution Thinking of y as a constant, we take the derivative in the usual way with
respect to x. This gives

∂z
= 2x
∂x

as the derivative of 4y 2 with respect to x is 0. So, we know f x = 2x.


In a similar way, we find ∂∂zy . We see

∂z
= 8y
∂y

as the derivative of x 2 with respect to y is 0. So f y = 8y.


∂z ∂z
Example 4.3.2 Let z = f (x, y) = 4x 2 y 3 . Find ∂x
and ∂y
.
Solution Thinking of y as a constant, take the derivative in the usual way with
respect to x: This gives

∂z
= 8x y 3
∂x

as the term 4y 3 is considered a “constant” here. So f x = 8x y 3 .


Similarly,

∂z
= 12x 2 y 2
∂y

as the term 4x 2 is considered a “constant” here. So f y = 12x 2 y 2 .


Now let’s do some without spelling out each step. Make sure you see what we are
doing!
x 4 +1
Example 4.3.3 f (x, y) = y 3 +2
.
Solution

∂f (4x 3 )
= 3
∂x y +2
∂f (x 4 + 1)(3y 2 )
=−
∂y (y 3 + 2)2
96 4 Multivariable Calculus

x 4 y 2 +2
Example 4.3.4 f (x, y) = y 3 x 5 +20
.
Solution

∂f (4x 3 y 2 )(y 3 x 5 + 20) − (x 4 y 2 + 2)(5y 3 x 4 )


=
∂x (y 3 x 5 + 20)2
∂f (2x 4 y)(y 3 x 5 + 20) − (x 4 y 2 + 2)(3y 2 x 5 )
=
∂y (y 3 x 5 + 20)2

Example 4.3.5 f (x, y) = sin(x 3 y + 2).


Solution
∂f
= cos(x 3 y + 2) (3x 2 y)
∂x
∂f
= cos(x 3 y + 2) (x 3 )
∂y

Example 4.3.6 f (x, y) = e−(x +y 4 )


2

Solution
∂f
= e−(x +y ) (−2x)
2 4

∂x
∂f
= e−(x +y ) (−4y 3 )
2 4

∂y

Example 4.3.7 f (x, y) = ln( x 2 + 2y 2 ).


Solution
∂f 1 2x
=
∂x 2 x + 2y 2
2

∂f 1 4y
=
∂y 2 x + 2y 2
2

4.3.1 Homework

These are for you: for each of these functions, find f x and f y .
First, functions with no cross terms.
Exercise 4.3.1 f (x, y) = x 2 + 3y 2 .
Exercise 4.3.2 f (x, y) = 4x 2 + 5y 4 .
Exercise 4.3.3 f (x, y) = −3x + 2y 8 .
4.3 Partial Derivatives 97

Next, functions with cross terms.

Exercise 4.3.4 f (x, y) = x 2 y 2 .

Exercise 4.3.5 f (x, y) = 2x 3 y 2 + 5x.

Exercise 4.3.6 f (x, y) = 3x y 2 .

Exercise 4.3.7 f (x, y) = x 2 y 5 .

Next, functions with fractions.


x2
Exercise 4.3.8 f (x, y) = y5
.

x 2 +2y
Exercise 4.3.9 f (x, y) = 5x+y 3
.

Exercise 4.3.10 f (x, y) = x+2y


5x+y
.

x 2 +2
Exercise 4.3.11 f (x, y) = 5+y
.

Now, sin and cos things.

Exercise 4.3.12 f (x, y) = sin(x y).

Exercise 4.3.13 f (x, y) = sin(x 2 y).

Exercise 4.3.14 f (x, y) = cos(x + 3y).



Exercise 4.3.15 f (x, y) = sin( x + y).

Exercise 4.3.16 f (x, y) = cos2 (x + 4y).



Exercise 4.3.17 f (x, y) = sin(x y).

Now, let’s add ln and exp.

Exercise 4.3.18 f (x, y) = e x y

Exercise 4.3.19 f (x, y) = e x+4y

Exercise 4.3.20 f (x, y) = e−3x y .

Exercise 4.3.21 f (x, y) = ln(x 2 + 4y 2 ).



Exercise 4.3.22 f (x, y) = ln( 1 + x y).

Exercise 4.3.23 f (x, y) = esin(3x+5y) .


2
+5y)
Exercise 4.3.24 f (x, y) = ecos(3x .

Exercise 4.3.25 f (x, y) = ln(1 + 3x + 8y).


98 4 Multivariable Calculus

4.4 Tangent Planes

Before we discuss tangent planes to a function f again, let’s digress to the ideas of
planes in general in 3D. We define a plane as follows.
Definition 4.4.1 (Planes)
A plane in 3D through the point (x0 , y0 , z 0 ) is defined as the set of all points (x, y, z)
so that the angle between the vectors D and N is zero where D is the vector we get
by connecting the point (x0 , y0 , z 0 ) to the point (x, y, z). Hence, for
⎡ ⎤ ⎡ ⎤
x − x0 N1
D = ⎣ y − y0 ⎦ and N = ⎣ N2 ⎦
z − z0 N3

the plane is the set of points (x, y, z) so that < D, N > = 0. The vector N is called
the normal vector to the plane.
Comment 4.4.1 A little thought shows that any plane crossing through the origin
is a two dimensional subspace of 3 .
Example 4.4.1 The equation 2x + 3y − 5z = 0 defines the plane whose normal vec-
tor is N = [2, 3, −5]T which passes through the origin (0, 0, 0).
Example 4.4.2 The equation 2(x − 2) + 3(y − 1) − 5(z + 3) = 0 defines the plane
whose normal vector is N = [2, 3, −5]T which passes through the point (2, 1, −3).
Note this can be rewritten as 2x + 3y − 5z = 4 + 3 + 15 = 22 after a simple manip-
ulation.
Example 4.4.3 The equation 2x + 3y − 5z = 11 corresponds to a plane with normal
vector N = [2, 3, −5]T which passes through some point (x0 , y0 , z 0 ). There are an
infinite number of choices for this base point: any triple which solves 2x0 + 3y0 −
5z 0 = 11 will do the job. An easy way to pick one is to pick two and solve for the
third. So for example, if z 0 = 0 and y0 = 4, we find 2x0 + 12 = 11 which gives
x0 = −1/2. Thus, this plane could be rewritten as 2(x + 1/2) + 3(y − 4) − 5z = 0.

4.4.1 The Vector Cross Product

There is another very useful way to define a plane which we did not discuss in the first
volume. As long as the vectors A and B point in different directions, they determine
a new vector A × B which is perpendicular to both of them and can serve as the
normal to a plane. Note, saying the vectors A and B point in different directions is
the same as saying they are linear dependent.
Definition 4.4.2 (Planes Again)
The plane in 3D determined by the vectors A and B containing the point (x0 , y0 , z 0 )
is defined as the plane whose normal vector is N = A × B.
4.4 Tangent Planes 99

To find the vector C = A × B requires a bit of calculation. Assume


C = [C1 , C2 , C3 ]T , A = [A1 , A2 , A3 ]T and B = [B1 , B2 , B3 ]T . Then, we want
< C, A > = 0 and < C, B > = 0. For convenience, we will assume all of the com-
ponents of A and B are nonzero. This gives the equations

A1 C 1 + A2 C 2 + A3 C 3 = 0
B1 C1 + B2 C2 + B3 C3 = 0

Solving both equations for C1 , we have

A2 A3 B2 B3
C1 = − C2 − C3 = − C2 − C3 .
A1 A1 B1 B1

Solving for C2 , we have

 
− BB31 − A3
A1
C2 = C3
− BB21 − A2
A1
 
− BB31 − A3
A1 A1 B1
= C3
− BB21 − A2
A1
A1 B1
B1 A3 − A1 B3
= C3 .
A1 B2 − B1 A2

We know A1 C1 = −A2 C2 − A3 C3 and so substituting for C2 we obtain

 
A2 B1 A3 − A1 B3 A3
C1 = − + C3
A1 A1 B2 − B1 A2 A1
 
A2 (B1 A3 − A1 B3 ) + A3 (A1 B2 − B1 A2 )
=− C3
A1 (A1 B2 − B1 A2 )
A2 B1 A3 − A1 A2 B3 + A3 A1 B2 − A3 B1 A2
=− C3
A1 (A1 B2 − B1 A2 )
−A1 A2 B3 + A3 A1 B2
=− C3
A1 (A1 B2 − B1 A2 )
A2 B3 − A3 B2
= C3 .
A1 B2 − B1 A2
100 4 Multivariable Calculus

We have found the components of C can be written in terms of the parameter C3 :

A2 B3 − A3 B2
C1 = C3
A1 B2 − B1 A2
B1 A3 − A1 B3
C2 = C3 .
A1 B2 − B1 A2

Choosing C3 = A1 B2 − B1 A2 , we find that the vector we are looking for has the
components

C1 = A2 B3 − A3 B2
C2 = B1 A3 − A1 B3
C3 = A1 B2 − B1 A2

We can reorganize these components by recognizing they are the determinants of


certain 2 × 2 matrices; i.e.

 
A2 A3
C1 = det
B2 B3
 
A1 A3
C2 = − det
B1 B3
 
A1 A2
C3 = det
B1 B2

Then using the standard basis vectors for 3 , i, j and k, we see the vector C = A × B
can be written as

     
A2 A3 A A A A
A × B = i det − j det 1 3 + k det 1 2
B2 B3 B1 B3 B1 B2

This leads us to the following definition.

Definition 4.4.3 (The Vector Cross Product)


The cross product of the two nonzero vectors A and B is defined to be the vector
A × B where

     
A2 A3 A A A A
A × B = i det − j det 1 3 + k det 1 2
B2 B3 B1 B3 B1 B2
4.4 Tangent Planes 101

This calculation is performed so often we define a special use of the determinant to


help us remember. We define the matrix
⎡ ⎤
i j k
C = ⎣ A1 A2 A3 ⎦
B1 B2 B3

and we define the determinant of this matrix to coincide with the definition of A × B.

Comment 4.4.2 This is easy to remember. Start with the i in row one. Cross out the
first row and first column of C. The first term in the cross product is then i times the
determinant of the 2 × 2 submatrix that is left over. This is the matrix
 
A1 A2
B1 B2

The first term is then


 
A1 A2
i det
B1 B2

The second term in row one is j . Associate this term with a minus sign and since it is
in row one, column two cross that row and column out of C to obtain the submatrix
 
A1 A3
B1 B3

We now have the second term


 
A A
− j det 1 3
B1 B3

The last term is associated with the row one, column three entry k. Cross out that
row and column in C to obtain the submatrix
 
A2 A3
B2 B3

This gives the last term


 
A A
k det 2 3
B2 B3

The cross product is then the sum of these three terms. This is also called expanding
a 3 × 3 determinant by the first row, but that is another story.
102 4 Multivariable Calculus

4.4.1.1 Homework

For each of these problems, graph the two vectors as well as the cross product.

Exercise 4.4.1 Find i × j , i × k and j × k,


 T  T
Exercise 4.4.2 Find A × B for A = 1 −2 3 and B = −1 5 3 .
 T  T
Exercise 4.4.3 Find A × B for A = −2 4 −3 and B = 6 −1 2 .
 T  T
Exercise 4.4.4 Find A × B for A = 0 −3 1 and B = 2 0 8 .

4.4.2 Back to Tangent Planes

Recall the tangent plane to a surface z = f (x, y) at the point (x0 , y0 ) was the plane
determined by the tangent lines T (x, y0 ) and T (x0 , y). The T (x, y0 ) line was deter-
mined by the vector

⎡ ⎤
1 ⎡ ⎤
⎢ ⎥ 1
0  ⎥=⎣ 0 ⎦
A=⎢
⎣  ⎦
d
dx
f (x, y0 ) 2x0
x0

and the T (x0 , y) line was determined by the vector

⎡ ⎤
0 ⎡ ⎤
⎢ ⎥ 0
1 
B=⎢
⎣ 
⎥=⎣ 1 ⎦

d
dy
f (x0 , y) 2y0
y0

We know now that we can write these vectors more generally as

⎡ ⎤
1
A=⎣ 0 ⎦
∂f
∂x
(x 0 , y0 )
⎡ ⎤
0
B=⎣ 1 ⎦
∂f
∂y
(x0 , y0 )
4.4 Tangent Planes 103

The plane determined by these vectors has normal A × B which is

⎡ ⎤
i j k
A × B = ⎣1 0 f x (x0 , y0 )⎦
0 1 f y (x0 , y0 )

or

     
0 f x (x0 , y0 ) 1 f x (x0 , y0 ) 01
A × B = i det − j det + k det
1 f y (x0 , y0 ) 0 f y (x0 , y0 ) 01
= − f x (x0 , y0 )i − f y (x0 , y0 ) j + k
⎡ ⎤
− f x (x0 , y0 )
= ⎣− f y (x0 , y0 )⎦ .
1

The tangent plane to the surface z = f (x, y) at the point (x0 , y0 ) is then given by

− f x (x0 , y0 )(x − x0 ) − f y (x0 , y0 )(y − y0 ) + (z − f (x0 , y0 ) = 0.

This then gives the traditional equation of the tangent plane:

z = f ( x0 , y0 ) + f x (x0 , y0 )(x − x0 ) + f y (x0 , y0 )(y − y0 ). (4.1)

We can use another compact definition at this point. We can define the gradient of
the function f to be the vector ∇ f . The gradient is defined as follows.

Definition 4.4.4 (The Gradient)


The gradient of the scalar function z = f (x, y) is defined to be the vector ∇ f where

 
f x (x0 , y0 )
∇ f (x0 , y0 ) = .
f y (x0 , y0 )

Note the gradient takes a scalar function argument and returns a vector answer.

Using the gradient, Eq. 4.1 can be rewritten as

f (x, y) = f (x0 , y0 )+ < ∇ f, X − X 0 >


= f (x0 , y0 ) + ∇ f T (X − X 0 )
104 4 Multivariable Calculus
 T
where X − X 0 = x − x0 y − y0 . The obvious question to ask now is how much
of a discrepancy is there between the value f (x, y) and the value of the tangent
plane?

Example 4.4.4 Find the gradient of f (x, y) = x 2 + 4x y + 9y 2 and the equation of


the tangent plane to this surface at the point (1, 2).

Solution
 
2x + 4y
∇ f (x, y) = .
4x + 18y

The equation of the tangent plane at (1, 2) is then

z = f (1, 2) + f x (1, 2)(x − 2) + f y (1, 2)(y − 2)


= 45 + 10(x − 2) + 40(y − 2)
= −55 + 10x + 40y.

Note this can also be written as 10x + 40y + z = 55 which is also a standard form.
However, in this form, the attachment point (1, 2, 45) is hidden from view.

4.4.3 Homework

Exercise 4.4.5 Find the gradient of f (x, y) = x 2 − x y 2 + y 2 and the equation of


the tangent plane to this surface at the point (1, 1).

Exercise 4.4.6 Find the gradient of f (x, y) = −x 2 + y 2 + 4y and the equation of


the tangent plane to this surface at the point (−1, 1).

Exercise 4.4.7 Find the gradient of f (x, y) = sin(x y) and the equation of the tan-
gent plane to this surface at the point (π/4, −π/4).

4.4.4 Computational Results

We can use MatLab/Octave to draw tangent planes and tangent lines to a surface.
Consider the function DrawTangentPlanePackage. The source code is similar
to what we have done in previous functions. This time, we send in the function f and
the two partial derivatives fx and fy. First, we plot the traces and draw vertical lines
from the traces to the x−y plane. Note this code will not do very well on surfaces
where the z values become negative! But then, this code is just for exploration and
it is easy enough to alter it for other jobs. And it is a good exercise! After the traces
and their shadow lines are drawn, we draw the tangent lines. Finally, we draw the
tangent plane. The tangent plane calculation uses the partial derivatives we sent into
this function as arguments.
4.4 Tangent Planes 105

Listing 4.6: DrawTangentPlanePackage


f u n c t i o n DrawTangentPlanePackage ( f , fx , fy , d e l x , nx , d e l y , ny , r , x0 , y0 )
% f i s the function defining the surface
% delx i s the s i z e of the x s t e p
% nx i s t he number o f s t e p s l e f t and r i g h t from x0
5 % dely i s the s i z e of the y s t e p
% ny i s t he number o f s t e p s l e f t and r i g h t from y 0
% r i s t h e s i z e o f t h e drawn t a n g e n t p l a n e
% ( x0 , y0 ) i s t h e l o c a t i o n o f t h e column r e c t a n g l e b a s e
%
10 % s e t up x and y s t u f f
x = x0−nx∗ d e l x : d e l x / 5 : x0+nx∗ d e l x ;
y = y0−ny∗ d e l y : d e l y / 5 : y0+ny∗ d e l y ;
[ rows , s x ] = s i z e ( x ) ;
[ rows , s y ] = s i z e ( y ) ;
15 h o l d on
% s e t up x t r a c e f o r x = x0
% s e t up y t r a c e f o r y = y0
x t r a c e = f ( x0 , y ) ;
y t r a c e = f ( x , y0 ) ;
20 f i x e d x = x0 ∗ o n e s ( 1 , s x ) ;
f i x e d y = y0 ∗ o n e s ( 1 , s y ) ;
p l o t 3 ( f i x e d x , y , x t r a c e , ’LineWidth’ , 4 , ’Color’ , ’red’ ) ;
p l o t 3 ( x , f i x e d y , y t r a c e , ’LineWidth’ , 4 , ’Color’ , ’red’ ) ;
% now draw x0 , y0 l i n e i n xy p l a n e
25 U = [ x0 ; x0 ] ;
V = [ y0−ny∗ d e l y ; y0+ny∗ d e l x ] ;
W = [0;0];
p l o t 3 (U, V,W) ;
U = [ y0−ny∗ d e l y ; y0+ny∗ d e l x ] ;
30 V = [ y0 ; y0 ] ;
W = [0;0];
p l o t 3 (U, V,W) ;
% now f i l l i n p l a n e s formed by x0 , y0 l i n e s
f o r i =1: s y
35 U = [ x0 ; x0 ] ;
V = [y( i ) ;y( i ) ];
W = [ 0 ; f ( x0 , y ( i ) ) ] ;
p l o t 3 (U, V,W, ’LineWidth’ , 1 , ’Color’ , ’red’ ) ;
end
40 f o r i =1: s x
U = [x( i ) ;x( i ) ];
V = [ y0 ; y0 ] ;
W = [ 0 ; f ( x ( i ) , y0 ) ] ;
p l o t 3 (U, V,W, ’LineWidth’ , 1 , ’Color’ , ’red’ ) ;
45 end
% now draw t a n g e n t l i n e s
% s e t up new l o c a l v a r i a b l e s c e n t e r e d a t ( x0 , y0 )
TX = @( x ) ( f ( x0 , y0 ) + f x ( x0 , y0 ) ∗ ( x−x0 ) ) ;
TY = @( y ) ( f ( x0 , y0 ) + f y ( x0 , y0 ) ∗ ( y−y0 ) ) ;
50 U = [ x0−nx∗ d e l x ; x0+nx∗ d e l x ] ;
V = [ y0 ; y0 ] ;
W = [TX( x0−nx∗ d e l x ) ;TX( x0+nx∗ d e l x ) ] ;
p l o t 3 (U, V,W, ’LineWidth’ , 3 , ’Color’ , ’blue’ ) ;
U = [ x0 ; x0 ] ;
55 V = [ y0−ny∗ d e l y ; y0+ny∗ d e l y ] ;
W = [TY( y0−ny∗ d e l y ) ;TY( y0+ny∗ d e l y ) ] ;
p l o t 3 (U, V,W, ’LineWidth’ , 3 , ’Color’ , ’blue’ ) ;
% p l o t tangent plane
u = [ x0−r ∗ d e l x ; x0+r ∗ d e l x ] ;
60 v = [ y0−r ∗ d e l y ; y0+r ∗ d e l y ] ;
% s e t up a new l o c a l mesh g r i d n e a r ( x0 , y0 )
[U, V] = meshgrid ( u , v ) ;
% s e t up t h e t a n g e n t p l a n e a t ( x0 , y0 )
w = @( u , v ) f ( x0 , y0 ) + f x ( x0 , y0 ) ∗ ( u−x0 ) + f y ( x0 , y0 ) ∗ ( v−y0 ) ;
65 W = w(U, V) ;
% p l o t the tangent plane
s u r f (U, V,W, ’EdgeColor’ , ’blue’ ) ;
hold o f f
end
106 4 Multivariable Calculus

The illustrations this code produces have already been used in Fig. 4.2. Practice with
this code and draw other pictures! A typical session to generate this figure would
look like

Listing 4.7: Drawing Tangent Planes


1 f = @( x , y ) x . ˆ 2 + y . ˆ 2 ;
f x = @( x , y ) 2∗ x ;
f y = @( x , y ) 2∗ y ;
DrawTangentPlanePackage ( f , fx , fy , 0 . 5 , 2 , 0 . 5 , 2 , . 3 , 0 . 5 , 0 . 5 ) ;

4.4.5 Homework

Exercise 4.4.8 Draw tangent lines and planes for the surface f (x, y) = x 2 + 3y 2
for various points (x0 , y0 ).

Exercise 4.4.9 Draw tangent lines and planes for the surface f (x, y) = −x 2 − 3y 2
for various points (x0 , y0 ). You will need to modify the code to make this work!

Exercise 4.4.10 Draw tangent lines and planes for the surface f (x, y) = x 2 − 3y 2
for various points (x0 , y0 ). You will need to modify the code to make this work! Make
sure you try the point (0, 0).

4.5 Derivatives in Two Dimensions!

Let’s look at the partial derivatives of f (x, y). As long as f (x, y) is defined locally
at (x0 , y0 ), we can say f x (x0 , y0 ) and f y (x0 , y0 ) exist if and only if there are error
functions E 1 (x, y, x0 , y0 ) and E 2 (x, y, x0 , y0 ) so that

f (x, y0 ) = f (x0 , y0 ) + f x (x0 , y0 )(x − x0 ) + E 1 (x, x0 , y0 )


f (x0 , y) = f (x0 , y0 ) + f y (x0 , y0 )(y − y0 ) + E 2 (y, x0 , y0 )

with E 1 → 0 and E 1 /(x − x0 ) → 0 as x → x0 and E 2 → 0 and E 2 /(y − x0 ) → 0


as y → y0 . Using the ideas we have presented here, we can come up with a way to
define the differentiability of a function of two variables.

Definition 4.5.1 (Error Form of Differentiability For Two Variables)


If f (x, y) is defined locally at (x0 , y0 ), then f is differentiable at (x0 , y0 ) if there
are two numbers L 1 and L 2 so that the error function E(x, y, x0 , y0 ) = f (x, y) −
f (x0 , y0 ) − L 1 (x − x0 ) − L 2 (y − y0 ) satisfies lim(x,y)→(x0 ,y0 ) E(x, y, x0 , y0 ) = 0
4.5 Derivatives in Two Dimensions! 107

and lim(x,y)→(x0 ,y0 ) E(x, y, x0 , y0 )/||(x − x0 , y − y0 )|| = 0. Here, the term


||(x − x0 , y − y0 )|| = (x − x0 )2 + (y − y0 )2 .

Note if f is differentiable at (x0 , y0 ), f must be continuous at (x0 , y0 ). The argument


is simple:

f (x, y) = f (x0 , y0 ) + L 1 (x0 , y0 )(x − x0 ) + L 2 (y − y0 ) + E(x, y, x0 , y0 )

and as (x, y) → (x0 , y0 ), we have f (x, y) → f (x0 , y0 ) which is the definition of f


being continuous at (x0 , y0 ). Hence, we can say

Theorem 4.5.1 (Differentiable Implies Continuous: Two Variables)


If f is differentiable at (x0 , y0 ) then f is continuous at (x0 , y0 ).

Proof We have sketched the argument already. 

From Definition 4.5.1, we can show if f is differentiable at the point (x0 , y0 ), then
L 1 = f x (x0 , y0 ) and L 2 = f y (x0 , y0 ). The argument goes like this: since f is differ-
entiable at (x0 , y0 ), we can say

f (x, y) − f (x0 , y0 ) − L 1 (x − x0 ) − L 2 (y − y0 )
lim = 0.
(x,y)→(x0 ,y0 ) (x − x0 )2 + (y − y0 )2

We can rewrite this using x = x − x0 and y = y − y0 as

f (x0 + x, y0 + y) − f (x0 , y0 ) − L 1 x − L 2 y


lim = 0.
(x,y)→(0,0) (x)2 + (y)2

In particular, for y = 0, we find

f (x0 + x, y0 ) − f (x0 , y0 ) − L 1 x


lim = 0.
(x)→0 (x)2

For x > 0, we find (x)2 = x and so

f (x0 + x, y0 ) − f (x0 , y0 )


lim + = L 1.
x→0 x

Thus, the right hand partial derivative f x (x0 , y0 )+ exists and equals L 1 . On the other
hand, if x < 0, then (x)2 = −x and we find, with a little manipulation, that
we still have
108 4 Multivariable Calculus

f (x0 + x, y0 ) − f (x0 , y0 )


lim = L 1.
(x)→0− x

So the left hand partial derivative f x (x0 , y0 )− exists and equals L 1 also. Combining,
we see f x (x0 , y0 ) = L 1 . A similar argument shows that f y (x0 , y0 ) = L 2 . Hence, we
can say if f is differentiable at (x0 , y0 ) then f x and f y exist at this point and we have

f (x, y) = f (x0 , y0 ) + f x (x0 , y0 )(x − x0 ) + f y (x0 , y0 )(y − y0 ) + E f (x, y, x0 , y0 )

where E f (x, y, x0 , y0 ) → 0 and E f (x, y, x0 , y0 )/||(x − x0 , y − y0 )|| → 0 as (x, y)


→ (x0 , y0 ). Note this argument is a pointwise argument. It only tells us that differ-
entiability at a point implies the existence of the partial derivatives at that point.

4.6 The Chain Rule

Now that we know a bit about two dimensional derivatives, let’s go for gold and
figure out the new version of the chain rule. The argument we make here is very
similar in spirit to the one dimensional one. You should go back and check it out! We
will do this argument carefully but without tedious rigor. At least that is our hope.
You’ll have to let us know how we did!
We assume there are two functions u(x, y) and v(x, y) defined locally about
(x0 , y0 ) and that there is a third function f (u, v) which is defined locally around
(u 0 = u(x0 , y0 ), v0 = v(x0 , y0 )). Now assume f (u, v) is differentiable at (u 0 , v0 )
and u(x, y) and v(x, y) are differentiable at (x0 , y0 ). Then we can say

u(x, y) = u(x0 , y0 ) + u x (x0 , y0 )(x − x0 ) + u y (x0 , y0 )(y − y0 ) + E u (x, y, x0 , y0 )


v(x, y) = v(x0 , y0 ) + vx (x0 , y0 )(x − x0 ) + v y (x0 , y0 )(y − y0 ) + E v (x, y, x0 , y0 )
f (u, v) = f (u 0 , v0 ) + f u (u 0 , v0 )(u − u 0 ) + f v (u 0 , v0 )(v − v0 ) + E f (u, v, u 0 , v0 )

where all the error terms behave as usual as (x, y) → (x0 , y0 ) and (u, v) →
(u 0 , v0 ). Note that as (x, y) → (x0 , y0 ), u(x, y) → u 0 = u(x0 , y0 ) and v(x, y) →
v0 = v(x0 , y0 ) as u and v are continuous at the (u 0 , v0 ) since they are differentiable
there. Let’s consider the partial of f with respect to x. Let u = u(x0 + x, y0 ) −
u(x0 , y0 ) and v = v(x0 + x, y0 ) − v(x0 , y0 ). Thus, u 0 + u = u(x0 + x, y0 )
and v0 + v = v(x0 + x, y0 ). Hence
4.6 The Chain Rule 109

f (u 0 + u, v0 + v) − f (u 0 , v0 )
x
f u (u 0 , v0 )(u − u 0 ) + f v (u 0 , v0 )(v − v0 ) + E f (u, v, u 0 , v0 )
=
x
u − u0 v − v0 E f (u, v, u 0 , v0 )
= f u (u 0 , v0 ) + f v (u 0 , v0 ) +
x x x
u x (x0 , y0 )(x − x0 ) + E u (x, x0 , y0 )
= f u (u 0 , v0 )
x
vx (x0 , y0 )(x − x0 ) + E v (x, x0 , y0 ) E f (u, v, u 0 , v0 )
+ f v (u 0 , v0 ) +
x x
= f u (u 0 , v0 ) u x (x0 , y0 ) + f v (u 0 , v0 ) vx (x0 , y0 )
E u (x, x0 , y0 ) E v (x, x0 , y0 ) E f (u, v, u 0 , v0 )
+ + + .
x x x

As (x, y) → (x0 , y0 ), (u, v) → (u 0 , v0 ) and so E f (u, v, u 0 , v0 )/x → 0. The other


two error terms go to zero also as (x, y) → (x0 , y0 ). Hence, we conclude

∂f ∂ f ∂u ∂ f ∂v
= + .
∂x ∂u ∂x ∂v ∂x
A similar argument shows

∂f ∂ f ∂u ∂ f ∂v
= + .
∂y ∂u ∂ y ∂v ∂ y

This result is known as the Chain Rule.

Theorem 4.6.1 (The Chain Rule)


Assume there are two functions u(x, y) and v(x, y) defined locally about (x0 , y0 )
and that there is a third function f (u, v) which is defined locally around (u 0 =
u(x0 , y0 ), v0 = v(x0 , y0 )). Further assume f (u, v) is differentiable at (u 0 , v0 ) and
u(x, y) and v(x, y) are differentiable at (x0 , y0 ). Then f x and f y exist at (x0 , y0 )
and are given by

∂f ∂ f ∂u ∂f ∂v
= +
∂x ∂u ∂x ∂v ∂x
∂f ∂ f ∂u ∂f ∂v
= + .
∂y ∂u ∂ y ∂v ∂y

Proof We have sketched the argument already. 


110 4 Multivariable Calculus

4.6.1 Examples

Example 4.6.1 Let f (x, y) = x 2 + 2x + 5y 4 . Then if x = r cos(θ) and y = r sin(θ),


using the chain rule, we find

∂f ∂ f ∂x ∂f ∂y
= +
∂r ∂x ∂r ∂y ∂r
∂f ∂ f ∂x ∂f ∂y
= +
∂θ ∂x ∂θ ∂y ∂θ

This becomes
   
∂f
= 2x + 2 cos(θ) + 20y 3 sin(θ)
∂r
     
∂f
= 2x + 2 −r sin(θ) + 20y 3 r cos(θ)
∂θ

You can then substitute in for x and y to get the final answer in terms of r and θ (kind
of ugly though!).

Example 4.6.2 Let f (x, y) = 10x 2 y 4 . Then if u = x 2 + 2y 2 and v = 4x 2 − 5y 2 ,


using the chain rule, we find f (u, v) = 10u 2 v 4 and so

∂f ∂ f ∂u ∂f ∂v
= +
∂x ∂u ∂x ∂v ∂x
∂f ∂ f ∂u ∂f ∂v
= +
∂y ∂u ∂ y ∂v ∂y

This becomes
   
∂f
= 20uv 4 2x + 40u 2 v 3 8x
∂x
   
∂f
= 20uv 4 4y + 40u 2 v 3 (−10y)
∂θ

You can then substitute in for u and v to get the final answer in terms of x and y
(even more ugly though!).

Example 4.6.3 In the discussion of Hamilton’s Rule from the first course on calculus
for biologists, we discuss a fitness function w for a model of altruism which depends
on P which is the probability of giving aid and Q which is the probability of receiving
aid. The model is

w = w0 + b Q − c P
4.6 The Chain Rule 111

where w0 is a baseline fitness amount. Note, the chain rule gives us

∂w ∂w ∂ P ∂w ∂ Q
= +
∂P ∂P ∂P ∂Q ∂P
∂Q
= −c + b
∂P
Let ∂ P Q be denoted by r , the coefficient of relatedness. The parameter r is very
hard to understand even though it was introduced in 1964 to study altruism. Altruism
occurs if fitness increases or ∂∂wP > 0. So altruism occurs if −c + br > 0 or r b > c.
This inequality is Hamilton’s Rule, but what counts is we understand what these
terms mean biologically.

4.6.1.1 Homework

Exercise 4.6.1 Let f (x, y) = x 2 + 5x 2 y 3 and let u(x, y) = 2x y and v(x, y) =


x 2 − y 2 . Find ∂x f (u, v) and ∂ y f (u, v).
Exercise 4.6.2 Let f (x, y) = 2x 2 − 5x 2 y 5 and let u(s, t) = st 2 and v(s, t) = s 2 +
t 4 . Find f s and f t .
Exercise 4.6.3 Let f (x, y) = 5x 2 y 3 + 10 and let u(x, y) = sin(st) and v(s, t) =
cos(st). Find f s and f t .

4.7 Tangent Plane Approximation Error

We are now ready to give you a whirlwind tour of what you can call second order
ideas in calculus for two variables. Or as some would say, let’s drink from the fountain
of knowledge with a fire hose! Well, maybe not that intense....
We will use these ideas for some practical things. Recall from the first class on
calculus for biologist’s, we used these ideas to to find the minimum and maximum
of functions of two variables and we applied those ideas to the problem of finding
the best straight line that fits a collection of data points. This regression line is of
great importance to you in your career as biologists! We also introduced the ideas of
average or mean, covariance and variance when we worked out how to find the
regression line. The slope of the regression line has many important applications and
we showed you some of them in our Hamilton’s rule model.
Once we have the chain rule, we can quickly develop other results such as how
much error we make when we approximate our surface f (x, y) using a tangent plane
at a point (x0 , y0 ). To finish our arguments, we need an analog of the Mean Value
Theorem from Calculus. The first thing we need is to know when a function of two
variables is differentiable. Just because it’s partials exist at a point is not enough to
guarantee that! But we can prove that if the partials are continuous around that point,
112 4 Multivariable Calculus

then the derivative does exist. And that means we can write the function in terms of
its tangent plane plus an error. The arguments to do this are not terribly hard, so let’s
go through them. We will need a version of the Mean Value Theorem for functions
of two variables. Here it is:

Theorem 4.7.1 (Mean Value Theorem)


Assume the partials of f (x, y) exist at (x0 , y0 ) and that f is defined locally around
(x0 , y0 ). Then given any (x, y) where f is locally defined, there is a point on the
line between (x0 , y0 ) and (x, y), (xc , yc ) with xc = x0 + c(x − x0 ) and yc c =
y0 + c(y − y0 ) so that

f (x, y) − f (x0 , y0 ) = f x (xc , yc )(x − x0 ) + f y (xc , yc )(y − y0 ).

Proof The argument that shows this is pretty straightforward. We apply the chain
rule using the simpler functions u(t) = x0 + t (x − x0 ) and v(t) = y0 + t (y − y0 ).
Then u and v are differentiable with u  (t) = x − x0 and v  (t) = y − y0 . Hence, if
h(t) = f (u(t), v(t)), we have

h  (t) = f x (u(t)) (x − x0 ) + f y (v(t)) (y − y0 )

and from the usual calculus mean value theorem,

h(1) − h(0) = h  (c)

where c is between 0 and 1. Using the definition of h and h  , we have

f (x, y) − f (x0 , y0 ) = f x (xc , yc )(x − x0 ) + f y (xc , yc )(y − y0 ).

Now it is not true that just because a function f has partial derivatives at a point
(x0 , y0 ) that f is differentiable. There are many examples where partials can exist
at a point and the function itself does not satisfy the definition of differentiability.
However, if we know the partials are themselves continuous locally at the point
(x0 , y0 ) then it is true that f is differentiable there. Once we know f is differentiable
there we can apply chain rule type ideas. Let’s assume f is defined locally around
(x0 , y0 ) and consider the difference

f (x0 + tx, y0 + ty) − f (x0 , y0 ) = f (x0 + tx, y0 + ty) − f (x0 + tx, y0 )


+ f (x0 + tx, y0 ) − f (x0 , y0 ).
4.7 Tangent Plane Approximation Error 113

From the Mean Value Theorem, we know we can write

f (x0 + tx, y0 + ty) − f (x0 + tx, y0 ) = f y (x0 + tx, y0 + t1 y)(ty)


f (x0 + tx, y0 ) − f (x0 , y0 ) = f x (x0 + t2 x, y0 )(tx)

for some numbers t1 and t2 between 0 and t. Hence, at t = 1, we have

f (x0 + x, y0 + y) − f (x0 , y0 ) − f x (x0 , y0 )x − f y (x0 , y0 )y


 
= f x (x0 + t2 x, y0 ) − f x (x0 , y0 ) x
 
+ f y (x0 + tx, y0 + t1 y) − f y (x0 , y0 ) y

where t1 and t2 are between 0 and 1. Let

E(x, y, x0 , y0 ) = f (x0 + x, y0 + y) − f (x0 , y0 ) − f x (x0 , y0 )x − f y (x0 , y0 )y

like usual. Then we have

 
E(x, y, x0 , y0 ) = f x (x0 + t2 x, y0 ) − f x (x0 , y0 ) x
 
+ f y (x0 + tx, y0 + t1 y) − f y (x0 , y0 ) y

We know as (x, y) → (0, 0), the numbers (t1 , t2 ) we found using the Mean
Value Theorem will also go to (0, 0) and so (x0 + t2 x, y0 ) → (x0 , y0 ) and (x0 +
x, y0 + t1 y) → (x0 , y0 ). Then the continuity of f x and f y at (x0 , y0 ) tells us

 
f x (x0 + t2 x, y0 ) − f x (x0 , y0 ) x
 
+ f y (x0 + tx, y0 + t1 y) − f y (x0 , y0 ) y → 0.

We conclude E(x, y, x0 , y0 ) → 0 as (x, y) → 0. Further,


114 4 Multivariable Calculus
 
E(x, y, x0 , y0 ) x
= f x (x0 + t2 x, y0 ) − f x (x0 , y0 )
(x)2 + (y)2 (x)2 + (y)2
 
y
+ f y (x0 + tx, y0 + t1 y) − f y (x0 , y0 )
(x)2 + (y)2

But the terms |x|/ (x)2 + (y)2 ≤ 1 and |y|/ (x)2 + (y)2 ≤ 1 and so as
(x, y) → (0, 0), we must have E(x, y, x0 , y0 )/ (x)2 + (y)2 → 0 as well.
These two limits show that f is differentiable at (x0 , y0 ). We can state this as a
theorem. We use this idea a lot in two dimensional calculus.

Theorem 4.7.2 (Continuous Partials Imply Differentiability)


Assume the partials of f (x, y) exist at (x0 , y0 ) and that f is defined locally around
(x0 , y0 ). Further, assume the partials are continuous locally at (x0 , y0 ). Then f is
differentiable at (x0 , y0 ).

Now let’s go back to the old idea of a tangent plane to a surface. For the surface
z = f (x, y) if its partials are continuous functions (they usually are for our work!)
then f is differentiable and hence we know that

f (x, y) = f (x0 , y0 ) + f x (x0 , y0 )(x − x0 ) + f y (x0 , y0 )(y − y0 ) + E(x, y, x0 , y0 )

and E(x, y, x0 , y0 ) → 0 and E(x, y, x0 , y0 )/ (x − x0 )2 + (y − y0 )2 → 0 as (x, y)


→ (x0 , y0 ).

4.8 Second Order Error Estimates

We can characterize the error must better if we have access to what are called the
second order partial derivatives of f . Roughly speaking, we take the partials of f x
and f y to obtain the second order terms. We can make this discussion brief. Assuming
f is defined locally as usual near (x0 , y0 ), we can ask about the partial derivatives
of the functions f x and f y with respect to x and y also. We define the second order
partials of f as follows.

Definition 4.8.1 (Second Order Partials)


If f (x, y), f x and f y are defined locally at (x0 , y0 ), we can attempt to find following
limits:

f x (x, y) − f x (x, y0 )
lim = ∂x ( f x )
x→x0 ,y=y0 x − x0
f x (x, y) − f x (x0 , y)
lim = ∂y ( fx )
x=x0 ,y→y0 y − y0
4.8 Second Order Error Estimates 115

f y (x, y) − f y (x, y0 )
lim = ∂x ( f y )
x→x0 ,y=y0 x − x0
f y (x, y) − f y (x0 , y)
lim = ∂y ( f y )
x=x0 ,y→y0 y − y0

Comment 4.8.1 When these second order partials exist at (x0 , y0 ), we use the fol-
lowing notations interchangeably: f x x = ∂x ( f x ), f x y = ∂ y ( f x ), f yx = ∂ y ( f x ) and
f yy = ∂ y ( f y ).

The second order partials are often organized into a matrix called the Hessian.

Definition 4.8.2 (The Hessian)


If f (x, y), f x and f y are defined locally at (x0 , y0 ), if the second order partials exist
at (x0 , y0 ), we define the Hessian, H(x0 , y0 ) at (x0 , y0 ) to be the matrix

 
f x x (x0 , y0 ) f x y (x0 , y0 )
H(x0 , y0 ) =
f yx (x0 , y0 ) f yy (x0 , y0 )

Comment 4.8.2 It is possible to prove that if the first order partials are continuous
locally near (x0 , y0 ) then the mixed order partials f x y and f yx must match at the
point (x0 , y0 ). Most of our surfaces have this property. Hence, for these smooth
surfaces, the Hessian is a symmetric matrix!

Example 4.8.1 Let f (x, y) = 2x − 8x y. Find the first and second order partials of
f and its Hessian.

Solution The partials are

f x (x, y) = 2 − 8y
f y (x, y) = −8x
f x x (x, y) = 0
f x y (x, y) = −8
f yx (x, y) = −8
f yy (x, y) = 0.

and so the Hessian is

   
f x x (x, y) f x y (x, y) 0 −8
H(x, y) = =
f yx (x, y) f yy (x, y) −8 0
116 4 Multivariable Calculus

4.8.1 Homework

Exercise 4.8.1 Let f (x, y) = 5x − 2x y. Find the first and second order partials of
f and its Hessian.

Exercise 4.8.2 Let f (x, y) = −8y + 9x y − 2y 2 . Find the first and second order
partials of f and its Hessian.

Exercise 4.8.3 Let f (x, y) = 4x − 6x y − x 2 . Find the first and second order par-
tials of f and its Hessian.

Exercise 4.8.4 Let f (x, y) = 4x 2 − 6x y − x 2 . Find the first and second order par-
tials of f and its Hessian.

4.8.2 Hessian Approximations

We can now explain the most common approximation result for tangent planes. Let

h(t) = f (x0 + tx, y0 + ty)

as usual. Then we know we can write

t2
h(t) = h(0) + h  (0)t + h  (c) .
2
Using the chain rule, we find

h  (t) = f x (x0 + tx, y0 + ty)x + f y (x0 + tx, y0 + ty)y

and

 
h  (t) = ∂x f x (x0 + tx, y0 + ty)x + f y (x0 + tx, y0 + ty)y x
 
+∂ y f x (x0 + tx, y0 + ty)x + f y (x0 + tx, y0 + ty)y y

= f x x (x0 + tx, y0 + ty)(x)2 + f yx (x0 + tx, y0 + ty)(y)(x)


+ f x y (x0 + tx, y0 + ty)(x)(y) + f yy (x0 + tx, y0 + ty)(y)2
4.8 Second Order Error Estimates 117

We can rewrite this in matrix–vector form as

  
  f x x (x0 + tx, y0 + ty) f yx (x0 + tx, y0 + ty) x
h  (t) = x y
f x y (x0 + tx, y0 + ty) f yy (x0 + tx, y0 + ty) y

Of course, using the definition of H, this can be rewritten as

 T  
 x x
h (t) = H(x0 + tx, y0 + ty)
y y

Thus, our tangent plane approximation can be written as

1
h(1) = h(0) + h  (0)(1 − 0) + h  (c)
2
for some c between 0 and 1. Substituting for the h terms, we find

f (x0 + x, y0 + y) = f (x0 , y0 ) + f x (x0 , y0 )x + f y (x0 , y0 )y


   
1 x T x
+ H(x0 + cx, y0 + cy)
2 y y

Clearly, we have shown how to express the error in terms of second order partials.
There is a point c between 0 and 1 so that

   
1 x T x
E(x0 , y0 , x, y) = H(x0 + cx, y0 + cy)
2 y y

Note the error is a quadratic expression in terms of the x and y.

4.9 Extrema Ideas

To understand how to think about finding places where the minimum and maximum
of a function to two variables might occur, all you have to do is realize it is a common
sense thing. We already know that the tangent plane attached to the surface which
represents our function of two variables is a way to approximate the function near
the point of attachment. We have seen in our pictures what happens when the tangent
plane is flat. This flatness occurs at the minimum and maximum of the function. It
also occurs in other situations, but we will leave that more complicated event for other
118 4 Multivariable Calculus

courses. The functions we want to deal with are quite nice and have great minima
and maxima. However, we do want you to know there are more things in the world
and we will touch on them only briefly.
To see what to do, just recall the equation of the tangent plane error to our function
of two variables f (x, y).

f (x, y) = f (x0 , y0 ) + ∇( f )(x0 , y0 )[x − x0 , y − y0 ]T


+(1/2)[x − x0 , y − y0 ]H (x0 + c(x − x0 ), y0 + c(y − y0 ))[x − x0 , y − y0 ]T

where c is some number between 0 and 1 that is different for each x. We also know
that the equation of the tangent plane to f (x, y) at the point (x0 , y0 ) is

f (x, y) = f (x0 , y0 )+ < ∇ f, X − X 0 > .

Now let’s assume the tangent plane is flat at (x0 , y0 ). Then the gradient ∇ f is the
zero vector and we have ∂∂xf (x0 , y0 ) = 0 and ∂∂ yf (x0 , y0 ) = 0. So the tangent plane
error equation simplifies to

f (x, y) = f (x0 , y0 ) + (1/2)[x − x0 , y − y0 ]


× H (x0 + c(x − x0 ), y0 + c(y − y0 ))[x − x0 , y − y0 ]T

Now let’s simplify this. The Hessian is just a 2 × 2 matrix whose components are
the second order partials of f . Let

∂2 f
A(c) = (x0 + c(x − x0 ), y0 + c(y − y0 ))
∂x 2
∂2 f
B(c) = (x0 + c(x − x0 ), y0 + c(y − y0 ))
∂x ∂ y
∂2 f
= (x0 + c(x − x0 ), y0 + c(y − y0 ))
∂ y ∂x
∂2 f
D(c) = (x0 + c(x − x0 ), y0 + c(y − y0 ))
∂ y2

Then, we have

  
  A(c) B(c) x − x0
f (x, y) = f (x0 , y0 ) + (1/2) x − x0 y − y0
B(c) D(c) y − y0
4.9 Extrema Ideas 119

We can multiply this out (a nice simple pencil and paper exercise!) to find


f (x, y) = f (x0 , y0 ) + 1/2 A(c)(x − x0 )2

+ 2B(c)(x − x0 ) (y − y0 ) + D(c)(y − y0 )2

Now it is time to remember an old technique from high school—completing the


square. Remember if we had a quadratic like u 2 + 3uv + 6v 2 , to complete the square
we take half of the number in front of the mixed term uv and square it and add and
subtract it times v 2 as follows.

u 2 + 3uv + 6v 2 = u 2 + 3uv + (3/2)2 v 2 − (3/2)2 v 2 + 6v 2 .

Now group the first three terms together and combine the last two terms into one
term.

   
u + 3uv + 6v = u + 3uv + (3/2) v + 6 − (3/2) v 2 .
2 2 2 2 2 2

The first three terms are a perfect square, (u + (3/2)v)2 . Simplifying, we find

 2
u + 3uv + 6v = u + (3/2)v + (135/4) v 2 .
2 2

This is called completely the square! Now let’s do this with the Hessian quadratic
we have. First, factor our the A(c). We will assume it is not zero so the divisions are
fine to do. Also, for convenience, we will replace x − x0 by x and y − y0 by y.
This gives

 
A(c) B(c) D(c)
f (x, y) = f (x0 , y0 ) + (x)2 + 2 x y + (y)2 .
2 A(c) A(c)

One half of the xy coefficient is B(c)


A(c)
so add and subtract (B(c)/A(c))2 (y)2 .
We find


A(c) B(c)
f (x, y) = f (x0 , y0 ) + (x)2 + 2 x y
2 A(c)
    
B(c) 2 B(c) 2 D(c)
+ (y) −
2
(y) +
2
(y) .
2
A(c) A(c) A(c)
120 4 Multivariable Calculus

Now group the first three terms together—the perfect square and combine the last
two terms into one. We have

f (x, y) = f (x0 , y0 )
 2   
A(c) B(c) A(c) D(c) − (B(c))2
+ x + y + (y) .
2
2 A(c) (A(c))2

Now we need this common sense result which says that if a function g is continuous
at a point (x0 , y0 ) and positive or negative , then it is positive or negative in a circle
of radius r centered at (x0 , y0 ). Here is the formal statement.

Theorem 4.9.1 (Nonzero Values and Continuity)


If f (x0 , y0 ) is a place where the function is positive or negative in value, then there
is a radius r so that f (x, y) is positive or negative in a circle of radius r around the
center (x0 , y0 ).

Proof This can be argued carefully using limits.


• If f (x0 , y0 ) > 0 and f is continuous at (x0 , y0 ), then if no matter now close we
were to (x0 , y0 ), we could find a point (xr , yr ) where f (xr , yr ) = 0, then that set
of points would define a path to (x0 , y0 ) and the limiting value of f on that path
would be 0.
• But we know the value at (x0 , y0 ) is positive and we know f is continuous there.
Hence, the limiting values for all paths should match.
• So we can’t find such points for all values of r . We see there will be a first r where
we can’t do this and so inside the circle determined by that r , f will be nonzero.
• You might think we haven’t ruled out the possibility that f could be negative
at some points. But the only way a continuous f could switch between positive
and negative is to pass through zero. And we have already ruled that out. So f is
positive inside this circle of radius r . 

Now getting back to our problem. We have at this point where the partials are zero,
the following expansion

   
A(c) B(c) B(c) 2
f (x, y) = f (x0 , y0 ) + (x) + 2
2
x y + (y) 2
2 A(c) A(c)
  
A(c) D(c) − (B(c))2
+ (y) .
2
(A(c))2

The algebraic sign of the terms after the function value f (x0 , y0 ) are completely
determined by the terms which are not squared. We have two simple cases:
4.9 Extrema Ideas 121

• A(c) > 0 and A(c) D(c) − (B(c))2 > 0 which implies the term after f (x0 , y0 ) is
positive.
• A(c) < 0 and A(c) D(c) − (B(c))2 > 0 which implies the term after f (x0 , y0 ) is
negative.

Now let’s assume all the second order partials are continuous at (x0 , y0 ). We know
A(c) = ∂∂x 2f (x0 + c(x − x0 ), y0 + c(y − y0 )) and from Theorem 4.9.1, if ∂∂x 2f (x0 , y0 )
2 2

> 0, then so is A(c) in a circle around (x0 , y0 ). The other term A(c) D(c) −
(B(c))2 > 0 will also be positive is a circle around (x0 , y0 ) as long as ∂∂x 2f (x0 , y0 )
2

∂2 f ∂2 f
∂ y2
(x0 , y0 ) − (x , y0 ) > 0. We can say similar things about the negative case.
∂x∂ y 0
Now to save typing let ∂∂x 2f (x0 , y0 ) = f x0x , ∂∂ y 2f (x0 , y0 ) = f yy ∂2 f
2 2
0
and ∂x∂ (x , y0 ) = f x0y .
y 0
So we can restate our two cases as

• f x0x > 0 and f x0x f yy


0
− ( f x0y )2 > 0 which implies the term after f (x0 , y0 ) is posi-
tive. This implies that f (x, y) > f (x0 , y0 ) in a circle of some radius r which says
f (x0 , y0 ) is a minimum value of the function locally at that point.
• f x0x < 0 and f x0x f yy
0
− ( f x0y )2 > 0 which implies the term after f (x0 , y0 ) is nega-
tive. This implies that f (x, y) < f (x0 , y0 ) in a circle of some radius r which says
f (x0 , y0 ) is a maximum value of the function locally at that point.

where, for convenience, we use a superscript 0 to denote we are evaluating the partials
at (x0 , y0 ). So we have come up with a great condition to verify if a place where
the partials are zero is a minimum or a maximum. If you think about it a bit, you’ll
notice we left out the case where f x0x f yy
0
− ( f x0y )2 < 0 which is important but we
will not do that in this class. That is for later courses to pick up, however it is the
test for the analog of the behavior we see in the cubic y = x 3 . The derivative is 0 but
there is neither a minimum or maximum at x = 0. In two dimensions, the situation is
more interesting of course. This kind of behavior is called a saddle. We have another
Theorem!

Theorem 4.9.2 (Extrema Test)


If the partials of f are zero at the point (x0 , y0 ), we can determine if that point is a
local minimum or local maximum of f using a second order test. We must assume
the second order partials are continuous at the point (x0 , y0 ).

• If f x0x > 0 and f x0x f yy


0
− ( f x0y )2 > 0 then f (x0 , y0 ) is a local minimum.
• f x x < 0 and f x x f yy − ( f x0y )2 > 0 then f (x0 , y0 ) is a local maximum.
0 0 0

We just don’t know anything if the test f x0x f yy


0
− ( f x0y )2 = 0. If the test gives f x0x f yy
0

( f x y ) < 0, we have a saddle.
0 2

Proof We have sketched out the reasons for this above. 

Now the second order test fails if det(H(x0 , y0 )) = 0 at the critical point as a
few examples show. First, the function f (x, y) = x 4 + y 4 has a global minimum at
(0, 0) but at that point
122 4 Multivariable Calculus
 
12x 2 0
H(x, y) = =⇒ det(H(x0 , y0 )) = 144x 2 y 2 .
0 12y 2

and hence, det(H(x0 , y0 )) = 0. Secondly, the function f (x, y) = −x 4 − y 4 has a


global maximum at (0, 0) but at that point
 
−12x 2 0
H(x, y) = =⇒ det(H(x0 , y0 )) = 144x 2 y 2 .
0 −12y 2

and hence, det(H(x0 , y0 )) = 0 as well. Finally, f (x, y) = x 4 − y 4 has a saddle at


(0, 0) but at that point
 
12x 2 0
H(x, y) = =⇒ det(H(x0 , y0 )) = −144x 2 y 2 .
0 −12y 2

and hence, det(H(x0 , y0 )) = 0 again. So if the det(H(x0 , y0 )) = 0, we just don’t


know what the behavior is.
Let’s finish this section with a more careful discussion of the idea of a saddle.
Recall at a critical point (x0 , y0 ), we found that

 2   
A(c) B(c) A(c) D(c) − (B(c))2 2
f (x, y) = f (x0 , y0 ) + x + y + (y) .
2 A(c) (A(c))2

Now suppose we knew A(c) D(c) − (B(c))2 < 0. Then, using the usual continuity
argument, we know that there is a circle around the critical point (x0 , y0 ) so that
A(c) D(c) − (B(c))2 < 0 when c = 0. This is the same as saying det(H(x0 , y0 )) <
0. But notice that on the line going through the critical point having y = 0, this
gives
 2
A(c)
f (x, y) = f (x0 , y0 ) + x .
2

and on the line through the critical point with x + B(c)


A(c)
y = 0. we have
 
A(c) A(c) D(c) − (B(c))2
f (x, y) = f (x0 , y0 ) + (y)2
2 (A(c))2

Now, if A(c) > 0, the first case gives f (x, y) = f (x0 , y0 ) + a positive number
showing f has a minimum on that trace. However, the second case gives f (x, y) =
f (x0 , y0 )− a positive number which shows f has a maximum on that trace. The
fact that f is minimized in one direction and maximized in another direction gives
rise to the expression that we consider f to behave like a saddle at this critical point.
The analysis is virtually the same if A(c) < 0, except the first trace has the maximum
4.9 Extrema Ideas 123

and the second trace has the minimum. Hence, the test for a saddle point is to see if
det(H(x0 , y0 )) < 0 as we stated in Theorem 4.9.2.

4.9.1 Examples

Example 4.9.1 Use our tests to show f (x, y) = x 2 + 3y 2 has a minimum at (0, 0).

Solution The partials here are f x = 2x and f y = 6y. These are zero at x = 0 and
y = 0. The Hessian at this critical point is

20
H (x, y) = = H (0, 0).
06

as H is constant here. Our second order test says the point (0, 0) corresponds
to a minimum because f x x (0, 0) = 2 > 0 and f x x (0, 0) f yy (0, 0) − ( f x y (0, 0))2 =
12 > 0.

Example 4.9.2 Use our tests to show f (x, y) = x 2 + 6x y + 3y 2 has a saddle at


(0, 0).

Solution The partials here are f x = 2x + 6y and f y = 6x + 6y. These are zero at
when

2x + 6y = 0
6x + 6y = 0

which has solution x = 0 and y = 0. The Hessian at this critical point is


 
26
H (x, y) = = H (0, 0).
66

as H is again constant here. Our second order test says the point (0, 0) corresponds to
a saddle because f x x (0, 0) = 2 > 0 and f x x (0, 0) f yy (0, 0) − ( f x y (0, 0))2 = 12 −
36 < 0.

Example 4.9.3 Show our tests fail on f (x, y) = 2x 4 + 4y 6 even though we know
there is a minimum value at (0, 0).

Solution For f (x, y) = 2x 4 + 4y 6 , you find that the critical point is (0, 0) and all
the second order partials are 0 there. So all the tests fail. Of course, a little common
sense tells you (0, 0) is indeed the place where this function has a minimum value.
Just think about how it’s surface looks. But the tests just fail. This is much like the
curve f (x) = x 4 which has a minimum at x = 0 but all the tests fail on it also.
124 4 Multivariable Calculus

Fig. 4.5 The Surface


f (x, y) = 2x 2 + 4y 3

Example 4.9.4 Show our tests fail on f (x, y) = 2x 2 + 4y 3 and the surface does not
have a minimum or maximum at the critical point (0, 0).

Solution For f (x, y) = 2x 2 + 4y 3 , the critical point is again (0, 0) and f x x (0, 0) =
4, f yy (0, 0) = 0 and f x y (0, 0) = f yx (0, 0) = 0. So f x x (0, 0) f yy (0, 0) − ( f x y (0, 0))2
= 0 so the test fails. Note the x = 0 trace is 4y 3 which is a cubic and so is nega-
tive below y = 0 and positive above y = 0. Not much like a minimum or maximum
behavior on this trace! But the trace for y = 0 is 2x 2 which is a nice parabola which
does reach its minimum at x = 0. So the behavior of the surface around (0, 0) is not
a maximum or a minimum. The surface acts a lot like a cubic. Do this in MatLab.

Listing 4.8: The surface f (x, y) = 2x 2 + 4y 3


1 [X, Y] = meshgrid ( − 1:.2 :1) ;
Z = 2∗X . ˆ 2 + 4∗Y . ˆ 3 ;
surf (Z) ;

This will give you a surface. In the plot that is shown go to the tool menu and click
of the rotate 3D option and you can spin it around. Clearly like a cubic! You can see
the plot in Fig. 4.5.

4.9.2 Homework

Exercise 4.9.1 Use our tests to show f (x, y) = 4x 2 + 2y 2 has a minimum at (0, 0).
Feel free to draw a surface plot to help you see what is going on.
4.9 Extrema Ideas 125

Exercise 4.9.2 Use our tests to find where f (x, y) = 2x 2 + 3x + 3y 2 + 8y has a


minimum. Feel free to draw a surface plot to help you see what is going on.

Exercise 4.9.3 Use our tests to find where f (x, y) = 100 − 2x 2 + 3x − 3y 2 + 8y


has a maximum. Feel free to draw a surface plot to help you see what is going on.

Exercise 4.9.4 Use our tests to find where f (x, y) = 2x 2 + x + 8 + 4y 2 + 8y + 20


has a minimum. Feel free to draw a surface plot to help you see what is going on.

Exercise 4.9.5 Show our tests fail on f (x, y) = 6x 4 + 8y 8 even though we know
there is a minimum value at (0, 0). Feel free to draw a surface plot to help you see
what is going on.

Exercise 4.9.6 Show our tests fail on f (x, y) = 10x 2 + 5y 5 and the surface does
not have a minimum or maximum at the critical point (0, 0). Feel free to draw a
surface plot to help you see what is going on.
Part III
The Main Event
Chapter 5
Integration

To help us with our modeling tasks, we need to explore two new ways to compute
antiderivatives and definite integrals. These methods are called Integration By Parts
and Partial Fraction Decompositions. You should also recall our discussions about
antiderivatives or primitives and Riemann integration from Peterson (2015) where
we go over topics such as how we define the Riemann Integral, the Fundamental
Theorem Calculus and the use of the Cauchy Fundamental Theorem Calculus. you
should also review the basic ideas of continuity and differentiability.

5.1 Integration by Parts

This technique is based on the product rule for differentiation. Let’s assume that
the functions f and g are both differentiable on the finite interval [a, b]. Then the
product f g is also differentiable on [a, b] and

( f (t) g(t)) = f  (t) g(t) + f (t) g  (t)

Now, we know the antiderivative of ( f g) is f g + C, where for convenience of


notation, we don’t write the usual (t) in each term. So, if we compute the definite
integral of both sides of this equation on [a, b], we find
 b  b  b
( f (t) g(t)) dt = f  (t) g(t) dt + f (t) g  (t) dt
a a a

The left hand side is simply ( f g) |ab = f (b)g(b) − f (a)g(a). Hence,


 b  b

f (t) g(t) dt + f (t) g  (t) dt = ( f g) |ab
a a

© Springer Science+Business Media Singapore 2016 129


J.K. Peterson, Calculus for Cognitive Scientists,
Cognitive Science and Technology, DOI 10.1007/978-981-287-877-9_5
130 5 Integration

This is traditionally written as


 b  b

f (t) g (t) dt = ( f g) |ab − g(t) f  (t) dt (5.1)
a a

We usually write this in an even more abbreviated form. If we let u(t) = f (t),
then du = f  (t) dt. Also, if v(t) = g(t), then dv = g  (t) dt. Then, we can rephrase
Eq. 5.1 as Eq. 5.2.
 b  b
u dv = uv |ab − v du (5.2)
a a

We can also develop the integration by parts formula as an indefinite integral. When
we do that we obtain the version in Eq. 5.3
  b
u dv = uv − v du + C (5.3)
a

Equation 5.2 gives what is commonly called the Integration By Parts formula.

5.1.1 How Do We Use Integration by Parts?

Let’s work through some problems carefully. As usual, we will give many details at
first and gradually do the problems faster with less written down. You need to work
hard at understanding this technique.

Example 5.1.1 Evaluate ln(t) dt.

Solution Let u(t) = ln(t) and dv = dt. Then du = 1t dt and v = dt = t. When
we find the antiderivative v, at this stage we don’t need to carry around an arbitrary
constant C as we will add one at the end. Applying Integration by Parts, we have
 
ln(t) dt = udv

= uv − vdu

1
= ln(t) t − t dt
t

= ln(t) t − dt
= t ln(t) − t + C

Example 5.1.2 Evaluate t ln(t) dt.
5.1 Integration by Parts 131

Solution Let u(t) = ln(t) and dv = tdt. Then du = 1
t
dt and v = tdt = t 2 /2.
Applying Integration by Parts, we have
 
t ln(t) dt = udv

= uv − vdu

1
= ln(t) t 2 /2 − t 2 /2 dt
t
2 
t
= ln(t) − t/2 dt
2
t2 t2
= ln(t) − + C
2 4

Example 5.1.3 Evaluate t 3 ln(t) dt.

Solution Let u(t) = ln(t) and dv = t 3 dt. Then du = 1
t
dt and v = t 3 dt = t 4 /4.
Applying Integration by Parts, we have
 
t ln(t) dt =
3
udv

= uv − vdu

1
= ln(t) t /4 −
4
t 4 /4 dt
t
4 
t
= ln(t) − t 3 /4 dt
4
t2 t4
= ln(t) − +C
2 16

Example 5.1.4 Evaluate t et dt.

Solution Let u(t) = t and dv = et dt. Then du = dt and v = et dt = et . Applying
Integration by Parts, we have
 
t e dt =
t
udv

= uv − vdu

= et t − et dt

= t et − et dt

= t et − et + C
132 5 Integration

Example 5.1.5 Evaluate t 2 et dt.

Solution Let u(t) = t 2 and dv = et dt. Then du = 2tdt and v = et dt = et . Apply-
ing Integration by Parts, we have
 
t e dt =
2 t
udv

= uv − vdu

= et t 2 − et 2t dt

= t 2 et − 2t et dt


Now the integral 2t et dt also requires the use of integration by parts. So we
integrate again using this technique Let u(t) = 2t and dv = et dt. Then du = 2dt
and v = et dt = et . Applying Integration by Parts again, we have
 
2t et dt = udv

= uv − vdu

= et 2t − et 2 dt

= 2t e −
t
2 et dt

= 2t et − 2 et + C

It is very awkward to do these multiple integration by parts in two separate steps like
we just did. It is much more convenient to repackage the computation like this:
 
t 2 et dt = uv −
vdu
u=t 2
dv = e dt t 
= et t 2 − et 2t dt
du = 2tdt v = et

= et t 2 −  et 2t dt
= t 2 et − 2t et dt
u = 2t dv = et dt
=
du = 2dt v = et
  
= t 2 et − et 2t − et 2 dt
= t 2 et − 2t et − 2 et + C
5.1 Integration by Parts 133

The framed boxes are convenient for our explanation, but this is still a bit awkward
(and long!) to write out for our problem solution. So let’s try this:
 
t 2 et dt = et t 2 − et 2t dt

u = t 2 ; dv = et dt; du = 2tdt; v = et

= et t 2 − et 2t dt

=t e −
2 t
2t et dt

u = 2t; dv = et dt; du = 2dt v = et


  
= t 2 et − et 2t − et 2 dt
 
= t 2 et − 2t et − 2 et + C
= t 2 et − 2t et + 2 et + C

Example 5.1.6 Evaluate t 2 sin(t) dt.

Solution We will do this one the short way:


 
t 2 sin(t) dt = −t 2 cos(t) − − cos(t) 2t dt

u = t 2 ; dv = sin(t)dt; du = 2tdt; v = − cos(t)



= −t 2 cos(t) + 2t cos(t) dt

u = 2t; dv = cos(t)dt; du = 2dt v = sin(t)


  
= −t cos(t) + 2t sin(t) −
2
2 sin(t) dt

= −t 2 cos(t) + {2t sin(t) + 2 cos(t)} + C


= −t 2 cos(t) + 2t sin(t) + 2 cos(t) + C
3
Example 5.1.7 Evaluate 1 t 2 sin(t) dt.

Solution We will do this one the short way: first do the indefinite integral just like
the last problem.
134 5 Integration
 
t 2 sin(t) dt = −t 2 cos(t) − − cos(t) 2t dt

u = t 2 ; dv = sin(t)dt; du = 2tdt; v = − cos(t)



= −t 2 cos(t) + 2t cos(t) dt

u = 2t; dv = cos(t)dt; du = 2dt; v = sin(t)


  
= −t cos(t) + 2t sin(t) −
2
2 sin(t) dt

= −t 2 cos(t) + {2t sin(t) + 2 cos(t)} + C


= −t 2 cos(t) + 2t sin(t) + 2 cos(t) + C

Then, we see
 3
t 2 sin(t) dt = {−t 2 cos(t) + 2t sin(t) + 2 cos(t)}(3)
1
− {−t 2 cos(t) + 2t sin(t) + 2 cos(t)}(1)
= {−9 cos(3) + 6 sin(3) + 2 cos(3)}
− {− cos(1) + 2 sin(1) + 2 cos(1)}

And it is not clear we can do much to simplify this expression except possibly just
use our calculator to actually compute a value!

5.1.2 Homework

Exercise 5.1.1 Evaluate ln(5t) dt.

Exercise 5.1.2 Evaluate 2t ln(t 2 ) dt.

Exercise 5.1.3 Evaluate (t + 1)2 ln(t + 1) dt.

Exercise 5.1.4 Evaluate t 2 e2t dt.
2
Exercise 5.1.5 Evaluate 0 t 2 e−3t dt.

Exercise 5.1.6 Evaluate 10t sin(4t) dt.

Exercise 5.1.7 Evaluate 6t cos(8t) dt.

Exercise 5.1.8 Evaluate (6t + 4) cos(8t) dt.
5 2
Exercise 5.1.9 Evaluate 2 (t + 5t + 3) ln(t) dt.

Exercise 5.1.10 Evaluate (t 2 + 5t + 3) e2t dt.
5.2 Partial Fraction Decompositions 135

5.2 Partial Fraction Decompositions

Suppose we wanted to integrate a function like (x+2)1(x−3) . This does not fit into a
simple substitution method at all. The way we do this kind of problem is to split the
fraction (x+2)1(x−3) into the sum of the two simpler fractions x+2
1 1
and x−3 . This is
called the Partial Fractions Decomposition approach Hence, we want to find numbers
A and B so that
1 A B
= +
(x + 2) (x − 3) x +2 x −3

If we multiply both sides of this equation by the term (x + 2) (x − 3), we get the
new equation

1 = A (x − 3) + B (x + 2)

Since this equation holds for x = 3 and x = −2, we can evaluate the equation twice
to get

1 = {A (x − 3) + B (x + 2)} |x=3
=5B
1 = {A (x − 3) + B (x + 2)} |x=−2
= −5 A

Thus, we see A is −1/5 and B is 1/5. Hence,

1 −1/5 1/5
= +
(x + 2) (x − 3) x +2 x −3

We could then integrate as follows


  
1 −1/5 1/5
dx = + dx
(x + 2) (x − 3) x +2 x −3
 
−1/5 1/5
= dx + dx
x +2 x −3
 
1 1
= −1/5 d x + 1/5 dx
x +2 x −3
= −1/5 ln (| x + 2 |) + 1/5 ln (| x − 3 |) + C

| x −3|
= 1/5 ln +C
| x +2|

| x − 3 | 1/5
= ln +C
| x +2|
136 5 Integration

where it is hard to say which of these equivalent forms is the most useful. In general, in
later chapters, as we work out various modeling problems, we will choose whichever
of the forms above is best for our purposes.

5.2.1 How Do We Use Partial Fraction Decompositions


to Integrate?

Let’s do some examples:

Example 5.2.1 Evaluate



5
dt.
(t + 3) (t − 4)

Solution We start with the decomposition:

5 A B
= +
(t + 3) (t − 4) t +3 t −4
5 = A (t − 4) + B (t + 3)

Now evaluate at t = 4 and t = −3, to get

5 = (A (t − 4) + B (t + 3)) |t=4
=7B
5 = (A (t − 4) + B (t + 3)) |t=−3
= −7 A

Thus, we know A = −5/7 and B = 5/7. We can now evaluate the integral:
  
5 −5/7 5/7
dt = + dt
(t + 3) (t − 4) t +3 t −4
 
−5/7 5/7
= dt + dt
t +3 t −4
 
1 1
= −5/7 dt + 5/7 dt
t +3 t −4
= −5/7 ln (| t + 3 |) + 5/7 ln (| t − 4 |) + C

5 |t −4|
= ln +C
7 |t +3|

| t − 4 | 5/7
= ln +C
|t +3|
5.2 Partial Fraction Decompositions 137

Example 5.2.2 Evaluate



10
dt.
(2t − 3) (8t + 5)

Solution We start with the decomposition:

10 A B
= +
(2t − 3) (8t + 5) 2t − 3 8t + 5
10 = A (8t + 5) + B (2t − 3)

Now evaluate at t = −5/8 and t = 3/2, to get

10 = (A (8t + 5) + B (2t − 3)) |t=−5/8


= (−10/8 − 3) B = −34/8B
10 = (A (8t + 5) + B (2t − 3)) |t=3/2
= (24/2 + 5) A = 17 A

Thus, we know B = −80/34 = −40/17 and A = 10/17. We can now evaluate the
integral:
  
10 10/17 −40/17
dt = + dt
(2t − 3) (8t + 5) 2t − 3 8t + 5
 
10/17 −40/17
= dt + dt
2t − 3 8t + 5
 
1 1
= 10/17 dt − 40/17 dt
2t − 3 8t + 5
= 10/17 (1/2) ln (| 2t − 3 |) − 40/17 (1/8) ln (| 8t + 5 |) + C

5 | 2t − 3 |
= ln +C
17 | 8t + 5 |

| 2t − 3 | 5/17
= ln +C
| 8t + 5 |

Example 5.2.3 Evaluate



6
dt.
(4 − t) (9 + t)
138 5 Integration

Solution We start with the decomposition:

6 A B
= +
(4 − t) (9 + t) 4−t 9+t
6 = A (9 + t) + B (4 − t)

Now evaluate at t = 4 and t = −9, to get

6 = (A (4 − t) + B (9 + t)) |t=4
= 13 B
6 = (A (4 − t) + B (9 + t)) |t=−9
= 13 A

Thus, we know A = 6/13 and B = 6/13. We can now evaluate the integral:
  
6 6/13 6/13
dt = + dt
(4 − t) (9 + t) 4−t 9+t
 
−6/13 6/13
= + dt
t −4 t +9
 
−6/13 6/13
= dt + dt
t −4 t +9
 
1 1
= 6/13 dt + 6/13 dt
t −4 t +9
= 6/13 ln (| t − 4 |) + 6/13 ln (| t + 9 |) + C
6
= ln (| (2t − 3) (8t + 5) |) + C
13
= ln (| (2t − 3) (8t + 5) |)6/13 + C

Example 5.2.4 Evaluate


 7
−6
dt.
4 (t − 2) (2t + 8)

Solution Again, we start with the decomposition:

−6 A B
= +
(t − 2) (2t + 8) t − 2 2t + 8
−6 = A (2t + 8) + B (t − 2)
5.2 Partial Fraction Decompositions 139

Now evaluate at t = 2 and t = −4, to get

−6 = (A (2t + 8) + B (t − 2)) |t=2


= 12 A
−6 = (A (2t + 8) + B (t − 2)) |t=−4
= −6 B

Thus, we know A = −1/2 and B = 1. We can now evaluate the indefinite integral:
  
−6 −1/2 1
dt = + dt
(t − 2) (2t + 8) t −2 2t + 8
 
−1/2 1
= + dt
t −2 2t + 8
 
−1/2 1
= dt + dt
t −2 2t + 8
 
1 1
= −1/2 dt + dt
t −2 2t + 8
= −1/2 ln (| t − 2 |) + 1/2 ln (| 2t + 8 |) + C

1 | 2t + 8 |
= ln +C
2 |t −2|

1 2t + 8
= ln | | +C
2 t −2

Then, the definite integral becomes


  
7
−6 1 2t + 8 1 2t + 8
dt = ln | | − ln | |
4 (t − 2) (2t + 8) 2 t −2 2 t −2
  
t=7 t=4
1 22 16
= ln − ln
2 5 2

1 22 2
= ln
2 5 16
 
1 44 1 11
= ln = ln
2 80 2 20

Note that this evaluation would not make sense if the on an interval [a, b] is any of
these two natural logarithm functions were undefined at some point in [a, b]. Here,
both natural logarithm functions are nicely defined on [4, 7].
140 5 Integration

5.2.2 Homework

Exercise 5.2.1 Evaluate



16
du.
(u + 2) (u − 4)

Exercise 5.2.2 Evaluate



35
dz.
(2z − 6) (z + 6)

Exercise 5.2.3 Evaluate



−2
ds.
(s + 4) (s − 5)

Exercise 5.2.4 Evaluate



6
d x.
(x 2 − 16)

Exercise 5.2.5 Evaluate


 6
9
dy.
4 (2y − 4) (3y − 9)

Exercise 5.2.6 Evaluate



−8
dw.
(w + 7) (w − 10)

Exercise 5.2.7 Evaluate



3
dt.
(t + 1) (t − 3)

Reference

J. Peterson, Calculus for Cognitive Scientists: Derivatives, Integration and Modeling, Springer
Series on Cognitive Science and Technology (Springer Science+Business Media Singapore Pte
Ltd, Singapore, 2015 in press)
Chapter 6
Complex Numbers

In the chapters to come, we will need to use the idea of a complex number. When we
use the quadratic equation to find the roots of a polynomial like f (t) = t 2 + t + 1,
we find

−1 ± 1 − 4
t=
2
1 √
= − ± −3
2
Since, it is well known that there are no numbers
√ in our world whose squares can be
negative, it was easy to think that the term −3 represented some sort of imaginary

quantity. But, it seemed reasonable that the usual properties of the function should
hold. Thus, we can write
√ √ √
−3 = −1 3

and the term −1 had the amazing property that when √ you squared it, you got back
−1! Thus, the√ square√root of any negative number −c for a positive c could be
rewritten as −1 × c. It became clear to the people studying the roots of polyno-
mials such as our simple one √above, that
√ if the set of real numbers was augmented to
include numbers of the form −1 × c, there √ would √ be a nice √ way to represent any
root of a polynomial. Since a number like −1 × 4 or 2 × −1 is also possible, it
seemed like two copies of the real numbers were needed: one that was the usual real
numbers
√ and another copy which was any real number times this strange quantity
−1. It became very convenient to label the
√ set of all traditional real numbers as the
x axis and the set of√numbers prefixed by −1 as the y axis.
Since the prefix −1 was the only real √ difference between√the usual real num-
bers and the new numbers with the prefix −1, this prefix −1 seemed the the
quintessential representative of this difference. Historically, since these new pre-
fixed
√ numbers were already thought of as imaginary, it was decided to start labeling
−1 as the simple letter i where i is short for imaginary! Thus, a number of this
© Springer Science+Business Media Singapore 2016 141
J.K. Peterson, Calculus for Cognitive Scientists,
Cognitive Science and Technology, DOI 10.1007/978-981-287-877-9_6
142 6 Complex Numbers

sort could be represented as a + b i where a and b are any ordinary real numbers.
In particular, the roots of our polynomial could be written as

1 √
t = − ± i 3.
2

6.1 The Complex Plane

A generic complex number z = a + b i can thus be graphed as a point in the plane


which has as ordinate axis the usual x axis and as abscissa, the new i y axis. We call
this coordinate system the Complex Plane. The magnitude of the complex number
z is defined to be the length of the vector which connects the ordered pair (a, b) in
this plane with the origin (0, 0). This is labeled as r in Fig. 6.1. This magnitude is
called the modulus or magnitude of z and is represented by the same symbol we use
for the absolute value of a number, | z |. However, here this magnitude is the length
of the hypotenuse of the triangle consisting of the points (0, 0), (a, 0) and a, b) in
the complex plane. Hence,

|z|= (a)2 + (b)2

The angle measured from the positive x axis to this vector is called the angle asso-
ciated with the complex number z and is commonly denoted by the symbol θ or
Arg(z). Hence, there are two equivalent ways we can represent a complex number
z. We can use coordinate information and write z = a + b i or we can use magnitude
and angle information. In this case, if you look at Fig. 6.1, you can clearly see that
a = r cos(θ) and b = r sin(θ). Thus,

z = | z | (cos(θ) + i sin(θ))
= r (cos(θ) + i sin(θ))

iy z = a + bi A complex number a+bi


has real part a and imag-
b inary part b. The coordi-
nate (a, b) is graphed in
the usual Cartesian man-
r ner as an ordered pair in
the x−iy complex plane.
 magnitude of z is
The
θ (a)2 + (b)2 which is
shown on the graph as
r. The angle associated
a x
with z is drawn as an
arc of angle θ

Fig. 6.1 Graphing complex numbers


6.1 The Complex Plane 143

iy A complex number
a + b i has real part
z̄ = a − bi a and imaginary part b.
Its complex conjugate is
ax a − b i The coordinate
(a, −b) is graphed in
−θ the usual Cartesian man-
ner as an ordered pair
in the complex plane.
r The magnitude of z̄ is

(a)2 + (b)2 which is
shown on the graph as
b r. The angle associated
with z̄ is drawn as an
arc of angle −θ

Fig. 6.2 Graphing the conjugate of a complex numbers

We can interpret the number cos(θ) + i sin(θ) in a different way. Given a complex
number z = a + b i, we define the complex conjugate of z to be z̄ = a − b i. It is
easy to see that

z z̄ = | z |2

z −1 =
| z |2

Now look at Fig. 6.1 again. In this figure, z is graphed in Quadrant 1 of the complex
plane. Now imagine that we replace z by z̄. Then the imaginary component changes
to −5 which is a reflection across the positive x axis. The magnitude of z̄ and z will
then be the same but Arg(z̄) is −θ. We see this illustrated in Fig. 6.2.

6.1.1 Complex Number Calculations

Example 6.1.1 For the complex number z = 2 + 4 i


1. Find its magnitude
2. Write it in the form r (cos(θ) + i sin(θ)) using radians
3. Graph it in the complex plane showing angle in degrees
 √
Solution The complex number 2 + 4 i has magnitude (2)2 + (4)2 which is 20.
Since both the real and imaginary component of the complex number are positive,
this complex number is graphed in the first quadrant of the complex plane. Hence,
the angle associated with z should be between 0 and π/2. We can easily calculate
the angle θ associated with z to be tan−1 (4/2) = tan−1 (4/2).
144 6 Complex Numbers

iy
Im(z)
A complex number
−2 + 8 i has real
part −2 and imaginary
z = −2 + 8i
part 8. The coordinate
(−2, 8) is graphed in the
r usual Cartesian manner
as an ordered pair in the
x − iy complex plane.
θ
The magnitude of z is

(−2)2 + (8)2 which
is shown on the graph as
r. The angle associated
with z is drawn as an
Re(z) x arc of angle θ

Fig. 6.3 The complex number −2 + 8i

Example 6.1.2 For the complex number −2 + 8 i


1. Find its magnitude
2. Write it in the form r (cos(θ) + i sin(θ)) using radians
3. Graph it in the complex plane showing angle in degrees

Solution The complex number −2 + 8 i has magnitude (−2)2 + (8)2 which is

68. This complex number is graphed in the second quadrant of the complex plane
because the real part is negative and the imaginary part is positive. Hence, the angle
associated with z should be between π/2 and π. The angle θ associated with z should
be π − tan−1 (|8/(−2)| = 1.82 rad or 104.04◦ . The graph in Fig. 6.3 is instructive.

We can summarize what we know about complex numbers in Proposition 6.1.1

Proposition 6.1.1 (Complex Numbers)


If z is a + b i, then

1. The magnitude of z is denoted | z | and is defined to be (a)2 + (b)2 .
2. The complex conjugate of z is a − b i.
3. The argument or angle associated with the complex number z is the angle θ
defined as the angle measured from the positive x axis to the radius vector which
points from the origin 0 + 0 i to a + b i. This angle is also called Arg(z).
Sometimes we measure this angle clockwise and sometimes counterclockwise.
4. Arg(z̄) = −Arg(z).
5. The magnitude of z is often denoted by r since it is the length of the radius vector
described above.
6. The polar coordinate version of z is given by z = r (cos(θ) + i sin(θ)).
6.1 The Complex Plane 145

6.1.2 Homework

Exercise 6.1.1 For the complex number −3 + 6 i


1. Find its magnitude
2. Write it in the form r (cos(θ) + i sin(θ)) using radians
3. Graph it in the complex plane showing angle in degrees

Exercise 6.1.2 For the complex number −3 − 6 i


1. Find its magnitude
2. Write it in the form r (cos(θ) + i sin(θ)) using radians
3. Graph it in the complex plane showing angle in degrees

Exercise 6.1.3 For the complex number 3 − 2 i


1. Find its magnitude
2. Write it in the form r (cos(θ) + i sin(θ)) using radians
3. Graph it in the complex plane showing angle in degrees

Exercise 6.1.4 For the complex number 5 + 3 i


1. Find its magnitude
2. Write it in the form r (cos(θ) + i sin(θ)) using radians
3. Graph it in the complex plane showing angle in degrees

Exercise 6.1.5 For the complex number −4 − 3 i


1. Find its magnitude
2. Write it in the form r (cos(θ) + i sin(θ)) using radians
3. Graph it in the complex plane showing angle in degrees

Exercise 6.1.6 For the complex number 5 − 7 i


1. Find its magnitude
2. Write it in the form r (cos(θ) + i sin(θ)) using radians
3. Graph it in the complex plane showing angle in degrees

6.2 Complex Functions

Let z = r (cos(θ) + i sin(θ)) and let’s think of θ are a variable now. We have seen the
function f (θ) = r (cos(θ) + i sin(θ)) arises when we interpret a complex number
in terms of the triangle formed by its angle and its magnitude. Now let’s think of θ
as a variable. Let’s find f  (θ). We have
146 6 Complex Numbers

f  (θ) = −r sin(θ) + i r cos(θ)


= i 2 r sin(θ) + i r cos(θ)
 
= i r i sin(θ) + r cos(θ)

= i f (θ).
f  (θ)
So f  (θ) = i f (θ) or f (θ)
= i. Taking the antiderivative of both sides, this suggests

ln( f (θ)) = i ⇒ f (θ) = e iθ .

Our antiderivative argument suggests we define e i θ = r (cos(θ) + i sin(θ)) and by


direct calculation, we had (e i θ ) = i e i θ . This motivational argument is not quite
right, of course, and there is deeper mathematics at work here, but it helps to explain
why we define e iθ = cos(θ) + i sin(θ). Hence, we extend the exponential function
to complex numbers as follows:

Definition 6.2.1 (The Extension Of eb to eib )


We extend the exponential function exp to complex number arguments as follows:
for any real numbers a and b, define

eib ≡ cos(b) + i sin(b)


ea+i b ≡ ea eib
= ea (cos(b) + i sin(b))

Definition 6.2.2 (The Extension Of et to e(a+ b i) t )


We extend the exponential function exp to the complex argument a + b i t as follows

eibt ≡ cos(bt) + i sin(bt)


e(a+i b) t ≡ eat e ibt
= eat (cos(bt) + i sin(bt))

Thus, we can rephrase the polar coordinate form of the complex number z = a + b i
as z = r e i θ .

6.2.1 Calculations with Complex Functions

Example 6.2.1 For the complex function

e(−2 +8 i)t
6.2 Complex Functions 147

1. Find its magnitude


2. Write it in its fully expanded form

Solution We know that

exp ((a + b i) t) = eat (cos(bt) + i sin(bt)) .

Hence,

exp ((−2 + 8 i) t) = e−2t e8it


= e−2t (cos(8t) + i sin(8t)) .

Since the complex magnitude of e8it is always one, we see

| e(−2 +8 i)t | = e−2t

Example 6.2.2 For the complex function

e(−1 +2 i)t

1. Find its magnitude


2. Write it in its fully expanded form

Solution We have

exp ((−1 + 2 i) t) = e−t e2it


= e−t (cos(2t) + i sin(2t)) .

Since the complex magnitude of e2it is always one, we see

| e(−1 +2 i)t | = e−t

Example 6.2.3 For the complex function

e2i t

1. Find its magnitude


2. Write it in its fully expanded form

Solution We have

exp ((0 + 2 i) t) = e0t e2it


= cos(2t) + i sin(2t).
148 6 Complex Numbers

Since the complex magnitude of e2it is always one, we see

| e(−1 +2 i)t | = 1

6.2.2 Homework

For each of the following complex numbers, find its magnitude, write it in the form
r eiθ using radians, and graph it in the complex plane showing angle in degrees.

Exercise 6.2.1 z = −3 + 6 i.

Exercise 6.2.2 z = −3 − 6 i.

Exercise 6.2.3 z = 3 − 6 i.

Exercise 6.2.4 z = 2 + 8 i.

Exercise 6.2.5 z = 5 + 1 i.

For each of the following complex functions, find its magnitude and write it in its
fully expanded form.

Exercise 6.2.6 e(−2 +3 i) t .

Exercise 6.2.7 e(−1 +4 i) t .

Exercise 6.2.8 e(0 +14 i) t .

Exercise 6.2.9 e(−8 −9 i) t .

Exercise 6.2.10 e(−2 +π i) t .


Chapter 7
Linear Second Order ODEs

We now turn our attention to Linear Second Order Differential Equations. The way
we solve these is built from our understanding of exponential growth and the models
built out of that idea. The idea of a half life is also important though it is not used
as much in these second order models. A great example also comes from Protein
Modeling and its version of half life called Response Time.
These have the general form

a u  (t) + b u  (t) + c u(t) = 0 (7.1)


u(0) = u 0 , u  (0) = u 1 (7.2)

where we assume a is not zero. Here are some examples just so you can get the feel
of these models.

x  (t) − 3 x  (t) − 4 x(t) = 0


x(0) = 3
x  (0) = 8
4 y  (t) + 8 y  (t) + y(t) = 0
y(0) = −1
y  (0) = 2
z  (t) + 5 z  (t) + 3 z(t) = 0
z(0) = 2
z  (0) = −3

We have already seen that for a first order problem like u  (t) = r u(t) with u(0) = u 0
has the solution u(t) = u 0 er t . It turns out that the solutions to Eqs. 7.1 and 7.2 will
also have this kind of form. To see how this happens, let’s look at a new concept called
an operator. For us, an operator is something that takes as input a function and then
transforms the function into another one. A great example is the indefinite integral

© Springer Science+Business Media Singapore 2016 149


J.K. Peterson, Calculus for Cognitive Scientists,
Cognitive Science and Technology, DOI 10.1007/978-981-287-877-9_7
150 7 Linear Second Order ODEs

operator we might
 call I which takes a nice continuous function f and outputs the
new function f . Another good one is the differentiation operator D which takes a
differentiable function f and creates the new function f  . Hence, if we let D denote
differentiation with respect to the independent variable and c be the multiply by a
constant operator defined by c( f ) = c f , we could rewrite u  (t) = r u(t) as

Du = r u (7.3)

where we suppress the (t) notation for simplicity of exposition. In fact, we could
rewrite the model again as
 
D−r u =0 (7.4)

where we let D − r act on u to create u  − r u. We can apply this idea to Eq. 7.1.
Next, let D2 be the second derivative operator: this means D2 u = u  . Then we can
rewrite Eq. 7.1 as

a D2 u + b Du + c u = 0 (7.5)

where we again suppress the time variable t. For example, if we had the problem

u  (t) + 5 u  (t) + 6 u(t) = 0

it could be rewritten as

D2 u + 5 Du + 6 u = 0.

A little more practice is good at this point. These models convert to the operator form
indicated We ignore the initial conditions for the moment.

x  (t) − 3 x  (t) − 4 x(t) = 0 ⇐⇒ ( D2 − 3 D − 4)(x) = 0


4 y  (t) + 8 y  (t) + y(t) = 0 ⇐⇒ (4 D2 + 8 D + 1)(y) = 0
z  (t) + 5 z  (t) + 3 z(t) = 0 ⇐⇒ ( D2 + 5 D + 3)(z) = 0.

In fact, we should begin to think of the models and their operator forms as inter-
changeable. The model most useful to us now is the first order linear model. We now
can write

x  = 3 x; x(0) = A ⇐⇒ ( D − 3)(x) = 0; x(0) = A.

and

x  = −2 x; x(0) = A ⇐⇒ ( D + 2)(x) = 0; x(0) = A.


7 Linear Second Order ODEs 151

Now consider the model

y  + 5y  + 6y = 0 ⇐⇒ ( D2 + 5 D + 6)(y) = 0.

Let’s try factoring: consider for an function f and do the computations with the
factors in both orders.

( D + 2) ( D + 3)( f ) = ( D + 2)( f  + 3 f )
= D( f  + 3 f ) + 2 ( f  + 3 f )
= f  + 3 f  + 2 f  + 6 f = f  + 5 f  + 6 f.

and

( D + 3) ( D + 2)( f ) = ( D + 3)( f  + 2 f )
= D( f  + 2 f ) + 3 ( f  + 2 f )
= f  + 2 f  + 3 f  + 6 f = f  + 5 f  + 6 f.

We see that

( D2 + 5 D + 6)(y) = 0 ⇐⇒ ( D + 2) ( D + 3)(y) = 0 ⇐⇒ ( D + 3) ( D + 2)(y) = 0.

Now we can figure out how to find the most general solution to this model.
• The general solution to ( D + 3)(y) = 0 is y(t) = Ae−3t .
• The general solution to ( D + 2)(y) = 0 is y(t) = Be−2t .
• Let our most general solution be y(t) = Ae−3t + Be−2t .
We know from our study of first order equations that a problem of the form
( D + r)u = 0 has a solution of the form er t . This suggests that there are two
possible solutions to the problem above. One satisfies ( D + 3)u = 0 and the other
( D + 2)u = 0. Hence, it seems that any combination of the functions e−3t and
e−2t should work. Thus, a general solution to our problem would have the form
A e−3t + B e−2t for arbitrary constants A and B. With this intuition established, let’s
try to solve this more formally.
For the problem Eq. 7.1, let’s assume that er t is a solution and try to find what
values of r might work. We see for u(t) = er t , we find

0 = a u  (t) + b u  (t) + c u(t)


= a r 2 er t + b r er t + c er t
 
= a r + b r + c er t .
2
152 7 Linear Second Order ODEs

Since er t can never be 0, we must have

0 = a r 2 + b r + c.

The roots of the quadratic equation above are the only values of r that will work
as the solution er t . We call this quadratic equation the Characteristic Equation of a
linear second order differential equation Eq. 7.1. To find these values of r , we can
either factor the quadratic or use the quadratic formula. If you remember from your
earlier algebra course, there are three types of roots:

(i): the roots are both real and distinct. We let r1 be the smallest root and r2 the
bigger one.
(ii): the roots are both the same. We let r1 = r2 = r in this case.
(iii): the roots are a complex pair of the form a ± b i.

Example 7.0.1 Consider this model.

u  (t) + 7 u  (t) + 10 u(t) = 0

Note the operator form of this model is

(D 2 + 7D + 10)(x) = 0.

Let’s derive the characteristic equation. We assume the model has a solution of the
form er t for some value of r . Then, plugging u(t) = er t into the model, we find

(r 2 + 7r + 10)(er t ) = 0.

Since er t is never 0 no matter what r ’s value is, we see this implies we need values
of r that satisfy

r 2 + 7r + 10 = 0.

This factors as

(r + 2)(r + 5) = 0.

Thus, the roots of the model’s characteristic equation are r1 = −5 and r2 = −2.
7.1 Homework 153

7.1 Homework

For these models,


(i): Write the model in operator form.
(ii): Derive the characteristic equation.
(iii): Find the roots of the characteristic equation.

Exercise 7.1.1

u  (t) + 50 u  (t) − 6 u(t) = 0.

Exercise 7.1.2

u  (t) + 7 u  (t) + 2 u(t) = 0.

Exercise 7.1.3

u  (t) − 8 u  (t) + 3 u(t) = 0.

Exercise 7.1.4

u  (t) + 6 u  (t) + 10 u(t) = 0.

Exercise 7.1.5

u  (t) + 5 u  (t) − 24 u(t) = 0.

Now, let’s figure out what to do in each of these three cases.

7.2 Distinct Roots

Our general problem

a u  (t) + b u  (t) + c u(t) = 0


u(0) = u 0 , u  (0) = u 1

which has characteristic equation

0 = a r 2 + b r + c,
154 7 Linear Second Order ODEs

factors as

0 = (r − r1 ) (r − r2 ).

We thus know that the general solution u has the form

u(t) = A er1 t + B er2 t .

We are told that initially, u(0) = u 0 and u  (0) = u 1 . Taking the derivative of u,
we find

u  (t) = r1 A er1 t + r2 B er2 t .

Hence, to satisfy the initial conditions, we must find A and B to satisfy the two
equations in two unknowns below:

u(0) = A er1 0 + B er2 0 = A + B = u 0


u  (0) = r1 A er1 0 + r2 B er2 0 = r1 A + r2 B = u 1 .

Thus to find the appropriate A and B, we must solve the system of two equations in
two unknowns:

A + B = u0,
r1 A + r2 B = u 1 .

Example 7.2.1 For this model,

• Derive the characteristic equation.


• Find the roots of the characteristic equation.
• Find the general solution.
• Solve the IVP.
• Plot and print the solution using MatLab.

x  (t) − 3 x  (t) − 10 x(t) = 0


x(0) = −10
x  (0) = 10

Solution (i): To derive the characteristic equation, we assume the solution has the
form er t and plug that into the problem. We find
 
r 2 − 3 r − 10 er t = 0.
7.2 Distinct Roots 155

Since er t is never zero, we see this implies that

r 2 − 3 r − 10 = 0.

This is the characteristic equation for this problem.


(ii): We find the roots of the characteristic equation by either factoring it or using
the quadratic formula. This one factors nicely giving

(r + 2) (r − 5) = 0.

Hence, the roots of this characteristic equation are r1 = −2 and r2 = 5.


(iii): The general solution is thus

u(t) = A e−2t + B e5t .

(iv): Next, we find the values of A and B which will let the solution satisfy the initial
conditions. We have

u(0) = A e0 + B e0 = A + B = −10
u  (0) = −2 A e0 + 5 B e0 = −2 A + 5 B = 20.

This gives the system of two equations in the two unknowns A and B

A + B = −10
−2 A + 5 B = 10.

Multiplying the first equation by 2 and adding, we get B = −10/7. It then follows
that A = −60/7. Thus, the solution to this initial value problem is

u(t) = −60/7 e−2t − 10/7 e5t .

(v): Finally, we can graph this solution in MatLab as follows:

Listing 7.1: Solution to x  − 3x  − 10x = 0; x(0) = −10, x  (0) = 10

which generates the plot we see in Fig. 7.1.


156 7 Linear Second Order ODEs

Fig. 7.1 Linear second order


problem: two distinct roots

7.2.1 Homework

For the models below,

• Find the characteristic equation.


• Find the general solution.
• Solve the IVP.
• Plot and print the solution using MatLab.

Exercise 7.2.1 For the ODE below

y  (t) + 5 y  (t) + 6 y(t) = 0


y(0) = 1
y  (0) = −2.

Exercise 7.2.2 For the ODE below

z  (t) + 9 z  (t) + 14 z(t) = 0


z(0) = −1
z  (0) = 1.

Exercise 7.2.3 For the ODE below

P  (t) − 2 P  (t) − 8 P(t) = 0


P(0) = 1
P  (0) = 2.
7.2 Distinct Roots 157

Exercise 7.2.4 For the ODE below

u  (t) + 3 u  (t) − 10 u(t) = 0


u(0) = −1
u  (0) = −2.

Exercise 7.2.5 For the ODE below

x  (t) + 3 x  (t) + 2 x(t) = 0


x(0) = −10
x  (0) = 200.

7.3 Repeated Roots

Now our general problem

a u  (t) + b u  (t) + c u(t) = 0


u(0) = u 0 , u  (0) = u 1

which has characteristic equation

0 = a r 2 + b r + c,

factors as

0 = (r − r1 )2 .

We have one solution u 1 (t) = er1 t , but we don’t know if there are others. Let’s
assume that another solution is a product f (t)er1 t . We know we want
   
0= D − r1 D − r1 f (t) er1 t
    

= D − r1 f (t) + r1 f (t) − r1 f (t) e r1 t

  
= D − r1 f  (t)er1 t
 
= f  (t) + r1 f  (t) − r1 f  (t) er1 t

= f  (t) er1 t .
158 7 Linear Second Order ODEs

Since er1 t is never zero, we must have f  = 0. This tells us that f (t) = α t + β for
any α and β. Thus, a second solution has the form

v(t) = (α t + β) er1 t
= α t er 1 t + β er 1 t

The only new function in this second solution is u 2 (t) = ter1 t . Hence, our general
solution in the case of repeated roots will be

u(t) = A er1 t + B t er1 t


 
= A + B t er 1 t .

We are told that initially, u(0) = u 0 and u  (0) = u 1 . Taking the derivative of u, we
find
 

u (t) = B e + r1 A + B t er1 t .
r1 t

Hence, to satisfy the initial conditions, we must find A and B to satisfy the two
equations in two unknowns below:
 
u(0) = A + B (0) er1 0 = A = u 0
 
u  (0) = B er1 (0) + r1 A + B (0) er1 (0) = B + r1 A = u 1 .

Thus to find the appropriate A and B, we must solve the system of two equations in
two unknowns:
A = u0,
B + r1 A = u 1 .

Example 7.3.1 Now let’s look at a problem with repeated roots. We want to

• Find the characteristic equation.


• Find the roots of the characteristic equation.
• Find the general solution.
• Solve the IVP.
• Plot and print the solution using MatLab.

u  (t) + 16 u  (t) + 64 u(t) = 0


u(0) = 1
u  (0) = 8
7.3 Repeated Roots 159

Solution (i): To find the characteristic equation, we assume the solution has the
form er t and plug that into the problem. We find
 
r + 16 r + 64 er t = 0.
2

Since er t is never zero, we see this implies that

r 2 + 16 r + 64 = 0.

This is the characteristic equation for this problem.


(ii): We find the roots of the characteristic equation by either factoring it or using
the quadratic formula. This one factors nicely giving

(r + 8) (r + 8) = 0.

Hence, the roots of this characteristic equation are repeated: r1 = −8 and r2 = −8.
(iii): The general solution is thus

u(t) = A e−8t + B t e−8t

(iv): Next, we find the values of A and B which will let the solution satisfy the initial
conditions. We have

u(0) = A e0 + B 0 e0 = A = 1
 
 −8t −8t

−8t 
u (0) = −8Ae + Be − 8Bte  = −8 A + B = 8.
t=0

This gives the system of two equations in the two unknowns A and B

A=1
−8 A + B = 8.

This tells us that B = 16. Thus, the solution to this initial value problem is

u(t) = e−8t + 16 t e−8t .

(v): Finally, we can graph this solution in MatLab as follows:

Listing 7.2: Solution to u  + 16u  + 64u = 0; u(0) = 1, u  (0) = 8


160 7 Linear Second Order ODEs

Fig. 7.2 Linear second order


problem: repeated roots

which generates the plot we see in Fig. 7.2.

7.3.1 Homework

For the models below,

• Find the characteristic equation.


• Find the general solution.
• Solve the IVP.
• Plot and print the solution using MatLab.

Exercise 7.3.1

x  (t) − 12 x  (t) + 36 x(t) = 0


x(0) = 1
x  (0) = −2.

Exercise 7.3.2

y  (t) + 14 y  (t) + 49 y(t) = 0


y(0) = −1
y  (0) = 1.
7.3 Repeated Roots 161

Exercise 7.3.3

w  (t) + 6 w  (t) + 9 w(t) = 0


w(0) = 1
w  (0) = 2.

Exercise 7.3.4

Q  (t) + 4 Q  (t) + 4 Q(t) = 0


Q(0) = −1
Q  (0) = −2.

Exercise 7.3.5

 (t) + 2  (t) + (t) = 0


(0) = −10
 (0) = 200.

7.4 Complex Roots

In this last case, our general problem

a u  (t) + b u  (t) + c u(t) = 0


u(0) = u 0 , u  (0) = u 1

which has characteristic equation

0 = a r 2 + b r + c,

factors as
  
0 = a r − (c + di) r − (c − di) .

because the roots are complex. Now we suspect the solutions are

u 1 (t) = e(c+di)t and u 2 (t) = e(c−di)t .

We have already seen how to interpret the complex functions e(c+di)t and e(c−di)t
(see Chap. 6, Definition 6.2.2). Let’s try to find out what the derivative of such a
function might be. First, it seems reasonable that if f (t) has a derivative f  (t) at t,
162 7 Linear Second Order ODEs

then multiplying by i to get i f (t) only changes the derivative to i f  (t). In fact, the
derivative of (c + id) f (t) should be (c + id) f  (t). Thus,
   

e(c+id)t = ect cos(dt) + i sin(dt)

= aect cos(dt) − d ect sin(dt) + i cect sin(dt) + i d ect cos(dt)


   
= e cos(dt) c + i d + e sin(dt) i c − d .
ct ct

We also know that i 2 = −1, so replacing −d by i 2 d in the last equation, we find


     
(c+id)t
e = e cos(dt) c + i d + e sin(dt) i c + i d
ct ct 2

   
= ect cos(dt) c + i d + i ect sin(dt) c + i d
 
= (c + i d)e cos(dt) + i sin(dt)
ct

= (c + i d) e(c+id)t .

We conclude that we can now take the derivative of e(c±id)t to get (c ± id) e(c±id)t .
We can now test to see if Ae(c+id)t + Be(c−id)t solves our problem. We see
 
e(c ± id)t = (c ± id)2 e(c ± id)t .

Thus,
     
a Ae(c+id)t + Be(c−id)t +b Ae(c+id)t + Be(c−id)t +c Ae(c+id)t + Be(c−id)t
   
= Ae(c+id)t a (c + id)2 + b (c + id) + c + Be(c−id)t a (c − id)2 + b (c − id) + c .

Now since c + id and c − id are roots of the characteristic equation, we know

a (c + id)2 + b (c + id) + c = 0
a (c − id)2 + b (c − id) + c = 0.

Thus,
     
a Ae(c+id)t + Be(c−id)t +b Ae(c+id)t + Be(c−id)t +c Ae(c+id)t + Be(c−id)t
   
= Ae(c+id)t 0 + Be(c−id)t 0 = 0.
7.4 Complex Roots 163

In fact, you can see that all of the calculations above would work even if the constants
A and B were complex numbers. So we have shown that any combination of the two
complex solutions e(c+id)t and e(c−id)t is a solution of our problem. Of course, a
solution that actually has complex numbers in it doesn’t seem that useful for our
world. After all, we can’t even graph it! So we have to find a way to construct
solutions which are always real valued and use them as our solution. Note
 
1 (c+id)t (c−id)t
u 1 (t) = e +e
2
 
1
= ect cos(dt) + i sin(dt) + cos(dt) − i sin(dt)
2
1
= 2 cos(dt) ect
2
= ect cos(dt)

and
 
1 (c+id)t (c−id)t
u 2 (t) = e −e
2i
 
1
= ect cos(dt) + i sin(dt) − cos(dt) + i sin(dt)
2i
1
= 2 i sin(dt) ect
2i
= ect sin(dt)

are both real valued solutions! So we will use as general solution to this problem

u(t) = A ect cos(dt) + B ect sin(dt)

where the constants A and B are now restricted to be real numbers. To solve the
initial value problem, then we have

u(0) = A ec0 cos(d0) + B ec0 sin(d0) = A = u 0


 

u  (0) = c A ect cos(dt) − d A ect sin(dt) + c B ect sin(dt) + d B ect cos(dt) 
t=0
= c A + d B = u1.

Hence, to solve the initial value problem, we find A and B by solving the two
equations in two unknowns below:

A = u0
c A + d B = u1.
164 7 Linear Second Order ODEs

Example 7.4.1 For this model,

• Find the characteristic equation.


• Find the roots of the characteristic equation.
• Find the general complex solution.
• Find the general real solution.
• Solve the IVP.
• Plot and print the solution using MatLab.

u  (t) + 8 u  (t) + 25 u(t) = 0


u(0) = 3
u  (0) = 4

Solution (i): To find the characteristic equation, we assume the solution has the
form er t and plug that into the problem. We find
 
r 2 + 8 r + 25 er t = 0.

Since er t is never zero, we see this implies that

r 2 + 8 r + 25 = 0.

This is the characteristic equation for this problem.


(ii): We find the roots of this characteristic equation using the quadratic formula. We
find

−8 ± 64 − 100
r = = −4 ± 3 i.
2
Hence, the roots of this characteristic equation occur as the complex pair: r1 =
−4 + 3 i and r2 = −4 − 3 i.
(iii): The general complex solution is thus

u(t) = A e(−4+3 i)t + B e(−4−3 i)t .

(iv): The real solutions we want are then e−4t cos(3t) and e−4t sin(3t). The general
real solution is thus

u(t) = A e−4t cos(3t) + B e−4t sin(3t).

for arbitrary real numbers A and B. (v): Next, we find the values of A and B which
will let the solution satisfy the initial conditions. We have
7.4 Complex Roots 165
 

u(0) = A e−4t cos(3t) + B e−4t sin(3t) 
t=0
= A = 3,
 

u  (0) = −4 Ae−4t cos(3t) − 3Ae−4t sin(3t) − 4Be−4t sin(3t) + 3Be−4t cos(3t) 
t=0
= −4 A + 3B = 4.

This gives the system of two equations in the two unknowns A and B

A=3
−4 A + 3 B = 4.

It then follows that B = 16/3. Thus, the solution to this initial value problem is

u(t) = 3 e−4t cos(3t) + 16/3 e−4t sin(3t).

(v): Finally, we can graph this solution in MatLab as follows:

Listing 7.3: Solution to u  + 8u  + 25u = 0; u(0) = 3, u  (0) = 4

which generates the plot we see in Fig. 7.3.

Fig. 7.3 Linear second order


problem: complex roots
166 7 Linear Second Order ODEs

Example 7.4.2 For

u  (t) − 8 u  (t) + 20 u(t) = 0


u(0) = 3
u  (0) = 4

• Derive the characteristic equation and find its roots.


• Find the general complex and general real solution.
• Solve the IVP and plot and print the solution using MatLab.

Solution (i): To find the characteristic equation, we assume the solution is er t . We


find (r 2 − 8 r + 20) er t = 0. Since er t is never√zero, we must have r 2 − 8 r + 20 = 0.
Using the quadratic formula, we find r = 8± 64−80 2
= 4 ± 2 i.
(ii): The general complex solution is φ(t) = c1 e(4+2 i)t + c2 e(4−2 i)t .
(iii): The general real solution is thus u(t) = A e4t cos(2t) + B e4t sin(2t) for
arbitrary real numbers A and B. Thus,

u  (t) = 4 A e4t cos(2t) − 2 Ae4t sin(2t)


+ 4B e4t sin(2t) + 2Be4t cos(2t).

(iv): Next, we find the values of A and B which will let the solution satisfy the initial
conditions. We have u(0) = 3 and u  (0) = 4 A + 2B = 4. This gives the system of
two equations in the two unknowns A and B

A=3
4 A + 2 B = 4 ⇒ 2B = 4 − 4 A = −8.

It then follows that B = −4. Thus, the solution to this initial value problem is

u(t) = 3 e4t cos(2t) − 4 e4t sin(2t).

(v): Finally, we can graph this solution in MatLab as follows:

Listing 7.4: Solution to u  + 8u  + 20u = 0; u(0) = 3, u  (0) = 4

Of course, we have to judge the time interval to choose for the linspace command.
We see the plot in Fig. 7.4. Note as t gets large, u(t) oscillates out of control.
7.4 Complex Roots 167

Fig. 7.4 Linear second order


problem: complex roots two

7.4.1 Homework

For these models,

• Find the characteristic equation and roots.


• Find the general complex solution.
• Find the general real solution.
• Solve the IVP.
• Plot and print the solution using MatLab.

Exercise 7.4.1

y  (t) − 8 y  (t) + 41 y(t) = 0


y(0) = +3
y  (0) = −2.

Exercise 7.4.2

x  (t) − 2 x  (t) + 2 x(t) = 0


x(0) = −5
x  (0) = 1.

Exercise 7.4.3

u  (t) − 2 u  (t) + 10 u(t) = 0


u(0) = +1
u  (0) = −2.
168 7 Linear Second Order ODEs

Exercise 7.4.4

P  (t) − 6 P  (t) + 13 P(t) = 0


P(0) = 1
P  (0) = 2.

7.4.2 The Phase Shifted Solution

We can also write these solutions in another form. Our solutions here look like
u(t) = A ect cos(dt) + B ect sin(dt) Let R = (A)2 + (B)2 . Rewrite the solution
as
 
A B
u(t) = R ect cos(dt) + sin(dt) .
R R

Define the angle δ by tan(δ) = BA . Then the angle’s value will depend on where A and
B just like when we find angles for complex numbers and vectors. So cos(δ) = RA and
sin(δ) = BR . Now, there is a trigonometric identity cos(E − F) = cos(E) cos(F) +
sin(E) sin(F). Here we have
 
u(t) = R ect cos(δ) cos(dt) + sin(δ) sin(dt)

= R ect cos(dt − δ).

The angle δ is called the phase shift. When written in this form, the solution is said
to be in phase shifted cosine form.

Example 7.4.3 Consider the solution u(t) = 3 e−4t cos(3t) + (16/3) e−4t sin(3t).
which gives u(0) = 3 and u  (0) = 4. Find the phase shifted cosine solution.

Solution Let R = (3)2 + (16/3)2 = 265/3. A and B are positive so they are in
Quadrant 1. So δ = tan −1 (16/9). We have u(t) = (3)2 + (16/3)2 e−4t cos(3t−δ).
To draw this solution by hand, do the following:

• On your graph, draw the curve ( 265/3)e−4t . This is the top curve that bounds
our solution called the top envelope.√
• On your graph, draw the curve −( 265/3)e−4t . This is the bottom curve that
bounds our solution called the bottom envelope.
• Draw the point (0, 3) and from it draw an arrow pointing up as the initial slope is
positive.
7.4 Complex Roots 169

• The solution starts at t = 0 and points up. It hits the top curve when the cos term
hits its maximum of 1. It then flips and moves towards its minimum value of −1
where it hits the bottom curve.
• Keep drawing the curve as it hits top and bottom in a cycle. This graph is expo-
nential decay that is oscillating towards zero.

Example 7.4.4 Consider the solution u(t) = 3 e2t cos(4t) − 5 e2t sin(4t). This gives
u(0) = 3 and u  (0) = 6 − 20 = −14. Convert to phase shifted form.

Solution Let R = (3)2 + (−5)2 = 34. Since A = 3 and √ B = −5, this is
quadrant 4. So we use δ = 2π − tan −1 (5/3). We have u(t) = 34 e2t cos(4t − δ).
To draw the solution by hand, do this:

• On your graph, draw the curve 34e2t . This is the top curve that bounds our
solution. √
• On your graph, draw the curve − 34e2t . This is the bottom curve that bounds
our solution.
• Draw the point (0, 3) and from it draw an arrow pointing down as the initial slope
is negative.
• The solution starts at t = 0 and points down. It hits the bottom curve when the cos
term hits its minimum of −1. It then flips and moves towards its maximum value
of 1 where it hits the top curve.
• Keep drawing the curve as it hits bottom and top in a cycle. This graph is expo-
nential growth that is oscillating out of control.

Example 7.4.5 Consider the solution u(t) = −8 e−2t cos(2t) − 6 e−2t sin(2t). Here
u(0) = −8 and u  (0) = 4. Find the phase shifted form.

Solution Let R = (−8)2 + (−6)2 = 10. A is negative and B is negative so


they are in Quadrant 3. So the angle is δ = π + tan −1 (6/8). We have u(t) =
10 e−2t cos(2t − δ). Then draw the solution like so:
• On your graph, draw the curve 10e−2t . This is the top curve that bounds our
solution.
• On your graph, draw the curve −10e2t . This is the bottom curve that bounds our
solution.
• Draw the point (0, −8) and from it draw an arrow pointing up as the initial slope
is positive.
• The solution starts at t = 0 and points up. It hits the top curve when the cos term
hits its maximum of 1. It then flips and moves towards its minimum value of −1
where it hits the bottom curve.
• Keep drawing the curve as it hits top and bottom in a cycle. This graph is expo-
nential decay that is oscillating towards zero.
170 7 Linear Second Order ODEs

7.4.2.1 Homework

For the models below,


• Find the characteristic equation and roots.
• Find the general solution and solve the IVP.
• Convert the solution to phase shifted cosine form.

Exercise 7.4.5

x  (t) + 4 x  (t) + 29 x(t) = 0


x(0) = 8
x  (0) = −2.

Exercise 7.4.6

y  (t) + 6 y  (t) + 45 y(t) = 0


y(0) = 6
y  (0) = 1.

Exercise 7.4.7

u  (t) − 4 u  (t) + 8 u(t) = 0


u(0) = −3
u  (0) = 2.
Chapter 8
Systems

We are now ready to solve what are called Linear Systems of differential equations.
These have the form
x  (t) = a x(t) + b y(t) (8.1)
y  (t) = c x(t) + d y(t) (8.2)
x(0) = x0 (8.3)
y(0) = y0 (8.4)

for any numbers a, b, c and d and initial conditions x0 and y0 . The full problem is
called, as usual, an Initial Value Problem or IVP for short. The two initial conditions
are just called the IC’s for the problem to save writing. For example, we might be
interested in the system

x  (t) = −2 x(t) + 3 y(t)


y  (t) = 4 x(t) + 5 y(t)
x(0) = 5
y(0) = −3

Here the IC’s are x(0) = 5 and y(0) = −3. Another sample problem might be the
one below.

x  (t) = 14 x(t) + 5 y(t)


y  (t) = −4 x(t) + 8 y(t)
x(0) = 2
y(0) = 7

We are interested in learning how to solve these problems.

© Springer Science+Business Media Singapore 2016 171


J.K. Peterson, Calculus for Cognitive Scientists,
Cognitive Science and Technology, DOI 10.1007/978-981-287-877-9_8
172 8 Systems

8.1 Finding a Solution

For linear first order problems like u  = 3u and so forth, we have found the solution
has the form u(t) = α e3t for some number α. We would then determine the value of
α to use by looking at the initial condition. To see what to do with Eqs. 8.1 and 8.2,
first let’s rewrite the problem in terms of matrices and vectors. In this form, Eqs. 8.1
and 8.2 can be written as
      
x (t) a b x(t)
= .
y  (t) c d y(t)

The initial conditions Eqs. 8.3 and 8.4 can then be redone in vector form as
   
x(0) x
= 0 .
y(0) y0

8.1.1 Worked Out Examples

Here are some examples of the conversion of a system of two linear differential
equations into matrix–vector form.
Example 8.1.1 Convert
x  (t) = 6 x(t) + 9 y(t)
y  (t) = −10 x(t) + 15 y(t)
x(0) = 8
y(0) = 9

into a matrix–vector system.


Solution The new form is seen to be
      
x (t) 6 9 x(t)
=
y  (t) −10 15 y(t)
   
x(0) 8
= .
y(0) 9

Example 8.1.2 Convert


x  (t) = 2 x(t) + 4 y(t)
y  (t) = − x(t) + 7 y(t)
x(0) = 2
y(0) = −3

into a matrix–vector system.


8.1 Finding a Solution 173

Solution The new form is seen to be


     
x  (t) 2 4 x(t)
=
y  (t) −1 7 y(t)
   
x(0) 2
= .
y(0) −3

8.1.2 A Judicious Guess

Now that we know how to do this conversion, it seems reasonable to believe that is
a constant times er t solve a first order linear problem like u  = r u, perhaps a vector
times er t will work here. Let’s make this formal. We’ll work with a specific system
first because numbers are always easier to make sense of in the initial exposure to a
technique. So let’s look at the problem below

x  (t) = 3 x(t) + 2 y(t)


y  (t) = −4 x(t) + 5 y(t)
x(0) = 2
y(0) = −3

Let’s assume the solution has the form V er t because by our remarks above since
this is a vector system it seems reasonable to move to using a vector rather than a
constant. Let’s denote the components of V as follows:
 
V1
V = .
V2

Then, it is easy to see that the derivative of V er t is


    
V1
V er t = er t
V2
 rt
 
V1 e
=
V2 er t
⎡   ⎤
rt
⎢ V1 e ⎥
=⎣⎢   ⎥

V2 er t
 
V1 r er t
=
V2 r er t
174 8 Systems
 
V1
= r er t
V2
= r V er t

Let’s plug in our possible solution into the original problem. That is, we assume the
solution is
 
x(t)
= V er t .
y(t)

Hence,
 
x  (t)
= r V er t .
y  (t)

When, we plug these terms into the matrix–vector form of the problem, we find
 
3 2
rVe = rt
V er t .
−4 5

We can rewrite this as


   
3 2 0
rVe − rt
Ve = .
rt
−4 5 0

Since one of these terms is a matrix and one is a vector, we need to write all the terms
in terms of matrices if possible. Recall, the two by two identity matrix is
 
1 0
I =
0 1

and I V = V always. Thus, we can rewrite our system as


     
1 0 3 2 0
r Ve − rt
Ve = rt
.
0 1 −4 5 0

Now, factor out the common er t term to give


       
1 0 3 2 0
r V− V er t = .
0 1 −4 5 0

Even though we don’t know yet what values of r will work for this problem, we do
know that the term er t is never zero no matter what value r has. Hence, we can say
that we are looking for a value of r and a vector V so that
8.1 Finding a Solution 175
     
1 0 3 2 0
r V− V = .
0 1 −4 5 0

For convenience, let the matrix of coefficients determined by our system of differ-
ential equations be denoted by A, i.e.
 
3 2
A= .
−4 5

Then, the equation that r and V must satisfy becomes


 
0
r I V−AV = .
0

Finally, noting the vector V is common, we factor again to get our last equation
   
0
rI−A V = .
0

We can then plug in the value of I and A to get the system of equations that r and V
must satisfy in order for V er t to be a solution.
    
r −3 −2 V1 0
= .
−(−4) r − 5 V2 0

To finish this discussion, note that for any value of r , this is a system of two linear
equations in the two unknowns V1 and V2 . If we choose a value of r for which
det (r I − A) was non zero, the theory we have so carefully gone over in Sect. 2.4 tells
us the two lines determined by row 1 and row 2 of this system have different slopes.
This means this system of equations has only one solution. Since both equations
cross through the origin, this unique solution must be V1 = 0 and V2 = 0. But, of
course, this tells us the solution is x(t) = 0 and y(t) = 0! We will not be able to
solve for the initial conditions x(0) = 2 and y(0) = −3 with this solution. So we
must reject any choice of r which gives us det (r I − A) = 0.
This leaves only one choice: the values of r where det (r I − A) = 0. Now, go
back to Sect. 2.10 where we discussed the eigenvalues and eigenvectors of a matrix
A. The values of r where det (r I − A) = 0 are what we called the eigenvalues of
our matrix A and for these values of r , we must find non zero vectors V (non zero
because otherwise, we can’t solve the IC’s!) so that
    
r −3 −2 V1 0
= .
4 r −5 V2 0
176 8 Systems

Note, as we did in Sect. 2.10, the system above is the same as


    
3 2 V1 V1
=r .
−4 5 V2 V2

Then, for each eigenvalue we find, we should have a solution of the form
   
x(t) V1
= er t .
y(t) V2

In general, for a system of two linear models like this, there are three choices for the
eigenvalues.

• Two real and distinct eigenvalues r1 and r2 which the eigenvectors E 1 and E 2 . This
has been discussed thoroughly. We can now say more about this type of solution.
The two eigenvectors E 1 and E 2 are linearly independent vectors in 2 and the
two solutions er1 tt and er2 t are linear independent functions. Hence, the set of
all possible solutions which is denoted by the general solution
 
x(t)
= a E 1 er 1 t + b E 2 er 2 t
y(t)

represents the span of these two linearly independent functions. In fact, these
two linearly independent solutions to the model are the basis vectors of the two
dimensional vector space that consists of the solutions to this model. Note, we are
not saying anything new here, but we are saying it with new terminology and a
higher level of abstraction. Thus, the general solution will be
 
x(t)
= a E 1 er 1 t + b E 2 er 2 t ,
y(t)

where E 1 is the eigenvector for eigenvalue r1 and E 2 is the eigenvector for eigen-
value r2 and a and b are arbitrary real numbers chosen to satisfy the IC’s.
• The eigenvalues are repeated so that r1 = r2 = α for some real number. We are
not yet sure what to do in this case. There are two possibilities:
1. The eigenvalue of value α when plugged into the eigenvalue–eigenvector
equation
    
α−a −b V1 0
= .
c α−d V2 0

turns out to be as usual and the two rows of this matrix are multiples. Hence, we
use either the top or bottom row to find our choice of nonzero eigenvector E 1 .
    
α−a −b V1 0
= .
c α−d V2 0
8.1 Finding a Solution 177

For example, if
 
3 1
A=
−1 1

the characteristic equation if (r − 2)2 = 0 which gives the repeated eigenvalue


α = 2. We then find the eigenvalue–eigenvector equation is
    
2−3 −1 V1 0
= .
1 2−1 V2 0

T
The top row and the bottom row are multiples and we find E 1 = 1 −1 . Note
in this case, the set of all V1 and V2 we can use are all multiples of E 1 . Hence,
this set of numbers forms a line through the origin in 2 . Another way of saying
this is that the set of all possible V1 and V2 here is a one dimensional subspace
of 2 . We know one solution to our model is E 1 e2t but what is the other one?
2. The other possibility is that A is a multiple of the identity, say A = 2I. Then,
the characteristic equation is the same as the first case: (r − 2)2 . However, the
eigenvalue–eigenvector equation is very different. We find
    
2−2 0 V1 0
= .
0 2−2 V2 0

which is a very strange system as both the top and bottom equation give 0V1 +
0V2 = 0. This says there are no restrictions on the values of V1 and V2 but they
can be picked independently. So pick V1 = 1 and V2 = 0 to give one choice
T
of eigenvector: E 1 = 1 0 . Then pick V2 = 0 and V2 = 1 to give a second
T
choice of eigenvector: E 2 = 0 1 . Another way of looking at this is the set
of all possible V1 and V2 is just  and so we are free to pick any basis of 2
2

for our eigenvectors we want. Hence, we might as well pick the simplest one:
E 1 = i and E 2 = j . We actually have two linearly independent solutions to our
model in this case. They are E 1 e2t and E 2 e2t a
• In the last case, the eigenvalues are complex number. If we let the eigenvalue be
r = α + βi, note the corresponding eigenvector could be a complex vector. So
let’s write it as V = E + i F where E and F have only real valued components.
Then we know
 
a b
(E + i F) = r (E + i F).
c d

Now take the complex conjugate of each side to get


 
a b
E + i F = r E + i F.
c d
178 8 Systems

But all the entries of the matrix are real, so complex conjugation does not change
them. The other conjugations then give
 
a b
(E − i F) = r (E − i F).
c d

This says that r = α − βi is also an eigenvalue with eigenvector the complex


conjugate of the eigenvector for r , i.e. E − i F. So eigenvalues and eigenvectors
here occur in complex conjugate pairs and the general complex solution is
 
x(t)
= c1 (E + i F) e(α+βi)t + c2 (E − i F) e(α−βi)t
y(t)
 
= eαt (E + i F)eiβt + (E − i F)eiβt

where c1 and c2 are complex numbers. Since we are interested in real solutions,
from this general complex solution, we will extract two linearly independent real
solutions which will form the basis for our two dimensional subspace of solutions.
We will return to this case later.
We are now ready for some definitions for Characteristic Equation of the linear
system.
Definition 8.1.1 (The Characteristic Equation of a Linear System of ODEs)
For the system

x  (t) = a x(t) + b y(t)


y  (t) = c x(t) + d y(t)
x(0) = x0
y(0) = y0 ,

the characteristic equation is defined by


 
det r I − A = 0

where A is the coefficient matrix


 
a b
A= .
c d

We can then define the eigenvalue of a linear system of differential equations.


Definition 8.1.2 (The Eigenvalues of a Linear System of ODEs)
The roots of the characteristic equation of the linear system are called its eigenvalues
and any nonzero vector V satisfying
8.1 Finding a Solution 179
   
0
rIV−A V = .
0

for an eigenvalue r is called an eigenvector associated with the eigenvalue r .


Finally, the general solution of this system can be built from its eigenvalues and
associated eigenvectors. It is pretty straightforward when the two eigenvalues are
real and distinct numbers and a bit more complicated in the other two cases. But
don’t worry, we’ll cover all of it soon enough. Before we go on, let’s do some more
characteristic equation derivations.

8.1.3 Sample Characteristic Equation Derivations

Let’s do some more examples. Here is the first one.


Example 8.1.3 Derive the characteristic equation for the system below

x  (t) = 8 x(t) + 9 y(t)


y  (t) = 3 x(t) − 2 y(t)
x(0) = 12
y(0) = 4

Solution First, note the matrix–vector form is


     
x  (t) 8 9 x(t)
= .
y  (t) 3 −2 y(t)
   
x(0) 12
= .
y(0) 4

The coefficient matrix A is thus


 
8 9
A= .
3 −2

We assume the solution has the form V er t and plug this into the system. This gives
   
8 9 0
r V er t − V er t = .
3 −2 0

Now rewrite using the identity matrix I and factor to obtain


    
8 9 0
rI− V er t = .
3 −2 0
180 8 Systems

Then, since er t can never be zero no matter what value r is, we find the values of r
and the vectors V we seek satisfy
    
8 9 0
rI− V = .
3 −2 0

Now, if r is chosen so that det (r I − A) = 0, the only solution to this system of two
linear equations in the two unknowns V1 and V2 is V1 = 0 and V2 = 0. This leads
to the solution x(t) = 0 and y(t) = 0 always and this solution does not satisfy the
initial conditions. Hence, we must find values r which give det (r I − A) = 0. The
resulting polynomial is
    
8 9 r −8 −9
det r I − = det
3 −2 −3 r + 2
= (r − 8)(r + 2) − 27 = r 2 − 6r − 43.

This is the characteristic equation of this system.

The next one is very similar. We will expect you to be able to do this kind of
derivation also.

Example 8.1.4 Derive the characteristic equation for the system below

x  (t) = −10 x(t) − 7 y(t)


y  (t) = 8 x(t) + 5 y(t)
x(0) = −1
y(0) = −4

Solution We see the matrix–vector form is


     
x  (t) −10 −7 x(t)
= .
y  (t) 8 5 y(t)
   
x(0) −1
= .
y(0) −4

with coefficient matrix A given by


 
−10 −7
A= .
8 5

Assume the solution has the form V er t and plug this into the system giving
   
−10 −7 0
r V er t − V er t = .
8 5 0
8.1 Finding a Solution 181

Rewriting using the identity matrix I and factoring, we obtain


    
−10 −7 0
rI− Ve =
rt
.
8 5 0

Then, since er t can never be zero no matter what value r is, we find the values of r
and the vectors V we seek satisfy
    
−10 −7 0
rI− V = .
8 5 0

Again, if r is chosen so that det (r I − A) = 0, the only solution to this system of two
linear equations in the two unknowns V1 and V2 is V1 = 0 and V2 = 0. This gives
us the solution x(t) = 0 and y(t) = 0 always and this solution does not satisfy the
initial conditions. Hence, we must find values r which give det (r I − A) = 0. The
resulting polynomial is
    
−10 −7 r + 10 7
det r I − = det
8 5 −8 r − 5
= (r + 10)(r − 5) + 56 = r 2 + 5r + 6.

This is the characteristic equation of this system.

8.1.4 Homework

For each of these problems,


• Write the matrix–vector form.
• Derive the characteristic equation but you don’t have to find the roots.
Exercise 8.1.1

x = 2 x + 3 y
y = 8 x − 2 y
x(0) = 3
y(0) = 5.

Exercise 8.1.2

x  = −4 x + 6 y
y = 9 x + 2 y
x(0) = 4
y(0) = −6.
182 8 Systems

8.2 Two Distinct Eigenvalues

Next, let’s do the simple case of two distinct real eigenvalues. Before you look at these
calculations, you should review how we found eigenvectors in Sect. 2.10. We worked
out several examples there, however, now let’s put them in this system context. Each
eigenvalue r has a corresponding eigenvector E. Since in this course, we want to
concentrate on the situation where the two roots of the characteristic equation are
distinct real numbers, we will want to find the eigenvector, E 1 , corresponding to
eigenvalue r1 and the eigenvector, E 2 , corresponding to eigenvalue r2 . The general
solution will then be of the form
 
x(t)
= a E 1 er 1 t + b E 2 er 2 t
y(t)

where we will use the IC’s to choose the correct values of a and b. Let’s do a complete
example now. We start with the system

x  (t) = −3 x(t) + 4 y(t)


y  (t) = −1 x(t) + 2 y(t)
x(0) = 2
y(0) = −4

First, note the matrix–vector form is


      
x (t) −3 4 x(t)
= .
y  (t) −1 2 y(t)
   
x(0) 2
= .
y(0) −4

The coefficient matrix A is thus


 
−3 4
A= .
−1 2

The characteristic equation is thus


    
−3 4 r +3 −4
det r I − = det
−1 2 1 r −2
= (r + 3)(r − 2) + 4 = r 2 + r − 2
= (r + 2)(r − 1).
8.2 Two Distinct Eigenvalues 183

Thus, the eigenvalues of the coefficient matrix A are r1 = −2 and r = 1. The general
solution will then be of the form
 
x(t)
= a E 1 e−2t + b E 2 et
y(t)

Next, we find the eigenvectors associated with these eigenvalues.


1. For eigenvalue r1 = −2, substitute the value of this eigenvalue into
 
r +3 −4
1 r −2

This gives
 
1 −4
1 −4

The two rows of this matrix should be multiples of one another. If not, we made
a mistake and we have to go back and find it. Our rows are indeed multiples, so
pick one row to solve for the eigenvector. We need to solve
    
1 −4 v1 0
=
1 −4 v2 0

Picking the top row, we get

v1 − 4 v2 = 0
1
v2 = v1
4
Letting v1 = a, we find the solutions have the form
   
v1 1
=a
v2 1
4

The vector
 
1
E1 =
1/4

is our choice for an eigenvector corresponding to eigenvalue r1 = −2. Thus, the


first solution to our system is
   
x1 (t) 1
= E 1 e−2t = e−2t .
y1 (t) 1/4
184 8 Systems

2. For eigenvalue r2 = 1, substitute the value of this eigenvalue into


 
r +3 −4
1 r −2

This gives
 
4 −4
1 −1

Again, the two rows of this matrix should be multiples of one another. If not, we
made a mistake and we have to go back and find it. Our rows are indeed multiples,
so pick one row to solve for the eigenvector. We need to solve
    
4 −4 v1 0
=
1 −1 v2 0

Picking the bottom row, we get

v1 − v2 = 0
v2 = v1

Letting v1 = b, we find the solutions have the form


   
v1 1
=b
v2 1

The vector
 
1
E2 =
1

is our choice for an eigenvector corresponding to eigenvalue r2 = 1. Thus, the


second solution to our system is
   
x2 (t) 1
= E2 e =
t
et .
y2 (t) 1

The general solution is therefore


     
x(t) x1 (t) x2 (t)
=a +b
y(t) y1 (t) y2 (t)
   
1 −2t 1
=a e +b et .
1/4 1
8.2 Two Distinct Eigenvalues 185

Finally, we solve the IVP. Given the IC’s, we find two equations in two unknowns
for a and b:
       
x(0) 2 1 1
= =a e0 + b e0
y(0) −4 1/4 1
 
a+b
= .
(1/4)a + b

This is the usual system

a+b = 2
(1/4)a + b = −4.

This easily solves to give a = 24/5 and b = −14/5. Hence, the solution to this IVP
is
     
x(t) x1 (t) x2 (t)
=a +b
y(t) y1 (t) y2 (t)
   
1 −2t 1
= (24/5) e + (−14/5) et .
1/4 1

This can be rewritten as

x(t) = (24/5) e−2t − (14/5) et


y(t) = (6/5) e−2t − (14/5) et .

Note when t is very large, the only terms that matter are the ones which grow fastest.
Hence, we could say

x(t) ≈ − (14/5) et
y(t) ≈ − (14/5) et .

or in vector form
   
x(t) 1
≈ −(14/5) et .
y(t) 1

This is just a multiple of the eigenvector E 2 ! Note the graph of x(t) and y(t) on an
x–y plane will get closer and closer to the straight line determined by this eigenvector.
So we will call E 2 the dominant eigenvector direction for this system.
186 8 Systems

8.2.1 Worked Out Solutions

You need some additional practice. Let’s work out a few more. Here is the first one.

Example 8.2.1 For the system below


    
x  (t) −20 12 x(t)
=
y  (t) −13 5 y(t)
   
x(0) −1
=
y(0) 2

• Find the characteristic equation


• Find the general solution
• Solve the IVP

Solution The characteristic equation is


    
1 0 −20 12
det r − =0
0 1 −13 5

or
 
r + 20 −12
0 = det
13 r − 5
= (r + 20)(r − 5) + 156
= r 2 + 15r + 56
= (r + 8)(r + 7)

Hence, eigenvalues, of the characteristic equation are r1 = −8 and r2 = −7. We


need to find the associated eigenvectors for these eigenvalues.
1. For eigenvalue r1 = −8, substitute the value of this eigenvalue into
 
r + 20 −12
13 r − 5

This gives
 
12 −12
13 −13

Again, the two rows of this matrix should be multiples of one another. If not, we
made a mistake and we have to go back and find it. Our rows are indeed multiples,
so pick one row to solve for the eigenvector. We need to solve
8.2 Two Distinct Eigenvalues 187
    
12 −12 v1 0
=
3 −13 v2 0

Picking the top row, we get

12v1 − 12 v2 = 0
v2 = v1

Letting v1 = a, we find the solutions have the form


   
v1 1
=a
v2 1

The vector
 
1
E1 =
1

is our choice for an eigenvector corresponding to eigenvalue r1 = −8.


2. For eigenvalue r2 = −7, substitute the value of this eigenvalue into
 
r + 20 −12
13 r − 5

This gives
 
13 −12
13 −12

Again,the two rows of this matrix should be multiples of one another.


Picking the bottom row, we get

13v1 − 12v2 = 0
v2 = (13/12)v1

Letting v1 = b, we find the solutions have the form


   
v1 1
=b
v2 13/12

The vector
 
1
E2 =
13/12

is our choice for an eigenvector corresponding to eigenvalue r2 = −7.


188 8 Systems

The general solution to our system is thus


     
x(t) 1 −2t 1
=a e +b et
y(t) 1 13/12

We solve the IVP by finding the a and b that will give the desired initial conditions.
This gives
     
−1 1 1
=a +b
2 1 13/12

or

−1 = a + b
2 = a + (13/12)b

This is easily solved using elimination to give a = −37 and b = 36. The solution
to the IVP is therefore
     
x(t) 1 −8t 1
= −37 e + 36 e−7t
y(t) 1 13/12
 
−37 e−8t + 36 e−7t
=
−37 e−8t − 36(13/12) e−7t

Note when t is very large, the only terms that matter are the ones which grow fastest
or, in this case, the ones which decay the slowest. Hence, we could say
   
x(t) 1
≈ 36 e−7t .
y(t) (13/12)

This is just a multiple of the eigenvector E 2 ! Note the graph of x(t) and y(t) on an
x–y plane will get closer and closer to the straight line determined by this eigenvector.
So we will call E 2 the dominant eigenvector direction for this system.
We now have all the information needed to analyze the solutions to this system
graphically.

Here is another example is great detail. Again, remember you will have to know
how to do these steps yourselves.

Example 8.2.2 For the system below


    
x  (t) 4 9 x(t)
=
y  (t) −1 −6 y(t)
   
x(0) 4
=
y(0) −2
8.2 Two Distinct Eigenvalues 189

• Find the characteristic equation


• Find the general solution
• Solve the IVP

Solution The characteristic equation is


    
1 0 4 9
det r − =0
0 1 −1 −6

or
 
r −4 −9
0 = det
1 r +6
= (r − 4)(r + 6) + 9
= r 2 + 2 r − 15
= (r + 5)(r − 3)

Hence, eigenvalues of the characteristic equation are r1 = −5 and r2 = 3. Next,


we find the eigenvectors.
1. For eigenvalue r1 = −5, substitute the value of this eigenvalue into
 
r −4 −9
1 r +6

This gives
 
−9 −9
1 1

We need to solve
    
−9 −9 v1 0
=
1 1 v2 0

Picking the bottom row, we get

v1 + v2 = 0
v2 = −v1

Letting v1 = a, we find the solutions have the form


   
v1 1
=a
v2 −1
190 8 Systems

The vector
 
1
E1 =
−1

is our choice for an eigenvector corresponding to eigenvalue r1 = −5.


2. For eigenvalue r2 = 3, substitute the value of this eigenvalue into
 
r −4 −9
1 r +6

This gives
 
−1 −9
1 9

This time, we need to solve


    
−1 −9 v1 0
=
1 9 v2 0

Picking the bottom row, we get

v1 + 9 v2 = 0
−1
v2 = v1
9
Letting v1 = b, we find the solutions have the form
   
v1 1
=b
v2 −1/9

The vector
 
1
−1/9

is our choice for an eigenvector corresponding to eigenvalue r2 = 3.


The general solution to our system is thus
     
x(t) 1 −5t 1
=a e +b e3t
y(t) −1 −1/9
8.2 Two Distinct Eigenvalues 191

We solve the IVP by finding the a and b that will give the desired initial conditions.
This gives
     
4 1 1
=a +b
−2 −1 −1/9

or

4 = a+b
−1
−2 = −a + b
9
This is easily solved using elimination to give a = 7/4 and B = 9/4. The solution
to the IVP is therefore
     
x(t) 7 1 −5t 9 1
= e + −1 e3t
y(t) 4 −1 4 9
 7 −5t 9 3t 
4
e +4e
= −7
4
e + −1
−5t
4
e3t

Again, the dominant eigenvector is E 2 .

8.2.2 Homework

For these problems:


1. Write matrix, vector form.
2. Find characteristic equation. No derivation needed this time.
3. Find the two eigenvalues.
4. Find the two associated eigenvectors in glorious detail.
5. Write general solution.
6. Solve the IVP.

Exercise 8.2.1

x = 3 x + y
y = 5 x − y
x(0) = 4
y(0) = −6.

The eigenvalues should be −2 and 4.


192 8 Systems

Exercise 8.2.2

x = x + 4 y
y = 5 x + 2 y
x(0) = 4
y(0) = −5.

The eigenvalues should be −3 and 6.

Exercise 8.2.3

x  = −3 x + y
y  = −4 x + 2 y
x(0) = 1
y(0) = 6.

The eigenvalues should be −2 and 1.

8.2.3 Graphical Analysis

Let’s try to analyze these systems graphically. We are interested in what these solu-
tions look like for many different initial conditions. So let’s look at the problem

x  (t) = −3 x(t) + 4 y(t)


y  (t) = −x(t) + 2 y(t)
x(0) = x0
y(0) = y0

8.2.3.1 Graphing the Nullclines

The set of (x, y) pairs where x  = 0 is called the nullcline for x; similarly, the points
where y  = 0 is the nullcline for y. The x  equation can be set equal to zero to get
−3x + 4y = 0. This is the same as the straight line y = 3/4 x. This straight line
divides the x–y plane into three pieces: the part where x  > 0; the part where x  = 0;
and, the part where x  < 0. In Fig. 8.1, we show the part of the x–y plane where
x  > 0 with one shading and the part where it is negative with another. Similarly, the
y  equation can be set to 0 to give the equation of the line −x + 2y = 0. This gives
the straight line y = 1/2 x. In Fig. 8.2, we show how this line also divided the x–y
plane into three pieces .
8.2 Two Distinct Eigenvalues 193

Fig. 8.1 Finding where x  < 0 and x  > 0

Fig. 8.2 Finding where y  < 0 and y  > 0

The shaded areas shown in Figs. 8.1 and 8.2 can be combined into Fig. 8.3. In this
figure, we divide the x–y plane into four regions marked with a I, II, III or IV. In each
region, x  and y  are either positive or negative. Hence, each region can be marked
with an ordered pair, (x  ±, y  ±).

8.2.3.2 Homework

For each of these problems,


• Find the x  nullcline and determine the plus and minus regions in the plane.
• Find the y  nullcline and determine the plus and minus regions in the plane.
• Assemble the two nullclines into one picture showing the four regions that result.
194 8 Systems

Fig. 8.3 Combining the x  and y  algebraic sign regions

Exercise 8.2.4

x = 2 x + 3 y
y = 8 x − 2 y
x(0) = 3
y(0) = 5.

Exercise 8.2.5

x  = −4 x + 6 y
y = 9 x + 2 y
x(0) = 4
y(0) = −6.

8.2.3.3 Graphing the Eigenvector Lines

Now we add the eigenvector lines. In Sect. 8.2, we found that this system has eigen-
values r1 = −2 and r2 = 1 with associated eigenvectors
   
1 1
E1 = , E2 = .
1/4 1
8.2 Two Distinct Eigenvalues 195

Recall a vector V with components a and b,


 
a
V =
b

determines a straight line with slope b/a. Hence, these eigenvectors each determine
a straight line. The E 1 line has slope 1/4 and the E 2 line has slope 1. We can graph
these two lines overlaid on the graph shown in Fig. 8.3.

8.2.3.4 Homework

For each of these problems,


• Find the x  nullcline and determine the plus and minus regions in the plane.
• Find the y  nullcline and determine the plus and minus regions in the plane.
• Assemble the two nullclines into one picture showing the four regions that result.
• Draw the Eigenvector lines on the same picture (Fig. 8.4).

Fig. 8.4 Drawing the nullclines and the eigenvector lines on the same graph
196 8 Systems

Exercise 8.2.6

x = 2 x + 3 y
y = 8 x − 2 y
x(0) = 3
y(0) = 5.

Exercise 8.2.7

x  = −4 x + 6 y
y = 9 x + 2 y
x(0) = 4
y(0) = −6.

8.2.3.5 Graphing Region I Trajectories

In each of the four regions, we know the algebraic signs of the derivatives x  and
y  . If we are given an initial condition (x0 , y0 ) which is in one of these regions, we
can use this information to draw the set of points (x(t), y(t)) corresponding to the
solution to our system
x  (t) = −3 x(t) + 4 y(t)
y  (t) = −x(t) + 2 y(t)
x(0) = x0
y(0) = y0 .

This set of points is called the trajectory corresponding to this solution. The first
point on the trajectory is the initial point (x0 , y0 ) and the rest of the points follow
from the solution
     
x(t) 1 1
=a e−2t + b et .
y(t) 1/4 1

where a and b satisfy the system of equations

x0 = a + b
y0 = (1/4)a + b

This can be rewritten as

x(t) = a e−2t + b et
y(t) = (1/4)a e−2t + b et .
8.2 Two Distinct Eigenvalues 197

Hence,

dy y  (t)
= 
dx x (t)
(−2/4)a e−2t + b et
=
−2a e−2t + b et

when t is large, as long as b is not zero, the terms involving e−2t are negligible and
so we have

dy y  (t)
= 
dx x (t)
b et

b et
≈ E2.

Hence, when t is large, the slopes of the trajectory approach 1, the slope of E 2 . So,
we can conclude that for large t, as long as b is not zero, the trajectory either parallels
the line determined by E 2 or approaches it asymptotically.
Of course, if an initial condition is chosen that lies on the line determined by E 1 ,
then a little thought will tell you that b is zero in this case and we have

dy y  (t)
= 
dx x (t)
(−2/4)a e−2t
=
−2a e−2t
= E1.

In this case,

x(t) = a e−2t
y(t) = (1/4)a e−2t .

and so the coordinates (x(t), y(t)) go to (0, 0) along the line determined by E 1 .
We conclude that unless an initial condition is chosen exactly on the line deter-
mined by E 1 , all trajectories eventually begin to either parallel the line determined by
E 2 or approach it asymptotically. If the initial condition is chosen on the line deter-
mined by E 1 , then the trajectories stay on this line and approach the origin where
they stop as that is a place where both x  and y  become 0. In Fig. 8.5, we show three
trajectories which begin in Region I. They all have a (+, +) sign pattern for x  and
y  , so the x and y components should both increase. We draw the trajectories with
the concavity as shown because that is the only way they can smoothly approach the
eigenvector line E 2 . We show this in Fig. 8.5.
198 8 Systems

Fig. 8.5 Trajectories in Region I

8.2.3.6 Can Trajectories Cross?

Is it possible for two trajectories to cross? Consider the trajectories shown in Fig. 8.6.
These two trajectories cross at some point. The two trajectories correspond to dif-
ferent initial conditions which means that the a and b associated with them will be
different. Further, these initial conditions don’t start on eigenvector E 1 or eigenvec-
tor E 2 , so the a and b values for both trajectories will be non zero. If we label these
trajectories by (x1 , y1 ) and (x2 , y2 ), we see

x1 (t) = a1 e−2t + b1 et
y1 (t) = (1/4)a1 e−2t + b1 et .

and

x2 (t) = a2 e−2t + b2 et
y2 (t) = (1/4)a2 e−2t + b2 et .
8.2 Two Distinct Eigenvalues 199

E2

I (+,+)
x = 0
II (−,+)
y = 0

E1

E1 In this figure, we show


two trajectories in Region
I that cross. In the text,
we explain why this is not
y = 0
possible.
IV (+,−)
x = 0
III (−,−)

E2

Fig. 8.6 Combining the x  and y  algebraic sign regions

Since we assume they cross, there has to be a time point, t ∗ , so that (x1 (t ∗ ), y1 (t ∗ ))
and (x2 (t ∗ ), y2 (t ∗ )) match. This means, using vector notation,
     
x1 (t) 1 1
= a1 e−2t + b1 et ,
y1 (t) 1/4 1
     
x2 (t) 1 1
= a2 e−2t + b2 et .
y2 (t) 1/4 1

Setting these two equal at t ∗ , then gives


       
1 ∗ 1 ∗ 1 ∗ 1 ∗
a1 e−2t + b1 e t = a2 e−2t + b2 et .
1/4 1 1/4 1

This is a bit messy, so for convenience, let the number et be denoted by U and the

number e−2t be V . Then, we can rewrite as
       
1 1 1 1
a1 V + b1 U = a2 V + b2 U.
1/4 1 1/4 1
200 8 Systems

Next, we can combine like vectors to find


   
1 1
(a1 − a2 )V = (b2 − b1 )U .
1/4 1

No matter what the values of a1 , a2 , b1 and b2 , this tells us that


   
1 1
= a multiple of .
1/4 1

This is clearly not possible, so we have to conclude that trajectories can’t cross. We
can do this sort of analysis for trajectories that start in any region, whether it is I, II, III
or IV. Further, a similar argument shows that a trajectory can’t cross an eigenvector
line as if it did, the argument above would lead us to the conclusion that E 1 is a
multiple of E 2 , which it is not.
We can state the results here as formal rules for drawing trajectories.

Theorem 8.2.1 (Trajectory Drawing Rules)


Given the system

x  (t) = a x(t) + b y(t)


y  (t) = c x(t) + d y(t)
x(0) = x0
y(0) = y0 ,

assume the eigenvalues r1 and r2 are different with either both negative or one
negative and one positive. Let E 1 and E 2 be the associated eigenvectors. Then, the
trajectories of this system corresponding to different initial conditions can not cross
each other. In particular, trajectories can not cross eigenvector lines.

8.2.3.7 Graphing Region II Trajectories

In region II, trajectories start where x  < 0 and y  > 0. Hence, the x values must
decrease and the y values, increase in this region. We draw the trajectory in this way,
making sure it curves in such a way that it has no corners or kinks, until it hits the
nullcline x  = 0. At that point, the trajectory moves into region I. Now x  > 0 and
y  > 0, so the trajectory moves upward along the eigenvector E 2 line like we showed
in the Region I trajectories. We show this in Fig. 8.7. Note although the trajectories
seem to overlap near the E 2 line, they actually do not because trajectories can not
cross as was be explained in Sect. 8.2.3.6.
8.2 Two Distinct Eigenvalues 201

Fig. 8.7 Trajectories in Region II

8.2.3.8 Graphing Region III Trajectories

Next, we examine trajectories that begin in Region III. Here x  and y  are negative,
so the x and y values will decrease and the trajectories will approach the dominant
eigenvector E 2 line from the right side as is shown in Fig. 8.8. The initial condition
that starts in Region III above the eigenvector E 2 line will move towards the y  = 0
line following x  < 0 and y  < 0 until it hits the line x  = 0 using x  < 0 and y  > 0.
Then it moves upward towards the eigenvector E 2 line as shown. It is easier to see
this in a magnified view as shown in Fig. 8.9.

8.2.3.9 Graphing Region IV Trajectories

Finally, we examine trajectories that begin in region IV. Here x  is positive and y  is
negative, so the x values will grow and y values will decrease. The trajectories will
behave in this manner until they intersect the x  = 0 nullcline. Then, they will cross
202 8 Systems

Fig. 8.8 Region III trajectories

into Region III and approach the dominant eigenvector E 2 line from the left side as
is shown in Fig. 8.10.

8.2.3.10 The Combined Trajectories

In Fig. 8.11, we show all the region trajectories on one plot. We can draw more, but
these should be enough to give you an idea of how to draw them. In addition, there
is a type of trajectory we haven’t drawn yet. Recall, the general solution is
     
x1 (t) 1 −2t 1
=a e +b et .
y1 (t) 1/4 1

If an initial condition was chosen to lie on eigenvector E 1 line, then b = 0. Hence,


for these initial conditions, we have
   
x1 (t) 1
=a e−2t .
y1 (t) 1/4
8.2 Two Distinct Eigenvalues 203

E2

I (+,+)

x = 0

II (−,+)

y = 0

III (−,−)
E1

Fig. 8.9 A magnified Region III trajectory

Thus, these trajectories start somewhere on the eigenvector E 1 line and then as t
increases, x(t) and y(t) go to (0, 0) along this eigenvector. You can easily imagine
these trajectories by placing a dot on the E 1 line with an arrow pointing towards the
origin.
We can do this sort of qualitative analysis for the three cases:
• One eigenvalue negative and one eigenvalue positive: example r1 = −2 and r2 = 1
which we have just completed.
• Both eigenvalues negative: example r1 = −2 and r2 = −1 which we have not
done.
• Both eigenvalues positive: example r1 = 1 and r2 = 2 which we have not done.
• In each case, we have two eigenvectors E1 and E2 . The way we label our eigenval-
ues will always make most trajectories approach the E2 line as t increases because
r2 is always the largest eigenvalue.
204 8 Systems

Fig. 8.10 Region IV trajectories

8.2.3.11 Mixed Sign Eigenvalues

Here we have one negative eigenvalue and one positive eigenvalue. The positive one
is the dominant one: example, r1 = −2 and r2 = 1 so the dominant eigenvalue is
r2 = 1.
• Trajectories that start on the E1 line go towards (0, 0) along that line.
• Trajectories that start on the E2 line move outward along that line.
• All other ICs give trajectories that move outward from (0, 0) and approach the
dominant eigenvector line, the E2 line as t increases.
• The (+, +), (+, −), (−, +) and (−, −) regions tells us the details of how this is
done. We find these regions using the nullcline analysis.
8.2 Two Distinct Eigenvalues 205

Fig. 8.11 Trajectories in all regions

8.2.3.12 Two Negative Eigenvalues

Here we have two negative eigenvalues. The least negative one is the dominant one:
example, r1 = −2 and r2 = −1 so the dominant eigenvalue is r2 = −1.
• Trajectories that start on the E1 line go towards (0, 0) along that line.
• Trajectories that start on the E2 line go towards (0, 0) along that line.
• All other ICs give trajectories move towards (0, 0) and approach the dominant
eigenvector line, the E2 line as t increases.
• The (+, +), (+, −), (−, +) and (−, −) regions tells us the details of how this is
done. We find these regions using the nullcline analysis.

8.2.3.13 Two Positive Eigenvalues

Now we have two positive eigenvalues. The positive one is the dominant one: exam-
ple, r1 = 2 and r2 = 3 so the dominant eigenvalue is r2 = 3.
• Trajectories that start on the E1 line move outward along that line.
• Trajectories that start on the E2 line move outward along that line.
206 8 Systems

• All other ICs give trajectories move outward and approach the dominant
eigenvector line, the E2 line as t increases. This case where both eigenvalues
are positive is the hardest one to draw and many times these trajectories become
parallel to the dominant line rather than approaching it.
• The (+, +), (+, −), (−, +) and (−, −) regions tells us the details of how this is
done. We find these regions using the nullcline analysis.

8.2.3.14 Examples

Example 8.2.3 Do the phase plane analysis for


    
x  (t) −20 12 x(t)
=
y  (t) −13 5 y(t)
   
x(0) −1
=
y(0) 2

Solution • The characteristic equation is


  
−20 12
det r I − = 0 ⇒ (r + 8)(r + 7) = 0.
−13 5

• Hence, eigenvalues or roots of the characteristic equation are r1 = −8 and


r2 = −7.
• The vector
 
1
E1 =
1

is our choice for an eigenvector corresponding to eigenvalue r1 = −8.


• The vector
 
1
E2 = 13
12

is our choice for an eigenvector corresponding to eigenvalue r2 = −7.


• Since both eigenvalues are negative we have r2 = −7 is the dominant one.
• Trajectories that start on the E1 line go towards (0, 0) along that line.
• Trajectories that start on the E2 line go towards (0, 0) along that line.
• All other ICs give trajectories that move towards (0, 0) and approach the dominant
eigenvector line, the E2 line as t increases.
• The (+, +), (+, −), (−, +) and (−, −) regions tells us the details of how this
is done. We find these regions using the nullcline analysis. Here x  = 0 gives
−20x + 12y = 0 or y = 20/12x while y  = 0 gives −13x + 5y = 0 or
y = 13/5x.
8.2 Two Distinct Eigenvalues 207

Example 8.2.4 Do the phase plane analysis for


    
x  (t) 4 9 x(t)
=
y  (t) −1 −6 y(t)
   
x(0) 4
=
y(0) −2

Solution • The characteristic equation is


  
4 9
det r I − = 0 ⇒ (r + 5)(r − 3) = 0.
−1 −6

• Hence, eigenvalues or roots of the characteristic equation are r1 = −5 and


r2 = 3.
• The vector
 
1
E1 =
−1

is our choice for an eigenvector corresponding to eigenvalue r1 = −5.


• The vector
 
1
E2 =
− 19

is our choice for an eigenvector corresponding to eigenvalue r2 = 3.


• Since one eigenvalue is positive and one is negative, we have r2 = 3 is the dominant
one.
• Trajectories that start on the E1 line go towards (0, 0) along that line.
• Trajectories that start on the E2 line move outward from (0, 0) along that line.
• All other ICs give trajectories that move outward from (0, 0) and approach the
dominant eigenvector line, the E2 line as t increases.
• The (+, +), (+, −), (−, +) and (−, −) regions tells us the details of how this
is done. We find these regions using the nullcline analysis. Here x  = 0 gives
4x + 9y = 0 or y = −4/9x while y  = 0 gives −x − 6y = 0 or y = −1/6x.

Example 8.2.5 Do the phase plane analysis for


    
x  (t) 4 −2 x(t)
=
y  (t) 3 −1 y(t)
   
x(0) 14
=
y(0) −22
208 8 Systems

Solution • The characteristic equation is


  
4 −2
det r I − = 0 ⇒ (r − 1)(r − 2) = 0.
3 −1

• Hence, eigenvalues or roots of the characteristic equation are r1 = 1 and r2 = 2.


• The vector
 
1
E1 = 3
2

is our choice for an eigenvector corresponding to eigenvalue r1 = 1.


• The vector
 
1
E2 =
1

is our choice for an eigenvector corresponding to eigenvalue r2 = 2.


• Since both eigenvalues are positive, the larger one, r2 = 2, is the dominant one.
• Trajectories that start on the E1 line move outward from (0, 0) along that line.
• Trajectories that start on the E2 line move outward from (0, 0) along that line.
• All other ICs give trajectories that move outward from (0, 0) and approach the
dominant eigenvector line, the E2 line as t increases.
• The (+, +), (+, −), (−, +) and (−, −) regions tells us the details of how this
is done. We find these regions using the nullcline analysis. Here x  = 0 gives
4x − 2y = 0 or y = 2x while y  = 0 gives 3x − y = 0 or y = 3x.

Finally, here is how we would work out a problem by hand in Figs. 8.12, 8.13 and
8.14; note the wonderful handwriting displayed on these pages.

8.2.3.15 Homework

You are now ready to do some problems on your own. For the problems below

• Find the characteristic equation


• Find the general solution
• Solve the IVP
• On the same x–y graph,
1. draw the x  = 0 line
2. draw the y  = 0 line
3. draw the eigenvector one line
8.2 Two Distinct Eigenvalues 209

Fig. 8.12 Example x  = 4x − 2y, y  = 3x − y, Page 1

Fig. 8.13 Example x  = 4x − 2y, y  = 3x − y, Page 2


210 8 Systems

Fig. 8.14 Example x  = 4x − 2y, y  = 3x − y, Page 3

4. draw the eigenvector two line


5. divide the x–y into four regions corresponding to the algebraic signs of x  and y 
6. draw the trajectories of enough solutions for various initial conditions to create
the phase plane portrait

Exercise 8.2.8
    
x  (t) 1 3 x(t)
=
y  (t) 3 1 y(t)
   
x(0) −3
= .
y(0) 1

Exercise 8.2.9
    
x  (t) 3 12 x(t)
=
y  (t) 2 1 y(t)
   
x(0) 6
= .
y(0) 1
8.2 Two Distinct Eigenvalues 211

Exercise 8.2.10
    
x  (t) −1 1 x(t)
=
y  (t) −2 −4 y(t)
   
x(0) 3
= .
y(0) 8

Exercise 8.2.11
    
x  (t) 3 4 x(t)
=
y  (t) −7 −8 y(t)
   
x(0) −2
= .
y(0) 4

Exercise 8.2.12
    
x  (t) −1 1 x(t)
=
y  (t) −3 −5 y(t)
   
x(0) 2
= .
y(0) −4

Exercise 8.2.13
    
x  (t) −5 2 x(t)
=
y  (t) −4 1 y(t)
   
x(0) 21
= .
y(0) 5

8.2.3.16 Adding Inputs

Now consider the sample problem

x  (t) = −2 x(t) + 3 y(t) + u


y  (t) = 4 x(t) + 5 y(t) + v
x(0) = 1
y(0) = 2

for some constants u and v. In matrix–vector form, we have


      
x(t) −2 3 x(t) u
= +
y(t) 4 5 y(t) v
212 8 Systems

We call the vector whose components are u and v, the inputs to this model. In general,
if the external inputs are both zero, we call the model a homogeneous model and
otherwise, it is a nonhomogeneous model. Now we don’t know how to solve this, but
let’s try a guess. Let’s assume there is a solution xinputs (t) = x ∗ and yinputs (t) = y ∗ ,
where x ∗ and y ∗ are constants. Then,
   ∗   
xinputs (t) x 0
= ∗ =
yinputs (t) y 0

Now plugging this assumed solution into the original model, we find
     ∗  
0 −2 3 x u
= +
0 4 5 y∗ v

Since det ( A) = −22 which is not zero, we know A−1 exists. Manipulating a bit,
our original equation becomes
   ∗  
−2 3 x u
=−
4 5 y∗ v

and so multiplying both sides by A−1 , we find


 ∗  −1  
x −2 3 u
= −
y∗ 4 5 v

This is easy to solve as we can calculate


 −1  
−2 3 −1 5 −3
=
4 5 22 −4 −2

We conclude a solution to the model is


   ∗   
xinputs x −1 5 −3 u
= ∗ =−
yinputs y 22 −4 −2 v

for any inputs u and v. For this sample problem, the characteristic equation is

det (r I − A) = r 2 − 3r − 22

which has roots


 √  √
3± 9 − 4(−22) 3 97
r = = ±
2 2 2
8.2 Two Distinct Eigenvalues 213

The eigenvalues are approximately r1 = −3.42 and r2 = 6.42. The associated


eigenvectors can easily be found to be
  
1.0 1.0
E1 = , E2 =
−0.47 2.81

and the general solution to this model with no inputs would be


 
xno input (t)
= a E 1 e−3.42t + b E 2 e6.42t
yno input (t)

for arbitrary a and b. A little thought then shows that adding [xno input (t), yno input (t)]T
to the solution [xinputs (t), yinputs (t)]T will always solve the model with inputs. So
the most general solution to the model with constant inputs must be of the form
     
x(t) x (t) x (t)
= no input + inputs
y(t) yno input (t) yinputs (t)
 
u
= a E 1 e−3.42t + b E 2 e6.42t − A−1
v

where A is the coefficient matrix of our model. The solution [xno input (t), yno input (t)]T
occurs so often it is called the homogeneous solution to the model (who knows why!)
and the solution [xinputs (t), yinputs (t)]T because it actually works for the model
with these particular inputs, is called the particular solution. To save subscript-
ing, we label the homogeneous solution [x h (t), yh (t)]T and the particular solution
[x p (t), y p (t)]T . Finally, any model with inputs that are not zero is called an non-
homogeneous model. We are thus ready for a definition.

Definition 8.2.1 (The Nonhomogeneous Model)


Any solution to the general non-homogeneous model
      
x(t) −2 3 x(t) f (t)
= +
y(t) 4 5 y(t) g(t)

for input functions f (t) and g(t) is called the particular solution and is labeled
[x p (t), y p (t)]T . The model with no inputs is called the homogeneous model and its
solutions are called homogeneous solutions and labeled [x h (t), yh (t)]T . The general
solution to the model with nonzero inputs is then
     
x(t) x (t) x (t)
= h + p
y(t) yh (t) y p (t)

For a linear model, from our discussions above, we know how to find a complete
solution.
214 8 Systems

8.2.3.17 Worked Out Example

Example 8.2.6
      
x  (t) 2 −5 x(t) 2
= +
y  (t) 4 −7 y(t) 1
   
x(0) 1
=
y(0) 5

Solution It is straightforward to find the eigenvalues here are r1 = −3 and r2 = −2


with corresponding eigenvectors
   
1 1.0
E1 = , E2 =
1 0.8

The particular solution is


   −1       9
xp 2 −5 2 1 −7 5 2 6
=− =− =
yp 4 −7 1 (−14 + 20) −4 2 1 6
6

Note, this particular solution is biologically reasonable if x and y are proteins and
the dynamics represent their interaction since x p and y p are positive.
The homogeneous solution is
 
x h (t)
= a E 1 e−3t + b E 2 e−2t
yh (t)

and the general solution is thus


       
x(t) 1 −3t 1 −2t 9/6
=a e +b e +
y(t) 1 0.8 1

Finally, to solve the IVP, we know a and b must satisfy


      3
1 1 1
=a +b + 2
5 1 0.8 1

which is two equations in two unknowns which we solve in the usual way. We find

1 = a + b + 3/2
5 = a + 0.8b + 1
8.2 Two Distinct Eigenvalues 215

or
−1/2 = a + b
4 = a + 0.8b

which implies a = 22 and b = −45/2. The solution is then


       
x(t) 1 −3t 1 9/6
= 22 e − 45/2 e−2t +
y(t) 1 0.8 1

8.2.3.18 Homework

Solve these IVP problems showing all details.

Exercise 8.2.14
      
x  (t) 1 3 x(t) 3
= +
y  (t) 3 1 y(t) 5
   
x(0) −3
= .
y(0) 1

Exercise 8.2.15
      
x  (t) 3 12 x(t) 6
= +
y  (t) 2 1 y(t) 8
   
x(0) 6
= .
y(0) 1

Exercise 8.2.16
      
x  (t) −1 1 x(t) 1
= +
y  (t) −2 −4 y(t) 10
   
x(0) 3
= .
y(0) 8

Exercise 8.2.17
      
x  (t) 3 4 x(t) 0.3
= +
y  (t) −7 −8 y(t) 6
   
x(0) −2
= .
y(0) 4
216 8 Systems

Exercise 8.2.18
      
x  (t) −1 1 x(t) 7
= +
y  (t) −3 −5 y(t) 0.5
   
x(0) 2
= .
y(0) −4

Exercise 8.2.19
      
x  (t) −5 2 x(t) 23
= +
y  (t) −4 1 y(t) 11
   
x(0) 21
= .
y(0) 5

8.3 Repeated Eigenvalues

In this case, the two roots of the characteristic equation are both the same real value.
For us, there are two cases where this can happen.

8.3.1 The Repeated Eigenvalue Has Two Linearly


Independent Eigenvectors

Let the matrix A be a multiple of the identity. For example, we have the linear system

x  (t) = −2 x(t) + 0 y(t)


y  (t) = 0 x(t) − 2 y(t)
x(0) = 2
y(0) = 6

which can clearly be written more succinctly as

x  = −2x; x(0) = 2 =⇒ x(t) = 2e−2t


y  = −2y; y(0) = 6 =⇒ y(t) = 6e−2t

The characteristic equation is then (r + 2)2 = 0 with repeated roots r = −2. When
we solve the eigenvalue–eigenvector equation, we find when we substitute in the
eigenvalue r = −2 that we must solve
    
00 V1 0
=
00 V2 0
8.3 Repeated Eigenvalues 217

which is a strange looking system of equations which we have not seen before. The
way to interpret this is to go back to looking at it as two equations in the unknowns
V1 and V2 . We have

0 V1 + 0 V2 = 0
0 V1 + 0 V2 = 0

As usual both rows of the eigenvalue–eigenvector equation are the same and so
choosing the top row to work with, the equation says there are no constraints on the
values of V1 and V2 . Hence, any nonzero vector [a, b]T will work. This gives us the
two parameter family
       
V1 a 1 0
= a +b .
V2 b 0 1

We choose as the first eigenvector E 1 = [1, 0]T and as the second eigenvector
E 2 = [0, 1]T . Hence, the two linearly independent solutions to this system are
E 1 e−2t and E 2 e−2t with the general solution
       −2t 
x(t) 1 −2t 0 −2t Ae
=A e +B e =
y(t) 0 1 Be−2t

Using the initial conditions, we get the solution we had before


   −2t 
x(t) Ae
=
y(t) Be−2t

8.3.2 The Repeated Eigenvalue Has only One Eigenvector

Let’s start with an example. Consider the model

x  (t) = 2 x(t) − y(t)


y  (t) = x(t) + 4 y(t)
x(0) = 2
y(0) = 5

The characteristic is r 2 − 6r + 9 = 0 which gives the repeated eigenvalue 3. The


eigenvalue equation for this root leads to this system to solve for nonzero V .
    
3−2 1 V1 0
=
−1 3 − 4 V2 0
218 8 Systems

This reduces to the system


    
1 1 V1 0
=
−1 −1 V2 0

Clearly, these two rows are equivalent and hence we only need to choose one to solve
for V2 in terms of V1 . The first row gives V1 + V2 = 0. Letting V1 = a, we find
V2 = −a. Hence, the eigenvectors have the form
 
1
V =a
−1

Hence, we choose as our eigenvector



1
E= .
−1

Our first solution is thus


 
x1 (t)
= E e3t .
y1 (t)

However, the eigenvalue–eigenvector equation we solve here does not allow us to


find two linearly independent eigenvectors corresponding to the eigenvalue r = 3.
However, we know there is another linearly independent solution. The first solution
is Ee3t and from our experiences with repeated root second order models, we suspect
the second solution should have te3t in it. Let’s try as the second solution,
 
x2 (t)  
= F + Gt e3t .
y2 (t)

and find conditions on the vectors F and G that make this true. If this is a solution,
then we must have
    
x2 (t) x2 (t)
= A
y2 (t) y2 (t)

Hence, by direct calculation


   
A F + Gt e3t = 3F + G e3t + 3t G e3t

Rewriting, we obtain
   
A F e3t + A G t e3t = 3 F + G e3t + 3t G e3t
8.3 Repeated Eigenvalues 219

Equating the coefficients of the terms e3t and te3t , we find

AG =3G
A F = 3 F + G.

The solution to the first equation is simply G = E as it is a restatement of the


eigenvalue–eigenvector equation for the eigenvalue 3. We can then rewrite the second
equation as
 
3I − A F = −E

which can easily be solved for F. The next linearly independent solution therefore
has the form (F + Et) e3t where F solves the system
    
1 1 F1 −1
= −E =
−1 −1 F2 1

From the first row, we find F1 + F2 = −1 so letting F1 = b, we have F2 = −1 − b.


Hence the most general solution is
       
b 0 1 0
= +b = + b E.
−1 − b −1 −1 −1

The contribution to the general solution from b E can be lumped in with the first
solution Ee3t as we have discussed, so we only need choose
 
0
F= .
−1

The general real solution is therefore


       
x(t) 1 0 1
=a e +b
3t
+t e3t
y(t) −1 −1 −1
 
a + bt
= e3t
−a − b − bt

Now apply the initial conditions to obtain


   
2 a
=
5 −a − b

Thus, a = 2 and b = −7. The solution is therefore


   
x(t) 2 − 7t 3t
= e
y(t) 5 + 7t
220 8 Systems

Note for large t, the (x, y) trajectory is essentially parallel to the eigenvector line for
E which has slope −1. Hence, if we scale by e3t to obtain
   
x(t)/e3t 2 − 7t
=
y(t)/e3t 5 + 7t

the phase plane trajectories will look like a series of parallel lines. If we keep in the
e3t factor, the phase plane trajectories will exhibit some curvature. Without scaling, a
typical phase plane plot for multiple trajectories looks like what we see in Fig. 8.15.
If we do the scaling, we see the parallel lines as we mentioned. This is shown in
Fig. 8.16.

Fig. 8.15 A phase plane plot

Fig. 8.16 A scaled phase


plane plot
8.3 Repeated Eigenvalues 221

From our discussions for this case of repeated eigenvalues, we can see the general
rule that if r = α is a repeated eigenvalue with only one eigenvector E, the two
linearly independent solutions are
 
x1 (t)
= E eαt
y1 (t)
 
x2 (t)  
= F + Et eαt .
y2 (t)

where F is a solution to the system


 
α I − A F = −E.

8.3.2.1 Homework

We can now do some problems. Find the solution to the following models.

Exercise 8.3.1

x  (t) = 3x(t) + y(t)


y  (t) = −x(t) + y(t)
x(0) = 2
y(0) = 3.

Exercise 8.3.2

x  (t) = 5 x(t) + 4 y(t)


y  (t) = −16 x(t) − 11 y(t)
x(0) = 5
y(0) = −6.

Exercise 8.3.3

x  (t) = 5 x(t) − 9 y(t)


y  (t) = 4 x(t) − 7 y(t)
x(0) = 10
y(0) = −20.
222 8 Systems

Exercise 8.3.4

x  (t) = 2 x(t) + y(t)


y  (t) = −4 x(t) + 6 y(t)
x(0) = −4
y(0) = −12.

We still have to determine the direction of motion in these trajectories. If we need


this, we do the usual nullcline analysis to get the algebraic sign pairs for (x  , y  ) as
usual.

8.4 Complex Eigenvalues

Let’s begin with a theoretical analysis for a change of pace. If the real valued matrix
A has a complex eigenvalue r = α + iβ, then there is a nonzero vector G so that

AG = (α + iβ)G.

Now take the complex conjugate of both sides to find

A G = (α + iβ) G.

However, since A has real entries, its complex conjugate is simply A back. Thus,
after taking complex conjugates, we find

A G = (α − iβ) G

and we conclude that if α + iβ is an eigenvalue of A with eigenvector G, then the


eigenvalue α − iβ has eigenvector G. Hence, letting E be the real part of G and F
be the imaginary part, we see E + i F is the eigenvector for α + iβ and E − i F is
the eigenvector for α − iβ.

8.4.1 The General Real and Complex Solution

We can write down the general complex solution immediately.


 
φ(t)
= c1 (E + i F) e(α+iβ)t + c2 (E − i F) e(α−iβ)t
ψ(t)
8.4 Complex Eigenvalues 223

for arbitrary complex numbers c1 and c2 . We can reorganize this solution into a more
convenient form as follows.
   
φ(t)
= eαt c1 (E + i F) e(iβ)t + c2 (E − i F) e(−iβ)t
ψ(t)
    
= eαt c1 e(iβ)t + c2 e(−iβ)t E + i c1 e(iβ)t − c2 e(−iβ)t F .

The first real solution is found by choosing c1 = 1/2 and c2 = 1/2. This give
   
x1 (t) αt
 (iβ)t (−iβ)t
   (iβ)t (−iβ)t

=e (1/2) e +e E + i (1/2) e −e F .
y1 (t)
   
However, we know that (1/2) e(iβ)t +e(−iβ)t = cos(βt) and (1/2) e(iβ)t −e(−iβ)t =
i sin(βt). Thus, we have
   
x1 (t) αt
=e E cos(βt) − F sin(βt) .
y1 (t)

The second real solution is found by setting c1 = 1/2i and c2 = −1/2i which gives
   
x2 (t)     
= eαt (1/2i) e(iβ)t − e(−iβ)t E + i (1/2i) e(iβ)t + e(−iβ)t F
y2 (t)
 
= eαt E sin(βt) + F cos(βt) .

The general real solution is therefore


      
x(t)
= eαt a E cos(βt) − F sin(βt) + b E sin(βt) + F cos(βt)
y(t)

for arbitrary real numbers a and b.


Example 8.4.1

x  (t) = 2 x(t) + 5 y(t)


y  (t) = −x(t) + 4 y(t)
x(0) = 6
y(0) = −1

Solution The characteristic is r 2 − 6r + 13 = 0 which gives the eigenvalues 3 ± 2i.


The eigenvalue equation for the first root, 3 + 2i leads to this system to solve for
nonzero V .
224 8 Systems
    
(3 + 2i) − 2 −5 V1 0
=
1 (3 + 2i) − 4 V2 0

This reduces to the system


    
1 + 2i −5 V1 0
=
1 −1 + 2i V2 0

Although it is not immediately apparent, the second row is a multiple of row one.
Multiply row one by −1 − 2i. This gives the row [−1 − 2i, 5]. Now multiple this new
row by −1 to get [1 + 2i, −5] which is row one. So even though it is harder to see,
these two rows are equivalent and hence we only need to choose one to solve for V2
in terms of V1 . The first row gives (1 + 2i)V1 + 5V2 = 0. Letting V1 = a, we find
V2 = (−1 − 2i)/5 a. Hence, the eigenvectors have the form
     
1 1 0
G=a = +i
− 1+2i
5
− 15 − 25

Hence,
   
1 0
E= and F =
− 15 − 25

The general real solution is therefore


      
x(t)
= e a E cos(2t) − F sin(2t) + b E sin(2t) + F cos(2t)
3t
y(t)
          
1 0 1 0
= e3t a cos(2t) − sin(2t) + b sin(2t) + cos(2t)
− 51 − 25 − 51 − 25
 
a cos(2t) + b sin(2t) 
= e3t  1 
− 5 a − 25 b cos(2t) 25 a + 15 b sin(2t)

Now apply the initial conditions to obtain


   
6 a
= e3t  1 
−1 −5 a − 2
5
b

Thus, a = 6 and b = −1/2. The solution is therefore


   
x(t) 6 cos(2t) − 21 sin(2t)
= e3t
y(t) − cos(2t) − 10
23
sin(2t)
8.4 Complex Eigenvalues 225

8.4.1.1 Homework

We can now do some problems. Find the solution to the following models.

Exercise 8.4.1

x  (t) = x(t) − 3 y(t)


y  (t) = 6 x(t) − 5 y(t)
x(0) = 2
y(0) = −1.

Exercise 8.4.2

x  (t) = 2 x(t) − 5 y(t)


y  (t) = 5 x(t) − 6 y(t)
x(0) = 15
y(0) = −2.

Exercise 8.4.3

x  (t) = 4 x(t) + y(t)


y  (t) = −41 x(t) − 6 y(t)
x(0) = 1
y(0) = −2.

Exercise 8.4.4

x  (t) = 3 x(t) + 13 y(t)


y  (t) = −2 x(t) + y(t)
x(0) = 4
y(0) = 2.

8.4.2 Rewriting the Real Solution

This is then rewritten as


      
x(t)
= eαt a E + b F cos(βt) + b E − a F sin(βt)
y(t)
226 8 Systems

Now rewrite again in terms of the components of E and F to obtain


            
x(t) αt E1 F1 E1 F1
=e a +b cos(βt) + b −a sin(βt)
y(t) E2 F2 E2 F2
  
a E 1 + bF1 bE 1 − a F1 cos(βt)
= eαt .
a E 2 + bF2 bE 2 − a F2 ] sin(βt)

Finally, we can move back to the vector form and write


   
x(t) αt cos(βt)
=e a E + b F, b E − a F .
y(t) sin(βt)

8.4.3 The Representation of A

We know
   
x  (t) x(t)
= A .
y  (t) y(t)

We can also calculate the derivative directly to find


   
x  (t) αt cos(βt)
= α e a E + b F, b E − a F
y  (t) sin(βt)
 
−β sin(βt)
+eαt a E + b F, b E − a F .
β cos(βt)

To make these equations a bit shorter, define the matrix ab by

ab = a E + b F, b E − a F .

Then the derivative can be written more compactly as


     
x  (t) αt cos(βt) αt −β sin(βt)
= α e ab + e ab .
y  (t) sin(βt) β cos(βt)

Whew! But wait, we can do more! With a bit more factoring, we have
   
x  (t) αt α cos(βt) − β sin(βt)
= e ab
y  (t) β cos(βt) + α sin(βt)
  
α −β cos(βt)
= eαt ab .
β α sin(βt)
8.4 Complex Eigenvalues 227

Hence, since
   
x(t) cos(βt)
= eαt ab .
y(t) sin(βt)

we can equate our two expressions for the derivative to find


    
cos(βt) α −β cos(βt)
A eαt ab = eαt ab .
sin(βt) β α sin(βt)

Since eαt > 0 always, we have


    
α −β cos(βt)
A ab − ab = 0.
β α sin(βt)

Therefore, after a fair bit of work, we have found the identity


 
α −β
A ab = ab
β α

8.4.4 The Transformed Model

It is straightforward, albeit messy, to show det (ab ) = 0 and so ab is invertible as


long as at least one of a and b is not zero. Thus, if we define the change of variable
   −1  
u(t) x(t)
= ab
v(t) y(t)

we see
    −1   
u (t) x (t)
= ab
v  (t) y  (t)
 −1  
x(t)
= ab A .
y(t)

But, we know
 −1   −1
α −β
ab A= ab .
β α
228 8 Systems

Thus, we have
    −1  
u  (t) α −β x(t)
= ab
v  (t) β α y(t)
  
α −β u(t)
= .
β α v(t)

This transformed system in the variables u and v also has the eigenvalues α + iβ but
it is simpler to solve.

8.4.5 The Canonical Solution

We can solve the canonical model


     
u (t) α −β u(t)
= .
v  (t) β α v(t)

as usual. The characteristic equation is (r − α)2 + β 2 = 0 giving eigenvalues α ± βi


as usual. We find the eigenvector for α + iβ satisfies
    
iβ β V1 0
=
−β iβ V2 0

This gives the equation iβV1 + βV2 = 0. Letting V1 = 1, we have V2 = −iβ. Thus,
       
V1 1 1 0
= = +i
V2 −iβ 0 −1

Thus,
   
1 0
E= and F =
0 −1

The general real solution is then


       
u(t) αt a E 1 + bF1 bE 1 − a F1 cos(βt) αt a b cos(βt)
=e =e
v(t) a E 2 + bF2 bE 2 − a F2 sin(βt) −b a sin(βt)
 
a cos(βt) + b sin(βt)
= eαt
−b cos(βt) + a sin(βt)
8.4 Complex Eigenvalues 229

We can rewrite this as follows. Let R = a 2 + b2 . Then
   a 
u(t) cos(βt) + Rb sin(βt)
= R eαt Rb
v(t) − R cos(βt) + Ra sin(βt)

Let the angle δ be defined by tan(δ) = b/a. This tells us cos(δ) = a/R and sin(δ) =
b/R. Then, cos(π/2 − δ) = b/R and sin(π/2 − δ) = a/R. Plugging these values
into our expression, we find
   
u(t) cos(δ) cos(βt) + sin(δ) sin(βt)
= R eαt
v(t) − cos(π/2 − δ) cos(βt) + sin(π/2 − δ) sin(βt)
 
cos(δ) cos(βt) + sin(δ) sin(βt)
= R eαt
− sin(δ) cos(βt) + cos(δ) sin(βt)

Then using the standard cos addition formulae, we obtain


   
u(t) αt cos(βt − δ)
= Re
v(t) sin(βt − δ)
 
Hence, u 2 (t) + v 2 (t) = e2αt R 2 cos2 (βt − δ) + R 2 sin2 (βt − δ) . This simplifies to
u 2 (t) + v 2 (t) = e2αt R 2 and hence in this canonical
√ case, the phase plane trajectory
is spiral in if α < 0, a circle of radius R = a 2 + b2 if α = 0 and a spiral out if
α > 0. Of course, the values of a and b are determined by the initial conditions.

8.4.5.1 Homework

Solve these canonical problems.

Exercise 8.4.5

x  (t) = 2 x(t) − 3 y(t)


y  (t) = 3 x(t) + 2 y(t)
x(0) = −6
y(0) = 8.

Exercise 8.4.6

x  (t) = −4 x(t) − 3 y(t)


y  (t) = 3 x(t) − 4 y(t)
x(0) = 6
y(0) = 5.
230 8 Systems

Exercise 8.4.7

x  (t) = 5 x(t) − 6 y(t)


y  (t) = 6 x(t) + 5 y(t)
x(0) = −2
y(0) = 4.

Exercise 8.4.8

x  (t) = x(t) − 2 y(t)


y  (t) = 2 x(t) + y(t)
x(0) = 1
y(0) = −1.

8.4.6 The General Model Solution

In general, we know the real solution is given by


   
x(t) αt cos(βt)
= e ab .
y(t) sin(βt)

Let ab have components


 
λ λ
ab = 11 12
λ21 λ22

Then we have
   
x(t) αt λ11 cos(βt) + λ12 sin(βt)
=e
y(t) λ21 cos(βt) + λ22 sin(βt)
 
Let R1 = λ211 + λ212 and R2 = λ221 + λ222 . Then

⎡  ⎤
  λ11 λ12
⎢ R1 cos(βt) +
R1 R1
sin(βt) ⎥
x(t)
= eαt ⎢
⎣  ⎥

y(t)
R2 λR212 cos(βt) + λ22
R2
sin(βt)
8.4 Complex Eigenvalues 231

Let the angles δ1 and δ2 be defined by tan(δ1 ) = λ12 /λ11 and tan(δ2 ) = λ22 /λ21
where in the case where we divide by zero, these angles are assigned the value of
±π/2 as needed. For convenience of exposition, we will assume here that all the
entries of ab are nonzero although, of course, what we really know is that this
matrix is invertible and so it is possible for some entries to be zero. But that is just
a messy complication and easy enough to fix with a little thought. So once we have
our angles δ1 and δ2 , we can rewrite the solution as

⎡  ⎤
  R cos(δ ) cos(βt) + sin(δ ) sin(βt)  
⎢ 1 1 1 ⎥
x(t)
= eαt ⎢  ⎥ = eαt R1 cos(βt − δ1 )
y(t) ⎣ ⎦ R2 cos(βt − δ2 )
R2 cos(δ2 ) cos(βt) + sin(δ2 ) sin(βt)

This is a little confusing as is, so let’s do an example with numbers. Suppose our
solution was
   
x(t) αt 2 cos(βt − π/6)
=e
y(t) 3 cos(βt − π/3)

Then this solution is clearly periodic since the cos functions are periodic. Note
that x hits its maximum and minimum value of 2 and −2 at the times t1 where
βt1 −π/6 = 0 and t2 with βt2 −π/6 = π. This gives t1 = π/(6β) and t2 = 7π/(6β).
At these values of time, the√y values are 3 cos(βt1 − π/3) = 3 cos(π/6 − π/3)
or y = 3 cos(−π/6) √ = 3 3/2 and 3 cos(βt2 − √ π/3) = 3 cos(π/3 − √ π/6) or
y = 3 cos(π/6) = −3 3/2. Thus, the points (2, 3 3/2) √ and (−2, −3 3/2) are
on this trajectory. The trajectory
√ reaches the point (2, 3 3/2) in time π/(6β) and
then hits the point (−2, −3 3/2) at time π/β + π/(6β).
The extremal y values occur at ±3. The maximum y of 3 is obtained at βt3 −
π/3 = 0 or t3 = π/(3β). √ The corresponding x value is then 2 cos(βt3 − √π/6) =
2 cos(π/3 − π/6) = 2 3/2. So at time t3 , the trajectory passes through (2 3/2, 3).
Finally, the minimum y value of −3 is achieved at βt4 − π/3 = π √ or βt4 = 8π/6.
At this time, the corresponding x value is 2 cos(8π/6 − π/6) = −2 3/2.
This is probably not very helpful; however, what we have shown here is that this
trajectory is an ellipse which is rotated. Take a sheet of paper and mark off the x and
y axes. Choose a small angle to represent π/(6β). Draw a√line through the origin
at angle √π/(6β) and on this line mark the two points (2, 3 3/2) (angle is π/(6β)
and (−3 3/2, −2) (angle 7π/(6β)). This line is the horizontal axis of the ellipse.
Now draw another line through the origin at angle π/(3β) which is double the first
angle.
√ This line is the vertical axis of√the ellipse. On this line plot the two points
(2 3/2, 3) (angle is π/(3β) and (−2 3/2, −3) (angle 4π/(3β)). This is a phase
shifted ellipse. At time π/(6β), we start at the farthest positive x value of the ellipse
on the horizontal axis. Then in π/(6β) additional time, we hit the largest y value
of the vertical axis. Next, in an additional 5π/(6β) we reach the most negative x
232 8 Systems

value of the ellipse on the horizontal axis. After another π/(6β) we arrive at the most
negative y value on the vertical axis. Finally, we arrive back at the start point after
another 5π/(6β). Try drawing it!
Maybe a little Matlab/Octave will help. Consider the following quick plot. We
won’t bother to label axes and so forth as we just want to double check all of the
complicated arithmetic above.
Listing 8.1: Checking our arithmetic!
beta = 3;
x = @( t ) 2∗ c o s ( b e t a ∗ t − p i / 6 ) ;
y = @( t ) 3∗ c o s ( b e t a ∗ t − p i / 3 ) ;
t = linspace (0 ,1 ,21) ;
u = x( t ) ;
v = y( t ) ;
plot (u , v) ;

This generates part of this ellipse as we can see in Fig. 8.17. We can close the plot
by plotting for a longer time.
Listing 8.2: Plotting for more time
beta = 3;
x = @( t ) 2∗ c o s ( b e t a ∗ t − p i / 6 ) ;
y = @( t ) 3∗ c o s ( b e t a ∗ t − p i / 3 ) ;
t = linspace (0 ,3 ,42) ;
u = x( t ) ;
v = y( t ) ;
plot (u , v) ;

This fills in the rest of the ellipse as we can see in Fig. 8.18.
We can plot the axes of this ellipse easily as well using the MatLab/Octave session
below.

Fig. 8.17 The partial ellipse


for the phase plane portrait
8.4 Complex Eigenvalues 233

Fig. 8.18 The complete


ellipse for the phase plane
portrait

Listing 8.3: Plotting the ellipse and its axis


beta = 3;
x = @( t ) 2∗ c o s ( b e t a ∗ t − p i / 6 ) ;
y = @( t ) 3∗ c o s ( b e t a ∗ t − p i / 3 ) ;
A = [ −2;2];
B = [−3∗ s q r t ( 3 ) / 2 ; 3 ∗ s q r t ( 3 ) / 2 ] ;
C = [−2∗ s q r t ( 3 ) / 2 ; 2 ∗ s q r t ( 3 ) / 2 ) ] ;
D = [ −3;3];
time = l i n s p a c e ( 0 , 3 , 4 2 ) ;
u = x ( time ) ;
v = y ( time ) ;
h o l d on
plot (u , v) ;
p l o t (A, B) ;
p l o t (C, D) ;
hold o f f ;

We can now see the axes lines we were talking about earlier in Fig. 8.19. Note these
axes are not perpendicular like we would usually see in an ellipse! Now, this picture
does not include the exponential decay or growth we would get by multiplying by
the scaling factor eαt for various α’s. A typical spiral out would look like Fig. 8.20.
To determine the direction of motion in these trajectories, we do the usual nullcline
analysis to get the algebraic sign pairs for (x  , y  ) as usual.

8.4.6.1 Homework

Write MatLab/Octave code to graph the following solutions and their axes. We are
using the same models we have solved before but with different initial conditions.
234 8 Systems

Fig. 8.19 The ellipse with


axes for the phase plane
portrait

Fig. 8.20 The spiral out


phase plane portrait

Exercise 8.4.9

x  (t) = x(t) − 3 y(t)


y  (t) = 6 x(t) − 5 y(t)
x(0) = 6
y(0) = 8.

Exercise 8.4.10

x  (t) = 2 x(t) − 5 y(t)


y  (t) = 5 x(t) − 6 y(t)
x(0) = 9
y(0) = −20.
8.4 Complex Eigenvalues 235

Exercise 8.4.11

x  (t) = 4 x(t) + y(t)


y  (t) = −41 x(t) − 6 y(t)
x(0) = 10
y(0) = 8.

Exercise 8.4.12

x  (t) = 3 x(t) + 13 y(t)


y  (t) = −2 x(t) + y(t)
x(0) = −5
y(0) = −2.
Chapter 9
Numerical Methods Systems of ODEs

We now want to learn how to solve systems of differential equations. A typical system
is the following

x  (t) = f (t, x(t), y(t)) (9.1)


y  (t) = g(t, x(t), y(t)) (9.2)
x(t0 ) = x0 ; y(t0 ) = y0 (9.3)

where x and y are our variables of interest which might represent populations to two
competing species or other quantities of biological interest. The system starts at time
t0 (which for us is usually 0) and we specify the values the variables x and y start at
as x0 and y0 respectively. The functions f and g are “nice” meaning that as functions
of three arguments (t, x, y) they do not have jumps and corners. The exact nature of
“nice” here is a bit beyond our ability to discuss in this introductory course, so we
will leave it at that. For example, we could be asked to solve the system

x  (t) = 3x(t) + 2x(t)y(t) − 3y 2 (t), (9.4)


y  (t) = −2x(t) + 2x 2 (t)y(t) + 5y(t), (9.5)
x(0) = 2; y(0) = 3. (9.6)

In the system given by Eqs. 9.4 and 9.5, the function f is

f (t, x, y) = 3x + 2x y − 3y 2

and the function g is

g(t, x, y) = −2x + 2x 2 y + 5y.

© Springer Science+Business Media Singapore 2016 237


J.K. Peterson, Calculus for Cognitive Scientists,
Cognitive Science and Technology, DOI 10.1007/978-981-287-877-9_9
238 9 Numerical Methods Systems of ODEs

or the system

x  (t) = 3x(t) + 2x(t)y(t) + 10 sin(t 2 + 5), (9.7)


 3 −2t 2
y (t) = −2x(t) + 2x(t)y (t) + 20t e
2
, (9.8)
x(0) = 12; y(0) = −5. (9.9)

The functions f and g in these examples are not linear in the variables x and y;
hence the system Eqs. 9.4 and 9.5 and system Eqs. 9.7 and 9.8 are what are called
a nonlinear systems. In general, arbitrary functions of time, φ and ψ to the model
giving

x  (t) = 3x(t) + 2x(t)y(t) − 3y 2 (t) + φ(t), (9.10)


y  (t) = −2x(t) + 2x 2 (t)y(t) + 5y(t) + ψ(t), (9.11)
x(0) = 2; y(0) = 3. (9.12)

The functions φ and ψ are what we could call data functions. For example, if
φ(t) = sin(t) and ψ(t) = te−t , the system Eqs. 9.10 and 9.11 would become

x  (t) = 3x(t) + 2x(t)y(t) − 3y 2 (t) + sin(t),


y  (t) = −2x(t) + 2x 2 (t)y(t) + 5y(t) + te−t ,
x(0) = 2; y(0) = 3.

The functions f and g would then become

f (t, x, y) = 3x + 2x y − 3y 2 + sin(t)
g(t, x, y) = −2x + 2x 2 y + 5y + te−t

How do we solve such a system of differential equations? There are some things
we can do if the functions f and g are linear in x and y, but many times we will be
forced to look at the solutions using numerical techniques. We explored how to solve
first order equations in Chap. 3 and we will now adapt the tools developed in that
chapter to systems of differential equations. First, we will show you how to write
a system in terms of matrices and vectors and then we will solve some particular
systems.

9.1 Setting Up the Matrix and the Vector Functions

It is easiest to see how to convert problems into a matrix–vector form by looking at


some examples.
9.1 Setting Up the Matrix and the Vector Functions 239

Example 9.1.1 Convert to a matrix–vector system

2.0 x  (t) − 3.0 x  (t) 5.0 x(t) = 0


x(0) = 2.0
x  (0) = −4.0

Solution We let the vector x be given by

   
x 1 (t) u(t)
x(t) = =
x 2 (t) u  (t)

Then,

x 1 (t) = u  (t) = x2 (t),


x 2 (t) = u  (t)
= −(5/2)u(t) + (3/2)u  (t)
= −(5/2)x 1 (t) + (3/2)x 2 (t).

We then convert the above into the matrix–vector system


     
x 1 (t) 0 1 x 1 (t)
x  (t) = =
x 2 (t) −(5/2) (3/2) x 2 (t)

Also, note that


     
x (0) u(0) 2
x(0) = 1 = = = x0.
x 2 (0) u  (0) −4

Let the matrix above be called A. Then we have converted the original system into
the matrix–vector equation x  (t) = A x(t), x(0) = x 0 .

Example 9.1.2 Convert to a matrix–vector system

x  (t) + 4.0 x  (t) − 5.0 x(t) = 0


x(0) = −1.0
x  (0) = 1.0

Solution Again, we let the vector x be given by


   
x 1 (t) u(t)
x(t) = =
x 2 (t) u  (t)
240 9 Numerical Methods Systems of ODEs

Then, similar to what we did in the previous example, we find

x 1 (t) = u  (t) = x2 (t),


x 2 (t) = u  (t)
= 5u(t) − (4)u  (t)
= 5x 1 (t) − 4x 2 (t).

We then convert the above into the matrix–vector system


     
 x  (t) 0 1 x 1 (t)
x (t) = 1 =
x 2 (t) 5 −4 x 2 (t)

Also, note that


     
x 1 (0) u(0) −1
x(0) = = = = x(0).
x 2 (0) u  (0) 1

Let the matrix above be called A. Then we have converted the original system into
the matrix–vector equation x  (t) = A x(t), x(0), x(0) = x 0 .
Example 9.1.3 Convert to a matrix–vector system

x  (t) + 2.0 y  (t) − 2.0 x(t) 3.0 y(t) = 0


4.0 x  (t) − 1.0 y  (t) + 6.0 x(t) − 8.0 y(t) = 0
x(0) = −1.0
y(0) = 2.0

Solution Let the vector u be given by

   
u1 (t) x(t)
u(t) = = .
u2 (t) y(t)

It is then easy to see that if we define the matrices A and B by


  
1 2 −2 3
A= , and B = ,
4 −1 6 −8

we can convert the original system into


 
0
A u (t) + B u(t) = ,
0
 
−1
u(0) =
2
9.1 Setting Up the Matrix and the Vector Functions 241

9.1.1 Homework

Exercise 9.1.1 Convert to a matrix–vector system X  = AX

6.0 x  (t) + 14.0 x  (t) + 1.0 x(t) = 0


x(0) = 1.0
x  (0) = −1.0

Exercise 9.1.2 Convert to a matrix–vector system AX  + B X = 0

−2.0 x  (t) + 1.0 y  (t) + 4.0 x(t) + 1.0 y(t) = 0


6.0 x  (t) + 9.0 y  (t) + 1.0 x(t) − 4.0 y(t) = 0
x(0) = 10.0
y(0) = 20.0

Exercise 9.1.3 Convert to a matrix–vector system AX  + B X = 0

1.0 x  (t) + 0.0 y  (t) + 2.0 x(t) + 3.0 y(t) = 0


4.0 x  (t) − 5.0 y  (t) + 6.0 x(t) − 2.0 y(t) = 0
x(0) = 1.0
y(0) = −2.0

9.2 Linear Second Order Problems as Systems

Now let’s consider how to adapt our previous code to handle these systems of dif-
ferential equations. We will begin with the linear second order problems because
we know how to do those already. First, let’s consider a general second order linear
problem.

a u  (t) + b u  (t) + c u(t) = g(t)


u(0) = e
u  (0) = f

where we assume a is not zero, so that we really do have a second order problem!
As usual, we let the vector x be given by
   
x 1 (t) u(t)
x(t) = =
x 2 (t) u  (t)
242 9 Numerical Methods Systems of ODEs

Then,
x 1 (t) = u  (t) = x2 (t),
x 2 (t) = u  (t)
= −(c/a)u(t) − (b/a)u  (t) + (1/a) g(t)

We then convert the above into the matrix–vector system


       
 x 1 (t) 0 1 x 1 (t) 0
x (t) =  = +
x 2 (t) −(c/a) −(b/a) x 2 (t) (1/a) g(t)

Also, note that


     
x (0) u(0) e
x(0) = 1 = = = x0.
x 2 (0) u  (0) f

Let the matrix above be called A. Then we have converted the original system into
the matrix–vector equation x  (t) = A x(t), x(0) = x 0 . For our purposes of using
MatLab, we need to write this in terms of vectors. We have
    
x 1 (t) x2
=
x 2 (t) −(c/a) x 1 − (b/a) x 2 + (1/a) g(t)

Now, let the dynamics vector f be defined by


   
f1 x2
f = =
f2 −(c/a) x 1 − (b/a) x 2 + (1/a) g(t)

For example, given the model

x  + 4x  − 5x = te−0.03t
x(0) = −1.0
x  (0) = 1.0

we write this in a MatLab session as is shown in Listing 9.1.

Listing 9.1: Linear Second Order Dynamics


% f o r s e c o n d o r d e r model
% g i v e n t h e model y ’ ’ + 4 y − 5 y = t e ˆ{ −.03 t }
a = 1;
b = 4;
5 c = −5;
B = −(b / a ) ;
A = −(c / a ) ;
C = 1/ a ;
g = @( t ) t . ∗ exp ( −.03∗ t ) ;
10 f = @( t , y ) [ y ( 2 ) ;A∗y ( 1 ) + B∗y ( 2 ) + C∗g ( t ) ] ;
9.2 Linear Second Order Problems as Systems 243

9.2.1 A Practice Problem

Consider the problem

y  (t) + 4.0 y  (t) − 5.0 y(t) = 0


y(0) = −1.0
y  (0) = 1.0

This has characteristic equation r 2 + 4r − 5 = 0 with roots r1 = −5 and r2 = 1.


Hence, the general solution is x(t) = Ae−5t + Bet . The initial conditions give

A + B = −1
−5A + B = 1

It is straightforward to see that A = −1/3 and B = −2/3. Then we solve the system
using MatLab with this session:

Listing 9.2: Solving x  + 4x  − 5x = 0, x(0) = −1, x  (0) = 1


% d e f i n e t h e d y n a mi c s f o r y ’ ’ + 4 y ’ − 5 y = 0
a = 1 ; b = 4 ; c = −5;
B = −(b / a ) ; A = −(c / a ) ; C = 1 / a ;
g = @( t ) 0 . 0 ;
5 f = @( t , y ) [ y ( 2 ) ;A∗y ( 1 ) + B∗y ( 2 ) + C∗g ( t ) ] ;
% define the true solution
t r u e = @( t ) [ − ( 1 . 0 / 3 . 0 ) ∗ exp (−5∗ t ) − ( 2 . 0 / 3 . 0 ) ∗ exp ( t ) ; . . .
( 5 . 0 / 3 . 0 ) ∗ exp (−5∗ t ) − ( 2 . 0 / 3 . 0 ) ∗ exp ( t ) ] ;
y0 = [ − 1 ; 1 ] ;
10 h = .2;
T = 3;
time = l i n s p a c e ( 0 ,T, 1 0 1 ) ;
N = c e i l (T / h ) ;
[ htime1 , rkapprox1 ] = FixedRK ( f , 0 , y0 , h , 1 ,N) ;
15 y h a t 1 = rkapprox1 ( 1 , : ) ;
[ htime2 , rkapprox2 ] = FixedRK ( f , 0 , y0 , h , 2 ,N) ;
y h a t 2 = rkapprox2 ( 1 , : ) ;
[ htime3 , rkapprox3 ] = FixedRK ( f , 0 , y0 , h , 3 ,N) ;
y h a t 3 = rkapprox3 ( 1 , : ) ;
20 [ htime4 , rkapprox4 ] = FixedRK ( f , 0 , y0 , h , 4 ,N) ;
y h a t 4 = rkapprox4 ( 1 , : ) ;
ytrue = true ( time ) ;
p l o t ( time , y t r u e ( 1 , : ) , htime1 , yhat1 , ’o’ , htime2 , yhat2 , ’*’ , . . .
htime3 , yhat3 , ’+’ , htime4 , yhat4 , ’-’ ) ;
25 x l a b e l ( ’Time’ ) ;
y l a b e l ( ’y’ ) ;
t i t l e ( ’Solution to x’’’’ + 4x’’ - 5x = 0, x(0) = -1, x’’(0) = 1 on [1.3]’ ) ;
l e g e n d ( ’True’ , ’RK1’ , ’RK2’ , ’RK3’ , ’RK4’ , ’Location’ , ’Best’ ) ;

One comment about this code. The function true has vector values; the first com-
ponent is the true solution and the second component is the true solution’s derivative.
Since we want to plot only the true solution, we need to extract it from true. We
do this with the command ytrue=true(time) which saves the vector of true
values into the new variable ytrue. Then in the plot command, we plot only the
solution by using ytrue(1,:). This generates a plot as shown in Fig. 9.1.
244 9 Numerical Methods Systems of ODEs

Fig. 9.1 Solution to x  + 4x  − 5x = 0, x(0) = −1, x  (0) = 1 on [1.3]

9.2.2 What If We Don’t Know the True Solution?

Now let’s add an external input, g(t) = 10 sin(5 ∗ t)e−0.03t . In a more advanced
class, we could find the true solution for this external input, but it if we changed to
g(t) = 10 sin(5 ∗ t)e−0.03t we would not be able to do that. So in general, there
2

are many models we can not find the true solution to. However, the Runge–Kutta
methods work quite well. Still, we always have the question in the back of our minds:
is this plot accurate?

Listing 9.3: Solving x  + 4x  − 5x = 10 sin(5 ∗ t)e−0.03t , x(0) = −1, x  (0) = 1


2

% d e f i n e t h e d y n a mi c s f o r y ’ ’ + 4 y − 5 y = 10 s i n ( 5 t ) e ˆ{ −.03 t }
a = 1 ; b = 4 ; c = −5;
B = −(b / a ) ; A = −(c / a ) ; C = 1 / a ;
g = @( t ) 10∗ s i n ( 5 ∗ t ) . ∗ exp ( −.03∗ t ) ;
5 f = @( t , y ) [ y ( 2 ) ;A∗y ( 1 ) + B∗y ( 2 ) + C∗g ( t ) ] ;
y0 = [ − 1 ; 1 ] ;
h = .2;
T = 3;
N = c e i l (T/ h ) ;
10 [ htime1 , rkapprox1 ] = FixedRK ( f , 0 , y0 , h , 1 ,N) ;
y h a t 1 = rkapprox1 ( 1 , : ) ;
[ htime2 , rkapprox2 ] = FixedRK ( f , 0 , y0 , h , 2 ,N) ;
y h a t 2 = rkapprox2 ( 1 , : ) ;
[ htime3 , rkapprox3 ] = FixedRK ( f , 0 , y0 , h , 3 ,N) ;
9.2 Linear Second Order Problems as Systems 245

15 y h a t 3 = rkapprox3 ( 1 , : ) ;
[ htime4 , rkapprox4 ] = FixedRK ( f , 0 , y0 , h , 4 ,N) ;
y h a t 4 = rkapprox4 ( 1 , : ) ;
p l o t ( htime1 , yhat1 , ’o’ , htime2 , yhat2 , ’*’ , . . .
htime3 , yhat3 , ’+’ , htime4 , yhat4 , ’-’ ) ;
20 x l a b e l ( ’Time’ ) ;
y l a b e l ( ’Approx y’ ) ;
t i t l e ( ’Solution to x’’’’ + 4x’’ - 5x = 10 sin(5t) eˆ{-.03t}, x(0) = -1, x’’(0) =
1 on [0,3]’ ) ;
l e g e n d ( ’RK1’ , ’RK2’ , ’RK3’ , ’RK4’ , ’Location’ , ’Best’ ) ;

This generates a plot as shown in Fig. 9.2.

9.2.3 Homework: No External Input

On all of these problems, choose an appropriate stepsize h, time interval [0, T ] for
some positive T and
• find the true solution and write this as MatLab code.
• find the Runge–Kutta order 1 through 4 solutions.
• Write this up with attached plots.

Fig. 9.2 Solution to x  + 4x  − 5x = 0, x(0) = −1, x  (0) = 1 on [1.3]


246 9 Numerical Methods Systems of ODEs

Exercise 9.2.1

u  (t) + u  (t) − 2 u(t) = 0


u(0) = 1
u  (0) = −2
Exercise 9.2.2

x  (t) + 6 x  (t) + 9 x(t) = 0


x(0) = 1
x  (0) = 2
Exercise 9.2.3

y  (t) + 4 y  (t) + 13 y(t) = 0


y(0) = 1
y  (0) = 2

9.2.4 Homework: External Inputs

For these models, find the Runge–Kutta 1 through 4 solutions and do the write up
with plot as usual.
Exercise 9.2.4

2 u  (t) + 4 u  (t) − 3 u(t) = exp(−2t) cos(3t + 5)


u(0) = −2
u  (0) = 3
Exercise 9.2.5

u  (t) − 2 u  (t) + 13 u(t) = exp(−5t) sin(3t 2 + 5)


u(0) = −12
u  (0) = 6

9.3 Linear Systems Numerically

We now turn our attention to solving systems of linear ODEs numerically. We will
show you how to do it in two worked out problems. We then generate the plot of y
versus x, the two lines representing the eigenvectors of the problem, the x  = 0 and
the y  = 0 lines on the same plot. A typical MatLab session would look like this:
9.3 Linear Systems Numerically 247

Listing 9.4: Solving x  = −3x + 4y, y  = −x + 2y; x(0) = −1; y(0) = 1


f = @( t , x ) [−3∗x ( 1 ) +4∗ x ( 2 ) ;−x ( 1 ) +2∗ x ( 2 ) ] ;
2 E1 = @( x ) 0 . 2 5 ∗ x ;
E2 = @( x ) x ;
xp = @( x ) ( 3 / 4 ) ∗x ;
yp = @( x ) 0 . 5 ∗ x ;
T = 1.4;
7 h = .03;
x0 = [ − 1 ; 1 ] ;
N = c e i l (T/ h ) ;
[ ht , rk ] = FixedRK ( f , 0 , x0 , h , 4 ,N) ;
X = rk ( 1 , : ) ;
12 Y = rk ( 2 , : ) ;
xmin = min (X) ;
xmax = max (X) ;
x t o p = max ( abs ( xmin ) , abs ( xmax ) ) ;
ymin = min (Y) ;
17 ymax = max (Y) ;
y t o p = max ( abs ( ymin ) , abs ( ymax ) ) ;
D = max ( xtop , y t o p ) ;
s = l i n s p a c e (−D, D, 2 0 1 ) ;
p l o t ( s , E1 ( s ) , ’-r’ , s , E2 ( s ) , ’-m’ , s , xp ( s ) , ’-b’ , s , yp ( s ) , ’-g’ ,X, Y, ’-k’ ) ;
22 x l a b e l ( ’x’ ) ;
y l a b e l ( ’y’ ) ;
t i t l e ( ’Phase Plane for Linear System x’’ = -3x+4y, y’’=-x+2y, x(0) = -1, y(0) = 1’
);
l e g e n d ( ’E1’ , ’E2’ , ’x’’=0’ , ’y’’=0’ , ’y vs x’ , ’Location’ , ’Best’ ) ;

This generates Fig. 9.3. This matches the kind of qualitative analysis we have done
by hand, although when we do the plots by hand we get a more complete picture.
Here, we see only one plot instead of the many trajectories we would normally sketch.

Fig. 9.3 Solving


x  = −3x + 4x y, y  = −x +
2y; x(0) = −1; y(0) = 1
248 9 Numerical Methods Systems of ODEs

Now let’s annotate the code.

Listing 9.5: Solving x  = −3x + 4y, y  = −x + 2y; x(0) = −1; y(0) = 1


% S e t up t h e d y n a m i c s f o r t h e model
f = @( t , x ) [ −3∗x ( 1 ) +4∗ x ( 2 ) ;−x ( 1 ) +2∗ x ( 2 ) ] ;
% E i g e n v e c t o r 1 i s [ 1 ; . 2 5 ] so s l o p e i s . 2 5
% s e t up a s t r a i g h t l i n e u s i n g t h i s s l o p e
5 E1 = @( x ) 0 . 2 5 ∗ x ;
% E i g e n v e c t o r 2 i s [ 1 ; . 1 ] so s l o p e i s 1
% s e t up a s t r a i g h t l i n e u s i n g t h i s s l o p e
E2 = @( x ) x ;
% t h e x ’=0 n u l l c l i n e i s −3x+4 y = 0 o r y = 3 / 4 x
10 % s e t up a l i n e w i t h t h i s s l o p e
xp = @( x ) ( 3 / 4 ) ∗x ;
% t h e y ’=0 n u l l c l i n e i s −x+2 y = 0 o r y = . 5 x
% s o s e t up a l i n e w i t h t h i s s l o p e
yp = @( x ) 0 . 5 ∗ x ;
15 % s e t the f i n a l time
T = 1.4;
% s e t the step s i z e
h = .03;
% s e t t h e IC
20 y0 = [ − 1 ; 1 ] ;
% f i n d how many s t e p s we ’ l l t a k e
N = c e i l (T / h ) ;
% Find N a p p r o x i m a t e RK 4 v a l u e s
% s t o r e t h e t i m e s i n h t and t h e v a l u e s i n r k
25 % rk ( 1 , : ) i s the f i r s t column which i s x
% r k ( 2 , : ) i s t h e s e c o n d column which i s y
[ ht , rk ] = FixedRK ( f , 0 , y0 , h , 4 ,N) ;
% rk ( 1 , : ) i s the f i r s t row which i s s e t t o X
% r k ( 2 , : ) i s t h e s e c o n d row which i s s e t t o Y
30 X = rk ( 1 , : ) ;
Y = rk ( 2 , : ) ;
% t h e x v a l u e s r a n g e from xmin t o xman
xmin = min (X) ;
xmax = max (X) ;
35 % f i n d max o f t h e i r a b s o l u t e v a l u e s
% example : xmin = −7, xmax = 4 ==> x t o p = 7
x t o p = max ( abs ( xmin ) , abs ( xmax ) ) ;
% t h e x v a l u e s r a n g e from xmin t o xman
ymin = min (Y) ;
40 ymax = max (Y) ;
% f i n d max o f t h e i r a b s o l u t e v a l u e s
% example : ymin = −3, ymax = 10 ==> y t o p = 10
y t o p = max ( abs ( ymin ) , abs ( ymax ) ) ;
% f i n d max o f x t o p and y t o p ; our example g i v e s D = 10
45 D = max ( xtop , y t o p ) ;
% The p l o t l i v e s i n t h e box [−D, D] x [−D, D]
% s e t up a l i n s p a c e o f [−D, D]
s = l i n s p a c e (−D, D, 2 0 1 ) ;
% p l o t t h e e i g e n v e c t o r l i n e s , t h e n u l l c l i n e s and Y v s X
50 % −r i s red , −m i s magenta , −b i s b l u e , −g i s g r e e n and −k i s b l a c k
p l o t ( s , E1 ( s ) , ’-r’ , s , E2 ( s ) , ’-m’ , s , xp ( s ) , ’-b’ , s , yp ( s ) , ’-g’ ,X, Y, ’-k’ ) ;
% s e t x and y l a b e l s
x l a b e l ( ’x’ ) ;
y l a b e l ( ’y’ ) ;
55 % set t i t l e
t i t l e ( ’Phase Plane for Linear System x’’ = -3x+4y, y’’=-x+2y, x(0) = -1, y(0) = 1’
);
% s e t legend
l e g e n d ( ’E1’ , ’E2’ , ’x’’=0’ , ’y’’=0’ , ’y vs x’ , ’Location’ , ’Best’ ) ;

Let’s do another example.


9.3 Linear Systems Numerically 249

Example 9.3.1 For the system below


    
x  (t) 4 9 x(t)
=
y  (t) −1 −6 y(t)
   
x(0) 4
=
y(0) −2

• Find the characteristic equation


• Find the general solution
• Solve the IVP
• Solve the System Numerically
Solution The eigenvalue and eigenvector portion of this solution has already been
done in Example 2.10.2 and so we only have to copy the results here. The character-
istic equation is
    
10 4 9
det r − =0
01 −1 −6

with eigenvalues r1 = −5 and r2 = 3.


1. For eigenvalue r1 = −5, we found the eigenvector to be
 
1
.
−1

2. For eigenvalue r2 = 3, we found the eigenvector to be


 
1
−1/9

The general solution to our system is thus


     
x(t) 1 1
=A e−5t + B e3t
y(t) −1 −1/9

We solve the IVP by finding the A and B that will give the desired initial conditions.
This gives
     
4 1 1
=A + B
−2 −1 −1/9

or

4= A + B
−1
−2 = − A + B
9
250 9 Numerical Methods Systems of ODEs

This is easily solved using elimination to give A = 7/4 and B = 9/4. The solution
to the IVP is therefore
     
x(t) 7 1 −5t 9 1
= e + −1 e3t
y(t) 4 −1 4
 9
7/4 e−5t + 9/4 e3t
=
−7/4 e−5t − 1/4 e3t

We now have all the information needed to solve this numerically. The MatLab session
for this problem is then

Listing 9.6: Phase Plane for x  = 4x + 9y, y  = −x − 6y, x(0) = 4, y(0) = −2


f = @( t , x ) [ 4 ∗ x ( 1 ) +9∗ x ( 2 ) ;−x ( 1 ) −6∗x ( 2 ) ] ;
2 E1 = @( x ) −1∗x ;
E2 = @( x ) − ( 1 / 9 ) ∗x ;
xp = @( x ) − ( 4 / 9 ) ∗x ;
yp = @( x ) − ( 1 / 6 ) ∗x ;
T = 0.4;
7 h = .01;
x0 = [ 4 ; − 2 ] ;
N = c e i l (T/ h) ;
[ ht , r k ] = FixedRK ( f , 0 , x0 , h , 4 ,N) ;
X = rk ( 1 , : ) ;
12 Y = rk ( 2 , : ) ;
xmin = min ( X ) ;
xmax = max ( X ) ;
x t o p = max ( a b s ( xmin ) , a b s ( xmax ) ) ;
ymin = min ( Y ) ;
17 ymax = max ( Y ) ;
y t o p = max ( a b s ( ymin ) , a b s ( ymax ) ) ;
D = max ( x t o p , y t o p ) ;
x = l i n s p a c e (−D, D, 2 0 1 ) ;
p l o t ( x , E1 ( x ) , ’-r’ , x , E2 ( x ) , ’-m’ , x , xp ( x ) , ’-b’ , x , yp ( x ) , ’-c’ ,X , Y , ’-k’ ) ;
22 x l a b e l ( ’x’ ) ;
y l a b e l ( ’y’ ) ;
t i t l e ( ’Phase Plane for Linear System x’’ = 4x+9y, y’’=-x-6y, x(0) = 4, y(0) = -2’ )
;
l e g e n d ( ’E1’ , ’E2’ , ’x’’=0’ , ’y’’=0’ , ’y vs x’ , ’Location’ , ’Best’ ) ;

This generates Fig. 9.4. Again this matches the kind of qualitative analysis we
have done by hand, but for only one plot instead of the many trajectories we would
normally sketch. We had to choose the final time T and the step size h by trial and
error to generate the plot you see. If T is too large, the growth term in the solution
generates x and y values that are too big and the trajectory just looks like it lies on
top of the dominant eigenvector line.

9.3.1 Homework

For these models,


• Find the characteristic equation
• Find the general solution
9.3 Linear Systems Numerically 251

Fig. 9.4 Solving


x  = 4x + 9x y, y  = −x −
6y; x(0) = 4; y(0) = −2

• Solve the IVP


• Solve the System Numerically

Exercise 9.3.1
    
x  (t) 1 1 x(t)
=
y  (t) −1 −3/2 y(t)
   
x(0) −5
=
y(0) −2

Exercise 9.3.2
    
x  (t) 4 2 x(t)
=
y  (t) −9 −5 y(t)
   
x(0) −1
=
y(0) 2

Exercise 9.3.3
    
x  (t) 37 x(t)
=
y  (t) 46 y(t)
   
x(0) −2
=
y(0) −3
252 9 Numerical Methods Systems of ODEs

9.4 An Attempt at an Automated Phase Plane Plot

Now let’s try to generate a real phase plane portrait by automating the phase plane
plots for a selection of initial conditions. Consider the code below which is saved in
the file AutoPhasePlanePlot.m.

Listing 9.7: AutoPhasePlanePlot.m


f u n c t i o n A u t o P h a s e P l a n e P l o t ( fname , s t e p s i z e , t i n i t , t f i n a l , rkord er , x b o x s i z e , y b o x s i z e ,
xmin , xmax , ymin , ymax )
% fname i s t h e name o f t h e model d y n a m i c s
% s t e p s i z e i s the chosen s t e p s i z e
% t i n i t i s the i n i t i a l time
5 % t f i n a l i s the f i n a l time
% r k o r d e r i s t h e RK o r d e r
% we w i l l u s e i n i t i a l c o n d i t i o n s c h o s e n from t h e box
% [ xmin , xmax ] x [ ymin , ymax ]
% T h i s i s done u s i n g t h e l i n s p a c e command
10 % s o x b o x s i z e i s t h e number o f p o i n t s i n t h e i n t e r v a l [ xmin , xmax ]
% y b o x s i z e i s t h e number o f p o i n t s i n t h e i n t e r v a l [ ymin , ymax ]
% u and v a r e t h e v e c t o r s we u s e t o compute our ICs
n = c e i l ( ( t f i n a l −t i n i t ) / s t e p s i z e ) ;
u = l i n s p a c e ( xmin , xmax , x b o x s i z e ) ;
15 v = l i n s p a c e ( ymin , ymax , y b o x s i z e ) ;
% h o l d p l o t and c y c l e l i n e c o l o r s
x l a b e l ( ’x’ ) ;
y l a b e l ( ’y’ ) ;
n e w p lo t ;
20 hold a l l ;
f o r i =1: x b o x s i z e
f o r j =1: y b o x s i z e
x0 = [ u ( i ) ; v ( j ) ] ;
[ htime , rk , f r k ] = FixedRK ( fname , t i n i t , x0 , s t e p s i z e , rkorder , n ) ;
25 X = rk ( 1 , : ) ;
Y = rk ( 2 , : ) ;
p l o t (X, Y) ;
end
end
30 t i t l e ( ’Phase Plane Plot’ ) ;
hold o f f ;

There are some new elements here. We set up vectors u and v to construct our
initial conditions from. Each initial condition is of the form (u i , v j ) and we use that
to set the initial condition x0 we pass into FixedRK() as usual. We start by telling
MatLab the plot we are going to build is a new one; so the previous plot should be
erased. The command hold all then tells MatLab to keep all the plots we generate
as well as the line colors and so forth until a hold off is encountered. So here we
generate a bunch of plots and we then see them on the same plot at the end! A typical
session usually requires a lot of trial and error. In fact, you should find the analysis by
hand is actually more informative! As discussed, the AutoPhasePlanePlot.m
script is used by filling in values for the inputs it needs. Again the script has these
inputs
9.4 An Attempt at an Automated Phase Plane Plot 253

Listing 9.8: AutoPhasePlanePlot Arguments


A u t o P h a s e P l a n e P l o t ( fname , s t e p s i z e , t i n i t , t f i n a l , rkorder , x b o x s i z e , y b o x s i z e , xmin ,
xmax , ymin , ymax )
% fname i s t h e name o f our d y n a m i c s f u n c t i o n
% s t e p s i z e i s our c a l l , h e r e . 0 1 seems good
% t i n i t i s the s t a r t i n g time , here always 0
5 % t f i n a l i s hard t o p i c k , h e r e s m a l l seems b e s t ; . 2 − . 6
% b e c a u s e one o f our e i g e n v a l u e s i s p o s i t i v e and t h e s o l u t i o n
% grows t o o f a s t
% r k o r d e r i s t h e Runge−K u t t a o r d e r , h e r e 4
% x b o x s i z e i s how many d i f f e r e n t x i n i t i a l v a l u e s we want , h e r e 4
10 % y b o x s i z e i s how mnay d i f f e r e n t y i n i t i a l v a l u e s we want , h e r e 4
% xmin and xmax g i v e t h e i n t e r v a l we p i c k t h e i n i t i a l x v a l u e s from ;
% here , t h e y come from [ − . 3 , . 3 ] i n t h e l a s t a t t e m p t .
% ymin and ymax g i v e t h e i n t e r v a l we p i c k t h e i n i t i a l x v a l u e s from
% here , t h e y come from [ − . 3 , . 3 ] i n t h e l a s t a t t e m p t .

Now some attempts for the model


    
x  (t) 4 −1 x(t)
=
y  (t) 8 −5 y(t)

This is encoded in Matlab as vecfunc = @(t,y) [4*y(1)-y(2);8*y(1)-


5*y(2)];. We can then try a few phase plane plots.

Listing 9.9: Trying Some Phase Plane Plots


AutoPhasePlanePlot ( vecfunc ,.1 ,0 ,1 ,4 ,4 ,4 , −1 ,1 , −1 ,1) ;
AutoPhasePlanePlot ( vecfunc , . 0 1 , 0 , . 1 , 4 , 4 , 4 , − 1 , 1 , − 1 , 1 ) ;
AutoPhasePlanePlot ( vecfunc , . 0 1 , 0 , . 4 , 4 , 4 , 4 , − 1 , 1 , − 1 , 1 ) ;
AutoPhasePlanePlot ( vecfunc , . 0 1 , 0 , . 6 , 4 , 4 , 4 , − 1 , 1 , − 1 , 1 ) ;
5 AutoPhasePlanePlot ( vecfunc , . 0 1 , 0 , . 6 , 4 , 4 , 4 , − 1 . 5 , 1 . 5 , − 1 . 5 , 1 . 5 ) ;
AutoPhasePlanePlot ( vecfunc , . 0 1 , 0 , . 6 , 4 , 4 , 4 , − . 5 , . 5 , − . 5 , . 5 ) ;
AutoPhasePlanePlot ( vecfunc , . 0 1 , 0 , . 8 , 4 , 4 , 4 , − . 5 , . 5 , − . 5 , . 5 ) ;
AutoPhasePlanePlot ( vecfunc , . 0 1 , 0 , . 8 , 4 , 4 , 4 , − . 3 , . 3 , − . 3 , . 3 ) ;
AutoPhasePlanePlot ( vecfunc , . 0 1 , 0 , . 8 , 4 , 4 , 4 , − . 2 , . 2 , − . 2 , . 2 ) ;
10 AutoPhasePlanePlot ( vecfunc , . 0 1 , 0 , . 4 , 4 , 4 , 4 , − . 2 , . 2 , − . 2 , . 2 ) ;
AutoPhasePlanePlot ( vecfunc , . 0 1 , 0 , . 4 , 4 , 4 , 4 , − . 3 , . 3 , − . 3 , . 3 ) ;
x l a b e l ( ’x axis’ ) ;
y l a b e l ( ’y axis’ ) ;
t i t l e ( ’Phase Plane Plot’ ) ;

After a while, we get a plot we like as shown in Fig. 9.5.


Not bad! But we don’t show the nullclines and the eigenvector lines on top of the
plots! That would be nice to do and we can do that using tools in MatLab.
254 9 Numerical Methods Systems of ODEs

Fig. 9.5 Phase plane


x  = 4x − y, y  = 8x − 5y

9.5 Further Automation!

So let’s start using some more MatLab tools. As always, with power comes an
increased need for responsible behavior!

9.5.1 Eigenvalues in MatLab

We will now discuss certain ways to compute eigenvalues and eigenvectors for a
square matrix in MatLab. For a given A, we can compute its eigenvalues as follows:

Listing 9.10: Eigenvalues in Matlab: eig


1 A = [ 1 2 3 ; 4 5 6 ; 7 8 −1]

A =

1 2 3
6 4 5 6
7 8 −1

E = e i g (A)

11 E =

−0.3954
11.8161
−6.4206
9.5 Further Automation! 255

So we have found the eigenvalues of this small 3 × 3 matrix. Note, in general they are
not returned in any sorted order like small to large. Bummer! To get the eigenvectors,
we do this:

Listing 9.11: Eigenvectors and Eigenvalues in Matlab: eig


[V, D] = e i g (A)

V =

5 0.7530 −0.3054 −0.2580


−0.6525 −0.7238 −0.3770
0.0847 −0.6187 0.8896

10 D =

−0.3954 0 0
0 11.8161 0
0 0 −6.4206

The eigenvalue/eigenvector pairs are thus

λ1 = −0.3954
⎡ ⎤
0.7530
V1 = ⎣ −0.6525 ⎦
0.0847

λ2 = 11.8161
⎡ ⎤
−0.3054
V2 = ⎣ −0.7238 ⎦
−0.6187

λ3 = −6.4206
⎡ ⎤
−0.2580
V3 = ⎣ −0.3770 ⎦
0.8896

Now let’s try a nice 5 × 5 array that is symmetric:


256 9 Numerical Methods Systems of ODEs

Listing 9.12: Example 5 × 5 Eigenvalue and Eigenvector Calculation in Matlab


1 B = [1 2 3 4 5;
2 5 6 7 9;
3 6 1 2 3;
4 7 2 8 9;
5 9 3 9 6]
6
B =

1 2 3 4 5
2 5 6 7 9
11 3 6 1 2 3
4 7 2 8 9
5 9 3 9 6

[W, Z ] = e i g ( B)
16
W =

0.8757 0.0181 −0.0389 0.4023 0.2637


−0.4289 −0.4216 −0.0846 0.6134 0.5049
21 0.1804 −0.6752 0.4567 −0.4866 0.2571
−0.1283 0.5964 0.5736 −0.0489 0.5445
0.0163 0.1019 −0.6736 −0.4720 0.5594

26 Z =

0.1454 0 0 0 0
0 2.4465 0 0 0
0 0 −2.2795 0 0
31 0 0 0 −5.9321 0
0 0 0 0 26.6197

It is possible to show that the eigenvalues of a symmetric matrix will be real and
eigenvectors corresponding to distinct eigenvalues will be 90◦ apart. Such vectors
are called orthogonal and recall this means their inner product is 0. Let’s check it
out. The eigenvectors of our matrix are the columns of W above. So their dot product
should be 0!

Listing 9.13: Inner Products in Matlab: dot


C = d o t (W( 1 : 5 , 1 ) ,W( 1 : 5 , 2 ) )

3 C =

1 . 3 3 3 6 e −16

Well, the dot product is not actually 0 because we are dealing with floating point
numbers here, but as you can see it is close to machine zero (the smallest number
our computer chip can detect). Welcome to the world of computing!

9.5.2 Linear System Models in MatLab Again

We have already solves linear models using MatLab tools. Now we will learn to do a
bit more. We begin with a sample problem. Note to analyze a linear systems model,
9.5 Further Automation! 257

we can do everything by hand, we can then try to emulate the hand work, using
computational tools. A sketch of the process is thus:
1. For the system below, first do the work by hand.
    
x  (t) 4 −1 x(t)
=
y  (t) 8 −5 y(t)
   
x(0) −3
=
y(0) 1

• Find the characteristic equation


• Find the general solution
• Solve the IVP
• On the same x–y graph,
(a) draw the x  = 0 line
(b) draw the y  = 0 line
(c) draw the eigenvector one line
(d) draw the eigenvector two line
(e) divide the x − y into four regions corresponding to the algebraic signs of
x  and y 
(f) draw the trajectories of enough solutions for various initial conditions to
create the phase plane portrait
2. Now do the graphical work with MatLab. Use the MatLab scripts below. Make
sure you have the relevant codes in your directory. For the problem above, a
typical MatLab session would be

Listing 9.14: Sample Linear Model: x  = 4x − y, y  = 8x − 5y


v e c f u n c = @( t , x ) [ 4 ∗ x ( 1 )−x ( 2 ) ; 8 ∗ x ( 1 ) −5∗x ( 2 ) ] ;
% E1 i s [ 1 ; 8 ] s o s l o p e i s 8
E1 = @( x ) 8∗ x ;
% E2 i s [ 1 ; 1 ] s o s l o p e i s 1
5 E2 = @( x ) x ;
% x ’ = 0 i s 4x−y =0 o r y = 4x
xp = @( x ) 4∗ x ;
%y ’=0 i s 8x−5y =0 s o 5 y = 8x o r y = 8 / 5 x
yp = @( x ) ( 8 / 5 ) ∗x ;
10 T = 0.4;
h = .01;
x0 = [ − 3 ; 1 ] ;
N = c e i l (T / h ) ;
[ ht , rk , f k ] = FixedRK ( v e c f u n c , 0 , x0 , h , 4 ,N) ;
15 X = rk ( 1 , : ) ;
Y = rk ( 2 , : ) ;
xmin = min (X) ;
xmax = max (X) ;
x t o p = max ( abs ( xmin ) , abs ( xmax ) ) ;
20 ymin = min (Y) ;
ymax = max (Y) ;
y t o p = max ( abs ( ymin ) , abs ( ymax ) ) ;
258 9 Numerical Methods Systems of ODEs

D = max ( xtop , y t o p ) ;
x = l i n s p a c e (−D, D, 2 0 1 ) ;
25 p l o t ( x , E1 ( x ) , ’-r’ , x , E2 ( x ) , ’-m’ , x , xp ( x ) , ’-b’ , x , yp ( x ) , ’-c’ ,X, Y, ’-k’ ) ;
x l a b e l ( ’x’ ) ;
y l a b e l ( ’y’ ) ;
t i t l e ( ’Phase Plane for Linear System x’’ = 4x-y, y’’=8x-5y, x(0) = -3, y(0) =
1’ ) ;
l e g e n d ( ’E1’ , ’E2’ , ’x’’=0’ , ’y’’=0’ , ’y vs x’ , ’Location’ , ’Best’ ) ;

which generates the plot seen in Fig. 9.6.


3. Now find eigenvectors and eigenvalues with MatLab for this problem. In MatLab
this is done like this

Listing 9.15: Find the Eigenvalues and Eigenvectors in Matlab


1 A = [4 , −1;8 , −5]

A =

4 −1
6 8 −5

[V, D] = e i g s (A)

V =
11
0.1240 0.7071
0.9923 0.7071

16 D =

−4 0
0 3

you should be able to see that the eigenvectors we got before by hand are the
same as these except they are written as vectors of length one.

Fig. 9.6 Sample plot


9.5 Further Automation! 259

4. Now plot many trajectories at the same time. We discussed this a bit earlier. It is
very important to note that the hand analysis is in many ways easier. A typical
MatLab session usually requires a lot of trial and error! So try not to get too
frustrated. The AutoPhasePlanePlot.m script is used by filling in values
for the inputs it needs.
Now some attempts.

Listing 9.16: Session for x  = 4x + 9y, y  = −x − 6y Phase Plane Plots


AutoPhasePlanePlot ( vecfunc ,.1 ,0 ,1 ,4 ,4 ,4 , −1 ,1 , −1 ,1) ;
AutoPhasePlanePlot ( vecfunc , . 0 1 , 0 , . 1 , 4 , 4 , 4 , − 1 , 1 , − 1 , 1 ) ;
AutoPhasePlanePlot ( vecfunc , . 0 1 , 0 , . 4 , 4 , 4 , 4 , − 1 , 1 , − 1 , 1 ) ;
AutoPhasePlanePlot ( vecfunc , . 0 1 , 0 , . 6 , 4 , 4 , 4 , − 1 , 1 , − 1 , 1 ) ;
5 AutoPhasePlanePlot ( vecfunc , . 0 1 , 0 , . 6 , 4 , 4 , 4 , − 1 . 5 , 1 . 5 , − 1 . 5 , 1 . 5 ) ;
AutoPhasePlanePlot ( vecfunc , . 0 1 , 0 , . 6 , 4 , 4 , 4 , − . 5 , . 5 , − . 5 , . 5 ) ;
AutoPhasePlanePlot ( vecfunc , . 0 1 , 0 , . 8 , 4 , 4 , 4 , − . 5 , . 5 , − . 5 , . 5 ) ;
AutoPhasePlanePlot ( vecfunc , . 0 1 , 0 , . 8 , 4 , 4 , 4 , − . 3 , . 3 , − . 3 , . 3 ) ;
AutoPhasePlanePlot ( vecfunc , . 0 1 , 0 , . 8 , 4 , 4 , 4 , − . 2 , . 2 , − . 2 , . 2 ) ;
10 AutoPhasePlanePlot ( vecfunc , . 0 1 , 0 , . 4 , 4 , 4 , 4 , − . 2 , . 2 , − . 2 , . 2 ) ;
AutoPhasePlanePlot ( vecfunc , . 0 1 , 0 , . 4 , 4 , 4 , 4 , − . 3 , . 3 , − . 3 , . 3 ) ;
x l a b e l ( ’x axis’ ) ;
y l a b e l ( ’y axis’ ) ;
t i t l e ( ’Phase Plane Plot’ ) ;

which generates the plot seen in Fig. 9.7.

9.5.3 Project

For this project, follow the outline just discussed above to solve any one of these
models. Then

Fig. 9.7 Phase plane plot


260 9 Numerical Methods Systems of ODEs

Solve the Model By Hand: Do this and attach to your project report.
Plot One Trajectory Using MatLab: Follow the outline above. This part of the
report is done in a word processor with appropriate comments, discussion etc.
Make sure you document all of your MatLab work thoroughly! Show your MatLab
code and sessions as well as plots.
Find the Eigenvalues and Eigenvectors in MatLab: Explain how the MatLab
work connects to the calculations we do by hand.
Plot Many Trajectories Simultaneously Using MatLab: This part of the report
is also done in a word processor with appropriate comments, discussion etc. Show
your MatLab code and sessions as well as plots with appropriate documentation.

Exercise 9.5.1
    
x  (t) 15 x(t)
=
y  (t) 51 y(t)
   
x(0) 4
=
y(0) 10

Exercise 9.5.2
    
x  (t) 1 1 x(t)
=
y  (t) −1 −3/2 y(t)
   
x(0) −4
=
y(0) −6

Exercise 9.5.3
    
x  (t) 4 2 x(t)
=
y  (t) −9 −5 y(t)
   
x(0) 2
=
y(0) −5

Exercise 9.5.4
    
x  (t) 37 x(t)
=
y  (t) 46 y(t)
   
x(0) −2
=
y(0) 5

    
x  (t) 15 x(t)
=
y  (t) 51 y(t)
   
x(0) 2
=
y(0) 3
9.6 AutoPhasePlanePlot Again 261

9.6 AutoPhasePlanePlot Again

Here is an enhanced version of the automatic phase plane plot tool. It would be nice
to automate the plotting of the eigenvector lines, the nullclines and the trajectories so
that we didn’t have to do so much work by hand. Consider the new function in Listing
9.17. In this function, we pass in vecfunc into the argument fname. We evaluate
this function using the command feval(fname,0,[a time; an x]);. The
linear model has a coefficient matrix A of the form
 
ab
cd

and so u feval(fname,0,[1;0]); returns column one of A and u feval


(fname,0,[0;1]); returns column two of A. This is how we can extract a, b, c
and d for our linear model without having to type them in. With them we can define
our A using A = [a,b;c,d]; and then grab the eigenvalues and eigenvectors.

Listing 9.17: AutoPhasePlanePlotLinearModel


f u n c t i o n Approx = A u t o P h a s e P l a n e P l o t L i n e a r M o d e l ( fname , s t e p s i z e , t i n i t , t f i n a l ,
rkorder , . . .
x b o x s i z e , y b o x s i z e , xmin , xmax , ymin , ymax , mag )
% fname i s t h e name o f t h e model dy n a m i c s
% s t e p s i z e i s the chosen s t e p s i z e
5 % t i n i t i s the i n i t i a l time
% t f i n a l i s the f i n a l time
% r k o r d e r i s t h e RK o r d e r
% we w i l l u s e i n i t i a l c o n d i t i o n s c h o s e n from t h e box
% [ xmin , xmax ] x [ ymin , ymax ]
10 % T h i s i s done u s i n g t h e l i n s p a c e command
% s o x b o x s i z e i s t h e number o f p o i n t s i n t h e i n t e r v a l [ xmin , xmax ]
% y b o x s i z e i s t h e number o f p o i n t s i n t h e i n t e r v a l [ ymin , ymax ]
% mag i s r e l a t e d t o t h e zoom i n l e v e l o f our p l o t .
%
15 n = c e i l ( ( t f i n a l −t i n i t ) / s t e p s i z e ) ;
% e x t r a c t a , b , c , and d
u = f e v a l ( fname , 0 , [ 1 ; 0 ] ) ;
a = u(1) ;
c = u(2) ;
20 v = f e v a l ( fname , 0 , [ 0 ; 1 ] ) ;
b = v (1) ;
d = v (2) ;
% construct A
A = [a ,b;c ,d];
25 % g e t e i g e n v a l u e s and e i g e n v e c t o r s
[V, D] = e i g (A) ;
e v a l s = d i a g (D) ;
% t h e f i r s t column o f V i s E1 , s e c o n d i s E2
E1 = V ( : , 1 ) ;
30 E2 = V ( : , 2 ) ;
% The r i s e o v e r run o f E1 i s t h e s l o p e
% The r i s e o v e r run o f E2 i s t h e s l o p e
E 1 s l o p e = E1 ( 2 ) / E1 ( 1 ) ;
E 2 s l o p e = E2 ( 2 ) / E2 ( 1 ) ;
35 % define the eigenvector l i n e s
E 1 l i n e = @( x ) E 1 s l o p e ∗x ;
E 2 l i n e = @( x ) E 2 s l o p e ∗x ;
% setup the n u l l c l i n e l i n e s
262 9 Numerical Methods Systems of ODEs

xp = @( x ) −(a / b ) ∗x ;
40 yp = @( x ) −(c / d ) ∗x ;
% c l e a r o u t any o l d p i c t u r e s
clf
% s e t u p x and y i n i t i a l c o n d i t i o n box
x i c = l i n s p a c e ( xmin , xmax , x b o x s i z e ) ;
45 y i c = l i n s p a c e ( ymin , ymax , y b o x s i z e ) ;
% f i n d a l l t h e t r a j e c t o r i e s and s t o r e them
Approx = { } ;
f o r i = 1: x b o x s i z e
f o r j =1: y b o x s i z e
50 x0 = [ x i c ( i ) ; y i c ( j ) ] ;
[ ht , rk ] = FixedRK ( fname , 0 , x0 , s t e p s i z e , 4 , n ) ;
Approx{ i , j } = rk ;
U = Approx{ i , j } ;
X = U( 1 , : ) ;
55 Y = U( 2 , : ) ;
% g e t t h e p l o t t i n g s q u a r e f o r each t r a j e c t o r y
umin = min (X) ;
umax = max (X) ;
utop = max ( abs ( umin ) , abs ( umax ) ) ;
60 vmin = min (Y) ;
vmax = max (Y) ;
v t o p = max ( abs ( vmin ) , abs ( vmax ) ) ;
D( i , j ) = max ( utop , v t o p ) ;
end
65 end
% get the l a r g e s t square to put a l l the p l o t s i n t o
E = max ( max (D) )
% setup the x linspace for the p l o t
x = l i n s p a c e (−E , E , 2 0 1 ) ;
70 % s t a r t the hold
h o l d on
% p l o t t h e e i g e n v e c t o r l i n e s and t h e n t h e n u l l c l i n e s
p l o t ( x , E 1 l i n e ( x ) , ’-r’ , x , E 2 l i n e ( x ) , ’-m’ , x , xp ( x ) , ’-b’ , x , yp ( x ) , ’-c’ ) ;
% l o o p o v e r a l l t h e ICS and g e t a l l t r a j e c t o r i e s
75 f o r i = 1: x b o x s i z e
f o r j =1: y b o x s i z e
U = Approx{ i , j } ;
X = U( 1 , : ) ;
Y = U( 2 , : ) ;
80 p l o t (X, Y, ’-k’ ) ;
end
end
% s e t l a b e l s and s o f o r t h
x l a b e l ( ’x’ ) ;
85 y l a b e l ( ’y’ ) ;
t i t l e ( ’Phase Plane’ ) ;
l e g e n d ( ’E1’ , ’E2’ , ’x’’=0’ , ’y’’=0’ , ’y vs x’ , ’Location’ , ’BestOutside’ ) ;
% s e t zoom f o r p l o t u s i n g mag
a x i s ([ −E∗mag E∗mag −E∗mag E∗mag ] ) ;
90 % f i n i s h the hold
hold o f f ;
end

In use, for f = @(t,x)[4*x(1)+9*x(2);-x(1)-6*x(2)] we can generate


nice looking plots and zero in on a nice view with an appropriate use of the parameter
max.
9.6 AutoPhasePlanePlot Again 263

Listing 9.18: Session for x  = 4x + 9y, y  = −x − 6y Enhanced Phase Plane Plots


Approx = AutoPhasePlanePlotLinearModelE ( f , . 0 1 , 0 , . 4 5 , 4 , 8 , 8 , − . 5 , . 5 , − . 5 , . 5 , . 2 ) ;
E =
4.1421
Approx = AutoPhasePlanePlotLinearModelE ( f , . 0 1 , 0 , . 6 5 , 4 , 8 , 8 , − . 5 , . 5 , − . 5 , . 5 , . 1 ) ;
5 E =
7.6481
Approx = AutoPhasePlanePlotLinearModelE ( f , . 0 1 , 0 , . 6 5 , 4 , 1 2 , 1 2 , − 1 . 5 , 1 . 5 , − 1 . 5 , 1 . 5 , . 1 ) ;
E =
22.9443
10 Approx = AutoPhasePlanePlotLinearModelE ( f
,.01 ,0 ,1.65 ,4 ,12 ,12 , −1.5 ,1.5 , −1.5 ,1.5 ,.01) ;
E =
462.3833
Approx = AutoPhasePlanePlotLinearModelE ( f
,.01 ,0 ,1.65 ,4 ,12 ,12 , −1.5 ,1.5 , −1.5 ,1.5 ,.005) ;
E =
15 462.3833
Approx = AutoPhasePlanePlotLinearModelE ( f , . 0 1 , 0 , 1 . 6 5 , 4 , 6 , 6 , − 1 . 5 , 1 . 5 , − 1 . 5 , 1 . 5 , . 0 0 5 )
;
E =
462.3833
Approx = AutoPhasePlanePlotLinearModelE ( f , . 0 1 , 0 , 1 . 6 5 , 4 , 6 , 6 , − 1 . 5 , 1 . 5 , − 1 . 5 , 1 . 5 , . 0 0 8 )
;
20 E =
462.3833

The last command generates the plot in Fig. 9.8.


Let’s look at all the pieces of this code in detail. It will help you learn a bit more
about how to write code to solve your models. First, we find the number of steps
needed as we have done before; n = ceil((tfinal-tinit)/stepsize);
Now we don’t want to have to hand calculate the eigenvalues and eigenvectors
anymore. To use the eig command, we need the matrix A from the dynamics

Fig. 9.8 Phase plane


x  = 4x + 9y, y  = −x − 6y
264 9 Numerical Methods Systems of ODEs

fname. To extract A, we use the following lines of code. Note, for a linear model,
f = @(t,x) [a*x(1) + b*x(2); c*x(1)+d*x(2)];. Hence, we can
do evaluations to find the coefficients: f(0,[1;0]) = [a;c] and f(0,[0;1])
= [b;d].

Listing 9.19: Extracting A


% extract [a;c]
u = f e v a l ( fname , 0 , [ 1 ; 0 ] ) ;
% e x t r a c t a and c
a = u(1) ;
5 c = u(2) ;
% extract [b;d]
v = f e v a l ( fname , 0 , [ 0 ; 1 ] ) ;
% e x t r a c t b and d
b = v (1) ;
10 d = v (2) ;
% s e t up A
A = [a ,b;c ,d];
% g e t t h e e i g e n v a l u e s and e i g e n v e c t o r s o f A
[V, D] = e i g s (A) ;
15 % t h e d i a g (D) command g e t s t h e d i a g o n a l o f D
% which g i v e s t h e two e i g e n v a l u e s
e v a l s = d i a g (D) ;

Next, we set the first column of V to be the eigenvector E1 and the second column
of V to be the eigenvector E1. Then we setup the lines we need to plot the eigenvectors
and the nullclines. The x  = 0 nullcline is ax + by = 0 or y = −(a/b)x and the
y  = 0 nullcline is cx + dy = 0 or y = −(c/d)x.

Listing 9.20: Setting up the eigenvector and nullcline lines


% e x t r a c t e i g e n v e c t o r 1 and e i g e n v e c t o r 2
E1 = V ( : , 1 ) ;
E2 = V ( : , 2 ) ;
% f i n d t h e s l o p e o f e i g e n v e c t o r 1 and
5 % the slope of eigenvector 2
E 1 s l o p e = E1 ( 2 ) / E1 ( 1 ) ;
E 2 s l o p e = E2 ( 2 ) / E2 ( 1 ) ;
% s e t the eigenvector l i n e s
E 1 l i n e = @( x ) E 1 s l o p e ∗x ;
10 E 2 l i n e = @( x ) E 2 s l o p e ∗x ;
% s e t the n u l l c l i n e l i n e s
xp = @( x ) −(a / b ) ∗x ;
yp = @( x ) −(c / d ) ∗x ;

Then clear any previous figures with clf before we get started with the plot.
The initial conditions are chosen by dividing the interval [xmin, xmax] into
xboxsize points. Similarly, we divide [ymin, ymax] into yboxsize points.
We use these xboxsi ze × yboxsi ze possible pairs as our initial conditions.
9.6 AutoPhasePlanePlot Again 265

Listing 9.21: Set up Initial conditions, find trajectories and the bounding boxes
% s e t up p o s s i b l e x c o o r d i n a t e s o f t h e ICs
x i c = l i n s p a c e ( xmin , xmax , x b o x s i z e ) ;
% s e t up p o s s i b l e y c o o r d i n a t e s o f t h e ICs
y i c = l i n s p a c e ( ymin , ymax , y b o x s i z e ) ;
5 % s e t up a d a t a s t r u c t u r e c a l l a c e l l t o s t o r e
% each t r a j e c t o r y
Approx = { } ;
% loop over a l l p o s s i b l e i n i t i a l condition
f o r i =1: x b o x s i z e
10 f o r j =1: y b o x s i z e
% s e t t h e IC
x0 = [ x i c ( i ) ; y i c ( j ) ] ;
% s o l v e t h e model u s i g n RK4 .
% r e t u r n the approximate v a l u e s in rk
15 [ ht , rk ] = FixedRK ( fname , 0 , x0 , s t e p s i z e , 4 , n ) ;
% s t o r e r k a s t h e i , j t h e n t r y i n t h e c e l l Approx
Approx{ i , j } = rk ;
% s e t U t o be t h e c u r r e n t Approx c e l l e n t r y
% which i s t h e same a s t h e r e t u r n e d r k
20 U = Approx{ i , j } ;
% s e t t h e f i r s t row o f U t o be X
% s e t t h e s e c o n d row o f U t o be Y
X = U( 1 , : ) ;
Y = U( 2 , : ) ;
25 % find the square the t r a j e c t o r y f i t s i n s i d e
umin = min (X) ;
umax = max (X) ;
utop = max ( abs ( umin ) , abs ( umax ) ) ;
vmin = min (Y) ;
30 vmax = max (Y) ;
v t o p = max ( abs ( vmin ) , abs ( vmax ) ) ;
% s t o r e the s i z e of the square t h a t f i t s the t r a j e c t o r y
% f o r t h i s IC
D( i , j ) = max ( utop , v t o p ) ;
35 end
end

Now that we have all these squares for the possible trajectories, we find the biggest
one possible and set up the linspace command for this box. All of our trajectories
will be drawn inside the square [−E, E] × [−E, E].

Listing 9.22: Set up the bounding box for all the trajectories
E = max ( max (D) )
x = l i n s p a c e (−E , E , 2 0 1 ) ;

Next, plot all the trajectories and set labels and so forth. The last thing we do is
to set the axis as axis([-E*mag E*mag -E*mag E*mag]); which zeros or
zooms in on the interval [−E ∗ mag, E ∗ mag] × [−E ∗ mag E ∗ mag].
266 9 Numerical Methods Systems of ODEs

Listing 9.23: Plot the trajectories at the chosen zoom level


h o l d on
p l o t ( x , E 1 l i n e ( x ) , ’-r’ , x , E 2 l i n e ( x ) , ’-m’ , x , xp ( x ) , ’-b’ , x , yp ( x ) , ’-c’ ) ;
f o r i = 1: x b o x s i z e
f o r j =1: y b o x s i z e
5 U = Approx{ i , j } ;
X = U( 1 , : ) ;
Y = U( 2 , : ) ;
p l o t (X, Y, ’-k’ ) ;
end
10 end
x l a b e l ( ’x’ ) ;
y l a b e l ( ’y’ ) ;
t i t l e ( ’Phase Plane’ ) ;
l e g e n d ( ’E1’ , ’E2’ , ’x’’=0’ , ’y’’=0’ , ’y vs x’ , ’Location’ , ’BestOutside’ ) ;
15 a x i s ([ −E∗mag E∗mag −E∗mag E∗mag ] ) ;
hold o f f ;

9.6.1 Project

Here is another project which uses AutoPhasePlanePlotLinearModel. For


a given model
Plot Many Trajectories Simultaneously Using MatLab: This part of the report
is done in a word processor with appropriate comments, discussion etc. Show
your MatLab code and sessions as well as plots with appropriate documentation.
Choose Initial Conditions wisely: Choose a useful set of initial conditions to plot
trajectories for by choosing xmin, xmax, ymin, ymax, and xboxsize,
yboxsize appropriately.
Choose the final time and step size: You’ll also have to find the right final time
and step size to use.
Generate a nice phase plane plot: Work hard on generating a really nice phase
plane plot by an appropriate use of the mag factor.

Exercise 9.6.1
    
x  (t) 1 1 x(t)
=
y  (t) −1 −3/2 y(t)

Exercise 9.6.2
    
x  (t) 4 2 x(t)
=
y  (t) −9 −5 y(t)
9.6 AutoPhasePlanePlot Again 267

Exercise 9.6.3
    
x  (t) 37 x(t)
=
y  (t) 46 y(t)

Exercise 9.6.4
    
x  (t) 15 x(t)
=
y  (t) 51 y(t)

9.7 A Two Protein Model

We begin with a description of what you need to do by using a sample problem.


Consider the following two protein system.
      
x  (t) −0.001 −0.001 x(t) 0.1
= +
y  (t) −0.004 −0.008 y(t) 0.5
   
x(0) 0
=
y(0) 0

Note we can group terms like this:

x  = (−0.001x + 0.1) − 0.001y


y  = (−0.008y + 0.5) − 0.004x.

Hence, we can interpret the x dynamics as protein x follows a standard protein


synthesis model (P  = −αP + β) but as protein y is created it binds with the
promoter for the gene controlling x to shut off x production at the rate −0.001.
Similarly, the y dynamics reflect a standard protein synthesis model for y with the
production of x curtailing the production of y through binding with y’s promoter. We
know this sort of linear system with constant inputs has a solution which we can find
using our standard tools. But we do want to make sure this model has a solution where
the protein levels are always positive. Otherwise, it is not a biologically realistic two
protein model!

9.7.1 Protein Values Should be Positive!

In general, although it is easy to write down a two protein system, it is a bit harder to
make sure you get positive values for the long term protein concentration levels. So as
we were making up problems, we used a MatLab script to help get a good coefficient
matrix. This is an interesting use of Matlab and for those of you who would like to dig
268 9 Numerical Methods Systems of ODEs

more deeply into this aspect of modeling, we encourage you to study closely what
we do. The terms that inhibit or enhance the other protein’s production in general
are a c1 y in the x dynamics and a c2 x in the y dynamics. Note, we can always think
of a two dimensional vector like this
   
c1 1
= c1
c2 c2 /c1

and letting t = c2 /c1 , we have for c1 = γ, that


     
c1 1 γ
=γ = .
c2 t tγ

We can do a similar thing for the constant production terms d1 for x and d2 for y to
write
       
d1 1 1 β
= d1 = d1 =
d2 d2 /d1 s sβ

for d1 = β and s = d1 /d2 . Finally, we can handle the two decay rates for x and y
similarly. If these two rates are −α1 for x and −α2 for y, we can model that as
       
α1 1 1 α
= α1 = α1 =
α2 α2 /α1 u uα

for α1 = α and u = α1 /α2 . So, if u is a lot more than 1, y will decay fast and we
would expect x to take a lot longer to reach equilibrium. Our general two protein
model will then have the form

x  = −α x − γ y + β
y  = −uα y + sβ − tγ x

and we want to choose these parameters so we get positive protein levels. The coef-
ficient matrix A here is
 
−α −γ
A=
−t γ −u α

with characteristic equation

r 2 + (α + uα)r + uα2 − tαγ = 0.

So the roots are


 
1
r = −(α + uα) ± α2 (1 − u)2 + 4tαγ
2
9.7 A Two Protein Model 269

We want both roots negative so we have a decay situation. Hence,

−(α + uα) + α2 (1 − u)2 + 4tαγ < 0.

This implies

(α + uα) > α2 (1 − u)2 + 4tγ 2


α (1 + u)2 > α2 (1 − u)2 + 4tγ 2
2

After simplifying, we find we need to choose t so that

uα2
t< .
γ2

Let’s choose t = (1/2)αu/γ 2 . Hence, a good choice for A is


 
−α γ
A=
−tγ = −(1/2)α2 u/γ −uα

Next, we set up the growth levels. We want the particular solution to have positive
components. If we let the particular solution be X p , we know X p = −A−1 F where
F is the vector of constant external inputs. Hence, we want −A−1 F to have positive
components. Thus,
    
−u ∗ α γ β 0
−(1/det (A)) >
t ∗ γ −α sβ 0

where we interpret the inequality componentwise. Now det ( A) = uα2 − tγ 2 . Plug-


ging in t = 0.5 u α2 /γ 2 , we find det ( A) = 0.5αγ 2 > 0. Hence, we satisfy our
component inequalities if

−u α β + s β γ < 0
t β γ − s α β < 0.

Cancelling β, we find
−u α + s γ < 0 ⇒ s γ < u α
t γ − s α < 0 ⇒ tγ < sα.

Plugging in t, we want
sγ<uα
 
α2
0.5 u 2 γ < sα.
γ
270 9 Numerical Methods Systems of ODEs

Simplifying, we find we want s to satisfy


α α
0.5u <s<u .
γ γ

Hence, any s on the line between 0.5uα and uα/γ will work. Parameterize this line
as s = u(α/γ)(1 − 0.5z) which at z = 0 gives uα/γ and at z = 1 gives 0.5uαγ. We
will choose z = 1/3. The commands in Matlab are then

Listing 9.24: Choose s and set the external inputs


% choose s
z = 1/3;
s = ( 1 − 0 . 5 ∗ z ) ∗ a l p h a ∗u / gamma ;
% set external input
5 F = [ beta ; s ∗ beta ]

We can easily write MatLab code to do all of this in a function we’ll call two-
proteins. It will return a good A and F for our choices of α, γ, u and β.

Listing 9.25: Finding the A and F for a two protein model


f u n c t i o n [A, F ] = twoprotsGetAF ( alpha , gamma , u , b e t a )
%
% −a l p h a i s d e c a y r a t e f o r p r o t e i n x
% −gamma i s d e c a y r a t e f o r p r o t e i n y i n f l u e n c i n g p r o t e i n x
5 % −t ∗gamma i s d e c a y r a t e f o r p r o t e i n x i n f l u e n c i n g p r o t e i n y
% −u∗ a l p h a i s d e c a y r a t e f o r p r o t e i n y
% b e t a i s t h e g r o w th r a t e f o r p r o t e i n x
% s ∗ b e t a i s t h e g r o w th r a t e f o r p r o t e i n y
% A i s t he dynamics
10 %
% now s e t u p A
% The r o o t s s a t i s f y
% 2 r = −a l p h a ∗(1+ u ) pm s q r t (D)
% where D = a l p h a ˆ2(1 −u ) ˆ 2 + 4 t gamma ˆ 2
15 % We want b o t h r o o t s n e g a t i v e , s o we want
%
% −a l p h a ∗(1+ u ) + s q r t (D) < 0
% a l p h a ∗(1+ u ) > s q r t (D)
% a l p h a ˆ 2 ( 1 + u ) ˆ 2 > a l p h a ˆ2(1 −u ) ˆ 2 + 4 t gamma ˆ 2
20 % a l p h a ˆ 2 ( 1 + 2 u+u ˆ 2 ) > a l p h a ˆ2(1 −2 u+u ˆ 2 ) + 4 t gamma ˆ 2
% 4 a l p h a ˆ 2 u > 4 t gamma ˆ 2
% t < u a l p h a ˆ 2 / gamma ˆ 2
%
% so s e t t t o s a t i s f y t h i s requirement
25 t = 0 . 5 ∗ a l p h a ˆ 2 ∗ u / ( gamma ˆ 2 )
A = [− alpha ,−gamma;− t ∗gamma,−u∗ a lp h a ] ;
% s e t up g r o w t h l e v e l s
% We want XP > 0 s o we want
% − Ainv ∗F > 0
30 % − ( 1 / d e t (A ) ) [−u∗ alpha , gamma ; t ∗gamma,− a l p h a ] ∗ [ 1 ; s ] ∗ b e t a > 0
% Now d e t (A ) = u a l p h a ˆ 2 − t gamma ˆ 2
% = u a l p h a ˆ 2 − ( 0 . 5 ∗ a l p h a ˆ 2 ∗ u / ( gamma ˆ 2 ) ) gamma ˆ 2
% = u alpha ˆ2 − 0.5 alpha ˆ2 u = 0.5 alpha ˆ2
% so d e t (A) > 0
35 % So we want
% ( 1 / d e t (A ) ) [−u∗ alpha , gamma ; t ∗gamma,− a l p h a ] ∗ [ 1 ; s ] ∗ b e t a < 0
% or
% [−u∗ alpha , gamma ; t ∗gamma,− a l p h a ] ∗ [ 1 ; s ] < 0
% or
9.7 A Two Protein Model 271

40 % −u a l p h a + s gamma < 0 ==> gamma s < u a l p h a


% ==> s < u a l p h a / gamma
% t gamma − a l p h a s < 0 ==> a l p h a s > t gamma
% ==> s > ( 0 . 5 ∗ a l p h a ˆ 2 ∗ u / ( gamma ˆ 2 ) ) ( gamma / a l p h a )
% ==> s > 0 . 5 a l p h a u / gamma
45 % So
% 0 . 5 a l p h a u / gamma < s < u a l p h a / gamma
% s = ( a l p h a u / gamma ) ( 0 . 5 z + (1− z ) )
% = ( a l p h a u / gamma ) ∗ ( 1 − 0 . 5 z )
%
50 z = 1/3;
s = ( 1 − 0 . 5 ∗ z ) ∗ a lp h a ∗u / gamma
F = [ beta ; s ∗ beta ] ;
%

Note without the comments, this code is short:

Listing 9.26: Finding the A and F for a two protein model: uncommented
f u n c t i o n [A, F ] = twoprotsGetAF ( alpha , gamma , u , b e t a )
2 %
% −a l p h a i s d e c a y r a t e f o r p r o t e i n x
% −gamma i s d e c a y r a t e f o r p r o t e i n y i n f l u e n c i n g p r o t e i n x
% −t ∗gamma i s d e c a y r a t e f o r p r o t e i n x i n f l u e n c i n g p r o t e i n y
% −u∗ a l p h a i s d e c a y r a t e f o r p r o t e i n y
7 % b e t a i s t h e g r o w th r a t e f o r p r o t e i n x
% s ∗ b e t a i s the growth r a t e f o r p r o t e i n y
% A i s t h e d y n a mi c s
%
t = 0 . 5 ∗ a lp h a ˆ 2 ∗ u / ( gamma ˆ 2 )
12 A = [− alpha ,−gamma;− t ∗gamma,−u∗ a l p h a ] ;
% s e t up g r o w t h l e v e l s
z = 1/3;
s = ( 1 − 0 . 5 ∗ z ) ∗ a lp h a ∗u / gamma
F = [ beta ; s ∗ beta ] ;

9.7.2 Solving the Two Protein Model

The section above shows us how to design the two protein system so the protein
levels are always positive so let’s try it out. Let α = γ = a for any positive a and
u = 8. Then for our nice choice of t, we have could do the work by hand and find
 
−a −a
A=
−4a −8a

But let’s be smarter and let MatLab do this for us! We can do this for any positive a
and generate a reasonable two protein model to play with. In Matlab. we begin by
setting up the matrix A with twoprotsGetAF. We will choose do this for β = 10.
272 9 Numerical Methods Systems of ODEs

Listing 9.27: Setting up the coefficient matrix A


a = 10 ˆ( − 3) ;
a = 0.0010000
[A, F ] = twoprotsGetAF ( a , a , 8 , 1 0 ) ;
t = 4
5 s = 6.6667
A
A =

−0.0010000 −0.0010000
10 −0.0040000 −0.0080000
F
F =

10.000
15 66.667

Then we find the inverse of A using the formulae we developed in class for the
inverse of a 2 × 2 matrix of numbers.

Listing 9.28: Finding the inverse of A


AInv = ( 1 / d e t (A) ) ∗ [A( 2 , 2 ) ,−A( 1 , 2 ) ;−A( 2 , 1 ) ,A( 1 , 1 ) ] ;
AInv
AInv =

5 −2000 250
1000 −250

Next, we find the particular solution, X P.

Listing 9.29: Find the particular solution


XP = −AInv∗F
XP =

4 3333.67
6666.67

Now we find the eigenvalues and eigenvectors of the matrix A using the eig com-
mand as we have done before. Recall, this commands returns the eigenvectors as
columns of a matrix V and returns the eigenvalues as the diagonal entries of the
matrix D.

Listing 9.30: Find eigenvalues and eigenvectors of A


[V, D] = e i g (A)
V =

0.88316 0.13163
5 −0.46907 0.99130

D =

D i a g o n a l Matrix
10
−4.6887 e −04 0
0 −8.5311 e −03
9.7 A Two Protein Model 273

So column one of V is the eigenvector associated with the eigenvalue −4.6887e−04


and column two of V is the eigenvector associated with the eigenvalue −8.5311e−03.
Note the first eigenvalue is the dominant one as MatLab doesn’t necessarily follow
our conventions on labeling the eigenvalues in a small to large order! The general
solution is known to be
       
x(t) 0.88316 −4.6887e−04t 0.13163 −8.5311e−03t 3333.67
= C1 e + C2 e +
y(t) −0.46907 0.99130 6666.67

Notice that since eigenvector one and eigenvector two are the columns of the matrix
V, we can rewrite this as in matrix–vector form as
   
x(t) C1
=V + X P.
y(t) C2

Our protein models start with zero levels of proteins, so to satisfy the initial condi-
tions, we have to solve
       
0 0.88316 0.13163 3333.67
= C1 + C2 +
0 −0.46907 0.99130 6666.67

Hence, the initial data gives us


 
C1
V = −X P.
C2

and the solution is then


 
C1
= V −1 (−X P) = −V −1 (X P) .
C2

In Matlab, we then have

Listing 9.31: Solving the IVP


VInv = ( 1 / d e t (V) ) ∗ [V( 2 , 2 ) ,−V( 1 , 2 ) ;−V( 2 , 1 ) ,V( 1 , 1 ) ] ;
C = VInv XP
C =

5 −2589.4
−7950.4

Then we can construct the solutions x(t) and y(t) so we can plot the protein concen-
trations as a function of time. First, store the eigenvalues in a more convenient form
by grabbing the diagonal entries of D using Ev = diag(D). Then, we construct
the solutions like this:
274 9 Numerical Methods Systems of ODEs

Listing 9.32: Constructing the solutions x and y and plotting


Ev = d i a g (D) ;
x = @( t ) ( C( 1 ) ∗V( 1 , 1 ) ∗ exp ( Ev ( 1 ) ∗ t ) +C( 2 ) ∗V( 1 , 2 ) ∗ exp ( Ev ( 2 ) ∗ t ) +XP ( 1 ) ) ;
y = @( t ) ( C( 1 ) ∗V( 2 , 1 ) ∗ exp ( Ev ( 1 ) ∗ t ) +C( 2 ) ∗V( 2 , 2 ) ∗ exp ( Ev ( 2 ) ∗ t ) +XP ( 2 ) ) ;
T = linspace (0 ,12000 ,1000) ;
5 p l o t ( T , x (T ) ,T , y ( T ) . ’ −+’);
xlabel(’Time ’ ) ;
y l a b e l ( ’Protein Levels’ ) ;
t i t l e ( ’Two Protein Model: alpha = gamma = 1.-e-3, u = 8, beta = 10’ ) ;
l e g e n d ( ’Protein 1’ , ’Protein 2’ ) ;

This generates Fig. 9.9.


Note we can also do a quick phase plane plot: plotting y(t) versus x(t) as shown
in Fig. 9.10. We see the protein trajectory converges to the particular solution as
t → ∞. Of course, you have to plot the solutions over long enough time to see this!

Listing 9.33: The protein phase plane plot


1 T = linspace (0 ,12000 ,1000) ;
p l o t ( x (T ) , y (T) ) ;
x l a b e l ( ’Protein One’ ) ;
y l a b e l ( ’Protein Two’ ) ;
t i t l e ( ’Protein Two vs Protein One’ ) ;

In summary, to solve the two protein problem, you would just type a few lines in
MatLab as follows:

Fig. 9.9 Two interacting proteins again


9.7 A Two Protein Model 275

Fig. 9.10 Phase plane for two interacting proteins again

Listing 9.34: Solving the problem


a = 10ˆ( − 3) ;
[A, F ] = twoprotsGetAF ( a , a , 8 , 1 0 ) ;
% f i n d e i g e n v a l u e s and e i g e n v e c t o r s
[V, D] = e i g (A) ;
5 Ev = d i a g (D)
VInv = ( 1 / d e t (V) ) ∗ [V( 2 , 2 ) ,−V( 1 , 2 ) ;−V( 2 , 1 ) ,V( 1 , 1 ) ] ;
AInv = ( 1 / d e t (A) ) ∗ [A( 2 , 2 ) ,−A( 1 , 2 ) ;−A( 2 , 1 ) ,A( 1 , 1 ) ] ;
% find particular solution
XP = −AInv∗F
10 % Use i n i t i a l c o n d i t i o n s
C = −VInv∗XP
% s e t up x and y
x = @( t ) ( C( 1 ) ∗V( 1 , 1 ) ∗ exp ( Ev ( 1 ) ∗ t ) +C( 2 ) ∗V( 1 , 2 ) ∗ exp ( Ev ( 2 ) ∗ t ) +XP ( 1 ) ) ;
y = @( t ) ( C( 1 ) ∗V( 2 , 1 ) ∗ exp ( Ev ( 1 ) ∗ t ) +C( 2 ) ∗V( 2 , 2 ) ∗ exp ( Ev ( 2 ) ∗ t ) +XP ( 2 ) ) ;

The last thing is to think about response times. Here x grows a lot slower by design
as we set u = 8 and so we could temporarily think of x as ≈0 giving the y dynam-
ics, y  ≈ −uαy + sβ. Thus, the response time for y is about tr = ln(2)/uα or
y

in MatLab tRx = log(2)/(-A(2,2)). To approximate the response time of


x, start with y achieving its approximate steady state value from our y dynamics
approximation above. This is y∞ ≈ sβ/(uα). Using this in the x dynamics, we have
x  ≈ −αx + β − γ y∞ which has a response time of tr ≈ ln(2)/(α) which in Matlab
y

is tRx = log(2)/(-A(1,1)). Hence, in our problem, one protein grows very


slowly and so we typically see two time scales. On the first, one protein changes
rapidly while the other is essentially constant. But on the longer time scale, the pro-
tein which grows slower initially eventually overtakes the first protein on its way to
its higher equilibrium value. As discussed above, we estimate these time scales as
276 9 Numerical Methods Systems of ODEs

Listing 9.35: Estimated Protein Response Times


tRy = l o g ( 2 ) /( −A( 2 , 2 ) )
tRx = l o g ( 2 ) /( −A( 1 , 1 ) )

Note the large difference in time scales! The protein levels we converge to are
x = 3333.67 and y = 6666.67, but the response time for the x protein is much
longer than the response time for the y protein. So if we plot over different time
scales, we should see primarily y protein until we reach a suitable fraction of the x
y
protein response time. Figure 9.11 shows a plot over a short time scale, 10tr . Over
10 short time scales, protein y is essentially saturated but protein x is only about 1/3
of its equilibrium value.

Listing 9.36: Short Time Scale Plot


T = l i n s p a c e ( 0 , 1 0 ∗ tRy , 1 0 0 ) ;
p l o t ( T , x ( T) ,T , y (T ) , ’+’ ) ;
3 x l a b e l ( ’Time’ ) ;
y l a b e l ( ’Protein Levels’ ) ;
t i t l e ( ’Two Protein Model over Short Time Scale’ ) ;
l e g e n d ( ’Protein X’ , ’Protein Y’ ) ;

y
Figure 9.12 shows a plot over a long time scale, 10trx which is about 80tr . This
allows x to get close to its asymptotic value. Note the long term value of y drops
some from its earlier peak. Our response times here are just approximations and the

Fig. 9.11 Phase plane for two interacting proteins on a short time scale
9.7 A Two Protein Model 277

Fig. 9.12 Phase plane for two interacting proteins on a long time scale

protein interactions of the model eventually take effect the the equilibrium value of
y drops to its final particular solution value.

Listing 9.37: Long Time Scale Plot


T = l i n s p a c e ( 0 , 1 0 ∗ tRx , 1 0 0 0 0 ) ;
p l o t ( T , x (T ) ,T , y ( T ) , ’+’ ) ;
x l a b e l ( ’Time’ ) ;
4 y l a b e l ( ’Protein Levels’ ) ;
t i t l e ( ’Two Protein Model over Long Time Scale’ ) ;
l e g e n d ( ’Protein X’ , ’Protein Y’ ) ;

9.7.3 Project

Now we are ready for the project. For the model choice of α = 0.003, γ = 0.005
and u = 8 and β = 0.01, we expect x to grow a lot slower. Your job is the follow
the discussion above and generate a nice report. The report has this format.
Introduction (5 Points) Explain what we are trying to do here with this model.
Contrast this model with two proteins to our earlier model for one protein.
Description of Model (10 Points) Describe carefully what each term in the model
represents in terms of protein transcription. Since these models are abstractions
of reality, explain how these models are a simplified version of the protein tran-
scription process.
278 9 Numerical Methods Systems of ODEs

Annotated Solution Discussion (27 Points) In this part, you solve the model using
MatLab as I did in the example above. As usual, explain your steps nicely. Cal-
culate the approximate response times for protein x and protein y and use them
to generate appropriate plots. For each plot you generate, provide useful labels,
legends and titles and include them in your document.
Conclusion (5 Points) Discuss what you have done here. Do you think you could
do something similar for more than two proteins using MatLab?
References (3 Points) Put any reference material you use in here.
Now do it all again, but this time set β = 2 with u = 0.1. This should switch the
short term and long term protein behavior.

9.8 A Two Box Climate Model

This is a simple model of how the ocean responds to the various inputs that give rise
to greenhouse warming. The ocean is modeled as two layers: the top layer is shallow
and is approximately 100 m in depth while the second layer is substantially deeper
as it is on average 4000 m in depth. The top layer is called the mixed layer as it
interacts with the atmosphere and so there is a transfer of energy back and forth from
the atmosphere to the top layer. The bottom layer is the deep layer and it exchanges
energy only with the mixed layer. The model we use is as follows:

Cm Tm = F − λ1 Tm − λ2 (Tm − Td )
Cd Td = λ2 (Tm − Td )

There are a lot of parameters here and they all have a physical meaning.
t: This is time.
Tm : This is the temperature of the mixed layer. We assume it is measured as the
deviation from the mixed layer’s equilibrium temperature. Hence, Tm = 1 would
mean the temperature has risen 1◦ from its equilibrium temperature. We therefore
set Tm = 0 initially and when we solve our model, we will know how much Tm
has gone up.
Td : This is the temperature of the deep layer. Again, this is measured as the
difference from the equilibrium temperature of the deep layer. We set Td = 0
initially also.
F: This is the external input which represents all the outside things that contribute
to global warming such as CO2 release and so forth. It does not have to be a constant
but in our project we will use a constant value for F.
Cm : This is the heat capacity of the mixed layer.
Cd : This is the heat capacity of the deep layer.
λ1 : This is the exchange coefficient which determines the rate at which heat is
transferred from the mixed layer to the atmosphere.
9.8 A Two Box Climate Model 279

λ2 : This is the exchange coefficient which determines the rate at which heat is
transferred from the mixed layer to the deep layer.
Looking at the mixed layer dynamics, we see there are two loss terms for the mixed
layer temperature. The first is based on the usual exponential decay model using the
exchange coefficient λ1 which specifies how fast heat is transferred from the mixed
layer to the atmosphere. The second loss term models how heat is transferred from the
mixed layer to the deep layer. We assume this rate is proportional to the temperature
difference between the layers which is why this loss is in terms of Tm − Td . The
deep layer temperature dynamics is all growth as the deep layer is picking up energy
from the mixed layer above it. Note this type of modeling—a loss in Tm and the
loss written as a gain for Td —is exactly what we will do in the SIR disease model
that comes later. Our reference for this model is the nice book on climate modeling
(Vallis 2012). It is a book all of you can read with profit as it uses mathematics you
now know very well. We can rewrite the dynamics into the standard form with a little
manipulation.

Cm Tm = −(λ1 + λ2 )Tm + λ2 Td + F


Cd Td = λ2 Tm − λ2 Td

In matrix form we then have


      
Cm Tm −(λ1 + λ2 ) λ2 Tm F
= +
Cd Td λ2 −λ2 Td 0

Then dividing by the heat capacities, we find

      
Tm −(λ1 + λ2 )/Cm λ2 /Cm Tm F/Cm
= +
Td λ2 /Cd −λ2 /Cd Td 0

Hence, this model is another of our standard linear models with an external input
such as we solved in the two protein model. The A matrix here is

 
−(λ1 + λ2 )/Cm λ2 /Cm
A=
λ2 /Cd −λ2 /Cd

and letting F denote the external input vector and T denote the vector of layer
derivatives, we see the two box climate model is represented by our usual dynamics
T  = AT + F which we know how to solve.
Note the particular solution here is T P = − A−1 F.
  
1 −λ2 /Cd −λ2 /Cm F/Cm
TP = −
det ( A) −λ2 /Cd −(λ1 + λ2 )/Cm 0
280 9 Numerical Methods Systems of ODEs

We find det ( A) = (λ1 λ2 )/(Cm Cd ) and so


 
−λ2 F/(Cm Cd )
T P = −(Cm Cd )/(λ1 λ2 )
−λ2 F/(Cm Cd )
 
F 1
=
λ1 1

So, if both eigenvalues of this model were negative, we would have the long term
equilibrium of both the mixed and deep layer would be the same Td∞ = Tm∞ = F/λ1 .
Next, we show the eigenvalues are indeed negative here.
For convenience of exposition, let α = λ1 /Cm , β = λ2 /Cm and γ = λ2 /Cd .
Also, it is helpful to express Cd as a multiple of Cm , so we write Cd = ρCm where
ρ is our multiplier. With these changes, the model can be rewritten. We find
      
Tm −(α + β) β Tm F/Cm
= +
Td γ −γ Td 0

But since γ = λ2 /Cd and Cd = ρCm we have γ = β/ρ. This gives the new form

      
Tm −(α + β) β Tm F/Cm
= +
Td β/ρ −β/ρ Td 0

Hence, any two box climate model can be represented by the dynamics T  = AT + F
where

 
−α − β β
A=
β/ρ −β/ρ

Let’s look at the eigenvalues of A. The characteristic equation is (r + α + β)(r +


β/ρ) − β 2 /ρ = 0. This simplifies to r 2 + (α + β + β/ρ)r + αβ/ρ = 0. We suspect
both eigenvalues are negative for our climate model as the physics here implies the
temperatures stabilize. From the quadratic formula

r = (1/2) −(α + β + β/ρ) ± (α + β + β/ρ)2 − 4αβ/ρ

The eigenvalue corresponding to the minus is negative. Next, look at the other root.
The discriminant D is the term inside the square root. Let’s show it is positive and
that will help us show the other root is negative also. We have
9.8 A Two Box Climate Model 281

D = (α + β + β/ρ)2 − 4αβ/ρ
= (α + β)2 + 2(α + β)(β/ρ) + β 2 /ρ2 − 4αβ/ρ
= (α + β)2 − 2αβ/ρ + 2β 2 /ρ + β 2 /ρ2

= 2αβ + β 2 + 2β 2 /ρ + β 2 /ρ2 − 2αβ/ρ + α2

So we see

D = 2αβ + β 2 + 2β 2 /ρ + (β/ρ − α)2 > 0

as ρ, α and β are positive. We now know D = (α + β + β/ρ)2 − 4αβ/ρ > 0.


Hence, if we drop the −4αβ/ρ term, the result is still positive and we have 0 < D <
(α + β + β/ρ)2 . The plus root then is

r = (1/2) −(α + β + β/ρ) + (α + β + β/ρ)2 − 4αβ/ρ

< (1/2) −(α + β + β/ρ) + (α + β + β/ρ)2


= (1/2) (−(α + β + β/ρ) + (α + β + β/ρ)) = 0.

Hence, the eigenvalues r1 and r2 are both negative and the general solution is
   
Tm F/λ1
= a E1 e r 1 t + b E2 e r 2 t +
Td F/λ1

where E1 and E2 are the eigenvectors of the two eigenvalues.


The values of these parameters in real climate modeling are fairly well known.
We know that Cd is much larger than Cm and that the mixed layer takes a short time
to reach its temporary equilibrium and that in the long term both Tm and Td reach an
equilibrium value that is larger. To estimate how this works, assume that the value
of Td stays close to zero at first. Then the Tm dynamics becomes simpler as Td = 0:

Tm = F/Cm − (λ1 + λ2 )/Cm Tm

This is our familiar protein synthesis model with steady state value of T̂m∞ = F/(λ1 +
λ2 ) and response time t Rm = Cm ln(2)/(λ1 + λ2 ). After sufficient response times
have passed for the mixed layer temperature to reach quasi equilibrium, we will have
Tm ≈ 0 as Tm is no longer changing by much. Hence, setting Tm = 0, we find a
relationship between Tm and Td once this new equilibrium is achieved. We have

−(λ1 + λ2 )Tm + F + λ2 Td = 0

which tells us that Tm = (λ2 Td + F)/(λ1 + λ2 ). Substitute this value of Tm into the
Td dynamics and we will get the final equilibrium value of Td for time much larger
that t Rm . The deep layer temperature dynamics now become
282 9 Numerical Methods Systems of ODEs

Td = λ2 Tm /Cd − λ2 /Cd Td


λ2 Td + F
= λ2 − λ2 /Cd Td
λ1 + λ2
λ1 λ2 λ2
=− Td + F
Cd (λ1 + λ2 ) Cd (λ1 + λ2 )

This is also a standard protein synthesis problem with a steady state value of
   
λ2 λ1 λ2
T̂d∞ = F /
Cd (λ1 + λ2 ) Cd (λ1 + λ2 )
F
=
λ1

and a response time of

ln(2)Cd (λ1 + λ2 )
t Rd =
λ1 λ2

and as t gets large, both temperatures approach the common steady state value of
F/λ1 .
So with a little reasoning and a lot of algebra we find the two response times of
the climate model are t Rm and t Rd as given above. To see these results in a specific
problem, we wrote a MatLab script to solve a typical climate problem. For the values
λ1 = 0.1, λ2 = 0.15000, Cm = 1.0820, Cd = 4.328 = 4Cm so that ρ = 4 and an
external input of F = 0.5, we can compute the solutions to the climate model as
follows

Listing 9.38: A Sample Two Box Climate Model Solution


% S e t up A
lambda1 = . 1
lambda2 = 0 . 1 5
4 Cm = 1 . 0 8 2 0
Cd = 4 . 3 2 8
rho = Cd /Cm
a l p h a = lambda1 /Cm;
b e t a = lambda2 /Cm;
9 A = [− alpha−beta , b e t a ; b e t a / rho , −b e t a / rho ] ;
% Find A i n v e r s e
Ainv = ( 1 / d e t (A) ) ∗ [A( 2 , 2 ) ,−A( 1 , 2 ) ;−A( 2 , 1 ) ,A( 1 , 1 ) ] ;
% s e t up e x t e r n a l i n p u t
F = 0.5
14 % Find p a r t i c u l a r s o l u t i o n
TP = −Ainv ∗ [ F /Cm; 0 ]
% Find s o l u t i o n f o r i n i t i a l c o n d i t i o n s Tm = 0 and Td = 0
[V, D] = e i g (A) ;
VInv = ( 1 / d e t (V) ) ∗ [V( 2 , 2 ) ,−V( 1 , 2 ) ;−V( 2 , 1 ) ,V( 1 , 1 ) ] ;
19 C = −VInv∗TP ;
Ev = d i a g (D) ;
% Find r e s p o n s e t i m e s u s i n g e q u a t i o n s a b o v e
tRmixed = Cm∗ l o g ( 2 ) / ( lambda1+lambda2 )
tRmixed = 2 . 9 9 9 9
9.8 A Two Box Climate Model 283

24 tRdeep = rho ∗Cm∗ ( l o g ( 2 ) / ( lambda1 ∗ lambda2 ) ) ∗ ( lambda1+lambda2 )


tRdeep = 4 9 . 9 9 9
% Quasi E q u i l i b r i u m E s t i m a t e o f Tm
Tmshort = F / ( lambda1+lambda2 )
% Long t e r m e q u i l i b r i u m Tm and Td
29 Tmlong = F / lambda1
Tdlong = F / lambda1
% S e t up Tm and Td s o l u t i o n s
Tm = @( t ) ( C( 1 ) ∗V( 1 , 1 ) ∗ exp ( Ev ( 1 ) ∗ t ) +C( 2 ) ∗V( 1 , 2 ) ∗ exp ( Ev ( 2 ) ∗ t ) +TP ( 1 ) ) ;
Td = @( t ) ( C( 1 ) ∗V( 2 , 1 ) ∗ exp ( Ev ( 1 ) ∗ t ) +C( 2 ) ∗V( 2 , 2 ) ∗ exp ( Ev ( 2 ) ∗ t ) +TP ( 2 ) ) ;
34 % P l o t Tm and Td o v e r a few Tm r e s p o n s e t i m e s
t i m e = l i n s p a c e ( 0 , 4 ∗ tRmixed , 1 0 0 ) ;
figure
p l o t ( time ,Tm( t i m e ) , time , Td ( t i m e ) , ’+’ ) ;
x l a b e l ( ’Time’ )
39 y l a b e l ( ’T_m’ ) ;
t i t l e ( ’Ocean Layer Temperatures’ ) ;
l e g e n d ( ’Mixed Layer’ , ’Deep Layer’ ) ;
% p l o t Tm and Td u n t i l we g e t c l o s e t o Tm s t e a d y s t a t e
t i m e = l i n s p a c e ( 0 , 8 ∗ tRmixed , 3 0 0 ) ;
44 figure
p l o t ( time ,Tm( t i m e ) , time , Td ( t i m e ) , ’+’ ) ;
x l a b e l ( ’Time in Years’ )
y l a b e l ( ’Temperature Degrees C’ ) ;
t i t l e ( ’Ocean Layer Temperatures’ ) ;
49 l e g e n d ( ’Mixed Layer’ , ’Deep Layer’ ) ;
% p l o t Tm and Td o v e r Td r e s p o n s e t i m e s
t i m e = l i n s p a c e ( 0 , 6 ∗ tRdeep , 5 0 0 ) ;
figure
p l o t ( time ,Tm( t i m e ) , time , Td ( t i m e ) , ’+’ ) ;
54 x l a b e l ( ’Time in Years’ )
y l a b e l ( ’Temperature Degrees C’ ) ;
t i t l e ( ’Ocean Layer Temperatures’ ) ;
l e g e n d ( ’Mixed Layer’ , ’Deep Layer’ ) ;

We see the approximate response time of the mixed layer is only 3 years but the
approximate response time for the deep layer is about 50 years. This MatLab session
generates three plots as shown in Figs. 9.13, 9.14 and 9.15.

Fig. 9.13 Simple two box


climate model: short time
scale
284 9 Numerical Methods Systems of ODEs

Fig. 9.14 Simple two box


climate model: mid time
scale

Fig. 9.15 Simple two box


climate model: long time
scale

9.8.1 Project

Now we are ready for the project. Solve the two box climate model with λ1 = 0.01,
λ2 = 0.02000, Cm = 0.34625, ρ = 2.2222 (so that Cd = ρCm ) and an external
input of F = 0.06. For this model, follow what I did in the example and generate a
report in word as follows:
Introduction (8 Points) Explain what we are trying to do here with this model.
The project description here tells you how this model works. Make sure you
understand all of the steps so that you can explain what this model really means
for policy decisions on global warning.
Description of Model (12 Points) Find additional references and use them to build
a two page description of this kind of simple climate model. Pay particular atten-
tion to a discussion of what the parameters in the model mean and what the
9.8 A Two Box Climate Model 285

external input F might represent. Our description was deliberately vague as we


want you to find out more.
Annotated Solution Discussion (22 Points) In this part, you solve the model using
MatLab as I did in the example above. As usual, explain your steps nicely. Cal-
culate the approximate response times for the temperatures in each ocean layer
and also state the quasi mixed ocean layer equilibrium. We want to see plots for
short term, midterm and long term time scales with full discussion of what the
plots mean.
Conclusion (5 Points) Discuss what you have done here and how you might use
it for social policy decisions. How could you convince a skeptical audience that
the model results are valid enough to warrant action?
References (3 Points) Put any reference material you use in here.

9.8.2 Control of Carbon Loading

There is talk currently about strategies to reduce carbon loading via a genetically
engineered bacteria or a nano machine. Such a strategy is called carbon sequestering
and it is not clear if this is possible. If you have read or watched much science fiction,
you’ll recognize carbon sequestering as a type of terraforming. This is certainly
both ambitious and no doubt a plan having a lot of risk that needs to be thought
about carefully. However, if we assume such a control strategy for carbon loading
is implemented say 25 years in the future, what does our simple model say? The
analysis is straightforward. At 25 years, our model reach a value Tm25 and Td25 which
we know are far from the true equilibrium values of ≈5◦ of global warming in our
example. If the sequestering strategy were instantly successful this would correspond
to setting T = 0 in our external data. This gives a straight exponential decay system
         25 
Tm −(λ1 + λ2 )/Cm λ2 /Cm Tm Tm T
= , = m25 .
Td λ2 /Cd −λ2 /Cd Td Td Td

The eigenvalue and eigenvectors for this system are the same as before and we know
half life of about 50 years.

ln(2)Cd (λ1 + λ2 )
t Rd ≈ ≈ 50.
λ1 λ2

It doesn’t really matter at what point carbon loading stops. So we can do this analysis
whether carbon loading stops 25 years in the future or 50 years. The deep layer still
takes a long time to return to a temperature of 0 which for us represents current
temperature and thus no global warming. Note the half life t Rd ≈ 50 is what drives
this. The constants λ1 and λ2 represent exchange rates between the mixed layer and
the atmosphere and the deep layer and the mixed layer. The constant Cd is the heat
capacity of the deep layer. If we also assumed carbon sequestering continued even
286 9 Numerical Methods Systems of ODEs

after carbon loading was shut off, perhaps this would correspond to increasing the
two critical parameters λ1 and λ2 and decreasing Cd . Since
 
1 1
t Rd ≈ ln(2) Cd +
λ1 λ2

such changes would decrease t Rd and bring the deep layer temperature to 0 faster. It
is conceivable administering nano machines and gene engineered bacteria might do
this but it is a strategy that would alter fundamental balances that we have had in
place for millions of years. Hence, it is hard to say if this is wise; certainly further
study is needed. But the bottom line is the carbon sequestering strategies will not
easily return the deep layer to a state of no global increase in temperature quickly.
Note we could also assume the carbon sequestering control strategy alters carbon
loading as a simple exponential decay Fe− t for some positive epsilon. Then our
model becomes
      − t     25 
Tm −(λ1 + λ2 )/Cm λ2 /Cm Tm Fe Tm T
= + = m25 .
Td λ2 /Cd −λ2 /Cd Td 0 Td Td

which we could solve numerically. Here, the carbon loading would not instanta-
neously drop to zero at some time like 25; instead is decays gracefully. However, this
still does not change the fact that the response time is determined by the coefficient
matrix A and so the return to a no global warming state will be slow. Finally, it is
sobering to think about how all of this analysis plays out in the backdrop of political
structures that remain in authority for about 4 years or so. We can see how hard it is
to elicit change when the results won’t show up for 10–40 administrations!

Reference

G. Vallis (ed.), Climate and the Oceans, Princeton Primers on Climate (Princeton University Press,
Princeton, 2012)
Part IV
Interesting Models
Chapter 10
Predator–Prey Models

In the 1920s, the Italian biologist Umberto D’Ancona studied population variations
of various species of fish that interact with one another. He came across the data
shown in Table 10.1.
Here, we interpret the percentage we see in column two of Table 10.1 as predator
fish, such as sharks, skates and so forth. Also, the catches used to calculate these
percentages were reported from all over the Mediterranean. The tonnage from all
the different catches for the entire year were then added and used to calculate the
percentages in the table. Thus, we can also calculate the percentage of catch that was
food by subtracting the predator percentages from the predator ones. This leads to
what we see in Table 10.2.
D’Ancona noted the time period coinciding with World War One, when fishing
was drastically cut back due to military actions, had puzzling data. Let’s highlight
this in Table 10.3. D’Ancona expected both food fish and predator fish to increase
when the rate of fishing was cut back. But in these war years, there is a substantial
increase in the percentage of predators caught at the same time the percentage of
food fish went down. Note, we are looking at percentages here. Of course, the raw

Table 10.1 The percent of Year Percent not food fish


the total fish catch in the
Mediterranean Sea which was 1914 11.9
considered not food fish 1915 21.4
1916 22.1
1917 21.2
1918 36.4
1919 27.3
1920 16.0
1921 15.9
1922 14.8
1923 10.7

© Springer Science+Business Media Singapore 2016 289


J.K. Peterson, Calculus for Cognitive Scientists,
Cognitive Science and Technology, DOI 10.1007/978-981-287-877-9_10
290 10 Predator–Prey Models

Table 10.2 The percent of Year Percent food Percent predator


the total fish catch in the
Mediterranean Sea considered 1914 88.1 11.9
predator and considered food 1915 78.6 21.4
1916 77.9 22.1
1917 78.8 21.2
1918 63.6 36.4
1919 72.7 27.3
1920 84.0 16.0
1921 84.1 15.9
1922 85.2 14.8
1923 89.3 10.7

Table 10.3 During World Year Percent food Percent predator


War One, fishing is drastically
curtailed, yet the predator 1915 78.6 21.4
percentage went up while the 1916 77.9 22.1
food percentage went down 1917 78.8 21.2
1918 63.6 36.4
1919 72.7 27.3

tonnage of fish caught went down during the war years, but the expectation was
that since there is reduced fishing, there should be a higher percentage of food fish
because they have not been harvested. D’Ancona could not understand this, so he
asked the mathematician Vito Volterra for help.

10.1 Theory

Volterra approached the modeling this way. He let the variable x(t) denote the pop-
ulation of food fish and y(t), the population of predator fish at time t. He was
constructing what you might call a coarse model. The food fish are divided into
categories like halibut, mackerel with a separate variable for each and the predators
are also not divided into different classes like sharks, squids and so forth. Hence,
instead of dozens of variables for both the food and predator population, everything
was lumped together. Following Volterra, we make the following assumptions:
1. The food population grows exponentially. Letting xg denote the growth rate of
the food fish, we must have

xg = a x

for some positive constant a.


10.1 Theory 291

2. The number of contacts per unit time between predators and prey is proportional
to the product of their populations. We assume the food fish are eaten by the
predators at a rate proportional to this contact rate. Letting the decay rate of the
food be denoted by xd , we see

xd = − b x y

for some positive constant b.


Thus, the net rate of change of food is x  = xg + xd giving

x  = a x − b x y.

for some positive constants a and b. He made assumptions about the predators as
well.

1. Predators naturally die following an exponential decay; letting this decay rate be
given by yd , we have

yd = −c y

for some positive constant c.


2. We assume the predator fish can grow proportional to how much they eat. In turn,
how much they eat is assumed to be proportional to the rate of contact between
food and predator fish. We model the contact rate just like before and let yg be
the growth rate of the predators. We find

yg = d x y

for some positive constant d.

Thus, the net rate of change of predators is y  = yg + yd giving

y  = −c y + d x y.

for some positive constants c and d. The full Volterra model is thus

x = a x − b x y (10.1)
y  = −c y + d x y (10.2)
x(0) = x0 (10.3)
y(0) = y0 (10.4)

Equations 10.1 and 10.2 give the dynamics of this system. Note these are nonlinear
dynamics, the first we have seen since the logistics model. Equations 10.3 and 10.4
292 10 Predator–Prey Models

are the initial conditions for the system. Together, these four equations are called a
Predator–Prey system. Since Volterra’s work, this model has been applied in many
other places. A famous example is the wolf–moose predator–prey system which has
been extensively modeled for Island Royale in Lake Superior. We are now going to
analyze this model. We have been inspired by the analysis given Braun (1978), but
Braun can use a bit more mathematics in his explanations and we will try to use only
calculus ideas.

10.2 The Nullcline Analysis

Once we obtain a solution (x, y) to the Predator–Prey problem, we have two nice
curves x(t) and y(t) defined for all non negative time t. As we did in Chap. 8, if we
graph in the x–y plane the ordered pairs (x(t), y(t)), we will draw a curve C where
any point on C corresponds to an ordered pair (x(t), y(t)) for some time t. At t = 0,
we are at the point (x0 , y0 ) on C . Hence, the initial conditions for the Predator–Prey
problem determine the starting point on C . As time increases, the pairs (x(t), y(t))
move in the direction of the tangent line to C . If we knew the algebraic sign of the
derivatives x  and y  at any point on C , we could decide the direction in which we
are moving along the curve C . So we begin our analysis by looking at the curves in
the x–y plane where x  and y  become 0. From these curves, we will be able to find
out the different regions in the plane where each is positive or negative. From that,
we will be able to decide in which direction a point moves along the curve.

10.2.1 The x  = 0 Analysis

Looking at the predator–prey equations, we see that if t ∗ is a time point when x  (t ∗ )


is zero, the food dynamic of the predator–prey system reduce to

0 = a x(t ∗ ) − b x(t ∗ ) y(t ∗ )

or
 
∗ ∗
0 = x(t ) a − b y(t )

Thus, the (x, y) pairs in the x–y plane where


 
0 = x a−b y
10.2 The Nullcline Analysis 293

are the ones where the rate of change of the food fish will be zero. Now these pairs
can correspond to many different time values t ∗ so what we really need to do is to
find all the (x, y) pairs where this happens. Since this is a product, there are two
possibilities: x = 0; the y axis and y = ab ; a horizontal line.

10.2.2 The y = 0 Analysis

In a similar way, the pairs (x, y) where y  becomes zero satisfy the equation
 
0 = y −c + d x .

Again, there are two possibilities: y = 0; the x axis and x = dc ; a vertical line.

10.2.2.1 An Example

Here’s an example: consider the predator–prey model

x  (t) = 2 x(t) − 5 x(t) y(t)


y  (t) = −6 y(t) + 3 x(t) y(t)

The x  nullclines satisfy 2x − 5x y = 0 or x(2 − 5y) = 0. Hence x = 0 or y = 2/5.


Draw the lines x = 0, the y axis, and the horizontal line y = 2/5 in the x y plane.
Then note the factor 2 − 5y is positive when 2 − 5y > 0 or when y < 2/5. Hence,
the factor is negative when y > 2/5. The factor x is positive when x > 0 and negative
when x < 0. So the combination x(2 − 5y) has a sign that can be determined easily
as shown in Fig. 10.1.
Next, for our predator–prey model

x  (t) = 2 x(t) − 5 x(t) y(t)


y  (t) = −6 y(t) + 3 x(t) y(t)

the y  nullclines satisfy −6y + 3x y = 0 or y(−6 + 3x) = 0. Hence y = 0 or x = 2.


Draw the lines y = 0, the x axis, and the vertical line x = 2 in the x y plane. The
factor −6 + 3x is positive when −6 + 3x > 0 or when x > 2. Hence, the factor is
negative when x < 2. The factor y is positive when y > 0 and negative when y < 0.
So the combination y(−6 + 3x) has a sign that can be determined easily as shown
in Fig. 10.2.
294 10 Predator–Prey Models

Fig. 10.1 x  = 0 nullcline for x  = 2x + 5y

Fig. 10.2 y  = 0 nullcline for y  = 6x + 3y


10.2 The Nullcline Analysis 295

10.2.3 The Nullcline Plane

Just like we did in Chap. 8, we find the parts of the x–y plane where the algebraic
signs of x  and y  are (+, +), (+, −), (−, +) and (−, −). As usual, the set of (x, y)
pairs where x  = 0 is called the nullcline for x; similarly, the points where y  = 0 is
the nullcline for y. The x  = 0 equation gives us the y axis and the horizontal line
y = ab while the y  = 0 gives the x axis and the vertical line x = dc . The x  dynamics
thus divide the plane into three pieces: the part where x  > 0; the part where x  = 0;
and, the part where x  < 0.

10.2.3.1 Back to Our Example

We go back to the model

x  (t) = 2 x(t) − 5 x(t) y(t)


y  (t) = −6 y(t) + 3 x(t) y(t)

We have already determined the nullclines for this model as shown in Figs. 10.1 and
10.2. We combine the x  and y  nullcline information to create a map of how x  and
y  change sign in the x y plane.
• Regions I, II, III and IV divide Quadrant 1.
• We will show there are trajectories moving down the positive y axis and out along
the positive x axis.
• Thus, a trajectory that starts in Quadrant 1 with positive initial conditions can’t
cross the trajectory on the positive x axis or the trajectory on the positive y axis.
• Thus, the Predator–Prey trajectories that start in Quadrant 1 with positive initial
conditions will stay in Quadrant 1.
This is shown in Fig. 10.3.

10.2.4 The General Results

For the general predator–prey model

x  (t) = a x(t) − b x(t) y(t)


y  (t) = −c y(t) + d x(t) y(t)

The x  nullclines satisfy ax − bx y = 0 or x(a − by) = 0. Hence x = 0 or y = a/b.


Draw the lines x = 0 (the y axis and the horizontal line y = a/b in the x y plane.
• The factor a − by is positive when a − by > 0 or when y < a/b. Hence, the factor
is negative when y > a/b.
296 10 Predator–Prey Models

Fig. 10.3 The combined x  = 0 nullcline for x  = 2x + 5y and y  = 0 nullcline for y  = 6x + 3y


information

• The factor x is positive when x > 0 and negative when x < 0.


• So the combination x(a − by) has a sign that can be determined easily.
In Fig. 10.4, we show the part of the x–y plane where x  > 0 with one shading and the
part where it is negative with another. Next, the y  nullclines satisfy −cy + d x y = 0
or y(−c + d x) = 0. Hence y = 0 or x = c/d. Draw the lines y = 0 (the x axis and
the vertical line x = c/d in the x y plane.

Fig. 10.4 Finding where x  < 0 and x  > 0 for the predator–prey model
10.2 The Nullcline Analysis 297

Fig. 10.5 Finding where y  < 0 and y  > 0 for the predator–prey model

• The factor −c + d x is positive when −c + d x > 0 or when x > c/d. Hence, the
factor is negative when x < c/d.
• The factor y is positive when y > 0 and negative when y < 0.
• So the combination y(−c + d x) has a sign that can also be determined easily.

In Fig. 10.5, we show how the y  nullcline divides the x–y plane into three pieces as
well.
The shaded areas shown in Figs. 10.4 and 10.5 can be combined into Fig. 10.6. In
each region, x  and y  are either positive or negative. Hence, each region can be
marked with an ordered pair, (x  ±, y  ±).

Fig. 10.6 Combining the x  and y  algebraic sign regions


298 10 Predator–Prey Models

10.2.5 Drawing Trajectories

We drew trajectories for the linear system models already without a lot of background
discussion. Now we’ll go over it again in more detail. We use the algebraic signs
of x  and y  to determine this. For example, if we are in Region I, the sign of x  is
negative and the sign of y  is positive. Thus, the variable x decreases and the variable
y increases in this region. So if we graphed the ordered pairs (x(t), y(t)) in the
x–y plane for all t > 0, we would plot a y versus x curve. That is, we would have
y = f (x) for some function of x. Note that, by the chain rule

dy dx
= f  (x) .
dt dt

Hence, as long as x  is not zero (and this is true in Region I!), we have at each time
t, that the slope of the curve y = f (x) is given by

df y  (t)
(t) =  .
dx x (t)

Since our pair (x, y) is the solution to a differential equation, we expect that x and y
both are continuously differentiable with respect to t. So if we draw the curve for y
versus x in the x–y plane, we do not expect to see a corner in it (as a corner means
the derivative fails to exist). So we can see three possibilities:

• a straight line as x  equals y  at each t meaning the slope is always the same,

Fig. 10.7 Trajectories in region I


10.2 The Nullcline Analysis 299

• a curve that is concave up or


• a curve that is concave down.
We illustrate this three possibilities in Fig. 10.7.
When we combine trajectories from one region with another, we must attach them
so that we do not get corners in the curves. This is how we can determine whether
or not we should use concave up or down or straight in a given region. We can do
this for all the different regions shown in Fig. 10.7.

10.3 Only Quadrant One Is Biologically Relevant

To analyze this nonlinear model, we need a fact from more advanced courses. For
these kinds of nonlinear models, trajectories that start at different initial conditions
can not cross.

Assumption 10.3.1 (Trajectories do not cross in the Predator–Prey model)


We can show, in a more advanced course, that two distinct trajectories to the Predator–
Prey model

x = a x − b x y (10.5)
y  = −c y + d x y (10.6)
x(0) = x0 (10.7)
y(0) = y0 (10.8)

can not cross.

10.3.1 Trajectories on the y+ Axis

Let’s begin by looking at a trajectory that starts on the positive y axis. We therefore
need to solve the system

x = a x − b x y (10.9)
y  = −c y + d x y (10.10)
x(0) = 0 (10.11)
y(0) = y0 > 0 (10.12)
300 10 Predator–Prey Models

It is easy to guess the solution is the pair (x(t), y(t)) with x(t) = 0 always and y(t)
satisfying y  = −c y(t). Hence,

y(t) = y0 e−ct

and y decays nicely down to 0 as t increases.

10.3.2 Trajectories on the x + Axis

If we start on the positive x axis, we want to solve

x = a x − b x y (10.13)
y  = −c y + d x y (10.14)
x(0) = x0 > 0 (10.15)
y(0) = 0 (10.16)

Again, it is easy to guess the solution is the pair (x(t), y(t)) with y(t) = 0 always
and x(t) satisfying x  = a x(t). Hence,

x(t) = x0 eat

and the trajectory moves along the positive x axis always increasing as t increases.
Since trajectories can’t cross other trajectories, this tells us a trajectory that begins in
Quadrant 1 with a positive (x0 , y0 ) can’t hit the x axis or the y axis in a finite amount
of time because it did, we would have two trajectories crossing.

10.3.3 What Does This Mean Biologically?

This is good news for our biological model. Since we are trying to model food
and predator interactions in a real biological system, we always start with initial
conditions (x0 , y0 ) that are in Quadrant One. It is very comforting to know that these
solutions will always remain positive and, therefore, biologically realistic. In fact, it
doesn’t seem biologically possible for the food or predators to become negative, so
if our model permitted that, it would tell us our model is seriously flawed! Hence, for
our modeling purposes, we need not consider initial conditions that start in Regions
V–IX. Indeed, if you look at Fig. 10.7, you can see that a solution trajectory could
only hit the y axis from Region II. But that can’t happen as if it did, two trajectories
would cross! Also, a trajectory could only hit the x axis from a start in Region III.
Again, since trajectories can’t cross, this is not possible either. So, a trajectory that
starts in Quadrant 1, stays in Quadrant 1—kind of has a Las Vegas feel doesn’t it?.
10.3 Only Quadrant One Is Biologically Relevant 301

10.3.4 Homework

For the following problems, find the x  and y  nullclines and sketch using multiple
colors, the algebraic sign pairs (x  , y  ) the nullclines determine in the x–y plane.

Exercise 10.3.1

x  (t) = 100 x(t) − 25 x(t) y(t)


y  (t) = −200 y(t) + 40 x(t) y(t)

Exercise 10.3.2

x  (t) = 1000 x(t) − 250 x(t) y(t)


y  (t) = −2000 y(t) + 40 x(t) y(t)

Exercise 10.3.3

x  (t) = 900 x(t) − 45 x(t) y(t)


y  (t) = −100 y(t) + 50 x(t) y(t)

Exercise 10.3.4

x  (t) = 10 x(t) − 25 x(t) y(t)


y  (t) = −20 y(t) + 40 x(t) y(t)

Exercise 10.3.5

x  (t) = 90 x(t) − 2.5 x(t) y(t)


y  (t) = −200 y(t) + 4.5 x(t) y(t)

10.4 The Nonlinear Conservation Law

So we can assume that for a start in Quadrant 1, the solution pair is always positive.
Let’s see how far we can get with a preliminary mathematical analysis. We can
analyze these trajectories like this. For convenience, assume we start in Region II
and the resulting trajectory hits the y = ab line at some time t ∗ . At that time, we will
have x  (t ∗ ) = 0 and y(t ∗ ) < 0. We show this situation in Fig. 10.8.
Look at the Predator–Prey model dynamics for 0 ≤ t < t ∗ . Since all variables are
positive and their derivatives are not zero for these times, we can look at the fraction
y  (t)/x  (t).
302 10 Predator–Prey Models

Fig. 10.8 Trajectories in region II

y  (t) y(t) (−c + d x(t))


= .
x  (t) x(t) (a − b y(t))

10.4.1 Example

To make this easier to understand, let’s do a specific example.

x  (t) = 2 x(t) − 5 x(t) y(t)


y  (t) = −6 y(t) + 3 x(t) y(t)

• Rewrite as y  /x  :

y  (t) −6y(t) + 3x(t)y(t)


=
x  (t) 2x(t) − 5x(t)y(t)
y(t)(−6 + 3x(t))
= .
x(t)(2 − 5y(t))

• Put all the y stuff on the left and all the x stuff on the right:

2 − 5y(t) −6 + 3x(t)
y  (t) = x  (t) .
y(t) x(t)
10.4 The Nonlinear Conservation Law 303

• Rewrite as separate pieces:

y  (t) x  (t)
2 − 5y  (t) = −6 + 3x  (t).
y(t) x(t)

• Integrate both sides from 0 to t:


   
t
y  (s) t t
x  (s) t
2 ds − 5 y  (s) ds = −6 ds + 3 x  (s) ds.
0 y(s) 0 0 x(s) 0

• Do the integrations: everything is positive so we don’t need absolute values in the


ln’s
t t t t
   
2 ln(y(s)) − 5y(s) = −6 ln(x(s)) + 3x(s) .
0 0 0 0

• Evaluate:
   
y(t) x(t)
2 ln − 5(y(t) − y0 ) = −6 ln + 3(x(t) − x0 ).
y0 x0

• Put ln’s on the left and other terms on the right:


   
x(t) y(t)
6 ln + 2 ln = 3(x(t) − x0 ) + 5(y(t) − y0 ).
x0 y0

• Combine ln terms:
 6   2 
x(t) y(t)
ln + ln = 3(x(t) − x0 ) + 5(y(t) − y0 ).
x0 y0

• Combine ln terms again:


 6  2 
x(t) y(t)
ln = 3(x(t) − x0 ) + 5(y(t) − y0 ).
x0 y0

• Exponentiate both sides:


 6  2
x(t) y(t)
= e3(x(t)−x0 )+5(y(t)−y0 ) .
x0 y0

• Simplify the exponential term:


 6  2
x(t) y(t) e3x(t) e5y(t)
=
x0 y0 e3x0 e5y0
304 10 Predator–Prey Models

• Put all function terms on the left and all constant terms on the right:

(x(t))6 (y(t))2 (x0 )6 (y0 )2


=
e3x(t) e5y(t) e3x0 e5y0
• Define the functions f and g by
– f (x) = x 6 /e3x .
– g(y) = y 2 /e5y .
• Then we can rewrite our result as

f (x(t)) g(y(t)) = f (x0 ) g(y0 ).

• We did this analysis for Region II, but it works in all the regions. So for the entire
trajectory, we know

f (x(t)) g(y(t)) = f (x0 ) g(y0 ).

• The equation

f (x(t)) g(y(t)) = f (x0 ) g(y0 ).

for f (x) = x 6 /e3x and g(y) = y 2 /e5y , is called the Nonlinear Conservation Law
or NLCL for the Predator–Prey model

x  (t) = 2 x(t) − 5 x(t) y(t)


y  (t) = −6 y(t) + 3 x(t) y(t)

10.4.2 The General Derivation

Recall, we were looking at the Predator–Prey model dynamics for 0 ≤ t < t ∗ . Since
all variables are positive and their derivatives are not zero for these times, we can
look at the fraction y  (t)/x  (t).

y  (t) y(t) (−c + d x(t))



= .
x (t) x(t) (a − b y(t))

The equation above will not hold at t ∗ , however, because at that point x  (t ∗ ) = 0.
But for t below that critical value, it is ok to look at this fraction.
10.4 The Nonlinear Conservation Law 305

• Rearranging a bit, we find

(a − b y(t)) y  (t) (−c + d x(t)) x  (t)


= .
y(t) x(t)

• switching to the variable s for 0 ≤ s < t, for any value t strictly less than our
special value t ∗ , we have

y  (s) x  (s)
a − b y  (s) = −c + d x  (s).
y(s) x(s)

• Now integrate from s = 0 to s = t to obtain


 t   t 
y  (s)  x  (s) 
a − b y (s) ds = −c + d x (s) ds.
0 y(s) 0 x(s)

• These integrals can be split into separate pieces giving


   
t
y  (s) t t
x  (s) t
a ds − b y  (s) ds = −c ds + d x  (s) ds.
0 y(s) 0 0 x(s) 0

• These can be integrated easily (yes, it’s true!) and we get


t t t t
   
a ln y(s) − b y(s) = −c ln x(s) + d x(s) .
0 0 0 0

• Evaluating these expressions at s = t and s = 0, using the initial condition x(0) =


x0 and y(0) = y0 , we find
       
a ln y(t) − ln y0 − b y(t) − y0 = −c ln x(t) − ln x0 + d x(t) − x0 .

• Now we simplify a lot (remember x0 and y0 are positive so absolute values are not
needed around them). First, we use a standard logarithm property:
       
y(t) x(t)
a ln − b y(T ) − y(0) = −c ln + d x(T ) − x0 .
y0 x0

Then, put all the logarithm terms on the left side and pull the powers a and c inside
the logarithms:
 a  c    
y(t) x(t)
ln + ln = b y(t) − y0 + d x(t) − x0 .
y0 x0
306 10 Predator–Prey Models

Then using properties of the logarithm again,


 a  c     
y(t) x(t)
ln = b y(t) − y0 + d x(t) − x0 .
y0 x0

• Now exponentiate both sides and use the properties of the exponential function to
simplify. We find
 a  c
y(t) x(t) eb y(t) ed x(t)
= .
y0 x0 eb y0 ed x0

• We can rearrange this as follows:


  
(x(t))c (y(t))a x0c y0a
= . (10.17)
ed x(t) eb y(t) ed x0 eb y0

The equation

f (x(t)) g(y(t)) = f (x0 ) g(y0 ).

for f (x) = x c /ed x and g(y) = y a /eby , is called the Nonlinear Conservation Law
or NLCL for the general Predator–Prey model

x  (t) = a x(t) − b x(t) y(t)


y  (t) = −c y(t) + d x(t) y(t)

10.4.2.1 Approaching t ∗

Now the right hand side is a positive number which for convenience we will call α.
Hence, we have the equation
  
(y(t))a (x(t))c

eb y(t) ed x(t)

holds for all time t strictly less than t ∗ . Thus, as we allow t to approach t ∗ from
below, the continuity of our solutions x(t) and y(t) allows us to say
       
(y(t))a (x(t))c (y(t ∗ ))a (x(t ∗ ))c
lim∗ lim∗ = = α.
t−>t eb y(t) t−>t ed x(t) eb y(t ∗ ) ed x(t ∗ )

Thus, Eq. 10.17 holds at t ∗ also.


10.4 The Nonlinear Conservation Law 307

10.4.2.2 The Other Regions

We can do a similar analysis for a trajectory that starts in Region IV and moves up
until it hits the y = ab line where x  = 0. This one will start at an initial point (x0 , y0 )
in Region IV and terminate on the y = ab line at the point (x(t ∗ ), ab ) for some time
t ∗ . In this case, we continue the analysis as before. For any time t < t ∗ , the variables
x(t) and y(t) are positive and their derivatives non zero. Hence, we can manipulate
the Predator–Prey Equations just like before to end up with
   
t
y  (s) t

t
x  (s) t
a ds − b y (s) ds = −c ds + d x  (s) ds.
0 y(s) 0 0 x(s) 0

We integrate in the same way and apply the initial conditions to obtain Eq. 10.17
again.

  
(y(t))a (x(t))c y0a x0c
= .
eb y(t) ed x(t) eb y0 ed x0

Then, taking the limit at t goes to t ∗ , we see this equation holds at t ∗ also. Again,
label the right hand side as the positive constant α. We then have
  
(y(t))a (x(t))c
= α.
eb y(t) ed x(t)

Letting t approach t ∗ , as we did earlier, we find


  
(y(t ∗ ))a (x(t ∗ ))c
= α.
eb y(t ∗ ) ed x(t ∗ )

We conclude Eq. 10.17 holds for trajectories that start in regions that terminate on
the x  = 0 line y = ab . Since trajectories that start in region I and III never have x 
become 0, all of the analysis we did above works perfectly. Hence, we can conclude
that Eq. 10.17 holds for all trajectories starting at a positive initial point (x0 , y0 ) in
Quadrant 1.
We know the pairs (x(t), y(t)) are on the trajectory that corresponds to the ini-
tial start of (x0 , y0 ). Hence, we can drop the time dependence (t) above and write
Eq. 10.18 which holds for any (x, y) pair that is on the trajectory.
  
ya xc y0a x0c
= . (10.18)
eb y ed x eb y0 ed x0

Equation 10.18 is called the Nonlinear Conservation Law associated with the
Predator–Prey model.
308 10 Predator–Prey Models

10.4.3 Can a Trajectory Hit the y Axis Redux?

Although we have assumed trajectories can’t cross and therefore a trajectory starting
in Region II can’t hit the y axis for that reason, we can also see this using the nonlinear
conservation law. We can do the same derivation for a trajectory starting in Region
II with a positive x0 and y0 and this time assume the trajectory hits the y axis at a
time t ∗ at the point (0, y1 ) with y1 > 0. We can repeat all of the integration steps to
obtain
  
(y(t))a (x(t))c y0a x0c
= .
eb y(t) ed x(t) eb y0 ed x0

This equation holds for all t before t ∗ . Taking the limit as t goes to t ∗ , we obtain
     
(y(t ∗ ))a (x(t ∗ ))c (y1 )a (0)c y0a x0c
= =0= .
eb y(t ∗ ) ed x(t ∗ ) eb y1 ed 0 eb y0 ed x0

This is not possible, so we have another way to seeing that a trajectory can’t hit the
y axis. A similar argument shows a trajectory in Region III can’t hit the x axis. We
will leave the details of that argument to you!

10.4.4 Homework

For the following Predator–Prey models, derive the nonlinear conservation law. Since
our discussions have shown us the times when x  = 0 in the fraction y  /x  do not
give us any trouble, you can derive this law by integrating

y  (t) y(t) (−c + d x(t))



= .
x (t) x(t) (a − b y(t))

in the way we have described in this section for the particular values of a, b, c and
d in the given model. So you should derive the equation

ya x c (y(0))a (x(0))c
=
eby ed x eby(0) ed x(0)
Exercise 10.4.1

x  (t) = 100 x(t) − 25 x(t) y(t)


y  (t) = −200 y(t) + 40 x(t) y(t)
10.4 The Nonlinear Conservation Law 309

Exercise 10.4.2

x  (t) = 1000 x(t) − 250 x(t) y(t)


y  (t) = −2000 y(t) + 40 x(t) y(t)

Exercise 10.4.3

x  (t) = 900 x(t) − 45 x(t) y(t)


y  (t) = −100 y(t) + 50 x(t) y(t)

Exercise 10.4.4

x  (t) = 10 x(t) − 25 x(t) y(t)


y  (t) = −20 y(t) + 40 x(t) y(t)

Exercise 10.4.5

x  (t) = 90 x(t) − 2.5 x(t) y(t)


y  (t) = −200 y(t) + 4.5 x(t) y(t)

10.5 Qualitative Analysis

From the discussions above, we now know that given an initial start (x0 , y0 ) in
Quadrant 1 of the x–y plane, the solution to the Predator–Prey system will not leave
Quadrant 1. If we piece the various trajectories together for Regions I, II, III and IV,
the solution trajectories will either periodic, spiraling in to some center or spiraling
out to give unbounded motion. These possibilities are shown in Fig. 10.9 (periodic),
Fig. 10.10 (spiraling out) and Fig. 10.11 (spiraling in). We want to find out which of
these possible trajectories is possible.

10.5.1 The Predator–Prey Growth Functions

Recall the Predator–Prey nonlinear conservation law is given by


  
ya xc y0a x0c
= .
eb y ed x eb y0 ed x0

We have already defined the functions f and g by for all non negative real numbers
by
310 10 Predator–Prey Models

Fig. 10.9 A periodic trajectory

Fig. 10.10 A spiraling out trajectory

xc
f (x) =
ed x
ya
g(y) = by .
e
These functions have a very specific look. We can figure this out using a bit of
common sense and some first semester calculus.
10.5 Qualitative Analysis 311

Fig. 10.11 A spiraling in trajectory

10.5.1.1 An Example

Consider a typical Predator–Prey Model

x  (t) = 8 x(t) − 6 x(t) y(t)


y  (t) = −7 y(t) + 3 x(t) y(t)

We know for any (x0 >, y0 > 0), the trajectory (x(t), y(t)) satisfies the NLCL

f (x(t))g(y(t)) = f (x0 )g(y0 )

where

x7
f (x) =
e3x
y8
g(y) = 6y
e
What do f and g look like? Let’s look at f first. Recall L’Hôpital’s rule.

x7 ∞
lim =
x→∞ e3x ∞
and so

x7 (x 7 ) 7x 6
lim = lim 3x  = lim .
x→∞ e 3x x→∞ (e ) x→∞ 3e3x
312 10 Predator–Prey Models

But this limit is also ∞/∞ and so we can apply L’Hôpital’s rule again.

7x 6 42x 5
lim = lim
x→∞ 3e3x x→∞ (9e3x

After we have taken the derivative of x 7 seven times we find

x7 7!
lim = lim 7 3x = 0
x→∞ e3x x→∞ 3 e

as 1/e3x goes to 0 as x grows large. So we know as x → ∞, f (x) → 0 from above


as f (x) is always positive. We also know f (0) = 0. These two facts tell us that since
f is continuous and differentiable, f must rise up to a maximum at some positive
point c. We find c by setting f  (x) = 0 and finding the critical point that is positive.
We have

(7x 6 ) e3x − x 7 (3e3x ) 7x 6 − 3x 7 x6


f  (x) = = = (7 − 3x) .
(e3x )2 e3x e3x

Since e3x is never zero, f  (x) = 0 when x = 0 or when x = 7/3. Note this is c/d
for our Predator–Prey model. A similar analysis holds for g(y) = y 8 /e6y . We find
y → ∞, g(y) → 0 from above as g(y) is always positive and since g(0) = 0, we
know g has a maximum. We use calculus to show the maximum occurs at y = 8/6
which is a/b for our Predator–Prey model.

10.5.2 General Results

We can do the same sort of analysis for a general Predator–Prey model. From our
specific example, it is easy to infer that f and g have the same generic form which
are shown in Figs. 10.12 and 10.13.

Fig. 10.12 The predator–prey f growth graph


10.5 Qualitative Analysis 313

Fig. 10.13 The predator–prey g growth graph

10.5.2.1 Homework

For the following Predator–Prey models, state what the f and g growth functions
are, use calculus to derive where there maximum occurs (you can do either f or g
as the derivation is the same for both) and sketch their graphs nicely.
Exercise 10.5.1

x  (t) = 10 x(t) − 25 x(t) y(t)


y  (t) = −20 y(t) + 40 x(t) y(t)

Exercise 10.5.2

x  (t) = 100 x(t) − 25 x(t) y(t)


y  (t) = −200 y(t) + 40 x(t) y(t)

Exercise 10.5.3

x  (t) = 90 x(t) − 45 x(t) y(t)


y  (t) = −10 y(t) + 5 x(t) y(t)

Exercise 10.5.4

x  (t) = 10 x(t) − 2.5 x(t) y(t)


y  (t) = −20 y(t) + 4 x(t) y(t)

Exercise 10.5.5

x  (t) = 9 x(t) − 3 x(t) y(t)


y  (t) = −300 y(t) + 50 x(t) y(t)
314 10 Predator–Prey Models

10.5.3 The Nonlinear Conservation Law Using f and g

We can write the nonlinear conservation law using the growth functions f and g in
the form of Eq. 10.19:

f (x) g(y) = f (x0 ) g(y0 ). (10.19)

The trajectories formed by the solutions of the Predator–Prey model that start in
Quadrant 1 are powerfully shaped by these growth functions. It is easy to see that
if we choose (x0 = dc , y0 = ab ), i.e. we start at the places where f and g have their
maximums, the resulting trajectory is very simple. It is the single point (x(t) =
d
c
, y(t) = ab ) for all time t. The solution to this Predator–Prey model with this initial
condition is thus to simply stay at the point where we start. If x0 = c/d and y0 = a/b,
then the NLCL says

f (x(t)) g(y(t)) = f (x0 ) g(y0 ) = f (c/d) g(a/b) = f max gmax .

But f max gmax is just a number so this trajectory is the constant x(t) = c/d and
y(t) = a/b for all time. Otherwise, there are two cases: if x0 = c/d, then f (x0 ) <
f max . So we can write f (x0 ) = r1 f max where r1 < 1. We don’t know where y0 is,
but we do know g(y0 ) ≤ gmax . So we can write g(y0 ) = r2 gmax with r2 ≤ 1. So in
this case, the NLCL gives

f (x(t)) g(y(t)) = f (x0 ) g(y0 ) = r1 r2 f max gmax .

Let μ = r1 r2 . Then we can say

f (x(t)) g(y(t)) = f (x0 ) g(y0 ) = μ f max gmax .

where μ < 1. Finally, if y0 = a/b, then g(y0 ) < gmax . We can thus write g(y0 ) =
r2 f max where r2 < 1. Although we don’t know where x0 is, we do know f (x0 ) ≤
f max . So we can write f (x0 ) = r1 f max with r1 ≤ 1. So in this case, the NLCL again
gives

f (x(t)) g(y(t)) = f (x0 ) g(y0 ) = r1 r2 f max gmax .

Letting μ = r1 r2 , we can say

f (x(t)) g(y(t)) = f (x0 ) g(y0 ) = μ f max gmax .


10.5 Qualitative Analysis 315

where μ < 1. We conclude all trajectories with x0 > 0 and y0 > 0 have an associated
μ ≤ 1 so that the NCLC that can be written

f (x(t)) g(y(t)) = f (x0 , y0 ) = μ f max gmax .

and μ < 1 for any trajectory with ICs different from the pair(c/d, a/b).
The next step is to examine what happens if we choose a value of μ < 1.

10.5.4 Trajectories are Bounded in x Example

Let’s do this for the specific Predator–Prey model

x  (t) = 8 x(t) − 6 x(t) y(t)


y  (t) = −7 y(t) + 3 x(t) y(t)

Let’s assume an IC corresponding to μ = 0.7. The arguments work for any μ but it
is nice to be able to pick a number and work off of it.
• Step 1: draw the f curve and the horizontal line of value 0.7 f max . The horizontal
line will cross the f curve twice giving two corresponding x values. Label these
x1 and x2 as shown. Also label the point c/d = 7/3 and draw the vertical lines
from these x values to the f curve itself.
• Also pick a point x1∗ < x1 and a point x2∗ > x2 and draw them in along with their
vertical lines that go up to the f curve.
• We show all this in Fig. 10.14.
• Step 2: Is it possible for the trajectory to contain the point x1∗ ? If so, there is a
corresponding y value so the NLCL holds: f (x1∗ )g(y) = 0.7 f max gmax . But this
implies that

0.7 f max
g(y) = gmax > gmax
f (x1∗ )

as the bottom of the fraction 0.7 f max / f (x1∗ ) is smaller than the top making the
fraction larger than 1. But no y can have a value larger than gmax . Hence, the point
x1∗ is not on the trajectory.
• Step 3: Is it possible for x1 to be on the trajectory? If so, there is a y value
so the NLCL holds giving f (x1 )g(y) = 0.7 f max gmax . But f (x1 ) = 0.7 f max , so
cancelling, we find g(y) = gmax . Thus y = a/b = 8/6 and (x1 , a/b = 8/6) is on
the trajectory.
• Step 4: Is it possible for the trajectory to contain the point x2∗ ? If so, there is a
corresponding y value so the NLCL holds: f (x2∗ )g(y) = 0.7 f max gmax . But this
implies that
316 10 Predator–Prey Models

Fig. 10.14 The conservation law f (x) g(y) = 0.7 f max gmax implies there are two critical points
x1 and x2 of interest

0.7 f max
g(y) = gmax > gmax
f (x2∗ )

as the bottom of the fraction 0.7 f max / f (x2∗ ) is smaller than the top making the
fraction larger than 1. But no y can have a value larger than gmax . Hence, the point
x2∗ is not on the trajectory.
• Step 5: Is it possible for x2 to be on the trajectory? If so, there is a y value
so the NLCL holds giving f (x2 )g(y) = 0.7 f max gmax . But f (x2 ) = 0.7 f max , so
cancelling, we find g(y) = gmax . Thus y = a/b = 8/6 and (x2 , a/b = 8/6) is on
the trajectory. We conclude if (x(t), y(t)) is on the trajectory, then x1 ≤ x(t) ≤ x2 .
We show this in Fig. 10.15.

Fig. 10.15 Predator–prey trajectories with initial conditions from Quadrant 1 are bounded in x
10.5 Qualitative Analysis 317

10.5.5 Trajectories are Bounded in y Example

Again we start with our specific Predator–Prey model

x  (t) = 8 x(t) − 6 x(t) y(t)


y  (t) = −7 y(t) + 3 x(t) y(t)

Our IC corresponds to μ = 0.7.


• Step 1: draw the g curve and the horizontal line of value 0.7gmax . The horizontal
line will cross the g curve twice giving two corresponding y values. Label these
y1 and y2 as shown. Also label the point a/b = 8/6 and draw the vertical lines
from these y values to the g curve itself.
• Also pick a point y1∗ < y1 and a point y2∗ > y2 and draw them in along with their
vertical lines that go up to the g curve.
• We show all this in Fig. 10.16.
• Step 2: Is it possible for the trajectory to contain the point y1∗ ? If so, there is a
corresponding x value so the NLCL holds: f (x)g(y1∗ ) = 0.7 f max gmax . But this
implies that

0.7gmax
f (x) = f max > f max
g(y1∗ )

as the bottom of the fraction 0.7gmax /g(y1∗ ) is smaller than the top making the
fraction larger than 1. But no x can have a value larger than f max . Hence, the point
y1∗ is not on the trajectory.
• Step 3: Is it possible for y1 to be on the trajectory? If so, there is a x value
so the NLCL holds giving f (x)g(y1 ) = 0.7 f max gmax . But g(y1 ) = 0.7gmax , so

Fig. 10.16 The conservation law f (x) g(y) = 0.7 f max gmax implies there are two critical points
y1 and y2 of interest
318 10 Predator–Prey Models

Fig. 10.17 Predator–prey trajectories with initial conditions from Quadrant 1 are bounded in y

cancelling, we find f (x) = f max . Thus x = c/d = 7/3 and (c/d = 7/3, y1 ) is on
the trajectory.
• Step 4: Is it possible for the trajectory to contain the point y2∗ ? If so, there is a
corresponding x value so the NLCL holds: f (xg(y2∗ ) = 0.7 f max gmax . But this
implies that

0.7gmax
f (x) = f max > f max
g(y2∗ )

as the bottom of the fraction 0.7gmax /g(y2∗ ) is smaller than the top making the
fraction larger than 1. But no x can have a value larger than f max . Hence, the point
y2∗ is not on the trajectory.
• Step 5: Is it possible for y2 to be on the trajectory? If so, there is a x value
so the NLCL holds giving f (x)g(y2 ) = 0.7 f max gmax . But g(y2 ) = 0.7gmax , so
cancelling, we find f (x) = f max . Thus x = c/d = 7/3 and (c/d = 7/3, y2 ) is on
the trajectory. We conclude if (x(t), y(t)) is on the trajectory, then y1 ≤ y(t) ≤ y2 .
We show these bounds in Fig. 10.17.

10.5.6 Trajectories are Bounded Example

Combining, we see trajectories are bounded in Quadrant 1. We show this in Fig. 10.18.

10.5.7 Trajectories are Bounded in x General Argument

We can do the analysis we did for the specific Predator–Prey model for the general
one. Some people like to see these arguments with the parameters a, b, c and d and
10.5 Qualitative Analysis 319

Fig. 10.18 Predator–prey trajectories with initial conditions from Quadrant 1 are bounded in x
and y

others like to see the argument with numbers. However, learning how to see things
abstractly is a skill that is honed by thinking with general terms and not specific
numbers. So as we redo the arguments from the specific example in this general way,
reflect on how similar there are even though you don’t see numbers! We now look
at the general model

x  (t) = a x(t) − b x(t) y(t)


y  (t) = −c y(t) + d x(t) y(t)

Let’s again assume an IC corresponding to μ = 0.7. Again, the arguments work for
any μ but it is nice to be able to pick a number and work off of it. So our argument
is a bit of a hybrid: general parameter values and a specific mu value.
• Step 1: draw the f curve and the horizontal line of value 0.7 f max . The horizontal
line will cross the f curve twice giving two corresponding x values. Label these
x1 and x2 as shown. Also label the point c/d and draw the vertical lines from these
x values to the f curve itself.
• Also pick a point x1∗ < x1 and a point x2∗ > x2 and draw them in along with their
vertical lines that go up to the f curve.
• We show all this in Fig. 10.19.
• Step 2: Is it possible for the trajectory to contain the point x1∗ ? If so, there is a
corresponding y value so the NLCL holds: f (x1∗ )g(y) = 0.7 f max gmax . But this
implies that

0.7 f max
g(y) = gmax > gmax
f (x1∗ )

as the bottom of the fraction 0.7 f max / f (x1∗ ) is smaller than the top making the
fraction larger than 1. But no y can have a value larger than gmax . Hence, the point
x1∗ is not on the trajectory.
320 10 Predator–Prey Models

Fig. 10.19 The conservation law f (x) g(y) = 0.7 f max gmax implies there are two critical points
x1 and x2 of interest

• Step 3: Is it possible for x1 to be on the trajectory? If so, there is a y value


so the NLCL holds giving f (x1 )g(y) = 0.7 f max gmax . But f (x1 ) = 0.7 f max , so
cancelling, we find g(y) = gmax . Thus y = a/b and (x1 , a/b) is on the trajectory.
• Step 4: Is it possible for the trajectory to contain the point x2∗ ? If so, there is a
corresponding y value so the NLCL holds: f (x2∗ )g(y) = 0.7 f max gmax . But this
implies that

0.7 f max
g(y) = gmax > gmax
f (x2∗ )

as the bottom of the fraction 0.7 f max / f (x2∗ ) is smaller than the top making the
fraction larger than 1. But no y can have a value larger than gmax . Hence, the point
x2∗ is not on the trajectory.
• Step 5: Is it possible for x2 to be on the trajectory? If so, there is a y value
so the NLCL holds giving f (x2 )g(y) = 0.7 f max gmax . But f (x2 ) = 0.7 f max , so
cancelling, we find g(y) = gmax . Thus y = a/b and (x2 , a/b) is on the trajectory.
We conclude if (x(t), y(t)) is on the trajectory, then x1 ≤ x(t) ≤ x2 . We show this
in Fig. 10.20.

10.5.8 Trajectories are Bounded in y General Argument

Again we look at the general Predator–Prey model

x  (t) = a x(t) − b x(t) y(t)


y  (t) = −c y(t) + d x(t) y(t)

with an IC corresponding to μ = 0.7.


10.5 Qualitative Analysis 321

Fig. 10.20 Predator–Prey trajectories with initial conditions from Quadrant 1 are bounded in x

• Step 1: draw the g curve and the horizontal line of value 0.7gmax . The horizontal
line will cross the g curve twice giving two corresponding y values. Label these
y1 and y2 as shown. Also label the point a/b and draw the vertical lines from these
y values to the g curve itself.
• Also pick a point y1∗ < y1 and a point y2∗ > y2 and draw them in along with their
vertical lines that go up to the g curve.
• We show all this in Fig. 10.21.
• Step 2: Is it possible for the trajectory to contain the point y1∗ ? If so, there is a
corresponding x value so the NLCL holds: f (x)g(y1∗ ) = 0.7 f max gmax . But this
implies that

0.7gmax
f (x) = f max > f max
g(y1∗ )

Fig. 10.21 The conservation law f (x) g(y) = 0.7 f max gmax implies there are two critical points
y1 and y2 of interest
322 10 Predator–Prey Models

as the bottom of the fraction 0.7gmax /g(y1∗ ) is smaller than the top making the
fraction larger than 1. But no x can have a value larger than f max . Hence, the point
y1∗ is not on the trajectory.
• Step 3: Is it possible for y1 to be on the trajectory? If so, there is a x value
so the NLCL holds giving f (x)g(y1 ) = 0.7 f max gmax . But g(y1 ) = 0.7gmax , so
cancelling, we find f (x) = f max . Thus x = c/d and (c/d, y1 ) is on the trajectory.
• Step 4: Is it possible for the trajectory to contain the point y2∗ ? If so, there is a
corresponding x value so the NLCL holds: f (xg(y2∗ ) = 0.7 f max gmax . But this
implies that

0.7gmax
f (x) = f max > f max
g(y2∗ )

as the bottom of the fraction 0.7gmax /g(y2∗ ) is smaller than the top making the
fraction larger than 1. But no x can have a value larger than f max . Hence, the point
y2∗ is not on the trajectory.
• Step 5: Is it possible for y2 to be on the trajectory? If so, there is a x value
so the NLCL holds giving f (x)g(y2 ) = 0.7 f max gmax . But g(y2 ) = 0.7gmax , so
cancelling, we find f (x) = f max . Thus x = c/d and (c/d, y2 ) is on the trajectory.
We conclude if (x(t), y(t)) is on the trajectory, then y1 ≤ y(t) ≤ y2 . We show
these bounds in Fig. 10.22.

10.5.9 Trajectories are Bounded General Argument

Combining, we see trajectories are bounded in Quadrant 1. We show this in Fig. 10.23.
Now that we have discussed these two cases, note that we could have just done
the x variable case and said that a similar thing happens for the y variable. In many
texts, it is very common to do this. Since you are beginners at this kind of reasoning,

Fig. 10.22 Predator–prey trajectories with initial conditions from Quadrant 1 are bounded in y
10.5 Qualitative Analysis 323

Fig. 10.23 Predator–prey trajectories with initial conditions from Quadrant 1 are bounded in x
and y

we have presented both cases in detail. But you should start training your mind to
see that presenting one case is actually enough!

10.5.10 Homework

For these Predator–Prey models, follow the analysis of the section above to show
that the trajectories must bounded.

Exercise 10.5.6

x  (t) = 10 x(t) − 25 x(t) y(t)


y  (t) = −20 y(t) + 40 x(t) y(t)

Exercise 10.5.7

x  (t) = 100 x(t) − 25 x(t) y(t)


y  (t) = −20 y(t) + 4 x(t) y(t)

Exercise 10.5.8

x  (t) = 80 x(t) − 4 x(t) y(t)


y  (t) = −10 y(t) + 5 x(t) y(t)
324 10 Predator–Prey Models

Exercise 10.5.9

x  (t) = 10 x(t) − 2 x(t) y(t)


y  (t) = −25 y(t) + 10 x(t) y(t)

Exercise 10.5.10

x  (t) = 12 x(t) − 4 x(t) y(t)


y  (t) = −60 y(t) + 15 x(t) y(t)

10.6 The Trajectory Must Be Periodic

If the trajectory was not periodic, then there would be horizontal and vertical lines
that would intersect the trajectory in more than two places. We will show that we can
have at most two intersections which tells us the trajectory must be periodic. We go
back to our specific example

x  (t) = 8 x(t) − 6 x(t) y(t)


y  (t) = −7 y(t) + 3 x(t) y(t)

This time we will show the argument for this specific case only and not do a general
example. But it is easy to infer from this specific example how to handle this argument
for any other Predator–Prey model!
• We draw the same figure as before for f but we don’t need the points x1∗ and x2∗ .
This time we add a point x ∗ between x1 and x2 . We’ll draw it so that it is between
c/d = 7/3 and x2 but just remember it could have been chosen on the other side.
We show this in Fig. 10.24.

Fig. 10.24 The f curve with the point x1 < c/d = 7/3 < x ∗ < x2 added
10.6 The Trajectory Must Be Periodic 325

• At the point x ∗ , the NLCL says the corresponding y values satisfy f (x ∗ )g(y) =
0.7 f max gmax . This tells us

0.7 f max
g(y) = gmax
f (x ∗ )

• The biggest the ratio 0.7 f max / f (x ∗ ) can be is when the bottom f (x ∗ ) is the
smallest. This occurs when x ∗ is chosen to be x1 or x2 . Then the ratio is
0.7 f max /(0.7 f max ) = 1.
• The smallest the ratio 0.7 f max / f (x ∗ ) can be is when the bottom f (x ∗ ) is the largest.
This occurs when x ∗ is chosen to be c/d = 7/3. Then the ratio is 0.7 f max / f max ) =
0.7.
• So the ratio 0.7 f max / f (x ∗ ) is between 0.7 and 1.
• Draw the g now adding in a horizontal line for r gmax where r = 0.7 f max / f (x ∗ ).
The lowest this line can be is on the line 0.7gmax and the highest it can be is the
line of value gmax . This is shown in Fig. 10.25.
• The figure above shows that there are at most two intersections with the g curve.
• The case of spiral in or spiral out trajectories implies there are points x ∗ with more
than two corresponding y values. Hence, spiral in and spiral out trajectories are
not possible and the only possibility is that the trajectory is periodic.
• So there is a smallest positive number T called the period of the trajectory which
means

(x(0), y(0)) = (x(T ), y(T )) = (x(2T ), y(2T ))


= (x(3T ), y(3T )) = . . .

Fig. 10.25 The g curve with the r gmax line added showing the y values for the chosen x ∗
326 10 Predator–Prey Models

10.6.1 Homework

For the following problems, show the details of the periodic nature of the Predator–
Prey trajectories by mimicking the analysis in the section above.

Exercise 10.6.1

x  (t) = 8 x(t) − 25 x(t) y(t)


y  (t) = −10 y(t) + 50 x(t) y(t)

Exercise 10.6.2

x  (t) = 30 x(t) − 3 x(t) y(t)


y  (t) = −45 y(t) + 9 x(t) y(t)

Exercise 10.6.3

x  (t) = 50 x(t) − 12.5 x(t) y(t)


y  (t) = −100 y(t) + 50 x(t) y(t)

Exercise 10.6.4

x  (t) = 10 x(t) − 2.5 x(t) y(t)


y  (t) = −2 y(t) + 1 x(t) y(t)

Exercise 10.6.5

x  (t) = 7 x(t) − 2.5 x(t) y(t)


y  (t) = −13 y(t) + 2 x(t) y(t)

10.7 Plotting Trajectory Points!

Now that we know the trajectory is periodic, let’s look at the plot more carefully.
We know the trajectories must lie within the rectangle [x1 , x2 ] × [y1 , y2 ]. Mathe-
matically, this means there is a smallest positive number T so that x(0) = x(T ) and
y(0) = y(T ). This number T is called the period of the Predator–Prey model. We
can see the periodicity of the trajectory by doing a more careful analysis of the tra-
jectories. We know the trajectory hits the points (x1 , ab ), (x2 , ab ), ( dc , y1 ) and ( dc , y2 ).
What happens when we look at x points u with x1 < u < x2 ? For convenience, let’s
look at the case x1 < u < dc and the case u = dc separately.
10.7 Plotting Trajectory Points! 327

10.7.1 Case 1: u = c
d

In this case, the nonlinear conservation law gives

f (u) g(v) = μ f max gmax .

However, we also know f ( dc ) = f max and so we must have

f max g(v) = μ f max gmax .

or
g(v) = μ gmax .

Since μ is less than 1, we draw the μ gmax horizontal line on the g graph as usual to
obtain the figure we previously drew as Fig. 10.21. Hence, there are two values of v
that give the value μ gmax ; namely, v = y1 and v = y2 . We conclude there are two
possible points on the trajectory, ( dc , v = y1 ) and ( dc , v = y2 ). This gives the usual
points shown as the vertical points in Fig. 10.23.

10.7.2 Case 2: x1 < U < c


d

The analysis is very similar to the one we just did for u = dc . First, for this choice of
u, we can draw a new graph as shown in Fig. 10.26.
Here, the conservation law gives

f (u) g(v) = μ f max gmax .

Fig. 10.26 The predator–prey f growth graph trajectory analysis for x1 < u < c
d
328 10 Predator–Prey Models

Fig. 10.27 The predator–prey g growth analysis for one point x1 < u < c
d

Dividing through by f (u), we seek v values satisfying

f max
g(v) = μ gmax .
f (u)

Here the ratio f max / f (u) is larger than 1 (just look at Fig. 10.26 to see this). Call this
ratio r . Hence, μ < μ ( f max / f (u)) and so μgmax < μ ( f max / f (u)) gmax . Also from
Fig. 10.26, we see the ratio μ f max < f (u) which tells us (μ f max )/ f (u) gmax <
gmax . Now look at Fig. 10.27. The inequalities above show us we must draw the
horizontal line μ r gmax above the line μ gmax and below the line gmax . So we seek v
values that satisfy

f max
μ gmax < g(v) = μ gmax = μ r gmax < gmax .
f (u)

We already know the values of v that satisfy g(v) = μ gmax which are labeled in
Fig. 10.26 as v = y1 and v = y2 . Since the number μ r is larger than μ, we see from
Fig. 10.27 there are two values of v, v = z 1 and v = z 2 for which g(v) = μ r gmax
and y1 < z 1 < ab < z 2 < y2 as shown.
From the above, we see that in the case x1 < u < dc , there are always 2 and only
2 possible v values on the trajectory. These points are (u, z 1 ) and (u, z 2 ).

10.7.3 x1 < u1 < u2 < c


d

What happens if we pick two points, x1 < u 1 < u 2 < dc ? The f curve analysis is
essentially the same but now there are two vertical lines that we draw as shown in
Fig. 10.28.
10.7 Plotting Trajectory Points! 329

Fig. 10.28 The predator–prey f growth graph trajectory analysis for x1 < u 1 < u 2 < c
d points

Now, applying conservation law gives two equations

f (u 1 ) g(v) = μ f max gmax


f (u 2 ) g(v) = μ f max gmax

This implies we are searching for v values in the following two cases:

f max
g(v) = μ gmax
f (u 1 )

and
f max
g(v) = μ gmax .
f (u 2 )

Since f (u 1 ) is smaller than f (u 2 ), we see the ratio f max / f (u 1 ) is larger than


f max / f (u 2 ) and both ratios are larger than 1 (just look at Fig. 10.28 to see this).
Call these ratios r1 (the one for u 1 ) and r2 (for u 2 ). It is easy to see r2 < r1 from the
figure. We also still have (as in our analysis of the case of one point u) that both μ r1
and μ r2 are less than 1. We conclude

f max
μ<μ = μ r1
f (u 2 )
f max
<μ = μ r2
f (u 1 )
< 1.
330 10 Predator–Prey Models

Fig. 10.29 The spread of the trajectory through fixed lines on the x axis gets smaller as we move
away from the center point dc

Now look at Fig. 10.29. The inequalities above show us we must draw the horizontal
line μ r1 gmax above the line μ r2 gmax which is above the line μ gmax . We already
know the values of v that satisfy g(v) = μ gmax which are labeled in Fig. 10.27 as
v = y1 and v = y2 . Since the number μ r2 is larger than μ, we see from Fig. 10.29
there are two values of v, v = z 21 and v = z 22 for which g(v) = μ r2 gmax and y1 <
z 21 < ab < z 22 < y2 as shown. But we can also do this for the line μ r1 gmax to find
two more points z 11 and z 12 satisfying
a
y1 < z 21 < z 11 < < z 12 < z 22 < y2
b
as seen in Fig. 10.29 also.
We also see that the largest spread in the y direction is at x = dc giving the two
points ( dc , y1 ) and ( dc , y2 ) which corresponds to the line segment [y1 , y2 ] drawn at
the x = dc location. If we pick the point x1 < u 2 < dc , the two points on the trajectory
give a line segment [z 21 , z 22 ] drawn at the x = u 2 location. Note this line segment
is smaller and contained in the largest one [y1 , y2 ]. The corresponding line segment
for the point u 1 is [z 11 , z 12 ] which is smaller yet.

10.7.4 Three u Points

If you think about it a bit, if we picked three points as follows, x1 < u 1 < u 2 < u 3 <
c
d
and three more points dc < u 4 < u 5 < u 6 < x2 , we would find line segments as
follows:
10.7 Plotting Trajectory Points! 331

Point Spread
x1 One point (x1 , ab )
u1 [z 11 , z 12 ]
u2 [z 21 , z 22 ] contains [z 11 , z 12 ]
u3 [z 31 , z 32 ] contains [z 21 , z 22 ]
c
d
[y1 , y2 ] contains [z 21 , z 22 ]
u4 [z 41 , z 42 ] inside [y1 , y2 ]
u2 [z 51 , z 52 ] inside [z 41 , z 42 ]
u1 [z 61 , z 62 ] inside [z 51 , z 52 ]
x2 One point (x2 , ab ) inside [z 51 , z 52 ]

We draw these line segments in Fig. 10.30. We know the Predator–Prey trajectory
must go through these points. Every time the trajectory hits the x value dc , the cor-
responding y spread is [y1 , y2 ]. If the trajectory was spiraling inwards, then the first
time we hit dc , the spread would be [y1 , y2 ] and the next time, the spread would have
to be less so that the trajectory moved inwards. This can’t happen as the second time
we hit dc , the spread is exactly the same. The points shown in Fig. 10.30 are always
the same. Again, note since the trajectory is periodic is there is a smallest positive
number T so that

x(t + T ) = x(t) and y(t + T ) = y(t)

for all values of t. This is the behavior we are seeing in Fig. 10.30. Note the value
of this period is really determined by the initial values (x0 , y0 ) as they determine the
bounding box [x1 , x2 ] × [y1 , y2 ] since the initial condition determines μ.

Fig. 10.30 The trajectory must be periodic


332 10 Predator–Prey Models

10.8 The Average Value of a Predator–Prey Solution

If we had a positive function h defined on an interval [a, b], we can define the average
value of h over [a, b] by the integral
 b
1
h̄ = h(t) dt. (10.20)
b−a a

To motivate this definition, let’s look at the Riemann sums of some nice function
f on the interval [1, 3]. Take a uniform partition [1, 3] with 5 points. P4 = {1, 1 +
h, 1 + 2h, 1 + 3h, 3} where h = (3 − 1)/4 = 0.5. The evaluation set is the left hand
endpoints: E 4 = {1, 1 + h, 1 + 2h, 1 + 3h = 3}. Note 4h = 3 − 1 = 2 which is the
length of [1, 3]. The Riemann sum is
 
RS = f (1) + f (1 + h) + f (1 + 2h) + f (1 + 3h) h
 
f (1) + f (1 + h) + f (1 + 2h) + f (1 + 3h)
= (4h)
4
 
f (1) + f (1 + h) + f (1 + 2h) + f (1 + 3h)
= (2)
4

Note ( 3j=0 f (1 + j h))/4 is an estimate of the average value of f on [1, 3] using
4 values of the function.
Now cut h in half. The new partition is {1, 1 + h/2, 1 + 2(h/2), 1 + 3(h/2), . . . ,
3} which has 8 subintervals with 9 points. The evaluation set is left hand endpoints
again. Note 8(h/2) = 3 − 1 = 2 which is the length of [1, 3]

RS = ( f (1) + f (1 + (h/2)) + · · · + f (1 + 7(h/2))) (h/2)


 
f (1) + f (1 + (h/2)) + · · · + f (1 + 7(h/2))
= (8(h/2))
8
 
f (1) + f (1 + (h/2)) + · · · + f (1 + 7(h/2))
= (2)
8

Now ( 7j=0 f (1 + j (h/2)))/8 is an estimate of the average value of f on [1, 3]
using 8 values of the function. Do this again and again
 
• For h/4, f (1)+ f (1+(h/4))+···+
16
f (1+15(h/4))
(2) is an estimate for the average value
of f on [1, 3] using
 16 values. 
f (1)+ f (1+(h/8))+···+ f (1+31(h/8))
• For h/8 = h/2 , 3
32
(2) is an estimate for the aver-
age value of f on [1, 3] using 32 values.
10.8 The Average Value of a Predator–Prey Solution 333
 
f (1)+ f (1+(h/64))+···+ f (1+255 (h/64))
• For h/64 = h/26 , 256
(2) is an estimate for the
average value of f on [1, 3] using 256 values.
• These all have the form Riemann sum is average value of f for the partition
times the length of [1, 3].
3
Letting the number of points in the partition go to ∞, the RS’s converge to 1 f (t) dt.
3
So we have 1 f (t)d t is average value of f on [1, 3] times the length of [1, 3].
This leads to the definition we used at the start of this section: Average value of
b
f on [a, b] = 1/(b − a) a f (t)d t. So for a Predator–Prey model, since there is a
period T for the model, we can define
 T
1
x̄ = Average Value of x on [0, T ] = x(t) dt
T 0
 T
1
ȳ = Average Value of y on [0, T ] = y(t) dt
T 0

Here are the details. Now, recall the Predator–Prey model is given by
 

x = x a−b y
 
y  = y −c + d x

x(0) = x0
y(0) = y0

Rearrange the x  equation like this:

x  (s)
= a − b y(s)
x(s)

for all 0 ≤ s ≤ T where T is the period for this trajectory. Now integrate from s = 0
to s = T to get
   
T
x  (s) T
ds = a − b y(s) ds.
0 x(s) 0

Hence, we have
 T 
 T
ln x(s)  = a T − b y(s) ds.
0 0
334 10 Predator–Prey Models

Simplifying, we find
  
x(T ) T
ln = a T −b y(s) ds.
x0 0

However, since T is the period for this trajectory, we know x(T ) must equal x(0).
Hence, ln(x(T )/x0 ) = ln(1) = 0. Rearranging, we conclude
 T
0=aT− b y(s) ds,
0
 T
b y(s) ds = a T,
0
 T
1 a
y(s) ds = .
T 0 b

The term on the left hand side is the average value of the solution y over the one
period of time, [0, T ]. Using the usual average notation, we will call this ȳ. Thus,
we have
 T
1 a
ȳ = y(s) ds = . (10.21)
T 0 b

We can do a similar analysis for the average value of the x component of the solution.
We find
y  (s)
= −c + d x(s), 0 ≤ s ≤ T,
y(s)
 T   T 
y (s)
ds = −c + d x(s) ds,
0 y(s) 0
 T  T


ln y(s)  = −c T + d x(s) ds,
0 0
   T
y(T )
ln = −c T + d x(s) ds.
y0 0

However, since T is the period for this trajectory, we know y(T ) must equal y(0).
Hence, ln(y(T )/y0 ) = ln(1) = 0. Rearranging, we conclude
 T
0 = −c T + d x(s) ds,
0
 T
d x(s) ds = c T,
0
 T
1 c
x(s) ds = .
T 0 d
10.8 The Average Value of a Predator–Prey Solution 335

The term on the left hand side is the average value of the solution x over the one
period of time, [0, T ]. Using the usual average notation, we will call this x̄. Thus,
we have
 T
1 c
x̄ = x(s) ds = . (10.22)
T 0 d

The point (x̄ = dc , ȳ = ab ) has an important interpretation. It is the average value of


the solution over the period of the trajectory.

10.8.1 Homework

For the following Predator–Prey models, derive the average x and y equations.

Exercise 10.8.1

x  (t) = 100 x(t) − 25 x(t) y(t)


y  (t) = −200 y(t) + 40 x(t) y(t)

Exercise 10.8.2

x  (t) = 1000 x(t) − 250 x(t) y(t)


y  (t) = −2000 y(t) + 40 x(t) y(t)

Exercise 10.8.3

x  (t) = 900 x(t) − 45 x(t) y(t)


y  (t) = −100 y(t) + 50 x(t) y(t)

Exercise 10.8.4

x  (t) = 10 x(t) − 25 x(t) y(t)


y  (t) = −20 y(t) + 40 x(t) y(t)

Exercise 10.8.5

x  (t) = 90 x(t) − 2.5 x(t) y(t)


y  (t) = −200 y(t) + 4.5 x(t) y(t)
336 10 Predator–Prey Models

10.9 A Sample Predator–Prey Model

Example 10.9.1 Consider the following Predator–Prey model:

x  (t) = 2 x(t) − 10 x(t) y(t)


y  (t) = −3 y(t) + 18 x(t) y(t)

Solution For any choice of initial conditions (x0 , y0 ), we can solve this as discussed
in the previous sections. We find a = 2, b = 10 so that ab = 0.2 and c = 3, d = 18
so that dc = 0.16666̄. We know a lot about these solutions now.
1. The solution (x(t), y(t)) has average value x value x̄ = dc which is 0.16666̄ and
an average y value, ȳ = ab = 0.2.
2. The initial condition (x0 , y0 ) is some point on the curve.
3. For each choice of initial condition (x0 , y0 ), there is a corresponding period T
so that (x(t), y(t)) = (x(t + T ), y(t + T )) for all time t.
4. Looking at Fig. 10.30, we can connect the dots so to speak to generate the trajec-
tory shown in Fig. 10.31.

Now that you know how to analyze the predator–prey models, you can look in
the literature and see how they are used. We will leave it up to you to find the many
references on how this model is used to study the wolve–moose population on Isle
Royal in Lake Superior and instead point you to a different one. In Axelsen et al.
(2001), the predators are Atlantic Puffins and the prey are juvenile herring and the
research is trying to understand the shapes that schools of herring take in the wild
under predation. This study is primarily descriptive with no mathematics at all but
the references point to other papers where simulations are carried out. You have
enough training now to follow this paper trail and see how the simulation papers and
the descriptive papers work hand in hand. But we leave the details and hard work
to do this to you. Happy hunting! If you look at the references in this paper, you’ll
note that one the papers there is by Hamilton—the same biologist whose work on
altruism we studied in Peterson (2015).

10.9.1 Homework

Do these analyses for some specific value of μ; for example, mu = 0.8 or something
similar. Of course, the specific value doesn’t matter that much, but it is easier to see
the graphical analysis is μ f max is not too close to the peak of the f . This time also
draw the bounding boxes that we get for different values of μ. You will see that when
μ is close to 1, the bounding box is very small and as μ get close to 0, the bounding
boxes get very large. Also, you will see they are nested inside each other.
10.9 A Sample Predator–Prey Model 337

Fig. 10.31 The theoretical trajectory for x  = 2x − 10x y; y  = −3y + 18x y. We do not know the
actual trajectory as we can not solve for x and y explicitly as functions of time. However, our
analysis tells us the trajectory has the qualitative features shown

Exercise 10.9.1 Provide a complete analysis for the Predator–Prey Model

x  (t) = 4 x(t) − 7 x(t) y(t)


y  (t) = −9 y(t) + 7 x(t) y(t)

Exercise 10.9.2 Provide a complete analysis for the Predator–Prey Model

x  (t) = 90 x(t) − 45 x(t) y(t)


y  (t) = −180 y(t) + 20 x(t) y(t)

Exercise 10.9.3 Provide a complete analysis for the Predator–Prey Model

x  (t) = 80 x(t) − 4 x(t) y(t)


y  (t) = −100 y(t) + 5 x(t) y(t)

Exercise 10.9.4 Provide a complete analysis for the Predator–Prey Model

x  (t) = 9 x(t) − 27 x(t) y(t)


y  (t) = −18 y(t) + 6 x(t) y(t)

Exercise 10.9.5 Provide a complete analysis for the Predator–Prey Model

x  (t) = 4 x(t) − 18 x(t) y(t)


y  (t) = −3 y(t) + 21 x(t) y(t)
338 10 Predator–Prey Models

10.10 Adding Fishing Rates

The Predator–Prey model we have looked at so far did not help Volterra explain the
food and predator fish data seen in the Mediterranean sea during World War I. The
model must also handle changes in fishing rates. War activities had decreased the
rate of fishing from 1915 to 1919 or so as shown in Table 10.2. To understand this
data, Volterra added a new decay rate to the model. He let the positive constant r
represent the rate of fishing and assumed that −r x would be removed from food fish
due to fishing and also assumed that the same rate would apply to predator removal.
Hence, −r y would be removed from the predators. This led to the Predator–Prey
with fishing given by

x  (t) = a x(t) − b x(t) y(t) − r x(t) (10.23)



y (t) = −c y(t) + d x(t) y(t) − r y(t). (10.24)

We don’t have to work to hard to understand what adding the fishing does to our
model results. We can rewrite the model as

x  (t) = (a − r ) x(t) − b x(t) y(t) (10.25)


y  (t) = −(c + r ) y(t) + d x(t) y(t). (10.26)

We see immediately that it doesn’t make sense for the fishing rate to exceed a as we
want a − r to be positive. We also know the new averages are

c+r
x̄r =
d
a −r
ȳr = .
b
where we label the new averages with a subscript r to denote their dependence on
the fishing rate r . What happens if we halve the fishing rate r ? The new model is

x  (t) = a x(t) − b x(t) y(t) − r/2 x(t)


y  (t) = −c y(t) + d x(t) y(t) − r/2 y(t).

which can be reorganized as

x  (t) = (a − r/2) x(t) − b x(t) y(t) (10.27)


y  (t) = −(c + r/2) y(t) + d x(t) y(t). (10.28)

leading to the new averages

c + r/2
x̄r/2 =
d
10.10 Adding Fishing Rates 339

a − r/2
ȳr/2 = .
b
Note that as long as we use a feasible r value (i.e. r < a), we have the following
inequality relationships:

c + r/2 c+r
x̄r/2 = < x̄r =
d d
a − r/2 a −r
x̄r/2 = > ȳr = .
b b
Hence, if we decrease the fishing rate r , the predator percentage goes up and the food
percentage goes down. Now look at Table 10.2 rewritten with the percentages listed
as fractions and interpreted as x̄ and ȳ. We show this in Table 10.4.
Note that Volterra’s Predator–Prey model with fishing rates added has now
explained this data. During the war years, predator amounts went up and food fish
amounts went down. A wonderful use of modeling, don’t you think? Insight was
gained from the modeling that had not been able to be achieved using other types of
analysis.
Let’s do an example to set this in place. Consider the following Predator–Prey
model with fishing added.

Example 10.10.1

x  (t) = 4 x(t) − 18 x(t) y(t) − 2 x(t)


y  (t) = −3 y(t) + 21 x(t) y(t) − 2 y(t)

Solution The averages are as follows:

x̄r =0 = 3/21 = 0.1429, ȳr =0 = 4/18 0.2222

Table 10.4 The average food and predator fish caught in the Mediterranean Sea
Year x̄ ȳ Fishing rate change ( x̄,  ȳ)
1914 0.881 0.119 Starting value No change yet
1915 0.786 0.214 Down relative to 1914 (−, +)
1916 0.779 0.221 Down relative to 1914 (−, +)
1917 0.788 0.212 Down relative to 1914 (−, +)
1918 0.636 0.364 Down relative to 1914 (−, +)
1919 0.727 0.273 Increased relative to 1918 (+, −)
1920 0.840 0.160 Increased relative to 1918 (+, −)
1921 0.841 0.159 Increased relative to 1918 (+, −)
1922 0.852 0.148 Increased relative to 1918 (+, −)
1923 0.893 0.107 Back to normal 1914 rate Back to normal
340 10 Predator–Prey Models

x̄r =2 = 5/21 = 0.2381, ȳr =2 = 2/18 0.1111


x̄r =1 = 4/21 = 0.1905, ȳr =1 = 3/18 0.1667.

We see that halving the fishing rate, decreases the food fish amounts (0.2381 down
to 0.1905) and increases the predator amounts (0.1111 up to 0.1667). We could also
shown this graphically by drawing all three average pairs on the same x–y plane
but we will leave that to you in the exercises.

10.10.1 Homework

For the following problems, add fishing to the model at some rate r which is given.
Find the new average solutions (x̄, ȳ) and explain what happens if we half the fishing
rate and how this relates to the way Volterra explained the Mediterranean Sea fishing
data from World War I. Draw a simple picture shown these three averages on the
same x–y graph: show original (x̄, ȳ), the (x̄, ȳ) when the fishing is added and the
(x̄, ȳ) when the fishing is halved. You should clearly see that adding halving the
fishing rate leads to the average predator value going up with the average food fish
value going down.
Exercise 10.10.1

x  (t) = 4 x(t) − 18 x(t) y(t) − 2 x(t)


y  (t) = −3 y(t) + 21 x(t) y(t) − 2 y(t)

Exercise 10.10.2

x  (t) = 3 x(t) − 10 x(t) y(t) − 1 x(t)


y  (t) = −4 y(t) + 20 x(t) y(t) − 1 y(t)

Exercise 10.10.3

x  (t) = 1 x(t) − 2 x(t) y(t) − 0.5 x(t)


y  (t) = −4 y(t) + 8 x(t) y(t) − 0.5 y(t)

Exercise 10.10.4

x  (t) = 40 x(t) − 18 x(t) y(t) − 0.2 x(t)


y  (t) = −30 y(t) + 20 x(t) y(t) − 0.2 y(t)

Exercise 10.10.5

x  (t) = 7 x(t) − 8 x(t) y(t) − 0.2 x(t)


y  (t) = −3 y(t) + 4 x(t) y(t) − 0.2 y(t)
10.11 Numerical Solutions 341

10.11 Numerical Solutions

Let’s try to solve a typical predator–prey system such as the one given below numer-
ically.
x  (t) = a x(t) − b x(t) y(t)
y  (t) = −c y(t) + d x(t) y(t)

As we have done in Chap. 9, we convert our model to a matrix-vector system.


The right hand side of our system is now a column vector: we identify x with the
component x(1) and y with the component x(2). This gives the vector function

a x(1) − b x(1) x(2)
f (t, x) =
−c x(2) + d x(1) x(2)

and we can no longer find the true solution, although our theoretical investigations
have told us a lot about the behavior that the true solution must have.
Let’s solve a Predator–Prey Model with Runge–Kutta Order 4.

x  (t) = 12 x(t) − 5 x(t) y(t)


y  (t) = −6 y(t) + 3 x(t) y(t)

We can do this with the following Matlab session.

Listing 10.1: Phase Plane x  = 12x − 5x y, y  = −6y + 3x y, x(0) = 0.2, y(0) =


8.6
f = @( t , x ) [ 1 2 ∗ x ( 1 ) − 5∗ x ( 1 ) . ∗ x ( 2 ) ;−6∗x ( 2 ) +3∗ x ( 1 ) . ∗ x ( 2 ) ] ;
T = 1.7;
h = .01;
% xbar = 6 / 3 = 2 . 0 ; ybar = 1 2 / 5 = 2 . 2
x0 = [ 0 . 2 ; 8 . 6 ] ;
N = c e i l (T / h ) ;
[ ht , rk ] = FixedRK ( f , 0 , x0 , h , 4 ,N) ;
X = rk ( 1 , : ) ;
Y = rk ( 2 , : ) ;
x1 = min (X)
x2 = max (X) ;
y1 = min (Y)
y2 = max (Y) ;
clf
h o l d on
p l o t ( [ x1 x1 ] , [ y1 y2 ] , ’ − b ’ ) ;
p l o t ( [ x2 x2 ] , [ y1 y2 ] , ’ − g ’ ) ;
p l o t ( [ x1 x2 ] , [ y1 y1 ] , ’ − r ’ ) ;
p l o t ( [ x1 x2 ] , [ y2 y2 ] , ’ −m’ ) ;
p l o t (X, Y, ’ −k ’ ) ;
xlabel ( ’x ’) ;
ylabel ( ’y ’) ;
t i t l e ( ’ Phase Pl a n e f o r P r e d a t o r − Prey model x ’ ’ = 1 2 x − 5 xy , y ’ ’= −6 y+3xy , x ( 0 ) =
0.2 , y (0) = 8.6 ’) ;
l e g e n d ( ’ x1 ’ , ’ x2 ’ , ’ y1 ’ , ’ y2 ’ , ’ y v s x ’ , ’ L o c a t i o n ’ , ’ Best ’ ) ;
hold o f f

This gives the plot of Fig. 10.32. Let’s annotate this code.
342 10 Predator–Prey Models

Fig. 10.32 Phase plane


x  = 12x − 5x y, y  =
−6y + 3x y, x(0) =
0.2, y(0) = 8.6

Listing 10.2: Annotated Predator–Prey Phase Plane Code


% s e t up P r e d a t o r − Prey dynamics
f = @( t , x ) [ 1 2 ∗ x ( 1 ) − 5∗ x ( 1 ) . ∗ x ( 2 ) ;−6∗x ( 2 ) +3∗ x ( 1 ) . ∗ x ( 2 ) ] ;
% s e t up f i n a l t i m e : t o o s m a l l and t r a j e c t o r y d o e s n o t c l o s e
% t o o b i g and t r a j e c t o r y wraps around m u l t i p l e t i m e s
T = 1.7;
% s e t s t e p s i z e : t o o s m a l l and t r a j e c t o r y l o o k s j a g g e d
h = .01;
% f i n d a v e r a g e s xbar = 6 / 3 = 2 . 0 ; ybar = 1 2 / 5 = 2 . 2
% p i c k IC s o mu i s r e a l l y s m a l l
x0 = [ 0 . 2 ; 8 . 6 ] ;
% f i n d number o f s t e p s
N = c e i l (T / h ) ;
% f i n d RK 4 approximate s o l u t i o n
[ ht , rk ] = FixedRK ( f , 0 , x0 , h , 4 ,N) ;
% s e t X and Y
X = rk ( 1 , : ) ;
Y = rk ( 2 , : ) ;
% g e t x1 and x2 from our bounding box argument
% u s i n g min and max o f X
x1 = min (X)
x2 = max (X) ;
% g e t y1 and y2 from our bounding box argument
% u s i n g min and max o f Y
y1 = min (Y)
y2 = max (Y) ;
% c l e a r p r e v i o u s graph
clf
% we w i l l do m u l t i p l e p l o t s s o s e t a h o l d on
% p l o t x1 v e r t i c a l l i n e i n b l u e
p l o t ( [ x1 x1 ] , [ y1 y2 ] , ’ − b ’ ) ;
% p l o t x2 v e r t i c a l l i n e i n g r e e n
p l o t ( [ x2 x2 ] , [ y1 y2 ] , ’ − g ’ ) ;
% p l o t y1 h o r i z o n t a l l i n e i n red
p l o t ( [ x1 x2 ] , [ y1 y1 ] , ’ − r ’ ) ;
% p l o t y1 h o r i z o n t a l l i n e i n magenta
p l o t ( [ x1 x2 ] , [ y2 y2 ] , ’ −m’ ) ;
% p l o t Y vs X in black
p l o t (X, Y, ’ −k ’ ) ;
10.11 Numerical Solutions 343

% s e t x and y l a b e l s
xlabel ( ’x ’) ;
ylabel ( ’y ’) ;
% set t i t l e
t i t l e ( ’ Phase Pl a n e f o r P r e d a t o r − Prey model x ’ ’ = 1 2 x − 5xy , y ’ ’= −6 y+3xy , x ( 0 ) =
0.2 , y (0) = 8.6 ’) ;
% s e t legend
l e g e n d ( ’ x1 ’ , ’ x2 ’ , ’ y1 ’ , ’ y2 ’ , ’ y v s x ’ , ’ L o c a t i o n ’ , ’ Best ’ ) ;
% c a n c e l hold
hold o f f

We can use this type of session to do interesting things.


• For a given Predator–Prey model with IC, set the final time T so low the trajectory
does not close. Then increase T slowly until trajectory just touches. This gives a
good estimate of the period T. The step size h needs to be adjusted to make sure
the graph is smooth to get a good value of T.
• For different IC’s you get different bounding boxes, so with a little care you can
plot a sequence of nested bounding boxes. You’ll find the smaller μ value bounding
box is inside the larger μ value box.

10.11.1 Estimating the Period T Numerically

We’ll estimate the period for our sample problem. We start with a small final time T
and move it up until the trajectory is almost closed.

Listing 10.3: First Estimate of the period T


f = @( t , x ) [ 1 2 ∗ x ( 1 ) − 5∗ x ( 1 ) . ∗ x ( 2 ) ;−6∗x ( 2 ) +3∗ x ( 1 ) . ∗ x ( 2 ) ] ;
T = 1.01;
h = .01;
% xbar = 6 / 3 = 2 . 0 ; ybar = 1 2 / 5 = 2 . 2
x0 = [ 1 . 2 ; 8 . 6 ] ;
N = c e i l (T / h ) ;
[ ht , rk ] = FixedRK ( f , 0 , x0 , h , 4 ,N) ;
X = rk ( 1 , : ) ;
Y = rk ( 2 , : ) ;
x1 = min (X)
x2 = max (X) ;
y1 = min (Y)
y2 = max (Y) ;
clf
h o l d on
p l o t ( [ x1 x1 ] , [ y1 y2 ] , ’ − b ’ ) ;
p l o t ( [ x2 x2 ] , [ y1 y2 ] , ’ − g ’ ) ;
p l o t ( [ x1 x2 ] , [ y1 y1 ] , ’ − r ’ ) ;
p l o t ( [ x1 x2 ] , [ y2 y2 ] , ’ −m’ ) ;
p l o t (X, Y, ’ −k ’ ) ;
xlabel ( ’x ’) ;
ylabel ( ’y ’) ;
t i t l e ( ’ Phase P l a n e f o r P r e d a t o r − Prey model x ’ ’ = 1 2 x − 5 xy , y ’ ’= −6 y+3xy , x ( 0 ) =
0.2 , y (0) = 8.6 ’) ;
l e g e n d ( ’ x1 ’ , ’ x2 ’ , ’ y1 ’ , ’ y2 ’ , ’ y v s x ’ , ’ L o c a t i o n ’ , ’ Best ’ ) ;
hold o f f

This gives us Fig. 10.33 and we can see the period T > 1.01.
344 10 Predator–Prey Models

Fig. 10.33 Predator–prey


plot with final time less than
the period

Now we increase the final time a bit.

Listing 10.4: Second Estimate of the period T


f = @( t , x ) [ 1 2 ∗ x ( 1 ) − 5∗ x ( 1 ) . ∗ x ( 2 ) ;−6∗x ( 2 ) +3∗ x ( 1 ) . ∗ x ( 2 ) ] ;
T = 1.02;
h = .01;
% xbar = 6 / 3 = 2 . 0 ; ybar = 1 2 / 5 = 2 . 2
x0 = [ 1 . 2 ; 8 . 6 ] ;
N = c e i l (T / h ) ;
[ ht , rk ] = FixedRK ( f , 0 , x0 , h , 4 ,N) ;
X = rk ( 1 , : ) ;
Y = rk ( 2 , : ) ;
x1 = min (X)
x2 = max (X) ;
y1 = min (Y)
y2 = max (Y) ;
clf
h o l d on
p l o t ( [ x1 x1 ] , [ y1 y2 ] , ’ − b ’ ) ;
p l o t ( [ x2 x2 ] , [ y1 y2 ] , ’ − g ’ ) ;
p l o t ( [ x1 x2 ] , [ y1 y1 ] , ’ − r ’ ) ;
p l o t ( [ x1 x2 ] , [ y2 y2 ] , ’ −m’ ) ;
p l o t (X, Y, ’ −k ’ ) ;
xlabel ( ’x ’) ;
ylabel ( ’y ’) ;
t i t l e ( ’ Phase P l a n e f o r P r e d a t o r − Prey model x ’ ’ = 1 2 x − 5 xy , y ’ ’= −6 y+3xy , x ( 0 ) =
0.2 , y (0) = 8.6 ’) ;
l e g e n d ( ’ x1 ’ , ’ x2 ’ , ’ y1 ’ , ’ y2 ’ , ’ y v s x ’ , ’ L o c a t i o n ’ , ’ Best ’ ) ;
hold o f f

This gives us Fig. 10.34 and we can see the trajectory is now closed. So the period
T ≤ 1.02. Hence, we know 1.01 < T ≤ 1.02.
10.11 Numerical Solutions 345

Fig. 10.34 Predator–prey


plot with final time about the
period

10.11.2 Plotting Predator and Prey Versus Time

We can also write code to generate x versus t plots and y versus t plots From these,
we can also estimate the period T . The x versus t code is shown below and right
after it is the one line modification need to generate the plot of y versus t. The codes
below shows a little bit more than one period for x and y.

Listing 10.5: x versus time code


f = @( t , x ) [ 1 2 ∗ x ( 1 ) − 5∗ x ( 1 ) . ∗ x ( 2 ) ;−6∗x ( 2 ) +3∗ x ( 1 ) . ∗ x ( 2 ) ] ;
T = 1.04;
h = .01;
% xbar = 6 / 3 = 2 . 0 ; ybar = 1 2 / 5 = 2 . 2
x0 = [ 1 . 2 ; 8 . 6 ] ;
N = c e i l (T / h ) ;
[ ht , rk ] = FixedRK ( f , 0 , x0 , h , 4 ,N) ;
X = rk ( 1 , : ) ;
p l o t ( ht , X, ’ −k ’ ) ;
xlabel ( ’ t ’) ;
ylabel ( ’y ’) ;
t i t l e ( ’ x v s t i m e f o r P r e d a t o r − Prey model x ’ ’ = 1 2 x − 5 xy , y ’ ’= −6 y+3xy , x ( 0 ) =
0.2 , y (0) = 8.6 ’) ;

The x versus t plot is shown in Fig. 10.35.


346 10 Predator–Prey Models

Fig. 10.35 y versus t for


x  = 12x − 5x y, y  =
−6y + 3x y, x(0) =
0.2, y(0) = 8.6

Listing 10.6: y versus time code


f = @( t , x ) [ 1 2 ∗ x ( 1 ) − 5∗ x ( 1 ) . ∗ x ( 2 ) ;−6∗x ( 2 ) +3∗ x ( 1 ) . ∗ x ( 2 ) ] ;
T = 1.04;
h = .01;
% xbar = 6 / 3 = 2 . 0 ; ybar = 1 2 / 5 = 2 . 2
x0 = [ 1 . 2 ; 8 . 6 ] ;
N = c e i l (T / h ) ;
[ ht , rk ] = FixedRK ( f , 0 , x0 , h , 4 ,N) ;
Y = rk ( 2 , : ) ;
p l o t ( ht , Y, ’ −k ’ ) ;
xlabel ( ’ t ’) ;
ylabel ( ’y ’) ;
t i t l e ( ’ y v s t i m e f o r P r e d a t o r − Prey model x ’ ’ = 1 2 x − 5 xy , y ’ ’= −6 y+3xy , x ( 0 ) =
0.2 , y (0) = 8.6 ’) ;

The y versus t plot is shown in Fig. 10.36.

10.11.3 Plotting Using a Function

Although it is nice to do things interactively, sometimes it gets a bit tedious. Let’s


rewrite this as a function.
10.11 Numerical Solutions 347

Fig. 10.36 y versus t for


x  = 12x − 5x y, y  =
−6y + 3x y, x(0) =
0.2, y(0) = 8.6

Listing 10.7: AutoSystemFixedRK.m


f u n c t i o n AutoSystemFixedRK ( fname , s t e p s i z e , t i n i t , t f i n a l , y i n i t , r k o r d e r )
% fname i s t h e name o f t h e model dy n a m i c s
3 % s t e p s i z e i s our s t e p s i z e c h o i c e
% t i n i t i s the i n i t i a l time
% t f i n a l i s the f i n a l time
% r k o r d e r i s t h e RK o r d e r
% y i n i t i s t h e i n i t i a l d a t a e n t e r e d a s [ number 1 ; number 2 ]
8 n = c e i l ( ( t f i n a l −t i n i t ) / s t e p s i z e ) ;
[ htime , rk ] = FixedRK ( fname , t i n i t , y i n i t , s t e p s i z e , rkorder , n ) ;
X = rk ( 1 , : ) ;
Y = rk ( 2 , : ) ;
x1 = min (X)
13 x2 = max (X) ;
y1 = min (Y)
y2 = max (Y) ;
clf
h o l d on
18 p l o t ( [ x1 x1 ] , [ y1 y2 ] , ’-b’ ) ;
p l o t ( [ x2 x2 ] , [ y1 y2 ] , ’-g’ ) ;
p l o t ( [ x1 x2 ] , [ y1 y1 ] , ’-r’ ) ;
p l o t ( [ x1 x2 ] , [ y2 y2 ] , ’-m’ ) ;
p l o t (X, Y, ’-k’ ) ;
23 x l a b e l ( ’x’ ) ;
y l a b e l ( ’y’ ) ;
t i t l e ( ’Phase Plane Plot of y vs x’ ) ;
l e g e n d ( ’x1’ , ’x2’ , ’y1’ , ’y2’ , ’y vs x’ , ’Location’ , ’Best’ ) ;
hold o f f
28 end

This is saved in the file AutoSystemFixedRK.m and we use it in MatLab like


this:
348 10 Predator–Prey Models

Listing 10.8: Using AutoSystemFixedRK


f = @( t , x ) [ 1 2 ∗ x ( 1 ) − 5∗ x ( 1 ) . ∗ x ( 2 ) ;−6∗x ( 2 ) +3∗ x ( 1 ) . ∗ x ( 2 ) ] ;
AutoSystemFixedRK ( f , 0 . 0 2 , 0 , 5 , [ 2 ; 4 ] , 4 ) ;

This is much more compact! We can use it to generate our graphs much faster.
We can use this to show you how the choice of step size is crucial to generating a
decent plot. We show what happens with too large a step size in Fig. 10.37 and what
we see with a better step size choice in Fig. 10.38.

Fig. 10.37 Predator–prey


plot with step size too large

Fig. 10.38 Predator–prey


plot with step size better!
10.11 Numerical Solutions 349

10.11.4 Automated Phase Plane Plots

Next, we can generate a real phase plane portrait by automating the phase plane plots
for a selection of initial conditions. This uses the code AutoPhasePlanePlot.m
which we discussed in Sect. 9.4.
We generate a very nice phase plane plot as shown in Fig. 10.39 for the model

x  (t) = 6 x(t) − 5 x(t) y(t)


y  (t) = −7 y(t) + 4 x(t) y(t)

for initial conditions from the box [0.1, 4.5] × [0.1, 4.5] using a fairly small step
size of 0.2.

Listing 10.9: Automated Phase Plane Plot for x  = 6x − 5x y, y  = −7y + 4x y


f = @( t , x ) [ 6 ∗ x ( 1 ) −5∗x ( 1 ) ∗x ( 2 ) ;−7∗x ( 2 ) +4∗ x ( 1 ) ∗x ( 2 ) ] ;
A u t o P h a s e P l a n e P l o t ( ’PredPrey’ , 0 . 0 2 , 0 . 0 , 1 6 . 5 , 4 , 5 , 5 , . 1 , 4 . 5 , . 1 , 4 . 5 ) ;

10.11.5 Homework

Here are some problems on using MatLab!

Exercise 10.11.1

x  (t) = 4 x(t) − 7 x(t) y(t)


y  (t) = −9 y(t) + 7 x(t) y(t)

Fig. 10.39 Predator–prey


plot for multiple initial
conditions!
350 10 Predator–Prey Models

1. Use our Runge–Kutta codes for h sufficiently small to generate a periodic orbit
using initial conditions:
(a)

2
1

(b)

5
2

2. Use our MatLab codes to estimate the period in each case


3. Generate plots of the xand y trajectories.

Exercise 10.11.2

x  (t) = 90 x(t) − 45 x(t) y(t)


y  (t) = −180 y(t) + 20 x(t) y(t)

1. Use our Runge–Kutta codes for h sufficiently small to generate a periodic orbit
using initial conditions:
(a)

4
12

(b)

5
20

2. Use our MatLab codes to estimate the period in each case


3. Generate plots of the x and y trajectories.

Exercise 10.11.3

x  (t) = 10 x(t) − 5 x(t) y(t)


y  (t) = −4 y(t) + 20 x(t) y(t)

1. Use our Runge–Kutta codes for h sufficiently small to generate a periodic orbit
using initial conditions:
10.11 Numerical Solutions 351

(a)

40
2

(b)

5
25

2. Use our MatLab codes to estimate the period in each case


3. Generate plots of the xand y trajectories.

Exercise 10.11.4

x  (t) = 7 x(t) − 14 x(t) y(t)


y  (t) = −6 y(t) + 3 x(t) y(t)

1. Use our Runge–Kutta codes for h sufficiently small to generate a periodic orbit
using initial conditions:
(a)

7
12

(b)

0.2
2

2. Use our MatLab codes to estimate the period in each case


3. Generate plots of the xand y trajectories.

Exercise 10.11.5

x  (t) = 8 x(t) − 4 x(t) y(t)


y  (t) = −10 y(t) + 2 x(t) y(t)

1. Use our the Runge–Kutta codes for h sufficiently small to generate a periodic
orbit using initial conditions:
(a)

0.1
18
352 10 Predator–Prey Models

(b)

6
0.1

2. Use our MatLab codes to estimate the period in each case


3. Generate plots of the xand y trajectories.

10.12 A Sketch of the Predator–Prey Solution Process

We now have quite a few tools for analyzing Predator–Prey models. Let’s look at
a sample problem. We can analyze by hand or with computational tools. Here is a
sketch of the process on a sample problem.

1. For the system below, first do the work by hand. For the model

x  (t) = 56 x(t) − 25 x(t) y(t)


y  (t) = −207 y(t) + 40 x(t) y(t)

(a) Derive the algebraic sign regions in Q1 determined by the x  = 0 and y  = 0


lines. Do the x  = 0 analysis first, then the y  = 0 analysis. Then assemble
the results into one graph. Use colors. In each of the four regions you get,
draw appropriate trajectory pieces and explain why at this point the full
trajectory could be spiral in, spiral out or periodic.
(b) Derive the nonlinear conservation law.
(c) The nonlinear conservation law has the form f (x) g(y) = f (x0 ) g(y0 ) for
a very particular f and g. State what f and g are for the following problems
and derive their properties. Give a nice graph of your results.
(d) Derive the trajectories that start on the positive x or positive y axis. Draw
nice pictures.
(e) Prove the trajectories that start in Q1 with positive x0 and y0 must be
bounded. Do this very nicely with lots of colors and explanation. Explain
why this result tells us the trajectories can’t spiral out or spiral in.
(f) Draw typical trajectories for various initial conditions showing how the
bounding boxes nest.
2. Now do the graphical work with MatLab. A typical generated phase plane plot
for this initial condition would be the plot seen in Fig. 10.40. In this plot,we
commented out the code that generates the bounding box.
3. Now plot the x versus time and y versus time for this Predator–Prey model. This
generates the plots seen in Figs. 10.35 and 10.36.
10.12 A Sketch of the Predator–Prey Solution Process 353

Fig. 10.40 Predator–prey


phase plot

Fig. 10.41 Predator–prey


phase plane plot

4. Now plot many trajectories at the same time. A typical session usually requires
a lot of trial and error. The AutoPhasePlanePlot.m script is used by filling
in values for the inputs it needs. We generate the plot seen in Fig. 10.41.

10.12.1 Project

Now your actual project. For the model

x  (t) = 10 x(t) − 5 x(t) y(t)


y  (t) = −40 y(t) + 9 x(t) y(t)
354 10 Predator–Prey Models

Solve the Model By Hand: Do this and attach to your project report.
Plot One Trajectory Using MatLab: Follow the outline above. This part of the
report is done in a word processor with appropriate comments, discussion etc.
Show your MatLab code and sessions as well as plots.
Estimate The Period T: Estimate the period T using the x versus time plot and
then fine tune your estimate using the phase plane plot—keep increasing the
final time until the trajectories touch for the first time. Pick an interesting initial
condition, of course!
Plot Many Trajectories Simultaneously Using MatLab: Follow the outline
above. This part of the report is also done in a word processor with appropri-
ate comments, discussion etc. Show your MatLab code and sessions as well as
plots.

References

B. Axelsen, T. Anker-Nilssen, P. Fossum, C. Kvamme, L. Nettestad, Pretty patterns but a simple


strategy: predator–prey interactions between juvenile herring and Atlantic puffins observed with
multibeam sonar. Can. J. Zool. 79, 1586–1596 (2001)
M. Braun, Differential Equations and Their Applications (Springer, New York, 1978)
J. Peterson, Calculus for Cognitive Scientists: Derivatives, Integration and Modeling, Springer
Series on Cognitive Science and Technology (Springer Science+Business Media Singapore Pte
Ltd, Singapore, 2015 In press)
Chapter 11
Predator–Prey Models with Self Interaction

Many biologists of Volterra’s time criticized his Predator–Prey model because it did
not include self-interaction terms. These are terms that model how food fish inter-
actions with other food fish and sharks interactions with other predators effect their
populations. We can model these effects by assuming their magnitude is proportional
to the interaction. Mathematically, we assume these are both decay terms giving us
the Predator–Prey Self Interaction model

xsel f = −ex x

ysel f = − f y y.

for positive constants e and f . We are thus led to the new self-interaction model
given below:
x  (t) = a x(t) − b x(t) y(t) − e x(t)2
y  (t) = −c y(t) + d x(t) y(t) − f y(t)2

The nullclines for the self-interaction model are a bit more complicated, but still
straightforward to work with. First, we can factor the dynamics to obtain

x  = x (a − b y − e x),
y  = y (−c + d x − f y).

11.1 The Nullcline Analysis

11.1.1 The x  = 0 Analysis

Looking at the predator–prey self interaction dynamics, equations, we see the (x, y)
pairs in the x–y plane where
© Springer Science+Business Media Singapore 2016 355
J.K. Peterson, Calculus for Cognitive Scientists,
Cognitive Science and Technology, DOI 10.1007/978-981-287-877-9_11
356 11 Predator–Prey Models with Self Interaction
 
0 = x a−b y−ex

are the ones where the rate of change of the food fish will be zero. Now these pairs
can correspond to many different time values so what we really need to do is to
find all the (x, y) pairs where this happens. Since this is a product, there are two
possibilities:

• x = 0; the y axis and


• y = ab − be x.

11.1.2 The y = 0 Analysis

In a similar way, the pairs (x, y) where y  becomes zero satisfy the equation
 
0 = y −c+d x − f y .

Again, there are two possibilities:

• y = 0; the x axis and


• y = − cf + df x.

11.1.3 The Nullcline Plane

Just like we did in Chap. 8, we find the parts of the x–y plane where the algebraic
signs of x  and y  are (+, +), (+, −), (−, +) and (−, −). As usual, the set of (x, y)
pairs where x  = 0 is called the nullcline for x; similarly, the points where y  = 0 is
the nullcline for y. The x  = 0 equation gives us the y axis and the line y = ab − be x
while the y  = 0 gives the x axis and the line y = − cf + df x. The x  and y  nullclines
thus divide the plane into the usual three pieces: the part where the derivative is
positive, zero or negative. In Fig. 11.1, we show the part of the x–y plane where
x  > 0 with one shading and the part where it is negative with another. In Fig. 11.2,
we show how the y  nullcline divides the x–y plane into three pieces as well. For x  ,
in each region of interest, we know the term x  has the two factors x and a − by − ex.
The second factor is positive when

a − by − ex > 0
a e
− x>y
b b
11.1 The Nullcline Analysis 357

UL x > 0 UR
x(a − by − ex)
− −⇒+
x < 0 x = x (a − by − ex).
x(a − by − ex) Setting this to 0,
+ −⇒− we get x = 0 and
y = a/b
x = 0 y = ab − eb x whose
y = a/b − e/bx graphs are shown.
The algebraic signs
x axis
x = a/e of x are shown in
the picture along
x > 0 with the signs of
x < 0 x(a − by − ex) each of the compo-
x(a − by − ex) + +⇒+ nent factors.
− +⇒−
x = 0
LL LR
y axis

Fig. 11.1 Finding where x  < 0 and x  > 0 for the Predator–Prey self interaction model

UL

y < 0 y  = y (−c + dx −
y(−c + dx − f y) f y). Setting this
+ −⇒− y = 0 UR to 0, we get y = 0
y = −c/f + d/f x y > 0 and y = − fc + fd x
y(−c + dx − f y) whose graphs are
y = 0 + +⇒+ shown. The alge-
x axis braic signs of y 
x = c/d
y > 0 are shown in the
y(−c + dx − f y) picture along with
y < 0 the signs of each
− −⇒+
y = −c/f y(−c + dx − f y) of the component
−+⇒− factors.
LL LR
y axis

Fig. 11.2 Finding where y  < 0 and y  > 0 for the Predator–Prey self interaction model

So below the line, the factor is positive. In a similar way, the term y  has the two
factors y and −c + d x − f y. Here the second factor is positive when

−c + d x − f y > 0
c d
− + x>y
f f

So below the line, the factor is positive. We then use this information to determine
the algebraic signs in each region. In Fig. 11.1, we show these four regions (think of
them as Upper Left (UL), Upper Right (UR), Lower Left (LL) and Lower Right (LR)
for convenience) with the x  equation shown in each region along with the algebraic
signs for each of the two factors. The y  signs are shown in Fig. 11.2.
358 11 Predator–Prey Models with Self Interaction

The areas shown in Figs. 11.1 and 11.2 can be combined into one drawing. To do
this, we divide the x–y plane into as many regions as needed and in each region, label
x  and y  as either positive or negative. Hence, each region can be marked with an
ordered pair, (x  ±, y  ±). In this self-interaction case, there are three separate cases:
the one where dc < ae which gives an intersection in Quadrant 1, the one where
d
c
= ae which gives an intersection on the x axis and where dc > ae which gives an
intersection in Quadrant 4. We are interested in biologically reasonable solutions so
if the initial conditions start in Quadrant 1, we would like to know the trajectories
stay in Quadrant 1 away from the x and y axes.

Example 11.1.1 Do the x  = 0 and y  = 0 nullcline analysis separately for the


model

x  (t) = 4 x(t) − 5 x(t) y(t) − e x(t)2


y  (t) = −6 y(t) + 2 x(t) y(t) − f y(t)2

Note we don’t specify e and f .

Solution • For x  , in each region of interest, we know the term x  has the two factors
x and 4 − 5y − ex. The second factor is positive when

4 e
4 − 5y − ex > 0 ⇒ − x>y
5 5
So below the line, the factor is positive.
• the term y  has the two factors y and −6 + 2x − f y. Here the second factor is
positive when

6 2
−6 + 2x − f y > 0 ⇒ − + x>y
f f

So below the line, the factor is positive.


• The graphs for this solution is shown in Figs. 11.3 and 11.4.

11.1.4 Homework

For these models, do the complete nullcline analysis for x  = 0 and y  = 0 separately
with all details.

Exercise 11.1.1

x  (t) = 15 x(t) − 2 x(t) y(t) − 2 x(t)2


y  (t) = −8 y(t) + 30 x(t) y(t) − 3 y(t)2
11.1 The Nullcline Analysis 359

UL x > 0 UR
x(4 − 5y − ex)
− −⇒+
x < 0 x = x (4 − 5y − ex).
x(4 − 5y − ex) Setting this to 0,
+ −⇒− we get x = 0 and
y = 4/5
x = 0 y = 45 − 5e x whose
y = 4/5 − e/5x graphs are shown.
The algebraic signs
x axis
x = 4/e of x are shown in
the picture along
x > 0 with the signs of
x < 0 x(4 − 5y − ex) each of the compo-
x(4 − 5y − ex) + +⇒+ nent factors.
− +⇒−
x = 0
LL LR
y axis

Fig. 11.3 x  nullcline for x  = x (4 − 5y − ex)

UL

y < 0 y  = y (−6 + 2x −
y(−6 + 2x − f y) f y). Setting this
+ −⇒− y = 0 UR to 0, we get y = 0
y = −6/f + 2/f x y > 0 and y = − f6 + f2 x
y(−6 + 2x − f y) whose graphs are
y = 0 + +⇒+ shown. The alge-
x axis braic signs of y 
x = 6/2
y > 0 are shown in the
y(−6 + 2x − f y) picture along with
y < 0 the signs of each
− −⇒+
y = −6/f y(−6 + 2x − f y) of the component
−+⇒− factors.
LL LR
y axis

Fig. 11.4 y  nullcline for y  = y (−6 + 2x − f y)

Exercise 11.1.2

x  (t) = 6 x(t) − 20 x(t) y(t) − 10 x(t)2


y  (t) = −28 y(t) + 30 x(t) y(t) − 3 y(t)2

Exercise 11.1.3

x  (t) = 7 x(t) − 6 x(t) y(t) − 2 x(t)2


y  (t) = −10 y(t) + 2 x(t) y(t) − 3 y(t)2

Exercise 11.1.4

x  (t) = 8 x(t) − 2 x(t) y(t) − 2 x(t)2


y  (t) = −9 y(t) + 3 x(t) y(t) − 1.5 y(t)2
360 11 Predator–Prey Models with Self Interaction

11.2 Quadrant 1 Trajectories Stay in Quadrant 1

To prepare for our Quadrant 1analysis, let’s combine the nullclines, but only in
Quadrant 1. First, let’s redraw the derivative sign analysis just in Quadrant 1. In
Fig. 11.5 we show the Quadrant 1 x  + and x  − regions in Quadrant 1 only. We will
show that we only need to look at the model in Quadrant 1. To do this, we will show
that the trajectories starting on the positive y axis move down towards the origin.
Further, we will show trajectories starting on the positive x axis move towards the
point (a/e, 0). Then since trajectories can not cross this tells us that a trajectory that
starts in Q1 with positive ICs can not cross the y axis and these trajectories can not
cross the positive x axis but they can end up at the point (a/e, 0).
In Fig. 11.6 we then show the Quadrant 1 analysis for the y  + and y  − regions.
We know different trajectories can not cross, so if we can show there are trajecto-
ries that stay on the x and y axes, we will know that trajectories starting in Quadrant
1 stay in Quadrant 1.

The x equation In
x = 0 Quadrant 1 is x =
y axis
x (a − b y − e x).
Setting this to 0,
we get x = 0 and
y = a/b − e/b x
x < 0 whose graphs are
y = a/b shown. The alge-
x = 0 braic signs of x
x > 0
y = a/b − e/b x are shown in the
x axis picture.
x = a/e

Fig. 11.5 The x  < 0 and x  > 0 signs in Quadrant 1 for the Predator–Prey self interaction model

The y  equation in
Quadrant 1 is y  =
y axis
y (−c + d x − f y).
Setting this to 0,
y = 0 we get y = 0 and
y = −c/f + d/f x
y = −c/f + d/f x
whose graphs are
shown. The alge-
y < 0 y > 0 braic signs of y 
are then shown in
x = c/d x axis the picture.
y = 0

Fig. 11.6 The y  < 0 and y  > 0 signs in Quadrant 1 for the Predator–Prey self interaction model
11.2 Quadrant 1 Trajectories Stay in Quadrant 1 361

11.2.1 Trajectories Starting on the y Axis

If we start at an initial condition on the y axis, x0 = 0 and y0 > 0, the self-interaction


model can be solved by choosing x(t) = 0 for all time t and solving the first order
equation

y  (t) = y (−c − f y).

Let’s look at the trajectories that start on the positive y axis for this model.

x  (t) = 4 x(t) − 5 x(t) y(t) − e x(t)2


y  (t) = −6 y(t) + 2 x(t) y(t) − f y(t)2

• With (x0 = 0, y0 > 0) then x(t) = 0 always and y satisfies

y  (t) = y (−6 − f y).

• Rewriting

y
= −1.
y (6 + f y)

• Integrating from 0 to t, we find


 
t
y  (s) t
=− ds = −t.
0 y(s) (6 + f y(s)) 0

• Make the substitution u = y(t)


 s=t
du
= −t.
s=0 u (c + f u)

• This integration needs a partial fraction decomposition approach. We search for α


and β so that

1 α β
= + .
u(6 + f u) u 6+ f u

• We want
1 α β
= +
u (6 + f u) u 6+ f u
1 = α (6 + f u) + β u.
362 11 Predator–Prey Models with Self Interaction

• Evaluating at u = 0 we get α = 1/6 and when u = −6/ f , β = − f /6.


• Complete the integration
   
s=t
1 α
s=t
β
du = + du
s=0 u(6 + f u) s=0 u 6+ f u
  s=t
1 s=t 1 f 1
= du − du.
6 s=0 u 6 s=0 6 + f u
1 1
= ln |u(s)|t0 − ln |6 + f u(s)|t0 .
6 6
• But u = y(s) and here the variables are positive so absolute values are not needed.
 
1 1 y(t)
ln |u(s)|0 − ln |6 + f u(s)|0 = (1/6) ln
t t
6 6 y0
 
6 + f y(t)
−(1/6) ln
6 + f y0

• Now simplify the ln.


  
y(t) 6 + f y(t)
(1/6) ln − (1/6) ln
y0 6 + f y0
 
y(t) 6 + f y0
= (1/6) ln
6 + f y(t) y0

• The right hand side of the integration was −t, so combining


 
y(t) 6 + f y0
ln = −6t.
6 + f y(t) y0

• Exponentiate

y(t) 6 + f y0
= e−6t .
6 + f y(t) y0

• Solve for y(t)


y0
y(t) = (6 + f y(t)) e−6t
6 + f y0
 
6y0 6y0
1− e−6t y(t) = e−6t
6 + f y0 6 + f y0
6y0
6+ f y0
e−6t
y(t) =
1 − 6+6yf0y0 e−6t
11.2 Quadrant 1 Trajectories Stay in Quadrant 1 363

• At t → ∞, the numerator goes to 0 and the denominator goes to 1. So as t → ∞,


y(t) → 0.
• So if the IC starts on the positive y axis, trajectory goes down to the origin.
• The argument is the same for other f values and other models.
We can do this argument in general using a generic c and d but you should get
the idea form our example.

11.2.2 Trajectories Starting on the x Axis

Let’s look at the trajectories that start on the positive x axis for the same model.

x  (t) = 4 x(t) − 5 x(t) y(t) − e x(t)2


y  (t) = −6 y(t) + 2 x(t) y(t) − f y(t)2

• With (x0 > 0, y0 = 0) then y(t) = 0 always and x(t) satisfies

x  (t) = 4x − ex 2 = ex((4/e) − x).

• Hence, the x  equation is a logistics model with L = 4/e and α = e. So x(t) → 4/e
as t → ∞. If x0 > 4/e, x(t) goes down toward 4/e and if x0 < 4/e, x(t) goes
up to 4/e.
• This argument works for any e and any other model. So trajectories that start on
the positive x axis move towards a/e.

We are now in a position to see what happens if we pick initial conditions in


Quadrant 1.

11.2.3 Homework

Analyze the x and y positive axis trajectories as we have done in the above discus-
sions.

Exercise 11.2.1

x  (t) = 15 x(t) − 20 x(t) y(t) − 2 x(t)2


y  (t) = −20 y(t) + 30 x(t) y(t) − 3 y(t)2
364 11 Predator–Prey Models with Self Interaction

Exercise 11.2.2

x  (t) = 5 x(t) − 25 x(t) y(t) − 10 x(t)2


y  (t) = −28 y(t) + 30 x(t) y(t) − 3 y(t)2

Exercise 11.2.3

x  (t) = 7 x(t) − 7 x(t) y(t) − 2 x(t)2


y  (t) = −10 y(t) + 2 x(t) y(t) − 3 y(t)2

Exercise 11.2.4

x  (t) = 8 x(t) − 3 x(t) y(t) − 3 x(t)2


y  (t) = −9 y(t) + 4 x(t) y(t) − 1.5 y(t)2

11.3 The Combined Nullcline Analysis in Quadrant 1

We now know that if a trajectory starts at x0 > 0 and y0 > 0, it can not cross the
positive x or y axis. There are three cases to consider. If dc < ae , the trajectories
will cross somewhere in Quadrant 1. This intersection point will play the same role
as the average x and average y in the Predator–Prey model without self interaction.
If dc = ae , the two nullclines intersect on the x axis at ae . Finally, if dc > ae , the
intersection occurs in Quadrant 4 which is not biologically reasonable and is not
even accessible as the trajectory can not cross the x access. Let’s work out the details
of these possibilities.

11.3.1 The Case c


d < a
e

We now combine Figs. 11.5 and 11.6 to create the combined graph for the case of
the intersection in Quadrant 1. We show this in Fig. 11.7. The two lines cross when

ex +b y = a
dx− f y=c
.

Solving using Cramer’s rule, we find the intersection (x ∗ , y ∗ ) to be


11.3 The Combined Nullcline Analysis in Quadrant 1 365

x = 0 y = 0
y axis y = −c/f + d/f x

(−, −)
a/b The combined
x = 0
(x , y  ) algebraic
y = a/b − e/b x sign graph in
(+, −) (−, +) Quadrant 1 for
the case dc < ae .
There are four re-
gions of interest as
c/d a/e shown.
x axis
y = 0
(+, +)

Fig. 11.7 The Quadrant 1 nullcline regions for the Predator–Prey self interaction model when
c/d < a/e

 
a b
det
c −f
x∗ =  
e b
det
d −f
a f +bc
=
e f +bd
 
e a
det
d c
y∗ =  
e b
det
d −f
ad −ec
= .
e f +bd

In this case, we have a/e > c/d or a d − e c > 0.

11.3.2 The Nullclines Touch on the X Axis

The second case is the one where the nullclines touch on the x axis. We show this
situation in Fig. 11.8. This occurs when c/d = a/e.

11.3.3 The Nullclines Cross in Quadrant 4

The second case is the one where the nullclines do not cross in Quadrant 1. We show
this situation in Fig. 11.9. This occurs when c/d > a/e. The two lines now cross at
a negative y value, but since in this model also, trajectories that start in Quadrant 1
366 11 Predator–Prey Models with Self Interaction

x = 0
y axis
The combined
a/b (−, −)
(x , y  ) algebraic
x = 0 sign graph in
y = a/b − e/b x Quadrant 1 for
y = −c/f + d/f x the case dc =
a
y = 0 e ; hence, the
(+, −) (−, +) nullclines don’t
cross.
a/e = c/d x axis
y = 0

Fig. 11.8 The qualitative nullcline regions for the Predator–Prey self interaction model when
c/d = a/e

x = 0
y axis
The combined
a/b (−, −)
(x , y  ) algebraic
x = 0 sign graph in
y = a/b − e/b x Quadrant 1 for
the case dc >
y = −c/f + d/f x a
y = 0 e ; hence, the
(+, −) (−, +) nullclines don’t
cross.
a/e c/d x axis
y = 0

Fig. 11.9 The qualitative nullcline regions for the Predator–Prey self interaction model when
c/d > a/e

can’t cross the x or y axis, we only draw the situation in Quadrant 1. By Cramer’s
rule, the solution to

ex +b y = a
dx− f y=c
.

is the pair (x ∗ , y ∗ ) as shown below.

a f +bc
x∗ =
e f +bd
ad −ec
y∗ = .
e f +bd
11.3 The Combined Nullcline Analysis in Quadrant 1 367

In this case, we have a/e < c/d or a d − e c < 0 and so y ∗ is negative and not
biologically interesting.

11.3.4 Example

Let’s do the combined nullcline analysis for the model

x  (t) = 4 x(t) − 5 x(t) y(t) − e x(t)2


y  (t) = −6 y(t) + 2 x(t) y(t) − f y(t)2

Here, c/d = 6/2 = 3 and a/e = 4/e.


• For c/d < a/e, we need 3 < 4/e or e < 4/3.
• For c/d = a/e, we need 4 = 4/e or e = 4/3.
• For c/d > a/e, we need 4 > 4/e or e > 4/3.
We’ll do this for various values of e so we can see all three cases.
• e = 1 so this is the c/d < a/e case. Note the value of f not important. We show
the result in Fig. 11.10.
• e = 4/3 so this is the c/d = a/e case and is shown in Fig. 11.11.
• e = 2 so this is the c/d > a/e case; see Fig. 11.12.

x = 0 y = 0
y axis y = −6/f + 2/f x
The combined
(−, −)
4/5  (x , y  ) alge-
x = 0
braic sign graph
y = 4/5 − 1/5 x
in Quadrant
(+, −) (−, +)
1 for the case
d = 3 < e = 4.
c a

There are four re-


6/2 4/1 x axis gions of interest as
y = 0 shown.
(+, +)

Fig. 11.10 The Quadrant 1 nullcline regions for the Predator–Prey self interaction model when
c/d < a/e
368 11 Predator–Prey Models with Self Interaction

x = 0
y axis The combined
(x , y  ) algebraic
4/5 (−, −)
sign graph in
x = 0 Quadrant 1 for
y = 4/5 − (4/3)/5 x
the case dc = 3 =
y = −6/f + 2/f x
e = 4/(4/3);
a
y = 0
(+, −) (−, +) Hence, nullclines
touch at x =
4/(4/3) = 6/2 a/e = 3.
x axis
y = 0

Fig. 11.11 The qualitative nullcline regions for the Predator–Prey self interaction model when
c/d = a/e

x = 0
y axis
The combined
4/5 (−, −) (x , y  ) algebraic
x = 0 sign graph in
y = 4/5 − 2/5 x Quadrant 1 for
the case dc >
y = −6/f + 2/f x a
y = 0 e ; hence, the
(+, −) (−, +)
nullclines don’t
cross.
4/2 6/2 x axis
y = 0

Fig. 11.12 The qualitative nullcline regions for the Predator–Prey self interaction model when
c/d > a/e

11.4 Trajectories in Quadrant 1

We are now ready to draw trajectories in Quadrant I. For this model

x  (t) = 4 x(t) − 5 x(t) y(t) − e x(t)2


y  (t) = −6 y(t) + 2 x(t) y(t) − f y(t)2

we have c/d = 6/2 = 3 and a/e = 4/e.


• For c/d < a/e, we need 3 < 4/e or e < 4/3.
• For c/d = a/e, we need 4 = 4/e or e = 4/3.
• For c/d > a/e, we need 4 > 4/e or e > 4/3.
For the choice of e = 1, the nullclines cross in Quadrant I. Hence, the trajectories
spiral in to the point where the nullclines cross. We find the trajectory shown in
Fig. 11.13.
Next, for e = 4/3, the nullclines touch on the x axis and the trajectories move
toward the point (6/2, 0). We show this in Fig. 11.14.
11.4 Trajectories in Quadrant 1 369

Fig. 11.13 Sample Predator–Prey model with self interaction: crossing in Q1

Fig. 11.14 Sample Predator–Prey model with self interaction: the nullclines cross on the x axis

Finally, for e = 2, the nullclines do not cross in Quadrant I and the trajectories
move toward the point (4/2, 0). We show the trajectories in Fig. 11.15.

11.5 Quadrant 1 Intersection Trajectories

We assume we start in Quadrant 1 in the region with (x  , y  ) having algebraic signs


(−, −) or (+, −). In these two regions, we know their corresponding trajectories
can’t cross the ones that start on the positive x or y axis. From the signs we see in
Fig. 11.7, it is clear that trajectories must spiral into the point (x ∗ , y ∗ ). So we don’t
370 11 Predator–Prey Models with Self Interaction

Fig. 11.15 Sample Predator–Prey model with self interaction: the nullclines cross in Quadrant 4

have to work as hard as we did before to establish this! However, it is also clear there
is not a true average x and y value here as the trajectory is not periodic. However,
there is a notion of an asymptotic average value which we now discuss.

11.5.1 Limiting Average Values in Quadrant 1

Now the discussions below will be complicated, but all of you can wade through it as
it does not really use any more mathematics than we have seen before. It is, however,
very messy and looks quite intimidating! Still, mastering these kind of things brings
rewards: your ability to think through complicated logical problems is enhanced! So
grab a cup of tea or coffee and let’s go for a ride. We are going to introduce the idea
of limiting average x and y values.
We know at any time t, the solutions x(t) and y(t) must be positive. Rewrite the
model as follows:

x
+ex = a−b y
x
y
+ f y = −c + d x.
y

Now integrate from s = 0 to s = t to obtain


  t  t
t
x  (s)
ds + e x(s) ds = a t − b y(s) ds
0 x(s) 0 0
  t  t
t
y  (s)
ds + f y(s) ds = −c t + d x(s) ds.
0 y(s) 0 0
11.5 Quadrant 1 Intersection Trajectories 371

We obtain
   t  t
x(t)
ln +e x(s) ds = a t − b y(s) ds
x0 0 0
   t  t
y(T )
ln + f y(s) ds = −c t + d x(s) ds.
y0 0 0

Now the solution x and y are continuous, so the integrals


 t
X (t) = x(s) ds
0
 t
Y (t) = y(s) ds
0

are also continuous by the Fundamental Theorem of Calculus. Using the new vari-
ables X and Y , we can rewrite these integrations as
 
x(t)
ln = a t − e X (t) − b Y (t)
x0
 
y(t)
ln = −c t + d X (t) − f Y (t).
y0
Hence,
 
x(t)
d ln = a d t − e d X (t) − b d Y (t)
x0
 
y(t)
e ln = −c e t + d e X (t) − e f Y (t).
y0

Now add the bottom and top equation to get


 d  e     
x(t) y(t)
ln = a d − c e t − e f + b d Y (t).
x0 y0

Now divide through by t to get


        
1 x(t) d y(t) e 1
ln = ad −ce − e f +bd Y (t). (11.1)
t x0 y0 t

From Fig. 11.7, it is easy to see that no matter what (x0 , y0 ) we choose in Quadrant
1, the trajectories are bounded and so there is a positive constant we will call B so
that
     
 x(t) d y(t) e 
ln
 x0 y0  ≤ B.
372 11 Predator–Prey Models with Self Interaction

Thus, for all t, we have


     
1  x(t) d y(t) e  B
ln ≤ .
t  x0 y0  t

Hence, if we let t grow larger and larger, B/t gets smaller and smaller, and in fact
     
1  x(t) d y(t) e  B
lim
t→∞ t 
ln
x0 y0  ≤ t→∞
lim
t
= 0.
But the left hand side is always non-negative also, so we have
     
1  x(t) d y(t) e 
0 ≤ lim ln
t→∞ t  x0 y0  ≤ 0,

which tells us that


     
1  x(t) d y(t) e 
lim
t→∞ t 
ln
x0 y0  = 0.

Finally, the above also implies


 d  e 
1 x(t) y(t)
lim ln = 0.
t→∞ t x0 y0

Now let t go to infinity in Eq. 11.1 to get


 d  e     
1 x(t) y(t) 1
lim ln = a d − c e − e f + b d lim Y (t).
t→∞ t x0 y0 t→∞ t

t
The term Y (t)/t is actually (1/t) 0 y(s) ds which is the average of the solution y
on the interval [0, t]. It therefore follows that

1 t
ad −ce
0 = lim y(s) ds = .
t→∞ t 0 e f +bd

But the term on the right hand side is exactly the y coordinate of the intersection
of the nullclines, y ∗ . We conclude the limiting average value of the solution y is
given by
 t
1
lim y(s) ds = y ∗ . (11.2)
t→∞ t 0

We can do a similar analysis (although there are differences in approach) to shown


that the limiting average value of the solution x is given by
11.5 Quadrant 1 Intersection Trajectories 373
 t
1
lim x(s) ds = x ∗ . (11.3)
t→∞ t 0

These two results are similar to what we saw in the Predator–Prey model without
self-interaction. Of course, we only had to consider the averages over the period
before, whereas in the self-interaction case, we must integrate over all time. It is
instructive to compare these results:

Model Average x Average y


No Self-Interaction c/d a/b
+bc ∗
Self-Interaction x ∗ = ea ff +bd y = ead−ec
f +bd

11.6 Quadrant 4 Intersection Trajectories

Look back at the signs we see in Fig. 11.9. It is clear that trajectories that start to the
left of c/d go up and to the left until they enter the (−, −) region. The analysis we
did for trajectories starting on the x axis or y axis in the crossing nullclines case are
still appropriate. So we know one a trajectory is in the (−, −) region, it can’t hit the x
axis except at a/e. Similarly, a trajectory that starts in (+, −) moves right and down
towards the x axis, but can’t hit the x axis except at a/e. We can look at the details of
the (+, −) trajectories by reusing the material we figured out in the limiting averages
discussion. Since this trajectory is bounded, as t grows arbitrarily large, the x(t) and
y(t) values must approach fixed values. We will call these asymptotic x and y values
x ∞ and y ∞ for convenience. The trajectory must satisfy
        
1 x(t) d y(t) e 1
ln = ad −ce − e f +bd Y (t). (11.4)
t x0 y0 t

with the big difference that the term a d − c e is now negative. Exponentiate to obtain
 d  e
x(t) y(t)
= e(a d−c e) t e−(e f +b d)Y (t)
. (11.5)
x0 y0

Now note
• The term e−(e f +b d)Y (t) is bounded by 1.
• The term e(a d−c e) t goes to zero as t gets large because a d − c e is negative.
Hence, as t increases to infinity, we find
 d  e  d  e
x(t) y(t) x∞ y∞
lim = = 0.
t→∞ x0 y0 x0 y0
374 11 Predator–Prey Models with Self Interaction

Since it is easy to see that x ∞ is positive, we must have y ∞ = 0. We now know


a trajectory starting in Region III hits the x axis as t goes to infinity. Now suppose
x ∞ < ae . Then, there is also a trajectory on the positive x axis that starts at x ∞
and moves towards ae . Looking backwards from the point (x ∞ , 0), we thus see two
trajectories springing out of that point. The first, our Region III trajectory, and the
second, our positive x axis trajectory. This is not possible. So x ∞ must be ae .
Now is this model biologically plausible? It seems not! It doesn’t seem reason-
able for the predator population to shrink to 0 while the food population converges
to some positive number x ∞ ! So adding more biological detail actually leads to a
loss of biologically reasonable predictive power; food for thought!
The discussion for the middle case where the nullclines intersect on the x axis is
essentially the same so we won’t go over it again.

11.6.1 Homework

Draw suitable trajectories for the following Predator–Prey models with self interac-
tion in great detail.
Exercise 11.6.1

x  (t) = 15 x(t) − 2 x(t) y(t) − 2 x(t)2


y  (t) = −8 y(t) + 30 x(t) y(t) − 3 y(t)2

Exercise 11.6.2

x  (t) = 6 x(t) − 20 x(t) y(t) − 10 x(t)2


y  (t) = −28 y(t) + 30 x(t) y(t) − 3 y(t)2

Exercise 11.6.3

x  (t) = 7 x(t) − 6 x(t) y(t) − 2 x(t)2


y  (t) = −10 y(t) + 3 x(t) y(t) − 3 y(t)2

Exercise 11.6.4

x  (t) = 8 x(t) − 2 x(t) y(t) − 2 x(t)2


y  (t) = −9 y(t) + 3 x(t) y(t) − 1.5 y(t)2

Exercise 11.6.5

x  (t) = 6 x(t) − 2 x(t) y(t) − 2 x(t)2


y  (t) = −9 y(t) + 5 x(t) y(t) − 3 y(t)2
11.7 Summary: Working Out a Predator–Prey Self-Interaction Model in Detail 375

11.7 Summary: Working Out a Predator–Prey


Self-Interaction Model in Detail

We can now summarize how you would completely solve a typical Predator–Prey
self-interaction model problem from first principles. These are the steps you need to
do:

1. Draw the nullclines reasonably carefully in multiple colors to make your teacher
happy. Once you know what the nullclines do, you can solve these problems
completely.
2. Determine if the nullclines cross as this makes a big difference in the kind of
trajectories we will see. Find the place where the nullclines cross if they do.
3. Once you know what the nullclines do, you can solve these problems completely.
Draw a few trajectories in each of the regions determined by the nullclines.
4. From our work in Sect. 11.5, we know the solutions to the Predator–Prey model
with self-interaction having initial conditions in Quadrant 1 are always positive.
You can then use this fact to derive the amazingly true statement that a solution
pair, x(t) and y(t), satisfies
 d  e      t
x(t) y(t)
ln = ad −ce t − e f +bd y(s) ds.
x0 y0 0

This derivation is the same whether the nullclines cross or not!

11.8 Self Interaction Numerically

From our theoretical investigations, we know if the ratio c/d exceeds the ratio a/e,
the solutions should approach the ratio a/e as time gets large. Let’s see if we get that
result numerically.
Let’s try this problem,

x  (t) = 2 x(t) − 3 x(t) y(t) − 3 x(t)2


y  (t) = −4 y(t) + 5 x(t) y(t) − 3 y(t)2

We can generate a full phase plane as follows


Listing 11.1: Phase Plane for x  = 2x − 3x y − 3x 2 , y  = −4x + 5x y − 3y 2
f = @( t , x ) [ 2 ∗ x ( 1 ) −3∗x ( 1 ) . ∗ x ( 2 ) −3∗x ( 1 ) . ˆ 2 ; − 4 ∗ x ( 2 ) +5∗ x ( 1 ) . ∗ x ( 2 ) −3∗x ( 2 ) ˆ 2 ] ;
AutoPhasePlanePlot ( f , . 0 1 , 0 , 3 , 4 , 3 , 3 , . 1 , 4 , . 1 , 4 ) ;

We generate the plot as shown in Fig. 11.16. Note that here c/d = 4/5 and a/e = 2/3
so c/d > a/e which tells us the nullcline intersection is in Quadrant 4. Hence, all
trajectories should go toward a/e = 2/3 on the x-axis.
376 11 Predator–Prey Models with Self Interaction

Fig. 11.16 Predator–Prey


system: x  = 2x − 3x y −
3x 2 , y  = −4y + 5x y − 3y 2

Now let’s look at what happens when the nullclines cross. We now use the model

x  (t) = 2 x(t) − 3 x(t) y(t) − 1.5 x(t)2


y  (t) = −4 y(t) + 5 x(t) y(t) − 1.5 y(t)2

The Matlab session is now


Listing 11.2: Phase Plane for x  = 2x − 3x y − 1.5x 2 , y  = −4x + 5x y − 1.5y 2
f = @( t , x ) [ 2 ∗ x ( 1 ) −3∗x ( 1 ) . ∗ x ( 2 ) −1.5∗ x ( 1 ) . ˆ 2 ; − 4 ∗ x ( 2 ) +5∗ x ( 1 ) . ∗ x ( 2 ) −1.5∗ x ( 2 ) ˆ 2 ] ;
AutoPhasePlanePlot ( f , . 0 1 , 0 , 3 , 4 , 3 , 3 , . 1 , 4 , . 1 , 4 ) ;

Since a/e = 2/1.5 and c/d = 4/5, the nullclines cross in Quadrant 1 and the
trajectories should converge to

a f + bc 2(1.5) + 3(4) 15
x∗ = = = = 0.87
e f + bd 1.5(1.5) + 3(5) 17.25
ad − ec 2(5) − 1.5(4) 4
y∗ = = = = 0.23
e f + bd 17.25 17.25

which is what we see in the plot shown in Fig. 11.17.

11.8.1 Homework

Generate phase plane plots for the following models.


11.8 Self Interaction Numerically 377

Fig. 11.17 Predator–Prey


system: x  = 2x − 3x y −
1.5x 2 , y  = −4y + 5x y −
1.5y 2 in this example, the
nullclines cross so the
trajectories moves towards a
fixed point (0.23, 0.87) as
shown

Exercise 11.8.1

x  (t) = 8 x(t) − 4 x(t) y(t) − 2 x(t)2


y  (t) = −10 y(t) + 2 x(t) y(t) − 1.5 y(t)2

Exercise 11.8.2

x  (t) = 12 x(t) − 4 x(t) y(t) − 2 x(t)2


y  (t) = −10 y(t) + 2 x(t) y(t) − 1.5 y(t)2

Exercise 11.8.3

x  (t) = 6 x(t) − 4 x(t) y(t) − 2 x(t)2


y  (t) = −3 y(t) + 2 x(t) y(t) − 1.5 y(t)2

Exercise 11.8.4

x  (t) = 6 x(t) − 4 x(t) y(t) − 2 x(t)2


y  (t) = −12 y(t) + 2 x(t) y(t) − 1.5 y(t)2

Exercise 11.8.5

x  (t) = 15 x(t) − 4 x(t) y(t) − 2 x(t)2


y  (t) = −10 y(t) + 2 x(t) y(t) − 1.5 y(t)2
378 11 Predator–Prey Models with Self Interaction

11.9 Adding Fishing!

Let’s do this in general. First, let’s look at the general model

x  = ax − bx y − ex 2
y  = −cy + d x y − f y 2

The term f y 2 models how much is lost to self interaction between the predators.
It seems reasonable that this loss should be less than the amount of food fish that
are being eaten by the predators. Hence, we will assume in this model that f < b
always so that we get a biologically reasonable model. Then note adding fishing can
be handled in the same way as before. The role of the average values will now be
played by the value of the intersection of the nullclines. We have
   
∗ ∗ a f + bc ad − ec
xno , yno = ,
e f + bd e f + bd
   
(a − r ) f + b(c + r ) (a − r )d − e(c + r )
xr∗ , yr∗ = ,
e f + bd e f + bd
   
a f + bc ad − ec −r f + br −r d − er )
= , + ,
e f + bd e f + bd e f + bd e f + bd
   
∗ ∗ b − f d + e
= xno , yno +r , −
e f + bd e f + bd
     
∗ ∗ ∗ ∗ r b − f d +e
xr/2 , yr/2 = xno , yno + , −
2 e f + bd e f + bd

Now compare.

∗ ∗ r b− f
xr/2 = xno +
2 e f + bd
∗ b− f r b− f
= xno +r −
e f + bd 2 e f + bd
r b − f
= xr∗ −
2 e f + bd

which shows us xr/2 goes down with the reduction in fishing as b > f . Similarly,

∗ ∗ r d +e
yr/2 = yno −
2 e f + bd
∗ b− f r b− f
= yno − r +
e f + bd 2 e f + bd
r b − f
= yr∗ +
2 e f + bd
11.9 Adding Fishing! 379


which shows us yr/2 goes up with the reduction in fishing as b > f . This is the same
behavior we saw in the original Predator–Prey model without self-interaction.

11.10 Learned Lessons

In our study of the Predator–Prey model, we have seen the model without self-
interaction was very successful at giving us insight into the fishing catch data during
World War I in the Mediterranean sea. This was despite the gross nature of the model.
No attempt was made to separate the food fish category into multiple classes of food
fish; no attempt made to break down the predatory category into various types of
predators. Yet, the modeling was ultimately successful as it provided illumination
into a biological puzzle. However, the original model lacked the capacity for self-
interaction and so it seemed plausible to add this feature. The self-interaction terms
we use in this chapter seemed quite reasonable, but our analysis has shown it leads
to completely wrong biological consequences. This tells the way we model self-
interaction is wrong. The self-interaction model, in general, would be

x  = a x − b x y − e u(x, y)
y  = −c y + d x y − f v(x, y)

where u(x, y) and v(x, y) are functions of both x and y that determine the self-
interaction. To analyze this new model, we would proceed as before. We determine
the nullclines

0 = x  = a x − b x y − e u(x, y)
0 = y  = −c y + d x y − f v(x, y)

and begin our investigations. We will have to decide if the model generates trajectories
that remain in Quadrant 1 if they start in Quadrant 1 so that they are biologically
reasonable. This will require a lot of hard work!
Note, we can simply compute solutions using MatLab or some other tool. We will
not know what the true solution is or even any ideas as to what general appearance it
might have. You should be able to see that a balanced blend of mathematical analysis,
computational study using a tool and intuition from the underlying science must be
used together to solve the problems.
Chapter 12
Disease Models

We will now build a simple model of an infectious disease called the SIR model.
Assume the total population we are studying is fixed at N individuals. This population
is then divided into three separate pieces: we have individuals
• that are susceptible to becoming infected are called Susceptible and are labeled
by the variable S. Hence, S(t) is the number that are capable of becoming infected
at time t.
• that can infect others. They are called Infectious and the number that are infectious
at time t is given by I (t).
• that have been removed from the general population. These are called Removed
and their number at time t is labeled by R(t).
We make a number of key assumptions about how these population pools interact.
• Individuals stop being infectious at a positive rate γ which is proportional to the
number of individuals that are in the infectious pool. If an individual stops being
infectious, this means this individual has been removed from the population. This
could mean they have died, the infection has progressed to the point where they
can no longer pass the infection on to others or they have been put into quarantine
in a hospital so that further interactions with the general population is not possible.
In all of these cases, these individuals are not infectious or can’t cause infections
and so they have been removed from the part of the population N which can be
infected or is susceptible. Mathematically, this means we assume

Iloss = −γ I.

• Susceptible individuals are those capable of catching an infection. We model the


interaction of infectious and susceptible individuals in the same way we handled the
interaction of food fish and predator fish in the Predator–Prey model. We assume
this interaction is proportional to the product of their population sizes: i.e. S I .
We assume the rate of change of Infectious is proportional to this interaction with
positive proportionality constant r . Hence, mathematically, we assume

© Springer Science+Business Media Singapore 2016 381


J.K. Peterson, Calculus for Cognitive Scientists,
Cognitive Science and Technology, DOI 10.1007/978-981-287-877-9_12
382 12 Disease Models


Igain = r S I.

We can then figure out the net rates of change of the three populations. The infectious
populations gains at the rate r S I and loses at the rate γ I . Hence, the net gain is
 
Igain + Iloss or

I  = r S I − γ I.

The net change of Susceptible’s is that of simple decay. Susceptibles are lost at the
rate −r S I . Thus, we have

S  = − r S I.

Finally, the removed population increases at the same rate the infectious population
decreases. We have

R  = γ I.

We also know that R(t) + S(t) + I (t) = N for all time t because our population is
constant. So only two of the three variables here are independent. We will focus on
the variables I and S from now on. Our complete Infectious Disease Model is then

I = r S I − γ I (12.1)

S = −r S I (12.2)
I (0) = I0 (12.3)
S(0) = S0 . (12.4)

where we can compute R(t) as N − I (t) − S(t).

12.1 Disease Model Nullclines

When we set I  and S  to zero, we obtain the usual nullcline equations.

I  = 0 = I (r S − γ)
S  = 0 = − r S I.

We see I  = 0 when I = 0 or when S = γ/r . This nullcline divides the I –S plane


into the regions shown in Fig. 12.1.
The S  nullcline is a little simpler. S  is zero on either the S or the I axis. However, the
minus sign complicates things a bit. The algebraic sign of S  in all relevant regions
are shown in Fig. 12.2.
12.1 Disease Model Nullclines 383

Fig. 12.1 Finding where I  < 0 and I  > 0 for the disease model

Fig. 12.2 Finding where S  < 0 and S  > 0 regions for the disease model

The nullcline information for I  = 0 and S  = 0 can be combined into one picture
which we show in Fig. 12.3.
384 12 Disease Models

Fig. 12.3 Finding the (I  , S  ) algebraic sign regions for the disease model

12.1.1 Homework

For the following disease models, do the I  and S  nullcline analysis separately and
then assemble.
Exercise 12.1.1

S  (t) = −15 S(t) I (t)


I  (t) = 15 S(t) I (t) − 50 I (t)
S(0) = S0
I (0) = I0

Exercise 12.1.2

S  (t) = −5 S(t) I (t)


I  (t) = 5 S(t) I (t) − 15 I (t)
S(0) = S0
I (0) = I0
12.1 Disease Model Nullclines 385

Exercise 12.1.3

S  (t) = −2.4 S(t) I (t)


I  (t) = 2.4 S(t) I (t) − 8.5 I (t)
S(0) = S0
I (0) = I0

Exercise 12.1.4

S  (t) = −1.7 S(t) I (t)


I  (t) = 1.7 S(t) I (t) − 1.2 I (t)
S(0) = S0
I (0) = I0

12.2 Only Quadrant 1 is Relevant

Consider a trajectory that starts at a point on the positive I axis. Hence, I0 > 0 and
S0 = 0. It is easy to see that if we choose S(t) = 0 for all time t and I satisfying

I  = −γ I
I (0) = I0

then the pair (S, I ) satisfying

S(t) = 0 and I (t) = I0 e−γ t

is trajectory. Since trajectories can not cross, we now know that a trajectory starting
in Quadrant 1 with biologically reasonable values of I0 > 0 and S0 > 0 must remain
on the right side of the I –S plane. Next, if we look at a trajectory which starts on
the positive S axis at the point S0 > 0 and I0 = 0, we see immediately that the
pair S(t) = S0 and I (t) = 0 for all time t satisfies the disease model by direct
calculation:

(S0 ) = 0 = −r S0 0.

In fact, any trajectory with starting point on the positive S axis just stays there. This
makes biological sense as since I0 is 0, there is no infection and hence no disease
dynamics at all. On the other hand, for the point I0 > 0 and S0 > γ/r , the algebraic
signs we see in Fig. 12.3 tell us the trajectory goes to the left and upwards until it
hits the line S = γ/r and then it decays downward toward the S axis. The trajectory
386 12 Disease Models

Fig. 12.4 The disease model in Quadrant 1

can’t hit the I axis as that would cross a trajectory, so it must head downward until
it hits the positive S axis. This intersection will be labeled as (S ∞ , 0) and it is easy
to see S ∞ < γ/r . At the point (S ∞ , 0), the trajectory will stop as both the I  and S 
derivatives become 0 there. Hence, we conclude we only need to look at trajectories
starting in Quadrant 1 with I0 > 0 as shown in Fig. 12.4.

12.2.1 Homework

For the following disease models, analyze the trajectories on the positive I and S
axis and show why this means disease trajectories that start in Q1+ stay there and
end on the positive S axis.
Exercise 12.2.1

S  (t) = −15 S(t) I (t)


I  (t) = 15 S(t) I (t) − 50 I (t)
S(0) = S0
I (0) = I0
12.2 Only Quadrant 1 is Relevant 387

Exercise 12.2.2

S  (t) = −5 S(t) I (t)


I  (t) = 5 S(t) I (t) − 15 I (t)
S(0) = S0
I (0) = I0

Exercise 12.2.3

S  (t) = −2.4 S(t) I (t)


I  (t) = 2.4 S(t) I (t) − 8.5 I (t)
S(0) = S0
I (0) = I0

Exercise 12.2.4

S  (t) = −1.7 S(t) I (t)


I  (t) = 1.7 S(t) I (t) − 1.2 I (t)
S(0) = S0
I (0) = I0

12.3 The I Versus S Curve

We know that biologically reasonable solutions occur with initial conditions starting
in Quadrant 1 and we know that our solutions satisfy S  < 0 always with both S and
I positive until we hit the S axis. Let the time where we hit the S axis be given by t ∗ .
Then, we can manipulate the disease model as follows. For any t < t ∗ , we can divide
to obtain

I  (t) r S(t) I (t) − γ I (t)


=
S  (t) −r S(t) I (t)
γ 1
= −1 + .
r S(t)
388 12 Disease Models

Thus,
dI γ 1
= −1 +
dS r S
or integrating, we find
   
γ S(t)
I (t) − I0 = − S(t) − S0 + ln .
r S0

We can simplify this to find


 
γ S(t)
I (t) = I0 + S0 − S(t) + ln .
r S0

Dropping the dependence on time t for convenience of notation, we see in Eq. 12.5,
the functional dependence of I on S.
 
γ S
I = I0 + S0 − S ln . (12.5)
r S0

It is clear that this curve has a maximum at the critical value γ/r . This value is very
important in infectious disease modeling and we call it the infectious to susceptible
rate ρ. We can use ρ to introduce the idea of an epidemic.

Definition 12.3.1 (A Disease Epidemic)


For the disease model
I = r S I − γ I
S = − r S I
I (0) = I0
S(0) = S0

the dependence of I on S is given by


 
S
I = I0 + S0 − S ρ ln .
S0

For this model, we say the infection becomes an epidemic if the initial value of
susceptibles, S0 exceeds the critical infectious to susceptible ratio ρ = γr because
the number of infections increases to its maximum before it begins to drop. This
12.3 The I Versus S Curve 389

behavior is easy to interpret as an infection going out of control; i.e. it has entered
an epidemic phase.

12.4 Homework

We are now ready to do some exercises. For the following disease models

1. Explain the meaning of the variables S and I .


2. What is the variable R and how is it related to S and I ?
3. Divide the first quadrant into regions corresponding to the algebraic signs of I 
and S  and explain why I and S cannot become negative. Draw a nice picture of
this like Fig. 12.4.
4. What is the meaning of the constants r and γ?
5. Derive the dd SI equation and solve it.
6. Draw the (S, I ) phase plane for this problem with a number of trajectories for
the cases S0 > ρ and S0 < ρ.
7. Explain why if S0 is larger than ρ, theorists interpret this as an epidemic.
8. Explain why if S0 is smaller than ρ, it is not interpreted as an epidemic.

Exercise 12.4.1 For the specific model

S  (t) = −150 S(t) I (t)


I  (t) = 150 S(t) I (t) − 50 I (t)
S(0) = S0
I (0) = 12.7

1. Is there an epidemic if S0 is 1.8?


2. Is there an epidemic if S0 is 0.2?

Exercise 12.4.2 For the specific model

S  (t) = −5 S(t) I (t)


I  (t) = 5 S(t) I (t) − 25 I (t)
S(0) = S0
I (0) = 120
390 12 Disease Models

1. Is there an epidemic if S0 is 4.9?


2. Is there an epidemic if S0 is 10.3?

Exercise 12.4.3 For the specific model

S  (t) = −15 S(t) I (t)


I  (t) = 15 S(t) I (t) − 35 I (t)
S(0) = S0
I (0) = 120

1. Is there an epidemic if S0 is 1.9?


2. Is there an epidemic if S0 is 4.3?

Exercise 12.4.4 For the specific model

S  (t) = −100 S(t) I (t)


I  (t) = 100 S(t) I (t) − 400 I (t)
S(0) = S0
I (0) = 120

1. Is there an epidemic if S0 is 2.9?


2. Is there an epidemic if S0 is 10.3?

Exercise 12.4.5 For the specific model

S  (t) = −5 S(t) I (t)


I  (t) = 5 S(t) I (t) − 250 I (t)
S(0) = S0
I (0) = 120

1. Is there an epidemic if S0 is 40?


2. Is there an epidemic if S0 is 70?

12.5 Solving the Disease Model Using Matlab

Here is a typical session to plot an SIR disease model trajectory for

S  = −5S I, I  = 5S I − 25I, S(0) = 10, I (0) = 5.


12.5 Solving the Disease Model Using Matlab 391

Listing 12.1: Solving S  = −5S I, I  = 5S I + 25I, S(0) = 10, I (0) = 5: Epidemic!


f = @( t , x ) [−5∗x ( 1 ) . ∗ x ( 2 ) ; 5∗ x ( 1 ) . ∗ x ( 2 ) − 25∗ x ( 2 ) ] ;
T = 2.0;
3 h = .005;
N = c e i l (T / h ) ;
x0 = [ 1 0 ; 5 ] ;
[ ht , rk ] = FixedRK ( f , 0 , x0 , h , 4 ,N) ;
X = rk ( 1 , : ) ;
8 Y = rk ( 2 , : ) ;
xmin = min (X) ;
xmax = max (X) ;
x t o p = max ( abs ( xmin ) , abs ( xmax ) ) ;
ymin = min (Y) ;
13 ymax = max (Y) ;
y t o p = max ( abs ( ymin ) , abs ( ymax ) ) ;
D = max ( xtop , y t o p )
x = l i n s p a c e ( 0 , D, 1 0 1 ) ;
GoverR = 2 5 / 5 ;
18 clf
h o l d on
p l o t ( [ GoverR GoverR ] , [ 0 D ] ) ;
p l o t (X, Y, ’-k’ ) ;
x l a b e l ( ’S axis’ ) ;
23 y l a b e l ( ’I axis’ ) ;
t i t l e ( ’Phase Plane for Disease Model S’’ = -5 S I, I’’ = 5 S I + 25 I, S
(0) = 10, I(0) = 5: Epidemic!’ ) ;
l e g e n d ( ’S = 25/5’ , ’S vs I’ , ’Location’ , ’Best’ ) ;

This generates the plot you see in Fig. 12.5.

Fig. 12.5 Solution to


S  = −5S I, I  = 5S I +
25I, S(0) = 10, I (0) = 5
392 12 Disease Models

12.5.1 Homework

For the following disease models, do the single plot corresponding to an initial
condition that gives an epidemic and also draw a nice phase plane plot using
AutoPhasePlanePlot.
Exercise 12.5.1

S  (t) = −150 S(t) I (t)


I  (t) = 150 S(t) I (t) − 50 I (t)
S(0) = S0
I (0) = I0

Exercise 12.5.2

S  (t) = −5 S(t) I (t)


I  (t) = 5 S(t) I (t) − 25 I (t)
S(0) = S0
I (0) = I0

Exercise 12.5.3

S  (t) = −3 S(t) I (t)


I  (t) = 3 S(t) I (t) − 4 I (t)
S(0) = S0
I (0) = I0

12.6 Estimating Parameters

During an epidemic, it is impossible to accurately determine the number of newly


infected people each day or week. This is because infectious people are only recog-
nized and removed from circulation if they seek medical attention. Indeed, we only
see data on the number of people admitted to hospitals each day or week. That is,
we have data on the number of newly removed people which is an estimate of R  . So
to compare the results predicted by the model to data from real epidemics, we must
find R  as a function of time t. Now we know

R  = γ I = γ (N − R − S).
12.6 Estimating Parameters 393

Further, we know
dS
dS −r S I
= dt
=
dR dR
dt
−γ I
S
=− .
ρ

This equation we can solve to obtain

S(R) = S0 e−R/ρ . (12.6)

Hence,

R  = γ (N − R − S0 e−R/ρ ). (12.7)

This differential equation is not solvable directly, so we will try some estimates.

12.6.1 Approximating the dR/dt Equation

To estimate the solution to Eq. 12.7, we would like to replace the term e which makes
our integration untenable with a quadratic approximation like that of Eq. 3.5 from
Sect. 3.1.3. We would approximate around the point R = 0, giving
   
R R 1 R 2
Q = 1− +
ρ ρ 2 ρ

with attendant error


 3
1 R
|E Q (R, 0)| ≤ .
3 ρ

We need to see if this error is not too large. Recall the I and S solution satisfies
 
S
I + S = I0 + S0 + ρ ln .
S0

An epidemic would start with the number of removed individuals R0 being 0. Hence,
we know initially N = I0 + S0 and so since R = N − I − S, we have
 
S
N − R = N + ρ ln
S0
394 12 Disease Models

or
   
S S0
R = −ρ ln = ρ ln .
S0 S

We know S always decreases from its initial value of S0 , so the fraction S0 /S is larger
than one; hence, the logarithm is positive. We conclude
 
R S
− = ln < 1.
ρ S0

Thus, the error we make in replacing e R/ρ by Q(R/ρ) is reasonably small as


 3
1 S0
|E Q (R, 0)| ≤ ln  1.
3 S

Now, let’s use this approximation in Eq. 3.5 to derive an approximation to R(t). The
approximate differential equation to solve is
  
 R
R = γ N − R − S0 Q (12.8)
ρ
    
R 1 R 2
= γ N − R − S0 1 − + . (12.9)
ρ 2 ρ

This can be rewritten as follows (we will go through all the steps because it is
intense!):
   
S0 2 ρ2 S0 2 ρ2
R = γ N − S0 + −1 R − R2
2 ρ2 S0 ρ S0
     
S0 S0 2ρ 2
2 ρ2
= −γ R −
2
−1 R − N − S0
2 ρ2 ρ S0 S0
     
S0 S0 − ρ N − S0
= −γ R 2
− 2 ρ R − 2 ρ 2
.
2 ρ2 S0 S0

Now the next step is truly complicated. We complete the square on the quadratic.
This gives
    2  2
 S0 S0 − ρ S0 − ρ S0 − ρ
R = −γ R −
2
2ρ R+ ρ −
2
ρ2
2 ρ2 S0 S0 S0
  
N − S0
− 2 ρ2
S0
12.6 Estimating Parameters 395

 2  2   
S0 S0 − ρ S0 − ρ N − S0
= −γ R− ρ − ρ2 − 2 ρ2 .
2 ρ2 S0 S0 S0

Egads! Now we will simplify by defining the constants α and β by


 2  
S0 − ρ N − S0
α =2
ρ +
2
2 ρ2
S0 S0

and
S0 − ρ
β= ρ.
S0

Then we can rewrite our equation is a notationally simpler form:


 
S0
R  = −γ (R − β)2 − α2 .
2 ρ2

Now we can go about the business of solving the differential equation. We will use a
new approach (rather than the integration by partial fraction decomposition we have
already used). After separating variables, we have

dR S0
= −γ dt.
(R − β)2 − α2 2 ρ2

Substitute u = R − β to obtain

du S0
= −γ dt.
u2 −α 2 2 ρ2

We will do these integrations using what are called hyperbolic trigonometric func-
tions. We make the following definitions:

Definition 12.6.1 (Hyperbolic Functions)


The main hyperbolic functions of the real number x are defined to be

e x − e−x
sinh(x) = , the hyperbolic sine,
2
e x + e−x
cosh(x) = , the hyperbolic cosine,
2
sinh(x)
tanh(x) = ,
cosh(x)
e x − e−x
= x , the hyperbolic tangent,
e + e−x
396 12 Disease Models

1
sech(x) = ,
cosh(x)
2
= x , the hyperbolic secant,
e + e−x

It is straightforward to calculate that

cosh2 (x) − sinh2 (x) = 1,


1 − tanh2 (x) = sech 2 (x).

Note that these definitions are similar, but different, from the ones you are used to
with the standard trigonometric functions sin(x), cos(x) and so forth.
And then there are the derivatives:
Definition 12.6.2 (Hyperbolic Function Derivatives)
The hyperbolic functions are continuous and differentiable for all real x. We have
 
sinh(x) = cosh(x)
 
cosh(x) = sinh(x)
 
tanh(x) = sech 2 (x)

Now let’s go back to the differential equation we need to solve. Make the substitution
u = α tanh(z). Then, we have du = α sech 2 (z) dz. Making the substitution, we
find

α sech 2 (z) dz S0
= −γ dt.
α (tanh (z) − 1)
2 2 2 ρ2

But tanh2 (z) − 1 = −sech 2 (z). Thus, the above simplifies to

α sech 2 (z) dz −1 S0
= dz = −γ dt.
−α sech (z)
2 2 α 2 ρ2

The integration is now simple. We have

S0
dz = α γ dt.
2 ρ2

Integrating, we obtain

S0
z(t) − z(0) = α γ t.
2 ρ2
12.6 Estimating Parameters 397

Just like there is an inverse tangent for trigonometric functions, there is an inverse
for the hyperbolic tangent also.
Definition 12.6.3 (Inverse Hyperbolic Function)
It is straightforward to see that tanh(x) is always increasing and hence it has a nicely
defined inverse function. We call this inverse the inverse hyperbolic tangent function
and denote it by the symbol tanh−1 (x).
We can show using rather messy calculations the following sum and difference
formulae for tanh. We will be using these in a bit.

tanh(u) + tanh(v)
tanh(u + v) =
1 + tanh(u) tanh(v)
tanh(u) − tanh(v)
tanh(u − v) = .
1 − tanh(u) tanh(v)

Hence, since u = α tanh(z),


   
u
−1 −1 R − β
z = tanh = tanh .
α α

Our differential equation can thus be rewritten as


   
R(t) − β R0 − β S0
tanh−1 − tanh−1 = αγ t.
α α 2 ρ2

We have
   
−1 R(t) − β −1 R0 − β S0
tanh − tanh = αγ t.
α α 2 ρ2

Hence, since R0 = 0 and tanh−1 is an odd function, we find


      
−1 R(t) − β −1 β S0
tanh tanh + tanh = tanh α γ t .
α α 2 ρ2

Applying Definition 12.6.3, we have


 
R(t)−β
α
+ αβ S0
β
= tanh α γ t .
1 + R(t)−β
α α
2 ρ2

Now we want R as a function of t, so we have some algebra to suffer our way through.
Grab another cup of coffee as this is going to be a rocky ride!
  
R(t) − β β S0 R(t) − β β
+ = tanh α γ t 1+ .
α α 2 ρ2 α α
398 12 Disease Models

It then follows that


    
R(t) − β β S0 β S0
1 − tanh α γ t = − + tanh α γ t .
α α 2 ρ2 α 2 ρ2

Finally, after a division, we have


 
β
tanh α γ S0
2 ρ2
t − α
R(t) − β
=  .
α β
1− α
tanh α γ S0
2 ρ2
t

Thus, letting the number φ be defined by tanh(φ) = β/α, we have


 
tanh α γ t − tanh φ
S0
2 ρ2
R(t) = β + α  
1 − tanh φ tanh α γ 2Sρ02 t
 
α γ S0
= β + α tanh t − φ
2ρ2

using the addition formula for tanh from Definition 12.6.3. We can then find the long
sought formula for R  . It is
 
α2 γ S0 2 α γ S0
R  (t) = sech t − φ
2ρ2 2ρ2

12.6.2 Using d R/d t to Estimate ρ

Assume we have collected data for the rate of change of R with respect to time during
an infectious incident. The general R  model is of the form

R  (t) = A sech 2 (a t − b)

for some choice of positive constants a, b and A. We fit our R  data by choosing a,
b and A carefully using some technique (these sorts of tools would be discussed in
another class). We know

N − S0
α2 = β 2 + 2 ρ2 . (12.10)
S0
12.6 Estimating Parameters 399

Our model tells us that


α γ S0
a=
2ρ2
α2 γ S0
A= = α a.
2ρ2

So we have α = A/a. Further, since b = tanh−1 (β/α), we have β = α tanh(b).


Thus, from Eq. 12.10, it follows that
 2  2
A N − S0
= α tanh(b) + 2 ρ2 .
a S0

After some manipulation, we have


 2  
N − S0 A
2 ρ2 = 1 − tanh (b) .
2
S0 a

The right hand side is known from our data fit and we can assume we have an estimate
of the total population N also. In addition, if we can estimate the initial susceptible
value S0 , we will have an estimate for the critical value ρ from our data:
 2  
1 S0 A
ρ =
2
1 − tanh (b) .
2
2 N − S0 a

It is a lot of work to generate the approximate value of R  but the payoff is that
we obtain an estimate of the γ/r ratio which determines whether we have an
epidemic or not.
Chapter 13
A Cancer Model

We are going to examine a relatively simple cancer model as discussed in Novak


(2006). Our model is a small fraction of what Novak shows is possible, but it should
help you to continue to grow in the art of modeling. This chapter is adapted from
Novak’s work, so you should really go and look at his full book after you work
your way through this chapter. However, we are more careful in the underlying
mathematical analysis than Novak and after this chapter is completed, you should
have a better understanding of why the mathematics and the science need to work
together. You are not ready for more sophisticated mathematical models of cancer,
but it wouldn’t hurt you to look at a few to get the lay of the land, so to speak. Check
out Ribba et al. (2006) and Tatabe et al. (2001) when you can. In Peterson (2015),
we also discussed this cancer model, but we only looked at part of it as we had not
yet learned how to handle systems of ODEs. Now we are fully prepared and we will
finish our analysis. Let’s review the basic facts again so that everything is easy to
refer to.
We are built of individual cells that have their own reproductive machinery. These
cells can sometimes revert to uncontrolled self-replication; this is, of course, a change
from their normal programming. Cancer is a disease of organisms that have more than
one cell. In order for organisms to have evolved multicellularity, individual cells had
to establish and maintain cooperation with each other. From one point of view then,
cancer is a breakdown of cellular cooperation; cells must divide when needed by the
development program, but not otherwise. Complicated genetic controls on networks
of cells had to evolve to make this happen. Indeed, many genes are involved to
maintain integrity of the genome, to make sure cell division does not make errors
and to help establish the development program that tells cells when to divide. Some
of these genes monitor the cell’s progress and, if necessary, induce cell death, a
process called apoptosis. Hence, most cells listen to many signals from other cells
telling them they are ok. If the signals saying this fail to arrive, the default program is
for the cell to commit suicide; i.e. to trigger apoptosis. hence, apoptosis is a critical
defense against cancer. Here is a rough timeline (adapted from Novak) that shows
the important events that are relevant to our simple cancer modeling exercise.

© Springer Science+Business Media Singapore 2016 401


J.K. Peterson, Calculus for Cognitive Scientists,
Cognitive Science and Technology, DOI 10.1007/978-981-287-877-9_13
402 13 A Cancer Model

1890 David von Hansermann noted cancer cells have abnormal cell division events.
1914 Theodore Boveri sees that something is wrong in the chromosomes of can-
cer cells: they are aneuploid. That is, they do not have the normal number of
chromosomes.
1916 Ernst Tyzzer first applied the term somatic mutation to cancer.
1927 Herman Muller discovered ionizing radiation which was known to cause can-
cer (i.e. was carcinogenic) was also able to cause genetic mutations (i.e. was
mutagenic).
1951 Herman Muller proposed cancer requires a single cell to receive multiple
mutations.
1950–1959 Mathematical modeling of cancer begins. It is based on statistics.
1971 Alfred Knudson proposes the concept of a Tumor Suppressor Gene or TSG.
The idea is that it takes a two point mutation to inactivate a TSG. TSG’s play a
central role in regulatory networks that determine the rate of cell cycling. Their
inactivation modifies regulatory networks and can lead to increased cell prolifer-
ation.
1986 A Retinoblastoma TSG is identified which is a gene involved in a childhood
eye cancer.

Since 1986, about 30 more TSG’s have been found. An important TSG is p53.
This is mutated in more than 50 % of all human cancers. This gene is at the center of a
control network that monitors genetic damage such as double stranded breaks (DSB)
of DNA. In a single stranded break (SSB), at some point the double stranded helix
of DNA breaks apart on one strand only. In a DSB, the DNA actually separates into
different pieces giving a complete gap. DSB’s are often due to ionizing radiation. If
a certain amount of damage is achieved, cell division is paused and the cell is given
time for repair. If there is too much damage, the cell will undergo apoptosis. In many
cancer cells, p53 is inactivated. This allows these cells to divide in the presence of
substantial genetic damage. In 1976, Michael Bishop and Harold Varmus introduced
the idea of oncogenes. These are another class of genes involved in cancer. These
genes increase cell proliferation if they are mutated or inappropriately expressed.
Now a given gene that occupies a certain position on a chromosome (this position is
called the locus of the gene) can have a number of alternate forms. These alternate
forms are called alleles. The number of alleles a gene has for an individual is called
that individual’s genotype for that gene. Note, the number of alleles a gene has is
therefore the number of viable DNA codings for that gene. We see then that mutations
of a TSG and an oncogene increase the net reproductive rate or somatic fitness of a
cell. Further, mutations in genetic instability genes also increase the mutation rate.
For example, mutations in mismatch repair genes lead to 50–100 fold increases in
point mutation rates. These usually occur in repetitive stretches of short sequences of
DNA. Such regions are called micro satellite regions of the genome. These regions are
used as genetic markers to track inheritance in families. They are short sequences of
nucleotides (i.e. ATCG) which are repeated over and over. Changes can occur such
as increasing or decreasing the number of repeats. This type of instability is thus
called a micro satellite or MIN instability. It is known that 15 % of colon cancer cells
have MIN.
13 A Cancer Model 403

Another instability is chromosomal instability or CIN. This means an increase or


decrease in the rate of gaining or losing whole chromosomes or large fractions of
chromosomes during cell division.
Let’s look at what can happen to a typical TSG. If the first allele of a TSG is
inactivated by a point mutation while the second allele is inactivated by a loss of
one parent’s contribution to part of the cell’s genome, the second allele is inactivated
because it is lost in the copying process. This is an example of CIN and in this case
is called loss of heterozygosity or LOH. It is known that 85 % of colon cancer cells
have CIN.

13.1 Two Allele TSG Models

Let’s look now at colon cancer itself. Look at Fig. 13.1. In this figure, you see a
typical colon crypt. Stem cells at the bottom of the crypt differentiate and move up
the walls of the crypt to the colon lining where they undergo apoptosis.
At the bottom of the crypt, a small number of stem cells slowly divide to produce
differentiated cells. These differentiated cells divide a few times while migrating
to the top of the crypt where they undergo apoptosis. This architecture means only
a small subset of cells are at risk of acquiring mutations that become fixed in the
permanent cell lineage. Many mutations that arise in the differentiated cells will be
removed by apoptosis.
Colon rectal cancer is thought to arise as follows. A mutation inactivates the
Adenomatous Polyposis Coli or APC TSG pathway. Ninety- five percent of col-
orectal cancer cells have this mutation with other mutations accounting for the other
5 %. The crypt in which the APC mutant cell arises becomes dyplastic; i.e. has
abnormal growth and produces a polyp. Large polyps seem to require additional
oncogene activation. Then 10–20 % of these large polyps progress to cancer.

Fig. 13.1 Typical colon


crypts
404 13 A Cancer Model

A general model of cancer based on TSG inactivation is as follows. The tumor


starts with the inactivation of a TSG called A, in a small compartment of cells. A
good example is the inactivation of the APC gene in a colonic crypt, but it could be
another gene. Initially, all cells have two active alleles of the TSG. We will denote
this by A+/+ where the superscript “+/+” indicates both alleles are active. One of
the alleles becomes inactivated at mutation rate u1 to generate a cell type denoted
by A+/− . The superscript +/− tells us one allele is inactivated. The second allele
becomes inactivated at rate û2 to become the cell type A−/− . In addition, A+/+ cells
can also receive mutations that trigger CIN. This happens at the rate uc resulting in
the cell type A+/+ CIN . This kind of a cell can inactivate the first allele of the TSG
with normal mutation rate u1 to produce a cell with one inactivated allele (i.e. a +/−)
which started from a CIN state. We denote these cells as A+/− CIN . We can also get
a cell of type A+/− CIN when a cell of type A+/− receives a mutation which triggers
CIN. We will assume this happens at the same rate uc as before. The A+/− CIN cell
then rapidly undergoes LOH at rate û3 to produce cells of type A−/− CIN . Finally,
A−/− cells can experience CIN at rate uc to generate A−/− CIN cells. We show this
information in Fig. 13.2.
Let N be the population size of the compartment. For colonic crypts, the typical
value of N is 1000–4000. The first allele is inactivated by a point mutation. The rate
at which this occurs is modeled by the rate u1 as shown in Fig. 13.2. We make the
following assumptions:

Assumption 13.1.1 (Mutation rates for u1 and uc are population independent)


The mutations governed by the rates u1 and uc are neutral. This means that these
rates do not depend on the size of the population N.

Assumption 13.1.2 (Mutation rates û2 and û3 give selective advantage)
The events governed by û2 and û3 give what is called selective advantage. This
means that the size of the population size does matter.

Using these assumptions, we will model û2 and û3 like this:

û2 = N u2

u1 û2
Without CIN A+/+ A+/− A−/−

uc uc uc

u1 û3
With CIN A+/+ CIN A+/− CIN A−/− CIN

Fig. 13.2 The pathways for the TSG allele losses


13.1 Two Allele TSG Models 405

and

û3 = N u3 .

where u2 and u3 are neutral rates. We can thus redraw our figure as Fig. 13.3.
The mathematical model is then setup as follows. Let
X0 (t) is the probability a cell in cell type A+/+ at time t.
X1 (t) is the probability a cell in cell type A+/− at time t.
X2 (t) is the probability a cell in cell type A−/− at time t.
Y0 (t) is the probability a cell in cell type A+/+ CIN at time t.
Y1 (t) is the probability a cell in cell type A+/− CIN at time t.
Y2 (t) is the probability a cell in cell type A−/− CIN at time t.
Looking at Fig. 13.3, we can generate rate equations. First, let’s rewrite Fig. 13.3
using our variables as Fig. 13.4.
To generate the equations we need, note each box has arrows coming into it and
arrows coming out of it. The arrows in are growth terms for the net change of the
variable in the box and the arrows out are the decay or loss terms. We model growth
as exponential growth and loss as exponential decay. So X0 only has arrows going
out which tells us it only has loss terms. So we would say X0 loss = −u1 X0 − uc X0
which implies X0 = −(u1 + uc )X0 . Further, X1 has arrows going in and out which
tells us it has growth and loss terms. So we would say X1 loss = −Nu2 X1 − uc X1
 
and X1 growth = u1 X0 which implies X1 = u1 X0 − (Nu2 + uc )X1 . We can continue

u1 N u2
Without CIN A+/+ A+/− A−/−

uc uc uc

u1 N u3
A+/+ CIN A+/− CIN A−/− CIN
With CIN

Fig. 13.3 The pathways for the TSG allele losses rewritten using selective advantage

u1 N u2
Without CIN X0 X1 X2

uc uc uc

u1 N u3
Y0 Y1 Y2
With CIN

Fig. 13.4 The pathways for the TSG allele losses rewritten using mathematical variables
406 13 A Cancer Model

in this way to find all the model equations. We can then see the Cancer Model rate
equations are
X0 = −(u1 + uc ) X0 (13.1)
X1 = u1 X0 − (uc + N u2 ) X1 (13.2)
X2 = N u2 X1 − uc X2 (13.3)
Y0 = uc X0 − u1 Y0 (13.4)
Y1 = uc X1 + u1 Y0 − N u3 Y1 (13.5)
Y2 = N u3 Y1 + uc X2 (13.6)

Initially, at time 0, all the cells are in the state X0 , so we have

X0 (0) = 1, X1 (0) = 0, X2 (0) = 0 (13.7)


Y0 (0) = 0, Y1 (0) = 0, Y2 (0) = 0. (13.8)

The problem we want to address is this one:


Question Under what circumstances is the CIN pathway to cancer the dominant
one?
In order to answer this, we need to analyze the trajectories of this model. Note, if
we were interested in the asymptotic behavior of this model as t goes to infinity,
then it is clear everything ends up with value 0. However, our interest is over the
typical lifetime of a human being and thus, we never reach the asymptotic state.
Thus, our analysis in the sections that follow is always concerned with the values of
our six variables at the end of a human life span. In Sect. 13.6 we do this for the cell
populations without CIN and then we use those results to develop approximations in
Sect. 13.7 to cell populations with CIN. Along the way, we develop approximations
to the true solutions that help us answer our question. But, make no mistake, the
analysis is a tricky and requires a fair amount of intellectual effort. Still, we think at
the end of it all, you will understand and appreciate how we use mathematics and
science together to try to come to grips with a difficult problem. Now, let’s look at
the results first, before we do the detailed mathematical analysis. This will let you
see the larger picture.

13.2 Model Assumptions

Since our interest in these variables is over the typical lifetime of a human being, we
need to pick a maximum typical lifetime.
Assumption 13.2.1 (Average human lifetime)
The average human life span is 100 years. We also assume that cells divide once per
day and so a good choice of time unit is days. The final time for our model will be
denoted by T and hence
13.2 Model Assumptions 407

T = 3.65 × 104 days.

Next, recall our colonic crypt, N is from 1000 to 4000 cells. For estimation purposes,
we often think of N as the upper value, N = 4 × 103 .

Assumption 13.2.2 (Loss of allele rates for neutral mutations)


We assume

u1 ≈ 10−7
u2 ≈ 10−7 .

We will assume the rate N u3 is quite rapid and so it is close to 1. We will set u3 as
follows:

Assumption 13.2.3 (Losing the second allele due to CIN is close to probability one)
We assume

N u3 ≈ 1 − r.

for small positive values of r.

Hence, once a cell reaches the Y1 state, it will rapidly transition to the end state Y2 if
r is sufficiently small.
We are not yet sure how to set the magnitude of uc , but it certainly is at least u1 .
For convenience, we will assume

Assumption 13.2.4 (uc is proportional to u1 )


We assume

uc = R u1 .

where R is a number at least 1. For example, if uc = 10−5 , this would mean R = 100.

13.3 Preliminary Analysis

The mathematical model can be written in matrix vector form as usual. This gives
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
X0 −(u1 + uc ) 0 0 0 0 0 X0
⎢X  ⎥ ⎢ u −(u + Nu2 ) 0 0 0 0⎥ ⎢X1 ⎥
⎢ 1 ⎥ ⎢ 1 c ⎥ ⎢ ⎥
⎢X ⎥ ⎢ 0 Nu −uc 0 0 0⎥ ⎢X2 ⎥
⎢ 2 ⎥ = ⎢ 2 ⎥ ⎢ ⎥
⎢Y ⎥ ⎢ uc 0 0 −u1 0 0⎥ ⎢Y0 ⎥
⎢ 0 ⎥ ⎢ ⎥ ⎢ ⎥
⎣Y1 ⎦ ⎣ 0 uc 0 u1 −Nu3 0⎦ ⎣Y1 ⎦
Y2 0 0 uc 0 Nu3 0 Y2
408 13 A Cancer Model
⎡ ⎤ ⎡ ⎤
X0 (0) 1
⎢X1 (0)⎥ ⎢0⎥
⎢ ⎥ ⎢ ⎥
⎢X2 (0)⎥ ⎢0⎥
⎢ ⎥ ⎢ ⎥
⎢Y0 (0)⎥ = ⎢0⎥
⎢ ⎥ ⎢ ⎥
⎣Y1 (0)⎦ ⎣0⎦
Y2 (0) 0

We can find eigenvalues and associated eigenvectors for this system as we have done
for 2 × 2 models in the past. However, it is clear, we can solve the X0 , X1 system
directly and then use that solution to find X2 . Here are the details.

13.4 Solving the Top Pathway Exactly

We will now solve the top pathway model exactly using the tools we have developed
in this course.

13.4.1 The X0 –X1 Subsystem

The X0 –X1 subsystem is a familiar 2 × 2 system.

X0 −(u1 + uc ) 0 X0
=
X1 u1 −(uc + Nu2 ) X1
X0 (0) 1
=
X1 (0) 0

The characteristic equation is

r + (u1 + uc ) r + (uc + Nu2 ) = 0.

Hence, the two eigenvalues are r1 = −(u1 +uc ) and r2 = −(uc +Nu2 ). The associated
eigenvectors are straightforward to find:

1 0
E1 = u1 and E2 =
Nu2 −u1 1

Applying the initial conditions, we find

X0 (t) 1 u1 0 −(uc +Nu2 )t


= u1 e−(u1 +uc )t − e
X1 (t) Nu2 −u1 Nu2 − u1 1
13.4 Solving the Top Pathway Exactly 409

From this, we see

X0 (t) = e−(u1 +uc )t (13.9)


u1
X1 (t) = e−(u1 +uc )t − e−(uc +Nu2 )t (13.10)
Nu2 − u1

Note, we don’t have to use all the machinery of eigenvalues and eigenvectors here.
It is clear that solving X0 is a simple integration.

X0 = −(u1 + uc )X0 , X0 (0) = 1 ⇒ X0 (t) = e−(u1 +uc )t .

The next solutions all use the integrating factor method. The next four variables all
satisfy models of the form. u (t) = −au(t) + f (t); u(t) = 0 where f is the external
data. Using the Integrating factor approach, we find

eat u (t) + au(t) = eat f (t)



eat u(t) = eat f (t)
 t
eat u(t) = eas f (s)ds
0

Hence,
 t
u(t) = e−at eas f (s)ds.
0

So, we can find X1 directly. We solve X1 = u1 X0 − (uc + N u2 ) X1 with X1 (0) = 0.


Using the Integrating factor formula with f (t) = u1 X0 (t) and a = (uc + Nu2 ), we
have X1 = −(uc + Nu2 )X1 + u1 X0 implies
 t
X1 (t) = e−(uc +Nu2 )t e(uc +Nu2 )s u1 X0 (s) ds.
0
 t
−(uc +Nu2 )t
=e e(uc +Nu2 )s u1 e−(u1 +uc )s ds
0
 t
−(uc +Nu2 )t
= u1 e e(Nu2 −u1 )s ds
0
t
u1 
= e−(uc +Nu2 )t e(Nu2 −u1 )s 
Nu2 − u1 0
u1 −(uc +Nu2 )t (Nu2 −u1 )t
= e e −1
Nu2 − u1
410 13 A Cancer Model

u1
= e−(uc +u1 )t − e(uc +Nu2 )t
Nu2 − u1

Using the X1 solution, we have to solve X2 = N u2 X1 − uc X2 . This leads to

Nu1 u2 1 −uc t
X2 (t) = e − e−(u1 +uc )t
Nu2 − u1 u1
1
− e−uc t − e−(uc +Nu2 )t
Nu2

The same integrating factor technique with Y0 = −u1 Y0 + uc X0 with Y (0) = 0 using
a = u1 and f (t) = uc X0 (t) leads to
 t
Y0 (t) = e−u1 t eu1 s uc X0 (s) ds.
0

However, the solutions for the models Y1 and Y2 are very messy and so we will try
to see what is happening approximately.

13.5 Approximation Ideas

To see how to approximate these solution, let’s recall ideas from Sect. 3.1 and apply
them to the approximation of the difference of two exponentials. Let’s look at the
function f (t) = e−rt − e−(r+a)t for positive r and a. To approximate this difference,
we expand each exponential function into the second order approximation plus the
error as usual.

t2 t3
e−rt = 1 − rt + r 2 − r 3 e−rc1
2 6
2
t t3
e−(r+a)t = 1 − (r + a)t + (r + a)2 − (r + a)3 e−(r+a)c2
2 6
for some c1 and c2 between 0 and t. Subtracting, we have

t2 t3
e−rt − e−(r+a)t = 1 − rt + r 2 − r 3 e−rc1
2 6
t2 t3
− 1 − (r + a)t + (r + a)2 (r + a)3 e−(r+a)c2
2 6
t2 t3
= at − (a2 + 2ar) + −r 3 e−rc1 + (r + a)3 e−(r+a)c2
2 6
13.5 Approximation Ideas 411

We conclude

t2 t3
e−rt − e−(r+a)t = at − (a2 + 2ar) + −r 3 e−rc1 + (r + a)3 e−(r+a)c2 (13.11)
2 6

We can also approximate the function g(t) = e−(r+a)t − e−(r+b)t for positive r, a and
b. Using a first order tangent line type approximation, we have

t2
e−(r+a)t = 1 − (r + a)t + (r + a)2 e−(r+a)c1
2
2
2 −(r+a)c2 t
e−(r+b)t = 1 − (r + b)t + (r + b) e
2
for some c1 and c2 between 0 and t. Subtracting, we find

t2
e−(r+a)t − e−(r+b)t = 1 − (r + a)t + (r + a)2 e−(r+a)c1
2
t2
− 1 − (r + b)t + (r + b)2 e−(r+b)c2
2
t2
= (−a + b)t + (r + a)2 e−(r+a)c1 − (r + b)2 e−(r+b)c2
2

We conclude

t2
e−(r+a)t − e−(r+b)t = (−a + b)t + (r + a)2 e−(r+a)c1 − (r + b)2 e−(r+b)c2 (13.12)
2

13.5.1 Example

Example 13.5.1 Approximate e−1.0t − e−1.1t using Eq. 13.11.

Solution We know

t2 t3
e−rt − e−(r+a)t = at − (a2 + 2ar) + −r 3 e−rc1 + (r + a)3 e−(r+a)c2
2 6

and here r = 1.0 and a = 0.1. So we have

t2
e−1.0t − e−1.1t ≈ 0.1t − (0.01 + 0.2) = 0.1t − 0.105t 2
2

and the error is on the order of t 3 which we write as O(t 3 ) where O stands for order.

Example 13.5.2 Approximate e−2.0t − e−2.1t using Eq. 13.11.


412 13 A Cancer Model

Solution Again, we use our equation

t2 t3
e−rt − e−(r+a)t = at − (a2 + 2ar) + −r 3 e−rc1 + (r + a)3 e−(r+a)c2
2 6

and here r = 2.0 and a = 0.1. So we have

t2
e−2.0t − e−2.1t ≈ 0.1t − (0.01 + 2(0.1)(2)) = 0.1t − 0.21t 2
2

and the error is O(t 3 ).

Example 13.5.3 Approximate e−2.1t − e−2.2t using Eq. 13.12.

Solution Now, we use our equation

t2
e−(r+a)t − e−(r+b)t = (−a + b)t + (r + a)2 e−(r+a)c1 − (r + b)2 e−(r+b)c2
2

and here r = 2.0, a = 0.1 and b = 0.2. So we have

e−2.1t − e−2.2t ≈ (0.2 − 0.1)t = 0.1t

and the error is O(t 2 ).

Example 13.5.4 Approximate e−2.1t − e−2.2t − e−1.1t − e−1.2t using Eq. 13.12.

Solution We have

e−2.1t − e−2.2t − e−1.1t − e−1.2t ≈ 0.1t − 0.1t = 0

plus O(t 2 ) which is not very useful. Of course, if the numbers had been a little
different, we would have not gotten 0. If we instead approximate using Eq. 13.11 we
find

e−2.1t − e−2.2t − e−1.1t − e−1.2t ≈ 0.1t − (0.01 + 2(0.1)(2.1)(t 2 /2)

− 0.1t − (0.01 + 2(0.1)(1.1)(t 2 /2)

= −0.215t 2 + 0.230t 2 = 0.015t 2

plus O(t 3 ) which is better. Note if the numbers are just right lots of stuff cancels!
13.5 Approximation Ideas 413

13.5.2 Homework

Exercise 13.5.1 Approximate e−3.0t − e−3.2t using Eq. 13.11.

Exercise 13.5.2 Approximate e−0.8t − e−0.85t using Eq. 13.11.

Exercise 13.5.3 Approximate e−1.3t − e−1.5t using Eq. 13.11.

Exercise 13.5.4 Approximate e−1.1t − e−1.3t using Eq. 13.12.

Exercise 13.5.5 Approximate e−0.07t − e−0.09t using Eq. 13.12.

Exercise 13.5.6 Approximate e−1.1t − e−1.3t − e−0.7t − e−0.8t using Eqs. 13.11
and 13.12.

Exercise 13.5.7 Approximate e−2.2t − e−2.4t − e−1.8t − e−1.9t using Eqs. 13.11
and 13.12.

Exercise 13.5.8 Approximate e−0.03t − e−0.035t − e−0.02t − e−0.025t using


Eqs. 13.11 and 13.12.

13.6 Approximation of the Top Pathway

We will want to solve the Y0 –Y2 models in addition to solving the equations for the
top pathway. To do this, we can take advantage of some key approximations.

13.6.1 Approximating X0

We have

t2
X0 (t) = 1 − (u1 + uc )t + +(u1 + uc )2 e−(uc +u1 )c1
2
for some c1 between 0 and t. Hence, X0 (t) ≈ 1 − (u1 + uc )t with error E0 (t)

t2
E0 (t) = (u1 + uc )2 e−(uc +u1 )c1
2
T2
≤ (u1 + uc )2 .
2
414 13 A Cancer Model

Thus, the maximum error made in approximating X0 (t) by 1 − (u1 + uc )t is E0 given


by
2
T2
E0 = (u1 + uc )2 = 10−7 (1 + R) 6.67 × 108 = 6.67(1 + R)2 10−6 .
2

We want this estimate for X0 be reasonable; i.e. first, give a positive number over the
human life time range and second, the discrepancy between the true X0 (t) and this
approximation is small. Hence, we will assume we want the maximum error in X0
to be 0.05, we which implies

E0 ≤ (1 + R)2 6.67 × 10−6 < 0.01.

This implies that


(1 + R)2 < 15.0 × 102

or

R < 38.7

Since uc = R u1 , we se

uc < 3.87 × 10−6 .

to have a good X0 estimate.

13.6.2 Approximating X1

Recall

u1
X1 (t) = e−(uc +u1 )t − e−(uc +Nu2 )t
Nu2 − u1

Since X1 (t) is written as the difference of two exponentials, we can use a first order
approximation as discussed in Eq. 13.12, to find

u1
X1 (t) = (Nu2 − u1 )t + (uc + u1 )2 e−(uc +u1 )c1
Nu2 − u1
t2
− (uc + Nu2 )2 e−(uc +Nu2 )c2
2
u1 t2
= u1 t + (uc + u1 )2 e−(uc +u1 )c1 − (uc + Nu2 )2 e−(uc +Nu2 )c2 .
Nu2 − u1 2
13.6 Approximation of the Top Pathway 415

Hence, approximating with X1 (t) ≈ u1 t, gives a maximum error of

u1 t2
E1 = max (uc + u1 )2 e−(uc +u1 )c1 − (uc + Nu2 )2 e−(uc +Nu2 )c2
0≤t≤T Nu2 − u1 2
u1 T2
≤ (uc + u1 )2 + (uc + Nu2 )2 .
Nu2 − u1 2

We have already found the R < 39 and for N at most 4000 with u1 = u2 = 10−7 ,
we see Nu2 − u1 ≈ Nu2 . Further, uc = Ru1 ≤ 10−5 and Nu2 ≤ 4 × 10−4 so that in
the second term, Nu2 is dominant. We therefore have

u1 T2
E1 ≤ (1 + R)2 (u1 )2 + (Nu2 )2 .
Nu2 2

But the term u12 is negligible in the second term, so we have

u1 T2
E1 ≈ (Nu2 )2
Nu2 2
2
T
= Nu1 u2 = 4000 × 10−14 × 6.67 × 108 = 0.027.
2

13.6.3 Approximating X2

Now here comes the messy part. Apply the second order difference of exponen-
tials approximation from Eq. 13.11 above to our X2 solution. To make our notation
somewhat more manageable, we will define the error term E(r, a, t) by

t3 t3
E(r, a, t) = −r 3 e−rc1 + (r + a)3 e−(r+a)c2 ≤ 2(r + a)3
6 6

Note the maximum error over human life time is thus E(r, a) which is

T3
E(r, a) = 2(r + a)3 . (13.13)
6
Now let’s try to find an approximation for X2 (t). We have

Nu1 u2 1 −uc t 1
X2 (t) = e − e−(u1 +uc )t − e−uc t − e−(uc +Nu2 )t
Nu2 − u1 u1 Nu2
Nu1 u2 1 t2
= u1 t − (u12 + 2u1 uc ) + E(uc , u1 , t)
Nu2 − u1 u1 2
416 13 A Cancer Model

Nu1 u2 1 t2
− Nu2 t − ((Nu2 )2 + 2Nu2 uc ) + E(uc , Nu2 , t)
Nu2 − u1 Nu2 2

Now do the divisions above to obtain

Nu1 u2 t2 E(uc , u1 , t)
X2 (t) = t − (u1 + 2uc ) +
Nu2 − u1 2 u1
Nu1 u2 t2 E(uc , Nu2 , t)
− t − (Nu2 + 2uc ) +
Nu2 − u1 2 Nu2

We can then simplify a bit to get

Nu1 u2 t2 E(uc , u1 , t)
X2 (t) = t − (u1 + 2uc ) + −t
Nu2 − u1 2 u1
t2 E(uc , Nu2 , t)
+ (Nu2 + 2uc ) −
2 Nu2
Nu1 u2 t2 E(uc , u1 , t) E(uc , Nu2 , t)
= (Nu2 − u1 ) + −
Nu2 − u1 2 u1 Nu2
t2 Nu2 u1
= Nu1 u2 + E(uc , u1 , t) − E(uc , Nu2 , t).
2 Nu2 − u1 Nu2 − u1
2
Hence, we see X2 (t) ≈ Nu1 u2 t2 with maximum error E2 over human life time T
given by

Nu2 u1
E2 = max E(uc , u1 , t) − E(uc , Nu2 , t).
0≤t≤T Nu2 − u1 Nu2 − u1
Nu2 u1
≤ E(uc , u1 ) + E(uc , Nu2 )
Nu2 − u1 Nu2 − u1
Nu2 u1 T3
= 2(u1 + uc )3 + 2(uc + Nu2 )3
Nu2 − u1 Nu2 − u1 6

From our model assumptions, an upper bound on N is 4000. Since u1 = u2 = 10−7 ,


we see the term Nu2 − u1 ≈ Nu2 = 4 × 10−4 . Also, u1 + uc is dominated by uc .
Thus,

u1 T3
E2 ≈ 2uc3 + (Nu2 )3
Nu2 − u1 6
3
u 1 T
≈ 2uc3 + (Nu2 )3
Nu2 6
3
T
≈ 2uc3 + u1 (Nu2 )2
6
13.6 Approximation of the Top Pathway 417

Table 13.1 The Non CIN Pathway Approximations with error estimates
Approximation Maximum error
T2
X0 (t) ≈ 1 − (u1 + uc ) t (u1 + uc )2 2

u1 T2
X1 (t) ≈ u1 t Nu2 −u1 (uc + u1 )2 + (uc + Nu2 )2 2

t2 3
X2 (t) ≈ N u1 u2 2 2uc3 T6

But the term u1 (Nu2 )2 is very small (≈1.6e − 14) and can also be neglected. So, we
have

T3
E2 ≈ 2uc3
6

Thus, if R ≤ 39, we have R3 = 5.9 × 104 and

E2 ≈ 2 × (5.8 × 10−17 ) × 8.1 × 1012


= 94 × 10−5 = 0.00094.

We summarize our approximation results for the top pathway in Table 13.1.

13.7 Approximating the CIN Pathway Solutions

We now know that

X0 (t) ≈ 1 − (u1 + uc ) t (13.14)


X1 (t) ≈ u1 t (13.15)
2
t
X2 (t) ≈ N u1 u2 . (13.16)
2
We can use these results to approximate the CIN cell populations Y0 , Y1 and Y2 . Let’s
recall the relevant equations.

Y0 = uc X0 − u1 Y0 (13.17)
Y1 = uc X1 + u1 Y0 − N u3 Y1 (13.18)
Y2 = N u3 Y1 + uc X2 . (13.19)

Now, replace X0 , X1 and X2 in the CIN cell population dynamics equations to give
us the approximate equations to solve.
418 13 A Cancer Model

Y0 = uc (1 − (u1 + uc ) t) − u1 Y0 (13.20)


Y1 = uc (u1 t) + u1 Y0 − N u3 Y1 (13.21)
t2
Y2 = N u3 Y1 + uc (N u1 u2 . (13.22)
2

13.7.1 The Y0 Estimate

First, if we replace X0 by its approximation, we have Y0 = uc (1 − (u1 + uc ) t +


E(t, 0)) − u1 Y0 where E(t, 0) is the error we make at time t. The approximate model
is then Y0 = uc (1 − (u1 + uc ) t) − u1 Y0 . The difference of these two models give
us a way of understanding how much error we make in replacing the full solution X0
by its approximation. Let  be this difference. We find

  (t) = uc E(t, 0), (0) = 0

is our model of the difference. We know E(t, 0) = (u1 + uc )2 e−(u1 +uc )β t2 here, so
2

the difference model is bounded by the envelope models:

t2
  (t) = ± uc (u1 + uc )2 , (0) = 0
2
The maximum error we can have due to this difference over human life time is then
found by integrating. We have

T3
(T ) = uc (u1 + uc )2 ,
6
which is about 0.0005. Hence, this contribution to the error is a bit small. Next, let’s
think about the other approximation error. After rearranging, the Y0 approximate
dynamics are

Y0 + u1 Y0 = uc − uc (u1 + uc ) t.

We solve this equation using the integrating factor method, with factor eu1 t . This
yields

Y0 (t) eu1 t = uc eu1 t − uc (u1 + uc ) t eu1 t .

Integrating and using the fact that Y0 (0) = 0, we have


 t  t
Y0 (t) e
u1 t
= uc eu1 s
ds − uc (u1 + uc ) s eu1 s ds
0 0
13.7 Approximating the CIN Pathway Solutions 419
  t
uc u1 + uc u1 s t u1 + uc u1 s
= eu1 t
− 1 − uc s e  + uc e ds
u1 u1 0 0 u1
uc u1 + uc u1 t u1 + uc
= eu1 t − 1 − uc t e + uc eu1 t − 1 .
u1 u1 u12

Now group the eu1 t and t eu1 t terms together. We have

2 u1 uc + uc2 u1 t u1 uc + uc2 u1 t 2 u1 uc + uc2


Y0 (t) eu1 t = e − te − .
u12 u1 u12

Now multiply through by e−u1 t to find

2 u1 uc + uc2 u1 uc + uc2 2 u1 uc + uc2 −u1 t


Y0 (t) = 2
− t− e
u1 u1 u12

To see what is going on here, we split this into two pieces as follows:

uc u1 uc + uc2
Y0 (t) = 1 − e−u1 t + 1 − ut t − e−u1 t . (13.23)
u1 u12

The critical thing to recall now is that

e−u1 t ≈ 1 − u1 t

with error E(t, 0) given by

t2
E(t, 0) = u12 e−u1 β
2
where β is some number between 0 and t. Then, as before, the largest possible error
over a human lifetime uses maximum time T = 3.65 × 104 giving

T2
|E(t, 0)| ≤ u12 .
2
Hence, we can rewrite Eq. 13.23 as

uc u1 uc + uc2
Y0 (t) = u1 t − E(t, 0) + − E(t, 0) (13.24)
u1 u12
2 u1 uc + uc2
= uc t − E(t, 0). (13.25)
u12
420 13 A Cancer Model

We conclude that Y0 (t) ≈ uc t with maximum error given by

2 u1 uc + uc2 2 T 2 2 T
2
|Y0 (t) − uc t| ≤ u1 = (2 u1 uc + uc ) . (13.26)
u12 2 2

For our chosen value of R ≤ 39, we see the magnitude of the Y0 error is given by

E = (2 + R) R (6.67 × 10−6 ).

For R < 39, we have

E < 1.067 × 10−2 = 0.01067.

If we add the error due to replacing the true dynamics by the approximation, we find
the total error is about 0.01067 + 0.0005 = 0.0105.

13.7.2 The Y1 Estimate

We can do the error due to the replacement of the true solutions by their approxima-
tions here too, but the story is similar; the error is small. We will focus on how to
approximate Y1 using the approximate dynamics. The Y1 dynamics are

Y1 = uc X1 + u1 Y0 − N u3 Y1 .

Over the effective lifetime of a human being, we can use our approximations for X1
and Y0 to obtain the dynamics that are relevant. This yields

Y1 = uc (u1 t) + u1 (uc t) − N u3 Y1 = 2 u1 uc t − N u3 Y1 .

This can be integrated using the integrating factor eN u3 t . Hence,


 t
Y1 (t) eN u3 t = 2 u1 uc s eN u3 s ds
0
1 1
= 2 u1 uc t eN u3 t − (eN u3 t − 1)
N u3 (N u3 )2
2 u1 uc
= (N u3 t − 1) eN u3 t + 1 .
(N u3 )2

Multiplying through by the integrating factor, we have

2 u1 uc
Y1 (t) = (N u3 t − 1) + e−N u3 t .
(N u3 )2
13.7 Approximating the CIN Pathway Solutions 421

Now group the terms as follows:

2 u1 uc
Y1 (t) = N u3 t + (e−N u3 t − 1)
(N u3 )2
2 u1 uc 2 u1 uc −N u3 t
= t+ (e − 1).
N u3 (N u3 )2

We know that we can approximate e−N u3 t as usual as

e−N u3 t = 1 + E(t, 0).

The error term E is then given by

E(t, 0) = −N u3 e−N u3 β t

for some β in [0, t]. As usual, this error is largest when β = 0 and t is the lifetime
of our model. Thus

|E(t, 0)| ≤ N u3 T .

Thus,
 
 
Y1 (t) − 2 u1 uc t  ≤
2 u1 uc
(N u3 ) T
 N u3 (N u3 )2
2 u1 uc
= T
N u3

However, we know Nu3 ≈ 1 and so

2 u1 uc
|Y1 (t) − t| ≤ 2u1 uc T
N u3
≤ 2 (10−7 ) R (10−7 ) T ,

using our value for u1 and our model for uc . We already know to have reasonable
error in X0 we must have R < 39. Since our average human lifetime is T = 36,500
days, we see

2 u1 uc
|Y1 (t) − t| ≤ 2 (10−7 ) 39 (10−7 ) 3.65 (104 )
N u3
≤ 2.85 × 10−8 .

Thus, the error in the Y1 estimate is quite small.


422 13 A Cancer Model

13.7.3 The Y2 Estimate

All that remains is to estimate Y2 . The dynamics are

t2
Y2 = N u3 Y1 + uc N u1 u2 .
2

Using our approximations, we obtain

2 u1 uc t2
Y2 = N u3 t + uc N u1 u2
N u3 2
N u1 u2 uc 2
= 2 u1 uc t + t .
2
The integration (for a change!) is easy. We find

N u1 u2 uc 3
Y2 (t) = u1 uc t 2 + t .
6
Note that
 
 
Y2 (t) − u1 uc t 2  = N u1 u2 uc t 3 .
  6

The error term is largest at the effective lifetime of a human being. Thus, we can say
(using our assumptions on the sizes of u1 , u2 and uc )
 
  3
Y2 (t) − u1 uc t 2  ≤ (N u1 u2 uc ) T .
  6

The magnitude estimate for Y2 can now be calculated. For R < 39, we have

N R 8.11 × 10−9 < 316.3 N × 10−9 .

For N ≤ 4000, this becomes

N R 8.11 × 10−9 < 1.265 × 10−3 .

We summarize our approximation results for the top pathway in Table 13.2.
13.8 Error Estimates 423

Table 13.2 The CIN Model Approximation Maximum error


Approximations with error
T2
estimates Y0 (t) ≈ uc t (2 u1 uc + uc2 ) 2
2 u1 uc 2 u1 uc
Y1 (t) ≈ N u3 t N u3 T
T3
Y2 (t) ≈ u1 uc t 2 N u1 u2 uc 6

13.8 Error Estimates

In Sect. 13.4, we solved for X0 , X1 and X2 exactly and then we developed their
approximations in Sect. 13.6. The approximations to X0 , X1 and X2 then let us develop
approximations to the CIN solutions. We have found that for reasonable errors in
X0 (t), we need R < 38.7. We were then able to calculate the error bounds for all the
variables. For convenience, since u1 and u2 are equal, let’s set them to be the common
value u. Then, by assumption uc = R u for some R ≥ 1. Our error estimates can then
be rewritten as seen in Table 13.3. This uses our assumptions that u1 = u2 = 10−7
and uc = Ru1 ≈ 4 × 10−6 .
The error magnitude estimates are summarized in Table 13.4.

Table 13.3 The Non CIN and CIN Model Approximations with error estimates using u1 = u2 = u
and uc = R u
Approximation Maximum error
X0 (t) ≈ 1 − (u1 + uc ) t 0.01
X1 (t) ≈ u1 t 0.027
t2
X2 (t) ≈ N u1 u2 2 0.0009
Y0 (t) ≈ uc t 0.0016
Y1 (t) ≈ 2Nu1uu3 c t 2.9 × 10−8
Y2 (t) ≈ u1 uc t 2 0.0013

Table 13.4 The Non CIN and CIN Model Approximations Dependence on population size N and
the CIN rate for R ≈ 39 with u1 = 10−7 and uc = R u1
Approximation Maximum error
X0 (t) ≈ 1 − (u1 + uc ) t (1 + R)2 6.67 × 10−6 < 0.01
u12 T2
X1 (t) ≈ u1 t N (1 + R)2 + (R + N)2 2 < 0.027
t2
X2 (t) ≈ N u1 u2 2 (3 N R + N−1
N
(N 2 + 1) e−RuT ) 8.11 × 10−9 < 0.0009
Y0 (t) ≈ uc t (2 + R) R 6.67 × 10−6 < 0.0016
Y1 (t) ≈ 2Nu1uu3 c t 2Ru12 T < 2.9 × 10−8
T3
Y2 (t) ≈ u1 uc t 2 RN u13 6 = RN 8.11 × 10−9 < 0.0013
424 13 A Cancer Model

13.9 When Is the CIN Pathway Dominant?

We think of N as about 4000 for a colon cancer model, but the loss of two allele
model is equally valid for different population sizes. Hence, we can think of N as
a variable also. In the estimates we generated in Sects. 13.6 and 13.7, we used the
value 4000 which generated even larger error magnitudes than what we would have
gotten for smaller N. To see if the CIN pathway dominates, we can look at the ratio
of the Y2 output to the X2 output. The ratio of Y2 to X2 tells us how likely the loss of
both alleles is due to CIN or without CIN. We have, for R < 39, that

Y2 (T ) u1 uc T 2 + E(T )
=
X2 (T ) (1/2)Nu1 u2 T 2 + F(T )

where E(T ) and F(T ) are the errors associated with our approximations for X2 and
Y2 . We assume u1 = u2 and so we can rewrite this as

Y2 (T ) (2R/N) + (2/Nu1 u2 T 2 )E(T )


=
X2 (T ) 1 + (2/Nu1 u2 T 2 )F(T )

For N = 4000 and u + 1 = u2 = 10−7 and T = 36, 500 days, we find 2/Nu1 u2 T 2 =
37.5 and hence we have
Y2 (T ) (2R/N) + 37.5 E(T )
=
X2 (T ) 1 + 37.5 F(T )

Now E(T ) ≈ 0.0009 and F(T ) ≈ 0.0013 and so

Y2 (T ) (2R/N) + 0.0388

X2 (T ) 1.0488

We can estimate how close this is to the ratio 2R/N and find Now E(T ) ≈ 0.00009
and F(T ) ≈ 0.0013 and so
   
 (2R/N) + 0.04 2R   0.04 − 0.05(2R/N) 
 −  ≈  
 1.05 N  1.05 

Here R < 39 and N is at least 1000, so (2R/N) ≈ 0.08. Hence, the numerator is
about |0.04 − 0.0004| ≈ 0.04. Thus, we see the error we make in using (2R/N) as an
estimate for Y2 (T )/X2 (T ) is about 0.04 which is fairly small. Hence, we can be rea-
sonably confident that the critical ratio (2R)/N is the same as the ratio Y2 (T )/X2 (T )
as the error over human life time is small. Our analysis only works for R < 39 though
so we should be careful in applying it. Hence, we can say

Y2 (T ) 2R
≈ .
X2 (T N
13.9 When Is the CIN Pathway Dominant? 425

Table 13.5 The CIN decay rates, uc required for CIN dominance. with u1 = u2 = 10−7 and
uc = R u1 for R ≥ 1
N Permissible uc For CIN dominance R value
100 >5.0 × 10−6 50
170 >8.5 × 10−6 85
200 >1.0 × 10−5 100
500 >2.5 × 10−5 250
800 >4.0 × 10−5 400
1000 >5.0 × 10−5 500
2000 >10.0 × 10−5 1000
4000 >20.0 × 10−5 2000
The third column shows the R value needed for a good CIN dominance

Hence, the pathway to Y2 is the most important if 2 R > N. This implies the CIN
pathway is dominant if
N
R> . (13.27)
2

For the fixed value of u1 = 10−7 , we calculate in Table 13.5 possible uc values for
various choices of N. We have CIN dominance if R > N2 and the approximations are
valid if R < 39.
Our estimates have been based on a cell population size N of 1000–4000. We
also know for a good X0 (t) estimate we want R < 39. Hence, as long as uc <
3.9 × 10−6 , we have a good X0 (t) approximation and our other estimates are valid.
From Table 13.5, it is therefore clear we will not have the CIN pathway dominant.
We would have to drop the population size to 70 or so to find CIN dominance. Now
our model development was based on the loss of alleles in a TSG with two possible
alleles, but the mathematical model is equally valid in another setting other than the
colon cancer one. If the population size is smaller than what we see in our colon
cancer model, it is therefore possible for the CIN pathway to dominate! However,
in our cancer model, it is clear that the non CIN pathway dominates.
Note we can’t even do this ratio analysis unless we are confident that our approx-
imations to Y2 and X2 are reasonable. The only way we know if they are is to do a
careful analysis using the approximation tools developed in Sect. 3.1. The moral is
that we need to be very careful about how we use estimated solutions when we
try to do science! However, with caveats, it is clear that our simple model gives an
interesting inequality, 2uc > N u2 , which helps us understand when the CIN pathway
dominates the formation of cancer.

13.9.1 Homework

We are assuming u1 = u2 = 10−7 and uc = Ru1 where R is a parameter which needs


to be less than 39 or so in order for our approximations to be ok. We have the human
426 13 A Cancer Model

life time is T = 36,500 days. and finally, we have N, the total number of cells in the
population which is 1000–4000. Now let’s do some calculations using different T ,
N, u1 and R values. For the choices below,
• Find the approximate values of all variables at the given T .
• Interpret the number at T as the number of cells in the that state after T days. This
means take your values for say X2 (T ) and multiple by N to get the number of cells.
• Determine if the top or bottom pathway to cancer is dominant Recall, if our approx-
imations are good, the equation 2R/N > 1 implies the top pathway is dominant.
As you answer these questions, note that we can easily use the equation above for
R and N values for which our approximations are probably not good. Hence, it is
always important to know the guideline we use to answer our question can be used!
So for example, most of the work in the book suggests that R ≤ 39 or so, so when
we use these equations for R = 110 etc. we can’t be sure our approximations are
accurate to let us do that!

Exercise 13.9.1 R = 32, N = 1500, u1 = u2 = 6.3 × 10−8 and T = 100 years but
express in days.

Exercise 13.9.2 R = 70, N = 4000, u1 = u2 = 8.3 × 10−8 and T = 100 years but
express in days.

Exercise 13.9.3 R = 110, N = 500, u1 = u2 = 1 × 10−7 and T = 105 years but


express in days.

Exercise 13.9.4 R = 20, N = 180, u1 = u2 = 1 × 10−7 and T = 95 years but


express in days.

13.10 A Little Matlab

Let’s look at how we might solve a cancer model using Matlab. Our first attempt
might be something like this. First, we set up the dynamics as

Listing 13.1: The cancer model dynamics


f = @( t , x ) [ −( u1+uc ) ∗x ( 1 ) ; . . .
u1∗x ( 1 ) −(uc+N∗u2 ) ∗x ( 2 ) ; . . .
N∗u2∗x ( 2 )−uc ∗x ( 3 ) ; . . .
uc ∗x ( 1 )−u1∗x ( 4 ) ; . . .
5 uc ∗x ( 2 ) +u1∗x ( 4 )−N∗u3∗x ( 5 ) ; . . .
N∗u3∗x ( 5 ) +uc ∗x ( 3 ) ] ;

But to make this work, we also must initialize all of our parameters before we try to
use the function. We have annotated the code lightly as most of it is pretty common
13.10 A Little Matlab 427

place to us now. We set the final time, as usual, to be human lifetime of 100 years
or 36,500 days. Our mutation rate u1 = 1.0 × 10−7 is in units of base pairs/day and
this requires many steps. If we set the step size to 0.5 as we do here, 73,000 steps
are required and even on a good laptop it will take a long time. Let’s do the model
for a value of R = 80 which is much higher than we can handle with our estimates!

Listing 13.2: A First Cancer Model Attempt


% Set the parameters
u1 = 1 . 0 e −7;
u2 = u1 ;
4 N = 1000;
R = 80;
uc = R∗u1 ;
r = 1 . 0 e −4;
u3 = (1− r ) /N;
9 % s e t the dynamics
f = @( t , x ) [ −( u1+uc ) ∗x ( 1 ) ; . . .
u1∗x ( 1 ) −(uc+N∗u2 ) ∗x ( 2 ) ; . . .
N∗u2∗x ( 2 )−uc ∗x ( 3 ) ; . . .
uc ∗x ( 1 )−u1∗x ( 4 ) ; . . .
14 uc ∗x ( 2 ) +u1∗x ( 4 )−N∗u3∗x ( 5 ) ; . . .
N∗u3∗x ( 5 ) +uc ∗x ( 3 ) ] ;
% s e t t h e f i n a l t i m e and s t e p s i z e
T = 36500;
h = .5;
19 % f i n d t h e RK 4 a p p r o x i m a t i o n s
M = c e i l (T / h ) ;
xinit = [1;0;0;0;0;0];
[ htime , rk , f r k ] = FixedRK ( f , 0 , x i n i t , h , 4 ,M) ;
% s a v e t h e a p p r o x i m a t i o n s i n v a r i a b l e s named
24 % l i k e the t e x t
X0 = rk ( 1 , : ) ;
X1 = rk ( 2 , : ) ;
X2 = rk ( 3 , : ) ;
Y0 = rk ( 4 , : ) ;
29 Y1 = rk ( 5 , : ) ;
Y2 = rk ( 6 , : ) ;
% p l o t Number o f A−− and A−−CIN c e l l s
p l o t ( htime , N∗X2 , htime , N∗Y2 ) ;
l e g e n d ( ’A--’ , ’A--CIN’ , ’location’ , ’west’ ) ;
34 x l a b e l ( ’Time days’ ) ;
y l a b e l ( ’A--, A--CIN cells’ ) ;
t i t l e ( ’N = 1000, R = 80; Top always dominant’ ) ;
% f i n d t h e l a s t e n t r y i n each v a r i a b l e
[ rows , c o l s ] = s i z e ( rk ) ;
39 % f i n d Y2 a t f i n a l t i m e and X2 a t f i n a l t i m e
Y2 ( c o l s )
X2 ( c o l s )
% Find Y2 ( T ) / X2 ( T )
Y2 ( c o l s ) / X2 ( c o l s )

The plot of the number of cells in the A−− and A−−CIN state is seen in Fig. 13.5.
428 13 A Cancer Model

Fig. 13.5 The number of cells in A−− and A−−CIN state versus time

If we could use our theory, it would tell us that since 2R/N = 160/1000 = 0.16, the
top pathway to cancer is dominant. Our numerical results give us

Listing 13.3: Our First Results


cols =
2 73000
Y2 ( c o l s ) =
9 . 2 2 5 6 e −04
X2 ( c o l s ) =
0.0020
7 Y2 ( c o l s ) / X2 ( c o l s ) =
0.4624

so the numerical results, while giving a Y 2(T )/X2(T ) different from 2R/N still
predict the top pathway to cancer is dominant. So it seems our general rule that
2R/N < 1 which we derived using complicated approximation machinery is working
well. We can change our time units to years to cut down on the number of steps we
need to take. To do this, we convert base pairs/day to base pairs/year by multiplying
u1 by 365. The new code is then
13.10 A Little Matlab 429

Listing 13.4: Switching to time units to years: step size is one half year
u1 = 1 . 0 e −7∗365;
2 u2 = u1 ;
N = 1000;
R = 80;
uc = R∗u1 ;
r = 1 . 0 e −4;
7 u3 = (1− r ) /N;
f = @( t , x ) [ −( u1+uc ) ∗x ( 1 ) ; . . .
u1∗x ( 1 ) −(uc+N∗u2 ) ∗x ( 2 ) ; . . .
N∗u2∗x ( 2 )−uc ∗x ( 3 ) ; . . .
uc ∗x ( 1 )−u1∗x ( 4 ) ; . . .
12 uc ∗x ( 2 ) +u1∗x ( 4 )−N∗u3∗x ( 5 ) ; . . .
N∗u3∗x ( 5 ) +uc ∗x ( 3 ) ] ;
T = 100;
h = .5;
M = c e i l (T / h ) ;
17 [ htime , rk , f r k ] = FixedRK ( f , 0 , [ 1 ; 0 ; 0 ; 0 ; 0 ; 0 ] , h , 4 ,M) ;
X0 = rk ( 1 , : ) ;
X1 = rk ( 2 , : ) ;
X2 = rk ( 3 , : ) ;
Y0 = rk ( 4 , : ) ;
22 Y1 = rk ( 5 , : ) ;
Y2 = rk ( 6 , : ) ;
[ rows , c o l s ] = s i z e ( rk )
N∗Y2 ( c o l s )
N∗X2 ( c o l s )
27 % Find Y2 ( T ) / X2 ( T )
Y2 ( c o l s ) / X2 ( c o l s )
2∗R /N
Y2 ( c o l s )
9 . 0 2 8 3 e −04
32 X2 ( c o l s )
0.0020
Y2 ( c o l s ) / X2 ( c o l s )
0.4548

We get essentially the same results using a time unit of years rather than days! The
example above uses a time step of h = 0.5 which is a half year step. We can do
equally well using a step size of h = 1 which is a year step. We can do it again using
h = 1 (i.e. the step is one year now)
Listing 13.5: Switching the step size to one year
h = 1 ; M = c e i l (T / h ) ; %ans = 100
[ htime , rk , f r k ] = FixedRK ( f , 0 , [ 1 ; 0 ; 0 ; 0 ; 0 ; 0 ] , h , 4 ,M) ;
X0 = rk ( 1 , : ) ; X1 = rk ( 2 , : ) ; X2 = rk ( 3 , : ) ;
Y0 = rk ( 4 , : ) ; Y1 = rk ( 5 , : ) ; Y2 = rk ( 6 , : ) ;
5 [ rows , c o l s ] = s i z e ( rk ) ;
Y2 ( c o l s ) / X2 ( c o l s )
ans = 0 . 4 5 2 9 1
2∗R /N
ans = 0 . 1 6 0 0 0
10 Y2 ( c o l s )
ans = 8 . 9 4 3 5 e −04
N∗Y2 ( c o l s )
ans = 0 . 8 9 4 3 5
X2 ( c o l s )
15 ans = 0 . 0 0 1 9 7 4 7
N∗X2 ( c o l s )
ans = 1 . 9 7 4 7

We can do similar things for different choices of N, R and h.


430 13 A Cancer Model

13.10.1 Homework

We again assume u1 = u2 = 10−7 and uc = Ru1 where R is a parameter. We have


the human life time is T = 36,500 days. and finally, we have N, the total number of
cells in the population which is 1000–4000. Now let’s do some calculations using
different N and R values. For the choices below,
• Solve the problem in MatLab using time measured in days. This is the hardest
case as it is slow. Calculate 2R/N and Y 2(T )/X2(T ) in your MatLab session and
determine which pathway to cancer is dominant.
• Do the same but now measure time in years and use h = 1. The calculations are
much faster now and the results are similar.
As you answer these questions, note the rule 2R/N seems valid for a much larger
value of R than our approximations suggest. But remember, our estimates give worst
case errors, so it is not surprising if the rule performs well even if R exceeds 39.

Exercise 13.10.1 R = 100, N = 1500.

Exercise 13.10.2 R = 200, N = 4000.

Exercise 13.10.3 R = 250, N = 400.

Exercise 13.10.4 R = 40, N = 60.

13.11 Insight Is Difficult to Achieve

We have had to work hard to develop some insight into the relative dominance of
the CIN pathway in this two allele cancer model. It has been important to solve our
model for arbitrary parameters u1 , u2 , u3 , uc and N. This need to do everything in
terms of parameters treated as variables complicated much of our analysis. Could
we do it another way? Well, we could perform a parametric study. We could solve
the model for say 10 choices of each of the 5 parameters using MatLab for the
values X2 (T ) and Y2 (T ) which, of course, depend on the values of u1 , u2 , u3 , uc
and N used in the computation. Hence, we should label this final time values as
X2 (T , u1 , u2 , u3 , uc , N) and Y2 (T , u1 , u2 , u3 , uc , N) to denote this dependence. After
105 separate MatLab computations, we would then have a data set consisting of 105
values of the variables X2 (T , u1 , u2 , u3 , uc , N) and Y2 (T , u1 , u2 , u3 , uc , N). We could
then try to do statistical modeling to see if we could tease out the CIN dominance
relation 2 uc > N u2 . But, choosing only 10 values for each parameter might be too
coarse for our purposes. If we choose 100 values for each parameter, we would have
to do 1010 computations to develop an immense table of 1010 entries of X2 (T ) and
Y2 (T ) values! You should be able to see that the theoretical approach we have taken
here, while hard to work through, has some benefits!
13.11 Insight Is Difficult to Achieve 431

With all the we have said so far, look at the following exposition of our cancer
model; you might read the following in a research report. Over the typical lifetime
of a human being, the variables in our model have functional dependencies on time
(denoted by the symbol ∼) given as follows:

X0 (t) ∼ 1 − (u1 + uc ) t
X1 (t) ∼ u1 t
t2
X2 (t) ∼ N u1 u2
2
Y0 (t) ∼ uc t
2 u1 uc
Y1 (t) ∼ t
N u3
Y2 (t) ∼ u1 uc t 2 .

It is then noted that the ratio Y2 to X2 gives interesting information about the domi-
nance of the CIN pathway. CIN dominance requires the ratio exceeds one giving us
the fundamental inequality

2 uc
> 1.
N u2

Note there is no mention in this derivation about the approximation error magnitudes
that must be maintained for the ratio tool to be valid! So because no details are
presented, perhaps we should be wary of accepting this model! if we used it without
being sure it was valid to make decisions, we could be quite wrong.

References

M. Novak, Evolutionary Dynamics: Exploring the Equations of Life (Belknap Press, Cambridge,
2006)
J. Peterson, Calculus for Cognitive Scientists: Derivatives, Integration and Modeling, Springer
Series on Cognitive Science and Technology (Springer Science+Business Media Singapore Pte
Ltd, Singapore, 2015 In press)
B. Ribba, T. Colin, S. Schnell, A multiscale mathematical model of cancer, and its use in analyzing
irradiation therapies. Theor. Biol. Med. Model. 3, 1–19 (2006)
Y. Tatabe, S. Tavare, D. Shibata, Investigating stem cells in human colon using methylation patterns.
Proc. Natl. Acad. Sci. 98, 10839–10844 (2001)
Part V
Nonlinear Systems Again
Chapter 14
Nonlinear Differential Equations

We are now ready to solve nonlinear systems of nonlinear differential equations using
our new tools. Our new tools will include
1. The use of linearization of nonlinear ordinary differential equation systems to gain
insight into their long term behavior. This requires the use of partial derivatives.
2. More extended qualitative graphical methods.
Recall, this means we are using the ideas of approximating functions of two variables
by tangent planes which results in tangent plane error. This error is written in terms
of Hessian like terms. The difference now is that x  has its dynamics f and y  has
its dynamics g and so we have to combine tangent plane approximations to both f
and g into one Hessian like term to estimate the error. So the discussions below are
similar to the ones in the past, but a bit different.

14.1 Linear Approximations to Nonlinear Models

Consider the nonlinear system

x  = f (x, y)
y  = g(x, y)
x(0) = x0
y(0) = y0

where both f and g possess continuous partial derivatives up to the second order.
For any fixed x and y, define the function h by
 
f (x0 + tx, y0 + ty)
h(t) =
g(x0 + tx, y0 + ty)

© Springer Science+Business Media Singapore 2016 435


J.K. Peterson, Calculus for Cognitive Scientists,
Cognitive Science and Technology, DOI 10.1007/978-981-287-877-9_14
436 14 Nonlinear Differential Equations

Then we can expand h in a second order approximation to find

t2
h(t) = h(0) + h  (0)t + h  (c) .
2
Using the chain rule, we find
 
 f (x + tx, y0 + ty)tx + f y (x0 + tx, y0 + ty)ty
h (t) = x 0
gx (x0 + tx, y0 + ty)tx + g y (x0 + tx, y0 + ty)ty

and in terms of the Hessian’s of f and g


⎡    ⎤
 T f x x (x0 + tx, y0 + ty) f yx (x0 + tx, y0 + ty) tx
⎢ tx ty
⎢ f x y (x0 + tx, y0 + ty) f yy (x0 + tx, y0 + ty) ty ⎥⎥
h  (t) = ⎢
⎢  

 ⎥
⎣ T gx x (x0 + tx, y0 + ty) g yx (x0 + tx, y0 + ty) tx ⎦
tx ty
gx y (x0 + tx, y0 + ty) g yy (x0 + tx, y0 + ty) ty

or
⎡   ⎤
 T tx
⎢ tx ty H f (x0 + tx, y0 + ty) ty ⎥
⎢ ⎥
h  (t) = ⎢
⎢ 

 ⎥
⎣ T tx ⎦
tx ty Hg (x0 + tx, y0 + ty)
ty

Thus, since the approximation is known to be

1
h(1) = h(0) + h  (0)(1 − 0) + h  (c)
2
for some c between 0 and 1, we have
⎡ ⎤
f (x0 + x, y0 + y)
⎣ ⎦
g(x0 + x, y0 + y)
⎡  T   ⎤
1 x ∗ ∗ x
⎢ f (x0 , y0 ) + f x (x0 , y0 )x + f y (x0 , y0 )y + 2 y H f (x , y )
y ⎥
⎢ ⎥
=⎢⎢


⎣  T   ⎦
1 x ∗ ∗ x
g(x0 , y0 ) + gx (x0 , y0 )x + g y (x0 , y0 )y + 2 Hg (x , y )
y y

where x ∗ = x0 + cx and y ∗ = y0 + cy. Hence, we can rewrite our nonlinear


model as
14.1 Linear Approximations to Nonlinear Models 437

x  = f (x0 , y0 ) + f x (x0 , y0 )x + f y (x0 , y0 )y + E f (x ∗ , y ∗ )


y  = f (x0 , y0 ) + gx (x0 , y0 )x + g y (x0 , y0 )y + Eg (x ∗ , y ∗ )

where the error terms E f (x ∗ , y ∗ ) and Eg (x ∗ , y ∗ ) are the usual Hessian based terms
   
∗ ∗ 1 x T ∗ ∗ x
E f (x , y ) = H f (x , y )
2 y y
 T  
1 x x
Eg (x ∗ , y ∗ ) = Hg (x ∗ , y ∗ )
2 y y

With sufficient knowledge of the Hessian terms, we can have some understanding
of how much error we make but, of course, it is difficult in interesting problems to
make this very exact. Roughly speaking though, for our functions with continuous
second order partial derivatives, in any closed circle of radius R around the base
point (x0 , y0 ), there is a constant B R so that

1 1
E f (x ∗ , y ∗ ) ≤ B R (|x| + |y|)2 and E f (x ∗ , y ∗ ) ≤ B R (|x| + |y|)2 .
2 2
It is clear that the simplest approximation arises at those special points (x0 , y0 ) where
both f (x0 , y0 ) = 0 and g(x0 , y0 ) = 0. The points are called equilibrium points of
the model. At such points, we can a linear approximation to the true dynamics of
this form
x  = f x (x0 , y0 )x + f y (x0 , y0 )y
y  = gx (x0 , y0 )x + g y (x0 , y0 )y

and this linearization of the model is going to give us trajectories that are close to the
true trajectories when are deviations x and y are small enough! We can rewrite
our linearizations again using x − x0 = x and y − y0 = y as follows
    
x f x (x0 , y0 ) f y (x0 , y0 ) x − x0
= .
y gx (x0 , y0 ) g y (x0 , y0 ) y − y0

The special matrix of first order partials of f and g is called the Jacobian of our
model and is denoted by J(x, y). Hence, in general
 
f (x, y) f y (x, y)
J(x, y) = x
gx (x, y) g y (x, y)
438 14 Nonlinear Differential Equations

so that the linearization of the model at the equilibrium point (x0 , y0 ) has the form
   
x x − x0
= J(x0 , y0 )
y y − y0

Next, if we make the change of variables u = x − x0 and v = y − y0 , we find at each


equilibrium point, there is a standard linear model (which we know how to study) of
the form
   
u u
= J(x0 , y0 )
v v

as J(x0 , y0 ) is simply as 2 × 2 real matrix which will have real distinct, repeated
or complex conjugate pair eigenvalues which we know how to deal with after our
discussions in Chap. 8. We are now ready to study interesting nonlinear models using
the tools of linearization.

14.1.1 A First Nonlinear Model

Let’s consider this model


2x y
x  = (1 − x)x −
1+x
y
y = 1 − y
1+x

which has multiple equilibrium points. Clearly, we need to stay away from the line
x = −1 for initial conditions as there the dynamics themselves are not defined!
However, we can analyze at other places. We define the nonlinear function f and g
then by

2x y
f (x, y) = (1 − x)x −
1+x
y
g(x, y) = 1 − y
1+x

We find that f (x, y) = 0 when either x = 0 or 2y = 1 − x 2 . When x = 0, we find


g(0, y) = (1 − y) y which is 0 when y = 0 or y = 1. So we have equilibrium points
at (0, 0 and (0, 1). Next, when 2y = 1 − x 2 , we find g becomes
 1 − x2  1
g x, = (1 + x)(1 + x)(1 − x).
2 4
14.1 Linear Approximations to Nonlinear Models 439

We need to discard x = −1 as the dynamics are not defined there. So the last equi-
librium point is at (1, 0). We can then use MatLab to find the Jacobians and the
associated eigenvalues and eigenvectors are each equilibrium point. We encode the
Jacobian
 
1 − 2x − 2 (1+x)
y
2 −2 (1+x)
x
J (x, y) = y2
(1+x)2 )
1 − 2 1+x
y

in the file Jacobian.m as

Listing 14.1: Jacobian


fun c t i o n A = Jacobian ( x , y )
A = [1 −2∗ x − 2∗ y / ( 1 + x ∗ x ) , −2∗x / ( 1 + x ) ; . . .
y∗ y / ( ( 1 + x ) ∗(1+ x ) ) , 1−2∗y / ( 1 + x ) ] ;

We can them find the local linear systems at each equilibrium.

14.1.1.1 Local Analysis

We will use MatLab to help with our analysis of these equilibrium points. Here are
the MatLab sessions for all the equilibrium points. The MatLab command eig is used
to find the eigenvalues and eigenvectors of a matrix.
Equilibrium Point (0,0) For the first equilibrium point (0, 0), we find the Jacobian
at (0, 0) and the associated eigenvalues and eigenvectors with the following MatLab
commands.

Listing 14.2: Jacobian at equilibrium point (0; 0)


J0 = J a c o b i a n ( 0 , 0 )
J0 =
1 0
0 1
[ V0 , D0 ] = e i g ( J0 )
V0 =
1 0
0 1
D0 =
1 0
0 1

Hence, there is a repeated eigenvalue, r = 1 but there are two different eigenvectors:
   
1 0
E1 = , E2 =
0 1
440 14 Nonlinear Differential Equations

with general local solution near (0, 0) of the form


     
x (t) 1 t 0 t
=a e +b e
y (t) 0 1

where a and b are arbitrary. Hence, trajectories move away from the origin locally.
Recall, the local linear system is
    
x  (t) 10 x (t)
=
y  (t) 01 y (t)

which is the same as the local variable system using the change of variables u = x
and v = y.
     
u (t) 1 0 u (t)
=
v  (t) 0 1 v (t)

Equilibrium Point (1,0) For the second equilibrium point (1, 0), we find the Jacobian
at (1, 0) and the associated eigenvalues and eigenvectors in a similar way.

Listing 14.3: Jacobian at equilibrium point (1; 0)


J1 = J a c o b i a n ( 1 , 0 )
J1 =
−1 −1
0 1
[ V1 , D1 ] = e i g ( J1 )
V1 =
1.0000 −0.4472
0 0.8944
D1 =
−1 0
0 1

Now there is a two different eigenvalues, r1 = −1 and r2 = 1 with associated eigen-


vectors
   
1 −0.4472
E1 = , E2 =
0 0.8944

with general local solution near (1, 0) of the form


     
x (t) 1 −t −0.4472 t
=a e +b e
y (t) 0 0.8944
14.1 Linear Approximations to Nonlinear Models 441

where a and b are arbitrary. Hence, trajectories move away from (1, 0) locally for
all trajectories except those that start on E 1 . Recall, the local linear system is
    
x  (t) −1 −1 x (t) − 1
=
y  (t) 0 1 y (t)

which, using the change of variables u = x − 1 and v = y, is the local variable


system.
     
u (t) −1 −1 u (t)
=
v  (t) 0 1 v (t)

Equilibrium Point (0,1) For the third equilibrium point (0, 1), we again find the
Jacobian at (0, 1) and the associated eigenvalues and eigenvectors in a similar way.

Listing 14.4: Jacobian at equilibrium point (0; 1)


J2 = J a c o b i a n ( 0 , 1 )
J2 =
−1 0
1 −1
[ V2 , D2 ] = e i g ( J2 )
V2 =
0 0.0000
1.0000 −1.0000
D2 =
−1 0
0 −1

Now there is again a repeated eigenvalue, r1 = −1. If you look at the D2 matrix,
you see both the columns are the same. In this case, MatLab does not give us useful
information. We can use the first column as our eigenvector E 1 , but we still must
find the other vector F.
Recall, the local linear system is
    
x  (t) −1 0 x (t)
=
y  (t) 1 −1 y (t) − 1

which, using the change of variables u = x and v = y − 1, is the local variable


system
     
u (t) −1 0 u (t)
=
v  (t) 1 −1 v (t)

Recall, the general solution to a model with a repeated eigenvalue with only one
eigenvector is given by
442 14 Nonlinear Differential Equations
 
x (t)
= a E 1 e−t + b F e−t + E 1 t e−t
y (t)

where F solves (−I − A)F = −E 1 . Here, that gives


 
1
F=
0

and so the local solution near (0, 1) has the form


       
x (t) 0 −t 1 −t 0
=a e +b e + t e−t
y (t) 1 0 1

where a and b are arbitrary. Hence, trajectories move toward from (0, 1) locally.

14.1.1.2 Generating a Phase Plane Portrait

Piecing together the global behavior from the local trajectories is difficult, so it is
helpful to write scripts in MatLab to help us. We can use the AutoPhasePlanePlot()
function from before, but this time we use different dynamics. The dynamics are
stored in the file autonomousfunc.m which encodes the right hand side of the model

2x y
x  = (1 − x)x −
1+x
y
y = 1 − y
1+x

in the usual way.

Listing 14.5: Autonomousfunc


f u n c t i o n f = autonomousfunc ( t , x )
f = [(1 − x ( 1 ) ) . ∗ x ( 1 ) −2.∗ x ( 1 ) . ∗ x ( 2 ) . / ( 1 + x ( 1 ) ) ; . . .
x (2) − x (2) .∗ x (2) ./(1+ x (1) ) ];

Then, to generate a nice phase plane portrait, we try a variety of [xmin, xmax] ×
[ymin, ymax] initial condition boxes until it looks right! Here, we avoid any initial
conditions that have negative values as for those the trajectories go off to infinity and
the plots are not manageable. We show the plot in Fig. 14.1.

Listing 14.6: Sample Phase Plane Plot


A u t o P h a s e P l a n e P l o t ( ’ autonomousfunc ’ , 0 . 1 , 0 . 0 , 3 . 9 , 4 , 8 , 8 , 0 . 0 , 2 . 5 , 0 . 0 , 2 . 5 ) ;
14.1 Linear Approximations to Nonlinear Models 443

Fig. 14.1 Nonlinear phase


plot for multiple initial
conditions!

You should play with this function. You’ll see it involves a lot of trial and error. Any
box [xmin, xmax] × [ymin, ymax] which includes trajectories whose x or y values
increase exponentially causes the overall plot’s x and y ranges to be skewed toward
those large numbers. This causes a huge loss in resolution on all other trajectories!
So it takes time and a bit of skill to generate a nice collection of phase plane plots!

14.1.2 Finding Equilibrium Points Numerically

We can find the equilibrium points using the root finding methods called bisection
and Newton’s method.

14.1.2.1 The Bisection Method

We need a simple function to find the root of a nice function f of the real variable
x using what is called bisection. The method is actually quite simple. We know that
if f is a continuous function on the finite interval [a, b] then f must have a zero
inside the interval [a, b] if f has a different algebraic sign at the endpoints a and b.
This means the product f (a) f (b) is not zero. So we assume we can find an interval
[a, b] on which this change in sign satisfies f (a) f (b) ≤ 0 (which we can do by
switching to − f if we have to!) and then if we divide the interval [a, b] into two
equal pieces [a, m] and [m, b], f (m) can’t have the same sign as both f (a) and f (b)
because of the assumed sign difference. So at least one of the two halves has a sign
change.
Note that if f (a) and f (b) was zero then we still have f (a) f (b) ≤ 0 and either a
or b could be our chosen root and either half interval works fine. If only one of the
444 14 Nonlinear Differential Equations

endpoint function values is zero, then the bisection of [a, b] into the two halves still
finds the one half interval that has the root.
So our prototyping Matlab code should use tests like f (x) f (y) ≤ 0 rather than
f (x) f (y) < 0 to make sure we catch the root.
Simple Bisection MatLab Code Here is a simple Matlab function to perform the
Bisection routine.

Listing 14.7: Bisection Code


f u n c t i o n r o o t = B i s e c t i o n ( fname , a , b , d e l t a )
%
% fname i s a s t r i n g t h a t g i v e s t h e name o f a
4 % continuous function f ( x )
% a,b t h i s i s t h e i n t e r v a l [ a , b ] on
% which f i s d e f i n e d and f o r which
% we assume t h a t t h e p r o d u c t
% f ( a ) ∗ f ( b ) <= 0
9 % d e l t a t h i s i s a non n e g a t i v e r e a l number
%
% root t h i s i s the midpoint of the i n t e r v a l
% [ alpha , b e t a ] h a v i n g t h e p r o p e r t y t h a t
% f ( a l p h a ) ∗ f ( b e t a ) <= 0 and
14 % | b e t a −a l p h a | <= d e l t a + e p s ∗ max ( | a l p h a | , | b e t a | )
% and e p s i s machine z e r o
%
disp (’ ’)
disp (’ k | a(k) | b(k) | b(k) - a(k) ’ )
19 k = 1;
d i s p ( s p r i n t f ( ’ %6d | %12.7f | %12.7f | %12.7f’ , k , a , b , b−a ) ) ;
f a = f e v a l ( fname , a ) ;
f b = f e v a l ( fname , b ) ;
w h i l e abs ( a−b ) > d e l t a + e p s ∗max ( abs ( a ) , abs ( b ) )
24 mid = ( a+b ) / 2 ;
fmid = f e v a l ( fname , mid ) ;
i f f a ∗ fmid <= 0
% t h e r e i s a r o o t i n [ a , mid ]
b = mid ;
29 f b = fmid ;
else
% t h e r e i s a r o o t i n [ mid , b ]
a = mid ;
f a = fmid ;
34 end
k = k + 1;
d i s p ( s p r i n t f ( ’ %6d | %12.7f | %12.7f | %12.7f’ , k , a , b , b−a ) ) ;
end
r o o t = ( a+b ) / 2 ;
39 end

We should look at some of these lines more closely. First, to use this routine, we need
to write a function definition for the function we want to apply bisection to. We will
do this in a file called func.m (Inspired Name, eh?) An example would be the one
we wrote for the function
x 
f (x) = tan − 1;
4
which is coded in Matlab by
14.1 Linear Approximations to Nonlinear Models 445

Listing 14.8: Function Definition In MatLab


f u n c t i o n y = func ( x )
%
% x real input
% y real output
5 %
y = t a n ( x / 4 ) −1;
end

So to apply bisection to this function on the interval [2, 4] with a stopping tolerance
of say 10−4 , in Matlab, we would type the command

Listing 14.9: Applying bisection


r o o t = B i s e c t i o n ( ’func’ , 2 , 4 , 1 0 ˆ − 4 ) }

Note that the name of our supplied function, the uninspired choice func, is passed in
as the first argument in single quotes as it is a string. Also, in the Bisection routine,
we have added the code to print out what is happening at each iteration of the while
loop. Matlab handles prints to the screen a little funny, so do set up a table of printed
values we use this syntax:

Listing 14.10: Bisection step by step code


% t h i s p r i n t s a b l a n k l i n e and t h e n a t a b l e h e a d i n g .
% note disp p r i n t s a s t r i n g only
disp (’ ’)
4 disp (’ k | a(k) | b(k) | b(k) - a(k) ’ )
% now t o p r i n t t h e k , a , b and b−a , we must f i r s t
% put t h e i r values i n t o a s t r i n g using the c l i k e function
% s p r i n t f and t h e n u s e d i s p t o d i s p l y t h a t s t r i n g .
% s o we do t h i s
9 % disp ( s p r i n t f ( ’ output s p e c i f i c a t i o n s here ’ , v a r i a b l e s here ) )
% s o i n s i d e t h e w h i l e l o o p we u s e
d i s p ( s p r i n t f ( ’ %6d | %12.7f | %12.7f | %12.7f’ , k , a , b , b−a ) ) ;

Running the Code As mentioned above, we will test this code on the function
x 
f (x) = tan − 1;
4

on the interval [2, 4] with a stopping tolerance of δ = 10−6 . Our function has been
written as the Matlab function func supplied in the file func.m. The Matlab run
time looks like this:
446 14 Nonlinear Differential Equations

Listing 14.11: Sample Bisection run step by step


r o o t = B i s e c t i o n ( ’func’ , 2 , 4 , 1 0 ˆ − 6 )

k | a(k) | b(k) | b(k) − a(k)


1 | 2.0000000 | 4.0000000 | 2.0000000
5 2 | 3.0000000 | 4.0000000 | 1.0000000
3 | 3.0000000 | 3.5000000 | 0.5000000
4 | 3.0000000 | 3.2500000 | 0.2500000
5 | 3.1250000 | 3.2500000 | 0.1250000
6 | 3.1250000 | 3.1875000 | 0.0625000
10 7 | 3.1250000 | 3.1562500 | 0.0312500
8 | 3.1406250 | 3.1562500 | 0.0156250
9 | 3.1406250 | 3.1484375 | 0.0078125
10 | 3.1406250 | 3.1445312 | 0.0039062
11 | 3.1406250 | 3.1425781 | 0.0019531
15 12 | 3.1406250 | 3.1416016 | 0.0009766
13 | 3.1411133 | 3.1416016 | 0.0004883
14 | 3.1413574 | 3.1416016 | 0.0002441
15 | 3.1414795 | 3.1416016 | 0.0001221
16 | 3.1415405 | 3.1416016 | 0.0000610
20 17 | 3.1415710 | 3.1416016 | 0.0000305
18 | 3.1415863 | 3.1416016 | 0.0000153
19 | 3.1415863 | 3.1415939 | 0.0000076
20 | 3.1415901 | 3.1415939 | 0.0000038
21 | 3.1415920 | 3.1415939 | 0.0000019
25 22 | 3.1415920 | 3.1415930 | 0.0000010
root =
3.1416

Homework Well, you have to practice this stuff to see what is going on. So here are
two problems to sink your teeth into!

Exercise 14.1.1 Use bisection to find the first five positive solutions of the equation
x = tan(x). You can see where this is roughly by graphing tan(x) and x simultane-
ously. Do this for tolerances {10−1 , 10−2 , 10−3 , 10−4 , 10−5 , 10−6 , 10−7 }. For each
root, choose a reasonable bracketing interval [a, b], explain why you chose it, pro-
vide a table of the number of iterations to achieve the accuracy and a graph of this
number versus accuracy.

Exercise 14.1.2 Use the Bisection Method to find the largest real root of the function
f (x) = x 6 − x − 1. Do this for tolerances {10−1 , 10−2 , 10−3 , 10−4 , 10−5 , 10−6 ,
10−7 }. Choose a reasonable bracketing interval [a, b], explain why you chose it,
provide a table of the number of iterations to achieve the accuracy and a graph of
this number versus accuracy.

14.1.2.2 Newton’s Method

Newton’s method is based on the tangent line and rapidly converges to a zero of the
function f if the original guess is reasonable. Of course, that is the problem. A bad
initial guess is a great way to generate random numbers! So usually, we find a good
14.1 Linear Approximations to Nonlinear Models 447

interval where the root might reside by first using bisection. The following code uses
a simple test to see which we should do in our zero finding routine.

Listing 14.12: Should We Do A Newton Step?


f u n c t i o n ok = S t e p I s I n ( x , fx , fpx , a , b )
%
3 % x current approximate root
% fx function f value at approximate root
% fpx d e r i v a t i v e f ’ value at approximate root
% a,b root i s in t h i s i n t e r v a l
%
8 % ok 1 i f t h e Newton S t e p x − f x / f p x i s i n [ a , b ]
% 0 i f not
%
i f fpx > 0
ok = ( ( a−x ) ∗ f p x <= −f x ) & (− f x <= ( b−x ) ∗ f p x ) ;
13 e l s e i f fpx < 0
ok = ( ( a−x ) ∗ f p x >= −f x ) & (− f x >= ( b−x ) ∗ f p x ) ;
else
ok = 0 ;
end
18
end

Then, once the bisection steps have given us an interval where the root might be,
we switch to Newton’s method. This takes the current guess, say x1 , and finds the
tangent line, T (x), to the function at that point. This gives

T (x) = f (x0 ) + f  (x0 )(x − x0 ).

If we find the value of x where the tangent line crosses the x axis, this becomes our
next guess x1 . We find

f (x0 )
x1 = x0 − .
f  (x0 )

In essence this is Newton’s Method which can be rephrased for the scalar function
case as
f (xn )
xn+1 = x N − .
f  (xn )

and it is clear the method fails if f  (xn ) = 0 or is close to 0 at any iteration. MatLab
code to implement this method is given next.
A Global Newton Method The code for a global Newton method is pretty straight-
forward. Here is the listing.
448 14 Nonlinear Differential Equations

Listing 14.13: Global Newton Method


1 f u n c t i o n [ x , fx , nEvals , aF , bF ] = . . .
GlobalNewton ( fName , fpName , a , b , t o l x , t o l f , nEvalsMax )
%
% fName a s t r i n g t h a t i s t h e name o f t h e f u n c t i o n f ( x )
% fpName a s t r i n g t h a t i s t h e name o f t h e f u n c t i o n s d e r i v a t i v e
6 % f ’(x)
% a,b we l o o k f o r t h e r o o t i n t h e i n t e r v a l [ a , b ]
% tolx t o l e r a n c e on t h e s i z e o f t h e i n t e r v a l
% tolf tolerance of f ( current approximation to root )
% nEvalsMax Maximum Number o f d e r i v a t i v e E v a l u a t i o n s
11 %
% x Approximate z e r o of f
% fx The v a l u e o f f a t t h e a p p r o x i m a t e z e r o
% nEvals The Number o f D e r i v a t i v e E v a l u a t i o n s Needed
% aF , bF t h e f i n a l i n t e r v a l t h e a p p r o x i m a t e r o o t l i e s in ,
16 % [ aF , bF ]
%
% T e r m i n a t i o n I n t e r v a l [ a , b ] has s i z e < t o l x
% | f ( approximate root ) | < t o l f
% Have e x c e e d e d nEvalsMax d e r i v a t i v e E v a l u a t i o n s
21 %
f a = f e v a l ( fName , a ) ;
f b = f e v a l ( fName , b ) ;
x = a;
f x = f e v a l ( fName , x ) ;
26 f p x = f e v a l ( fpName , x ) ;

nEvals = 1;
k = 1;
disp (’ ’)
31 d i s p ( ’ Step | k | a(k) | x(k) | b(k) ’ )
d i s p ( s p r i n t f ( ’Start | %6d | %12.7f | %12.7f | %12.7f’ , k , a , x , b ) ) ;
while ( abs ( a−b )>t o l x ) && ( abs ( f x )> t o l f ) && ( nEvals<nEvalsMax ) | | ( n E v a l s ==1)
%[ a , b ] b r a c k e t s a r o o t and x=a o r x=b
check = S t e p I s I n ( x , fx , fpx , a , b ) ;
36 i f check
%Take Newton S t e p
x = x − fx / fpx ;
else
%Take a B i s e c t i o n S t e p :
41 x = ( a+b ) / 2 ;
end
f x = f e v a l ( fName , x ) ;
f p x = f e v a l ( fpName , x ) ;
n E v a l s = nE v a l s + 1;
46 i f f a ∗ fx <=0
%t h e r e i s a r o o t i n [ a , x ] . Use r i g h t e n d p o i n t .
b = x;
fb = fx ;
else
51 %t h e r e i s a r o o t i n [ x , b ] . Bring in l e f t e n d p o i n t .
a = x;
fa = fx ;
end
k = k + 1;
56 i f ( check )
d i s p ( s p r i n t f ( ’Newton | %6d | %12.7f | %12.7f | %12.7f’ , k , a , x , b ) ) ;
else
d i s p ( s p r i n t f ( ’Bisection | %6d | %12.7f | %12.7f | %12.7f’ , k , a , x , b ) ) ;
end
61 end
aF = a ;
bF = b ;

end

A Run Time Example We will apply our global newton method root finding code
to a simple example: find a root for f (x) = sin(x) in the interval [ −7π
2
, 15π + 0.1].
14.1 Linear Approximations to Nonlinear Models 449

We code the function and its derivative in two simple Matlab files; f1.m and f1p.m.
These are

Listing 14.14: Global Newton Function


function y = f1 ( x )
y = sin (x) ;
end

and
Listing 14.15: Global Newton Function Derivative
f u n c t i o n y = f1p ( x )
2 y = cos ( x ) ;
end

To run this code on this example, we would then type a phrase like the one below:
Listing 14.16: Global Newton Sample
[ x , fx , nEvals , aLast , bLast ] = GlobalNewton ( ’f1’ , ’f1p’ , −7∗ p i / 2 , . . .
2 15∗ p i + . 1 , 1 0 ˆ − 6 , 1 0 ˆ − 8 , 2 0 0 )

Here is the runtime output:

Listing 14.17: Global Newton runtime results


[ x , fx , nEvals , aLast , bLast ] = GlobalNewton ( ’f1’ , ’f1p’ , . . .
−7∗ p i / 2 , 1 5 ∗ p i + . 1 , . . .
3 10ˆ −6 ,10ˆ −8 ,200)
Step | k | a(k) | x(k) | b(k)
Start | 1 | −10.9955743 | −10.9955743 | 47.2238898
Bisection | 2 | −10.9955743 | 18.1141578 | 18.1141578
Bisection | 3 | −10.9955743 | 3.5592917 | 3.5592917
8 Newton | 4 | 3.1154761 | 3.1154761 | 3.5592917
Newton | 5 | 3.1154761 | 3.1415986 | 3.1415986
Newton | 6 | 3.1415927 | 3.1415927 | 3.1415986
x =
3.1416
13 fx =
1 . 2 2 4 6 e −16
nEvals =
6
aLast =
18 3.1416
bLast =
3.1416

Homework
Exercise 14.1.3 Use the Global Newton Method to find the first five positive solu-
tions of the equation x = tan(x). You can see where this is roughly by graphing
tan(x) and x simultaneously. Do this for tolerances {10−1 , 10−2 , 10−3 , 10−4 , 10−5 ,
10−6 , 10−7 }. For each root, choose a reasonable bracketing interval [a, b], explain
why you chose it, provide a table of the number of iterations to achieve the accuracy
and a graph of this number versus accuracy.
450 14 Nonlinear Differential Equations

Exercise 14.1.4 Use the Global Newton Method to find the largest real root of the
function f (x) = x 6 − x − 1. Do this for tolerances {10−1 , 10−2 , 10−3 , 10−4 , 10−5 ,
10−6 , 10−7 }. Choose a reasonable bracketing interval [a, b], explain why you chose
it, provide a table of the number of iterations to achieve the accuracy and a graph of
this number versus accuracy.

14.1.2.3 Adding Finite Difference Approximations to the Derivative

We can also choose to replace the derivative function for f with a finite difference
approximation. We will use

f (xc + δc ) − f (xc )
f  (x) ≈
δc

to approximate the value of the derivative at the point xc . As we have discussed


earlier, some care is required to pick a size for δc so that round-off errors do not
destroy the accuracy of our finite difference approximation to f  .
The simple Matlab code to implement this is given below:

Listing 14.18: Finite difference approximation to the derivative


f v a l = f e v a l ( fname , x ) ;
f p v a l = ( f e v a l ( fname , x+ d e l t a ) − f v a l ) / d e l t a ;

We can also use a secant approximation as follows:

f (xc ) − f (x− )
f  (x) ≈
xc − x−

where x− is the previous iterate from our routine. The Matlab fragment we need is
then:

Listing 14.19: Secant approximation to the derivative


f p c = ( f c − f ) / ( xc − x ) ;

14.1.2.4 A Finite Difference Global Newton Method

We add the finite difference routines into our Global Newton’s Method as follows:
14.1 Linear Approximations to Nonlinear Models 451

Listing 14.20: Finite Difference Global Newton Method


f u n c t i o n [ x , fx , nEvals , aF , bF ] = . . .
GlobalNewtonFD ( fName , a , b , t o l x , t o l f , nEvalsMax )
%
4 % fName a s t r i n g t h a t i s t h e name o f t h e f u n c t i o n f ( x )
% a,b we l o o k f o r t h e r o o t i n t h e i n t e r v a l [ a , b ]
% tolx t o l e r a n c e on t h e s i z e o f t h e i n t e r v a l
% tolf tolerance of f ( current approximation to root )
% nEvalsMax Maximum Number o f d e r i v a t i v e E v a l u a t i o n s
9 %
% x Approximate z er o of f
% fx The v a l u e o f f a t t h e a p p r o x i m a t e z e r o
% nEvals The Number o f D e r i v a t i v e E v a l u a t i o n s Needed
% aF , bF t h e f i n a l i n t e r v a l t h e a p p r o x i m a t e r o o t l i e s in ,
14 % [ aF , bF ]
%
% T e r m i n a t i o n I n t e r v a l [ a , b ] has s i z e < t o l x
% | f ( approximate root ) | < t o l f
% Have e x c e e d e d nEvalsMax d e r i v a t i v e E v a l u a t i o n s
19 %
f a = f e v a l ( fName , a ) ;
f b = f e v a l ( fName , b ) ;
x = a;
f x = f e v a l ( fName , x ) ;
24 d e l t a = s q r t ( e p s ) ∗ abs ( x ) ;
f p v a l = f e v a l ( fName , x+ d e l t a ) ;
f p x = ( f p v a l −f x ) / d e l t a ;

nEvals = 1;
29 k = 1;
%d i s p ( ’ ’ )
%d i s p ( ’ S t e p | k | a(k) | x(k) | b ( k ) ’)
%d i s p ( s p r i n t f ( ’ S t a r t | %6d | %12.7 f | %12.7 f | %12.7 f ’ , k , a , x , b ) ) ;
while ( abs ( a−b )>t o l x ) && ( abs ( f x )> t o l f ) && ( nEvals<nEvalsMax ) | | ( n E v a l s ==1)
34 %[ a , b ] b r a c k e t s a r o o t and x=a o r x=b
check = S t e p I s I n ( x , fx , fpx , a , b ) ;
i f check
%Take Newton S t e p
x = x − fx / fpx ;
39 else
%Take a B i s e c t i o n S t e p :
x = ( a+b ) / 2 ;
end
f x = f e v a l ( fName , x ) ;
44 f p v a l = f e v a l ( fName , x+ d e l t a ) ;
f p x = ( f p v a l −f x ) / d e l t a ;
n E v a ls = n E v a ls +1;
i f f a ∗ fx <=0
%t h e r e i s a r o o t i n [ a , x ] . Use r i g h t e n d p o i n t .
49 b = x;
fb = fx ;
else
%t h e r e i s a r o o t i n [ x , b ] . Bring in l e f t e n dpoi nt .
a = x;
54 fa = fx ;
end
k = k + 1;
%i f ( c h e c k )
% d i s p ( s p r i n t f ( ’ Newton | %6d | %12.7 f | %12.7 f | %12.7 f ’ , k , a , x , b ) ) ;
59 %e l s e
% d i s p ( s p r i n t f ( ’ B i s e c t i o n | %6d | %12.7 f | %12.7 f | %12.7 f ’ , k , a , x , b ) ) ;
%end
end
aF = a ;
64 bF = b ;

end


Note, we use for our finite difference stepsize machine |x|.
452 14 Nonlinear Differential Equations

A Run Time Example We will apply our finite difference global newton method
root finding code to the same simple example: find a root for f (x) = sin(x) in the
interval [ −7π
2
, 15π + 0.1]. We only need the code for the function now which is as
usual in the file f1.m.
To run this code on this example, we would then type a phrase like the one below:

Listing 14.21: Sample GlobalNewton Finite Difference solution


[ x , fx , nEvals , aLast , bLast ] = GlobalNewtonFD ( ’f1’ , −7∗ p i / 2 , 1 5 ∗ p i + . 1 , . . .
10ˆ −6 ,10ˆ −8 ,200)

Here is the runtime output:

Listing 14.22: Global Newton Finite Difference runtime results


[ x , fx , nEvals , aLast , bLast ] = GlobalNewtonFD ( ’f1’ , −7∗ p i / 2 , 1 5 ∗ p i + . 1 , . . .
10ˆ −6 ,10ˆ −8 ,200)
3 Step | k | a(k) | x(k) | b(k)
Start | 1 | −10.9955743 | −10.9955743 | 47.2238898
Bisection | 2 | −10.9955743 | 18.1141578 | 18.1141578
Bisection | 3 | −10.9955743 | 3.5592917 | 3.5592917
Newton | 4 | 3.1154761 | 3.1154761 | 3.5592917
8 Newton | 5 | 3.1154761 | 3.1415986 | 3.1415986
Newton | 6 | 3.1154761 | 3.1415927 | 3.1415927
x =
3.1416
fx =
13 −4.3184 e −15
nEvals =
6
aLast =
3.1155
18 bLast =
3.1416

Homework

Exercise 14.1.5 Use the Finite Difference Global Newton Method to find the second
positive solution of the equation x = tan(x). Do this for tolerances 10−8 . This time
alter the GlobalNewtonFD code to allow the finite difference step size delta to be
a parameter and do a parametric study on the effects of delta. Note that the code

now uses the reasonable choice of machine |x| but you need to use the additional δ
choices {10−4 , 10−6 , 10−8 , 10−10 }. This will give you five δ choices. Provide a table
and a graph of δ versus accuracy of the root approximation.

Exercise 14.1.6 Use the Finite Difference Global Newton Method to find the largest
real root of the function f (x) = x 6 − x − 1. Do this for tolerances 10−8 . Again
use altered GlobalNewtonFD code with the finite difference step size delta as a
parameter and do a parametric study on the effects of delta. Note that the code

now uses the reasonable choice of machine |x| but you need to use the additional δ
14.1 Linear Approximations to Nonlinear Models 453

choices {10−4 , 10−6 , 10−8 , 10−10 }. This will give you five δ choices. Provide a table
and a graph of δ versus accuracy of the root approximation.

Exercise 14.1.7 Do the same thing for the problems above, but replace the Finite
Difference Global Newton Code with a Secant Global Newton Code. This will only
require a few lines of code to change really, so don’t freak out!

14.1.3 A Second Nonlinear Model

Now let’s look at this model

x  = 0.5(−h(x) + y)
y  = 0.2(−x − 1.5y + 1.2)

for

h(x) = 17.76x − 103.79x 2 + 229.62x 3 − 226.31x 4 + 83.72x 5 ,

This is model of how an electrical component called a diode behaves called the
trigger model and the details are not really important as we are just investigating
how to use our code and theoretical ideas. The equilibrium points are the solutions
to the simultaneous equations

y = 17.76x − 103.79x 2 + 229.62x 3 − 226.31x 4 + 83.72x 5


2 4
y=− x+ .
3 5

Fig. 14.2 Equilibrium


points graphically for the
trigger model
454 14 Nonlinear Differential Equations

We can see these solutions graphically by plotting the two curves simultaneously
and using the cursor to locate the roots and read off the (x, y) values from the plot.
This is not quite accurate so a better way is to find the roots numerically. The plot
which shows the equilibrium points graphically is shown in Fig. 14.2.
We do this with the following MatLab/Octave session.

Listing 14.23: Finding equilibrium points numerically


g = @( x ) ( 1 2 − 10∗ x ) / 1 5 ;
h = @( x ) 1 7 . 7 6 ∗ x −103.79∗ x . ˆ 2 . . .
+ 229.62∗ x . ˆ 3 − 226.31∗ x . ˆ 4 + 83.72∗ x . ˆ 5 ;
X = linspace (0 ,1 ,101) ;
5 Y1 = h (X) ;
Y2 = g (X) ;
p l o t (X, Y1 , X, Y2 ) ;
q = @( x ) h ( x ) − g ( x ) ;
[ x1 , fx1 , Nevals1 , aF1 , bF1 ] = . . .
10 GlobalNewtonFD ( q , 0 , . 1 , 1 0 ˆ − 6 , 1 0 ˆ − 8 , 1 0 ) ;
[ x2 , fx2 , Nevals2 , aF2 , bF2 ] = . . .
GlobalNewtonFD ( q , . 2 , . 4 , 1 0 ˆ − 6 , 1 0 ˆ − 8 , 1 0 ) ;
[ x3 , fx3 , Nevals3 , aF3 , bF3 ] = . . .
GlobalNewtonFD ( q , . 4 , . 9 , 1 0 ˆ − 6 , 1 0 ˆ − 8 , 1 0 ) ;
15 y1 = h ( x1 ) ;
y2 = h ( x2 ) ;
y3 = h ( x3 ) ;
EP = { [ x1 , y1 ] , [ x2 , y2 ] , [ x3 , y3 ] } ;
[ x3 , fx3 , Nevals3 , aF3 , bF3 ] = . . .
20 GlobalNewtonFD ( q , . 4 , . 9 , 1 0 ˆ − 6 , 1 0 ˆ − 8 , 1 0 ) ;
y3 = h ( x3 ) ;
EP = { [ x1 , y1 ] , [ x2 , y2 ] , [ x3 , y3 ] } ;
EP
EP =
25
[1 ,1] =
0.062695 0.758672
[1 ,2] =
0.28537 0.60975
30 [1 ,3] =
0.88443 0.21038
}

There are now three equilibrium points

Q 1 = (0.062695, 0.758672)
Q 2 = (0.28537, 0.60975)
Q 3 = (0.88443, 0.21038)

The Jacobian is
 
−0.5h  (x) 0.5
J (x, y) =
−0.2 −0.3

Now let’s find the linearizations.


14.1 Linear Approximations to Nonlinear Models 455

14.1.3.1 Equilibrium Point Q1

For the equilibrium point Q1, we have

Listing 14.24: Jacobian at Q1


J ( x1 , y1 )
ans =
3
−3.61840 0.50000
−0.20000 −0.30000
[ V1 , D1 ] = e i g ( J ( x1 , y1 ) )
V1 =
8
−0.998155 −0.150341
−0.060715 −0.988634

D1 =
13
D i a g o n a l Matrix

−3.58798 0
0 −0.33041

Hence, there are two real distinct eigenvalues, r1 = 3.58798 and r2 = −0.33041.
The eigenvectors are
   
−0.998155 −0.150341
E1 = and E2 =
−0.060715 −0.988634

The dominant eigenvector is E1 and it is easy to plot the resulting trajectories. We


modify our function linearsystem to the new function linearsystemep so
that we can plot the trajectories centered at the equilibrium point Q1.

Listing 14.25: Linearsystemep


function f = linearsystemep (p , t , x )
a = p(1) ;
b = p(2) ;
c = p(3) ;
5 d = p(4) ;
ex = p ( 5 ) ;
ey = p ( 6 ) ;
u = x ( 1 ) − ex ;
v = x ( 2 ) − ey ;
10 f = [ a ∗u+b∗v ; c ∗u+d∗ v ] ;

We set up the plot as follows. We define the coefficient matrix A1 of our linearization
and set up the parameter p1 by listing all the entries of A1 followed by the coordinates
of Q1. We then generate the automatic phase plane plot.
456 14 Nonlinear Differential Equations

Listing 14.26: Phase plane plot for Q1 linearization


A1 = J ( x1 , y1 ) ;
p1 = [ A1 ( 1 , 1 ) , A1 ( 1 , 2 ) , A1 ( 2 , 1 ) , A1 ( 2 , 2 ) , x1 , y1 ]
p1 =

5 −3.618397 0.500000 −0.200000 −0.300000 0.062695 0.758672

AutoPhasePlanePlotLinearSystemRKF5 ( ’x axis’ , ’y axis’ , . . .


’Q1 Plot’ , . . .
1 . 0 e −6 ,1.0 e − 6 , . 0 1 , . 2 , . . .
10 ’linearsystemep’ , p1 , . 0 1 , 0 , . 4 , 1 2 , 1 2 , 0 , 1 , 0 , 1 ) ;

This generates the plot shown in Fig. 14.3.

14.1.3.2 Equilibrium Point Q2

For equilibrium point Q2, we find the linearization just like we did for Q1. First, we
find the Jacobian at this point.
Listing 14.27: Jacobian at Q2
J ( x2 , y2 )
ans =

1.82012 0.50000
5 −0.20000 −0.30000
[ V2 , D2 ] = e i g ( J ( x2 , y2 ) )
V2 =
0 . 9 9 5 3 7 3 −0.234595
−0.096085 0.972093
10 D2 =
1.77185 0
0 −0.25173

Fig. 14.3 Trigger model


linearized about Q1
14.1 Linear Approximations to Nonlinear Models 457

Hence, there are two real distinct eigenvalues, r1 = 1.77185 and r2 = −0.25173.
The eigenvectors are
   
0.995373 −0.234595
E1 = and E2 =
−0.096085 0.972093

The dominant eigenvector is again E1 . We set the coefficient matrix A2 of our


linearization and the parameter p2 of our model and generate the automatic phase
plane plot.

Listing 14.28: Phase plane for Q2 linearization


A2 = J ( x2 , y2 ) ;
p2 = [ A2 ( 1 , 1 ) , A2 ( 1 , 2 ) , A2 ( 2 , 1 ) , A2 ( 2 , 2 ) , x2 , y2 ]
p2 =

5 1.82012 0 . 5 0 0 0 0 −0.20000 −0.30000 0.28537 0.60975


AutoPhasePlanePlotLinearSystemRKF5 ( ’x axis’ , . . .
’y axis’ , ’Q1 Plot’ , . . .
1 . 0 e −6 ,1.0 e − 6 , . 0 1 , . 2 , . . .
’linearsystemep’ , p2 , . 0 1 , 0 , . 4 , 1 2 , 1 2 , − 0 . 5 , 1 , 0 , 1 ) ;

This generates the plot shown in Fig. 14.4.

14.1.3.3 Equilibrium Point Q3

Finally, we analyze the model near the point Q3. The Jacobian is now

Fig. 14.4 Trigger model


linearized about Q2
458 14 Nonlinear Differential Equations

Listing 14.29: Jacobian at Q3


1 J ( x3 , y3 )
ans =
−1.43702 0.50000
−0.20000 −0.30000
[ V3 , D3 ] = e i g ( J ( x3 , y3 ) )
6 V3 =
−0.98204 −0.43297
−0.18868 −0.90141
D3 =
−1.34096 0
11 0 −0.39607

Hence, there are two real distinct eigenvalues, r1 = −1.34096 and r2 = −0.39607.
The eigenvectors are
   
−0.98204 −0.43297
E1 = and E2 =
−0.18868 −0.90141

The dominant eigenvector is now E2 . We set the coefficient matrix A3 of our lin-
earization and the parameter p3 of our model and generate the automatic phase plane
plot.

Listing 14.30: Phase plane for Q3 linearization


A3 = J ( x3 , y3 ) ;
p3 = [ A3 ( 1 , 1 ) , A3 ( 1 , 2 ) , A3 ( 2 , 1 ) , A3 ( 2 , 2 ) , x3 , y3 ]
p3 =
−1.43702 0 . 5 0 0 0 0 −0.20000 −0.30000 0.88443 0.21038
5 AutoPhasePlanePlotLinearSystemRKF5 ( ’x axis’ , . . .
’y axis’ , ’Q1 Plot’ , . . .
1 . 0 e − 6 , 1 . 0 e − 6 , . 0 1 , . 2 , ’linearsystemep’ , . . .
p3 , . 0 1 , 0 , . 4 , 1 2 , 1 2 , 0 , 1 , 0 , 1 ) ;

This generates the plot shown in Fig. 14.5.

14.1.3.4 The Full Phase Plane

We can also generate the full plot of the original system using the function
triggermodel
14.1 Linear Approximations to Nonlinear Models 459

Fig. 14.5 Trigger model


linearized about Q3

Listing 14.31: Trigger model


function y = triggermodel ( t , x )
%
u = x (1) ;
v = x (2) ;
5 y = [ − 0 . 5 ∗ ( 1 7 . 7 6 ∗ u −103.79∗u . ˆ 2 + 2 2 9 . 6 2 ∗ u . ˆ 3 − 2 2 6 . 3 1 ∗ u . ˆ 4 + 8 3 . 7 2 ∗ u . ˆ 5 −
v) ; . . .
0.2∗( − u − 1 . 5 ∗ v + 1 . 2 ) ] ;

We then plot the trajectories using AutoPhasePlanePlotRKF5NoP. This gives


us Fig. 14.6. You can see in this plot how trajectories do not stay near Q2; instead,
they move to Q1 or to Q3. Hence, you can a trigger a move from Q 1 to Q 3 or vice
versa by choosing the right Initial condition! So this gives us a way to implement
computer memory. Choose say Q 1 to represent a binary “1” and Q 3 to represent a
binary “0” or to implement a model of how emotional states can switch quickly.

Listing 14.32: Phase plane plot for trigger model


AutoPhasePlanePlotRKF5NoP ( ’x axis’ , . . .
’y axis’ , ’Trigger Model Phase Plane’ , . . .
1 . 0 e −6 ,1.0 e − 6 , . 0 1 , . 0 5 , . . .
’triggermodel’ , . 0 1 , 0 , 2 , 2 0 , 2 0 , − 0 . 2 , 1 , 0 , 1 ) ;

14.1.4 The Predator–Prey Model

The next nonlinear system is the familiar Predator–Prey model. Consider the follow-
ing example
460 14 Nonlinear Differential Equations

Fig. 14.6 Trigger model


phase plane plot

x  (t) = 3 x(t) − 4 x(t) y(t)


y  (t) = −5 y(t) + 7 x(t) y(t)

The equilibrium points (x, y) solve the equations

x (3 − 4y) = 0
y (−5 + 7x) = 0

which has the familiar solutions Q 1 = (0, 0) and Q 2 = ( 75 , 43 ). The Jacobian here is
 
3 − 4y −4x
J(x, y) =
7y −5 + 7x

14.1.4.1 Equilibrium Point Q1

For the equilibrium point Q1, we have


Listing 14.33: Jacobian at Q1
1 A1 = J ( 0 , 0 )
A1 =
3 −0
0 −5
[ V1 , D1 ] = e i g ( A1 )
6 V1 =
0 1
1 0
D1 =
−5 0
11 0 3
14.1 Linear Approximations to Nonlinear Models 461

Hence, there are two real distinct eigenvalues, r1 = −5 and r2 = 3. The eigenvectors
are
   
0 1
E1 = and E2 =
1 0

The dominant eigenvector is thus E2 and it is easy to plot the resulting trajectories.
Using the function linearsystemep we set up the plot. Using the coefficient
matrix A1 of our linearization and the coordinates of Q 1 , we set the parameter p1
by listing all the entries of A1 followed by the coordinates of Q1. We then generate
the automatic phase plane plot.

Listing 14.34: Phase plane plot for Q1 linearization


A1 = J ( x1 , y1 ) ;
p1 = [ A1 ( 1 , 1 ) , A1 ( 1 , 2 ) , A1 ( 2 , 1 ) , A1 ( 2 , 2 ) , x1 , y1 ]
p1 =

5 3 −0 0 −5 0 0

AutoPhasePlanePlotLinearSystemRKF5 ( ’x axis’ , . . .
’y axis’ , ’Q1 Plot’ , . . .
1 . 0 e −6 ,1.0 e − 6 , . 0 1 , . 2 , . . .
10 ’linearsystemep’ , p1 , . 0 1 , 0 , . 4 , 1 2 , 1 2 , − 1 , 1 , − 1 , 1 ) ;

This generates the plot shown in Fig. 14.7.


Note the linearization shows a set of trajectories that moves out from the origin along
the x axis depending on the quadrant the initial condition comes from. We also know
that no trajectory of the Predator–Prey model that starts in Quadrant One can cross
the x and y axis, so the lesson that the linearization is an approximation to the true
trajectories really comes home here.

Fig. 14.7 Predator–Prey


model linearized about Q1
462 14 Nonlinear Differential Equations

14.1.4.2 Equilibrium Point Q2

For equilibrium point Q2, we find the Jacobian at this point.

Listing 14.35: Jacobian at Q2


A2 = J ( x2 , y2 )
A2 =
0 . 0 0 0 0 0 −2.85714
5.25000 0.00000
5 [ V2 , D2 ] = e i g ( A2 )
V2 =
0.00000 + 0.59365 i 0.00000 − 0.59365 i
0.80472 + 0.00000 i 0.80472 − 0.00000 i
D2 =
10 0.0000 + 3.8730 i 0
0 0.0000 − 3.8730 i

Hence, there is now a complex conjugate eigenvalue pair which has zero real part.
Thus, these trajectories will be circles about Q 2 . The eigenvalues are r1 = 3.8730i
and r2 = −3.8730i. The eigenvectors are not really needed for our phase plane. We
set the coefficient matrix A2 of our linearization and the parameter p2 of our model
and generate the automatic phase plane plot.

Listing 14.36: Phase plane for Q2 linearization


A2 = J ( x2 , y2 ) ;
p2 = [ A2 ( 1 , 1 ) , A2 ( 1 , 2 ) , A2 ( 2 , 1 ) , A2 ( 2 , 2 ) , x2 , y2 ]
p2 =

5 0 . 0 0 0 0 0 −2.85714 5.25000 0.00000 0.71429 0.75000


AutoPhasePlanePlotLinearSystemRKF5 ( ’x axis’ , . . .
’y axis’ , ’Q2 Plot’ , . . .
1 . 0 e −6 ,1.0 e − 6 , . 0 1 , . 2 , . . .
’linearsystemep’ , p2 , . 0 1 , 0 , 2 . 4 , 1 2 , 1 2 , − 0 . 5 , . 1 , 0 , . 1 ) ;

This generates the plot shown in Fig. 14.8.

14.1.4.3 The Full Phase Plane

The full plot of the original system uses the standard function PredPrey modified
for the model at hand.

Listing 14.37: Predator–Prey dynamics


f u n c t i o n f = PredPrey ( t , x )
f = [ 3 ∗ x ( 1 ) −4∗x ( 1 ) ∗ x ( 2 ) ;−5∗x ( 2 ) +7∗ x ( 1 ) ∗ x ( 2 ) ] ;
14.1 Linear Approximations to Nonlinear Models 463

Fig. 14.8 Predator–Prey


model linearized about Q2

We then plot the trajectories using AutoPhasePlanePlotRKF5NoPMultiple.


This gives us Fig. 14.9. We had to write a new function to generate the plot as the
equilibrium point Q 1 generates trajectories with large x and y values which com-
plicate the calculations. So the new function manually builds four separate plots,
one for each quadrant, and stays carefully away from the trajectories on the x and
y axis. It is pretty hard to automate this kind of plot! In each quadrant, the number
of trajectories and the size of the box from which the initial conditions come must
be chosen inside the code. Also, the length of time the calculations run in a given
quadrant must be handpicked. Hence, there is a lot of tweaking here!

Fig. 14.9 Predator–Prey


model phase plane plot
464 14 Nonlinear Differential Equations

Listing 14.38: AutoPhasePlanePlotRKF5NoPMultiple


f u n c t i o n AutoPhasePlanePlotRKF5NoPMultiple ( Hlabel , V l a b e l , T l a b e l , e r r o r t o l , s t e p t o l ,
minstep , maxstep , . . .
fname , h i n i t , Q1 , Q2 , Q3 , Q4 )
% fname i s t h e name o f t h e model dy n a m i c s
% e r r o r t o l i s t o l e r a n c e i n RKF5 a l g o r i t h m
5 % s t e p t o l i s a n o t h e r t o l e r a n c e i n RKF5 a l g o r i t h m
% m i n s t e p i s t h e minimum s t e p s i z e a l l o w e d
% ma xs t e p i s t h e maximum s t e p s i z e a l l o w e d
% h i n i t i s the i n i t i a l step s i z e
% Q1 i s 1 i f we want t o g e n e r a t e t h e p l o t i n q u a d r a n t 1
10 % Q2 i s 1 i f we want t o g e n e r a t e t h e p l o t i n q u a d r a n t 2
% Q3 i s 1 i f we want t o g e n e r a t e t h e p l o t i n q u a d r a n t 3
% Q4 i s 1 i f we want t o g e n e r a t e t h e p l o t i n q u a d r a n t 4
% tend i s the f i n a l time
%
15 %h o l d p l o t and c y c l e l i n e c o l o r s
newplot ;
hold a l l ;
i f ( Q3 == 1 )
endLL = 2 ;
20 uLL = l i n s p a c e ( − . 2 , − . 1 , endLL ) ;
vLL = l i n s p a c e ( − . 2 , − . 1 , endLL ) ;
u = uLL ;
v = vLL ;
f o r i =1: endLL
25 f o r j =1: endLL
y0 = [ u ( i ) ; v ( j ) ] ;
[ t v a l s , y v a l s , f v a l s , h v a l s ] = RKF5NoP ( e r r o r t o l , s t e p t o l , minstep , maxstep , . . .
fname , h i n i t , 0 , 0 . 5 , y0 ) ;
plot ( yvals (1 ,:) , yvals (2 ,:) ) ;
30 end
end
end
i f ( Q2 == 1 )
endUL = 2 ;
35 uUL = l i n s p a c e ( − . 2 , − . 1 , endUL ) ;
vUL = l i n s p a c e ( . 1 , . 2 , endUL ) ;
u = uUL;
v = vUL ;
f o r i =1: endUL
40 f o r j =1: endUL
y0 = [ u ( i ) ; v ( j ) ] ;
[ t v a l s , y v a l s , f v a l s , h v a l s ] = RKF5NoP ( e r r o r t o l , s t e p t o l , minstep , maxstep , . . .
fname , h i n i t , 0 , 0 . 6 , y0 ) ;
plot ( yvals (1 ,:) , yvals (2 ,:) ) ;
45 end
end
end
i f ( Q4 == 1 )
endLR = 2 ;
50 uLR = l i n s p a c e ( . 1 , . 2 , endLR ) ;
vLR = l i n s p a c e ( − . 2 , − . 1 , endLR ) ;
u = uLR
v = vLR
f o r i =1: endLR
55 f o r j =1: endLR
y0 = [ u ( i ) ; v ( j ) ] ;
[ t v a l s , y v a l s , f v a l s , h v a l s ] = RKF5NoP ( e r r o r t o l , s t e p t o l , minstep , maxstep , . . .
fname , h i n i t , 0 , 0 . 5 , y0 ) ;
plot ( yvals (1 ,:) , yvals (2 ,:) ) ;
60 end
end
end
i f ( Q1 == 1 )
endUR = 4 ;
65 uUR = l i n s p a c e ( . 1 , 2 . 0 , endUR ) ;
vUR = l i n s p a c e ( . 1 , 2 . 0 , endUR ) ;
u = uUR;
v = vUR;
f o r i =1:endUR
70 f o r j =1:endUR
y0 = [ u ( i ) ; v ( j ) ] ;
[ t v a l s , y v a l s , f v a l s , h v a l s ] = RKF5NoP ( e r r o r t o l , s t e p t o l , minstep , maxstep , . . .
fname , h i n i t , 0 , 2 . 4 , y0 ) ;
plot ( yvals (1 ,:) , yvals (2 ,:) ) ;
75 end
end
end
x l a b e l ( Hlabel ) ;
ylabel ( Vlabel ) ;
80 t i t l e ( Tlabel ) ;
hold o f f ;
14.1 Linear Approximations to Nonlinear Models 465

Listing 14.39: A sample phase plane plot


AutoPhasePlanePlotRKF5NoPMultiple ( ’x axis’ , . . .
’y axis’ , ’Predator - Prey Phase Plane’ , . . .
1 . 0 e − 6 , 1 . 0 e − 6 , . 0 1 , . 0 5 , ’PredPrey’ , . 0 1 , 1 , 1 , 1 , 1 ) ;

14.1.5 The Predator–Prey Model with Self Interaction

The next nonlinear system is the Predator–Prey model with self interaction. Consider
the following example

x  (t) = 3 x(t) − 4 x(t) y(t) − e x(t)2


y  (t) = −5 y(t) + 7 x(t) y(t) − f y(t)2

where e and f are positive numbers. From our earlier discussions, we know that there
are essentially two interesting cases here. The x  and y  nullclines cross in the first
quadrant leading to spiral in trajectories towards that common point or they cross
in the fourth quadrant which leads to all trajectories converging to a point on the
positive x axis. We can now look at this model using our new tools. The equilibrium
points (x, y) solve the equations

x (3 − 4y − ex) = 0
y (−5 + 7x − f y) = 0

Three of the equilibrium points are then Q 1 = (0, 0), Q 2 = (0, −5/ f ) and Q 3 =
(3/e, 0). These equilibrium points coincide with trajectories that we do not see if we
choose positive initial conditions in Quadrant I.

14.1.5.1 The Nullclines Intersect in Quadrant I

This situation occurs when 3/e > 5/7 so as an example, let’s choose the value e = 1.
The value of f is not very important, so let’s choose f = 1 also just to make it easy.
Then, the intersection occurs when

x + 4y = 3
7x − y = 5

We can find this point of intersection many ways. The old fashioned way is by elimi-
nation. We find x = 23/29 and y = 16/29. Let Q 4 = (23/29, 16/29). The Jacobian
in general is
466 14 Nonlinear Differential Equations
 
3 − 4y − 2ex −4x
J(x, y) = .
7y −5 + 7x − 2 f y

Thus, for e = 1 and f = 1, we have


 
3 − 4y − 2x −4x
J(x, y) = .
7y −5 + 7x − 2y

We will only look at the local linearizations for equilibrium points Q 4 and then Q 3 .
The other two are similar to what we have done before.
Equilibrium Point Q4 For the equilibrium point Q4, we can find the Jacobian at Q 4
and the corresponding eigenvalues and eigenvectors. As expected, the eigenvalues
are complex with negative real part implying the local linearization gives spiral in
trajectories.

Listing 14.40: Jacobian at Q4


J = @( x , y ) [ 3 − 4∗ y−2∗x , −4∗ x ; 7 ∗ y , −5 + 7∗ x − 2∗ y ] ;
2 J (23/29 ,16/29)
ans =
−0.79310 −3.17241
3 . 8 6 2 0 7 −0.55172
A4 = J ( 2 3 / 2 9 , 1 6 / 2 9 ) ;
7 [ V4 , D4 ] = e i g ( A4 )
V4 =
−0.02315 + 0 . 6 7 1 1 5 i −0.02315 − 0 . 6 7 1 1 5 i
0.74096 + 0.00000 i 0.74096 − 0.00000 i
D4 =
12 D i a g o n a l Matrix
−0.6724 + 3 . 4 9 8 2 i 0
0 −0.6724 − 3 . 4 9 8 2 i

Hence, there is a complex eigenvalue pair, r = −0.6724 ± 3.4982i. Using linear


systemep we generate the plot using the coefficient matrix A4 of our linearization
and the coordinates of Q 4 . We set the parameter p4 by listing all the entries of A4
followed by the coordinates of Q4. We then generate the automatic phase plane plot.

Listing 14.41: Phase plane for Q4 linearization


A4 = J ( x1 , y1 ) ;
p4 = [ A4 ( 1 , 1 ) , A4 ( 1 , 2 ) , A4 ( 2 , 1 ) , A4 ( 2 , 2 ) , 2 3 / 2 9 , 1 6 / 2 9 ] ;
AutoPhasePlanePlotLinearSystemRKF5 ( ’x axis’ , . . .
’y axis’ , ’Q4 Plot’ , . . .
5 1 . 0 e −6 ,1.0 e − 6 , . 0 1 , . 2 , . . .
’linearsystemep’ , p4 , . 0 1 , 0 , . 4 , 1 2 , 1 2 , − 1 , 1 , − 1 , 1 ) ;

This generates the plot shown in Fig. 14.10.


14.1 Linear Approximations to Nonlinear Models 467

Fig. 14.10 Predator–Prey


model with self interaction
with nullcline intersection
Q4 in Quadrant I, linearized
about Q4

Equilibrium Point Q3 For equilibrium point Q3 = (3/e, 0) = (3, 0) here, we find


the Jacobian, eigenvalues and eigenvectors at this point.

Listing 14.42: Jacobian at Q3


A3 = J ( 3 , 0 ) ;
[ V3 , D3 ] = e i g ( A3 )
V3 =
4 1 . 0 0 0 0 0 −0.53399
0.00000 0.84549
D3 =
D i a g o n a l Matrix
−3 0
9 0 16

The eigenvalues are both real, r1 = −3 and r2 = 16. Hence, the dominant eigenvector
is E2 where
 
−0.53399
E2 =
0.84549

So we will see all the trajectories moving parallel or towards the E2 line. We set the
coefficient matrix A3 of our linearization and the parameter p3 of our model and
generate the automatic phase plane plot. It took a bit of experimentation to generate
this plot so that it looked reasonably good. The eigenvalue of 16 causes very fast
growth!
MatLab! A Predator–Prey Model with Self Interaction! the nullclines intersect in
Quadrant 1: session code for phase plane portrait for equilibrium point Q3
468 14 Nonlinear Differential Equations

Fig. 14.11 Predator–Prey


model linearized about Q3

Listing 14.43: Phase plane for Q3 linearization


p3 = [ A3 ( 1 , 1 ) , A3 ( 1 , 2 ) , A3 ( 2 , 1 ) , A3 ( 2 , 2 ) , 3 , 0 ] ;
AutoPhasePlanePlotLinearSystemRKF5 ( ’x axis’ , ’y axis’ , ’Q3 Plot’ , . . .
1 . 0 e − 6 ,1.0 e − 6 , . 0 0 1 , . 0 1 , ’linearsystemep’ , . . .
p3 , . 0 0 1 , 0 , . 1 , 8 , 8 , 2 . 9 , 3 . 1 , − . 1 , . 1 ) ;

This generates the plot shown in Fig. 14.11.


The Full Phase Plane The full plot of the original system uses the standard function
PredPrey modified for the model at hand.
MatLab! A Predator–Prey Model with Self Interaction! the nullclines intersect in
Quadrant 1: session code for full phase plane portrait: dynamics

Listing 14.44: PredPreySelf


function f = PredPreySelf ( t , x )
f = [ 3 ∗ x ( 1 ) −4∗x ( 1 ) ∗x ( 2 )−x ( 1 ) ∗x ( 1 ) ;−5∗x ( 2 ) +7∗ x ( 1 ) ∗x ( 2 )−x ( 2 ) ∗x ( 2 ) ] ;

To generate the full phase plane in all quadrants is difficult due to the growth rates
in all areas of the plane other than quadrant I.

Listing 14.45: Actual Phase plane


AutoPhasePlanePlotRKF5NoPMultiple ( ’x axis’ , ’y axis’ , . . .
’Predator - Prey Self Interaction Phase Plan Plot’ , . . .
3 1 . 0 e −6 ,1.0 e − 6 , . 0 1 , . 1 , . . .
’PredPreySelf’ , . 0 1 , 1 , 1 , 1 , 1 ) ;

This generates the plot shown in Fig. 14.12.


14.1 Linear Approximations to Nonlinear Models 469

Fig. 14.12 Predator–Prey


self interaction model with
intersection in quadrant I

14.1.5.2 The Nullclines Intersect in Quadrant IV

This situation occurs when 3/e < 5/7 so as an example, let’s choose the value e = 5.
The value of f is not very important, so let’s choose f = 5 also just to make it easy.
Then, the intersection occurs when

5x + 4y = 3
7x − 5y = 5

We find x = 35/53 and y = −4/53. Let Q 4 = (35/53, −4/53). The Jacobian for
e = 5 and f = 5 is
 
3 − 4y − 10x −4x
J(x, y) = .
7y −5 + 7x − 10y

The other equilibrium points are Q 1 = (0, 0), Q 2 = (0, −5/ f ) and Q 3 = (3/e, 0).
Equilibrium points coincide with trajectories that we do not see if we choose positive
initial conditions in Quadrant I. However, equilibrium point Q 3 is different now. All
the trajectories that start in the positive first quadrant will converge to Q 3 . Again, we
will only look at the local linearizations for equilibrium points Q 3 and Q 4 .
Equilibrium Point Q4 For the equilibrium point Q4, we can find the Jacobian at
Q 4 and the corresponding eigenvalues and. eigenvectors
470 14 Nonlinear Differential Equations

Listing 14.46: Jacobian at Q4


1 J = @( x , y ) [3 −4∗y−10∗x , −4∗ y ; 7 ∗ y , −5+7∗x−10∗y ] ;
J (35/53 , −4/53)
ans =
−3.30189 0.30189
−0.52830 0.37736
6 A4 = J ( 3 5 / 5 3 , − 4 / 5 3 ) ;
[ V4 , D4 ] = e i g ( A4 )
V4 =
−0.989605 −0.082757
−0.143812 −0.996570
11 D4 =
−3.25802 0
0 0.33349

Hence, there are two real roots that are distinct and the dominant eigenvector is E2 .
Using the coefficient matrix A4 of our linearization and the coordinates of Q 4 , we
set the parameter p4 and generate the automatic phase plane plot.

Listing 14.47: Phase plane plot for Q4 linearization


A4 = J ( x1 , y1 ) ;
p4 = [ A4 ( 1 , 1 ) , A4 ( 1 , 2 ) , A4 ( 2 , 1 ) , A4 ( 2 , 2 ) , 3 5 / 5 3 , − 4 / 5 3 ] ;
AutoPhasePlanePlotLinearSystemRKF5 ( ’x axis’ , . . .
’y axis’ , ’Q4 Plot’ , . . .
5 1 . 0 e − 6 , 1 . 0 e − 6 , . 0 1 , . 2 , ’linearsystemep’ , . . .
p4 , . 0 1 , 0 , . 4 , 1 2 , 1 2 , − 1 , 1 , − 1 , 1 ) ;

This generates the plot shown in Fig. 14.13.

Fig. 14.13 Predator–Prey


model with self interaction
with nullcline intersection
Q4 in Quadrant IV,
linearized about Q4
14.1 Linear Approximations to Nonlinear Models 471

Equilibrium Point Q3 For equilibrium point Q3 = (3/e, 0) = (3/5, 0) here, we


find the Jacobian, eigenvalues and eigenvectors at this point.

Listing 14.48: Jacobian at Q3


A3 = J ( . 6 , 0 ) ;
[ V3 , D3 ] = e i g ( A3 )
V3 =
4 1 0
0 1
D3 =
−3.00000 0
0 −0.80

The eigenvalues are now both real and negative, r1 = −3 and r2 = −0.8. The dom-
inant eigenvector is again E2 . So we will see all the trajectories moving parallel or
towards the E2 line.

Listing 14.49: Phase plane for Q3 linearization


p3 = [ A3 ( 1 , 1 ) , A3 ( 1 , 2 ) , A3 ( 2 , 1 ) , A3 ( 2 , 2 ) , . 6 , 0 ] ;
2 AutoPhasePlanePlotLinearSystemRKF5 ( ’x axis’ , ’y axis’ , ’Q3 Plot’ , . . .
, 1 . 0 e − 6 , 1 . 0 e − 6 , . 0 0 1 , . 0 1 , ’linearsystemep’ , p3 , . . .
.001 ,0 ,.1 ,8 ,8 ,2.9 ,3.1 , −.1 ,.1) ;

This generates the plot shown in Fig. 14.14.


The Full Phase Plane The full plot of the original system uses the standard function
PredPrey modified for the model at hand.

Fig. 14.14 Predator–Prey


model with self interaction
with nullcline intersection in
Quadrant IV, linearized
about Q3
472 14 Nonlinear Differential Equations

Listing 14.50: PredPreySelf


function f = PredPreySelf ( t , x )
f = [ 3 ∗ x ( 1 ) −4∗x ( 1 ) ∗ x ( 2 ) −5∗x ( 1 ) ∗ x ( 1 ) ;−5∗x ( 2 ) +7∗ x ( 1 ) ∗ x ( 2 ) −5∗x ( 2 ) ∗ x ( 2 ) ] ;

To generate the full phase plane in all quadrants is as always difficult and we
have to play with the settings in the function AutoPhasePlanePlotRKF5NoP
Multiple.

Listing 14.51: The actual phase plane


AutoPhasePlanePlotRKF5NoPMultiple ( ’x axis’ , ’y axis’ , . . .
’Predator - Prey Self Interaction Model with null
cline intersection in Quadrant IV Phase Plan Plot’ , . . .
1 . 0 e −6 ,1.0 e − 6 , . 0 1 , . 1 , . . .
5 ’PredPreySelf’ , . 0 0 5 , 1 , 1 , 1 , 1 ) ;

This generates the plot shown in Fig. 14.15.

14.1.6 Problems

For each of the following systems, do this:

1. Graph the nullclines x  = 0 and y  = 0 and show on the x − y plane the regions
where x  and y  take on their various algebraic signs.
2. Find the equilibrium points.

Fig. 14.15 Predator–Prey


self interaction model with
nullcline intersection in
Quadrant IV phase plane plot
14.1 Linear Approximations to Nonlinear Models 473

3. At each equilibrium point, find the Jacobian of the system and analyze the lin-
earized system we have discussed in class. This means:
• find eigenvalues and eigenvectors if the system has real eigenvalues. You don’t
need the eigenvectors if the eigenvalues are complex.
• sketch a graph of the linearized solutions near the equilibrium point.
4. Using 1 and 3 to combine all this information into full graph of the system.

Exercise 14.1.8

x = y
x3
y  = −x + −y
6
Exercise 14.1.9

x  = −x + y
y  = 0.1x − 2y − x 2 − 0.1x 3

Exercise 14.1.10

x = y
y  = −x + y (1 − 3x 2 − 2y 2 )
Chapter 15
An Insulin Model

We are now going to discuss a very nice model of diabetes detection or equivalently,
a model of insulin regulation, which was presented in the classic text on applied
mathematical modeling by Braun, Differential Equations and Their Applications
(Braun 1978).
In diabetes there is too much sugar in the blood and the urine. This is a metabolic
disease and if a person has it, they are not able to use up all the sugars, starches
and various carbohydrates because they don’t have enough insulin. Diabetes can be
diagnosed by a glucose tolerance test (GTT). If you are given this test, you do an
overnight fast and then you are given a large dose of sugar in a form that appears in
the bloodstream. This sugar is called glucose. Measurements are made over about
five hours or so of the concentration of glucose in the blood. These measurements
are then used in the diagnosis of diabetes. It has always been difficult to interpret
these results as a means of diagnosing whether a person has diabetes or not. Hence,
different physicians interpreting the same data can come up with a different diagnosis,
which is a pretty unacceptable state of affairs!
In this chapter, we are going to discuss a criterion developed in the 1960s by
doctors at the Mayo Clinic and the University of Minnesota that was fairly reliable.
It showcases a lot of our modeling in this course and will give you another example
of how we use our tools. We start with a simple model of the blood glucose regulatory
system.
Glucose plays an important role in vertebrate metabolism because it is a source of
energy. For each person, there is an optimal blood glucose concentration and large
deviations from this leads to severe problems including death. Blood glucose levels
are autoregulated via standard forward and backward interactions like we see in many
biological systems. An example is the signal that is used to activate the creation of
a protein which we discussed earlier. The signaling molecules are typically either
bound to another molecule in the cell or are free. The equilibrium concentration of
free signal is due to the fact that the rate at which signaling molecules bind equals the
rate at which they split apart from their binding substrate. When an external message
comes into the cell called a trigger, it induces a change in this careful balance which
temporarily upgrades or degrades the equilibrium signal concentration. This then
© Springer Science+Business Media Singapore 2016 475
J.K. Peterson, Calculus for Cognitive Scientists,
Cognitive Science and Technology, DOI 10.1007/978-981-287-877-9_15
476 15 An Insulin Model

influence the protein concentration rate. Blood glucose concentrations work like this
too, although the details differ. The blood glucose concentration is influenced by a
variety of signaling molecules just like the protein creation rates can be. Here are some
of them. The hormone that decreases blood glucose concentration is insulin. Insulin
is a hormone secreted by the β cells of the pancreas. After we eat carbohydrates, our
gastrointestinal tract sends a signal to the pancreas to secrete insulin. Also, the glucose
in our blood directly stimulates the β cells to secrete insulin. We think insulin helps
cells pull in the glucose needed for metabolic activity by attaching itself to membrane
walls that are normally impenetrable. This attachment increases the ability of glucose
to pass through to the inside of the cell where it can be used as fuel. So, if there is not
enough insulin, cells don’t have enough energy for their needs. The other hormones
we will focus on all tend to change blood glucose concentrations also.

• Glucagon is a hormone secreted by the α cells of the pancreas. Excess glucose is


stored in the liver in the form of Glycogen. There is the usual equilibrium amount
of storage caused by the rate of glycogen formation being equal to the rate of the
reverse reaction that moves glycogen back to glucose. Hence the glycogen serves as
a reservoir for glucose and when the body needs glucose, the rate balance is tipped
towards conversion back to glucose to release needed glucose to the cells. The
hormone glucagon increases the rate of the reaction that converts glycogen back
to glucose and so serves an important regulatory function. Hypoglycemia (low
blood sugar) and fasting tend to increase the secretion of the hormone glucagon.
On the other hand, if the blood glucose levels increase, this tends to suppress
glucagon secretion; i.e. we have another back and forth regulatory tool.
• Epinephrine also called adrenalin is a hormone secreted by the adrenal medulla.
It is part of an emergency mechanism to quickly increase the blood glucose con-
centration in times of extremely low blood sugar levels. Hence, epinephrine also
increases the rate at which glycogen converts to glucose. It also directly inhibits
how much glucose is able to be pulled into muscle tissue because muscles use a
lot of energy and this energy is needed elsewhere more urgently. It also acts on the
pancreas directly to inhibit insulin production which keeps glucose in the blood.
There is also another way to increase glucose by converting lactate into glucose
in the liver. Epinephrine increases this rate also so the liver can pump this extra
glucose back into the blood stream.
• Glucocorticoids are hormones like cortisol which are secreted by the adrenal
cortex which influence how carbohydrates are metabolized which is turn increase
glucose if the the metabolic rate goes up.
• Thyroxin is a hormone secreted by the thyroid gland and it helps the liver form
glucose from sources which are not carbohydrates such as glycerol, lactate and
amino acids. So another way to up glucose!
• Somatotropin is called the growth hormone and it is secreted by the anterior pitu-
itary gland. This hormone directly affect blood glucose levels (i.e. an increase in
Somatotropin increases blood glucose levels and vice versa) but it also inhibits the
effect of insulin on muscle and fat cell’s permeability which diminishes insulin’s
15 An Insulin Model 477

ability to help those cells pull glucose out of the blood stream. These actions can
therefore increase blood glucose levels.
Now net hormone concentration is the sum of insulin plus the others. Let H denote
this net hormone concentration. At normal conditions, call this concentration H0 .
There have been studies performed that show that under close to normal conditions,
the interaction of the one hormone insulin with blood glucose completely dominates
the net hormonal activity. That is normal blood sugar levels primarily depend on
insulin-glucose interactions.
So if insulin increases from normal levels, it increases net hormonal concentration
to H0 + H and decreases glucose blood concentration. On the other hand, if other
hormones such as cortisol increased from base levels, this will make blood glucose
levels go up. Since insulin dominates all activity at normal conditions, we can think
of this increase in cortisol as a decrease in insulin with a resulting drop in blood
glucose levels. A decrease in insulin from normal levels corresponds to a drop in net
hormone concentration to H0 − H. Now let G denote blood glucose level. Hence,
in our model an increase in H means a drop in G and a decrease in H means an
increase in G! Note our lumping of all the hormone activity into a single net activity
is very much like how we modeled food fish and predator fish in the predator–prey
model.
The idea of our model for diagnosing diabetes from the GTT is to find a simple
dynamical model of this complicated blood glucose regulatory system in which
the values of two parameters would give a nice criterion for distinguishing normal
individuals from those with mild diabetes or those who are pre diabetic. Here is what
we will do. We describe the model as

G  (t) = F1 (G, H) + J(t)


H  (t) = F2 (G, H)

where the function J is the external rate at which blood glucose concentration is
being increased. There are two nonlinear interaction functions F1 and F2 because
we know G and H have complicated interactions.
Let’s assume G and H have achieved optimal values G 0 and H0 by the time
the fasting patient has arrived at the hospital. Hence, we don’t expect to have any
contribution to G  (0) and H  (0); i.e. F1 (G 0 , H0 ) = 0 and F2 (G 0 , H0 ) = 0.
We are interested in the deviation of G and H from their optimal values G 0 and
H0 , so let g = G − G 0 and h = H − H0 . We can then write G = G 0 + g and
H = H0 + h. The model can then be rewritten as

(G 0 + g) (t) = F1 (G 0 + g, H0 + h) + J(t)


(H0 + h) (t) = F2 (G 0 + g, H0 + h)
478 15 An Insulin Model

or

g  (t) = F1 (G 0 + g, H0 + h) + J(t)
h (t) = F2 (G 0 + g, H0 + h)

We know the tangent plane to a function F(x, y) at the point (x0 , y0 ) is given by

T (x, y) = F(x0 , y0 ) + Fx (x0 , y0 )(x − x0 ) + Fy (x0 , y0 )(y − y0 ) + E F

where the error is E F . We use this idea on our functions F1 and F2 at the optimal
values G 0 and H0 . We have

∂ F1 ∂ F1
F1 (G 0 + g, H0 + h) = F1 (G 0 , H0 ) + (G 0 , H0 ) g + (G 0 , H0 ) h + E F1
∂g ∂h
∂ F2 ∂ F2
F2 (G 0 + g, H0 + h) = F2 (G 0 , H0 ) + (G 0 , H0 ) g + (G 0 , H0 ) h + E F2
∂g ∂h

but the terms F1 (G 0 , H0 ) = 0 and F1 (G 0 , H0 ) = 0, so we can simplify to

∂ F1 ∂ F1
F1 (G 0 + g, H0 + h) = (G 0 , H0 ) g + (G 0 , H0 ) h + E F1
∂g ∂h
∂ F2 ∂ F2
F2 (G 0 + g, H0 + h) = (G 0 , H0 ) g + (G 0 , H0 ) h + E F2
∂g ∂h

It seems reasonable to assume that since we are so close to ordinary operating con-
ditions, the errors E F1 and E F2 will be negligible. Thus our model approximation is

∂ F1 ∂ F1
g  (t) = (G 0 , H0 ) g + (G 0 , H0 ) h + J(t)
∂g ∂h
∂ F2 ∂ F2
h (t) = (G 0 , H0 ) g + (G 0 , H0 ) h
∂g ∂h

We can reason out the algebraic signs of the four partial derivatives to be

∂ F1
(G 0 , H0 ) = −
∂g
∂ F1
(G 0 , H0 ) = −
∂h
∂ F2
(G 0 , H0 ) = +
∂g
∂ F2
(G 0 , H0 ) = −
∂h
15 An Insulin Model 479

The arguments for these algebraic signs come from our understanding of the phys-
iological processes that are going on here. Let’s look at a small positive deviationg
from the optimal value G 0 while letting the net hormone concentration be fixed at
H0 . At this point, we are not adding an external input, so here J(t) = 0. Then our
model approximation is

∂ F1
g  (t) = (G 0 , H0 ) g
∂g

At a state where we have an increase in blood sugar levels over optimal, i.e. g > 0,
the other hormones such as cortisol and glucagon will try to regulate the blood sugar
level down by increasing their concentrations and for example storing more sugar
into glycogen. Hence, the term ∂∂gF1 (G 0 , H0 ) should be negative as here g  is negative
as g should be decreasing. So we model this as ∂∂gF1 (G 0 , H0 ) = −m 1 for some
positive number m 1 . Now consider a positive change in h from the optimal level
while keeping at the optimal level G 0 . Then the model is

∂ F1
g  (t) = (G 0 , H0 ) h
∂h

and since h > 0, this means the net hormone concentration is up which we interpret
as insulin above normal. This means blood sugar levels go down which implies g 
is negative again. Thus, ∂∂hF1
(G 0 , H0 ) must be negative which means we model it as
∂ F1
∂h
(G 0 , H0 ) = −m 2 for some positive m 2 .
Now look at the h model in these two cases. If we have a small positive deviation
g from the optimal value G 0 while letting the net hormone concentration be fixed at
H0 , we have

∂ F2
h (t) = (G 0 , H0 ) g.
∂g

Again, since g is positive, this means we are above normal blood sugar levels which
implies mechanisms are activated to bring the level down. Hence h > 0 as we have
increasing net hormone levels. Thus, we must have ∂∂g F2
(G 0 , H0 ) = m 3 for some
positive m 3 . Finally, if we have a positive deviation h from optimal while blood
sugar levels are optimal, the model is

∂ F2
h (t) = (G 0 , H0 ) h.
∂h

Since h is positive, we have the concentrations of the hormones that pull glucose
out of the blood stream are above optimal. This means that too much sugar is being
removed as so the regulatory mechanisms will act to stop this action implying h < 0.
This tells us ∂∂g
F2
(G 0 , H0 ) = −m 4 for some positive constant m 4 . Hence, the four
480 15 An Insulin Model

partial derivatives at the optimal points can be defined by four positive numbers m 1 ,
m 2 , m 3 and m 4 as follows:

∂ F1
(G 0 , H0 ) = −m 1
∂g
∂ F1
(G 0 , H0 ) = −m 2
∂h
∂ F2
(G 0 , H0 ) = +m 3
∂g
∂ F2
(G 0 , H0 ) = −m 4
∂h
Our model dynamics are thus approximated by

g  (t) = −m 1 g − m 2 h + J(t)
h (t) = m 3 g − m 4 h

This implies

g  (t) = −m 1 g  − m 2 h + J  (t)

Now plug in the formula for h to get

g  (t) = −m 1 g  − m 2 (m 3 g − m 4 h) + J  (t)
= −m 1 g  − m 2 m 3 g + m 2 m 4 h + J  (t).

But we can use the g  equation to solve for h. This gives

m 2 h = −g  (t) − m 1 g + J(t)

which leads to
 
g  (t) = −m 1 g  − m 2 m 3 g + m 4 −g  (t) − m 1 g + J(t) + J  (t)
= −(m 1 + m 4 ) g  − (m 1 m 4 + m 2 m 3 )g + m 4 J(t) + J  (t).

So our final model is

g  (t) + (m 1 + m 4 ) g  + (m 1 m 4 + m 2 m 3 )g = m 4 J(t) + J  (t).

Let α = (m 1 + m 4 )/2 and ω 2 = m 1 m 4 + m 2 m 3 and we can rewrite as

g  (t) + 2α g  + ω 2 g = S(t).
15 An Insulin Model 481

where S(t) = m 4 J(t) + J  (t). Now the right hand side here is zero except for the
very short time interval when the glucose load is being ingested. Hence, we can
simply search for the solution to the homogeneous model

g  (t) + 2α g  + ω 2 g = 0.

The roots of the characteristic equation here are



−2α ± 4α2 − 4ω 2 
r = = −α ± α2 − ω 2 .
2

The most interesting case is if we have complex roots. In that case, α2 − ω 2 < 0.
Let 2 = |α2 − ω 2 |. Then, the general phase shifted solution has the form g =
Re−αt cos(t − δ) which implies

G = G 0 + Re−αt cos(t − δ).

Hence, our model has five unknowns to find: G 0 , R, α,  and δ, The easiest way to
do this is to measure G 0 , the patient’s initial blood glucose concentration, when the
patient arrives. Then measure the blood glucose concentration N more times giving
the data pairs (t1 , G 1 ), (t2 , G 2 ) and so on out to (t N , G N ). Then form the least squares
error function


N
 2
E= G i − G 0 − Re−αti cos(ti − δ)
i=1

and find the five parameter values that make this error a minimum. This can be
done using MatLab using some tools that outside of the scope of our text. Numerous
experiments have been done with this model and if we let T0 = 2π/, it has been
found that if T0 < 4 h, the patient is normal and if T0 is much larger than that, the
patient has mild diabetes.

15.1 Fitting the Data

Here is some typical glucose versus time data.

Time Glucose Level


0 95
1 180
2 155
3 140
482 15 An Insulin Model

We will now try to find the parameter values which minimize the nonlinear least
squares problem we have here. This appears to be a simple problem, but you will see
all numerical optimization problems are actually fairly difficult. Our problem is to
find the free parameters G o , R, α,  and δ which minimize


N
 2
E(G 0 , R, α, , δ) = G i − G 0 − Re−αti cos(ti − δ)
i=1

For convenience, let X = [G 0 , R, α, , δ] and f i (X) =  G i − G 0 − Re−αti cos


N
(ti − δ); then we can rewrite the error function as E(X) = i=1 f i2 (X). Gradient
descent requires the gradient of this error function. This is just a messy calculation;

∂E N
∂ fi
=2 f i (X)
∂ Go i=1
∂ Go

∂E N
∂ fi
=2 f i (X)
∂R i=1
∂R

∂E N
∂ fi
=2 f i (X)
∂α i=1
∂α

∂E N
∂ fi
=2 f i (X)
∂ i=1
∂

∂E N
∂ fi
=2 f i (X)
∂δ i=1
∂δ

where the f i partials are given by

∂ fi
= −1
∂ Go
∂ fi
= −e−αti cos(ti − δ)
∂R
∂ fi
= ti Re−αti cos(ti − δ),
∂α
∂ fi
= ti Re−αti sin(ti − δ)
∂
∂ fi
= −Re−αti sin(ti − δ)
∂δ
15.1 Fitting the Data 483

and so
∂E N
= −2 f i (X)
∂ Go i=1

∂E N
= −2 f i (X) e−αti cos(ti − δ)
∂R i=1

∂E N
=2 f i (X) ti Re−αti cos(ti − δ),
∂α i=1

∂E N
=2 f i (X) ti Re−αti sin(ti − δ)
∂ i=1

∂E N
= −2 f i (X) Re−αti sin(ti − δ)
∂δ i=1

Now suppose we are at the point X 0 and we want to know how much of the descent
vector D to use. Note, if we use the amount ξ of the descent vector at X 0 , we
compute the new error value E(X 0 −ξ D(X 0 )). Let g(ξ) = E(X 0 −ξ D(X 0 )). We see
g(0) = E(X 0 ) and given a first choice of ξ = λ, we have g(λ) = E(X 0 − λD(X 0 )).
Next, let Y = X 0 − ξ D(X 0 ). Then, using the chain rule, we can calculate the
derivative of g at 0. First, we have
g  (ξ) = −<∇(E), D(X 0 ) >

and using the normalized gradient of E as the descent vector, we find


g  (0 = −||∇(E)||.

Now let’s approximate g using a quadratic model. Since we are trying for a minimum,
in general we try to take a step in the direction of the negative gradient which makes
the error function go down. Then, we have g(0) = E(X 0 ) is less than g(λ) =
E(X 0 − λD(X 0 )) and the directional derivative gives g  (0) = −||∇(E)|| < 0.
Hence, if we approximate g by a simple quadratic model, g(ξ) = A + Bξ + Cξ 2 ,
this model will have a unique minimizer and we can use the value of ξ where the
minimum occurs as our next choice of descent step. This technique is called a Line
Search Method and it is quite useful. To summarize, we fit our g model and find
g(0) = E(X 0 )
g  (0) = −||∇(E)||
E(X 0 − λD(X 0 )) − E(X 0 ) + ||∇(E)||λ
g(λ) = A + Bλ + Cλ2 =⇒ C =
λ2

The minimum of this quadratic occurs at λ∗ = B


2C
and this will give us our next
descent direction X 0 − λ∗ D(X 0 ).
484 15 An Insulin Model

15.1.1 Gradient Descent Implementation

Let’s get started on how to find the optimal parameters numerically. Along the way,
we will show you how hard this is. We start with a minimal implementation. We have
already discussed some root finding codes in Chap. 14 so we have seen code kind of
similar. But this will be a little different and it is good to have you see a bit about it.
What is complicated here is that we have lots of functions that depend on the data
we are trying to fit. So the number of functions depends on the size of the data set
which makes it harder to set up.

Listing 15.1: Nonlinear LS For Diabetes Model: Version One


f u n c t i o n [ Error , G0 , R, alpha , Omega , d e l t a , normgrad , update ] = DiabetesGradOne ( I n i t i a l ,
lambda , maxiter , d a t a )
%
% I n i t i a l g u e s s f o r G0 , R , alpha , Omega , d e l t a
% data = c o l l e c t i o n of ( time , g l u c o s e ) p a i r s
5 % m a x i t e r = maximum number o f i t e r a t i o n s t o u s e
% lambda = how much o f t h e d e s c e n t v e c t o r t o u s e
%
% setup l e a s t squares error function
N = l e n g t h ( data ) ;
10 E = 0.0;
Time = d a t a ( : , 1 ) ;
G = data ( : , 2 ) ;
% I n i t i a l = [ i n i t i a l G, i n i t i a l R , i n i t i a l alpha , i n i t i a l Omega , i n i t i a l d e l t a ]
g = Initial (1) ; r = Initial (2) ; a = Initial (3) ; o = Initial (4) ; d = Initial (5) ;
15 f = @( t , equilG , r , a , o , d ) e q u i l G + r ∗ exp(−a ˆ 2 ∗ t ) . ∗ ( c o s ( o∗ t−d ) ) ;
E = D i a b e t e s E r r o r ( f , g , r , a , o , d , Time ,G)
for i = 1: maxiter
% calculate error
E = D i a b e t e s E r r o r ( f , g , r , a , o , d , Time ,G) ;
20 i f ( i == 1 )
Error ( 1 ) = E ;
end
Error = [ Error ; E ] ;
% f i n d grad of E
25 [ gradE , normgrad ] = E r r o r G ra d i e n t ( Time , G, g , r , a , o , d ) ;
% find descent direction D
i f ( normgrad >1)
D e s c e n t = gradE / normgrad ;
else
30 D e s c e n t = gradE ;
end
Del = [ g ; r ; a ; o ; d]−lambda ∗ D e s c e n t ;
g = Del ( 1 ) ; r = Del ( 2 ) ; a = Del ( 3 ) ; o = Del ( 4 ) ; d = Del ( 5 ) ;
end
35 G0 = g ; R = r ; a l p h a = a ; Omega = o ; d e l t a = d ;
E = D i a b e t e s E r r o r ( f , g , r , a , o , d , Time ,G)
update = [ G0 ;R; a l p h a ; Omega ; d e l t a ] ;
end

Note inside this function, we call another function to calculate the gradient of
the norm. This is given below and implements the formulae we presented earlier for
these partial derivatives.
15.1 Fitting the Data 485

Listing 15.2: The Error Gradient Function


f u n c t i o n [ gradE , normgrad ] = E r r o r G r a die nt ( Time , G, g , r , a , o , d )
2 %
% Time i s t h e t i m e d a t a
% G i s the glucose data
% g = current equilG
% r = current R
7 % a = c ur re nt alpha
% o = c u r r e n t Omega
% d = current delta
%
% C a l c u l a t e Error g r a d i e n t
12 f e r r o r = @( t , G, g , r , a , o , d ) G − g − r ∗ exp(−a ˆ 2 ∗ t ) . ∗ ( c o s ( o∗ t−d ) ) ;
g e r r o r = @( t , r , a , o , d ) r ∗ exp(−a ˆ 2 ∗ t ) . ∗ ( c o s ( o∗ t−d ) ) ;
h e r r o r = @( t , r , a , o , d ) r ∗ exp(−a ˆ 2 ∗ t ) . ∗ ( s i n ( o∗ t−d ) ) ;
N = l e n g t h ( Time ) ;
sum = 0 ;
17 f o r k =1:N
sum = sum + f e r r o r ( Time ( k ) ,G( k ) , g , r , a , o , d ) ;
end
p E pe qu i l g = −2∗sum ;
sum = 0 ;
22 f o r k =1:N
sum = sum + f e r r o r ( Time ( k ) ,G( k ) , g , r , a , o , d ) ∗ g e r r o r ( Time ( k ) , r , a , o , d ) / r ;
end
pEpR = −2∗sum ;
sum = 0 ;
27 f o r k =1:N
sum = sum + f e r r o r ( Time ( k ) ,G( k ) , g , r , a , o , d ) ∗Time ( k ) ∗2∗ a∗Time ( k ) ∗ g e r r o r ( Time ( k
) ,r , a , o ,d) ;
end
pEpa = 2∗sum ;
sum = 0 ;
32 f o r k =1:N
sum = sum + f e r r o r ( Time ( k ) ,G( k ) , g , r , a , o , d ) ∗Time ( k ) ∗ h e r r o r ( Time ( k ) , r , a , o , d ) ;
end
pEpo = 2∗sum ;
sum = 0 ;
37 f o r k =1:N
sum = sum + f e r r o r ( Time ( k ) ,G( k ) , g , r , a , o , d ) ∗ h e r r o r ( Time ( k ) , r , a , o , d ) ;
end
pEpd = −2∗sum ;
gradE = [ p E p e qu i lg ; pEpR ; pEpa ; pEpo ; pEpd ] ;
42 normgrad = norm ( gradE ) ;
end

We also need code for the error calculations which is given here.
Listing 15.3: Diabetes Error Calculation
f u n c t i o n E = D i a b e t e s E r r o r ( f , g , r , a , o , d , Time ,G)
2 %
% T = Time
% G = Glocuose v a l u e s
% f = n o n l i n e a r i n s u l i n model
% g , a , r , o , d = p a r a m e t e r s i n d i a b e t e s n o n l i n e a r model
7 % N = s i z e of data
N = l e n g t h (G) ;
E = 0.0;
% calculate error function
f o r i = 1 :N
12 E = E +( G( i ) − f ( Time ( i ) , g , r , a , o , d ) ) ˆ 2 ;
end

Let’s look at some run time results using this code.


486 15 An Insulin Model

Listing 15.4: Run time results for gradient descent on the original data
Data = [ 0 , 9 5 ; 1 , 1 8 0 ; 2 , 1 5 5 ; 3 , 1 4 0 ] ;
2 Time = Data ( : , 1 ) ;
G = Data ( : , 2 ) ;
f = @( t , equilG , r , a , o , d ) e qu i l G + r ∗ exp(−a ˆ 2 ∗ t ) . ∗ ( c o s ( o ∗ t−d ) ) ;
time = l i n s p a c e ( 0 , 3 , 4 1 ) ;
RInitial = 53.64;
7 G O I n i t i a l = 95 + R I n i t i a l ;
AInitial = sqrt ( log (17/5) ) ;
OInitial = pi ;
d I n i t i a l = −p i ;
I n i t i a l = [ GOInitial ; R I n i t i a l ; A I n i t i a l ; OInitial ; d I n i t i a l ] ;
12 [ Error , G0 , R, alpha , Omega , d e l t a , normgrad , update ] = DiabetesGrad ( I n i t i a l , 5 . 0 e
−4 ,20000 , Data , 0 ) ;
I n i t i a l E = 463.94
E = 376.40
o c t a v e :16> [ Error , G0 , R, alpha , Omega , d e l t a , normgrad , update ] = DiabetesGrad (
I n i t i a l , 5 . 0 e −4 ,40000 , Data , 0 ) ;
I n i t i a l E = 463.94
17 E = 377.77
o c t a v e :145 > [ Error , G0 , R, alpha , Omega , d e l t a , normgrad , update ] = DiabetesGrad (
I n i t i a l , 5 . 0 e −4 ,100000 , Data , 0 ) ;
I n i t i a l E = 463.94
E = 377.77

After 100,000 iterations we still do not have a good fit. Note we start with a
small constant λ = 5.0e − 4 here. Try it yourself. If you let this value be larger,
the optimization spins out of control. Also, we have not said how we chose our
initial values. We actually looked at the data on a sheet of paper and did some rough
calculations to try for some decent values. We will leave that to you to figure out. If
the initial values are poorly chosen, gradient descent optimization is a great way to
generate really bad values! So be warned. You will have to exercise due diligence to
find a sweet starting spot.
We can see how we did by looking the resulting curve fit in Fig. 15.1.

Fig. 15.1 The diabetes


model curve fit: no line
search
15.1 Fitting the Data 487

15.1.2 Adding Line Search

Now let’s add line search and see if it gets better. We will also try scaling the data
so all the variables in question are roughly the same size. For us, a good choice is
to scale the G 0 and the R value by 50, although we could try other choices. We
have already discussed line search for our problem, but here it is again in a quick
nutshell. If we are minimizing a function of M variables, say f (X), then if we are
at the point X 0 , we can look at the slice of this function if we move out from the
base point X 0 is the direction of the negative gradient, ∇( f (X 0 ) = ∇( f 0 ). Define
a function of the single variable ξ as g(ξ) = f (X 0 + ξ ∇( f 0 )). Then, we can try
to approximate g as a quadratic, g(ξ) ≈= A + Bξ + Cξ 2 . Of course, the actual
function might not be approximated nicely by such a quadratic, but it is worth a shot!
Once we fit the parameters A, B and C, we see this quadratic model is minimized
at λ∗ = − 2CB
. The code now adds the line search code which is contained in the block

Listing 15.5: Line Search Code


f u n c t i o n [ lambdastar , DelOptimal ] = L i n e S e a r c h ( Del , EStart , E F u l l S t e p , Base , Descent ,
normgrad , lambda )
%
% Del i s t h e u p d a t e due t o t h e s t e p lambda
% E s t a r t i s E at the base value
5 % E F u l l S t e p i s E a t t h e g i v e n lambda v a l u e
% Base i s t h e c u r r e n t p a r a m e t e r v e c t o r
% Descent i s the current descent v e c t o r
% normgrad o f g r a d o f E a t t h e b a s e v a l u e
% lambda i s t h e c u r r e n t s t e p
10 %
% we have enough i n f o r m a t i o n t o do a l i n e s e a r c h h e r e
A = EStart ;
BCheck = −normgrad ;
C = ( E F u l l S t e p − A − BCheck∗lambda ) / ( lambda ˆ 2 ) ;
15 la m bda s ta r = −BCheck / ( 2 ∗C) ;
i f (C<0 | | l a mbd a s ta r < 0 )
% we a r e g o i n g t o a maximum on t h e l i n e s e a r c h ; r e j e c t
DelOptimal = Del ;
else
20 % we have a minimum on t h e l i n e s e a r c h
DelOptimal = Base−la mbd a s ta r ∗ D e s c e n t ;
end
end
488 15 An Insulin Model

Listing 15.6: Nonlinear LS For Diabetes Model


f u n c t i o n [ Error , G0 , R, alpha , Omega , d e l t a , normgrad , update ] = DiabetesGrad ( I n i t i a l ,
lambda , maxiter , data , d o l i n e s e a r c h )
2 %
% I n i t i a l g u e s s f o r G0 , R , alpha , Omega , d e l t a
% data = c o l l e c t i o n of ( time , g l u c o s e ) p a i r s
% tol = stop tolerance
% m a x i t e r = maximum number o f i t e r a t i o n s t o u s e
7 % lambda = how much o f t h e d e s c e n t v e c t o r t o u s e
%
% setup l e a s t squares error function
N = l e n g t h ( data ) ;
E = 0.0;
12 Time = d a t a ( : , 1 ) ;
G = data ( : , 2 ) ;
% I n i t i a l = [ i n i t i a l G, i n i t i a l R , i n i t i a l alpha , i n i t i a l Omega , i n i t i a l d e l t a ]
g = Initial (1) ;
r = Initial (2) ;
17 a = Initial (3) ;
o = Initial (4) ;
d = Initial (5) ;
f = @( t , equilG , r , a , o , d ) e qu i l G + r ∗ exp(−a ˆ 2 ∗ t ) . ∗ ( c o s ( o∗ t−d ) ) ;
I n i t i a l E = D i a b e t e s E r r o r ( f , g , r , a , o , d , Time ,G)
22 for i = 1: maxiter
% calculate error
E = D i a b e t e s E r r o r ( f , g , r , a , o , d , Time ,G) ;
i f ( i == 1 )
Error ( 1 ) = E ;
27 end
Error = [ Error ; E ] ;
% C a l c u l a t e Error g r a d i e n t
[ gradE , normgrad ] = E r r o r G ra d i e n t ( Time , G, g , r , a , o , d ) ;
% find descent direction
32 i f ( normgrad >1)
D e s c e n t = gradE / normgrad ;
else
D e s c e n t = gradE ;
end
37 Base = [ g ; r ; a ; o ; d ] ;
Del = Base−lambda ∗ D e s c e n t ;
newg = Del ( 1 ) ; newr = Del ( 2 ) ; newa = Del ( 3 ) ; newo = Del ( 4 ) ; newd = Del ( 5 ) ;
E S t a r t = E;
E F u l l S t e p = D i a b e t e s E r r o r ( f , newg , newr , newa , newo , newd , Time ,G) ;
42 i f ( d o l i n e s e a r c h ==1)
[ lambdastar , DelOptimal ] = L in e S e a r c h ( Del , EStart , E F u l l S t e p , Base , Descent ,
normgrad , lambda ) ;
g = DelOptimal ( 1 ) ; r = DelOptimal ( 2 ) ; a = DelOptimal ( 3 ) ; o = DelOptimal ( 4 ) ;
d = DelOptimal ( 5 ) ;
EOptimalStep = D i a b e t e s E r r o r ( f , g , r , a , o , d , Time ,G) ;
else
47 g = Del ( 1 ) ; r = Del ( 2 ) ; a = Del ( 3 ) ; o = Del ( 4 ) ; d = Del ( 5 ) ;
end
end
G0 = g ; R = r ; a l p h a = a ; Omega = o ; d e l t a = d ;
E = D i a b e t e s E r r o r ( f , g , r , a , o , d , Time ,G)
52 update = [ G0 ;R; a l p h a ; Omega ; d e l t a ] ;
end

We can see how this is working by letting some of the temporary calculations
print. Here are two iterations of line search printing out the A, B and C and the rele-
vant energy values. Our initial values don’t matter much here as we are just checking
out the line search algorithm.
15.1 Fitting the Data 489

Listing 15.7: Some Details of the Line Search


Data = [ 0 , 9 5 ; 1 , 1 8 0 ; 2 , 1 5 5 ; 3 , 1 4 0 ] ;
2 Time = Data ( : , 1 ) ;
G = Data ( : , 2 ) ;
f = @( t , equilG , r , a , o , d ) e q u i l G + r ∗ exp(−a ˆ 2 ∗ t ) . ∗ ( c o s ( o ∗ t−d ) ) ;
time = l i n s p a c e ( 0 , 3 , 4 1 ) ;
RInitial = 85∗17/21;
7 G O I n i t i a l = 95 + R I n i t i a l ;
AInitial = sqrt ( log (17/4) ) ;
OIniti al = pi ;
d I n i t i a l = −p i ;
I n i t i a l = [ GOInitial ; R I n i t i a l ; A I n i t i a l ; OInitial ; d I n i t i a l ] ;
12 [ Error , G0 , R, alpha , Omega , d e l t a , normgrad , update ] = DiabetesGrad ( I n i t i a l , 5 . 0 e
−4 ,2 , Data , 1 ) ;
I n i t i a l E = 635.38
EFullStep = 635.31
A = 635.38
BCheck = −595.35
17 C = 9 . 1 0 0 2 e +05
l a m b d a s t ar = 3 . 2 7 1 1 e −04
EFullStep = 635.27
A = 635.33
BCheck = −592.34
22 C = 9 . 0 7 2 5 e +05
l a m b d a s t ar = 3 . 2 6 4 5 e −04
E = 635.29

Now let’s remove those prints and let it run for awhile. We are using the original
data here to try to find the fit.

Listing 15.8: The Full Run with Line Search


1 o c t a v e :90> [ Error , G0 , R, alpha , Omega , d e l t a , normgrad , update ] = DiabetesGrad (
I n i t i a l , 5 . 0 e −4 ,20000 , Data , 1 ) ;
I n i t i a l E = 635.38
E = 389.07
o c t a v e :93> [ Error , G0 , R, alpha , Omega , d e l t a , normgrad , update ] = DiabetesGrad (
I n i t i a l , 5 . 0 e −4 ,30000 , Data , 1 ) ;
I n i t i a l E = 635.38
6 E = 0.067171

We have success! The line search got the job done in 30,000 iterations while the
attempt using just gradient descent without line search failed. but remember, we do
additional processing at each step. We show the resulting curve fit in Fig. 15.2.
The qualitative look of this fit is a bit different. We leave it you to think about how
we are supposed to choose which fit is better; i.e. which fit is a better one to use for
the biological reality we are trying to model? This is a really hard question. Finally,
the optimal values of the parameters are
490 15 An Insulin Model

Fig. 15.2 The diabetes


model curve fit with line
search on unscaled data

Listing 15.9: Optimal Parameter Values


o c t a v e :14> update
update =
154.37240
4 62.73116
0.57076
3.77081
−3.47707

The critical value of 2π/  = 1.6663 here which is less than 4 so this patient is
normal!
Also, note these sorts of optimizations are very frustrating, If we use the
scaled version of the first Initial =[GOInitial;.RInitial;AInitial;
OInitial;.dInitial]; we make no progress even though we run for 60, 000
iterations and these iterations a bit more expensive because we use line search. So
let’s perturb the starting point a bit and see what happens.

Listing 15.10: Runtime Results


% use s c a l e d data
Data = [ 0 , 9 5 / 5 0 ; 1 , 1 8 0 / 5 0 ; 2 , 1 5 5 / 5 0 ; 3 , 1 4 0 / 5 0 ] ;
Time = Data ( : , 1 ) ;
G = Data ( : , 2 ) ;
5 RInitial = 53.64/50;
GOInitial = 95/50 + R I n i t i a l ;
AInitial = sqrt ( log (17/5) ) ;
OInitial = pi ;
d I n i t i a l = −p i ;
10 % perturb the s t a r t point a b i t
I n i t i a l = [ GOInitial ; . 7 ∗ R I n i t i a l ; A I n i t i a l ;1.1 ∗ OInitial ; . 9 ∗ d I n i t i a l ] ;
[ Error , G0 , R, alpha , Omega , d e l t a , normgrad , update ] = DiabetesGrad ( I n i t i a l , 5 . 0 e
−4 ,40000 , Data , 1 ) ;
I n i t i a l E = 0.36485
E = 1 . 5 0 6 4 e −06
15.1 Fitting the Data 491

Fig. 15.3 The diabetes


model curve fit with line
search on scaled data

We have success! We see the fit to the data in Fig. 15.3.


The moral here is that it is quite difficult to automate our investigations. This is
why truly professional optimization code is so complicated. In our problem here,
we have a hard time finding a good starting point and we even find that scaling the
data—which seems like a good idea—is not as helpful as we thought it would be.
Breathe a deep sigh and accept this as our lot!

Reference

M. Braun, Differential Equations and Their Applications (Springer, Berlin, 1978)


Part VI
Series Solutions to PDE Models
Chapter 16
Series Solutions

Let’s look at another tool we can use to solve models. Sometimes our models involve
partial derivatives instead of normal derivatives; we call such models partial differ-
ential equation models or PDE models. They are pretty important and you should
have a beginning understanding of them. Let’s get started with a common tool called
the Separation of Variables Method.

16.1 The Separation of Variables Method

A common PDE model is the general cable model which is given below is fairly
abstract form.

∂2 ∂
β2 −−α = 0, for 0 ≤ x ≤ L , t ≥ 0,
∂x 2 ∂t
∂
(0, t) = 0,
∂x
∂
(L , t) = 0,
∂x
(x, 0) = f (x).

for positive constants α and β. The domain is the usual half infinite [0, L] × [0, ∞)
where the spatial part of the domain corresponds to the length of the dendritic cable
in an excitable nerve cell. We won’t worry too much about the details of where
this model comes from as we will discuss that in another volume. The boundary
conditions u x (0, t) = 0 and u x (L , t) = 0 are called Neumann Boundary conditions.
The conditions u(0, t) = 0 and u(L , t) = 0 are known as Dirichlet Boundary
conditions. The solution to a model such as this is a function (x, t) which is
sufficiently smooth to have partial derivatives with respect to the needed variables
continuous for all the orders required. For these problems, the highest order we need

© Springer Science+Business Media Singapore 2016 495


J.K. Peterson, Calculus for Cognitive Scientists,
Cognitive Science and Technology, DOI 10.1007/978-981-287-877-9_16
496 16 Series Solutions

is the second order partials. One way to find the solution is to assume we can separate
the variables so that we can write (x, t) = u(x)w(t). If we make this separation
assumption, we will find solutions that must be written as what are called infinite
series and to solve the boundary conditions, we will have to be able to express
boundary functions as series expansions. Hence, we will have to introduce some
new ideas in order to understand these things. Let’s motivate what we need to do by
applying the separation of variables technique to the cable equation. This will shows
the ideas we need to use in a specific example. Then we will step back and go over
the new mathematical ideas of series and then return to the cable model and finish
finding the solution.
We assume a solution of the form (x, t) = u(x) w(t) and compute the needed
partials. This leads to a the new equation

d 2u dw
β2 w(t) − u(x)w(t) − αu(x) = 0.
dx2 dt
Rewriting, we find for all x and t, we must have
 2 
2d u dw
w(t) β 2
− u(x) = αu(x) .
dx dt

This tells us
2
β 2 dd xu2 − u(x) α dw
= dt , 0 ≤ x ≤ L , t > 0.
u(x) w(t)

The only way this can be true is if both the left and right hand side are equal to a
constant that is usually called the separation constant . This leads to the decoupled
Eqs. 16.1 and 16.2.

dw
α =  w(t), t > 0, (16.1)
dt
d 2u
β 2 2 = (1 + ) u(x), 0 ≤ x ≤ L , (16.2)
dx
We also have boundary conditions. Our assumption leads to the following boundary
conditions in x:
du
(0) w(t) = 0, t > 0,
dx
du
(L) w(t) = 0, t > 0.
dx
16.1 The Separation of Variables Method 497

Since these equations must hold for all t, this forces

du
(0) = 0, (16.3)
dx
du
(L) = 0. (16.4)
dx

Equations 16.1–16.4 give us the boundary value problem in u(x) we need to solve.
Then, we can find w.

16.1.1 Determining the Separation Constant

The model is then


1+
u  − u=0
β2
du
(0) = 0,
dx
du
(L) = 0.
dx
We are looking for nonzero solutions, so any choice of separation constant  that
leads to a zero solution will be rejected.

16.1.1.1 Case I: 1 +  = ω 2 , ω = 0

The model to solve is

ω2
u  − u=0
β2
u  (0) = 0,
u  (L) = 0.

with characteristic equation r 2 − ωβ 2 = 0 with the real roots ± √ωD . The general
2

solution of this second order model is given by


   
ω ω
u(x) = A cosh x + B sinh x
β β
498 16 Series Solutions

which tells us
   
ω ω ω ω
u  (x) = A sinh x + B cosh x
β β β β

Next, apply the boundary conditions, u  (0) = 0 and u  (L) = 0. Hence,

u  (0) = 0 = B
 
ω
u  (L) = 0 = A sinh L
β
 
Hence, B = 0 and A sinh L ωβ = 0. Since sinh is never zero when ω is not zero,
we see A = 0 also. Hence, the only u solution is the trivial one and we can reject
this case.

16.1.1.2 Case II: 1 +  = 0

The model to solve is now

u  = 0
u  (0) = 0,
u  (L) = 0.

with characteristic equation r 2 = 0 with the double root r = 0. Hence, the general
solution is now

u(x) = A + B x

Applying the boundary conditions, u(0) = 0 and u(L) = 0. Hence, since u  (x) = B,
we have

u  (0) = 0 = B
u  (L) = 0 = B L

Hence, B = 0 but the value of A can’t be determined. Hence, any arbitrary constant
which is not zero is a valid non zero solution. Choosing B = 1, let u 0 (x) = 1 be
our chosen nonzero solution for this case.
We now need to solve for w in this case. Since  = −1, the model to solve is

dw 1
= − w(t), 0 < t,
dt α
w(L) = 0.
16.1 The Separation of Variables Method 499

The general solution is w(t) = Ce− α t for any value of C. Choose C = 1 and we
1

set

w0 (y) = e− α t .
1

Hence, the product φ0 (x, t) = u 0 (x) w0 (t) solves the boundary conditions. That is

φ0 (x, t) = e− α t .
1

is a solution.

16.1.1.3 Case III: 1 +  = −ω 2 , ω = 0

ω2
u  + u=0
β2
u  (0) = 0,
u  (L) = 0.

The general solution is given by


  
ω ω
u(x) = A cos x + B sin x
β β

and hence
   
 ω ω ω ω
u (x) = −A sin x + B cos x
β β β β

Next, apply the boundary conditions to find

u  (0) = 0 = B
 
 ω
u (L) = 0 = A sin L
β
 
Hence, B = 0 and A sin L ωβ = 0. Thus, we can determine a unique value of A
 
only if sin L ωβ = 0. If ω = nπβ
L
, we can solve for B and find B = 0, but otherwise,
B can’t be determined. So the only solutions are the trivial or zero solutions unless
ωL = nπβ. Letting ωn = nπβ L
, we find a non zero solution for each nonzero value
of B of the form
   
ωn nπ
u n (x) = B cos x = B cos x .
β L
500 16 Series Solutions

 B = 1. Then we have an infinite


For convenience, let’s choose all the constants

family of nonzero solutions u n (x) = cos L x and an infinite family of separation

constants n = −1 − ωn2 = −1 − n Lπ2 D .


2 2

We can then solve the w equation. We must solve

dw (1 + ωn2 )
=− w(t), t ≥ 0.
dt α
The general solution is

1+ωn2 1+n 2 π 2 β 2
w(t) = Bn e− α t
= Bn e− αL 2
t

Choosing the constants Bn = 1, we obtain the wn functions

n2 π2 β 2
wn (t) = e− αL 2
t

Hence, any product

φn (x, t) = u n (x) wn (t)

will solve the model with the x boundary conditions and any finite sum of the form,
for arbitrary constants An


N 
N
 N (x, t) = An φn (x, t) = An u n (x) wn (t)
n=1 n=1


N  
nπ 1+n 2 π 2 β 2
= An cos x e− αL 2 t
n=1
L

Adding in the 1 +  = 0 case, we find the most general finite term solution has the
form


N 
N
 N (x, t) = A0 φ0 (x, t) + An φn (x, t) = A0 u 0 (x)w0 (t) + An u n (x) wn (t)
n=1 n=1


N  
nπ 1+n 2 π 2 β 2
= A0 e − α1 t
+ An cos x e− αL 2 t .
n=1
L

Now these finite term solutions do solve the boundary conditions ∂ ∂x


(0, t) = 0 and
∂
∂x
(L , t) = 0, but how do we solve the remaining condition (x, 0) = f (x)? To do
this, we note since we can assemble the finite term solutions for any value of N , no
matter how large, it is clear we should let N → ∞ and express the solution as
16.1 The Separation of Variables Method 501


 ∞

(x, t) = A0 φ0 (x, t) + An φn (x, t) = A0 u 0 (x)w0 (t) + An u n (x) wn (t)
n=1 n=1

  
nπ 1+n 2 π 2 β 2
= A0 e− α t + x e− αL 2 t .
1
An cos
n=1
L

This is the form that will let us solve the remaining boundary condition. We need to
step back now and talk more about this idea of a series solution to our model.

16.2 Infinite Series

Let’s
 look at sequences offunctions
 made up of building blocks of the form u n (x) =
cos L or vn (x) = sin L for various values of the integer n. The number L is
nπ nπ

a fixed value here. We can combine these functions into finite sums: let U N (x) and
VN (x) be defined as follows:


N  

U N (x) = an sin x .
n=1
L

and


N  

VN (x) = b0 + bn cos x .
n=1
L

If we fixed the value of x to be say, x0 , the collection of numbers

{U1 (x0 ), U2 (x0 ), U3 (x0 ), . . . , Un (x0 ), . . . }

and

{V0 (x0 ), V1 (x0 ), V2 (x0 ), V3 (x0 ), . . . , Vn (x0 ), . . . }

are the partial sums formed from the sequences of cosine and sine numbers. However,
the underlying sequences can be negative, so these are not sequences of non negative
terms like we previously discussed. These sequences of partial sums may or may
not have a finite supremum value. Nevertheless,  we still
 represent the supremum

using the same notation: i.e. the supremum of Ui (x0 ) and the supremum of
 ∞   i=1  
∞ ∞
Vi (x0 ) can be written as n=1 an sin L x and b0 + n=1 bn cos L x
nπ nπ
i=0
502 16 Series Solutions

Let’s consider the finite sequence

{U1 (x0 ), U2 (x0 ), U3 (x0 ), . . . , Un (x0 )}.

This sequence of real numbers converges to a possibly different number for each x0 ;
hence, let’s call this possible limit S(x0 ). Now the limit may not exist, of course. We
will write limn→ Un (x0 ) = S(x0 ) when the limit exists. If the limit does not exist for
some value of x0 , we will understand that the value S(x0 ) is not defined in some way.
Note, from our discussion above, this could mean the limiting value flips between
a finite set of possibilities, the limit approaches ∞ or the limit approaches −∞. In
any case, the value S(x0 ) is not defined as a finite value. We would say this precisely
as follows: given any positive tolerance , there is a positive integer N so that
   
 n iπ 

n > N =⇒  ai sin − S(x0 ) < .
i=1
L

We use the notation of the previous section and write this as


n  

lim ai sin = S(x0 ).
n−>∞
i=1
L

with limiting value, S(x0 ) written as



  

S(x0 ) = ai sin ,
i=1
L

As before, this symbol is called an infinite series and we see we get a potentially
different series at each point x0 . The error term S(x0 ) − Un (x0 ) is then written as


n   ∞
  
iπ iπ
S(x0 ) − ai sin x0 = ai sin x0 ,
i=1
L i=n+1
L

which you must remember is just a short hand for this error.
Now that we have an infinite series notation defined, we note the term Un (x0 ),
which is the sumof n terms, is also called the nth partial sum of the series
∞ iπ
i=1 ai sin L x 0 . Note we can define the convergence at a point x 0 for the partial

sums of the cos functions in a similar manner.


16.3 Independant Objects 503

16.3 Independant Objects

Let’s go back and think about vectors in 2 . As you know, we think of these as
arrows with a tail fixed at the origin of the two dimensional coordinate system we
call the x–y plane. They also have a length or magnitude and this arrow makes an
angle with the positive x axis. Suppose we look at two such vectors, E and F. Each
vector has an x and a y component so that we can write
   
a c
E= , F=
b d

The cosine of the angle between them is proportional to the inner product
< E, F >= ac + bd. If this angle is 0 or π, the two vectors lie along the same
line. In any case, the angle associated with E is tan−1 ( ab ) and for F, tan−1 ( dc ).
Hence, if the two vectors lie on the same line, E must be a multiple of F. This means
there is a number β so that

E = β F.

We can rewrite this as


   
a c

b d

Now let the number 1 in front of E be called −α. Then the fact that E and F lie on
the same line implies there are 2 constants α and β, both not zero, so that

α E + β F = 0.

Note we could argue this way for vectors in 3 and even in n . Of course, our ability
to think of these things in terms of lying on the same line and so forth needs to be
extended to situations we can no longer draw, but the idea is essentially the same.
Instead of thinking of our two vectors as lying on the same line or not, we can rethink
what is happening here and try to identify what is happening in a more abstract
way. If our two vectors lie on the same line, they are not independent things in the
sense one is a multiple of the other. As we saw above, this implies there was a linear
equation connecting the two vectors which had to add up to 0. Hence, we might say
the vectors were not linearly independent or simply, they are linearly dependent.
Phrased this way, we are on to a way of stating this idea which can be used in many
more situations. We state this as a definition.
504 16 Series Solutions

Definition 16.3.1 (Two Linearly Independent Objects)


Let E and F be two mathematical objects for which addition and scalar multiplication
is defined. We say E and F are linearly dependent if we can find non zero constants
α and β so that

α E + β F = 0.

Otherwise, we say they are linearly independent.


We can then easily extend this idea to any finite collection of such objects as follows.
Definition 16.3.2 (Finitely many Linearly Independent Objects)
Let {E i : 1 ≤ i ≤ N } be N mathematical objects for which addition and scalar
multiplication is defined. We say E and F are linearly dependent if we can find
non zero constants α1 to α N , not all 0, so that

α1 E 1 + · · · + α N E N = 0.

Note we have changed the way we define the constants a bit. When there are more
than two objects involved, we can’t say, in general, that all of the constants must be
non zero.

16.3.1 Independent Functions

Now let’s apply these ideas to functions f and g defined on some interval I . By this
we mean either
• I is all of , i.e. a = −∞ and b = ∞,
• I is half-infinite. This means a = −∞ and b is finite with I of the form (−∞, b)
or (−∞, b]. Similarly, I could have the form (a, ∞) or [a, ∞,
• I is an interval of the form (a, b), [a, b), (a, b] or [a, b] for finite a < b.
We would say f and g are linearly independent on the interval I if the equation

α1 f (t) + α2 g(t) = 0, for all t ∈ I.

implies α1 and α2 must both be zero. Here is an example. The functions sin(t) and
cos(t) are linearly independent on because

α1 cos(t) + α2 sin(t) = 0, for all t,

also implies the above equation holds for the derivative of both sides giving

−α1 sin(t) + α2 cos(t) = 0, for all t,


16.3 Independant Objects 505

This can be written as the system


    
cos(t) sin(t) α1 0
=
− sin(t) cos(t) α2 0

for all t. The determinant of the matrix here is cos2 (t) + sin2 (t) = 1 and so picking
any t we like, we find the unique solution is α1 = α2 = 0. Hence, these two
functions are linearly independent on . In fact, they are linearly independent on any
interval I .
This leads to another important idea. Suppose f and g are linearly independent
differentiable functions on an interval I . Then, we know the system
    
f (t) g(t) α1 0
=
f  (t) g  (t) α2 0

only has the unique solution α1 = α2 = 0 for all t in I . This tells us


 
f (t) g(t)
det = 0
f  (t) g  (t)

for all t in I . This determinant comes up a lot and it is called the Wronskian of
the two functions f and g and it is denoted by the symbol W ( f, g). Hence, we have
the implication: if f and g are linearly independent differentiable functions, then
W ( f, g) = 0 for all t in I . What about the converse? If the Wronskian is never zero
on I , then the system
    
f (t) g(t) α1 0
=
f  (t) g  (t) α2 0

must have the unique solution α1 = α2 = 0 at each t in I also. So the converse is


true: if the Wronskian is not zero on I , then the differentiable functions f and g are
linearly independent on I . We can state this formally as a theorem.

Theorem 16.3.1 (Two Functions are Linearly Independent if and only if their
Wronskian is not zero) If f and g are differentiable functions on I , the Wronskian of
f and g is defined
to be
 
f (t) g(t)
W ( f, g) = det .
f  (t) g  (t)

where W ( f, g) is the symbol for the Wronskian of f and g. Sometimes, this is just
written as W , if the context is clear. Then f and g are linearly independent on I if
and only if W ( f, g) is non zero on I .
506 16 Series Solutions

Proof See the discussions above. 

If f , g and h are twice differentiable on I , the Wronskian uses a third row of second
derivatives and the statement that these three functions are linearly independent on
I if and only if their Wronskian is non zero on I is proved essentially the same way.
The appropriate theorem is

Theorem 16.3.2 (Three Functions are Linearly Independent if and only if their
Wronskian is not zero) If f , g and h are twice differentiable functions on I , the
Wronskian of f , g and h is defined to be
⎛⎡ ⎤⎞
f (t) g(t) h(t)
W ( f, g, h) = det ⎝⎣ f  (t) g  (t) h  (t) ⎦⎠ .
f  (t) g  (t) h  (t)

where W ( f, g, h) is the symbol for the Wronskian of f and g. Then f , g and h are
linearly independent on I if and only if W ( f, g, h) is non zero on I .

Proof The arguments are similar, although messier. 

For example, to show the three functions f (t) = t, g(t) = sin(t) and h(t) = e2t
are linearly independent on , we could form their Wronskian
⎛⎡ ⎤⎞
t sin(t) e2t    
cos(t) 2e2t sin(t) e2t
W ( f, g, h) = det ⎝⎣1 cos(t) 2e2t ⎦⎠ = t −
− sin(t) 4e2t − sin(t) 4e2t
0 − sin(t) 4e2t
   
= t e (4 cos(t) + 2 sin(t) − e (4 sin(t) + sin(t)
2t 2t

 
= e 4t cos(t) + 2t sin(t) − 5 sin(t) .
2t

Since, e2t is never zero, the question becomes is

4t cos(t) + 2t sin(t) − 5 sin(t)

zero for all t? If so, that would mean the functions t sin(t), t cos(t) and sin(t) are
linearly dependent. We could then form another Wronskian for these functions which
would be rather messy. To see these three new functions are linearly independent, it is
easier to just pick thr ee points t from and solve the resulting linearly dependence
equations. Since t = 0 does not give any information, let’s try t = −π, t = π4 and
t = π2 . This gives the system
⎡ ⎤⎡ ⎤ ⎡ ⎤
−4π 0 0 α1 0
⎢ √ √
π 2 ⎥ ⎣α ⎦ = ⎣0⎦
⎣ π 2 4 2 −5 2 ⎦
2
2

0 2π −5 α3 0
2
16.3 Independant Objects 507

in the unknowns α1 , α2 and α3 . We see√


immediately α − 1 = 0 and the remaining
two by two system has determinant 2 (−10 π4 ) + 10 π2 = 0. Hence, α2 = α3 =
2

0 too. This shows t sin(t), t cos(t) and sin(t) are linearly independent and show
the line 4t cos(t) + 2t sin(t) − 5 sin(t) is not zero for all t. Hence, the functions
f (t) = t, g(t) = sin(t) and h(t) = e2t are linearly independent. As you can see,
these calculations become messy quickly. Usually, the Wronskian approach for more
than two functions is too hard and we use the “pick three suitable points ti , from I”
approach and solve the resulting linear system. If we can show the solution is always
0, then the functions are linearly independent.

16.3.2 Homework

Exercise 16.3.1 Prove et and e−t are linearly independent on .


Exercise 16.3.2 Prove et and e2t are linearly independent on .
Exercise 16.3.3 Prove f (t) = 1 and g(t) = t 2 are linearly independent on .
Exercise 16.3.4 Prove et , e2t and e3t are linearly independent on . Use the pick
three points approach here.
Exercise 16.3.5 Prove sin(t), sin(2t) and sin(3t) are linearly independent on .
Use the pick three points approach here.
Exercise 16.3.6 Prove 1, t and t 2 are linearly independent on . Use the pick three
points approach here.

16.4 Vector Spaces and Basis

We can make the ideas we have been talking about more formal. If we have a set
of objects u with a way to add them to create new objects in the set and a way to
scale them to make new objects, this is formally called a Vector Space with the set
denoted by V . For our purposes, we scale such objects with either real or complex
numbers. If the scalars are real numbers, we say V is a vector space over the reals;
otherwise, it is a vector space over the complex field.
Definition 16.4.1 (Vector Space) Let V be a set of objects u with an additive
operation ⊕ and a scaling method . Formally, this means
1. Given any u and v, the operation of adding them together is written u ⊕ v and
results in the creation of a new object w in the vector space. This operation is
commutative which means the order of the operation is not important; so u ⊕ v
and v ⊕ u give the same result. Also, this operation is associative as we can group
any two objects together first, perform this addition ⊕ and then do the others and
the order of the grouping does not matter.
508 16 Series Solutions

2. Given any u and any number c (either real or complex, depending on the type
of vector space we have), the operation c u creates a new object. We call such
numbers scalars.
3. The scaling and additive operations are nicely compatible in the sense that order
and grouping is not important. These are called the distributive laws for scaling
and addition. They are

c (u ⊕ v) = (c u) ⊕ (c v)
(c + d) u = (c u) ⊕ (d u).

4. There is a special object called o which functions as a zero so we always have


o ⊕ u = u ⊕ o = u.
5. There are additive inverses which means to each u there is a unique object u† so
that u ⊕ u† = o.
Comment 16.4.1 These laws imply

(0 + 0) u = (0 u) ⊕ (0 u)

which tells us 0 u = 0. A little further thought then tells us that since

0 = (1 − 1) u
= (1 u) ⊕ (−1 u)

we have the additive inverse u† = −1 u.


Comment 16.4.2 We usually say this much simpler. The set of objects V is a vector
space over its scalar field if there are two operations which we denote by u + v and
cu which generate new objects in the vector space for any u, v and scalar c. We
then just add that these operations satisfy the usual commutative, associative and
distributive laws and there are unique additive inverses.
Comment 16.4.3 The objects are often called vectors and sometimes we denote
them by u although this notation is often too cumbersome.
Comment 16.4.4 To give examples of vector spaces, it is usually enough to specify
how the additive and scaling operations are done.
• Vectors in 2 , 3 and so forth are added and scaled by components.
• Matrices of the same size are added and scaled by components.
• A set of functions of similar characteristics uses as its additive operator,
pointwise addition. The new function ( f ⊕ g) is defined pointwise by ( f ⊕g)(t) =
f (t) + g(t). Similarly, the new function c f is defined by c f is the function
whose value at t is (c f )(t) = c f (t). Classic examples are
1. C[a, b] is the set of all functions whose domain is [a, b] that are continuous on
the domain.
16.4 Vector Spaces and Basis 509

2. C 1 [a, b] is the set of all functions whose domain is [a, b] that are continuously
differentiable on the domain.
3. R[a, b] is the set of all functions whose domain is [a, b] that are Riemann inte-
grable on the domain.
There are many more, of course.
Vector spaces have two other important ideas associated with them. We have already
talked about linearly independent objects. Clearly, the kinds of objects we were
focusing on were from some vector space V . The first idea is that of the span of a set.
Definition 16.4.2 (The Span Of A Set Of Vectors) Given a finite set of vectors in
a vector space V , W = {u1 , . . . , u N } for somepositive integer N , the span of W
N
is the collection of all new vectors of the form i=1 ci ui for any choices of scalars
c1 , . . . , c N . It is easy to see W is a vector space itself and since it is a subset of V ,
we call it a vector subspace. The span of the set W is denoted by S pW . If the set
of vectors W is not finite, the definition is  similar but we say the span of W is the
N
set of all vectors which can be written as i=1 ci ui for some finite set of vectors
u1 , . . . u N from W .
Then there is the notion of a basis for a vector space. First, we need to extend the
idea of linear independence to sets that are not necessarily finite.
Definition 16.4.3 (Linear Independence For Non Finite Sets) Given a set of vectors
in a vector space V , W , we say W is a linearly independent subset if every finite set
of vectors from W is linearly independent in the usual manner.
Definition 16.4.4 (A Basis For A Vector Space) Given a set of vectors in a vector
space V , W , we say W is a basis for V if the span of W is all of V and if the vectors
in W are linearly independent. Hence, a basis is a linearly independent spanning set
for V . The number of vectors in W is called the dimension of V . If W is not finite
is size, then we say V is an infinite dimensional vector space.
Comment 16.4.5 In a vector space like n , the maximum size of a set of linearly
independent vectors is n, the dimension of the vector space.
Comment 16.4.6 Let’s look at the vector space C[0, 1], the set of all continuous
functions on [0, 1]. Let W be the set of all powers of t, {1, t, t 2 , t 3 , . . .}. We can
use the derivative technique to show this set is linearly independent even though
it is infinite in size. Take any finite subset from W . Label the resulting powers as
{n 1 , n 2 , . . . , n p }. Write down the linear dependence equation

c1 t n 1 + c2 t n 2 + · · · + c p t n p = 0.

Take n p derivatives to find c p = 0 and then backtrack to find the other constants
are zero also. Hence C[0, 1] is an infinite dimensional vector space. It is also clear
that W does not span C[0, 1] as if this was true, every continuous function on [0, 1]
would be a polynomial of some finite degree. This is not true as sin(t), e−2t and many
others are not finite degree polynomials.
510 16 Series Solutions

16.4.1 Inner Products in Function Spaces

Now there is an important result that we use a lot in applied work. If we have an object
u in a Vector Space V , we often want to find to approximate u using an element from
a given subspace W of the vector space. To do this, we need to add another property
to the vector space. This is the notion of an inner product. We already know what
an inner product is in a simple vector space like n . Many vector spaces can have
an inner product structure added easily. For example, in C[a, b], since each object is
continuous, each object is Riemann integrable. Hence, given two functions f and g
b
from C[a, b], the real number given by a f (s)g(s)ds is well-defined. It satisfies all
the usual properties that the inner product for finite dimensional vectors in n does
also. These properties are so common we will codify them into a definition for what
an inner product for a vector space V should behave like.
Definition 16.4.5 (Real Inner Product) Let V be a vector space with the reals as
the scalar field. Then a mapping ω which assigns a pair of objects to a real number
is called an inner product on V if
1. ω(u, v) = ω(v, u); that is, the order is not important for any two objects.
2. ω(c u, v) = cω(u, v); that is, scalars in the first slot can be pulled out.
3. ω(u ⊕ w, v) = ω(u, v) + ω(w, v), for any three objects.
4. ω(u, u) ≥ 0 and ω(u, u) = 0 if and only if u = 0.
These properties imply that ω(u, c v) = cω(u, v) as well. A vector space V with
an inner product is called an inner product space.
Comment 16.4.7 The inner product is usually denoted with the symbol <, > instead
of ω( , ). We will use this notation from now on.
Comment 16.4.8 When we have an inner product, we can measure the size or
magnitude of an object, as follows. We define the analogue of the euclidean norm of
an object u using the usual || || symbol as

||u|| = < u, u >.

This is called the norm induced by the inner product of the object. In C[a, b], with
b
the inner product < f, g> = a f (s)g(s)ds, the norm of a function f is thus || f || =

b 2
a f (s)ds. This is called the L 2 norm of f .
It is possible to prove the Cauchy–Schwartz inequality in this more general setting
also.
Theorem 16.4.1 (General Cauchy–Schwartz Inequality)
If V is an inner product space with inner product < , > and induced norm || ||, then

| < u, v > | ≤ ||u|| ||v||

with equality occurring if and only if u and v are linearly dependent.


16.4 Vector Spaces and Basis 511

Proof The proof is different than the one you would see in a Calculus text for 2 ,
of course, and is covered in a typical course on beginning linear analysis. 
Comment 16.4.9 We can use the Cauchy–Schwartz inequality to define a notion of
angle between objects exactly like we would do in 2 . We define the angle θ between
u and v via its cosine as usual.
< u, v >
cos(θ) = .
||u|| ||v||

Hence, objects can be perpendicular or orthogonal even if we can not interpret them
as vectors in 2 . We see two objects are orthogonal if their inner product is 0.
Comment 16.4.10 If W is a finite dimensional subspace, a basis for W is said to be
an orthonormal basis if each object in the basis has L 2 norm 1 and all of the objects
are mutually orthogonal. This means < ui , u j > is 1 if i = j and 0 otherwise. We
typically let the Kronecker delta symbol δi j be defined by δi j = 1 if i = j and 0
otherwise so that we can say this more succinctly as < ui , u j >= δi j .

16.5 Fourier Series

A general trigonometric series S(x) has the following form

∞     
iπ iπ
S(x) = b0 + ai sin x + bi cos x
i=1
L L

for any numbers an and bn . Of course, there is no guarantee that this series will
converge at any x! If we start with a function f which is continuous on the interval
[0, L], we can define the trigonometric series associated with f as follows

1
S(x) = < f, 1 >
L
∞  
         
2 iπ iπ 2 iπ iπ
+ f (x), sin x sin x + f (x), cos x cos x .
L L L L L L
i=1

where the symbol < , > is the inner product in the set of functions C[0, L] defined
L
by < u, v >= 0 u(s)v(s)ds. The coefficients in the Fourier series for f are called
the Fourier coefficients of f . Since these coefficients are based on inner products
with scaled sin and cos functions, we call these the normalized Fourier coefficients.
Let’s be clear about this and a bit more specific. The nth Fourier sin coefficient,
n ≥ 1, of f is as follows:
 L  
2 iπ
an ( f ) = f (x) sin x dx
L 0 L
512 16 Series Solutions

The nth Fourier cos coefficient, n ≥ 0, of f are defined similarly:



1 L
b0 ( f ) = f (x) d x
L 0
  
2 L iπ
bn ( f ) = f (x) sin x d x, n ≥ 1.
L 0 L

16.5.1 The Cable Model Infinite Series Solution

We now know series of the form


N 
nπ 1+n 2 π 2 β 2
A0 e − α1 t
+ An cos x e− αL 2 t .
n=1
L

are like Fourier series although in terms of two variables. We can show these series
converge pointwise for x in [0, L] and all t. We can also show that we can take the
partial derivative of this series solutions term by term (see the discussions in Peterson
(2015) for details) to obtain


N  
nπ nπ 1+n 2 π 2 β 2
−An sin x e− αL 2 t .
n=1
L L

This series evaluated at x = 0 and x = L gives 0 and hence the Neumann conditions
are satisfied. Hence, the solution (x, t) given by


N  
nπ 1+n 2 π 2 β 2
(x, t) = A0 e − α1 t
+ An cos x e− αL 2 t .
n=1
L

for the arbitrary sequence of constants (An ) is a well-behaved solution on our domain.
The remaining boundary condition is

(x, 0) = f (x), for 0 ≤ x ≤ L

and

  

(x, 0) = A0 + An cos x .
n=1
L

Rewriting in terms of the series solution, for 0 ≤ x ≤ L, we find


16.5 Fourier Series 513


  

A0 + An cos x = f (x)
n=1
L

The Fourier series for f is given by



  

f (x) = B0 + Bn cos x
n=1
L

with

1 L
B0 = f (x),
L 0
  
2 L nπ
Bn = f (x) cos x d x.
L 0 L

Then, setting these series equal, we find that the solution is given by An = Bn for
all n ≥ 0. The full details of all of this is outside the scope of our work here, but this
will give you a taste of how these powerful tools can help us solve PDE models.

Reference

J. Peterson. Calculus for Cognitive Scientists: Partial Differential Equation Models, Springer Series
on Cognitive Science and Technology (Springer Science+Business Media Singapore Pte Ltd.,
Singapore, 2015 in press)
Part VII
Summing It All Up
Chapter 17
Final Thoughts

How can we train people to think in an interdisciplinary manner using these sorts
of tools? It is our belief we must foster a new mindset within practicing scientists.
We can think of this as the development of a new integrative discipline we could
call How It All Fits Together Science. Of course, this name is not formal enough
to serve as a new disciplinary title, but keeping to its spirit, we could use the name
Integrative Science. Now the name Integrative Biology has occurred here and there
in the literature over the last few decades and it is not clear to us it has the right
tone. So, we will begin calling our new integrative point of view Building Scientific
Bridges or B SB.

17.1 Fostering Interdisciplinary Appreciation

We would like the typical scientist to have a deeper appreciation of the use of the
triad of biology, mathematics and computer science (BMC) in this attempt to build
bridges between the many disparate areas of biology, cognitive sciences and the
other sciences. The sharp disciplinary walls that have been built in academia hurt
everyone’s chances at developing an active and questing mind that is able to both
be suspicious of the status—quo and also have the tools to challenge it effectively.
Indeed, we have longed believed that all research requires a rebellious mind. If you
revere the expert opinion of others too much, you will always be afraid to forge a
new path for yourself. So respect and disrespect are both part of the toolkit of our
budding scientists.
There is currently an attempt to create a new Astrobiology program at the Uni-
versity of Washington which is very relevant to our discussion. Astrobiology is an
excellent example of how science and a little mathematics can give insight into issues
such as the creation of life and its abundance in the universe. Read its primer edited
by Linda Billings and others, (Billings 2006) for an introduction to this field. Careful
arguments using chemistry, planetary science and many other fields inform what is
probable. It is a useful introduction not only to artificial life issues, but also to the
© Springer Science+Business Media Singapore 2016 517
J.K. Peterson, Calculus for Cognitive Scientists,
Cognitive Science and Technology, DOI 10.1007/978-981-287-877-9_17
518 17 Final Thoughts

process by which we marshal ideas form disparate fields to form interesting models.
Indeed, if you look at the new textbook, Planets and Life: The Emerging Science
of Astrobiology, (Sullivan and Baross 2007) you will be inspired at the wealth of
knowledge such a field must integrate. This integration is held back by students who
are not trained in both BMC and B SB. We paraphrase some of the important points
made by the graduate students enrolled in this new program about interdisciplinary
training and research. Consider what the graduate students in this new program have
said Sullivan and Baross (2007, p. 548):
…some of the ignorance exposed by astrobiological questions reveals not the boundaries
of scientific knowledge, but instead the boundaries of individual disciplines. Furthermore,
collaboration by itself does not address this ignorance, but instead compounds it by encour-
aging scientists to rely on each other’s authority. Thus, anachronistic disciplinary borders
are reinforced rather than overrun. In contrast astrobiology can motivate challenges to dis-
ciplinary isolation and the appeals to authority that such isolation fosters.

Indeed, studying problems that require points of view from many places is of great
importance to our society and from Sullivan and Baross (2007, p. 548) we hear that
many different disciplines should now be applied to a class of questions perceived as
broadly unified and that such an amalgamation justifies a new discipline (or even meta
discipline) such as astrobiology.

Now simply replace the key word astrobiology by biomathematics or B SB or


cognitive science and reread the sentence again. Aren’t we trying to do just this
when we design any textual material such as this book and its earlier companion
(Peterson 2015) to integrate separate disciplines? Since we believe we can bring
additional illumination to problems we wish to solve in cognitive sciences by adding
new ideas from mathematics and computer science to the mix, we are asking explicitly
for such an amalgamation and interdisciplinary blending. We would like to create a
small cadre of like minded people who believe as the nascent astrobiology graduate
students do (Sullivan and Baross 2007, p. 549) that
Dissatisfaction with disciplinary approaches to fundamentally interdisciplinary questions
also led many of us to major in more than one field. This is not to deny the importance
of reductionist approaches or the advances stimulated by them. Rather, as undergraduates
we wanted to integrate the results of reductionist science. Such synthesis is often poorly
accommodated by disciplines that have evolved, especially in academia, to become insular,
autonomous departments. Despite the importance of synthesis for many basic scientific
questions, it is rarely attempted in research…

We believe, as the astrobiology students (Sullivan and Baross 2007, p. 550) do that
[B SB enriched by BMC] can change this by challenging the ignorance fostered by disci-
plinary structure while pursuing the creative ignorance underlying genuine inquiry. Because
of its integrative questions, interdisciplinary nature,…, [B SB enriched by BMC] emerges
as an ideal vehicle for scientific education at the graduate, undergraduate and even high
school levels. [It] permits treatment of traditionally disciplinary subjects as well as areas
where those subjects converge (and, sometimes, fall apart!) At the same time, [it] is well
suited to reveal the creative ignorance at scientific frontiers that drives discovery.
17.1 Fostering Interdisciplinary Appreciation 519

where in the original quote, we have replaced astrobiology by the phrase B SB


enriched by BMC and enclosed our changes in brackets. The philosophy of this
course has been to encourage thinking that is suitable to the B SB enriched by BMC.
As you have seen, we have used the following point of view:

• All mathematical concepts are tied to real biological and scientific need. Hence,
after preliminary calculus concepts have been discussed, there is a careful devel-
opment of the exponential and natural logarithm functions. This is done in such
a way that all properties are derived so there are no mysteries, no references to
“this is how it is” and we don’t go through the details because the mathematical
derivation is beyond you. We insist that learning to think with the language of
mathematics is an important skill. Once the exponential function is understood,
it is immediately tied to the simplest type of real biological model: exponential
growth and decay.
• Mathematics is subordinate to the science in the sense that the texts build the math-
ematical knowledge needed to study interesting nonlinear biological and cognitive
models. We emphasize that to add more interesting science always requires more
difficult mathematics and concomitant intellectual resources.
• Standard mathematical ideas from linear algebra such as eigenvalues and eigen-
vectors can be used to study systems of linear differential equations using both
numerical tools (MatLab based currently) and graphical tools. We always stress
how the mathematics and the biology interact.
• Nonlinear models are important to study scientific things. We begin with the logis-
tics equation and progress to the Predator–Prey model and a standard SIR disease
model. We emphasize how we must abstract out of biological and scientific com-
plexity the variables necessary to build the model and how we can be wrong. We
show how the original Predator–Prey model, despite being a gross approximation
of the biological reality of fish populations in the sea, gives great explanatory
insight. Then, we equally stress that adding self-interaction to the model leads to
erroneous biological predictions. The SIR disease model is also clearly explained
as a gross simplification of the immunological reality which nevertheless illumi-
nates the data we see.
• Difficult questions can be hard to formulate quantitatively but if we persevere, we
can get great insights. We illustrate this point of view with a model of cancer which
requires 6 variables but which is linear. We show how much knowledge is both used
and thrown away when we make the abstractions necessary to create the model.
At this point, the students also know that models can easily have more variables
in them than we can graph comfortably. Also, numerical answers delivered via
MatLab are just not very useful. We need to derive information about the functional
relationships between the parameters of the model and that requires a nice blend
of mathematical theory and biology.

The use of BMC is therefore fostered in this text and its earlier companion by
introducing the students to concepts from the traditional Math-Two-Engineering (just
a little as needed), Linear Algebra (a junior level mathematics course) and Differential
520 17 Final Thoughts

Equations (a sophomore level mathematics course) as well as the rudiments of using


a language such as MatLab for computational insight. We deliberately try to use
scientific examples from biology and behavior in many places to foster the B SB
point of view.
We believe just as the astrobiology students do (Sullivan and Baross 2007, p. 550)
that
The ignorance motivating scientific inquiry will never wholly be separated from the igno-
rance hindering it. The disciplinary organization of scientific education encourages scientists
to be experts on specialized subjects and silent on everything else. By creating scientists
dependent of each other’s specializations, this approach is self-reinforcing. A discipline
of [B SB enriched by BMC] should attempt something more ambitious: it should instead
encourage scientists to master for themselves what formerly they deferred to their peers.

Indeed, there is more that can be said (Sullivan and Baross 2007, p. 552)
What [B SB enriched by BMC] can mean as a science and discipline is yet to be decided,
for it must face the two-fold challenge of cross-disciplinary ignorance that disciplinary
education itself enforces. First, ignorance cannot be skirted by deferral to experts, or by
other implicit invocations of the disciplinary mold that [B SB enriched by BMC] should
instead critique. Second, ignorance must actually be recognized. This is not trivial: how do
you know what you do not know? Is it possible to understand a general principle without
also understanding the assumptions and caveats underlying it? Knowledge superficially
“understood” is self-affirming. For example, the meaning of the molecular tree of life may
appear unproblematic to an astronomer who has learned that the branch lengths represent
evolutionary distance, but will the astronomer even know to consider the hidden assumptions
about rate constancy by which the tree is derived? Similarly, images from the surface of Mars
showing evidence of running water are prevalent in the media, yet how often will a biologist
be exposed to alternative explanations for these geologic forms, or to the significant evidence
to the contrary? [There is a need] for a way to discriminate between science and … uncritically
accepted results of science.

A first attempt at developing a first year curriculum for the graduate program in
astrobiology led to an integrative course in which specialists from various disciplines
germane to the study of astrobiology gave lectures in their own areas of expertise
and then left as another expert took over. This was disheartening to the students in
the program. They said that (Sullivan and Baross 2007, p. 553)
As a group, we realized that we could not speak the language of the many disciplines
in astrobiology and that we lacked the basic information to consider their claims critically.
Instead, this attempt at an integrative approach provided only a superficial introduction to
the major contributions of each discipline to astrobiology. How can critical science be built
on a superficial foundation? Major gaps in our backgrounds still needed to be addressed. In
addition, we realized it was necessary to direct ourselves toward a more specific goal. What
types of scientific information did we most need? What levels of mastery should we aspire
to? At the same time, catalyzed by our regular interactions in the class, we students realized
that we learned the most (and enjoyed ourselves the most) in each other’s interdisciplinary
company. While each of us had major gaps in our basic knowledge, as a group we could
begin to fill many of them.
17.1 Fostering Interdisciplinary Appreciation 521

Now paraphrase as follows:


Students can not speak the language of the many disciplines B SB enriched by BMC
requires and they lack the basic information to consider their claims critically. An attempt
at an integrative approach that provided only a superficial introduction to the major contri-
butions of each discipline can not lead to the ability to do critical science. Still, major gaps
in their backgrounds need to be addressed. What types of scientific information are most
needed? What levels of mastery should they aspire to? It is clear the students learned the
most (and enjoyed themselves the most) in each other’s interdisciplinary company. While
each has major gaps in basic knowledge, as a group they can begin to fill many of them.

Given the above, it is clear we must introduce and tie disparate threads of material
together with care. In many things, we favor theoretical approaches that attempt to
put an overarching theory of everything together into a discipline. Since biology and
cognitive science is so complicated, we will always have to do this. You can see
examples of this way of thinking explored in many areas of current research. For
example, it is very hard to understand how organs form in the development process. A
very good book on these development issues is first the one by Davies (2005). Davis
talks about the complexity that emerges from local interactions. This is, of course,
difficult to define precisely, yet, it is a useful principle that helps us understand the
formation of a kidney and other large scale organs. Read this book and then read the
one by Schmidt-Rhaesa (2007) and you will see that the second book makes more
sense given the ideas you have mastered from Davis.
Finally, it is good to always keep in mind (Sullivan and Baross 2007, p. 552)
Too often, the scientific community and academia facilely redress ignorance by appealing
to the testimony of experts. This does not resolve ignorance. It fosters it.

This closing comment sums it up nicely. We want to give the students and ourselves
an atmosphere that fosters solutions and our challenge is to find a way to do this.
So how do we become a good user of the three fold way of mathematics, science
and computer tools? We believe it is primarily a question of deep respect for the
balance between these disciplines. The basic idea is that once we abstract from
biology or some other science how certain quantities interact, we begin to phrase
these interactions in terms of mathematics. It is very important to never forget that
once the mathematical choices have been made, the analysis of the mathematics
alone will lead you to conclusions which may or may not be biologically relevant.
You must always be willing to give up a mathematical model or a computer science
model if it does not lead to useful insights into the original science.
We can quote from Randall (2005, pp. 70–71). She works in Particle Physics
which is the experimental arm that gives us data to see if string theory or loop gravity
is indeed a useful model of physical reality. It is not necessary to know physics here
to get the point. Notice what she says:
The term “model” might evoke a small scale battleship or castle you built in your child-
hood. Or you might think of simulations on a computer that are meant to reproduce known
dynamics—how a population grows, for example, or how water moves in the ocean. Model-
ing in particle physics is not the same as either of these definitions. Particle physics models
are guesses at alternate physical theories that might underlie the standard model…Different
522 17 Final Thoughts

assumptions and physical concepts distinguish theories, as do the distance or energy scales
at which a theory’s principles apply. Models are a way at getting at the heart of such dis-
tinguishing features. They let you explore a theory’s potential implications. If you think of
a theory as general instructions for making a cake, a model would be a precise recipe. The
theory would say to add sugar, a model would specify whether to add half a cup or two cups.

Now substitute Biological for Particle Physics and so forth and you can get a feel for
what a model is trying to do. Of course, biological models are much more complicated
than physics ones!
The primary message of this course is thus to teach you to think deeply and
carefully. The willingness to attack hard problems with multiple tools is what we
need in young scientists. We hope this course teaches you a bit more about that.
We believe that we learn all the myriad things we need to build reasonable models
over a lifetime of effort. Each model we design which pulls in material from disparate
areas of learning enhances our ability to develop the kinds of models that give insight.
As Pierrehumbert (2010, p. xi) says about climate modeling
When it comes to understanding the whys and wherefores of climate, there is an infinite
amount one needs to know, but life affords only a finite time in which to learn it…It is a
lifelong process. [We] attempt to provide the student with a sturdy scaffolding upon which
a deeper understanding may be built later.
The climate system [and other biological systems we may study] is made up of building
blocks which in themselves are based on elementary…principles, but which have surprising
and profound collective behavior when allowed to interact on the [large] scale. In this sense,
the “climate game” [the biological modeling game] is rather like the game of Go, where
interesting structure emerges from the interaction of simple rules on a big playing field,
rather than complexity in the rules themselves.

References

L. Billings, Education paper: the astrobiology primer: an outline of general knowledge-version 1,


2006. Astrobiology 6(5), 735–813 (2006)
J. Davies, Mechanisms of Morphogenesis (Academic Press Elsevier, Massachusetts, 2005)
J. Peterson, Calculus for Cognitive Scientists: Derivatives, Integration and Modeling, Springer
Series on Cognitive Science and Technology (Springer Science+Business Media Singapore Pte
Ltd, Singapore, 2015 in press)
R. Pierrehumbert, Principles of Planetary Climate (Cambridge University Press, Cambridge, 2010)
L. Randall, Warped Passages: Unraveling the Mysteries of the Universe’s Hidden Dimensions
(Harper Collins, New York, 2005)
A. Schmidt-Rhaesa, The Evolution of Organ Systems (Oxford University Press, Oxford, 2007)
W. Sullivan, J. Baross (eds.), Planets and Life: The Emerging Science of Astrobiology (Cambridge
University Press, Cambridge, 2007)
Part VIII
Advise to the Beginner
Chapter 18
Background Reading

To learn more about how the combination of mathematics, computational tools and
science have been used with profit, we have some favorite books that have tried to
do this. Reading these books have helped us design biologically inspired algorithms
and models and has inspired how we wrote this text! A selection of these books
would include treatment of regulatory systems in genomics such as Davidson (2001,
2006). An interesting book on how the architecture of genomes may have developed
which gives lots of food for thought for model abstraction is found in Lynch (2007).
Since the study of how to build cognitive systems is our big goal, the encyclopediac
treatment of how nervous systems evolved given in J. Kass’s four volumes (Kaas and
Bullock 2007a, b; Kaas and Krubitzer 2007; Kaas and Preuss 2007) is very useful.
To build a cognitive system that would be deployable as an autonomous device in a
robotic system requires us to think carefully about how to build a computer system
using computer architectural choices and hardware that has some plasticity. Clues as
to how to do this can be found by looking at how simple nervous systems came to
be. After all, the complexity of the system we can build will of necessity be fair less
than that of a real neural system that controls a body; hence, any advice we can get
from existing biological systems is welcome! With that said, as humans we process
a lot of environmental inputs and the thalamus plays a vital role in this. If you read
Murray Sherman’s discussion of the thalamus in Sherman and Guillery (2006), you
will see right away how you could begin to model some of the discussed modules in
the thalamus which will be useful in later work in the next two volumes.
Let’s step back now and talk about our ultimate goal which is to study cognitive
systems. You now have two self study texts under your belt (ahem, Peterson 2015a, b),
and you have been exposed to a fair amount of coding in MatLab. You have seen a
programming language is just another tool we can use to gain insight into answering
very difficult questions. The purpose of this series of four books (the next ones are
Peterson 2015c, d) are to prime you to always think about attacking complicated and
difficult questions using a diverse toolkit. You have been trained enough now that
you can look at all that you read and have a part of you begin the model building
process for the area you are learning about in your reading.

© Springer Science+Business Media Singapore 2016 525


J.K. Peterson, Calculus for Cognitive Scientists,
Cognitive Science and Technology, DOI 10.1007/978-981-287-877-9_18
526 18 Background Reading

We believe firmly that the more you are exposed to ideas from multiple disciplines,
the better you will be able to find new ways to solve our most challenging problems.
Using a multidisciplinary toolkit does not always go over big with your peers, so
you must not let that hold you back. A great glimpse into how hard it is to get ideas
published (even though they are great) is to look at how this process went for others.
We studied Hamilton’s models of altruism in the previous book, so it might surprise
you to know he had a hard time getting them noticed. His biography (Segerstrale
2013) is fascinating reading as is the collection of stories about biologists who thought
differently (Harman and Dietrich 2013). Also, although everyone knows who Francis
Crick was, reading his biography (Olby 2009) gives a lot of insight into the process
of thinking creatively outside of the established paradigms. Crick believed, as we do,
that theoretical models are essential to developing a true understanding of a problem.
The process of piecing together a story or model of a biological process indirectly
from many disparate types of data and experiments is very hard. You can learn much
about this type of work by reading carefully the stories of people who are doing
this. Figuring out what dinosaurs were actually like even though they are not alive
now requires that we use a lot of indirect information. The book on trace fossils (i.e.
footprints and coprolites—fossilized poop, etc.) Martin (2014) shows you how much
can be learned from those sources of data. In another reconstruction process, we know
enough now about how hominids construct their bodies; i.e. how thick the fat is on top
of muscle at various places on the face and so forth, to make a good educated guess at
the building a good model of extinct hominids such as Australopithecus. This process
is detailed in Gurchie (2013). Gurchie is an artist and it is fascinating to read how he
uses the knowledge scientists can give him and his own creative insight to create the
stunning models you can see at the Smithsonian now in the Hall of Humanity. We
also know much more about how to analyze DNA sample found in very old fossils
and the science behind how we can build a reasonable genome for a Neanderthal as
detailed in Pääbo (2014) is now well understood. Read Martin’s Gurchie’s, Pääbo’s
discussions and apply what you have been learning about mathematics, computation
and science to what they are saying. You should be seeing it all come together much
better now.
Since our ultimate aim is to build cognitive models with our tools, it is time for
you to read some about the brains that have been studied over the years. Kaas’s books
mentioned about nice, but almost too intense. Study some of these others and apply
the new intellectual lens we have been working to develop.
• Stiles (2008) has written a great book on brain development which can be read in
conjunction with another evolutionary approach by Streidter (2005). Read them
both and ask yourself how you could design and implement even a minimal model
that can handle such modular systems interacting in complex ways. Learning the
right language to do that is the province of what we do in Peterson (2015c, d),
so think of this reading as wetting your appetite! Another lower level and a bit
more layman in tone is the nice text by Allman (2000) which is good to read as
a overview of what the others are saying. The last one you should tackle is Allen
18 Background Reading 527

(2009) book about the evolution of the brain. Compare and contrast what all these
books are saying: learn to think critically!
• On a slightly different note, Taylor (2012) tells you about the various indirect ways
we use to measure what is going on in a functioning neural system. At the bottom
of all these tools is a lot of mathematics, computation and science which you are
much better prepared to read about now.
• It is very hard to understand how to model altruism as we have found out in
this book. It is also very hard to model quantitatively sexual difference. As you
are reading about the brain in the books we have been mentioning, you can dip
into Jordan-Young (2010) to see just how complicated the science behind sexual
difference is and how difficult it is to get good data. Further, there are many
mysteries concerning various aspects of human nature which are detailed in Barash
(2012). Ask yourself, how would you build models of such things? Another really
good read here on sex is by Ridley (1993) which is on sex and how human nature
evolved. It is a bit old now, but still very interesting.
• Once you have read a bit about the brain in general and from an evolutionary
viewpoint, you are ready to learn more about how our behavior is modulated by
neurotransmitters and drugs. S. Snyder has a nice layman discussion of those ideas
in Snyder (1996).
To understand cognition in some ways to try to understand how biological
processes evolved to handle information flow. Hence, reading about evolution in
general—going beyond what we touched on in our discussions of altruism—is a
good thing. An important part of our developmental chain is controlled by stem cells
and a nice relatively non technical introduction to those ideas is in Fox (2007) which
is very interesting also because it includes a lot of discussion on how stem cell ma-
nipulation is a technology we have to come to grips with. A general treatment of
human development can be found in Davies (2014) which you should read as well.
Also, the ways in which organ development is orchestrated by regulatory genes helps
you again see the larger picture in which many modules of computation interact. You
should check out A. Schmidt-Raisa’s book on how organ systems evolved as well as
J. Davies’s treatment of how an organism determines its shape in Davies (2005).
The intricate ways that genes work together is hard to model and we must
make many abstractions and approximations to make progress. G. Wagner has done
a terrific job of explaining some of the key ideas in evolutionary innovation in
Wagner (2014) and J. Archibald shows how symbiosis is a key element in evolu-
tion in Archibald (2014).
Processing books like these will help hone your skills and acquaint you with lots
of material outside of your usual domain. This is a good thing! So keep reading and
we hope you join us in the next volumes!
All of these readings have helped us to see the big picture!
528 18 Background Reading

References

J. Allen, The Lives of the Brain: Human Evolution and the Organ of Mind (The Belknap Press of
Harvard University Press, Cambridge, 2009)
J. Allman, Evolving Brains (Scientific American Library, New York, 2000)
J. Archibald, One PlusOne Equals One (Oxford University Press, Oxford, 2014)
D. Barash, Homo Mysterious: Evolutionary Puzzles of Human Nature (Oxford University Press,
Oxford, 2012)
E. Davidson, Genomic Regulatory Systems: Development and Evolution (Academic Press, San
Diego, 2001)
E. Davidson, The Regulatory Genome: Gene Regulatory Networks in Development and Evolution
(Academic Press Elsevier, Burlington, 2006)
J. Davies, Mechanisms of Morphogenesis (Academic Press Elsevier, Boston, 2005)
J. Davies, Life Unfolding: How the Human Body Creates Itself (Oxford University Press, Oxford,
2014)
C. Fox, Cell of Cells: The Global Race To Capture and Control the Stem Cell (W. H. Norton and
Company, New York, 2007)
J. Gurchie, Shaping Humanity: How Science, Art, and Imagination Help Us Understand Our Origins
(Yale University Press, New Haven, 2013)
O. Harman, M. Dietrich, Outsider Scientists: Routes to Innovation in Biology (University of Chicago
Press, Chicago, 2013)
R. Jordan-Young, Brainstorm: The Flaws in the Science of Sex Differences (Harvard University
Press, Brainstorm: The Flaws in the Science of Sex Differences, 2010)
J. Kaas, T. Bullock, (eds). Evolution of Nervous Systems: A Comprehensive Reference Editor J.
Kaas (Volume 1: Theories, Development, Invertebrates). (Academic Press Elsevier, Amsterdam,
2007a)
J. Kaas, T. Bullock, (eds). Evolution of Nervous Systems: A Comprehensive Reference Editor J.
Kaas (Volume 2: Non-Mammalian Vertebrates). (Academic Press Elsevier, Amsterdam, 2007b)
J. Kaas, L. Krubitzer, (eds). Evolution of Nervous Systems: A Comprehensive Reference Editor J.
Kaas (Volume 3: Mammals). (Academic Press Elsevier, Amsterdam, 2007)
J. Kaas, T. Preuss, (eds). Evolution of Nervous Systems: A Comprehensive Reference Editor J. Kaas
(Volume 4: Primates). (Academic Press Elsevier, Amsterdam, 2007)
M. Lynch, The Origins of Genome Architecture (Sinauer Associates, Inc., Sunderland, 2007)
A. Martin, Dinosaurs Without Bones: Dinosaur Lives Revealed By Their Trace Fossils (Pegasus
Books, New York, 2014)
S. Murray Sherman, R. Guillery, Exploring The Thalamus and Its Role in Cortical Function (The
MIT Press, Cambridge, 2006)
R. Olby, Francis Crick: Hunter of Life’s Secrets (ColdSpring Harbor Laboratory Press, New York,
2009)
S. Pääbo, Neanderthal Man: In Search of Lost Genomes (Basic Books, New York, 2014)
J. Peterson, Calculus for Cognitive Scientists: Derivatives, Integration and Modeling, Springer
Series on Cognitive Science and Technology (Springer Science+Business Media Singapore Pte
Ltd, Singapore, 2015a in press)
J. Peterson, Calculus for Cognitive Scientists: Higher Order Models and Their Analysis, Springer
Series on Cognitive Science and Technology (Springer Science+Business Media Singapore Pte
Ltd., Singapore, 2015b in press)
J. Peterson, Calculus for Cognitive Scientists: Partial Differential Equation Models, Springer Series
on Cognitive Science and Technology (Springer Science+Business Media Singapore Pte Ltd.,
Singapore, 2015c in press)
J. Peterson, BioInformation Processing: A Primer On Computational Cognitive Science, Springer
Series on Cognitive Science and Technology (Springer Science+Business Media Singapore Pte
Ltd., Singapore, 2015d in press)
References 529

M. Ridley, The Red Queen: Sex and the Evolution of Human Nature (Harper Perennial, New York,
1993)
U. Segerstrale, Nature’s Oracle: The Life and Work of W.D. Hamilton (Oxford University Press,
Oxford, 2013)
S. Snyder, Drugs and the Brain (Scientific American Library, New York, 1996)
J. Stiles, The Fundamentals of Brain Development (Harvard University Press, Cambridge, 2008)
G. Streidter, Principles of Brain Evolution (Sinauer Associates, Inc., Sunderland, 2005)
K. Taylor, The Brain Supremacy: Notes from the frontiers of neuroscience (Oxford University Press,
Oxford, 2012)
G. Wagner, Homology, Genes and Evolutionary Innovation (Princeton University Press, Oxford,
2014)
Glossary

Biological Modeling This is the study of biological systems using a combination


of mathematical, scientific and computational approaches, p. 4.

Cancer Model A general model of cancer based on TSG inactivation is as follows.


The tumor starts with the inactivation of a TSG called A, in a small compartment
of cells. A good example is the inactivation of the APC gene in a colonic crypt,
but it could be another gene. Initially, all cells have two active alleles of the TSG.
We will denote this by A+/+ where the superscript “+/+” indicates both alleles
are active. One of the alleles becomes inactivated at mutation rate u 1 to generate a
cell type denoted by A+/− . The superscript +/− tells us one allele is inactivated.
The second allele becomes inactivated at rate û 2 to become the cell type A−/− .
In addition, A+/+ cells can also receive mutations that trigger CIN. This happens
at the rate u c resulting in the cell type A+/+ CIN . This kind of a cell can inactivate
the first allele of the TSG with normal mutation rate u 1 to produce a cell with one
inactivated allele (i.e. a +/−) which started from a CIN state. We denote these
cells as A+/− CIN . We can also get a cell of type A+/+ CIN when a cell of type
A+/− receives a mutation which triggers CIN. We will assume this happens at the
same rate u c as before. The A+/− CIN cell then rapidly undergoes LOH at rate û 3
to produce cells of type A−/− CIN . Finally, A−/− cells can experience CIN at rate
u c to generate A−/− CIN cells. The first allele is inactivated by a point mutation.
The rate at which this occurs is modeled by the rate u 1 . We assume the mutations
governed by the rates u 1 and u c are neutral. This means that these rates do not
depend on the size of the population N . The events governed by û 2 and û 3 give

© Springer Science+Business Media Singapore 2016 531


J.K. Peterson, Calculus for Cognitive Scientists,
Cognitive Science and Technology, DOI 10.1007/978-981-287-877-9
532 Glossary

what is called selective advantage. This means that the size of the population
size does matter. Using these assumptions, we therefore model û 2 and û 3 as

û 2 = N u 2

and

û 3 = N u 3 .

where u 2 and u 3 are neutral rates. The mathematical model is then setup as follows.
Let
X 0 (t) is the probability a cell in cell type A+/+ at time t.
X 1 (t) is the probability a cell in cell type A+/− at time t.
X 2 (t) is the probability a cell in cell type A−/− at time t.
Y0 (t) is the probability a cell in cell type A+/+ C I N at time t.
Y1 (t) is the probability a cell in cell type A+/− C I N at time t.
Y2 (t) is the probability a cell in cell type A−/− C I N at time t.
We can then derive rate equations to be

X 0 = −(u 1 + u c ) X 0
X 1 = u 1 X 0 − (u c + N u 2 ) X 1
X 2 = N u 2 X 1 − u c X 2
Y0 = u c X 0 − u 1 Y0
Y1 = u c X 1 + u 1 Y0 − N u 3 Y1
Y2 = N u 3 Y1 + u c X 2

We are interested in analyzing this model over a typical human life span of 100
years, p. 406.

Cauchy Fundamental Theorem of Calculus Let G be any antiderivative of the


Riemann integrable function f on the interval [a, b]. Then G(b) − G(a) =
b
a f (t) dt, p. 129.

Characteristic Equation of a linear second order differential equation For a


linear second order differential equation,

a u  (t) + b u  (t) + c u(t) = 0

assume that er t is a solution and try to find what values of r might work. We see
for u(t) = er t , we find
Glossary 533

0 = a u  (t) + b u  (t) + c u(t)


= a r 2 er t + b r er t + c er t
 
= a r + b r + c er t .
2

Since er t can never be 0, we must have

0 = a r 2 + b r + c.

The roots of the quadratic equation above are the only values of r that will work
as the solution er t . We call this quadratic equation the characteristic equation for
this differential equation, p. 152.
Characteristic Equation of the linear system For a linear system
   
x  (t) x(t)
=A
y  (t) y(t)
   
x(0) x
= 0 .
y(0) y0

assume the solution is V er t for some nonzero vector V . This implies that r and
V must satisfy
   
0
rI−A Ve =
rt
.
0

Since er t is never zero, we must have


   
0
rI−A V = .
0

The only way we can get nonzero vectors V as solutions is to choose the values
of r so that
 
det r I − A = 0.

The second order polynomial we obtain is called the characteristic equation as-
sociated with this linear system. Its roots are called the eigenvalues of the system
and any nonzero vector V associated with an eigenvalue r is an eigenvector for
r , p. 178.
Complex number This is a number which has the form of a + b i where a and
b are arbitrary real numbers and the letter i represents a very abstract concept: a
number whose square i 2 = −1! We usually draw a complex number in a standard
534 Glossary

x–y Cartesian plane with the y axis labeled as i y instead of the usual y. Then
the number 5 + 4 i would be graphed just like the two dimensional coordinate
(4, 5), p. 141.
Continuity A function f is continuous at a point p if for all positive tolerances
, there is a positive δ so that | f (t) − f ( p) |<  if t is in the domain of f and
| t − p |< δ. You should note continuity is something that is only defined at a
point and so functions in general can have very few points of continuity. Another
way of defining the continuity of f at the point p is to say the limt → p f (t) exists
and equals f ( p), p. 129.

Differentiability A function f is differentiable at a point p if there is a number L


so that for all positive tolerances , there is a positive δ so that

f (t) − f ( p)
| − L |<  if t is in the domain of f and | t − p |< δ
t−p

You should note differentiability is something that is only defined at a point and
so functions in general can have very few points of differentiability. Another way
of defining the differentiability of f at the point p is to say the limt → p f (t)− f ( p)
t− p
exists. At each point p where this limit exists, we can define a new function called
the derivative of f at p. This is usually denoted by f  ( p) or ddtf ( p), p. 129.

Exponential growth Some biological systems can be modeled using the idea of
exponential growth. This means the variable of interest, x, has growth proportional
to its rate of change. Mathematically, this means x  ∝ r x for some proportionality
constant r , p. 149.

Fundamental Theorem of Calculus Let f be  Riemann Integrable on [a, b]. Then


x
the function F defined on [a, b] by F(x) = a f (t) dt satisfies
1. F is continuous on all of [a, b]
2. F is differentiable at each point x in [a, b] where f is continuous and F  (x) =
f (x), p. 129.
Glossary 535

Half life The amount of time it takes a substance x to lose half its original value
under exponential decay. It is denoted by t1/2 and can also be expressed as t1/2 =
ln(2)/r where r is the decay rate in the differential equation x  (t) = −r x(t),
p. 149.

Infectious Disease Model Assume the total population we are studying is fixed
at N individuals. This population is then divided into three separate pieces: we
have individuals
• that are susceptible to becoming infected are called Susceptible and are labeled
by the variable S. Hence, S(t) is the number that are capable of becoming
infected at time t.
• that can infect others. They are called Infectious and the number that are in-
fectious at time t is given by I (t).
• that have been removed from the general population. These are called Removed
and their number at time t is labeled by R(t).
We make a number of key assumptions about how these population pools interact.
• Individuals stop being infectious at a positive rate γ which is proportional to the
number of individuals that are in the infectious pool. If an individual stops being
infectious, this means this individual has been removed from the population.
This could mean they have died, the infection has progressed to the point where
they can no longer pass the infection on to others or they have been put into
quarantine in a hospital so that further interactions with the general population
is not possible. In all of these cases, these individuals are not infectious or can’t
cause infections and so they have been removed from the part of the population N
which can be infected or is susceptible. Mathematically, this means we assume

I  = −γ I.

• Susceptible individuals are those capable of catching an infection. We model the


interaction of infectious and susceptible individuals in the same way we handled
the interaction of food fish and predator fish in the Predator–Prey model. We
assume this interaction is proportional to the product of their population sizes:
i.e. S I . We assume the rate of change of Susceptible’s is proportional to this
interaction with positive proportionality constant r . Hence, mathematically, we
assume

S  = r S I.
536 Glossary

We can then figure out the net rates of change of the three populations. The
infectious populations gains at the rate r S I and loses at the rate γ I . Hence

I  = r S I − γ I.

The net change of Susceptible’s is that of simple decay. Susceptibles are lost at
the rate r S I . Thus, we have

S  = − r S I.

Finally, the removed population increases at the same rate the infectious population
decreases. We have

R  = γ I.

We also know that R(t) + S(t) + I (t) = N for all time t because our population
is constant. So only two of the three variables here are independent. We typically
focus on the variables I and S for that reason. Our complete Infectious Disease
Model is then

I = r S I − γ I
S  = −r S I
I (0) = I0
S(0) = S0 .

where we can compute R(t) as N − I (t) − S(t), p. 382.

Insulin regulation Glucose plays an important role in vertebrate metabolism be-


cause it is a source of energy. For each person, there is an optimal blood glucose
concentration and large deviations from this leads to severe problems including
death. Blood glucose levels are autoregulated via standard forward and backward
interactions like we see in many biological systems. The blood glucose concentra-
tion is influenced by a variety of signaling molecules just like the protein creation
rates can be. Here are some of them. The hormone that decreases blood glucose
concentration is insulin. Insulin is a hormone secreted by the β cells of the pan-
creas. After we eat carbohydrates, our gastrointestinal tract sends a signal to the
pancreas to secrete insulin. Also, the glucose in our blood directly stimulates the
β cells to secrete insulin. We think insulin helps cells pull in the glucose needed
for metabolic activity by attaching itself to membrane walls that are normally
impenetrable. This attachment increases the ability of glucose to pass through to
the inside of the cell where it can be used as fuel. So, if there is not enough insulin,
cells don’t have enough energy for their needs. There are other hormones that tend
to change blood glucose concentrations also, such as glucagon, epinephrine, glu-
cocorticoids, thyroxin and somatotropin. Net hormone concentration is the sum
Glossary 537

of insulin plus the others and we let H denote the net hormone concentration. At
normal conditions, call this concentration H0 . Under close to normal conditions,
the interaction of the one hormone insulin with blood glucose completely dom-
inates the net hormonal activity; so normal blood sugar levels primarily depend
on insulin–glucose interactions. Hence, if insulin increases from normal levels, it
increases net hormonal concentration to H0 + H and decreases glucose blood
concentration. On the other hand, if other hormones such as cortisol increased
from base levels, this will make blood glucose levels go up. Since insulin domi-
nates all activity at normal conditions, we can think of this increase in cortisol as
a decrease in insulin with a resulting drop in blood glucose levels. A decrease in
insulin from normal levels corresponds to a drop in net hormone concentration to
H0 −H. Now let G denote blood glucose level. Hence, in our model an increase
in H means a drop in G and a decrease in H means an increase in G! Note our
lumping of all the hormone activity into a single net activity is very much like how
we modeled food fish and predator fish in the predator–prey model. We describe
the model as

G  (t) = F1 (G, H) + J(t)


H  (t) = F2 (G, H)

where the function J is the external rate at which blood glucose concentration is
being increased in a glucose tolerance test. There are two nonlinear interaction
functions F1 and F2 because we know G and H have complicated interactions.
Let’s assume G and H have achieved optimal values G 0 and H0 by the time the
fasting patient has arrived at the hospital. Hence, we don’t expect to have any
contribution to G  (0) and H  (0); i.e. F1 (G 0 , H0 ) = 0 and F2 (G 0 , H0 ) = 0. We
are interested in the deviation of G and H from their optimal values G 0 and H0 ,
so let g = G − G 0 and h = H − H0 . We can then write G = G 0 + g and
H = H0 + h. The model can then be rewritten as

g  (t) = F1 (G 0 + g, H0 + h) + J(t)
h (t) = F2 (G 0 + g, H0 + h)

and we can then approximate these dynamics using tangent plane approximations
to F1 and F2 giving

∂ F1 ∂ F1
g  (t) ≈ (G 0 , H0 ) g + (G 0 , H0 ) h + J(t)
∂g ∂h
∂ F2 ∂ F2
h (t) ≈ (G 0 , H0 ) g + (G 0 , H0 ) h
∂g ∂h

It is this linearized system of equations we can analyze to give some insight into
how to interpret the results of a glucose tolerance test, p. 475.
538 Glossary

Linear Second Order Differential Equations These have the general form

a u  (t) + b u  (t) + c u(t) = 0


u(0) = u 0 , u  (0) = u 1

where we assume a is not zero, p. 149.


Linear Systems of differential equations These are systems of differential equa-
tions of the form

x  (t) = a x(t) + b y(t)


y  (t) = c x(t) + d y(t)
x(0) = x0
y(0) = y0

which can be written in matrix–vector form as


      
x (t) a b x(t)
=
y  (t) c d y(t)
   
x(0) x
= 0 .
y(0) y0

We typically call the coefficient matrix A so that the system is, p. 171.
   
x  (t) x(t)
= A
y  (t) y(t)
   
x(0) x
= 0 ,
y(0) y0

Predator–Prey This is a classical model developed to model the behavior of the


interacting population of a food source and its predator. Let the variable x(t)
denote the population of food and y(t), the population of predators at time t.
This is, of course, a coarse model. For example, food fish could be divided into
categories like halibut, mackerel with a separate variable for each and the predators
could be divided into different classes like sharks, squids and so forth. Hence,
instead of dozens of variables for both the food and predator population, everything
is lumped together. We then make the following assumptions:
Glossary 539

1. The food population grows exponentially. Letting xg denote the growth rate of
the food, we have

xg = a x

for some positive constant a.


2. The number of contacts per unit time between predators and prey is propor-
tional to the product of their populations. We assume the food are eaten by the
predators at a rate proportional to this contact rate. Letting the decay rate of
the food be denoted by xd , we see

xd = −b x y

for some positive constant b.

Thus, the net rate of change of food is x  = xg + xd giving

x  = a x − b x y.

for some positive constants a and b. We then make assumptions about the predators
as well.

1. Predators naturally die following an exponential decay; letting this decay rate
be given by yd , we have

yd = −c y

for some positive constant c.


2. We assume the predators grow proportional to how much they eat. In turn, how
much they eat is assumed to be proportional to the rate of contact between food
and predator fish. We model the contact rate just like before and let yg be the
growth rate of the predators. We find

yg = d x y

for some positive constant d.

Thus, the net rate of change of predators is y  = yg + yd giving

y  = −c y + d x y.

for some positive constants c and d. The full model is thus, p. 292.
540 Glossary

x = a x − b x y
y  = −c y + d x y
x(0) = x0
y(0) = y0 ,

Predator–Prey Self Interaction The original Predator–Prey model does not in-
clude self-interaction terms. These are terms that model how the food and predator
populations interaction with themselves. We can model these effects by assuming
their magnitude is proportional to the interaction. Mathematically, we assume
these are both decay terms giving us

xsel f = −e x x

ysel f = − f y y.

for positive constants e and f . We are thus led to the new self-interaction model
given below, p. 355:

x  (t) = a x(t) − b x(t) y(t) − e x(t)2


y  (t) = −c y(t) + d x(t) y(t) − f y(t)2 ,

Predator–Prey with fishing If we add fishing at the rate r to the Predator–Prey


model, we obtain, p. 338:

x  (t) = a x(t) − b x(t) y(t) − r x(t)


y  (t) = −c y(t) + d x(t) y(t) − r y(t),

Primitives The primitive of a function f is any function F which is differentiable


and satisfies F  (t) = f (t) at all points in the domain of f , p. 129.
Protein Modeling The gene Y is a string of nucleotides (A, C, T and G) with a
special starting string in front of it called the promoter. The nucleotides in the
gene Y are read three at a time to create the amino acids which form the protein Y ∗
corresponding to the gene. The process is this: a special RNA polymerase, RNAp,
which is a complex of several proteins, binds to the promoter region. Once RNAp
binds to the promoter, messenger RNA, mRNA, is synthesized that corresponds to
the specific nucleotide triplets in the gene Y . The process of forming this mRNA
is called transcription. Once the mRNA is formed, the protein Y ∗ is then made.
The protein creation process is typically regulated. A single regulator works like
this. An activator called X is a protein which increases the rate of mRNA creation
when ti binds to the promoter. The activator X switches between and active and
inactive version due to a signal SX . We let the active form be denoted by X ∗ . If X ∗
binds in front of the promoter, mRNA creation increases implying an increase
in the creation of the protein Y ∗ also. Once the signal SX appears, X rapidly
transitions to its state X ∗ , binds with the front of the promoter and protein Y ∗
Glossary 541

begins to accumulate. We let β denote the rate of protein accumulation which is


constant once the signal SX begins. However, proteins also degrade due to two
processes:

• proteins are destroyed by other proteins in the cell. Call this rate of destruction
αdes .
• the concentration of protein in the cell goes down because the cell grows and
therefore its volume increases. Protein is usually measured as a concentration
and the concentration goes down as the volume goes up. Call this rate αd i l —the
dil is for dilation.
The net or total loss of protein is called α and hence

α = αdes + αd i l

The net rate of change of the protein concentration is then our familiar model

dY ∗
= β − αY ∗
dt 
constant growth loss term

We usually do not make a distinction between the gene Y and its transcribed
protein Y ∗ . We usually treat the letters Y and Y ∗ as the same even though it is not
completely correct. Hence, we just write as our model

Y = β − α Y
Y (0) = Y0

and then solve it using the integrating factor method even though, strictly speaking,
Y is the gene!, p. 149.

Response Time In a protein synthesis model

y  (t) = −α y(t) + β
y( 0 ) = y0

the time it takes the solution to go from its initial concentration y0 to a value
halfway between the initial amount and the steady state value is called the re-
sponse time. It is denoted by tr and tr = ln(2)/α so it is functionally the same as
the half life in an exponential decay model, p. 149.
542 Glossary

Riemann Integral If a function on the finite interval [a, b] is bounded, we can


define a special limit which, if it exists, is called the Riemann Integral of the func-
tion on the interval [a, b]. Select a finite number of points from the interval [a, b],
{t0 , t1 , . . . , tn−1 , tn }. We don’t know how many points there are, so a different
selection from the interval would possibly gives us more or less points. But for
convenience, we will just call the last point tn and the first point t0 . These points
are not arbitrary—t0 is always a, tn is always b and they are ordered like this:

t0 = a < t1 < t2 < · · · < tn−1 < tn = b

The collection of points from the interval [a, b] is called a Partition of [a, b] and is
denoted by some letter—here we will use the letter P. So if we say P is a partition
of [a, b], we know it will have n + 1 points in it, they will be labeled from t0 to
tn and they will be ordered left to right with strict inequalities. But, we will not
know what value the positive integer n actually is. The simplest Partition P is the
two point partition {a, b}. Note these things also:

1. Each partition of n + 1 points determines n subintervals of [a, b]


2. The lengths of these subintervals always adds up to the length of [a, b] itself,
b − a.
3. These subintervals can be represented as

{[t0 , t1 ], [t1 , t2 ], . . . , [tn−1 , tn ]}

or more abstractly as [ti , ti+1 ] where the index i ranges from 0 to n − 1.


4. The length of each subinterval is ti+1 − ti for the indices i in the range 0 to
n − 1.
Now from each subinterval [ti , ti+1 ] determined by the Partition P, select any point
you want and call it si . This will give us the points s0 from [t0 , t1 ], s1 from [t1 , t2 ]
and so on up to the last point, sn−1 from [tn−1 , tn ]. At each of these points, we can
evaluate the function f to get the value f (s j ). Call these points an Evaluation
Set for the partition P. Let’s denote such an evaluation set by the letter E. If
the function f was nice enough to be positive always and continuous, then the
product f (si ) × (ti+1 − ti ) can be interpreted as the area of a rectangle; in general,
though, these products are not areas. Then, if we add up all these products, we
get a sum which is useful enough to be given a special name: the Riemann sum
for the function f associated with the Partition P and our choice of evaluation set
E = {s0 , . . . , sn−1 }. This sum is represented by the symbol S( f, P, E) where the
things inside the parenthesis are there to remind us that this sum depends on our
choice of the function f , the partition P and the evaluation set E. The Riemann
sum is normally written as

S( f, P, E) = f (si ) (ti+1 − ti )
i∈P
Glossary 543

and we just remember that the choice of P will determine the size of n. Each
partition P has a maximum subinterval length—let’s use the symbol || P || to
denote this length. We read the symbol || P || as the norm of P. Each par-
tition P and evaluation set E determines the number S( f, P, E) by a simple
calculation. So if we took a collection of partitions P1 , P2 and so on with associ-
ated evaluation sets E 1 , E 2 etc., we would construct a sequence of real numbers
{S( f, P1 , E 1 ), S( f, P2 , E 2 ), . . . , S( f, Pn , E n ), . . .}. Let’s assume the norm of the
partition Pn gets smaller all the time; i.e. limn → ∞ || Pn ||= 0. We could then
ask if this sequence of numbers converges to something. What if the sequence
of Riemann sums we construct above converged to the same number I no matter
what sequence of partitions whose norm goes to zero and associated evaluation
sets we chose? Then, we would have that the value of this limit is independent of
the choices above. This is what we mean by the Riemann Integral of f on the
interval [a, b]. If there is a number I so that

lim S( f, Pn , E n ) = I
n→∞

no matter what sequence of partitions {Pn } with associated sequence of evaluation


sets {E n } we choose as long as limn → ∞ || Pn ||= 0, we say that the Riemann
Integral of f on [a, b] exists and equals the value I . The value I is dependent on
the choice of f and interval [a, b]. So we often denote this value by I ( f, [a, b])
or more simply as, I ( f, a, b). Historically, the idea of the Riemann integral was
developed using area approximation as an application, so the summing nature of
the Riemann Sum was denoted by the 16th century letter S which resembled an
elongated
 or stretched letter S which looked like what we call the integral sign
. Hence, the common notation for the Riemann Integral of f on [a, b], when
b
this value exists, is a f . We usually want to remember what the independent
variable of f is also and we want to remind ourselves that this value is obtained
as we let the norm of the partitions go to zero. The symbol dt for the independent
variable t is used as a reminder that ti+1 − ti is going to zero as the norm of the
b
partitions goes to zero. So it has been very convenient to add to the symbol a f
b
this information and use the augmented symbol a f (t) dt instead. Hence, if the
b
independent variable was x instead of t, we would use a f (x) d x. Since for a
function f , the name we give to the independent variable is a matter of personal
b
choice, we see that the choice of variable name we use in the symbol a f (t) dt
is very arbitrary. Hence, it is common to refer to the independent variable we use
b
in the symbol a f (t) dt as the dummy variable of integration, p. 129.
544 Glossary

Separation of Variables Method A common PDE model is the general cable


model which is given below is fairly abstract form.

∂2 ∂
β2 −−α = 0, for 0 ≤ x ≤ L , t ≥ 0,
∂x 2 ∂t
∂
(0, t) = 0,
∂x
∂
(L , t) = 0,
∂x
(x, 0) = f (x).

The domain is the usual half infinite [0, L] × [0, ∞) where the spatial part of
the domain corresponds to the length of the dendritic cable in an excitable nerve
cell. We won’t worry too much about the details of where this model comes from
as we will discuss that in another volume. The boundary conditions u x (0, t) =
0 and u x (L , t) = 0 are called Neumann Boundary conditions. The conditions
u(0, t) = 0 and u(L , t) = 0 are known as Dirichlet Boundary conditions. One
way to find the solution is to assume we can separate the variables so that we can
write (x, t) = u(x)w(t). We assume a solution of the form (x, t) = u(x)w(t)
and compute the needed partials. This leads to a the new equation

d 2u dw
β2 w(t) − u(x)w(t) − αu(x) = 0.
dx2 dt
Rewriting, we find for all x and t, we must have
 2 
2d u dw
w(t) β − u(x) = αu(x) .
dx2 dt

This tells us
2
β 2 dd xu2 − u(x) α dw
= dt , 0 ≤ x ≤ L , t > 0.
u(x) w(t)

The only way this can be true is if both the left and right hand side are equal
to a constant that is usually called the separation constant . This leads to the
decoupled equations

dw
α =  w(t), t > 0,
dt
d 2u
β 2 2 = (1 + ) u(x), 0 ≤ x ≤ L ,
dx
Glossary 545

We also have boundary conditions.

du
(0) = 0
dx
du
(L) = 0.
dx
This gives us a second order ODE to solve in x and a first order ODE to solve in
t. We have a lot of discussion about this in the text which you should study. In
general, we find there is an infinite family of solutions that solve these coupled
ODE models which we can label u n (x) and wn (t). Thus, any finite combination
N
n (x, t) = n=0 an u n (x)wn (t) will solve these ODE models, but we are still
left with satisfying the last condition that (x, 0) = f (x). We do this by finding
a series solution. We can show that the data function f can be written as a series
f (x) = ∞ n=0 bn u n (x) for a set of constants {b0 , b1 , . . .} and we can also show
that the series (x, t) = ∞ n=0 an u n (x)wn (t) solves the last boundary condition
(x, 0) = ∞ a
n=0 n nu (x)w n (0) = f (x) as long as we choose an = bn for all n.
The idea of a series and the mathematical machinery associated with that takes a
while to explain, so Chap. 16 is devoted to that, p. 495.

Tangent plane error We can characterize the error made when a function of
two variables is replaced by its tangent plane at a point better if we have ac-
cess to the second order partial derivatives of f . The value of f at the point
(x0 + x, y0 + y) can be expresses as follows:

f (x0 + x, y0 + y) = f (x0 , y0 ) + f x (x0 , y0 )x + f y (x0 , y0 )y


   
1 x T x
+ H(x0 + cx, y0 + cy)
2 y y

where c between 0 and 1 so that the tangent plane error is given by, p. 435
   
1 x T x
E(x0 , y0 , x, y) = H(x0 + cx, y0 + cy) ,
2 y y
Index

A Experimental Evidence, 401


Advice on MatLab and Octave, 5 First allele inactivated by CIN at rate u c ,
404
First allele inactivated by point mutation
B at rate u 1 , 404
Background Reading Flawed modeling choices, 430
A selection of books to read to continue Is the CIN or non CIN pathway domi-
your journey, 525 nant?, 406
Losing the second allele due to CIN is
close to probability one, 407
C Loss of allele rates for neutral mutations,
Calculus for Cognitive Science 407
Roadmap, 4 Loss of heterozygosity, 403
Cancer Modeling Loss of second allele rates, 404
u c is proportional to u 1 , 407 Maximum human lifetime T , 406
Average human lifetime, 406 Micro satellite instability (MIN), 402
Chromosomal instability (CIN), 403 Mutation rates û 2 and û 3 give selective
CIN decay rates, u c required for CIN advantage, 404
dominance. with u 1 = u 2 = 10−7 and Mutation rates for u 1 and u c are popula-
u c = Ru 1 , 425 tion independent, 404
CIN Model Approximations with error Mutations with selective advantage de-
estimates in terms of parameters, 422 pend on population size, 404
Colon crypt, 403 Neutral mutation, 404
Colon Rectal Cancer Non CIN and CIN Model Approxima-
Mutation inactivates Adenomatous tions, 424
Polyposis Coli (APG) TSG path- Non CIN and CIN Model Approxima-
way, 403 tions Dependence on population size N
Estimating the error for the X 0 approxi- and the CIN rate u c for u 1 = u 2 10−7
mation, 414 and u c = Ru for some R ≥ 1, 423
Estimating the error for the X 1 approxi- Non CIN and CIN Model Approxima-
mation, 414 tions with error estimates using u 1 =
Estimating the error for the Y0 approxi- u 2 u and u c = Ru, 423
mation, 420 Non CIN Model Approximations with
Estimating the error for the Y1 approxi- error estimates in terms of parameters,
mation, 421 417
Estimating the error for the Y2 approxi- The N u 3 model, 407
mation, 422 The u 1 and u 2 values, 407
© Springer Science+Business Media Singapore 2016 547
J.K. Peterson, Calculus for Cognitive Scientists,
Cognitive Science and Technology, DOI 10.1007/978-981-287-877-9
548 Index

The u c model, 407 Real Inner Product, 510


The Y0 approximation equations, 418 The Span Of A Set Of Vectors, 509
The Y1 approximation equations, 420 Diabetes Detection Model
The Y2 approximation equations, 422 general discussion, 475
The Mathematical Model Graphically, hormones that chance blood glucose con-
405 centrations, 476
The Mathematical Model Initial Condi- net hormones and blood glucose, 477
tions, 406 the generic nonlinear net hormone H and
The Mathematical Model Rate Equa- blood glucose G dynamics, 477
tions, 406 an example for fitting data to the g
The population size N , 404 model, 481
Tumor Suppressor Gene (TSG), 402 converting the approximate system
Two allele model for TSG inactivation, into a second order ODE in the tran-
404 sient blood glucose concentration
Complex Functions g, 480
e(a+bi)t , 146 fitting the data using gradient descent,
Complex Numbers 482
e ib , 146 gradient descent is helped with line
Argument, 142 searching, 483
Complex conjugate, 143 linearizing the model using tangent
Complex Plane, 142 plane approximations, 478
Graph of a complex number, 143 reasoning out the algebraic signs of
Graph of a complex number’s conjugate, the partial derivatives in the approx-
143 imation, 479
Polar coordinate form, 142 rewriting as deviations from nominal
Roots of quadratic equations, 141 values, 477
the g model solution, 481
the approximate model incorporating
D the algebraic sign information, 480
Definition the data from the glucose tolerance
Disease Model test is fitted to the g model using a
Epidemic, 388 least squares error function, 481
Infectious to Susceptible Rate ρ, 388 the linear approximation model, 478
Eigenvalues and Eigenvectors of a 2 by the role of glucose in vertebrate
2 Matrix, 51 metabolism, 475
Eigenvalues and Eigenvectors of a n by Disease Model
n Matrix, 56 nullclines, 382
Extension Of eb to eib , 146 only quadrant one is relevant biologi-
Extension of et to e(a+bi)t , 146 cally, 385
Functions Satisfying Lipschitz Condi- the basics of the SIR model, 381
tions, 74 the population divided into three pieces:
Hyperbolic Functions, 395 susceptible, infectious and removed,
Derivatives, 396 381
Inverse Hyperbolic Tangent, 397 the rates of change of the three sub pop-
Linear Independence, 504 ulations, 382
The Angle Between n Dimensional Vec- The SIR model assumptions, 381
tors, 17 Disease Models
The Determinant Of A 2 × 2 Matrix, 19 Estimating the parameters from data, 392
The Inner Product Of Two Vectors, 12 estimating ρ, 398
Vector Space, 507 estimating d R/dt, 393
Vector Spaces
A Basis For A Vector Space, 509
Linear Independence For Non Finite E
Sets, 509 Euler’s Method
Index 549

dividing the time domain of the solution Consistent, 32


x into finite steps: introducing the idea Cramer’s Rule, 28
of the step size h, 69 Data Vector, 23
the algorithm, 79 Inconsistent Example, 33
examples, 79 Inconsistent implies data and columns
the error estimate at step 2h, 71 not collinear, 33
the error estimate at step 3h, 73 Matrix - Vector Form, 23
the error estimate at step h, 70 Two Equation in Two Unknowns Form,
the error estimate at step N h, 73 22
the euler approximate to the true solution Vector Variable, 23
x(h), 70 Zero Data
the global error is proportional to h, 73 Coefficient Matrix Has Nonzero De-
the tangent line approximation to the true terminant, 36
solution x, 69 Consistent System, 36
the tangent line approximation to the true Consistent System Example, 37
solution x(2h) and the second Euler ap- Linear ODE Systems
proximate at 2h, 70 r can not satisfy det (r I − A) = 0, 175
the tangent line approximation to the true r must satisfy det (r I − A) = 0, 175
solution x(2h) and the second Euler ap- r satisfying det (r I − A) = 0 are called
proximate at 3h, 72 eigenvalues of the system, 176
the tangent line approximation to the true x  = 0 lines, 192
solution x(h), 70 y  = 0 lines, 192
Assume solution has form V er t , 173
Can trajectories cross, 198
F
Characteristic Equation, 178
Fourier Series
Combining the x  = 0 and y  lines, 193
Fourier Cosine Coefficients for f : nor-
Complex Eigenvalues, 222
malized, 512
Canonical Real Solution in Phase
Fourier Series for f , 511
Shifted Form, 229
Fourier Sine Coefficients for f : normal-
First Real Solution, 223
ized, 511
General Complex Solution, 222
Function Approximation
General Real Solution, 223
first order Taylor Polynomial
derivation, 64 Matrix Representation of the Real So-
examples, 66 lution, 226
second order Taylor Polynomial Phase Shifted Form Of Real Solution
derivation, 67 Example, 231
examples, 68 Representing the Coefficient Matrix
Taylor Polynomials defined, 61 for the General Real Solution, 227
zeroth order Taylor Polynomial Rewriting the Real Solution, 225
derivation, 63 Second Real Solution, 223
The Canonical Real Solution, 228
The Phase Shifted Form for the Real
I Solution, 230
Integration The Transformed Real Solution, 227
Integration By Parts, 130 Theoretical Analysis, 222
Partial Fraction Decomposition, 135 Worked Out Example, 223
Derivation of the characteristic equation,
173
L Dominant solution, 185
Lemma eigenvalues, 183
Estimating The Exponential Function, 74 eigenvectors, 184
Linear Equations Factored form matrix system r and V
Coefficient Matrix, 23 must solve, 175
550 Index

Formal rules for drawing trajectories, The Cable Equation, 495


200 Separation of Variables
General form, 171 Definition, 496
General graphical analysis for a linear Linear Second Order ODE
ODE systems problem, 208 Definition, 149
General solution, 176, 178, 182, 185 Derivation of the characteristic equation,
Graphical analysis, 192 152
Graphing all trajectories, 202 Derivative of e(a+bi)t , 162
Graphing Region I trajectories, 196 Deriving the second solution to the re-
Graphing Region II trajectories, 200 peated roots case, 158
Graphing Region III trajectories, 201 Factoring Operator forms, 151
Graphing Region IV trajectories, 202 General complex solution, 163
Graphing the eigenvector lines, 194 General real solution, 163
magnified Region II trajectories, 201 General solution for distinct roots case,
Matrix system r and V must solve, 174 154
Matrix system solution V er t must solve, Graphing distinct roots case in MatLab,
174 155
Matrix–vector form, 182 Graphing real solution to complex root
Nonzero V satisfying (r I − A)V = 0 case in MatLab, 165
are called eigenvectors corresponding Graphing repeated roots solution in Mat-
to the eigenvalue r of the system, 178 Lab, 159
nullclines, 192 Operator form of au  + bu  + cu = 0,
Numerical Methods 150
Converting second order linear prob- Operator form of u  = r u, 150
lems to ODE systems, 238
MatLab implementation, 246
MatLab implementation: linear sec-
ond order ODE system dynamics, M
242 MatLab
Repeated Eigenvalues, 216 A Predator–Prey Model
One Independent Eigenvector: Find- session code for full phase plane por-
ing F, 219 trait, 465
One Independent Eigenvector: Gen- session for full phase plane portrait:
eral Solution, 219 the dynamics, 462
One Independent Eigenvector: Sec- session for phase plane portrait for
ond Solution, 218 equilibrium point Q1, 461
Only One Independent Eigenvector, session for phase plane portrait for
217 equilibrium point Q2, 462
Two Independent Eigenvectors: Gen- session for the Jacobian for equilib-
eral Solution, 217 rium point Q1, 460
Two Linearly Independent Eigenvec- session for the Jacobian for equilib-
tors, 216 rium point Q2, 462
Rewriting in matrix–vector form, 172 A Predator–Prey Model with Self Inter-
Solution to IVP, 185 action
The derivative of V er t , 174 the nullclines intersect in Quadrant 1:
The Eigenvalues Of A Linear System Of Jacobian for equilibrium point Q3,
ODEs, 178 467
Two negative eigenvalues case, 205 the nullclines intersect in Quadrant 1:
Vector form of the solution, 185 Jacobian for equilibrium point Q4,
Linear PDE 466
Models the nullclines intersect in Quadrant
Neumann and Dirichlet Boundary 1: session code for full phase plane
Conditions, 495 portrait, 468
Index 551

the nullclines intersect in Quadrant 1: Nonlinear Least Squares code ver-


session code for phase plane por- sion two: the optimal parameter val-
trait for equilibrium point Q4, 466 ues, 490
the nullclines intersect in Quadrant 4: Nonlinear Least Squares code ver-
session code for phase plane plot at sion two: typical line search details,
equilibrium point Q4, 470 489
the nullclines intersect in Quadrant 4: Nonlinear Least Squares: code ver-
session code for the dynamics, 471 sion one: finding the gradient, 485
the nullclines intersect in Quadrant 4: Disease Model
session code for the full phase plot, Solving S  = −5S I, I  = 5S I +
472 25I, S(0) = 10, I (0) = 5: Epi-
the nullclines intersect in Quadrant demic, 391
4: session code for the Jacobian at DrawSimpleSurface, 86
equilibrium point Q3, 471 DrawTangentPlanePackage, 104
the nullclines intersect in Quadrant Eigenvalues in Matlab: eig, 254
4: session code for the Jacobian at Eigenvectors and Eigenvalues in Matlab:
equilibrium point Q4, 469 eig, 255
the nullclines intersect in Quadrant 4: Example 5×5 Eigenvalue and Eigenvec-
session code for the phase plot at tor Calculation in Matlab, 255
equilibrium point Q3, 471 Find the Eigenvalues and Eigenvectors in
Automated Phase Plane Plot for x  = Matlab, 258
6x − 5x y, y  = −7y + 4x y, 349 finding equilibrium points with Global
AutoPhasePlanePlot Arguments, 252 Newton code for the trigger model, 454
Finite Difference Global Newton
AutoPhasePlanePlot.m, 252
Method, 450
AutoPhasePlanePlotLinearModel, 261
FixedRK.m: The Runge–Kutta Solution,
Bisection Code, 444 83
a sample bisection session, 445 Function Definition In MatLab, 444
bisection code that lists each step in Global Newton Code
the search, 445 a sample session for finding the roots
finding roots of tan(x/4) = 1 show- of sin(x)., 449
ing each step, 445 Global Newton Function, 449
Diabetes Model Global Newton Function Derivative, 449
Nonlinear Least Squares code ver- Global Newton Method, 447
sion one, 484 Inner Products in Matlab: dot, 256
Nonlinear Least Squares code ver- Linear Second Order Dynamics, 242
sion one: implementing the error LowerTriangular Solver, 43
calculation, 485 LU Decomposition of A With Pivoting,
Nonlinear Least Squares code ver- 48
sion one: runtime results with no LU Decomposition of A Without Pivot-
line search, 486 ing, 47
Nonlinear Least Squares code ver- Newton’s Method Code
sion two: line search added, 488 a sample finite difference session,
Nonlinear Least Squares code ver- 452
sion two: results from perturbing finite difference approximations to
the initial data, 490 the derivatives, 450
Nonlinear Least Squares code ver- sample finite difference session out-
sion two: run time results with line put, 452
search, 489 secant approximations to the deriva-
Nonlinear Least Squares code ver- tives, 450
sion two: the line search function, Nonlinear ODE Models
487 encoding a Jacobian, 439
552 Index

first example: a sample phase plane Derivation of the characteristic equation


portrait and the problems we have for eigenvalues, 51
in generating it, 442 Eigenvalue equation, 50
first example: finding the eigenvalues Nonzero vectors which solve the eigen-
and eigenvectors of the Jacobian at value equation for an eigenvalue are
equilibrium point (0, 0), 439 called its eigenvectors, 52
first example: finding the eigenvalues Roots of the characteristic equation are
and eigenvectors of the Jacobian at called eigenvalues, 52
equilibrium point (0, 1), 441 MultiVariable Calculus
first example: finding the eigenvalues 2D Vectors in terms of i and j, 85
and eigenvectors of the Jacobian at 3D Vectors in terms of i, j and k, 86
equilibrium point (1, 0), 440 Continuity, 89
first example: setting up the dynam- Continuous Partials Imply Differentia-
ics, 442 bility, 114
Phase Plane for x  = 4x + 9y, y  = Drawing a surface
−x − 6y, x(0) = 4, y(0) = −2, 250 A sample drawing session, 87
Plotting the full phase plane – this is Putting the pieces together as a utility
harder to automate function, 86
AutoPhasePlanePlotRKF5NoPMultiple, Error Form of Differentiability For Two
463 Variable, 106
RKstep.m: Runge–Kutta Codes, 82 Extremals
Sample Linear Model: x  = 4x − y, y  = manipulating the tangent plane er-
8x − 5y, 257 ror equation as a quadratic—
Should We Do A Newton Step?, 447 completing the square, 119
Solving x  = −3x + 4y, y  = −x + Minima and maxima can occur where
2y; , x(0) = −1; y(0) = 1, 248 the tangent plane is flat, 118
Solving x  + 4x  − 5x = 0, , x(0) = Hessian, 115
−1, x  (0) = 1, 244 Mean Value Theorem for Two Variables,
Solving x  + 4x  − 5x = 0, , x(0) = 112
−1, x  (0) = 1, 243 Partial Derivatives, 93
Solvingx  = −3x + 4y, , y  = −x + 1D derivatives on traces, 90
2y; , x(0) = −1; y(0) = 1, 246 Defining the partial derivative using
the trigger model the traces, 93
session for equilibrium point Q1, 455 Drawing the traces tangent vectors in
session for for equilibrium point Q1, 3D, 91
456 Notations, 94
session for phase plane portrait for Planes, 98
equilibrium point Q1, 455 Planes Again, 98
session for phase plane portrait for Second Order Partials, 114
equilibrium point Q1: defining the Smoothness
linearized dynamics, 455 2D differentiable implies 2D continu-
session for phase plane portrait for ity, 107
equilibrium point Q2, 457 A review of 1D continuity ideas, 88
session for phase plane portrait for An example of a 2D function that is
equilibrium point Q3, 457, 458 not continuous at a point, 89
session for phase plane portrait of en- Continuity in 2D, 89
tire model, 459 Defining differentiability for func-
session for phase plane portrait of en- tions of two variables, 106
tire model: defining the dynamics, Nonzero values occur in neighbor-
458 hoods, 120
Upper Triangular Solver, 44 The 2D chain rule, 108
Matrix Surface Approximation
2 × 2 Determinant, 19 Deriving the hessian error result, 116
Index 553

Looking at the algebraic signs of the the nullclines intersect in Quadrant 4:


terms in the tangent plane quadratic equilibrium point Q3: eigenvalues
error, 120 and eigenvectors, 471
The new tangent plane equation with the nullclines intersect in Quadrant 4:
hessian error, 117 equilibrium point Q4, 469
The Gradient, 103 the nullclines intersect in Quadrant 4:
Vector Cross Product, 100 equilibrium point Q4: eigenvalues
and eigenvectors, 470
a standard change of variable at the equi-
N librium point gives a simple 2D linear
Nonlinear ODE Models model, 438
A Predator–Prey Model basic approximations, 435
the analysis for equilibrium point Q1: equilibrium point dynamic approxima-
eigenvalues and eigenvectors, 461 tions are simple, 437
the analysis for equilibrium point Q2, equilibrium points (x0 , y0 ) are where the
462 dynamics satisfy both f (x0 , y0 ) = 0
a first example: x  = (1 − x)x − 1+x 2x y
and g(x0 , y0 ) = 0, 437
 
expressing the nonlinear model dynam-
and y  = 1 − 1+x y
y, 438 ics in terms of the tangent plane ap-
a first example proximations, 436
equilibrium points and the Jacobians, finding equilibrium points numerically,
439 443
the local linear system at equilibrium the bisection method, 443
point (0, 0), 440 generating phase plane portraits at equi-
the local linear system at equilibrium librium points, 442
point (0, 1), 441 linearization in terms of the Jacobian at
the local linear system at equilibrium an equilibrium point, 438
point (1, 0), 441 the error made in replacing the nonlinear
A Predator–Prey Model, 459 dynamics by the tangent plane approx-
the analysis for equilibrium point Q1, imations is given in terms of a Hessian,
460 437
the analysis for equilibrium point Q2: the Jacobian, 437
eigenvalues and eigenvectors, 462 the trigger model, 453
the Jacobian, 460 Analysis for equilibrium point Q1,
A Predator–Prey Model with Self Inter- 455
action, 465
Analysis for equilibrium point Q1:
equilibrium point Q3: eigenvalues eigenvalues and eigenvectors, 455
and eigenvectors, 467
Analysis for equilibrium point Q2,
the nullclines intersect in Quadrant 1,
456
465
Analysis for equilibrium point Q2:
the nullclines intersect in Quadrant 1:
eigenvalues and eigenvectors, 457
equilibrium point Q3, 467
Analysis for equilibrium point Q3,
the nullclines intersect in Quadrant 1:
457
equilibrium point Q4, 466
the nullclines intersect in Quadrant 1: Analysis for equilibrium point Q3:
equilibrium point Q4: eigenvalues eigenvalues and eigenvectors, 458
and eigenvectors, 466 Jacobian, 454
the nullclines intersect in Quadrant 1: Numerical Solution Techniques
the Jacobian, 465 Euler’s Method, 79
the nullclines intersect in Quadrant 4, Runge–Kutta Methods
469 description, 80
the nullclines intersect in Quadrant 4: MatLab implementation: Fixe-
equilibrium point Q3, 471 dRK.m, 83
554 Index

MatLab implementation: RKstep.m, y  = 0 line, 357


81 Asymptotic values, 374
Order one to four, 81 How to analyze a model, 375
Limiting average values, 370
Modeling food–food and predator–
O predator interaction, 355
ODE Systems nullclines, 356
Nonhomogeneous Model, 213 Q1 Intersection point of the nullclines,
Outside reading 364
Theoretical biology, 525 Q4 intersection of nullclines, 366
Quadrant 1 x  = 0 line, 360
Quadrant 1 y  = 0 line, 360
P Solving numerically, 375
Predator–Prey Models The model is not biologically useful, 379
f and g growth functions, 309 The nullclines cross: c/d < a/e, 364
A sample Predator–Prey analysis, 336 Trajectory starting on positive y axis, 361
Adding fishing rates, 338 Proposition
Distinct trajectories do not cross, 299 Complex Number Properties, 144
Food fish assumptions, 290
Nonlinear conservation law in Region II,
301 R
Nonlinear conservation law Region IV, Root Finding Algorithms
307 bisection, 443
nullclines, 295 Newton’s Method
Numerical solutions, 341 secant approximations to derivatives,
Original Mediterranean sea data, 289 450
Phrasing the nonlinear conservation law Newton’s Method with finite difference
in terms of f and g, 312 approximations for the derivatives, 450
Plotting one point, 326 Newton’s Method with the coded deriv-
Plotting three points, 330 atives, 447
Plotting two points, 328
Positive x axis trajectory, 300
Positive y axis trajectory, 299 S
Predator assumptions, 291 Separation of Variables
Quadrant 1 trajectory possibilities, 309 The Cable Equation
Showing a trajectory is bounded in x: 1+ the Separation Constant is Zero,
general argument, 319 498
Showing a trajectory is bounded in x: 1+ the Separation Constant is Nega-
specific example, 315 tive, 499
Showing a trajectory is bounded in y: Determining the Separation Con-
general argument, 320 stant, 497
Showing a trajectory is bounded in y: Handling the Applied Boundary Data
specific example, 317 Function, 513
Showing a trajectory is bounded: general Series Coefficient Matching, 513
argument, 322 The 1+ the Separation Constant is
Showing a trajectory is bounded: specific Positive, 497
example, 318 The Family Of Nonzero Solutions for
Showing a trajectory is periodic, 324 u, 1+ Separation Constant is Neg-
The average value of x and y on a trajec- ative, 499
tory, 332 The Family Of Nonzero Solutions for
The Predator–Prey model explains the w, 1+ Separation Constant is Neg-
Mediterranean Sea data, 339 ative, 500
Predator–Prey Self Interaction Models The Family Of Nonzero Solutions for
x  = 0 line, 356 w, Separation Constant is −1, 499
Index 555

The Infinite Series Solution, 512 needed ideas: functions satisfying Lip-
The Separated ODEs, 496 schitz conditions, 74
The Separation Constant, 496 Traditional engineering calculus course, 3
The Heat Equation Training Interdisciplinary Students
The Family Of Nonzero Solutions for Fostering the triad: biology, mathematics
u, Separation Constant is −1, 498 and computational, 520
Series Overcoming disciplinary boundaries,
Trigonometric Series, 511 517
Convergence of a sequence of sums of
functions, 502
Finite sums of sine and cosine functions, V
501 Vector
 
Infinite Series notation for the remainder Angle Between A Pair Of Two Dimen-

after n terms, i=n+1 ai sin iπ x
L 0 ,
sional Vectors, 16
Inner product of two vectors, 12
502 n Dimensional Cauchy Schwartz Theo-
Partial sums, 502 rem, 16
Writing limit S(x0 ) using the series no- Orthogonal, 26

tation i=1 , 502 Two Dimensional Cauchy Schwartz The-
orem, 15
Two Dimensional Collinear, 15
T What does the inner product mean?, 14
The Inverse of the 2 × 2 matrix A, 40
The Inverse of the matrix A, 39
Theorem
W
Cauchy–Schwartz Inequality, 510
Worked Out Solutions, 20
Chain Rule For Two Variables, 109
Adding fishing rates to the predator–prey
Cramer’s Rule, 28
model, 339
Differentiability Implies Continuous For
Angle Between, 18
Two Variables, 107
Collinearity Of, 20, 21
Error Estimates For Euler’s Method, 76
Complex Functions
Linear ODE Systems
Drawing, 200 e(−1+2i)t , 147
MultiVariable Calculus e(−2+8i)t , 147
Nonzero values occur in neighbor- e2it , 147
hoods, 120 Complex Numbers
Second Order Test for Extrema, 121 −2 + 8i, 144
Rolle’s Theorem, 62 z = 2 + 4i, 143
Taylor Polynomials Convert to a matrix–vector system, 172
FirstOrder, 66, 68 Converting into a matrix–vector system,
Zeroth Order, 63 172
The Mean Value Theorem, 62 Converting to a matrix - vector system,
Three Functions are Linearly Indepen- 239, 240
dent if and only if their Wronskian is Converting to a vector - matrix system,
not zero, 506 239
Two Functions are Linearly Independent Cramer’s Rule, 28, 29
if and only if their Wronskian is not Deriving the characteristic equation, 179,
zero, 505 180
Time Dependent Euler’s Method Determine Consistency, 33
error estimates analysis, 76 Euler’s Method, 79, 80
finding bounds for | x  (t) | on the time Inner Product, 17
domain, 75 Integration
 By Parts
10
needed ideas: exponential function (2t−3)(8t+5) dt, 137
 5
bounds, 74 (t+3)(t−4) dt, 136
556 Index

 6
dt, 138 Linear second order ODE: Characteristic
 (4−t)(9+t)
ln(t)dt, 130 equation distinct roots case, derivation,
3 2 154
t sin(t)dt, 133
17 −6 Linear second order ODE: Solution com-
dt, 138
4 (t−2)(2t+8) plex roots case, 164
 tt ln(t)dt, 131 Linear second order ODE: Solution dis-
 te2 dt, 131 tinct roots case, 154
 t 2 sin(t)dt,
t dt, 132
133 Linear second order ODE: Solution re-
 3t e peated roots case, 159
t ln(t)dt, 131
Matrix Vector Equation, 23, 24
Linear second order ODE: Characteris-
tic equation complex roots case, deriva- Partial Derivatives, 95
tion, 164 Solving numerically, 249
Linear second order ODE: Characteris- Solving The Consistent System, 37
tic equation derivation repeated roots Solving the IVP completely, 186, 189
case, 159 Solving the Predator–Prey model, 336

You might also like