0% found this document useful (0 votes)

28 views567 pages

CST, Calculus For Cognitive Scientists Higher Order Models and Their Analysis

Uploaded by

David Olivera

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views567 pages

CST, Calculus For Cognitive Scientists Higher Order Models and Their Analysis

Uploaded by

David Olivera

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 567

Cognitive Science and Technology

James K. Peterson

Calculus for
Cognitive
Scientists
Higher Order Models and Their Analysis
Cognitive Science and Technology

Series editor
David M.W. Powers, Adelaide, Australia

More information about this series at http://www.springer.com/series/11554

The seadragons were intrigued by calculus and flocked to the teacher
James K. Peterson

Calculus for Cognitive

Scientists
Higher Order Models and Their Analysis

123
James K. Peterson
Department of Mathematical Sciences
Clemson University
Clemson, SC
USA

ISSN 2195-3988 ISSN 2195-3996 (electronic)

Cognitive Science and Technology
ISBN 978-981-287-875-5 ISBN 978-981-287-877-9 (eBook)
DOI 10.1007/978-981-287-877-9

Library of Congress Control Number: 2015958343

© Springer Science+Business Media Singapore 2016

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made.

Printed on acid-free paper

This Springer imprint is published by SpringerNature

The registered company is Springer Science+Business Media Singapore Pte Ltd.
I dedicate this work to the biology students
who have learned this material over the last
10 semesters, the practicing biologists, the
immunologists, the cognitive scientists, and
the computer scientists who have helped an
outsider think much better and to my family
who have listened to my ideas in the living
room and over dinner for many years. I hope
that this text helps inspire everyone who
works in science to consider mathematics and
computer science as indispensable tools in
their own work in the biological sciences.
Acknowledgments

We would like to thank all the students who have used the various iterations
of these notes as they have evolved from handwritten to the fourth fully typed
version here. We particularly appreciate your interest as this course is required and
uses mathematics; a combination that causes fear in many biological science
majors. We have been pleased by the enthusiasm you have brought to this inter-
esting combination of ideas from many disciplines. Finally, we gratefully
acknowledge the support of Hap Wheeler in the Department of Biological Sciences
during the years 2006 to 2014 for believing that this material would be useful to
biology students.
For this new text on a follow-up course to the first course on calculus for
cognitive scientists, we would like to thank all of the students from Spring 2006 to
Fall 2014 for their comments and patience with the inevitable typographical errors,
mistakes in the way we explained topics, and organizational flaws as we have
taught second semester of calculus ideas to them. This new text starts assuming you
know something the equivalent of a first semester course in calculus and particu-
larly know about exponential and logarithm functions, first-order models and the
MATLAB tools needed to solve the models numerically. In addition, you need to
know a fair bit of a start into calculus for functions of more than one variable and
the ideas of approximation to functions of one and two variables. These are not
really standard topics in just one course in calculus, which is why our first volume
was written to provide coverage of all those things. In addition, all of the mathe-
matics subserve ideas from biological models so that everything is wrapped toge-
ther in a pleasing package!
With that background given, in this text, we add new material on linear and
nonlinear systems models and more biological models. We also cover a useful way
of solving what are called linear partial differential equations using the technique

vii
viii Acknowledgments

named Separation of Variables. To make sense of all this, we naturally have to dip
into mathematical waters at appropriate points and we are not shy about that! But
rest assured, everything we do is carefully planned because it is of great use to you
in your attempts to forge an alliance between cognitive science, mathematics, and
computation.
Contents

Part I Introduction
1 Introductory Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1 A Roadmap to the Text. . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Code. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Final Thoughts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Part II Review
2 Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1 The Inner Product of Two Column Vectors . . . . . . . . . . . . . . 11
2.1.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Interpreting the Inner Product. . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 Determinants of 2 2 Matrices . . . . . . . . . . . . . . . . . . . . . . 19
2.3.1 Worked Out Problems . . . . . . . . . . . . . . . . . . . . . . 20
2.3.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4 Systems of Two Linear Equations. . . . . . . . . . . . . . . . . . . . . 21
2.4.1 Worked Out Examples. . . . . . . . . . . . . . . . . . . . . . 23
2.4.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.5 Solving Two Linear Equations in Two Unknowns . . . . . . . . . 25
2.5.1 Worked Out Examples. . . . . . . . . . . . . . . . . . . . . . 28
2.5.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.6 Consistent and Inconsistent Systems . . . . . . . . . . . . . . . . . . . 31
2.6.1 Worked Out Examples. . . . . . . . . . . . . . . . . . . . . . 33
2.6.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.7 Specializing to Zero Data . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.7.1 Worked Out Examples. . . . . . . . . . . . . . . . . . . . . . 37
2.7.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

ix
x Contents

2.8 Matrix Inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

2.8.1 Worked Out Examples. . . . . . . . . . . . . . . . . . . . . . 40
2.8.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.9 Computational Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . 42
2.9.1 A Simple Lower Triangular System . . . . . . . . . . . . 42
2.9.2 A Lower Triangular Solver . . . . . . . . . . . . . . . . . . 43
2.9.3 An Upper Triangular Solver . . . . . . . . . . . . . . . . . . 43
2.9.4 The LU Decomposition of A Without Pivoting. . . . . 44
2.9.5 The LU Decomposition of A with Pivoting . . . . . . . 48
2.10 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . 50
2.10.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.10.2 The General Case . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.10.3 The MatLab Approach. . . . . . . . . . . . . . . . . . . . . . 57
Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3 Numerical Methods Order One ODEs. . . . . . . . . . . . . . . . . . . . . . 61
3.1 Taylor Polynomials. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.1.1 Fundamental Tools . . . . . . . . . . . . . . . . . . . . . . . . 62
3.1.2 The Zeroth Order Taylor Polynomial. . . . . . . . . . . . 63
3.1.3 The First Order Taylor Polynomial . . . . . . . . . . . . . 64
3.1.4 Quadratic Approximations . . . . . . . . . . . . . . . . . . . 67
3.2 Euler’s Method with Time Independence. . . . . . . . . . . . . . . . 69
3.3 Euler’s Method with Time Dependence . . . . . . . . . . . . . . . . . 74
3.3.1 Lipschitz Functions and Taylor Expansions . . . . . . . 74
3.4 Euler’s Algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.4.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.5 Runge–Kutta Methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.5.1 The MatLab Implementation . . . . . . . . . . . . . . . . . 81
4 Multivariable Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.1 Functions of Two Variables . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.1.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.2 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.3 Partial Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.3.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
4.4 Tangent Planes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.4.1 The Vector Cross Product . . . . . . . . . . . . . . . . . . . 98
4.4.2 Back to Tangent Planes . . . . . . . . . . . . . . . . . . . . . 102
4.4.3 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.4.4 Computational Results . . . . . . . . . . . . . . . . . . . . . . 104
4.4.5 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.5 Derivatives in Two Dimensions!. . . . . . . . . . . . . . . . . . . . . . 106
4.6 The Chain Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
4.6.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
Contents xi

4.7 Tangent Plane Approximation Error . . . . . . . . . . . . . . . . . . . 111

4.8 Second Order Error Estimates . . . . . . . . . . . . . . . . . . . . . . . 114
4.8.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
4.8.2 Hessian Approximations . . . . . . . . . . . . . . . . . . . . 116
4.9 Extrema Ideas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
4.9.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
4.9.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

Part III The Main Event

5 Integration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
5.1 Integration by Parts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
5.1.1 How Do We Use Integration by Parts? . . . . . . . . . . 130
5.1.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
5.2 Partial Fraction Decompositions . . . . . . . . . . . . . . . . . . . . . . 135
5.2.1 How Do We Use Partial Fraction Decompositions
to Integrate? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
5.2.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
6 Complex Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
6.1 The Complex Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
6.1.1 Complex Number Calculations . . . . . . . . . . . . . . . . 143
6.1.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
6.2 Complex Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
6.2.1 Calculations with Complex Functions . . . . . . . . . . . 146
6.2.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
7 Linear Second Order ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
7.1 Homework. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
7.2 Distinct Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
7.2.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
7.3 Repeated Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
7.3.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
7.4 Complex Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
7.4.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
7.4.2 The Phase Shifted Solution . . . . . . . . . . . . . . . . . . 168
8 Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
8.1 Finding a Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
8.1.1 Worked Out Examples. . . . . . . . . . . . . . . . . . . . . . 172
8.1.2 A Judicious Guess . . . . . . . . . . . . . . . . . . . . . . . . 173
8.1.3 Sample Characteristic Equation Derivations . . . . . . . 179
8.1.4 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
xii Contents

8.2 Two Distinct Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . 182

8.2.1 Worked Out Solutions . . . . . . . . . . . . . . . . . . . . . . 186
8.2.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
8.2.3 Graphical Analysis . . . . . . . . . . . . . . . . . . . . . . . . 192
8.3 Repeated Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
8.3.1 The Repeated Eigenvalue Has Two Linearly
Independent Eigenvectors. . . . . . . . . . . . . . . . . . . . 216
8.3.2 The Repeated Eigenvalue Has only One
Eigenvector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
8.4 Complex Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
8.4.1 The General Real and Complex Solution . . . . . . . . . 222
8.4.2 Rewriting the Real Solution . . . . . . . . . . . . . . . . . . 225
8.4.3 The Representation of A . . . . . . . . . . . . . . . . . . . . 226
8.4.4 The Transformed Model . . . . . . . . . . . . . . . . . . . . 227
8.4.5 The Canonical Solution . . . . . . . . . . . . . . . . . . . . . 228
8.4.6 The General Model Solution . . . . . . . . . . . . . . . . . 230
9 Numerical Methods Systems of ODEs . . . . . . . . . . . . . . . . . . . . . . 237
9.1 Setting Up the Matrix and the Vector Functions . . . . . . . . . . . 238
9.1.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
9.2 Linear Second Order Problems as Systems . . . . . . . . . . . . . . 241
9.2.1 A Practice Problem . . . . . . . . . . . . . . . . . . . . . . . . 243
9.2.2 What If We Don’t Know the True Solution? . . . . . . 244
9.2.3 Homework: No External Input . . . . . . . . . . . . . . . . 245
9.2.4 Homework: External Inputs . . . . . . . . . . . . . . . . . . 246
9.3 Linear Systems Numerically. . . . . . . . . . . . . . . . . . . . . . . . . 246
9.3.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
9.4 An Attempt at an Automated Phase Plane Plot . . . . . . . . . . . . 252
9.5 Further Automation! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
9.5.1 Eigenvalues in MatLab . . . . . . . . . . . . . . . . . . . . . 254
9.5.2 Linear System Models in MatLab Again . . . . . . . . . 256
9.5.3 Project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
9.6 AutoPhasePlanePlot Again. . . . . . . . . . . . . . . . . . . . . . . . . . 261
9.6.1 Project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
9.7 A Two Protein Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
9.7.1 Protein Values Should be Positive! . . . . . . . . . . . . . 267
9.7.2 Solving the Two Protein Model . . . . . . . . . . . . . . . 271
9.7.3 Project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
9.8 A Two Box Climate Model . . . . . . . . . . . . . . . . . . . . . . . . . 278
9.8.1 Project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
9.8.2 Control of Carbon Loading . . . . . . . . . . . . . . . . . . 285
Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
Contents xiii

Part IV Interesting Models

10 Predator–Prey Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
10.1 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
10.2 The Nullcline Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
10.2.1 The x0 ¼ 0 Analysis . . . . . . . . . . . . . . . . . . . . . . . 292
10.2.2 The y0 ¼ 0 Analysis . . . . . . . . . . . . . . . . . . . . . . . 293
10.2.3 The Nullcline Plane. . . . . . . . . . . . . . . . . . . . . . . . 295
10.2.4 The General Results . . . . . . . . . . . . . . . . . . . . . . . 295
10.2.5 Drawing Trajectories . . . . . . . . . . . . . . . . . . . . . . . 298
10.3 Only Quadrant One Is Biologically Relevant . . . . . . . . . . . . . 299
10.3.1 Trajectories on the y þ Axis . . . . . . . . . . . . . . . . . . 299
10.3.2 Trajectories on the x þ Axis . . . . . . . . . . . . . . . . . . 300
10.3.3 What Does This Mean Biologically? . . . . . . . . . . . . 300
10.3.4 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
10.4 The Nonlinear Conservation Law . . . . . . . . . . . . . . . . . . . . . 301
10.4.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
10.4.2 The General Derivation . . . . . . . . . . . . . . . . . . . . . 304
10.4.3 Can a Trajectory Hit the y Axis Redux? . . . . . . . . . 308
10.4.4 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
10.5 Qualitative Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
10.5.1 The Predator–Prey Growth Functions . . . . . . . . . . . 309
10.5.2 General Results . . . . . . . . . . . . . . . . . . . . . . . . . . 312
10.5.3 The Nonlinear Conservation Law Using f and g . . . . 314
10.5.4 Trajectories are Bounded in x Example . . . . . . . . . . 315
10.5.5 Trajectories are Bounded in y Example . . . . . . . . . . 317
10.5.6 Trajectories are Bounded Example . . . . . . . . . . . . . 318
10.5.7 Trajectories are Bounded in x General Argument . . . 318
10.5.8 Trajectories are Bounded in y General Argument . . . 320
10.5.9 Trajectories are Bounded General Argument. . . . . . . 322
10.5.10 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
10.6 The Trajectory Must Be Periodic . . . . . . . . . . . . . . . . . . . . . 324
10.6.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
10.7 Plotting Trajectory Points! . . . . . . . . . . . . . . . . . . . . . . . . . . 326
10.7.1 Case 1: u ¼ dc . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
10.7.2 Case 2: x1 \U\ dc . . . . . . . . . . . . . . . . . . . . . . . . . 327
10.7.3 x1 \u1 \u2 \ dc . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
10.7.4 Three u Points . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
10.8 The Average Value of a Predator–Prey Solution . . . . . . . . . . . 332
10.8.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
10.9 A Sample Predator–Prey Model . . . . . . . . . . . . . . . . . . . . . . 336
10.9.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
10.10 Adding Fishing Rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
10.10.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
xiv Contents

10.11 Numerical Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341

10.11.1 Estimating the Period T Numerically . . . . . . . . . . . . 343
10.11.2 Plotting Predator and Prey Versus Time. . . . . . . . . . 345
10.11.3 Plotting Using a Function . . . . . . . . . . . . . . . . . . . 346
10.11.4 Automated Phase Plane Plots . . . . . . . . . . . . . . . . . 349
10.11.5 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
10.12 A Sketch of the Predator–Prey Solution Process . . . . . . . . . . . 352
10.12.1 Project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
11 Predator–Prey Models with Self Interaction . . . . . . . . . . . . . . . . . 355
11.1 The Nullcline Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
11.1.1 The x0 ¼ 0 Analysis . . . . . . . . . . . . . . . . . . . . . . . 355
11.1.2 The y0 ¼ 0 Analysis . . . . . . . . . . . . . . . . . . . . . . . 356
11.1.3 The Nullcline Plane. . . . . . . . . . . . . . . . . . . . . . . . 356
11.1.4 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
11.2 Quadrant 1 Trajectories Stay in Quadrant 1 . . . . . . . . . . . . . . 360
11.2.1 Trajectories Starting on the y Axis . . . . . . . . . . . . . 361
11.2.2 Trajectories Starting on the x Axis . . . . . . . . . . . . . 363
11.2.3 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
11.3 The Combined Nullcline Analysis in Quadrant 1 . . . . . . . . . . 364
11.3.1 The Case dc \ ae . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
11.3.2 The Nullclines Touch on the X Axis . . . . . . . . . . . . 365
11.3.3 The Nullclines Cross in Quadrant 4. . . . . . . . . . . . . 365
11.3.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
11.4 Trajectories in Quadrant 1 . . . . . . . . . . . . . . . . . . . . . . . . . . 368
11.5 Quadrant 1 Intersection Trajectories . . . . . . . . . . . . . . . . . . . 369
11.5.1 Limiting Average Values in Quadrant 1. . . . . . . . . . 370
11.6 Quadrant 4 Intersection Trajectories . . . . . . . . . . . . . . . . . . . 373
11.6.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
11.7 Summary: Working Out a Predator–Prey Self-Interaction
Model in Detail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
11.8 Self Interaction Numerically. . . . . . . . . . . . . . . . . . . . . . . . . 375
11.8.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
11.9 Adding Fishing! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378
11.10 Learned Lessons. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
12 Disease Models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
12.1 Disease Model Nullclines . . . . . . . . . . . . . . . . . . . . . . . . . . 382
12.1.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384
12.2 Only Quadrant 1 is Relevant . . . . . . . . . . . . . . . . . . . . . . . . 385
12.2.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386
12.3 The I Versus S Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
12.4 Homework. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
Contents xv

12.5 Solving the Disease Model Using Matlab . . . . . . . . . . . . . . . 390

12.5.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392
12.6 Estimating Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392
12.6.1 Approximating the dR=dt Equation . . . . . . . . . . . . . 393
12.6.2 Using dR=dt to Estimate ρ . . . . . . . . . . . . . . . . . . . 398
13 A Cancer Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401
13.1 Two Allele TSG Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
13.2 Model Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 406
13.3 Preliminary Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407
13.4 Solving the Top Pathway Exactly . . . . . . . . . . . . . . . . . . . . . 408
13.4.1 The X0 –X1 Subsystem . . . . . . . . . . . . . . . . . . . . . . 408
13.5 Approximation Ideas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 410
13.5.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
13.5.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
13.6 Approximation of the Top Pathway . . . . . . . . . . . . . . . . . . . 413
13.6.1 Approximating X0 . . . . . . . . . . . . . . . . . . . . . . . . . 413
13.6.2 Approximating X1 . . . . . . . . . . . . . . . . . . . . . . . . . 414
13.6.3 Approximating X2 . . . . . . . . . . . . . . . . . . . . . . . . . 415
13.7 Approximating the CIN Pathway Solutions . . . . . . . . . . . . . . 417
13.7.1 The Y0 Estimate . . . . . . . . . . . . . . . . . . . . . . . . . . 418
13.7.2 The Y1 Estimate . . . . . . . . . . . . . . . . . . . . . . . . . . 420
13.7.3 The Y2 Estimate . . . . . . . . . . . . . . . . . . . . . . . . . . 422
13.8 Error Estimates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
13.9 When Is the CIN Pathway Dominant?. . . . . . . . . . . . . . . . . . 424
13.9.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425
13.10 A Little Matlab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426
13.10.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430
13.11 Insight Is Difﬁcult to Achieve . . . . . . . . . . . . . . . . . . . . . . . 430
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431

Part V Nonlinear Systems Again

14 Nonlinear Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . 435
14.1 Linear Approximations to Nonlinear Models . . . . . . . . . . . . . 435
14.1.1 A First Nonlinear Model . . . . . . . . . . . . . . . . . . . . 438
14.1.2 Finding Equilibrium Points Numerically . . . . . . . . . 443
14.1.3 A Second Nonlinear Model . . . . . . . . . . . . . . . . . . 453
14.1.4 The Predator–Prey Model . . . . . . . . . . . . . . . . . . . 459
14.1.5 The Predator–Prey Model with Self Interaction . . . . . 465
14.1.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
xvi Contents

15 An Insulin Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475

15.1 Fitting the Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481
15.1.1 Gradient Descent Implementation . . . . . . . . . . . . . . 484
15.1.2 Adding Line Search . . . . . . . . . . . . . . . . . . . . . . . 487
Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491

Part VI Series Solutions to PDE Models

16 Series Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495
16.1 The Separation of Variables Method . . . . . . . . . . . . . . . . . . . 495
16.1.1 Determining the Separation Constant . . . . . . . . . . . . 497
16.2 Inﬁnite Series. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501
16.3 Independant Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503
16.3.1 Independent Functions . . . . . . . . . . . . . . . . . . . . . . 504
16.3.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507
16.4 Vector Spaces and Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . 507
16.4.1 Inner Products in Function Spaces . . . . . . . . . . . . . 510
16.5 Fourier Series. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511
16.5.1 The Cable Model Inﬁnite Series Solution . . . . . . . . . 512
Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513

Part VII Summing It All Up

17 Final Thoughts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517
17.1 Fostering Interdisciplinary Appreciation. . . . . . . . . . . . . . . . . 517
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 522

Part VIII Advise to the Beginner

18 Background Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 528

Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 531

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547
List of Figures

Figure 3.1 True versus Euler for

x0 ¼ 0:5xð60 xÞ; xð0Þ ¼ 20 . . . . . . . . . . . . . . . . . ..... 84
Figure 4.1 The traces f ðx0 ; yÞ and f ðx; y0 Þ for the surface
z ¼ x2 þ y2 for x0 ¼ 0:5 and y0 ¼ 0:5 . . . . . . . . . . . ..... 90
Figure 4.2 The traces f ðx0 ; yÞ and f ðx; y0 Þ for the surface
z ¼ x2 þ y2 for x0 ¼ 0:5 and y0 ¼ 0:5 with added
tangent lines. We have added the tangent plane
determined by the tangent lines . . . . . . . . . . . . . . . ..... 92
Figure 4.3 The traces f ðx0 ; yÞ and f ðx; y0 Þ for the surface
z ¼ x2 þ y2 for x0 ¼ 0:5 and y0 ¼ 0:5 with added
tangent lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 93
Figure 4.4 The traces f ðx0 ; yÞ and f ðx; y0 Þ for the surface
z ¼ x2 þ y2 for x0 ¼ 0:5 and y0 ¼ 0:5 with added
tangent lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
Figure 4.5 The Surface f ðx; yÞ ¼ 2x2 þ 4y3 . . . . . . . . . . . . . . . . . . . . 124
Figure 6.1 Graphing complex numbers. . . . . . . . . . . . . . . . . . . . . . . 142
Figure 6.2 Graphing the conjugate of a complex numbers . . . . . . . . . 143
Figure 6.3 The complex number 2 þ 8i . . . . . . . . . . . . . . . . . . . . . 144
Figure 7.1 Linear second order problem: two distinct roots. . . . . . . . . 156
Figure 7.2 Linear second order problem: repeated roots . . . . . . . . . . . 160
Figure 7.3 Linear second order problem: complex roots . . . . . . . . . . . 165
Figure 7.4 Linear second order problem: complex roots two . . . . . . . . 167
Figure 8.1 Finding where x0 \0 and x0 [ 0. . . . . . . . . . . . . . . . . . . . 193
Figure 8.2 Finding where y0 \ 0 and y0 [ 0 . . . . . . . . . . . . . . . . . . . 193
Figure 8.3 Combining the x0 and y0 algebraic sign regions . . . . . . . . . 194
Figure 8.4 Drawing the nullclines and the eigenvector lines on
the same graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
Figure 8.5 Trajectories in Region I . . . . . . . . . . . . . . . . . . . . . . . . . 198
Figure 8.6 Combining the x0 and y0 algebraic sign regions . . . . . . . . . 199
Figure 8.7 Trajectories in Region II . . . . . . . . . . . . . . . . . . . . . . . . . 201

xvii
xviii List of Figures

Figure 8.8 Region III trajectories . . . . . . . . . . . . . . . . . . . . . . . . . . . 202

Figure 8.9 A magnified Region III trajectory . . . . . . . . . . . . . . . . . . 203
Figure 8.10 Region IV trajectories. . . . . . . . . . . . . . . . . . . . . . . . . . . 204
Figure 8.11 Trajectories in all regions . . . . . . . . . . . . . . . . . . . . . . . . 205
Figure 8.12 Example x0 ¼ 4x 2y; y0 ¼ 3x y, Page 1 . . . . . . . . . . . . 209
Figure 8.13 Example x0 ¼ 4x 2y; y0 ¼ 3x y, Page 2 . . . . . . . . . . . . 209
Figure 8.14 Example x0 ¼ 4x 2y; y0 ¼ 3x y, Page 3 . . . . . . . . . . . . 210
Figure 8.15 A phase plane plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
Figure 8.16 A scaled phase plane plot . . . . . . . . . . . . . . . . . . . . . . . . 220
Figure 8.17 The partial ellipse for the phase plane portrait . . . . . . . . . . 232
Figure 8.18 The complete ellipse for the phase plane portrait . . . . . . . . 233
Figure 8.19 The ellipse with axes for the phase plane portrait. . . . . . . . 234
Figure 8.20 The spiral out phase plane portrait . . . . . . . . . . . . . . . . . . 234
Figure 9.1 Solution to x00 þ 4x0 5x ¼ 0; xð0Þ ¼ 1; x0 ð0Þ ¼ 1
on [1.3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 244
Figure 9.2 Solution to x00 þ 4x0 5x ¼ 0; xð0Þ ¼ 1; x0 ð0Þ ¼ 1
on [1.3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 245
Figure 9.3 Solving x0 ¼ 3x þ 4xy; y0 ¼ x þ 2y;
xð0Þ ¼ 1; yð0Þ ¼ 1 . . . . . . . . . . . . . . . . . . . . . . . ..... 247
Figure 9.4 Solving
x0 ¼ 4x þ 9xy; y0 ¼ x 6y; xð0Þ ¼ 4; yð0Þ ¼ 2 . . . . . . . 251
Figure 9.5 Phase plane x0 ¼ 4x y; y0 ¼ 8x 5y . . . . . . . . . . . . . . . . 254
Figure 9.6 Sample plot. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
Figure 9.7 Phase plane plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
Figure 9.8 Phase plane x0 ¼ 4x þ 9y; y0 ¼ x 6y . . . . . . . . . . . . . . 263
Figure 9.9 Two interacting proteins again. . . . . . . . . . . . . . . . . . . . . 274
Figure 9.10 Phase plane for two interacting proteins again . . . . . . . . . . 275
Figure 9.11 Phase plane for two interacting proteins on a short
time scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 276
Figure 9.12 Phase plane for two interacting proteins on a long
time scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
Figure 9.13 Simple two box climate model: short time scale . . . . . . . . 283
Figure 9.14 Simple two box climate model: mid time scale . . . . . . . . . 284
Figure 9.15 Simple two box climate model: long time scale . . . . . . . . . 284
Figure 10.1 x0 ¼ 0 nullcline for x0 ¼ 2x þ 5y . . . . . . . . . . . . . . . . . . . 294
Figure 10.2 y0 ¼ 0 nullcline for y0 ¼ 6x þ 3y . . . . . . . . . . . . . . . . . . . 294
Figure 10.3 The combined x0 ¼ 0 nullcline for x0 ¼ 2x þ 5y and
y0 ¼ 0 nullcline for y0 ¼ 6x þ 3y information . . . . . . ..... 296
Figure 10.4 Finding where x0 \0 and x0 [ 0 for the predator–prey
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 296
Figure 10.5 Finding where y0 \0 and y0 [ 0 for the predator–prey
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 297
Figure 10.6 Combining the x0 and y0 algebraic sign regions . . . . ..... 297
Figure 10.7 Trajectories in region I . . . . . . . . . . . . . . . . . . . . . ..... 298
List of Figures xix

Figure 10.8 Trajectories in region II . . . . . . . . . . . . . . . . . . . . . . . . . 302

Figure 10.9 A periodic trajectory . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
Figure 10.10 A spiraling out trajectory . . . . . . . . . . . . . . . . . . . . . . . . 310
Figure 10.11 A spiraling in trajectory . . . . . . . . . . . . . . . . . . . . . . . . . 311
Figure 10.12 The predator–prey f growth graph . . . . . . . . . . . . . . . . . . 312
Figure 10.13 The predator–prey g growth graph . . . . . . . . . . . . . . . . . . 313
Figure 10.14 The conservation law f ðxÞ gðyÞ ¼ 0:7 fmax gmax
implies there are two critical points x1 and x2
of interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 316
Figure 10.15 Predator–prey trajectories with initial conditions from
Quadrant 1 are bounded in x . . . . . . . . . . . . . . . . . ..... 316
Figure 10.16 The conservation law f ðxÞ gðyÞ ¼ 0:7 fmax gmax
implies there are two critical points y1 and y2
of interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 317
Figure 10.17 Predator–prey trajectories with initial conditions from
Quadrant 1 are bounded in y . . . . . . . . . . . . . . . . . ..... 318
Figure 10.18 Predator–prey trajectories with initial conditions from
Quadrant 1 are bounded in x and y . . . . . . . . . . . . ..... 319
Figure 10.19 The conservation law f ðxÞ gðyÞ ¼ 0:7 fmax gmax
implies there are two critical points x1 and x2
of interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 320
Figure 10.20 Predator–Prey trajectories with initial conditions from
Quadrant 1 are bounded in x . . . . . . . . . . . . . . . . . ..... 321
Figure 10.21 The conservation law f ðxÞ gðyÞ ¼ 0:7 fmax gmax
implies there are two critical points y1 and y2
of interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 321
Figure 10.22 Predator–prey trajectories with initial conditions from
Quadrant 1 are bounded in y . . . . . . . . . . . . . . . . . ..... 322
Figure 10.23 Predator–prey trajectories with initial conditions from
Quadrant 1 are bounded in x and y . . . . . . . . . . . . ..... 323
Figure 10.24 The f curve with the point x1 \c=d ¼ 7=3\x \x2
added . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 324
Figure 10.25 The g curve with the rgmax line added showing the
y values for the chosen x . . . . . . . . . . . . . . . . . . . ..... 325
Figure 10.26 The predator–prey f growth graph trajectory analysis
for x1 \u\ dc . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 327
Figure 10.27 The predator–prey g growth analysis for one point
x1 \u\ dc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 328
Figure 10.28 The predator–prey f growth graph trajectory analysis
for x1 \u1 \u2 \ dc points . . . . . . . . . . . . . . . . . . . ..... 329
Figure 10.29 The spread of the trajectory through fixed lines on the
x axis gets smaller as we move away from the center
point dc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 330
Figure 10.30 The trajectory must be periodic . . . . . . . . . . . . . . . ..... 331
xx List of Figures

Figure 10.31 The theoretical trajectory for

x0 ¼ 2x 10xy; y0 ¼ 3y þ 18xy. We do not know
the actual trajectory as we can not solve for x and
y explicitly as functions of time. However, our
analysis tells us the trajectory has the qualitative
features shown. . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 337
Figure 10.32 Phase plane x0 ¼ 12x 5xy; y0 ¼ 6y þ 3xy;
xð0Þ ¼ 0:2; yð0Þ ¼ 8:6 . . . . . . . . . . . . . . . . . . . . . ..... 342
Figure 10.33 Predator–prey plot with final time less than the
period. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 344
Figure 10.34 Predator–prey plot with final time about the period . ..... 345
Figure 10.35 y versus t for x0 ¼ 12x 5xy; y0 ¼ 6y þ 3xy;
xð0Þ ¼ 0:2; yð0Þ ¼ 8:6 . . . . . . . . . . . . . . . . . . . . . ..... 346
Figure 10.36 y versus t for x0 ¼ 12x 5xy; y0 ¼ 6y þ 3xy;
xð0Þ ¼ 0:2; yð0Þ ¼ 8:6 . . . . . . . . . . . . . . . . . . . . . . . . . . 347
Figure 10.37 Predator–prey plot with step size too large . . . . . . . . . . . . 348
Figure 10.38 Predator–prey plot with step size better! . . . . . . . . . . . . . . 348
Figure 10.39 Predator–prey plot for multiple initial conditions!. . . . . . . . 349
Figure 10.40 Predator–prey phase plot. . . . . . . . . . . . . . . . . . . . . . . . . 353
Figure 10.41 Predator–prey phase plane plot . . . . . . . . . . . . . . . . . . . . 353
Figure 11.1 Finding where x0 \0 and x0 [ 0 for the Predator–Prey
self interaction model . . . . . . . . . . . . . . . . . . . . . . ..... 357
Figure 11.2 Finding where y0 \0 and y0 [ 0 for the Predator–Prey
self interaction model . . . . . . . . . . . . . . . . . . . . . . ..... 357
Figure 11.3 x0 nullcline for x0 ¼ x ð4 5y exÞ . . . . . . . . . . . . ..... 359
Figure 11.4 y0 nullcline for y0 ¼ y ð6 þ 2x fyÞ . . . . . . . . . . . ..... 359
Figure 11.5 The x0 \0 and x0 [ 0 signs in Quadrant 1 for the
Predator–Prey self interaction model. . . . . . . . . . . . ..... 360
Figure 11.6 The y0 \0 and y0 [ 0 signs in Quadrant 1 for the
Predator–Prey self interaction model. . . . . . . . . . . . ..... 360
Figure 11.7 The Quadrant 1 nullcline regions for the
Predator–Prey self interaction model
when c=d\a=e . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 365
Figure 11.8 The qualitative nullcline regions for the
Predator–Prey self interaction model
when c=d ¼ a=e . . . . . . . . . . . . . . . . . . . . . . . . . ..... 366
Figure 11.9 The qualitative nullcline regions for the
Predator–Prey self interaction model
when c=d [ a=e . . . . . . . . . . . . . . . . . . . . . . . . . ..... 366
Figure 11.10 The Quadrant 1 nullcline regions for the
Predator–Prey self interaction model
when c=d\a=e . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 367
List of Figures xxi

Figure 11.11 The qualitative nullcline regions for the

Predator–Prey self interaction model
when c=d ¼ a=e . . . . . . . . . . . . . . . . . . . . . . . . . ..... 368
Figure 11.12 The qualitative nullcline regions for the
Predator–Prey self interaction model
when c=d [ a=e . . . . . . . . . . . . . . . . . . . . . . . . . ..... 368
Figure 11.13 Sample Predator–Prey model with self interaction:
crossing in Q1. . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 369
Figure 11.14 Sample Predator–Prey model with self interaction:
the nullclines cross on the x axis . . . . . . . . . . . . . . ..... 369
Figure 11.15 Sample Predator–Prey model with self interaction:
the nullclines cross in Quadrant 4 . . . . . . . . . . . . . ..... 370
Figure 11.16 Predator–Prey system:
x0 ¼ 2x 3xy 3x2 ; y0 ¼ 4y þ 5xy 3y2 . . . . . . . ..... 376
Figure 11.17 Predator–Prey system: x0 ¼ 2x 3xy 1:5x2 ;
y0 ¼ 4y þ 5xy 1:5y2 in this example, the
nullclines cross so the trajectories moves towards a
ﬁxed point (0.23, 0.87) as shown . . . . . . . . . . . . . . ..... 377
Figure 12.1 Finding where I 0 \0 and I 0 [ 0 for the disease
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 383
Figure 12.2 Finding where S0 \0 and S0 [ 0 regions for the
disease model . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 383
Figure 12.3 Finding the ðI 0 ; S0 Þ algebraic sign regions for the
disease model . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 384
Figure 12.4 The disease model in Quadrant 1 . . . . . . . . . . . . . . ..... 386
Figure 12.5 Solution to
S0 ¼ 5SI; I 0 ¼ 5SI þ 25I; Sð0Þ ¼ 10; Ið0Þ ¼ 5 . . . . ..... 391
Figure 13.1 Typical colon crypts. . . . . . . . . . . . . . . . . . . . . . . ..... 403
Figure 13.2 The pathways for the TSG allele losses . . . . . . . . . ..... 404
Figure 13.3 The pathways for the TSG allele losses rewritten
using selective advantage . . . . . . . . . . . . . . . . . . . ..... 405
Figure 13.4 The pathways for the TSG allele losses rewritten
using mathematical variables . . . . . . . . . . . . . . . . . ..... 405
Figure 13.5 The number of cells in A and ACIN state versus
time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 428
Figure 14.1 Nonlinear phase plot for multiple initial conditions! . ..... 443
Figure 14.2 Equilibrium points graphically for the trigger
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453
Figure 14.3 Trigger model linearized about Q1 . . . . . . . . . . . . . . . . . . 456
Figure 14.4 Trigger model linearized about Q2 . . . . . . . . . . . . . . . . . . 457
Figure 14.5 Trigger model linearized about Q3 . . . . . . . . . . . . . . . . . . 459
Figure 14.6 Trigger model phase plane plot . . . . . . . . . . . . . . . . . . . . 460
Figure 14.7 Predator–Prey model linearized about Q1 . . . . . . . . . . . . . 461
Figure 14.8 Predator–Prey model linearized about Q2 . . . . . . . . . . . . . 463
xxii List of Figures

Figure 14.9 Predator–Prey model phase plane plot . . . . . . . . . . ..... 463

Figure 14.10 Predator–Prey model with self interaction with
nullcline intersection Q4 in Quadrant I, linearized
about Q4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 467
Figure 14.11 Predator–Prey model linearized about Q3 . . . . . . . . ..... 468
Figure 14.12 Predator–Prey self interaction model with intersection
in quadrant I . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 469
Figure 14.13 Predator–Prey model with self interaction with
nullcline intersection Q4 in Quadrant IV, linearized
about Q4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 470
Figure 14.14 Predator–Prey model with self interaction with
nullcline intersection in Quadrant IV, linearized about
Q3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 471
Figure 14.15 Predator–Prey self interaction model with nullcline
intersection in Quadrant IV phase plane plot . . . . . . ..... 472
Figure 15.1 The diabetes model curve fit: no line search . . . . . . ..... 486
Figure 15.2 The diabetes model curve fit with line search on
unscaled data. . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 490
Figure 15.3 The diabetes model curve fit with line search on
scaled data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 491
List of Tables

Table 1.1 Typical engineering calculus sequence. . . . . . . . . . . . ..... 4

Table 10.1 The percent of the total fish catch in the Mediterranean
Sea which was considered not food fish . . . . . . . . . . ..... 289
Table 10.2 The percent of the total fish catch in the Mediterranean
Sea considered predator and considered food . . . . . . . ..... 290
Table 10.3 During World War One, fishing is drastically curtailed,
yet the predator percentage went up while the food
percentage went down . . . . . . . . . . . . . . . . . . . . . . . ..... 290
Table 10.4 The average food and predator fish caught in the
Mediterranean Sea . . . . . . . . . . . . . . . . . . . . . . . . . ..... 339
Table 13.1 The Non CIN Pathway Approximations with error
estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 417
Table 13.2 The CIN Model Approximations with error estimates . ..... 423
Table 13.3 The Non CIN and CIN Model Approximations with
error estimates using u1 ¼ u2 ¼ u and uc ¼ R u . . . . . ..... 423
Table 13.4 The Non CIN and CIN Model Approximations
Dependence on population size N and the CIN rate for
R 39 with u1 ¼ 107 and uc ¼ R u1 . . . . . . . . . . . . ..... 423
Table 13.5 The CIN decay rates, uc required for CIN dominance.
with u1 ¼ u2 ¼ 107 and uc ¼ R u1 for R 1 . . . . . . . ..... 425

xxiii
List of Code Examples

Listing 1.1 How to add paths to octave . . . . . . . . . . . . . . . . . . . . . . 5

Listing 1.2 Set paths in octave . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Listing 2.1 Lower Triangular Solver . . . . . . . . . . . . . . . . . . . . . . . . 43
Listing 2.2 Sample Solution with LTriSol . . . . . . . . . . . . . . . . . . . . 43
Listing 2.3 Upper Triangular Solver . . . . . . . . . . . . . . . . . . . . . . . . 44
Listing 2.4 Sample Solution with UTriSol . . . . . . . . . . . . . . . . . . . . 44
Listing 2.5 Storing multipliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Listing 2.6 Extracting Lower and Upper Parts of a matrix . . . . . . . . . 46
Listing 2.7 LU Decomposition of A Without Pivoting. . . . . . . . . . . . 47
Listing 2.8 Solution using LU Decomposition . . . . . . . . . . . . . . . . . 47
Listing 2.9 LU Decomposition of A With Pivoting . . . . . . . . . . . . . . 48
Listing 2.10 Solving a System with pivoting . . . . . . . . . . . . . . . . . . . 49
Listing 2.11 Eigenvalues in Matlab. . . . . . . . . . . . . . . . . . . . . . . . . . 58
Listing 2.12 Eigenvectors in Matlab . . . . . . . . . . . . . . . . . . . . . . . . . 58
Listing 2.13 Eigenvalues and Eigenvectors Example . . . . . . . . . . . . . . 59
Listing 2.14 Checking orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . 59
Listing 3.1 RKstep.m: Runge–Kutta Codes . . . . . . . . . . . . . . . . . . . 82
Listing 3.2 FixedRK.m: The Runge–Kutta Solution . . . . . . . . . . . . . 83
Listing 3.3 True versus All Four Runge–Kutta
Approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Listing 4.1 DrawSimpleSurface . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Listing 4.2 Drawing a simple surface . . . . . . . . . . . . . . . . . . . . . . . 87
Listing 4.3 Drawing a full trace . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Listing 4.4 Drawing Tangent Lines . . . . . . . . . . . . . . . . . . . . . . . . . 92
Listing 4.5 Drawing Tangent Lines . . . . . . . . . . . . . . . . . . . . . . . . . 93
Listing 4.6 DrawTangentPlanePackage . . . . . . . . . . . . . . . . . . . . . . 105
Listing 4.7 Drawing Tangent Planes . . . . . . . . . . . . . . . . . . . . . . . . 106
Listing 4.8 The surface f(x, y) = 2x2 + 4y3 . . . . . . . . . . . . . . . . . . . . 124
Listing 7.1 Solution to x00 3x0 10x ¼ 0; xð0Þ ¼ 10;
x0 ð0Þ ¼ 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .... 155
Listing 7.2 Solution to u'' + 16u' + 64u = 0; u(0) = 1, u'(0) = 8 . .... 159

xxv
xxvi List of Code Examples

Listing 7.3 Solution to u'' + 8u' + 25u = 0; u(0) = 3, u'(0) = 4 . . . . . . 165

Listing 7.4 Solution to u'' + 8u' + 20u = 0; u(0) = 3, u'(0) = 4 . . . . . . 166
Listing 8.1 Checking our arithmetic! . . . . . . . . . . . . . . . . . . . . . . . . 232
Listing 8.2 Plotting for more time. . . . . . . . . . . . . . . . . . . . . . . . . . 232
Listing 8.3 Plotting the ellipse and its axis. . . . . . . . . . . . . . . . . . . . 233
Listing 9.1 Linear Second Order Dynamics . . . . . . . . . . . . . . . . . . . 242
Listing 9.2 Solving x'' + 4x' – 5x = 0; x(0) = −1, x'(0) = 1 . . . . . . . . 243
Solving x00 þ 4x0 5x ¼ 10 sinð5 tÞe0:03t ;
2
Listing 9.3
0
xð0Þ ¼ 1; x ð0Þ ¼ 1 . . . . . . . . . . . . . . . . . . . . . ..... 244
Listing 9.4 Solving x' = −3x + 4y, y' = −x + 2y; x(0) = −1;
y(0) = 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 247
Listing 9.5 Solving x' = −3x + 4y, y' = −x + 2y; x(0) = −1;
y(0) = 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 248
Listing 9.6 Phase Plane for x' = 4x + 9y, y' = −x − 6y; x(0) = 4;
y(0) = −2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
Listing 9.7 AutoPhasePlanePlot.m . . . . . . . . . . . . . . . . . . . . . . . . . 252
Listing 9.8 AutoPhasePlanePlot Arguments . . . . . . . . . . . . . . . . . . . 253
Listing 9.9 Trying Some Phase Plane Plots . . . . . . . . . . . . . . . . . . . 253
Listing 9.10 Eigenvalues in Matlab: eig . . . . . . . . . . . . . . . . . . . . . . 254
Listing 9.11 Eigenvectors and Eigenvalues in Matlab: eig . . . . . . . . . . 255
Listing 9.12 Example 5 × 5 Eigenvalue and Eigenvector
Calculation in Matlab . . . . . . . . . . . . . . . . . . . . . . . . . . 256
Listing 9.13 Inner Products in Matlab: dot. . . . . . . . . . . . . . . . . . . . . 256
Listing 9.14 Sample Linear Model: x' = 4x − y, y' = 8x − 5y. . . . . . . . 257
Listing 9.15 Find the Eigenvalues and Eigenvectors in Matlab . . . . . . . 258
Listing 9.16 Session for x' = 4x + 9y, y' = −x − 6y Phase Plane
Plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 259
Listing 9.17 AutoPhasePlanePlotLinearModel . . . . . . . . . . . . . ..... 261
Listing 9.18 Session for x' = 4x + 9y, y' = −x − 6y Enhanced
Phase Plane Plots . . . . . . . . . . . . . . . . . . . . . . . . ..... 263
Listing 9.19 Extracting A . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 264
Listing 9.20 Setting up the eigenvector and nullcline lines. . . . . ..... 264
Listing 9.21 Set up Initial conditions, find trajectories and the
bounding boxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
Listing 9.22 Set up the bounding box for all the trajectories . . . . . . . . 265
Listing 9.23 Plot the trajectories at the chosen zoom level . . . . . . . . . . 266
Listing 9.24 Choose s and set the external inputs . . . . . . . . . . . . . . . . 270
Listing 9.25 Finding the A and F for a two protein model. . . . . . . . . . 270
Listing 9.26 Finding the A and F for a two protein model:
uncommented . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
Listing 9.27 Setting up the coefficient matrix A . . . . . . . . . . . . . . . . . 272
Listing 9.28 Finding the inverse of A . . . . . . . . . . . . . . . . . . . . . . . . 272
Listing 9.29 Find the particular solution . . . . . . . . . . . . . . . . . . . . . . 272
Listing 9.30 Find eigenvalues and eigenvectors of A . . . . . . . . . . . . . 272
List of Code Examples xxvii

Listing 9.31 Solving the IVP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273

Listing 9.32 Constructing the solutions x and y and plotting . . . . . . . . 274
Listing 9.33 The protein phase plane plot . . . . . . . . . . . . . . . . . . . . . 274
Listing 9.34 Solving the problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
Listing 9.35 Estimated Protein Response Times . . . . . . . . . . . . . . . . . 276
Listing 9.36 Short Time Scale Plot . . . . . . . . . . . . . . . . . . . . . . . . . . 276
Listing 9.37 Long Time Scale Plot . . . . . . . . . . . . . . . . . . . . . . . . . . 277
Listing 9.38 A Sample Two Box Climate Model Solution . . . . . . . . . . 282
Listing 10.1 Phase Plane x' = 12x – 5xy, y' = −6y + 3xy,
x (0) = 0.2, y (0) = 8.6 . . . . . . . . . . . . . . . . . . . . . . . . 341
Listing 10.2 Annotated Predator–Prey Phase Plane Code . . . . . . . . . . . 342
Listing 10.3 First Estimate of the period T . . . . . . . . . . . . . . . . . . . . 343
Listing 10.4 Second Estimate of the period T . . . . . . . . . . . . . . . . . . 344
Listing 10.5 x versus time code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
Listing 10.6 y versus time code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
Listing 10.7 AutoSystemFixedRK.m . . . . . . . . . . . . . . . . . . . . . . . . . 347
Listing 10.8 Using AutoSystemFixedRK . . . . . . . . . . . . . . . . . . . . . . 348
Listing 10.9 Automated Phase Plane Plot for x' = 6x – 5xy,
y' = −7y + 4xy . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 349
Listing 11.1 Phase Plane for x' = 2x – 3xy − 3x2,
y' = −4x + 5xy – 3y2 . . . . . . . . . . . . . . . . . . . . . ..... 375
Listing 11.2 Phase Plane for x' = 2x – 3xy − 1.5x2,
y' = −4x + 5xy – 1.5 y2 . . . . . . . . . . . . . . . . . . . ..... 376
Listing 12.1 Solving S0 ¼ 5SI; I 0 ¼ 5SI + 25I;
Sð0Þ ¼ 10; Ið0Þ ¼ 5: Epidemic! . . . . . . . . . . . . . . . . . . . 391
Listing 13.1 The cancer model dynamics. . . . . . . . . . . . . . . . . . . . . . 426
Listing 13.2 A First Cancer Model Attempt . . . . . . . . . . . . . . . . . . . . 427
Listing 13.3 Our First Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428
Listing 13.4 Switching to time units to years: step size is one half
year . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
Listing 13.5 Switching the step size to one year. . . . . . . . . . . . . . . . . 429
Listing 14.1 Jacobian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
Listing 14.2 Jacobian at equilibrium point (0; 0) . . . . . . . . . . . . . . . . 439
Listing 14.3 Jacobian at equilibrium point (1; 0) . . . . . . . . . . . . . . . . 440
Listing 14.4 Jacobian at equilibrium point (0; 1) . . . . . . . . . . . . . . . . 441
Listing 14.5 Autonomousfunc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442
Listing 14.6 Sample Phase Plane Plot . . . . . . . . . . . . . . . . . . . . . . . . 442
Listing 14.7 Bisection Code. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444
Listing 14.8 Function Definition In MatLab. . . . . . . . . . . . . . . . . . . . 445
Listing 14.9 Applying bisection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445
Listing 14.10 Bisection step by step code . . . . . . . . . . . . . . . . . . . . . . 445
Listing 14.11 Sample Bisection run step by step . . . . . . . . . . . . . . . . . 446
Listing 14.12 Should We Do A Newton Step?. . . . . . . . . . . . . . . . . . . 447
Listing 14.13 Global Newton Method. . . . . . . . . . . . . . . . . . . . . . . . . 448
xxviii List of Code Examples

Listing 14.14 Global Newton Function . . . . . . . . . . . . . . . . . . . . . . . . 449

Listing 14.15 Global Newton Function Derivative . . . . . . . . . . . . . . . . 449
Listing 14.16 Global Newton Sample . . . . . . . . . . . . . . . . . . . . . . . . . 449
Listing 14.17 Global Newton runtime results . . . . . . . . . . . . . . . . . . . . 449
Listing 14.18 Finite difference approximation to the derivative . . . . . . . 450
Listing 14.19 Secant approximation to the derivative . . . . . . . . . . . . . . 450
Listing 14.20 Finite Difference Global Newton Method . . . . . . . . . . . . 451
Listing 14.21 Sample GlobalNewton Finite Difference solution . . . . . . . 452
Listing 14.22 Global Newton Finite Difference runtime results . . . . . . . 452
Listing 14.23 Finding equilibrium points numerically . . . . . . . . . . . . . . 454
Listing 14.24 Jacobian at Q1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
Listing 14.25 Linearsystemep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
Listing 14.26 Phase plane plot for Q1 linearization . . . . . . . . . . . . . . . 456
Listing 14.27 Jacobian at Q2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456
Listing 14.28 Phase plane for Q2 linearization . . . . . . . . . . . . . . . . . . . 457
Listing 14.29 Jacobian at Q3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
Listing 14.30 Phase plane for Q3 linearization . . . . . . . . . . . . . . . . . . . 458
Listing 14.31 Trigger model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459
Listing 14.32 Phase plane plot for trigger model . . . . . . . . . . . . . . . . . 459
Listing 14.33 Jacobian at Q1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 460
Listing 14.34 Phase plane plot for Q1 linearization . . . . . . . . . . . . . . . 461
Listing 14.35 Jacobian at Q2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462
Listing 14.36 Phase plane for Q2 linearization . . . . . . . . . . . . . . . . . . . 462
Listing 14.37 Predator–Prey dynamics . . . . . . . . . . . . . . . . . . . . . . . . 462
Listing 14.38 AutoPhasePlanePlotRKF5NoPMultiple . . . . . . . . . . . . . . 464
Listing 14.39 A sample phase plane plot. . . . . . . . . . . . . . . . . . . . . . . 465
Listing 14.40 Jacobian at Q4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466
Listing 14.41 Phase plane for Q4 linearization . . . . . . . . . . . . . . . . . . . 466
Listing 14.42 Jacobian at Q3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467
Listing 14.43 Phase plane for Q3 linearization . . . . . . . . . . . . . . . . . . . 468
Listing 14.44 PredPreySelf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468
Listing 14.45 Actual Phase plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468
Listing 14.46 Jacobian at Q4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470
Listing 14.47 Phase plane plot for Q4 linearization . . . . . . . . . . . . . . . 470
Listing 14.48 Jacobian at Q3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471
Listing 14.49 Phase plane for Q3 linearization . . . . . . . . . . . . . . . . . . . 471
Listing 14.50 PredPreySelf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
Listing 14.51 The actual phase plane . . . . . . . . . . . . . . . . . . . . . . . . . 472
Listing 15.1 Nonlinear LS For Diabetes Model: Version One . . . . . . . 484
Listing 15.2 The Error Gradient Function . . . . . . . . . . . . . . . . . . . . . 485
Listing 15.3 Diabetes Error Calculation. . . . . . . . . . . . . . . . . . . . . . . 485
Listing 15.4 Run time results for gradient descent on the original
data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 486
Listing 15.5 Line Search Code. . . . . . . . . . . . . . . . . . . . . . . . ..... 487
List of Code Examples xxix

Listing 15.6 Nonlinear LS For Diabetes Model . . . . . . . . . . . . . . . . . 488

Listing 15.7 Some Details of the Line Search . . . . . . . . . . . . . . . . . . 489
Listing 15.8 The Full Run with Line Search . . . . . . . . . . . . . . . . . . . 489
Listing 15.9 Optimal Parameter Values . . . . . . . . . . . . . . . . . . . . . . . 490
Listing 15.10 Runtime Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 490
Abstract

This book tries to show how mathematics, computer science, and biology can be
usefully and pleasurably intertwined. The first volume (J. Peterson, Calculus for
Cognitive Scientists: Derivatives, Integration and Modeling (Springer, Singapore,
2015 in press)) discussed the necessary one- and two-variable calculus tools as well
as first-order ODE models. In this volume, we explicitly focus on two-variable
ODE models both linear and nonlinear and learn both theoretical and computational
tools using MATLAB to help us understand their solutions. We also go over
carefully on how to solve cable models using separation of variables and Fourier
series. And we must always caution you to be careful to make sure the use of
mathematics gives you insight. These cautionary words about the modeling of the
physics of stars from 1938 should be taken to heart:
Technical journals are filled with elaborate papers on conditions in the interiors of model
gaseous spheres, but these discussions have, for the most part, the character of exercises in
mathematical physics rather than astronomical investigations, and it is difficult to judge the
degree of resemblance between the models and actual stars. Differential equations are like
servants in livery: it is honourable to be able to command them, but they are “yes” men,
loyally giving support and amplification to the ideas entrusted to them by their master—
Paul W. Merrill, The Nature of Variable Stars, 1938, quoted in Arthur I. Miller Empire
of the Stars: Obsession, Friendship, and Betrayal in the Quest for Black Holes, 2005.

The relevance of this quotation to our pursuits is clear. It is easy to develop

sophisticated mathematical models that abstract from biological complexity
something we can then analyze with mathematics or computational tools in an
attempt to gain insight. But as Merrill says,
it is difﬁcult to judge the degree of resemblance between the models and actual [biology]

where we have taken the liberty to replace physics with our domain here of biology.
We should never forget the last line

xxxi
xxxii Abstract

Differential equations are like servants in livery: it is honourable to be able to command

them, but they are “yes” men, loyally giving support and ampliﬁcation to the ideas
entrusted to them by their master.

We must always take our modeling results and go back to the scientists to make
sure they retain relevance.
History

Based On:
Notes On MTHSC 108 for Biologists developed during the
Spring 2006 and Fall 2006,
Spring 2007 and Fall 2007 courses
The ﬁrst edition of this text was used in
Spring 2008 and Fall 2008,
The course was then relabeled at MTHSC 111
and the text was used in
Spring 2009 and Fall 2009 courses
The second edition of this text was used in
Spring 2010 and Fall 2010 courses
The third edition was used in the
Spring 2011 and Fall 2011,
Spring 2012 and Summer Session I courses
The fourth edition was used in
Fall 2012, Spring 2013, and Fall 2013 courses
The ﬁfth edition was used in the Spring 2014 course
Also, we have used material from notes on
Partial Differential Equation Models
which has been taught to small numbers of students since 2008.

xxxiii
Part I
Introduction
Chapter 1
Introductory Remarks

In this course, we will try to introduce beginning cognitive scientists to more of the
kinds of mathematics and mathematical reasoning that will be useful to them if they
continue to live, grow and learn within this area. In our twenty first century world,
there is a tight integration between the areas of mathematics, computer science and
science. Now a traditional Calculus course for the engineering and physical sciences
consists of a four semester sequence as shown in Table 1.1.
Unfortunately, this sequence of courses is heavily slanted towards the needs of the
physical sciences. For example, many of our examples come from physics, chemistry
and so forth. As long as everyone in the class has the common language to under-
stand the examples, the examples are a wonderful tool for adding insight. However,
the typical students who are interested in cognitive science often find the language
of physics and engineering to be outside their comfort zone. Hence, the examples
lose their explanatory power. Our first course starts a different way of teaching this
material and this text will continue that journey. Our philosophy, as usual, is that all
parts of the course must be integrated, so we don’t want to use mathematics, science
or computer approaches for their own intrinsic value. Experts in these separate fields
must work hard to avoid this. This is the time to be generalists and always look for
connective approaches. Also, models are carefully chosen to illustrate the basic idea
that we know far too much detail about virtually any biologically based system we
can think of. Hence, we must learn to throw away information in the search of the
appropriate abstraction. The resulting ideas can then be phrased in terms of mathe-
matics and simulated or solved with computer based tools. However, the results are
not useful, and must be discarded and the model changed, if the predictions and illu-
minating insights we gain from the model are incorrect. We must always remember
that throwing away information allows for the possibility of mistakes. This is a hard
lesson to learn, but important. Note that models from population biology, genetics,
cognitive dysfunction, regulatory gene circuits and many others are good examples
to work with. All require massive amounts of abstraction and data pruning to get
anywhere, but the illumination payoffs are potentially quite large.

© Springer Science+Business Media Singapore 2016 3

J.K. Peterson, Calculus for Cognitive Scientists,
Cognitive Science and Technology, DOI 10.1007/978-981-287-877-9_1
4 1 Introductory Remarks

Table 1.1 Typical engineering calculus sequence

Course Topics
MTHSC 106 Simple limit ideas
Functions and continuity
Differentiation and applications
Simple integration and applications
MTHSC 108 More integration and applications
Sequences and series
MTHSC 206 Polar coordinates
Space curves
Double and triple integrals
Vectors and functions of more than one variable
Partial derivatives
2D and 3D applications
MTHSC 208 First order ordinary differential equations (ODEs)
Linear second order differential equations
Complex numbers and linear independence of functions
Systems of linear differential equations
Matrices, eigenvalues and eigenvectors
Qualitative analysis of linear systems
Laplace transform techniques

1.1 A Roadmap to the Text

In this course, we introduce enough relevant mathematics and the beginnings of useful
computational tools so that you can begin to understand a fair bit about Biological
Modeling. We present a selection of nonlinear biological models and slowly build you
to the point where you can begin to have a feel for the model building process. We start
our model discussion with the classical Predator–Prey model in Chap. 10. We try to
talk about it as completely as possible and we use it as a vehicle to show how graphical
analysis coupled with careful mathematical reasoning can give us great insight. We
discuss completely the theory of the original Predator–Prey model in Sect. 10.1 and
its qualitative analysis in Sect. 10.5. We then introduce the use of computational tools
to solve the Predator–Prey model using MatLab in Sect. 10.11. While this model is
very successful at modeling biology, the addition of self-interaction terms is not.
The self-interaction models are analyzed in Chap. 11 and computational tools are
discussed in Sect. 11.8.
In Chap. 12, we show you a simple infectious disease model. The nullclines for this
model are developed in Sect. 12.1 and our reasoning why only trajectories that start
with positive initial conditions are biologically relevant are explained in Sect. 12.2.
1.1 A Roadmap to the Text 5

The infectious versus susceptible curve is then derived in Sect. 12.3. We finish this
Chapter with a long discussion of how we use a bit of mathematical wizardry to
develop a way to estimate the value of ρ in these disease models by using data
gathered on the value of R . This analysis in Sect. 12.6, while complicated, is well
worth your effort to peruse!
In Chap. 13, we show you a simple model of colon cancer which while linear
is made more complicated by its higher dimensionality—there are 6 variables of
interest now and graphical analysis is of less help. We try hard to show you how
we can use this model to get insight as to when point mutations or chromosomal
instability are the dominant pathway to cancer.
In Chap. 15, we go over a simple model of insulin detection using second order
models which have complex roots. We use the phase shifted form to try to detect
insulin from certain types of data.

1.2 Code

The code for much of this text is in the directory ODE in our code folder which you
can download from Biological Information Processing (http://www.ces.clemson.
edu/~petersj/CognitiveModels.html). These code samples can then be downloaded
as the zipped tar ball CognitiveCode.tar.gz and unpacked where you wish. If you
have access to MatLab, just add this folder with its sub folders to your MatLab path.
If you don’t have such access, download and install Octave on your laptop. Now
Octave is more of a command line tool, so the process of adding paths is a bit more
tedious. When we start up an Octave session, we use the following trick. We write
up our paths in a file we call MyPath.m. For us, this code looks like this

Listing 1.1: How to add paths to octave

f u n c t i o n MyPath ( )
%
s 1 = ’ / home / p e t e r s j / MatLabFiles / B i o I n f o / : ’ ;
s 2 = ’ / home / p e t e r s j / MatLabFiles / B i o I n f o /GSO: ’ ;
s 3 = ’ / home / p e t e r s j / MatLabFiles / B i o I n f o /HH: ’ ;
s 4 = ’ / home / p e t e r s j / MatLabFiles / B i o I n f o / I n t e g r a t i o n : ’ ;
s 5 = ’ / home / p e t e r s j / MatLabFiles / B i o I n f o / I n t e r p o l a t i o n : ’ ;
s 6 = ’ / home / p e t e r s j / MatLabFiles / B i o I n f o / L i n e a r A l g e b r a : ’ ;
s 7 = ’ / home / p e t e r s j / MatLabFiles / B i o I n f o / N e r n s t : ’ ;
s 8 = ’ / home / p e t e r s j / MatLabFiles / B i o I n f o /ODE: ’ ;
s 9 = ’ / home / p e t e r s j / MatLabFiles / B i o I n f o / RootsOpt : ’ ;
s 1 0 = ’ / home / p e t e r s j / MatLabFiles / B i o I n f o / L e t t e r s : ’ ;
s 1 1 = ’ / home / p e t e r s j / MatLabFiles / B i o I n f o / Graphs : ’ ;
s 1 2 = ’ / home / p e t e r s j / MatLabFiles / B i o I n f o / PDE : ’ ;
s 1 3 = ’ / home / p e t e r s j / MatLabFiles / B i o I n f o / FDPDE : ’ ;
s 1 4 = ’ / home / p e t e r s j / MatLabFiles / B i o I n f o / 3 DCode ’ ;
s = [ s1 , s2 , s3 , s4 , s5 , s6 , s7 , s8 , s9 , s 1 2 ] ;
addpath ( s ) ;
end
6 1 Introductory Remarks

The paths we want to add are setup as strings, here called s1 etc., and to use this,
we start up Octave like so. We copy MyPath.m into our working directory and then
do this

Listing 1.2: Set paths in octave

o c t a v e >> MyPath ( ) ;

We agree it is not as nice as working in MatLab, but it is free! You still have
to think a bit about how to do the paths. For example, in Peterson (2015c), we
develop two different ways to handle graphs in MatLab. The first is in the direc-
tory GraphsGlobal and the second is in the directory Graphs. They are not
to be used together. So if we wanted to use the setup of Graphs and noth-
ing else, we would edit the MyPath.m file to set s = [s11]; only. If we
wanted to use the GraphsGlobal code, we would edit MyPath.m so that
s11 = ’/home/petersj/MatLabFiles/BioInfo/GraphsGlobal:’;
and then set s = [s11];. Note the directories in the MyPath.m are ours: the main
directory is ’/home/petersj/MatLabFiles/BioInfo/ and of course,
you will have to edit this file to put your directory information in there instead
of ours.
All the code will work fine with Octave. So pull up a chair, grab a cup of coffee
or tea and let’s get started.

1.3 Final Thoughts

As we said in Peterson (2015a), we want you to continue to grow in a multidisciplinary

way and so we think you will definitely need to learn more. Remember, every time
you try to figure something out in science, you will find there is a lot of stuff you
don’t know and you have to go learn new tricks. That’s ok and you shouldn’t be afraid
of it. You will probably find you need more mathematics, statistics and so forth in
your work, so don’t forget to read more as there is always something interesting
over the horizon you are not prepared for yet. But the thing is, every time you
figure something out, you get better at figuring out the next thing! Also, we have
written several more companion texts for you to consider on your journey. The next
on is Calculus for Cognitive Scientists: Partial Differential Equation Models,
Peterson (2015b) and then there is the fourth volume on bioinformation processing,
BioInformation Processing: A Primer On Computational Cognitive Science,
Peterson (2015c) which starts you on building interesting neural systems, which this
material will prepare you for. Enjoy your journeys!
References 7

References

J. Peterson, Calculus for Cognitive Scientists: Derivatives, Integration and Modeling, Springer
Series on Cognitive Science and Technology (Springer Science+Business Media Singapore Pte
Ltd, Singapore, 2015a in press)
J. Peterson, Calculus for Cognitive Scientists: Partial Differential Equation Models, Springer Series
on Cognitive Science and Technology (Springer Science+Business Media Singapore Pte Ltd,
Singapore, 2015b in press)
J. Peterson, BioInformation Processing: A Primer On Computational Cognitive Science, Springer
Series on Cognitive Science and Technology (Springer Science+Business Media Singapore Pte
Ltd, Singapore, 2015c in press)
Part II
Review
Chapter 2
Linear Algebra

We need to use both vector and matrix ideas in this course. This was covered already
in the first text (Peterson 2015), so we will assume you can review that material
before you start into this chapter. Here we will introduce some new ideas as well
as tools in MatLab we can use to solve what are called linear algebra problems; i.e.
systems of equations. Let’s begin by looking at inner products more closely.

2.1 The Inner Product of Two Column Vectors

We can also define the inner product of two vectors. If V and W are two column
vectors of size n × 1, then the product V T W is a matrix of size 1 × 1 which we
identify with a real number. We see if
⎡ ⎤ ⎡ ⎤
V1 W1
⎢ V2 ⎥ ⎢ W2 ⎥
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥
V = ⎢ V3 ⎥ and W = ⎢ W3 ⎥
⎢ .. ⎥ ⎢ .. ⎥
⎣.⎦ ⎣ . ⎦
Vn Wn

then we define the 1 × 1 matrix

V T W = W T V = V , W [V1 W1 + V2 W2 + V3 W3 + · · · + Vn Wn ]

and we identify this one by one matrix with the real number

V1 W1 + V2 W2 + V3 W3 + · · · + Vn Wn

This product is so important, it is given a special name: it is the inner product of

the two vectors V and W . Let’s make this formal with Definition 2.1.1.

© Springer Science+Business Media Singapore 2016 11

J.K. Peterson, Calculus for Cognitive Scientists,
Cognitive Science and Technology, DOI 10.1007/978-981-287-877-9_2
12 2 Linear Algebra

Definition 2.1.1 (The Inner Product Of Two Vectors)

If V and W are two column vectors of size n × 1, the inner product of these vectors
is denoted by < V, W > which is defined as the matrix product V T W which is
equivalent to the W T V and we interpret this 1×1 matrix product as the real number

V1 W1 + V2 W2 + V3 W3 + · · · + Vn Wn

where Vi are the components of V and Wi are the components of W .

2.1.1 Homework

Exercise 2.1.1 Find the dot product of the vectors V and W given by

6 7
V = and W = .
1 2

Exercise 2.1.2 Find the dot product of the vectors V and W given by

−6 2
V = and W = .
−8 6

Exercise 2.1.3 Find the dot product of the vectors V and W given by

10 2
V = and W = .
−4 80

We add, subtract and scalar multiply vectors and matrices as usual. We also suggest
you review how to do matrix–vector multiplications. Multiplication of matrices is
more complex as we discussed in the volume (Peterson 2015). Let’s go through it
again a bit more abstractly. Recall the dot product of two vectors V and V W is
defined to be
n
V , W = Vi Wi
i=1

where n is the number of components in the vectors. Using this we can define the
multiplication of the matrix A of size n × p with the matrix B of size p × m as
follows.
2.1 The Inner Product of Two Column Vectors 13
⎡ ⎤
Row 1 of A
⎢ Row 2 of A⎥
⎢ ⎥
⎢ .. ⎥ Column 1 of B | · · · | Column n of B
⎣ . ⎦
Row n of A
⎡ ⎤
< Row 1 of A, Column 1 of B > . . . < Row 1 of A, Column n of B >
⎢< Row 2 of A, Column 1 of B > . . . < Row 2 of A, Column n of B >⎥
⎢ ⎥
=⎢ .. .. .. ⎥
⎣ . . . ⎦
< Row n of A, Column 1 of B > . . . < Row n of A, Column n of B >

We can write this more succinctly with we let Ai denote the rows of A and B i be the
columns of B. Note the use of subscripts for the rows and superscripts for the columns.
Then, we can rewrite the matrix multiplication algorithm more compactly as
⎡ ⎤ ⎡ ⎤
A1 A1 , B 1 . . . A1 , B n
⎢ A2 ⎥ ⎢ A2 , B 1 . . . A2 , B n ⎥
⎢ ⎥ ⎢ ⎥
⎢ .. ⎥ B1 | · · · | Bn =⎢ .. .. .. ⎥
⎣ . ⎦ ⎣ . . . ⎦
An An , B . . . An , B
1 1

Thus, the entry in row i and column j of the matrix product AB is

AB i j = Ai , B j .

Comment 2.1.1 If A is a matrix of any size and 0 is the appropriate zero matrix
of the same size, then both 0 + A and A + 0 are nicely defined operations and the
result is just A.

Comment 2.1.2 Matrix multiplication is not communicative: i.e. for square matri-
ces A and B, the matrix product A B is not necessarily the same as the product B A.

2.2 Interpreting the Inner Product

What could this number < V, W > possibly mean? To figure this out, we have to
do some algebra. Let’s specialize to nonzero column vectors with only 2 compo-
nents. Let

a b
V = and W =
c d

Since these vectors are not zero, only one of the terms in (a, c) and in (b, d) can be
zero because otherwise both components would be zero and we are assuming these
vectors are not the zero vector. We will use this fact in a bit. Now here < V, W > =
ab + cd. So
14 2 Linear Algebra

(ab + cd)2 = a 2 b2 + 2abcd + c2 d 2

|| V ||2 || W ||2 = a 2 + c2 b2 + d 2
= a 2 b2 + a 2 d 2 + c2 b2 + c2 d 2

Thus,

|| V ||2 || W ||2 − (< V, W >)2 = a 2 b2 + a 2 d 2 + c2 b2 + c2 d 2 − a 2 b2 − 2abcd − c2 d 2

= a 2 d 2 − 2abcd + c2 b2
= (ad − bc)2 .

Now, this does look complicated, doesn’t it? But this last term is something squared
and so it must be non-negative! Hence, taking square roots, we have shown that

|< V, W >| ≤ || V || || W ||

Note, since a real number is always less than or equal to it absolute value, we can
also say

< V, W > ≤ || V || || W ||

And we can say more. If it turned out that the term (ad − bc)2 was zero, then
ad − bc = 0. There are then a few cases to look at.

1. If all the terms a, b, c and d are not zero, then we can write ad = bc implies a/c =
b/d. We know the vector V can be interpreted as the line segment starting at (0, 0)
on the line with equation y = (a/c)x. Similarly, the vector W can be interpreted
as the line segment connecting (0, 0) and (b, d) on the line y = (b/d)x. Since
a/c = b/d, these lines are the same. So both points (a, c) and (b, d) line on the
same line. Thus, we see these vectors lay on top of each other or point directly
opposite each other in the x − y plane; i.e. the angle between these vectors is 0
or π radians (that is 0◦ or 180◦ ).
2. If a = 0, then bc must be 0 also. Since we know the vector V is not the zero
vector, we can’t have c = 0 also. Thus, b must be zero. This tells us V has
components (0, c) for some non zero c and V has components (0, d) for some
non zero d. These components also determine two lines like in the case above
which either point in the same direction or opposite one another. Hence, again,
the angle between the lines determined by these vectors is either 0 or π radians.
3. We can argue just like the case above if d = 0. We would find the angle between
the lines determined by the vectors is either 0 or π radians.

We can summarize our results as a Theorem which is called the Cauchy–Schwartz

Theorem for two dimensional vectors.
2.2 Interpreting the Inner Product 15

Theorem 2.2.1 (Cauchy Schwartz Theorem For Two Dimensional Vectors)

If V and W are two dimensional column vectors with components (a, c) and (b, d)
respectively, then it is always true that

< V, W >≤|< V, W >| ≤ || V || || W ||

Moreover,
|< V, W >| = || V || || W ||

if and only the quantity ad − bc = 0. Further, this quantity is equal to 0 if and only
if the angle between the line segments determined by the vectors V and W is 0◦
or 180◦ .

Here is yet another way to look at this: assume there is a non zero value of t so
that the equation below is true.

a b 0
V +t W = +t =
c d 0

This implies

a b
= −t
c d

Since these two vectors are equal, their components must match. Thus, we must have
a = −t b
c = −t d

Thus,
c
a d = (−t b) =bc
−t

and we are back to ad − bc = 0! Hence, another way of saying that the vectors V
and W are either 0◦ or 180◦ apart is to say that as vectors they are multiples of one
another! Such vectors are called collinear vectors to save writing. In general, we
say two n dimensional vectors are collinear if there is a nonzero constant t so that
V = t W although, of course, we can’t really figure out a way to visualize these
vectors!
Now, the scaled vectors E = ||VV || and F = ||W W
||
have magnitudes of 1. Their
components are (a/ || V ||, c/ || V || and (b/ || W ||, d/ || W ||. These points
lie on a circle of radius 1 centered at the origin. Let θ1 be the angle E makes with
the positive x-axis. Then, since the hypotenuse distance that defines the cos(θ1 ) and
sin(θ1 ) is 1, we must have
16 2 Linear Algebra

a
cos(θ1 ) =
|| V ||
c
sin(θ1 ) =
|| V ||

We can do the same thing for the angle θ2 that F makes with the positive x axis to see

b
cos(θ2 ) =
|| W ||
d
sin(θ2 ) =
|| W ||

The angle between vectors V and W is the same as between vectors E and F. Call
this angle θ. Then θ = θ1 − θ2 and using the formula for the cos of the difference
of angles
cos(θ) = cos(θ1 − θ2 )
= cos(θ1 ) cos(θ2 ) + sin(θ1 ) sin(θ2 )
a b c d
= +
|| V || || W || || V || || W ||
ab + cd
=
|| V || || W ||
< V, W >
=
|| V || || W ||

Hence, the ratio < V, W > /(|| V || || W ||) is the same as cos(θ)! So we can use
this simple calculation to find the angle between a pair two dimensional vectors.
The more general proof of the Cauchy Schwartz Theorem for n dimensional vec-
tors is a journey you can take in another mathematics class! We will state it though
so we can use it later if we need it.

Theorem 2.2.2 (Cauchy Schwartz Theorem For n Dimensional Vectors)

If V and W are n dimensional column vectors with components (V1 , . . . , Vn ) and
(W1 , . . . , Wn ) respectively, then it is always true that

< V, W > ≤ | < V, W > | ≤ || V || || W ||

Moreover,

|< V, W >| = || V || || W ||

if and only the vector V is a non zero multiple of the vector W .

2.2 Interpreting the Inner Product 17

Theorem 2.2.2 then tells us that if the vectors V and W are not zero, then

< V, W >
−1 ≤ ≤1
|| V || || W ||

and by analogy to what works for two dimensional vectors, we can use this ratio to
define the cos of the angle between two n dimensional vectors even though we can’t
see them at all! We do this in Definition 2.2.1.
Definition 2.2.1 (The Angle Between n Dimensional Vectors)
If V and W are two non zero n dimensional column vectors with components
(V1 , . . . , Vn ) and (W1 , . . . , Wn ) respectively, the angle θ between these vectors is
defined by

< V, W >
cos(θ) =
|| V || || W ||

Moreover, the angle between the vectors is 0◦ if < V, W > = 1 and is 180◦ if
< V, W > = −1.

2.2.1 Examples

Example 2.2.1 Find the angle between the vectors V and W given by

−6 −8
V = and W = .
13 1

Solution Compute the inner product < V , W > = (−6)(−8) + (13)(1)

√ = 61.
Next, find themagnitudes of these vectors:
√ || V || = (−6) 2 + (13)2 = 205 and
|| W || = (−8)2 + (1)2 = 65. Then, if θ is the angle between the vec-
tors, we know

< V, W > 61
cos(θ) = =√ √
|| V || || W || 205 65
= 0.5284

Hence, since V is in quadrant 2 and W is in quadrant 2 as well, we expect the

angle between them should be between 0◦ and 90◦ . Your calculator should return
cos−1 (0.5284) = 58.10◦ or 1.0141 rad. You should graph these vectors and see this
visually too.
Example 2.2.2 Find the angle between the vectors V and W given by

−6 8
V = and W = .
−13 1
18 2 Linear Algebra

Solution Compute the inner product < V , W > = (−6)(8) + (−13)(1) = √

−61.
Next, find the magnitudes
of these vectors:
√ || V || = (−6) 2 + (−13)2 = 205
and || W ||= (8) + (1) = 65. Then, if θ is the angle between the vec-
2 2

tors, we know

< V, W > −61

cos(θ) = =√ √
|| V || || W || 205 65
= −0.5284

Hence, since V is in quadrant 3 and W is in quadrant 1, we expect the angle between

them should be larger than 90◦ . Your calculator should return cos−1 (−0.5284) =
121.90◦ or 2.1275 rad. You should graph these vectors and see this visually too.

Example 2.2.3 Find the angle between the vectors V and W given by

6 8
V = and W = .
−13 1

Solution Compute the inner product < V , W >=(6)(8) + (−13)(1) =√35. Next,
√ vectors: || V || = (6) + (−13) = 205 and
find the magnitudes of these 2 2

|| W || = (8) + (1) = 65. Then, if θ is the angle between the vectors, we know
2 2

< V, W > 35
cos(θ) = =√ √
|| V || || W || 205 65
= 0.3032

Hence, since V is in quadrant 4 and W is in quadrant 1, we expect the angle between

them should be between 0◦ and 180◦ . Your calculator should return cos−1 (0.3032) =
72.35◦ or 1.2627 rad. You should graph these vectors and see this visually too.

2.2.2 Homework

Exercise 2.2.1 Find the angle between the vectors V and W given by

5 7
V = and W = .
4 2

Exercise 2.2.2 Find the angle between the vectors V and W given by

−6 9
V = and W = .
−8 8
2.2 Interpreting the Inner Product 19

Exercise 2.2.3 Find the angle between the vectors V and W given by

10 2
V = and W = .
−4 8

Exercise 2.2.4 Find the angle between the vectors V and W given by

6 7
V = and W = .
1 2

Exercise 2.2.5 Find the angle between the vectors V and W given by

3 2
V = and W = .
−5 −3

Exercise 2.2.6 Find the angle between the vectors V and W given by

1 −2
V = and W = .
−4 −3

2.3 Determinants of 2 × 2 Matrices

Since the number ad − bc is so important is all of our discussions about the rela-
tionship between the two dimensional vectors V and W with components (a, c) and
(b, d) respectively, we will define this number to be the determinant of the matrix
A formed by using V for column 1 and W for column 2 of A. That is

ab a b
A= = V W =
cd c d

We then formally define the determinant of the 2 × 2 matrix A by Definition 2.3.1.

Definition 2.3.1 (The Determinant Of A 2 × 2 Matrix)

Given the 2 × 2 matrix A defined by

ab
A= ,
cd

the determinant of A is the number ad − bc. We denote the determinant by det ( A)

or | A |.

Comment 2.3.1 It is also common to denote the determinant by

a b
det A = .
cd
20 2 Linear Algebra

Also, note that if we looked at the transpose of A, we would find

a c a c
A =
T
= Y Z = .
bd b d

Notice that det AT is (a)(d) − (b)(c) also. Hence, if det AT is zero, it means
that Y and Z are collinear. Hence, it the det ( A) is zero, both the vectors determined
by the rows of A and the columns of A are collinear. Let’s summarize what we know
about this new thing called the determinant of A.
1. If | A | is 0, then the vectors determined by the columns of A are collinear.
This also means that the vectors determined by the columns are multiples of one
another. Also, the vectors determined by the columns of AT are also collinear.
2. If | A | is not 0, then the vectors determined by the columns of A are not collinear
which means these vectors point in different directions. Another way of saying
this is that these vectors are not multiples of one another. The same is true for the
columns of the transpose of A.

2.3.1 Worked Out Problems

Example 2.3.1 Compute the determinant of

16.0 8.0
A=
−6.0 −5.0

Solution | A | = (16)(−5) − (8)(−6) = −32.

Example 2.3.2 Compute the determinant of

−2.0 3.0
A=
6.0 −9.0

Solution | A | = (−2)(−9) − (3)(6) = 0.

Example 2.3.3 Determine if the vectors V and W given by

4 −2
V = and W =
5 3

are collinear.
Solution Form the matrix A using these vectors as the columns. This gives

4 −2
A=
5 3
2.3 Determinants of 2 × 2 Matrices 21

The calculate | A | = (4)(3) − (−2)(5). Since this value is 22 which is not zero,
these vectors are not collinear.

Example 2.3.4 Determine if the vectors V and W given by

−6 3
V = and W =
4 −2

are collinear.

Solution Form the matrix A using these vectors as the columns. This gives

−6 3
A=
4 −2

The calculate | A | = (−6)(−2) − (3)(4). Since this value is 0, these vectors are
collinear. You should graph them in the x−y plane to see this visually.

2.3.2 Homework

Exercise 2.3.1 Compute the determinant of

2.0 −3.0
6.0 5.0

Exercise 2.3.2 Compute the determinant of

12.0 −1.0
4.0 2.0

2.4 Systems of Two Linear Equations

We can use all of this material to understand simple two linear equations in two
unknowns x and y. Consider the problem
2x +4 y = 7 (2.1)
3 x + 4 y = −8 (2.2)
22 2 Linear Algebra

Now consider the equation below written in terms of vectors:

2 4 7
x +y =
3 4 −8

Using the standard ways of multiplying vectors by scalars and adding vectors, we
see the above can be rewritten as

2x 4y 7
+ =
3x 4y −8

or

2x + 4y 7
=
3x + 4y −8

This last vector equation is clearly the same as the original Eqs. 2.1 and 2.2:

2x + 4y = 7
3x + 4y = −8

Further, in this example, letting

2 4 7
V = , W= , and D =
3 4 −8

we see Eqs. 2.1 and 2.2 are equivalent to the vector equation

x V + y W = D.

We can also write the system Eqs. 2.1 and 2.2 in an equivalent matrix–vector form.
Recall the original system which is written below:
2x +4 y = 7
3 x + 4 y = −8

We have already identified this system is equivalent to the vector equation

xV+yW = D

where

2 4 7
V = , W= and D =
3 4 −8
2.4 Systems of Two Linear Equations 23

Now use V and W as column one and column two of the matrix A

24
A= V W =
34

Then, the original system can be written in the matrix–vector form

24 x 7
=
34 y −8

We call the matrix A the coefficient matrix of the system given by Eqs. 2.1 and 2.2.
Now we can introduce a new type of notation. Think of the column vector

x
y

as being a vector variable. We will use a bold font and a capital letter for this and set

x
X=
y

Then, the original system can be written as

A X = D.

We typically refer to the vector D as the data vector associated with the system given
by Eqs. 2.1 and 2.2.

2.4.1 Worked Out Examples

Example 2.4.1 Consider the system of equations

1x + 2y =9
−5 x + 12 y = −1

Find the matrix vector equation form of this system.

24 2 Linear Algebra

Solution Define V , W and D as follows:

1 2 9
V = , W= and D =
−5 12 −1

Then define the matrix A using V and W as its columns:

1 2
A=
−5 12

and the system is equivalent to

A X = D.

Example 2.4.2 Consider the system of equations

7x + 5y =2
−3 x + −4 y = 1

Find the matrix vector equation form of this system.

Solution Define V , W and D as follows:

7 5 2
V = , W= and D =
−3 −4 1

Then define the matrix A using V and W as its columns:

7 5
A=
−3 −4

and the system is equivalent to

A X = D.

2.4.2 Homework

Exercise 2.4.1 Consider the system of equations

1α + 2β = 3
4α + 5β = 6
2.4 Systems of Two Linear Equations 25

Find the matrix vector equation form of this system.

Exercise 2.4.2 Consider the system of equations

−1 w + 3 z = 21
6 w + 7 z = 12

Find the matrix vector equation form of this system.

Exercise 2.4.3 Consider the system of equations

−7 u + 14 v = 8
25 u + −2 v = 8

Find the matrix vector equation form of this system.

2.5 Solving Two Linear Equations in Two Unknowns

We now know how to write the system of two linear equations in two unknowns
given by Eqs. 2.3 and 2.4

a x + b y = D1 (2.3)
c x + d y = D2 (2.4)

in an equivalent matrix–vector form. This system is equivalent to the vector equation

xV+yW = D

where

a b D1
V = , W= and D =
c d D2

Finally, using V and W as column one and column two of the matrix A

ab
A= V W =
cd

Then, the original system was written in vector and matrix–vector form as

ab x D1
xV+yW = =
cd y D2
26 2 Linear Algebra

Now, we can solve this system very easily as follows. We have already discussed the
inner product of two vectors. So we could compute the inner product of both sides
of x V + y W = D with any vector U we want and get
< U, x V + y W > = < U, D >

We can simplify the left hand side to get

x < U, V > +y < U, W > = < U, D >

Since this is true for any vector U, let’s try to find useful ones! Any vector U that
satisfies < U, W > = 0 would be great as then the y would drop out and we could
solve for x. The angle between such vector U and W would then be 90◦ or 270◦ .
We will call such vectors orthogonal as the lines associated with the vectors are
perpendicular.
We can easily find such a vector. Since W defines a line through the origin with
slope d/b, from our usual algebra and pre-calculus courses, we know the line through
the origin which is perpendicular to it has negative reciprocal slope: i.e. −b/d. A
line with the slope −b/d corresponds with a vector with components (d, −b). The
usual symbol for perpendicularity is ⊥ so we will label our vector orthogonal to W
as W ⊥ . We see that

⊥ d
W =
−b

and as expected

< W ⊥ , W > = (d)(b) + (−b)(d) = 0

Thus, we have

< W ⊥, D > = x < W ⊥, V > + y < W ⊥, W >

= x < W ⊥, V >

This looks complicated, but it can be written in terms of things we understand. Let’s
actually calculate the inner products. We find

< W ⊥ , V > = (d)(a) + (−b)(c) = det ( A)

and

D1 b
< W ⊥ , D > = (d)(D1 ) + (−b)(D2 ) = det .
D2 d
2.5 Solving Two Linear Equations in Two Unknowns 27

Hence, by taking the inner product of both sides with W ⊥ , we find the y term drops
out and we have

ab D1 b
x det = det
cd D2 d

Thus, if det ( A) is not zero, we can solve for x to get

D1 b
det
D2 d det D W
x= =
ab det V W
det
cd

We can do a similar thing to find out what the variable y is by taking the inner
product of both sides of of x V + y W = D with the vector V ⊥ and get

x < V ⊥, V > + y < V ⊥, W > = < V ⊥, D >

where

⊥ c
V =
−a

and as expected

< V ⊥ , V > = (c)(a) + (−a)(c) = 0

Going through the same steps as before, we would find that if det ( A) is non zero,
we could solve for y to get

a D1
det
c D2 det V D
y= =
ab det V W
det
cd

Let’s summarize:

1. Given any system of two linear equations in two unknowns, there is a coefficient
matrix A with first column V and second column W that is associated with it.
Further, the right hand side of the system defines a data vector D.
2. If det ( A) is not zero, we can solve for the unknowns x and y as follows:
28 2 Linear Algebra

det DW
x=
det V W

det V D
y=
det V W

This method of solution is known as Cramer’s Rule.

3. This system of two linear equations in two unknowns is associated with two
column vectors V and W . You can see there is a unique solution if and only if
| A | is not zero. This is the same as saying there is a unique solution if and only
if the vectors are not collinear.

We can state this as Theorem 2.5.1.

Theorem 2.5.1 (Cramer’s Rule)

Consider the system of equations
a x + b y = D1
c x + d y = D2 .

Define the vectors

a b D1
V = , W= , and D =
c d D2

Also, define the matrix A by

A= V W

Then, if det ( A) = 0, the unique solution to this system of equations is given by

det D W
x=
det ( A)

det V D
y=
det ( A)

2.5.1 Worked Out Examples

Example 2.5.1 Solve the system

−2 x + 4 y = 6
8 x + −1 y = 2

using Cramer’s Rule.

2.5 Solving Two Linear Equations in Two Unknowns 29

Solution Solve

−2 x + 4 y = 6
8 x + −1 y = 2

using Cramer’s Rule. We have

−2 4 6
V = , W= and D =
8 −1 2

det DW
x=
det V W

6 4
det
2 −1
=
−2 4
det
8 −1
−30
= = 1,
−30

det V D
y=
det V W

−2 6
det
8 2
=
−2 4
det
8 −1
−52 26
= =−
−30 15

Example 2.5.2 Solve the system

−5 x + 1 y = 8
9 x + −10 y = 2

using Cramer’s Rule.

Solution We have

−5 1 9
V = , W= and D =
9 −10 −2
30 2 Linear Algebra

det DW
x=
det V W

8 1
det
2 −10
=
−5 1
det
9 −10
−82 −82
= =− .
41 41
det V D
y=
det V W

−5 8
det
9 2
=
−5 1
det
9 −10
−82
=
41

2.5.2 Homework

Exercise 2.5.1 Solve the system

−3 x + 4 y = 6
8 x + 9 y = −1

using Cramer’s Rule.

Exercise 2.5.2 Solve the system

2x + 3y =6
−4 x + 0 y = 8

using Cramer’s Rule.

Exercise 2.5.3 Solve the system

18 x + 1 y = 1
−9 x + 3 y = 17
2.5 Solving Two Linear Equations in Two Unknowns 31

using Cramer’s Rule.

Exercise 2.5.4 Solve the system

−7 x + 6 y = −4
8x + 1y =1

using Cramer’s Rule.

Exercise 2.5.5 Solve the system

−90 x + 1 y = 1
80 x + −1 y = 1

using Cramer’s Rule.

2.6 Consistent and Inconsistent Systems

So what happens if det ( A) = 0? By the remark above, we know that the vectors V
and W are collinear. We also know from our discussions in Sect. 2.3 that the columns
of AT are collinear. Hence, there is a non zero constant r so that

a c
=r
b d

Thus, a = r c and b = r d and the original system can be written as

r c x + r d y = D1
c x + d y = D2

r (c x + d y) = D1
c x + d y = D2

You can see we do not really have two equations in two unknowns since the top
equation on the left hand side is just a multiple of the left hand side of the bottom
32 2 Linear Algebra

equation. This can only make sense if D1 /r = D2 or D1 = r D2 . We can conclude

that the relationship between the components of D must be just right! Hence, we
have the system

c x + d y = D1 /r
c x + d y = D2

Now subtract the top equation from the bottom equation. You find

0 x + 0 y = 0 = D2 − D1 /r

This equation only makes sense if when you subtract the top from the bottom equa-
tion, the new right hand side is 0! We call such systems consistent if the right hand
side becomes 0 and inconsistent if not zero. So we have a great test for inconsistency.
We scale the top or bottom equation just right to make them identical and subtract
the two equations. If we get 0 = α for a nonzero α, the system is inconsistent.
Here is an example. Consider the system

2x +3y = 8
4x +6y = 9

Here, the column vectors of AT are

2 4
Y = and Z =
3 6

We see Z = 2 Y and we have the system

2 x + 3 y = 8 = D1
2 (2 x + 3 y) = 9 = D2

This system would be consistent if the bottom equation was exactly two times the top
equation. For this to happen, we need D2 = 2 D1 ; i.e., we need 9 = 2 × 8 which is
impossible. So these equations are inconsistent. As mentioned earlier, an even better
way to see these equations are inconsistent is to subtract the top equation from the
bottom equation to get

0x + 0 y = 0 = 1
2.6 Consistent and Inconsistent Systems 33

which again is not possible. Remember, consistent equations when det ( A) = 0

would have the some multiple of top − bottom equation = zero.
Another way to look at this situation is to note that the column vectors, V and W ,
of A are collinear. Hence, there is another non zero scalar s so that V = s W . We
can then rewrite the usual vector form of our system as

D=xV+yW
= x sW + y W

This says that the data vector D = (xs + y) W . Hence, if there is a solution x and y,
it will only happen in D is a multiple of W . This says D is collinear with W which
in turn is collinear with V . Going back to our sample

2x +3y = 8
4 x + 6 y = 9.

We see D with components (8, 9) is not a multiple of V with components (2, 4) or

W with components (3, 6). Thus, the system must be inconsistent.

2.6.1 Worked Out Examples

Example 2.6.1 Consider the system

4 x + 5 y = 11
−8 x − 10 y = −22

Determine if this system is consistent or inconsistent.

Solution We see immediately that the determinant of the coefficient matrix A is

zero. So the question of consistency is reasonable to ask. Here, the column vectors
of AT are

4 −8
Y = and Z =
5 −10

We see Z = −2 Y and we have the system

4 x + 5 y = 11 = D1
−2 (4 x + 5 y) = −22 = D2
34 2 Linear Algebra

This system would be consistent if the bottom equation was exactly minus two times
the top equation. For this to happen, we need D2 = −2 D1 ; i.e., we need −22 =
−2 × 11 which is true. So these equations are consistent.

Example 2.6.2 Consider the system

6 x + 8 y = 14
18 x + 24 y = 48

Determine if this system is consistent or inconsistent.

Solution We see immediately that the determinant of the coefficient matrix A is zero.
So again, the question of consistency is reasonable to ask. Here, the column vectors
of AT are

6 18
Y = and Z =
8 24

We see Z = 3 Y and we have the system

6 x + 8 y = 14 = D1
3 (6 x + 8 y) = 48 = D2

This system would be consistent if the bottom equation was exactly three times the
top equation. For this to happen, we need D2 = 3 D1 ; i.e., we need −48 = 3 × 14
which is not true. So these equations are inconsistent.

2.6.2 Homework

Exercise 2.6.1 Consider the system

2x +5y = 1
8 x + 20 y = 4

Determine if this system is consistent or inconsistent.

2.6 Consistent and Inconsistent Systems 35

Exercise 2.6.2 Consider the system

60 x + 80 y = 120
6 x + 8 y = 13

Determine if this system is consistent or inconsistent.

Exercise 2.6.3 Consider the system

−2 x + 7 y = 10
20 x − 70 y = 4

Determine if this system is consistent or inconsistent.

Exercise 2.6.4 Consider the system

x+y=1
2x +2 y = 3

Determine if this system is consistent or inconsistent.

Exercise 2.6.5 Consider the system

−11 x − 3 y = −2
33 x + 9 y = 6

Determine if this system is consistent or inconsistent.

2.7 Specializing to Zero Data

If the system we want to solve has zero data, then we must solve a system of equa-
tions like

a x +b y = 0
c x + d y = 0.
36 2 Linear Algebra

Define the vectors V and W as usual. Note D is now the zero vector

a b 0
V = , W= , and D =
c d 0

Also, define the matrix A by

A= V W

Then, if det ( A) = 0, the unique solution to this system of equations is given by

0b
det
0d
x=
det ( A)
0
= =0
ad − bc

a0
det
c0
y=
det ( A)
0
= =0
ad − bc

Hence, the unique solution to a system of the form A X = 0 is x = 0 and y = 0.

But what happens if the determinant of A is zero? In this case, we know the column
vectors of AT are collinear: i.e.

a c
Y = and Z = ,
b d

are collinear and so there is a non zero constant r so that a = r c and b = r d. This
gives the system
r (c x + b y) = 0
c x + d y = 0.

Now if you multiply the bottom equation by r and subtract from the top equation,
you get 0. This tells us the system is consistent. The original system of two equations
is thus only one equation. We can choose to use either the original top or bottom
equation to solve. Say we choose the original top equation. Then we need to find x
and y choices so that
a x +b y = 0

There are in finitely many solutions here! It is easiest to see how to solve this kind
of problem using some examples.
2.7 Specializing to Zero Data 37

2.7.1 Worked Out Examples

Example 2.7.1 Find all solutions to the consistent system

−2 x + 7 y = 0
20 x − 70 y = 0

Solution First, note the determinant of the coefficient matrix of the system is zero.
Also, since the bottom equation is −10 times the top equation, we see the system is
also consistent. We solve using the top equation:

−2 x + 7 y = 0

Thus,
7y =2x
y = (2/7) x

We see a solution vector of the form

x x 1
X= = =x
y (2/7) x (2/7)

will always work. There is a lot of ambiguity here as the multiplier x is completely
arbitrary. For example,if we let x = 7 c for an arbitrary c, then solving for y, we
find y = 2 c. We can then rewrite the solution vector as

2
c
7

in terms of the arbitrary multiplier c. It does not really matter what form we pick,
however we often try to pick a form which has integers as components.

Example 2.7.2 Find all solutions to the consistent system

4x +5y = 0
8 x + 10 y = 0

Solution First, note the determinant of the coefficient matrix of the system is zero.
Also, since the bottom equation is 2 times the top equation, we see the system is also
consistent. We solve using the bottom equation this time:

8 x + 10 y = 0
38 2 Linear Algebra

Thus,
10 y = −8 x
y = −(4/5) x

We see a solution vector of the form

x x 1
X= = =x
y −(4/5) x −(4/5)

will always work. Again, there is a lot of ambiguity here as the multiplier x is
completely arbitrary. For example,if we let x = 10 d for an arbitrary d, then solving
for y, we find y = −8 d. We can then rewrite the solution vector as

10
d
−8

in terms of the arbitrary multiplier d. Again, it is important to note that it does not
really matter what form we pick, however we often try to pick a form which has
integers as components.

2.7.2 Homework

Exercise 2.7.1 Find all solutions to the consistent system

x +3y = 0
6 x + 18 y = 0

Exercise 2.7.2 Find all solutions to the consistent system

−3 x + 4 y = 0
9 x − 12 y = 0

Exercise 2.7.3 Find all solutions to the consistent system

2x +7y = 0
1 x + (3/2) y = 0

Exercise 2.7.4 Find all solutions to the consistent system

−10 x + 5 y = 0
20 x − 10 y = 0
2.7 Specializing to Zero Data 39

Exercise 2.7.5 Find all solutions to the consistent system

−12 x + 5 y = 0
4 x − (5/3) y = 0

2.8 Matrix Inverses

If a matrix A has a non zero determinant, we know the system

x D1
A =
y D2

has a unique solution for each right hand side vector

D1
D= .
D2

If we could find another matrix B which satisfied

BA= AB=I

we could multiply both sides of our system by B to find

x D1
BA =B
y D2

x D1
I =B
y D2

x D1
=B
y D2

which is the solution to our system! The matrix B is, of course, special and clearly
plays the role of an inverse for the matrix A. When such a matrix B exists, it is called
the inverse of A and is denoted by A−1 .
Definition 2.8.1 (The Inverse of the matrix A)
If there is a matrix B of the same size as the square matrix A, B is said to be inverse
of A is

BA= AB=I

In this case, we denote the inverse of A by A−1 .

We can show that the inverse of A exists if and only if det ( A) = 0. In general,
it is very hard to find the inverse of a matrix, but in the case of a 2 × 2 matrix, it
is very easy.
40 2 Linear Algebra

Definition 2.8.2 (The Inverse of the 2 × 2 matrix A)

Let

ab
A=
cd

and assume det ( A) = 0. Then, the inverse of A is given by

−1 1 d −b
A =
det ( A) −c a

2.8.1 Worked Out Examples

Example 2.8.1 For

62
A=
34

find A−1 .

Solution Since det ( A) = 18, we see

−1 1 4 −2
A =
18 −3 6

Example 2.8.2 For the system

−2 4 x 8
=
35 y 7

find the unique solution.

Solution The coefficient matrix here is

−2 4
A=
3 5

Since det ( A) = −22, we see

−1 −1 5 −4
A =
22 −3 −2
2.8 Matrix Inverses 41

and hence,

x 8
= A−1
y 7

−12
−1 40 − 28
= = 22
22 −24 − 14 38
22

2.8.2 Homework

Exercise 2.8.1 For

68
A=
34

find A−1 if it exists.

Exercise 2.8.2 For

68
A=
35

find A−1 if it exists.

Exercise 2.8.3 For

−3 2
A=
3 5

find A−1 if it exists.

Exercise 2.8.4 For the system

43 x −5
=
11 2 y 3

find the unique solution if it exists.

Exercise 2.8.5 For the system

−1 2 x 10
=
12 y 30

find the unique solution if it exists.

42 2 Linear Algebra

Exercise 2.8.6 For the system

40 30 x −5
=
16 5 y 10

find the unique solution if it exists.

2.9 Computational Linear Algebra

Let’s look at how we can use MatLab/Octave to solve the general linear system of
equations
Ax =b

where A is a n × n matrix, x is a column vector with n rows whose components are

the unknowns we wish to solve for and b is the data vector.

2.9.1 A Simple Lower Triangular System

We will start by writing a function to solve a special system of equations; we begin

with a lower triangular matrix system L x = b. For example, if the system we wanted
to solve was

⎡ ⎤ ⎡ ⎤ ⎡ ⎤
1 −2 3 x 9
⎣0 4 1⎦ ⎣ y ⎦ = ⎣8⎦
0 0 6 z 2

this is easily solve by starting at the last equation and working backwards. This is
called backsolving. Here, we have

2
z=
6
1 23
4y = 8 − z = 8 − =
3 3
23
y=
12
23 71
x = 9 + 2y − 3z = 9 + −1=
6 6
It is easy to write code to do this in MatLab as we do below.
2.9 Computational Linear Algebra 43

2.9.2 A Lower Triangular Solver

Here is a simple function to solve such a system.

Listing 2.1: Lower Triangular Solver

f u n c t i o n x = LTriSol (L, b )
%
% L i s n x n Lower T r i a n g u l a r M a t r i x
% b i s nx1 d a t a v e c t o r
5 % O b t a i n x by f o r w a r d s u b s t i t u t i o n
%
n = length (b) ;
x = zeros (n , 1 ) ;
f o r j =1: n−1
10 x ( j ) = b ( j ) / L( j , j ) ;
b ( j + 1: n ) = b ( j +1: n ) − x ( j ) ∗L( j +1: n , j ) ;
end
x ( n ) = b ( n ) /L( n , n ) ;
end

To use this function, we would enter the following commands at the Matlab prompt.
For now, we are assuming that you are running Matlab in a local directory which
contains your Matlab code LTriSol.m. So we fire up Matlab and enter these com-
mands:

Listing 2.2: Sample Solution with LTriSol

A = [2 0 0; 1 5 0; 7 9 8]
A =
2 0 0
1 5 0
5 7 9 8
b = [6; 2; 5]
b =
6
2
10 5
x = L T r i S o l (A, b )
x =
3.0000
−0.2000
15 −1.7750

which solves the system as we wanted.

2.9.3 An Upper Triangular Solver

Here is a simple function to solve a similar system where this time A is upper
triangular. The code is essentially the same although the solution process starts at
the top and sweeps down.
44 2 Linear Algebra

Listing 2.3: Upper Triangular Solver

f u n c t i o n x = UTriSol (U, b )
%
% U i s nxn n o n s i n g u l a r Upper T r i a n g u l a r m a t r i x
% b i s nx1 d a t a v e c t o r
5 % x i s s o l v e d by back s u b s t i t u t i o n
%
n = length (b) ;
x = zeros (n , 1 ) ;
f o r j = n: −1:2
10 x ( j ) = b ( j ) /U( j , j ) ;
b ( 1 : j −1) = b ( 1 : j −1) − x ( j ) ∗U( 1 : j −1 , j ) ;
end
x ( 1 ) = b ( 1 ) / U( 1 , 1 ) ;
end

As usual, to use this function, we would enter the following commands at the Matlab
prompt. We are still assuming that you are running Matlab in a local directory and
that your Matlab code UTriSol.m is also in this directory.
So we enter these commands in Matlab.

Listing 2.4: Sample Solution with UTriSol

C = [7 9 8; 0 1 5; 0 0 2]
C =
7 9 8
0 1 5
5 0 0 2
b = [ 6; 2; 5]
b =
6
2
10 5
x = UTriSol (C, b )
x =
11.5000
−10.5000
15 2.5000

which again solves the system as we wanted.

2.9.4 The LU Decomposition of A Without Pivoting

It is possible to take a general matrix A and rewrite it as the product of a lower

triangular matrix L and an upper triangular matrix U. Here is a simple function to
solve a system using the LU decomposition of A. First, it finds the LU decomposition
and then it uses the lower triangular and upper triangular solvers we wrote earlier.
To do this, we add and subtract multiples of rows together to remake the original
matrix A into an upper triangular matrix. Let’s do a simple example. Let the matrix
A be given by
2.9 Computational Linear Algebra 45
⎡⎤
8 23
A = ⎣−4 3 2⎦
7 89

Start in the row 1 and column 1 position in A. The entry there is the pivoting element.
Divide the entries below it by the 8 and store them in the rest of column 1. This gives
the new matrix A∗
⎡ ⎤
8 23
A ∗ = ⎣− 8 3 2 ⎦
4

7
8
89

If we took the original row 1 and multiplied it by the − 48 and subtracted it from row
2, we would have the new second row

0 4 3.5

If we took the 78 , multiplied the original row 1 by it and subtracted it from the original
row 3, we would have

50 51
0 8 8

With these operations done, we have the matrix A∗ taking the form
⎡ ⎤
8 2 3
A∗ = ⎣− 8 4 3.5⎦
4

7 25 51
8 4 8

The multipliers in the lower part of column 1 are important to what we are doing, so
we are saving them in the parts of column 1 we have made zero. In MatLab, what
we have just done could be written like this

Listing 2.5: Storing multipliers

% t h i s i s a 3 x3 m a t r i x
n = 3
% s t o r e m u l t i p l i e r s i n t h e r e s t o f column 1
A( 2 : n , 1 ) = A( 2 : n , 1 ) / A( 1 , 1 ) ;
5 % compute t h e new t h e 2 x2 b l o c k which
% r e m o v e s column 1 and row 1
A( 2 : n , 2 : n ) = A( 2 : n , 2 : n ) − A( 2 : n , 1 ) ∗A( 1 , 2 : n ) ;
46 2 Linear Algebra

The code above does what we just did by hand. Now do the same thing again, but
start in the column 2 and row 2 position in the new matrix A∗ . The new pivoting
element is 4, so below it in column 2, we divide the rest of the elements of column
2 by 4 and store the results. This gives
⎡ ⎤
8 2 3
A = ⎣− 8 4 3.5⎦
∗ 4

7 25 51
8 16 8

We are not done. We now calculate the multiplier 25 16

times the part of this row 2 past
the pivot position and subtract it from the rest of row 3. We actually have a 0 then in
both column 1 and column 2 of row 3 now. So, the calculations give
29
00 32

although the row we store in A∗ is

7 25 29
8 16 32

We are now done. We have converted A into the form

⎡ ⎤
8 2 3
A∗ = ⎣− 8 4 3.5⎦
4

7 25 29
8 16 32

Let this final matrix be called B. We can extract the lower triangular part of B using
the MatLab command tril(B,-1) and the lower triangular matrix L formed from
A is then made by adding a main diagonal of 1’s to this. The upper triangular part of
A is then U which we find by using triu(B). In code this is

Listing 2.6: Extracting Lower and Upper Parts of a matrix

L = e y e ( n , n ) + t r i l (A, −1) ;
U = t r i u (A) ;

In our example, we find

⎡ ⎤ ⎡ ⎤
1 0 0 82 3
L = ⎣− 8 1 0⎦ U = ⎣0 4 2 ⎦
4 7

7 25 29
8 16
1 0 0 32

The full code is listed below.

2.9 Computational Linear Algebra 47

Listing 2.7: LU Decomposition of A Without Pivoting

f u n c t i o n [ L , U] = GE(A)
%
3 % A i s nxn m a t r i x
% L i s nxn l o w e r t r i a n g u l a r
% U i s nxn u p p e r t r i a n g u l a r
%
% We compute t h e LU d e c o m p o s i t i o n o f A u s i n g
8 % G a us s i a n E l i m i n a t i o n
%

[ n , n ] = s i z e (A) ;
f o r k =1: n−1
13 % find multiplier
A( k +1 : n , k ) = A( k +1: n , k ) /A( k , k ) ;
% z e r o o u t column
A( k +1: n , k +1: n ) = A( k +1: n , k +1: n ) − A( k +1 : n , k ) ∗A( k , k +1: n ) ;
end
18 L = e y e ( n , n ) + t r i l (A, −1) ;
U = t r i u (A) ;
end

Now in MatLab, to see it work, we enter these commands:

Listing 2.8: Solution using LU Decomposition
A = [ 1 7 24 1 8 1 5 ; 23 5 7 14 1 6 ; 4 6 13 20 2 2 ; . . .
10 12 19 21 3 ; 1 1 18 25 2 9 ]
A =
17 24 1 8 15
5 23 5 7 14 16
4 6 13 20 22
10 12 19 21 3
11 18 25 2 9
[ L , U] = GE(A) ;
10 L
L =
1.0000 0 0 0 0
1.3529 1.0000 0 0 0
0.2353 −0.0128 1.0000 0 0
15 0.5882 0.0771 1.4003 1.0000 0
0.6471 −0.0899 1.9366 4.0578 1.0000
U
U =

20 17.0000 24.0000 1.0000 8.0000 15.0000

0 −27.4706 5.6471 3.1765 −4.2941
0 0 12.8373 18.1585 18.4154
0 0 0 −9.3786 −31.2802
0 0 0 0 90.1734
25 b = [ 1; 3; 5; 7; 9]
b =
1
3
5
30 7
9
y = LTriSol (L, b )
y =
1.0000
48 2 Linear Algebra

35 1.6471
4.7859
−0.4170
0.9249
x = UTriSol (U, y )
40 x =

0.0103
0.0103
0.3436
45 0.0103
0.0103
c = A∗x
c =
1.0000
50 3.0000
5.0000
7.0000
9.0000

which solves the system as we wanted.

2.9.5 The LU Decomposition of A with Pivoting

Here is a simple function to solve a system using the LU decomposition of A with

what is called pivoting. This means we find the largest absolute value entry in the
column we are trying to zero out and perform row interchanges to bring that one to
the pivot position. The MatLab code changes a bit; see if you can see what we are
doing and why! Note that this pivoting step is needed if a pivot element in a column k,
row k position is very small. Using it as a divisor would then cause a lot of numerical
problems because we would be multiplying by very large numbers.

Listing 2.9: LU Decomposition of A With Pivoting

f u n c t i o n [ L , U, p i v ] = GePiv (A) ;
2 %
% A i s nxn m a t r i x
% L i s nxn l o w e r t r i a n g u l a r m a t r i x
% U i s nxn u p p e r t r i a n g u l a r m a t r i x
% p i v i s a nx1 i n t e g e r v e c t o r t o h o l d v a r i a b l e o r d e r
7 % permutations
%
[ n , n ] = s i z e (A) ;
piv = 1:n ;
f o r k =1: n−1
12 [ maxc , r ] = max ( abs (A( k : n , k ) ) ) ;
q = r+k−1;
piv ( [ k q ] ) = piv ( [ q k ] ) ;
A( [ k q ] , : ) = A( [ q k ] , : ) ;
i f A( k , k ) ˜=0
17 A( k +1: n , k ) = A( k + 1: n , k ) /A( k , k ) ;
A( k +1: n , k +1: n ) = A( k +1: n , k +1: n ) − A( k +1: n , k ) ∗A( k , k +1: n ) ;
end
end
L = e y e ( n , n ) + t r i l (A, −1) ;
22 U = t r i u (A) ;
end
2.9 Computational Linear Algebra 49

We use this code to solve a system as follows:

Listing 2.10: Solving a System with pivoting

A = [ 1 7 24 1 8 1 5 ; 23 5 7 14 1 6 ; 4 6 13 20 2 2 ; . . .
2 10 12 19 21 3 ; 11 18 25 2 9]
A =
17 24 1 8 15
23 5 7 14 16
4 6 13 20 22
7 10 12 19 21 3
11 18 25 2 9
b = [1; 3; 5; 7; 9]
b =
1
12 3
5
7
9
[ L , U, p i v ] = GePiv (A) ;
17 L
L =
1.0000 0 0 0 0
0.7391 1.0000 0 0 0
0.4783 0.7687 1.0000 0 0
22 0.1739 0.2527 0.5164 1.0000 0
0.4348 0.4839 0.7231 0.9231 1.0000
U
U =
23.0000 5.0000 7.0000 14.0000 16.0000
27 0 20.3043 −4.1739 −2.3478 3.1739
0 0 24.8608 −2.8908 −1.0921
0 0 0 19.6512 18.9793
0 0 0 0 −22.2222
piv
32 piv =

2 1 5 3 4
y = LTriSol (L, b ( piv ) ) ;
y
37 y =
3.0000
−1.2174
8.5011
0.3962
42 −0.2279
x = UTriSol (U, y ) ;
x
x =
0.0103
47 0.0103
0.3436
0.0103
0.0103
c = A∗x
52 c =
1.0000
3.0000
5.0000
7.0000
57 9.0000

which solves the system as we wanted.

50 2 Linear Algebra

2.10 Eigenvalues and Eigenvectors

Another important aspect of matrices is called the eigenvalues and eigenvectors of

a matrix. We will motivate this in the context of 2 × 2 matrices of real numbers,
and then note it can also be done for the general square n × n matrix. Consider the
general 2 × 2 matrix A given by

a b
A=
cd

Is it possible to find a non zero vector v and a number r so that

A v = r v? (2.5)

There are many ways to interpret what such a number and vector pair means, but for
the moment, we will concentrate on finding such a pair (r, v). Now, if this was true,
we could rewrite the equation as

r v− Av = 0 (2.6)

where 0 denotes the vector of all zeros

0
0= .
0

Next, recall that the two by two identity matrix I is given by

10
I=
01

and it acts like multiplying by one with numbers; i.e. I v = v for any vector v. Thus,
instead of saying r v, we could say r I v. We can therefore write Eq. 2.6 as

r I v− Av = 0 (2.7)

We know that we can factor the vector v out of the left hand side and rewrite again
as Eq. 2.8.

r I−A v=0 (2.8)

Now recall that we want the vector v to be non zero. Note, in solving this system,
there are two possibilities:
2.10 Eigenvalues and Eigenvectors 51

(i): the determinant of B is non zero which implies the only solution is v = 0.
(ii): the determinant of B is zero which implies the there are infinitely many solutions
for v all of the form a constant c times some non zero vector E.
Here the matrix B = r I − A. Hence, if we want a non zero solution v, we must
look for the values of r that force det (r I − A) = 0. Thus, we want

0 = det (r I − A)

r − a −b
= det
−c r − d
= (r − a) (r − d) − b c
= r 2 − (a + d) r + ad − bc.

This important quadratic equation in the variable r determines what values of r

will allow us to find non zero vectors v so that A v = r v. Note that although we
started out in our minds thinking that r would be a real number, what we have done
above shows us that it is possible that r could be complex.

Definition 2.10.1 (The Eigenvalues and Eigenvectors of a 2 by 2 Matrix)

Let A be the 2 × 2 matrix

a b
A= .
cd

Then an eigenvalue r of the matrix A is a solution to the quadratic equation defined by

det (r I − A) = 0.

Any non zero vector that satisfies the equation

Av =r v

for the eigenvalue r is then called an eigenvector associated with the eigenvalue r
for the matrix A.

Comment 2.10.1 Since this is a quadratic equation, there are always two roots
which take the forms below:
(i): the roots r1 and r2 are real and distinct,
(ii): the roots are repeated r1 = r2 = c for some real number c,
(iii): the roots are complex conjugate pairs; i.e. there are real numbers α and β so
that r1 = α + β i and r2 = α − β i.
52 2 Linear Algebra

Let’s look at some examples:

Example 2.10.1 Find the eigenvalues and eigenvectors of the matrix

−3 4
A=
−1 2

Solution The characteristic equation is

10 −3 4
det r − =0
01 −1 2

or

r + 3 −4
0 = det
1r −2
= (r + 3)(r − 2) + 4
= r2 + r − 2
= (r + 2)(r − 1)

Hence, the roots, or eigenvalues, of the characteristic equation are r1 = −2 and

r2 = 1. Next, we find the eigenvectors associated with these eigenvalues.
1. For eigenvalue r1 = −2, substitute the value of this eigenvalue into

r + 3 −4
1r −2

This gives

1 −4
1 −4

The two rows of this matrix should be multiples of one another. If not, we made
a mistake and we have to go back and find it. Our rows are indeed multiples, so
pick one row to solve for the eigenvector. We need to solve

1 −4 v1 0
=
1 −4 v2 0

Picking the top row, we get

v1 − 4 v2 = 0
1
v2 = v1
4
2.10 Eigenvalues and Eigenvectors 53

Letting v1 = A, we find the solutions have the form

v1 1
=A
v2 1
4

The vector

1
1/4

is our choice for an eigenvector corresponding to eigenvalue r1 = −2.

2. For eigenvalue r2 = 1, substitute the value of this eigenvalue into

r + 3 −4
1r −2

This gives

4 −4
1 −1

Again, the two rows of this matrix should be multiples of one another. If not, we
made a mistake and we have to go back and find it. Our rows are indeed multiples,
so pick one row to solve for the eigenvector. We need to solve

4 −4 v1 0
=
1 −1 v2 0

Picking the bottom row, we get

v1 − v2 = 0
v2 = v1

Letting v1 = B, we find the solutions have the form

v1 1
=B
v2 1

The vector

1
1

is our choice for an eigenvector corresponding to eigenvalue r2 = 1.

54 2 Linear Algebra

Example 2.10.2 Find the eigenvalues and eigenvectors of the matrix

4 9
A=
−1 −6

Solution The characteristic equation is

10 4 9
det r − =0
01 −1 −6

or

r − 4 −9
0 = det
1r +6
= (r − 4)(r + 6) + 9
= r 2 + 2 r − 15
= (r + 5)(r − 3)

Hence, the roots, or eigenvalues, of the characteristic equation are r1 = −5 and

r2 = 3. Next, we find the eigenvectors associated with these eigenvalues.
1. For eigenvalue r1 = −5, substitute the value of this eigenvalue into

r − 4 −9
1r +6

This gives

−9 −9
1 1

Picking the bottom row, we get

v1 + v2 = 0
v2 = − v1
2.10 Eigenvalues and Eigenvectors 55

Letting v1 = A, we find the solutions have the form

v1 1
=A
v2 −1

The vector

1
−1

is our choice for an eigenvector corresponding to eigenvalue r1 = −5.

2. For eigenvalue r2 = 3, substitute the value of this eigenvalue into

r − 4 −9
1r +6

This gives

−1 −9
1 9

Picking the bottom row, we get

v1 + 9 v2 = 0
−1
v2 = v1
9
Letting v1 = B, we find the solutions have the form

v1 1
=B −1
v2 9

The vector

1
−1
9

is our choice for an eigenvector corresponding to eigenvalue r2 = 3.

56 2 Linear Algebra

2.10.1 Homework

Exercise 2.10.1 Find the eigenvalues and eigenvectors of the matrix

6 3
A=
−11 −8

Exercise 2.10.2 Find the eigenvalues and eigenvectors of the matrix

2 1
A=
−4 −3

Exercise 2.10.3 Find the eigenvalues and eigenvectors of the matrix

−2 −1
A=
8 7

Exercise 2.10.4 Find the eigenvalues and eigenvectors of the matrix

−6 −3
A=
4 1

Exercise 2.10.5 Find the eigenvalues and eigenvectors of the matrix

−4 −2
A=
13 11

2.10.2 The General Case

For a general n × n matrix A, we have the following:

Definition 2.10.2 (The Eigenvalues and Eigenvectors of a n by n Matrix)

Let A be the n × n matrix.
⎡ ⎤
A11 A12 · · · An1
⎢ A21 A22 · · · An2 ⎥
⎢ ⎥
A=⎢ . .. .. .. ⎥ .
⎣ .. . . . ⎦
An1 An2 · · · Ann
2.10 Eigenvalues and Eigenvectors 57

Then an eigenvalue r of the matrix A is a solution to the polynomial defined by

det (r I − A) = 0.

Any non zero vector that satisfies the equation

Av =r v

for the eigenvalue r is then called an eigenvector associated with the eigenvalue r
for the matrix A.

Comment 2.10.2 Since this is a polynomial equation, there are always n roots some
of which are real numbers which are distinct, some might be repeated and some might
be complex conjugate pairs (and they can be repeated also!). An example will help.
Suppose we started with a 5 × 5 matrix. Then, the roots could be
1. All the roots are real and distinct; for example, 1, 2, 3, 4 and 5.
2. Two roots are the same and three roots are distinct; for examples, 1, 1, 3, 4 and
5.
3. Three roots are the same and two roots are distinct; for examples, 1, 1, 1, 4 and
5.
4. Four roots are the same and one roots is distinct from that; for examples, 1, 1,
1, 1 and 5.
5. Five roots are the same; for examples, 1, 1, 1, 1 and 1.
6. Two pairs of roots are the same and one roots is different from them; for examples,
1, 1, 3, 3 and 5.
7. One triple root and one pair of real roots; for examples, 1, 1, 1, 3 and 3.
8. One triple root and one complex conjugate pair of roots; for examples, 1, 1, 1,
3 + 4i and 3 − 4i.
9. One double root and one complex conjugate pair of roots and one different real
root; for examples, 1, 1, 2, 3 + 4i and 3 − 4i.
10. Two complex conjugate pair of roots and one different real root; for examples,
−2, 1 + 6i, 1 − 6i, 3 + 4i and 3 − 4i.

2.10.3 The MatLab Approach

We will now discuss certain ways to compute eigenvalues and eigenvectors for a
square matrix in MatLab. For a given A, we can compute its eigenvalues as follows:
58 2 Linear Algebra

Listing 2.11: Eigenvalues in Matlab

A = [ 1 2 3 ; 4 5 6 ; 7 8 −1]
A =
3 1 2 3
4 5 6
7 8 −1
E = e i g (A)
E =
8 −0.3954
11.8161
−6.4206

So we have found the eigenvalues of this small 3 × 3 matrix. To get the eigenvectors,
we do this:

Listing 2.12: Eigenvectors in Matlab

[V, D] = e i g (A)
V =
0.7530 −0.3054 −0.2580
−0.6525 −0.7238 −0.3770
5 0.0847 −0.6187 0.8896
D =
−0.3954 0 0
0 11.8161 0
0 0 −6.4206

Note the eigenvalues are not returned in ranked order. The eigenvalue/eigenvector
pairs are thus

λ1 = −0.3954
⎡ ⎤
0.7530
V1 = ⎣ −0.6525 ⎦
0.0847

λ2 = 11.8161
⎡ ⎤
−0.3054
V2 = ⎣ −0.7238 ⎦
−0.6187

λ3 = −6.4206
⎡ ⎤
−0.2580
V3 = ⎣ −0.3770 ⎦
0.8896

Now let’s try a nice 5 × 5 array that is symmetric:

2.10 Eigenvalues and Eigenvectors 59

Listing 2.13: Eigenvalues and Eigenvectors Example

1 B = [1 2 3 4 5;
2 5 6 7 9;
3 6 1 2 3;
4 7 2 8 9;
5 9 3 9 6]
6 B =
1 2 3 4 5
2 5 6 7 9
3 6 1 2 3
4 7 2 8 9
11 5 9 3 9 6
[W, Z] = e i g (B)
W =

0.8757 0.0181 −0.0389 0.4023 0.2637

16 −0.4289 −0.4216 −0.0846 0.6134 0.5049
0.1804 −0.6752 0.4567 −0.4866 0.2571
−0.1283 0.5964 0.5736 −0.0489 0.5445
0.0163 0.1019 −0.6736 −0.4720 0.5594
Z =
21 0.1454 0 0 0 0
0 2.4465 0 0 0
0 0 −2.2795 0 0
0 0 0 −5.9321 0
0 0 0 0 26.6197

It is possible to show that the eigenvalues of a symmetric matrix will be real and
eigenvectors corresponding to distinct eigenvalues will be 90◦ apart. Such vectors
are called orthogonal and recall this means their inner product is 0. Let’s check it
out. The eigenvectors of our matrix are the columns of W above. So their dot product
should be 0!

Listing 2.14: Checking orthogonality

C = d o t (W( 1 : 5 , 1 ) ,W( 1 : 5 , 2 ) )
C =
1 . 3 3 3 6 e −16

Well, the dot product is not actually 0 because we are dealing with floating point
numbers here, but as you can see it is close to machine zero (the smallest number
our computer chip can detect). Welcome to the world of computing!

Reference

Now that you are taking this course on More Calculus for Cognitive Scientists,
we note that in the previous course, you were introduced to continuity, derivatives,
integrals and models using derivatives. You were also taught about functions of two
variables and partial derivatives along with more interesting models. You were also
introduced to how to solve models using Euler’s method. We use these ideas a lot,
so there is a much value in reviewing this material. So let’s dive into it again. When
we try to solve systems like

dy
= f (t, y) (3.1)
dt
y(t0 ) = y0 (3.2)

where f is continuous in the variables t and y, and y0 is some value the solution is
to have at the time point t0 , we will quickly find that it is very hard in general to do
this by hand. So it is time to begin looking at how the MatLab environment can help
us. We will use MatLab to solve these differential equations with what are called
numerical methods. First, let’s discuss how to approximate functions in general.

3.1 Taylor Polynomials

We can approximate a function at a point using polynomials of various degrees. We

can first find the constant that best approximates a function f at a point p. This is
called the zeroth order Taylor polynomial and the equation we get is

f (x) = f ( p) + E 0 (x, p)

© Springer Science+Business Media Singapore 2016 61

J.K. Peterson, Calculus for Cognitive Scientists,
Cognitive Science and Technology, DOI 10.1007/978-981-287-877-9_3
62 3 Numerical Methods Order One ODEs

where E 0 (x, p) is the error. On the other hand, we could try to find the best straight
line that does the job. We would find

f (x) = f ( p) + f ( p)(x − p) + E 1 (x, p)

where E 1 (x, p) is the error now. This straight line is the first order Taylor polyno-
mial but we know it also as the tangent line. We can continue finding polynomials
of higher and higher degree and their respective errors. In this class, our interests
stop with the quadratic case. We would find

f (x) = f ( p) + f ( p)(x − p) + f ( p)(x − p)2 + E 2 (x, p)

where E 2 (x, p) is the error. This is called the second order Taylor polynomial or
quadratic approximation. Now let’s dig into the theory behind this so that we can
better understand the error terms.

3.1.1 Fundamental Tools

Let’s consider a function which is defined locally at the point p. This means there
is at least an interval (a, b) containing p where f is defined. Of course, this interval
could be the whole x axis!. Let’s also assume f exists locally at p in this same
interval. Now pick any x is the interval [ p, b) (we can also pick a point in the left
hand interval (a, p] but we will leave that discussion to you!). From Calculus I,
recall Rolle’s Theorem and the Mean Value Theorem. These are usually discussed in
Calculus I, but we really prove them carefully in course called Mathematical Analysis
(but that is another story).

Theorem 3.1.1 (Rolle’s Theorem)

Let f : [a, b] → be a function defined on the interval [a, b] which is continuous
on the closed interval [a, b] and is at least differentiable on the open interval (a, b).
If f (a) = f (b), then there is at least one point c, between a and b, so that f (c) = 0.

and

Theorem 3.1.2 (The Mean Value Theorem)

Let f : [a, b] → be a function defined on the interval [a, b] which is continuous
on the closed interval [a, b] and is at least differentiable on the open interval (a, b).
Then there is at least one point c, between a and b, so that

f (b) − f (a)
= f (c).
b−a
3.1 Taylor Polynomials 63

3.1.2 The Zeroth Order Taylor Polynomial

Our function f on the interval [ p, x] satisfies all the requirements of the Mean Value
Theorem. So we know there is a point cx with p < cx < x so that

f (x) − f ( p)
= f (cx ).
x−p

This can be written as

f (x) = f ( p) + f (cx )( p − a).

Let the constant f ( p), a polynomial of degree 0, be denoted by

P0 ( p, x) = f ( p).

We’ll call this the 0th order Taylor Polynomial for f at the point p. Next, let the 0th
order error term be defined by

E 0 ( p, x) = f (x) − P0 ( p, x) = f (x) − f ( p).

The error or remainder term is clearly the difference or discrepancy between the
actual function value at x and the 0th order Taylor Polynomial. Since f (x) − f ( p) =
f (cx )(x − p), we can write all we have above as

E 0 ( p, x) = f (cx )(x − p), some cx , with p < cx < x.

We can interpret what we have done by saying f ( p) is the best choice of 0th order
polynomial or constant to approximate f (x) near p. Of course, for most functions,
this is a horrible approximation! So the next step is to find the best straight line that
approximates f near p. Let’s try our usual tangent line to f at p. We summarize this
result as a theorem.

Theorem 3.1.3 (Zeroth Order Taylor Polynomial)

Let f : [a, b] → be continuous on [a, b] and be at least differentiable on (a, b).
Then for each p in [a, b], there is at least one point c, between p and x, so that
f (x) = f ( p) + f (c)(x − p). The constant f ( p) is called the zeroth order Taylor
Polynomial for f at p and we denote it by P0 (x; p). The point p is called the base
point. Note we are approximating f (x) by the constant f ( p) and the error we make
is E 0 (x, p) = f (c)(x − p).
64 3 Numerical Methods Order One ODEs

3.1.2.1 Examples

Example 3.1.1 If f (t) = t 3 , by the theorem above, we know on the interval [1, 3]
that at 1 f (t) = f (1) + f (c)(t − 1) where c is some point between 1 and t. Thus,
t 3 = 1 + (3c2 )(t − 1) for some 1 < c < t. So here the zeroth order Taylor
Polynomial is P0 (t, 1) = 1 and the error is E 0 (t, 1) = (3c2 )(t − 1).

Example 3.1.2 If f (t) = e−1.2t , by the theorem above, we know that at 0 f (t) =
f (0) + f (c)(t − 0) where c is some point between 0 and t. Thus, e−1.2t = 1 +
(−1.2)e−1.2c (t −0) for some 0 < c < t or e−1.2t = 1−1.2e−1.2c t. So here the zeroth
order Taylor Polynomial is P0 (t, 1) = 1 and the error is E 0 (t, 0) = −1.2e−1.2c t.

Example 3.1.3 If f (t) = e−0.00231t , by the theorem above, we know that at 0

f (t) = f (0) + f (c)(t − 0) where c is some point between 0 and t. Thus,
e−0.00231t = 1 + (−0.00231)e−0.00231c (t − 0) for some 0 < c < t or e−0.00231t =
1 − 0.00231e−0.00231c t. So here the zeroth order Taylor Polynomial is P0 (t, 1) = 1
and the error is E 0 (t, 0) = −0.00231e−0.00231c t.

3.1.3 The First Order Taylor Polynomial

If a function f is differentiable, from Calculus I, we know we can approximate its

value at the point p by its tangent line, T . We have

f (x) = f ( p) + f ( p) (x − p) + E T (x, p), (3.3)

where the tangent line T is the function

T (x) = f ( p) + f ( p) (x − p)

and the term E 1 (x, p) represents the error between the true function value f (x) and
the tangent line value T (x). That is

E 1 (x, p) = f (x) − T (x).

Another way to look at this is that the tangent line is the best straight line or linear
approximation to f at the point p. We all know from our first calculus course how
these pictures look. If the function f is curved near p, then the tangent line is not
a very good approximation to f at p unless x is very close to p. Now, let’s assume
f is actually two times differentiable on the local interval (a, b) also. Define the
constant M by

f (x) − f ( p) − f ( p)(x − p)
M= .
(x − p)2
3.1 Taylor Polynomials 65

In this discussion, this really is a constant value because we have fixed our value of
x and p already. We can rewrite this equation as

f (x) = f ( p) + f ( p)(x − p) + M(x − p)2 .

Now let’s define the function g on an interval I containing p by

g(t) = f (t) − f ( p) − f ( p) (t − p) − M(t − p)2 .

Then,

g (t) = f (t) − f ( p) − 2M(t − p)

g (t) = f (t) − 2M.

Then,

g(x) = f (x) − f ( p) − f ( p) (x − p) − M(x − p)2

= f (x) − f ( p) − f ( p) (x − p) − f (x) + f ( p) + f ( p)(x − p)
=0

and

g( p) = f ( p) − f ( p) − f ( p) ( p − p) − M( p − p)2 = 0.

We thus know g(x) = g( p) = 0. Also, from the Mean Value Theorem, there is a
point cx0 between p and x so that

g(x) − g( p)
= g (cx0 ).
p−x

Since the numerator is g(x) − g( p), we now know g (cx0 ) = 0. But we also have

g ( p) = f ( p) − f ( p) − 2M( p − p) = 0.

Next, we can apply Rolle’s Theorem to the function g . This tells us there is a point
cx1 between p and cx0 so that g (cx1 ) = 0. Thus,

0 = g (cx1 ) = f (cx1 ) − 2M.

and simplifying, we have

1 1
M= f (cx ).
2
66 3 Numerical Methods Order One ODEs

Remembering what the value of M was, gives us our final result

f (x) − f ( p) − f ( p)(x − p) 1
= f (cx1 ), some cx1 with p < cx1 < cx0 .
(x − p)2 2

which can be rewritten as

1 1
f (x) = f ( p) + f ( p)(x − p) + f (cx )(x − p)2 , some cx1 with p < cx1 < cx0 .
2
We define the 1st order Taylor polynomial, P1 (x, p) and 1st order error, E 1 (x, p) by

P1 (x, p) = f ( p) + f ( p)(x − p)
E 1 (x, p) = f (x) − P1 (x, p) = f (x) − f ( p) − f ( p)(x − p)
1
= f (cx1 )(x − p)2 .
2
Thus, we have shown, E 1 (x, p) satisfies

(x − p)2
E 1 (x, p) = f (cx1 ) (3.4)
2

where cx1 is some point between x and p. Note the usual Tangent line is the same as
the first order Taylor Polynomial, P1 ( f, p, x) and we have a nice representation of
our error. We can state this as our next theorem:

Theorem 3.1.4 (First Order Taylor Polynomial)

Let f : [a, b] → be continuous on [a, b] and be at least twice differentiable on
(a, b). For a given p in [a, b], for each x, there is at least one point c, between p
and x, so that f (x) = f ( p) + f ( p)(x − p) + (1/2) f (c)(x − p)2 . The f ( p) +
f ( p)(x − p) is called the first order Taylor Polynomial for f at p. and we denote
it by P1 (x; p). The point p is again called the base point. Note we are approximating
f (x) by the linear function f ( p)+ f ( p)(x − p) and the error we make is E 1 (x, p) =
(1/2) f (c)(x − p).

3.1.3.1 Example

Let’s do some examples to help this sink in.

Problem One: Let’s find the tangent line approximations for a simple exponential
decay function.

Example 3.1.4 For f (t) = e−1.2t on the interval [0, 5] find the tangent line approx-
imation, the error and maximum the error can be on the interval.
3.1 Taylor Polynomials 67

Solution Using base point 0, we have at any t

f (t) = f (0) + f (0)(t − 0) + (1/2) f (c)(t − 0)2

= 1 + (−1.2)(t − 0) + (1/2)(−1.2)2 e−1.2c (t − 0)2
= 1 − 1.2t + (1/2)(1.2)2 e−1.2c t 2 .

where c is some point between 0 and t. Hence, c is between 0 and 5 also. The first
order Taylor Polynomial is P1 (t, 0) = 1 − 1.2t which is also the tangent line to
e−1.2t at 0. The error is (1/2)(−1.2)2 e−1.2c t 2 .
Now let AE(t) denote absolute value of the actual error at t and ME be maximum
absolute error on the interval. The largest the error can be on [0, 5] is when f (c)
is the biggest it can be on the interval. Here,

AE(t) = (1/2)(1.2)2 e−1.2c t 2 ≤ (1/2)(1.2)2 × 1 × (5)2 = (1/2)1.44 × 25 = ME.

Problem Two: Let’s find the tangent line approximations for a simple exponential
decay function again but let’s do it a bit more generally.

Example 3.1.5 If f (t) = e−βt , for β = 1.2 × 10−5 , find the tangent line approxi-
mation, the error and the maximum error on [0, 5].

Solution At any t

1
f (t) = f (0) + f (0)(t − 0) + f (c)(t − 0)2
2
1
= 1 + (−β)(t − 0) + (−β)2 e−βc (t − 0)2
2
1
= 1 − βt + β 2 e−βc t 2 .
2
where c is some point between 0 and t which means c is between 0 and 5. The first
order Taylor Polynomial is P1 (t, 0) = 1 − βt which is also the tangent line to e−βt
at 0. The error is 21 β 2 e−βc t 2 . The largest the error can be on [0, 5] is when f (c) is
the biggest it can be on the interval. Here,
−5
AE(t) = |(1/2)(1.2 × 10−5 )2 e−1.2×10 c t 2 | ≤ (1/2)(1.2 × 10−5 )2 (1)(5)2
= (1/2)1.44 × 10−10 (25) = ME

3.1.4 Quadratic Approximations

We could also ask what quadratic function Q fits f best near p. Let the quadratic
function Q be defined by
68 3 Numerical Methods Order One ODEs

(x − p)2
Q(x) = f ( p) + f ( p) (x − p) + f ( p) . (3.5)
2
The new error is called E Q (x, p) and is given by

E Q (x, p) = f (x) − Q(x).

If f is three times differentiable, we can argue like we did in the tangent line approx-
imation (using the Mean Value Theorem and Rolle’s theorem on an appropriately
defined function g) to show there is a new point cx2 between p and cx1 with

(x − p)3
E Q (x, p) = f (cx2 ) (3.6)
6
So if f looks like a quadratic locally near p, then Q and f match nicely and the
error is pretty small. On the other hand, if f is not quadratic at all near p, the error
will be large. We then define the second order Taylor polynomial, P2 ( f, p, x) and
second order error, E 2 (x, p) = E Q (x, p) by

1
P2 (x, p) = f ( p) + f ( p)(x − p) + f ( p)(x − p)2
2
1
E 2 (x, p) = f (x) − P2 (x, p) = f (x) − f ( p) − f ( p)(x − p) − f ( p)(x − p)2
2
1 2
= f (cx ) (x − p)3 .
6
Theorem 3.1.5 (Second Order Taylor Polynomial)
Let f : [a, b] → be continuous on [a, b] and be at least three times differentiable
on (a, b). Given p in [a, b], for each x, there is at least one point c, between p and x, so
that f (x) = f ( p) + f ( p)(x − p) + (1/2) f ( p)(x − p)2 + (1/6) f (c)(x − p)3 .
The quadratic f ( p) + f ( p)(x − p) + (1/2) f ( p)(x − p)2 is called the second
order Taylor Polynomial for f at p and we denote it by P2 (x, p). The point p
is again called the base point. Note we are approximating f (x) by the quadratic
f ( p) + f ( p)(x − p) + (1/2) f ( p)(x − p)2 and the error we make is E 2 (x, p) =
(1/6) f (c)(x − p).

3.1.4.1 Examples

Let’s work out some problems involving quadratic approximations.

Example 3.1.6 If f (t) = e−βt , for β = 1.2 × 10−5 , find the second order approxi-
mation, the error and the maximum error on [0, 5].
3.1 Taylor Polynomials 69

Solution For each t in the interval [0, 5], then there is some 0 < c << t < 5 so that

f (t) = f (0) + f (0)(t − 0) + (1/2) f (0)(t − 0)2

+ (1/6) f (c)(t − 0)3
1 1
= 1 + (−β)(t − 0) + (−β)2 (t − 0)2 + (−β)3 e−βc (t − 0)3
2 6
= 1 − βt + (1/2)β 2 − (1/6)β 3 e−βc t 3 .

The second order Taylor Polynomial is p2 (t, 0) = 1 − βt + (1/2)β 2 t 2 which is

also called the quadratic approximation to e−βt at 0. The error is − 16 β 3 e−βc t 3 . The
error is largest on [0, 5] when f (c) is the biggest it can be on the interval. Here,
−5 −5
AE(t) = | − (1/6)(1.2 × 10−5 )3 e−1.2×10 c t 3 | ≤ | − (1/6)(1.2 × 10−5 )3 e−1.2×10 c t 3 |
≤ (1/6)(1.2 × 10−5 )3 (1) (5)3 = (1/6) 1.728 × 10−15 (125) = ME

Example 3.1.7 Do this same problem on the interval [0, T ].

Solution The approximations are the same, 0 < c < T and

−5 −5
AE = | − (1/6)(1.2 × 10−5 )3 e−1.2×10 c t 3 | ≤ | − (1/6)(1.2 × 10−5 )3 e−1.2×10 c t 3 |
≤ (1/6)(1.2 × 10−5 )3 T 3 = (1/6) 1.728 × 10−15 (T 3 ) = ME.

We can find higher order Taylor polynomials and remainders using these arguments
as long as f has higher order derivatives. But, for our purposes, we can stop here.

3.2 Euler’s Method with Time Independence

Let’s try to approximate the solution to the model x = f (x) with x(0) = x0 . Note
the dynamics does not depend on time t. The solution x(t) can be written

x(t) = x(0) + x (0)(t − 0) + x (ct )(t − 0)2 /2

where ct is some number between 0 and t. To approximate the solution, we will

divide the interval [0, T ] into pieces of length h. We call h the stepsize. If h does not
evenly divide T , we just use the last subinterval even though it may be a bit short.
Let N be the number of subintervals we get by doing this.
• Example: Divide [0, 5] using h = 0.4. Then 5/0.4 = 12.5 so we create 13
subintervals with the last one of length 0.2 instead of 0.4. So N = 13.
• Example: Divide [0, 10] using h = 0.2. Then 10/0.2 = 50 and we get N = 50.
70 3 Numerical Methods Order One ODEs

To approximate the true solution x(h) we then have

x(h) = x(0) + x (0)(h − 0) + x (ch )(h − 0)2 /2

= x0 + x (0)h + x (ch )h 2 /2,

where ch is between 0 and h. We can rewrite this more. Note x = f (x) tells
us we can replace x (0) by f (x(0)) = f (x0 ). Also, since x = f (x), the chain
rule tells us x = f (x) x = (d f /d x) f where we let f (x) = (d f /d x)(x). So
x (ch ) = f (x(ch )) f (x(ch )) and we have

x(h) = x0 + f (x0 )h + f (x(ch )) f (x(ch )) h 2 /2

Let x1 be the true solution x(h) and let x̂0 be the starting or zeroth Euler approximate
which is defined by x̂0 = x0 . Hence, we make no error at first. Further, let the first
Euler approximate x̂1 be defined by x̂1 = x0 + f (x0 ) h = x̂0 + f (x̂0 ) h. Which is
the tangent line approximation to x at the point t = 0! Then we have

x1 = x̂1 + f (x(ch )) f (x(ch )) h 2 /2.

Define the error at the first step by E 1 = |x1 − x̂1 |. Thus,

E 1 = |x1 − x̂1 | = | f (x(ch ))| | f (x(ch ))| h 2 /2.

Now let’s do bounds. Now x is continuous on [0, T ] so x is bounded. Hence,

||x||∞ = max0≤t≤T |x(t)| is some finite number. Call it D. We see x(t) lives in the
interval [−D, D] which is on the x axis. Then
• f is continuous on [−D, D], so || f ||∞ = max −D≤x≤D | f (x)| is some finite
number.
• f is continuous on [−D, D], so || f ||∞ = max −D≤x≤D | f (x)| is some finite
number.
The specific numbers we used for the example y = 3y are an example of these
bounds. Using these bounds, we have

E 1 = |x1 − x̂1 | = | f (x(ch ))| | f (x(ch ))| h 2 /2

≤ || f ||∞ || f ||∞ h 2 /2 = B h 2 /2 ≤ C h 2 /2.

where we let A = || f ||∞ , B = || f ||∞ || f ||∞ and C be the maximum of A and B.

Now let’s do the approximation for x(2h). We will let x2 = x(2h) and we will
define the second Euler approximate by x̂2 = x̂1 + f (x̂1 ) h. The tangent line approx-
imation to x at h gives

x(2h) = x(h) + x (h) h + x (c2h ) h 2 /2

= x(h) + f (x(h))h + f (x(c2h )) f (x(c2h )) h 2 /2
= x1 + f (x1 )h + f (x(c2h )) f (x(c2h )) h 2 /2.
3.2 Euler’s Method with Time Independence 71

Now add and subtract x̂2 = x̂1 + f (x̂1 ) h to this equation.

x(2h) = x1 + f (x1 )h + f (x(c2h )) f (x(c2h )) h 2 /2

= x1 + f (x1 )h + x̂2 − x̂2 + f (x(c2h )) f (x(c2h )) h 2 /2
= (x1 + f (x1 )h − x̂1 − f (x̂1 ) h) + x̂2
+ f (x(c2h )) f (x(c2h )) h 2 /2
= x̂2 + (x1 − x̂1 ) + ( f (x1 ) − f (x̂1 )) h
+ f (x(c2h )) f (x(c2h )) h 2 /2.

We are almost there! Next, we can apply the Mean Value Theorem to the difference
f (x1 ) − f (x̂1 ) and find f (x1 ) − f (x̂1 ) = f (xd )(x1 − x̂1 ) with xd between x1 and
x̂1 . Plugging this in, we have

x2 = x̂2 + (x1 − x̂1 ) + ( f (x1 ) − f (x̂1 )) h

+ f (x(c2h )) f (x(c2h )) h 2 /2
= x̂2 + (x1 − x̂1 ) + f (xd ) (x1 − x̂1 ) h
+ f (x(c2h )) f (x(c2h )) h 2 /2
= x̂2 + (x1 − x̂1 ) (1 + f (xd )h) + f (x(c2h )) f (x(c2h )) h 2 /2

Thus

x2 − x̂2 = (x1 − x̂1 ) (1 + f (xd )h) + f (x(c2h )) f (x(c2h )) h 2 /2.

Now E 2 = |x2 − x̂2 |, so we can overestimate

E 2 = |x2 − x̂2 | ≤ |x1 − x̂1 | (1 + | f (xd )|h)

+ | f (x(c2h ))| | f (x(c2h ))| h 2 /2
≤ E 1 (1 + || f ||∞ h) + || f ||∞ || f ||∞ h 2 /2
= E 1 (1 + Ah) + B h 2 /2 ≤ E 1 (1 + Ch) + Ch 2 /2.

Since E 1 ≤ Ch 2 /2, we find

E 2 ≤ C h 2 /2 (1 + Ch) + C h 2 /2 = (C h 2 /2) (1 + (1 + Ch))

Let’s look at e1+Ch . We know using our approximations

eut = 1 + ut + u 2 euc t 2 /2

for some c between 0 and u. But the error term is positive so we know eut ≥ 1 + ut.
Letting u = Ch and using t = 1, we have
72 3 Numerical Methods Order One ODEs

1 + (1 + Ch) ≤ 1 + eCh .

and so

E 2 ≤ (C h 2 /2) (1 + eCh )

Now let’s do the approximation for x(3h). We will let x3 = x(3h) and we will
define the third Euler approximate by x̂3 = x̂2 + f (x̂2 ) h. The tangent line approxi-
mation to x at 2h gives

x(3h) = x(2h) + x (2h) h + x (c3h ) h 2 /2

= x(2h) + f (x(2h))h + f (x(c3h )) f (x(c3h )) h 2 /2
= x2 + f (x2 )h + f (x(c3h )) f (x(c3h )) h 2 /2.

Now add and subtract x̂3 = x̂2 + f (x̂2 ) h to this equation.

x(3h) = x2 + f (x2 )h + f (x(c3h )) x(c3h ) h 2 /2

= x2 + f (x2 )h + x̂3 − x̂3 + f (x(c3h )) f (x(c3h )) h 2 /2
= (x2 + f (x2 )h − x̂2 − f (x̂2 ) h) + x̂3
+ f (x(c3h )) f (x(c3h )) h 2 /2
= x̂3 + (x2 − x̂2 ) + ( f (x2 ) − f (x̂2 )) h
+ f (x(c3h )) f (x(c3h )) h 2 /2.

But we can apply the Mean Value Theorem to the difference f (x2 ) − f (x̂2 ). We
find f (x2 ) − f (x̂2 ) = f (xu )(x2 − x̂2 ) with xu between x2 and x̂2 . Plugging this
in, we find

x3 = x̂3 + (x2 − x̂2 ) + ( f (x2 ) − f (x̂2 )) h

+ f (x(c3h )) f (x(c3h )) h 2 /2
= x̂3 + (x2 − x̂2 ) + f (xu ) (x2 − x̂2 ) h
+ f (x(c3h )) f (x(c3h )) h 2 /2
= x̂3 + (x2 − x̂2 ) (1 + f (xu )h) + f (x(c3h )) f (x(c3h )) h 2 /2

Thus

x3 − x̂3 = (x2 − x̂2 ) (1 + f (xu )h) + f (x(c3h )) f (x(c3h )) h 2 /2.

Now E 3 = |x3 − x̂3 |, so we can overestimate

3.2 Euler’s Method with Time Independence 73

E 3 = |x3 − x̂3 | ≤ |x2 − x̂2 | (1 + | f (xu )|h)

+ | f (x(c3h ))| | f (x(c3h ))| h 2 /2
≤ E 2 (1 + || f ||∞ h) + || f ||∞ || f ||∞ h 2 /2
= E 2 (1 + Ah) + B h 2 /2.

Since E 2 ≤ (C h 2 /2) (2 + Ch), we find

E 3 ≤ (C h 2 /2) (2 + Ch) (1 + Ah) + B h 2 /2

≤ (C h 2 /2) (2 + Ch) (1 + Ch) + C h 2 /2

= C 1 + (1 + Ch) + (1 + Ch)2 h 2 /2

We have already shown that 1 + u ≤ eu for any u. It follows (1 + u)2 ≤ (eu )2 = e2u .
So we have

E 3 ≤ C 1 + (1 + Ch) + (1 + Ch)2 h 2 /2

≤ C 1 + eCh + e2Ch h 2 /2.

We also know 1 + u + u 2 = (u 3 − 1)/(u − 1). So we have

e3Ch − 1 2
E3 ≤ C h /2.
eCh − 1

Continuing we find after N steps

e N Ch − 1 2
EN ≤ C h /2.
eCh − 1

Now after N steps, we reach the end of the interval T . So N h = T . Rewriting, we

have the absolute error we make after N Euler approximation steps is

eC T − 1 2
EN ≤ C h /2.
eCh − 1

The total error is what we get when we add up E 1 + · · · + E N ≤ N E N . This gives

the estimate
eC T − 1 2
E1 + · · · + E N ≤ N C h /2
eCh − 1
N eC T − 1
≤ (Ch) h
2 Ch
N CT
= (e − 1) h
2
74 3 Numerical Methods Order One ODEs

The total or global error thus has the form of a constant times h and hence, the
solution on the entire interval is of order h. Of course, the local error at each step is
on the order of h 2 .

3.3 Euler’s Method with Time Dependence

If the dynamics function depends on time, we will still be able to look carefully at the
Euler approximates. Let’s look carefully at the errors we make when we use Euler’s
method in this case. We start with a preliminary result called a Lemma.

Lemma 3.3.1 (Estimating The Exponential Function)

For all x, 1 + x ≤ e x which implies (1 + x)n ≤ enx .

Proof We know if the base point is 0, then since e x is twice differentiable, there is
a number ξ between 0 and x so that e x = 1 + x + eξ x2 . However, the last term is
2

always nonnegative and so dropping that term we obtain e x ≥ 1 + x from which the
final result follows.

3.3.1 Lipschitz Functions and Taylor Expansions

We need another basic idea about functions which is called Lipschitzity. This is a
definition.

Definition 3.3.1 (Functions Satisfying Lipschitz Conditions)

If f is a function defined on the finite interval [a, b], we say f satisfies a Lipschitz
condition on [a, b] if there is a positive constant K so that

| f (x) − f (y)| ≤ K |x − y|

for all x and y in [a, b].

Many functions satisfy a Lipschitz condition. For example, if f has a derivative that
is bounded by the constant K on [a, b], this means | f (x)| ≤ K for all x in [a, b].
We can then apply the Mean Value Theorem to f on any interval [x, y] in [a, b] to
see there is a number ξ between x and y so that

| f (x) − f (y)| = | f (ξ)| |x − y| ≤ K |x − y|.

We see f is Lipschitz on [a, b].

Next, if we want to solve the equation x (t) = f (t, x), x(t0 ) = x0 , one way we can
attack it is to assume the solution x has enough smoothness to allow us to expand it
in a Taylor series expansion about the base point (t0 , x0 ). We assume f is continuous
3.3 Euler’s Method with Time Dependence 75

on a rectangle D which is the set of (t, x) with t ∈ [a, b] and x ∈ [c, d] with the point
(t0 , x0 ) in D for some finite intervals [a, b] and [c, d]. The theory of the solutions to
our models allows us to make sure the value of d is large enough to hold the image
of the solution x; i.e., x(t) ∈ [c, d] for all t ∈ [t0 , b]. Also, for convenience, we
are assuming x is a scalar variable although the arguments we present can easily be
modified to handle x being a vector. Using the Taylor Remainder theorem, this gives
for a given time point t:

dx 1 d2x 1 d3x
x(t) = x(t0 ) + (t0 ) (t − t0 ) + (t 0 ) (t − t 0 ) 2
+ (ξ) (t − t0 )3
dt 2 dt 2 6 dt 3

where ξ is some point in the interval [t0 , t]. We also know by the chain rule that since
x is f that
d2x ∂f ∂ f dx
= +
dt 2 ∂t ∂t dt
d2x ∂f ∂f
= + f
dt 2 ∂t ∂x

which implies, switching to a standard subscript notation for partial derivatives (yes,
these calculations are indeed yucky2 !)

d3x
= ( f tt + f t x f ) + ( f xt + f x x f ) f + f x ( f t + f x f )
dt 3
The important thing to note is that this third order derivative is made up of algebraic
combinations on f and its various partial derivatives. We typically assume that on the
interval [t0 , b], that all of these functions are continuous and bounded. Thus, letting
||g|| represent the maximum value of the continuous function g(s, u) on the interval
[t0 , b] × [c, d], we know there is a constant B so that

|x (t)| ≤ || f x ||∞ || f ||∞ + || f t ||∞ = B.

implying ||x ||∞ ≤ B on [t0 , b]. Further, there is a constant C so that

|x (ξ)| ≤ (|| f tt ||∞ + || f t x ||∞ || f ||∞ ) + (|| f xt ||∞ + || f x x ||∞ || f ||∞ ) || f ||∞
+ || f x ||∞ (|| f t ||∞ + || f x ||∞ || f ||∞ )
= C.

That is, ||x ||∞ ≤ C on [t0 , b]. Of course, if f has sufficiently smooth higher order
partial derivatives, we can find bounds on higher derivatives of x as well. Now, using
the standard abbreviations f 0 = f (t0 , x0 ), f t0 = ∂∂tf (t0 , x0 ) and f x0 = ∂∂xf (t0 , x0 )
with similar notations for the second order partials, we see our solution can be written
as
76 3 Numerical Methods Order One ODEs

1 1 d3x
x(t) = x0 + f 0 (t − t0 ) + ( f t0 + f x0 f 0 ) (t − t0 )2 + (ξ) (t − t0 )3
2 6 dt 3
1
= x0 + f 0 (t − t0 ) + ( f t0 + f x0 f 0 ) (t − t0 )2
2
1
+ f tt + f t x f ) + ( f xt + f x x f ) f + f x ( f t + f x f (t − t0 )3 .
6 ξ

We can now state a result which tells us how much error we make with Euler’s
method. From the remarks above, the assumption ||x ||∞ is bounded is not an un-
reasonable for many models we wish to solve. Our discussions above allow us to
be fairly quantitative about how much local error and global error we make using
Euler’s approximations. We state this as a theorem.

Theorem 3.3.2 (Error Estimates For Euler’s Method)

Assume x is a solution to x (t) = f (t, x(t)), x(t0 ) = x0 on the interval [t0 , b] for
some positive b. Assume f and its first order partials are continuous on a rectangle
D which is the set of (t, x) with t ∈ [a, b] and x ∈ [c, d] with the point (t0 , x0 ) in
D for some finite intervals [a, b] and [c, d] with [c, d] containing the range of the
solution x. Let {t0 , t1 , . . . , t N } denote the steps of discrete times obtained using the
step size h in the Euler method. Then the Euler approximates x̂n satisfy

e(b−a)K − 1 h
|xn − x̂n | ≤ eb−a K |e0 | + ||x ||∞
K 2

where xn is the value of the true solution x(tn ) and e0 = x0 − x̂0 . If we also know
e0 = 0 (the usual state of affairs), then, letting the constant B be defined by

e(b−a)K − 1
B= ||x ||∞ ,
2K

we can say x(b) − x̂ N (h) ≤ B h where N (h) is the index at which t Nh = b.

Proof From our remarks earlier, our assumptions on f and its first order partials
tell us f satisfies a Lipschitz condition with Lipschitz constant K in D and that the
solution x has a bounded second derivative on the interval [t0 , b]. Let en = xn − x̂n
and τn = h2 x (ξn ). Then, the usual Taylor series expansion gives

h
x(tn+1 ) = x(tn ) + h f (tn , x(tn )) + h x (ξn )
2
= x(tn ) + h f (tn , x(tn )) + h τn

where ξn is between tn+1 and tn and the Euler approximates satisfy

x̂n+1 = x̂n + h f (tn , x̂n ).

3.3 Euler’s Method with Time Dependence 77

Subtracting, we find

xn+1 − x̂n+1 = xn − x̂n + h f (tn , x(tn )) − f (tn , x̂n ) + h τn .

Thus,

en+1 = en + h f (tn , x(tn )) − f (tn , x̂n ) + h τn .

This leads to the estimate

|en+1 | ≤ |en | + h | f (tn , x(tn )) − f (tn , x̂n )| + h |τn |.

Now apply the Lipschitz condition on f to rewrite the above as

|en+1 | ≤ |en | + h K |xn − x̂n | + h |τn |

≤ (1 + K )|en | + h |τn |

It is easy to see that

h h
|τn | = x (ξn ) ≤ ||x ||∞ .
2 2

For convenience, let τ (h) = h

2
||x ||. Then, we have the estimate

|en+1 | ≤ (1 + K )|en | + h |τ (h)|.

This is a recursion relation. We can easily see what is happening by working out a
few terms.

|e1 | ≤ |e0 |(1 + h K ) + h|τ (h)|

|e2 | ≤ |e1 |(1 + h K ) + h|τ (h)|

≤ |e0 |(1 + h K ) + h|τ (h)| (1 + h K ) + h|τ (h)|

≤ |e0 |(1 + h K ) + hτ (h) 1 + (1 + h K )
2

|e3 | ≤ |e2 |(1 + h K ) + h|τ (h)|

≤ |e0 |(1 + h K ) + h|τ (h)| 1 + (1 + h K ) + (1 + h K ) .
3 2
78 3 Numerical Methods Order One ODEs

It is easy to see that after n steps, we find

n−1
|en | ≤ |e0 |(1 + h K ) + h|τ (h)|
n
(1 + h K )i .
i=0

It is well-known that for any value of r = 1, we have the identity

1 − rn
1 + r + r 2 + r 3 + · · · + r n−1 = .
1−r

Letting r = 1 + h K , we can rewrite our error estimate as

(1 + h K )n − 1
|en | ≤ |e0 |(1 + h K )n + |τ (h)| .
K
Then, using Lemma 3.3.1, we have

(1 + h K )n − 1
|en | ≤ |e0 | enh K + h|τ (h)| .
hK
But since tn = t0 + nh, we know nh ≤ b − t0 leading to

(1 + h K )n − 1
|en | ≤ |e0 | e(b−t0 )K + |τ (h)| .
K

But (1 + h K )n ≤ enh K and since nh = b − a, we have

e(b−a)K − 1
|en | ≤ |e0 | e(b−t0 )K + |τ (h)| .
K
Now if e0 = 0 (as it normally would be), we have

e(b−a)K − 1
|en | ≤ ||x ||∞ h
2K
and letting

e(b−a)K − 1
B= ||x ||∞ ,
2K
we have |e N (h) | ≤ Bh as required.

Comment 3.3.1 Note the local error we make at each step is proportional to h 2 but
the global error after we reach t = b is proportional to h. Hence, Euler’s method is
an order 1 method.
3.4 Euler’s Algorithm 79

3.4 Euler’s Algorithm

Here then is Euler’s Algorithm to approximate the solution to x = f (t, x) with

x(0) = x0 using step size h for as many steps as we want.
• x̂0 = x0 so E 0 = 0.
• x̂1 = x̂0 + f (t, x̂0 ) h; E 1 = |x1 − x̂1 |.
• x̂2 = x̂1 + f (t, x̂1 ) h; E 2 = |x2 − x̂2 |.
• x̂3 = x̂2 + f (t, x̂2 ) h; E 3 = |x3 − x̂3 |.
• x̂4 = x̂3 + f (t, x̂3 ) h; E 4 = |x4 − x̂4 |.
• Continue as many steps as you want.
Recursively:
• x̂0 = x0 so E 0 = 0.
• x̂n+1 = x̂n + f (t, x̂n ) h; E n+1 = |xn+1 − x̂n+1 | for n = 0, 1, 2, 3, . . ..
The approximation scheme above is called Euler’s Method.

3.4.1 Examples

We need to do some examples by hand so here goes.

Example 3.4.1 Find the first three Euler approximates for x = −2x, x(0) = 3 using
h = 0.3. Find the true solution values and errors also.

Solution Here f (x) = −2x and the true solution is x(t) = 3e−2t .
Step 0: x̂0 = x0 = 3 so E 0 = 0.
Step 1:

x̂1 = x̂0 + f (x̂0 ) h = 3 + f (3) (0.3)

= 3 + (−2(3)) (0.3) = 3 − 6(0.3) = 3 − 1.8 = 1.2.
x1 = x(h) = 3e−2h = 3e−0.6 = 1.646.
E 1 = |x1 − x̂1 | = |1.646 − 1.2| = 0.446.

Step 2:

x̂2 = x̂1 + f (x̂1 ) h = 1.2 + f (1.2) (0.3)

= 1.2 + (−2(1.2)) (0.3) = 1.2 − 2.4(0.3) = 0.48.
x2 = x(2h) = 3e−2(2h) = 3e−4h = 3e−1.2 = 0.9036.
E 2 = |x2 − x̂2 | = |0.9036 − 0.48| = 0.4236.
80 3 Numerical Methods Order One ODEs

Step 3:

x̂3 = x̂2 + f (x̂2 ) h = 0.48 + f (0.48) (0.3)

= 0.48 + (−2(0.48)) (0.3) = 0.48 − 0.96(0.3) = 0.192.
x3 = x(3h) = 3e−2(3h) = 3e−6h = 3e−1.8 = 0.4959.
E 3 = |x3 − x̂3 | = |0.4959 − 0.192| = 0.3039.

Example 3.4.2 Find the first three Euler approximates for x = 2x, x(0) = 4 using
h = 0.2. Find the true solution values and errors also.

Solution Here f (x) = 2x and the true solution is x(t) = 4e2t .

Step 0: x̂0 = x0 = 4 so E 0 = 0.
Step 1:

x̂1 = x̂0 + f (x̂0 ) h = 4 + f (4) (0.3)

= 4 + (2(4)) (0.2) = 4 + 8(0.2) = 5.6.
x1 = x(h) = 4e2h = 4e0.4 = 5.9673.
E 1 = |x1 − x̂1 | = |5.9673 − 5.6| = 0.3673.

Step 2:

x̂2 = x̂1 + f (x̂1 ) h = 5.6 + f (5.6) (0.2)

= 5.6 + (2(5.6)) (0.2) = 5.6 + 11.2(0.2) = 7.84.
x2 = x(2h) = 4e2(2h) = 4e4h = 4e0.8 = 8.9032.
E 2 = |x2 − x̂2 | = |8.9032 − 7.84| = 1.0622.

Step 3:

x̂3 = x̂2 + f (x̂2 ) h = 7.84 + f (7.84) (0.2)

= 7.84 + (2(7.84)) (0.2) = 7.84 + 15.68(0.2) = 10.976.
x3 = x(3h) = 4e2(3h) = 4e6h = 4e1.2 = 13.2805.
E 3 = |x3 − x̂3 | = |13.2805 − 10.976| = 2.3045.

3.5 Runge–Kutta Methods

These methods are based on more sophisticated ways of approximating the solution
y . These methods use multiple function evaluations at different time points around a
given t ∗ to approximate y(t ∗ ). In more advanced classes, we can show this technique
generates a sequence {yn } starting at y0 using the following recursion equation:
3.5 Runge–Kutta Methods 81

yn+1 = yn + h × F o (tn , yn , h, f )
y0 = y0

where h is the step size we use for our underlying partition of the time space giving

ti = t0 + i × h

for appropriate indices and F o is a fairly complicated function of the previous approx-
imate solution, the step size and the right hand side function f . The Runge–Kutta
methods are available for various choices of the superscript o which is called the
order of the method. We will not discuss much about F o in this course, as it is best
served up in a more advanced class. What we can say is this: For order o, the local
error is like h o+1 . So
Order One: Local error is h 2 and this method is the same as the Euler Method.
The global error then goes down linearly with h.
Order Two: Local error is h 3 and this method is better than the Euler Method. If
the global error for a given stepsize h is then E, halving the stepsize to h/2 gives
a new global error of E/4. Thus, the global error goes down quadratically. This
means halving the stepsize has a dramatic effect of the global error.
Order Three: Local error is h 4 and this method is better than the Euler Method. If
the global error for a given stepsize h is E, then halving the stepsize to h/2 gives a
new global error of E/8. Thus, the global error goes down as a cubic power. This
means halving the stepsize has an even more dramatic effect of the global error.
Order Four: Local error is h 5 and this method is better than the order three
Method. If the global error for a given stepsize h is E, then halving the step-
size to h/2 gives a new global error of E/16. Thus, the global error goes down as
a fourth power! This means halving the stepsize has huge effect of the global error.
We will now look at MatLab code that allows us to solve our differential equation
problems using the Runge–Kutta method instead of the Euler method of Sect. 3.3.

3.5.1 The MatLab Implementation

The basic code to implement the Runge–Kutta methods is broken into two pieces.
The first one, RKstep.m implements the evaluation of the next approximation so-
lution at point (tn , yn ) given the old approximation at (tn−1 , yn−1 ). We then loop
through all the steps to get to the chosen final time using the code in FixedRK.m.
The details of these algorithms are beyond the scope of this text and so we will not go
into them here. In this code, we are allowing for the dynamic functions to depend on
82 3 Numerical Methods Order One ODEs

time also. Previously, we have used dynamics like f = @(x) 3*x and we expect
the dynamics functions to have that form in DoEuler. However, we want to have
more complicated dynamics now—at least the possibility of it!—so we will adapt
what we have done before. We will now define our dynamics as if they depend on
time. So from now on, we would write f=@(t,x) 3*x even though there is no
time dependence. We then rewrite our DoEuler to DoEulerTwo so that we can
use these more general dynamics. This code is in DoEulerTwo.m and we have
discussed it in the first text on starting your calculus journey. You can review this
function in that text. The Runge–Kutta code uses the new dynamics functions. We
have gone over this code in the previous text, but we will show it to you again for
completeness. In this code, you see the lines like feval(fname,t,x) which
means take the function fname passed in as an argument and evaluate it as the pair
(t,x). Hence, fname(t,x) is the same as f(t,x).

Listing 3.1: RKstep.m: Runge–Kutta Codes

f u n c t i o n [ tnew , ynew , fnew ] = RKstep ( fname , t c , yc , f c , h , k )
%
% fname t h e name o f t h e r i g h t hand s i d e f u n c t i o n f ( t , y )
% t i s a s c a l a r u s u a l l y c a l l e d t i m e and
5 % y i s a vector of s i z e d
% yc approximate s o l u t i o n to y ’ ( t ) = f ( t , y ( t ) ) at t = t c
% fc f ( tc , yc )
% h The t i m e s t e p
% k The o r d e r o f t h e Runge−K u t t a Method 1<= k <= 4
10 %
% tnew t c +h
% ynew a p p r o x i m a t e s o l u t i o n a t tnew
% fnew f ( tnew , ynew )
%
15 i f k==1
k1 = h∗ f c ;
ynew = yc+k1 ;
e l s e i f k==2
k1 = h∗ f c ;
20 k2 = h∗ f e v a l ( fname , t c +( h / 2 ) , yc +( k1 / 2 ) ) ;
ynew = yc + k2 ;
e l s e i f k==3
k1 = h∗ f c ;
k2 = h∗ f e v a l ( fname , t c +( h / 2 ) , yc +( k1 / 2 ) ) ;
25 k3 = h∗ f e v a l ( fname , t c +h , yc−k1 +2∗ k2 ) ;
ynew = yc +( k1 +4∗ k2+k3 ) / 6 ;
e l s e i f k==4
k1 = h∗ f c ;
k2 = h∗ f e v a l ( fname , t c +( h / 2 ) , yc +( k1 / 2 ) ) ;
30 k3 = h∗ f e v a l ( fname , t c +( h / 2 ) , yc +( k2 / 2 ) ) ;
k4 = h∗ f e v a l ( fname , t c +h , yc +k3 ) ;
ynew = yc +( k1 +2∗ k2 +2∗ k3+k4 ) / 6 ;
else
d i s p ( s p r i n t f ( ’The RK method %2d order is not allowed!’ , k ) ) ;
35 end
tnew = t c +h ;
fnew = f e v a l ( fname , tnew , ynew ) ;
end
3.5 Runge–Kutta Methods 83

The code above does all the work. It manages all of the multiple tangent line
calculations that Runge–Kutta needs at each step. We loop through all the steps to
get to the chosen final time using the code in FixedRK.m which is shown below.

Listing 3.2: FixedRK.m: The Runge–Kutta Solution

f u n c t i o n [ t v a l s , y v a l s , f c v a l s ] = FixedRK ( fname , t0 , y0 , h , k , n )
%
% Gives approximate s o l u t i o n to
% y ’( t ) = f ( t , y( t ) )
5 % y ( t 0 ) = y0
% u s i n g a k t h o r d e r RK method
%
% t0 i n i t i a l time
% y0 initial state
10 % h stepsize
% k RK o r d e r 1<= k <= 4
% n Number o f s t e p s t o t a k e
%
% tvals t i m e v a l u e s o f form
15 % t v a l s ( j ) = t 0 + ( j −1)∗h , 1 <= j <= n
% y v a l s approximate s o l u t i o n
% y v a l s ( : j ) = approximate s o l u t i o n at
% t v a l s ( j ) , 1 <= j <= n
%
20 tc = t0 ;
yc = y0 ;
tvals = tc ;
y v a l s = yc ;
f c = f e v a l ( fname , t c , yc ) ;
25 f o r j =1: n−1
[ t c , yc , f c ] = RKstep ( fname , t c , yc , f c , h , k ) ;
y v a l s = [ y v a l s yc ] ;
tvals = [ tvals tc ];
fcvals = [ fcvals fc ];
30 end
end

Here is an example where we solve a specific model using all four Runge–Kutta
choices and plot them all together. Note when we use RKstep, we only return the
first two outputs; that is, for our returned variables, we write [htime1,xhat1] =
FixedRK(f,0,20,0.06,1,N1); instead of returning the full list of outputs
which includes function evaluations [htime1,xhat1,fhat1] = FixedRK
(f,0,20,0.06,1,N1);. We can do this as it is all right to not return the third
output. However, you still have to return the arguments in the order stated when the
function is defined. For example, if we used the command [htime1,fhat1] =
FixedRK(f,0,20,0.06,1,N1);, this would return the approximate values
and place them in the variable fhat1 which is not what we would want to do.
84 3 Numerical Methods Order One ODEs

Fig. 3.1 True versus Euler for x = 0.5x(60 − x), x(0) = 20

Listing 3.3: True versus All Four Runge–Kutta Approximations

f = @( t , x ) . 5 ∗ x .∗(60 − x ) ;
t r u e = @( t ) 6 0 . / ( 1 + ( 6 0 / 2 0 − 1 ) ∗ exp ( −.5∗60∗ t ) ) ;
T = .6;
time = l i n s p a c e ( 0 ,T, 3 1 ) ;
5 h1 = . 0 6 ;
N1 = c e i l (T / h1 ) ;
[ htime1 , x h a t 1 ] = FixedRK ( f , 0 , 2 0 , . 0 6 , 1 , N1 ) ;
[ htime2 , x h a t 2 ] = FixedRK ( f , 0 , 2 0 , . 0 6 , 2 , N1 ) ;
[ htime3 , x h a t 3 ] = FixedRK ( f , 0 , 2 0 , . 0 6 , 3 , N1 ) ;
10 [ htime4 , x h a t 4 ] = FixedRK ( f , 0 , 2 0 , . 0 6 , 4 , N1 ) ;
% t h e . . . a t t h e end o f t h e l i n e a l l o w s us t o c o n t i n u e
% a long l i n e to the s t a r t of the next l i n e
p l o t ( time , t r u e ( t i m e ) , htime1 , xhat1 , ’*’ , htime2 , xhat2 , ’o’ , . . .
htime3 , xhat3 , ’+’ , htime4 , xhat4 , ’.’ ) ;
15 x l a b e l ( ’Time’ ) ;
y l a b e l ( ’x’ ) ;
% We want t o u s e t h e d e r i v a t i v e s y m b o l x ’ s o
% s i n c e Matlab t r e a t ’ a s t h e s t a r t and s t o p o f t h e l a b e l
% we w r i t e ’ ’ . That way Matlab w i l l t r e a t ’ ’ a s a s i n g l e q u o t e
20 % t h a t i s our d i f f e r e n t i a t i o n s y m b o l
t i t l e ( ’RK Approximations to x’’=.5x(60-x), x(0) - 20’ ) ;
% N o t i c e we c o n t i n u e t h i s l i n e t o o
l e g e n d ( ’True’ , ’RK 1, h=.06’ , ’RK 2, h=.06’ , ’RK 3, h = .06’ , . . .
’RK 4, h = .06’ , ’Location’ , ’Best’ ) ;

This generates Fig. 3.1

Note Runge–Kutta Order 4 does a great job even with a large step size.
Chapter 4
Multivariable Calculus

Now let’s review functions of more than one variable since it may have been some
time since you looked at these ideas. This is gone over carefully in the first text on
starting calculus ideas, but it is a good idea to talk about them again.

4.1 Functions of Two Variables

Let’s start by looking at the x−y plane as a collection of two dimensional vectors.
Each vector is rooted at the origin and the head of the vector corresponds to our
usual coordinate pair (x, y). The set of all such x and y determines the x−y plane
which we will also call 2 . The superscript two is used because we are now explicitly
acknowledging that we can think of these ordered pairs as vectors also with just a
slight identification on our part. Since we know about vectors, note if we have a
vector we can rewrite it, using our standard rules for vector arithmetic and scaling
of vectors as

6 1 0
=6 +7
7 0 1

A little thought will let you see we can do this for any vector and so we define special
vectors i = e1 and j = e2 as follows:

1 0
i = e1 = and j = e2 =
0 1

© Springer Science+Business Media Singapore 2016 85

J.K. Peterson, Calculus for Cognitive Scientists,
Cognitive Science and Technology, DOI 10.1007/978-981-287-877-9_4
86 4 Multivariable Calculus

Thus, any vector can be written as

x
= x e1 + y e1
y
=xi+y j

Now let’s start looking at functions that map each ordered pair (x, y) into a number.
Let’s begin with an example. Consider the function f (x, y) = x 2 + y 2 defined for
all x and y. Hence, for each x and y we pick, we calculate a number we can denote by
z whose value is f (x, y) = x 2 + y 2 . Using the same ideas we just used for the x−y
plane, we see the set of all such triples (x, y, z) = (x, y, x 2 + y 2 ) defines a surface
in 3 which is the collection of all ordered triples (x, y, z). Each of these triples can
be identified with a three dimensional vector whose tail is the origin and whose head
is the triple (x, y, z). We note any three dimensional vector can be written as

⎡ ⎤
x
⎣ y ⎦ = x e1 + y e2 + yz e3
z
=xi+y j+ zk

where we define the special vectors used in this representation by

⎡ ⎤ ⎡ ⎤ ⎡ ⎤
1 0 0
i = e1 = ⎣0⎦ , j = e2 = ⎣1⎦ and k = e3 = ⎣0⎦
0 0 1

We can plot this surface in MatLab with fairly simple code. As discussed in the first
volume, the utility function DrawSimpleSurface which manages how to draw
such a surface using boolean variables like DoGrid to turn a graph on or off. The
surface is drawn using a grid, a mesh, traces, patches and columns and a base all of
which contribute to a somewhat cluttered figure. Hence, the boolean variables allows
us to select how much clutter we want to see! So if the boolean variable DoGrid is
set to one, the grid is drawn. The code is self-explanatory so we just lay it out here.
We haven’t shown all the code for the individual drawing functions, but we think
you’ll find it interesting to see how we manage the pieces in this one piece of code.
So check this out.
4.1 Functions of Two Variables 87

Listing 4.1: DrawSimpleSurface

f u n c t i o n DrawSimpleSurface ( f , d e l x , nx , d e l y , ny , x0 , y0 , domesh , d o t r a c e s , dogrid , dopatch ,
docolumn , dobase )
% f i s the function defining the surface
% delx i s the s i z e of the x s t e p
% nx i s t h e number o f s t e p s l e f t and r i g h t from x0
5 % dely i s the s i z e of the y s t e p
% ny i s t h e number o f s t e p s l e f t and r i g h t from y0
% ( x0 , y0 ) i s t h e l o c a t i o n o f t h e column r e c t a n g l e b a s e
% domesh = 1 i f we want t o do t h e mesh
% d o g r i d = 1 i f we want t o do t h e g r i d
10 % d o p a t c h = 1 i f we want t h e p a t c h a b o v e t h e column
% d o b a s e = 1 i f we want t h e b a s e o f t h e column
% docolumn = 1 i f we want t h e column
% d o t r a c e s = 1 i f we want t h e t r a c e s
%
15 h o l d on
i f d o t r a c e s ==1
% s e t up x t r a c e f o r x0 , y t r a c e f o r y0
DrawTraces ( f , d e l x , nx , d e l y , ny , x0 , y0 ) ;
end
20 i f domesh==1 % p l o t t h e s u r f a c e
DrawMesh ( f , d e l x , nx , d e l y , ny , x0 , y0 ) ;
end
i f d o g r i d ==1 %p l o t x , y g r i d
DrawGrid ( f , d e l x , nx , d e l y , ny , x0 , y0 ) ;
25 end
i f dopatch ==1
% draw p a t c h f o r t o p o f column
DrawPatch ( f , d e l x , nx , d e l y , ny , x0 , y0 ) ;
end
30 i f dobase ==1
% draw p a t c h f o r t o p o f column
DrawBase ( f , d e l x , nx , d e l y , ny , x0 , y0 ) ;
end
i f docolumn ==1
35 %draw column
DrawColumn ( f , d e l x , nx , d e l y , ny , x0 , y0 ) ;
end
hold o f f
end

Hence, to draw everything for this surface, we would use the session:

Listing 4.2: Drawing a simple surface

1 f = @( x , y ) x . ˆ 2 + y . ˆ 2 ;
DrawSimpleSurface ( f , 0 . 5 , 2 , 0 . 5 , 2 , 0 . 5 , 0 . 5 , 1 , 1 , 1 , 1 , 1 , 1 ) ;

This surface has circular cross sections for different positive values of z and it is
called a circular paraboloid. If you used f (x, y) = 4x 2 + 3y 2 , the cross sections
for positive z would be ellipses and we would call the surface an elliptical paraboloid.
Now this code is not perfect. However, as an exploratory tool it is not bad! Now it is
time for you to play with it a bit in the exercises below.
88 4 Multivariable Calculus

4.1.1 Homework

Exercise 4.1.1 Explore the surface graph of the circular paraboloid f (x, y) = x 2 +
y 2 for different values of (x0 , y0 ) and x and y. Experiment with the 3D rotated
view to make sure you see everything of interest.

Exercise 4.1.2 Explore the surface graph of the elliptical paraboloid f (x, y) =
2x 2 + y 2 for different values of (x0 , y0 ) and x and y. Experiment with the 3D
rotated view to make sure you see everything of interest.

Exercise 4.1.3 Explore the surface graph of the elliptical paraboloid f (x, y) =
2x 2 + 3y 2 for different values of (x0 , y0 ) and x and y. Experiment with the 3D
rotated view to make sure you see everything of interest.

4.2 Continuity

Let’s recall the ideas of continuity for a function of one variable. Consider these three
versions of a function f defined on [0, 2].

⎧ 2
⎨x , if 0 ≤ x < 1
f (x) = 10, if x = 1
⎩
1 + (x − 1)2 if 1 < x ≤ 2.

The first version is not continuous at x = 1 because although the lim x→1 f (x) exists
and equals 1 (lim x→1− f (x) = 1 and lim x→1+ f (x) = 1), the value of f (1) is 10
which does not match the limit. Hence, we know f here has a removable discontinuity
at x = 1. Note continuity failed because the limit existed but the value of the function
did not match it. The second version of f is given below.

x 2, if 0 ≤ x ≤ 1
f (x) =
(x − 1)2 if 1 < x ≤ 2.

In this case, the lim x→1− = 1 and f (1) = 1, so f is continuous from the left. How-
ever, lim x→1+ = 0 which does not match f (1) and so f is not continuous from the
right. Also, since the right and left hand limits do not match at x = 1, we know
lim x→1 does not exist. Here, the function fails to be continuous because the limit
does not exist. The final example is below:

x 2, if 0 ≤ x < 1
f (x) =
x + (x − 1)2 if 1 < x ≤ 2.
4.2 Continuity 89

Here, the limit and the function value at 1 both match and so f is continuous at
x = 1. To extend these ideas to two dimensions, the first thing we need to do is
to look at the meaning of the limiting process. What does lim(x,y)→(x0 ,y0 ) mean?
Clearly, in one dimension we can approach a point x0 from x in two ways: from
the left or from the right or jump around between left and right. Now, it is apparent
that we can approach a given point (x0 , y0 ) in an infinite number of ways. Draw
a point on a piece of paper and convince yourself that there are many ways you
can draw a curve from another point (x, y) so that the curve ends up at (x0 , y0 )!
We still want to define continuity in the same way; i.e. f is continuous at the point
(x0 , y0 ) if lim(x,y)→(x0 ,y0 ) f (x, y) = f (x0 , y0 ). If you look at the graphs of the surface
z = x 2 + y 2 we have done previously, we clearly see that we have this kind of
behavior. There are no jumps, tears or gaps in the surface we have drawn. Let’s make
this formal.

Definition 4.2.1 (Continuity)

Let z = f (x, y) be a function of the two independent variables x and y defined
on some domain. At each pair (x, y) where f is defined in a circle of some finite
radius r ,

Br (x0 , y0 ) = {(x, y) | (x − x0 )2 + (y − y0 )2 < r },

if lim(x,y)→(x0 ,y0 ) f (x, y) exists and matches f (x0 , y0 ), we say f is continuous at

(x0 , y0 ).

Here is an example of a function which is not continuous at the point (0, 0). Let

√ 2x2 , if (x, y) = (0, 0)

f (x, y) = x +y 2
0, if (x, y) = (0, 0).

If we show the limit as we approach (0, 0) does not exist, then we will know f is not
continuous at (0, 0). If this limit exists, we should get the same value for the limit
no matter what path we take to reach (0, 0). Let the first path be given by x(t) = t
and y(t) = 2t. Then, as t → 0, (x(t),
√ y(t)) → (0, √0) as desired. Plugging in to f ,
2t/ t 2 + 4t 2 = 2/ 5 and hence the limit along this
we find for t = 0, f (t, 2t) = √
path is this constant value 2/ 5. On the other hand, along the path x(t) = t and
y(t) = −3t, for t = 0, we have f (t, −3t) = 2/3 which is not the same. Since the
limiting value differs on two paths, the limit can’t exist. Hence, f is not continuous
at (0, 0).

4.3 Partial Derivatives

Let’s go back to our simple surface example and look at the traces again. In Fig. 4.1, we
show the traces for the base point x0 = 0.5 and y0 = 0.5. We have also drawn vertical
lines down from the traces to the x−y plane to further emphasize the placement of
90 4 Multivariable Calculus

Fig. 4.1 The traces f (x0 , y) and f (x, y0 ) for the surface z = x 2 + y 2 for x0 = 0.5 and y0 = 0.5

the traces on the surface. The surface itself is not shown as it is somewhat distracting
and makes the illustration too busy.
You can generate this type of graph yourself with the function DrawFullTraces
as follows:

Listing 4.3: Drawing a full trace

DrawFullTraces ( f , 0 . 5 , 2 , 0 . 5 , 2 , 0 . 5 , 0 . 5 ) ;

Note, that each trace has a well-defined tangent line and derivative at the points x0
and y0 . We have

d d 2
f (x, y0 ) = (x + y02 )
dx dx
= 2x

as the value y0 in this expression is a constant and hence its derivative with respect to
x is zero. We denote this new derivative as ∂∂xf which we read as the partial derivative
of f with respect to x. It’s value as the point (x0 , y0 ) is 2x0 here. For any value of
(x, y), we would have ∂∂xf = 2x. We also have
4.3 Partial Derivatives 91

d d 2
f (x0 , y) = (x + y 2 )
dy dy 0
= 2y

We then denote this new derivative as ∂∂ yf which we read as the partial derivative of
f with respect to y. It’s value as the point (x0 , y0 ) is then 2y0 here. For any value of
(x, y), we would have ∂∂ yf = 2y.
The tangent lines for these two traces are then

d
T (x, y0 ) = f (x0 , y0 ) + f (x, y0 ) (x − x0 )
dx x0
= (x02 + y02 ) + 2x0 (x − x0 )

d
T (x0 , y) = f (x0 , y0 ) + f (x0 , y) (y − y0 )
dy y0

= (x02 + y02 ) + 2y0 (y − y0 ).

We can also write these tangent line equations like this using our new notation for
partial derivatives.

∂f
T (x, y0 ) = f (x0 , y0 ) + (x0 , y0 ) (x − x0 )
∂x
= (x02 + y02 ) + 2x0 (x − x0 )
∂f
T (x0 , y) = f (x0 , y0 ) + (x0 , y0 ) (y − y0 )
∂y
= (x02 + y02 ) + 2y0 (y − y0 ).

We can draw these tangent lines in 3D. To draw T (x, y0 ), we fix the y value to be y0
and then we draw the usual tangent line in the x−z plane. This is a copy of the x−z
plane translated over to the value y0 ; i.e. it is parallel to the x−z plane we see at the
value y = 0. We can do the same thing for the tangent line T (x, y0 ); we fix the x
value to be x0 and then draw the tangent line in the copy of the y − z plane translated
to the value x0 . We show this in Fig. 4.3. Note the T (x, y0 ) and the T (x0 , y) lines
are determined by vectors as shown below.

⎡ ⎤ ⎡ ⎤
1 ⎡ ⎤ 0 ⎡ ⎤
⎢ ⎥ 1 ⎢ ⎥ 0
0 ⎥ = ⎣ 0 ⎦ and B = ⎢ 1
⎥=⎣ 1 ⎦
A=⎢
⎣ ⎦ ⎣d ⎦
d
dx
f (x, y0 ) 2x0 dy
f (x0 , y) 2y0
x0 y0

Note that if we connect the lines determined by the vectors A and B, we determine
a flat sheet which you can interpret as a piece of paper laid on top of these two lines.
92 4 Multivariable Calculus

Fig. 4.2 The traces f (x0 , y) and f (x, y0 ) for the surface z = x 2 + y 2 for x0 = 0.5 and y0 = 0.5
with added tangent lines. We have added the tangent plane determined by the tangent lines

Of course, we can only envision a small finite subset of this sheet of paper as you
can see in Fig. 4.2. Imagine that the sheet extends infinitely in all directions! The
sheet of paper we are plotting is called the tangent plane to our surface at the point
(x0 , y0 ). We will talk about this more formally later.
To draw this picture with the tangent lines, the traces and the tangent plane, we use
DrawTangentLines which has arguments (f,fx,fy,delx,nx,dely,ny,
r,x0,y0). There are three new arguments: fx which is ∂ f /∂x, fy which is ∂ f /∂ y
and r which is the size of the tangent plane that is plotted. For the picture shown in
Fig. 4.3, we’ve removed the tangent plane because the plot was getting pretty busy.
We did this by commenting out the line that plots the tangent plane. It is easy for
you to go into the code and add it back in if you want to play around. The MatLab
command line is

Listing 4.4: Drawing Tangent Lines

f x = @( x , y ) 2∗ x ;
f y = @( x , y ) 2∗ y ;
%
DrawTangentLines ( f , fx , fy , 0 . 5 , 2 , 0 . 5 , 2 , . 3 , 0 . 5 , 0 . 5 ) ;

If you want to see the tangent plane as well as the tangent lines, all you have to
do is look at the following lines in DrawTangentLines.m.
4.3 Partial Derivatives 93

Fig. 4.3 The traces f (x0 , y) and f (x, y0 ) for the surface z = x 2 + y 2 for x0 = 0.5 and y0 = 0.5
with added tangent lines

Listing 4.5: Drawing Tangent Lines

1 % s e t up a new l o c a l mesh g r i d n e a r ( x0 , y0 )
[U, V] = meshgrid ( u , v )
% s e t up t h e t a n g e n t p l a n e a t ( x0 , y0 )
W = f ( x0 , y0 ) + f x ( x0 , y0 ) ∗ (U−x0 ) + f y ( x0 , y0 ) ∗ (V−y0 )
% p l o t the tangent plane
6 s u r f (U, V,W, ’EdgeColor’ , ’blue’ ) ;

These lines setup the tangent plane and the tangent plane is turned off if there is
a % in front of surf(U,V,W,’EdgeColor’,’blue’);. We edited the file to
take the % out so we can see the tangent plane. We then see the plane in Fig. 4.4 as
we saw before.
The ideas we have been discussing can be made more general. When we take the
derivative with respect to one variable while holding the other variable constant (as
we do when we find the normal derivative along a trace), we say we are taking a
partial derivative of f. Here there are two flavors: the partial derivative with respect
to x and the partial derivative with respect to y. We can now state some formal
definitions and introduce the notations and symbols we use for these things. We
define the process of partial differentiation carefully below.

Definition 4.3.1 (Partial Derivatives)

Let z = f (x, y) be a function of the two independent variables x and y defined on
some domain. At each pair (x, y) where f is defined in a circle of some finite radius
r , Br (x0 , y0 ) = {(x, y) | (x − x0 )2 + (y − y0 )2 < r }, it makes sense to try to find
the limits
94 4 Multivariable Calculus

Fig. 4.4 The traces f (x0 , y) and f (x, y0 ) for the surface z = x 2 + y 2 for x0 = 0.5 and y0 = 0.5
with added tangent lines

f (x, y) − f (x, y0 )
lim
x→x0 ,y=y0 x − x0
f (x, y) − f (x0 , y)
lim
x=x0 ,y→y0 y − y0

If these limits exists, they are called the partial derivatives of f with respect to x and
y at (x0 , y0 ), respectively.

Comment 4.3.1 For these partial derivatives, we use the symbols

∂f ∂z
f x (x0 , y0 ), (x0 , y0 ), z x (x0 , y0 ), (x0 , y0 )
∂x ∂x
and
∂f ∂z
f y (x0 , y0 ), (x0 , y0 ), z y (x0 , y0 ), (x0 , y0 )
∂y ∂y

Comment 4.3.2 We often use another notation for partial derivatives. The function
f of two variables x and y can be thought of as having two arguments or slots into
which we place values. So another useful notation is to let the symbol D1 f be f x and
D2 f be f y . We will be using this notation later when we talk about the chain rule.

Comment 4.3.3 It is easy to take partial derivatives. Just imagine the one variable
held constant and take the derivative of the resulting function just like you did in
your earlier calculus courses.
4.3 Partial Derivatives 95

∂z
Example 4.3.1 Let z = f (x, y) = x 2 + 4y 2 be a function of two variables. Find ∂x
and ∂∂zy .
Solution Thinking of y as a constant, we take the derivative in the usual way with
respect to x. This gives

∂z
= 2x
∂x

as the derivative of 4y 2 with respect to x is 0. So, we know f x = 2x.

In a similar way, we find ∂∂zy . We see

∂z
= 8y
∂y

as the derivative of x 2 with respect to y is 0. So f y = 8y.

∂z ∂z
Example 4.3.2 Let z = f (x, y) = 4x 2 y 3 . Find ∂x
and ∂y
.
Solution Thinking of y as a constant, take the derivative in the usual way with
respect to x: This gives

∂z
= 8x y 3
∂x

as the term 4y 3 is considered a “constant” here. So f x = 8x y 3 .

Similarly,

∂z
= 12x 2 y 2
∂y

as the term 4x 2 is considered a “constant” here. So f y = 12x 2 y 2 .

Now let’s do some without spelling out each step. Make sure you see what we are
doing!
x 4 +1
Example 4.3.3 f (x, y) = y 3 +2
.
Solution

∂f (4x 3 )
= 3
∂x y +2
∂f (x 4 + 1)(3y 2 )
=−
∂y (y 3 + 2)2
96 4 Multivariable Calculus

x 4 y 2 +2
Example 4.3.4 f (x, y) = y 3 x 5 +20
.
Solution

∂f (4x 3 y 2 )(y 3 x 5 + 20) − (x 4 y 2 + 2)(5y 3 x 4 )

=
∂x (y 3 x 5 + 20)2
∂f (2x 4 y)(y 3 x 5 + 20) − (x 4 y 2 + 2)(3y 2 x 5 )
=
∂y (y 3 x 5 + 20)2

Example 4.3.5 f (x, y) = sin(x 3 y + 2).

Solution
∂f
= cos(x 3 y + 2) (3x 2 y)
∂x
∂f
= cos(x 3 y + 2) (x 3 )
∂y

Example 4.3.6 f (x, y) = e−(x +y 4 )

Solution
∂f
= e−(x +y ) (−2x)
2 4

∂x
∂f
= e−(x +y ) (−4y 3 )
2 4

∂y

Example 4.3.7 f (x, y) = ln( x 2 + 2y 2 ).

Solution
∂f 1 2x
=
∂x 2 x + 2y 2
2

∂f 1 4y
=
∂y 2 x + 2y 2
2

4.3.1 Homework

These are for you: for each of these functions, find f x and f y .
First, functions with no cross terms.
Exercise 4.3.1 f (x, y) = x 2 + 3y 2 .
Exercise 4.3.2 f (x, y) = 4x 2 + 5y 4 .
Exercise 4.3.3 f (x, y) = −3x + 2y 8 .
4.3 Partial Derivatives 97

Next, functions with cross terms.

Exercise 4.3.4 f (x, y) = x 2 y 2 .

Exercise 4.3.5 f (x, y) = 2x 3 y 2 + 5x.

Exercise 4.3.6 f (x, y) = 3x y 2 .

Exercise 4.3.7 f (x, y) = x 2 y 5 .

Next, functions with fractions.

x2
Exercise 4.3.8 f (x, y) = y5
.

x 2 +2y
Exercise 4.3.9 f (x, y) = 5x+y 3
.

Exercise 4.3.10 f (x, y) = x+2y

5x+y
.

x 2 +2
Exercise 4.3.11 f (x, y) = 5+y
.

Now, sin and cos things.

Exercise 4.3.12 f (x, y) = sin(x y).

Exercise 4.3.13 f (x, y) = sin(x 2 y).

Exercise 4.3.14 f (x, y) = cos(x + 3y).

√
Exercise 4.3.15 f (x, y) = sin( x + y).

Exercise 4.3.16 f (x, y) = cos2 (x + 4y).

√
Exercise 4.3.17 f (x, y) = sin(x y).

Now, let’s add ln and exp.

Exercise 4.3.18 f (x, y) = e x y

Exercise 4.3.19 f (x, y) = e x+4y

Exercise 4.3.20 f (x, y) = e−3x y .

Exercise 4.3.21 f (x, y) = ln(x 2 + 4y 2 ).

√
Exercise 4.3.22 f (x, y) = ln( 1 + x y).

Exercise 4.3.23 f (x, y) = esin(3x+5y) .

2
+5y)
Exercise 4.3.24 f (x, y) = ecos(3x .

Exercise 4.3.25 f (x, y) = ln(1 + 3x + 8y).

98 4 Multivariable Calculus

4.4 Tangent Planes

Before we discuss tangent planes to a function f again, let’s digress to the ideas of
planes in general in 3D. We define a plane as follows.
Definition 4.4.1 (Planes)
A plane in 3D through the point (x0 , y0 , z 0 ) is defined as the set of all points (x, y, z)
so that the angle between the vectors D and N is zero where D is the vector we get
by connecting the point (x0 , y0 , z 0 ) to the point (x, y, z). Hence, for
⎡ ⎤ ⎡ ⎤
x − x0 N1
D = ⎣ y − y0 ⎦ and N = ⎣ N2 ⎦
z − z0 N3

the plane is the set of points (x, y, z) so that < D, N > = 0. The vector N is called
the normal vector to the plane.
Comment 4.4.1 A little thought shows that any plane crossing through the origin
is a two dimensional subspace of 3 .
Example 4.4.1 The equation 2x + 3y − 5z = 0 defines the plane whose normal vec-
tor is N = [2, 3, −5]T which passes through the origin (0, 0, 0).
Example 4.4.2 The equation 2(x − 2) + 3(y − 1) − 5(z + 3) = 0 defines the plane
whose normal vector is N = [2, 3, −5]T which passes through the point (2, 1, −3).
Note this can be rewritten as 2x + 3y − 5z = 4 + 3 + 15 = 22 after a simple manip-
ulation.
Example 4.4.3 The equation 2x + 3y − 5z = 11 corresponds to a plane with normal
vector N = [2, 3, −5]T which passes through some point (x0 , y0 , z 0 ). There are an
infinite number of choices for this base point: any triple which solves 2x0 + 3y0 −
5z 0 = 11 will do the job. An easy way to pick one is to pick two and solve for the
third. So for example, if z 0 = 0 and y0 = 4, we find 2x0 + 12 = 11 which gives
x0 = −1/2. Thus, this plane could be rewritten as 2(x + 1/2) + 3(y − 4) − 5z = 0.

4.4.1 The Vector Cross Product

There is another very useful way to define a plane which we did not discuss in the first
volume. As long as the vectors A and B point in different directions, they determine
a new vector A × B which is perpendicular to both of them and can serve as the
normal to a plane. Note, saying the vectors A and B point in different directions is
the same as saying they are linear dependent.
Definition 4.4.2 (Planes Again)
The plane in 3D determined by the vectors A and B containing the point (x0 , y0 , z 0 )
is defined as the plane whose normal vector is N = A × B.
4.4 Tangent Planes 99

To find the vector C = A × B requires a bit of calculation. Assume

C = [C1 , C2 , C3 ]T , A = [A1 , A2 , A3 ]T and B = [B1 , B2 , B3 ]T . Then, we want
< C, A > = 0 and < C, B > = 0. For convenience, we will assume all of the com-
ponents of A and B are nonzero. This gives the equations

A1 C 1 + A2 C 2 + A3 C 3 = 0
B1 C1 + B2 C2 + B3 C3 = 0

Solving both equations for C1 , we have

A2 A3 B2 B3
C1 = − C2 − C3 = − C2 − C3 .
A1 A1 B1 B1

Solving for C2 , we have

− BB31 − A3
A1
C2 = C3
− BB21 − A2
A1

− BB31 − A3
A1 A1 B1
= C3
− BB21 − A2
A1
A1 B1
B1 A3 − A1 B3
= C3 .
A1 B2 − B1 A2

We know A1 C1 = −A2 C2 − A3 C3 and so substituting for C2 we obtain

A2 B1 A3 − A1 B3 A3
C1 = − + C3
A1 A1 B2 − B1 A2 A1

A2 (B1 A3 − A1 B3 ) + A3 (A1 B2 − B1 A2 )
=− C3
A1 (A1 B2 − B1 A2 )
A2 B1 A3 − A1 A2 B3 + A3 A1 B2 − A3 B1 A2
=− C3
A1 (A1 B2 − B1 A2 )
−A1 A2 B3 + A3 A1 B2
=− C3
A1 (A1 B2 − B1 A2 )
A2 B3 − A3 B2
= C3 .
A1 B2 − B1 A2
100 4 Multivariable Calculus

We have found the components of C can be written in terms of the parameter C3 :

A2 B3 − A3 B2
C1 = C3
A1 B2 − B1 A2
B1 A3 − A1 B3
C2 = C3 .
A1 B2 − B1 A2

Choosing C3 = A1 B2 − B1 A2 , we find that the vector we are looking for has the
components

C1 = A2 B3 − A3 B2
C2 = B1 A3 − A1 B3
C3 = A1 B2 − B1 A2

We can reorganize these components by recognizing they are the determinants of

certain 2 × 2 matrices; i.e.

A2 A3
C1 = det
B2 B3

A1 A3
C2 = − det
B1 B3

A1 A2
C3 = det
B1 B2

Then using the standard basis vectors for 3 , i, j and k, we see the vector C = A × B
can be written as

A2 A3 A A A A
A × B = i det − j det 1 3 + k det 1 2
B2 B3 B1 B3 B1 B2

This leads us to the following definition.

Definition 4.4.3 (The Vector Cross Product)

The cross product of the two nonzero vectors A and B is defined to be the vector
A × B where

A2 A3 A A A A
A × B = i det − j det 1 3 + k det 1 2
B2 B3 B1 B3 B1 B2
4.4 Tangent Planes 101

This calculation is performed so often we define a special use of the determinant to

help us remember. We define the matrix
⎡ ⎤
i j k
C = ⎣ A1 A2 A3 ⎦
B1 B2 B3

and we define the determinant of this matrix to coincide with the definition of A × B.

Comment 4.4.2 This is easy to remember. Start with the i in row one. Cross out the
first row and first column of C. The first term in the cross product is then i times the
determinant of the 2 × 2 submatrix that is left over. This is the matrix

A1 A2
B1 B2

The first term is then

A1 A2
i det
B1 B2

The second term in row one is j . Associate this term with a minus sign and since it is
in row one, column two cross that row and column out of C to obtain the submatrix

A1 A3
B1 B3

We now have the second term

A A
− j det 1 3
B1 B3

The last term is associated with the row one, column three entry k. Cross out that
row and column in C to obtain the submatrix

A2 A3
B2 B3

This gives the last term

A A
k det 2 3
B2 B3

The cross product is then the sum of these three terms. This is also called expanding
a 3 × 3 determinant by the first row, but that is another story.
102 4 Multivariable Calculus

4.4.1.1 Homework

For each of these problems, graph the two vectors as well as the cross product.

Exercise 4.4.1 Find i × j , i × k and j × k,

T T
Exercise 4.4.2 Find A × B for A = 1 −2 3 and B = −1 5 3 .
T T
Exercise 4.4.3 Find A × B for A = −2 4 −3 and B = 6 −1 2 .
T T
Exercise 4.4.4 Find A × B for A = 0 −3 1 and B = 2 0 8 .

4.4.2 Back to Tangent Planes

Recall the tangent plane to a surface z = f (x, y) at the point (x0 , y0 ) was the plane
determined by the tangent lines T (x, y0 ) and T (x0 , y). The T (x, y0 ) line was deter-
mined by the vector

⎡ ⎤
1 ⎡ ⎤
⎢ ⎥ 1
0 ⎥=⎣ 0 ⎦
A=⎢
⎣ ⎦
d
dx
f (x, y0 ) 2x0
x0

and the T (x0 , y) line was determined by the vector

⎡ ⎤
0 ⎡ ⎤
⎢ ⎥ 0
1
B=⎢
⎣
⎥=⎣ 1 ⎦
⎦
d
dy
f (x0 , y) 2y0
y0

We know now that we can write these vectors more generally as

⎡ ⎤
1
A=⎣ 0 ⎦
∂f
∂x
(x 0 , y0 )
⎡ ⎤
0
B=⎣ 1 ⎦
∂f
∂y
(x0 , y0 )
4.4 Tangent Planes 103

The plane determined by these vectors has normal A × B which is

⎡ ⎤
i j k
A × B = ⎣1 0 f x (x0 , y0 )⎦
0 1 f y (x0 , y0 )

0 f x (x0 , y0 ) 1 f x (x0 , y0 ) 01
A × B = i det − j det + k det
1 f y (x0 , y0 ) 0 f y (x0 , y0 ) 01
= − f x (x0 , y0 )i − f y (x0 , y0 ) j + k
⎡ ⎤
− f x (x0 , y0 )
= ⎣− f y (x0 , y0 )⎦ .
1

The tangent plane to the surface z = f (x, y) at the point (x0 , y0 ) is then given by

− f x (x0 , y0 )(x − x0 ) − f y (x0 , y0 )(y − y0 ) + (z − f (x0 , y0 ) = 0.

This then gives the traditional equation of the tangent plane:

z = f ( x0 , y0 ) + f x (x0 , y0 )(x − x0 ) + f y (x0 , y0 )(y − y0 ). (4.1)

We can use another compact definition at this point. We can define the gradient of
the function f to be the vector ∇ f . The gradient is defined as follows.

Definition 4.4.4 (The Gradient)

The gradient of the scalar function z = f (x, y) is defined to be the vector ∇ f where

f x (x0 , y0 )
∇ f (x0 , y0 ) = .
f y (x0 , y0 )

Note the gradient takes a scalar function argument and returns a vector answer.

Using the gradient, Eq. 4.1 can be rewritten as

f (x, y) = f (x0 , y0 )+ < ∇ f, X − X 0 >

= f (x0 , y0 ) + ∇ f T (X − X 0 )
104 4 Multivariable Calculus
T
where X − X 0 = x − x0 y − y0 . The obvious question to ask now is how much
of a discrepancy is there between the value f (x, y) and the value of the tangent
plane?

Example 4.4.4 Find the gradient of f (x, y) = x 2 + 4x y + 9y 2 and the equation of

the tangent plane to this surface at the point (1, 2).

Solution

2x + 4y
∇ f (x, y) = .
4x + 18y

The equation of the tangent plane at (1, 2) is then

z = f (1, 2) + f x (1, 2)(x − 2) + f y (1, 2)(y − 2)

= 45 + 10(x − 2) + 40(y − 2)
= −55 + 10x + 40y.

Note this can also be written as 10x + 40y + z = 55 which is also a standard form.
However, in this form, the attachment point (1, 2, 45) is hidden from view.

4.4.3 Homework

Exercise 4.4.5 Find the gradient of f (x, y) = x 2 − x y 2 + y 2 and the equation of

the tangent plane to this surface at the point (1, 1).

Exercise 4.4.6 Find the gradient of f (x, y) = −x 2 + y 2 + 4y and the equation of

the tangent plane to this surface at the point (−1, 1).

Exercise 4.4.7 Find the gradient of f (x, y) = sin(x y) and the equation of the tan-
gent plane to this surface at the point (π/4, −π/4).

4.4.4 Computational Results

We can use MatLab/Octave to draw tangent planes and tangent lines to a surface.
Consider the function DrawTangentPlanePackage. The source code is similar
to what we have done in previous functions. This time, we send in the function f and
the two partial derivatives fx and fy. First, we plot the traces and draw vertical lines
from the traces to the x−y plane. Note this code will not do very well on surfaces
where the z values become negative! But then, this code is just for exploration and
it is easy enough to alter it for other jobs. And it is a good exercise! After the traces
and their shadow lines are drawn, we draw the tangent lines. Finally, we draw the
tangent plane. The tangent plane calculation uses the partial derivatives we sent into
this function as arguments.
4.4 Tangent Planes 105

Listing 4.6: DrawTangentPlanePackage

f u n c t i o n DrawTangentPlanePackage ( f , fx , fy , d e l x , nx , d e l y , ny , r , x0 , y0 )
% f i s the function defining the surface
% delx i s the s i z e of the x s t e p
% nx i s t he number o f s t e p s l e f t and r i g h t from x0
5 % dely i s the s i z e of the y s t e p
% ny i s t he number o f s t e p s l e f t and r i g h t from y 0
% r i s t h e s i z e o f t h e drawn t a n g e n t p l a n e
% ( x0 , y0 ) i s t h e l o c a t i o n o f t h e column r e c t a n g l e b a s e
%
10 % s e t up x and y s t u f f
x = x0−nx∗ d e l x : d e l x / 5 : x0+nx∗ d e l x ;
y = y0−ny∗ d e l y : d e l y / 5 : y0+ny∗ d e l y ;
[ rows , s x ] = s i z e ( x ) ;
[ rows , s y ] = s i z e ( y ) ;
15 h o l d on
% s e t up x t r a c e f o r x = x0
% s e t up y t r a c e f o r y = y0
x t r a c e = f ( x0 , y ) ;
y t r a c e = f ( x , y0 ) ;
20 f i x e d x = x0 ∗ o n e s ( 1 , s x ) ;
f i x e d y = y0 ∗ o n e s ( 1 , s y ) ;
p l o t 3 ( f i x e d x , y , x t r a c e , ’LineWidth’ , 4 , ’Color’ , ’red’ ) ;
p l o t 3 ( x , f i x e d y , y t r a c e , ’LineWidth’ , 4 , ’Color’ , ’red’ ) ;
% now draw x0 , y0 l i n e i n xy p l a n e
25 U = [ x0 ; x0 ] ;
V = [ y0−ny∗ d e l y ; y0+ny∗ d e l x ] ;
W = [0;0];
p l o t 3 (U, V,W) ;
U = [ y0−ny∗ d e l y ; y0+ny∗ d e l x ] ;
30 V = [ y0 ; y0 ] ;
W = [0;0];
p l o t 3 (U, V,W) ;
% now f i l l i n p l a n e s formed by x0 , y0 l i n e s
f o r i =1: s y
35 U = [ x0 ; x0 ] ;
V = [y( i ) ;y( i ) ];
W = [ 0 ; f ( x0 , y ( i ) ) ] ;
p l o t 3 (U, V,W, ’LineWidth’ , 1 , ’Color’ , ’red’ ) ;
end
40 f o r i =1: s x
U = [x( i ) ;x( i ) ];
V = [ y0 ; y0 ] ;
W = [ 0 ; f ( x ( i ) , y0 ) ] ;
p l o t 3 (U, V,W, ’LineWidth’ , 1 , ’Color’ , ’red’ ) ;
45 end
% now draw t a n g e n t l i n e s
% s e t up new l o c a l v a r i a b l e s c e n t e r e d a t ( x0 , y0 )
TX = @( x ) ( f ( x0 , y0 ) + f x ( x0 , y0 ) ∗ ( x−x0 ) ) ;
TY = @( y ) ( f ( x0 , y0 ) + f y ( x0 , y0 ) ∗ ( y−y0 ) ) ;
50 U = [ x0−nx∗ d e l x ; x0+nx∗ d e l x ] ;
V = [ y0 ; y0 ] ;
W = [TX( x0−nx∗ d e l x ) ;TX( x0+nx∗ d e l x ) ] ;
p l o t 3 (U, V,W, ’LineWidth’ , 3 , ’Color’ , ’blue’ ) ;
U = [ x0 ; x0 ] ;
55 V = [ y0−ny∗ d e l y ; y0+ny∗ d e l y ] ;
W = [TY( y0−ny∗ d e l y ) ;TY( y0+ny∗ d e l y ) ] ;
p l o t 3 (U, V,W, ’LineWidth’ , 3 , ’Color’ , ’blue’ ) ;
% p l o t tangent plane
u = [ x0−r ∗ d e l x ; x0+r ∗ d e l x ] ;
60 v = [ y0−r ∗ d e l y ; y0+r ∗ d e l y ] ;
% s e t up a new l o c a l mesh g r i d n e a r ( x0 , y0 )
[U, V] = meshgrid ( u , v ) ;
% s e t up t h e t a n g e n t p l a n e a t ( x0 , y0 )
w = @( u , v ) f ( x0 , y0 ) + f x ( x0 , y0 ) ∗ ( u−x0 ) + f y ( x0 , y0 ) ∗ ( v−y0 ) ;
65 W = w(U, V) ;
% p l o t the tangent plane
s u r f (U, V,W, ’EdgeColor’ , ’blue’ ) ;
hold o f f
end
106 4 Multivariable Calculus

The illustrations this code produces have already been used in Fig. 4.2. Practice with
this code and draw other pictures! A typical session to generate this figure would
look like

Listing 4.7: Drawing Tangent Planes

1 f = @( x , y ) x . ˆ 2 + y . ˆ 2 ;
f x = @( x , y ) 2∗ x ;
f y = @( x , y ) 2∗ y ;
DrawTangentPlanePackage ( f , fx , fy , 0 . 5 , 2 , 0 . 5 , 2 , . 3 , 0 . 5 , 0 . 5 ) ;

4.4.5 Homework

Exercise 4.4.8 Draw tangent lines and planes for the surface f (x, y) = x 2 + 3y 2
for various points (x0 , y0 ).

Exercise 4.4.9 Draw tangent lines and planes for the surface f (x, y) = −x 2 − 3y 2
for various points (x0 , y0 ). You will need to modify the code to make this work!

Exercise 4.4.10 Draw tangent lines and planes for the surface f (x, y) = x 2 − 3y 2
for various points (x0 , y0 ). You will need to modify the code to make this work! Make
sure you try the point (0, 0).

4.5 Derivatives in Two Dimensions!

Let’s look at the partial derivatives of f (x, y). As long as f (x, y) is defined locally
at (x0 , y0 ), we can say f x (x0 , y0 ) and f y (x0 , y0 ) exist if and only if there are error
functions E 1 (x, y, x0 , y0 ) and E 2 (x, y, x0 , y0 ) so that

f (x, y0 ) = f (x0 , y0 ) + f x (x0 , y0 )(x − x0 ) + E 1 (x, x0 , y0 )

f (x0 , y) = f (x0 , y0 ) + f y (x0 , y0 )(y − y0 ) + E 2 (y, x0 , y0 )

with E 1 → 0 and E 1 /(x − x0 ) → 0 as x → x0 and E 2 → 0 and E 2 /(y − x0 ) → 0

as y → y0 . Using the ideas we have presented here, we can come up with a way to
define the differentiability of a function of two variables.

Definition 4.5.1 (Error Form of Differentiability For Two Variables)

If f (x, y) is defined locally at (x0 , y0 ), then f is differentiable at (x0 , y0 ) if there
are two numbers L 1 and L 2 so that the error function E(x, y, x0 , y0 ) = f (x, y) −
f (x0 , y0 ) − L 1 (x − x0 ) − L 2 (y − y0 ) satisfies lim(x,y)→(x0 ,y0 ) E(x, y, x0 , y0 ) = 0
4.5 Derivatives in Two Dimensions! 107

and lim(x,y)→(x0 ,y0 ) E(x, y, x0 , y0 )/||(x − x0 , y − y0 )|| = 0. Here, the term

||(x − x0 , y − y0 )|| = (x − x0 )2 + (y − y0 )2 .

Note if f is differentiable at (x0 , y0 ), f must be continuous at (x0 , y0 ). The argument

is simple:

f (x, y) = f (x0 , y0 ) + L 1 (x0 , y0 )(x − x0 ) + L 2 (y − y0 ) + E(x, y, x0 , y0 )

and as (x, y) → (x0 , y0 ), we have f (x, y) → f (x0 , y0 ) which is the definition of f

being continuous at (x0 , y0 ). Hence, we can say

Theorem 4.5.1 (Differentiable Implies Continuous: Two Variables)

If f is differentiable at (x0 , y0 ) then f is continuous at (x0 , y0 ).

Proof We have sketched the argument already.

From Definition 4.5.1, we can show if f is differentiable at the point (x0 , y0 ), then
L 1 = f x (x0 , y0 ) and L 2 = f y (x0 , y0 ). The argument goes like this: since f is differ-
entiable at (x0 , y0 ), we can say

f (x, y) − f (x0 , y0 ) − L 1 (x − x0 ) − L 2 (y − y0 )
lim = 0.
(x,y)→(x0 ,y0 ) (x − x0 )2 + (y − y0 )2

We can rewrite this using x = x − x0 and y = y − y0 as

f (x0 + x, y0 + y) − f (x0 , y0 ) − L 1 x − L 2 y

lim = 0.
(x,y)→(0,0) (x)2 + (y)2

In particular, for y = 0, we find

f (x0 + x, y0 ) − f (x0 , y0 ) − L 1 x

lim = 0.
(x)→0 (x)2

For x > 0, we find (x)2 = x and so

f (x0 + x, y0 ) − f (x0 , y0 )

lim + = L 1.
x→0 x

Thus, the right hand partial derivative f x (x0 , y0 )+ exists and equals L 1 . On the other
hand, if x < 0, then (x)2 = −x and we find, with a little manipulation, that
we still have
108 4 Multivariable Calculus

f (x0 + x, y0 ) − f (x0 , y0 )

lim = L 1.
(x)→0− x

So the left hand partial derivative f x (x0 , y0 )− exists and equals L 1 also. Combining,
we see f x (x0 , y0 ) = L 1 . A similar argument shows that f y (x0 , y0 ) = L 2 . Hence, we
can say if f is differentiable at (x0 , y0 ) then f x and f y exist at this point and we have

f (x, y) = f (x0 , y0 ) + f x (x0 , y0 )(x − x0 ) + f y (x0 , y0 )(y − y0 ) + E f (x, y, x0 , y0 )

where E f (x, y, x0 , y0 ) → 0 and E f (x, y, x0 , y0 )/||(x − x0 , y − y0 )|| → 0 as (x, y)

→ (x0 , y0 ). Note this argument is a pointwise argument. It only tells us that differ-
entiability at a point implies the existence of the partial derivatives at that point.

4.6 The Chain Rule

Now that we know a bit about two dimensional derivatives, let’s go for gold and
figure out the new version of the chain rule. The argument we make here is very
similar in spirit to the one dimensional one. You should go back and check it out! We
will do this argument carefully but without tedious rigor. At least that is our hope.
You’ll have to let us know how we did!
We assume there are two functions u(x, y) and v(x, y) defined locally about
(x0 , y0 ) and that there is a third function f (u, v) which is defined locally around
(u 0 = u(x0 , y0 ), v0 = v(x0 , y0 )). Now assume f (u, v) is differentiable at (u 0 , v0 )
and u(x, y) and v(x, y) are differentiable at (x0 , y0 ). Then we can say

u(x, y) = u(x0 , y0 ) + u x (x0 , y0 )(x − x0 ) + u y (x0 , y0 )(y − y0 ) + E u (x, y, x0 , y0 )

v(x, y) = v(x0 , y0 ) + vx (x0 , y0 )(x − x0 ) + v y (x0 , y0 )(y − y0 ) + E v (x, y, x0 , y0 )
f (u, v) = f (u 0 , v0 ) + f u (u 0 , v0 )(u − u 0 ) + f v (u 0 , v0 )(v − v0 ) + E f (u, v, u 0 , v0 )

where all the error terms behave as usual as (x, y) → (x0 , y0 ) and (u, v) →
(u 0 , v0 ). Note that as (x, y) → (x0 , y0 ), u(x, y) → u 0 = u(x0 , y0 ) and v(x, y) →
v0 = v(x0 , y0 ) as u and v are continuous at the (u 0 , v0 ) since they are differentiable
there. Let’s consider the partial of f with respect to x. Let u = u(x0 + x, y0 ) −
u(x0 , y0 ) and v = v(x0 + x, y0 ) − v(x0 , y0 ). Thus, u 0 + u = u(x0 + x, y0 )
and v0 + v = v(x0 + x, y0 ). Hence
4.6 The Chain Rule 109

f (u 0 + u, v0 + v) − f (u 0 , v0 )
x
f u (u 0 , v0 )(u − u 0 ) + f v (u 0 , v0 )(v − v0 ) + E f (u, v, u 0 , v0 )
=
x
u − u0 v − v0 E f (u, v, u 0 , v0 )
= f u (u 0 , v0 ) + f v (u 0 , v0 ) +
x x x
u x (x0 , y0 )(x − x0 ) + E u (x, x0 , y0 )
= f u (u 0 , v0 )
x
vx (x0 , y0 )(x − x0 ) + E v (x, x0 , y0 ) E f (u, v, u 0 , v0 )
+ f v (u 0 , v0 ) +
x x
= f u (u 0 , v0 ) u x (x0 , y0 ) + f v (u 0 , v0 ) vx (x0 , y0 )
E u (x, x0 , y0 ) E v (x, x0 , y0 ) E f (u, v, u 0 , v0 )
+ + + .
x x x

As (x, y) → (x0 , y0 ), (u, v) → (u 0 , v0 ) and so E f (u, v, u 0 , v0 )/x → 0. The other

two error terms go to zero also as (x, y) → (x0 , y0 ). Hence, we conclude

∂f ∂ f ∂u ∂ f ∂v
= + .
∂x ∂u ∂x ∂v ∂x
A similar argument shows

∂f ∂ f ∂u ∂ f ∂v
= + .
∂y ∂u ∂ y ∂v ∂ y

This result is known as the Chain Rule.

Theorem 4.6.1 (The Chain Rule)

Assume there are two functions u(x, y) and v(x, y) defined locally about (x0 , y0 )
and that there is a third function f (u, v) which is defined locally around (u 0 =
u(x0 , y0 ), v0 = v(x0 , y0 )). Further assume f (u, v) is differentiable at (u 0 , v0 ) and
u(x, y) and v(x, y) are differentiable at (x0 , y0 ). Then f x and f y exist at (x0 , y0 )
and are given by

∂f ∂ f ∂u ∂f ∂v
= +
∂x ∂u ∂x ∂v ∂x
∂f ∂ f ∂u ∂f ∂v
= + .
∂y ∂u ∂ y ∂v ∂y

Proof We have sketched the argument already.

110 4 Multivariable Calculus

4.6.1 Examples

Example 4.6.1 Let f (x, y) = x 2 + 2x + 5y 4 . Then if x = r cos(θ) and y = r sin(θ),

using the chain rule, we find

∂f ∂ f ∂x ∂f ∂y
= +
∂r ∂x ∂r ∂y ∂r
∂f ∂ f ∂x ∂f ∂y
= +
∂θ ∂x ∂θ ∂y ∂θ

This becomes

∂f
= 2x + 2 cos(θ) + 20y 3 sin(θ)
∂r

∂f
= 2x + 2 −r sin(θ) + 20y 3 r cos(θ)
∂θ

You can then substitute in for x and y to get the final answer in terms of r and θ (kind
of ugly though!).

Example 4.6.2 Let f (x, y) = 10x 2 y 4 . Then if u = x 2 + 2y 2 and v = 4x 2 − 5y 2 ,

using the chain rule, we find f (u, v) = 10u 2 v 4 and so

∂f ∂ f ∂u ∂f ∂v
= +
∂x ∂u ∂x ∂v ∂x
∂f ∂ f ∂u ∂f ∂v
= +
∂y ∂u ∂ y ∂v ∂y

This becomes

∂f
= 20uv 4 2x + 40u 2 v 3 8x
∂x

∂f
= 20uv 4 4y + 40u 2 v 3 (−10y)
∂θ

You can then substitute in for u and v to get the final answer in terms of x and y
(even more ugly though!).

Example 4.6.3 In the discussion of Hamilton’s Rule from the first course on calculus
for biologists, we discuss a fitness function w for a model of altruism which depends
on P which is the probability of giving aid and Q which is the probability of receiving
aid. The model is

w = w0 + b Q − c P
4.6 The Chain Rule 111

where w0 is a baseline fitness amount. Note, the chain rule gives us

∂w ∂w ∂ P ∂w ∂ Q
= +
∂P ∂P ∂P ∂Q ∂P
∂Q
= −c + b
∂P
Let ∂ P Q be denoted by r , the coefficient of relatedness. The parameter r is very
hard to understand even though it was introduced in 1964 to study altruism. Altruism
occurs if fitness increases or ∂∂wP > 0. So altruism occurs if −c + br > 0 or r b > c.
This inequality is Hamilton’s Rule, but what counts is we understand what these
terms mean biologically.

4.6.1.1 Homework

Exercise 4.6.1 Let f (x, y) = x 2 + 5x 2 y 3 and let u(x, y) = 2x y and v(x, y) =

x 2 − y 2 . Find ∂x f (u, v) and ∂ y f (u, v).
Exercise 4.6.2 Let f (x, y) = 2x 2 − 5x 2 y 5 and let u(s, t) = st 2 and v(s, t) = s 2 +
t 4 . Find f s and f t .
Exercise 4.6.3 Let f (x, y) = 5x 2 y 3 + 10 and let u(x, y) = sin(st) and v(s, t) =
cos(st). Find f s and f t .

4.7 Tangent Plane Approximation Error

We are now ready to give you a whirlwind tour of what you can call second order
ideas in calculus for two variables. Or as some would say, let’s drink from the fountain
of knowledge with a fire hose! Well, maybe not that intense....
We will use these ideas for some practical things. Recall from the first class on
calculus for biologist’s, we used these ideas to to find the minimum and maximum
of functions of two variables and we applied those ideas to the problem of finding
the best straight line that fits a collection of data points. This regression line is of
great importance to you in your career as biologists! We also introduced the ideas of
average or mean, covariance and variance when we worked out how to find the
regression line. The slope of the regression line has many important applications and
we showed you some of them in our Hamilton’s rule model.
Once we have the chain rule, we can quickly develop other results such as how
much error we make when we approximate our surface f (x, y) using a tangent plane
at a point (x0 , y0 ). To finish our arguments, we need an analog of the Mean Value
Theorem from Calculus. The first thing we need is to know when a function of two
variables is differentiable. Just because it’s partials exist at a point is not enough to
guarantee that! But we can prove that if the partials are continuous around that point,
112 4 Multivariable Calculus

then the derivative does exist. And that means we can write the function in terms of
its tangent plane plus an error. The arguments to do this are not terribly hard, so let’s
go through them. We will need a version of the Mean Value Theorem for functions
of two variables. Here it is:

Theorem 4.7.1 (Mean Value Theorem)

Assume the partials of f (x, y) exist at (x0 , y0 ) and that f is defined locally around
(x0 , y0 ). Then given any (x, y) where f is locally defined, there is a point on the
line between (x0 , y0 ) and (x, y), (xc , yc ) with xc = x0 + c(x − x0 ) and yc c =
y0 + c(y − y0 ) so that

f (x, y) − f (x0 , y0 ) = f x (xc , yc )(x − x0 ) + f y (xc , yc )(y − y0 ).

Proof The argument that shows this is pretty straightforward. We apply the chain
rule using the simpler functions u(t) = x0 + t (x − x0 ) and v(t) = y0 + t (y − y0 ).
Then u and v are differentiable with u (t) = x − x0 and v (t) = y − y0 . Hence, if
h(t) = f (u(t), v(t)), we have

h (t) = f x (u(t)) (x − x0 ) + f y (v(t)) (y − y0 )

and from the usual calculus mean value theorem,

h(1) − h(0) = h (c)

where c is between 0 and 1. Using the definition of h and h , we have

f (x, y) − f (x0 , y0 ) = f x (xc , yc )(x − x0 ) + f y (xc , yc )(y − y0 ).

Now it is not true that just because a function f has partial derivatives at a point
(x0 , y0 ) that f is differentiable. There are many examples where partials can exist
at a point and the function itself does not satisfy the definition of differentiability.
However, if we know the partials are themselves continuous locally at the point
(x0 , y0 ) then it is true that f is differentiable there. Once we know f is differentiable
there we can apply chain rule type ideas. Let’s assume f is defined locally around
(x0 , y0 ) and consider the difference

f (x0 + tx, y0 + ty) − f (x0 , y0 ) = f (x0 + tx, y0 + ty) − f (x0 + tx, y0 )

+ f (x0 + tx, y0 ) − f (x0 , y0 ).
4.7 Tangent Plane Approximation Error 113

From the Mean Value Theorem, we know we can write

f (x0 + tx, y0 + ty) − f (x0 + tx, y0 ) = f y (x0 + tx, y0 + t1 y)(ty)

f (x0 + tx, y0 ) − f (x0 , y0 ) = f x (x0 + t2 x, y0 )(tx)

for some numbers t1 and t2 between 0 and t. Hence, at t = 1, we have

f (x0 + x, y0 + y) − f (x0 , y0 ) − f x (x0 , y0 )x − f y (x0 , y0 )y

= f x (x0 + t2 x, y0 ) − f x (x0 , y0 ) x

+ f y (x0 + tx, y0 + t1 y) − f y (x0 , y0 ) y

where t1 and t2 are between 0 and 1. Let

E(x, y, x0 , y0 ) = f (x0 + x, y0 + y) − f (x0 , y0 ) − f x (x0 , y0 )x − f y (x0 , y0 )y

like usual. Then we have

E(x, y, x0 , y0 ) = f x (x0 + t2 x, y0 ) − f x (x0 , y0 ) x

+ f y (x0 + tx, y0 + t1 y) − f y (x0 , y0 ) y

We know as (x, y) → (0, 0), the numbers (t1 , t2 ) we found using the Mean
Value Theorem will also go to (0, 0) and so (x0 + t2 x, y0 ) → (x0 , y0 ) and (x0 +
x, y0 + t1 y) → (x0 , y0 ). Then the continuity of f x and f y at (x0 , y0 ) tells us

f x (x0 + t2 x, y0 ) − f x (x0 , y0 ) x

+ f y (x0 + tx, y0 + t1 y) − f y (x0 , y0 ) y → 0.

We conclude E(x, y, x0 , y0 ) → 0 as (x, y) → 0. Further,

114 4 Multivariable Calculus

E(x, y, x0 , y0 ) x
= f x (x0 + t2 x, y0 ) − f x (x0 , y0 )
(x)2 + (y)2 (x)2 + (y)2

y
+ f y (x0 + tx, y0 + t1 y) − f y (x0 , y0 )
(x)2 + (y)2

But the terms |x|/ (x)2 + (y)2 ≤ 1 and |y|/ (x)2 + (y)2 ≤ 1 and so as
(x, y) → (0, 0), we must have E(x, y, x0 , y0 )/ (x)2 + (y)2 → 0 as well.
These two limits show that f is differentiable at (x0 , y0 ). We can state this as a
theorem. We use this idea a lot in two dimensional calculus.

Theorem 4.7.2 (Continuous Partials Imply Differentiability)

Assume the partials of f (x, y) exist at (x0 , y0 ) and that f is defined locally around
(x0 , y0 ). Further, assume the partials are continuous locally at (x0 , y0 ). Then f is
differentiable at (x0 , y0 ).

Now let’s go back to the old idea of a tangent plane to a surface. For the surface
z = f (x, y) if its partials are continuous functions (they usually are for our work!)
then f is differentiable and hence we know that

f (x, y) = f (x0 , y0 ) + f x (x0 , y0 )(x − x0 ) + f y (x0 , y0 )(y − y0 ) + E(x, y, x0 , y0 )

and E(x, y, x0 , y0 ) → 0 and E(x, y, x0 , y0 )/ (x − x0 )2 + (y − y0 )2 → 0 as (x, y)

→ (x0 , y0 ).

4.8 Second Order Error Estimates

We can characterize the error must better if we have access to what are called the
second order partial derivatives of f . Roughly speaking, we take the partials of f x
and f y to obtain the second order terms. We can make this discussion brief. Assuming
f is defined locally as usual near (x0 , y0 ), we can ask about the partial derivatives
of the functions f x and f y with respect to x and y also. We define the second order
partials of f as follows.

Definition 4.8.1 (Second Order Partials)

If f (x, y), f x and f y are defined locally at (x0 , y0 ), we can attempt to find following
limits:

f x (x, y) − f x (x, y0 )
lim = ∂x ( f x )
x→x0 ,y=y0 x − x0
f x (x, y) − f x (x0 , y)
lim = ∂y ( fx )
x=x0 ,y→y0 y − y0
4.8 Second Order Error Estimates 115

f y (x, y) − f y (x, y0 )
lim = ∂x ( f y )
x→x0 ,y=y0 x − x0
f y (x, y) − f y (x0 , y)
lim = ∂y ( f y )
x=x0 ,y→y0 y − y0

Comment 4.8.1 When these second order partials exist at (x0 , y0 ), we use the fol-
lowing notations interchangeably: f x x = ∂x ( f x ), f x y = ∂ y ( f x ), f yx = ∂ y ( f x ) and
f yy = ∂ y ( f y ).

The second order partials are often organized into a matrix called the Hessian.

Definition 4.8.2 (The Hessian)

If f (x, y), f x and f y are defined locally at (x0 , y0 ), if the second order partials exist
at (x0 , y0 ), we define the Hessian, H(x0 , y0 ) at (x0 , y0 ) to be the matrix

f x x (x0 , y0 ) f x y (x0 , y0 )
H(x0 , y0 ) =
f yx (x0 , y0 ) f yy (x0 , y0 )

Comment 4.8.2 It is possible to prove that if the first order partials are continuous
locally near (x0 , y0 ) then the mixed order partials f x y and f yx must match at the
point (x0 , y0 ). Most of our surfaces have this property. Hence, for these smooth
surfaces, the Hessian is a symmetric matrix!

Example 4.8.1 Let f (x, y) = 2x − 8x y. Find the first and second order partials of
f and its Hessian.

Solution The partials are

f x (x, y) = 2 − 8y
f y (x, y) = −8x
f x x (x, y) = 0
f x y (x, y) = −8
f yx (x, y) = −8
f yy (x, y) = 0.

and so the Hessian is

f x x (x, y) f x y (x, y) 0 −8
H(x, y) = =
f yx (x, y) f yy (x, y) −8 0
116 4 Multivariable Calculus

4.8.1 Homework

Exercise 4.8.1 Let f (x, y) = 5x − 2x y. Find the first and second order partials of
f and its Hessian.

Exercise 4.8.2 Let f (x, y) = −8y + 9x y − 2y 2 . Find the first and second order
partials of f and its Hessian.

Exercise 4.8.3 Let f (x, y) = 4x − 6x y − x 2 . Find the first and second order par-
tials of f and its Hessian.

Exercise 4.8.4 Let f (x, y) = 4x 2 − 6x y − x 2 . Find the first and second order par-
tials of f and its Hessian.

4.8.2 Hessian Approximations

We can now explain the most common approximation result for tangent planes. Let

h(t) = f (x0 + tx, y0 + ty)

as usual. Then we know we can write

t2
h(t) = h(0) + h (0)t + h (c) .
2
Using the chain rule, we find

h (t) = f x (x0 + tx, y0 + ty)x + f y (x0 + tx, y0 + ty)y

and

h (t) = ∂x f x (x0 + tx, y0 + ty)x + f y (x0 + tx, y0 + ty)y x

+∂ y f x (x0 + tx, y0 + ty)x + f y (x0 + tx, y0 + ty)y y

= f x x (x0 + tx, y0 + ty)(x)2 + f yx (x0 + tx, y0 + ty)(y)(x)

+ f x y (x0 + tx, y0 + ty)(x)(y) + f yy (x0 + tx, y0 + ty)(y)2
4.8 Second Order Error Estimates 117

We can rewrite this in matrix–vector form as

f x x (x0 + tx, y0 + ty) f yx (x0 + tx, y0 + ty) x
h (t) = x y
f x y (x0 + tx, y0 + ty) f yy (x0 + tx, y0 + ty) y

Of course, using the definition of H, this can be rewritten as

T
x x
h (t) = H(x0 + tx, y0 + ty)
y y

Thus, our tangent plane approximation can be written as

1
h(1) = h(0) + h (0)(1 − 0) + h (c)
2
for some c between 0 and 1. Substituting for the h terms, we find

f (x0 + x, y0 + y) = f (x0 , y0 ) + f x (x0 , y0 )x + f y (x0 , y0 )y

1 x T x
+ H(x0 + cx, y0 + cy)
2 y y

Clearly, we have shown how to express the error in terms of second order partials.
There is a point c between 0 and 1 so that

1 x T x
E(x0 , y0 , x, y) = H(x0 + cx, y0 + cy)
2 y y

Note the error is a quadratic expression in terms of the x and y.

4.9 Extrema Ideas

To understand how to think about finding places where the minimum and maximum
of a function to two variables might occur, all you have to do is realize it is a common
sense thing. We already know that the tangent plane attached to the surface which
represents our function of two variables is a way to approximate the function near
the point of attachment. We have seen in our pictures what happens when the tangent
plane is flat. This flatness occurs at the minimum and maximum of the function. It
also occurs in other situations, but we will leave that more complicated event for other
118 4 Multivariable Calculus

courses. The functions we want to deal with are quite nice and have great minima
and maxima. However, we do want you to know there are more things in the world
and we will touch on them only briefly.
To see what to do, just recall the equation of the tangent plane error to our function
of two variables f (x, y).

f (x, y) = f (x0 , y0 ) + ∇( f )(x0 , y0 )[x − x0 , y − y0 ]T

+(1/2)[x − x0 , y − y0 ]H (x0 + c(x − x0 ), y0 + c(y − y0 ))[x − x0 , y − y0 ]T

where c is some number between 0 and 1 that is different for each x. We also know
that the equation of the tangent plane to f (x, y) at the point (x0 , y0 ) is

f (x, y) = f (x0 , y0 )+ < ∇ f, X − X 0 > .

Now let’s assume the tangent plane is flat at (x0 , y0 ). Then the gradient ∇ f is the
zero vector and we have ∂∂xf (x0 , y0 ) = 0 and ∂∂ yf (x0 , y0 ) = 0. So the tangent plane
error equation simplifies to

f (x, y) = f (x0 , y0 ) + (1/2)[x − x0 , y − y0 ]

× H (x0 + c(x − x0 ), y0 + c(y − y0 ))[x − x0 , y − y0 ]T

Now let’s simplify this. The Hessian is just a 2 × 2 matrix whose components are
the second order partials of f . Let

∂2 f
A(c) = (x0 + c(x − x0 ), y0 + c(y − y0 ))
∂x 2
∂2 f
B(c) = (x0 + c(x − x0 ), y0 + c(y − y0 ))
∂x ∂ y
∂2 f
= (x0 + c(x − x0 ), y0 + c(y − y0 ))
∂ y ∂x
∂2 f
D(c) = (x0 + c(x − x0 ), y0 + c(y − y0 ))
∂ y2

Then, we have

A(c) B(c) x − x0
f (x, y) = f (x0 , y0 ) + (1/2) x − x0 y − y0
B(c) D(c) y − y0
4.9 Extrema Ideas 119

We can multiply this out (a nice simple pencil and paper exercise!) to find

f (x, y) = f (x0 , y0 ) + 1/2 A(c)(x − x0 )2

+ 2B(c)(x − x0 ) (y − y0 ) + D(c)(y − y0 )2

Now it is time to remember an old technique from high school—completing the

square. Remember if we had a quadratic like u 2 + 3uv + 6v 2 , to complete the square
we take half of the number in front of the mixed term uv and square it and add and
subtract it times v 2 as follows.

u 2 + 3uv + 6v 2 = u 2 + 3uv + (3/2)2 v 2 − (3/2)2 v 2 + 6v 2 .

Now group the first three terms together and combine the last two terms into one
term.

u + 3uv + 6v = u + 3uv + (3/2) v + 6 − (3/2) v 2 .
2 2 2 2 2 2

The first three terms are a perfect square, (u + (3/2)v)2 . Simplifying, we find

2
u + 3uv + 6v = u + (3/2)v + (135/4) v 2 .
2 2

This is called completely the square! Now let’s do this with the Hessian quadratic
we have. First, factor our the A(c). We will assume it is not zero so the divisions are
fine to do. Also, for convenience, we will replace x − x0 by x and y − y0 by y.
This gives

A(c) B(c) D(c)
f (x, y) = f (x0 , y0 ) + (x)2 + 2 x y + (y)2 .
2 A(c) A(c)

One half of the xy coefficient is B(c)

A(c)
so add and subtract (B(c)/A(c))2 (y)2 .
We find

A(c) B(c)
f (x, y) = f (x0 , y0 ) + (x)2 + 2 x y
2 A(c)

B(c) 2 B(c) 2 D(c)
+ (y) −
2
(y) +
2
(y) .
2
A(c) A(c) A(c)
120 4 Multivariable Calculus

Now group the first three terms together—the perfect square and combine the last
two terms into one. We have

f (x, y) = f (x0 , y0 )
2
A(c) B(c) A(c) D(c) − (B(c))2
+ x + y + (y) .
2
2 A(c) (A(c))2

Now we need this common sense result which says that if a function g is continuous
at a point (x0 , y0 ) and positive or negative , then it is positive or negative in a circle
of radius r centered at (x0 , y0 ). Here is the formal statement.

Theorem 4.9.1 (Nonzero Values and Continuity)

If f (x0 , y0 ) is a place where the function is positive or negative in value, then there
is a radius r so that f (x, y) is positive or negative in a circle of radius r around the
center (x0 , y0 ).

Proof This can be argued carefully using limits.

• If f (x0 , y0 ) > 0 and f is continuous at (x0 , y0 ), then if no matter now close we
were to (x0 , y0 ), we could find a point (xr , yr ) where f (xr , yr ) = 0, then that set
of points would define a path to (x0 , y0 ) and the limiting value of f on that path
would be 0.
• But we know the value at (x0 , y0 ) is positive and we know f is continuous there.
Hence, the limiting values for all paths should match.
• So we can’t find such points for all values of r . We see there will be a first r where
we can’t do this and so inside the circle determined by that r , f will be nonzero.
• You might think we haven’t ruled out the possibility that f could be negative
at some points. But the only way a continuous f could switch between positive
and negative is to pass through zero. And we have already ruled that out. So f is
positive inside this circle of radius r .

Now getting back to our problem. We have at this point where the partials are zero,
the following expansion

A(c) B(c) B(c) 2
f (x, y) = f (x0 , y0 ) + (x) + 2
2
x y + (y) 2
2 A(c) A(c)

A(c) D(c) − (B(c))2
+ (y) .
2
(A(c))2

The algebraic sign of the terms after the function value f (x0 , y0 ) are completely
determined by the terms which are not squared. We have two simple cases:
4.9 Extrema Ideas 121

• A(c) > 0 and A(c) D(c) − (B(c))2 > 0 which implies the term after f (x0 , y0 ) is
positive.
• A(c) < 0 and A(c) D(c) − (B(c))2 > 0 which implies the term after f (x0 , y0 ) is
negative.

Now let’s assume all the second order partials are continuous at (x0 , y0 ). We know
A(c) = ∂∂x 2f (x0 + c(x − x0 ), y0 + c(y − y0 )) and from Theorem 4.9.1, if ∂∂x 2f (x0 , y0 )
2 2

> 0, then so is A(c) in a circle around (x0 , y0 ). The other term A(c) D(c) −
(B(c))2 > 0 will also be positive is a circle around (x0 , y0 ) as long as ∂∂x 2f (x0 , y0 )
2

∂2 f ∂2 f
∂ y2
(x0 , y0 ) − (x , y0 ) > 0. We can say similar things about the negative case.
∂x∂ y 0
Now to save typing let ∂∂x 2f (x0 , y0 ) = f x0x , ∂∂ y 2f (x0 , y0 ) = f yy ∂2 f
2 2
0
and ∂x∂ (x , y0 ) = f x0y .
y 0
So we can restate our two cases as

• f x0x > 0 and f x0x f yy

0
− ( f x0y )2 > 0 which implies the term after f (x0 , y0 ) is posi-
tive. This implies that f (x, y) > f (x0 , y0 ) in a circle of some radius r which says
f (x0 , y0 ) is a minimum value of the function locally at that point.
• f x0x < 0 and f x0x f yy
0
− ( f x0y )2 > 0 which implies the term after f (x0 , y0 ) is nega-
tive. This implies that f (x, y) < f (x0 , y0 ) in a circle of some radius r which says
f (x0 , y0 ) is a maximum value of the function locally at that point.

where, for convenience, we use a superscript 0 to denote we are evaluating the partials
at (x0 , y0 ). So we have come up with a great condition to verify if a place where
the partials are zero is a minimum or a maximum. If you think about it a bit, you’ll
notice we left out the case where f x0x f yy
0
− ( f x0y )2 < 0 which is important but we
will not do that in this class. That is for later courses to pick up, however it is the
test for the analog of the behavior we see in the cubic y = x 3 . The derivative is 0 but
there is neither a minimum or maximum at x = 0. In two dimensions, the situation is
more interesting of course. This kind of behavior is called a saddle. We have another
Theorem!

Theorem 4.9.2 (Extrema Test)

If the partials of f are zero at the point (x0 , y0 ), we can determine if that point is a
local minimum or local maximum of f using a second order test. We must assume
the second order partials are continuous at the point (x0 , y0 ).

• If f x0x > 0 and f x0x f yy

0
− ( f x0y )2 > 0 then f (x0 , y0 ) is a local minimum.
• f x x < 0 and f x x f yy − ( f x0y )2 > 0 then f (x0 , y0 ) is a local maximum.
0 0 0

We just don’t know anything if the test f x0x f yy

0
− ( f x0y )2 = 0. If the test gives f x0x f yy
0
−
( f x y ) < 0, we have a saddle.
0 2

Proof We have sketched out the reasons for this above.

Now the second order test fails if det(H(x0 , y0 )) = 0 at the critical point as a
few examples show. First, the function f (x, y) = x 4 + y 4 has a global minimum at
(0, 0) but at that point
122 4 Multivariable Calculus

12x 2 0
H(x, y) = =⇒ det(H(x0 , y0 )) = 144x 2 y 2 .
0 12y 2

and hence, det(H(x0 , y0 )) = 0. Secondly, the function f (x, y) = −x 4 − y 4 has a

global maximum at (0, 0) but at that point

−12x 2 0
H(x, y) = =⇒ det(H(x0 , y0 )) = 144x 2 y 2 .
0 −12y 2

and hence, det(H(x0 , y0 )) = 0 as well. Finally, f (x, y) = x 4 − y 4 has a saddle at

(0, 0) but at that point

12x 2 0
H(x, y) = =⇒ det(H(x0 , y0 )) = −144x 2 y 2 .
0 −12y 2

and hence, det(H(x0 , y0 )) = 0 again. So if the det(H(x0 , y0 )) = 0, we just don’t

know what the behavior is.
Let’s finish this section with a more careful discussion of the idea of a saddle.
Recall at a critical point (x0 , y0 ), we found that

2
A(c) B(c) A(c) D(c) − (B(c))2 2
f (x, y) = f (x0 , y0 ) + x + y + (y) .
2 A(c) (A(c))2

Now suppose we knew A(c) D(c) − (B(c))2 < 0. Then, using the usual continuity
argument, we know that there is a circle around the critical point (x0 , y0 ) so that
A(c) D(c) − (B(c))2 < 0 when c = 0. This is the same as saying det(H(x0 , y0 )) <
0. But notice that on the line going through the critical point having y = 0, this
gives
2
A(c)
f (x, y) = f (x0 , y0 ) + x .
2

and on the line through the critical point with x + B(c)

A(c)
y = 0. we have

A(c) A(c) D(c) − (B(c))2
f (x, y) = f (x0 , y0 ) + (y)2
2 (A(c))2

Now, if A(c) > 0, the first case gives f (x, y) = f (x0 , y0 ) + a positive number
showing f has a minimum on that trace. However, the second case gives f (x, y) =
f (x0 , y0 )− a positive number which shows f has a maximum on that trace. The
fact that f is minimized in one direction and maximized in another direction gives
rise to the expression that we consider f to behave like a saddle at this critical point.
The analysis is virtually the same if A(c) < 0, except the first trace has the maximum
4.9 Extrema Ideas 123

and the second trace has the minimum. Hence, the test for a saddle point is to see if
det(H(x0 , y0 )) < 0 as we stated in Theorem 4.9.2.

4.9.1 Examples

Example 4.9.1 Use our tests to show f (x, y) = x 2 + 3y 2 has a minimum at (0, 0).

Solution The partials here are f x = 2x and f y = 6y. These are zero at x = 0 and
y = 0. The Hessian at this critical point is

20
H (x, y) = = H (0, 0).
06

as H is constant here. Our second order test says the point (0, 0) corresponds
to a minimum because f x x (0, 0) = 2 > 0 and f x x (0, 0) f yy (0, 0) − ( f x y (0, 0))2 =
12 > 0.

Example 4.9.2 Use our tests to show f (x, y) = x 2 + 6x y + 3y 2 has a saddle at

(0, 0).

Solution The partials here are f x = 2x + 6y and f y = 6x + 6y. These are zero at
when

2x + 6y = 0
6x + 6y = 0

which has solution x = 0 and y = 0. The Hessian at this critical point is

26
H (x, y) = = H (0, 0).
66

as H is again constant here. Our second order test says the point (0, 0) corresponds to
a saddle because f x x (0, 0) = 2 > 0 and f x x (0, 0) f yy (0, 0) − ( f x y (0, 0))2 = 12 −
36 < 0.

Example 4.9.3 Show our tests fail on f (x, y) = 2x 4 + 4y 6 even though we know
there is a minimum value at (0, 0).

Solution For f (x, y) = 2x 4 + 4y 6 , you find that the critical point is (0, 0) and all
the second order partials are 0 there. So all the tests fail. Of course, a little common
sense tells you (0, 0) is indeed the place where this function has a minimum value.
Just think about how it’s surface looks. But the tests just fail. This is much like the
curve f (x) = x 4 which has a minimum at x = 0 but all the tests fail on it also.
124 4 Multivariable Calculus

Fig. 4.5 The Surface

f (x, y) = 2x 2 + 4y 3

Example 4.9.4 Show our tests fail on f (x, y) = 2x 2 + 4y 3 and the surface does not
have a minimum or maximum at the critical point (0, 0).

Solution For f (x, y) = 2x 2 + 4y 3 , the critical point is again (0, 0) and f x x (0, 0) =
4, f yy (0, 0) = 0 and f x y (0, 0) = f yx (0, 0) = 0. So f x x (0, 0) f yy (0, 0) − ( f x y (0, 0))2
= 0 so the test fails. Note the x = 0 trace is 4y 3 which is a cubic and so is nega-
tive below y = 0 and positive above y = 0. Not much like a minimum or maximum
behavior on this trace! But the trace for y = 0 is 2x 2 which is a nice parabola which
does reach its minimum at x = 0. So the behavior of the surface around (0, 0) is not
a maximum or a minimum. The surface acts a lot like a cubic. Do this in MatLab.

Listing 4.8: The surface f (x, y) = 2x 2 + 4y 3

1 [X, Y] = meshgrid ( − 1:.2 :1) ;
Z = 2∗X . ˆ 2 + 4∗Y . ˆ 3 ;
surf (Z) ;

This will give you a surface. In the plot that is shown go to the tool menu and click
of the rotate 3D option and you can spin it around. Clearly like a cubic! You can see
the plot in Fig. 4.5.

4.9.2 Homework

Exercise 4.9.1 Use our tests to show f (x, y) = 4x 2 + 2y 2 has a minimum at (0, 0).
Feel free to draw a surface plot to help you see what is going on.
4.9 Extrema Ideas 125

Exercise 4.9.2 Use our tests to find where f (x, y) = 2x 2 + 3x + 3y 2 + 8y has a

minimum. Feel free to draw a surface plot to help you see what is going on.

Exercise 4.9.3 Use our tests to find where f (x, y) = 100 − 2x 2 + 3x − 3y 2 + 8y

has a maximum. Feel free to draw a surface plot to help you see what is going on.

Exercise 4.9.4 Use our tests to find where f (x, y) = 2x 2 + x + 8 + 4y 2 + 8y + 20

has a minimum. Feel free to draw a surface plot to help you see what is going on.

Exercise 4.9.5 Show our tests fail on f (x, y) = 6x 4 + 8y 8 even though we know
there is a minimum value at (0, 0). Feel free to draw a surface plot to help you see
what is going on.

Exercise 4.9.6 Show our tests fail on f (x, y) = 10x 2 + 5y 5 and the surface does
not have a minimum or maximum at the critical point (0, 0). Feel free to draw a
surface plot to help you see what is going on.
Part III
The Main Event
Chapter 5
Integration

To help us with our modeling tasks, we need to explore two new ways to compute
antiderivatives and definite integrals. These methods are called Integration By Parts
and Partial Fraction Decompositions. You should also recall our discussions about
antiderivatives or primitives and Riemann integration from Peterson (2015) where
we go over topics such as how we define the Riemann Integral, the Fundamental
Theorem Calculus and the use of the Cauchy Fundamental Theorem Calculus. you
should also review the basic ideas of continuity and differentiability.

5.1 Integration by Parts

This technique is based on the product rule for differentiation. Let’s assume that
the functions f and g are both differentiable on the finite interval [a, b]. Then the
product f g is also differentiable on [a, b] and

( f (t) g(t)) = f (t) g(t) + f (t) g (t)

Now, we know the antiderivative of ( f g) is f g + C, where for convenience of

notation, we don’t write the usual (t) in each term. So, if we compute the definite
integral of both sides of this equation on [a, b], we find
b b b
( f (t) g(t)) dt = f (t) g(t) dt + f (t) g (t) dt
a a a

The left hand side is simply ( f g) |ab = f (b)g(b) − f (a)g(a). Hence,

b b

f (t) g(t) dt + f (t) g (t) dt = ( f g) |ab
a a

© Springer Science+Business Media Singapore 2016 129

J.K. Peterson, Calculus for Cognitive Scientists,
Cognitive Science and Technology, DOI 10.1007/978-981-287-877-9_5
130 5 Integration

This is traditionally written as

b b

f (t) g (t) dt = ( f g) |ab − g(t) f (t) dt (5.1)
a a

We usually write this in an even more abbreviated form. If we let u(t) = f (t),
then du = f (t) dt. Also, if v(t) = g(t), then dv = g (t) dt. Then, we can rephrase
Eq. 5.1 as Eq. 5.2.
b b
u dv = uv |ab − v du (5.2)
a a

We can also develop the integration by parts formula as an indefinite integral. When
we do that we obtain the version in Eq. 5.3
b
u dv = uv − v du + C (5.3)
a

Equation 5.2 gives what is commonly called the Integration By Parts formula.

5.1.1 How Do We Use Integration by Parts?

Let’s work through some problems carefully. As usual, we will give many details at
first and gradually do the problems faster with less written down. You need to work
hard at understanding this technique.

Example 5.1.1 Evaluate ln(t) dt.

Solution Let u(t) = ln(t) and dv = dt. Then du = 1t dt and v = dt = t. When
we find the antiderivative v, at this stage we don’t need to carry around an arbitrary
constant C as we will add one at the end. Applying Integration by Parts, we have

ln(t) dt = udv

= uv − vdu

1
= ln(t) t − t dt
t

= ln(t) t − dt
= t ln(t) − t + C

Example 5.1.2 Evaluate t ln(t) dt.
5.1 Integration by Parts 131

Solution Let u(t) = ln(t) and dv = tdt. Then du = 1
t
dt and v = tdt = t 2 /2.
Applying Integration by Parts, we have

t ln(t) dt = udv

= uv − vdu

1
= ln(t) t 2 /2 − t 2 /2 dt
t
2
t
= ln(t) − t/2 dt
2
t2 t2
= ln(t) − + C
2 4

Example 5.1.3 Evaluate t 3 ln(t) dt.

Solution Let u(t) = ln(t) and dv = t 3 dt. Then du = 1
t
dt and v = t 3 dt = t 4 /4.
Applying Integration by Parts, we have

t ln(t) dt =
3
udv

= uv − vdu

1
= ln(t) t /4 −
4
t 4 /4 dt
t
4
t
= ln(t) − t 3 /4 dt
4
t2 t4
= ln(t) − +C
2 16

Example 5.1.4 Evaluate t et dt.

Solution Let u(t) = t and dv = et dt. Then du = dt and v = et dt = et . Applying
Integration by Parts, we have

t e dt =
t
udv

= uv − vdu

= et t − et dt

= t et − et dt

= t et − et + C
132 5 Integration

Example 5.1.5 Evaluate t 2 et dt.

Solution Let u(t) = t 2 and dv = et dt. Then du = 2tdt and v = et dt = et . Apply-
ing Integration by Parts, we have

t e dt =
2 t
udv

= uv − vdu

= et t 2 − et 2t dt

= t 2 et − 2t et dt

Now the integral 2t et dt also requires the use of integration by parts. So we
integrate again using this technique Let u(t) = 2t and dv = et dt. Then du = 2dt
and v = et dt = et . Applying Integration by Parts again, we have

2t et dt = udv

= uv − vdu

= et 2t − et 2 dt

= 2t e −
t
2 et dt

= 2t et − 2 et + C

It is very awkward to do these multiple integration by parts in two separate steps like
we just did. It is much more convenient to repackage the computation like this:

t 2 et dt = uv −
vdu
u=t 2
dv = e dt t
= et t 2 − et 2t dt
du = 2tdt v = et

= et t 2 − et 2t dt
= t 2 et − 2t et dt
u = 2t dv = et dt
=
du = 2dt v = et

= t 2 et − et 2t − et 2 dt
= t 2 et − 2t et − 2 et + C
5.1 Integration by Parts 133

The framed boxes are convenient for our explanation, but this is still a bit awkward
(and long!) to write out for our problem solution. So let’s try this:

t 2 et dt = et t 2 − et 2t dt

u = t 2 ; dv = et dt; du = 2tdt; v = et

= et t 2 − et 2t dt

=t e −
2 t
2t et dt

u = 2t; dv = et dt; du = 2dt v = et

= t 2 et − et 2t − et 2 dt

= t 2 et − 2t et − 2 et + C
= t 2 et − 2t et + 2 et + C

Example 5.1.6 Evaluate t 2 sin(t) dt.

Solution We will do this one the short way:

t 2 sin(t) dt = −t 2 cos(t) − − cos(t) 2t dt

u = t 2 ; dv = sin(t)dt; du = 2tdt; v = − cos(t)

= −t 2 cos(t) + 2t cos(t) dt

u = 2t; dv = cos(t)dt; du = 2dt v = sin(t)

= −t cos(t) + 2t sin(t) −
2
2 sin(t) dt

= −t 2 cos(t) + {2t sin(t) + 2 cos(t)} + C

= −t 2 cos(t) + 2t sin(t) + 2 cos(t) + C
3
Example 5.1.7 Evaluate 1 t 2 sin(t) dt.

Solution We will do this one the short way: first do the indefinite integral just like
the last problem.
134 5 Integration

t 2 sin(t) dt = −t 2 cos(t) − − cos(t) 2t dt

u = t 2 ; dv = sin(t)dt; du = 2tdt; v = − cos(t)

= −t 2 cos(t) + 2t cos(t) dt

u = 2t; dv = cos(t)dt; du = 2dt; v = sin(t)

= −t cos(t) + 2t sin(t) −
2
2 sin(t) dt

= −t 2 cos(t) + {2t sin(t) + 2 cos(t)} + C

= −t 2 cos(t) + 2t sin(t) + 2 cos(t) + C

Then, we see
3
t 2 sin(t) dt = {−t 2 cos(t) + 2t sin(t) + 2 cos(t)}(3)
1
− {−t 2 cos(t) + 2t sin(t) + 2 cos(t)}(1)
= {−9 cos(3) + 6 sin(3) + 2 cos(3)}
− {− cos(1) + 2 sin(1) + 2 cos(1)}

And it is not clear we can do much to simplify this expression except possibly just
use our calculator to actually compute a value!

5.1.2 Homework

Exercise 5.1.1 Evaluate ln(5t) dt.

Exercise 5.1.2 Evaluate 2t ln(t 2 ) dt.

Exercise 5.1.3 Evaluate (t + 1)2 ln(t + 1) dt.

Exercise 5.1.4 Evaluate t 2 e2t dt.
2
Exercise 5.1.5 Evaluate 0 t 2 e−3t dt.

Exercise 5.1.6 Evaluate 10t sin(4t) dt.

Exercise 5.1.7 Evaluate 6t cos(8t) dt.

Exercise 5.1.8 Evaluate (6t + 4) cos(8t) dt.
5 2
Exercise 5.1.9 Evaluate 2 (t + 5t + 3) ln(t) dt.

Exercise 5.1.10 Evaluate (t 2 + 5t + 3) e2t dt.
5.2 Partial Fraction Decompositions 135

5.2 Partial Fraction Decompositions

Suppose we wanted to integrate a function like (x+2)1(x−3) . This does not fit into a
simple substitution method at all. The way we do this kind of problem is to split the
fraction (x+2)1(x−3) into the sum of the two simpler fractions x+2
1 1
and x−3 . This is
called the Partial Fractions Decomposition approach Hence, we want to find numbers
A and B so that
1 A B
= +
(x + 2) (x − 3) x +2 x −3

If we multiply both sides of this equation by the term (x + 2) (x − 3), we get the
new equation

1 = A (x − 3) + B (x + 2)

Since this equation holds for x = 3 and x = −2, we can evaluate the equation twice
to get

1 = {A (x − 3) + B (x + 2)} |x=3
=5B
1 = {A (x − 3) + B (x + 2)} |x=−2
= −5 A

Thus, we see A is −1/5 and B is 1/5. Hence,

1 −1/5 1/5
= +
(x + 2) (x − 3) x +2 x −3

We could then integrate as follows

1 −1/5 1/5
dx = + dx
(x + 2) (x − 3) x +2 x −3

−1/5 1/5
= dx + dx
x +2 x −3

1 1
= −1/5 d x + 1/5 dx
x +2 x −3
= −1/5 ln (| x + 2 |) + 1/5 ln (| x − 3 |) + C

| x −3|
= 1/5 ln +C
| x +2|

| x − 3 | 1/5
= ln +C
| x +2|
136 5 Integration

where it is hard to say which of these equivalent forms is the most useful. In general, in
later chapters, as we work out various modeling problems, we will choose whichever
of the forms above is best for our purposes.

5.2.1 How Do We Use Partial Fraction Decompositions

to Integrate?

Let’s do some examples:

Example 5.2.1 Evaluate

5
dt.
(t + 3) (t − 4)

Solution We start with the decomposition:

5 A B
= +
(t + 3) (t − 4) t +3 t −4
5 = A (t − 4) + B (t + 3)

Now evaluate at t = 4 and t = −3, to get

5 = (A (t − 4) + B (t + 3)) |t=4
=7B
5 = (A (t − 4) + B (t + 3)) |t=−3
= −7 A

Thus, we know A = −5/7 and B = 5/7. We can now evaluate the integral:

5 −5/7 5/7
dt = + dt
(t + 3) (t − 4) t +3 t −4

−5/7 5/7
= dt + dt
t +3 t −4

1 1
= −5/7 dt + 5/7 dt
t +3 t −4
= −5/7 ln (| t + 3 |) + 5/7 ln (| t − 4 |) + C

5 |t −4|
= ln +C
7 |t +3|

| t − 4 | 5/7
= ln +C
|t +3|
5.2 Partial Fraction Decompositions 137

Example 5.2.2 Evaluate

10
dt.
(2t − 3) (8t + 5)

Solution We start with the decomposition:

10 A B
= +
(2t − 3) (8t + 5) 2t − 3 8t + 5
10 = A (8t + 5) + B (2t − 3)

Now evaluate at t = −5/8 and t = 3/2, to get

10 = (A (8t + 5) + B (2t − 3)) |t=−5/8

= (−10/8 − 3) B = −34/8B
10 = (A (8t + 5) + B (2t − 3)) |t=3/2
= (24/2 + 5) A = 17 A

Thus, we know B = −80/34 = −40/17 and A = 10/17. We can now evaluate the
integral:

10 10/17 −40/17
dt = + dt
(2t − 3) (8t + 5) 2t − 3 8t + 5

10/17 −40/17
= dt + dt
2t − 3 8t + 5

1 1
= 10/17 dt − 40/17 dt
2t − 3 8t + 5
= 10/17 (1/2) ln (| 2t − 3 |) − 40/17 (1/8) ln (| 8t + 5 |) + C

5 | 2t − 3 |
= ln +C
17 | 8t + 5 |

| 2t − 3 | 5/17
= ln +C
| 8t + 5 |

Example 5.2.3 Evaluate

6
dt.
(4 − t) (9 + t)
138 5 Integration

Solution We start with the decomposition:

6 A B
= +
(4 − t) (9 + t) 4−t 9+t
6 = A (9 + t) + B (4 − t)

Now evaluate at t = 4 and t = −9, to get

6 = (A (4 − t) + B (9 + t)) |t=4
= 13 B
6 = (A (4 − t) + B (9 + t)) |t=−9
= 13 A

Thus, we know A = 6/13 and B = 6/13. We can now evaluate the integral:

6 6/13 6/13
dt = + dt
(4 − t) (9 + t) 4−t 9+t

−6/13 6/13
= + dt
t −4 t +9

−6/13 6/13
= dt + dt
t −4 t +9

1 1
= 6/13 dt + 6/13 dt
t −4 t +9
= 6/13 ln (| t − 4 |) + 6/13 ln (| t + 9 |) + C
6
= ln (| (2t − 3) (8t + 5) |) + C
13
= ln (| (2t − 3) (8t + 5) |)6/13 + C

Example 5.2.4 Evaluate

7
−6
dt.
4 (t − 2) (2t + 8)

Solution Again, we start with the decomposition:

−6 A B
= +
(t − 2) (2t + 8) t − 2 2t + 8
−6 = A (2t + 8) + B (t − 2)
5.2 Partial Fraction Decompositions 139

Now evaluate at t = 2 and t = −4, to get

−6 = (A (2t + 8) + B (t − 2)) |t=2

= 12 A
−6 = (A (2t + 8) + B (t − 2)) |t=−4
= −6 B

Thus, we know A = −1/2 and B = 1. We can now evaluate the indefinite integral:

−6 −1/2 1
dt = + dt
(t − 2) (2t + 8) t −2 2t + 8

−1/2 1
= + dt
t −2 2t + 8

−1/2 1
= dt + dt
t −2 2t + 8

1 1
= −1/2 dt + dt
t −2 2t + 8
= −1/2 ln (| t − 2 |) + 1/2 ln (| 2t + 8 |) + C

1 | 2t + 8 |
= ln +C
2 |t −2|

1 2t + 8
= ln | | +C
2 t −2

Then, the definite integral becomes

7
−6 1 2t + 8 1 2t + 8
dt = ln | | − ln | |
4 (t − 2) (2t + 8) 2 t −2 2 t −2

t=7 t=4
1 22 16
= ln − ln
2 5 2

1 22 2
= ln
2 5 16

1 44 1 11
= ln = ln
2 80 2 20

Note that this evaluation would not make sense if the on an interval [a, b] is any of
these two natural logarithm functions were undefined at some point in [a, b]. Here,
both natural logarithm functions are nicely defined on [4, 7].
140 5 Integration

5.2.2 Homework

Exercise 5.2.1 Evaluate

16
du.
(u + 2) (u − 4)

Exercise 5.2.2 Evaluate

35
dz.
(2z − 6) (z + 6)

Exercise 5.2.3 Evaluate

−2
ds.
(s + 4) (s − 5)

Exercise 5.2.4 Evaluate

6
d x.
(x 2 − 16)

Exercise 5.2.5 Evaluate

6
9
dy.
4 (2y − 4) (3y − 9)

Exercise 5.2.6 Evaluate

−8
dw.
(w + 7) (w − 10)

Exercise 5.2.7 Evaluate

3
dt.
(t + 1) (t − 3)

Reference

In the chapters to come, we will need to use the idea of a complex number. When we
use the quadratic equation to find the roots of a polynomial like f (t) = t 2 + t + 1,
we find
√
−1 ± 1 − 4
t=
2
1 √
= − ± −3
2
Since, it is well known that there are no numbers
√ in our world whose squares can be
negative, it was easy to think that the term −3 represented some sort of imaginary
√
quantity. But, it seemed reasonable that the usual properties of the function should
hold. Thus, we can write
√ √ √
−3 = −1 3
√
and the term −1 had the amazing property that when √ you squared it, you got back
−1! Thus, the√ square√root of any negative number −c for a positive c could be
rewritten as −1 × c. It became clear to the people studying the roots of polyno-
mials such as our simple one √above, that
√ if the set of real numbers was augmented to
include numbers of the form −1 × c, there √ would √ be a nice √ way to represent any
root of a polynomial. Since a number like −1 × 4 or 2 × −1 is also possible, it
seemed like two copies of the real numbers were needed: one that was the usual real
numbers
√ and another copy which was any real number times this strange quantity
−1. It became very convenient to label the
√ set of all traditional real numbers as the
x axis and the set of√numbers prefixed by −1 as the y axis.
Since the prefix −1 was the only real √ difference between√the usual real num-
bers and the new numbers with the prefix −1, this prefix −1 seemed the the
quintessential representative of this difference. Historically, since these new pre-
fixed
√ numbers were already thought of as imaginary, it was decided to start labeling
−1 as the simple letter i where i is short for imaginary! Thus, a number of this
© Springer Science+Business Media Singapore 2016 141
J.K. Peterson, Calculus for Cognitive Scientists,
Cognitive Science and Technology, DOI 10.1007/978-981-287-877-9_6
142 6 Complex Numbers

sort could be represented as a + b i where a and b are any ordinary real numbers.
In particular, the roots of our polynomial could be written as

1 √
t = − ± i 3.
2

6.1 The Complex Plane

A generic complex number z = a + b i can thus be graphed as a point in the plane

which has as ordinate axis the usual x axis and as abscissa, the new i y axis. We call
this coordinate system the Complex Plane. The magnitude of the complex number
z is defined to be the length of the vector which connects the ordered pair (a, b) in
this plane with the origin (0, 0). This is labeled as r in Fig. 6.1. This magnitude is
called the modulus or magnitude of z and is represented by the same symbol we use
for the absolute value of a number, | z |. However, here this magnitude is the length
of the hypotenuse of the triangle consisting of the points (0, 0), (a, 0) and a, b) in
the complex plane. Hence,

|z|= (a)2 + (b)2

The angle measured from the positive x axis to this vector is called the angle asso-
ciated with the complex number z and is commonly denoted by the symbol θ or
Arg(z). Hence, there are two equivalent ways we can represent a complex number
z. We can use coordinate information and write z = a + b i or we can use magnitude
and angle information. In this case, if you look at Fig. 6.1, you can clearly see that
a = r cos(θ) and b = r sin(θ). Thus,

z = | z | (cos(θ) + i sin(θ))
= r (cos(θ) + i sin(θ))

iy z = a + bi A complex number a+bi

has real part a and imag-
b inary part b. The coordi-
nate (a, b) is graphed in
the usual Cartesian man-
r ner as an ordered pair in
the x−iy complex plane.
magnitude of z is
The
θ (a)2 + (b)2 which is
shown on the graph as
r. The angle associated
a x
with z is drawn as an
arc of angle θ

Fig. 6.1 Graphing complex numbers

6.1 The Complex Plane 143

iy A complex number
a + b i has real part
z̄ = a − bi a and imaginary part b.
Its complex conjugate is
ax a − b i The coordinate
(a, −b) is graphed in
−θ the usual Cartesian man-
ner as an ordered pair
in the complex plane.
r The magnitude of z̄ is

(a)2 + (b)2 which is
shown on the graph as
b r. The angle associated
with z̄ is drawn as an
arc of angle −θ

Fig. 6.2 Graphing the conjugate of a complex numbers

We can interpret the number cos(θ) + i sin(θ) in a different way. Given a complex
number z = a + b i, we define the complex conjugate of z to be z̄ = a − b i. It is
easy to see that

z z̄ = | z |2
z̄
z −1 =
| z |2

Now look at Fig. 6.1 again. In this figure, z is graphed in Quadrant 1 of the complex
plane. Now imagine that we replace z by z̄. Then the imaginary component changes
to −5 which is a reflection across the positive x axis. The magnitude of z̄ and z will
then be the same but Arg(z̄) is −θ. We see this illustrated in Fig. 6.2.

6.1.1 Complex Number Calculations

Example 6.1.1 For the complex number z = 2 + 4 i

1. Find its magnitude
2. Write it in the form r (cos(θ) + i sin(θ)) using radians
3. Graph it in the complex plane showing angle in degrees
√
Solution The complex number 2 + 4 i has magnitude (2)2 + (4)2 which is 20.
Since both the real and imaginary component of the complex number are positive,
this complex number is graphed in the first quadrant of the complex plane. Hence,
the angle associated with z should be between 0 and π/2. We can easily calculate
the angle θ associated with z to be tan−1 (4/2) = tan−1 (4/2).
144 6 Complex Numbers

iy
Im(z)
A complex number
−2 + 8 i has real
part −2 and imaginary
z = −2 + 8i
part 8. The coordinate
(−2, 8) is graphed in the
r usual Cartesian manner
as an ordered pair in the
x − iy complex plane.
θ
The magnitude of z is

(−2)2 + (8)2 which
is shown on the graph as
r. The angle associated
with z is drawn as an
Re(z) x arc of angle θ

Fig. 6.3 The complex number −2 + 8i

Example 6.1.2 For the complex number −2 + 8 i

1. Find its magnitude
2. Write it in the form r (cos(θ) + i sin(θ)) using radians
3. Graph it in the complex plane showing angle in degrees

Solution The complex number −2 + 8 i has magnitude (−2)2 + (8)2 which is
√
68. This complex number is graphed in the second quadrant of the complex plane
because the real part is negative and the imaginary part is positive. Hence, the angle
associated with z should be between π/2 and π. The angle θ associated with z should
be π − tan−1 (|8/(−2)| = 1.82 rad or 104.04◦ . The graph in Fig. 6.3 is instructive.

We can summarize what we know about complex numbers in Proposition 6.1.1

Proposition 6.1.1 (Complex Numbers)

If z is a + b i, then

1. The magnitude of z is denoted | z | and is defined to be (a)2 + (b)2 .
2. The complex conjugate of z is a − b i.
3. The argument or angle associated with the complex number z is the angle θ
defined as the angle measured from the positive x axis to the radius vector which
points from the origin 0 + 0 i to a + b i. This angle is also called Arg(z).
Sometimes we measure this angle clockwise and sometimes counterclockwise.
4. Arg(z̄) = −Arg(z).
5. The magnitude of z is often denoted by r since it is the length of the radius vector
described above.
6. The polar coordinate version of z is given by z = r (cos(θ) + i sin(θ)).
6.1 The Complex Plane 145

6.1.2 Homework