0% found this document useful (0 votes)
212 views

Math Methods Book

Uploaded by

algohrythm
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
212 views

Math Methods Book

Uploaded by

algohrythm
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 303

Mathematical Methods for the Physical Sciences

Two Semester Course

Shawn D. Ryan, Ph.D.


Copyright
c 2015-2016 Dr. Shawn D. Ryan

P UBLISHED O NLINE

Special thanks to Mathias Legrand ([email protected]) with modifications by Vel(vel@


latextemplates.com) for creating and modifying the template used for these lecture notes. Also,
thank you to the students in the Mathematical Methods I-II class at Kent State University in the
2015-2016 academic year including K. Bittinger, R. Dovishaw, T. Dubensky, M. Grose, K. Khanal,
J. Krusinski, E. McMasters, J. Paltani, J. Sobieski, J. Taylor, C. Zickel, B. Zimmerman.
Licensed under the Creative Commons Attribution-NonCommercial 3.0 Unported License (the
“License”). You may not use this file except in compliance with the License. You may obtain a
copy of the License at http://creativecommons.org/licenses/by-nc/3.0. Unless required
by applicable law or agreed to in writing, software distributed under the License is distributed on an
“AS IS ” BASIS , WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and limitations under the License.

First published online, February 2016


Contents

I Part One: Complex Numbers

1 Fundamentals of Complex Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . 11


1.1 Introduction 11
1.2 Real and Imaginary Parts of a Complex Number 14
1.3 The Complex Plane 15
1.3.1 Review of Unit Circle in Radians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3.2 Going Deeper: Understanding Euler’s Identity . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.4 Terminology and Notation 20
1.4.1 Complex Conjugation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.5 Complex Algebra 24
1.5.1 Simplifying to Standard Form x + iy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.5.2 Complex Conjugation of an Expression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.5.3 Finding the Absolute Value |z| . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.5.4 Complex Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.5.5 Graphs of Complex Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
1.5.6 Physical Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
1.6 Complex Infinite Series 33
1.6.1 Review from Calculus: Tests for Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.6.2 Examples with Complex Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.7 Complex Power Series and Disk of Convergence 35
1.8 Elementary Functions of Complex Numbers 37
1.9 Euler’s Formula 38
1.10 Powers and Roots of Complex Numbers 40
1.11 The Exponential and Trigonometric Functions 42
1.12 Hyperbolic Functions 44

II Part Two: Linear Algebra

2 Fundamentals of Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47


2.1 Systems of Linear Equations 47
2.1.1 Matrix Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.1.2 Elementary Row Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.1.3 Fundamental Questions In Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.2 Row Reduction and Echelon Forms 49
2.2.1 Solutions of Linear Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.3 Determinants and Cramer’s Rule 51
2.3.1 Special Case: Upper and Lower Triangular Matrices . . . . . . . . . . . . . . . . . . . . 53
2.3.2 Properties of Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.3.3 Cramer’s Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.4 Vectors 55
2.4.1 Scalar Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.4.2 Vector (Cross) Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
2.4.3 Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
2.5 Lines, Planes, and Geometric Applications 58
2.6 Matrix Operations 62
2.6.1 Scalar Multiplication of Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
2.6.2 Addition and Subtraction of Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
2.6.3 Multiplication and Division of Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
2.6.4 Matrix Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
2.6.5 Solution Sets of Linear Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
2.6.6 Inverse Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
2.6.7 Ways to Compute A−1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
2.6.8 Rotation Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
2.6.9 Functions of Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
2.7 Linear Combinations, Functions, and Operators 69
2.7.1 Linear Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
2.7.2 Linear Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
2.8 Matrix Operations and Linear Transformations 72
2.9 Linear Dependence and Independence 73
2.9.1 Special Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
2.9.2 Linear Independence of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
2.9.3 Basis Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
2.10 Special Matrices 76
2.11 Eigenvalues and Eigenvectors 76
2.11.1 The Characteristic Equation: Finding Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . 77
2.11.2 Similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
2.12 Diagonalization 79
2.12.1 Physical Interpretation of Eigenvalues, Eigenvectors, and Diagonalization . . . 83
III Part Three: Multivariable Calculus

3 Partial Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.1 Introduction and Notation 87
3.1.1 Review of Product, Quotient, and Chain Rule . . . . . . . . . . . . . . . . . . . . . . . . . . 88
3.2 Power Series in Two Variables 90
3.3 Total Differentials 92
3.4 Approximations Using Differentials 93
3.5 Chain Rule or Differentiating a Function of a Function 95
3.6 Implicit Differentiation 97
3.7 More Chain Rule 99
3.7.1 Using Cramer’s Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
3.8 Maximum and Minimum Problems with Constraints 101
3.9 Lagrange Multipliers 105

4 Multivariable Integration and Applications . . . . . . . . . . . . . . . . . . . . 111


4.1 Introduction 111
4.2 Double Integrals Over General Regions 114
4.2.1 Integrals Over Subregions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
4.2.2 Area Between Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
4.3 Triple Integrals 118
4.3.1 Volume Between Surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
4.4 Applications of Integration 120
4.4.1 Mass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
4.4.2 Moments and Center of Mass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
4.4.3 Moment of Inertia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
4.4.4 Generalization of Physical Quantities to 3D . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
4.4.5 Applications to Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
4.5 Change of Variables in Integrals 125
4.5.1 Changing to Polar Coordinates in a Double Integral . . . . . . . . . . . . . . . . . . . 126
4.5.2 Arc Length in Polar Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
4.6 Cylindrical Coordinates 128
4.7 Cylindrical Coordinates 129
4.7.1 Jacobians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
4.8 Surface Integrals 132

5 Vector Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135


5.1 Applications of Vector Multiplication 135
5.1.1 Dot and Cross Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
5.2 Triple Products 137
5.2.1 Triple Scalar Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
5.2.2 Triple Vector Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
5.2.3 Applications of Triple Scalar Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
5.2.4 Application of Triple Vector Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
5.3 Fields 140
5.4 Differentiation of Vectors 143
5.4.1 Differentiation in Polar Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
5.5 Directional Derivative and Gradient 145
5.5.1 Gradients in Other Coordinate Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
5.5.2 Physical Significance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
5.6 Some Other Expressions Involving ∇ 148
5.6.1 Divergence, ∇ · V . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
5.6.2 Physical Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
5.6.3 Curl ∇ × V . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
5.6.4 Solenoidal and Irrotational . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
5.6.5 Divergence and Laplacian in Other Coordinate Systems . . . . . . . . . . . . . . . . 152
5.7 Line Integrals 152
5.7.1 Potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
5.7.2 Alternate Approach to Finding Scalar Potential φ . . . . . . . . . . . . . . . . . . . . . . 156
5.8 Green’s Theorem in the Plane 156
5.9 The Divergence (Gauss) Theorem 160
5.9.1 Gauss Law for Electricity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
5.10 The Stokes (Curl) Theorem 162
5.10.1 Ampere’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
5.10.2 Conservative Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

IV Part Four: Ordinary Differential Equations

6 Ordinary Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169


6.1 Introduction to ODEs 169
6.1.1 Some Basic Mathematical Models; Direction Fields . . . . . . . . . . . . . . . . . . . . 169
6.1.2 Solutions of Some Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
6.1.3 Classifications of Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
6.2 Separable Equations 175
6.3 Linear First-Order Equations, Method of Integrating Factors 178
6.3.1 REVIEW: Integration By Parts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
6.3.2 Modeling With First Order Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
6.4 Existence and Uniqueness 187
6.4.1 Linear Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
6.4.2 Nonlinear Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
6.5 Other Methods for First-Order Equations 190
6.5.1 Autonomous Equations with Population Dynamics . . . . . . . . . . . . . . . . . . . . . 190
6.5.2 Bernoulli Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
6.5.3 Exact Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
6.5.4 Homogeneous Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
6.6 Second-Order Linear Equations with Constant Coefficients and Zero Right-
Hand Side 198
6.6.1 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
6.6.2 Homogeneous Equations With Constant Coefficients . . . . . . . . . . . . . . . . . . . 200
6.7 Complex Roots of the Characteristic Equation 200
6.7.1 Review Real, Distinct Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
6.7.2 Complex Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
6.8 Repeated Roots of the Characteristic Equation and Reduction of Order 204
6.8.1 Repeated Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
6.8.2 Reduction of Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
6.9 Second-Order Linear Equations with Constant Coefficients and Non-zero
Right-Hand Side 209
6.9.1 Nonhomogeneous Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
6.9.2 Undetermined Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
6.9.3 The Basic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
6.9.4 Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
6.9.5 Sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
6.9.6 Method of Undetermined Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
6.10 Mechanical and Electrical Vibrations 219
6.10.1 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
6.10.2 Free, Undamped Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
6.10.3 Free, Damped Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
6.10.4 Forced Vibrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
6.11 Two-Point Boundary Value Problems and Eigenfunctions 231
6.11.1 Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
6.11.2 Eigenvalue Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
6.12 Systems of Differential Equations 235
6.13 Homogeneous Linear Systems with Constant Coefficients 236
6.13.1 The Phase Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
6.13.2 Real, Distinct Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

V Part Five: PDEs and Fourier Series

7 Fourier Series and Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247


7.1 Introduction to Fourier Series 247
7.1.1 Simple Harmonic Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
7.2 Fourier Coefficients 249
7.3 Fourier Coefficients 249
7.3.1 A Basic Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
7.3.2 Derivation of Euler Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
7.4 Dirichlet Conditions 255
7.5 Convergence and Sum of a Fourier series 255
7.5.1 Gibbs Phenomenon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
7.6 Complex Form of Fourier Series 257
7.7 Complex Fourier Series 257
7.7.1 General Complex Fourier Series for Intervals (0, L) . . . . . . . . . . . . . . . . . . . . . . 259
7.8 General Fourier Series for Functions of Any Period p = 2L 260
7.9 Even and Odd Functions 265
7.10 Even and Odd Functions, Half-Range Expansions 265
7.10.1 Half-Range Expansions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
7.10.2 Fourier Sine Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
7.10.3 Fourier Cosine Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272

8 Partial Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275


8.1 Introduction to Basic Classes of PDEs 275
8.2 Introduction to PDEs 275
8.2.1 Basics of Partial Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
8.2.2 Laplace’s Equation - Type: Elliptical . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
8.2.3 Poisson’s Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
8.2.4 Diffusion/Heat Equation - Type: Parabolic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
8.2.5 Wave Equation - Type: Hyperbolic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
8.2.6 Helmholtz Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
8.2.7 Schrödinger Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
8.2.8 Solutions to PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
8.3 Laplace’s Equations and Steady State Temperature Problems 277
8.3.1 Dirichlet Problem for a Rectangle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
8.3.2 Dirichlet Problem For A Circle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
8.4 Heat Equation and Schrödinger Equation 281
8.4.1 Derivation of the Heat Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
8.5 Separation of Variables and Heat Equation IVPs 283
8.5.1 Initial Value Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
8.5.2 Separation of Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
8.5.3 Neumann Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
8.5.4 Other Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
8.6 Heat Equation Problems 287
8.6.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
8.7 Other Boundary Conditions 291
8.7.1 Mixed Homogeneous Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . 291
8.7.2 Nonhomogeneous Dirichlet Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
8.7.3 Other Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
8.8 The Schrödinger Equation 295
8.9 Wave Equations and the Vibrating String 296
8.9.1 Derivation of the Wave Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
8.9.2 The Homogeneous Dirichlet Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
8.9.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
8.9.4 D’Alembert’s Solution of the Wave Equation, Characteristics . . . . . . . . . . . . 300

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
I
Part One: Complex Numbers

1 Fundamentals of Complex Numbers . . 11


1.1 Introduction
1.2 Real and Imaginary Parts of a Complex Number
1.3 The Complex Plane
1.4 Terminology and Notation
1.5 Complex Algebra
1.6 Complex Infinite Series
1.7 Complex Power Series and Disk of Convergence
1.8 Elementary Functions of Complex Numbers
1.9 Euler’s Formula
1.10 Powers and Roots of Complex Numbers
1.11 The Exponential and Trigonometric Functions
1.12 Hyperbolic Functions
1. Fundamentals of Complex Numbers

1.1 Introduction
The two course sequence Mathematical Methods in the Physical Sciences I and II are designed to
condense many courses in higher level mathematics into the essential information needed to study
upper level physics undergraduate courses. Our main focus is to develop mathematical intuition for
solving real world problems while developing our tool box of useful methods. Topics in this course
are derived from five principle subjects in Mathematics

(i) Complex Numbers (Math 42048, Boas Ch. 2) → Quantum Mechanics

(ii) Linear Algebra (Math 21001, Boas Ch. 3) → Transformations, Change of Coor.,
Stability

(iii) Multivariable Calculus (Math 22005, Boas Ch. 4-6) → Forces, Inertia, Volume, Area

(iv) Introduction to Ordinary Differential Equations (Math 32044, Boas Ch. 7-8) →
Particle Motion, Dynamics

(v) Introduction to Partial Differential Equations (Math 42045, Boas Ch. 13) → Signal
Analysis, Heat Conduction, Waves, Equilibrium Physics

Each class individually goes deeper into the subject, but we will cover the basic tools needed to
handle problems arising in physics, materials sciences, and the life sciences. Your upper level
curses will introduce the physical motivation for the problems, but here we will develop the solution
methods for solving those problems. In Math Methods 1 we will cover Chapters 2 - 5 or the first
half of our list.
Recall the first place you most likely saw a complex number: solving a quadratic equation
ax2 + bx + c = 0 with the quadratic formula

−b ± b2 − 4ac
x= .
2a
12 Chapter 1. Fundamentals of Complex Numbers

When the so-called discriminant, b2 − 4ac is negative the square root produces p an imaginary
2
number. For example if one wants to solve x + 1 = 0 the quadratic formula gives ± ( − 1) = ±i.
A quadratic equation must have two roots or solutions and the imaginary number i was introduced
to handle such cases.
Consider some easy examples for how to handle negative square roots.
√ √
 Example 1.1 i)
√ √ √ −64
√ = 8 −1 = 8i
ii) −5 = 5 −1√ = 5i √ √
iii) Powers of i: i = −1, i2 = −1 −1 = −1, i3 = i2 i = −i, i4 = i2 i2 = 1. Any other power of i
can be found by dividing the exponent by 4 and only considering the remainder i5 3 = i4(13) i1 = i. 
Just as a refresher solve the following quadratic equation using the quadratic formula
 Example 1.2 Solve x2 − x + 1 = 0. The quadratic formula gives
√ √ √
1 ± 1 − 4 1 ± −3 1 3
x= = = ± i
2 2 2 2


We have this built up intuition from the past, but what exactly is a complex number. Let’s
make an analogy with negative numbers (thanks Kalid Azad for the insight). Imagine a time before
negative numbers were accepted around the 1700’s in Europe. Given two numbers 7 and 8 we can
easily write 8 − 7 = 1. Starting with 8 sheep, if I give you 7 I will only have one left. What about
7 − 8? How can I have less than nothing? The problem is trying to think about this problem with
concrete objects. The easiest way to understand this is with money. If I owe you $50 and I am paid
only $10 to teach this course, then at the end of the day I have lost $40 hence the negative sign. In
this case −40 represents a debt or something I owe. The negative sign was invented to keep track
of which direction I am (positive I earned money or negative I owe money).

Figure 1.1: Thanks to Kalid Azad for the figures

Now, how to we handle a square root of a number less than zero? Suppose we want to solve
x2 = 9. This mean finding a number such that 1 × x × x = 9. What can I apply to 1 twice so that I
receive 9? The answers are 3 and −3. We can scale 1 by 3 and then scale by 3 again or we can
scale 1 by negative 3 (scale by 3 and reflect it to negative side) and do the same again.
Now, try to solve x2 = −1, or 1 × x × x = −1. What can we apply twice to turn 1 into -1? We
cannot multiply by a positive or negative number twice, because the result will be negative. What if
we rotated it by 90◦ (see Figure‘1.1)? This works, but what does it mean. Summary:
1.1 Introduction 13

Figure 1.2: Thanks to Kalid Azad for the figures

1) i can be thought of as a “new dimension" to measure a number

2) Multiplying by i is a rotation of a number by 90◦ counter-clockwise

3) Multiplying by −i is a rotation of a number by 90◦ clockwise

Complex numbers are very similar to real numbers. Can we make sense of arithmetic operations
of complex number (e.g., +, −, ×, ÷)? What about functions of complex numbers such as ei or
sin(iz) and cos(iz)?. Also, in the upcoming sections we will consider graphing, power series of
complex functions/radius of convergence, and distances or magnitudes of complex numbers.
14 Chapter 1. Fundamentals of Complex Numbers

1.2 Real and Imaginary Parts of a Complex Number


So according to the last section numbers can be two-dimensional! Can a number be both real and
imaginary? YES! Take for example the solution to x2 − x + 12 = 0, which is 1 + i. This number has
both a real part 1 and a purely imaginary part i, but together they form a complex number of the
form x + yi.
Definition 1.2.1 A complex number is any number of the form z = x + iy where x and y are
real numbers. x is called the real part and y is called the imaginary part.

R Notice the imaginary part y of a complex number z = x + iy is in fact real! It is the real
number coefficient for i.

 Example 1.3 Find the real and imaginary parts of:

i) 5 + 6i, Re{5 + 6i} = 5 and Im{5 + 6i} = 6.

ii) −1 + 3i, Re{−1 + 3i} = −1 and Im{−1 + 3i} = 3.

iii) 6i, Re{6i} = 0 and Im{6i} = 6.

iii) 7, Re{7} = 7 and Im{7} = 0. 

All real numbers are complex numbers with zero imaginary part. Therefore the real numbers
are a subset of the complex numbers; however, there is a more useful observation. All complex
numbers can be written as z = x + yi. If we associate this with the point (x, y) in two-dimensional
space, then we can plot complex numbers (see Figure 1.3.2). In the next section we will investigate
graphing complex numbers further.

Figure 1.3: Thanks to Kalid Azad for the figures


1.3 The Complex Plane 15

1.3 The Complex Plane


As mentioned briefly before, the two-dimensional real space can be thought of as equivalent
to the complex plane, R2 ' C. Any complex number can be associated to a point we can plot
in the xy-plane with a traditional Cartesian coordinate system. Consider the complex number
z = 2 + 3i → (2, 3)

(2, 3)

 Example 1.4 Plot 4 + 3i, 3i, 5, -1-i

(0, 3) (4, 3)
• •

(5, 0)
•x

(−1, −1)

Recall from calculus another form on coordinates in two dimensions, polar coordinates
(x, y) 7→ (r, θ ). Can we use the same idea to identify complex numbers with their associated
polar coordinates? Yes!
Definition 1.3.1 (Polar Coordinates of Complex Numbers) Any complex number z = x + iy can
be written in polar form using the same relations from two dimensional Cartesian coordinates
p
r = x 2 + y2
θ = tan−1 (y/x)

or

x = r cos(θ )
y = r sin(θ ).

Thus, z = x + iy = rcos(θ ) + i sin(θ ) = r [cos(θ ) + i sin(θ )]. NOTE: That all the quantities
involved are real (e.g., x, y, r, θ )!
16 Chapter 1. Fundamentals of Complex Numbers

(x, y)

r y

θ
x
x

We can actually simplify this expression further with the help of Euler’s Identity

Definition 1.3.2 (Euler’s Identity) The polar form of a complex number can be written as

eiθ = cos(θ ) + i sin(θ ). (1.1)

This will be taken as a fact for now and will be shown explicitly in a few sections when we study
complex power series.

R Traditionally if asked for the polar form of a complex number z the expectation is that it is
written z = reiθ .

Key idea: For solving a lot of problems with complex numbers the main task is to identify
whether it would be easier to tackle the problem in Cartesian (x, y) or polar (r, θ ) coordinates.

1.3.1 Review of Unit Circle in Radians

You need to be very familiar with the standard right triangles, 45-45-90 and 30-60-90 in each
quadrant in order to effectively use the polar form of a complex number. Even though the unit
circle is familiar we must also be able to scale to any size right triangle of these two forms.
1.3 The Complex Plane 17

(0, 1)
 √   √ 
− 12 , 23 1
,
2 2
3

 √ √  √ √ 
− 22 , 2
2 π
2
2
, 2
2
2
2π π
 √  3 3 √ 
− 23 , 21 3π
90◦
π 3 1
2 ,2
4 4
120◦ 60◦
5π π
6 6
150◦ 30◦

(−1, 0) (1, 0)
π 180◦ 0◦ ◦
360 2π x

210◦ 330◦
7π 11π
6 6
 √  5π
240◦ 300◦ 7π
√ 
− 23 , − 12 270◦ 3 1
4 4 2 ,−2
4π 5π
 √ √  3 3 √ √ 

2 2 2 2
− 2 ,− 2 2
2 , − 2
 √   √ 
− 21 , − 23 1
2 , − 2
3

(0, −1)

 Example 1.5 For each of the following find the polar form and plot the result in two-dimensions.

i) z = −1 + 3i
p √
Step 1: Find r = x2 + y2 = 1 + 3 = 2.

Step 2: Find θ = tan−1 ( 3/1). Recall that tangent
√ is opposite over adjacent. Thus, we need
a triangle where the opposite side has length 3 and the adjacent side is along -1. If x < 0 and
y > 0 we are in quadrant II with a 30-60-90 right triangle. Therefore θ = 2π
3 .


Step 3: Write in polar form z = 2ei 3 .

Step 4: Plot the result!


18 Chapter 1. Fundamentals of Complex Numbers

√ y
(−1, 3)

r
θ
x

ii) z = 1 − i
p √ √
Step 1: Find r = x2 + y2 = 1 + 1 = 2.

Step 2: Find θ = tan−1 (−1/1). The tangent is opposite over adjacent. Thus, we need a tri-
angle where the opposite side is along -1 and the adjacent side is along 1. If x > 0 and y < 0 we are
in quadrant IV with a 45-45-90 right triangle. Therefore θ = 3π π
4 or − 4 . Note that it is important to
observe the sign of both x and y to be in the correct quadrant.
√ i 7π
Step 3: Write in polar form z = 2e 4 .

Step 4: Plot the result!

x
θ

r

(−1, 1)

iii) z = 3i
p √
Step 1: Find r = x2 + y2 = 0 + 9 = 3.

Step 2: Find θ = tan−1 (5/0). We are looking for an angle whose tangent is ∞. Recall that
tangent is sine over cosine. the tangent is infinite if cos(θ ) = 0 or θ = π/2 or θ = −π/2. Since
y > 0, then θ = π/2.
1.3 The Complex Plane 19

π
Step 3: Write in polar form z = 3ei 2 .

Step 4: Plot the result!

• (0, 3)

θ
x

1.3.2 Going Deeper: Understanding Euler’s Identity

Figure 1.4: Thanks to Kalid Azad for the figures.


20 Chapter 1. Fundamentals of Complex Numbers

Euler’s Identity eiθ = cos(θ ) + i sin(θ ) is a formula, which explains how to move around the
unit circle. Consider a point confined to the unit circle traveling x radians. The horizontal distance
traveled is cos(x) and the vertical distance traveled is sin(x). To take these two coordinates and
combine them into one number we make it complex! z = cos(x) + i sin(y). Thus, the right side of
Euler’s formula/identity describes motion on a circle.
The left-hand side of Euler’s Identity contains the exponential function e. In real number the
function ex arises in problems involving growth or decay at a fast rate. Here what do we mean
by imaginary growth (eiθ )?!? Imaginary growth is different than normal exponential growth. The
growth is in a different direction, instead of going forward we growth along the imaginary axis
(y-direction or 90◦ ). Instead of speeding up or slowing down a point begins to rotate (multiplying a
number by i does not change its magnitude it only rotates it).

Figure 1.5: Thanks to Kalid Azad for the figures.

Thinking Question In real numbers the exponential function keeps growing larger and larger,
so in the case of “imaginary growth" should we rotate faster and faster?

Since we are constrained to the unit circle instead of growing larger and larger, a point moves
further along the circle. For example if we compare eiθ and e2iθ . The magnitude does not change
(still 1), but we rotate twice as far (or travel twice as long if θ is thought of as time).

Interesting Case: Complex Growth What if the growth rate is complex ex+iy ?

The real part ex grows like normal while the imaginary part eiy rotates. Thus, one can expect
a spiral shape. This will be seen later when finding complex solutions to equations of motion!!

1.4 Terminology and Notation



In this class we will always use i to denote the complex (pure imaginary) number i := −1. Be
aware that in many physics textbooks j is also used. Often this is seen when studying electricity
where current is denoted as i to avoid confusion. An additional point regarding notation is that a
complex number z = x + iy is one number so when labeling points in the complex plane usually a
single letter is used (e.g., A, B, P etc.).
Recall from the last lecture the polar from of a complex number using Euler’s identity.

z = x + iy = r(cos(θ ) + i sin(θ ) = reiθ (1.2)


1.4 Terminology and Notation 21

Figure 1.6: Thanks to Kalid Azad for the figures.

where the real part of z, Re{z} = x, and the imaginary part of z, Im{z}p= y. In addition, the
magnitude or length associated with the complex number z is r = |z| = x2 + y2 and the angle
θ = tan−1 (y/x).


 Example 1.6 Write z = − 3 − 3i in polar form and plot it.
p √ √
Step 1: Find r = x2 + y2 = 3 + 9 = 2 3.
√ √
Step 2: Find θ = tan−1 (−3/ − 3) = tan−1 (− 3/ − 1). Recall that √ tangent is opposite over
adjacent. Thus, we need a triangle where the opposite side along − 3 and the adjacent side is
along -1. If x < 0 and y < 0 we are in quadrant III with a 30-60-90 right triangle. Therefore θ = 4π
3 .
√ 4π
Step 3: Write in polar form z = 2 3ei 3 .

Step 4: Plot the result!


22 Chapter 1. Fundamentals of Complex Numbers

θ
x
θre f



(− 3, −3)


Note that due to periodicity the true answer for the angle is θ = + 2πn where n is an integer.
3
The first component, θ principle := 4π
3 , is known as the principle angle and must be between the
standard interval of 0 ≤ θ p < 2π. Another angle of importance is the reference angle, 0 ≤ θre f ≤ π2 ,
which gives the magnitudes of the sides of the 30-60-90 or 45-45-90 right triangle. Observe that the
reference angle has nothing to do with the sign of each side of the triangle, because it is independent
of the quadrant it is in.

Figure 1.7: Difference between principle angle θ p (black) and the reference angle θre f (red).

R When working with complex numbers make sure that the angle θ you find is in the same
quadrant as the complex number itself.

1.4.1 Complex Conjugation


Consider two complex numbers z1 = x + iy and z2 = x − iy. The only difference is the sign of the
imaginary part, ±iy. These two complex numbers are known as complex conjugates. Specifically,
z2 is the complex conjugate of z1 denoted by z2 = z1 . Where the bar indicates “complex conjugate"
(in some textbooks the notation of a ? is used, z2 = z?1 . Given any complex number we can always
find its conjugate by changing the sign of the imaginary part. You may have come across these pairs
1.4 Terminology and Notation 23

before when solving quadratic equations, because complex solutions to equations always come as
conjugate pairs. In other words, if z = 2 + 3i is a solution, then z = 2 − 3i must also be a solution.
 Example 1.7 Find the complex conjugate of each of the following complex numbers:

i) z = 1 + i, then z̄ = 1 − i

ii) z = −2 − 4i, then z̄ = −3 + 4i

iii) z = −5i, then z̄ = 5i

iv) z = −3, then z̄ = −3

v) z = 0, then z̄ = 0 

It is easy to blindly remember to change the sign of the imaginary part, but let’s look at a pair
of complex conjugates plotted on the same coordinate plane to see if there is any relationship. Lets
look back at Example 3 i) and ii) .

(−2, 4) y

(1, 1)


(1, −1)


(−2, −4)

The complex conjugate of a number is just its reflection across the x-axis (real axis).

Thinking Question: A complex conjugate is just a reflection across the x-axis so the change
in (x, y) coordinates is simple y 7→ −y. How does a complex conjugate effect the polar form of a
complex number?

Answer: The magnitude of a complex number and its conjugate are identical so r remains the same.
However, θ 7→ −θ . What does this mean in terms of the principle angle and the reference angle?
The reference angle θre f remains unchanged, but the principle angle changes sign. So if θ = π4 ,
then θz̄ = − π4 or 7π
4 .
24 Chapter 1. Fundamentals of Complex Numbers

We can also directly see this from the polar form of a complex number

z̄ = x + iy = r [cos(θ ) + i sin(θ )] = r [cos(−θ ) + i sin(−θ )] = r [cos(θ ) − i sin(θ )] = x −iy. (1.3)

1.5 Complex Algebra

With real numbers we can perform various algebraic operations to combine them into something
new. These include, but are not limited to

i) Basic Operations: Addition +, Subtraction −, Multiplication ×, and Division ÷

ii) Magnitude and Distance | · |, v = |v|, or |a − b|.

iii) Solving Equations 4x = 6

Now we will consider the complex analogue of each of these as well as some physical applications
for complex numbers.

1.5.1 Simplifying to Standard Form x + iy

As we have seen before complex numbers can be written in two equivalent forms z = x + iy = reiθ .
The first form is referred to as standard form and will be useful for the basic operations.

Addition of Complex Numbers

Definition 1.5.1 Given two complex numbers z1 = x1 + iy1 and z2 = x2 + iy2 , their sum z1 + z2
is defined as

(x1 + iy1 ) + (x2 + iy2 ) = (x1 + x2 ) + i(y1 + y2 ) (1.4)

Just add the real parts and add the imaginary parts.

 Example 1.8 i) (1 + i) + (2 − 3i) = (1 + 2) + i(1 + (−3)) = 3 − 2i

ii) (1 + 5i) + (−4i) = (1 + 0) + i(5 + (−4)) = 1 + i

iii) (1 + 0i) + (0 + 2i) = (1 + 0) + i(0 + 2) = 1 + 2i 

Now, let’s visualize Example 2i).


1.5 Complex Algebra 25

z1

z1 + z2 z1 + z2

z2

z2 + z1

This visualization shows two key ideas:

1. The addition of complex numbers behaves exactly as vector addition in two-dimensions (recall
the analogy between R2 ∼= C).

2. Addition of real numbers is commutative, a + b = b + a. Here we see the addition of complex


numbers is also commutative. In others words the order in which one adds them does not matter.

Subtraction of Complex Numbers

Definition 1.5.2 Given two complex numbers z1 = x1 + iy1 and z2 = x2 + iy2 , their difference
z1 − z2 is defined as

(x1 + iy1 ) − (x2 + iy2 ) = (x1 − x2 ) + i(y1 − y2 ) (1.5)

Just subtract the real parts and the imaginary parts.

 Example 1.9 i) (1 + i) − (2 − 3i) = (1 − 2) + i(1 − (−3)) = −1 + 4i

ii) (1 + 5i) − (−4i) = (1 − 0) + i(5 − (−4)) = 1 + 9i

iii) (1 + 0i) − (0 + 2i) = (1 − 0) + i(0 − 2) = 1 − 2i 

Now, let’s visualize Example 4i).


26 Chapter 1. Fundamentals of Complex Numbers

z1 − z2

z1 − z2

z1
x

z2
z2 − z1

z2 − z1

This visualization shows two key ideas:

1. The subtraction of complex numbers behaves exactly as vector subtraction in two-dimensions.

2. Subtraction of real numbers is NOT commutative, a − b = b − a. The results are the same except
for the sign. Here we see also that the subtraction of complex numbers is NOT commutative. In
others words the order in which one subtracts does matter!
Multiplication of Complex Numbers
Definition 1.5.3 Given two complex numbers z1 = x1 + iy1 and z2 = x2 + iy2 , their product z1 z2
is defined as

(x1 + iy1 )(x2 + iy2 ) = (x1 x2 − y1 y2 ) + i(x2 y1 + x1 y2 ) (1.6)

Just FOIL as you would any product of binomials! Once the four terms are found just combine
the two real numbers into x and the two imaginary numbers into y

R The most common mistake made is that one forgets i2 = −1 so when multiplying the two
imaginary parts you receive a real number and a sign change.

 Example 1.10 ii) (2 + 3i)2 = (2 + 3i)(2 + 3i) 4 + 6i + 6i − 9 = −5 + 12i

ii) (1 + i)(2 − 3i) = 2 − 3i + 2i + 3 = 5 + i


1.5 Complex Algebra 27

iii) (1 + 5i)(−4i) = −4i − 20i2 = 20 − 4i

iv) (1 − i)2 = 1 − i − i + i2 = −2i

v) (1 + i)(1 − i) = 1 − i + i − i2 = 2

vi) (1 − i)(1 + 2i)(2 − i) = [1 + 2i + i + 2i2 ](2 − i) = [−1 + 3i](2 − i) = −2 + i + 6i − 3i2 = 1 + 7i 


Now, let’s visualize Example 7 v).

z1
z1 z2
x

z2

This visualization shows two key ideas:

1. What is special about multiplying two complex conjugates? Example 7 v) shows that this
always results in a real number

(x1 + iy1 )(x1 − iy1 ) = (x1 x1 + y1 y1 ) + i(x1 y1 − x1 y1 ) = x12 + y21 = r2 (1.7)

2. The multiplication of complex numbers behaves exactly as FOIL in the case of two binomials.

3. Multiplication of real numbers is commutative, ab = ba. Here we see also that the multi-
plication of complex numbers is commutative. In others words the order in which one multiplies is
not important.
In some cases multiplication may be easier to carry out in polar form (while polar form is
clearly not a good choice for addition/subtraction). To multiply two complex numbers in polar form
we simply multiply the magnitudes, r, and add the angles

z1 z2 = r1 eiθ1 r2 eiθ2 = r1 r2 ei(θ1 +θ2 ) . (1.8)

Thus, visually multiplication amounts to rotating the first complex number z1 by angle θ2 and
extending its length by a factor of r2 .
 Example 1.11 i) (1 − i)2 = (1 − i)(1 − i) = 1 − i − i − 1 = −2i
√ −i π
In Polar Form: (1 − i) = 2e 4
√ −i π √ −i π pi
(1 − i)2 = 2e 4 2e 4 = 2e−i 2 = −2i
28 Chapter 1. Fundamentals of Complex Numbers

x
z1
θ2
z1 z2
(1, −1)

(0, −2)

Division of Complex Numbers


Definition 1.5.4 Given two complex numbers z1 = x1 + iy1 and z2 = x2 + iy2 , their quotient
z1 /z2 is defined as
x1 + iy1
(1.9)
x2 + iy2
To write this in standard form, x + iy, one must follow two steps:

Step 1: Multiply the top and bottom by the complex conjugate of the denominator (re-
sulting in a real number in the denominator).

Step 2: Separate the real and imaginary parts to find x and y in the standard form.

1+i 1+i 2+3i 2+3i+2i+3i2 2+5i−3 −1+5i 1 5


 Example 1.12 i) 2−3i = 2−3i 2+3i = 4+6i−6i−9i2 = 4+9 = 13 = − 13 + 13 i

1 5 2
CHECK: (2 − 3i)(− 13 + 13 i) = − 13 + 10 3 15 2
13 i + 13 i − 13 i = 1 + i

1+5i 1+5i 4i 4i+20i2 −20+4i


ii) −4i = −4i 4i = −16i2
= 16 = − 45 + 14 i

2−i 2−i 2−i 4−2i−2i+i2 3−4i


ii) 2+i = 2+i 2−i = 4+2i−2i−i2
= 5 = 35 − 54 i 

Unlike in multiplication, using the polar form is not a good choice unless the angle of both the
numerator and denominator are easy to find (e.g., 30 − 60 − 90 or 45 − 45 − 90 right triangle). To
divide two complex numbers in polar form we simply divide the magnitudes, r, and subtract the
angle in the denominator from the angle in the numerator

r1 eiθ1 r1
z1 /z2 = iθ
= ei(θ1 −θ2 ) . (1.10)
r2 e 2 r2
Thus, visually division amounts to rotating the first complex number z1 by angle −θ2 and reducing
its length by a factor of r2 .
1−i−i+i2 −2i
 Example 1.13 i) 1−i 1−i 1−i
1+i = Method 1: 1+i 1−i = 1−i+i−i2 = 2 = −i
√ π
2e−i 4
= ei(− 4 − 4 ) = e−i 2 = cos( −π
π π π π
Method 2: √ iπ
2e 4 2 ) + i sin(− 2 ) = −i
1.5 Complex Algebra 29

(1, 1)
z2

θ1 z1 z2
x
θ2
z1 /z2
z1
(1, −1)

R Multiplication and Division amount to rotating complex numbers in one direction or another
and scaling them. Most times it is useful to avoid the polar form and carry out these operations
on the standard form. No matter which method is used the final answer should always be
reported in standard form, x + iy.

1.5.2 Complex Conjugation of an Expression


Definition 1.5.5 The conjugate of a sum of two complex numbers is the sum of the conjugates.
Given z1 = x1 + iy1 and z2 = x2 + iy2 , their sum

z1 + z2 = (x1 + x2 ) + i(y1 + y2 ) = (x1 +x2 )−i(y1 +y2 ) = (x1 −iy1 )+(x2 −iy2 ) = z̄1 + z̄2 . (1.11)

In addition, the conjugate of a difference, product, or quotient is equal to the difference, product,
or quotient of the conjugates (e.g., z1 − z2 = z̄1 − z̄2 , z1 z2 = z̄1 z̄2 , and z1 /z2 = z̄1 /z̄2 ).

 Example 1.14 i) (1 + i)(2 − 3i) = (1 − i)(2 + 3i) = 2 + 3i − 2i − 3i2 = 2 + i + 3 = 5 + i


OR: 2 − 3i + 2i − 3i2 = 5 − i = 5 + i.

3−4i−3i+4i2
ii) (1 + i)/(3 − 4i) = 1−i 1−i 3−4i
3+4i = 3+4i 3−4i = 9+16 = −1−7i
25 =
1
− 25 7
− 25 i
2
OR: 1+i 3+4i
3−4i 3+4i = 3+4i+3i+4i
9+16 = −1+7i −1 7 1 7
25 = 25 + 25 i = − 25 − 25 i. 

Notice z = f + ig where f , g are complex numbers, then z̄ = f¯ + iḡ, NOT: f − ig.


 Example 1.15 Let f = 1 + i or g = 2 − i, then find z̄ = f + ig.

Thus, z̄ = (1 + i)+i(2 − i) = (1−i)−i(2+i) = 2−3i. As a check find z = (1+i)+i(2−i) = 2+3i,


then z̄ = 2 − 3i 6= f − ig = −i. 

 Example 1.16 Show the conjugate of the quotient is the quotient of the conjugates.

z1 r1 eiθ1 r1 i(θ1 −θ2 ) r1 i(θ2 −θ1 ) r1 e−iθ1 z̄1


Then z2 = r2 eiθ2
= r2 e = r2 e = r2 eiθ1
= z̄2 . 

Observe that if it works for a sum then it automatically works for a difference A − B = A + (−B).
If it works for a quotient, then it will work for a product A/B = A × (1/B).
30 Chapter 1. Fundamentals of Complex Numbers

1.5.3 Finding the Absolute Value |z|


p
Definition 1.5.6 The magnitude or length of a complex number z is r = |z| = x2 + y2 . We
take the positive square root since distance is positive.

 Example 1.17 Given a complex number z = x + iy find zz̄.

zz̄ = (x + iy)(x − iy) = x2 −


√ ixy + ixy − i2 y2 = x2 + y2 = r2 = |z|2 or in polar form zz̄ = reiθ re−iθ =
r2 = |z|2 . Thus, |z| = r = zz̄. 

 Example 1.18 Find:


√ √
i) |1 + i| = 12 + 12 = 2.
√ √
ii) |4i| = 02 + 42 = 16 = 4.
√ √
iii) |1 + 2i| = 12 + 22 = 5. 

R The absolute value of a product or quotient is the product or quotient of the absolute values.
2

Example 1.19 i) 1+i

1+i 1+i

1+i+i+i

2i
1−i = 1−i 1+i = 1+1 = 2 = |i| = 1


√ √
|1+i| 12 +12 √2
OR: |1−i| =√ = 2
= 1.
12 +(−1)2
√ √
|2−3i| 22 +(−3)2
√13 .
2−3i
ii) 5+6i = = √ =
|5+6i| 52 +62 61

|2+4i|

2 2
√ √
ii) 2+4i √2 +4 √20

1+i = = = = 10. 

|1+i| 12 +12 2

1.5.4 Complex Equations


The main idea is to remember that a complex number is associated with a pair of real numbers (the
real and imaginary parts). Thus, a complex equation really contains two equations, one for the real
parts and one for the imaginary parts.
Definition 1.5.7 Two complex numbers, z1 and z2 , are equal if and only if x1 = x2 (real parts)
and y1 = y2 (imaginary parts).

For example, 2 + i 6= 2 − i. What does this say about complex equations? When solving a
complex equation we really need to solve two equations at once (for the real and imaginary parts).
Knowing that an equations is complex gives a relationship between each part.
 Example 1.20 Find z = x + iy if z2 = 4i.

(x + iy)2 = 4i
FOIL x2 + 2ixy − y2 = 4i
Split into two equations Real: x2 − y2 = 0, Imaginary: 2xy = 4.

Solving the first equation gives x2 = y2 . Either x = −y or√x = y. In the first case (x = −y), the
second equation gives −2x2 = 4, which implies x = ± = −2 = ±2i, but we know x must be a
real number so this case cannot hold!

2

In the second √ (x = y), √
√ case √ equation gives 2x = 4 or x = ± 2. Thus, the two so-
the second
lutions are ( 2, 2) and (− 2, − 2). 
1.5 Complex Algebra 31

 Example 1.21 i) #39 Solve x + iy = y + ix.

Matching real and imaginary parts, this is always true if x = y. So there are infinitely many
solutions that lie on the line y = x in the complex plane.

ii) #44 Solve x + iy = (1 − i)2 .

FOIL the left-hand side.


x + iy = 1 − i − i + i2 = 1 − 2i − 1 = −2i ⇒ x = 0, y = −2.
iii) #45 Solve (x + iy)2 = (x − iy)2 .

FOIL both sides.


x2 + 2xyi − y2 = x2 − 2xyi − y2
2xyi = −2xyi
xy = −xy.
Thus, x = −x or y = −y. So either x = 0 or y = 0. Therefore, z = x or z = y. 

Now for a harder example! If you can solve this you can handle most quadratic equations and
have demonstrated you follow all the necessary steps.
x+iy+2+3i z+2+3i
 Example 1.22 x+iy−3 = i + 2. Let z = x + iy and rewrite the equation as z−3 = i + 2.

z + 2 + 3i = (i + 2)(z − 3)
z + 2 + 3i = zi + −3i + 2z − 6
Rearrange terms with z: z − (2 + i)z = −3i − 6 − (2 − 3i)
(−1 − i)z = −8 − 6i.
−8−6i −8−6i −1+i 8−8i+6i−6i2 14−2i
Thus, z = −1−i = −1−i −1+i = 1+1 = 2 = 7 − i. 

1.5.5 Graphs of Complex Equations


What is the curve made up of points in the complex plane satisfying |z| = 1?

In others words, find x and y such that x2 + y2 = 1. This is just the equation of a circle of ra-
dius 1.
y

x
32 Chapter 1. Fundamentals of Complex Numbers

 Example 1.23 Describe the plots of each of these complex equations:

i) |z − 3| = 4, square both sides |z − 3|2 = 16 ⇒ (x − 3)2 + y2 = 16. This is a circle centered


at (3,0) with radius 4.

ii) |z − 3| ≥ 4. This is the area outside the circle centered at (3,0) of radius 4 including the
circle boundary.

iii) |z − 3| < 4. This is the interior of the circle centered at (3,0) of radius 4. 

Recall the three basic conic sections and their equations

a) Circle centered at (a,b) of radius r (x − a)2 + (y − b)2 = r2


(x − a)2 (y − b)2
b) Ellipse centered at (a,b) with radius c in x and radius d in y + =1
c2 d2
(x − a)2 (y − b)2
c) Hyperbola centers at (a,b) − =1
c2 d2

Plot the solution to the equation θ = π/6

θ x

 Example 1.24 Plot each of the following:

i) Re{z} < 2

ii) Re{z} ≥ −1

iii) Im{z} ≥ 3. 

1.5.6 Physical Applications


Complex Equations appear in all sorts of physical applications. Primarily adding analysis for
problems in two-dimensions. Classical examples are 2D fluid flow, superconductivity in a wire,
among others. Understanding how complex equations work will provide the basis for learning
advanced methods. If interested further look up the techniques of conformal mapping or contour
integration, which are very useful in physics.
 Example 1.25 A particle moves in the (x, y) plane so that its position as a function of time t is
1.6 Complex Infinite Series 33

given by

i + 3t
z = x + iy = .
t − 2i
Find the magnitudes of the velocity and the acceleration as a function of time.

Answer: First recall the definitions of position, velocity, and acceleration as well as their re-
lationships
Position: z = x + iy
dy
Velocity: dz dx
dt = dt + i dt
2
d2z d2x
Acceleration: dt 2
= dt 2
+ i ddt 2y
3(t−2i)−(i+3t) 3t−6i−i−3t −7i
First find the velocity using the Quotient Rule, dzdt = (t−2i)2 )
= (t−2i)2
= (t−2i) 2 . We need to
dz |−7i|
find the magnitude so consider dt = |t 2 −4it−4| = √ 2 72 2
= √ 7
t 4 −8t 2 +16+16t 2
= √t 4 +8t7 2 +16 =
(t −4) +(4t)
√ 7 7
= t 2 +4
.
(t 2 +4)2

d2z 14i
Now find the acceleration by taking one more derivative. dt 2
= (t−2i)3
. Last find its magnitude

2 s
d z 14i −14i 14
a = 2 = =p ,
dt (t − 2i)3 (t + 2i)3 (t − 6it − 12t + 8i)(t 3 + 6it 2 − 12t − 8i)
3 2

14 14
then a = √ = (t 2 +4)3/2
. 
(t 2 +4)3

1.6 Complex Infinite Series


Recall that for a function of a real variable, f (x), we can find an approximation locally using a
Taylor Expansion. Any function of one variable can be expanded about a point x = a as follows:

f 00 (a) f (k) (a)


f (x + a) = f (a) + f 0 (a)(x − a) + (x − a)2 + ... + (x − a)k + ...
2! k!
Key Questions: Does this series converge? If so, then for what values of x is the expansion valid?
We will explore these questions for complex Taylor Series.

Convergence in Section 2.6

Radius/Disk of Convergence in Section 2.7

For a real series we can define a partial sum of the first n terms, Sn := ∑nk=1 f (k) (a)(x − a)k . We
say the series converges if limn→∞ Sn = S where S is the sum.

R In future courses, you may see convergence defined different. Rigorously, a series is said to
converge if the partial sums get closer and closer together, |Sm − Sn | → 0 as m, n → ∞.

Analogously, for complex numbers we say that the partial sum Sn = Xn + iYn consisting of a
sum in the real parts and a sum in the imaginary parts. The sum converges if both expressions
approach some limit!

lim [Xn + iYn ] = lim Xn + i lim Yn = X + iY.


n→∞ n→∞ n→∞
34 Chapter 1. Fundamentals of Complex Numbers

Thus, Xn → ∞ and Yn → ∞. In other words, the real and imaginary parts of the series each converge
as a series of real numbers.
First, let’s review the definition of absolute convergence and convergence tests for series of real
numbers.
Definition 1.6.1 If the series of absolute values ∑∞n=1 |zn | < ∞, then the series is called absolutely
convergence.

There is also a special type of series, which converges known as a geometric series.
Definition 1.6.2 A geometric series has the form: ∑∞ i
i=1 ar . If |r| < 1 this series converges.
Useful formulas for the infinite sum and all partial sums are
n
1 − rn
  ∞
i a
∑ ar = a , ∑ ari = .
i=1 1−r i=1 1−r

1.6.1 Review from Calculus: Tests for Convergence


For more detail please consult Chapter 1 in the textbook by Boas.
Convergence Test
If the terms Xi 9 0, then the series must diverge.
i
 Example 1.26 ∑∞
i=1 i+1 must diverge since each term Xi → 1 6= 0. 

Comparison Test
Consider two series a1 + a2 + a3 + ... and b1 + b2 + b3 + .... If |an | ≤ |bn | for all n and the series
∑ bn converges, then the series for an is absolutely convergent OR if |an | ≥ dn and the series for dn
diverges, then the series for an diverges.
1 1 1 1
 Example 1.27 ∑∞
n=1 n! = 1 + 2 + 6 + .... Let bn = 2n , then |an | ≤ bn and ∑ bn < ∞ (geometric
series). Thus an converges! 

Integral Test
R∞
If 0 < an+1 ≤ an for n > N, then ∑∞
n=1 an converges/diverges if
an dn converges/diverges.
0

∞ 1 R∞ 1
 Example 1.28 ∑n=1 . Using the Integral Test: dn = ln(n) = ln(∞) − 0 → ∞. So the
n 1 n
1
original series diverges! 

Ratio Test
Take the ratio of two consecutive terms in the series: ρn = aan+1 and consider limn→∞ ρn . If:

n
i) ρ < 1 the series converges
ii) ρ > 1 the series diverges
iii) ρ = 1 there is not enough info to conclude if the series converges or diverges.

∞ 1 1 n! n! 1
 Example 1.29 ∑n=1
n! , then ρn = (n+1)! 1 = (n+1)! = n+1 → 0. Thus, the original series
converges. 

Root Test
p
Consider the nth root of the summand L := limn→∞ n |an |. If:
i) L < 1 the series is absolutely convergent
ii) L > 1 the series is divergent
iii) L = 1 there is not enough info to conclude if the series converges or diverges.
 n
∞ 5n−3n3 5n−3n3 −3 3
 Example 1.30 ∑n=0 3
7n +2
, then L = 3
7n +2
= 7 = 7 < 1. The series converges! 
1.7 Complex Power Series and Disk of Convergence 35

Alternating Series
An alternating series is a series where the terms have the form an = (−1)n bn or an = (−1)n+1 bn .
An alternating series converges if the limit of the absolute value of the terms converges to zero and
the terms are decreasing: |an+1 | < |an | and limn→∞ an = 0.
 Example 1.31 1 − 21 + 31 − 14 + 51 − 16 + ..., converges by the alternating series test. 

Definition 1.6.3 If a series converges, but not absolutely, then it is said to be conditionally
convergent. This is a weaker form of convergence. In particular, the terms in the sum can be
rearranged to form any total. In contrast, for a series that is absolutely convergent, rearranging
the terms does not change the sum.

1.6.2 Examples with Complex Series


(i+2) 2 3 n
 Example 1.32 i) 1 + + (i+2)
3
(i+2) (2+i)
9 + 27 + ... + 3n + ....
√ √
(2+i)n+1 3n 2+i 22 +12 5
By the Ratio Test: limn→∞ |ρn | = limn→∞ 3n+1 (2+i)n = 3 = 3 = 3 < 1. The se-
ries converges!

i n
ii) ∑∞ √
n=1 n .

(−1) n (−1) n
Consider the Real Part: ∑∞ n=1

2n
and Imaginary Part: ∑∞ √
n=0 2n+1 . Both series converge by
the alternating series test. Thus, the complex series converges!

iii) ∑∞ n
n=0 (z + 1) .
p
Using the Root Test: L := limn→∞ |z + 1| converges for |z + 1| < 1 or (x + 1)2 + y2 < 1 or
(x + 1)2 + y2 < 1. Thus, the series converges for z inside the circle centered at (−1, 0) of radius 1
not including the boundary. 

1.7 Complex Power Series and Disk of Convergence


Recall from calculus a power series for a function of a real variable (centered at zero)
∞ ∞
f (n) (0) n
f (x) = ∑ an xn = ∑ n!
x ,
n=1 n=1

or centered at point x = a
∞ ∞
f (n) (a)
f (x) = ∑ bn (x − a)n = ∑ n!
(x − a)n .
n=1 n=1

Definition 1.7.1 (Interval of Convergence) The values of x where the series converges.
n+1

Example 1.33 Given the power series ∑ xn . By the ratio test ρ := xxn = |x|. For convergence



we need ρ < 1. Thus, |x| < 1 is the interval of convergence. 

Before defining a complex power series, let’s discuss some facts for real power series (review
from Calculus):

1. A power series can be differentiated or integrated term by term. The resulting series con-
verges to the derivative or integral of the original function within the same interval of convergence.
36 Chapter 1. Fundamentals of Complex Numbers
x n
 Example 1.34 Consider the function f (x) = ex , which has power series ∑∞
n=0 n! .

nx n−1
x n−1
∞ x k
a) Differentiating term by term: ∑∞ ∞ x
n=1 n! = ∑n=1 (n−1)! →k:=n−1 ∑k=0 k! = e .

x n+1 ∞ x k
x x x
R
b) Integrating term by term: ∑∞n=0 (n+1)! →k:=n+1 ∑k=1 k! = e − 1. Note e = e +C.
xn+1
x
c) The Interval of Convergence (I.O.C.) can be found using the ratio test ρ := (n+1)!
xn = limn→∞ n+1 →

n!
0. Thus, the interval of convergence is all real numbers. 

2. Two power series can be added, subtracted, multiplied. The result converges in the common
interval of convergence.

3. One series can be substituted into another if the substituted series values are in the inter-
val of convergence of the series it is being plugged into.

4. The power series of a function is unique! Only one power series of the form ∑n an xn con-
verges to a given function.

R Properties 1.-4. still hold for complex power series!

Definition 1.7.2 A complex power series has the form ∑n an zn where z = x + iy. The real
power series just a special case of the complex power series when y = 0.

2 3 n
 Example 1.35 i) 1 + z + z2 + z6 + ... = ∑∞ z
n=0 n!

2 3 4
ii) 1 − i(z + 1) + (i[z+1])
2 + (i[z+1])
6 + (i[z+1])
24

(z−2+2i) n
iii) ∑∞
n=0 6n n3
. 

Definition 1.7.3 The complex analogue of the radius of convergence is the disk of convergence
(in the 2D complex plane).

 Example 1.36 Find the Disk of Convergence (D.O.C.) for each complex power series in the
previous example.
n+1
zn z z
i) For ∑∞ , use the ratio test. := lim (n+1)! n!
zn = limn→∞ n+1 → 0. Thus, the series

n=1 n! ρ n→∞

converges for all z in the complex plane. Therefore, the disk of convergence is the entire complex
plane, C.

(i[z+1])n (−1)n (i[z+1])n+1 (−1)n+1 n i(z+1)(−1)n
ii) For 1+ ∑∞ . By the ratio test = lim = lim →

n=1 n ρ n→∞ n+1 (i[z+1])n (−1)n n→∞ n+1
|z+1|
1 . Thus, ρ < 1 if |z + 1| < 1. Thus, the disk of convergence is the interior of the circle centered
at (−1, 0) with radius 1.
1.8 Elementary Functions of Complex Numbers 37

(−2, 0) (−1, 0)
• • x


(z−2+2i)n (z−2+2i)n+1 6n n3 (z−2+2i)n3
iii) ∑∞n=0 6n n3
. Use the ratio test, ρ = limn→∞ 6n+1 (n+1)3 (z−2+2i)n
= lim n→∞ 6(n+1)3

z−2+2i
. Thus, the series converges if ρ < 1 or |z − 2 + 2i| < 6. The disk of convergence is centered
6 √
at (2, −2) of radius 6. 

z2 z4 (−1)n z2n
 Example 1.37 iv) 1 − 3! + 5! + .... General Form: ∑∞
n=0 (2n+1)! . Then by the ratio test,
n+1 2(n+1)
2

ρ = limn→∞ (−1) (2n+1)!
z (−1)z
[2(n+1)+1]! (−1)n z2n = limn→∞ (2n+3)(2n+2) → 0. Thus, the disk of convergence

is the entire complex plane, C.
n+1
n+1 (z + i − 3)2(n+1) . Then by the ratio test, ρ = lim 2 (z+i−3)2(n+1)
v) ∑∞n=0 2 n→∞ 2n (z+i−3)2n
= limn→∞ |2(z +
1
i − 3)2 | = |2(z + i − 3)2 |. the disk of convergence is where |(z + i − 3)|2 < or the disk centered at
√ 2
(3, −1) of radius 1/ 2. 

1.8 Elementary Functions of Complex Numbers


In principle we can consider any function we have traditionally used (e.g., exponential, trig functions,
polynomials). In the previous section we saw complex polynomials in the form of power series.
We start this section with the next level of complexity, rational functions (ratios of polynomials):
a0 + a1 z + a2 z2 + ... + aN zN
f (z) = .
b0 + b1 z + b2 z2 + ... + bM zM
3 −1
 Example 1.38 Given the complex function f (z) = zz+2 , find f (i − 1).

Step 1: Substitute value of z into the function


(i − 1)3 − 1
f (i − 1) = .
(i − 1) − 2
Step 2: Simplifying
(i − 1)(i2 − 2i + 1) − 1 (i − 1)(−2i) − 1 −2i2 + 2i − 1 1 + 2i
f (i − 1) = = = =
i+1 i+1 i+1 i+1
Step 3: Rationalize the denominator (Multiply by the complex conjugate over itself)
1 + 2i 1 − i 1 + 2i − i − 2i2 3 1
f (i − 1) = = = + i
i+1 1−i 2 2 2
More Examples in Class! 
38 Chapter 1. Fundamentals of Complex Numbers
x n
Recall the power series for ex = ∑∞
n=0 n! in real numbers. Can we make a similar definition for
the complex exponential function?
Definition 1.8.1 Using the definition of the power series expansion of the exponential function
of real variables, replace x with z:

zn
ez = ∑ n! (1.12)
n=0

Where does it converge (Disk of Convergence)?


n+1 z
z n!
By the Ratio Test: ρ = limn→∞ (n+1)! = limn→∞ n+1 → 0. Thus, ρ > 1 for all z in the
n

z
Complex Plane and the disk of convergence must be C.
Operations with Complex Exponential Functions
z2 z2
h i h i
(z +z )2
 Example 1.39 i) ez1 ez2 = (1 + z1 + 21 + ...)(1 + z2 + 22 + ...) = 1 + (z1 + z2 ) + 1 2 2 =
ez1 +z2

d n
ii) Note: = nzn (just like normal derivatives of real numbers!
dz [z ]
 
d z d z2 zn 2 n−1
dz [e ] = dz 1 + z + 2 + ... + n! + ... = 0 + 1 + z + z2 + ... + nzn! + ...
2 n−1
= 0 + 1 + z + z2 + ... + (n−1)!
z
+ ... = ez 

1.9 Euler’s Formula


Recall the Taylor expansion for the basic trig functions of one real variable

x3 x5 x7
sin(x) = x − + − + ...
3! 5! 7!
x2 x4 x6
cos(x) = 1 − + − + ...
2! 4! 6!
Can we do something similar for complex trig functions? First, consider the complex Taylor series
for the exponential function

(iθ )2 (iθ )3 (iθ )4 (iθ )5


eiθ = 1 + iθ + + + + + ...
2! 3! 4! 5!
θ2 θ3 θ4 θ5
= 1 + iθ − −i + + i + ...
2! 3! 4! 5!
θ2 θ4 θ3 θ5
   
= 1− + + ... + i θ − + + ...
2! 4! 3! 5!
= cos(θ ) + i sin(θ ).

This result gives us Euler’s Formula!

eiθ = cos(θ ) + i sin(θ ) (1.13)

We have been using this formula since Section 3.2, but now we can see why it holds. We also have
verified

z = x + iy = r(cos(θ ) + i sin(θ ) = reiθ . (1.14)


1.9 Euler’s Formula 39

 Example 1.40 Find the values of 3eiπ/3 , eiπ/2 , 2e−iπ/6 , e2nπi .

1
i) 3eiπ/3 ⇒ r = 3, θ = π/3. Recall from

 polar coordinates x = r cos(θ ) = 3 cos(π/3) = 3
√ √ √ 2 = 3/2
3 3 3 3 3 3
and y = r sin(θ ) = 3 sin(π/3) = 3 2 = 2 . Thus, z = +
2 2 i.

ii) eiπ/2 ⇒ r = 1, θ = π/2. Recall from polar coordinates x = r cos(θ ) = cos(π/2) = 0 and
y = r sin(θ ) = sin(π/2) = 1. Thus, z = i.

iii) 2e−iπ/6 ⇒ r= 2, θ = −π/6. Recall from polar coordinates x = r cos(θ ) = 2 cos(−π/6) =
√ 
3

2 cos(π/6) = 2 2 = 3 and y = r sin(θ ) = 2 sin(−π/6) = −2 sin(π/6) = 2 − 21 = −1. Thus,


z = 3 − i.

iv) e2nπi ⇒ r = 1, θ = 2nπ. Recall from polar coordinates x = r cos(θ ) = 1 and y = r sin(θ ) = 0.
Thus, z = 1 for all n. 

Recall that Euler’s Formula/Identity is especially useful for multiplying and dividing complex
numbers
z1 z2 = r1 eiθ1 r2 eiθ2 = r1 r2 ei(θ1 +θ2 )
r1 eiθ1 r1
z1 /z2 = iθ
= ei(θ1 −θ2 )
r2 e 2 r2
(1−i)2
 Example 1.41 Evaluate 1+i .
h√ i2 √
Step 1: Write in Polar Form: z1 = (1 − i)2 = 2e−iπ/4 = 2e−iπ/2 and z2 = 1 + i = 2eiπ/4 .

Step 2: Carry out the multiplication or division:


z1 2e−iπ/2 2 √
=√ = √ ei(−π/2−π/4) = 2ei3π/4 .
z2 2eiπ/4 2

Step 3: Write in Standard Form, z = x + iy. z = 2ei3π/4 = −1 − i. 

 Example 1.42 Evaluate (2 + 2 3i)(1 + i).
√ √
Step 1: Write in Polar Form: z1 = (2 + 2 3i) = 4eiπ/3 and z2 = 1 + i = 2eiπ/4 .

Step 2: Carry out the multiplication or division:


√ √
z1 z2 = 4 2ei(π/3+π/4) = 4 2ei7π/12 .
Step 3: Write in Standard Form, z = x + iy, if an easy angle (e.g., 30-60-90, 45-45-90). Here 7π/12
is not an angle that can be handled easily by hand, so we will leave it in polar form. 

 Example 1.43 Evaluate (1 + i)(1 − i).


√ iπ/4 √
Step 1: Write in Polar Form: z1 = 1 + i = 2e and z2 = 1 − i = 2e−iπ/4 .

Step 2: Carry out the multiplication or division:


z1 z2 = 2ei(π/4−π/4) = 2ei0 = 2
Step 3: Write in Standard Form, z = x + iy, if an easy angle (e.g., 30-60-90, 45-45-90). Notice
(1 + i)(1 − i) = 1 + i − i − i2 = 2. 
40 Chapter 1. Fundamentals of Complex Numbers

1.10 Powers and Roots of Complex Numbers

We have clear definitions for powers and roots (fractional powers) of real numbers. Can we define
the analogous notions for complex numbers?
Given a complex number z, consider it raised to the nth power.

Definition 1.10.1 To raise a complex number to the nth power one needs to raise the modulus,
r, to the nth power and multiply the angle by n.
h in
zn = reiθ = rn einθ (1.15)

Another useful idea using this definition is Demoivre’s Theorem:

Theorem 1.10.1 (Demoivre’s Theorem) When r = 1, the nth power can be expressed int he
following way:
 n
eiθ = (cos(θ ) + i sin(θ ))n = cos(nθ ) + i sin)nθ ). (1.16)

 Example 1.44 Evaluate (1 + i)4 .


√ iπ/4
Step 1: Write in Polar Form: 1 + i = 2e

Step 2: Carry out the calculation using the definition.

h√ i4 √
(1 + i)4 = 2eiπ/4 = ( 2)4 eiπ = 4 [cos(π) + i sin(π)] = 4[−1 + 0] = −4. (1.17)

Now we want to consider taking the nth root. Recall that taking the nth root of a real number

is equivalent to raising that number to the 1n power. Similarly, for a complex number, n z = z1/n .

Definition 1.10.2 To take the nth root of a complex number one needs to take the nth root of
the modulus, r, and divide the angle by n.

√ h i1/n √
n
z = z1/n = reiθ = r1/n eiθ /n = n r [cos(θ /n) + i sin(θ /n)] . (1.18)

 Example 1.45 Find the cube roots of 64. In other words, find z so that z3 = 64. Let’s at-
tack this problem using the polar form of the complex number, z = 64. Thus, r = 64 and
θ = 0, 2π, 4π, ..., 2πn. Now, by the definition of the root: z1/3 = r1/3 eiθ /3 = r1/3 ei(2πn+θ )/3 ⇒
r = 4, θ = 0, 2π/3, 4π/3, 6π/3, .... Observe that 6π/3 =√2π = 0 (on √ the complex plane). Thus, the
three roots are: 4ei0 = 4, 4ei2π/3 , 4e4π/3 or z = 4, −2 + 2 3i, −2 − 2 3i.
√ √ √ √ √
As√a check:√(−2 + 2 3i)3 = (−2 + 2 3i)(4 − 8 3i − 12) = (−2 + 2 3i)(−8 − 8 3i) = (16 +
16 3i − 16 3i + 48) = 64.
1.10 Powers and Roots of Complex Numbers 41

y

(−2, 2 •3)

(4,
• x0)



(−2, −2 3)

R Always remember that complex numbers always come in conjugate pairs!

 Example 1.46 Find the 4th roots of -81. In other words, find z so that z4 = −81. Use the polar
form of the complex number, z = −81. Thus, r = 81 and θ = π, 3π, 5π, 7π, ..., π + 2πn.

Now, by the definition of the root: z1/4 = r1/4 eiθ /4 = r1/4 ei(2πn+θ )/4 ⇒ r = 3, θ = π/4, 3π/4, 5π/4, 7π/4, ....
Observe that 9π/4 = π/4 (on the complex plane).
√ √ √ √ √
3 2
Thus,

the√ four roots

are: 3eiπ/4 , 3ei3π/4 , 3ei5π/4 , 3ei7π/4 or z = 2 + i322, 322 − i322,−322 +
i322,−322 −i322.
√ √
As a check: ( 3 2 2 + i 3 2 2 )4 = ( 18 36 18 18 36 18
4 + 4 i − 4 )( 4 + 4 i − 4 ) = (9i)(9i) = −81.

y
√ √ √ √
(−3 2/2, 3 2/2) (3 2/2, 3 2/2)
• •

√ •√ √ • √
(−3 2/2, −3 2/2) (3 2/2, −3 2/2)
42 Chapter 1. Fundamentals of Complex Numbers



6
 Example 1.47 Find and plot the values of −64.
6
Thus, we need to find r, θ such that reiθ = −64. Consider the polar form of −64, where
r = 64 and θ = π, 3π, 5π, 7π, 9π, 11π.

Now, by the definition of the root: z1/6 = r1/6 eiθ /6 = r1/6 ei(2πn+θ )/6 ⇒ r = 2, θ = π/6, π/2, 5π/6, 7π/6, 3π/2, 11π/6, ..
Observe that 13π/6 = π/6 (on the complex plane).

iπ/6 iπ/2 i5π/6 , 2ei7π/6 , 2ei3π/2 , 2ei11π/6 or z =


√ √
Thus, √ are: 2e , 2e , 2e
√ the six roots 3 + i, 2i, − 3 +
i, − 3 − i, −2i, 3 − i.
y

(0, 2)

√ √
(− 3, 1) ( 3, 1)
• •


(− 3, −1)• •

( 3, −1)


(0, −2)

1.11 The Exponential and Trigonometric Functions


z 2 z n
Recall from a previous section the power series expansion for ez = 1 + z + 2! + ... n! + .... This can
be written in another form:
ez = ex+iy = ex eiy = ex [cos(y) + i sin(y)] .
This new form using Euler’s Formula may be easier to use in some instances.
 Example 1.48 i) e3+iπ/2 = e3 eiπ/2 = e3 [cos(π/2) + i sin(π/2)] = e3 [0 + i] = e3 i.

ii) e3 ln 3−iπ/2 = eln(27) e−iπ/2 = 27 [cos(−π/2) + i sin(−π/2)] = 27[0 − i] = −27i. 

Recall Euler’s Formula:


eiθ = cos(θ ) + i sin(θ ) (1.19)
−iθ
e = cos(θ ) − i sin(θ ) (1.20)
1.11 The Exponential and Trigonometric Functions 43

Subtracting (8.114) from (8.54):

eiθ − e−iθ
eiθ − e−iθ = 2i sin(θ ) ⇒ sin(θ ) = .
2i
Adding (8.54) to (8.114) gives:

eiθ + e−iθ
eiθ + e−iθ = 2 cos(θ ) ⇒ cos(θ ) = .
2
These expressions hold for real θ , but can be extended to all complex numbers, z, by replacing
θ 7→ z.
Definition 1.11.1 (Complex Trigonometric Functions)

eiz − e−iz eiz + e−iz


sin(z) = cos(z) = . (1.21)
2 2
The remaining trigonometric functions can be derived using the usual relations:

sin(z) cos(z) 1 1
tan(z) = , cot(z) = , csc(z) = , sec(z) = . (1.22)
cos(z) sin(z) sin(z) cos(z)

 Example 1.49 Find sin(i).

Using the definition:


2 2
ei − e−i e−1 − e1
sin(i) = = ≈ 1.1752i.
2i 2i


R One interesting difference from real numbers is the range for sine and cos. For real x,
| sin(z), cos(z)| ≤ 1. This bound does not hold for the complex forms of sine and cosine as
seen by the previous example.

We can recover some of the same calculus trig identities for the complex versions.
 Example 1.50 Does sin2 (z) + cos2 (z) = 1?
 2
eiz −e−iz e2iz −2+e−2iz
Check: sin2 (z) = 2i = −4 .
 2
eiz +e−iz e2iz +2+e−2iz
Check: cos2 (z) = 2i = 4 .

So, sin2 (z) + cos2 (z) = 4


4 = 1. 

 Example 1.51 Show the double angle formula: sin(2z) = 2 cos(z) sin(z).

(eiz +e−iz )(eiz −e−iz )


h ih i
e2iz −e−2iz eiz +e−iz eiz −e−iz
sin(2z) = 2i = 2i =2 2 2i = 2 cos(z) sin(z). 

What about the derivatives of the sine and cosine? Are they the same or very different?
h i
d eiz −e−iz iz −iz iz −iz
 Example 1.52 i)
d
dz sin(z) = dz 2i = ie +ie2i = e +e2 = cos(z). Same!
h i h iz −iz i
d d eiz +e−iz ieiz −ie−iz e −e
ii) dz cos(z) = dz 2 = 2 = − 2i = − sin(z). Same! 
44 Chapter 1. Fundamentals of Complex Numbers

1.12 Hyperbolic Functions


What do sine and cosine look like when a complex number is purely imaginary, z = iy?

ei(iy) − e−i(iy) e−y − ey ey − e−y


sin(iy) = = =i
2i 2i 2
ei(iy) + e−i(iy) e−y + ey ey + e−y
cos(iy) = = = .
2 2 2
These are special functions and come up when solving dynamic problems (differential equations,
more in Math Methods II!).
Definition 1.12.1 (Hyperbolic Trig Functions)

ez − e−z ez + e−z
sinh(z) = cosh(z) = . (1.23)
2 2
Similarly,

sinh(z) cosh(z) 1 1
tanh(z) = , coth(z) = , sech(z) = , csch(z) = . (1.24)
cosh(z) sinh(z) cosh(z) sinh(z)

Thus, observe that sin(iy) = i sinh(y) and cos(iy) = cosh(y). Now consider some trig identities
with hyperbolic trig functions.
 Example 1.53 Show: cosh2 (z) − sinh2 (z) = 1
h i2
ez +e−z e2z +2+e−2z
Using the definition, cosh2 (z) = 2 = 4
h i2
ez −e−z e2z −2+e−2z
Also, using the definition: sinh2 (z) = 2 = 4 . So, cosh2 (z) − sinh2 (z) = 4
4 = 1.


We also can consider the derivatives of the hyperbolic trig functions:


h i
d ez −e−z z −z
 Example 1.54 i)
d
dz sinh(z) = dz 2 = e +e
2 = cosh(z).
h z −z i z −z
d
ii) dz d e +e
cosh(z) = dz 2 = e −e
2 = sinh(z). 

R Observe that there is no sign change when taking the derivative of the hyperbolic cosine. This
d
is in contrast to normal trig functions where dz cos(z) = − sin(z).

Exercise 1.1 Why are complex roots of quadratic equations always found in pairs?

Hint: Look at the Quadratic Formula, which is valid for any quadratic equation. 
II
Part Two: Linear Algebra

2 Fundamentals of Linear Algebra . . . . . . 47


2.1 Systems of Linear Equations
2.2 Row Reduction and Echelon Forms
2.3 Determinants and Cramer’s Rule
2.4 Vectors
2.5 Lines, Planes, and Geometric Applications
2.6 Matrix Operations
2.7 Linear Combinations, Functions, and Operators
2.8 Matrix Operations and Linear Transformations
2.9 Linear Dependence and Independence
2.10 Special Matrices
2.11 Eigenvalues and Eigenvectors
2.12 Diagonalization
2. Fundamentals of Linear Algebra

Linear Algebra basically refers to linear relationships between objects. Can we think of examples
of linear functions we have seen in the past?
Definition 2.0.2 A function f (x) is linear if:

1. f (x + y) = f (x) + f (y).

2. f (cx) = c f (x) for any real number c.

Linear algebra takes this idea to the next level of abstraction by introducing the idea of a linear
operation. The idea is to take a system of linear equations and solve them simultaneously using
object called matrices. This section will start by introducing the relationship between matrices and
systems of linear equations. After the basic definitions are known we will begin to explore how to
work with these objects to solve real problems.

2.1 Systems of Linear Equations


First we must define what is meant by a single Linear Equation.
Definition 2.1.1 (Linear Equation) A linear equation is any equation of the form

a1 x1 + a2 x2 + ... + an xn = b,

where a1 , ..., an , b are constant real numbers and x1 , ..., xn are the unknown variables.

 Example 2.1 Are the following equations linear?

i) 4x1 − 5x2 + 2 = x1

If we rearrange we find: 3x1 − 5x2 = −2, so yes!


48 Chapter 2. Fundamentals of Linear Algebra

ii) x2 = 2( 6 − x1 ) + x3

If we rearrange we find: 2x1 + x2 − x3 = 2 6, so yes! 

Sometime it is easier to see if an equation meets any of these easy cases for being nonlinear to
rule out linearity:

i) Products of Variables: x1 x2 + x3 = 4 is Nonlinear

ii) Trig. Functions: sin(x1 ) + x2 = 2 is Nonlinear

iii) Powers/Roots: x2 , x1/2 are Nonlinear.


Definition 2.1.2 (A System of Linear Equations) A system of linear equations is a collection of
one or more linear equations involving the same set of variables (e.g., x1 , ..., xn ).

Definition 2.1.3 (Solution of a Linear System) A list (s1 , s2 , ..., sn ) of numbers that makes
each equation in the system true when the values s1 , s2 , ..., sn are substituted for x1 , x2 , ..., xn
respectively.

 Example 2.2 Possible solutions for two equations in two variables:

i) One Unique Solution (Consistent):


x1 + x2 = 10
−x1 + x2 = 0
ii) No Solution (Inconsistent):
x1 − 2x2 = −3
2x1 − 4x2 = 8
iii) Infinitely Many Solutions (Consistent):
x1 + x2 = 3
−2x1 − 2x2 = −6


Definition 2.1.4 (Equivalent Systems) Two linear systems with the same solution set.

2.1.1 Matrix Notation


Given a linear system in Standard Form:
x1 − 2x2 = −1
−x1 + 3x2 = 3
we can define two associated matrices. The first is the coefficient matrix made up of the coefficients
of each variable:
 
1 −2
,
−1 3
and the augmented matrix that is composed of the coefficients and the righthand side
 
1 −2 −1
.
−1 3 3
2.2 Row Reduction and Echelon Forms 49

2.1.2 Elementary Row Operations


There are three basic operations we can perform on an augmented matrix without changing its
solution set:
1. (Replacement) Add one row to a multiple of another row.
2. (Interchange) Switch two rows.
3. (Scaling) Multiply all entries in a row by a nonzero constant.
These three operations are your only valid tools in attacking problems involving matrices. We
will learn more advanced methods throughout the course, but for now we will stick with these three
rules.
Definition 2.1.5 (Row Equivalent Matrices) Two matrices where one matrix can be transformed
into the other matrix by a sequence of elementary row operations.

Fact: If the augmented matrices of two linear systems are row equivalent, then the
two systems have the same solution set.

Definition 2.1.6 (Size of a Matrix) We say a matrix with m rows and n columns is an m × n
matrix. Thus, a 2 × 3 matrix has two rows and three columns. In fact the Matrix A is composed
of elements (numbers) ai j where i corresponds to the row and j the column.

 Example 2.3 Use the three elementary row operations to solve the following linear system:

x1 − 2x2 + x3 = 0
2x2 − 8x3 = 8
−4x1 + 5x2 + 9x3 = −9,

Step 1: Put the equations into Standard Form


Step 2: Find the augmented matrix for the system of linear equations
Step 3: Row reduce using elementary row operations
Step 4: Back Substitute the values into the system to solve.

Solution: (29,16,3)

Final Step: Check by plugging the solution back into the original system. 

2.1.3 Fundamental Questions In Linear Algebra


1. Is the system consistent? (Does a solution exist?)

2. If a solution exists, is it unique? (Is there one and only one solution)?
These questions are answered during the course of elementary row operations.

If the augmented matrix ever has a row with all zeros except the last element is nonzero, [00...0b],
then the system is inconsistent and there is no solution!

More on uniqueness in the next section (hint: it will have to do with the concept of pivot variables).

2.2 Row Reduction and Echelon Forms


50 Chapter 2. Fundamentals of Linear Algebra
Definition 2.2.1 (Echelon Form). A matrix is in echelon form if:

1. All nonzero rows are above any rows of all zeros.


2. Each leading entry (e.g., left most entry, also called pivot) of a row is in a column to the right
of the leading entry of the row above it.
3. All entries in a column below a leading entry are zero.

Definition 2.2.2 (Reduced Echelon Form). A matrix is in reduced echelon form if in addition to
1.-3.:

4. The leading entry in each nonzero row is 1.


5. Each leading 1 is the only nonzero entry in its column.

 Example 2.4 i) Row reduce the following matrix to echelon form and locate the pivot columns

(columns which contain a pivot).


 
0 −3 −6 4 9
 −1 −2 −1 3 1 
 
 −2 −3 0 3 −1 
1 4 5 −9 −7

Row reduce to see that the pivot columns are 1, 2, and 4. There can be no more than 1 pivot in any
row.

ii) Row reduce the following matrix to reduced echelon form.


 
0 −3 −6 6 4 −5
 3 −7 8 −5 8 9 
3 −9 12 −9 6 12


2.2.1 Solutions of Linear Systems


Definition 2.2.3 A basic variable is any variable that corresponds to a pivot column in the
augmented matrix of a system.

A free variable is any variable that is not a basic variable.

The final step in solving any linear system is writing all the basic variables in terms of any free
variables.
 Example 2.5


x1 = −6x2 − 3x4

x2 is free
  
1 6 0 3 0 0


 0 0 1 −8 5  ⇒ x3 = 5 + 8x4
0 0 0 0 1 7

x4 is free





x5 = 7.


2.3 Determinants and Cramer’s Rule 51
Definition 2.2.4 The general solution of a system of linear equations provides a parametric
description of the solution set.

Thinking Question: The above example has infinitely many solutions. Why is it true?
Definition 2.2.5 The Transpose of a matrix denoted AT is the matrix formed when the rows
and columns of A are switched. (AT )i j = A ji , and thus the transpose of an m × n matrix is an
n × m matrix.
 
1 2 3
 Example 2.6 Given a matrix A = , find its transpose.
4 5 6
 
1 4
AT =  2 5 . 

3 6

Definition 2.2.6 (Rank of a Matrix) The number of nonzero rows remaining when a matrix has
been row reduced is the rank of the matrix.
If we know the rank of a matrix we know how many solutions to expect based on the original
size of the matrix, M is m × n and A is the row-reduced augmented matrix. Consider the general
problem of solving m equations in n unknowns:
1. If rank(M) < rank(A), the equations are inconsistent .
2. If rank(M) = rank(A) = n (the number of unknowns), there is exactly one solution.
3. If rank(M) = rank(A) = R < n, then there are R basic variables and n − R free variables resulting
in infinitely many solutions.

2.3 Determinants and Cramer’s Rule


Recall in the last lecture we introduced the concept of a matrix to provide a convenient and
organized way to solve a system of linear equations. The main concepts from the last section were
the elementary row operations and deciding whether a matrix has none, one or infinitely many
solutions. An arbitrary m × n matrix is just a display of coefficients for m equations in n unknowns.
Through this next section we focus on a special type of matrix called a square matrix. In this case
the matrix has the exact same number of rows and columns (e.g., n × n).
Now that we have these matrices we want to use them to do more advanced things then simply
solving linear systems. The first basic quantity associated to any square matrix that will be useful
going forward is the determinant. The determinant can be though of as a measure of volume in
some sense (more to come on this later). The loose analog for real numbers is the absolute value,
| · |.
Start with the simplest case of a 2 × 2 matrix of the form
 
a b
A= . (2.1)
c d

Definition 2.3.1 The determinant of a 2 × 2 matrix A, denoted |A|, is defined to be



a b
|A| = = ad − bc. (2.2)
c d

Note that the determinant of a 1 × 1 matrix, A = a is a trivial extension of this idea |A| = a.
52 Chapter 2. Fundamentals of Linear Algebra
 
1 2
 Example 2.7 i) Find the determinant of A = .
3 4

1 2
det(A) = = 1(4) − 2(3) = 4 − 6 = −2.
3 4
 
1 2
ii) Find the determinant of A = .
2 4

1 2
det(A) = = 1(4) − 2(2) = 4 − 4 = 0. 
2 4
In order to introduce a formula for the determinant of an arbitrary n × n matrix we first must
introduce some notation. Given a matrix A we can define a sub-matrix Ai j where the ith row and
jth column have been deleted.
 Example 2.8
 
1 2 3 4  
 5 6 7 8  1 2 4
A=
 9 10 11 12  ,
 A23 =  9 10 12  .
13 14 16
13 14 15 16


Definition 2.3.2 (Determinant of an n × n Matrix) For n ≥ 2, the determinant of an n × n


matrix A is given by

n
det(A) = a11 det(A1 1) − a12 det(A12 ) + ... + (−1)1+n a1n det(A1n ) = ∑ (−1)1+ j a1 j det(A1 j ).
j=1
(2.3)

This process is call Cofactor Expansion.

Thus, for an n × n matrix we keep applying cofactor expansion until all the remaining determi-
nants are 2 × 2.
 
1 2 0
 Example 2.9 i) Compute the determinant of A =  3 −1 2 
2 0 1
Solution: 1
 
1 0 0
ii) Compute the determinant of A =  0 2 0 
0 0 3
Solution: 6 

R Cofactor expansion can actually be done about any row or column not just the first one
(
∑nj=1 (−1)i+ j ai j det(Ai j ) Expand about row i
|A| = det(A) = n i+ j a det(A )
(2.4)
∑i=1 (−1) ij ij Expand about column j.

 
1 2 0
 Example 2.10 i) Compute the determinant of A =  3 −1 2  using cofactor expansion
2 0 1
about the third column
2.3 Determinants and Cramer’s Rule 53

Solution: 1
 
1 2 3 4
 0 2 1 5 
ii) Compute the determinant of A = 
 0

0 2 1 
0 0 3 5
Solution: 14 

2.3.1 Special Case: Upper and Lower Triangular Matrices

Definition 2.3.3 An n × n matrix A is said to be upper triangular if all the elements below the
main diagonal, ai j for i < j, are zero. An n × n matrix A is said to be lower triangular if all the
elements above the main diagonal, ai j for i > j, are zero. Finally, a matrix is said to be diagonal
if the only nonzero elements are in positions (i, i) for i = 1, ..., n.

FACT: If a matrix A is one of the three cases of triangular matrices (e.g., upper, lower, diagonal),
then the determinant is just the product of the diagonal elements.
 
1 2 0
 Example 2.11 i) Compute the determinant of A =  0 −1 2 
0 0 1
Solution: -1
 
2 3 4 5
 0 1 2 3 
ii) Compute the determinant of A =  
 0 0 −3 5 
0 0 0 4
Solution: -24 

2.3.2 Properties of Determinants

Facts:
1. If each element of one row or one column of a determinant is multiplied by a number k, the value
of the determinant is multiplied by k.

2. The value of a determinant is zero if one of the following occurs:


(a) All elements of one row or column are zero.
(b) Two rows or two columns are identical.
(c) Two rows or two columns are proportional.

3. If two rows or two columns of a determinant are interchanged, the value of the determi-
nant changes sign.

4. The value of a determinant is unchanged if:


(a) Row are written as columns and columns as rows (e.g., det(A) = det(AT )).
(b) We add to each element of one row, k times the corresponding element of another row, where k
is any number (and a similar statement for columns).
54 Chapter 2. Fundamentals of Linear Algebra
 
1 2 3 4
 0 5 0 0 
 Example 2.12 i) Find the determinant of A = 
 2 7 6 10 .

2 9 7 11
Solution: -10.

2 4 6
ii) Find the determinant of A =  5 6 7 .
7 6 10
Solution: -40.
 
2 3 0 1
 4 7 0 3 
iii) Find the determinant of A =  .
 7 9 −2 4 
1 2 0 4
Solution: -12. 

Example 2.13 (Application) Find the equation of a plane through (0, 0, 0), (1, 0, 1), (1, 2, 0).
Recall the equation of a plane has the form ax + by + cz + d = 0. Treat a, b, c, d as the unknowns.
We can setup the following determinant problems to find the equation of the plane



x y z 1

0 0 0 1
= 0.
(2.5)

1 0 1 1

1 2 0 1

Using cofactor expansion about the first row and simplifying we find −2x + y + 2z = 0 is the
equation for the plane. 

2.3.3 Cramer’s Rule


Recall the original purpose of linear algebra was to solve linear systems more efficiently. Can
we use the determinant to solve a system of linear equations? The answer is yes and the solution
method is called Cramer’s Rule

Theorem 2.3.1 (Cramer’s Rule) Given the following linear system with n unknowns, x1 , ..., xn
and coefficients ai j

a11 x1 + ... + a1n xn = b1


..
.an1 x1 + ... + ann xn = bn .

Also, define D := det(A) is the determinant of the coefficient matrix consisting of the ai j and
D j = det(A j ) where A j is the matrix where the jth column is replaced by the righthand side
b1 , ..., bn . Using these quantities we can find the solution:
D1 Dj Dn
x1 = , ..., x j = , ..., xn = . (2.6)
D D D
Observe that if the determinant D = 0 there is no solution.
2.4 Vectors 55

 Example 2.14 Use Cramer’s Rule to solve the following linear systems: i)

x + 2y = 1
−x + 3y = 4
Solution: x = 1, y = −1.

i)
x+y+z = 3
3y − z = −2
2x − z = 0
Solution: x = 1, y = 0, z = 2. 

2.4 Vectors
This section should be a review from calculus or a brief introduction to vectors if you are unfamiliar.
In terms of matrices, a vector is a matrix with only one column.
Definition 2.4.1 An n-dimensional vector has the form:
 
u1
 u2 
u= .
 
..
 . 
un

where ui is referred to as the ith component of the vector u.

A vector describes physical quantities such as velocity which require a magnitude and a
direction.
Definition 2.4.2 The magnitude or norm of a vector is denoted by
q
|u| = kuk = u21 + ... + u2n .

Definition 2.4.3 (Unit Vector) There is a special kind of vector which will be useful in the
coming lectures that has length 1. Any vector with this property is called a unit vector. If a
vector does not have unit length it can easy be scaled to have length 1 by dividing each element
of the vector v by its norm, |v|, v̂ = v/|v|.

In 2D, we have two basis vector from which all other vectors can be constructed,
î = [1, 0], ĵ = [0, 1]. In 3D, we have three basis vectors, î = [1, 0, 0], ĵ = [0, 1, 0], k̂ = [0, 0, 1].
This idea can be extended to any dimension n, resulting in n basis vectors êi having a zero in
every component except for the ith component which is 1.

Definition 2.4.4 (Zero Vector) There is another special kind of vector which has magnitude
zero. The zero vector 0 = [0, 0, ..., 0].

There are two basic operations among vectors:


1. Vector addition, u + v = (u1 + v1 , u2 + v2 , ..., un + vn )

2. Scalar Multiplication, cu = (cu1 , cu 2, ..., cun ).


56 Chapter 2. Fundamentals of Linear Algebra

R Vector addition is commutative, in other words u + v = v + u and associative u + (v + w) =


(u + v) + w.

 
1
 Example 2.15 Let u = . Express u, 2u, and − 32 u on a graph. 
2

2.4.1 Scalar Product


Unlike in the previous section on complex numbers, the notation of multiplication of vectors is not
well defined. Instead of a traditional product of vectors the useful notion of a scalar product was
introduced.
Definition 2.4.5 (Scalar Product) Given two vectors of the same length, u, v, the scalar product
is defined as

u · v = u1 v1 + u2 v2 + ... + un vn .

The scalar product has the following properties:


1. u · (v + w) = u · v + u · w
2. (u + v) · w = u · w + v · w.

 Example 2.16 i) Let u = [1, 2, 3] and v = [−1, 0, 1]. Then u · v = 1(−1) + 2(0) + 3(1) = 2.

ii) Let u = [1, 4] and v = [−1, −2]. Then u · v = 1(−1) + 4(−2) = −9.

iii) In 3D, î · ĵ = 1(0) + 0(1) + 0(0) = 0. 

The scalar product is a very useful quantity we can give information about the angle between
the two vectors involved and the magnitude. This will be crucial for application that require one to
determine when vectors are parallel or perpendicular.

Theorem 2.4.1 Given two vectors of equal length u, v, then the scalar product can also be
expressed as

u · v = |u|v| cos(θ ), (2.7)

where θ is the angle between the two vectors. In particular we see that if the two vectors are
parallel (e.g., θ = 0), then the scalar product is just the product of the magnitudes (and positive!).
If the two vectors are perpendicular (e.g., θ = π/2), then the scalar product is zero.

Notice that in the previous example all the unit basis vectors in any dimension are perpendicular.

R If two vectors are parallel, then every component of one of the vectors is proportional to the
same component in the other vector (e.g., u1 /v1 = u2 /v2 = u3 /v3 ). In other words, they are
scalar multiples of each other. For example, u = [1, 1] and v = [2, 2].

 Example 2.17 i) Take the scalar product of a vector with itself, v · v = |v|2 cos(0) = |v|2 . In

particular, we find an alternate definition for the norm of a vector: |v| = v · v.

ii) Find the angle between u = [1, 0] and v = [1, 1]. Using the alternate definition
√ of the scalar
product u · v√= |u||v| cos(θ ). Plugging in the appropriate values we find 1 = 2 cos(θ ). Thus,
cos(θ ) = 1/ 2 and therefore θ = π/4. 
2.4 Vectors 57

2.4.2 Vector (Cross) Product

In addition to the scalar product we can define another form of product that results in a vector.

Definition 2.4.6 (Cross Product) To find a vector which is perpendicular to two given three
dimensional vectors, denoted w = u × v.

î ĵ k̂

u × v = ux uy uz = î(uy vz − uz by ) + ĵ(uz vx − ux vz ) + k̂(ux vy − uy vx ). (2.8)
vx vy vz

Similar to the scalar product there is an alternate way to find the magnitude of the cross product
|u × v| = |u||v| sin(θ ) where θ is the angle between u and v.

The resulting vector w is perpendicular to the plane containing u and v. Its direction
is determined by the “righthand rule".

 Example 2.18 Let u = [0, 1, 0] and v = [1, 0, 0]. Find u × v.

Solution: u × v = [0, 0, 1]. 

There are a few special cases of the cross product we should highlight before moving on:
1. If u × v = 0, the u and v are parallel or anti-parallel (opposite directions).
2. u × u = |u|2 sin(θ ) = 0, since θ = 0.
3. u × v = −v × u.
4. u × (v + w) = u × v + u × w.

 Example 2.19 i) Let u = 2î − ĵ + k̂ and v = 3ĵ + k̂. Find u × v.

Solution: [−4, −2, 6].

ii) Let u = [0, 3, −1] and v = [1, 2, 3]. Find u × v.

Solution: [11, −1, −3]. 

2.4.3 Orthogonality
Definition 2.4.7 (Orthogonal) If two vectors are perpendicular we say they are orthogonal.
Orthogonal vectors are characterized by vectors whose scalar product is zero. If, in addition, the
vectors have unit length then they are called orthonormal.

Next, we want to define the notion of distance between vectors.

Definition 2.4.8 The distance between u and v is:


q
dist(u, v) = ku − vk = (u1 − v1 )2 + ... + (un − vn )2 . (2.9)

In general, ku − vk2 = kuk2 + kvk2 − 2u · v and ku + vk2 = kuk2 + kvk2 + 2u · v.

If ku − vk2 = ku + vk2 = kuk2 + kvk2 , then u · v = 0 or the vectors are orthogonal.


58 Chapter 2. Fundamentals of Linear Algebra

Returning to Matrices
Not all linear systems Ax = b have solutions. For example
    
1 2 x1 3
= . (2.10)
2 4 x2 2
 
1
The solution to this system are multiples of and the righthand side is not a multiple. Thus,
2
many times when we solve linear systems in physical applications we have to find the closest
solution to the real thing, x̂. This is defined to be the point whose distance from the solutions is
minimized, kAx̂ − bk.
In particular, the orthogonal projection of the righthand side onto the solution will be the best
approximation.
Definition 2.4.9 (Orthogonal Projection) The projection of vector u onto v is

(u · v)
projv u = v. (2.11)
v·v
Using the orthogonal projection, the closest right hand side which has a solution is [1.4, 2.8]
and the x̂ which produces this is x̂ = [1.4, 0].

2.5 Lines, Planes, and Geometric Applications


A common problem in physics is finding the vector between two points. This can be done by taking
the difference of the vectors (the direction of the result will depend on which vector is subtracted).
 Example 2.20 Find the vector from u = [1, 2] to v = [1, 0].

Solution: Compute v − u = [1 − 1, 0 − 2] = [0, −2]. Notice u − v = [1 − 1, 2 − 0] = [0, 2], has


the same magnitude, but the opposite sign. This indicated that it is the vector from v point toward u.


In two dimension another common problem is finding the line from a point (x0 , y0 ) in the
direction of a given vector v = [a, b]. This general line has the form

x − x0 = î(x − x0 ) + ĵ(y − y0 ).

If the line must be parallel to the vector v, then the components of the line must be proportional to
the components of v (Recall from 3.3 if vectors are parallel their components are proportional).
x − x0 y − y0 y − y0 b b
= ⇒ = ⇒ y = (x − x0 ) + y0 . (2.12)
a b x − x0 a a
This is exactly the familiar slope intercept form of a line. Another way to write this is in parametric
form where x − x0 is a scalar multiple of v

x − x0 = vt ⇒ x = x0 + vt. (2.13)

This form has a physical meaning: x0 is the starting point of a particle and x is the location of the
particle at time t if it moves with velocity v.
We can preform and analogous procedure in three dimensions. Find the line that passes through
x = (x0 , y0 , z0 ) in the direction of v = (a, b, c). Using the first approach of parallel vectors we
known the components are proportional:
x − x0 y − y0 z − z0
= = (if a, b, c 6= 0). (2.14)
a b c
2.5 Lines, Planes, and Geometric Applications 59

For example, if c = 0 then z = z0 . Using the second approach we can find the parametric form:

x = x0 + at

x = x0 + vt = y = y0 + bt . (2.15)

z = z0 + ct

 Example 2.21 Given the point (1, 0, 1) find the equation for the line through this point and
parallel to (1, 2, 3).

x = 1 + t

x = x0 + vt = y = 2t . (2.16)

z = 1 + 3t

It possible to ask a similar question, given a point find the line through this point , but perpen-
dicular to a vector v = (a, b). Recall, that two line are perpendicular (orthogonal) if their scalar
product is zero
a
(x − x0 ) · v = 0 ⇒ a(x − x0 ) + b(y − y0 ) = 0 ⇒ y = − (x − x0 ) + y0 . (2.17)
b
 Example 2.22 Given the point (1, 1) find the equation for the line through this point and
orthogonal to (1, 2).
1
y = − (x − 1) + 1. (2.18)
2


In three dimensions, this can be used to find the equation of a plane (x − x0 ) · v = a(x − x0 ) +
b(y − y0 ) + c(z − z0 ) = 0 or after rearranging ax + by + cz = d = ax0 + by0 + cz0 .
 Example 2.23 Find the equation of the plane through u = (1, 0, 0), v = (1, 1, 1), and w = (0, 1, 1).

First we need to find a normal vector! To do this we find the vector from u to v, v − u = (0, 1, 1) and
the vector from u to w, w − u = (−1, 1, 1). Now that we have two vectors in the plane containing
u, v, w we can compute the normal to this plane using the cross product.

î ĵ k̂

N = (v − u) × (w − u) = 0 1 1 = î(1 − 1) + ĵ(−1 − 0) + k̂(0 + 1) = (0, −1, 1). (2.19)
−1 1 1

So the equation of the plane is

0(x − x0 ) − 1(y − y0 ) + (z − z0 ) = 0. (2.20)

Plugging in one of the three points for (x0 , y0 , z0 ) (e.g., v) we find

−1(y − 1) + (z − 1) = 0 ⇒ −y + z = 0. (2.21)

 Example 2.24 Find the equation of the line through u = (−1, 1, 0), and orthogonal to the plane
in the previous example.
60 Chapter 2. Fundamentals of Linear Algebra

Since N = (0, −1, 1) is perpendicular to the plane, then it must be parallel to the vector we
want. Thus, returning to the previous set of examples we need to find the equation for the line
through u and parallel to N.

x = −1

x = u + Nt = y = 1 − t . (2.22)

z=t

 Example 2.25 Find the closest distance from a point P = (1, 2, 3) to the plane defined by
x + 2y + 2z − 1 = 0.

What we need to solve this problem is the normal vector n = (1, 2, 2) and a point on the plane (e.g.,
Q = (1, 0, 0). We now construct the vector from Q to P, PQ = P − Q = (0, 2, 3). The distance is
then

n
dist = PQ · .
(2.23)
|n|
√ √ n
Now we find |n| = 12 + 22 + 22 = 9 = 3. Then |n| = (1/3, 2/3, 2/3) and

n
dist = PQ · = |0(1/3) + (2)(2/3) + (3)2/3| = |4/3 + 6/3| = |10/3| = 10/3.

|n|
Or there is an alternative method using the cross product instead of the scalar product. The distance
from a point to a plane will be the magnitude of the vector from the point P to the nearest spot on
the plane denoted R. Thus,

v
|PR| = |PQ| sin(θ ) = PQ × ,
(2.24)
|v|

where θ is the angle between PQ and RQ and v = RQ. To find R we must use the previous example
to find the line through P, but orthogonal to the plane

x = 1 + t

x = P + nt = y = 2 + 2t . (2.25)

z = 3 + 2t

Now plug the expressions for x, y, z into the equation of the plane and solve for t

(1 + t) + 2(2 + 2t) + 2(3 + 2t) − 1 = 0


9t + 10 = 0
t = −10/9.

Substitution of this value of t into (6.202) gives R = (−1/9, −2/9, 7/9), the point on the plane
that is also on the line through P. Now RQ = R − Q = (−10/9, −2/9, 7/9). To complete the
computation we need to find
10

|PQ × RQ| |(20/9, −30/9, 20/9)| 9 17 10
|PQ| sin(θ ) = = = 1√ = .
|RQ| |(−10/9, −2/9.7/9)| 3 17
3

2.5 Lines, Planes, and Geometric Applications 61

 Example 2.26 Find the distance from the point P = (1, 2, 2) to the line joining Q = (1, 0, 0) and
R = (−1, 1, 0).

Observe here that R is not necessarily the closest point to P so we will p use the formula from√
the last part of (8.111). First, define v = R − Q = (−2, 1, 0). Then |v| = (−2) 2 + 12 + 02 = 5
√ √
and v/|v| = (−2/ 5, 1/ 5, 0). Also, PQ = P − Q = (0, 2, 2). Then from (8.111)

√ √

v 1
dist = PQ × = |(0, 2, 2) × (−2/ 5, 1/ 5, 0)| = √ |(0, 2, 2) × (−2, 1, 0)|
(2.26)
|v| 5
p √ √
1 (−2)2 + (−4)2 + 42 36 6 6 5
= √ |(−2, −4, 4)| = √ = √ =√ = . (2.27)
5 5 5 5 5


 Example 2.27 Find the distance between the lines x1 = −î + 2ĵ + (î − k̂)t and x2 = ĵ − 2k̂ + (ĵ −
î)t.

Step 1: Write each line in the parametric form x = x0 + vt


 
x1 = −1 + t
 x2 = −t

x1 = x0,1 + v1t = y1 = 2 x2 = x0,2 + v2t = y2 = 1 + t (2.28)
 
z1 = −1t z2 = −2
 

To use (2.23), we need to identify P, Q and n. First, P and Q are just x0,1 and x0,2 respectively.
Thus, P = (−1, 2, 0) and Q = (0, 1, −2) and PQ = P − Q = (1, −1, −2). To find the normal vector
n we use the only tool we have for finding a vector orthogonal to another ... the cross product

î ĵ k̂

n = v1 × v2 = 1 0 −1 = −î + ĵ + k̂ = (−1, 1, 1), |n| = 3. (2.29)
−1 1 0

Therefore, using (2.23), we find

√ √ √ √ √ √ √

n 4
dist = PQ · = (1, −1, −2) · (−1/ 3, 1/ 3, 1/ 3) = |−1/ 3−1/ 3−2/ 3| = |4/ 3| = √ .
|n| 3
(2.30)

 Example 2.28 i) Find the direction of the line of intersection of the two planes −x + 2y + z = 3

and x + y − 3z = 1.

Since the intersection of the two planes must lie in both planes (by definition), then the line
we are looking for must be orthogonal/perpendicular to the normal vector for each plane

n1 = (−1, 2, 1) n2 = (1, 1, −3).

So the direction of the line is



î ĵ k̂

n1 × n2 = −1 2 1 = −7î + 2ĵ − 3k̂ = (−7, −2, −3).
(2.31)
1 1 −3
62 Chapter 2. Fundamentals of Linear Algebra

ii) Find the cosine of the angle between the two planes.

This is equivalent to finding the cosine of the angle between the normal vectors. Using the
definition of the scalar product
n1 · n2 = |n1 ||n2 | cos(θ )
√ √
(−1, 2, 1) · (1, 1, −3) = 6 11 cos(θ )
−2
√ = cos(θ ).
66


2.6 Matrix Operations


In this section we consider different ways of manipulating matrices to solve problems:
• Matrix Equations, Ax = b
• Matrix Algebra (+, −, ×, ÷)
• Inverse of a Matrix, x = A−1 b
• Functions of Matrices (e.g., eA )
Definition 2.6.1 (Equal Matrices) Two matrices A and B are equal if and only if every single
element is equal. In particular they must be the same size (e.g., both are m × n).
 2   
x −3 2 1 2
 Example 2.29 Let A = and B = . Find the values of x so that A = B.
3 4 3 4
Solution: By definition the two matrices are equal if all the elements are equal. So x2 − 3 =
1 ⇒ x2 = 4 ⇒ x = ±2.

Can a 2 × 3 matrix equal a 3 × 2 matrix? No! the dimensions must be the same! 

2.6.1 Scalar Multiplication of Matrices


 
ca11 ca12 · · · ca1n
 .. ..
For any real number c, the matrix cA =  . . It is important to notice that

.
can1 can2 · · · cann
the scalar number c multiplies every entry.

Recall Section 3.3, and consider how the determinant is effected by scalar multiplication.
 
1 2
 Example 2.30 Let A = , which has determinant det(A) = 1(4) − 2(3) = 4 − 6 = −2.
3 4
Let c = 2 and find the determinant of cA.
 
2 4
Solution: First, compute cA = , then det(cA) = 2(8) − 4(6) = 16 − 24 = −8. It turns out
6 8
in general det(cA) = cn det(A) where n is the size of the square matrix, A is n × n. 

2.6.2 Addition and Subtraction of Matrices


Before considering any algebraic operations on matrices one must make sure they are the same size!
Addition and Subtraction of two matrices are similar just add/subtract the matching components.
     
a11 a12 b11 b12 a11 + b11 a12 + b12
A+B = + =
a21 a22 b21 b22 a21 + b21 a22 + b22
2.6 Matrix Operations 63
   
1 2 0 −1
 Example 2.31 Let A = and let B = . Find 5A, A + B, and A − B.
3 4 −2 4
   
5(1) 5(2) 5 10
5A = =
5(3) 5(4) 15 20
   
1+0 2 + (−1) 1 1
A+B = =
3 + (−2) 4+4 1 8
   
1−0 2 − (−1) 1 3
A−B = =
3 − (−2) 4−4 5 0

Also, observe that 2A = A + A. In order for addition and subtraction to make sense the
dimensions of the matrices must be the same.

2.6.3 Multiplication and Division of Matrices


First, we must
 define what
 we mean
 by AB = C for matrices A, B, and C. Given two matrices
a b e f
A= and B = , then the product
c d g h
    
a b e f ae + bg a f + dh
AB = = . (2.32)
c d g h ce + dg c f + dh

The (i, j)th entry of the resulting matrix C is the scalar product of the ith row of A with the jth
column of B.
   
1 0 0 −1
 Example 2.32 Given two matrices A = and B = , then the product
2 3 2 1
   
1(0) + 0(2) 1(−1) + 0(1) 0 −1
AB = = . (2.33)
2(0) + 3(2) 2(−1) + 3(1) 6 1


R Key Observation: In order for matrix multiplication to work, the number of columns of
matrix A must be equal to the number of rows in matrix B (for the scalar products to be
well-defined). Thus, if A is m × n, then B must be n × d for multiplication to work! In addition,
the resulting matrix will have dimension m × d.

So the inner dimensions of AB must be equal (e.g., n) and the outer dimensions give the size
of the resulting matrix (m × d).

 Example 2.33 Multiply


 
4 −2  
2 −3
A =  3 −5  B= (2.34)
6 −7
0 −1

Step 1: Check that the dimensions are valid for matrix multiplication.

Matrix A is 3 × 2 and matrix B is 2 × 2. The inner dimensions agree (e.g., 2) so we can mul-
tiply and the resulting matrix should be 3 × 2.
64 Chapter 2. Fundamentals of Linear Algebra

Thinking Question: What if we wanted to multiply BA? No! We cannot because the inner
dimensions do not agree in this case (e.g., 3 6= 2).

Step 2: Carry Out the Multiplication


   
4(2) − 2(6) 4(−3) − 2(−7) −4 2
AB =  3(2) − 5(6) 3(−3) − 5(−7)  =  −24 26 
0(2) + 1(6) 0(−3) + 1(−7) 6 −7


What is the determinant of a product of matrices?

det(AB) = det(A)det(B) = det(AB).

This is only valid for square (n × n matrices).

2.6.4 Matrix Equation


A fundamental problem in Linear Algebra is finding a vector x such that Ax = b. Consider
    
1 2 3 x 2
 1 −1 0   y  =  3  .
0 0 1 z −1

This matrix equation represents the following system of linear equations.



x + 2y + 3z = 2

x−y = 3

z = −1

How can we solve this system using Matrix Operations.

2.6.5 Solution Sets of Linear Systems


The most basic type of matrix equation to solve is that of a homogeneous linear system

Ax = 0.

 Example 2.34 Solve the following linear system with row reduction

x1 + 10x2 = 0
2x1 + 20x2 = 0

corresponding to the matrix equation Ax = 0


    
1 10 x 0
=
2 20 y 0

A homogeneous equation always has the zero or trivial solution x = [0, 0]. A non-zero solution is
called a non-trivial solution.

Do any non-trivial solutions exist for this problem?


2.6 Matrix Operations 65

After row reduction we see that


   
1 10 0 1 10 0
→ .
2 20 0 0 0 0
 
−10
Using back substitution we find that x = x2 where x2 is the free variable. Thus, there are
1
infinitely many solutions of this form. 

R A homogeneous equation Ax = 0 has nontrivial solutions if and only if the system of equations
has at least one free variable.

 Example 2.35 Determine if the following homogeneous system has nontrivial solutions and then

describe the solution set


2x1 + 4x2 − 6x3 = 0
4x1 + 8x2 − 10x3 = 0
corresponding to the matrix equation Ax = 0
    
2 4 −6 x 0
=
4 8 −10 y 0
Notice first that there must be at least one free variable. Why? There are more columns than
rows in the matrix A. Carry out the row reduction to find
   
2 4 −6 0 1 2 0 0
→ .
4 8 −10 0 0 0 1 0
 
−2
Using back substitution we find that x = x2  1  where x2 is the free variable. Thus, there are
0
infinitely many solutions of this form. 

We can also use the determinant in an alternate method for determining the existence of
nontrivial solutions.
Theorem 2.6.1 A system of n homogeneous linear equations in n unknowns has a nontrivial
solution if and only if the determinant of the coefficient matrix, det(A) = 0.

We see from the first example (which is 2 × 2) that det(A) = 1(20) − 2(10) = 0 and it had
non-trivial solutions.
 Example 2.36 For what values of λ do we have non-trivial solutions for the following linear
system
(
(1 − λ )x − 3y = 0
3x + (1 − λ )y = 0
 
1−λ −3
To solve this problem consider the coefficient matrix A = and compute its
31 − λ
determinant
0 = det(A) = (1 − λ )(1 − λ ) − 3(3) = λ 2 − 2λ − 8 = (λ + 2)(λ − 4).
So the system has non-trivial solutions when λ = −2, 4. We will see problems like this again in
future sections when talking about eigenvalues, eigenvectors, and diagonalization of matrices! 
66 Chapter 2. Fundamentals of Linear Algebra

Now that we understand how to solve homogeneous linear system what about non-homogeneous
or inhomogeneous matrix equations?
 Example 2.37 Determine the solution set of

2x1 + 4x2 − 6x3 = 0


4x1 + 8x2 − 10x3 = 4

Carry out the row reduction to find


   
2 4 −6 0 1 2 0 6
→ .
4 8 −10 4 0 0 1 2
Using back substitution we find
   
6 −2
x =  0  + x2  1 
2 0


So the solution set of the inhomogeneous equation is the homogeneous solution xc = x2 [−2, 1, 0]
added to a particular solution x p = [6, 0, 2]. If we change the righthand side in the original problem
the only thing that will change is the particular solution x p .
Summary: The solution to a matrix equation Ax = b is the sum of the homogeneous solution
xc (to Ax = 0) and a particular solution x p . One can think of the solution to the inhomogeneous
equation as a translation of the solution set to the homogeneous equation by x p .

Theorem 2.6.2 Suppose the equation Ax = b is consistent for some given b, and let p be a
solution. Then the solution set of Ax = b is the set of all vectors of the form x = p + xc where xc
is a solution to the homogeneous equation Ax = 0.

 Example 2.38 Determine the solution set of

2x1 + −4x2 − 4x3 = 0

and compare it to the solution set of

2x1 + −4x2 − 4x3 = 6.

This is just one equation in three unknowns so there is no row reduction to be done. In both cases
we can solve for x1 in terms of the free variables x2 , x3 . For the first equation
   
2 2
x = x2  1  + x3  0  ,
0 1
and for the first equation
     
3 2 2
x = 0 + x2 1 + x3 0  ,
    
0 0 1
So the inhomogeneous equation has a solution set composed of the homogeneous solution added to
a particular solution x p = [3, 0, 0]. Solving another inhomogeneous system

2x1 + −4x2 − 4x3 = 4


2.6 Matrix Operations 67

we find
    
2 2 2
x =  0  + x2  1  + x3  0  ,
0 0 1

which has the same homogeneous part, but the particular solution is now x p = [2, 0, 0]. 

2.6.6 Inverse Matrix


First, recall what we do for one equation and one unknown ax = b. We divide by a and find x = b/a.
The matrix equivalent of this operation is to multiply both sides by the inverse Ax = b ⇒ x = A−1 b.
So if we can develop an algorithm for finding the inverse of a matrix A, then we can easily solve a
system of linear equations by just multiplying both sides by the inverse A−1 .
Definition 2.6.2 (Matrix Inverse) The inverse of a matrix only makes sense for a square n × n
matrix A. The inverse, denoted A−1 , is the unique matrix such that AA−1 = In = A−1 A. (For
comparison to real numbers aa−1 = a/a = 1).

If a matrix A has an inverse, then it is said to be invertible or non-singular.

R Observe that 1 = det(I) = det(AA−1 ) = det(A)det(A−1 ). Thus, det(A−1 ) = 1


det(A) . Also, if
det(A) = 0, then the matrix is singular and has no inverse.

2.6.7 Ways to Compute A−1


 
a b
Case I: 2 × 2 Let A = . If the det(A) 6= 0 or ad − bc 6= 0, then
c d
 
−1 1 d −b
A =
ad − bc −c a

If det(A) = ad − bc = 0, the the matrix is not invertible.


 
−7 3
 Example 2.39 Let A = . Find A−1 .
5 −2
Step 1: Find det(A) = −7(−2) − 3(5) = −1.

Step 2: Use the formula for A−1


   
−1 1 −2 −3 2 3
A = = .
−1 −5 −7 5 7

Step 3: You can check your work, AA−1 = I2 . 

This formula works great for 2 × 2 matrices, but as soon as the dimension becomes n × n where
n ≥ 3 the corresponding formula becomes unmanageable. Thus, we need to come up with a general
method for computing the inverse of a matrix.
Case II: n × n Given a matrix A we can setup the augmented matrix

In |A−1 .
 
[A|In ] Row reduce A to I

As we apply the row operations to In the righthand side will transform into the inverse.
68 Chapter 2. Fundamentals of Linear Algebra
 
−7 3
 Example 2.40 Let A = and check that we get the same answer as the formula.
5 −2
Solution: In class! 

 Example 2.41 Solve the matrix equation


    
1 1 0 x 3
 1 0 −2   y  =  −5  . (2.35)
0 −1 0 z −2

by finding the inverse, A−1 .

Solution: In class! 

2.6.8 Rotation Matrices  


x
Given a vector in two-dimensions r = , we can rotate it about the origin by an angle θ using
y
a rotation matrix
 
cos θ − sin θ
R= . (2.36)
sin θ cos θ

Then the new coordinates are


  
0 cos θ − sin θ x
r = Rr = . (2.37)
sin θ cos θ y

 Example 2.42 If r = [1, 0] and we want to rotate it by θ = π/2, then


       
0 cos θ − sin θ x 0 −1 1 0
r = Rr = = = . (2.38)
sin θ cos θ y 1 0 0 1

This is a common example of a linear transformation. More on the definition and use of linear
transformations soon! 

2.6.9 Functions of Matrices


We can take the power of a matrix. For example raising a square n × n matrix A to the kth power is
equivalent to multiplying A to itself k times

a1nk
 k 
a11 ···
Ak = AA...A 6=  ... ..  . (2.39)

. 
akn1 · · · aknn

We cannot just raise each element to the kth power, the matrix multiplication must be performed.
 
1 0
 Example 2.43 Let A = . Find A2 .
−1 2

12 02
      
2 1 0 1 0 1 0
A = = 6= (2.40)
−1 2 −1 2 −3 4 (−1) 22
2


2.7 Linear Combinations, Functions, and Operators 69

What about an arbitrary function of a matrix, f (A). To get an idea of how these work we can
expand the function in a Taylor series.
k2 A2
ekA = 1 + kA + + ... (2.41)
2!
Observe that both sides of this equation are matrices. So the exponential function of a matrix has a
meaning.

Two other quick observations:

1. (A + B)2 = A2 + AB + BA + B2 . The middle terms may not be the same since matrix mul-
tiplications is not commutative! (In general AB 6= BA).

2. eA+B 6= eA eB .

2.7 Linear Combinations, Functions, and Operators


In this section we will study a more general class of operations that matrices are a particular
example of, Linear Transformations. However, we must first understand the basic definition of
linearity and different operations associated with this definition.
Definition 2.7.1 (Linear Combination) Given two vectors u and v as well as two scalars a, b
then a linear combination of u and v is any sum of the form

au + bv.

 Example 2.44 Find 3u − v for u = [2, 1] and v = [−2, 2].

Solution: Using scalar multiplication of vectors and subtraction we find


     
6 −2 8
3u − v = − = .
3 2 1


R Any position vector r = (x, y, z) = xî + yĵ + zk̂ is a linear combination of the three unit basis
vectors î, ĵ, k̂.

 Example 2.45 Determine if w = [−4, 1] is a linear combination of u = [2, 1] and v = [−2, 2].

Need: To find scalars a, b so that au + bv = w. This can be rewritten as


    
u1 v1 a w1
= . (2.42)
u2 v2 b w2
So determining if some vector is a linear combination of other vectors is equivalent to solving
(6.202). For our particular problem we can setup the augmented matrix and row reduce to solve
 
2 −2 −4
.
1 2 1
After row reduction we find
 
2 −2 −4
.
0 3 3
70 Chapter 2. Fundamentals of Linear Algebra

Using back substitution we find that b = 1 and then 2a − 2b = −4 → a = −1. Therefore, [−4, 1] =
−u + v. 

Definition 2.7.2 (Span) Given a set of vectors {v1 , · · · , v p } in Rn , then the span denoted,
span{v1 , · · · , v p } is the set of all linear combinations of the vectors or equivalently it is all
vectors that can be written as

u = a1 v1 + · · · + a p v p .

 Example 2.46 i) What is the span of the vector v = [1, 2].

Solution: The span is any scalar multiple of this vector, u = c[1, 2]. Visually this is a line in
two dimensions through the point (1, 2) with slope m = y/x = 2/1 = 2.

ii) Find the span of v1 = [1, 2] and v2 = [−1, 0].

The span is all vectors of the form av1 + bv2 . In two-dimensions this is the entire plane. Two
vectors will always span the two-dimensional plane if they are not scalar multiples of each other,
v1 = cv2 . If this is the case, then we return to the i) where we have only a line. 

2.7.1 Linear Functions


Definition 2.7.3 A function of a vector, f (v), is called linear if

f (v1 + v2 ) = f (v1 ) + f (v2 ) and f (av1 ) = a f (v1 )

for any scalar a.

 Example 2.47 a. Let u = (1, 2, 3) and v = (x, y, z). Is f (v) = u · v = x + 2y + 3z linear?

Check the properties:

i) f (v1 +v2 ) = (x1 +x2 )+2(y1 +y2 )+3(z1 +z2 ) = x1 +2y1 +3z1 +x2 +2y2 +3z2 = f (v1 )+ f (v2 ).

ii) f (av) = ax + 2(ay) + 3(az) = a(x + 2y + 3z) = a f (v). So yes it is linear!

b. Is f (v) = v · v = x2 + y2 + z2 linear?

Check the properties:

i) f (v1 + v2 ) = (x1 + x2 )2 + (y1 + y2 )2 + (z1 + z2 )3 = x12 + y21 + z21 + x22 + y22 + z22 + 2(x1 x2 + y1 y2 +
z1 z2 ) = f (v1 ) + f (v2 ) + 2(x1 x2 + y1 y2 + z1 z2 ) So no it is not linear. 

The key to determine if a function is linear is to look for typical nonlinearities:


i) Powers of variables, v · v = |v|2 .
ii) Trig functions, sin(v).
iii) Multiplication of different components, f (v) = [xy, y].

Question: What if the function is a vector (e.g., magnetic field B(x)).


Definition 2.7.4 F(x) is a linear vector function if:
i) F(x1 + x2 ) = F(x1 ) + F(x2 ).
2.7 Linear Combinations, Functions, and Operators 71

ii) F(ax) = aF(x) for any scalar a.

 Example 2.48 a. Is F(x) = 3x − v where v = [1, 1, 1] linear?

Check the properties:

i) F(x1 + x2 ) = 3(x1 + x2 ) − v = 3x1 − v + 3x2 = F(x1 ) + x2 . No! Not linear.

b. Is F(x) = 3x linear?

Check the properties:

i) F(x1 + x2 ) = 3(x1 + x2 ) = 3x1 + 3x2 = F(x1 ) + F(x2 ).

ii) F(ax) = 3(ax) = a(3x) = aF(x). Yes, linear! 

 Example
  a rotationby θ = π/2 in two dimensions.
2.49 Consider   Recall
 the
 rotation
 matrix
cos θ − sin θ 0 −1 0 −1 x −y
R= = . Is F(x) = Rπ/2 x = = linear?
sin θ cos θ 1 0 1 0 y x
Check the properties:

i) F(x1 + x2 ) = Rπ/2 (x1 + x2 ) = Rπ/2 x1 + Rπ/2 x2 = F(x1 ) + F(x2 ).

ii) F(ax) = Rπ/2 (ax) = aRπ/2 (x) = aF(x). Yes, linear! 

A matrix applied to a vector is referred to as a linear operator.

2.7.2 Linear Operators


Definition 2.7.5 An operator is a rule or instruction for how to act on some scalar or vector. An
operator L is linear if:
i) L(u + v) = Lu + Lv.
ii) L(cv) = cL(v) for any real scalar c.
Here u and v can be scalars, vectors, matrices, functions, etc.

R Every example above that was shown to be linear is a linear operator.

 Example 2.50 Important for Math Methods II, is differentiation d/dx a linear operator?

Check the properties:

df
i) d
dx [ f (x) + g(x)] = dx + dg
dx .

ii) d
dx [c f (x)] = c ddxf . Yes, derivatives are linear operators! 

 Example 2.51 Is f (x) = x1/n a linear operator? If so, when (for what values of n)?
√ √ √
Solution: In general, n x + y 6= n x + n y unless n = 1. In this case f (x) = x for all other powers
this function is NOT a linear operator. 
72 Chapter 2. Fundamentals of Linear Algebra

2.8 Matrix Operations and Linear Transformations


Recall the equivalence of a system of linear equations to a matrix equation, Ax = b:
(     
ax + by = e a b x e
or = .
cx + dy = f c d y f

Every point x = [x, y] is moved to a new point [e, f ] (mapping/transformation). All the necessary
information is built into the matrix A. Observe that:
i) A(x1 + x2 ) = Ax1 + Ax2
ii) A(cx) = cAx.
Both hold by general matrix properties, so all matrices are linear transformations.
 
1 1
 Example 2.52 Let A = and x = [1, 0]. Then Ax = x0 = [1, 1]. So |x| = 1, but
1 −1

|x0 | = 2. So lengths and distances are not preserved by this matrix, but it is still a linear
transformation. 

R If a linear transformation does preserve lengths and distance then it is called orthogonal. A
classic example of this is a rotation matrix, which clearly does not change the length of a
vector. A matrix of an orthogonal transformation is an orthogonal matrix and has the property
that M −1 = M T .
 
0 −1
 Example 2.53 i) Let Rπ/2 = then one can check that R−1 T
π/2 = Rπ/2 .
1 0
 
−1 0
ii) This also holds for reflections S = . 
0 1
Observe also that if a matrix M is orthogonal, then I = M −1 M = M T M and thus, 1 = det(I) =
det(M T M) = det(M)2 . This implies that det(M) = ±1. If the determinant is positive it is a pure
rotation and if it is negative then the transformation involves some form of reflection.

R In 3D if you want to rotate an object you have to pick an axis to rotate about. The rotation
matrices about the x-axis, y-axis, and z-axis respectively are
     
1 0 0 cos θ 0 − sin θ cos θ − sin θ 0
Rx =  0 cos θ − sin θ  Ry =  0 1 0  Rz =  sin θ cos θ 0  .
0 sin θ cosθ sin θ 0 cosθ 0 0 1
Are the 3D rotation matrices orthogonal? Yes! you can check by finding RTi = R−1
i .

Question: Are all matrices that have det(M) = 1 orthogonal?


 Example 2.54
     
1 1 T 1 0 −1 1 −1
A= , A = 6= A =
0 1 1 1 0 1


Theorem 2.8.1 Every linear transformation can be expressed with a matrix.

One of the most useful techniques in linear algebra is to take a mapping or transformation and
express it as a matrix in order to analyze it. A mapping T (x) : Rn 7→ Rm can be written as an m × n
matrix A where each column of A is the output of the set of unit basis vectors (e.g., î, ĵ, k̂)
 
A = T (î)T (ĵ)T (k̂) . (2.43)
2.9 Linear Dependence and Independence 73
 
x+y
 Example 2.55 Find the matrix associated to the transformation T (x) =  3y .
z
Step 1: First check that the transformation is indeed linear!

Step 2: Since we are in three dimensions, compute how it acts on the three basis vectors î, ĵ, k̂:
i) T (î) = [1, 0, 0]
ii) T (ĵ) = [1, 3, 0]
iii) T (k̂) = [0, 0, 1].

Step 3: Each of these solutions is a column of A


 
1 1 0
A =  0 3 0 .
0 0 1

Observe that if we want to know what point maps to b = [2, 3, 1] is equivalent to finding a point
x = A−1 b. We can check that x = [1, 1, 1]. 

In a future section we can define one-to-one and onto linear transformations by looking at their
corresponding matrices.

2.9 Linear Dependence and Independence


A homogeneous system such as
    
1 2 −3 x1 0
 3 5 9   x2  =  0 
5 9 3 x3 0

can be viewed as a linear combination of vectors


       
1 2 −3 0
x1  3  + x2  5  + x3  9  =  0 
5 9 3 0

This equation has the trivial solution (x1 = 0, x2 = 0, x3 = 0), but is this the only solution?
Definition 2.9.1 A set of vectors {v1 , v2 , ..., v p } in Rn is said to be linearly independent if the
vector equation

x1 v1 + · · · + x p v p = 0, (2.44)

has only the trivial solution. The set {v1 , v2 , ..., v p } is said to be linearly dependent if there
exists weights c1 , ..., c p not all zero such that

c1 v1 + · · · + c p v p = 0. (2.45)
     
1 2 −3
 Example 2.56 Let v1 =  3 , v2 =  5 , v3 =  9 .
5 9 3
a. Determine if {v1 , v2 , v3 } is linearly independent.
74 Chapter 2. Fundamentals of Linear Algebra

b. If possible, find a linear dependence relation.

Step 1: Set up an augmented matrix with columns v1 , v2 , v3 and righthand side 0.

Step 2: Row reduce the matrix to echelon form:


   
1 2 −3 0 1 2 −3 0
 3 5 9 0  →  0 −1 18 0  .
5 9 3 0 0 −1 18 0

Step 3: See if there are any free variables. If there are then the vectors are linearly dependent
and the dependence relation has coefficients c1 = x1 = −33x3 , c2 = x2 = 18x3 , c3 = x3 for any real
number x3 . One possible dependence relation is −33v1 + 18v2 + v3 = 0 by choosing x3 = 1. 

R The linear dependence relation among the columns of a matrix A corresponds to a nontrivial
solution to Ax = 0.

2.9.1 Special Cases


1. Consider the set containing one nonzero vector {v1 }. The only solution to x1 v1 = 0 is x1 = 0.
Thus, a set of only one vector is automatically linearly independent as long as it is not the zero
vector v1 6= 0.

2. Consider a set of two vectors


       
2 4 2 2
u1 = , u2 = , v1 = , v2 = .
1 2 1 3

Notice that u2 = 2u1 and therefore

2u1 + (−1)u2 = 0.

Thus, the set {u1 , u2 } is linearly dependent.

Since v2 is not a multiple of v1 a similar relationship cannot be found and the set {v1 , v2 } must be
linearly independent.

R A set of two vectors is linearly dependent if one of the vectors is a scalar multiple of the other.

A set of two vectors is linearly independent if and only if neither of the vectors is a multiple
of the other.

3. Consider a set of vectors containing the zero vector, {v1 , v2 , ..., v p−1 , 0}. Then

0v1 + 0v2 + ... + 0v p−1 + 10 = 0.

Thus, any set of vectors containing the zero vector, 0, must be linearly dependent.

4. “A set containing too many vectors".


2.9 Linear Dependence and Independence 75

Theorem 2.9.1 If a set contains more vectors than there are entries in each vector, then the set
is linearly dependent (e.g.,any set {v1 , ..., v p } in Rn where p > n).

2.9.2 Linear Independence of Functions

Consider a set of functions { f1 (x), f2 (x), ..., fn (x)}. This set is linearly dependent if there exists
constants k1 , ..., kn such that

k1 f1 + ... + kn fn = 0.

Definition 2.9.2 If functions f1 , ..., fn have derivatives up to order n − 1 and



f1 ··· fn
.. ..

W ( f1 , ..., fn ) = 6= 0,

(n−1). .
(n−1)

f
1 ··· fn

then the functions are linearly independent. This determinant, W , is called the Wronskian of the
functions (we will see this again when solving differential equations!).

 Example 2.57 a) Is the set of functions {cos(x), sin(x)} linearly independent?

Solution: The Wronskian W = cos2 (x) + sin2 (x) = 1 6= 0. Yes!

b) Is the set of functions {1, x, 2 + 5x} linearly independent?

Solution: The Wronskian W = 0. No!

c) Is the set of functions {1, cos(x), cos(2x)} linearly independent?

Solution: The Wronskian W = 4 sin(x) sin(2x) − 2 cos(x) sin(2x) 6= 0. Yes!

d) Is the set of functions {1, x2 , cos(2x)} linearly independent?


Solution: W = −8x cos(2x) + 4 sin(2x) 6= 0. 

2.9.3 Basis Functions

Recall the definition of the span of a set of vectors {v1 , v2 , ..., v p } is the set of all linear combinations.
While to be linearly independent, the set of vectors must not contain any vector that can be made as
a linear combination of the others.

Definition 2.9.3 A set of vectors {v1 , v2 , ..., v p } is called a basis for Rn if the set is linearly
independent and the linear combinations of the vectors span all of Rn .

 Example 2.58 The standard basis in R3 , {î, ĵ, k̂}. 


76 Chapter 2. Fundamentals of Linear Algebra

2.10 Special Matrices


Matrix Symbol Condition
Transpose Matrix AT ai j = a ji
Conjugate Matrix A or A? ai j → ai j
T
Hermitian (Self-adjoint) Matrix A† = A ai j = a ji
T
Anti-Hermitian Matrix A = −A ai j = −a ji
Inverse Matrix A−1 [A|I] → [I|A−1 ]
Real Matrix A=A ai j = āi j
Imaginary Matrix A = −A ai j = −ai j
Symmetric Matrix A = AT ai j = a ji , A real
Skew Symmetric A = −AT ai j = −a ji , A real
Orthogonal A−1 = AT
T
Unitary Matrix A−1 = A
T T
Normal Matrix AA = A A

2.11 Eigenvalues and Eigenvectors


The basic concepts here – eigenvalues and eigenvectors – are used through mathematics, physics
and chemistry.
     
0 −2 1 −1
 Example 2.59 Let A = ,u= and v = . Examine the images of u
−4 2 1 1
and v under multiplication by A.
 
    
0 −2 −2 1 1
Au = = = −2 = −2u
−4 2 −2 1 1
    
0 −2 −1 −2
Av = = 6= λ v.
−4 2 1 6

Here u is called an eigenvector of A. v is not an eigenvector of A since Av is not a multiple of v. 

Definition 2.11.1 An eigenvector of an n × n matrix A is a nonzero vector x such that Ax = λ x


for some scalar λ . A scalar λ is called an eigenvalue if there is a nontrivial solution x for
Ax = λ x, such an x is called the eigenvector corresponding to λ .
 
0 −2
 Example 2.60 Show that 4 is an eigenvalue of A = and find the corresponding
−4 2
eigenvectors.

Solution: The scalar 4 is an eigenvalue of A if and only if Ax = 4x has a nontrivial solution.


This is equivalent to (A − 4I)x = 0 having a nontrivial solution. To solve this problem, we first find
A − 4I:
     
0 −2 4 0 −4 −2
A − 4I = − = . (2.46)
−4 2 0 4 −4 −2

Now solve (A − 4I)x = 0 using row reduction


   
−4 −2 0 1 1/2 0
→ . (2.47)
−4 −2 0 0 0 0
2.11 Eigenvalues and Eigenvectors 77

Thus, using back substitution we find x1 = − 21 x2 or in vector form


   1 
x1 −2
x= = x2 . (2.48)
x2 1
 1 
−2
Thus, each vector of the form x2 is an eigenvector for the eigenvalue λ = 4. 
1

R The method just used to find eigenvectors cannot be used to find eigenvalues. We must come
up with a procedure for finding each eigenvalue and then implement the above strategy for
finding the corresponding eigenvalues.

 Example 2.61 Suppose that λ is an eigenvalue of A. Determine an eigenvalue of A2 and A3 . In


general, what is the eigenvalue of An .

Solution: Since λ is an eigenvalue of A, there is a nonzero vector x such that Ax = λ x. Ap-


ply A to both sides:

A2 x = A(λ x)
A2 x = λ Ax
A2 x = λ 2 x.

In general, λ n is an eigenvalue of An . 

Theorem 2.11.1 The eigenvalues of a triangular matrix are the entries on the diagonal.

Theorem 2.11.2 If v1 , ..., vr are eigenvectors that correspond to distinct eigenvalues λ1 , ..., λr of
an n × n matrix A, then v1 , ..., vr are linearly independent.

2.11.1 The Characteristic Equation: Finding Eigenvalues


To find eigenvectors we need to solve (A − λ I)x = 0 and find nontrivial solutions, but how do we
find the eigenvalues, λ ? There are two ways to think about it:

1. (A−λ I)x = 0 must have nontrivial solutions. Then (A−λ I) is not invertible. Thus det(A−λ I) =
0.

2. For there to be nontrivial solutions of (A − λ I)x = 0, Cramer’s rule must fail. It fails when the
det(A − λ I) = 0.
Definition 2.11.2 (Characteristic Equations) To find the eigenvalues of a matrix one must solve
the characteristic equation

det(A − λ I) = 0. (2.49)
 
0 1
 Example 2.62 Find the eigenvalues of A = .
−6 5
Solution: Since
     
0 1 λ 0 −λ 1
A−λI = − = ,
−6 5 0 λ −6 5 − λ
78 Chapter 2. Fundamentals of Linear Algebra

the characteristic equation becomes

−λ (5 − λ ) + 6 = 0
λ 2 − 5λ + 6 = 0
(λ − 2)(λ − 3) = 0.

Thus, the eigenvalues are λ = 2 and λ = 3. 

R For a 3 × 3 matrix or larger, recall that the determinant can be computed by cofactor expansion.

1 2 1
 Example 2.63 Find the eigenvalues of A =  0 −5 0 .
1 8 1
Solution: Since
 
1−λ 2 1
A−λI =  0 −5 − λ 0 .
1 8 1−λ

Then

1−λ 1
det(A − λ I) = (−5 − λ ) .
1 1−λ

the characteristic equation becomes

(−5 − λ )[(1 − λ 2 ) − 1] = 0
(−5 − λ )[λ 2 − 2λ ] = 0
(−5 − λ )λ (λ − 2) = 0.

Thus, the eigenvalues are λ = −5, λ = 0, and λ = 2. 


 
3 2 3
 Example 2.64 Find the eigenvalues of A =  0 6 10 .
0 0 2
Solution: Since
 
3−λ 2 3
A−λI =  0 6−λ 10  .
0 0 2−λ

Then

det(A − λ I) = (3 − λ )(6 − λ )(2 − λ ) = 0.

Thus, the eigenvalues are λ = 3, λ = 6, and λ = 2. 

2.11.2 Similarity
Definition 2.11.3 For n × n matrices A and B, we say A is similar to B if there is an invertible
matrix P such that

P−1 AP = B or A = PBP−1 .
2.12 Diagonalization 79

Theorem 2.11.3 If n × n matrices A and B are similar, then they have the same characteristic
polynomial and hence the same eigenvalues!

Definition 2.11.4 A square matrix A is diagonalizable f A is similar to a diagonal matrix, i.e.


if A = PDP−1 where P is invertible and D is diagonal.

R If we consider the matrix P a linear transformation, then diagonalizable matrices can be


transformed from one coordinate system to another. In particular, working with D is much
simpler than working with A.

2.12 Diagonalization
One of the goals of the section is to develop a useful factorization A = PDP−1 , when A is n × n.
We can use this to quickly find Ak quickly for large k. The matrix D is diagonal, and DK is trivial
to compute (each element is raised to the kth power). Thus, Ak = (PDP−1 )k = PDk P−1 , which is
easier to compute. But how do we find the matrix P.

Theorem 2.12.1 (Diagonalization Theorem) An n × n matrix A is diagonalizable if and only if


A has n linearly independent eigenvectors. In fact, A = PDP−1 with D a diagonal matrix, if and
only if the columns of P are n linearly independent eigenvectors of A. In this case, the diagonal
entries of D are eigenvalues of A that correspond, respectively, to the eigenvectors in P.


2 0 0
 Example 2.65 Diagonalize the following matrix (if possible). A =  1 2 1 
−1 0 1
Step 1: Find the eigenvalues of A. Use the characteristic equation!

2−λ 0 0
1 = (2 − λ )2 (1 − λ ).

0 = det(A − λ I) = 1
2−λ
−1 0 1−λ

Thus, the eigenvalues are λ = 1 and λ = 2. Even though we do not have three unique eigenvalues,
we hopefully will find three linearly independent eigenvectors.

Step 2: Find three linearly independent eigenvectors of A. To find the eigenvectors we must
solve (A − λ I)x = 0, for each value of λ .

Case 1 (λ = 1): Solve (A − I)x = 0, by writing the augmented matrix a row-reducing


   
1 0 0 0 1 0 0 0
 1 1 1 0  →  0 1 1 0 .
−1 0 0 0 0 0 0 0

Thus, from back substitution, we see x2 + x3 = 0 or x2 = −x3 with x3 free.


 Alsofrom the first row,
0
x1 = 0. So the eigenvector corresponding to λ = 1 has the form v1 = x3  −1 .
1
80 Chapter 2. Fundamentals of Linear Algebra

Case 2 (λ = 2): Solve (A − 2I)x = 0, by writing the augmented matrix a row-reducing


   
0 0 0 0 1 0 1 0
 1 0 1 0  →  0 0 0 0 .
−1 0 −1 0 0 0 0 0

Thus, from back substitution, we see  x1 + x3 = 0 or x1 = −x3 with x2 , x3 free. So the eigenvectors
−1
corresponding to λ = 2 are v2 = x3  0  and we must pick another linearly independent eigen-
1
 
0
vector, since x2 is free just change its value and pick x3 = 0, v3 =  1 .
0
Step 3: Construct P from the vectors in Step 2.
 
0 0 −1
P =  −1 1 0 
1 0 1

Step 4: Construct D from the corresponding eigenvalues.


 
1 0 0
D= 0 2 0 
0 0 2

Note that the eigenvector in the first column of P must be associated to the eigenvalue in the first
column of D.

Step 5: Check your work by verifying AP = PD. It is easier to check this then computing
P−1 . 
 
2 4 6
 Example 2.66 Diagonalize the following matrix (if possible). A =  0 2 2 
0 0 4
Step 1: Find the eigenvalues of A. Use the characteristic equation!

2−λ 4 6
2 = (2 − λ )2 (4 − λ ).

0 = det(A − λ I) = 0
2−λ
0 0 4−λ

Thus, the eigenvalues are λ = 4 and λ = 2. Even though we do not have three unique eigenvalues,
we hopefully will find three linearly independent eigenvectors.

Step 2: Find three linearly independent eigenvectors of A. To find the eigenvectors we must
solve (A − λ I)x = 0, for each value of λ .

Case 1 (λ = 4): Solve (A − 4I)x = 0, by writing the augmented matrix a row-reducing


 
−2 4 6 0
 0 −2 2 0  .
0 0 0 0
2.12 Diagonalization 81

Thus, from back substitution, we see −2x2 + 2x3 = 0 or x2 = x3 with x3 free. Also from the first
row, −2x1 + 4x2 + 6x3 = 0 or x1 = 2x2 + 3x3 = 5x3 . So the eigenvector corresponding to λ = 4 has
5
the form v1 = x3  1 .
1
Case 2 (λ = 2): Solve (A − 2I)x = 0, by writing the augmented matrix a row-reducing
   
0 4 6 0 0 4 6 0
 0 0 2 0  →  0 0 2 0 .
0 0 2 0 0 0 0 0
Thus, from back substitution, we see x3 = 0. From the first equation
 4x2 + 6x3 = 0 or x2 = 0 with
1
x1 free. So the eigenvector corresponding to λ = 2 is v2 = x1  0  and we cannot find another
0
linearly independent eigenvector. Thus, A is not diagonalizable. 
 
2 0 0
 Example 2.67 Why is A =  2 6 0  diagonalizable?
3 2 1
First, find the eigenvalues with the characteristic equation

2−λ 0 0

0 = det(A − λ I) = 2 6−λ 0 = (2 − λ )(6 − λ )(4 − λ ).
3 2 1−λ
Thus, the eigenvalues are λ = 4, λ = 6, and λ = 2. Since the eigenvalues are distinct, then they will
each have at least one linearly independent eigenvector. Thus, we are guaranteed to have enough
eigenvectors to build the matrices P and D. 

R In the special case that the matrix A is real and symmetric, then it can always be diagonalized
and the eigenvectors have an additional property. They are no longer just linearly independent,
they are also orthogonal (e.g., v1 · v2 = 0).
 
5 0 2
 Example 2.68 Diagonalize the following matrix (if possible). A =  0 3 0 
2 0 5
Step 1: Find the eigenvalues of A. Use the characteristic equation!

5−λ 0 2
5−λ 2
0 = det(A−λ I) = 0
3−λ 0 = (3−λ )
= (3−λ )(3−λ )(7−λ ).
2 5 − λ
2 0 5−λ
Thus, the eigenvalues are λ = 3 and λ = 7. Even though we do not have three unique eigenvalues,
we hopefully will find three linearly independent eigenvectors.

Step 2: Find three linearly independent eigenvectors of A. To find the eigenvectors we must
solve (A − λ I)x = 0, for each value of λ .

Case 1 (λ = 7): Solve (A − 7I)x = 0, by writing the augmented matrix a row-reducing


   
−2 0 2 0 −2 0 2 0
 0 −4 0 0  →  0 −4 0 0  .
2 0 −2 0 0 0 0 0
82 Chapter 2. Fundamentals of Linear Algebra

Thus, from back substitution, we see −4x2 = 0 or x2 = 0. Also from the first row, −2x1 +
 2x3= 0
1
or x1 = x3 with x3 free. So the eigenvector corresponding to λ = 7 has the form v1 = x3 0 .

1
Case 2 (λ = 3): Solve (A − 3I)x = 0, by writing the augmented matrix a row-reducing
   
2 0 2 0 2 0 2 0
 0 0 0 0  →  0 0 0 0 .
2 0 2 0 0 0 0 0
Thus, from back substitution, we see 2x1+ 2x3  = 0 or x1 = −x3 with x2 , x3 free. So the eigenvec-
−1
tors corresponding to λ = 3 are v2 = x3  0  and we must pick another linearly independent
1
 
0
eigenvector, since x2 is free just change its value and pick x3 = 0, v3 = 1 .

0
Step 3: Construct P from the vectors in Step 2.
 
1 −1 0
P= 0 0 1 
1 1 0
Step 4: Construct D from the corresponding eigenvalues.
 
7 0 0
D= 0 3 0 
0 0 3
Note that the eigenvector in the first column of P must be associated to the eigenvalue in the first
column of D.

Observe that since the matrix was real an symmetric the eigenvalues are all orthogonal! 
 
2 1 1
 Example 2.69 Diagonalize the following matrix (if possible). A =  1 2 1 
1 1 2
Step 1: Find the eigenvalues of A. Use the characteristic equation!

2−λ 1 1

0 = det(A − λ I) = 1
2−λ 1 = · · · = (1 − λ )(1 − λ )(4 − λ ).
1 1 2−λ
Thus, the eigenvalues are λ = 1 and λ = 4. Even though we do not have three unique eigenvalues,
we hopefully will find three linearly independent eigenvectors.

Step 2: Find three linearly independent eigenvectors of A. To find the eigenvectors we must
solve (A − λ I)x = 0, for each value of λ .

Case 1 (λ = 4): Solve (A − 4I)x = 0, by writing the augmented matrix a row-reducing


   
−2 1 1 0 1 −2 1 0
 1 −2 1 0  →  0 −3 3 0  .
1 1 0 0 0 0 0 0
2.12 Diagonalization 83

Thus, from back substitution, we see −3x2 + 3x3 = 0 or x2 = x3 with x3 free. Also from the first
 row,
1
x1 −2x2 +x3 = 0 or x1 = x3 . So the eigenvector corresponding to λ = 4 has the form v1 = x3  1 .
1
Case 2 (λ = 1): Solve (A − I)x = 0, by writing the augmented matrix a row-reducing
   
1 1 1 0 1 1 1 0
 1 1 1 0  →  0 0 0 0 .
1 1 1 0 0 0 0 0
Thus, from back  we see x1 + x2 + x3 = 0 or x1 = −x2 − x3 with x2 ,x3 free.
 substitution,
  So
−1 −1 −1
x = x2  1  + x3  0 . So the eigenvectors corresponding to λ = 3 are v2 =  1  and
0 1 0
 
−1
v3 =  0 .
1
Step 3: Construct P from the vectors in Step 2.
 
1 −1 −1
P= 1 1 0 
1 0 1
Step 4: Construct D from the corresponding eigenvalues.
 
4 0 0
D= 0 1 0 
0 0 1
Note that the eigenvector in the first column of P must be associated to the eigenvalue in the first
column of D.

Observe that since the matrix was real an symmetric the eigenvalues are all orthogonal! 

2.12.1 Physical Interpretation of Eigenvalues, Eigenvectors, and Diagonalization


In physics it is important to track deformation of a material. To do so, we consider an initial position
at the point (x, y), then the system is stretched, rotated, reflected, etc. until the original point is at a
new position (X,Y ). This deformation can be described by a M.
The first natural question is if there exists such deformations that a vector just gets stretched or
shrinks along the same direction. In other words
   
X x
=λ .
Y y
Such vectors with this property are the eigenvectors of the transformation and the special values λ
are the eigenvalues of the transformation. If there exists a matrix P such that P−1 MP = D, then
we say that we have diagonalized M by a similarity transformation. Physically, this amounts to a
simplification of the problem using a better choice of variables.
Now consider the physical meaning of P and D. Consider a set of two axes, the traditional (x, y)
and a rotated set of axes (x0 , y0 ) (by angle θ ). The relation of one coordinate system to another can
be expressed as a system of linear equations:
(
x = x0 cos(θ ) + y0 sin(θ )
y = y0 sin(θ ) + y0 cos(θ ).
84 Chapter 2. Fundamentals of Linear Algebra

or in matrix notation
 
0 cos(θ ) − sin(θ )
r = Pr , where P = .
sin(θ ) cos(θ )

Recall that M is the matrix that described the deformation in the (x, y)-plane. Then R = Mr shows
that the vector r becomes the vector R after the deformation.

Thinking Question: How can we describe the deformation in the (x0 , y0 ) system? In other words,
what matrix takes r0 to R0 ?

By using the above relations we find:

R = Mr
PR0 = MPr0
R0 = P−1 MPr0 .

Thus, D = P−1 MP is the matrix which describes in the (x0 , y0 ) system the same deformation that M
describes in the (x, y) system.

Thinking Question: What happens in the case that P is chosen to make D a diagonal matrix?

If this is the case then then new axes (x0 , y0 ) are along the directions of the eigenvectors of M. If the
eigenvectors are orthogonal, then the new axes will be orthogonal as well. In principle if P is not an
orthogonal matrix (composed of orthogonal eigenvectors), then the new axes will not be orthogonal.
The only case where we are guaranteed orthogonal eigenvectors is if the original deformation M is
real and symmetric.
When D is diagonal, in the (x0 , y0 ) coordinate system the system is either stretched or shrunk
along the axes no matter how complicated the original deformation M was.
Definition 2.12.1 If two or more eigenvalues are the same, then this eigenvalue is called
degenerate. Degeneracy means that two independent eigenvectors correspond the to same
eigenvalue.
III
Part Three: Multivariable
Calculus

3 Partial Differentiation . . . . . . . . . . . . . . . . . 87
3.1 Introduction and Notation
3.2 Power Series in Two Variables
3.3 Total Differentials
3.4 Approximations Using Differentials
3.5 Chain Rule or Differentiating a Function of a Func-
tion
3.6 Implicit Differentiation
3.7 More Chain Rule
3.8 Maximum and Minimum Problems with Constraints
3.9 Lagrange Multipliers

4 Multivariable Integration and Applications


111
4.1 Introduction
4.2 Double Integrals Over General Regions
4.3 Triple Integrals
4.4 Applications of Integration
4.5 Change of Variables in Integrals
4.6 Cylindrical Coordinates
4.7 Cylindrical Coordinates
4.8 Surface Integrals

5 Vector Analysis . . . . . . . . . . . . . . . . . . . . . 135


5.1 Applications of Vector Multiplication
5.2 Triple Products
5.3 Fields
5.4 Differentiation of Vectors
5.5 Directional Derivative and Gradient
5.6 Some Other Expressions Involving ∇
5.7 Line Integrals
5.8 Green’s Theorem in the Plane
5.9 The Divergence (Gauss) Theorem
5.10 The Stokes (Curl) Theorem
3. Partial Differentiation

3.1 Introduction and Notation


In Calculus I-II we focus on functions of one variable y = f (x). In reality we need to be able to take
derivatives in many dimensions (e.g., velocity, acceleration, etc.). Other applications of derivatives
we have seen before include:
00
• Power Series Expansions (e.g., f (x) = f (0) + f 0 (0)x + f 2(0) x2 + ...
• Max/Min Problems, Extrema
We start by considering functions of several real variables z = f (x, y) (we will stay in real space
for now so we can consider both two- and three-dimensions.
 Example 3.1 The volume of a cylinder V (r, h) = πr2 h depends on the radius and the height. 

Example 3.2 Suppose z = f (x, y), but x = 5, then z = f (x, y) = f (y) is a 2D curve in a 3D area
made up of the intersection of the plane x = 5 with the function f (x, y). 

Definition 3.1.1 (Level Curves) A level curve is made up of points (x, y) where f (x, y) = const.

 Example 3.3 Let z = f (x, y) = x2 + y2 , then the level curves have the form x2 + y2 = const and
are circles! 

dy
In two dimensions the derivatives are simple. If y = f (x) = x2 , then dx = y0 = 2x and
d2y
dx2
= y00 = 2. Is there an analogue in three dimensions? z = f (x, y) = x2 y

We have several options for a first derivative:

• Take the derivative with respect to x holding y constant (fixed), denoted ∂∂ xf = 2xy
• Take the derivative with respect to y holding x constant, denoted ∂∂ yf = x2 .
• Take the gradient, a vector in the direction
" of
# themaximal change in both x and y composed
∂f 
2xy
of the previous two examples, ∇ f = ∂∂ xf = .
∂y
x2
88 Chapter 3. Partial Differentiation

R Notice that the partial derivative of f with respect to x is not the same as the total derivative,
∂f df
∂ x 6= dx . More on this later!

There are also many ways to take a second derivative:

• Take the derivative with respect to x a second time holding y constant (fixed), denoted
∂2 f
∂ x2
= 2y
∂2 f
• Take the derivative with respect to y a second time holding x constant, denoted ∂ y2
= 0.
∂2 f ∂2 f
• Take one derivative of each (while holding the other fixed), denoted = = 2x. This
∂ x∂ y ∂ y∂ x
is called the “Mixed Derivative". Observe that we can switch the order and get the same
result for “nice functions" (e.g., continuous, differentiable, etc. Clairaut’s Theorem)

R There are additional notation for partial derivatives. The partial derivative of z = f (x, y) with
respect to x can be denoted, zx , fx , ∂∂ xf , f1 among others.

3.1.1 Review of Product, Quotient, and Chain Rule


Definition 3.1.2 (Product Rule) Given two differentiable functions f (x) and g(x), the derivative
of the product

d
[ f (x)g(x)] = f 0 (x)g(x) + g0 (x) f (x).
dx

Definition 3.1.3 (Quotient Rule) Given two differentiable functions f (x) and g(x), the derivative
of the quotient

f 0 (x)g(x) − f (x)g0 (x)


 
d f (x)
= .
dx g(x) [g(x)]2

Definition 3.1.4 (Chain Rule) Given two differentiable functions f (x) and g(x), the derivative
of the composition g ◦ f is

d
[g( f (x))] = g0 ( f (x)) f 0 (x).
dx

Definition 3.1.5 (Derivatives with Log or Exp) Given a differentiable function f (x),

d f (x)
e = f 0 (x)e f (x)
dx
d f 0 (x)
ln( f (x)) =
dx f (x)
d y
x = xy ln(x)
dx
2
 Example 3.4 Let z = f (x, y) = −x2 y3 + e−x y . Find fx , fy , fxx , fyy , fxy = fyx .
3.1 Introduction and Notation 89

Solution:
2y
fx = −2xy3 − 2xye−x
2y
fy = −3x2 y2 − x2 e−x
2 2y
fxx = −2y3 − 2ye−x y + 4x2 y2 e−x
2y
fyy = −6x2 y + x4 e−x
2 2y
fxy = −6xy2 − 2xe−x y + 2x3 ye−x

2x2 y
 Example 3.5 Let z = f (x, y) = 3x+1 and find fx , fy , fxy .

Solution:

4xy(3x + 1) − 6x2 y
fx =
(3x + 1)2
2x2
fy =
3x + 1
4x(3x + 1) − 6x2 6x2 + 4x
fxy = = .
(3x + 1)2 (3x + 1)2


 Example 3.6 Let z = f (x, y) = ln(3x + 1) and find fx .

Solution:
3
fx = .
3x + 1

 
x
 Example 3.7 If f (x, y) = sin
1+y , find f x and f y .

Solution: Compute
 
∂f x 1
fx = = cos
∂x 1+y 1+y
 
∂f x −x
fy = = cos .
∂y 1 + y (1 + y)2


 Example 3.8 If f (x, y) = x3 + x2 y3 − 2y2 , find fx (2, 1) and fy (2, 1).

Solution: Compute

∂ f 2 3

fx = = 3x + 2xy = 3(4) + 2(2)(1) = 16
∂ x (x,y)=(2,1)

(x,y)=(2,1)

∂ f 2 2

fy = = 3x y − 4y = 3(4)(1) + 4(1) = 8.
∂y
(x,y)=(2,1) (x,y)=(2,1)


90 Chapter 3. Partial Differentiation

 Example 3.9 If f (x, y) = 4 − x2 − y2 , find fx (1, 1) and fy (1, 1).

Solution: Compute

∂ f
fx = = −2x = −2
∂ x (x,y)=(1,1)

(x,y)=(1,1)

∂ f
fy = = −4y = −4.
∂y
(x,y)=(1,1) (x,y)=(1,1)

 Example 3.10 If f (x, y, z) = exy ln(z), find fx , fy , and fz .

Solution: Compute

∂f
fx = = yexy ln(z)
∂x
∂f
fy = = xexy ln(z)
∂y
∂f exy
fz = = .
∂z z


We can even work in other coordinate systems such as polar (r, θ ).


 Example 3.11 If f (x, y) = 3x2 − y2 = 3r2 cos2 (θ ) − r2 sin2 (θ ) = g(r, θ ), find fr and fθ .

Solution: Compute

fr = 2r 3 cos2 (θ ) − sin2 (θ )
 

fθ = r2 [−6 cos(θ ) sin(θ ) − 2 sin(θ ) cos(θ )] − 8r2 sin θ cos θ .

There are several physical scalar quantities that are functions of more then one variables. For
example the temperature in a material depends on space and time T = T (x, y, z,t). In Physics,
these quantities have physicalmeaning.
 One way physicists denote taking derivative while other
∂T
parameters are left constant is ∂ p . This indicates that we take the derivative of the temperature,
V
T , with respect to the pressure, p, leaving the volume V a fixed constant.

3.2 Power Series in Two Variables


Recall the power series for a function of one real variable

f 00 (a) f (k) (a)


f (x − a) = f (a) + f 0 (a)(x − a) + (x − a)2 + ... + (x − a)k + ...
2! k!
Apply a similar idea to functions of multiple variables.

Case I: Separable A function of two variables, f (x, y), is separable if it can be written as a
product of a function of x and a function of y, f (x, y) = g(x)h(y). In this case you can expand each
function in a power series in one variable and multiple to get the power series of f .
3.2 Power Series in Two Variables 91

Step 1: Expand each function in a 1D Taylor series.

Step 2: Multiply the terms and group by increasing total power α + β , xα yβ .


 Example 3.12 Step 1:

x2 y3
  
x
f (x, y) = e sin(y) = 1 + x + + ... y − + ... .
2 3!
Step 2:
x2 y y3 xy3 x3 y
f (x, y) = ex sin(y) = y + xy + − + + + ...
2 3! 3! 6


 Example 3.13 Step 1:

x2 y2
  
f (x, y) = cos(x) cos(y) = 1 − + ... 1 − + ... .
2! 2!
Step 2:
x2 y2 x2 y2
f (x, y) = cos(x) cos(y) = 1 − − + + ...
2 2 4


Definition 3.2.1 The Taylor Series Expansion of f (x, y) about the point (a, b) uses powers of
(x − a) and (y − b) in the form

f (x, y) = a00 + a10 (x − a) + a01 (y − b) + a20 (x − a)2 + a11 (x − a)(y − b) + a02 (y − b)2 + ...

where f (a, b) = a00 and

fx = a10 + 2a20 (x − a) + a11 (y − b) + ... if x = a and y = b, then fx (a, b) = a10


fy = a01 + a11 (x − a) + 2a02 (y − b) + ... if x = a and y = b, then fy (a, b) = a01
fxx = 2a20 + ... if x = a and y = b, then fxx (a, b) = 2a20
fyy = 2a02 + ... if x = a and y = b, then fyy (a, b) = 2a02
fxy = fyx = a11 + ... if x = a and y = b, then fxy (a, b) = a11

Therefore.
fxx (a, b) fyy (a, b)
f (x, y) = f (a, b)+ fx (a, b)(x−a)+ fy (a, b)(y−b)+ f (x−a)2 + fxy (a, b)(x−a)(y−b)+ (y−b)2 +.
2! 2!
If we let h = (x − a) and k = (y − b), then the second order terms have the form

∂ 2
 
1 2 2
 1 ∂
fx x(a, b)h + 2 fxy (a, b)hk + fyy (a, b)k = h +k f (a, b).
2! 2! ∂ x ∂y
92 Chapter 3. Partial Differentiation
The Taylor Series Expansion can then be written in the general form

∂ n
∞  
1 ∂
f (x, y) = ∑ h +k f (a, b), (3.1)
n=0 n! ∂x ∂y

h in
where h ∂∂x + k ∂∂y are the binomial coefficients (see Pascal’s Triangle!).

R This is a Maclaurin Series if a = b = 0. This procedure works on all functions (not just
separable functions).

R Compare the series representation for the 2D series with that of the 1D series for a function
of one real variable
∂ n
∞  
1
f (x) = ∑ h f (a). (3.2)
n=0 n! ∂x

 Example 3.14 Find the Taylor Series Expansion of f (x, y) = ex+y :

1 2
ex+y = 1 + x + y + x + 2xy + y2 + ...

2


 Example 3.15 Find the Taylor Series Expansion of f (x, y) = ex sin(y):

ex sin(y) = y + xy + ... matchs a previous example

 Example 3.16 Find the Taylor Series Expansion of f (x, y) = sin(x − y):

sin(x − y) = x − y + ...

3.3 Total Differentials


Definition 3.3.1 The differential dx of the independent variable x is: dx = ∆x, but dy 6= ∆y. The
change in y, ∆y, is the actual change in the y value whereas dy is the change in y along a tangent
at x + ∆x.

dy ∆y
R Of course, if dx is small, then ∆y ≈ dy and dx = lim∆x→0 ∆x . This follows from the fact that
dy
y0 = dx ⇒ dy = y0 dx ⇒ dy = ∂∂ xy dx.

In the case of a multi-variable function z = f (x, y) we see that


∂z ∂z
dz = dx + dy.
∂x ∂y

This is called the total differential of the function. Also, ∂∂ xz and ∂∂ yz are the slops of the tangent
lines in each direction. The total derivative is different from the partial derivative in that we longer
3.4 Approximations Using Differentials 93

assume the one of the variables is constant (fixed). Observe that if y is held constant then the dy = 0
and the total differential reduces to dz = ∂∂ xz dx ⇒ dx ∂z
dx = ∂ x .
The total differential can always be taken no matter how many variables are present. Let
u = f (x1 , x2 , x3 , x4 , ..., xn ). Then the total differential is

∂f ∂f ∂f ∂f
du = dx1 + dx2 + dx3 + ... dxn .
∂ x1 ∂ x2 ∂ x3 ∂ xn

 Example 3.17 Recall the concept of Implicit Differentiation from Calculus I. If we are given a
function f (x, y) = x4 + 2y2 = 8 it is hard to solve for y0 , so we use the idea of implicit differentiation

dy dy −4x3 −x3
4x3 + 4y =0 ⇒ = = . (3.3)
dx dx 4y y

The left hand side can be written as


 
1 ∂f ∂f
dx + dx = 0.
dx ∂ x ∂y

The term inside the square brackets is the total differential of f . So even back in Calculus 1 we
were using this concept without knowing it. 

3.4 Approximations Using Differentials


 Example 3.18 Find the total differential of f (x, y) = 10x3 − 8x2 y + 4y3

Solution:

∂f ∂f
d = 30x2 − 16xy dx + −8x2 + 12y2 dy.
   
df = dx +
∂x ∂y

√ √
 Example 3.19 Find the approximate value of .5 + 10−19 = .5 using total differentials.

Solution: Let f (x) = x1/2 , the ∆ f = f (.5 + 10−19 ) − f (.5) ≈ d f . Here x = .5 and dx = 10−19 .
Thus,

∂f 1 1
df = dx = (.5)−1/2 (10−19 ) = (1.41)(10−19 ) = 7 × 10−20 .
∂x 2 2


1 1
 Example 3.20 Find the approximate value of (n+1)2 − (n−1)2 using total differentials.

1
Solution: Let f (x) = (x+1)2
, the ∆ f = f (n) − f (n − 2) ≈ d f . Here x = n and dx = −2. Thus,

∂f −2 4
df = dx = 3
(−2) = .
∂x (x + 1) (n + 1)3

This has physical meaning. Two forces with decay n−2 can sum to produce something with extra
decay n−3 . 
94 Chapter 3. Partial Differentiation
p
 Example 3.21 Let z = f (x, y) = 2 x 2 + y2 .
a) Use the total differential to approximate ∆z when x changes from 3 → 2.98 and y changes from
4 → 4.01.
b) Calculate the actual change ∆z.

Solution: First, using the total differential


∂f ∂f
∆z ≈ dz = dx + dy
∂x ∂y
   
2x 2y
= dx + dy
(x2 + y2 )−1/2 (x2 + y2 )−1/2
2(3) 2(4)
= 2 −1/2
(−.02) + 2 (.01) = −.008
(3 + 4 )2 (3 + 42 )−1/2
Next, compute the actual change
∆z = f (x + ∆x, y + ∆y) − f (x, y)
p p
= f (2.98, 4.01) − f (3, 4) = 2 2.982 + 4.012 − 2 32 + 42 = −.007903.


R Observe that
f (x + ∆x, y + ∆y) − f (x, y) = f (x + ∆x, y + ∆y) − f (x + ∆x, y) + f (x + ∆x, y) − f (x, y)
∂f ∂f
≈ dy + dx.
∂y ∂x

 Example 3.22 Find the approximate value of (1.922 + 2.12 )1/3 using total differentials.
p
Solution: Let z = f (x, y) = 3
x2 + y2 = (x2 + y2 )1/3 . Here x = 2, y = 2, dx = −.08, and dy = .1.
Thus,
∂f ∂f
df = dx + dy
∂x ∂y
   
2x 2y
= dx + dy
3(x2 + y2 )2/3 3(x2 + y2 )2/3
2(2) 2(2)
= 2/3
(−.08) + (.1) = .0067
2
3(2 + 2 ) 2 3(2 + 22 )2/3
2

So f (1.92, 2.1) ≈ f (2, 2) + dz = 2 + .0067 = 2.0067, actual value: 2.008. 

 Example 3.23 Approximate the change in volume of a beverage can in the shape of a right

circular cylinder as the radius changes from 3 to 2.5 and the height changes from 14 to 14.2.

Solution: Let V = f (r, h) = πr2 h. Here r = 3, h = 14, dr = −.5, and dh = .2. Thus,
∂f ∂f
dV = dr + dh
∂r ∂ h 
= [2πrh] dr + πr2 dh
= 2π(3)(14)(−.5) + π(32 )(.2) = −126.2920
Thus, decreasing the radius by .5 and increasing the height by .2 results in a decrease in the volume
by 126.29units2 . 
3.5 Chain Rule or Differentiating a Function of a Function 95

 Example 3.24 Making a profit depends on the level of inventory x and the floor space y in the
following way (in thousands)

P(x, y) = −.02x2 − 15y2 + xy + 39x + 25y − 20000.

Currently, we have 4, 000, 000 in inventory and 150, 000 sq. feet. So x = 4000 and y = 150000.
Find the expected change in profit if management decides to increase the inventory by 500, 000 and
decrease the floor space by 10000 sq. feet.

Solution: Here x = 4000, y = −150, dx = 500, and dy = −10. Thus,


∂P ∂P
dP = dx + dy
∂x ∂y
= [−.04x + y + 39] dx + [−30y + x + 25] dy
= [−.04(4000) + 150 + 39] (500) + [−30(150) + 4000 + 25] (−10) = $19250

Thus, management will have made a good decision that results in an increase in the profits! 

m1 m2
 Example 3.25 The reduced mass µ of a system of two bodies is µ −1 = 1
m1 + 1
m2 ⇒µ= m1 +m2 .

From Newton’s 2nd Law: F21 = m1 a1 and F12 = m2 a2 . From Newton’s 3rd Law: F21 = −F12 ⇒
m1 a1 = −m2 a2 ⇒ a2 = −m 1
m2 a1 .
 
m2 +m1
The relative acceleration arel = a1 − a2 = 1 + m F12
m2 a1 = m1 m2 m1 a1 = mrel . Thus, mrel arel = F12 .
1

If m1 increases by 2% then what is the % change on m2 so that the relative mass µ remains
unchanged.

Solution: Using total derivatives with dm1 = .02m1 we see


dm2 dm1 −.02m1
0 = −m−2 −2
1 dm1 − m2 dm2 ⇒ 2
=− 2 = .
m2 m1 m21
Thus,
dm2 .02dms m2
=− ⇒ dm2 = −.02 m2 .
m2 m1 m1
Therefore, the relative mass remains unchanged if m2 decreases by .2 m 2
m1 %. 

3.5 Chain Rule or Differentiating a Function of a Function


Recall the Chain Rule for functions of a single variable.
 Example 3.26 If y = f (x) and x = g(t), then y(t) = f (g(t)). We then can take the derivative of
y with respect to t
dy dy dx
= = f 0 (g(t))g0 (t).
dt dx dt


Now, recall the total differential for a multi-variable function z = f (x, y)


∂f ∂f
dz = dx + dy.
∂x ∂y
96 Chapter 3. Partial Differentiation

If x and y are functions of t, then divide by dt to find

dz ∂ f dx ∂ f dy
= + (Chain Rule) (3.4)
dt ∂ x dt ∂ y dt

 Example 3.27 Let z = x2 y + 3xy4 where x = sin(2t) and y = cos(t). Find dz


dt when t = 0.

Solution: By the Chain Rule

dz ∂ f dx ∂ f dy 
= 2xy + 3y4 [2 cos(2t)] + x2 + 12xy3 [− sin(t)] .
  
= +
dt ∂ x dt ∂ y dt

Note that when t = 0, then x = 0, y = 1. Thus,



dz
= [0 + 3][2] + [0 + 0][0] = 6.
dt t=0

This is interpreted as the rate of change of z as (x, y) moves along a curve C . 

 Example 3.28 Let z = x2 + y2 + xy where x = sin(t) and y = et . Find dz


dt .

Solution: By the Chain Rule

dz ∂ f dx ∂ f dy
= [2x + y] [cos(t)]+[2y + x] et = 2 sin(t) cos(t)+et cos(t)+2e2t +et sin(t).
 
= +
dt ∂ x dt ∂ y dt

p
 Example 3.29 Let w = ln x2 + y2 + z2 = 21 ln(x2 + y2 + z2 ) where x = sin(t), y = cos(t), and
dw
z = tan(t). Find dt .

Solution: By the Chain Rule

dw ∂ w dx ∂ w dy ∂ w dz
= + +
dt ∂ x dt ∂ y dt ∂ z dt
     
2x y z  2 
= 2 2 2
[cos(t)] + 2 2 2
[− sin(t)] + 2 2 2
sec (t)
x +y +z x +y +z x +y +z
cos(t) sin(t) − cos(t) sin(t) + tan(t) sec2 (t) tan(t) sec2 (t)
= = = tan(t).
1 + tan2 (t) sec2 (t)


 Example 3.30 (Application) The pressure (kPa), volume V (L), and temperature (K) of a mole
of ideal gas are related by the equation PV = 8.31T (Ideal Gas Law). Find the rate at which the
pressure is changing when the temperature is 300K increasing at a rate of .1K/s and V = 100L
increasing at a rate of .2L/s.

dT dV
Solution: Here T = 300, dt = .1, V = 100, dt = .2, and P = 8.31 VT . By the Chain Rule:

dP ∂ P dT ∂ P dV 8.31 dT 8.31T dV
= + = −
dt ∂ T dt ∂V dt V dt V 2 dt
8.31 8.31(300)
= (.1) − (.2) = −0.04155
100 1002

3.6 Implicit Differentiation 97

Question: What if x(t) and y(t) were functions of two variables x = x(t, s) and y = y(t, s)?

Then z = f (x(t, s), y(t, s)) and by the Chain Rule:

∂z ∂ f ∂x ∂ f ∂y
= +
∂s ∂x ∂s ∂y ∂s
∂z ∂ f ∂x ∂ f ∂y
= + .
∂t ∂ x ∂t ∂ y ∂t

 Example 3.31 Let z = ex sin(y) where x = st 2 and y = s2t. Find ∂∂ zs , ∂∂tz .

Solution: Thus,

∂z ∂ f ∂x ∂ f ∂y 2 2
= + = [ex sin y][t 2 ] + [ex cos y][2st] = t 2 est sin(s2t) + 2stest cos(s2t)
∂s ∂x ∂s ∂y ∂s
∂z ∂ f ∂x ∂ f ∂y 2 2
= + = [ex sin y][2st] + [ex cos y][s2 ] = 2stest sin(s2t) + s2 est cos(s2t).
∂t ∂ x ∂t ∂ y ∂t


 Example 3.32 Let z = ex+2y where x = st and y = st . Find ∂∂ sz , ∂∂tz .

Solution: Thus,
 
∂z ∂ f ∂x ∂ f ∂y x+2y 1 x+2y −t s 2t
+ 1 2t
= + = [e ][ ] + [2e ][− 2 ] = e t s −
∂s ∂x ∂s ∂y ∂s t s t s2
 
∂z ∂ f ∂x ∂ f ∂y x+2y −s x+2y 1 s 2t
−s s 2
= + = [e ][ 2 ] + [2e ][ ] = e t − 2− .
∂t ∂ x ∂t ∂ y ∂t t s t s


3.6 Implicit Differentiation


Recall from Calculus that the method of Implicit Differentiation is a special case of the Chain Rule.
The method is “implicit" due to the fact that often y cannot be solved as a function of x.
2 2

 Example 3.33 Let x + y = 25 ⇒ y = ± 25 − x2 . Thus,
x x
y0 = √ or y0 = − √ .
25 − x2 25 − x2

Implicit Differentiation lets one find the derivative of y WITHOUT writing y explicitly as a function
of x. Take the derivative of each side with respect to x

dy
2x + 2y =0
dx
dy
2y = −2x
dx
dy −2x −x −x
= = = √ .
dx 2y y ± 25 − x2


This method is usually used when we want to know the value of the derivative at a point (x0 , y0 ).
98 Chapter 3. Partial Differentiation

 Example 3.34 x2 + sin(x) = t. Find dx


dt .

Solution: This can be solved using total differentiation or implicit differentiation. First with
TD we rewrite 0 = x2 + sin(x) − t = f (x, y).
∂ f dx ∂ f dt dx dx 1
0= + = [2x + cos(x)] − 1 ⇒ = .
∂ x dt ∂t dt dt dt 2x + cos(x)
To use implicit differentiation we think of x as a function of t, then by Chain Rule
dx dx dx 1
2x + cos(x) = 1 ⇒ = .
dt dt dt 2x + cos(x)


For higher derivatives we do not use differentials! We only use implicit differentiation.
2
 Example 3.35 x2 + sin(x) = t. Find ddt 2x .

Solution: To use implicit differentiation we think of x as a function of t, then by Chain Rule


on the result of the prior example
 2  2
dx d2x d2 dx
2 + 2x 2 + cos(x) 2 − sin(x) =0
dt dt dt dt
 2
d2x dx
[2x + cos(x)] + [2 − sin(x)] = 0
dt 2 dt
d 2 x − dx

dt [2 − sin(x)]
=
dt 2 2x + cos(x)
2
d x −(2 − sin(x)
2
= .
dt (2x + cos(x))3


Suppose we want the value of the derivative at a point. In the previous two examples, if we
1
want th values at x = π and t = 0, then from the implicit differentiation we find dx
dt = 2π−1 and
d2x −2
dt 2
= (2π−1) 3.

The most prevalent application for Implicit Differentiation is to find the equation of a tangent
lien to a curve, which also gives the slope at a point.
 Example 3.36 Find the equation of the tangent line to the curve x2 y3 + x3 y2 − y = 0 at the point
(1, 1).

Solution: Using Implicit Differentiation we find


dy dy dy
2xy3 + 3x2 y2 + 3xy2 + 2x3 y − =0
dx dx dx
dy  2 2
3x y + 2x3 y − 1 = −2xy3 + 3xy2

dx
dy −2xy3 + 3xy2
= 2 2
dx 3x y + 2x3 y − 1
dy 1
Plug in (1, 1) = .
dx 4
y−y0 y−1
Thus, the equation for the tangent line has the form y = mx + b where m = x−x0 = x−1 = 14 . Thus,
y = 14 (x − 1) + 1. 
3.7 More Chain Rule 99
p
 Example 3.37 Find the equation of the tangent line to the curve 3 − x + x2 + y2 = 0 at the
point (0, 2).

Solution: Using Implicit Differentiation we find

1 dy
−1 + (x2 + y2 )(2x + 2y ) = 0
2 dx
dy
x(x2 + y2 ) + y(x2 + y2 ) = 1
dx
dy 1 − x(x2 + y2 )
=
dx y(x2 + y2 )
dy 1
Plug in (0, 2) = .
dx 8
y−y0 y−2
Thus, the equation for the tangent line has the form y = mx + b where m = x−x0 = x−0 = 18 . Thus,
y = 18 x + 2. 

3.7 More Chain Rule


Continuing from the last section, we consider functions z = f (x, y) where x = x(s,t) and y = y(s,t).
 Example 3.38 Let z = xy where x = sin(s + t) and y = s − t. Find ∂∂tz , ∂∂ zs .

Solution: Using the Chain Rule we find

∂z ∂z ∂x ∂z ∂y
= + = [y][cos(s + t)] + [x](−1) = y cos(s + t) − x
∂t ∂ x ∂t ∂ y ∂t
∂z ∂z ∂x ∂z ∂y
= + = [y][cos(s + t)] + [x](1) = y cos(s + t) + x
∂s ∂x ∂s ∂y ∂s


 Example 3.39 Let u = x2 + 2xy − y ln(z) where x = s + t 2 , y = s − t 2 , and z = 2t. Find ∂∂ us , ∂∂tu .

Solution: Using the Chain Rule we find

∂u ∂u ∂x ∂u ∂y ∂u ∂z
= + + = [2x + 2y](1) + [2x − ln(z)](1) + [−y/z](0) = 4x + 2y − ln(z)
∂s ∂x ∂s ∂y ∂s ∂z ∂s
∂u ∂u ∂x ∂u ∂y ∂u ∂z 2y
= + + = [2x + 2y][2t] + [2x − ln(z)](−2t) + [−y/z](2) = 4yt + 2t ln(z) − .
∂t ∂ x ∂t ∂ y ∂t ∂ z ∂t t


Notation: Sometime it is useful to write the Chain Rule formulas in matrix form. If u = f (x, y, z)
where x = x(s,t), y = y(s,t), and z = z(s,t). Recall from linear algebra
 
    ∂x ∂x
∂u ∂u ∂u ∂u ∂u  ∂s ∂t
∂y ∂y
= . (3.5)

∂ s ∂t ∂x ∂y ∂z
 ∂s ∂t
∂z ∂z
∂s ∂t

One recovers the Chain Rule through matrix multiplication.


100 Chapter 3. Partial Differentiation

3.7.1 Using Cramer’s Rule


One can use Cramer’s Rule to solve for dx, dy, etc.
 Example 3.40 Find dz 2 2 2
dt given that z = x − y where x and y are defined implicitly x + y = t and
x sin(t) = yey .

Solution: Here we cannot solve for x and y in terms of t explicitly! Instead find dx and dy
first, then use the Chain Rule to get the desired derivative. From Total Differentiation we have

2xdx + 2ydy = 2tdt


sin(t)dx + x cos(t)dt = (yey + ey )dy.

Simplify and Rearrange:

xdx + ydy = tdt


sin(t)dx − (y + 1)ey dy = −x cos(t)dt.

This can be written as a product of matrices:


    
x y dx tdt
= .
sin(t) −(y + 1)ey dy −x cos(t)dt

This is just a linear system of the form Ax = b. Recall from Cramer’s Rule that as long as the
x) det(Ay )
determinant of A is not zero, det(A) 6= 0, the the solution to Ax = b is x = det(A
det(A) and y = det(A)
where Ax is A with the first column replaced by the righthand side b. Thus,

x y = −x(y + 1)ey − y sin(t)

det(A) = y
sin(t) −(y + 1)e

tdt y = −tdt(y + 1)ey + xy cos(t)dt

det(Ax ) = y
−x cos(t)dt −(y + 1)e

x tdt = −x2 cos(t)dt − tdt sin(t).

det(Ay ) =
sin(t) −x cos(t)dt

Therefore, Cramer’s Rule gives:

−t(y + 1)ey + xy cos(t)


dx = dt
x(y + 1)ey − y sin(t)
−x2 cos(t) − t sin(t)
dy = dt
x(y + 1)ey − y sin(t)

Thus, we can now compute the differential dz

−t(y + 1)ey + xy cos(t)


   
∂z ∂z −x cos(t) − t sin(t)
dz = dx + dy = 1 dt + (−1) dt.
∂x ∂y x(y + 1)ey − y sin(t) x(y + 1)ey − y cos(t)
dz
Now to find the derivative dt just divide the above expression by dt on both sides to find

−t(y + 1)ey + xy cos(t)


   
dz ∂ z dx ∂ z dy −x cos(t) − t sin(t)
= + =1 + (−1) .
dt ∂ x dt ∂ y dt x(y + 1)ey − y sin(t) x(y + 1)ey − y cos(t)

3.8 Maximum and Minimum Problems with Constraints 101

3.8 Maximum and Minimum Problems with Constraints


Finding maximum and minimum values of a function have wide-ranging physical applications. For
example, a fundamental law of physics says that a system wants to be in the state that minimizes
energy. Maxima also have many applications such as a system wanting to be in a maximum state
of entropy (disorder). If we can view the function (through graphing) we observe local extrema
(maxima and minima) in the form of peaks and valleys in the graph. But how can we determine
which points (x, y) give rise to the maximum and minimum values without plotting them? The goal
of this section is to develop a method to answer this question.
Definition 3.8.1 (Local Extrema) A function of two variables, z = f (x, y), has a local maximum
at (a, b) if f (x, y) ≤ f (a, b) when (x, y) is near (a, b). If f (x, y) ≥ f (a, b) for all (x, y) near (a, b),
then f (a, b) is a local minimum.

Definition 3.8.2 (Global Extrema) A function of two variables, z = f (x, y), has a global max-
imum at (a, b) if f (x, y) ≤ f (a, b) for every point (x, y). If f (x, y) ≥ f (a, b) for all (x, y), then
f (a, b) is a global minimum.

Theorem 3.8.1 If a function f (x, y) has a local minimum/maximum at a point (a, b), then
the first order partial derivatives ∂∂ xf (a, b) = fx (a, b) = 0 and ∂f
∂ y (a, b) = fy (a, b) = 0 . This is
analogous to the 1D case where f 0 (x) = 0 at all extrema.

Definition 3.8.3 Any location (x, y) where fx (x, y) = fy (x, y) = 0 is called a critical point.

 Example 3.41 Let f (x, y) = x2 + y2 − 2x − 6y + 14. Find the critical points.

Solution: All that must be done is take the first order partial derivatives fx , fy , set them equal to
zero and solve for x, y.
0 = fx = 2x − 2 ⇒ x=1
0 = fy = 2y − 6 ⇒ y = 3.
Thus, the only critical point is (1, 3). 

 Example 3.42 Let f (x, y) = y2 − x2 . Find the critical points.

Solution: All that must be done is take the first order partial derivatives fx , fy , set them equal to
zero and solve for x, y.
0 = fx = −2x ⇒ x=0
0 = fy = 2y ⇒ y = 0.
Thus, the only critical point is (0, 0). 

Finding the critical points only gives us candidates for extrema. Being a critical point is
necessary for being a maximum or minimum, but it is not sufficient. Recall the three relevant cases
from 1D and how we determine if we have a maximum, minimum, or point of inflection (all of
which are critical points with f 0 (x) = 0).

Theorem 3.8.2 (Second Derivative Test) Suppose that the first order partial derivatives are zero,
∂f ∂f
∂ x (a, b) = 0 and ∂ y (a, b) = 0 (critical point at (a, b)). In addition, fxx , fxy , fyy exist. Define

D = D(a, b) = fxx (a, b) fyy (a, b) − [ fxy (a, b)]2 . (3.6)


102 Chapter 3. Partial Differentiation

Type of Critical Point Conditions Needed


Maximum f 0 (x) = 0 and f 00 (x) < 0
Minimum f (x) = 0 and f 00 (x) > 0
0

Point of Inflection f 0 (x) = 0 and f 00 (x) = 0

Then:
a) If D > 0 and fxx > 0, then f (a, b) is a local minimum.
b) If D > 0 and fxx < 0, then f (a, b) is a local maximum.
c) If D < 0, then f (a, b) is a saddle point.
d) If D = 0, then the second derivative test is inconclusive.
Is there a nice way to remember the formula for D? Yes! With determinants

fx x fxy 2
D= = fxx fyy − fxy .
fyx fyy

 Example 3.43 Find the local maximum, minimum, and saddle points of f (x, y) = x4 + y4 − 4xy +
1.

Solution:
Step 1: Find all possible critical points.

0 = fx = 4x3 − 4y ⇒ y = x3
0 = fy = 4y3 − 4x

By substituting the first expression into the second we find

0 = 4x9 − 4x = 4x(x8 − 1) = 4x(x4 + 1)(x4 − 1) = 4x(x4 + 1)(x2 − 1)(x2 + 1).

Thus, the real roots are x = 0, 1, −1. Using the relation that y = x3 , then the three critical points are
(0, 0), (1, 1), (−1, −1).

Step 2: Find the 2nd order partial derivatives

fxx = 12x2
fyy = 12y2
fxy = fyx = −4.

Step 3: Compute D(a, b) for each critical point.

D(x, y) = (12x2 )(12y2 ) − (−4)2 = 144x2 y2 − 16

For the critical point (0, 0), D(0, 0) = −16 < 0, so it is a saddle point. For the critical point
(1, 1), D(1, 1) = 144 − 16 > 0 and fxx (1, 1) = 12 > 0, so it is a local minimum. Finally, for the
critical point (−1, −1), D(−1, −1) = 144 − 16 > 0 and Fyy (−1, −1) = 12 > 0, so it is also a local
minimum. 

 Example 3.44 Find the shortest distance from the point (1, 0, −2) to the plane x + 2y + z = 4.

Solution: Recall that the distance between a point (x, y, z) and (1, 0, −2) is
q
d = (x − 1)2 + y2 + (z + 2)2 .
3.8 Maximum and Minimum Problems with Constraints 103

Using the equation for the plane z = 4 − x − 2y and

d 2 = (x − 1)2 + y2 + (6 − x − 2y)2 .

Observe that if we minimize d 2 we will also minimize d.

Step 1: Find all possible critical points.

0 = fx = 2(x − 1) + 2(6 − x − 2y)(−1) = −14 + 4x + 4y


0 = fy = 2y + 2(6 − x − 2y)(−2) = −24 + 4x + 10y

By solving this system we find that x = 11/6 and y = 5/3. Thus, the critical point as (11/6, 5/3).

Step 2: Find the 2nd order partial derivatives

fxx = 4
fyy = 10
fxy = fyx = 4.

Step 3: Compute D(a, b) for each critical point.

D(x, y) = 4(10) − (4)2 = 24 > 0.

Since D > 0 and fxx > 0, then√(11/6, 5/3) is a local minimum. Plugging this point back into the
distance formula gives d = 56 6. 

 Example 3.45 A cardboard box without a lid is to be made from 12m2 of cardboard. Find the
maximum volume of such a box.

Solution: Recall the volume of a box

V = xyz

for a box with side lengths x, y, z.

Step 0: Express the desired function as a function of only two variables using another relation such
as the surface area A = xy + 2xz + 2yz = 12. Solving for z

12 − xy 12xy − x2 y2
z= , V = xyz = .
2x + 2y 2x + 2y
Step 1: Find all possible critical points.

y2 (12 − 2xy − x2 )
0 = Vx = ⇒ 0 = y2 (12 − 2xy − x2 )
2(x + y)2
x2 (12 − 2xy − y2 )
0 = Vy = ⇒ 0 = x2 (12 − 2xy − y2 )
2(x + y)2
Observe that if either x or y is zero, then we get the minimum volume, V = 0. Ignoring these cases
we find that either x = y or x = −y. In the real world, side lengths cannot be negative, so x = y. By
substitution we find x = 2 and y = 2. Thus, the critical point as (2, 2).

Step 2: Find the 2nd order partial derivatives. In this case there is only one critical point left which
is not the minimum, so it must be a maximum. Therefore, Vmax = xyz = 2(2)(1) = 4. 
104 Chapter 3. Partial Differentiation

 Example 3.46 A cardboard box without a lid has a volume of 5m3 . Find the minimum surface
area of such a box.

Solution: Recall the volume of a box


V = xyz = 5
for a box with side lengths x, y, z. The surface area is A = xy + 2xz + 2yz

Step 0: Express the desired function as a function of only two variables using another relation.
Solving for z
5 10 10
z= , A = xy + 2xz + 2yz = xy + + .
xy y z
Step 1: Find all possible critical points.
10 10
0 = Ax = y − ⇒ y=
x2 x2
10
0 = Ay = x − 2
y
4
By substituting the first equation into the second we find 0 = x − 10x 1 3

100 = x 1 − 10 x . Thus,
x = 101/3 and then y = 101/3 .

5
Step 2: Find the 2nd order partial derivatives. Recall z = xy = 1052/3 = 12 101/3 . Thus the sur-
face area is minimized when the height is half the length x and width y. 

 Example 3.47 A trapezoidal gutter has an opening of 24cm. Find the angle of the sides θ so that
the cross-sectional area is maximized.

Solution: Recall the area of a trapezoid (if x is the base and y is the side
x + x + 2y cos(θ )
A= y cos(θ ) = (x + y cos(θ ))y sin(θ ).
2
We also know that the width is 24 = x + 2y.

Step 0: Express the desired function as a function of only two variables using another relation.
Solving for x
x = 24 − 2y, A = (24y − 2y2 + y2 cos(θ )) sin(θ ).
Step 1: Find all possible critical points.
0 = Ax = −y2 sin2 (θ ) + 24y cos(θ ) − 2y2 cos(θ ) + y2 cos2 (θ )
= −y2 sin2 (θ ) + 24y cos(θ ) − 2y2 cos(θ ) + 2y2 cos2 (θ ) − y2
0 = Ay = (24 − 4y + 2y cos(θ )) sin(θ )

From the first equation θ = 0 or 24 − 4y + 2y cos(θ ) = 0 ⇒ cos(θ ) = −12+2y


y . If θ = 0 we have a
minimum cross-sectional area, so disregard this case. Thus,
2
   
2 144 − 48y + 4y 2 2 −12 + 2y 2 144 − 48y + 4y
0 = ax = −y 1 − + (24y − 2y )( )+y
y2 y y2
0 = 3y2 − 24y = 3y(y − 8).
3.9 Lagrange Multipliers 105
−12+2y 4 1
Thus, y = 8. Then x = 24 − 2y = 8. Also, cos(θ ) = y = 8 = 2 ⇒ θ = π3 . Thus, the maximal
area

2 2
 π  π 3 √
A = (24y−2y +y cos(θ )) sin(θ ) = (24(8) − 2(64) + (64) cos( ) sin( ) = (192−128+32) = 48 3.
3 3 2


3.9 Lagrange Multipliers


In the last section we studied a problem of maximizing the volume V = xyz of a rectangular box
subject to the constraint that the area of the surfaces A = xy + 2xz + 2yz = 12.

Before: We had to use substitution and solve multiple equations.

Now: Solve a max/min problem with a constraint simultaneously using Lagrange Multipliers.
This method was developed long ago to solve a classical problem the so-called Milkmaid
Problem. Imagine you are on a farm and it is time to get milk. The maid has to get the day’s
milk from the cow. The sun is setting and she has a date with a handsome shepherd and wants to
complete the task as quickly as possible. Before she can get the milk she must rinse her bucket in
the nearby river. She wants to take the shortest possible path from her location, M, to the river to
the cow C.

Thinking Question: What point P along the river should she rinse her bucket?

This question can be restated as finding a point P for which the distance from M to P and from P to
C is minimum. If she only needed to go to the cow the obvious solution is a straight line, but the
problem is not as simple with 3 points. We need to satisfy the constraint that P is on the river bank.
Suppose the shape of the river is described as a curve, g(x, y) = 0 where g(x, y) = y − x2
(parabola) or g(x, y) = x2 + y2 − r2 (circle). We want to minimize the function

F(P) = dist(M, P) + dist(P,C) subject to g(P) = 0.

Graphically: For every point P on an ellipse the distance from a focus to P to the other focus is
constant. Take M,C as the foci of an ellipse, any point on increasing ellipses has the same distance
from both. To find the point P, find the smallest ellipse that intersects the curve defining the river.
This occurs when the smallest ellipse and river are tangent.
Algebraically: Usually to find a maximum or minimum we need to set derivatives equal to zero,
∂f ∂f
∂ x = ∂ y = 0 or ∇ f = 0, but we must pair this with our constraint. How to do this? Add a new
variable and define a new function!

F(P, λ ) := f (P) − λ g(P).

To find the critical points we set all first derivatives equal to zero, ∇F = 0 ⇔ ∇ f = λ ∇g

∂f ∂g
0= (P) − λ
∂x ∂x
∂f ∂g
0= (P) − λ
∂y ∂y
0 = g(P).

The first two equations are used to find the critical point and the last equation enforces the constraint.
106 Chapter 3. Partial Differentiation

The variable λ is a dummy variable used to get a system of equations, we really only care about
x, y, z. Once you have found all the critical points where f you plug them into f to see which are
maxima and which are minima. Solving this system of equation can be hard! Some tricks:

1. Since we do not care what λ is, you can solve first for λ in terms of x, y, z to remove λ
from the equations.

2. Try first solving for one variable in terms of the others.

3. Remember when taking the square root to consider both the positive and negative root.

4. Remember when dividing an equation by some expression, you must be sure that the ex-
pression is not zero. Often it is helpful to consider two cases: first solve the equations assuming
that a variable is 0, and then solve the equations assuming that it is not zero.
In physics, the Lagrange multiplier is te relative weight of the constraint on the problem. In
economics, it represents the fact that the maximum profit is subject to limited resources where λ is
the marginal value. This can also represent the rate at which the optimal value of f (P) changes if
you change the constraint.
 Example 3.48 Parabolic River. Assume the maid is at (0, 5), the cow is at (8, 0) and the river
curve is g(x, y) = (y − 2) − (x − 4)2 .

Solution: We want to minimize the distance functional


q q
F(x, y, λ ) := x2 + (y − 5)2 + (x − 8)2 + y2 − λ −(x − 4)2 + (y − 2)2
 
(3.7)

We can make the problem easier by replacing each distance (square root) with its square (if the
square of the distance is minimized, then so is the distance).

F̃(x, y, λ ) := x2 + (y − 5)2 + (x − 8)2 + y2 − λ −(x − 4)2 + (y − 2)2


 

We now take the gradient and set it equal to zero, ∇F = 0


∂F
0= = 2x + 2(x − 8) + 2λ (x − 4)
∂x
∂F
0= = 2(y − 5) + 2y − λ
∂y
∂F
0= = (x − 4)2 − (y − 2)
∂λ
Step 1: Solve for λ in terms of x, y. Thus, λ = 4y − 10.

∂F
Step 2: Plug this value into ∂x =0
∂F
0= = 4x − 16 + 2(4y − 10)(x − 4)
∂x
= 4(x − 4)(8y − 16)
= 32(x − 4)(y − 2)Using the constraint: = 32(x − 4)3

Thus, the critical point is x = 4. This implies (from the constraint) that y = 2 and √ λ = −2.
Plugging this back into the original distance equation (6.202) gives F(4, 2, −2) = 5 + 8 = 7.828.
Just to check that√this is the minimal path try another point on the river (e.g., (2, 6) ⇒ λ = 14).
F(2, 6, 14) = 6 + 5 = 8.2366 is a larger distance to travel. 
3.9 Lagrange Multipliers 107
Definition 3.9.1 (Method of Lagrange Multipliers) to find the maximum or minimum values of
a function f (x, y, z) subject to the constraint g(x, y, z) = k.
a) Find all values of x, y, z, λ such that

∇ f (x, y, z) = λ ∇g(x, y, z), g(x, y, z) = k.

b) Evaluate f at all critical points from (a). The largest value is the maximum of f and the
smallest value is the minimum of f .

Let’s recall some examples solved in the last section.


 Example 3.49 Maximize the volume V = xyz of a cardboard box subject to the constraint that

we only have 12m2 of cardboard.

Solution: Solve using Lagrange Multipliers.


Step 1: Define F(x, y, z, λ ) := f (x, y, z) − λ g(x, y, z)
F(x, y, z, λ ) = xyz − λ [xy + 2xz + 2yz − 12] .
Step 2: Find the partial derivatives
∂F
0= = yz − λ [y + 2z]
∂x
∂F
0= = xz − λ [x + 2z]
∂y
∂F
0= = xy − λ [2x + 2y]
∂z
∂F
0= = xy + 2xz + 2yz − 12.
∂λ
We have to be a little clever to solve this problem! Multiply the first equation by x, the second
equation by y, and the third equation by z:
xyz = λ [xy + 2xz]
xyz = λ [xy + 2yz]
xyz = λ [2xz + 2yz] .
Observe that λ 6= 0 or every equation would be false (since none of the three dimensions can be
zero! Combining the equations we find
0 = λ [xy + 2xz] − λ [xy + 2yz] ⇒ xz = yz ⇒ x=y
0 = λ [xy + 2yz] − λ [2xz + 2yz] ⇒ xy = 2xz ⇒ y = 2z.
Thus, x = y = 2z. Plugging these into the constraint we find
0 = 4z2 + 4z2 + 4z2 − 12 ⇒ z2 = 1 ⇒ z = 1.
Therefore, x = y = 2. Matching the previous answer! 

 Example 3.50 Find the extremal values of the function f (x, y) = x2 + 2y2 on the circle x 2 + y2 =
1.

Solution: Solve using Lagrange Multipliers.


Step 1: Define F(x, y, z, λ ) := f (x, y, z) − λ g(x, y, z)
F(x, y, z, λ ) = x2 + 2y2 − λ x2 + y2 − 1 .
 
108 Chapter 3. Partial Differentiation

Step 2: Find the partial derivatives

∂F
0= = 2x − λ [2x]
∂x
∂F
0= = 4y − λ [2y]
∂y
∂F
0= = x2 + y2 − 1.
∂λ
From the first equation we see that either x = 0 or λ = 1. If x = 0, then y = ±1. If λ = 1, then y = 0
and x = ±1. So there are four critical points (0, 1), (0, −1), (1, 0), (−1, 0). To find the extrema
(maximum or minimum) plug in the critical points to f (x, y):

f (0, 1) = 2
f (0, −1) = 2
f (1, 0) = 1
f (−1, 0) = 1.

Therefore, the max value is 2 occurring at (0, 1) and (0, −1). The minimum value is 1 occurring at
(1, 0) and (−1, 0). 

 Example 3.51 Find the points on the sphere x2 + y2 + z2 = 4 that are the closest and furthest
from (3, 1, −1).

Solution: Solve using Lagrange Multipliers. If the distance is minimized so is the distance
squared.
Step 1: Define F(x, y, z, λ ) := f (x, y, z) − λ g(x, y, z)

F(x, y, z, λ ) = (x − 3)2 + (y − 1)2 = (z + 1)2 − λ x2 + y2 + z2 − 4 .


 

Step 2: Find the partial derivatives

∂F 3
0= = 2(x − 3) − λ [2x] ⇒ x=
∂x 1−λ
∂F 1
0= = 2(y − 1) − λ [2y] ⇒ y=
∂y 1−λ
∂F −1
0= = 2(z + 1) − λ [2z] ⇒ z=
∂z 1−λ
∂F
0= = x2 + y2 + z2 − 4.
∂λ

Plugging the values of x, y, z into the constraint gives λ = 1 ± 211 . So there are two critical points
( √611 , √211 , √−2
11
) (max distance) and ( √−6
11
, √−2
11
, √211 ) (min distance). 

What happens if we have more than one constraint? Then we add all constraints on as different
Lagrange Multipliers.
Definition 3.9.2 To maximize or minimize a function f (x, y, z) with two constraints g(x, y, z) =
C1 and h(x, y, z) = C2 we set up the functional to minimize as

F(x, y, z, λ , µ) := f (x, y, x) − λ [g(x, y, z) −C1 ] − µ [h(x, y, z) −C2 ] .

Then solve for unknowns (x, y, z, λ , µ).


3.9 Lagrange Multipliers 109

 Example 3.52 Find the maximum value of the function f (x, y, z) = x + 2y + 3z on the curve of
intersection of the plane x − y + z = 1 and the cylinder x2 + y2 = 1.

Solution: Solve using Lagrange Multipliers.


Step 1: Define F(x, y, z, λ , µ) := f (x, y, x) − λ [g(x, y, z) −C1 ] − µ [h(x, y, z) −C2 ]

F(x, y, z, λ , µ) := x + 2y + 3z − λ [x − y + z − 1] − µ x2 + y2 − 1 .
 

Step 2: Find the partial derivatives

∂F
0= = 1 − λ − 2µx
∂x
∂F
0= = 2 + λ − 2µy
∂y
∂F
0= = 3−λ ⇒ λ =3
∂z
∂F
0= = x−y+z−1
∂λ
∂F
0= = x2 + y2 − 1.
∂µ
−1 5
Since λ = 3, the first equation gives that x = µ and the second equation gives that y = 2µ . Using

these relations and the µ constraint we see that µ = ± 229 resulting in x = ∓ √229 , y = ± √529 . From
the constraint g we find that z = 1 ± √729 .
 
Step 3: Plug the critical points into f to determine the maximum and minimum, ∓ √229 +2 ± √529 +
  √
3 1 ± √729 = 3 ± 29 (max with +). 
4. Multivariable Integration and Applications

4.1 Introduction
Recall how integration works in one dimension. Given a function f (x) defined in the interval
a ≤ x ≤ b, we can approximate the integral value via an Riemann sum. Divide the interval [a, b] into
n sub-intervals [xi−1 , xi ] of equal width ∆x = b−a
n . Then we can multiply the width of the interval

by the height, f (x ) to form the Riemann sum:
n Z b
∑ f (xi∗ )∆x → f (x)dx as ∆x → 0,
i=1 a

where xi−1 ≤ xi∗ ≤ xi is a point in the ith interval. The integral is obtained by taking the limit as
n → ∞ (∆x → 0). Thus, the integral represents the “area under the curve" (see Fig. 4.1).
We can use a similar procedure to define integration in two dimensions. Instead of intervals
[xi−1 , xi ] we have little rectangles, R = [a, b] × [c, d], with area ∆A = ∆x∆y. Approximate the volume
under the surface by a sum of these boxes.

Figure 4.1: The typical way to envision integration is the sum of a bunch of tiny rectangles under
the desired curve.
112 Chapter 4. Multivariable Integration and Applications

Definition 4.1.1 (Double Integral) The volume under the curve f (x, y) is defined as
m n ZZ
∑∑ f (xi∗j , y∗i j )∆A → f (x, y)dA as ∆A → 0.
i=1 j=1 R

Figure 4.2: As in 1D we can approximate a curve by rectangles, only here. in 2D, we use rectangular
prisms.

Properties of Double Integrals:


RR RR RR
1. Additive: [ f (x, y) + g(x, y)] dA = R f (x, y)dA + R g(x, y)dA.
RR RR
2. Scalar Multiple: R c f (x, y)dA =c R f (x, y)dA.
RR RR
3. Bounds: If f (x, y) ≥ g(x, y) in a region R, then R f (x, y)dA ≥ R g(x, y)dA.
Recall that in 1D we do not use the Riemann sums in practice, rather we use the Fundamental
Theorem of Calculus
Z b
f (x)dx = F(b) − F(a), (4.1)
a

where F is the indefinite integral of f or F 0 (x) = f (x).


In two dimensions, suppose f (x, y) is a function of two variables and is integrable on the
rectangle R = [a, b] × [c, d]. First, integrate in one of the variables (for example y first):
Z d
A(x) = f (x, y)dy.
c

Then integrate in the remaining variable:


Z b Z b Z d  Z d Z b

V= A(x)dx = f (x, y)dy dx = f (x, y)dy dx.
a a c c a

V is referred to as an iterated integral. The iterated integral implies that we integrate with respect
to y treating x as a constant, then integrate the resulting function with respect to x.

R The fact that we can integrate in either variable first is the result of Fubini’s Theorem. Once
can find many references online if interested.
4.1 Introduction 113
R3R2
 Example 4.1 Evaluate: 0 1 x2 ydydx.

Solution: First, integrate in y, then x:


Z 3Z 2 Z 3
" 2 # Z 3
1 1
x2 ydydx = x2 y2 dx = 2x2 − x2 dx

0 1 0 2 1 0 2
Z 3 3
3 2 1 1 27
= x dx = x3 = [27 − 0] = .
0 2 2 0 2 2
As a check recompute by first integrating in x, then y:
" #
1 3 3
Z 2Z 3 Z 2 Z 2
2
x ydydx = x y dy = 9y − 0dy
1 0 1 3 0 1

9 2 2 36 9 27
Z 2
= 9ydy = y = − = .
1 2 1 2 2 2


2 )dA
RR
 Example 4.2 Evaluate: R (x − 3y where R = [0, 2] × [1, 2].

Solution: First, integrate in y, then x:


" 2 #
Z Z 2 2 Z 2 Z 2
x − 3y2 dydx = xy − y3 dx =

2x − 8 − x + 1dx
0 1 0 1 0
Z 2 2
1 2
= x − 7dx = x − 7x = 2 − 14 = −12.
0 2 0

RR
 Example 4.3 Evaluate: R y sin(xy)dA where R = [1, 2] × [0, π].

Solution: First, integrate in x, then y:


" 2 #
Z Z π 2 Z π Z π
y sin(xy)dydx = − cos(xy) dy =
− cos(2y) + cos(y)dy
0 1 0 1 0
π
1
= − sin(2y) + sin(y) = 0.
2 0


 Example 4.4 Find the volume of the solid S that is bounded by the elliptic parabloid x2 + 2y2 +
z = 16 and the planes x = 2, y = 2 as well as x = 0, y = 0, z = 0.

Solution: Set up the volume integral by finding the appropriate bounds. Also, we need to in-
tegrate z as a function of x and y, z = 16 − x2 − y2 :
Z 2Z 2 Z 2
" 2 # Z 2
1 8
(16 − x2 − y2 )dxdy = 16x − x3 − 2y2 x dy = 32 − − 4y2 dy

V=
0 0 0 3 0 0 3
88 4 3 2 144
Z 2
88
= − 4y2 dy = − y = = 48.
0 3 3 3 0 3

114 Chapter 4. Multivariable Integration and Applications

The double integral simplifies in the special case that the function z = f (x, y) is separable (e.g.,
f (x, y) = g(x)h(y)).
Z bZ d Z bZ d Z b Z d
 Z b Z d
f (x, y)dydx = g(x)h(y)dydx = g(x) h(y)dy dx = g(x)dx h(y)dy.
a c a c a c a c
(4.2)
RR
 Example 4.5 Evaluate: R sin(x) cos(y)dA where R = [0, π/2] × [0, π/2].

Solution: The function is separable, so split the integral:


Z π/2 Z π/2 Z π/2  Z π/2 
sin(x) cos(y)dydx = sin(x)dx cos(y)dy
0 0 0 0
" π/2 # " π/2 #

= − cos(x) sin(y) = [0 + 1][1 − 0] = 1.
0 0


R1R1√
 Example 4.6 Evaluate: 0 0 s + tdsdt.

Solution: First, integrate in s, then t:


Z 1Z 1√ Z 1Z 1 1 #
Z 1
"
Z 1
1/2 3/2 2 2 2
(1 + t)3/2 − t 3/2 dt

s + tdsdt = (s + t) dsdt = (s + t) dt =
0 0 0 0 0 3 0 0 3 3
1 √
4 4 4 8 2
= (1 + t)5/2 − t 5/2 = 25/2 =

.
15 15 0 15 15

R 1 R 3 x+3y
 Example 4.7 Evaluate:
0 0 e dxdy.

Solution: First, integrate in x, then y:


Z 1Z 3 Z 1
" 3 # Z 1
x+3y x+3y
e3+3y − e3y dy

e dxdy = e dy =
0 0 0 0 0

 1 1  3
Z 1
3y
 3  1 3y  3 2
= e e − 1 dy = e e − 1 = e − 1 .
0 3 0 3


4.2 Double Integrals Over General Regions


There are two basic cases to consider: (i) area is between two functions of x or (ii) the area is
between two functions of y.

Case I: A region D in between the graphs of two continuous functions of x:

D := {(x, y) | a ≤ x ≤ b, g1 (x) ≤ y ≤ g2 (x)} .

Thus, the integral over this region can be computed as


ZZ Z b Z g2 (x)
f (x, y)dydx = f (x, y)dydx.
D a g1 (x)
4.2 Double Integrals Over General Regions 115

Case II: A region D in between the graphs of two continuous functions of y:

D := {(x, y) | h1 (y) ≤ x ≤ h2 (y), c ≤ y ≤ d} .

Thus, the integral over this region can be computed as


ZZ Z d Z h2 (y)
f (x, y)dydx = f (x, y)dxdy.
D c h1 (y)

where D is bounded by y = 2x2 and y = 1 + x2 .


RR
 Example 4.8 Evaluate: D (x + 2y)dA

Solution:
Step 1: Determine the bounds of D by graphing the two given curves and finding the points of
intersection.

To find the points of intersection set the two curves equal to each other and solve for x

2x2 = 1 + x2 ⇒ x2 − 1 = 0 ⇒ x = ±1.

If x = 1, then y = 2x2 = 2 and if x = −1, then y = 1 + x2 = 2.

Step 2: Set up the integral, then solve. Observe that it is very important to figure out which
curve is on top!!

ZZ Z 1 Z 1+x2 Z 1
" 1+x2 #
xy + y2

(x + 2y)dA = (x + 2y)dydx = dx
D −1 2x2 −1 2x2
Z 1 Z 1
= x(1 + x2 ) + (1 + x2 )2 − 2x3 − 4x4 dx =
x + x3 + 1 + 2x2 + x4 − 2x3 − 4x4 dx
−1 −1
Z 1 1
4 3 2 3 5 1 4 2 3 1 2 32
= −3x − x + 2x + x + 1dx = − x − x + x + x + x = .
−1 5 4 3 2 −1 15

R You must draw a diagram to find the bound correctly! Question: What if you accidentally
switch g1 (x) and g2 (x)? The magnitude of the answer will be the same, but with the wrong
sign, (−) correct answer.

 Example 4.9 Find the volume of the solid S that is bounded by the parabloid z = x2 + y2 and
above the region D bounded by y = 2x and y = x2 .

Solution:
Step 1: Determine the bounds of D by graphing the two given curves and finding the points of
intersection.

To find the points of intersection set the two curves equal to each other and solve for x

2x = x2 ⇒ x2 − 2x = 0 ⇒ x = 0, x = 2.
116 Chapter 4. Multivariable Integration and Applications

Step 2: Set up the integral, then solve. Observe that it is very important to figure out which curve
is on top!!
Z 2 Z 2x Z 2
" 2x #
1
ZZ
x2 + y2 dA = x2 + y2 dydx = x2 y + y3 dx

D 0 x2 0 3 x2
Z 2 Z 2
8 1 14 3 1
= 2x3 + x3 − x4 − x6 dx = x − x4 − x6 dx
0 3 3 0 3 3
2
14 1 1 216
= x4 − x5 − x7 =

.
12 5 21 0 35


R In the previous
√ example, we could also write the domain D in terms of functions of y, x = y/2
and x = y. Then the points of intersection are (0, 0) and (2, 4) with integral
Z 4 Z y/2
216
√ x2 + y2 dxdy = .
0 y 35

where D is bounded by y = x − 1 and y2 = 2x + 6.


RR
 Example 4.10 Evaluate: D xydA

Solution:
Step 1: Determine the bounds of D by graphing the two given curves and finding the points of
intersection.

To find the points of intersection set the two curves equal to each other and solve for x
1
x = y + 1 = y2 − 3 ⇒ y2 − 2y − 8 = (y − 4)(y + 2) = 0 ⇒ y = 4, y = −2.
2
Step 2: Set up the integral, then solve. Observe that it is very important to figure out which curve
is on top!!
" #
1 2 y+1
ZZ Z 4 Z y+1 Z 4
xydA = xydxdy = x y dy
D −2 12 y2 −3 −2 2 1 2
y −3 2
Z 4 Z 4
1 1 1 1
= − ( y2 − 3)2 y + (y + 1)2 ydy = − y5 + 2y3 + y2 − 4ydy
−2 2 2 2 −2 8
4
1 6 1 4 1 3 2

= − y + y + y − 2y = 36.
48 2 3 −2

R If we wanted to integrate the previous example in y and define the domain between two
functions of x, first we would need to break the integral into two parts:
ZZ Z −1 Z √2x+6 Z 5 Z √2x+6
xydA = √ xydxdy + xydxdy.
D −3 − 2x+6 −1 x−1

 Example 4.11 Find the volume of a tetrahedron that is bounded by the planes x + 2y + z = 2,
x = 2y, x = 0, and z = 0.
4.2 Double Integrals Over General Regions 117

Solution:
Step 1: Determine the bounds of D by graphing the two given curves and finding the points of
intersection.

To find the points of intersection set the two curves equal to each other and solve for x

x x
y = 1− = ⇒ x = 1.
2 2

Step 2: Set up the integral, then solve. Observe that it is very important to figure out which curve
is on top!!

ZZ Z 1 Z 1−x/2 Z 1
" 1−x/2 #
2y − xy − y2

zdA = (2 − x − 2y)dydx = dx
D 0 x/2 0 x/2
Z 1
= x2 − 2x + 1dx
0
1
1 3 2
1 1
= x − x + x = − 1 + 1 = .
3 0 3 3


R1R1
 Example 4.12 Evaluate: 0 x sin(y2 )dydx.

Solution: This would be too hard to evaluate as written. Instead, it may be easier to integrate as a
function of x first.

Step 1: Determine the bounds of D by graphing the two given curves and finding the points
of intersection.

To find the points of intersection set the two curves equal to each other and solve for x

x=y=1 ⇒ x = 1.

Step 2: Set up the integral, then solve. Observe that it is very important to figure out which curve
is on top!!
Z 1Z 1 Z 1Z y Z 1 y 
2 2 2

sin(y )dydx = sin(y )dxdy = sin(y )x dy
0 x 0 0 0 0
Z 1 1
2 1 2
1
= sin(y )ydy = − cos(y ) = [cos(1) + 1] .
0 2 0 2

4.2.1 Integrals Over Subregions


A double integral can always be broken into subregions. If D is broken into two pieces D1 and D2
such that D = D1 ∪ D2 and D1 ∩ D2 = 0, then
ZZ ZZ ZZ
f (x, y)dA = f (x, y)dA + f (x, y)dA. (4.3)
D D1 D2
118 Chapter 4. Multivariable Integration and Applications

4.2.2 Area Between Curves


We can also calculate the area between two curves using a double integral
ZZ
1dA = Area(D). (4.4)
D

 Example 4.13 Calculate the area of the circle x2 + y2 = 1.


Z 1 Z √1−x2 Z 1p p
√ 1dydx = 1 − x2 + 1 − x2
−1 − 1−x2 −1
Z 1p
=2 1 − x2 dx
−1
Z π/2 q
Substitute x = sin(θ ), dx = cos(θ )dθ : =2 1 − sin2 (θ )(cos(θ ))dθ
−π/2
Z π/2
=2 cos2 (θ )dθ
−π/2
1 1
Z π/2
=2 + cos(2θ )dθ
−π/2 2 2
" π/2 #
1 1
= 2 θ + sin(2θ )
2 4 −π/2
hπ π i
=2 + 0 + + 0 = π.
4 4


4.3 Triple Integrals


Now we consider integrating a function of three variables over three-dimensional space. This
requires defining the triple integral. Given a box
B := {(x, y, z) | a ≤ x ≤ b, c ≤ y ≤ d, r ≤ z ≤ s} ,
the triple integral can be defined as
ZZZZ Z sZ dZ b
f (x, y, z)dV = f (x, y, z)dxdydz. (4.5)
B r c a
2
RRRR
 Example 4.14 Evaluate the triple integral B xyz dV where B = {0 ≤ x ≤ 1, −1 ≤ y ≤ 2, 0 ≤
z ≤ 3}.

Solution: In 3D there are now 6 possible orders of integration! So it is important to choose


the one that seems to result int he easiest integrals to compute.

1 2 2 1
Z 3Z 2 Z 1 Z 3Z 2 Z 3Z 2
2 1 2
xyz dxdydz = x yz dydz = yz dydz
0 −1 0 0 −1 2 0 0 −1 2

1 2 2 2 1 3 3 27
Z 3 Z 3 Z 3
2 1 2 3 2
= y z dz = z − z dz = z dz = z = .
0 4 −1 0 4 0 4 4 0 4


R The Fubini Theorem still holds so we can take the three integrals in any order we see fit!
4.3 Triple Integrals 119
Definition 4.3.1 (Iterated Integrals) Suppose a region is bounded between two surfaces. Then
we can compute the three-dimensional iterated integrals
Z b Z g2 (x) Z u2 (x,y) Z d Z h2 (y) Z u2 (x,y)
f (x, y, z)dzdydx or f (x, y, z)dzdxdy. (4.6)
a g1 (x) u1 (x,y) c h1 (y) u1 (x,y)

RRRR
 Example 4.15 Evaluate: E zdV where E is a solid tetrahedron bounded by x = 0, y = 0,
z = 0, x + y + z = 1.

Solution:

Step 1: Draw Two Diagrams! One for the 3D surfaces and one for the two-dimensional area
D we will integrate over. These will be helpful in finding the curves and points of intersection. The
points of intersection in the plane z = 0 are the lines y = 0 and y = 1 − x.

Step 2: Setup the Iterated Integrals:

1 2 1−x−y
Z 1 Z 1−x Z 1−x−y Z 1 Z 1−x
zdzdydx = z dydx
0 0 0 0 0 2 0
Z 1 Z 1−x
1
= (1 − x − y)2 dydx
0 0 2
Z 1 1−x
1 3

= − (1 − x − y) dx
0 6 0
Z 1
1
= (1 − x)3 dx
0 6
1
1 4
1
= − (1 − x) = .
24 0 24

RRRR
 Example 4.16 Evaluate: 6xydV where E is a solid tetrahedron bounded by the plane
√ E
z = 1 + x + y, the curve y = x, y = 0, x = 1.

Solution:

Step 1: Draw Two Diagrams! One for the 3D surfaces and one for the two-dimensional area
D we will integrate over. These will be helpful in finding the curves and points of intersection. The

points of intersection in the plane z = 0 are the lines y = 0 and y = x where x is from 0 to 1.
120 Chapter 4. Multivariable Integration and Applications

Step 2: Setup the Iterated Integrals:


Z 1 Z √x Z 1+x+y Z 1 Z √x
6xydzdydx = 6xy(1 + x + y)dydx
0 0 0 0 0
Z 1Z √
x
= 6xy + 6x2 y + 6xy2 dydx
0 0
Z 1 √x
2 2 2 3

= 3xy + 3x y + 2xy dx
0 0
Z 1
= 3x2 + 3x3 + 2x5/2 dx
0
3 4 4 7/2 1 65

3
=x + x + x = .
4 7 0 28


4.3.1 Volume Between Surfaces


We can also calculate the area between two curves using a double integral
ZZZ
1dV = Volume(B). (4.7)
B

4.4 Applications of Integration


There are many real world physical applications that can found using double and triple integrals.
RR
In
the previous section,
RRR
we observed that these integrals can be used to compute the area A = 1dxdy
and volume V = 1dxdydz. Typical questions one may ask have the form: Given the curve
y = x2 − x from x = 0 to x = 1, find
a) Area under the curve
b) Mass of a sheet of material cut in the shape of this area with a given density ρ(x, y)
c) Arc Length of the curve
d) Centroid of the area
e) Centroid of the arc
f) Moments of Inertia

We must first define these quantities.

4.4.1 Mass
Suppose a plate occupies a region D in the xy-plane with variable density ρ(x, y).
Definition 4.4.1 (Mass) In physics, the density is defined as the mass per unit of volume, ρ = Vm .
We can define the pass even when the density is non-uniform ρ = ρ(x, y),
ZZ
m= ρ(x, y)dxdy. (4.8)
D

Similarly, if an electric charge is distributed over a region D with a charge density (charge/area)
given by σ (x, y), then the total charge is
ZZ
Q := σ (x, y)dA.
D
4.4 Applications of Integration 121

 Example 4.17 Suppose the charge is distributed over a triangular region D between x = 1, y = 1,
y = 1 − x so that the charge density at (x, y) is σ (x, y) = xy (C/m2 ). Find the charge Q.

Solution: As in the last section we need to find the points of intersection before defining the
bounds of the integral. Here, the lines intersect at (1, 0), (0, 1), and (1, 1).

Then we need to setup the appropriate integral


Z 1Z 1 Z 1 1
1
ZZ
2

Q= xydA = xydydx = xy dx
D 0 1−x 0 2 1−x
Z 1 1 1
1 1
Z
= x − x(1 − x)2 dx = − x3 + x2 dx
0 2 2 0 2
1
1 1 1 1 5
= − x4 + x3 = − + = .

8 3 0 8 3 24


4.4.2 Moments and Center of Mass


In physics, the moment of force (often referred to as just moment) is a measure of its tendency to
cause a body to rotate about a specific point or axis.
Definition 4.4.2 (Moments) The moment of an object about an axis is the product of the mass
and the directed distance from the axis
ZZ ZZ
Mx := yρ(x, y)dxdy, My := xρ(x, y)dxdy. (4.9)
D D

In physics, the center of mass for a distribution of mass in space is the unique point where the
weighted relative position of the distributed mass sums to zero. In other words, it is the point where
if a force is applied the object will move in direction of force without rotation.
Definition 4.4.3 (Centers of Mass) The center of mass of an object

1 My 1 Mx
ZZ ZZ
x̄ := xρ(x, y)dxdy = , ȳ := yρ(x, y)dxdy = , (4.10)
m D m m D m
RR
where the mass is m = D ρ(x, y)dydx.

In mathematics and physics, the centroid or geometric center of a two-dimensional region (area)
is the arithmetic mean ("average") position of all the points in the shape.
Definition 4.4.4 (Centroid) The centroid of an object is the point where it would balance on the
end of the rod if the density were uniform.
RR RR
xρ(x, y)dxdy 1 yρ(x, y)dxdy 1
ZZ ZZ
xcent := RRD =ρ=const. xdA, ycent := RRD =ρ=const. ydA.
D ρ(x, y)dydx A D D ρ(x, y)dydx A D
(4.11)

Example 4.18 Find the mass and the center of mass of a triangular plate with vertices (0, 0), (1, 0), (0, 2)
and density ρ(x, y) = 1 + 3x + y.

Solution: First we need to find the boundary curves, in particular the line L forming the hy-
2−0
potenuse of the triangular plate. Using point slope form: (y − 2) = 0−1 (x − 0) ⇒ y = −2x + 2.
122 Chapter 4. Multivariable Integration and Applications

Now compute the total mass using the formula:

1 −2x+2
ZZ Z 1 Z −2x+2 Z 1
m= ρ(x, y)dA = 1 + 3x + ydydx = y + 3xy + y2 dx
D 0 0 0 2 0
Z 1 1
2 4 3 8
= −4x + 4dx = − x + 4x = .
0 3 0 3

Now we must compute the centers of mass x̄, ȳ.

1 2 −2x+2
Z 1 Z −2x+2
1 3 1
Z
2
x̄ = x(1 + 3x + y)dydx = xy + 3x y + xy dx
m 0 0 8 0 2 0
" 1 #
3 1 3 3
Z
3 4 2

= −4x + 4xdx = −x + 2x = ,
8 0 8 0 8

and

3 2 1 3 −2x+2
Z 1 Z −2x+2
Z 1
1 3 1 2
ȳ = y + xy + y
y(1 + 3x + y)dydx = dx
m 0 0 0 2 8
2 3 0
" 1 #
3 1 1 3 3 10 1
Z
−6x3 − 10x2 + 2x + 2 + (−2x + 2)3 dx = x4 − x3 + x2 + 2x − (−2x + 2)4

=
8 0 3 8 2 3 24 0
11
= .
16
In addition, let’s compute the centroid to show how it differs from the center of mass. First we must
compute the area
Z 1 Z −2x+2 Z 1 1
2

A := 1dydx = −2x + 2dx = −x + 2x = 1.
0 0 0 0

Now, compute xcent , ycent :


Z 1 Z −2x+2 −2x+2Z 1 Z 1 1
2 2 3 2
1
xcent = xdydx = xy
dx = −2x + 2xdx = − x + x = ,
0 0 0 0 0 3 0 3
Z 1 Z −2x+2 Z 1 −2x+2 1
1
1 2 2 3 2
Z
2 2

ycent = ydydx = y dx = 2x + −4x + 2dx = x − 2x + 2x = .
0 0 0 2 0 0 3 0 3


4.4.3 Moment of Inertia


The moment of inertia, otherwise known as the angular mass or rotational inertia, of a rigid body
determines the torque needed for a desired angular acceleration about a rotational axis. It depends
on the body’s mass distribution and the axis chosen, with larger moments requiring more torque to
change the body’s rotation. Typically the moment of inertia requires a mass and radius of rotation,
I = mr2 .
Definition 4.4.5 (Moment of Inertia)
ZZ ZZ ZZ
2 2
Ix := y ρ(x, y)dA, Iy := x ρ(x, y)dA, In 3D Iz := (x2 + y2 )ρ(x, y)dA.
D D D
(4.12)
4.4 Applications of Integration 123

 Example 4.19 Find the moments of inertia Ix , Iy , Iz of a homogeneous rectangular plate with
corners (0, 0), (0, 1), (2, 1), (2, 0) and constant density ρ(x, y) = ρ.

ρ 3 2 8
Z 2Z 1 Z 2
2 2
Ix = x ρdydx = x ρdx = x = ρ,
0 0 0 3 0 3
ρ 2 2
Z 2Z 1 Z 2
2 ρ
Iy = y ρdydx = dx = x = ρ,
0 0 0 3 3 0 3
1 3 1 ρ 3 ρ 2 10
Z 2Z 1 Z 2 Z 2
2 2 2 2 ρ
Iz = (x + y )ρdydx = (x y + y )ρ dx = ρx + dx = x + x = ρ.
0 0 0 3 0 0 3 3 3 0 3
Notice along the way we proved the Perpendicular Axis Theorem giving a relation between the
moments, Ix + Iy + Iz . 

4.4.4 Generalization of Physical Quantities to 3D


ZZZ
Mass: m := ρ(x, y, z)dV
E ZZZ ZZZ ZZZ
Moments: Myz := xρ(x, y, z)dV, Mxz := yρ(x, y, z)dV, Mxy := zρ(x, y, z)dV
E E E
Myz Mxz Mxy
Center of Mass: x̄ := , ȳ := , z̄ :=
m ZZZ m m
Moments of Inertia: Ix := (y2 + z2 )ρ(x, y, z)dV,
ZZZ E ZZZ
Iy := (x2 + z2 )ρ(x, y, z)dV, Iz := (x2 + y2 )ρ(x, y, z)dV.
E E

4.4.5 Applications to Probability


Definition 4.4.6 (Prob. Density) A function f (x) can be taken as a probability density if it
satisfies
Z ∞
f (x) ≥ 0 and f (x)dx = 1. (4.13)
−∞

In one dimension, the probability of finding a value in the interval [a, b] given a probability
density f is
Z b
P(a ≤ x ≤ b) = f (x)dx.
a

The analogous calculation in two-dimension refers to a joint probability density, f (x, y), in two
variables
Z bZ d Z ∞Z ∞
P(a ≤ x ≤ b, c ≤ y ≤ d) = f (x, y)dydx, f (x, y)dydx = 1.
a c −∞ −∞

 Example 4.20 The joint density function for X and Y is given by:
(
C(x + 2y), if 0 ≤ x ≤ 10, 0 ≤ yleq10
f (x, y) := .
0m otherwise

a) Find the constant C such that f (x, y) is a probability density.


124 Chapter 4. Multivariable Integration and Applications

b) Find the probability that x ≤ 7 and y ≥ 2.

Solution: First, we must find C by integrating f in both variables from −∞ to ∞


Z ∞Z ∞ Z 10 Z 10 Z 10 10
2

1= f (x, y)dydx = C(x + 2y)dyd = Cxy +Cy dx
−∞ −∞ 0 0 0 0
Z 10 10
2

= 10Cx + 100Cdx = 5Cx + 100Cx = 1500C
0 0

1
Thus, C = 1500 .

Next, the probability that x ≤ 7 and y ≥ 2 can be computed:


Z 7 Z ∞ Z 7 Z 10 Z 2 10
1 1 2

P(x ≤ 7, y ≥ 2) = f (x, y)dydx = (x + 2y)dydx = xy + y dx
−∞ 2 0 2 1500 1500 0 2
Z 7
" 7 #
1 1 868
4x2 + 96x =

= 8x + 96dx = = .5787
1500 0 1500 0 1500


Another widely used concept from probability is the concept of an expected value. This is the
value of x and y one should expect to see on average if many trials are run.
Definition 4.4.7 (Expected Value) Given a joint probability density f (x, y), the expected values,
µx , µy , can be computed as
ZZ ZZ
µx := x f (x, y)dA = mx̄, µy := y f (x, y)dA = mȳ. (4.14)

Observe the relationship between the expected values and the centers of mass defined earlier.

 Example 4.21 Given the curve y = x − x2 from x = 0 to x = 1.


a) Find the area under the curve.
b) Find the mass of the plane sheet cut to fit this area with density ρ(x, y) = xy.
c) Find the Center of Mass of the sheet.
d) Find the Volume of Revolution.
e) Find an expression for the Surface Area of Revolution.

Solution: a) First, compute the area under the curve:


Z 1 Z x−x2 Z 1 x−x2 Z 1
A := 1dydx = y dx = x − x2 dx
0 0 0 0 0

1 2 1 3 1 1 1 1

= x − x = − = .
2 3 0 2 3 6

b) Next, find the total mass of the sheet


Z 1 Z x−x2 Z 1 x−x2
1
ZZ
2

m= ρ(x, y)dydx = xy xydydx =
dx
D 0 2 0
0 0

1 6 1 5 1 4 1
Z 1
1 5 4 1 3 1
= x − x + x dx = x − x + x = .
0 2 2 12 5 8 0 120
4.5 Change of Variables in Integrals 125

c) Next, find the center of mass


2
1 2 2 x−x
Z 1 Z x−x2 Z 1
1
ZZ
2
x̄ = xρ(x, y)dA = 120 x ydydx = 120 x y dx
m D 0 0 0 2 0
"
#
1 7 1 6 1 5 1
Z 1
1 6 5 1 4 4
= 120 x − x + x dx = 120 x − x + x = .
0 2 2 14 6 10 0 7

Z 1 Z x−x2 Z 1 x−x2
1 1
ZZ
2 3

ȳ = yρ(x, y)dA = 120
xy dydx = 120 xy dx
m D 0 0 3 0 0
" #
1 8 1 7 1 6 1 5 1
Z 1
1 7 6 5 1 4 1
= 120 − x + x − x + x dx = 120 − x + x − x + x = .
0 3 3 24 7 6 15 0 7

d) Find the volume of revolution. Recall volume of a cylinder is πr2 , but here the curve y(x) is
the radius and the x interval [0, 1] is the height
Z 1 Z 1 Z 1
2 2 2
V= π(x − x ) dx = π
πy dx = x4 − 2x3 + x2 dx
0 0 0
" #
1 5 1 4 1 3 1 π
=π x − x + x = .
5 2 3 0 30

e) The expression for the Surface Area of Revolution is:


s  2
Z 1 Z 1 Z 1 q
dy
A := 2πyds = 2
2π(x − x ) 1 + dx = 2π(x − x2 ) 1 + (1 − 2x)2 dx.
0 0 dx 0


 Example 4.22 Find the moments of inertia for the same density, but under the curve f (x) = x2
(see Book).

Solution: Compute the moments:


x2
1 10 1
Z 1 Z x2 Z 1 Z 1
3 1 4
1 9 1
Ix = xy dydx = xy = x dx = x = ,
0 0 0 4 0 0 4 40 40 0
2 Z1
1 3 2 x 1 8 1
Z 1 Z x2 Z 1
1 2 1
Iy = x3 ydydx = x y = x dx = x = ,
0 0 0 2 0 0 2 16 0 16
Z 1 Z x2 Z 1 Z x2 1 Z x2 1 1 7
Z
2 2 3
Iz = (x + y )xydydx = xy dydx + x3 ydydx = + = .
0 0 0 0 0 0 40 16 80


4.5 Change of Variables in Integrals


The fundamental question in this section is can we study multiple integrals in other coordinate
systems. The three basic systems to consider are:
• (2D) Polar Coordinates (r, θ )
• (3D) Cylindrical Coordinates (r, θ , z)
• (3D) Spherical Coordinates (ρ, θ , φ ) RR
A typical problem arises when we want to compute R f (x, y)dA where R is a disk or a ring.
This is much easier in polar coordinates! Recall the definition of polar coordinates.
126 Chapter 4. Multivariable Integration and Applications

Definition 4.5.1 (Polar Coordinates) Recall the relationship between (x, y) and (r, θ ):

r2 = x2 + y2 , θ = tan−1 (y/x)
x = r cos(θ ), y = r sin(θ ).

For normal integration in Cartesian coordinates, we divide the region into rectangles of area
A = dxdy, not so in polar coordinates! Instead we have a “polar rectangle" whose ares is not drdθ ,
but A = rdrdθ since

1 1 1
Area: = r22 ∆θ − r12 ∆θ = (r12 − r22 )∆θ
2 2 2
1
= (r2 − r1 )(r2 + r1 )∆θ = r∗ ∆r∆θ → rdrdθ
2

as ∆r → 0, ∆θ → 0.

4.5.1 Changing to Polar Coordinates in a Double Integral


Given a region r and a function f (x, y) we can convert the double integral to polar coordinates
ZZ Z βZ b
f (x, y)dA = f (r cos(θ ), r sin(θ ))rdrdθ . (4.15)
R α a

2
RR
 Example 4.23 Evaluate R2 (3x + 4y )dA where R2 = {1 ≤ r ≤ 2, 0 ≤ θ ≤ π}.

Solution: Compute
ZZ Z πZ 2 Z πZ 2
(3x + 4y2 )dA = [3r cos(θ ) + 4r2 sin2 (θ )]rdrdθ = 3r2 cos(θ ) + 4r3 sin2 (θ )drdθ
R2 0 1 0 1
Z π 2 Z π
3 4 2
7 cos(θ ) + 15 sin2 (θ )dθ

= r cos(θ ) + r sin (θ ) dθ =
0 1 0
π
15 15 15 15 15π
Z π
= 7 cos(θ ) + − cos(2θ )dθ = 7 sin(θ ) + θ − sin(2θ ) = .
0 2 2 2 4 0 2

 Example 4.24 Find the volume of the solid bounded by the plane z = 0 and the parabloid
z = 1 − x 2 − y2 .

Solution: Compute

1 2 1 4 1
ZZ Z 2π Z 1 Z 2π
2 2 2
V= (1 − x − y )dydx = (1 − r )rdrdθ = r − r dθ
D 0 0 0 2 4 0
θ 2π π
Z 2π
1
= dθ = = .
0 4 4 0 2

Example 4.25 Given a semicircular sheet of material of radius a, θ ∈ [−π/2.π/2], and constant
density ρ find (a) the center of mass, (b) Moments.
4.5 Change of Variables in Integrals 127

Solution: (a) By symmetry ȳ = 0 and we must compute x̄:

ZZ Z π/2 Z a Z π/2 Z a
x̄ = xρdA = r cos(θ )ρrdrdθ = r2 cos(θ )ρdrdθ
−π/2 0 −π/2 0
a
1 a3
Z π/2 Z π/2
ρr3 cos(θ ) dθ =

= ρ cos(θ )dθ
−π/2 3 0 −π/2 3
a3
π/2
2a3
= ρ sin(θ ) = ρ.
3 −π/2 3

(b) By symmetry Ix = 0 and the mass

1 2 a
Z π/2 2
πa2

a
ZZ Z π/2 Z Z π/2
m= ρdA = ρrdrdθ = r dθ = dθ = .
−π/2 0 −π/2 2 0 −π/2 2 2

Then

ρ π/2 a 2 2 ρ π/2 a 3 2
ZZ Z Z Z Z
ρ
Iy = x2 dA = r cos (θ )rdrdθ = r cos (θ )drdθ
m m −π/2 0 m −π/2 0
ρ π/2 1 4 2 a ρ π/2 a4
Z Z
= r cos (θ ) drdθ = (1 − sin(2θ ))dθ
m −π/2 4 0 m −π/2 8
ρ a4
π/2
1 ρ πa4 a2
= ( (θ + cos(2θ ))) = =ρ .
m 8 2 −π/2 m 8 4

We can also compute double integrals for arbitrarily polar regions bounded by continuous
curves

ZZ Z β Z h2 (θ )
f (x, y)dA = f (r cos(θ ), r sin(θ ))rdrdθ . (4.16)
R α h1 (θ )

 Example 4.26 Use the double integral to find the area of 1 loop of the 4 leaved rose r = cos(2θ ).

Solution: The area can be found as

1 2 cos(2θ )
ZZ Z π/4 Z cos(2θ ) Z π/4
A(D) = dA = rdrdθ = r dθ
D −π/4 0 −π/4 2 0
" π/4 #
1 1 π/4 1 1
Z π/4 Z
π
cos2 (2θ )dθ =

= 1 + cos(4θ )dθ = θ + sin(4θ ) = .
−π/4 2 4 −π/4 4 4 −π/4 8

 Example 4.27 Find the value of a solid that lies under the parabloid z = x2 + y2 , above the
xy-plane and inside the cylinder x2 + y2 = 2x ⇔ (x − 1)2 + y2 = 1.

Solution: In polar coordinates x2 + y2 − 2x ⇔ r = 2 cos(θ ) and z = x2 + y2 = r2 . Then the volume


128 Chapter 4. Multivariable Integration and Applications

is

1 4 2 cos(θ )
ZZ Z π/2 Z 2 cos(θ ) Z π/2 Z π/2
2 2 3
V= (x + y )dA = r drdθ = r dθ − 4 cos4 (θ )dθ
D −π/2 0 −π/2 4 0 −π/2

1 + cos(2θ ) 2
Z π/2  
=4 dθ
−π/2 2
π/2 1 1
Z Z π/2
= 1 + 2 cos(2θ ) + cos2 (2θ )dθ = 1 + 2 cos(2θ ) + + cos(4θ )dθ
−π/2 −π/2 2 2
π/2
θ 1 3π
= θ + sin(2θ ) + + sin(4θ ) = .
2 8 −π/2 2

4.5.2 Arc Length in Polar Coordinates


q
ds2 dr2 dr 2
From the Pythagorean Theorem ds2 = dr2 +r2 dθ 2 ⇒ dθ 2 . Then ds =

2 = dθ 2 +r dθ + r2 dθ =
q
2
1 + r2 dθ

dr dr. As a check we compute the arclength of a known function such as a circle of
radius 1, x2 + y2 = 1
Z 2π √ Z 2π 2π

ds = r2 dθ = rdθ = rθ = 2πr.
0 0 0

4.6 Cylindrical Coordinates


Definition 4.6.1 (Cylindrical Coordinates) Recall the relationship between Cartesian (x, y, z)
and Cylindrical Coordinates (r, θ , z)

r2 = x2 + y2 , θ = tan−1 (y/x), z=z


x = r cos(θ ), y = r sin(θ ), z = z.

A typical volume element is dV = rdrdθ dz and arc length ds2 = dr2 + r2 dθ 2 + dz2 .

 Example 4.28 (a) Find the point (2, 2π


3 , 1) in Cartesian coordinates.


x = r cos(θ ) = 2 cos( ) = 2(−1/2) = −1
3
2π √ √
y = r sin(θ ) = 2 sin( ) = 2( 3/2) = 3
3
z = 1.

(b) Find the cylindrical coordinates of the point (3, −3, −7).
p √ √
r= x2 + y2 = 9 + 9 = 3 2
y −3 7π
tan(θ ) = = = −1 ⇒ θ = + 2nπ
x 3 4
z = −7.


4.7 Cylindrical Coordinates 129

R Cylindrical coordinates are most useful in problems with symmetry about some axis (e.g., a
cylinder). In Cartesian coordinates a cylinder is x2 + y2 = c2 and in Cylindrical coordinates
r = c.

 Example 4.29 Describe the surface with cylindrical coordinates z = r.

Solution: In Cartesian coordinates this is z2 = x2 + y2 , which is concentric circles of radius z


or a cone. 

Definition 4.6.2 (Triple Integrals in Cylindrical Coordinates)


ZZZ Z β Z h2 (θ ) Z u2 (r cos(θ ),r sin(θ ))
f (x, y, z)dV = f (r cos(θ ), r sin(θ ), z)rdzdrdθ . (4.17)
E α h1 (θ ) u1 (r cos(θ ),r sin(θ ))

 Example 4.30 A solid E is inside the cylinder x2 + y2 = 1, below the plane


pz = 4, and above the
parabloid z = 1 − x2 − y2 . Find the mass of E given the density ρ(x, y, z) = x 2 + y2 .

Solution: Recall the formula for the mass:


ZZZ Z 2π Z 1 Z 4 Z 2π Z 1 4 Z 2π Z 1
2
r2 (4 − 1 + r2 )drdθ

m= ρ(x, y, z)dV = rrdzdrdθ = r z drdθ =
E 0 0 1−r2 0 0 1−r2 0 0
1
6 2π 12π
Z 2π Z 1 Z 2π Z 2π
4 2 1 5 3

= r + 3r drdθ = r + r dθ − θ = .
0 0 0 5 0 0 5 0 5

R 2 R √4−x2 R 2
 Example 4.31 Evaluate −2 − 4−x2 √ √ (x2 + y2 )dzdydx.
x2 +y2

Solution: Solve:
Z 2 Z √4−x2 Z 2 Z 2π Z 2 Z 2 2 Z 2π Z 2
2 2 2 3

√ √ (x + y )dzdydx = r rdzdrdθ = r z drdθ
−2 − 4−x2 x2 +y2 0 0 r 0 0 r
1 5 1 4 2
Z 2π Z 2 Z 2π
4 3
= −r + 2r drdθ = − r + r dθ
0 0 0 5 2 0
Z 2π Z 2π
32 8 16π
= − + 8dθ = dθ = .
0 5 0 5 5


4.7 Cylindrical Coordinates


Definition 4.7.1 (Spherical Coordinates) Recall the relationship between Cartesian (x, y, z) and
Spherical Coordinates (ρ, θ , φ )

rho2 = x2 + y2 + z2 , θ = cos−1 (z/ρ), φ = cos−1 (x/ρ sin θ )


x = ρ sin(θ ) cos(φ ), y = ρ sin(θ ) sin(φ ), z = ρ cos(θ ).

A typical volume element is dV = ρ 2 sin2 (θ )dρdθ dφ and arc length ds2 = dρ 2 + ρ 2 dθ 2 +


ρ 2 sin2 (θ )dφ 2 .

There are many domains which are easier to describe in spherical coordinates such as (i) a
sphere ρ = const, (ii) the half-plane φ = c, the upper half cone θ = c for c < π2 and lower half cone
θ = c for c > π2 .
130 Chapter 4. Multivariable Integration and Applications

 Example 4.32 (a) Find the point (2, π4 , π3 ) in Cartesian coordinates.



π π √ √ 3
x = ρ sin(θ ) cos(φ ) = 2 sin( ) cos( ) = 2( 3/2)(1/ 2) = √
3 4 2

π π √ √ 3
y = ρ sin(θ ) sin(φ ) = 2 cos( ) sin( ) = 2( 3/2)(1/ 2) = √
3 4 2
z = ρ cos(θ ) = 2 cos(π/3) = 2(1/2) = 1.

(b) Find the spherical coordinates of the point (0, 2 3, −2).
p √ √
ρ= x2 + y2 + z2 = 0 + 12 + 4 = 16 = 4
z −2 1 2π
cos(θ ) = = =− ⇒ θ= + 2nπ
ρ 4 2 3
x 0
cos(φ ) = = = 0.
ρ sin(θ ) 4 sin(2π/3)


Definition 4.7.2 (Triple Integrals in Spherical Coordinates)

ZZZ Z dZ βZ b
f (x, y, z)dV = f (ρ sin(θ ) cos(φ ), ρ cos(θ ) sin(φ ), ρ cos(θ ))ρ 2 sin(θ )dρdφ dθ .
E c α a
(4.18)

(x2 +y2 +z2 )3/2 dV on the unit ball B = {x2 + y2 + z2 ≤ 1}.


RRR
 Example 4.33 Evaluate Be

Solution: Setup the appropriate integral


ZZZ Z π Z 2π Z 1 Z π Z 2π Z 1
(x2 +y2 +z2 )3/2 ρ3 2 3
e dV = e ρ sin(θ )dρdφ dθ = sin(θ )dθ dφ ρ 2 eρ dρ
B 0 0 0 0 0 0
 π  " 2π # " 1 #
1 3 1 1 4π
= − cos(θ ) φ eρ = 2(2π)( e − = (e − 1).
0 0 3 0 3 3 3

 Example 4.34 “Ice


p Cream" Use spherical coordinates to find the volume of he solid that lies
above the cone z = x2 + y2 and below the sphere x2 + y2 + z2 = z.

Solution: first we observe that the equation for the sphere can be written as ρ 2 = ρ cos(θ ) ⇒ ρ =
cos(θ ) and the equation of the cone becomes ρ cos(θ ) = ρ 2 sin2 (θ ) cos2 (φ ) + ρ 2 sin2 (θ ) sin2 (φ ) =
ρ sin(θ ). Thus, θ = π/4.
Z π/4 Z 2π Z cos(θ ) Z π/4 Z 2π cos(θ )
1 3
ZZZ
2

V (E) = dV = ρ sin(θ )dρdφ dθ = ρ sin(θ ) dφ dθ
E 0 0 0 0 0 3 0
Z π/4 Z 2π
1 1
Z π/4
= cos3 (θ ) sin(θ )dφ dθ = 2π cos3 (θ ) sin(θ )dθ
0 0 3 0 3
" π/4 #    
2π 1 4
2π 1 1 2π 3 π
= − cos (θ ) = − + = = .
3 4 0 3 16 4 3 16 8


4.7 Cylindrical Coordinates 131

4.7.1 Jacobians
Jacobians describe how a basic area element is scaled when changing coordinates. Consider a
transformation T from the xy-plane to the uv-plane, T (x, y) = (u, v). The rectangular area element
in (x, y), dA = dxdy will becomes distorted in the uv-plane and have a new area. The scaling of
one are to another after a transformation is the Jacobian.
Definition 4.7.3 (2D Jacobian) In 2D, when we go from (x, y) to some new coordinates (s,t)
we compute the Jacobian using partial derivatives and determinants

 
x, y ∂ (x, y) ∂x ∂x
∂ s ∂t
J=J = := ∂ y ∂ y . (4.19)
s,t ∂ (s,t) ∂ s ∂t

The area element dA = dydx is replaced buy |J|dsdt. Notice the absolute value.

 Example 4.35 (Polar Coordinates)



∂x ∂x
 
x, y cos(θ ) −r sin(θ )

= r cos2 (θ ) + r sin2 (θ ) = r.

J :=
∂r ∂θ =
∂y ∂y sin(θ ) r cos(θ )
r, θ ∂r ∂θ



Definition 4.7.4 (3D Jacobian) In 3D, when we go from (x, y, z) to some new coordinates (r, s,t)
we compute the Jacobian using partial derivatives and determinants

  ∂x ∂x ∂x
x, y, z ∂ (x, y, z) ∂∂ yr ∂∂ ys ∂∂ty

J=J = := ∂ r ∂ s ∂t . (4.20)
r, s,t ∂ (r, s,t) ∂z ∂z ∂z
∂ r ∂ s ∂t
RRR
The
RRR
area element dA = dzdydx is replaced buy |J|drdsdt. So f (x, y, z)dxdydz =
f (r, s,t)|J|drdsdt.

 Example 4.36 (Cylindrical Coordinates)



∂x ∂x ∂x
cos(θ ) −r sin(θ ) 0

∂r ∂z
  ∂θ
x, y, z
J := ∂y ∂y ∂y
= sin(θ ) r cos(θ ) 0 = r cos2 (θ ) + r sin2 (θ ) = r.

r, θ , z ∂r ∂θ ∂z

∂z ∂z ∂z
0 0 1
∂r ∂θ ∂z


 Example 4.37 (Spherical Coordinates)


∂x ∂x ∂x

sin(θ ) cos(φ ) r cos(θ ) cos(φ ) −r sin(θ ) sin(φ )

  ∂ρ ∂θ ∂φ
x, y, z
∂y ∂y ∂y

J := = sin(θ ) sin(φ ) r cos(θ ) sin(φ ) r sin(θ ) cos(φ )

ρ, θ , φ ∂ρ ∂θ ∂φ
∂z ∂z ∂z cos(θ ) −r sin(θ ) 0


∂ρ ∂θ ∂φ
= cos(θ ) r cos (φ ) cos(θ ) sin(θ ) + r2 sin2 (φ ) cos(θ ) sin(θ )
 2 2


+ r sin(θ ) r sin2 (θ ) cos2 (φ ) + r sin2 (θ ) sin2 (φ )


 

= cos(θ ) r2 cos(θ ) sin(θ ) + r sin(θ ) r sin2 (θ ) = r2 sin(θ ).


   

ds 2 dr 2
2
Express the velocity of a particle in spherical coordinates v2 = + r2 dθ
 
R dt = dt dt +
 2
r2 sin2 (θ ) dφ
dt .
132 Chapter 4. Multivariable Integration and Applications

4.8 Surface Integrals


Question: What if we want to compute the surface area of an arbitrary object (even if it is not a
surface of revolution)?
Let S be a surface defined by z = f (x, y) where f has continuous partial derivatives. We take a
small area element defined by the vectors a, b each starting a point Pi j and lying along the sides of
a parallelogram with area ∆Ai j . Then

Area ∆Ai j = |a × b|
a = ∆xî + fx (xi , y j )∆xk̂
b = ∆yĵ + fy (xi , y j )∆yk̂

where the partial derivatives are the slopes of the tangent lines through the point Pi j . Thus,

î ĵ k̂

a × b = ∆x 0 fx ∆x
= − fx ∆x∆yî − fy ∆x∆yĵ + ∆x∆yk̂.
(4.21)
0 ∆y fy ∆y

Therefore the area is


q
∆A = |a × b| = [ fx ]2 + [ fy ]2 + 1∆x∆y (4.22)

Definition 4.8.1 (Area of Surface) The area of the surface with equation z = f (x, y) for (x, y) ∈ D
where fx , fy are continuous is
s  2  2
∂z ∂z
ZZ q ZZ
A= [ fx (x, y)]2 + [ f y (x, y)]2 + 1dA = 1+ + dA . (4.23)
D D ∂x ∂y

R The formula
r for the area of a surface is the 3d analogue of the formula for arclength
Rb  2
dy
s = a 1 + dx dx.

 Example 4.38 Find the surface area of the part of the surface z = x2 + 2y that lies above the
triangular region in the xy-plane with vertices (0, 0), (1, 0), (1, 1).

Solution: Find the necessary partial derivatives and use the formula for the area:
ZZ q Z 1Z xq Z 1Z xp
A= fx2 + fy2 + 1dA = (2x)2 + (2)2 + 1dydx = 4x2 + 5dydx
D 0 0 0 0
1p
x Z 1 p Z 9
1 1/2
Z
u du for u = 4x2 + 5 and du = 8xdx

= 2
4x + 5y dx = x 4x2 + 5dx =
0

0 0 5 8

1 3/2 9 27 5 5 √ 

1 
= u = − = 27 − 5 5 .
12 5 12 12 12


 Example 4.39 Find the area of the part of the paraboloid z = x2 + y2 from z = 0 to z = 9 (this
surface is above the disk D with center (0, 0) and radius 3.
4.8 Surface Integrals 133

Solution: Find the necessary partial derivatives and use the formula for the area:
ZZ q ZZ q ZZ p
A= 2 2
fx + fy + 1dA = 2 2
(2x) + (2y) + 1dA = 4x2 + 4y2 + 1dA
D D D
Z 2π Z 3 p Z 2π Z 3
1
= 4r2 + 1rdrdθ = dθ (4r2 + 1)1/2 8rdr
0 0 0 0 8
" 3 #
π √
 
1 2 3/2
1 3/2 1 
= 2π (4r + 1) = 2π (37) − = 37 37 − 1 .
12 0 12 12 6

 Example 4.40 Find the area of the part of the plane z = 2 + 3x + 4y that lies above the rectangle
[0, 5] × [1, 4].

Solution: Find the necessary partial derivatives and use the formula for the area:
ZZ q ZZ p Z 4Z 5
A= fx2 + fy2 + 1dA = 32 + 42 + 1dA = dxdy
D D 1 0
Z 4√ 5 Z 4 √

= 26x dy = 5 26dy = 20 26.
1 0 1

2
x3/2 + y 3/2

 Example 4.41 Find the area of the part of the surface z = 3 that lies above the
rectangle [0, 1] × [0, 1].

Solution: Find the necessary partial derivatives and use the formula for the area:
ZZ q ZZ q Z 1Z 1p
A= (x1/2 )2 + (y1/2 )2 + 1dA =
fx2 + fy2 + 1dA = x + y + 1dydx
D D 0 0
Z 1Z 1 Z 1 1 Z 1
2 3/2 2 2
1/2
(x + 2)3/2 − (x + 1)3/2 dx

= (x + y + 1) dydx = (x + y + 1) dx =
0 0 0 3 0 0 3 3
1
4 4 4 4 4 4 4  √ √ 
= (x + 2)5/2 − (x + 1)5/2 = 35/2 − 25/2 − 25/2 +

= 9 3−8 2+1 .
15 15 15 0 15 15 15 15


 Example 4.42 Find the area of the part of the paraboloid z = 4 − x2 − y2 above the xy-plane.

Solution: Find the necessary partial derivatives and use the formula for the area:
ZZ q ZZ q ZZ p
A= fx2 + fy2 + 1dA = (−2x)2 + (−2y)2 + 1dA = 4x2 + 4y2 + 1dA
D D D
Z 2π Z 2 p Z 2π Z 2 Z 2π 2
2
1 2 1/2 1 2 3/2

= 4r + 1rdrdθ = (4r + 1) 8rdrdθ = (4r + 1) dθ
0 0 0 0 8 0 12 0
Z 2π  √

1 π  
= (173/2 − 1) dθ = 17 17 − 1 .
0 12 6

5. Vector Analysis

5.1 Applications of Vector Multiplication


Before returning to applications of vectors to physical systems we first briefly summarize everything
we know about vectors from Chapter 3.
A vector x = (x1 , x2 , ..., xn ) is an ordered sequence of real numbers. The number xn is the nth
component of the vector x. The collection of all vectors is called a vector space or linear space.
For concreteness consider two three dimensional vectors x = (x1 , x2 , x3 ) and y = (y1 , y2 , y3 ). Every
vector space has the following properties:

i) Vector Equality: If x = y, then xi = yi for all i.

ii) Vector Addition: x + y = z = y + x where zi = xi + yi (Commutative).

iii) Scalar Multiplication: αx = (αx1 , αx2 , αx3 ).

iv) Zero Vector: 0 = (0, 0, 0).

Figure 5.1: Illustration of vector addition with forces.


136 Chapter 5. Vector Analysis

v) Associative Addition: (x + y) + z = x + (y + z).


vi) Distributive Law for Scalar Mult.: α(x + y) = αx + αy.
vii) Associative Scalar Multiplication: (αβ )x = α(β x).

5.1.1 Dot and Cross Products


Also, we recall the definitions of the scalar (dot) product and the vector (cross) product:

• Scalar Product: x · y = |x||y|


cos(θ ) = x1 y1 + x2 y2 + x3 y3
î ĵ k̂

• Cross Product: x × y = x1 x2 x3 = î(x2 y3 − x3 y3 ) + ĵ(x3 y1 − y1 x3 ) + k̂(x1 y2 − x2 y1 ),
y1 y2 y3
and |x × y| = |x||y| sin(θ ).
We now consider various applications of these vector quantities.
Work
A force does work if, when acting on a body, there is a displacement of the point of application
in the direction of the force. In general force is work times displacement. If the force is applied
parallel to the displacement then the work is the magnitude of the force times the distance traveled,
but what happens if the force and displacement are not parallel? In physics, we know the component
of the force perpendicular to the displacement does no work. So
W = |F||d| cos(θ ) = F · d.
If we want to study a dynamic problem we may need the work done by Ran applied force over an
infinitesimally small distance dW = F · dr resulting in a total work W = ab F · dr.
Torque
The torque or moment is the tendency of a force to rotate an object about an axis. Think about a lever
balanced on a fulcrum at the origin. Here the torque is the force times the distance. Analogously in
vector quantities, the lever arm is the perpendicular distance from the origin to the point the force F
is applied. Since we need something perpendicular the torque is
τ := r × F, |τ| = |F||r| sin(θ ).
The torque will act in the direction perpendicular to r and F as indicated by the righthand rule.
Angular Velocity
The angular velocity, ω, is defined as the rate of change of angular displacement. This is a vector
quantity acting in the direction along the axis of rotation (by the righthand rule). Consider a point P
in a rigid body rotating with angular velocity ω. The linear translational velocity v of the point P is
v = ω × r.
Shortest Distance
What is the shortest distance of a rocket traveling at a constant velocity v = (1, 2, 3) from an
observer at x0 = (2, 1, 3)? The rocket is launched at time t = 0 at the point x1 = (1, 1, 1).

Solution: The rocket (ignoring gravity and air resistance) will follow a straight line

x(t) = 1 + t

x(t) = x0 + vt = y(t) = 1 + 2t . (5.1)

z(t) = 1 + 3t

5.2 Triple Products 137

We now want to minimize the distance d = |x − x0 | from the observer at x0 = (2, 1, 3) from the
current position of the rocket at time t, x(t). Equivalently we can minimize the square of the
distance (x − x0 )2 . To minimize we differentiate the equation of motion with respect to t:
d
(x − x0 )2 = 2(x − x0 ) · ẋ = 2[x1 − x0 + tv] · v = 0, v = ẋ. (5.2)
dt
Since ẋ = v is the tangent vector of the line, geometrically we say that the shortest distance vector
through a point x0 is perpendicular to the line. Now solving for the time t when the rocket is closest
we find
(x1 − x0 ) · v 1
t =− = .
v2 2
Now substituting t back into (6.202) yields x(1/2) = (3/2, 2, 5/2) as the point
p the rocket is closest.
So the shortest distance is d = |x0 − (3/2, 2, 5/2)| = |(−1/2, 1, −1/2)| = 3/2.
Law of Cosines
Let vector C = A + B and take a dot product with itself
C2 = C · C = (A + B) · (A · B) = A · A + B · B + 2A · B = A2 + B2 + 2|A||B| cos(θ ). (5.3)
This is exactly the Law of Cosines!

5.2 Triple Products


In the previous sections we reviewed the definitions of the scalar (dot) and vector (cross) products.
Now we use these operations in combination to define two useful quantities:
i) Triple Scalar Product, A · (B ×C)
ii) Triple Vector Product, A × (B ×C)
The names come from the resulting quantity (scalar, vector respectively).

R Observe that the other possible quantities do not make sense: 1. (A · B) ×C (number × vector)
and 2. (A · B) ·C (number · vector).

5.2.1 Triple Scalar Product


First, we give a geometric interpretation of the triple scalar product A · (B ×C) as the volume of
a parallelepiped with sides A, B,C. The area of the base is |B ×C| = |B||C| sin θ and the height is
|A| cos φ . Thus, the volume
V = |A||B||C| sin θ cos φ = |B ×C||A| cos φ = A · (B ×C). (5.4)
Observe that if two sides area parallel such that B ×C = 0, then both the Triple Scalar Product and
the volume of the parallelepiped would be zero.
Observe that the quantities can be rearranged:
A · (B ×C) = Ax (ByCz − BzCy ) + Ay (BzCx − BxCz ) + Az (BxCy − ByCx )
= Bx (Cy Az −Cz Ay ) + By (Cz Ax −Cx Az ) + Bz (Cx Ay −Cy Ax ) = B · (C × A)
= Cx (Ay Bz − Az By ) +Cy (Az Bx − Ax Bz ) +Cz (Ax By − Ay Bx ) = C · (A × B)

Since the cross product changes sign when the order is reversed, we can only rotate the three
quantities A, B,C together clockwise or counter clockwise.
138 Chapter 5. Vector Analysis

Figure 5.2: Image from Weber et al. Essential Math Methods for Physicists

 Example 5.1 A parallelepiped has sides A = î + 2ĵ − k̂, B = ĵ + k̂, C = î − ĵ. Find the volume.

Solution: Using the Triple Scalar Product:



î ĵ k̂

A · (B ×C) = A · 0 1 1
1 −1 0
= A · [î(0 − (−1)) + ĵ(1 − 0) + k̂(0 − 1)]
= (1, 2, −1) · (1, 1, −1) = 1(1) + 2(1) − 1(−1) = 4.

There is an alternate definition of the Triple Scalar Product one can use involving determinants
(cf. Linear Algebra Ch. 3)

Ax Ay Az

A · (B ×C) = Bx By Bz = Ax (ByCz − BxCy ) + Ay (BzCx − BxCz ) + Az (BxCy − ByCx ). (5.5)
Cx Cy Cz

 Example 5.2 Apply the alternate definition of the triple scalar product to the last example.

Solution: Using Cofactor Expansion (Laplace Development)



1 2 −1
1+1
1 1 3+1
2 −1
A · (B ×C) = 0 1 1 = 1(−1) + 0 + 1(−1)
1 1 = 1+3 = 4

1 −1 0 −1 0

 Example 5.3 Find the volume of a parallelepiped defined by A = (0, 1, 2), B = (1, 2, 3), C =
(−1, −1, −1).

Solution: Compute

0 1 2
= 0+1(−1)1+2 1 3 +2(−1)1+3 1 2

A·(B×C) = 1
2 3 = −2+2 = 0.
−1 −1 −1 −1
−1 −1 −1

Wait! There is no volume? It turns out that A − B = C forming a linear combination. Thus, we do
not have a basis and the three vectors A, B,C lie in the same plane giving a volume of zero. 
5.2 Triple Products 139

Figure 5.3: Image from Weber et al. Essential Math Methods for Physicists

5.2.2 Triple Vector Product


We now consider the Triple Vector product A × (B × C). The first thing to observe is that the
location of the parentheses is important! For example,
0 = A × (B × B) 6= (A × B) × B.
Begin with the geometric interpretation of A × (B ×C). What the quantity says is that B ×C is
perpendicular to both B and C. Thus the triple product must be perpendicular to A and the vector
B ×C resulting in a vector perpendicular to A in the plane spanned by B,C.
So, A × (B ×C) = αB + βC for constants α, β . Take a dot product of both sides with A to find:
0 = α(B · A) + β (C · A) ⇒ α = γ(C · A), β = −γ(B · A). (5.6)
Choosing γ = 1 gives A × (B × C) = B(A · C) − C(A · B). The so-called BAC-CAB Rule for the
Triple Vector Product.
 Example 5.4 Find A × (B ×C) for A = (1, 2, −1), B = (0, 1, 1), and C = (1, −1, 0).

Solution: Thus,

î ĵ k̂ î ĵ k̂

A × (B ×C) = A × 0 1 1 = A × (1, 1, −1) = 1 2 −1 = (−1, 0, −1).
1 −1 0 1 1 −1

Try to change the order of the parentheses



î ĵ k̂ î ĵ k̂

(A × B) ×C = 1 2 −1 ×C = (3, −1, 1) ×C = 3 −1 1
= (−1, 1, 2).

0 1 1 1 −1 0


5.2.3 Applications of Triple Scalar Products


Consider the torque of a force about an axis τ = r × F where r, F are in a plane perpendicular to
the axis of rotation. The torque at the origin is just r × F

We want to find the torque produced by the force F about any axis L (line). Let r be the vec-
tor from a point on line L to the force F. For simplicity let the line L = k̂ (z-axis). Then the torque
about the line L is:
τ L := n̂ · (r × F) (5.7)
140 Chapter 5. Vector Analysis

where n̂ is the unit normal in the direction of L. We can think of this as the projection of the torque
in the direction of L.
Can this be simplified further? Break the vectors r and F into components parallel to L and
perpendicular to L. Then

r × F = (rk + r⊥ ) × (Fk + F⊥ )
= rk × Fk + r⊥ × Fk + rk × F⊥ + r⊥ × F⊥
= 0 + r⊥ × Fk + rk × F⊥ + r⊥ × F⊥

Then we compute

n̂ · (r × F) = n̂ · (r⊥ × Fk ) + n̂ · (rk × F⊥ ) + n̂ · (r⊥ × F⊥ )


= n̂ · (r⊥ × F⊥ ).

The last line follows from the fact that rk , Fk are parallel to n̂ so that n̂ · (rk × ·) = 0 and n̂ · (Fk × ·) =
0. Thus, the torque about the line L is the torque based on the components perpendicular to L.
 Example 5.5 If a force F = î + 2ĵ − k̂ acts at the point P = (1, 2, 3), find the torque of F about
the line x = 2î + ĵ + (î + 2ĵ + 3k̂)t.

Solution: First, find the vector torque about a point on the line (observe x0 = (2, 1, 0), v = (1, 2, 3)).
This is τ = r × F where r = P − x0 = (1, 2, 3) − (2, 1, 0) = (−1, 1, 3) and thus the torque is

î ĵ k̂

τ = r × F = −1 1 3 = (−7, 2, −3).
1 2 −1

Now we can compute the torque about the line L using the unit normal n̂ = v/|v| = (1, 2, 3)/ 12 + 22 + 32 =
√1 (1, 2, 3),
14

1 1 −12
τ L = n̂ · (r × F) = √ (1, 2, 3) · (−7, 2, −3) = √ (−7 + 4 − 9) = √ .
14 14 14
Observe that the negative sign indicated that the torque acts in the direction −n̂. 

5.2.4 Application of Triple Vector Product


 Example 5.6 Suppose a particle of mass m is at rest on a rotating rigid body. Then the angular

momentum L of the particle about the origin, O, is

L := r × (mv) = mr × v, (5.8)

where the linear velocity v = ω × r for angular velocity ω. Thus, the angular momentum becomes

L = mr × (ω × r) . (5.9)

Centripetal acceleration can also be defined likewise, a = ω × (ω × r). 

5.3 Fields
A vector field is a physical quantity, which has a different value at each point in space. Common
examples are temperature T or the gravitational force of a satellite on Earth |F| = Gmr12m2 .
In each example, there is a physical quantity in some region
5.3 Fields 141

Figure 5.4: Images from Stewart Calculus.

i) If the physical quantity is a scalar, then we have a scalar field, e.g., Temperature T .

ii) If the physical quantity is a vector, then we have a vector field, e.g., Electric Field E,
Magnetic Field B, Force F, velocity v.

Definition 5.3.1 Let D be a region in 2D. A vector field is a function F that assigns to each
point (x, y) in D a two-dimensional vector F(x, y) or in 3D F(x, y, z). To draw a vector field,
place an arrow at equally spaced points (x, y) representing the force F(x, y).

R One can always write it in terms of component functions

F(x, y) = P(x, y)î + Q(x, y)ĵ = (P(x, y), Q(x, y)) (5.10)
where P, Q are scalar functions of two variables, scalar fields. In 3D it can be written as

F(x, y, z) = P(x, y, z)î + Q(x, y, z)ĵ + R(x, y, z)k̂ = (P(x, y, z), Q(x, y, z), R(x, y, z)). (5.11)

Below are sample vector fields:


 Example 5.7 A vector field in 2D is defined by F = −yî + xĵ. Describe F by sketching some of
the vectors in the vector field (in class). Each arrow is tangent to a circle with its center at the origin
(Check: F · x = 0 implying that the vector field F is perpendicular to the location x). 

 Example 5.8 Sketch the vector field in 3D given by F(x, y, z) = zk̂. 

 Example 5.9 Fluid flows along a pipe. Let V(x, y, z) be the velocity vector at a point. Then the

velocity field V assigns a vector to each point in a certain domain (interior of the pipe). Observe
that the velocity spreads out and has a smaller magnitude when the pipe diameter is larger. 
142 Chapter 5. Vector Analysis

Figure 5.5: Images from Stewart Calculus.

 Example 5.10 Recall Newton’s Law of Gravitation |F| = GmM


r2
. Assume a mass M is at the
origin and little mass m is at location (x, y, z). The gravitational field acts toward the direction of
x
the unit vector − |x| with a force
GmM GmM
F=− 3
x=− 2 (x, y, z). (5.12)
|x| (x + y2 + z2 )3/2


Figure 5.6: Image from Stewart Calculus.

 Example 5.11 Assume there is an electric charge, Q, at the origin. By Coulomb’s Law, the
electric force exerted by this charge on a charge q located at (x, y, z) is
εqQ
F(x, y, z) = x (5.13)
|x|3
5.4 Differentiation of Vectors 143

where ε is a constant. For like charges qQ > 0 (repulsive) and for unlike charges qQ < 0 (attractive).


5.4 Differentiation of Vectors


Consider the vector

v = vx î + vy ĵ + vz k̂ = (vx , vy , vz ) (5.14)

where each component is a function of time, t. We denote the derivative in time


 
dv dx dy dz
:= , , . (5.15)
dt dt dt dt

The derivative of the vector v is a vector whose components are the derivatives of the components
of v.
 Example 5.12 Let (x, y, z) be the coordinates of a particle at time t.

Displacement r = (x, y, z)
 
dr dx dy dz
Velocity v= = , ,
dt dt dt dt
d 2 r dv
 2
d x d2y d2z

Acceleration z= 2 = = , , .
dt dt dt 2 dt 2 dt 2


One can also show the following relations for a vector u = (ux , uy , uz ) by working with compo-
nents
d da du
i) dt (au) = dt u + a dt
d du
ii) dt (u · v) = dt · v + dv
dt · u
d du
iii) dt (u × v) = dt × v + u × dv
dt

 Example 5.13 The position vector of a particle is r = (4 + 3t)î + t 3 ĵ − 5t k̂


1. At what time does it pass through the point (1, −1, 5).
2. Find the velocity at this time.
3. Find the equation of the line tangent to its path and plane normal to its path at (1, −1, 5).

Solutions: 1. To find the time the particle passes through the point (1, −1, 5) we set each of
these values equal to the corresponding component of r(t) and solve for t. Thus,

4 + 3t = 1
t 3 = −1
−5t = 5.

All the equations are satisfied when t = −1.

dr
2. The velocity is v = dt = (3, 3t 2 , −5). At time t = −1, the velocity is (3, 3, −5).

3. The line tangent has equation x = x0 + vt where x0 = (1, −1, 5) and v = (3, 3, −5). The
144 Chapter 5. Vector Analysis

plane normal has equation ax + by + cz + d = 0 where (a, b, c) is the normal to the plane (parallel
to the velocity v. Thus,

3x + 3y − 5z + d = 0 ⇒ Point on plane is (x, y, z) = (1, −1, 5) ⇒ d = 25. (5.16)

 Example 5.14 The position of a particle is r(t) = (cos(t), sin(t),t). Show that the speed |v| and
the acceleration |a| are constant.

Solution: Compute the velocity and acceleration vectors:

dr
q √
v= = (− sin(t), cos(t), 1) ⇒ |v| = (−sin(t))2 + (cos(t))2 + 1 = 2
dt
dr2
q
a = 2 = (− cos(t), − sin(t), 0) ⇒ |a| = (− cos(t))2 + (− sin(t))2 + 0 = 1.
dt


5.4.1 Differentiation in Polar Coordinates


Observe that the unit vectors in polar coordinates are:

er = (cos(θ ), sin(θ ), eθ = (− sin(θ ), cos(θ )). (5.17)

Thus, their corresponding velocities are:

der dθ dθ dθ
= (− sin θ , cos(θ ) ) = eθ
dt dt dt dt
deθ dθ dθ dθ
= (− cos(θ ) , − sin(θ ) ) = −er
dt dt dt dt

 Example 5.15 Express a vector u in polar coordinates u = ur er + uθ eθ . Find du


dt .

Solution: Take the time derivative and use the relationships above
   
du dur der duθ deθ dur dθ dθ duθ
= er + ur + eθ + uθ = er − uθ + eθ ur + . (5.18)
dt dt dt dt dt dt dt dt dt

 Example 5.16 Let r = rer . Find the velocity and acceleration.

Solution: Use the relations above to perform the computations:

dr dr dθ
v= = er + eθ r
dt dt" dt
2 2
 2 #  2 
d r d r dθ d θ dr dθ
a = 2 = er −r + eθ r 2 + 2 .
dt dt 2 dt dt dt dt


5.5 Directional Derivative and Gradient 145

5.5 Directional Derivative and Gradient


As discussed previously, temperature T (x, y, z) is a typical example of a scalar field. Suppose one
turns on a Jacuzzi with a heater in the center. We want to know how the temperature changes as we
move through the water. This depends greatly on the direction one moves. If we move toward the
center it gets hotter and the temperature increases; however, if we move away from the center the
temperature will decrease.
This leads to two natural questions:
dT
i) What is the temperature change in the specific direction one is heading, ds ?

ii) Which direction produces the largest/smallest temperature change?

Since heat flows from hot to cold, the heat would follow the direction of the maximal rate of
decrease.

Problem: Consider a scalar function φ (x, y, z) (e.g. Temperature). We want to find the derivative
in the direction s, dφ
dt , at a given point (x0 , y0 , z0 ) in a given direction.

Suppose u = (a, b, c) is a unit vector in the s direction. Move a distance s in the direction of
u: (x, y, z) = (x0 , y0 , z0 ) + s(a, b, c) = x0 + su. Along this line, one can think of x, y, z as functions
of only a single variable s. Thus, using the chain rule we find

dφ ∂ φ dx ∂ φ dy ∂ φ dz
= + +
ds ∂ x ds ∂ y ds ∂ z ds
∂φ ∂φ ∂φ
= a+ b+ c
∂x ∂y ∂z
 
∂φ ∂φ ∂φ
= , , · (a, b, c)
∂x ∂y ∂z
= ∇φ · u

 
∂φ ∂φ ∂φ
Definition 5.5.1 The vector ∇φ = ∂x , ∂y , ∂z is called the gradient of φ and may also be
denoted grad(φ ).

∂φ ∂φ ∂φ
∇φ := î + ĵ + k̂ (5.19)
∂x ∂y ∂z

Definition 5.5.2 The directional derivative in the direction u is



= ∇φ · u (5.20)
ds

R If ∇φ is in the direction of u, then it is maximized/minimized: ∇φ = au, then


= ∇φ · u = au · u = a|u|2 = a.
ds
If the gradient is not in this direction (instead in direction of unit vector v), then


= ∇φ · u = av · u = a|v||u| cos(θ ) = a cos(θ ) < a.
ds
146 Chapter 5. Vector Analysis

 Example 5.17 Find the directional derivative of φ = xy2 + 3yz at (1, 0, 2) in the direction
v = (1, 2, 2).

Solution:
Step 1: Obtain the unit vector u in the direction of v

v 1 1
u= =√ (1, 2, 2) = (1, 2, 2)
|v| 12 + 22 + 22 3

Step 2: Compute the gradient


 
∂φ ∂φ ∂φ
∇φ = , , = (y2 , 2xy + 3z, 3y)
∂x ∂y ∂z


and evaluate at the point (1, 0, 2), ∇φ = (0, 6, 0).
(1,0,2)

Step 3: Compute the directional derivative

dφ 1
= ∇φ · u = (0, 6, 0) · (1, 2, 2) = 0 + 6(2/3) + 0 = 4. (5.21)
ds 3


The geometric or physical interpretation of the directional derivative requires knowledge of


the dot product. The directional derivative dφ ds = ∇φ · u = |∇φ ||u| cos(θ ). Thus, the directional
derivative is the projection of the gradient ∇φ onto the line in the direction of u. Obviously, the
projection is largest if ∇φ in the direction of u, just like the gradient is largest if it is in the direction
of u.
 Example 5.18 Suppose the temperature T (x, y, z) is given by T = z3 − x3 + xyz + 10. In which
direction is the temperature change the greatest at (−1, 1, 2) and at what rate?

Solution: First, the greatest temperature change is in the direction of the gradient

2 2

∇T = (−3x + yz, xz, 3z + xy) = (−1, −2, 11).
(−1,1,2)

The rate of increase/decrease is


q √ √ √
|∇T | = (−1)2 + (−2)2 + 112 = 1 + 4 + 121 = 126 = 3 14.

∆φ
Suppose u is tangent to the surface φ = const at the point P = (x0 , y0 , z0 ). Consider ∆s for
PA, PB, PC approaching u. Since φ = const, P, A, B,C are on the surface, ∆φ = 0. Thus,

∆φ dφ
= 0 →∆s=0 =0 ⇒ ∇φ · u = 0 ⇒ ∇φ ⊥ u. (5.22)
∆s ds

Definition 5.5.3 The vector ∇φ is normal to the surface φ = const.

Since |∇φ | is the value of the directional derivative normal to the surface, then the normal
derivative dφ
dn = |∇φ |. In temperature problems, the direction of largest change in temperature is
normal to the isothermal lines (constant temperature).
5.5 Directional Derivative and Gradient 147

 Example 5.19 Given the surface xyz2 = 4, find the equation of the tangent plane and normal line
at the point (2, 2, −1).

Solution: The level surface is w = xyz2 , so the normal direction is in the direction of the gra-
dient

2 2

∇w = (yz , xz , 2xyz) = (2, 2, −8).
(2,2,−1)

The tangent plane has the following equation (since the gradient is normal to the surface)
∂w ∂w ∂w
x+ y+ z+d = 0 ⇒ 2x + 2y − 8z + d = 0 ⇒ x + y − 4z = 8.
∂x ∂y ∂z
Found d by plugging in the point (x, y, z) = (2, 2, −1). The normal line has equation

x−2 y−2 z+1 x = 2 + 2t

= = , y = 2 + 2t
2 2 −8 
z = −1 − 8t

5.5.1 Gradients in Other Coordinate Systems


∂f 1∂f ∂f
Cylindrical (Polar if z = 0) ∇f = er + eθ + ez
∂r r ∂θ ∂z
∂f 1∂f 1 ∂f
Spherical ∇f = er + eθ + eφ .
∂r r ∂θ r sin φ ∂ φ

5.5.2 Physical Significance


The gradient of a scalar, ∇φ , is extremely important in physics and engineering. Given a potential
energy U, the associated force is F = −∇U (e.g., gravity, electrostatics, etc.). If a force can be
described by a single scalar function U, we call U the potential.

R Since the force is the directional derivative of U, we can find U by integrating the force along
a path. The work of the force F along the dr is
Z Z Z Z r2
W= dU = ∇U · dr = −F · dr = dU = U(r2 ) −U(r1 ).
r1

Definition 5.5.4 Such forces that behave in this way are call conservative. (More on this in the
coming sections).
p
 Example 5.20 Find the gradient for a function of position r (Central Forces) r = x 2 + y2 + z2 .

∂r x
Solution: Then ∂x =√ = xr . Then
x2 +y2 +z2

∂f ∂f ∂f
∇ f (r) = î + ĵ + k̂
∂x ∂y ∂z
∂ f ∂r ∂ f ∂r ∂ f ∂r
= î + ĵ + k̂
∂r ∂x ∂r ∂y ∂r ∂z
∂ f hx y z i r ∂f
= î + ĵ + k̂ = ,
∂r r r r |r| ∂ r
148 Chapter 5. Vector Analysis

which is made up of the unit vector in the radial direction multiplied by the directional derivative in
the radial direction. 

If a vector function depends on space (x, y, z) and time t, the from the total differential we see
∂F ∂F ∂F ∂F ∂F
dF = dx + dy + dz + dt = (dr · ∇)F + dt. (5.23)
∂x ∂y ∂z ∂t ∂t
Divide the result by dt to get the so-called material derivative
 
dF dr ∂F
= ·∇ F + . (5.24)
dt dt ∂t
This expression comes up a lot in physics, but most famously in the Navier-Stokes equations for
fluid flow
du ∂ u
= + (v · ∇)u (5.25)
dt ∂t
dr
where u is the fluid velocity and v = dt is the velocity of an individual fluid particle.

5.6 Some Other Expressions Involving ∇


In this section we take a deeper look at the gradient operator and where else it can appear.
Definition 5.6.1 The symbol ∇ is used to represent a vector operator and is defined as

∂ ∂ ∂
∇ = î + ĵ + k̂ . (5.26)
∂x ∂y ∂z
d
This can be compared and contrasted with the scalar differential operator dx .

So far we have considered the gradient of a scalar function, ∇φ . Here an operation is performed
on the scalar φ resulting in a vector. Can the gradient operator, ∇, be applied to a vector?

Given a vector function V(x, y, z) = (Vx ,Vy ,Vz ). Question: How can we now compute the vector
operator ∇ applied to the vector V in different ways?

5.6.1 Divergence, ∇ · V
Definition 5.6.2 The divergence of a vector is defined as the scalar quantity

∂Vx ∂Vy ∂Vz


∇·V = + + . (5.27)
∂x ∂y ∂z
This is represents the outward flow. If the divergence is positive there is a net outward flow, if
negative there is a net inward flow. The special case where the divergence is zero the media is
referred to as incompressible or solenoidal.

Special Case: The divergence of a central force field. Let r be the position vector and φ a
scalar function. Then
∂ ∂ ∂
∇ · (rφ ) = [xφ ] + [yφ ] + [zφ ]
∂x ∂y ∂z
∂φ ∂φ ∂φ
Product Rule = φ +x +φ + +φ +
∂x ∂y ∂z
= 3φ + r · (∇φ )
5.6 Some Other Expressions Involving ∇ 149
Definition 5.6.3 In general, we have the following “Product Rule for Divergence"

∇ · (φ v) = ∇φ · v + φ (∇ · v). (5.28)

5.6.2 Physical Interpretation


Consider the quantity ∇ · (φ v) where v is the fluid velocity and ρ is the density of the fluid. Consider
a small volume dxdydz as a rectangular prism ABCDEFGH. Then the total flow through the face
EFGH is


ρvx dydz,
x=0

which is the tangential component of the force while ρvy , ρvz contribute nothing. The flow out the
opposite face ABCD is
 

ρvx dydz = ρvx + (ρvx )dx dydz.
x=dx ∂x x=0

Thus, the net rate of flow in the x-direction is the flow in minus the flow out
 
∂ ∂
ρvx dydz − ρvx + (ρvx )dx dydz = − (ρvx )dxdydz

x=0 ∂ x x=0 ∂ x

Similarly we can find the net rate of flow in the y and z directions:


y-direction − (ρvy )dxdydz
∂y

z-direction − (ρvz )dxdydz.
∂z
Therefore, the net flow per unit time is:
 
∂ ∂ ∂
− (ρvx ) + (ρvy ) + (ρvz ) dxdydz = −∇ · (ρv)dxdydz.
∂x ∂y ∂z

A direct application of this idea results in the continuity equation

∂ρ
+ ∇ · (ρv) = 0. (5.29)
∂t
This equation says that the net flow out of a volume results in a decreased density inside that
volume.
 Example 5.21 If F = (xz, xyz, −y2 ). Find ∇ · F = div(F).

Solution: Compute

∂ ∂ ∂
div(F) = (xz) + (xyz) + (−y2 )
∂x ∂y ∂z
= z + xz + 0 = z + xz.

5.6.3 Curl ∇ × V
150 Chapter 5. Vector Analysis
Definition 5.6.4 The curl of a vector field is defined as


     î ĵ k̂
∂Vz ∂Vy ∂Vx ∂Vz ∂Vy ∂Vx
∇ ×V = curl(V) = î − + ĵ − + k̂ − = ∂∂x ∂
∂y

∂z ,
∂y ∂z ∂z ∂x ∂x ∂y V
x Vy V
z
(5.30)

resulting in a vector.

 Example 5.22 Find ∇ × V where V = (xz, xyz, −y2 ).



î ĵ k̂

∇ × V = ∂∂x ∂
∂y

∂ z = (−2y − xy, x, yz).
V Vy V
x z

 Example 5.23 Compute the curl of a central force



î ĵ k̂

∇ × r = ∂∂x ∂
∂y

∂ z = 0.
x y z

The physical significance can be seen by considering the circulation of a fluid around a rectangle in
the xy-plane with corners (x0 , y0 ), (x0 + dx, y0 ), (x0 + dx, y0 + dy), (x0 , y0 + dy).
   
∂ vy ∂ vx
Circ1234 = vx (x0 , y0 )dx + vy (x0 , y0 ) + dx dy + vx (x0 , y0 ) + dy (−dx) + vy (x0 , y0 )(−dy)
∂x ∂y
 
∂ vy ∂ vx
= − dxdy.
∂x ∂y

Circ

Divide by the area dxdy to find area = ∇ × v . 
z−component

Special Case: When the curl is zero ∇ × v = 0 the flow is called irrotational or conservative.
Physically this means that the fluid velocity only moves in the radial direction with no rotation.
r
Examples of this include gravitational and electrostatic forces: v = C |r| where C = −Gm1 m2 for
q1 q2
gravitational forces and C = 4πε0 for electrostatic Coulomb forces.
Consider the quantity ∇ × (φ v), then

î ĵ k̂
∂ ∂ ∂
∇ × (φ v) = ∂ x ∂y ∂z
φv φv φv
x y z
   
∂φ ∂ vz ∂ φ ∂ vy ∂φ ∂ vx ∂ φ ∂ vz
= î vz + f − vy − φ + ĵ vx + f − vz − φ
∂y ∂y ∂z ∂z ∂z ∂z ∂x ∂x
 
∂φ ∂ vy ∂ φ ∂ vx
+ k̂ vy + f − vx − φ
∂x ∂x ∂y ∂y
= φ (∇ × v) + (∇φ ) × v.

The last line represents to so-called “Product Rule for Curl".


5.6 Some Other Expressions Involving ∇ 151

 Example 5.24 Consider the vector potential of a constant B-field In electrodynamics, if ∇ · B = 0,


then the B-field can be represented as B = ∇ × A where A is a vector potential. This is seen to be
true by doing the following computation:
∇ · B = ∇ · (∇ × A) = A · (∇ × ∇) = 0.
One possible form of A = 12 (B × r). Thus,
2(∇ × A) = ∇ × (B × r) = B(∇ · r) − (B · ∇)r = 3B − B = 2B ⇒ B = (∇ × A). (5.31)


The other interesting observation is that the BAC −CAB Rule still holds when using the gradient
operator, ∇.
 Example 5.25 Evaluate

∇ × (∇ × V) = ∇(∇ · V) − (∇ · ∇)V = ∇(∇ · V) − ∇2 V




Definition 5.6.5 (Laplacian) The Laplacian operator is defined as

∂2 ∂2 ∂2
∆ = ∇2 = ( + + ). (5.32)
∂ x2 ∂ y2 ∂ z2
2 2 2
The Laplacian of a scalar is ∆φ = ∂∂ xφ2 + ∂∂ yφ2 + ∂∂ zφ2 resulting in a scalar. If we apply the Laplacian
to a vector
∂ Vx ∂ 2Vx ∂ 2Vx ∂ 2Vy ∂ 2Vy ∂ 2Vy ∂ 2Vz ∂ 2Vz ∂ 2Vz
 2 
2
∆V = ∇ V = + + , + + , + + 2 ,
∂ x2 ∂ y2 ∂ z2 ∂ x 2 ∂ y2 ∂ z2 ∂ x2 ∂ y2 ∂z

the result is a vector with each component the Laplacian applied to each scalar component.

There are famous equations involving the Laplacian we will study in detail in Chapter 13:
1. Laplace’s Equation ∆φ = 0 (Elasticity)
2. Heat Equation ∆φ = a12 ∂∂tφ (Temperature Distribution, Diffusion, Schrödinger)
1 ∂ 2φ
3. Wave Equation ∆φ = a2 ∂t 2
(Vibration, Waves).

5.6.4 Solenoidal and Irrotational


If a vector v is solenoidal, then ∇ · v = 0 and there exists a vector A such that v = ∇ × A.

If a vector v is irrotational or conservative, then ∇ × v = 0 and there exists a scalar function


φ such that v = ∇φ .
Some vectors always fit into one of these categories. For example, the curl of a vector curl(v) is
always solenoidal since ∇ · curl(v) = 0. Also, every vector can be written as a conservative vector
field added to a solenoidal vector field. Given a vector F, then
F = −∇φ + ∇ × A
for some scalar φ and vector A.
We can also combine these operations to get other useful relationship (no need to memorize):
∇(A · B) = (B · ∇)A + (A · ∇)B + B × (∇ × A) + A × (∇ × B)
A × (∇ × B) = ∇(A · B) − (A · ∇)B
B × (∇ × A) = ∇(A · B) − (B · ∇)A.
152 Chapter 5. Vector Analysis

5.6.5 Divergence and Laplacian in Other Coordinate Systems


In Cylindrical (r, θ , z) coordinates we have

1 ∂ 1 ∂ ∂
∇·v = (rvr ) + (vθ ) + vz
r ∂r r ∂θ ∂z
1 ∂ ∂f 1 ∂2 f ∂2 f
∆f = (r ) + 2 + .
r ∂r ∂r r ∂ θ 2 ∂ z2
In Spherical (rho, θ , φ ) coordinates we have

1 ∂ 1 ∂ 1 ∂ vφ
∇·v = 2
(ρ 2 vρ ) + (vθ sin θ ) +
ρ ∂ρ ρ sin θ ∂ θ ρ sin θ ∂ φ
1 ∂ ∂f 1 ∂ ∂f 1 ∂2 f
∆f = 2 (ρ 2 )+ 2 (sin θ ) + 2 sin2 θ 2 .
ρ ∂ρ ∂ρ ρ sin θ ∂ θ ∂θ ρ ∂φ

5.7 Line Integrals


Recall that infinitesimal work can be written dW = F · dr. Suppose that the object is moving
along some path (from A to B). Along this curve there is only one independent variable (the
parameterization of the curve). Therefore, the force field F and dr = dxî + dyĵ + dzk̂ are functions
of a single variable. One can think of breaking the curve into equal size arc of length ∆s such that
the work done is summed over each segment
N Z
W = ∑ f (xi , yi )∆si →∆s→0 f · ds.
i C

Definition 5.7.1 (Line Integral) The line integral (in this case for work) can be expressed as
Z
W= F · dr, (5.33)
C

for any curve C moving counterclockwise. If moving clockwise, then a − appears in front of
the integral.

Figure 5.7: Depiction of Line Integral for a force field F.


5.7 Line Integrals 153

 Example 5.26 Given the force F = (x2 , −xy) find the work done by F along the paths between
(0, 0) and (2, 1): 1. Line, 2. parabola, 3. broken line (vertical up then right), and 4. x = 2t 3 and
y = t 2.

Solution: Since r = (x, y) and dr = (dx, dy), then F · dr = x2 dx − xydy and the total work is
Z
W= x2 dx − xydy.

Thus, the work along path 1 (the line y = 12 x ⇒ dy = 21 dx) gives

1 3 2
Z 2  Z2
2 1 2 3 2
W1 = x dx − x dx = x dx = x = 2.
0 4 0 4 4 0

Thus, the work along path 2 (the parabola y = 14 x2 ⇒ dy = 12 xdx) gives

1 3 1 5 2 8 32 28
Z 2 
2 1 4
W2 = x dx − x dx = x − x = − = .
0 8 3 40 0 3 40 15

Thus, the work along path 3 (the broken line, up: dx = 0, x = 0, right: dy = 0, y = 1) gives

1 3 2 8
Z 1 Z 2
2
W3 = 0 − 0dy + x dx + 0 = x = .
0 0 3 0 3

Thus, the work along path 4 (x = 2t 3 ⇒ dx = 6t 2 dt and y = t 2 ⇒ dy = 2tdt for 0 ≤ t ≤ 1) gives

24 9 4 6 1 43
Z 1 Z 1
3 2 2 5 8 6

W4 = (2t ) (6t dt) − 2t (2tdt) = 24t − 4t dt = t − t = .
0 0 9 7 0 21

Observe that the most work is required to move the object is along the broken path and the least
work is along the parabola path 2 favoring x. 

R An analogous procedure can be carried out in 3D to compute the work


Z Z Z Z
W= F · dr = Fx dx + Fy dy + Fz dz. (5.34)

 Example 5.27 (Path Dependent Work) The force exerted on a body F = (−y, x). Find the work

required to move an object from (0, 0) to (1, 1) moving along the broken line right then up.
Z (1,1) Z (1,1) Z 1 Z 1 Z 1 1

W= F · dr = (−ydx + xdy) = − ydx + xdy = 0 + dy = y = 1,
(0,0) (0,0) 0 0 0 0

or along the broken line up then right


Z (1,1) Z (1,1) Z 1 Z 1 Z 1 1

W= F · dr = (−ydx + xdy) = − ydx + xdy = − dx + 0 = −x = −1.
(0,0) (0,0) 0 0 0 0

The amount of work required depends on the choice of path! 


154 Chapter 5. Vector Analysis

 Example 5.28 (Line Integral


 for Work)
 Find the work done on the unit circle clockwise from 0
−y
to −π for the force F = , x
x2 +y2 x2 +y2
.

Solution: Parameterize the circle: x = cos(ϕ), y = sin(ϕ), then dx = − sin(ϕ)dϕ, dy = cos(ϕ)dϕ)


and F = (− sin(ϕ), cos(ϕ))
Z −π Z −π −π
xdy − ydx
Z
2 2

W =− 2 2
= (− sin (ϕ) − cos (ϕ))dϕ = − dϕ = −ϕ = π.
C x +y 0 0 0

Now using the same force compute the integral around the square from (1, 0) → (1, −1) →
(−1, −1) → (−1, 0).
Z Z −1
Z −1
Z 0

W = − F · dr = − Fy dy − Fx dx − Fy dy
0 1 x=1 −1 y=−1 x=−1
Z −1 −1 0
1 1 −1
Z Z
=− 2
dy − 2 2
dx − 2 2
dy
0 1+y 1 x + (−1) −1 (−1) + y
Z 1 Z 0
0 1 1 1
= −int−1 2
dy + 2
dx + 2
dy
1+y −1 x + 1 −1 1 + y
0 1 0
π
= tan (y) + tan (x) + tan (y) = tan−1 (1) + tan−1 (1) − tan−1 (−1) − tan−1 (−1) = 4 = π.
−1 −1 −1


−1 −1 −1 4
This is the same result as the circular path!! 

Question: What is special about this force field F that allows the work to be the same regardless
of path?

Solution: Conservative vector fields!! In example 1 the answer depended on the path and
in the second example it was independent of the path. For example 1 consider a lady with a box
on a truck. She performed two paths: 1. Drag the box right then lift to desired point. 2. Lift
immediately then carry to desired point. The only work done on the second path is lifting.

In physics, if there is friction, then the work depends on the path. This is “non-conservative"
where energy is dissipated by friction. In the second example the only work was lifting, this is
“conservative" where no energy is dissipated due to friction.

R Thus, to be path-independent the force field must be curl free, ∇ × F = curl(F) = 0.


 
−y
 Example 5.29 Let the electric field E = , x
x2 +y2 x2 +y2
. Is this field conservative and therefore
any line integral would be path independent?

Solution: Compute

î ĵ k̂ 
x2 + y2 − x(2x) −(x2 + y2 ) + 2y2

∂ ∂ ∂
curl(E) = ∂ x ∂ z = 0, 0, − = (0, 0, 0).

∂y
y x (x2 + y2 )2 (x2 + y2 )2
− x2 +y 0

2 x2 +y2

Yes, the electric field is conservative! 

Consider another case. Suppose that there exists a function W (x, y, z) such that F = ∇W . This
implies that
∂ Fx ∂ Fy ∂ Fx ∂ Fz ∂ Fy ∂ Fz
= , = , =
∂y ∂x ∂z ∂x ∂z ∂y
5.7 Line Integrals 155

and
 
∂ Fz ∂ Fy ∂ Fx ∂ Fz ∂ Fy ∂ Fx
curl(F) = − , − , − = 0.
∂y ∂z ∂z ∂x ∂x ∂y

Then the quantity


Z B Z B Z B Z B
∂W ∂W ∂W
F · dr = ∇W · dr = dx + dy + dz = dW = W (B) −W (A).
A A A ∂x ∂y ∂z A

In words this says that when a vector field is conservative, then the total work done is the work at
the end point minus the work at the starting point independent of the path. This is referred to as
The Fundamental Theorem of Line Integrals.

R In the previous explanation we used the exact differential of W , dW = ∂W ∂W ∂W


∂ x dx + ∂ y dy + ∂ z dz.
In other words, the exact differential exists if and only if the curl of the force field F is zero,
curl(F) = ∇ × F = 0.

 Example 5.30 Is F = (yzexz , exz , xyexz ) an exact differential?

Solution: First computing the curl



î ĵ k̂
= î(xexz −xexz )+ ĵ(yexz +xyzexz −yexz −xyzexz )+ k̂(zexz −zexz ) = 0.
∂ ∂ ∂

0 = ∇×F = ∂ x

∂y ∂z
yzexz exz xyexz

Yes! This force field is an exact differential. 

 
R In two-dimensions we only need that ∂∂W
x∂ y = ∂W
∂ y∂ x . So if F = ∂W ∂W
,
∂x ∂y , 0 , then F is an exact
x x
differential. For example F = (e sin(y), e cos(y)) is conservative.

5.7.1 Potentials
In mechanics, if F = ∇W (conservative), then W is the work done by the force F. If a mass falls a
distance z, then the work done is W = mgz. If a mass is lifted a distance z, then the work done is
W = −mgz (direction opposite the force of gravity). The total increase in potential energy when
lifting the object is φ = mgz implying that φ = −W . Thus, the force F = −∇φ where φ is the
potential energy or scalar potential function.

R In general, if curl(v) = 0, then there exists a scalar function φ such that v = −∇φ . One
special case where the sign is opposite is hydrodynamics where v = ∇φ , but we ignore this
case for now.

Now, suppose that the curl(F) = 0 ⇒ F = ∇W . Questions: How can we find the function W ?

Solution: We calculate the line integral from A to B along a convenient path (since the integral is
path independent).
 Example 5.31 Show that the force F = (3 + 2xy, x2 − 3y2 , 0) is conservative, then find the scalar
potential φ such that F = −∇φ .
156 Chapter 5. Vector Analysis

Solution: First, show the force field is conservative by computing the curl


î ĵ k̂
∂ ∂ ∂
0 = ∇ × F = ∂x ∂y ∂ z = î(0 − 0) + ĵ(0 − 0) + k̂(2x − 2x) = 0.
3 + 2xy x2 − 3y2 0

So the force field is conservative!

Now consider the path in 3D from (0, 0, 0) → (x, 0, 0) → (x, y, 0) → (x, y, z) and compute the
work
Z B Z B
W= F · dr = (3 + 2xy)dx + (x2 − 3y2 )dy + 0dz
A A
= Integral of Path 1 + Path 2 + Path 3
Z x Z y Z z
2 2
= 3dx + (x − 3y )dy + dz
0 0 0
= 3x + x2 y − y3 + 0.

Thus, W = 3x + x2 y − y3 and φ = −W = −3x − x2 y + y3 . 

5.7.2 Alternate Approach to Finding Scalar Potential φ


2 3z 3z
 Example 5.32 Consider the force field F = (y , 2xy + e , 3ye ). Show that F is conservative

and find the scalar potential φ such that F = −∇φ .

Solution:First, show the force field is conservative by computing the curl



î ĵ k̂
= î(3e3z − 3e3z ) + ĵ(0 − 0) + k̂(2y − 2y) = 0.
∂ ∂ ∂
0 = ∇ × F = ∂ x ∂y ∂z
y2 2xy + e3z 3ye3z

So the force field is conservative!

Now we need to find W . We know that ∂W 2


∂ x = Fx = y . If we integrate in x we will get an expression
for W , W = xy2 + g(y, z). Now take the y derivative to find ∂W 3z
∂ y = 2xy + gy (y, z) = 2xy + e = Fy .
This implies that gy (y, z) = e3z . Integrating this in y gives g(y, z) = ye3z + h(z).
Now we have that W = xy2 + ye3z + h(z). Take a derivative in z to find ∂W 3z 0
∂ z = 3ye + h (z) =
3ye3z = Fz . Thus, h0 (z) = 0 ⇒ h(x) = C1 (a constant). Therefore, W = xy2 + ye3z + C1 for any
constant C1 and φ = −W . 

5.8 Green’s Theorem in the Plane


Recall the Fundamental Theorem of Calculus:
Z b
d
f (t)dt = f (b) − f (a). (5.35)
a dt
We want to now generalize this idea to multiple dimensions in the form of the celebrated Divergence
and Stokes Theorems in 3D. We start, however, with the 2D versions of these theorems known as
Green’s Theorem. The idea is to relate an area integral to the line integral around its boundary.
Let P(x, y) and Q(x, y) be continuous functions with continuous first derivatives. Let x = a
and x = b be the left and right most x-coordinates of area A. Let yu describe the upper curve and
5.8 Green’s Theorem in the Plane 157

Figure 5.8: Positively oriented (counter-clockwise) curve around boundary. Figure from Stewart
Calculus.

yl describe the lower curve between a and b. We want to show that the integral over the area is
equivalent to the line integral around the boundary

∂P
ZZ I
dA = Pds. (5.36)
A ∂y C

Starting from the left-hand side:


Z b Z yu 
∂P ∂ P(x, y)
ZZ
dydx = dy dx
A ∂y a yl ∂y
Z b
= [P(x, yu ) − P(x, yl )] dx
a
Z a Z b I
=− P(x, yu )dx − P(x, yl )dx = − Pds.
b a C

RR ∂ q H
Repeating the calculation, but using functions of y instead of x gives: A ∂ x dxdy = C Qdy.

Theorem 5.8.1 (Green’s Theorem in the Plane (2D)) Let P(x, y) and Q(x, y) be continuous
functions with continuous first derivatives defined on the area A, then
ZZ  
∂Q ∂P
I
− dxdy = (Pdx + Qdy), (5.37)
A ∂x ∂y ∂A

with the line integral oriented counter-clockwise around the boundary ∂ A.

Example 5.33 Evaluate x4 dx+xydy where C is a triangle from with vertices (0, 0), (1, 0), (0, 1).
H


Solution: First note the region under consideration can be written as a Type I or Type II re-
gion (see Chapter on Multiple Integration). Also, given the vertices the triangle is bounded from
above by the line y = 1 − x and below by y = 0 between x = 0 and x = 1. From the initial function
we see that P(x, y) = x4 and Q(x, y) = xy. Thus, using Green’s Theorem
ZZ   Z 1 Z 1−x
∂Q ∂P
I
4
x dx + xydy = − dxdy = (y − 0)dydx
A ∂x ∂y 0 0

1 2 1−x
Z 1 Z 1 1
1 2 1 3
1
= y dx = (1 − x) dx = − (1 − x) = .
0 2 0 0 2 6 0 6

158 Chapter 5. Vector Analysis
p
Example 5.34 Evaluate (3y − esin(x) )dx + (7x + y4 + 1)dy where C is the ellipse defined by
H


x2 + y2 = 9.
p
Solution: From the initial function we see that P(x, y) = 3y − esin(x) and Q(x, y) = 7x + y4 + 1.
Thus, using Green’s Theorem
ZZ  
∂Q ∂P
I p ZZ
sin(x) 4
(3y − e )dx + (7x + y + 1)dy = − dxdy = (7 − 3)dxdy
A ∂x ∂y A
ZZ Z 2π Z 3 Z 2π 3
2

= 4dA =Polar 4rdrdθ = 2r dθ
D 0 0 0 0
Z 2π
= 18dθ = 36π.
0

R In the prior two examples it is easier to do the double integral than the line integral. Sometimes
the line integral is easier, so remember to use the theorem both ways!

RR
This theorem can be used to find the area A of some objects. Consider A = A 1dA. Then
one can choose P, Q such that ∂∂Qx − ∂∂Py = 1. For example, a) P = 0, Q = x, b) P = −y, Q = 0, c)
P = − 12 y, Q = 12 x. Then using Green’s Theorem we derive Green’s Theorem for Areas:

1
I I I
A = xdy = − ydx = xdy − ydx. (5.38)
C C 2 C

2 y2
 Example 5.35 Find the area enclosed by the ellipse ax2 + b2 = 1.

Solution: Using Green’s Theorem for Areas: (Let x = a cos(t), y = b sin(t), dx = −a sin(t)dt, dy =
b cos(t)dt)
Z 2π
1 1
I
A= xdy − ydx = ab cos2 (t)dt + ab sin2 (t)dt
2 C 2 0
Z 2π
ab
= dt = πab.
2 0


 Example 5.36 Let F = (x2 , −xy) and consider the area bounded from above by y = 1 and below
by y = 14 x2 . Find the work done in moving around this curve. (From last section it is W2 −W3 = −4
5 ).

Solution: We can use Green’s Theorem to compute the work!


I ZZ  
2 ∂ ∂ 2
W= x dx − xydy = (xy) − (x ) dxdy
∂A A ∂x ∂y
ZZ Z 1 Z 2√y Z 1 2√y

= −ydxdy = −ydxdy = −xy dy
A 0 0 0 0
Z 1 1
4 4
−2y3/2 dy = − y5/2 = − .

=
0 5 0 5

5.8 Green’s Theorem in the Plane 159
∂F
 Example 5.37 Let’s examine conservative forces, ∇ × F = 0, or in 2D ∂ xy − ∂∂Fyx = 0. From
Green’s Theorem we compute the work
ZZ  
∂ Fy ∂ Fx
I
W= Fx dx + Fy dy = − dxdy = 0.
∂A A ∂x ∂y

The work required to move an object around any closed path for a conservative force is zero! This
results from the fact that the work in moving an object from one point to another is independent of
the path. 

Example 5.38 Let Q = Vx , P = −Vy where V = (Vx ,Vy ) Then ∂∂Qx − ∂∂ Py = ∂V y ∂V


∂ x + ∂ y = div(V)
x


with Vz = 0.
p infinitesimal tangent at any point is dr = (dx, dy) and the normal is nds = (dy, −dx) with
The
ds = dx2 + dy2 . Thus, Pdx + Qdy = −Vy dx +Vx dy = (Vx ,Vy ) · (dy, −dx) = V · nds. Therefore,
ZZ I
2D Divergence Theorem div(V)dA = V · nds (5.39)
A ∂A

Example 5.39 Let Q = Vy , P = Vx where V = (Vx ,Vy ) Then ∂∂Qx − ∂∂ Py =


∂Vy

∂x − ∂V
∂ y = curl(V) · k̂
x

with Vz = 0.
Thus, Pdx + Qdy = Vx dx +Vy dy = (Vx ,Vy ) · (dx, dy) = V · dr. Therefore,
ZZ I
2D Stokes Theorem corl(V) · k̂dA = V · dr (5.40)
A ∂A

Question: What about non-simple regions made up of many areas?


Example 5.40 Evaluate C y2 dx + 3xydy where C is the curve around the half annulus with outer
H


radius 2 and inner radius 1.

Solution: Compute
I ZZ   ZZ Z πZ 2
2 ∂ ∂ 2
y dx + 3xydy = (3xy) − (y ) = ydA = r sin(θ )rdrdθ
C A ∂x ∂y A 0 1
2 π
1 3 7 7 14
Z π Z π
= r sin(θ ) dθ =
sin(θ )dθ = − cos(θ ) = .
0 3 0 3 1 3 3 0

R What if the region under consideration has a hole? Consider the outer boundary curve C1 and
the inner hole has boundary C2 . Then the total line integral around the boundary
I I I I
Pdx + Qdy + Pdx + Qdy = Pdx + Qdy − Pdx + Qdy. (5.41)
C1 −C2 C1 C2

Here we just subtract the area inside the hole from the total!

 Example 5.41 Let F = (y2 , 3xy) be the force and compute the work done around two circles
x 2 + y2 = 1 and x2 + y2 = 4.
160 Chapter 5. Vector Analysis

Solution: Compute
I I Z 2π Z 2 Z 2π Z 1
y2 dx + 3xydy − y2 dx + 3xydy = r2 sin(θ )drdθ − r2 sin(θ )drdθ
C1 C2 0 0 0 0
Z 2π Z 2 Z 2π Z 2
= r2 sin(θ )drdθ = sin(θ )dθ r2 dr
0 1 0 1

1 3 2

= − cos(θ )
r = 0.
0 3 1


5.9 The Divergence (Gauss) Theorem


Recall the divergence of a vector field v(x, y, z)
∂ vx ∂ vy ∂ vz
div(v) = ∇ · v = + + .
∂x ∂y ∂z
The idea behind the divergence theorem relies on fluid in a region R. Let v be the fluid velocity at a
point inside R. The divergence is the amount of substance flowing out of a given volume. Consider
the cross-sectional area A. The amount of water flowing through this region at time t over the area
A0 perpendicular to the flow is the water in a cylindrical cross-section of area A0 and length vt for
a total amount of vtA0 ρ where ρ is the density. For cross-section A (inclined a angle θ to v) we
have vtA0 ρ = vtρA cos(θ ). This gives a unit area in unit time of vρ cos(θ ) = vρn where n is the
unit normal.

Figure 5.9: Depiction of fluid flow in and out of a region. Image from Weber Essential Math
Methods for Physicists.

Imagine a volume V is subdivided into parallelepipeds. For each parallelepipeds

∑ v · dA = ∇ · vdV.
six f aces

Adding up this quantity for all parallelepipeds results in just the divergence at the boundary (the
amount flowing in/out of interior boundaries cancels due to same magnitude in opposite directions)

∑ v · dA = ∑ V · dA = ∑ ∇ · vdV.
all parallelepipeds exteriorsur f ace Volumes

Theorem 5.9.1 (Gauss (Divergence) Theorem) Given a vector field v we have the following
relation between the volume and surface integrals
ZZ ZZZ ZZ ZZZ
v · dA = ∇ · vdV, v · ndA + ∇ · vdV. (5.42)
A V A V
5.9 The Divergence (Gauss) Theorem 161
Here V is the volume, A is the associated surface area, and n is the unit normal to the surface.
This holds for simple solid regions with no holes.
RR
 Example 5.42 Let B = ∇ × A. Show that S B · dA = 0.

Solution: Compute
ZZ Z Z
∇ × A =Div.T hm ∇ · (∇ × A)dV = 0dV = 0.
S V V


 Example 5.43 Over a volume V , let ψ solve Laplace’s equation (∆ψ = 0). Show that the integral
over the closed surface in V of the normal derivative of ψ ( ∂∂ψn = ∇ψ · n) is zero
Z Z Z Z
∂ψ
dA = ∇ψ · ndA =Div.T hm = ∇ · (∇ψ)dV = ∆ψdV = 0.
S ∂n S V V


Definition 5.9.1 The flux of a vector field through the surface an object is described by the
divergence
ZZ ZZZ
F · dA = ∇ · FdV. (5.43)
S V

 Example 5.44 Find the flux of the vector field F(x, y, z) = (z, y, x) over the unit sphere x2 + y2 +
z2 = 1.

Solution: First compute the divergence ∇ · F = 1. Then the flux



ZZ ZZZ ZZZ
Flux = F · dA = ∇ · FdV = 1dV = Vol(V ) = .
S V V 3


 Example 5.45 Find the flux when F = (xy, y2 + e xz2


, sin(xy)), through the surface bounded by
the parabolic cylinder z = 1 − x2 and planes z = 0, y = 0, y + z = 2.

Solution: First compute the divergence ∇ · F = y + 2y + 0 = 3y. Then the flux


ZZ ZZZ Z 1 Z 1−x2 Z 2−z
Flux = F · dA =DivT hm ∇ · FdV = 3ydydzdx
S V −1 0 0
1 2 2−z
Z 1 Z 1−x2 2
3 1 1−x
Z Z
=3 y dzdx = (2 − z)2 dzdx
−1 0 2 0 2 −1 0
Z 1 1−x2
3 1 1 1 1 1 6
Z Z
− (2 − z)3 (1 + x2 )3 − 8dx = − (x + 3x4 + 3x2 + 1) − 8dx

= dx = −
2 −1 3 0 2 −1 2 −1
Z 1
" 1 #
1 6 4 2 1 1 7 3 5 3
184
=− x + 3x + 3x − 7dx = − x + x + x − 7x = .
2 −1 2 7 5 −1 35

R The divergence theorem is also true for unions of simple regions!


162 Chapter 5. Vector Analysis

5.9.1 Gauss Law for Electricity


Recall Coulomb’s Law E = 4πεq r2 er where ε0 is the permitivity of free space derived from the
0
q
Electric displacement D = ε0 E = 4πr 2 er . Let S be a closed surface surrounding a point close to a
charge q at the origin. Then
q
I Z
D · ndA = 1dΩ.
S 4π Areao fCircle

Definition 5.9.2 Gauss Law: The total charge inside a closed surface
I ZZZ
D · ndA = ρdV
V

where ρ is the charge density / charge distribution


I I
D · ndA = ∑ D · ndA = ∑ qi , (5.44)
S i Si i

where the total charge over the region S is the sum of the isolated charges. Using the divergence
theorem we find
ZZ Z
∇ · DdV = ρdV, (5.45)
S

which is Maxwell’s equation, ∇ · D = ρ, in non-local form.

rq
 Example 5.46 Let D = 4πr2 er where er = |r| . Show that the electric flux through any closed
surface surrounding the origin is
ZZ
D · dA = q.
S

Solution: Let S̃ be a small sphere of radius a around the origin. Then


ZZ ZZ ZZ
D · dA = D · dA = D · ndA
S
ZZS̃ S̃
q r q
ZZ
= 3
r · dA = 2
dA
S̃ 4π|r| |r| S̃ 4π|r|
q q q
ZZ
= 2
dA = 2
Area(S̃) = 2
4πa2 = q.
S̃ 4πa 4πa 4πa


5.10 The Stokes (Curl) Theorem


Recall the definition of the curl of a vector field v:

î ĵ k̂

curl(v) = ∇ × v = ∂ x ∂∂y ∂∂z .
v v v
x y z

For a rigid body we have seen that the translational velocity v = ω × r where ω is the angular
velocity. Then

curl(v) = ∇ × (ω × r) =BAC−CAB (∇ · r)ω − (ω · ∇)r


= 3ω − ω = 2ω.
5.10 The Stokes (Curl) Theorem 163

Figure 5.10: Depiction of local cells from Weber Essential Math Methods.

Thus, the angular velocity ω = 12 (∇ × v) (sometimes curl(v) is denoted rot(v). Also, recall that if a
vector field is conservative, then it is curl free and called irrotational.
Consider four basic flow patterns: a) Vortex, b) Parallel, c) Parallel with variable velocity
(shear), d) flow around a corner. Then we can determine the degree to which the fluid is rotating by
computing the circulation.
Definition 5.10.1 (Circulation) The circulation of a fluid with fluid velocity V through the
surface S is
I
v · dr (5.46)
S

In flow a) ∇ × v 6= 0 at the center, b) ∇ × v = 0, c) ∇ × v 6= 0, d) ∇ × v = 0.


Consider the circulation around the sub-cells of a surface

∑ v · dr = ∑ v · dr = ∑ ∇ × v · dA.
4sideso f cell exterior rectangles

Taking the limit of both sides gives Stokes Theorem

Theorem 5.10.1 (Stokes (Curl) Theorem) For a vector field v the following relation holds for a
surface S with boundary C
I ZZ
v · dr = (∇ × v) · ndA. (5.47)
C S

Example 5.47 Evaluate C F · dr where F = (−y2 , x, z2 ) and C is the intersection of the plane
H


y + z = 2 and the cylinder x2 + y2 = 1.

Solution: Compute
I ZZ ZZ
F · dr =Stokes curl(F) · dS = (1 + 2y)dA
C S S
Z 2π Z 1 Z 2π 1
1 2 2 3
= (12 r sin(θ ))rdrdθ = r + r sin(θ ) dθ
0 0 3 0 2 0
Z 2π 2π
1 2 θ 2
= + sin(θ )dθ = − cos(θ ) = π.
0 2 3 2 3 0

Example 5.48 Evaluate C F · dr where F = (x + y2 , y + z2 , z + x2 ) and C is the triangle with


H


vertices (1, 0, 0), (0, 1, 0), (0, 0, 1).


164 Chapter 5. Vector Analysis

Solution: Compute
I ZZ Z 1 Z 1−x
F · dr =Stokes curl(F) · n = −2ydydx
C S 0 0
Z 1 1−x Z 1 1
2
21 3
1
= −y dx = −(1 − x) dx = (1 − x) = − .
0 0 0 3 0 3


5.10.1 Ampere’s Law


Ampere’s Law relates the current to the magnetic field
I
H · dr = I, (5.48)
C

B
where I is the current, H = µ0 , B is the magnetic field, and µ0 is the constant permeability. Using
Ampere’s Law we find
Z 2π
I
I
I= H · dr = |H|rdθ = |H|r2π ⇒ |H| = .
C 0 2πr
RR
Using the current density J we can find the total current: I = S J · ndA. Combining this with
Ampere’s Law gives
ZZ I ZZ
J · ndA = H · dr =Stokes (∇ × H) · ndA.
S C S

This implies that ∇ × H = J, another one of Maxwell’s equations.

5.10.2 Conservative Fields


Consider simply connected regions with no holes (a region is simply connected if any closed curve
can be shrunk to a point without including points outside the region).

Theorem 5.10.2 Let the vector field F be continuous with continuous first partial derivatives in
a simply connected region S. Then all the following statements are true or false:

a) curl(F) = 0 at every point


H
b) F · dr = 0 around every simple closed curve in the region

c) F is conservative, and any work done is path independent

d) F · dr is an exact differential of a single valued function

e) F = −∇φ where φ is a single-valued, scalar potential field.

To briefly review:
1. Irrotational ⇒ ∇ × v = 0 ⇒ there exists a scalar field φ such that v = −∇φ .
2. Solenoidal ⇒ ∇ · v = 0 ⇒ there exists a vector field A such that v = ∇ × A.
 Example 5.49 Given a vector field v = (x2 − yz, −2yz, z2 − 2xz). Find A such that v = ∇ × A.

Solution: First check that v is solenoidal: ∇ · v = 2x − 2z + 2z − 2x = 0. Yes!


5.10 The Stokes (Curl) Theorem 165

Knowing the curl of A must be v we have infinitely many choices of A, we need to find one.
∂A
Start by assuming Ax = 0. Then −2yz == ∂∂Axz and z2 − 2xz = ∂ xy . Integrating each with respect to
x gives

Ay = z2 x − x2 z + f1 (y, z)
Az = 2xyz + f2 (y, z).

Now using the first component of v:

∂ Az ∂ Ay ∂ f2 ∂ f1
x2 − yz = − = 2xz + − 2xz + x2 − .
∂y ∂z ∂y ∂z

In particular, pick f1 , f2 to satisfy this. Some examples are: 1. f1 = 12 yz2 , f2 = 0, 2. f1 = 0, f2 =


− 12 y2 z. Using the latter we find A = (0, xz2 − x2 z, 2xyz − 21 y2 z). 

R If we have one A, then all others have the form A + ∇u for any scalar function u. This is true
since ∇ × (A + ∇u) = ∇ × A + ∇ × (∇u) = ∇ × A.
IV
Part Four: Ordinary
Differential Equations

6 Ordinary Differential Equations . . . . . . 169


6.1 Introduction to ODEs
6.2 Separable Equations
6.3 Linear First-Order Equations, Method of Integrating
Factors
6.4 Existence and Uniqueness
6.5 Other Methods for First-Order Equations
6.6 Second-Order Linear Equations with Constant Co-
efficients and Zero Right-Hand Side
6.7 Complex Roots of the Characteristic Equation
6.8 Repeated Roots of the Characteristic Equation and
Reduction of Order
6.9 Second-Order Linear Equations with Constant Co-
efficients and Non-zero Right-Hand Side
6.10 Mechanical and Electrical Vibrations
6.11 Two-Point Boundary Value Problems and Eigenfunc-
tions
6.12 Systems of Differential Equations
6.13 Homogeneous Linear Systems with Constant Coeffi-
cients
6. Ordinary Differential Equations

6.1 Introduction to ODEs


6.1.1 Some Basic Mathematical Models; Direction Fields
Definition 6.1.1 A differential equation is an equation containing derivatives.

Definition 6.1.2 A differential equation that describes some physical process is often called a
mathematical model

 Example 6.1 (Falling Object)

γv

(+)

mg
Consider an object falling from the sky. From Newton’s Second Law we have

dv
F = ma = m (6.1)
dt
When we consider the forces from the free body diagram we also have

F = mg − γv (6.2)
170 Chapter 6. Ordinary Differential Equations

where γ is the drag coefficient. Combining the two


dv
m = mg − γv (6.3)
dt
Suppose m = 10kg and γ = 2kg/s. Then we have
dv v
= 9.8 − (6.4)
dt 5

Figure 6.1: Direction field for above example

It looks like the direction field tends towards v = 49m/s. We plot the direction field by plugging
in the values for v and t and letting dv/dt be the slope of a line at that point. 

Direction Fields are valuable tools in studying the solutions of differential equations of the
form
dy
= f (t, y) (6.5)
dt
where f is a given function of the two variables t and y, sometimes referred to as a rate function.
At each point on the grid, a short line is drawn whose slope is the value of f at the point. This
technique provides a good picture of the overall behavior of a solution.

Two Things to keep in mind:


1. In constructing a direction field we never have to solve the differential equation only evaluate it
at points.

2. This method is useful if one has access to a computer because a computer can generate
the plots well.
 Example 6.2 (Population Growth) Consider a population of field mice, assuming there is nothing
to eat the field mice, the population will grow at a constant rate. Denote time by t (in months) and
the mouse population by p(t), then we can express the model as
dp
= rp (6.6)
dt
where the proportionality factor r is called the rate constant or growth constant. Now suppose
owls are killing mice (15 per day), the model becomes
dp
= 0.5p − 450 (6.7)
dt
6.1 Introduction to ODEs 171

note that we subtract 450 rather than 15 because time was measured in months. In general
dp
= rp − k (6.8)
dt
where the growth rate is r and the predation rate k is unspecified. Note the equilibrium solution
would be k/r.
Definition 6.1.3 The equilibrium solution is the value of p(t) where the system no longer
dp
changes, dt = 0.

In this example solutions above equilibrium will increase, while solutions below will decrease.

Figure 6.2: Direction field for above example

Steps to Constructing Mathematical Models:


1. Identify the independent and dependent variables and assign letters to represent them. Often the
independent variable is time.
2. Choose the units of measurement for each variable.
3. Articulate the basic principle involved in the problem.
4. Express the principle in the variables chosen above.
5. Make sure each term has the same physical units.
6. We will be dealing with models in this chapter which are single differential equations.
Example 6.3 Draw the direction field for the following, describe the behavior of y as t → ∞.
Describe the dependence on the initial value:

y0 = 2y + 3 (6.9)

Ans: For y > −1.5 the slopes are positive, and hence the solutions increase. For y < −1.5 the
slopes are negative, and hence the solutions decrease. All solutions appear to diverge away from
the equilibrium solution y(t) = −1.5. 

 Example 6.4 Write down a DE of the form dy/dt = ay + b whose solutions have the required
behavior as t → ∞. It must approach 23 .
Answer: For solutions to approach the equilibrium solution y(t) = 2/3, we must have y0 < 0 for
y > 2/3, and y0 > 0 for y < 2/3. The required rates are satisfied by the DE y0 = 2 − 3y. 

 Example 6.5 Find the direction field for y0 = y(y − 3)



172 Chapter 6. Ordinary Differential Equations

Figure 6.3: Direction field for above example

6.1.2 Solutions of Some Differential Equations


Last Time: We derived two formulas:

dv
m = mg − γv (Falling Bodies) (6.10)
dt
dp
= rp − k (Population Growth) (6.11)
dt
Both equations have the form:

dy
= ay − b (6.12)
dt
 Example 6.6 (Field Mice / Predator-Prey Model)
Consider
dp
= 0.5p − 450 (6.13)
dt
we want to now solve this equation. Rewrite equation (6.13) as

dp p − 900
= . (6.14)
dt 2
Note p = 900 is an equilbrium solution and the system does not change. If p 6= 900

d p/dt 1
= (6.15)
p − 900 2

By Chain Rule we can rewrite as


 
d 1
ln |p − 900| = (6.16)
dt 2

So by integrating both sides we find


t
ln |p − 900| = +C (6.17)
2
6.1 Introduction to ODEs 173

Therefore,

p = 900 +Cet/2 (6.18)

Thus we have infinitely many solutions where a different arbitrary constant C produces a different
solution. What if the initial population of mice was 850. How do we account for this? 

Definition 6.1.4 The additional condition, p(0) = 850, that is used to determine C is an example
of an initial condition.
Definition 6.1.5 The differential equation together with the initial condition form the initial
value problem

Consider the general problem


dy
= ay − b (6.19)
dt
y(0) = y0 (6.20)

The solution has the form

y = (b/a) + [y0 − (b/a)]eat (6.21)

when a 6= 0 this contains all possible solutions to the general equation and is thus called the general
solution The geometric representation of the general solution is an infinite family of curves called
integral curves.
 Example 6.7 (Dropping a ball) System under consideration:

dv v
= 9.8 − (6.22)
dt 5
v(0) = 0 (6.23)

From the formula above we have


   
−9.8 −9.8 − t
v= + 0− e 5 (6.24)
−1/5 −1/5
and the general solution is

v = 49 +Ce−t/5 (6.25)

with the I.C. C = −49. 

6.1.3 Classifications of Differential Equations


Last Time: We solved some basic differential equations, discussed IVPs, and defined the general
solution.

Now we want to classify two main types of differential equations.


Definition 6.1.6 If the unknown function depends on a single independent variable where only
ordinary derivatives appear, it is said to be an ordinary differential equation. Example

y0 (x) = xy (6.26)
174 Chapter 6. Ordinary Differential Equations
Definition 6.1.7 If the unknown function depends on several variables, and the derivatives are
partial derivatives it is said to be a partial differential equation.

One can also have a system of differential equations

dx/dt = ax − αxy (6.27)


dy/dt = −cy + γxy (6.28)

Note: Questions from this section are common on exams.

Definition 6.1.8 The order of a differential equation is the order of the highest derivative that
appears in the equation.

Ex 1: y000 + 2et y00 + yy0 = 0 has order 3.


Ex 2: y(4) + (y0 )2 + 4y000 = 0 has order 4. Look at derivatives not powers.

Another way to classify equations is whether they are linear or nonlinear:

Definition 6.1.9 A differential equation is said to be linear if F(t, y, y0 , y00 , ..., y(n) ) = 0 is a
linear function in the variables y, y0 , y00 , ..., y(n) . i.e. none of the terms are raised to a power or
inside a sin or cos.

 Example 6.8 a) y0 + y = 2
b) y00 = 4y − 6
c) y(4) + 3y0 + sin(t)y 

Definition 6.1.10 An equation which is not linear is nonlinear.

 Example 6.9 a) y0 + t 4 y2 = 0
b) y00 + sin(y) = 0
c) y(4) − tan(y) + (y000 )3 = 0 

2 g
 Example 6.10 ddtθ2 + L sin(θ ) = 0.

The above equation can be approximated by a linear equation if we let sin(θ ) = θ . This pro-
cess is called linearization. 

Definition 6.1.11 A solution of the ODE on the interval α < t < β is a function φ that satisfies

φ (n) (t) = f [t, φ (t), ..., φ (n−1) (t)] (6.29)

Common Questions:
1. (Existence) Does a solution exist? Not all Initial Value Problems (IVP) have solutions.

2. (Uniqueness) If a solution exists how many are there? There can be none, one or infinitely many
solutions to an IVP.

3. How can we find the solution(s) if they exist? This is the key question in this course. We
will develop many methods for solving differential equations the key will be to identify which
method to use in which situation.
6.2 Separable Equations 175

6.2 Separable Equations


In general, we want to solve first order differential equations of the form

dy
= f (t, y). (6.30)
dt

We begin with equations that have a special form referred to as separable equations of the form

dy
= f (y)g(x) (6.31)
dx

The General Solution Method:

1
Step 1: (Separate) dy = g(x)dx (6.32)
f (y)
1
Z Z
Step 2: (Integrate) dy = g(x)dx (6.33)
f (y)
Step 3: (Solve for y) F(y) = G(x) + c (6.34)

Note only need a constant of integration of one side, could just combine the constants we get on
each side. Also, we only solve for y if it is possible, if not leave in implicit form.
Definition 6.2.1 An equilibrium solution is the value of y which makes dy/dx = 0, y remains
this constant forever.

 Example 6.11 (Newton’s Law of Cooling) Consider the ODE, where E is a constant:

dB
= κ(E − B) (6.35)
dt

with initial condition (IC) B(0) = B0 . This is separable

dB
Z Z
= κdt (6.36)
E −B
− ln |E − B| = κt + c (6.37)
−κt+c −κt
E −B = e = Ae (6.38)
−κt
B(t) = E − Ae (6.39)
B(0) = E − A (6.40)
A = E − B0 (6.41)
E − B0
B(t) = E − κt (6.42)
e


 Example 6.12

dy 1
= 6y2 x, y(1) = . (6.43)
dt 3
176 Chapter 6. Ordinary Differential Equations

Separate and Solve:

dy
Z Z
= 6xdx (6.44)
y2
1
− = 3x2 + c (6.45)
y
y(1) = 1/3 (6.46)
−3 = 3(1) + c ⇒ c = −6 (6.47)
1
− = 3x2 − 6 (6.48)
y
1
y(x) = (6.49)
6 − 3x2

2

What is the interval of validity for this solution?
√ √Problem
√ when √ 6 − 3x = 0 or when x = ± 2.
So possible intervals of validity: (−∞, − 2), (− 2, 2), ( 2, ∞). We want to √ choose
√ the one
containing the initial value for x, which is x = 1, so the interval of validity is (− 2, 2). 

 Example 6.13

3x2 + 2x − 4
y0 = , y(1) = 3 (6.50)
2y − 2

There are no equilibrium solutions.

Z Z
2y − 2dy = 3x2 + 2x − 4dx (6.51)

y2 − 2y = x3 + x2 − 4x + c (6.52)
y(1) = 3 ⇒ c = 5 (6.53)
2 3 2
y − 2y + 1 = x + x − 4x + 6 (Complete the Square) (6.54)
(y − 1)2 = x3 + x2 − 4x + 6 (6.55)
p
y(x) = 1 ± x3 + x2 − 4x + 6 (6.56)

There are two solutions we must choose the appropriate one. Use the IC to determine only the
positive solution is correct.

p
y(x) = 1 + x3 + x2 − 4x + 6 (6.57)

We need the terms under the square root to be positive, so the interval of validity is values of x
where x3 + x2 − 4x + 6 ≥ 0. Note x = 1 is in here so IC is in interval of validity. 

 Example 6.14

dy xy3
= , y(0) = 1 (6.58)
dx 1 + x2
6.2 Separable Equations 177

One equilibrium solution, y(x) = 0, which is not our case (since it does not meet the IC). So
separate:
dy x
Z Z
= dx (6.59)
y3 1 + x2
1 1
− 2 = ln(1 + x2 ) + c (6.60)
2y 2
1
y(0) = 1 ⇒ c = − (6.61)
2
1
y2 = (6.62)
1 − ln(1 + x2 )
1
y(x) = p (6.63)
1 − ln(1 + x2 )
Determine the interval of validity. Need
ln(1 + x2 ) < 1 ⇒ x2 < e − 1 (6.64)
√ √
So the interval of validity is − e − 1 < x < e − 1. 

 Example 6.15
dy y−1
= 2 (6.65)
dx x + 1
The equilibrium solution is y(x) = 1 and our IC is y(0) = 1, so in this case the solution is the
constant function y(s) = 1. 

 Example 6.16
dy
(Review IBP) = ey−t sec(y)(1 + t 2 ), y(0) = 0 (6.66)
dt
Separate by rewriting, and using Integration By Parts (IBP)
dy ey e−t
= (1 + t 2 ) (6.67)
dt cos(y)
Z Z
e−y cos(y)dy = e−t (1 + t 2 )dt (6.68)
e−y 5
(sin(y) − cos(y)) = −e−t (t 2 + 2t + 3) + (6.69)
2 2
Won’t be able to find an explicit solution so leave in implicit form. In the implicit form it is difficult
to find the interval of validity so we will stop here. 

 Example 6.17 Solve the differential equation xy0 = y + 1.

Solution: Separate variables


y0 1
=
y+1 x
dy dx
Write y0 = dy/dx and Rearrange =
y+1 x
1 1
Z Z
Integrate both sides dy = dx
y+1 x
Simplify ln(y + 1) = ln(x) +C
Let C = ln(a) for constant a ln(y + 1) = ln(x) + ln(a) = ln(ax)
put both sides into the exponential function y + 1 = ax.
178 Chapter 6. Ordinary Differential Equations

Thus, we have a family of solution y = ax − 1. One curve for each value of the constant a. This is
more commonly referred to as the general solution. Finding a particular solution means choosing
a value of a so that one curve remains. 

6.3 Linear First-Order Equations, Method of Integrating Factors


Consider the general equation
dy
+ p(t)y = g(t) (6.70)
dt
We said in Chapter 1 if p(t) and g(t) are constants we can solve the equation explicitly. Unfortu-
nately this is not the case when they are not constants. We need the method of integrating factor
developed by Leibniz, who also invented calculus, where we multiply (6.70) by a certain function
µ(t), chosen so the resulting equation is integrable. µ(t) is called the integrating factor. The
challenge of this method is finding it.

Summary of Method:
1. Rewrite the equation as (MUST BE IN THIS FORM)

y0 + ay = f (6.71)

2. Find an integrating factor, which is any function


R
a(t)dt
µ(t) = e . (6.72)

3. Multiply both sides of (6.71) by the integrating factor.

µ(t)y0 + aµ(t)y = f µ(t) (6.73)

4. Rewrite as a derivative

(µy)0 = µ f (6.74)

5. Integrate both sides to obtain


Z
µ(t)y(t) = µ(t) f (t)dt +C (6.75)

and thus
1 C
Z
y(t) = µ(t) f (t)dt + (6.76)
µ(t) µ(t)
Now lets see some examples:
 Example 6.18 Find the general solution of

y0 = y + e−t (6.77)

Step 1:

y0 − y = et (6.78)

Step 2:
R
µ(t) = e− 1dt
= e−t (6.79)
6.3 Linear First-Order Equations, Method of Integrating Factors 179

Step 3:

e−t (y0 − y) = e−2t (6.80)

Step 4:

(e−t y)0 = e−2t (6.81)

Step 5:

1
Z
e−t y = e−2t dt = − e−2t +C (6.82)
2
Solve for y

1
y(t) = − e−t +Cet (6.83)
2


 Example 6.19 Find the general solution of

y0 = y sint + 2te− cost (6.84)

and y(0) = 1.
Step 1:

y0 − y sint = 2te− cost (6.85)

Step 2:
R
µ(t) = e− sintdt
= ecost (6.86)

Step 3:

ecost (y0 − y sint) = 2t (6.87)

Step 4:

(ecost y)0 = 2t (6.88)

Step 5:

ecost y = t 2 +C (6.89)

So the general solution is:

y(t) = (t 2 +C)e− cost (6.90)

With IC

y(t) = (t 2 + e)e− cost (6.91)


180 Chapter 6. Ordinary Differential Equations

 Example 6.20 Find General Solution to

y0 = y tant + sint (6.92)

with y(0) = 2. Note Integrating factor


R
µ(t) = e− tan dt
= eln(cost) = cost (6.93)

Final Answer
cost 5
y(t) = − + (6.94)
2 2 cost


 Example 6.21 Solve

2y0 + ty = 2 (6.95)

with y(0) = 1. Integrating Factor


2 /4
µ(t) = et (6.96)

Final Answer
Z t
2 /4 2 /4 2 /4
y(t) = e−t es ds + e−t . (6.97)
0

6.3.1 REVIEW: Integration By Parts


This is the most important integration technique learned in Calculus 2. We will derive the method.
Consider the product rule for two functions of t.
 
d dv du
uv = u + v (6.98)
dt dt dt

Integrate both sides from a to b


b Z b Z b
dv du
uv = u + v (6.99)
a a dt a dt

Rearrange the resulting terms


Z b b Z b
dv du
u = uv − v (6.100)
a dt a a dt

Practicing this many times will be helpful on the homework. Consider two examples.
R9
 Example 6.22 Find the integral 1 ln(t)dt. First define u, du, dv, and v.

u = ln(t) dv = dt (6.101)
1
du = dt v=t (6.102)
t
6.3 Linear First-Order Equations, Method of Integrating Factors 181

Thus
Z 9 9 Z 9

ln(t)dt = t ln(t) − 1dt (6.103)
1 1 1
9

= 9 ln(9) − t (6.104)
1
= 9 ln(9) − 9 + 1 (6.105)
= 9 ln(9) − 8 (6.106)


R x
 Example 6.23 Find the integral e cos(x)dx. First define u, du, dv, and v.

u = cos(x) dv = ex dx (6.107)
x
du = − sin(x)dx v=e (6.108)

Thus
Z Z
ex cos(x)dx = ex cos(x) − ex sin(x)dx (6.109)

Do Integration By Parts Again

u = sin(x) dv = ex dx (6.110)
x
du = cos(x)dx v=e (6.111)

So
Z Z
ex cos(x)dx = ex cos(x) − ex sin(x)dx (6.112)
Z
= ex cos(x) + ex sin(x) − ex cos(x)dx (6.113)
Z
ex cos(x)dx = ex cos(x) + sin(x)

2 (6.114)
1
Z
ex cos(x)dx = ex cos(x) + sin(x) +C

(6.115)
2


Notice when we do not have limits of integration we need to include the arbitrary constant of
integration C.

6.3.2 Modeling With First Order Equations


Last Time: We solved separable ODEs and now we want to look at some applications to real world
situations

There are two key questions to keep in mind throughout this section:
1. How do we write a differential equation to model a given situation?
2. What can the solution tell us about that situation?

 Example 6.24 (Radioactive Decay)

dN
= −λ N(t), (6.116)
dt
182 Chapter 6. Ordinary Differential Equations

where N(t) is the number of atoms of a radioactive isotope and λ > 0 is the decay constant. The
equation is separable, and if the initial data is N(0) = N0 , the solution is

N(t) = N0 e−λt . (6.117)

so we can see that radioactive decay is exponential. 

 Example 6.25 (Newton’s Law of Cooling) If we immerse a body in an environment with a


constant temperature E, then if B(t) is the temperature of the body we have

dB
= κ(E − B), (6.118)
dt
where κ > 0 is a constant related to the material of the body and how it conducts heat. This equation
is separable. We solved it before with the initial condition B(0) = B0 to get

E − B0
B(t) = E − . (6.119)
eκt


Approaches to writing down a model describing a situation:


1. Remember the derivative is the rate of change. It’s possible that the description of the problem
tells us directly what the rate of change is. Newton’s Law of Cooling tells us the rate of change of
the body’s temperature was proportional to the difference in temperature between the body and the
environment. All we had to do was set the relevant terms equal.

2. There are also cases where we are not explicitly given the formula for the rate of change.
But we may be able to use the physical description to define the rate of change and then set the
derivative equal to that. Note: The derivative = increase - decrease. This type of thinking is only
applicable to first order equations since higher order equations are not formulated as rate of change
equals something.

3. We may just be adapting a known differential equation to a particular situation, i.e. New-
ton’s Second Law F = ma. It is either a first or second order equation depending on if you define
it for position for velocity. Combine all forces and plug in value for F to yield the differential
equation. Used for falling bodies, harmonic motion, and pendulums.

4. The last possibility is to determine two different expressions for the same quantity and set
the equal to derive a differential equation. Useful when discussing PDEs later in the course.

The first thing one must do when approaching a modeling problem is determining which of
the four situations we are in. It is crucial to practice this identification now it will be useful on
exams and later sections. Secondly, your differential equation should not depend on the initial
condition. The IC only tells the starting position and should not effect how a system evolves.

Type I: (Interest)

Suppose there is a bank account that gives r% interest per year. If I withdraw a constant w
dollars per month, what is the differential equation modeling this?

Ans: Let t be time in years, and denote the balance after t years as B(t). B0 (t) is the rate of
change of my account balance from year to year, so it will be the difference between the amount
6.3 Linear First-Order Equations, Method of Integrating Factors 183

added and the amount withdrawn. The amount added is interest and the amount withdrawn is 12w.
Thus
r
B0 (t) = B(t) − 12w (6.120)
100
This is a linear equation, so we can solve by integrating factor. Note: UNITS ARE IMPORTANT,
w is withdrawn each month, but 12w is withdrawn per year.
 Example 6.26 Bill wants to take out a 25 year loan to buy a house. He knows that he can afford

maximum monthly payments of $400. If the going interest rate on housing loans is 4%, what is the
largest loan Bill can take out so that he will be able to pay it off in time?

Ans: Measure time t in years. The amount Bill owes will be B(t). We want B(25) = 0. The
4% interest rate will take the form of .04B added. He can make payments of 12 × 400 = 4800 each
year. So the IVP will be

B0 (t) = .04B(t) − 4800, B(25) = 0 (6.121)

This is a linear equation in standard form, use integrating factor

B0 (t) − .04B(t) = −4800 (6.122)


R
−.04dt −.04t
µ(t) = e =e (6.123)
4 4
− 100 t 0 − 100 t
(e B(t)) = −4800e (6.124)
Z
4 4 4
e− 100 t B(t) = −4800 e− 100 t dt = 120000e− 100 t + c (6.125)
4
B(t) = 120000 + ce 100 t (6.126)
−1
B(25) = 0 = 120000 + ce ⇒ c = −120000e (6.127)
4
B(t) = 120000 − 120000e 100 (t−25) (6.128)

We want the size of the loan, which is the amount Bill begins with B(0):

B(0) = 120000 − 120000e−1 = 120000(1 − e−1 ) (6.129)

Type II: (Mixing Problems)

In

Out
We have a mixing tank containing some liquid inside. Contaminant is being added to the tank at
some constant rate and the mixed solution is drained out at a (possibly different) rate. We will want
to find the amount of contaminant in the tank at a given time.
How do we write the DE to model this process? Let P(t) be the amount of pollutant (Note: Amount
184 Chapter 6. Ordinary Differential Equations

of pollutant, not the concentration) in the tank at time t. We know the amount of pollutant that is
entering and leaving the tank each unit of time. So we can use the second approach

Rate of Change ofP(t) = Rate of entry of contaminant − Rate of exit of contaminant (6.130)

The rate of entry can be defined in different ways. 1. Directly adding contaminant i.e. pipe adding
food coloring to water. 2. We might be adding solution with a known concentration of contaminant
to the tank (amount = concentration x volume).
What is the rate of exit? Suppose that we are draining the tank at a rate of rout . The amount of
contaminant leaving the tank will be the amount contained in the drained solution, that is given
by rate x concentration. We know the rate, and we need the concentration. This will just be the
concentration of the solution in the tank, which is in turn given by the amount of contaminant in
the tank divided by the volume.

Amount of Contaminant
Rate of exit of contaminant = Rate of drained solution × (6.131)
Volume of Tank
or
P(t)
Rate of exit of contaminant = rout . (6.132)
V (t)

What is V (t)? The Volume is decreasing by rout at each t. Is there anything being added to the
volume? That depends if we are adding some solution to the tank at a certain rate rin , that will
add to the in-tank volume. If we directly add contaminant not in solution, nothing is added. So
determine which situation by reading the problem. In the first case if the initial volume is V0 , we’ll
get V (t) = V0 + t(rin − rout ), and in the second, V (t) = V0 − trout .
 Example 6.27 Suppose a 120 gallon well-mixed tank initially contains 90 lbs. of salt mixed

with 90 gal. of water. Salt water (with a concentration of 2 lb/gal) comes into the tank at a rate of
4 gal/min. The solution flows out of the tank at a rate of 3 gal/min. How much salt is in the tank
when it is full?

Ans: We can immediately write down the expression for volume V (t). How much liquid is
entering each minute? 4 gallons. How much is leaving the tank in the same minute? 3 gallons. So
each minute the Volume increases by 1 gallon, and we have V (t) = 90 + (4 − 3)t = 90 + t. This
tells us the tank will be full at t = 30.
We let P(t) be the amount of salt (in pounds) in the tank at time t. Ultimately, we want to determine
P(30), since this is when the tank will be full. We need to determine the rates at which salt is
entering and leaving the tank. How much salt is entering? 4 gallons of salt water enter the tank
each minute, and each of those gallons has 2lb. of salt dissolved in it. Hence we are adding 8 lbs.
of salt to the tank each minute. How much is exiting the tank? 3 gallons leave each minute, and the
concentration in each of those gallons is P(t)/V (t). Recall

Rate of Change ofP(t) = Rate of entry of contaminant − Rate of exit of contaminant


(6.133)
Amount of Contaminant
Rate of exit of contaminant = Rate of drained solution × (6.134)
Volume of Tank
dP P(t)lb 3P(t)
= (4gal/min)(2lb/gal) − (3gal/min)( ) = 8−
dt V (t)gal 90 + t
(6.135)
6.3 Linear First-Order Equations, Method of Integrating Factors 185

This is the ODE for the salt in the tank, what is the IC? P(0) = 90 as given by the problem. Now we
have an IVP so solve (since linear) using integrating factor
dP 3
+ P(t) = 8 (6.136)
dt 90 + t
R 3
µ(t) = e 90+t dt = e3 ln(90+t) = (90 + t)3 (6.137)
3 0 3
((90 + t) P(t)) = 8(90 + t) (6.138)
Z
(90 + t)3 P(t) = 8(90 + t)3 dt = 2(90 + t)4 + c (6.139)
c
P(t) = 2(90 + t) + (6.140)
(90 + t)3
c
P(0) = 90 = 2(90) + 3 ⇒ c = −(90)4 (6.141)
90
904
P(t) = 2(90 + t) − (6.142)
(90 + t)3
Remember we wanted P(30) which is the amount of salt when the tank is full. So

904 3 27
P(30) = 240 − 3
= 240 − 90( )3 = 240 − 90( ). (6.143)
120 4 64
We could ask for amount of salt at anytime before overflow and all would be the same besides last
step where we replace 30 with the time wanted. 

Exercise: What is the concentration of the tank when the tank is full?
 Example 6.28 A full 20 liter tank has 30 grams of yellow food coloring dissolved in it. If a

yellow food coloring solution (with concentration of 2 grams/liter) is piped into the tank at a rate of
3 liters/minute while the well mixed solution is drained out of the tank at a rate of 3 liters/minute,
what is the limiting concentration of yellow food coloring solution in the tank?

Ans: The ODE would be


dP P(t)g 3P
= (3L/min)(2g/L) − (3L/min) = 6− (6.144)
dt V (t)L 20
Note that volume is constant since we are adding and removing the same amount at each time step.
Use the method of integrating factor.
R 3 3
µ(t) = e 20 dt = e 20 t (6.145)
3 3
(e 20 t P(t))0 = 6e 20 t (6.146)
Z
3 3 3
e 20 t P(t) = 6e 20 t dt = 40e 20 t +c (6.147)
c
P(t) = 40 + 3 (6.148)
e 20 t
P(0) = 20 = 40 + c ⇒ c = −20 (6.149)
20
P(t) = 40 − 3 . (6.150)
e 20 t
Now what will happen to the concentration in the limit, or as t → ∞. We know the volume will
always be 20 liters.
3
P(t) 40 − 20e− 20 t
lim = lim =2 (6.151)
t→∞ V (t) t→∞ 20
186 Chapter 6. Ordinary Differential Equations

So the limiting concentration is 2g/L. Why does this make physical sense? After a period of time
the concentration of the mixture will be exactly the same as the concentration of the incoming
solution. It turns out that the same process will work if the concentration of the incoming solution
is variable. 

 Example 6.29 A 150 gallon tank has 60 gallons of water with 5 pounds of salt dissolved in it.

Water with a concentration of 2 + cos(t) lbs/gal comes into the tank at a rate of 9 gal/hr. If the well
mixed solution leaves the tank at a rate of 6 gal/hour, how much salt is in the tank when it overflows?

Ans: The only difference is the incoming concentration is variable. Given the Volume starts
at 600 gal and increases at a rate of 3 gal/min
dP 6P
= 9(2 + cos(t)) − (6.152)
dt 60 + 3t
Our IC is P(0) = 5 and use the method of integrating factor
R 6
µ(t) = e 60+3t dt = e2 ln(20+t) = (20 + t)2 . (6.153)
2 0 2
((20 + t) P(t)) = 9(2 + cos(t))(20 + t) (6.154)
Z
(20 + t)2 P(t) = 9(2 + cos(t))(20 + t)2 dt (6.155)
2
= 9( (20 + t)3 + (20 + t)2 sin(t) + 2(20 + t) cos(t) − 2 sin(t)) +(6.156)
c
3
2 2 cos(t) 2 sin(t) c
P(t) = 9( (20 + t) + sin(t) + − 2
)+ (6.157)
3 20 + t (20 + t) (20 + t)2
2 2 c 9 c
P(0) = 5 = 9( (20) + ) + = 120 + + (6.158)
3 20 400 10 400
c = −46360 (6.159)

We want to know how much salt is in the tank when it overflows. This happens when the volume
hits 150, or at t = 30.
18 cos(30) 18 sin(30) 46360
P(30) = 300 + 9 sin(30) + − − (6.160)
50 2500 2500
So P(t) ≈ 272.63 pounds. 

We could make the problem more complicated by assuming that there will be a change in the
situation if the solution ever reached a critical concentration. The process would still be the same,
we would just need to solve two different but limited IVPs.

Type III: (Falling Bodies)


Lets consider an object falling to the ground. This body will obey Newton’s Second Law of Motion,

dv
m = F(t, v) (6.161)
dt
where m is the object’s mass and F is the net force acting on the body. We will look at the situation
where the only forces are air resistance and gravity. It is crucial to be careful with the signs.
Throughout this course downward displacements and forces are positive. Hence the force due
to gravity is given by FG = mg, where g ≈ 10m/s2 is the gravitational constant.
Air Resistance acts against velocity. If the object is moving up air resistance works downward,
always in opposite direction. We will assume air resistance is linearly dependant on velocity (ie
6.4 Existence and Uniqueness 187

FA = αv, where FA is the force due to air resistance). This is not realistic, but it simplifies the
problem. So F(t, v) = FG + FA = 10 − αv, and our ODE is
dv
m = 10m − αv (6.162)
dt
 Example 6.30 A 50 kg object is shot from a cannon straight up with an initial velocity of 10 m/s
off the very tip of a bridge. If the air resistance is given by 5v, determine the velocity of the mass at
any time t and compute the rock’s terminal velocity.

Ans: Two parts: 1. When the object is moving upwards and 2. When the object is moving
downwards. If we look at the forces it turns out we get the same DE

50v0 = 500 − 5v (6.163)

The IC is v(0) = −10, since we shot the object upwards. Our DE is linear and we can use integrating
factor
1
v0 + v = 10 (6.164)
10
t
µ(t) = e 10 (6.165)
t t
0
(e v(t))
10 = 10e 10 (6.166)
Z
t t t
e 10 v(t) = 10e 10 dt = 100e 10 + c (6.167)
c
v(t) = 100 + t (6.168)
e 10
v(0) = −10 = 100 + c ⇒ c = −110 (6.169)
110
v(t) = 100 − t . (6.170)
e 10
What is the terminal velocity of the rock? The terminal velocity is given by the limit of the velocity
as t → ∞, which is 100. We could also have computed the velocity of the rock when it hit the
ground if we knew the height of the bridge (integrate to get position). 

 Example 6.31 A 60kg skydiver jumps out of a plane with no initial velocity. Assuming the
magnitude of air resistance is given by 0.8|v|, what is the appropriate initial value problem modeling
his velocity?

Ans: Air Resistance is an upward force, while gravity is acting downward. So our force should be

F(t, v) = mg − .8v (6.171)

thus our IVP is

60v0 = 60g − .8v, v(0) = 0 (6.172)

6.4 Existence and Uniqueness


Last Time: We developed 1st Order ODE models for physical systems and solved them using the
methods of Integrating Factor and Separable Equations.

In Section 1.3 we noted three common questions we would be concerned with this semester.
188 Chapter 6. Ordinary Differential Equations

1. (Existence) Given an IVP, does a solution exist?


2. (Uniqueness) If a solution exists, is it unique?
3. If a solution exists, how do we find it?

We have spent a lot of time on developing methods, now we will spend time on the first two
questions. Without Solving an IVP, what information can we derive about the existence and
uniqueness of solutions? Also we will note strong differences between linear and nonlinear
equations.

6.4.1 Linear Equations


While we will focus on first order linear equations, the same basic ideas work for higher order
linear equations.

Theorem 6.4.1 (Fundamental Theorem of Existence and Uniqueness for Linear Equations)
Consider the IVP

y0 + p(t)y = q(t), y(t0 ) = y0 . (6.173)

If p(t) and q(t) are continuous functions on an open interval α < t0 < β , then there exists a
unique solution to the IVP defined on the interval (α, β ).

R The same result holds for general IVPs. If we have the IVP

(n−1)
y(n) + an−1 (t)y(n−1) + ... + a1 (t)y0 + a0 (t)y = g(t), y(t0 ) = y0 , ..., y(n−1) (t0 ) = y0
(6.174)

then if ai (t) (for i = 0, ..., n − 1) and g(t) are continuous on an open interval α < t0 < β , there
exists a unique solution to the IVP defined on the interval (α, β ).

What does Theorem 1 tell us?


(1) If the given linear differential equation is nice, not only do we know EXACTLY ONE
solution exists. In most applications knowing a solution is unique is more important than
knowing a solution exists.
(2) If the interval (α, β ) is the largest interval on which p(t) and q(t) are continuous, then
(α, β ) is the interval of validity to the unique solution guaranteed by the theorem. Thus given
a "nice" IVP there is no need to solve the equation to find the interval of validity. The interval
only depends on t0 since the interval must contain it, but does not depend on y0 .

 Example 6.32 Without solving, determine the interval of validity for the solution to the following
IVP

(t 2 − 9)y0 + 2y = ln |20 − 4t|, y(4) = −3 (6.175)

Ans: If we look at Theorem 1, we need to write our equation in the form given in Theorem 1 (i.e.
coefficient of y0 is 1). So rewrite as

2 ln |20 − 4t|
y0 + = (6.176)
t2 − 9 t2 − 9
Next we identify where either of the two other coefficients are discontinuous. By removing those
points we find all intervals of validity. Then the last step is to identify which interval of validity
contains t0 .
6.4 Existence and Uniqueness 189

Using the notation in Theorem 1, p(t) is discontinuous when t = ±3, since at those points we
are dividing by zero. q(t) is discontinuous at t = 5, since the natural log of 0 does not exists (only
defined on (0, ∞)). This yields four intervals of validity where both p(t) and q(t) are continuous

(−∞, −3), (−3, 3), (3, 5), (5, ∞) (6.177)

Notice the endpoints are where p(t) and q(t) are discontinuous, guaranteeing within each interval
both are continuous. Now all that is left is to identify which interval contains t0 = 4. Thus our
interval of validity is (3, 5). 

R The other intervals of validity we found are intervals of validity for the same differential
equation, but for different initial conditions. For example, if our IC was y(2) = 5 then the
interval of validity must contain 2, so the answer would be (−3, 3).

What happens if our IC is at one of the bad points where p(t) and q(t) are discontinu-
ous? Unfortunately we are unable to conclude anything, since the theorem does not apply.
On the other hand we cannot say that a solution does not exist just because the hypothesis are
not met, so the bottom line is that we cannot conclude anything.

6.4.2 Nonlinear Equations


We saw in the linear case every "nice enough" equation has a unique solution except for if the initial
conditions are ill-posed. But even this seemingly simple nonlinear equation
dt 2
( ) + x2 + 1 = 0 (6.178)
dx
has no real solutions.

So we have the following revision of Theorem 1 that applies to nonlinear equations as well.
Since this is applied to a broader class the conclusions are expected to be weaker.

Theorem 6.4.2 Consider the IVP

y0 = f (t, y), y(t0 ) = y0 . (6.179)

If f and ∂∂ yf are continuous functions on some rectangle α < t0 < β , γ < y0 < δ containing the
point (t0 , y0 ), then there is a unique solution to the IVP defined on some interval (a, b) satisfying
α < a < t0 < b ≤ β .

OBSERVATION:
(1) Unlike Theorem 1, Theorem 2 does not tell us the interval of a unique solution guaranteed by
it. Instead, it tells us the largest possible interval that the solution will exist in, we would need to
actually solve the IVP to get the interval of validity.
(2) For nonlinear differential equations, the value of y0 may affect the interval of validity, as we
will see in a later example. We want our IC to NOT lie on the boundary of a region where f or its
partial derivative are discontinuous. Then we find the largest t-interval on the line y = y0 containing
t0 where everything is continuous.

R Theorem 2 refers to partial derivative ∂∂ yf of the function of two variables f (t, y). We will
talk extensively about this later, but for now we treat t as a constant and take a normal
190 Chapter 6. Ordinary Differential Equations

derivative with respect to y. For example

∂f
f (t, y) = t 2 − 2y3t, then = −6y2t. (6.180)
∂y

 Example 6.33 Determine the largest possible interval of validity for the IVP

y0 = x ln(y), y(2) = e (6.181)

We have f (x, y) = x ln(y), so ∂∂ yf = xy . f is discontinuous when y ≤ 0, and fy (partial derivative with


respect to y) is discontinuous when y = 0. Since our IC y(2) = e > 0 there is no problem since
y0 is never in the discontinuous region. Since there are no discontinuities involving x, then the
rectangle is −∞ < x0 < ∞, 0 < y0 < ∞. Thus the theorem concludes that the unique solution exists
somewhere inside (−∞, ∞). 

R Note that this basically told us nothing, and nonlinear problems are quite harder to deal with
than linear. What can happen if the conditions of Theorem 2 are NOT met?

 Example 6.34 Determine all possible solutions to the IVP


1
y0 = y 3 , y(0) = 0. (6.182)

First note this does not satisfy the conditions of the theorem, since fy = 3y12/3 is not continuous at
y0 = y = 0. Now solve the equation it is separable. Notice the equilibrium solution is y = 0. This
satisfies the IC, but let’s solve the equation.
Z Z
y−1/3 dy = dt (6.183)
3 2/3
y = t +c (6.184)
2
y(0) = 0 (6.185)
2 3
y(t) = ±( t) 2 (6.186)
3
(6.187)

The IC does not rule out either of these possibilities, so we end up with three possible solutions
(these two and the equilibrium solution y(t) ≡ 0). 

6.5 Other Methods for First-Order Equations


6.5.1 Autonomous Equations with Population Dynamics
First order differential equations relate the slope of a function to the values of the function and the
independent variable. We can visualize this using direction fields. This in principle can be very
complicated and it might be hard to determine which initial values correspond to which outcomes.
However, there is a special class of equations, called autonomous equations, where this process is
simplified. The first thing to note is autonomous equations do not depend on t

y0 = f (y) (6.188)

R Notice that all autonomous equations are separable.


6.5 Other Methods for First-Order Equations 191

What we need to know to study the equation qualitatively is which values of y make y0 zero,
positive, or negative. The values of y making y0 = 0 are the equilibrium solutions. They are
constant solutions and are indicated on the ty-plane by horizontal lines.
After we establish the equilibrium solutions we can study the positivity of f (y) on the interme-
diate intervals, which will tell us whether the equilibrium solutions attract nearby initial conditions
(in which case they are called asymptotically stable), repel them (unstable), or some combination
of them (semi-stable).
 Example 6.35 Consider

y0 = y2 − y − 2 (6.189)

Start by finding the equilibrium solutions, values of y such that y0 = 0. In this case we need to
solve y2 − y − 2 = (y − 2)(y + 1) = 0. So the equilibrium solutions are y = −1 and y = 2. There
are constant solutions and indicated by horizontal lines. We want to understand their stability. If
we plot y2 − y − 2 versus y, we can see that on the interval (−∞, −1), f (y) > 0. On the interval
(−1, 2), f (y) < 0 and on (2, ∞), f (y) > 0. Now consider the initial condition.

(1) If the IC y(t0 ) = y0 < −1, y0 = f (y) > 0 and y(t) will increase towards -1.
(2) If the IC −1 < y0 < 2, y0 = f (y) < 0, so the solution will decrease towards -1. Since the solutions
below -1 go to -1 and the solutions above -1 go to -1, we conclude y(t) = −1 is an asymptotically
stable equilibrium.
(3) If y0 > 2, y0 = f (y) > 0, so the solution increases away from 2. So at y(t) = 2 above and below
solutions move away so this is an unstable equilibrium. 

 Example 6.36 Consider

y0 = (y − 4)(y + 1)2 (6.190)

The equilibrium solutions are y = −1 and y = 4. To classify them, we graph f (y) = (y − 4)(y + 1)2 .
(1) If y < −1, we can see that f (y) < 0, so solutions starting below -1 will tend towards −∞.
(2) If −1 < y0 < 4, f (y) < 0, so solutions starting here tend downwards to -1. So y(t) = 1 is
semistable.
(3) If y > 4, f (y) > 0, solutions starting above 4 will asymptotically increase to ∞, so y(t) = 4 is
unstable since no nearby solutions converge to it. 

Populations
The best examples of autonomous equations come from population dynamics. The most naive
model is the "Population Bomb" since it grows without any deaths

P0 (t) = rP(t) (6.191)

with r > 0. The solution to this differential equation is P(t) = P0 ert , which indicates that the
population would increase exponentially to ∞. This is not realistic at all.
A better and more accurate model is the "Logistic Model"

P r
P0 (t) = rP(1 − ) = rP − P2 (6.192)
N N

where N > 0 is some constant. With this model we have a birth rate of rP and a mortality rate of
192 Chapter 6. Ordinary Differential Equations
r 2
NP . The equation is separable so let’s solve it.

dP
= rdt (6.193)
P(1 − NP )
1 1/N
Z Z
( + )dP = rdt (6.194)
P 1 − P/N
P
ln |P| − ln |1 − | = rt + c (6.195)
N
P
= Aert (6.196)
1 − NP
1 rt
P = Aert = Ae P (6.197)
N
Aert AN
P(t) = A rt
= −rt + A
(6.198)
1+ Ne Ne

P0 N
if P(0) = P0 , then A = N−P0 to yield

P0 N
P(t) = (6.199)
(N − P0 )e−rt + P0

In its present form its hard to analyze what is going on so let’s apply the methods from the first
section to analyze the stability.
Looking at the logistic equation, we can see that our equilibrium solutions are P = 0 and P = N.
Graphing f (P) = rP(1 − NP ), we see that
(1) If P < 0, f (P) < 0
(2) If 0 < P < N, f (P) > 0
(3) If P > N, f (P) < 0
Thus 0 is unstable while while N is asymptotically stable, so we can conclude for initial population
P0 > 0

lim P(t) = N (6.200)


t→∞

So what is N? It is the carrying capacity for the environment. If the population exists, it will grow
towards N, but the closer it gets to N the slower the population will grow. If the population starts
off greater then the carrying capacity for the environment P0 > N, then the population will die off
until it reaches that stable equilibrium position. And if the population starts off at N, the births and
deaths will balance out perfectly and the population will remain exactly at P0 = N.

Note: It is possible to construct similar models that have unstable equilibria above 0.

Exercise 6.1 Show that the equilibrium population P(t) = N is unstable for the autonomous
equation
P
P0 (t) = rP( − 1). (6.201)
N

6.5 Other Methods for First-Order Equations 193

6.5.2 Bernoulli Equations


The Bernoulli equation is a special first order equation of the form:

y0 + Py = Qyn , (6.202)

where P, Q are functions of x. Even though the equation is not linear, we can reduce it to a linear
equation with the following transformation: z = y1−n where by Chain Rule z0 = (1 − n)y−n y0 . Now
multiply (6.202) by (1 − n)y−n to find

(1 − n)y−n y0 + (1 − n)Py1−n = (1 − n)Q


z0 + (1 − n)Pz = (1 − n)Q.

We now have a first order linear equation that could be solved using the method of integrating
factor.

6.5.3 Exact Equations


The final category of first order differential equations we will consider are Exact Equations. These
nonlinear equations have the form
dy
M(x, y) + N(x, y) =0 (6.203)
dx
where y = y(x) is a function of x and find the
∂M ∂N
= (6.204)
∂y ∂x
where these two derivatives are partial derivatives.
Multivariable Differentiation
If we want a partial derivative of f (x, y) with respect to x we treat y as a constant and differentiate
normally with respect to x. On the other hand, if we want a partial derivative of f (x, y) with respect
to y we treat x as a constant and differentiate normally with respect to y.
 Example 6.37 Let f (x, y) = x2 y = y2 . Then
∂f
= 2xy (6.205)
∂x
∂f
= x2 + 2y. (6.206)
∂y


 Example 6.38 Let f (x, y) = y sin(x)


∂f
= y cos(y) (6.207)
∂x
∂f
= sin(x) (6.208)
∂y


We also need the crucial tool of the multivariable chain rule. If we have a function Φ(x, y(x))
depending on some variable x and a function y depending on x, then
dΦ ∂ Φ ∂ Φ dy
= + = Φx + Φy y0 (6.209)
dx ∂x ∂ y dx
194 Chapter 6. Ordinary Differential Equations

Solving Exact Equations


Start with an example to illustrate the method.
 Example 6.39 Consider

dy
2xy − 9x2 + (2y + x2 + 1) =0 (6.210)
dx
The first step in solving an exact equation is to find a certain function Φ(x, y). Finding Φ(x, y) is
most of the work. For this example it turns out

Φ(x, y) = y2 + (x2 + 1)y − 3x3 (6.211)

Notice if we compute the partial derivatives of Φ, we obtain

Φx (x, y) = 2xy − 9x2 (6.212)


2
Φy (x, y) = 2y + x + 1. (6.213)

Looking back at the differential equation, we can rewrite it as

dy
Φx + Φy = 0. (6.214)
dx
Thinking back to the chain rule we can express as


=0 (6.215)
dx
Thus if we integrate, Φ = c, where c is a constant. So the general solution is

y2 + (x2 + 1)y − 3x3 = c (6.216)

for some constant c. If we had an initial condition, we could use it to find the particular solution to
the initial value problem. 

Let’s investigate the last example further. An exact equation has the form

dy
M(x, y) + N(x, y) =0 (6.217)
dx
with My (x, y) = Nx (x, y). The key is to construct Φ(x, y) such that the DE turns into


=0 (6.218)
dx
by using the multivariable chain rule. Thus we require Φ(x, y) satisfy

Φx (x, y) = M(x, y) (6.219)


Φy (x, y) = N(x, y) (6.220)

R A standard fact from multivariable calculus is that mixed partial derivatives commute. That
is why we want My = Nx , so My = Φxy and Nx = Φyx , and so these should be equal for Φ to
exist. Make sure you check the function is exact before wasting time on the wrong solution
process.
6.5 Other Methods for First-Order Equations 195

Once we have found Φ, then dx = 0, and so

Φ(x, y) = c (6.221)

yielding an implicit general solution to the differential equation.


So the majority of the work is computing Φ(x, y). How can we find this desired function, let’s
retry Example 3, filling in the details.
 Example 6.40 Solve the initial value problem

dy
2xy − 9x2 + (2y + x2 + 1) = 0, y(0) = 2 (6.222)
dx
Let’s begin by checking the equation is in fact exact.

M(x, y) = 2xy − 9x2 (6.223)


2
N(x, y) = 2y + x + 1 (6.224)
(6.225)

Then My = 2x = Ny , so the equation is exact.


Now how do we find Φ(x, y)? We have Φx = M and Φy = N. Thus we could compute Φ in one
of two ways
Z Z
Φ(x, y) = Mdx or Φ(x, y) = Ndy. (6.226)

In general it does not usually matter which you choose, one may be easier to integrate than the
other. In this case
Z
Φ(x, y) = 2xy − 9x2 dx = x2 y − 3x3 + h(y). (6.227)

Notice since we only integrate with respect to x we can have an arbitrary function only depending
on y. If we differentiate h(y) with respect to x we still get 0 like an arbitrary constant c. So in order
to have the highest accuracy we take on an arbitrary function of y. Note if we integrated N with
respect to y we would get an arbitrary function of x. DO NOT FORGET THIS!
Now all we need is to find h(y). We know if we differentiate Φ with respect to x, then h(y) will
vanish which is unhelpful. So instead differentiate with respect to y, since Φy = N in order to be
exact. so any terms in N that aren’t in Φy must be h0 (y).
So Φy = x2 + h0 (y) and N = x2 + 2y + 1. Since these are equal we have h0 (y) = 2y + 1, an so
Z
h(y) = h0 (y)dy = y2 + y (6.228)

R We will drop the constant of integration we get from integrating h since it will combine with
the constant c that we get in the solution process.

Thus, we have

Φ(x, y) = x2 y − 3x3 + y2 + y = y2 + (x2 + 1)y − 3x3 , (6.229)

which is precisely the Φ that we used in Example 3. Observe



=0 (6.230)
dx
196 Chapter 6. Ordinary Differential Equations

and thus Φ(x, y) = y2 + (x2 + 1)y − 3x3 = c for some constant c. To compute c, we’ll use our initial
condition y(0) = 2

22 + 2 = c ⇒ c = 6 (6.231)

and so we have a particular solution of

y2 + (x2 + 1)y − 3x3 = 6 (6.232)

This is a quadratic equation in y, so we can complete the square or use quadratic formula to get an
explicit solution, which is the goal when possible.

y2 + (x2 + 1)y − 3x3 = 6 (6.233)


(x2 + 1)2 (x2 + 1)2
y2 + (x2 + 1)y + = 6 + 3x3 + (6.234)
4 4
x2 + 1 2 x4 + 12x3 + 2x2 + 25
(y + ) = (6.235)
2 4 √
2
−(x + 1) ± x4 + 12x3 + 2x2 + 25
y(x) = (6.236)
2
Now we use the initial condition to figure out whether we want the + or − solution. Since y(0) = 2
we have

−1 ± 25 −1 ± 5
2 = y(0) = = = 2, −3 (6.237)
2 2
Thus we see we want the + so our particular solution is

−(x2 + 1) + x4 + 12x3 + 2x2 + 25
y(x) = (6.238)
2


 Example 6.41 Solve the initial value problem

2xy2 + 2 = 2(3 − x2 y)y0 , y(−1) = 1. (6.239)

First we need to put it in the standard form for exact equations

2xy2 + 2 − 2(3 − x2 y)y0 = 0. (6.240)

Now, M(x, y) = 2xy2 + 2 and N(x, y) = −2(3 − x2 y). So My = 4xy = Nx and the equation is exact.
The next step is to compute Φ(x, y). We choose to integrate N this time
Z Z
Φ(x, y) = Ndy = 2x2 y − 6dy = x2 y2 − 6y + h(x). (6.241)

To find h(x), we compute Φx = 2xy2 + h0 (x) and notice that for this to be equal to M, h0 (x) = 2.
Hence h(x) = 2x and we have an implicit solution of

x2 y2 − 6y + 2x = c. (6.242)

Now, we use the IC y(−1) = 1:

1 − 6 − 2 = c ⇒ c = −7 (6.243)
6.5 Other Methods for First-Order Equations 197

So our implicit solution is

x2 y2 − 6y + 2x + 7 = 0. (6.244)

Again complete the square or use quadratic formula


p
6 ± 36 − 4x2 (2x + 7)
y(x) = (6.245)
√ 2x2
3 ± 9 − 2x3 − 7x2
= (6.246)
x2
and using the IC, we see that we want − solution, so the explicit particular solution is

3 − 9 − 2x3 − 7x2
y(x) = (6.247)
x2


 Example 6.42 Solve the IVP

2ty
− 2t − (4 − ln(t 2 + 1))y0 = 0, y(2) = 0 (6.248)
t2 + 1
and find the solution’s interval of validity.
This is already in the right form. Check if it is exact, M(t, y) = t 22ty
+1
− 2t and N(t, y) =
2 2t
ln(t + 1) − 4, so My = t 2 +1 = Nt . Thus the equation is exact. Now compute Φ(x, y). Integrate M

2ty
Z Z
Φ= Mdt = dt = y ln(t 2 + 1) − t 2 + h(y). (6.249)
t2 + 1
Φy = ln(t 2 + 1) + h0 (y) = ln(t 2 + 1) − 4 = N (6.250)

so we conclude h0 (y) = −4 and thus h(y) = −4y. So our implicit solution is then

y ln(t 2 + 1) − t 2 − 4y = c (6.251)

and using the IC we find c = −4. Thus the particular solution is

y ln(t 2 + 1) − t 2 − 4y = −4 (6.252)

Solve explicitly to obtain

t2 − 4
y(x) = . (6.253)
ln(t 2 + 1) − 4

Now let’s find the interval of validity. We do not have to worry about the natural log since
t2 + 1> 0 for all t. Thus we want to avoid division by 0.

ln(t 2 + 1) − 4 = 0 (6.254)
2
ln(t + 1) = 4 (6.255)
2 4
t = e −1 (6.256)
p
t = ± e4 − 1 (6.257)
√ √
So there are three possible intervals of validity, we want the one containing t = 2, so (− e4 − 1, e4 − 1).

198 Chapter 6. Ordinary Differential Equations

 Example 6.43 Solve

3y3 e3xy − 1 + (2ye3xy + 3xy2 e3xy )y0 = 0, y(1) = 2 (6.258)


We have
My = 9y2 e3xy + 9xy3 e3xy = Nx (6.259)
Thus the equation is exact. Integrate M
Z Z
Φ= Mdx = 3y3 e3xy − 1 = y2 e3xy − x + h(y) (6.260)

and
Φy = 2ye3xy + 3xy2 e3xy + h0 (y) (6.261)
Comparing Φy to N, we see that they are already identical, so h0 (y) = 0 and h(y) = 0. So
y2 e3xy − x = c (6.262)
and using the IC gives c = 4e6 − 1. Thus our implicit particular solution is
y2 e3xy − x = 4e6 − 1, (6.263)
and we are done because we will not be able to solve this explicitly. 

6.5.4 Homogeneous Equations


A homogeneous function of x and y of degree n means that a function can be written as xn f (y/x).
For example
x3 − xy2 = x3 [1 − (y/x)2 ]
is a homogeneous function of degree 3. An equation of the form
P(x, y)dx + Q(x, y)dy = 0 (6.264)
where P, Q are homogeneous functions of the same degree and the equation can be rewritten as
dy P(x, y) y
y0 = =− =f . (6.265)
dx Q(x, y) x
This suggest that homogeneous equations of this form could be solved by a change of variables
v = y/x or y = xv. After substitution the equation becomes separable! y0 = f (v).

6.6 Second-Order Linear Equations with Constant Coefficients and Zero Right-
Hand Side
6.6.1 Basic Concepts
The example of a second order equation which we have seen many times before is Newton’s Second
Law when expressed in terms of position s(t) is
d2s
m = F(t, s0 , s) (6.266)
dt 2
One of the most basic 2nd order equations is y00 = −y. By inspection, we might notice
that this has two obvious nonzero solutions: y1 (t) = cos(t) and y2 (t) = sin(t). But consider
9 cos(t) − 2 sin(t)? This is also a solution. Anything of the form y(t) = c1 cos(t) + c2 sin(t), where
c1 and c2 are arbitrary constants. Every solution if no conditions are present has this form.
6.6 Second-Order Linear Equations with Constant Coefficients and Zero
Right-Hand Side 199
00
 Example 6.44 Find all of the solutions to y = 9y

We need a function whose second derivative is 9 times the original function. What function
comes back to itself without a sign change after two derivatives? Always think of the exponential
function when situations like this arise. Two possible solutions are y1 (t) = e3t and y2 (t) = e−3t .
In fact so are any combination of the two. This is the principal of linear superposition. So
y(t) = c1 e3t + c2 e−3t are infinitely many solutions.

EXERCISE: Check that y1 (t) = e3t and y2 (t) = e−3t are solutions to y00 = 9y. 

The general form of a second order linear differential equation is


p(t)y00 + q(t)y0 + r(t)y = g(t). (6.267)
We call the equation homogeneous if g(t) = 0 and nonhomogeneous if g(t) 6= 0.

Theorem 6.6.1 (Principle of Superposition) If y1 (t) and y2 (t) are solutions to a second order
linear homogeneous differential equation, then so is any linear combination

y(t) = c1 y1 (t) + c2 y2 (t). (6.268)

This follows from the homogeneity and the fact that a derivative is a linear operator. So given
any two solutions to a homogeneous equation we can find infinitely more by combining them. The
main goal is to be able to write down a general solution to a differential equation, so that with
some initial conditions we could uniquely solve an IVP. We want to find y1 (t) and y2 (t) so that
the general solution to the differential equation is y(t) = c1 y1 (t) + c2 y2 (t). By different we mean
solutions which are not constant multiples of each other.
Now reconsider y00 = −y. We found two different solutions y1 (t) = cos(t) and y2 (t) = sin(t)
and any solution to this equation can be written as a linear combination of these two solutions,
y(t) = c1 cos(t) + c2 sin(t). Since we have two constants and a 2nd order equation we need two
initial conditions to find a particular solution. We are generally given these conditions in the form
of y and y0 defined at a particular t0 . So a typical problem might look like
p(t)y00 + q(t)y0 + r(t)y = 0, y0 (t0 ) = y00 , y(t0 ) = y0 (6.269)
 Example 6.45 Find a particular solution to the initial value problem

y00 + y = 0, y(0) = 2, y0 (0) = −1 (6.270)


We have established the general solution to this equation is
y(t) = c1 cos(t) + c2 sin(t) (6.271)
To apply the initial conditions, we’ll need to know the derivative
y0 (t) = −c1 sin(t) + c2 cos(t) (6.272)
Plugging in the initial conditions yields
2 = c1 (6.273)
−1 = c2 (6.274)
so the particular solution is
y(t) = 2 cos(t) − sin(t). (6.275)


Sometimes when applying initial conditions we will have to solve a system of equations, other
times it is as easy as the previous example.
200 Chapter 6. Ordinary Differential Equations

6.6.2 Homogeneous Equations With Constant Coefficients


We will start with the easiest class of second order linear homogeneous equations, where the
coefficients p(t), q(t), and r(t) are constants. The equation will have the form

ay00 + by0 + c = 0. (6.276)

How do we find solutions to this equation? From calculus we can find a function that is linked to
its derivatives by a multiplicative constant, y(t) = ert . Now that we have a candidate plug it into the
differential equation. First calculate the derivatives y0 (t) = rert and y00 (t) = r2 ert .

a(r2 ert ) + b(rert ) + cert = 0 (6.277)


rt 2
e (ar + br + c) = 0 (6.278)

What can we conclude? If y(t) = ert is a solution to the differential equation, then ert (ar2 +br +c) =
0. Since ert 6= 0, then y(t) = ert will solve the differential equation as long as r is a solution to

ar2 + br + c = 0. (6.279)

This equation is called the characteristic equation for ay00 + by0 + c = 0.


Thus, to find a solution to a linear second order homogeneous constant coefficient equation, we
begin by writing down the characteristic equation. Then we find the roots r1 and r2 (not necessarily
distinct or real). So we have the solutions

y1 (t) = er1t , y2 (t) = er2t . (6.280)

Of course, it is also possible these are the same, since we might have a repeated root. We will see
in a future section how to handle these. In fact, we have three cases.
 Example 6.46 Find two solutions to the differential equation y00 − 9y = 0 (Example 1). The
characteristic equation is r2 − 9 = 0, and this has roots r = ±3. So we have two solutions y1 (t) = e3t
and y2 (t) = e−3t , which agree with what we found earlier. 

The three cases are the same as the three possibilities for types of roots of quadratic equations:
(1) Real, distinct roots r1 6= r2 .
(2) Complex roots r1 , r2 = α ± β i.
(3) A repeated real root r1 = r2 = r.
We’ll look at each case more closely in the lectures to come.

6.7 Complex Roots of the Characteristic Equation


Last Time: We considered the Wronskian and used it to determine when we have solutions to
a second order linear equation or if given one solution we can find another which is linearly
independent.

6.7.1 Review Real, Distinct Roots


Recall that a second order linear homogeneous differential equation with constant coefficients

ay00 + by0 + cy = 0 (6.281)

is solved by y(t) = ert , where r solves the characteristic equation

ar2 + br + c = 0 (6.282)
6.7 Complex Roots of the Characteristic Equation 201

So when there are two distinct roots r1 6= r2 , we get two solutions y1 (t) = er1t and y2 (t) = er2t .
Since they are distinct we can immediately conclude the general solution is

y(t) = c1 er1t + c2 er2t (6.283)

Then given initial conditions we can solve c1 and c2 .


Exercises:
(1) y00 + 3y0 − 18y = 0, y(0) = 0, y0 (0) = −1.
ANS: y(t) = 19 e−6t − 91 e3t .
(2) y00 − 7y0 + 10y = 0, y(0) = 3, y(0) = 2
ANS: y(t) = − 43 e5t + 13 3e
2t

(3) 2y00 − 5y0 + 2y = 0, y(0) = −3, y0 (0) = 3


1
ANS: y(t) = −6e 2 t + 3e2t .
(4) y00 + 5y0 = 0, y(0) = 2, y0 (0) = −5
ANS: y(t) = 1 + e−5t
(5) y00 − 2y0 − 8 = 0, y(2) = 1, y0 (2) = 0
4
ANS: y(t) = 3e18 e4t + 2e3 e−2t
(6) y00 + y0 − 3y = 0 √ √
−1+ 13 −1− 13
ANS: y(t) = c1 e 2 t + c2 e 2 t .

6.7.2 Complex Roots


Now suppose the characteristic equation has complex roots of the form r1,2 = α ± iβ . This means
we have two solutions to our differential equation

y1 (t) = e(α+iβ )t , y2 (t) = e(α−iβ )t (6.284)

This is a problem since y1 (t) and y2 (t) are complex-valued. Since our original equation was both
simple and had real coefficients, it would be ideal to find two real-valued "different" enough
solutions so that we can form a real-valued general solution. There is a way to do this.

Theorem 6.7.1 (Euler’s Formula)

eiθ = cos(θ ) + i sin(θ ) (6.285)

In other words, we can write an imaginary exponential as a sum of sin and cos. How do we establish
this fact? There are two ways:

(1) Differential Equations: First we want to write eiθ = f (θ ) + ig(θ ). We also have

d iθ
f 0 + ig0 = [e ] = ieiθ = i f − g. (6.286)

Thus f 0 = −g and g0 = f , so f 00 = − f and g00 = −g. Since e0 = 1, we know that f (0) = 1 and
g(0) = 0. We conclude that f (θ ) = cos(θ ) and g(θ ) = sin(θ ), so

eiθ = cos(θ ) + i sin(θ ) (6.287)

(2) Taylor Series: Recall that the Taylor series for ex is



xn x2 x3
ex = ∑ = 1 + x + + + ... (6.288)
n=0 n 2! 3!
202 Chapter 6. Ordinary Differential Equations

while the Taylor series for sin(x) and cos(x) are



(−1)n x2n+1 x3 x5
sin(x) = ∑ = x − + + ... (6.289)
n=0 (2n + 1)! 3! 5!

(−1)n x2n x2 x4
cos(x) = ∑ = 1 − + + ... (6.290)
n−0 (2n)! 2! 4!
(6.291)

If we set x = iθ in the first series, we get



(iθ )n
eiθ = ∑ (6.292)
n=0 n!
θ 2 iθ 3 θ 4 iθ 5
= 1 + iθ − − + + − ... (6.293)
2! 3! 4! 5!
θ2 θ4 θ 3 iθ 5
= (1 − + − ...) + i(θ − + − ...) (6.294)
2! 4! 3! 5!

(−1)n θ 2n ∞
(−1)n θ 2n+1
= ∑ +i ∑ (6.295)
n=0 (2n!) n=0 (2n + 1)!
= cos(θ ) + i sin(θ ) (6.296)

So we can write our two complex exponentials as

e(α+iβ )t = eαt eiβt = eαt (cos(βt) + i sin(βt)) (6.297)


(α−iβ )t αt −iβt αt
e = e e = e (cos(βt) − i sin(βt)) (6.298)

where the minus sign pops out of the sign in the second equation since sin is odd and cos is even.
Notice our new expression is still complex-valued. However, by the Principle of Superposition, we
can obtain the following solutions
1 αt 1
y1 (t) = (e (cos(βt) + i sin(βt))) + (eαt (cos(βt) − i sin(βt))) = eαt cos(βt) (6.299)
2 2
1 αt 1
y2 (t) = (e (cos(βt) + i sin(βt))) − (eαt (cos(βt) − i sin(βt))) = eαt sin(βt)(6.300)
2i 2i
EXERCISE: Check that y1 (t) = eαt cos(βt) and y2 (t) = eαt sin(βt) are in fact solutions to the
beginning differential equation when the roots are α ± iβ .
So now we have two real-valued solutions y1 (t) and y2 (t). It turns out they are linearly
independent, so if the roots of the characteristic equation are r1,2 = α ± iβ , we have the general
solution

y(t) = c1 eαt cos(βt) + c2 eαt sin(βt) (6.301)

Let’s consider some examples:


 Example 6.47 Solve the IVP

y00 − 4y0 + 9y = 0, y(0) = 0, y0 (0) = −2 (6.302)

The characteristic equation is

r2 − 4r + 9 = 0 (6.303)
6.7 Complex Roots of the Characteristic Equation 203

which has roots r1,2 = 2 ± i 5. Thus the general solution and its derivatives are
√ √
y(t) = c1 e2t cos( 5t) + c2 e2t sin( 5t) (6.304)
0 2t
√ √ 2t
√ 2t
√ √ 2t

y (t) = 2c1 e cos( 5t) − 5c1 e sin( 5t) + 2c2 e sin( 5t) + 5c2 e cos( 5t). (6.305)

If we apply the initial conditions, we get

0 = c1 (6.306)

−2 = 2c1 + 5c2 (6.307)

which is solved by c1 = 0 and c2 = − √25 . So the particular solution is

2 √
y(t) = − √ e2t sin( 5t). (6.308)
5


 Example 6.48 Solve the IVP

y00 − 8y0 + 17y = 0, y(0) = 2, y0 (0) = 5. (6.309)

The characteristic equation is

r2 − 8r + 17 = 0 (6.310)

which has roots r1,2 = 4 ± i. Hence the general solution and its derivatives are

y(t) = c1 e4t cos(t) = c2 e4t sin(t) (6.311)


0 4t 4t 4t 4t
y (t) = 4c1 e cos(t) − c1 e sin(t) + 4c2 e sin(t) + c2 e cos(t) (6.312)

and plugging in initial conditions yields the system

2 = c1 (6.313)
5 = 4c1 + c2 (6.314)

so we conclude c1 = 2 and c2 = −3 and the particular solution is

y(t) = 2e4t cos(t) − 3e4t sin(t) (6.315)

 Example 6.49 Solve the IVP

4y00 + 12y0 + 10y = 0, y(0) = −1, y0 (0) = 3 (6.316)

The characteristic equation is

4r2 + 12r + 10 = 0 (6.317)

which has roots r1,2 = − 32 ± 12 i. So the general solution and its derivative are
3 t 3 t
y(t) = c1 e 2 t cos( ) + c2 e 2 t sin( ) (6.318)
2 2
3 3 t 1 3 t 3 3 t 1 3 t
y0 (t) = c1 e 2 t cos( ) − c1 e 2 t sin( ) + c2 e 2 t sin( ) + c2 e 2 t cos( ) (6.319)
2 2 2 2 2 2 2 2
204 Chapter 6. Ordinary Differential Equations

Plugging in the initial condition yields


−1 = c1 (6.320)
3 1
3 = c1 + c2 (6.321)
2 2
which has solution c1 = −1 and c2 = 9. The particular solution is
3 t 3 t
y(t) = −e 2 t cos( ) + 9e 2 t sin( ) (6.322)
2 2


 Example 6.50 Solve the IVP


π π
y00 + 4y = 0, y( ) = −10, y0 ( ) = 4. (6.323)
4 4
The characteristic equation is

r2 + 4 = 0 (6.324)

which has roots r1,2 = ±2i. The general solution and its derivatives are
y(t) = c1 cos(2t) + c2 sin(2t) (6.325)
0
y (t) = −2c1 sin(2t) + 2c2 cos(2t). (6.326)
The initial conditions give the system
−10 = c2 (6.327)
4 = −2c1 (6.328)
so we conclude that c1 = −2 and c2 = −10 and the particular solution is

y(t) = −2 cos(2t) − 10 sin(2t). (6.329)

6.8 Repeated Roots of the Characteristic Equation and Reduction of Order


Last Time: We considered cases of homogeneous second order equations where the roots of the
characteristic equation were complex.

6.8.1 Repeated Roots


The last case of the characteristic equation to consider is when the characteristic equation has
repeated roots r1 = r2 = r. This is a problem since our usual solution method produces the same
solution twice

y1 (t) = er1t = er2t = y2 (t) (6.330)

But these are the same and are not linearly independent. So we will need to find a second solution
which is "different" from y1 (t) = ert . What should we do?
Start by recalling that if the quadratic equation ar2 + br + c = 0 has a repeated root r, it must
b
b
be r = − 2a . Thus our solution is y1 (t) = e− 2a . We know any constant multiple of y1 (t) is also a
solution. These will still be linearly dependent to y1 (t). Can we find a solution of the form
b
y2 (t) = v(t)y1 (t) = v(t)e− 2a t (6.331)
6.8 Repeated Roots of the Characteristic Equation and Reduction of Order 205

i.e. y2 is the product of a function of t and y1 .


Differentiate y2 (t):
b
b b
y02 (t) = v0 (t)e− 2a t − v(t)e− 2a t (6.332)
2a
b b b b b b2 b
y002 (t) = v00 (t)e− 2a − v0 (t)e− 2a t − v0 (t)e− 2a t + 2 v(t)e− 2a t (6.333)
2a 2a 4a
b b b b2 b
= v00 (t)e− 2a t − v0 (t)e− 2a t + 2 v(t)e− 2a t . (6.334)
a 4a
Plug in differential equation:
b b b b2 b b b b b
a(v00 e− 2a t − v0 e− 2a t + 2 ve− 2a t ) + b(v0 e− 2a t − ve− 2a t ) + c(ve− 2a t ) = 0 (6.335)
a 4a 2a
b b2 b2
e− 2a t av00 + (−b + b)v0 + ( − + c)v = 0

(6.336)
4a 2a
b 1
e− 2a t av00 − (b2 − 4ac)v = 0

(6.337)
4a
Since we are in the repeated root case, we know the discriminant b2 − 4ac = 0. Since exponentials
are never zero, we have
av00 = 0 ⇒ v00 = 0 (6.338)
We can drop the a since it cannot be zero, if a were zero it would be a first order equation! So what
does v look like
v(t) = c1t + c2 (6.339)
b
for constants c1 and c2 . Thus for any such v(t), y2 (t) = v(t)e− 2a t will be a solution. The most
general possible v(t) that will work for us is c1t + c2 . Take c1 = 1 and c2 = 0 to get a specific v(t)
and our second solution is
b
y2 (t) = te− 2a t (6.340)
and the general solution is
b b
y(t) = c1 e− 2a t + c2te− 2a t (6.341)

R Here’s another way of looking at the choice of constants. Suppose we do not make a choice.
Then we have the general solution
b b
y(t) = c1 e− 2a t + c2 (ct + k)e− 2a t (6.342)
b t
− 2a b t
− 2a b t
− 2a
= c1 e + c2 cte + c2 ke (6.343)
b t
− 2a b t
− 2a
= (c1 + c2 k)e + c2 cte (6.344)
since they are all constants we just get
b b
y(t) = c1 e− 2a t + c2te− 2a t (6.345)

To summarize: if the characteristic equation has repeated roots r1 = r2 = r, the general solution
is
y(t) = c1 ert + c2tert (6.346)
Now for examples:
206 Chapter 6. Ordinary Differential Equations

 Example 6.51 Solve the IVP

y00 − 4y0 + 4y = 0, y(0) = −1, y0 (0) = 6 (6.347)

The characteristic equation is

r2 − 4r + 4 = 0 (6.348)
2
(r − 2) = 0 (6.349)

so we see that we have a repeated root r = 2. The general solution and its derivative are

y(t) = c1 e2t + c2te2t (6.350)


0 2t 2t 2t
y (t) = 2c1 e + c2 e + 2c2te (6.351)

and plugging in initial conditions yields

−1 = c1 (6.352)
6 = 2c1 + c2 (6.353)

so we have c1 = −1 and c2 = 8. The particular solution is

y(t) = −e2t + 6te2t (6.354)

 Example 6.52 Solve the IVP

16y00 + 40y0 + 25y = 0, y(0) = −1, y0 (0) = 2. (6.355)

The characteristic equation is

16r2 + 40r + 25 = 0 (6.356)


2
(4r + 5) = 0 (6.357)

and so we conclude that we have a repeated root r = − 54 and the general solution and its derivative
are
5 5
y(t) = c1 e− 4 t + c2te− 4 t (6.358)
5 5 5 5 5
y0 (t) = − c1 e− 4 t + c2 e− 4 t − c2te− 4 t (6.359)
4 4
Plugging in the initial conditions yields

−1 = c1 (6.360)
5
2 = − c1 + c2 (6.361)
4

so c1 = −1 and c2 = 45 . The particular solution is

5 3 5
y(t) = −e− 4 t + te− 4 t (6.362)
4

6.8 Repeated Roots of the Characteristic Equation and Reduction of Order 207

6.8.2 Reduction of Order


We have spent the last few lectures analyzing second order linear homogeneous equations with
constant coefficients, i.e. equations of the form

ay00 + by0 + cy = 0 (6.363)

Let’s now consider the case when the coefficients are not constants

p(t)y00 + q(t)y0 + r(t)y = 0 (6.364)

In general this is not easy, but if we can guess a solution, we can use the techniques developed
in the repeated roots section to find another solution. This method will be called Reduction Of
Order. Consider a few examples
 Example 6.53 Find the general solution to

2t 2 y00 + ty0 − 3y = 0 (6.365)

given that y1 (t) = t −1 is a solution.


ANS: Think back to repeated roots. We know we had a solution y1 (t) and needed to find a
distinct solution. What did we do? We asked which nonconstant function v(t) make y2 (t) = v(t)y1 (t)
is also a solution. The y2 derivatives are

y2 = vt −1 (6.366)
y02 = vt 0 −1
− vt −2
(6.367)
y002 = v t 00 −1
−v t 0 −2
+ 2vt −3
=v t 00 −1 0 −2
− 2v t + 2vt −3
(6.368)

The next step is to plug into the original equation so we can solve for v:

2t 2 (v00t −1 − 2v0t −2 + 2vt −3 ) + t(v0t −1 − vt −2 ) − 3vt −1 = 0 (6.369)


00 0 −1 0 −1 −1
2v t − 4v + 4vt + v − vt − 3vt = 0 (6.370)
00 0
2tv − 3v = 0 (6.371)

Notice that the only terms left involve v00 and v0 , not v. This also happened in the repeated root case.
The v term should always disappear at this point, so we have a check on our work. If there is a v
term left we have done something wrong.
Now we know that if y2 is a solution, the function v must satisfy

2tv00 − 3v0 = 0 (6.372)

But this is a second order linear homogeneous equation with nonconstant coefficients. Let w(t) =
v0 (t). By changing variables our equation becomes

3
w0 − w = 0. (6.373)
2t
So by Integrating Factor
− 2t3 dt 3 3
R
µ(t) = e = e− 2 ln(t) = t − 2 (6.374)
− 32 0
(t w) = 0 (6.375)
− 32
t w = c (6.376)
3
w(t) = ct 2 (6.377)
208 Chapter 6. Ordinary Differential Equations

So we know what w(t) must solve the equation. But to solve our original differential equation, we
do not need w(t), we need v(t). Since v0 (t) = w(t), integrating w will give our v
Z
v(t) = w(t)dt (6.378)
Z
3
= ct 2 t dt (6.379)
2 5
= ct 2 + k (6.380)
5
5
Now this is the general form of v(t). Pick c = 5/2 and k = 0. Then v(t) = t 2 , so y2 (t) = v(t)y1 (t) =
3
t 2 , and the general solution is
3
y(t) = c1t −1 + c2t 2 (6.381)

Reduction of Order is a powerful method for finding a second solution to a differential equation
when we do not have any other method, but we need to have a solution to begin with. Sometimes
even finding the first solution is difficult.
We have to be careful with these problems sometimes the algebra is tedious and one can make
sloppy mistakes. Make sure the v terms disappears when we plug in the derivatives for y2 and check
the solution we obtain in the end in case there was an algebra mistake made in the solution process.
 Example 6.54 Find the general solution to

t 2 y00 + 2ty0 − 2y = 0 (6.382)

given that

y1 (t) = t (6.383)

is a solution.
Start by setting y2 (t) = v(t)y1 (t). So we have

y2 = tv (6.384)
y02 0
= tv + v (6.385)
y002 00 0
= tv + v + v = tv + 2v . 0 00 0
(6.386)

Next, we plug in and arrange terms

t 2 (tv00 + 2v0 ) + 2t(tv0 + v) − 2tv = 0 (6.387)


3 00 2 0 2 0
t v + 2t v + 2t v + 2tv − 2tv = 0 (6.388)
3 00 2 0
t v + 4t v = 0. (6.389)

Notice the v drops out as desired. We make the change of variables w(t) = v0 (t) to obtain

t 3 w0 + 4t 2 w = 0 (6.390)

which has integrating factor µ(t) = t 4 .

(t 4 w)0 = 0 (6.391)
4
t w = c (6.392)
−4
w(t) = ct (6.393)
6.9 Second-Order Linear Equations with Constant Coefficients and Non-zero
Right-Hand Side 209
So we have
Z
v(t) = w(t)dt (6.394)
Z
= ct −4 dt (6.395)
c
= − t −3 + k. (6.396)
3
A nice choice for the constants is c = −3 and k = 0, so v(t) = t −3 , which gives a second solution
of y2 (t) = v(t)y1 (t) = t −2 . So our general solution is

y(t) = c1t + c2t −2 (6.397)

6.9 Second-Order Linear Equations with Constant Coefficients and Non-zero


Right-Hand Side
Last Time: We considered cases of homogeneous second order equations where the roots of the
characteristic equation were repeated real roots. Then we looked at the method of reduction of
order to produce a second solution to an equation given the first solution.

6.9.1 Nonhomogeneous Equations


A second order nonhomogeneous equation has the form

p(t)y00 + q(t)y0 + r(t)y = g(t) (6.398)

where g(t) 6= 0. How do we get the general solution to these?


Suppose we have two solutions Y1 (t) and Y2 (t). The Principle of Superposition no longer holds
for nonhomogeneous equations. We cannot just take a linear combination of the two to get another
solution. Consider the equation

p(t)y00 + q(t)y0 + r(t)y = 0 (6.399)

which we will call the associated homogeneous equation.

Theorem 6.9.1 Suppose that Y1 (t) and Y2 (t) are two solutions to equation (6.398) and that
y1 (t) and y2 (t) are a fundamental set of solutions to (6.399). Then Y1 (t) −Y2 (t) is a solution to
Equation (6.399) and has the form

Y1 (t) −Y2 (t) = c1 y1 (t) + c2 y2 (t) (6.400)

Notice the notation used, it will be standard. Uppercase letters are solutions to the nonhomoge-
neous equation and lower case letters to denote solutions to the homogeneous equation.
Let’s verify the theorem by plugging in Y1 −Y2 to (6.399)

p(t)(Y1 −Y2 )00 + q(t)(Y1 −Y2 )0 + r(t)(Y1 −Y2 ) = 0 (6.401)


p(t)Y100 + q(t)Y10 + r(t)Y1 − p(t)Y200 + q(t)Y20 + r(t)Y2 = 0
 
(6.402)
g(t) − g(t) = 0 (6.403)
0 = 0 (6.404)
210 Chapter 6. Ordinary Differential Equations

So we have that Y1 (t) −Y2 (t) solves equation (6.399). We know that y1 (t) and y2 (t) are a fundamen-
tal set of solutions to equation (6.399) and so any solution can be written as a linear combination of
them. Thus for constants c1 and c2

Y1 (t) −Y2 (t) = c1 y1 (t) + c2 y2 (t) (6.405)

So the difference of any two solutions of (6.398) is a solution to (6.399). Suppose we have a
solution to (6.398), which we denote by Yp (t). Let Y (t) denote the general solution. We have seen

Y (t) −Yp (t) = c1 y1 (t) + c2 y2 (t) (6.406)

or

Y (t) = c1 y1 (t) + c2 y2 (t) +Yp (t) (6.407)

where y1 and y2 are a fundamental set of solutions to Y (t). We will call

yc (t) = c1 y1 (t) + c2 y2 (t) (6.408)

the complimentary solution and Yp (t) a particular solution. So, the general solution can be
expressed as

Y (t) = yc (t) +Yp (t). (6.409)

Thus, to find the general solution of (6.398), we’ll need to find the general solution to (6.399) and
then find some solution to (6.398). Adding these two pieces together give the general solution to
(6.398).
If we vary a solution to (6.398) by just adding in some solution to Equation (6.399), it will still
solve Equation (6.398). Now the goal of this section is to find some particular solution Yp (t) to
Equation (6.398). We have two methods. The first is the method of Undetermined Coefficients,
which reduces the problem to an algebraic problem, but only works in a few situations. The other
called Variation of Parameters is a much more general method that always works but requires
integration which may or may not be tedious.

6.9.2 Undetermined Coefficients


The major disadvantage of this solution method is that it is only useful for constant coefficient
differential equations, so we will focus on

ay00 + by0 + cy = g(t) (6.410)

for g(t) 6= 0. The other disadvantage is it only works for a small class of g(t)’s.
Recall that we are trying to find some particular solution Yp (t) to Equation (6.410). The idea
behind the method is that for certain classes of nonhomogeneous terms, we’re able to make a good
educated guess as to how Yp (t) should look, up to some unknown coefficients. Then we plug our
guess into the differential equation and try to solve for the coefficients. If we can, our guess was
correct and we have determined Yp (t). If we cannot solve for the coefficients, then we guessed
incorrectly and we will need to try again.

6.9.3 The Basic Functions


There are three types of basic types of nonhomogeneous terms g(t) that can be used for this
method: exponentials, trig functions (sin and cos), and polynomials. One we know how they work
individually and combination will be similar.
6.9 Second-Order Linear Equations with Constant Coefficients and Non-zero
Right-Hand Side 211
Exponentials
Let’s walk through an example where g(t) is an exponential and see how to proceed.
 Example 6.55 Determine a particular solution to

y00 − 4y0 − 12y = 2e4t . (6.411)

How can we guess the form of Yp (t)? When we plug Yp (t) into the equation, we should get
g(t) = 2e4t . We know that exponentials never appear or disappear during differentiation, so try

Yp (t) = Ae4t (6.412)

for some coefficient A. Differentiate, plug in, and see if we can determine A. Plugging in we get
16Ae4t − 4(4Ae4t ) − 12Ae4t = 2e4t (6.413)
4t 4t
−12Ae = 2e (6.414)
For these to be equal we need A to satisfy
1
−12A = 2 ⇒ A = − . (6.415)
6
So with this choice of A, our guess works, and the particular solution is
1
Yp (t) = − e4t . (6.416)
6


Consider the following full problem:


 Example 6.56 Solve the IVP
13 7
y00 − 4y0 − 12y = 2e4t , y(0) = − , y0 (0) = . (6.417)
6 3
We know the general solution has the form

y(t) = yc (t) +Yp (t) (6.418)

where the complimentary solution yc (t) is the general solution to the associated homogeneous
equation

y00 − 4y0 − 12y = 0 (6.419)

and Yp (t) is the particular solution to the original differential equation. From the previous example
we know
1
Yp (t) = − e4t . (6.420)
6
What is the complimentary solution? Our associated homogeneous equation has constant coeffi-
cients, so we need to find roots of the characteristic equation.
r2 − 4r − 12 = 0 (6.421)
(r − 6)(r + 2) = 0 (6.422)
So we conclude that r1 = 6 and r2 = −2. These are distinct roots, so the complimentary solution
will be

yc (t) = c1 e6t + c2 e−2t (6.423)


212 Chapter 6. Ordinary Differential Equations

We must be careful to remember the initial conditions are for the non homogeneous equation, not
the associated homogeneous equation. Do not apply them at this stage to yc , since that is not a
solution to the original equation.
So our general solution is the sum of yc (t) and Yp (t). We’ll need it and its derivative to apply
the initial conditions
1
y(t) = c1 e6t + c2 e−2t − e4t (6.424)
6
2
y0 (t) = 6c1 e6t − 2c2 e2t − e4t (6.425)
3
Now apply the initial conditions
13 1
− = y(0) = c1 + c2 − (6.426)
6 6
7 2
= y0 (0) = 6c1 − 2c2 − (6.427)
3 3
This system is solved by c1 = − 18 and c2 = − 158 , so our solution is
1 15 1
y(t) = − e6t − e−2t − e4t . (6.428)
8 8 6


Trig Functions
The second class of nonhomogeneous terms for which we can use this method are trig functions,
specifically sin and cos.
 Example 6.57 Find a particular solution for the following IVP
y00 − 4y0 − 12y = 6 cos(4t). (6.429)
In the first example the nonhomogeneous term was exponential, and we know when we
differentiate exponentials they persist. In this case, we’ve got a cosine function. When we
differentiate a cosine, we get sine. So we expect an initial guess to require a sine term in addition to
cosine. Try
Yp (t) = A cos(4t) + B sin(4t). (6.430)
Now differentiate and plug in

−16A cos(4t) − 16B sin(4t) − 4 −4A sin(4t) + 4B cos(4t)) − 12(A cos(4t) + B sin(4t) = 13 cos(4t)
(6.431)
(−16A − 16B − 12A) cos(4t) + (−16B + 16A − 12B) sin(4t) = 13 cos(4t)
(6.432)
(−28A − 16B) cos(4t) + (16A − 28B) sin(4t) = 13 cos(4t)
(6.433)
To solve for A and B set the coefficients equal. Note that the coefficient for sin(4t) on the right
hand side is 0. So we get the system of equations
cos(4t) : −28A − 16B = 13 (6.434)
sin(4t) : 16A − 28B = 0. (6.435)
7
This system is solved by A = − 20 and B = − 15 . So a particular solution is
7 1
Yp (t) = − cos(4t) − sin(4t) (6.436)
20 5


Note that the guess would have been the same if g(t) had been sine instead of cosine.
6.9 Second-Order Linear Equations with Constant Coefficients and Non-zero
Right-Hand Side 213
Polynomials
The third and final class of nonhomogeneous term we can use with this method are polynomials.
 Example 6.58 Find a particular solution to

y00 − 4y0 − 12y = 3t 3 − 5t + 2. (6.437)

In this case, g(t) is a cubic polynomial. When differentiating polynomials the order decreases. So
if our initial guess is a cubic, we should capture all terms that will arise. Our guess

Yp (t) = At 3 + Bt 2 +Ct + D. (6.438)

Note that we have a t 2 term in our equation even though one does not appear in g(t)! Now
differentiate and plug in
6At + 2B − 4(3At 2 + 2Bt +C) − 12(At 3 + Bt 2 +Ct + D) = 3t 2 − 5t + 2 (6.439)
−12At 3 + (12A − 12B)t 2 + (6A − 8B − 12C)t + (2B − 4C − 12D) = 3t 2 − 5t + 2 (6.440)
We obtain a system of equations by setting coefficients equal
1
t 3 : −12A = 3 ⇒ A = − (6.441)
4
1
t 2 : −12A − 12B = 0 ⇒ B = (6.442)
4
1
t : 6A − 8B − 12C = −5 ⇒ C = (6.443)
8
1
1 : 2B − 4C − 12D = 2 ⇒ D = − (6.444)
6
So a particular solution is
1 1 1 1
Yp (t) = − t 3 + t 2 + t − (6.445)
4 4 8 6


Summary
Given each of the basic types, we make the following guess
aeαt ⇒ Aeαt (6.446)
a cos(αt) ⇒ A cos(αt) + B sin(αt) (6.447)
a sin(αt) ⇒ A cos(αt) + B sin(αt) (6.448)
n n−1 n n−1
ant + an−1t + ... + a1t + a0 ⇒ Ant + An−1t + ... + A1t + A0 (6.449)

6.9.4 Products
The idea for products is to take products of our forms above.
 Example 6.59 Find a particular solution to

y00 − 4y0 − 12y = te4t (6.450)

Start by writing the guess for the individual pieces. g(t) is the product of a polynomial and an
exponential. Thus guess for the polynomial is At + B while the guess for the exponential is Ce4t .
So the guess for the product should be

Ce4t (At + B) (6.451)


214 Chapter 6. Ordinary Differential Equations

We want to minimize the number of constants, so


Ce4t (At + B) = e4t (ACt + BC). (6.452)
Rewrite with two constants
Yp (t) = e4t (At + B) (6.453)
Notice this is the guess as if it was just t with the exponential multiplied to it. Differentiate and
plug in
16e4t (At + B) + 8Ae4t − 4(4e4t (At + B) + Ae4t ) − 12e4t (At + B) = te4t (6.454)
4t 4t 4t
(16A − 16A − 12A)t + (16B + 8A − 16B − 4A − 12B)e = te (6.455)
4t 4t 4t
−12Ate + (4A − 12B)e = te (6.456)
Then we set the coefficients equal
1
te4t : −12A = 1 ⇒ A = − (6.457)
12
1
e4t : (4A − 12B) = 0 ⇒ B = − (6.458)
36
So, a particular solution for this differential equation is
1 1 e4t
Yp (t) = e4t (− t − ) = − (3t + 1). (6.459)
12 36 36


Basic Rule: If we have a product with an exponential write down the guess for the other piece
and multiply by an exponential without any leading coefficient.
 Example 6.60 Find a particular solution to

y00 − 4y0 − 12y = 29e5t sin(3t). (6.460)


We try the following guess
Yp (t) = e5t (A cos(3t) + B sin(3t)). (6.461)
So differentiate and plug in
25e5t (A cos(3t) + B sin(3t)) + 30e5t (−A sin(3t) + B cos(3t)+
9e5t (−A cos(3t) − B sin(3t)) − 4(5e5t (A cos(3t) + B sin(3t))+ (6.462)
5t 5t 5t
3e (−A sin(3t) + B cos(3t))) − 12e (A cos(3t) + B sin(3t)) = 29e sin(3t)
Gather like terms
(−16A + 18B)e5t cos(3t) + (−18A − 16B)e5t sin(3t) = 29e5t sin(3t) (6.463)
Set the coefficients equal
e5t cos(3t) : −16A + 18B = 0 (6.464)
5t
e sin(3t) : −18A − 16B = 29 (6.465)
9
This is solved by A = − 10 and B = − 45 . So a particular solution to this differential equation is

9 4 e5t
Yp (t) = e5t − t− = − (9t + 8). (6.466)
10 5 10

6.9 Second-Order Linear Equations with Constant Coefficients and Non-zero
Right-Hand Side 215
 Example 6.61 Write down the form of the particular solution to

y00 − 4y0 − 12y = g(t) (6.467)

for the following g(t):


(1) g(t) = (9t 2 − 103t) cos(t)

Here we have a product of a quadratic and a cosine. The guess for the quadratic is

At 2 + Bt +C (6.468)

and the guess for the cosine is

D cos(t) + E sin(t). (6.469)

Multiplying the two guesses gives

(At 2 + Bt +C)(D cos(t)) + (At 2 + Bt +C)(E sin(t)) (6.470)


2 2
(ADt + BDt +CD) cos(t) + (AEt + BEt +CE) sin(t). (6.471)

Each of the coefficients is a product of two constants, which is another constant. Simply to get our
final guess

Yp (t) = (At 2 + Bt +C) cos(t) + (Dt 2 + Et + F) sin(t) (6.472)

This is indicative of the general rule for a product of a polynomial and a trig function. Write
down the guess for the polynomial, multiply by cosine, then add to that the guess for the polynomial
multiplied by a sine.

(2) g(t) = e−2t (3 − 5t) cos(9t)

This homogeneous term has all three types of special functions. So combining the two general
rules above, we get

Yp (t) = e−2t (At + B) cos(9t) + e−2t (Ct + D) sin(9t). (6.473)

6.9.5 Sums
We have the following important fact. If Y1 satisfies

p(t)y00 + q(t)y0 + r(t)y = g1 (t) (6.474)

and Y2 satisfies

p(t)y00 + q(t)y0 + r(t)y = g2 (t) (6.475)

then Y1 +Y2 satisfies

p(t)y00 + q(t)y0 + r(t)y = g1 (t) + g2 (t) (6.476)

This means that if our nonhomogeneous term g(t) is a sum of terms we can write down the
guesses for each of those terms and add them together for our guess.
216 Chapter 6. Ordinary Differential Equations

 Example 6.62 Find a particular solution to

y00 − 4y0 − 12y = e7t + 12. (6.477)

Our nonhomogeneous term g(t) = e7t + 12 is the sum of an exponential g1 (t) = e7t and a 0 degree
polynomial g2 (t) = 12. The guess is

Yp (t) = Ae7t + B (6.478)

This cannot be simplified, so this is our guess. Differentiate and plug in

49Ae7t − 28Ae7t − 12Ae7t − 12B = e7t + 12 (6.479)


7t 7t
9Ae − 12B = e + 12. (6.480)
1
Setting the coefficients equal gives A = 9 and B = −1, so our particular solution is

1
Yp (t) = e7t − 1. (6.481)
9


 Example 6.63 Write down the form of a particular solution to

y00 − 4y0 − 12y = g(t) (6.482)

for each of the following g(t):


(1) g(t) = 2 cos(t) − 9 sin(3t)

Our guess for the cosine is

A cos(3t) + B sin(3t) (6.483)

Additionally, our guess for the sine is

C cos(3t) + D sin(3t) (6.484)

So if we add the two of them together, we obtain

A cos(3t) + B sin(3t) +C cos(3t) + D sin(3t) = (A +C) cos(3t) + (B + D) sin(3t) (6.485)

But A +C and B + D are just some constants, so we can replace them with the guess

Yp (t) = A cos(3t) + B sin(3t). (6.486)

(2) g(t) = sin(t) − 2 sin(14t) − 5 cos(14t)

Start with a guess for the sin(t)

A cos(t) + B sin(t). (6.487)

Since they have the same argument, the previous example showed we can combine the guesses for
cos(14t) and sin(14t) into

C cos(14t) + D sin(14t) (6.488)


6.9 Second-Order Linear Equations with Constant Coefficients and Non-zero
Right-Hand Side 217
So the final guess is
Yp (t) = A cos(t) + B sin(t) +C cos(14t) + D cos(14t) (6.489)
(3) g(t) = 7 sin(10t) − 5t 2 + 4t
Here we have the sum of a trig function and a quadratic so the guess will be
Yp (t) = A cos(10t) + B sin(10t) +Ct 2 + Dt + E. (6.490)
(4) g(t) = 9et + 3te−5t − 5e−5t
This can be rewritten as 9et + (3t − 5)e−5t . So our guess will be
Yp (t) = Aet + (Bt +C)e−5t (6.491)
(5) g(t) = t 2 sin(t) + 4 cos(t)
So our guess will be
Yp (t) = (At 2 + Bt +C) cos(t) + (Dt 2 + Et + F) sin(t). (6.492)
(6) g(t) = 3e−3t + e−3t sin(3t) + cos(3t)
Our guess
Yp (t) = Ae−3t + e−3t (B cos(3t) +C sin(3t)) + D cos(3t) + E sin(3t). (6.493)
This seems simple, right? There is one problem which can arise you need to be aware of
 Example 6.64 Find a particular solution to
y00 − 4y0 − 12y = e6t (6.494)
This seems straightforward, so try Yp (t) = Ae6t . If we differentiate and plug in
36Ae6t − 24Ae6t − 12Ae6t = e6t (6.495)
6t
0 = e (6.496)
Exponentials are never zero. So this cannot be possible. Did we make a mistake on our original
guess? Yes, if we went through the normal process and found the complimentary solution in this
case
yc (t) = c1 e6t + c2 e−2t . (6.497)
So our guess for the particular solution was actually part of the complimentary solution. So we
need to find a different guess. Think back to repeated root solutions and try Yp (t) = Ate6t . Try it
(36Ate6t + 12Ae6t ) − 4(6Ate6t + Ae6t ) − 12Ate6t = e6t (6.498)
6t 6t 6t
(36A − 24A − 12A)te + (12A − 4A)e = e (6.499)
6t 6t
8Ae = e (6.500)
Setting the coefficients equal, we conclude that A = 18 , so
1
Yp (t) = te6t . (6.501)
8


NOTE: If this situation arises when the complimentary solution has a repeated root and has the
form
yc (t) = c1 ert + c2tert (6.502)
then our guess for the particular solution should be
Yp (t) = At 2 ert . (6.503)

218 Chapter 6. Ordinary Differential Equations

6.9.6 Method of Undetermined Coefficients


Then we want to construct the general solution y(t) = yc (t) +Yp (t) by following these steps:
(1) Find the general solution of the corresponding homogeneous equation.
(2) Make sure g(t) belongs to a special set of basic functions we will define shortly.
(3) If g(t) = g1 (t) + ... + gn (t) is the sum of n terms, then form n subproblems each of which
contains only one gi (t). Where the ith subproblem is

ay00 + by0 + cy = gi (t) (6.504)

(4) For the ith subproblem assume a particular solution of the appropriate functions (exponential,
sine, cosine, polynomial). If there is a duplication in Yi (t) with a solution to the homogeneous
problem then multiply Yi (t) by t (or if necessary t 2 ).
(5) Find the particular solution Yi (t) for each subproblem. Then the sum of the Yi is a particular
solution for the full nonhomogeneous problem.
(6) Form the general solution by summing all the complimentary solutions from the homogeneous
equation and the n particular solutions.
(7) Use the initial conditions to determine the values of the arbitrary constants remaining in the
general solution.

Now for more examples, write down the guess for the particular solution:
(1) y00 − 3y0 − 28y = 6t + e−4t − 2
First we find the complimentary solution using the characteristic equation

yc (t) = c1 e7t + c2 e−4t (6.505)

Now look at the nonhomogeneous term which is a polynomial and exponential, 6t − 2 + e−4t . So
our initial guess should be

At + B +Ce−4t (6.506)

The first two terms are fine, but the last term is in the complimentary solution. Since Cte−4t does
not show up in the complimentary solution our guess should be

Yp (t) = At + B +Cte−4t . (6.507)

(2) y00 − 64y = t 2 e8t + cos(t)


The complimentary solution is

yc (t) = c1 e8t + c2 e−8t . (6.508)

Our initial guess for a particular solution is

(At 2 + Bt +C)e8t + D cos(t) + E sin(t) (6.509)

Again we have a Ce8t term which is also in the complimentary solution. So we need to multiply the
entire first term by t, so our final guess is

Yp (t) = (At 3 + Bt 2 +Ct)e8t + D cos(t) + E sin(t). (6.510)

(3) y00 + 4y0 = e−t cos(2t) + t sin(2t)


The complimentary solution is

yc (t) = c1 cos(2t) + c2 sin(2t) (6.511)


6.10 Mechanical and Electrical Vibrations 219

Our first guess for a particular solution would be

e−t (A cos(2t) + B sin(2t)) + (Ct + D) cos(2t) + (Et + F) sin(2t) (6.512)

We notice the second and third terms contain parts of the complimentary solution so we need to
multiply by t, so we have a our final guess

Yp (t) = e−t (A cos(2t) + B sin(2t)) + (Ct 2 + Dt) cos(2t) + (Et 2 + Ft) sin(2t). (6.513)

(4) y00 + 2y0 + 5 = e−t cos(2t) + t sin(2t)


Notice the nonhomogeneous term in this example is the same as in the previous one, but the
equation has changed. Now the complimentary solution is

yc (t) = c1 e−t cos(2t) + c2 e−t sin(2t) (6.514)

So our initial guess for the particular solution is the same as the last example

e−t (A cos(2t) + B sin(2t)) + (Ct + D) cos(2t) + (Et + F) sin(2t) (6.515)

This time the first term causes the problem, so multiply the first term by t to get the final guess

Yp (t) = te−t (A cos(2t) + B sin(2t)) + (Ct + D) cos(2t) + (Et + F) sin(2t) (6.516)

So even though the nonhomogeneous parts are the same the guess also depends critically on the
complimentary solution and the differential equation itself.
(5) y00 + 4y0 + 4y = t 2 e−2t + 2e−2t
The complimentary solution is

yc (t) = c1 e−2t + c2te−2t (6.517)

Notice that we can factor out a e−2t from out nonhomogeneous term, which becomes (t 2 + 2)e−2t .
This is the product of a polynomial and an exponential, so our initial guess is

(At 2 + Bt +C)e−2t (6.518)

But the Ce−2t term is in yc (t). Also, Cte−2t is in yc (t). So we must multiply by t 2 to get our final
guess

Yp (t) = (At 4 + Bt 3 +C)e−2t . (6.519)

6.10 Mechanical and Electrical Vibrations


Last Time: We studied the method of undetermined coefficients thoroughly, focusing mostly on
determining guesses for particular solutions once we have solved for the complimentary solution.

6.10.1 Applications
The first application is mechanical vibrations. Consider an object of a given mass m hanging from
a spring of natural length l, but there are a number of applications in engineering with the same
general setup as this.
We will establish the convention that always the downward displacement and forces are
positive, while upward displacements and forces are negative. BE CONSISTENT. We also measure
all displacements from the equilibrium position. Thus if our displacement is u(y), u = 0 corresponds
to the center of gravity as it hangs at rest from a spring.
220 Chapter 6. Ordinary Differential Equations

We need to develop a differential equation to model the displacement u of the object. Recall
Newton’s Second Law

F = ma (6.520)

where m is the mass of the object. We want our equation to be for displacement, so we’ll replace a
by u00 , and Newton’s Second Law becomes

F(t, u, u0 ) = mu00 . (6.521)

What are the various forces acting on the object? We will consider four different forces, some of
which may or may not be present in a given situation.

(1) Gravity, Fg
The gravitational force always acts on an object. It is given by

Fg = mg (6.522)

where g is the acceleration due to gravity. For simpler computations, you may take g = 10 m/s.
Notice gravity is always positive since it acts downward.

(2) Spring, Fs
We attach an object to a spring, and the spring will exert a force on the object. Hooke’s Law
governs this force. The spring force is proportional to the displacement of the spring from its
natural length. What is the displacement of the spring? When we attach an object to a spring, the
spring gets stretched. The length of the stretched spring is L. Then the displacement from its natural
length is L + u.
So the spring force is

Fs = −k(L + u) (6.523)

where k > 0 is the spring constant. Why is it negative? It is to make sure the force is in the correct
direction. If u > −L, i.e. the spring has been stretched beyond its natural length, then u + L > 0
and so Fs < 0, which is what we expect because the spring would pull upward on the object in this
situation. If u < −L, so the spring is compressed, then the spring force would push the object back
downwards and we expect to find Fs > 0.

(3) Damping, Fd
We will consider some situations where the system experiences damping. This will not always
be present, but always notice if damping is involved. Dampers work to counteract motion (example:
shocks on a car), so this will oppose the direction of the object’s velocity.
In other words, if the object has downward velocity u0 > 0, we would want the damping force
to be acting in the upwards direction, so that Fd < 0. Similarly, if u0 < 0, we want Fd > 0. Assume
all damping is linear.

Fd = −γu0 (6.524)

where γ > 0 is the damping constant.

(4) External Force, F(t)


This is encompasses all other forces present in a problem. An example is a spring hooked up to
a piston that exerts an extra force upon it. We call F(t) the forcing function, and it is just the sum
of any of the external forces we have in a particular problem.
6.10 Mechanical and Electrical Vibrations 221

The most important part of any problem is identifying all the forces involved in the problem.
Some may not be present. The forces will change depending on the particular situation. Let’s
consider the general form of our differential equation modeling a spring system. We have

F(t, u, u0 ) = Fg + Fs + Fd + F(t) (6.525)

so that Newton’s Second Law becomes

mu00 = mg − k(L + u) − γu0 + F(t), (6.526)

or upon reordering it becomes

mu00 + γu0 + ku = mg − kL + F(t). (6.527)

What happens when the object is at rest. Equilibrium is u = 0, there are only two forces acting on
the object: gravity and the spring force. Since the object is at rest, these two forces must balance to
0. So Fg + Fs = 0. In other words,

mg = kL. (6.528)

So our equation simplifies to

mu00 + γu0 + ku = F(t), (6.529)

and this is the most general form of our equation, with all forces present. We have the corresponding
initial conditions

u(0) = u0 Initial displacement from equilibrium position (6.530)


0
u (0) = u00 Initial Velocity (6.531)

Before we discuss individual examples, we need to touch on how we might figure out the
constants k and γ if they are not explicitly given. Consider the spring constant k. We know if the
spring is attached to some object with mass m, the object stretches the spring by some length L
when it is at rest. We know at equilibrium mg = kL. Thus, if we know how much some object with
a known mass stretches the spring when it is at rest, we can compute
mg
k= . (6.532)
L
How do we compute γ? If we do not know the damping coefficient from the beginning, we may
know how much force a damper exerts to oppose motion of a given speed. Then set |Fd | = γ|u0 |,
where |Fd | is the magnitude of the damping force and |u0 | is the speed of motion. So we have γ = Fud0 .
We will see how to compute in examples on damped motion. Let’s consider specific spring mass
systems.

6.10.2 Free, Undamped Motion


Start with free systems with no damping or external forces. This is the simplest situation since
γ = 0. Our differential equation is

mu00 + ku = 0, (6.533)

where m, k > 0. Solve by considering the characteristic equation

mr2 + k = 0, (6.534)
222 Chapter 6. Ordinary Differential Equations

which has roots


r
k
r1,2 = ±i . (6.535)
m
We’ll write

r1,2 = ±iω0 , (6.536)

where we’ve substituted


r
k
ω0 = . (6.537)
m
ω0 is called the natural frequency of the system, for reasons that will be clear shortly.
Since the roots of our characteristic equation are imaginary, the form of our general solution is

u(t) = c1 cos(ω0t) + c2 sin(ω0t) (6.538)

This is why we called ω0 the natural frequency of the system: it is the frequency of motion when
the spring-mass system has no interference from dampers or external forces.
Given initial conditions we can solve for c1 and c2 . This is not the ideal form of the solution
though since it is not easy to read off critical information. After we solve for the constants rewrite
as

u(t) = R cos(ω0t − δ ), (6.539)

where R > 0 is the amplitude of displacement and δ is the phase angle of displacement, some-
times called the phase shift.
Before determining how to rewrite the general solution in this desired form lets compare the
two forms. When we keep it as the general solution is it easier to find the constants c1 and c2 . But
the new form is easier to work with since we can immediately see the amplitude making it much
easier to graph. So ideally we will find the general solution, solve for c1 and c2 , and then convert to
the final form.
Assume we have c1 and c2 how do we find R and δ ? Consider Equation (6.539) we can use a
trig identity to write it as

u(t) = R cos(δ ) cos(ω0t) + R sin(δ ) sin(ω0t). (6.540)

Comparing this to the general solution, we see that

c1 = R cos(δ ), c2 = R sin(δ ). (6.541)

Notice

c21 + c22 = R2 (cos2 (δ ) + sin2 (δ )) = R2 , (6.542)

so that, assuming R > 0,


q
R = c21 + c22 . (6.543)

Also,
c2 sin(δ )
= = tan(δ ). (6.544)
c1 cos(δ )
to find δ .
6.10 Mechanical and Electrical Vibrations 223

 Example 6.65 A 2kg object is attached to a spring, which it stretches by 58 m. The object is
given an initial displacement of 1m upwards and given an initial downwards velocity of 4m/sec.
Assuming there are no other forces acting on the spring-mass system, find the displacement of the
object at time t and express it as a single cosine.
The first step is to write down the initial value problem for this setup. We’ll need to find an m
and k. m is easy since we know the mass of the object is 2kg. How about k? We know
mg (2)(10)
k= = 5
= 32. (6.545)
L 8
So our differential equation is
2u00 + 32u = 0. (6.546)
The initial conditions are given by
u(0) = −1, u0 (0) = 4. (6.547)
The characteristic equation is
2r2 + 32 = 0, (6.548)
q p
k
and this has roots r1,2 = ±4i. Hence ω0 = 4. Check: ω0 = m = 32/2 = 4. So our general
solution is
u(t) = c1 cos(4t) + c2 sin(4t). (6.549)
Using our initial conditions, we see
−1 = u(0) = c1 (6.550)
0
4 = u (0) = 4c2 ⇒ c2 = 1. (6.551)
So the solution is
u(t) = − cos(4t) + sin(4t). (6.552)
We want to write this as a single cosine. Compute R
q √
R = c21 + c22 = 2. (6.553)
Now consider δ
c2
tan(δ ) = = −1. (6.554)
c1
So δ is in Quadrants II or IV. To decide which look at the values of cos(δ ) and sin(δ ). We have
sin(δ ) = c2 > 0 (6.555)
cos(δ ) = c1 < 0. (6.556)
So δ must be in Quadrant II, since there sin > 0 and cos < 0. If we take arctan(−1) = − π4 , this
has a value in Quadrant IV. Since tan is π-periodic, however, − π4 + π = 3π
4 is in Quadrant II and
also has a tangent of −1 Thus our desired phase angle is
c2 3π
δ = arctan( ) + π = arctan(−1) + π = (6.557)
c1 4
and our solution has the final form
√ 3π
u(t) = 2 cos(4t − ). (6.558)
4

224 Chapter 6. Ordinary Differential Equations

6.10.3 Free, Damped Motion


Now, let’s consider what happens if we add a damper into the system with damping coefficient γ.
We still consider free motion so F(t) = 0, and our differential equation becomes

mu00 + γu0 + ku = 0. (6.559)

The characteristic equation is

mr2 + γr + k = 0, (6.560)

and has solution


p
−γ ± γ 2 − 4km
r1,2 = . (6.561)
2m
There are three different cases we need to consider, corresponding to the discriminant being positive,
zero, or negative.

(1) γ 2 − 4mk = 0
γ
This case gives a double root of r = − 2m , and so the general solution to our equation is
γ γ
u(t) = c1 e 2m + c2te− 2m (6.562)

Notice that limt→∞ u(t) = 0, which is good, since this signifies damping. This is called critical
damping and occurs when
γ 2 − 4mk = 0 (6.563)
√ √
γ = 4mk = 2 mk (6.564)

This value of γ − 2 mk is denoted by γCR and is called the critical damping coefficient. Since
this case separates the other two it is generally useful to be able to calculate this coefficient for a
given spring-mass system, which we can do using this formula. Critically damped systems may
cross u = 0 once but will never cross more than that. No oscillation

(2) γ 2 − 4mk > 0


In this case, the discriminant is positive and so we will get two distinct real roots r1 and r2 .
Hence our general solution is

u(t) = c1 er1t + c2 er2t (6.565)

But what is the behavior of this solution? The solution should die out since we have damping. We
need to check limt→∞ u(t) = 0. Rewrite the roots
p
−γ ± γ 2 − 4mk
r1,2 = (6.566)
2m
q
−γ ± γ( 1 − 4mk
γ2
)
= (6.567)
2ms
γ 4mk
= − (1 ± 1− ) (6.568)
2m γ2

By assumption, we have γ 2 > 4mk. Hence


4mk
1− <1 (6.569)
γ2
6.10 Mechanical and Electrical Vibrations 225

and so
s
4mk
1− < 1. (6.570)
γ2
so the quantity in parenthesis above is guaranteed to be positive, which means both of our roots are
negative.
Thus the damping in this case has the desired effect, and the vibration will die out in the limit.
This case, which occurs when γ > γCR , is called overdamping. The solution won’t oscillate around
equilibrium, but settles back into place. The overdamping kills all oscillation

(3) γ 2 < 4mk


The final case is when γ < γCR . In this case, the characteristic equation has complex roots
p
−γ ± γ 2 − 4mk
r1,2 = = α + iβ . (6.571)
2m
The displacement is
u(t) = c1 eαt cos(βt) + c2 eαt sin(βt) (6.572)
αt
= e (c1 cos(βt) + c2 sin(βt)). (6.573)
In analogy to the free undamped case we can rewrite as
u(t) = Reαt cos(βt − δ ). (6.574)
We know α < 0. Hence the displacement will settle back to equilibrium. The difference is that
solutions will oscillate even as the oscillations have smaller and smaller amplitude. This is called
overdamped.
Notice that the solution u(t) is not quite periodic. It has the form of a cosine, but the amplitude is
not constant. A function u(t) is called quasi-periodic, since it oscillates with a constant frequency
but a varying amplitude. β is called the quasi-frequency of the oscillation.
So when we have free, damped vibrations we have one of these three cases. A good example
to keep in mind when considering damping is car shocks. If the shocks are new its overdamping,
when you hit a bump in the road the car settles back into place. As the shocks wear there is more
of an initial bump but the car still settles does not bounce around. Eventually when your shocks
where and you hit a bump, the car bounces up and down for a few minutes and then settles like
underdamping. The critical point where the car goes from overdamped to underdamped is the
critically damped case.
Another example is a washing machine. A new washing machine does not vibrate significantly
due to the presence of good dampers. Old washing machines vibrate a lot.
In practice we want to avoid underdamping. We do not want cars to bounce around on the road
or buildings to sway in the wind. With critical damping we have the right behavior, but its too hard
to achieve this. If the dampers wear a little we are then underdamped. In practice we want to stay
overdamped.
 Example 6.66 A 2kg object stretches a spring by 58 m. A damper is attached that exerts a resistive
force of 48N when the speed is 3m/sec. If the initial displacement is 1m upwards and the initial
velocity is 2m/sec downwards, find the displacement u(t) at any time t.
This is actually the example from the last class with damping added and different initial
conditions. We already know k = 32. What is the damping coefficient γ? We know |Fd | = 48 when
the speed is |u0 | = 3. So the damping coefficients is given by
|Fd | 48
γ= = = 16. (6.575)
|u0 | 3
226 Chapter 6. Ordinary Differential Equations

Thus the initial value problem is

2u00 + 16u0 + 32u = 0, u(0) = −1, u0 (0) = 2. (6.576)

Before we solve it, see which case we’re in. To do so, let’s calculate the critical damping coefficient.

√ √
γCR = 2 mk = 2 64 = 16. (6.577)

So we are critically damped, since γ = γCR . This means we will get a double root. Solving the
characteristic equation we get r1 = r2 = −4 and the general solution is

u(t) = c1 e−4t + c2te−4t . (6.578)

The initial conditions give coefficients c1 = −1 and c2 = −2. So the solution is

u(t) = −e−4t − 2te−4t (6.579)

Notice there is no oscillations in this case. 

 Example 6.67 For the same spring-mass system as in the previous example, attach a damper

that exerts a force of 40N when the speed is 2m/s. Find the displacement at any time t.
the only difference from the previous example is the damping force. Lets compute γ

|Fd | 40
γ= = = 20. (6.580)
|u0 | 2

Since we computed γCR = 16, this means we are overdamped and the characteristic equation should
give us distinct real roots. The IVP is

2u00 + 20u0 + 32u = 0, u(0) = −1, u(0) = 2. (6.581)

The characteristic equation has roots r1 = −8 and r2 = −2. So the general solution is

u(t) = c1 e−8t + c2 e−2t (6.582)

The initial conditions give c1 = 0 and c2 = −1, so the displacement is

u(t) = −e−2t (6.583)

Notice here we do not actually have a "vibration" as we normally think of them. The damper is
strong enough to force the vibrations to die out so quickly that we do not notice much if any of
them. 

 Example 6.68 For the same spring-mass system as in the previous two examples, add a damper
that exerts a force of 16N when the speed is 2m/s.
In this case, the damping coefficient is

16
γ= = 8, (6.584)
2
which tells us that this case is underdamped as γ < γCR = 16. We should expect complex roots of
the characteristic equation. The IVP is

2u00 + 8u0 + 32u = 0, u(0) = −1, u0 (0) = 3. (6.585)


6.10 Mechanical and Electrical Vibrations 227

The characteristic equation has roots



−8 ± 192 √
r1,2 = = −2 ± i 12. (6.586)
4
Thus our general solution is
√ √
u(t) = c1 e−2t cos( 12t) + c2 e2t sin( 12t) (6.587)

The initial conditions give the constants c1 = 1 and c2 = √1 , so we have


12

√ 1 √
u(t) = −e−2t cos( 12t) + √ e2t sin( 12t). (6.588)
12
Let’s write this as a single cosine
s r
1 13
R = (−1)2 + ( √ )2 = (6.589)
12 12
1
tan(δ ) = − √ (6.590)
12
As in the undamped case, we look at the signs of c1 and c2 to figure out what quadrant δ is in. By
doing so, we see that δ has negative cosine and positive sine, so it is in Quadrant II. Hence we need
to take the arctangent and add π to it
1
δ = arctan(− √ ) + π. (6.591)
12
Thus our displacement is


r
13 −2t 1
u(t) = e cos( 12t − arctan(− √ − π). (6.592)
12 12
In this case, we actually get a vibration, even√though its amplitude steadily decreases until it is
negligible. The vibration has quasi-frequency 12. 

6.10.4 Forced Vibrations


Last Time: We studied non-forced vibrations with and without damping. We studied the four forces
acting on an object gravity, spring force, damping, and external forces.
Forced, Undamped Motion
What happens when the external force F(t) is allowed to act on our system. The function F(t) is
called the forcing function. We will consider the undamped case

mu00 + ku = F(t). (6.593)

This is a nonhomogeneous equation, so we will need to find both the complimentary and particular
solution.

u(t) = uc (t) +Up (t), (6.594)

Recall that uc (t) is the solution to the associated homogeneous equation. We will use undetermined
coefficients to find the particular solution Up (t) (if F(t) has an appropriate form) or variation of
parameters.
228 Chapter 6. Ordinary Differential Equations

We restrict our attention to the case which appears frequently in applications

F(t) = F0 cos(ωt) or F(t) = F0 sin(ωt) (6.595)

The force we are applying to our spring-mass system is a simple periodic function with frequency
ω. For now we assume F(t) = F0 cos(ωt), but everything is analogous if it is a sine function. So
consider

mu00 + ku0 = F0 cos(ωt). (6.596)

Where the complimentary solution to the analogous free undamped equation is

uc (t) = c1 cos(ω0t) + c2 sin(ω0t), (6.597)


q
where ω0 = mk is the natural frequency.
We can use the method of undetermined coefficients for this nonhomogeneous term F(t). The
initial guess for the particular solution is

Up (t) = A cos(ωt) + B sin(ωt). (6.598)

We need to be careful, note that we are okay since ω0 6= ω, but if the frequency of the forcing
function is the same as the natural frequency, then this guess is the complimentary solution uc (t).
Thus, if ω0 = ω, we need to multiply by a factor of t. So there are two cases.

(1) ω 6= ω0
In this case, our initial guess is not the complimentary solution, so the particular solution will
be

Up (t) = A cos(ωt) + B sin(ωt). (6.599)

Differentiating and plugging in we get

mω 2 (−A cos(ωt) − B sin(ωt)) + k(A cos(ωt) + B sin(ωt)) = F0 cos(ωt) (6.600)


2 2
(−mω A + kA) cos(ωt) + (−mω B + kB) sin(ωt) = F0 cos(ωt). (6.601)

Setting the coefficients equal, we get


F0
cos(ωt) : (−mω 2 + k)A = F0 ⇒ A= (6.602)
k − mω 2
sin(ωt) : (−mω 2 + k)B = 0 ⇒ B = 0. (6.603)

So our particular solution is


F0
Up (t) = cos(ωt) (6.604)
k − mω 2
F0
= k
cos(ωt) (6.605)
m( m − ω 2 )
F0
= cos(ωt). (6.606)
m(ω02 − ω 2 )

Notice that the amplitude of the particular solution is dependent on the amplitude of the forcing
function F0 and the difference between the natural frequency and the forcing frequency.
6.10 Mechanical and Electrical Vibrations 229

We can write our displacement function in two forms, depending on which form we use for
complimentary solution.
F0
u(t) = c1 cos(ω0t) + c2 sin(ω0t) + 2
cos(ωt) (6.607)
m(ω0 − ω 2 )
F0
u(t) = R cos(ω0t − δ ) + 2
cos(ωt) (6.608)
m(ω0 − ω 2 )
Again, we get an analagous solution if the forcing function were F(t) = F0 sin(ωt).
The key feature of this case can be seen in the second form. We have two cosine functions
with different frequencies. These will interfere with each other causing the net oscillation to vary
between great and small amplitude. This phenomena has a name "beats" derived from musical
terminology. Thing of hitting a tuning fork after it has already been struck, the volume will increase
and decrease randomly. One hears the waves created here in the exact form of our solution.
(2) ω = ω0
If the frequency of the forcing function is the same as the natural frequency, so the guess for
the particular solution is

Up (t) = At cos(ω0t) + Bt sin(ω0t) (6.609)

Differentiate and plug in

(−mω02 + k)At cos(ω0t) + (−mω02 + k)Bt sin(ω0t)


(6.610)
+ 2mω0 B cos(ω0t) − 2mω0 A sin(ω0t) = F0 cos(ω0t).

To begin simplification recall that ω02 = mk , so mω02 = k. this means the first two terms will vanish
(expected since no analogous terms on right side), and we get

2mω0 B cos(ω0t) − 2mω0 A sin(ω0t) = F0 cos(ω0t). (6.611)

Now set the coefficients equal


F0
cos(ω0t) : 2mω0 B = F0 B= (6.612)
2mω0
sin(ω0t) : −2mω0 A = 0 A=0 (6.613)

Thus the particular solution is


F0
Up (t) = t sin(ω0t) (6.614)
2mω0
and the displacement is
F0
u(t) = c1 cos(ω0t) + c2 sin(ω0t) + t sin(ω0t) (6.615)
2mω0
or
F0
u(t) = R cos(ω0t − δ ) + t sin(ω0t). (6.616)
2mω0
What stands out most about this equation? Notice that as t → ∞, u(t) → ∞ due to the form of
the particular solution. Thus, in the case where the forcing frequency is the same as the natural
frequency, the oscillation will have an amplitude that continues to increase for all time since the
external force adds energy to the system in a way that reinforces the natural motion of the system.
230 Chapter 6. Ordinary Differential Equations

This phenomenon is called resonance. Resonance is the phenomenon behind microwave


ovens. The microwave radiation strikes the water molecules in what’s being heated at their natural
frequency, causing them to vibrate faster and faster, which generates heat. A similar trait is noticed
in the Bay of Fundy, where tidal forces cause the ocean to resonate, yielding larger and larger tides.
Resonance in the ear causes us to be able to distinguish between tones in sound.
A common example is the Tacoma Narrows Bridge. This is incorrect because the oscillation
that led to the collapse of the bridge was from a far more complicated phenomenon than the simple
resonance we’re considering now. In general, for engineering purposes, resonance is something we
would like to avoid unless we understand the situation and the effect on the system.
In summary when we drive our system at a different frequency than the natural frequency, the
two frequencies interfere and we observe beats in motion. When the system is driven at a natural
frequency, the natural motion of the system is reinforced, causing the amplitude of the motion to
increase to infinity.
 Example 6.69 A 3kg object is attached to a spring, which it stretches by 40cm. There is no
damping, but the system is forced with the forcing function

F(t) = 10 cos(ωt) (6.617)

such that the system will experience resonance. If the object is initially displaced 20cm downward
and given an initial upward velocity of 10cm/s, find the displacement at any time t.
We need to be aware of the units, convert all lengths to meters. Find k

mg (3)(10)
k= = = 75 (6.618)
L .4
Next, we are told the system experiences resonance. Thus the forcing frequency ω must be the
natural frequency ω0 .
r r
k 75
ω = ω0 = = =5 (6.619)
m 3
Thus our initial value problem is

3u00 + 75u = 10 cos(5t) u(0) = .2, u0 (0) = −.1 (6.620)

The complimentary solution is the general solution of the associated free, undamped case. Since
we have computed the natural frequency already, the complimentary solution is just

uc (t) = c1 cos(5t) + c2 sin(5t). (6.621)

The particular solution (using formula derived above) is


1
t sin(5t), (6.622)
3
and so the general solution is
1
u(t) = c1 cos(5t) + c2 sin(5t) + t sin(5t). (6.623)
3
1 1
The initial conditions give c1 = 5 and c2 = − 50 , so the displacement can be given as

1 1 1
u(t) = cos(5t) − sin(5t) + t sin(5t) (6.624)
5 50 3
6.11 Two-Point Boundary Value Problems and Eigenfunctions 231

Let’s convert the first two terms to a single cosine.


r r
1 2 1 2 101
R = ( ) + (− ) = (6.625)
5 50 2500
1
− 50 1
tan(δ ) = 1
=− (6.626)
5
10
Looking at the signs of c1 and c2 , we see that cos(δ ) > 0 and sin(δ ) < 0. Thus δ is in Quadrant IV,
and so we can just take the arctangent.
1
δ = arctan(− ) (6.627)
10
The displacement is then
r  
101 1 1
u(t) = cos 5t − arctan(− ) + t sin(5t) (6.628)
2500 10 3


6.11 Two-Point Boundary Value Problems and Eigenfunctions


6.11.1 Boundary Conditions
Up until now, we have studied ordinary differential equations and initial value problems. Now we
shift to partial differential equations and boundary value problems. Partial differential equations are
much more complicated, but are essential in modeling many complex systems found in nature. We
need to specify how the solution should behave on the boundary of the region our equation is defined
on. The data we prescribe are the boundary values or boundary conditions, and a combination
of a differential equation and boundary conditions is called a boundary value problem.
Boundary Conditions depend on the domain of the problem. For an ordinary differential
equation our domain was usually some interval on the real line. With a partial differential equation
our domain might be an interval or it might be a square in the two-dimensional plane. To see how
boundary conditions effect an equation let’s examine how they affect the solution of an ordinary
differential equation.
 Example 6.70 Let’s consider the second order differential equation y00 + y = 0. Specifying
boundary conditions for this equation involves specifying the values of the solution (or its deriva-
tives) at two points, recall this is because the equation is second order. Consider the interval (0, 2π)
and specify the boundary conditions y(0) = 0 and y(2π) = 0. We know the solutions to the equation
have the form
y(x) = A cos(x) + B sin(x). (6.629)
by the method of characteristics. Applying the first boundary condition we see 0 = y(0) = A.
Applying the second condition gives 0 = y(2π) = B sin(2π), but sin(2π) is already zero so B can
be any number. So the solutions to this boundary value problem are any functions of the form
y(x) = B sin(x). (6.630)


 Example 6.71 Consider y00 + y = 0 with boundary conditions y(0) = y(6) = 0. this seems similar
to the previous problem, the solutions still have the general form
y(x) = A cos(x) + B sin(x) (6.631)
and the first condition still tells us A = 0. The second condition tells us that 0 = y(6) = B sin(6).
Now since sin(6) 6= 0, so we must have B = 0 and the entire solution is y(x) = 0. 
232 Chapter 6. Ordinary Differential Equations

Boundary value problems occur in nature all the time. Examine the examples physically. We
know from previous chapters y00 + y = 0 models an oscillator such as a rock hanging from a spring.
1
The rock will oscillate with frequency 2π . The condition y(0) = 0 just means that when we start
observing, we want the rock to be at the equilibrium spot. If we specify y(2π) = 0, this will
automatically happen, since the motion is 2π periodic. On the other hand, it is impossible for the
rock to return to the equilibrium point after 6 seconds. It will come back in 2π seconds, which is
more than 6. So the only possible way the rock can be at equilibrium after 6 seconds is if it does
not leave, which is why the only solution is the zero solution.
The previous examples are homogeneous boundary value problems. We say that a boundary
problem is homogeneous if the equation is homogeneous and the two boundary conditions involve
zero. That is, homogeneous boundary conditions might be one of these types
y(x1 ) = 0 y(x2 ) = 0 (6.632)
0
y (x1 ) = 0 y(x2 ) = 0 (6.633)
0
y(x1 ) = 0 y (x2 ) = 0 (6.634)
0 0
y (x1 ) = 0 y (x2 ) = 0. (6.635)
On the other hand, if the equation is nonhomogeneous or any of the boundary conditions do not
equal zero, then the boundary value problem is nonhomogenous or inhomogeneous. Let’s look at
some examples of nonhomogeneous boundary value problems.
 Example 6.72 Take y00 + 9y = 0 with boundary conditions y(0) = 2 and y( π6 ) = 1. The general
solution to the differential equation is

y(x) = A cos(3x) + B sin(3x). (6.636)

The two conditions give


2 = y(0) = A (6.637)
π
1 = y( ) = B (6.638)
6
so that the solution is

y(x) = 2 cos(3x) + sin(3x) (6.639)

 Example 6.73 Take y00 + 9y = 0 with boundary conditions y(0) = 2 and y(2π) = 2. The general
solution to the differential equation is

y(x) = A cos(3x) + B sin(3x). (6.640)

The two conditions give


2 = y(0) = A (6.641)
2 = y(2π) = A. (6.642)
This time the second condition did not give and new information, like in Example 1 and B does not
affect whether or not the solution satisfies the boundary conditions or not. We then have infinitely
many solutions of the form

y(x) = 2 cos(3x) + B sin(3x) (6.643)


6.11 Two-Point Boundary Value Problems and Eigenfunctions 233

 Example 6.74 Take y00 + 9y = 0 with boundary conditions y(0) = 2 and y(2π) = 4. The general
solution to the differential equation is

y(x) = A cos(3x) + B sin(3x). (6.644)

The two conditions give

2 = y(0) = A (6.645)
4 = y(2π) = A. (6.646)

On one hand, A = 2 and by the second equation A = 4. This is impossible and this boundary value
problem has no solutions. 

These examples illustrate that a small change to the boundary conditions can dramatically
change the problem, unlike small changes in the initial data for initial value problems.

6.11.2 Eigenvalue Problems


Recall the system studied extensively in previous chapters

Ax = λ x (6.647)

where for certain values of λ , called eigenvalues, there are nonzero solutions called eigenvectors.
We have a similar situation with boundary value problems.
Consider the problem

y00 + λ y = 0 (6.648)

with boundary conditions y(0) = 0 and y(π) = 0. The values of λ where we get nontrivial (nonzero)
solutions will be eigenvalues. The nontrivial solutions themselves are called eigenfunctions.
We need to consider three cases separately.
(1) If λ > 0, then it is convenient to let λ = µ 2 and rewrite the equation as

y00 + µ 2 y = 0 (6.649)

The characteristic polynomial is r2 + µ 2 = 0 with roots r = ±iµ. So the general solution is

y(x) = A cos(µx) + B sin(µx) (6.650)

Note that µ 6= 0 since λ > 0. Recall the boundary conditions are y(0) = 0 and y(π) = 0. So the
first boundary condition gives A = 0. The second boundary condition reduces to

B sin(µπ) = 0 (6.651)

For nontrivial solutions B 6= 0. So sin(µπ) = 0. Thus µ = 1, 2, 3, ... and thus the eigenvalues λn
are 1, 4, 9, ..., n2 . The eigenfunctions are only determined up to arbitrary constant, so convention is
to choose the arbitrary constant to be 1. Thus the eigenfunctions are

y1 (x) = sin(x) y2 (x) = sin(2x), ..., yn (x) = sin(nx) (6.652)

(2) If λ < 0, let λ = −µ 2 . So the above equation becomes

y00 − µ 2 y = 0 (6.653)
234 Chapter 6. Ordinary Differential Equations

The characteristic equation is r2 − µ 2 = 0 with roots r = ±µ, so its general solution can be written
as
y(x) = A cosh(µx) + B sinh(µx) = Ceµx + De−µx (6.654)
The first boundary condition, if considering the first form, gives A = 0. The second boundary
condition gives B sinh(µπ) = 0. Since µ 6= 0, then sinh(µπ) 6= 0, and therefore B = 0. So for
λ < 0 the only solution is y = 0, there are no nontrivial solutions and thus no eigenvalues.
(3) If λ = 0, then the equation above becomes
y00 = 0 (6.655)
and the general solution if we integrate twice is
y(x) = Ax + B (6.656)
The boundary conditions are only satisfied when A = 0 and B = 0. So there is only the trivial
solution y = 0 and λ = 0 is not an eigenvalue.
To summarize we only get real eigenvalues and eigenvectors when λ > 0. There may be
complex eigenvalues. A basic problem studied later in the chapter is
y00 + λ y = 0, y(0) = 0, y(L) = 0 (6.657)
Hence the eigenvalues and eigenvectors are
n2 π 2 nπx
λn = 2
, yn (x) = sin( ) for n = 1, 2, 3, ... (6.658)
L L
This is the classical Euler Buckling Problem.
Review Euler’s Equations:
 Example 6.75 Consider equation of the form

t 2 y00 + ty0 + y = 0 (6.659)


and let x = ln(t). Then
dy dy dx 1 dy
= = (6.660)
dt dx dt t dx
d2y d dy 1 dy 1 dy
= ( ) + ( ) (6.661)
dx2 dt dx t dx t dx
d 2 y 1 dy 1
= 2 2
+ (− 2 ) (6.662)
dx t dx t
(6.663)
Plug these back into the original equation
d 2 y dy dy
t 2 y00 + ty + y = − + +y = 0 (6.664)
dx2 dx dx
= y00 + y = 0 (6.665)
Thus the characteristic equation is r2 + 1 = 0, which has roots r = ±i. So the general solution is
ŷ(x) = c1 cos(x) + c2 sin(x) (6.666)
Recalling that x = ln(t) our final solution is
y(x) = c1 cos(ln(t)) + c2 sin(ln(t)) (6.667)

6.12 Systems of Differential Equations 235

6.12 Systems of Differential Equations


To this point we have focused on solving a single equation, but may real world systems are given as
a system of differential equations. An example is Population Dynamics, Normally the death rate of
a species is not a constant but depends on the population of predators. An example of a system of
first order linear equations is

x10 = 3x1 + x2 (6.668)


x20 = 2x1 − 4x2 (6.669)

We call a system like this coupled because we need to know what x1 is to know what x2 is and vice
versa. It is important to note that there will be a lot of similarities between our discussion and the
previous sections on second and higher order linear equations. This is because any higher order
linear equation can be written as a system of first order linear differential equations.
 Example 6.76 Write the following second order differential equation as a system of first order
linear differential equations

y00 + 4y0 − y = 0, y(0) = 2, y0 (0) = −2 (6.670)

All that is required to rewrite this equation as a first order system is a very simple change of
variables. In fact, this is ALWAYS the change of variables to use for a problem like this. We set

x1 (t) = y(t) (6.671)


0
x2 (t) = y (t) (6.672)

Then we have

x10 = y0 = x2 (6.673)
x20 00 0
= y = y − 4y = x1 − 4x2 (6.674)

Notice how we used the original differential equation to obtain the second equation. The first
equation, x10 = x2 , is always something you should expect to see when doing this. All we have left
to do is to convert the initial conditions.

x1 (0) = y(0) = 2 (6.675)


0
x2 (0) = y (0) = −2 (6.676)

Thus our original initial value problem has been transformed into the system

x10 = x2 , x1 (0) = 2 (6.677)


x20 = x1 − 4x2 , x2 (0) = −2 (6.678)

Let’s do an example for higher order linear equations.


 Example 6.77

y(4) + ty000 − 2y00 − 3y0 − y = 0 (6.679)

as a system of first order differential equations.


236 Chapter 6. Ordinary Differential Equations

We want to use a similar change of variables as the previous example. The only difference is
that since our equation in this example is fourth order we will need four new variables instead of
two.
x1 = y (6.680)
0
x2 = y (6.681)
00
x3 = y (6.682)
000
x4 = y (6.683)
Then we have
x10 = y0 = x2 (6.684)
x20 00
= y = x3 (6.685)
x30 000
= y = x4 (6.686)
x40 = y (4) 0 00 000
= y + 3y + 2y − ty = x1 + 3x2 + 2x3 − tx4 (6.687)
as our system of equations. To be able to solve these, we need to review some facts about systems
of equations and linear algebra. 

6.13 Homogeneous Linear Systems with Constant Coefficients


A two-dimensional equation has the form
x0 = ax + by (6.688)
0
y = cx + dy (6.689)
Suppose we have got our system written in matrix form

x0 = Ax (6.690)

How do we solve this equation? If A were a 1 × 1 matrix, i.e. a constant, and x were a vector with 1
component, the differential equation would be the separable equation

x0 = ax (6.691)

We know this is solved by

x(t) = ceat . (6.692)

One might guess, then, that in the n × n case, instead of a we have some other constant in the
exponential, and instead of the constant of integration c we have some constant vector η. So our
guess for the solution will be

x(t) = ηert . (6.693)

Plugging the guess into the differential equation gives


rηert = Aηert (6.694)
rt
(Aη − rη)e = 0 (6.695)
rt
(A − rI)ηe = 0. (6.696)
Since ert 6= 0, we end up with the requirement that

(A − rI)η = 0 (6.697)
6.13 Homogeneous Linear Systems with Constant Coefficients 237

This should seem familiar, it is the condition for η to be an eigenvector of A with eigenvalue r.
Thus, we conclude that for (6.693) to be a solution of the original differential equation, we must
have η an eigenvalue of A with eigenvalue r.
That tells us how to get some solutions to systems of differential equations, we find the
eigenvalues and vectors of the coefficient matrix A, then form solutions using (6.693). But how
will we form the general solution?
Thinking back to the second/higher order linear case, we need enough linearly independent
solutions to form a fundamental set. As we noticed last lecture, if we have all simple eigenvalues,
then all the eigenvectors are linearly independent, and so the solutions formed will be as well. We
will handle the case of repeated eigenvalues later.
So we will find the fundamental solutions of the form (6.693), then take their linear combina-
tions to get our general solution.

6.13.1 The Phase Plane


We are going to rely on qualitatively understanding what solutions to a linear system of differential
equations look like, this will be important when considering nonlinear equations. We know the
trivial solution x = 0 is always a solution to our homogeneous system x0 = Ax. x = 0 is an example
of an equilibrium solution, i.e. it satisfies

x0 = Ax = 0 (6.698)

and is a constant solution. We will assume our coefficient matrix A is nonsingular (det(A) 6= 0),
thus x = 0 is the only equilibrium solution.
The question we want to ask is whether other solutions move towards or away from this constant
solution as t → ±∞, so that we can understand the long term behavior of the system. This is no
different than what we did when we classified equilibrium solutions for first order autonomous
equations, we will generalize the ideas to systems of equations.
When we drew solution spaces then, we did so on the ty-plane. To do something analogous we
would require three dimensions, since we would have to sketch both x1 and x2 vs. t. Instead, what
we do is ignore t and think of our solutions as trajectories on the x1 x2 -plane. Then our equilibrium
solution is the origin. The x1 x2 -plane is called the phase plane. We will see examples where we
sketch solutions, called phase portraits.

6.13.2 Real, Distinct Eigenvalues


Lets get back to the equation x0 = Ax. We know if λ1 and λ2 are real and distinct eigenvalues of the
2 × 2 coefficient matrix A associated with eigenvectors η (1) and η (2) , respectively. We know from
above η (1) and η (2) are linearly independent, as λ1 and λ2 are simple. Thus the solutions obtained
from them using (6.693) will also be linearly independent, and in fact will form a fundamental set
of solutions. The general solution is

x(t) = c1 eλ1t η (1) + c2 eλ2t η (2) (6.699)

So if we have real, distinct eigenvalues, all that we have to do is find the eigenvectors, form the
general solution as above, and use any initial conditions that may exist.
 Example 6.78 Solve the following initial value problem
   
0 −2 2 5
x = x x(0) = (6.700)
2 1 0
238 Chapter 6. Ordinary Differential Equations

The first thing we need to do is to find the eigenvalues of the coefficient matrix.

−2 − λ 2
0 = det(A − λ I) = (6.701)
2 1−λ
= λ2 +λ −6 (6.702)
= (λ − 2)(λ + 3) (6.703)

So the eigenvalues are λ1 = 2 and λ2 = −3. Next we need the eigenvectors.

(1) λ1 = 2

(A − 2I)η = 0 (6.704)
    
−4 2 η1 0
= (6.705)
2 −1 η2 0

So we will want to find solutions to the system

−4η1 + 2η2 = 0 (6.706)


2η1 − η2 = 0. (6.707)

Using either equation we find η2 = 2η1 , and so any eigenvector has the form
   
η1 η1
η= = (6.708)
η2 2η1

Choosing η1 = 1 we obtain the first eigenvector


 
(1) 1
η = . (6.709)
2

(2) λ2 = −3

(A + 3I)η = 0 (6.710)
    
1 2 η1 0
= (6.711)
2 4 η2 0

So we will want to find solutions to the system

η1 + 2η2 = 0 (6.712)
2η1 + 4η2 = 0. (6.713)

Using either equation we find η1 = −2η2 , and so any eigenvector has the form
   
η1 −2η2
η= = . (6.714)
η2 η2

Choosing η2 = 1 we obtain the second eigenvector


 
(2) −2
η = . (6.715)
1
6.13 Homogeneous Linear Systems with Constant Coefficients 239

Thus our general solution is


   
2t 1 −3t −2
x(t) = c1 e + c2 e . (6.716)
2 1

Now let’s use the initial condition to solve for c1 and c2 . The condition says
     
5 1 −2
= x(0) = c1 + c2 . (6.717)
0 2 1

All that’s left is to write out is the matrix equation as a system of equations and then solve.

c1 − 2c2 = 5 (6.718)
2c1 + c2 = 0 ⇒ c1 = 1, c2 = −2 (6.719)

Thus the particular solution is


   
1 −2
x(t) = e2t − 2e−3t . (6.720)
2 1


 Example 6.79 Sketch the phase portrait of the system from Example 1.
In the last example we saw that the eigenvalue/eigenvector pairs for the coefficient matrix were
 
(1) 1
λ1 = 2 η = (6.721)
2
 
(2) −2
λ2 = −3 η = . (6.722)
1
The starting point for the phase portrait involves sketching solutions corresponding to the eigenvec-
tors (i.e. with c1 or c2 = 0). We know that if x(t) is one of these solutions

x0 (t) = Aci eλit η (i) = ci λi eλit η (i) . (6.723)

This is just, for any t, a constant times the eigenvector, which indicates that lines in the direction of
the eigenvector are these solutions to the system. There are called eigensolutions of the system.
Next, we need to consider the direction that these solutions move in. Let’s start with the first
eigensolution, which corresponds to the solution with c2 = 0. The first eigenvalue is λ1 = 2 > 0.
This indicates that this eigensolution will grow exponentially, as the exponential in the solution has
a positive exponent. The second eigensolution corresponds to λ2 = −3 < 0, so the exponential in
the appropriate solution is negative. Hence this solution will decay and move towards the origin.
What does the typical trajectory do (i.e. a trajectory where both c1 , c2 6= 0)? The general
solution is

x(t) = c1 e2t η (1) + c2 e−3t η (2) . (6.724)

Thus as t → ∞, this solution will approach the positive eigensolution, as the component correspond-
ing to the negative eigensolution will decay away. On the other hand, as t → −∞, the trajectory
will asymptotically reach the negative eigensolution, as the positive eigensolution component will
be tiny. The end result is the phase portrait as in Figure 1. When the phase portrait looks like this
(which happens in all cases with eigenvalues of mixed signs), the equilibrium solution at the origin
is classified as a saddle point and is unstable.

240 Chapter 6. Ordinary Differential Equations

Figure 6.4: Phase Portrait of the saddle point in Example 1

 Example 6.80 Solve the following initial value problem.

x10 = 4x1 + x2 x1 (0) = 6 (6.725)


x20 = 3x1 + 2x2 x2 (0) = 2 (6.726)
Before we can solve anything, we need to convert this system into matrix form. Doing so converts
the initial value problem to
   
0 4 1 6
x = x x(0) = . (6.727)
3 2 2
To solve, the first thing we need to do is to find the eigenvalues of the coefficient matrix.

4−λ 1
0 = det(A − λ I) = (6.728)
3 2−λ
= λ 2 − 6λ + 5 (6.729)
= (λ − 1)(λ − 5) (6.730)
So the eigenvalues are λ1 = 1 and λ2 = 5. Next, we find the eigenvectors.
(1) λ1 = 1

(A − I)η = 0 (6.731)
    
3 1 η1 0
= (6.732)
3 1 η2 0
So we will want to find solutions to the system
3η1 + η2 = 0 (6.733)
3η1 + η2 = 0. (6.734)
Using either equation we find η2 = −3η1 , and so any eigenvector has the form
   
η1 η1
η= = (6.735)
η2 −3η1
6.13 Homogeneous Linear Systems with Constant Coefficients 241

Choosing η1 = 1 we obtain the first eigenvector


 
(1) 1
η = . (6.736)
−3

(2) λ2 = 5

(A − 5I)η = 0 (6.737)
    
−1 1 η1 0
= (6.738)
3 −3 η2 0

So we will want to find solutions to the system

−η1 + η2 = 0 (6.739)
3η1 − 3η2 = 0. (6.740)

Using either equation we find η1 = η2 , and so any eigenvector has the form
   
η1 η2
η= = . (6.741)
η2 η2

Choosing η2 = 1 we obtain the second eigenvector


 
(2) 1
η = . (6.742)
1

Thus our general solution is


   
t 1 5t 1
x(t) = c1 e + c2 e . (6.743)
−3 1

Now using our initial conditions we solve for c1 and c2 . The condition gives
     
6 1 1
= x(0) = c1 + c2 . (6.744)
2 −3 1

All that is left is to write out this matrix equation as a system of equations and then solve

c1 + c2 = 6 (6.745)
−3c1 + c2 = 2 ⇒ c1 = 1, c2 = 5 (6.746)

Thus the particular solution is


   
t 1 5t 1
x(t) = e + 5e . (6.747)
−3 1


 Example 6.81 Sketch the phase portrait of the system from Example 3.
In the last example, we saw that the eigenvalue/eigenvector pairs for the coefficient matrix were
 
(1) 1
λ1 = 1 η = . (6.748)
−3
 
(2) 1
λ2 = 5 η = . (6.749)
1
242 Chapter 6. Ordinary Differential Equations

Figure 6.5: Phase Portrait of the unstable node in Example 2

We begin by sketching the eigensolutions (these are straight lines in the directions of the eigenvec-
tors). Both of these trajectories move away from the origin, though, as the eigenvalues are both
positive.
Since |λ2 | > |λ1 |, we call the second eigensolution the fast eigensolution and the first one
the slow eigensolution. The term comes from the fact that the eigensolution corresponds to the
eigenvalue with larger magnitude will either grow or decay more quickly than the other one.
As both grow in forward time, asymptotically, as t → ∞, the fast eigensolution will dominate
the typical trajectory, as it gets larger much more quickly than the slow eigensolution does. So in
forward time, other trajectories will get closer and closer to the eigensolution corresponding to η (2) .
On the other hand, as t → −∞, the fast eigensolution will decay more quickly than the slow one,
and so the eigensolution corresponding to η (1) will dominate in backwards time.
Thus the phase portrait will look like Figure 2. Whenever we have two positive eigenvalues,
every solution moves away from the origin. We call the equilibrium solution at the origin, in this
case, a node and classify it as being unstable.


 Example 6.82 Solve the following initial value problem.

x10 = −5x1 + x2 x1 (0) = 2 (6.750)


x20 = 2x1 − 4x2 x2 (0) = −1 (6.751)

We convert this system into matrix form.


   
0 −5 1 2
x = x x(0) = . (6.752)
2 −4 −1

To solve, the first thing we need to do is to find the eigenvalues of the coefficient matrix.

−5 − λ 1
0 = det(A − λ I) = (6.753)
2 −4 − λ
= λ 2 + 9λ + 18 (6.754)
= (λ + 3)(λ + 6) (6.755)
6.13 Homogeneous Linear Systems with Constant Coefficients 243

So the eigenvalues are λ1 = −3 and λ2 = −6. Next, we find the eigenvectors.


(1) λ1 = −3

(A + 3I)η = 0 (6.756)
    
−2 1 η1 0
= (6.757)
2 −1 η2 0

So we will want to find solutions to the system

−2η1 + η2 = 0 (6.758)
2η1 − η2 = 0. (6.759)

Using either equation we find η2 = 2η1 , and so any eigenvector has the form
   
η1 η1
η= = (6.760)
η2 2η1

Choosing η1 = 1 we obtain the first eigenvector


 
(1) 1
η = . (6.761)
2

(2) λ2 = −6

(A + 6I)η = 0 (6.762)
    
1 1 η1 0
= (6.763)
2 2 η2 0

So we will want to find solutions to the system

η1 + η2 = 0 (6.764)
2η1 + 2η2 = 0. (6.765)

Using either equation we find η1 = −η2 , and so any eigenvector has the form
   
η1 −η2
η= = . (6.766)
η2 η2

Choosing η2 = 1 we obtain the second eigenvector


 
(2) −1
η = . (6.767)
1

Thus our general solution is


   
−3t 1 −6t −1
x(t) = c1 e + c2 e . (6.768)
2 1

Now using our initial conditions we solve for c1 and c2 . The condition gives
     
2 1 −1
= x(0) = c1 + c2 . (6.769)
−1 2 1
244 Chapter 6. Ordinary Differential Equations

Figure 6.6: Phase Portrait of the Stable Node in Example 3

All that is left is to write out this matrix equation as a system of equations and then solve
c1 − c2 = 2 (6.770)
1 5
2c1 + c2 = −1 ⇒ c1 = , c2 = − (6.771)
3 3
Thus the particular solution is
   
1 −3t 1 5 −6t −1
x(t) = e − e . (6.772)
3 2 3 1


 Example 6.83 Sketch the phase portrait of the system from Example 5.
In the last example, we saw that the eigenvalue/eigenvector pairs for the coefficient matrix were
 
(1) 1
λ1 = −3 η = (6.773)
2
 
(2) −1
λ2 = −6 η = . (6.774)
1
We begin by sketching the eigensolutions. Both of these trajectories decay towards the ori-
gin, since both eigenvalues are negative. Since |λ2 | > |λ1 |, the second eigensolution is the fast
eigensolution and the first one the slow eigensolution. In the general solution, both exponentials
are negative and so every solution will decay and move towards the origin. Asymptotically, as
t → ∞ the trajectory gets closer and closer to the origin, the slow eigensolution will dominate the
typical trajectory, as it dies out less quickly than the fast eigensolution. So in forward time, other
trajectories will get closer and closer to the eigensolution corresponding to η (1) . On the other hand,
as t → −∞, the fast solution will grow more quickly than the slow one, and so the eigensolution
corresponding to η (2) will dominate in backwards time.
Thus the phase portrait will look like Figure 3. Whenever we have two negative eigenvalues,
every solution moves toward the origin. We call the equilibrium solution at the origin, in this case,
a node and classify it as being asymptotically stable.

V
Part Five: PDEs and Fourier
Series

7 Fourier Series and Transforms . . . . . . . . 247


7.1 Introduction to Fourier Series
7.2 Fourier Coefficients
7.3 Fourier Coefficients
7.4 Dirichlet Conditions
7.5 Convergence and Sum of a Fourier series
7.6 Complex Form of Fourier Series
7.7 Complex Fourier Series
7.8 General Fourier Series for Functions of Any Period
p = 2L
7.9 Even and Odd Functions
7.10 Even and Odd Functions, Half-Range Expansions

8 Partial Differential Equations . . . . . . . . . 275


8.1 Introduction to Basic Classes of PDEs
8.2 Introduction to PDEs
8.3 Laplace’s Equations and Steady State Temperature
Problems
8.4 Heat Equation and Schrödinger Equation
8.5 Separation of Variables and Heat Equation IVPs
8.6 Heat Equation Problems
8.7 Other Boundary Conditions
8.8 The Schrödinger Equation
8.9 Wave Equations and the Vibrating String

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
7. Fourier Series and Transforms

7.1 Introduction to Fourier Series


Fourier series have many applications in the physical sciences. For example, in vibrations and
oscillations → Think Frequency! The key idea behind Fourier series is to provide an alternative
f (n) (0)xn
way of expressing a function to the traditional power series, f (x) = ∑∞
n=0 n! . Instead let us
express a function as a sum of sines and cosines

f (x) = ∑ [an cos(mπx) + bn sin(mπx)] . (7.1)
m=0

This does remarkably well at approximating functions. Notice though that the resulting function is
periodic. Fourier series are also a key tool in solving PDEs (more later).

7.1.1 Simple Harmonic Motion


Imagine a particle at point p moves around the circle at a constant speed. Let the mass be the
projection of p onto a vertical line (spring-mass system). Also, let ω be the angular velocity of p.
Thus, θ = ωt and the position of the mass is c = sin(θ ) = ωt. The motion of the point c traces out
a sine curve and this motion is called Simple Harmonic Motion.
Definition 7.1.1 (Simple Harmonic Motion) Simple harmonic motion can take the forms:

sin(ωt), cos(ωt), sin(ωt + φ ), (7.2)

where ω is the angular velocity and φ is the phase (horizontal displacement). Traditional
examples include a hanging mass from a spring a pendulum, and a tuning fork.

The location of the point p = (A cos(ωt), A sin(ωt)) = Aeiωt if P = x + iy. Here A is the
amplitude or the maximum displacement of the object. We also can write down the equation for the
motion of the point c
dy d
= (A sin(ωt)) = Aω cos(ωt) = B cos(ωt),
dt dt
248 Chapter 7. Fourier Series and Transforms

Figure 7.1: Convergence of Fourier Series to a step function (square wave). K is the number of
terms.

a)

b)

Figure 7.2: Depiction of simple harmonic motion using a point rotating around the unit circle or a
spring mass system.

where B = Aω is the maximum velocity achieved by the object. In physics we can define the kinetic
energy:
 2
1 2 1 dy 1 1
KE = mv = m = mB2 cos2 (ωt) ≤ mB2 .
2 2 dt 2 2

Therefore, the maximum energy of the system, 12 mB2 , is proportional to the maximum velocity B
(and therefore the amplitude A).
Since sine/cosine are periodic, one we know the value on the interval [0, 2π) (or [0, L)), then
we know the value everywhere.
7.2 Fourier Coefficients 249

 Example 7.1 Consider the general function representing simple harmonic motion
 
h x i 2π
y(x,t) := A cos 2π − f t = A cos (x − vt) . (7.3)
λ λ
ω 1 2π
Here A is the amplitude, λ is the wavelength, f = 2π is the frequency, T = f = ω is the period,
and v = f λ is the velocity. 

From an ODE perspective (think last chapter) we have an equation for simple harmonic motion:

d2x k
Fnet = m = −kx (Hooke’s Law) ⇒ x00 + x = 0. (7.4)
dt 2 m
Solving this equation with the characteristic polynomial gives r2 + mk = 0, which implies that
q
r = mk i. Thus, the general solution is

x(t) = C1 cos(ωt) +C2 sin(ωt) = A cos(ωt − φ ), (7.5)


q q
where angular velocity ω = mk , the amplitude A = C12 +C22 (depends on initial/boundary
  q
conditions), and the phase φ = tan−1 CC21 . The frequency f = 2π ω 1
= 2π k
m and the period
T = 1f = 2π mk .
p

Many common periodic functions are not continuous or differentiable such as the square wave,
the sawtooth, or the rectified half wave (semi-circle wave). Problem: Given a function f (x) how
an we expand it into a series of sines and cosines.

7.2 Fourier Coefficients


7.3 Fourier Coefficients
This section follows the outline and layout of Kreyzig Chapter 11. Recall that Fourier series are in-
finite series designed to represent general periodic functions in terms of simpler ones (sines/cosines).
The immense theory behind Fourier series can seem complicated, but the application of Fourier
series to real problems is much simpler. Fourier series have a distinct advantage over Taylor series
in that even discontinuous functions have a Fourier series representation while they do not possess
a Taylor series.
The main use of Fourier series are in representing periodic functions.
Definition 7.3.1 (Periodic Functions) A function f (x) is called a periodic function if f (x) is
defined for all real x and if there is some positive number p called the period of f (x), such that

f (x + p) = f (x) (7.6)

for all x. The graph of such a function is obtained from periodic repetition of its graph over any
interval of length p.

Examples of periodic functions are f (x) = sin(x), cos(x) and examples that are not periodic are
f (x) = xn , ex , cosh(x), ln(x).

R If f (x) has a period p, then it also has a period of 2p (and np for any integer n > 0). Since
f (x + 2p) = f ([x + p] + p) = f (x + p) = f (x), ⇒ f (x + np) = f (x).
Furthermore, if functions f (x) and g(x) have period p, then a f (x) + bg(x) with any constants
a and b also has period p.
250 Chapter 7. Fourier Series and Transforms

Figure 7.3: Periodic Function. Image from Kreyzig Adv. Engineering Math

Problem: We want to find a representation of a 2π-periodic function in terms of simple


functions

1, cos(x), sin(x), cos(2x), sin(2x), ..., cos(nx), sin(nx), ..., (7.7)

which are all 2π-periodic (see Figure!7.4). The associated series made up of these terms is called a
trigonometric series and has the form

a0 +a1 cos(x)+b1 sin(x)+a2 cos(2x)+b2 sin(2x)+... = a0 + ∑ [an cos(nx) + bn sin(nx)] . (7.8)
n=1

where a0 , a1 , b1 , a2 , b2 , ... are all constants called the coefficients of the series. Since each term
has a period of 2π, then if the series converges the result must also have a period of 2π! Given a
function f (x) of period 2π and such that it can be represented by a series of this form, that series
converges, and has the sum f (x), then we can use equality to write

f (x) = a0 + ∑ [an cos(nx) + bn sin(nx)] . (7.9)
n=1

Thus the righthand side is called the Fourier series of f (x).

Figure 7.4: Periodic Trig Function. Image from Kreyzig Adv. Engineering Math

Question: How do we find the coefficients ai , bi for i = 0, ..., n?

Solution: These constants are the so-called Fourier coefficients of f (x), given by Euler for-
mulas
7.3 Fourier Coefficients 251

Theorem 7.3.1 (Euler Formulas for Fourier Coefficients)

1 π
Z
a0 = f (x)dx
2π −π
1
Z π
an = f (x) cos(nx)dx n = 1, 2, ...
π −π
1
Z π
bn = f (x) sin(nx)dx n = 1, 2, ...
π −π

R The Fourier coefficient a0 is actually the average value of the function over its period! In other
words, if we only take the first term in the Fourier series the function f (x) is approximated by
its average value over its period. We can think of this as a “First Approximation".

7.3.1 A Basic Example

Start by considering some basic questions when it comes to Fourier series:


1. How are continuous functions able to represent a given discontinuous function?
2. How does the quality of the approximation increase as one takes more and more terms in the
Fourier series?

 Example 7.2 (Periodic Rectangular Wave) Find the Fourier coefficients of the periodic function

(
−k if −π < x < 0
f (x) = .
k if 0<x<π

This is a typical function representing an external force acting on a mechanical system or electric
circuit (see Fig. 7.5).

Figure 7.5: Square wave periodic function. Image from Kreyzig Adv. Engineering Math
252 Chapter 7. Fourier Series and Transforms

Solution: Start by using the formulas for the Fourier coefficients:

1 1 0 1 π
Z π Z Z
a0 = f (x)dx = −kdx + kdx
2π 2π −π 2π 0
"−π #
0
π
1 1
= −kx + kx =
[−kπ + kπ] = 0
2π −π 0 2π
Z 0
1 1 1 π
Z π Z
an = −k cos(nx)dx +
f (x) cos(nx)dx = k cos(nx)dx
π π π 0
"−π 0
−π
π #
1 k k 1
= − sin(nx) + sin(nx) = [0 + 0] = 0

π n −π n 0 π
Z 0
1 1 1 π
Z π Z
bn = −k sin(nx)dx +
f (x) sin(nx)dx = k sin(nx)dx
π
−π −π π π 0
" 0 π #
1 k k k 2k
= cos(nx) − cos(nx) = [1 − cos(−nπ) − cos(nπ) + 1] = (1 − cos(nπ)).
π n −π n 0 nπ nπ

4k
Hence, the Fourier sine coefficients are b2n = 0 and b2n+1 = (2n+1)π . Thus, the Fourier series is

∞  
4k 1 1
f (x) = a0 + ∑ [an cos(nx) + bn sin(nx)] = sin(x) + sin(3x) + sin(5x) + . . . .
n=1 π 3 5

A picture depicting the approximation with the partial sums is shown in Fig. 7.6.


7.3.2 Derivation of Euler Formulas

In order to derive the formulas for the Fourier coefficients we heavily rely on the following result
about the orthogonality of trigonometric functions.

Theorem 7.3.2 (Orthogonality of Trigonometric Functions) On the interval (−π ≤ x ≤ π) the


following relations hold:
Z π
cos(nx) cos(mx)dx = 0 (n 6= m) (7.10)
−π
Z π
sin(nx) sin(mx)dx = 0 (n 6= m) (7.11)
−π
Z π
sin(nx) cos(mx)dx = 0 (n 6= m or n = m). (7.12)
−π

This is proved by transforming each product into a single trig function using the sum/difference
formulas.
7.3 Fourier Coefficients 253

Figure 7.6: Convergence of Fourier series to a periodic function. Image from Kreyzig Adv.
Engineering Math

Now apply Theorem 7.3.2 to Fourier series.


f (x) = a0 + ∑ [an cos(nx) + bn sin(nx)]
n=1
!
Z π Z π ∞
Integrate both sides f (x)dx = a0 + ∑ [an cos(nx) + bn sin(nx)] dx
−π −π n=1
Z π Z π ∞  Z π Z π 
f (x)dx = a0 dx + ∑ an cos(nx)dx + bn sin(nx) dx
−π −π n=1 −π −π
Z π
f (x)dx = 2πa0 + 0.
−π

1 Rπ
This gives the formula for a0 = 2π −π f (x)dx since all the integrals in the sum are zero.
254 Chapter 7. Fourier Series and Transforms

Now to get the formula for an we repeat this process, but multiply both sides of the expression
for a Fourier series by cos(mx) before integrating.

f (x) = a0 + ∑ [an cos(nx) + bn sin(nx)]
n=1
!
Z π Z π ∞
f (x) cos(mx)dx = a0 + ∑ [an cos(nx) + bn sin(nx)] cos(mx)dx
−π −π n=1
Z π Z π  Z∞ π Z π 
f (x) cos(mx)dx = a0 cos(mx)dx + ∑ an cos(nx) cos(mx)dx + bn sin(nx) cos(mx) dx
−π −π n=1 −π −π
Z π Z π
f (x) cos(nx)dx = an cos2 (nx)dx = an π.
−π −π

This gives the formula for an = π1 −π


π R
f (x) cos(nx)dx.
Now to get the formula for bn we repeat this process, but multiply both sides of the expression
for a Fourier series by sin(mx) before integrating.

f (x) = a0 + ∑ [an cos(nx) + bn sin(nx)]
n=1
!
Z π Z π ∞
f (x) sin(mx)dx = a0 + ∑ [an cos(nx) + bn sin(nx)] sin(mx)dx
−π −π n=1
Z π Z π  Z
∞ π Z π 
f (x) sin(mx)dx = a0 sin(mx)dx + ∑ an cos(nx) sin(mx)dx + bn sin(nx) sin(mx) dx
−π −π n=1 −π −π
Z π Z π
f (x) sin(nx)dx = bn sin2 (nx)dx = bn π.
−π −π

1 Rπ
This gives the formula for bn = π −π f (x) sin(nx)dx.
 Example 7.3 The fundamental period is the smallest positive period. Find it for: cos(x), cos(2x),

sin(πx), sin(2πx).

Solution: For cos(x) we know the period is 2π. To find the fundamental period for cos(kx)
we need to find when kx = 2π or x = 2π 2π
k . So the fundamental period for cos(2x) is k = π. The
2π 2π
fundamental period for cos(πx) is π = 2 and for cos(2πx) is 2π = 1. 

 Example 7.4 Sketch three periods of the 2π-periodic function defined on the interval −π ≤ x ≤ π
as
(
1 −π < x < 0
a) f (x) = x, b) f (x) = π − |x|, c) .
cos( 12 x) 0 < x < π

 Example 7.5 Find the Fourier coefficients of the periodic function


(
0 if −π < x < 0
f (x) = .
1 if 0<x<π
7.4 Dirichlet Conditions 255

Solution: Start by using the formulas for the Fourier coefficients:

1 π 1 0 1 π
Z Z Z
a0 = f (x)dx = 0dx + 1dx
2π −π 2π −π 2π 0
 π 
1 1 1
= x = [π] =
2π 0 2π 2
1 1 0 1 π
Z π Z Z
an = f (x) cos(nx)dx = 0dx + cos(nx)dx
π −π π −π π 0
 π 
1 1 1
= sin(nx) = [0 + 0] = 0
π n 0 π
1 1 0 1 π
Z π Z Z
bn = f (x) sin(nx)dx = 0dx + sin(nx)dx
π −π π −π π 0
 π 
1 1 1 2
= − cos(nx) = [− cos(−nπ)] = − cos(nπ).
π n 0 nπ nπ
2 2
Hence, the Fourier sine coefficients are b2n = − 2nπ and b2n+1 = (2n+1)π . Thus, the Fourier series is


1 2 1 2 1
f (x) = a0 + ∑ [an cos(nx) + bn sin(nx)] = + sin(x)− sin(2x)+ sin(3x)− sin(4x)+. . . .
n=1 2 3π 2π 5π 4π


 Example 7.6 Find the Fourier coefficients of the periodic function


(
−x if −π < x < 0
f (x) = .
x if 0<x<π

Solution: Start by using the formulas for the Fourier coefficients: (In class with IBP) 

7.4 Dirichlet Conditions


7.5 Convergence and Sum of a Fourier series
This section follows the outline and layout of Kreyzig Chapter 11. There are a wide range of
functions that can be represented by a Fourier series (unlike a Taylor Series).

Theorem 7.5.1 (Representation by Fourier series) Let f (x) be a periodic function with period
2π and piecewise continuous on the interval −π ≤ x ≤ π. Furthermore, let f (x) have a left-hand
and right-hand derivative at each point of that interval. Then the Fourier series

f (x) = a0 + ∑ [an cos(nx) + bn sin(nx)] (7.13)
n=1

converges. It’s sum is f (x) except at points x0 where f (x) is discontinuous. There the sum of the
series is the average of the left and right hand limits of f (x) at x0 .
(
1 −3 ≤ x ≤ 0
 Example 7.7 What does the Fourier Series of f (x) = will converge to at
2x 0 < x ≤ 3
x = −2, 0, 3, 5, 6?

Solution: The first two points are inside the original interval of definition of f (x), so we can
256 Chapter 7. Fourier Series and Transforms

Figure 7.7: Illustration of sum of the first few terms of a Fourier series for a function with a jump
discontinuity.

just read the function value directly. The only discontinuity of f (x) occurs at x = 0. So at x = −2,
f (x) is nice and continuous. The Fourier Series will converge to f (−2) = 1. On the other hand,
at x = 0 we have a jump discontinuity, so the Fourier Series will converge to the average of the
one-sided limits. f (0+ ) = limx→0+ f (x) = 0 and f (0− ) = limx→0− f (x) = 1, so the Fourier Series
will converge to 12 [ f (0+ ) + f (0− )] = 21 .

What happens at the other points? Here we consider where f (x) or its periodic extension,
f per (x), have jump discontinuities. These can only occur either at x = x0 + 2Lm where −L < x0 < L
is a jump discontinuity of f (x) or at endpoints x = ±L + 2Lm, since the periodic extension might
not "sync up" at these points, producing a jump discontinuity.

At x = 3, we are at one of these "boundary points" and the left-sided limit is 6 while the
right-sided limit is 1. Thus the Fourier Series will converge here to 6+1 7
2 = 2 . x = 5 is a point of
continuity for f per (x) and so the Fourier Series will converge to f per (5) = f (−1) = 1. x = 6, is a
jump discontinuity (corresponding to x = 0), so the Fourier Series will converge to 12 . 

(
2 − 2 ≤ x < −1
 Example 7.8 Where does the Fourier Series for f (x) = converge at
1−x −1 ≤ x ≤ 2
x = −7, −1, 6?

Solution: None of the points are inside (−2, 2) where f (x) is discontinuous. The only points where
the periodic extension might be discontinuous are the "boundary points" x = ±2 + 4k. In fact, since
f (−2) 6= f (2), these will be points of discontinuity. So f per (x) is continuous at x = −7, since it
is not a boundary point and we have f per (−7) = f (1) = 0, which is what the Fourier Series will
converge to. The same for x = −1, the Fourier Series will converge to f (−1) = 2+2 2 = 2.

For x = 6 we are at an endpoint. The left-sided limit is -1, while the right-sided limit is 2, so
the Fourier Series will converge to their average 12 . 

 Example 7.9 Plot the function the Fourier series will converge to for each of the following
7.6 Complex Form of Fourier Series 257

functions defined on the interval −π ≤ x ≤ π: (Plots in class!)


(
1, if − π2 < x < π2
a) f (x) = .
0, othewise
(
x + π, if − π < x < 0
b) f (x) = .
−x + π, if 0 < x < π
(
0, if − π < x < 0
c) f (x) = .
x, if 0 < x < π
d) f (x) = x, if − π < x < π.

π
−1, if − π < x < − 2

e) f (x) = x, if − π2 < x < π2 .

1, if π2 < x < π

f ) f (x) = x2 , if − π < x < π.

7.5.1 Gibbs Phenomenon


The Gibbs phenomenon, discovered by J. Willard Gibbs (1899) describes how a Fourier series
of a piecewise continuously differentiable periodic function behaves at the location of a jump
discontinuity. The nth partial sum of the Fourier series has large oscillations near the jump due
to the presence of the sines/cosines. This may increase/decrease the maximum of the partial sum
above/below that of the function itself.

Figure 7.8: Illustration of Gibbs Phenomenon showing an “overshoot" near the jump discontinuities.

7.6 Complex Form of Fourier Series


7.7 Complex Fourier Series
This section follows the outline and layout of Kreyzig Chapter 11. We want to use the knowledge
derived from our first chapter on Complex Numbers to write the Fourier series in complex form.
The complex form is very useful for physical applications and can be easier to use when solving
some differential equations.
In order to write the Fourier series in complex form we must first recall the Euler identity:

Definition 7.7.1 (Euler Identity) eiθ = cos(θ ) + i sin(θ ) and similarly e−iθ = cos(θ ) − i sin(θ ).
258 Chapter 7. Fourier Series and Transforms

Also, recall by adding and subtracting these functions we get the complex form of the sine and
cosine functions
1 1
cos(t) = (eit + e−it ), sin(t) = (eit − e−it ). (7.14)
2 2i
Recall 1/i = −i and let t = nx to find
1 1
an cos(nx) + bn sin(nx) = an (einx + e−inx ) + bn (einx − e−inx )
2 2i
1 1
= (an − ibn )einx + (an + ibn )e−inx .
2 2
Inputting this expression into the Fourier series while writing a0 = c0 , 12 (an − ibn ) = cn , 21 (an +
ibn ) = kn to find
∞ 
f (x) = a0 + ∑ cn einx + kn e−inx .

(7.15)
n=1

where the coefficients ci , ki are defined as


1 1 π 1 π
Z Z
cn = (an − ibn ) = f (x) [cos(nx) − i sin(nx)] dx = f (x)e−inx dx
2 2π −π 2π −π
1 1 1 π
Z π Z
kn = (an + ibn ) = f (x) [cos(nx) + i sin(nx)] dx = f (x)einx dx.
2 2π −π 2π −π
We simply the formula further by setting c−n = kn , then the Fourier series can be written as:

1
Z π
f (x) = ∑ cn einx , cn = f (x)e−inx dx, n = 0, ±1, ±2, ... (7.16)
n=−∞ 2π −π

This is the complex form of the Fourier series of the Complex Fourier series of f (x). The cn are
the complex Fourier coefficients of f (x).
 Example 7.10 Find the complex Fourier series of f (x) = ex if −π < x < π and f (x + 2π) = f (x)
and then from it obtain the usual Fourier series as a check.

Solution: Start by computing the complex Fourier coefficients:


1 1 x−inx π

1 π x −inx 1 1
Z
cn = ee dx = e = (eπ − e−π )(−1)n .
2π −π 2π 1 − in
−π 2π 1 − in
On the right side, 1
1−in = 1+in
1+n2
and eπ − e−π = 2 sinh(π). Hence the complex Fourier series is
sinh(π) ∞ 1 + in inx
ex = ∑ (−1)n e . (7.17)
π n=−∞ 1 + n2
Now, derive the traditional Fourier series. Notice that
(1 + in)einx = (1 + in) [cos(nx) + i sin(nx)] = [cos(nx) − n sin(nx)] + i [n cos(nx) + sin(nx)]
(1 − in)e−inx = (1 − in) [cos(nx) − i sin(nx)] = [cos(nx) − n sin(nx)] − i [n cos(nx) + sin(nx)]
Adding these to equations results in the imaginary parts canceling
(1 + in)einx + (1 − in)e−inx = 2 [cos(nx) − n sin(nx)] , n = 1, 2, ...
for n = 0 we get 1. Thus, the real Fourier series is
 
x 2 sinh(π) 1 1 1
e = − [cos(x) − sin(x)] + [cos(2x) − 2 sin(2x)] − ... .
π 2 1 + 12 1 + 22

7.7 Complex Fourier Series 259

Figure 7.9: Plot of the partial sum for the Fourier series of ex .

 Example 7.11 Find the complex Fourier series of f (x) = sin(x) if −π < x < π and f (x + 2π) =
f (x).

Solution: Start by computing the complex Fourier coefficients:

1 einπ − e−inπ
 
1
Z π
−inx
cn = sin(x)e dx = .
2π −π 2π n2 − 1

The coefficients are zero unless n = ±1 (since e±inπ = cos(nπ) ± i sin(nπ) = cos(nπ) = (−1)n .
Using L’Hospital rule twice gives c1 = 2i1 and c−1 = − 2i1 . Thus, the complex Fourier series is


eix − e−ix
f (x) = ∑ cn einx = ,
n=−∞ 2i

which is exactly the complex representation of the sine function. 

 Example 7.12 Find the complex Fourier series of f (x) = 1 if 0 < x < T , f (x) = 0 otherwise,

and f (x + 2π) = f (x).

Solution: Start by computing the complex Fourier coefficients:

i −inx T
Z T
1 1 i  inT
Z π
−inx −inx

cn = f (x)e dx = e dx = e = e −1 ,
2π −π 2π 0 2πn
0 2πn

1 RT T
when n 6= 0. Also, c0 = 2π 0 dx = 2π . Thus, the complex Fourier series is
" #
∞ ∞
inx 1 i  −inT  inx
f (x) = ∑ cn e = T+ ∑ e −1 e .
n=−∞ 2π n=−∞,n6=0 n

7.7.1 General Complex Fourier Series for Intervals (0, L)


Definition 7.7.2 The complex Fourier series for an interval from (0, L) has the form
∞ Z L
2π 1 2π
f (x) = ∑ cn einx L , cn = e−inx L f (x)dx. (7.18)
n=−∞ L 0


The fraction L is often written as ω0 and called the fundamental angular frequency.
260 Chapter 7. Fourier Series and Transforms

 Example 7.13 Find the complex Fourier series of f (x) = 1 if 0 < x < L/2, f (x) = −1 if
L/2 < x < L, and f (x + 2π) = f (x).

Solution: Start by computing the complex Fourier coefficients:

1 L −inx 2π
Z L/2 Z L 
1 i  −2inπ
Z
2π 2π
−inx −inx
+ 1 − 2−inπ ,

cn = e L f (x)dx = e L dx − e L dx = e
L 0 L 0 L/2 2πin

when n 6= 0. Also, c0 = 0 since the mean of the function is zero. Thus, the complex Fourier series is
∞ ∞
[1 − e−inπ ] inx 2π
f (x) = ∑ cn einx = ∑ e L.
n=−∞ n=−∞,n6=0 inπ

 Example 7.14 Find the complex Fourier series for the following functions a) f (x) = 1 for

−π < x < π, b) f (x) = −2x for −π < x < π.

Solution: For a)

1 −inx π

1 1  −inπ
Z π
−inx
− einπ ,

cn = e dx = − e =− e
2π −π 2πin
−π 2πin

which is 0 when n 6= 0. Thus the only term in the Fourier series is c0 = 1.

For b)

1 −x −inx π
 
1 −1 −inx
Z π Z π
−inx
cn =
2π −π
xe dx =
2π in
e − −π in e dx
−π
1 −inx π
 
1 −π inπ
= [2e ] + 2 e
2π in n
−π
 
1 −π inπ 1 −inπ −inπ
= [2e ] + 2 [e −e ]
2π in n
1 1 Rπ
If n 6= 0, then cn = − 2in 2einπ = ni einπ . If n = 0, then c0 = 2π −π xdx = 0. Thus, the complex
Fourier series is
∞ ∞ ∞
i inπ inx i in(π+x)
f (x) = ∑ cn einx = ∑ e e = ∑ e .
n=−∞ n=−∞,n6=0 n n=−∞,n6=0 n

7.8 General Fourier Series for Functions of Any Period p = 2L


This section follows the outline and layout of Kreyzig Chapter 11. So far all the functions consid-
ered have had period p = 2π simplifying the formulas for the Fourier coefficients and the Fourier
series. In general for applications, most functions do not have period 2π, but rather possess a period
of arbitrary length we will call p = 2L. The good news is that the Fourier series and formulas for
the coefficients for a general function of period 2L have a very similar form. We use the notation
p = 2L where L is the length of the object under consideration (such as a spring or rod). We will
see this application in the Chapter on PDEs.
7.8 General Fourier Series for Functions of Any Period p = 2L 261

Key Idea: Find and use a change of scale that transforms a 2π periodic function into a func-
tion of period 2L. Recall, the form of the for a function of period 2π

g(v) = a0 + ∑ [an cos(nv) + bn sin(nv)] (7.19)
n=1
with coefficients
1 π
Z
a0 = g(v)dv (7.20)
2π −π
1
Z π
an = g(v) cos(nv)dv n = 1, 2, 3, ... (7.21)
π −π
1
Z π
bn = g(v) sin(nv)dv n = 1, 2, 3, .... (7.22)
π −π
Using a change of scale, let v = kx with k such that the old period v = 2π and gives a new period
where x = 2L. Thus, 2π = k2L ⇒ k = π/L. Therefore, v = kx = πx/L and dv = πL dx. Now writing
g(v) = f (x), the Fourier series and coefficients become
∞ h
a0  nπ   nπ i
f (x) = + ∑ an cos x + bn sin x (7.23)
2 n=1 L L
with coefficients
1 L
Z
a0 = f (x)dx (7.24)
2L −L
1 L  nπx 
Z
an = f (x) cos dx n = 1, 2, 3, ... (7.25)
L −L L
1 L  nπx 
Z
bn = f (x) sin dx n = 1, 2, 3, .... (7.26)
L −L L
Also, the complex Fourier series can be expressed for arbitrary intervals
∞ Z L
nπx 1 inπ
f (x) = ∑ cn e− L , cn = f (x)e− x dx. (7.27)
n=−∞ 2L −L

 Example 7.15 (Periodic Rectangular Wave) Find the Fourier series of the function

0
 if − 2 < x < −1
f (x) = k if − 1 < x < 1 ,

0 if 1 < x < 2

where p = 2L = 4 and L = 2.

Solution: Using the formulas just derived we can find the Fourier coefficients
k 1

1 L 1 1 k
Z Z
a0 = f (x)dx = kdx = x =
2L −L 4 −1 4 −1 2
Z L Z 1
1  nπx  1  nπx 
an = f (x) cos dx = k cos dx
L −L L 2 −1 2
 nπx  1
k = 2k sin nπ
 
= sin
nπ 2 −1 nπ 2
Z L Z 1
1  nπx  1  nπx 
bn = f (x) sin dx = k sin dx
L −L L 2 −1 2
 nπx  1     
k
= − cos = − k cos nπ − cos −nπ =0
nπ 2 −1 nπ 2 2
262 Chapter 7. Fourier Series and Transforms

Observe that an = 0 if n is even. Hence the Fourier series is


       
k 2k π 1 3π 1 5π
f (x) = + cos x − cos x + cos x + ... .
2 π 2 3 2 5 2


Figure 7.10: Period Rectangular Wave from first example.

 Example 7.16 (Periodic Rectangular Wave) Find the Fourier series of the function
(
−k if − 2 < x < 0
f (x) = ,
k if 0 < x < 2

where p = 2L = 4 and L = 2.

Solution: Using the formulas just derived we can find the Fourier coefficients

k 0 k 2
Z 0 Z 2
1 L

1
Z
a0 = f (x)dx = −kdx + kdx = x + x = 0
2L −L 4 −2 0 4 −2 4 0
Z L Z 0 Z 2  nπx  
1  nπx  1  nπx 
an = f (x) cos dx = −k cos dx + k cos dx
L −L L 2 −2 2 0 2
" #
1 2k  nxπ  0 2k  nπx  2
= − sin + sin =0
2 nπ 2 −2 nπ 2 0
Z 0 Z 2
1 L  nπx  1  nπx   nπx  
Z
bn = f (x) sin dx = −k sin dx + k sin dx
L −L L 2 −2 2 0 2
" #
1 2k  nxπ  0 2k  nπx  2
= cos − cos
2 nπ 2 −2 nπ 2 0
(
k 4k/nπ if n is odd
= [1 − cos(nπ) − cos(nπ) + 1] = .
nπ 0 if n is even

Hence the Fourier series is


       
4k π 1 3π 1 5π
f (x) = sin x + sin x + sin x + ... .
π 2 3 2 5 2


 Example 7.17 (Half-wave Rectifier) Find the Fourier series of the function
(
0 if − L < t < 0
f (x) = ,
E sin(ωt) if 0 < x < L
7.8 General Fourier Series for Functions of Any Period p = 2L 263

Figure 7.11: Period Rectangular Wave from second example.

2π π
where p = 2L = ω and L = ω.

Solution: Using the formulas just derived we can find the Fourier coefficients

1 L ω π/ω E
Z Z
a0 = f (t)dt = E sin(ωt)dt =
2L −L 2π 0 π
1 L  nπt  ω π/ω
Z Z
an = f (t) cos dt = E sin(ωt) cos(nωt)dt
L −L L π 0
ωE π/ω
Z
= [sin((1 + n)ωt) + sin((1 − n)ωt)]dt
2π 0
cos((1 + n)ωt) cos((1 − n)ωt) π/ω
   
ωE E − cos((1 + n)π) + 1 − cos((1 − n)π) + 1
= − − = +
2π (1 + n)ω (1 − n)ω 0 2π 1+n 1−n
Z L
1 nπt
Z π/ω
  ω
bn = f (t) sin dt = E sin(ωt) sin(nωt)dt.
L −L L π 0

E 2 2 2E

Observe that a1 = 0 and an = 0 is n is odd. For n even, an = 2π 1+n + 1−n = − (n−1)(n+1)π . Also,
b1 = E/2 and all other bn = 0. Hence the Fourier series is

 
E E 2E 1 1
f (t) = + sin(ωt) − cos(2ωt) + cos(4ωt) + ... .
π 2 π 1·3 3·5

Figure 7.12: Half-wave Rectifier from second example.

 Example 7.18 Compute the Fourier Series of f (x) = 1 + x on the interval (−L, L).
264 Chapter 7. Fourier Series and Transforms

Solution: Using the above formulas we have


1 L
Z
a0 = (1 + x)dx = 1
2L −L
1 L mπx
Z
am = (1 + x) cos( )dx
L −L L
1+x mπx 1 mπx L
= sin( ) + 2 2 cos( )|
mπ L m π L −L
1
= (cos(mπ) − cos(−mπ)) = 0 m 6= 0
m π2
2
Z L
1 mπx
bm = (1 + x) sin( )dx
L −L L
1+x mπx 1 mπx L
= − cos( ) + 2 2 sin( )|
mπ L m π L −L
2L 2L
= − cos(mπ) = (−1)m+1 .
mπ mπ
So the full Fourier series of f (x) is
2L πx 1 2πx 1 3πx
1+x = 1+ (sin( ) − sin( ) + sin( ) − ...) (7.28)
π L 2 L 3 L
2L ∞ 1 (2n − 1)πx 1 2nπx
= 1+ ∑ sin( ) − sin( ). (7.29)
π n=1 2n − 1 L 2n L

(
2 − 2 ≤ x < −1
 Example 7.19 Compute the Fourier Series for f (x) = on the interval
1−x −1 ≤ x < 2
(−2, 2).

Solution: We start by using the Euler-Fourier Formulas. For the Cosine terms we find
1 2
Z
a0 = f (x)dx
4 −2
Z −1 Z 2 
1
= 2dx + 1 − xdx
4 −2 −1
1 3 7
= (2 + ) =
4 2 8
and
1 2 nπx
Z
an = f (x) cos( )dx
2 −2 2
Z −1 Z 2 
1 nπx nπx
= 2 cos( )dx + (1 − x) cos( )dx
2 −2 2 −1 2
  
1 4 nπx −1 2(1 − x) nπx 2 4 nπx 2
= sin( )|−2 + sin( )|−1 − 2 2 cos( )|−1
2 nπ 2 nπ 2 n π 2
 
1 4 nπ 4 nπ 4 nπ
= − sin( ) + sin( ) − 2 2 (cos(nπ) − cos( ))
2 nπ 2 nπ 2 n π 2

2
 n2 π 2 n odd

= 0 n = 4m .

 4
− n2 π 2 n = 4m + 2
7.9 Even and Odd Functions 265

Also, for the sine terms


1 2 nπx
Z
bn = f (x) sin( )dx
2 −2 2
Z −1 Z 2 
1 nπx nπx
= 2 sin( )dx + (1 − x) sin( )dx
2 −2 2 −1 2
 
1 4 nπx −1 2(1 − x) nπx 2 4 nπx 2
= − cos( )| − cos( )| − (sin( )|
2 nπ 2 −2 nπ 2 −1 n2 π 2 2 −1
 
1 6 4 nπ
= cos(nπ) − 2 2 sin( )
2 nπ n π 2

3
 nπ
 neven
= − nπ − n22π 2 n = 4m + 1 .
3

 3
− nπ + n22π 2 n = 4m + 3

So we have
∞      
7 2 (4m + 1)πx 3 2 (4m + 1)πx
f (x) = + ∑ 2 2
cos + − − sin
8 m=1 (4m + 1) π 2 (4m + 1)π (4m + 1)2 π 2 2
   
4 (4m + 2)πx 3 (4m + 2)πx
− 2 2
cos + sin
(4m + 2) π 2 (4m + 2)π 2
     
2 (4m + 3)πx 3 2 (4m + 3)πx
+ cos + − + sin
(4m + 3)2 π 2 2 (4m + 3)π (4m + 3)2 π 2 2
 
3 4mπx
+ sin .
4mπ 2


This example represents a worst case scenario. There are a lot of Fourier coefficients to keep
track of. Notice that for each value of m, the summand specifies four different Fourier terms (for
4m, 4m + 1, 4m + 2, 4m + 3). This can often happen and depending on L, even more terms maybe
required.

7.9 Even and Odd Functions


7.10 Even and Odd Functions, Half-Range Expansions
Recall that an even function is a function satisfying

g(−x) = g(x). (7.30)

This means that the graph y = g(x) is symmetric with respect to the y-axis. An odd function
satisfies

g(−x) = −g(x) (7.31)

meaning that its graph y = g(x) is symmetric with respect to the origin.
 Example 7.20 A monomial xn is even if n is even and odd if n is odd. cos(x) is even and sin(x)
is odd. Note tan(x) is odd. 

There are some general rules for how products and sums behave:
(1) If g(x) is odd and h(x) is even, their product g(x)h(x) is odd.
(2) If g(x) and h(x) are either both even or both odd, g(x)h(x) is even.
266 Chapter 7. Fourier Series and Transforms

Figure 7.13: Even function (Left) and Odd function (Right).

(3) The sum of two even functions or two odd functions is even or odd, respectively.
To remember the rules consider how many negative signs come out of the arguments.
(4) The sum of an even and an odd function can be anything. In fact, any function on (−L, L) can
be written as a sum of an even function, called the even part, and an odd function, called the odd
part.
(5)
Rx
Differentiation and Integration can change the parity of a function. If f (x) is even, ddxf and
0 f (s)ds are both odd, and vice versa.
The graph of an odd function g(x) must pass through the origin by definition. This also tells us
that if g(x) is even, as long as g0 (0) exists, then g0 (0) = 0.

Theorem 7.10.1 Definite Integrals on symmetric intervals of odd and even functions have useful
properties
Z L Z L Z L
(odd)dx = 0 and (even)dx = 2 (even)dx (7.32)
−L −L 0

Given a function f (x) defined on (0, L), there is only one way to extend it to (−L, L) to an even
or odd function. The even extension of f (x) is
(
f (x) for 0 < x < L
feven (x) = (7.33)
f (−x) for − L < x < 0.

This is just its reflection across the y-axis. Notice that the even extension is not necessarily defined
at the origin.
The odd extension of f (x) is

 f (x) for 0 < x < L

fodd (x) = − f (−x) for − L < x < 0 . (7.34)

0 for x = 0

This is just its reflection through the origin.

R Since the cosine terms in a Fourier series are even and the sine terms are odd, then it should
not be surprising that an even function is given by a series of cosine terms and an odd function
by a series of sine terms.
7.10 Even and Odd Functions, Half-Range Expansions 267

Theorem 7.10.2 (Fourier Cosine Series, Fourier Sine Series) The Fourier series of an even
function of period 2L is a Fourier Cosine Series
∞  nπ 
f (x) = a0 + ∑ an cos x (7.35)
n=1 L
with coefficients (NOTE: Integration only on the half-interval (0, L)!)

1 L
Z
a0 = f (x)dx
L 0
2 L  nπx 
Z
an = f (x) cos dx, n = 1, 2, 3, ...
L 0 L
The Fourier series of an odd function of period 2L is a Fourier Sine Series
∞  nπx 
f (x) = ∑ bn sin (7.36)
n=1 L

with coefficients
Z L
2  nπx 
bn = f (x) sin dx.
L 0 L

Theorem 7.10.3 (Sum and Scalar Multiple) The Fourier coefficients of a sum f1 + f2 are the
sums of the corresponding Fourier coefficients of f1 and f2 . The Fourier coefficients of c f are c
times the Fourier coefficients of f .

Figure 7.14: Sawtooth function.

 Example 7.21 (Sawtooth Wave) Find the Fourier series of the function f (x) = x + π if −π <
x < π and f (x + 2π) = f (x).

Solution: Here we have f = f1 + f2 where f1 = x and f2 = π. The Fourier coefficients of f2


are zero except for the first one (the constant term, a0 = π). Thus, the Fourier coefficients an , bn are
those for f1 except for a0 , which is π. Since f1 is odd, then an = 0 for n = 1, 2, ... and

2 π 2 π
Z Z
bn = f1 (x) sin(nx)dx = x sin(nx)dx
π 0 π 0
2 −x cos(nx) π 1 π
 
2
Z
=
π n + n 0 cos(nx)dx = − n cos(nπ).
0
268 Chapter 7. Fourier Series and Transforms

Thus, b1 = 2, b2 = −2/2 = −1, b3 = 2/3, b4 = −2/4 = −1/2, ... and the Fourier series of f (x) is

 
1 1
f (x) = π + 2 sin(x) − sin(2x) + sin(3x) − ... .
2 3

Figure 7.15: First few partial sums.

7.10.1 Half-Range Expansions


Half-range expansions are Fourier series. The basic idea is to represent a function f (x) by a Fourier
series where the function is only defined on the interval (0, L). This could represent a violin string
or the temperature distribution in a metal bar. We could extend f (x) as a function of period L and
develop the extended function into a Fourier Series. This series would in general contain both sine
and cosine terms requiring many computations. We know that if the function were even on (−L, L)
we would only have to compute the cosine terms. Likewise if the function were odd on the interval
(−L, L) then we would only need to compute the cosine terms.

Definition 7.10.1 An even periodic extension is a function of period 2L which is even, but
coincides with the given function f (x) on the interval (0, L) (see f1 in figure).

An odd periodic extension is a function of period 2L which is odd, but coincides with
the given function f (x) on the interval (0, L) (see f2 in figure).

R Both extensions have period 2L. This is where the term half-range expansion comes from:
f is given on half the range (0, L) giving only half the periodicity of the length 2L.

 Example 7.22 Find the two half-range expansions for the function

(
2k
Lx if 0 < x < L2
f (x) = 2k
.
L (L − x) if L2 < xL

Solution: a) Even Periodic Extension: Find the Fourier Cosine series, which converges to the even
7.10 Even and Odd Functions, Half-Range Expansions 269

Figure 7.16: Extensions to even and odd functions. f1 (x) is the even periodic extension and f2 (x)
is the odd periodic extension.

periodic extension. Start by finding the Fourier coefficients:

1 2k L/2 2k L
 Z 
k
Z
a0 = xdx + (L − x)dx =
L L 0 L L/2 2
 Z L/2 Z L  nπ  
2 2k  nπ  2k
an = x cos x dx + (L − x) cos x dx
L L 0 L L L/2 L
L2 L2   L  L2 
 nπ    
nπ L nπ nπ 
=IBP sin + 2 2 cos −1 − L− sin − 2 2 cos(nπ) − cos .
2nπ 2 n π 2 nπ 2 2 n π 2

Combining the terms in an gives an = n24kπ 2 2 cos nπ



2 − cos(nπ) − 1 and an = 0 if n = 2, 6, 10, ..., 4n−
2, .... Hence, the first half-range expansion of f (x) is
 
k 16k 1 2π 1 6π
f (x) = − 2 cos x + 2 cos x + ... .
2 π 22 L 6 L
b) Odd Period Extension: Find the Fourier Sine series, which converges to the odd periodic
extension. Start by finding the Fourier coefficients:
8k nπ
bn = sin .
n2 π 2 2
Thus, the other half-range expansion of f (x) is
 
8k 1 π 1 3π 1 5π
f (x) = 2 sin x − sin x + sin x − ... .
π 12 L 32 L 52 L


7.10.2 Fourier Sine Series


Each of terms in the Fourier Sine Series for f (x), sin( nπx
L ), is odd. As with the full Fourier Series,
each of these terms also has period 2L. So we can think of the Fourier Sine Series as the expansion
of an odd function with period 2L defined on the entire line which coincides with f (x) on (0, L).
270 Chapter 7. Fourier Series and Transforms

Figure 7.17: Even and odd extensions.

One can show that the full Fourier Series of fodd is the same as the Fourier Sine Series of f (x).
Let

nπx nπx
a0 + ∑ an cos( ) + bn sin( ) (7.37)
n=1 L L

be the Fourier Series for fodd (x), with coefficients given in Section 10.3
Z L
1 nπx
an = fodd (x) cos( )dx = 0 (7.38)
L −L L
But fodd is odd and cos is even, so their product is again odd.
Z L
1 nπx
bn = fodd (x) sin( )dx (7.39)
L −L L
But both fodd and sin are odd, so their product is even.
2 L nπx
Z
bn = fodd (x) sin( )dx (7.40)
L 0 L
2 L nπx
Z
= f (x) sin( )dx, (7.41)
L 0 L
which are just the Fourier Sine coefficients of f (x). Thus, as the Fourier Sine Series of f (x) is the
full Fourier Series of fodd (x), the 2L-periodic odd function that the Fourier Sine Series expands is
just the periodic extension fodd .
This goes both ways. If we want to compute a Fourier Series for an odd function on (−L, L) we
can just compute the Fourier Sine Series of the function restricted to (0, L). It will almost converge
to the original function on (−L, L) with the only issues occurring at any jump discontinuities. The
only works for odd functions. Do not use the formula for the coefficients of the Sine Series,
unless you are working with an odd function.
 Example 7.23 Write down the odd extension of f (x) = L − x on (0, L) and compute its Fourier
Series.

Solution: To get the odd extension of f (x) we will need to see how to reflect f across the
origin. What we end up with is the function
(
L−x 0 < x < L
fodd (x) = . (7.42)
−L − x − L < x < 0
7.10 Even and Odd Functions, Half-Range Expansions 271

Now, what is the Fourier Series of fodd (x)? By the previous discussion, we know that is will be
identical to the Fourier Sine Series of f (x), as this will converge on (−L, 0) to fodd . So we have

nπx
fodd (x) = ∑ bn sin( ), (7.43)
n=1 L
where
2 L nπx
Z
bn = (L − x) sin( )dx (7.44)
L 0 L
2 L(L − x) nπx L2 nπx L
= [− cos( ) − 2 2 sin( )] (7.45)
L nπ L n π L 0
2L
= . (7.46)

Thus the desired Fourier Series is
2L ∞ 1 nπx
fodd (x) = ∑ sin( ). (7.47)
π n=1 n L


Can we compute the Fourier Sine Series of a constant function like f (x) = 1 which is even? It
is important to remember that if we are computing the Fourier Sine Series for f (x), it only needs to
converge to f (x) on (0, L), where issues of evenness and oddness do not occur. The Fourier Sine
Series will converge to the odd extension of f (x) on (−L, L).
 Example 7.24 Find the Fourier Series for the odd extension of
(
3
0 < x < 32
2
f (x) = (7.48)
x − 23 32 < x < 3.

on (−3, 3).

Solution: The Fourier Series for fodd (x) on (−3, 3) will just be the Fourier Sine Series for f (x) on
(0, 3). The Fourier Sine coefficients for f (x) are
2 3 nπx
Z
bn = f (x) sin( )dx (7.49)
3 0 3
Z 3 Z 3 
2 2 3 nπx 3 nπx
= sin( )dx + 3 (x − ) sin( ) (7.50)
3 0 2 3 2
2 3
nπx 23 3(x − 32 )
 
2 9 nπx 3 9 nπx 3
= − cos( )| + cos( )| 3 + sin( )| 3 (7.51)
3 2nπ 3 0 nπ 3 2 n2 π 2 3 2
 
2 9 nπ 9 9 nπx
= − (cos( ) − 1) − cos(nπ) − 2 2 sin( ) (7.52)
3 2nπ 2 2nπ n π 2

2 9 nπ n+1 9 nπ
= ( (1 − cos( ) + (−1) ) − 2 2 sin( ) (7.53)
3 2nπ 2 n π 2
 
3 nπ 2 nπ
= 1 − cos( ) + (−1)n+1 − sin( ) (7.54)
nπ 2 nπ 2
and the Fourier Series is
3 ∞ 1 nπ 2 nπ nπx
fodd (x) = ∑ [1 − cos( ) + (−1)n+1 − sin( )] sin( ). (7.55)
π n=1 n 2 nπ 2 3

272 Chapter 7. Fourier Series and Transforms

7.10.3 Fourier Cosine Series


Now consider what happens for the Fourier Cosine Series of f (x) on (0, L). This is analogous to
the Sine Series case. Every term in the Cosine Series has the form
nπx
an cos( ) (7.56)
L
and hence is even, so the entire Cosine Series is even. Thus, the Cosine Series must converge on
(−L, L) to an even function which coincides on (0, L) with f (x). This must be the even extension
(
f (x) 0 < x < L
feven (x) = . (7.57)
f (−x) − L < x < 0

Notice that this definition does not specify the value of the function at zero, the only restriction on
an even function at zero is that, if it exists, the derivative should be zero.
It is straight forward enough to show that the Fourier coefficients of feven (x) coincide with the
Fourier Cosine coefficients of f (x). The Euler-Fourier formulas give

1 L nπx
Z
an = feven (x) cos( )dx (7.58)
L −L L
Z L
2 nπx nπx
= feven (x) cos( )dx since feven (x) cos( ) is even (7.59)
L 0 L L
2 l nπx
Z
= feven (x) cos( )dx (7.60)
L 0 L

which are the Fourier Cosine coefficients of f (x) on (0, L)


Z L
1 nπx
bn = feven (x) sin( )dx = 0 (7.61)
L −L L

since feven (x) sin( nπx


L ) is odd. Thus the Fourier Cosine Series of f (x) on (0, L) can be considered
as the Fourier expansion of feven (x) on (−L, L), and therefore also as expansion of the periodic
extension of feven (x). It will converge as in the Fourier Convergence Theorem to this periodic
extension.
This also means that if we want to compute the Fourier Series of an even function, we can just
compute the Fourier Cosine Series of its restriction to (0, L). It is very important that this only be
attempted if the function we are starting with is even.
 Example 7.25 Write down the even extension of f (x) = L − x on (0, L) and compute its Fourier
Series.

Solution: The even extension will be


(
L−x 0 < x < L
feven (x) = . (7.62)
L+x −L < x < 0

Its Fourier Series is the same as the Fourier Cosine Series of f (x), by the previous discussion. So
we can just compute the coefficients. Thus we have

nπx
feven (x) = a0 + ∑ an cos( ), (7.63)
n=1 L
7.10 Even and Odd Functions, Half-Range Expansions 273

where

1 L 1 L L
Z Z
a0 = f (x)dx = (L − x)dx = (7.64)
L 0 L 0 2
2 L nπx
Z
an = f (x) cos( )dx (7.65)
L 0 L
2 L nπx
Z
= (L − x) cos( )dx (7.66)
L 0 L
2 L(L − x) nπx L2 nπx L
= [ sin( ) − 2 2 cos( )] (7.67)
L nπ L n π L 0
2 L2
 
= (− cos(nπ) + cos(0)) (7.68)
L n2 π 2
2L
= ((−1)n+1 + 1). (7.69)
n2 π 2

So we have


L 2L
feven (x) = + ∑ 2 2 ((−1)n+1 + 1). (7.70)
2 n=1 n π

 Example 7.26 Write down the even extension of

(
3
0 ≤ x < 32
2
f (x) = (7.71)
x − 23 32 ≤ x ≤ 3

and compute its Fourier Series.

Solution: Using Equation (7.57) we see that the even extension is


3 3
x − 2 2 < x < 3


3 0 ≤ x < 3

feven (x) = 32 3
2 . (7.72)


 2 − 2 < x <0
−x − 3 − 3 ≤ x ≤ − 3

2 2
274 Chapter 7. Fourier Series and Transforms

We just need to compute the Fourier Cosine coefficients of the original f (x) on (0, 3).
1 3
Z
a0 = f (x)dx (7.73)
3 0
Z 3/2 Z 3 
1 3 3
= dx + x − dx (7.74)
3 0 2 3/2 2
1 9 9 9
= ( + )= (7.75)
3 4 8 8
2 3 nπx
Z
an = f (x) cos( )dx (7.76)
3 0 3
Z 3/2 Z 3 
2 3 nπx 3 nπx
= cos( )dx + (x − ) cos( )dx (7.77)
3 0 2 3 3/2 2 3
nπx 3/2 3(x − 32 )
 
2 9 nπx 3 9 nπx 3
= sin( )| + sin( )| + cos( )| (7.78)
3 2nπ 3 0 nπ 3 3/2 n2 π 2 3 3/2
  
2 9 nπ 9 nπ
= sin( ) + 2 2 cos(nπ) − cos( ) (7.79)
3 2nπ 2 n π 2
  
6 1 nπ 1 nπ
= sin( ) + (−1)n − cos( ) (7.80)
nπ 2 2 nπ 2
 
6 1 n nπ 1 nπ
= ((−1) − cos( )) + sin( ) . (7.81)
nπ nπ 2 2 2
So the Fourier Series is
 
9 6 ∞ 1 1 n nπ  1 nπ nπx
feven = + ∑ (−1) − cos( ) + sin( ) cos( ). (7.82)
8 π n=1 n nπ 2 2 2 3

8. Partial Differential Equations

8.1 Introduction to Basic Classes of PDEs


8.2 Introduction to PDEs
Many physical and geometric problems rely on models formed by partial differential equations
(PDEs). Here the known functions depend on more than one variable, for example space and time.
In previous chapters we studied ODEs equations which have limited use in modeling physical
systems. Usually they are restricted to the simplest situations such as spring-mass systems or
population dynamics. Now we wish to consider problems form a wider range of fields such as
elasticity, thermodynamics, electrostatics, quantum mechanics, and population dynamics.
Throughout the remainder of the Chapter we focus on initial value problems (IVP) and boundary
value problems (BVP) for common physical systems such as a vibrating string, temperature
distribution in a material, or an elastic membrane.
In this section, we briefly outline the basic concepts involved in studying PDEs and the six
most basic PDEs common to math and physics and describe the physical situations where each
comes about.

8.2.1 Basics of Partial Differential Equations


Definition 8.2.1 A partial differentiation equation (PDE) is an equation involving one or
more partial derivatives of an unknown function that depends on more than one variable.

Definition 8.2.2 The order of a PDE is the order of the highest derivative.

Definition 8.2.3 A PDE is linear if it is of the first degree in the unknown function and its
partial derivatives, otherwise we call it nonlinear.

R The remainder of the course focuses on second order linear PDEs, which have a surprisingly
wide range of applications.
276 Chapter 8. Partial Differential Equations

 Example 8.1 Determine if the following PDEs are linear and what their order is:

• ux x + 2uux = 3
• uxxx + sin(u) = 0
• ux + 3u = 5 sin(x)
• (ux )3 + ux x = x3


Definition 8.2.4 We call a linear PDE homogeneous if each of its terms contains either u or
one of its partial derivatives, otherwise we call the equation nonhomogeneous.

 Example 8.2 In the previous example determine which of the equations is homogeneous. 

8.2.2 Laplace’s Equation - Type: Elliptical


∆u = 0 (8.1)
Applications include gravitational potential u in a region with no mass, electrostatic potential in
a charge free region, the steady state temperature distribution in a region without sources or sinks,
or the velocity of an incompressible fluid with no vortices or sinks.

8.2.3 Poisson’s Equation


∆u = f (x, y, z) (8.2)
Applications include everything for Laplace, but f represents a source term either charge,
electrical, force, or heat source.

8.2.4 Diffusion/Heat Equation - Type: Parabolic


1 ∂u
∆u = 2 (8.3)
α ∂t
The quantity u can represent a non-steady state temperature distribution in a region without
heat sources, concentration of a diffusing substance. Here α 2 is known as the thermal diffusivity.

8.2.5 Wave Equation - Type: Hyperbolic


1 ∂ 2u
∆u = 2 2 (8.4)
v ∂t
The quantity u represents displacement from equilibrium of a vibrating string or membrane, in
electrostatics it can be the current or potential along a transmission line, or u can be the component
of the electric or magnetic field in an electromagnetic wave.

8.2.6 Helmholtz Equation


∆F + k2 F = 0 (8.5)
The time-independent form of the wave equation.

8.2.7 Schrödinger Equation


h̄ ∂
= ∆Ψ +V Ψ = ih̄ Ψ (8.6)
2m ∂t
The√wave equation of quantum mechanics where m is the particle mass, h̄ is Planck’s constant,
i = −1, and V is the potential energy of the particle. The wave function Ψ is complex and its
absolute value squared is proportional to the position probability of the particle.
8.3 Laplace’s Equations and Steady State Temperature Problems 277

8.2.8 Solutions to PDEs


Definition 8.2.5 A solution of a PDE in some region R of the space of the independent variables
is a function that has all the partial derivatives appearing in the PDE in some domain D containing
R, and satisfies the PDE everywhere in R.

R Solutions to the same equation can look very different. Consider the PDE

∂ 2u ∂ 2u
+ = 0.
∂ x 2 ∂ y2

The following five functions are all solutions (verify): 1. u = x2 − y2 , 2. ex cos(y), 3.


u = sin(x) cosh(y), 4. u = 5, 5. u = ln(x2 + y2 ). A unique solution is obtained by combining
the PDE with initial and/or boundary conditions.

Definition 8.2.6 If there is a condition prescribing the values of the unknown function u on the
boundary of domain R, we call these conditions boundary conditions. When t is one of the
variables we can describe the unknown function u or its derivatives at time t = 0, we call these
conditions initial conditions.

Theorem 8.2.1 (Principle of Superposition) If u1 and u2 are solutions of a homogeneous linear


PDE in some region R, then

u = c1 u1 + c2 u2 (8.7)

with any constants c1 , c2 is also a solution of that PDE in the region R.

 Example 8.3 (Similar to ODE) Find solutions u of the PDE uxx − u = 0 where u = u(x, y).

Solution: Since there are no y-derivatives, then we can solve this PDE like we would u00 − u = 0.
Using the characteristic equation we find r2 − 1 = 0 ⇒ r = ±1. Thus, this ODE has solution
u(x) = c1 ex + c2 e−x for constants c1 , c2 . To solve the PDE we must remember these constant could
also be functions of y, so the solution of the PDE is

u(x, y) = c1 (y)ex + c2 (y)e−x

for arbitrary functions of y, c1 (y), c2 (y). 

 Example 8.4 (Similar to ODE) Find solutions u = u(x, y) of the PDE uxy = ux .

Solution: Let ux = p, then py = uxy = −ux = −p. Solving the equation for p gives p = c(x)e−y ,
then integrate with respect to x to get u:
Z
u(x, y) = f (x)e−y + g(x), f (x) = c(x)dx.

8.3 Laplace’s Equations and Steady State Temperature Problems


We will consider the two-dimensional and three-dimensional Laplace Equations
(2D) : uxx + uyy = 0, (8.8)
(3D) : uxx + uyy + uzz = 0. (8.9)
278 Chapter 8. Partial Differential Equations

8.3.1 Dirichlet Problem for a Rectangle


We want to find the function u satisfying Laplace’s Equation

uxx + uyy = 0 (8.10)

in the rectangle 0 < x < a, 0 < y < b, and satisfying the boundary conditions
u(x, 0) = 0, u(x, b) = 0, 0 < x < a, (8.11)
u(0, y) = 0, u(a, y) = f (y), 0 ≤ y ≤ b. (8.12)
We need four boundary conditions for the four spatial derivatives.
Start by using Separation of Variables and assume u(x, y) = X(x)Y (y). Substitute u into
Equation (8.54). This yields
X 00 Y 00
=− = λ,
X Y
where λ is a constant. We obtain the following system of ODEs
X 00 − λ X = 0 (8.13)
00
Y + λY = 0. (8.14)
From the boundary conditions we find
X(0) = 0 (8.15)
Y (0) = 0,Y (b) = 0. (8.16)
We first solve the ODE for Y , which we have seen numerous times before. Using the BCs we find
there are nontrivial solutions if and only if λ is an eigenvalue
nπ 2
λ= , n = 1, 2, 3, ...
b
and Yn (y) = sin( nπy
b ), the corresponding eigenfunction. Now substituting in for λ we want to solve
the ODE for X. This is another problem we have seen regularly and the solution is
nπx  nπx 
Xn (x) = c1 cosh + c2 sinh
b b
The BC implies that c1 = 0. So the fundamental solution to the problem is
nπx  nπy 
un (x, y) = sinh sin .
b b
By linear superposition the general solution is
∞ ∞
nπx  nπy 
u(x, y) = ∑ cn un (x, y) = ∑ cn sinh sin .
n=1 n=1 b b

Using the last boundary condition u(a, y) = f (y) solve for the coefficients cn .

nπa  nπy 
u(a, y) = ∑ cn sinh sin = f (y)
n=1 b b

Using the Fourier Since Series coefficients we find


Z b
2 nπy 
cn = f (y) sin dy.
b sinh( nπa
b ) 0 b
8.3 Laplace’s Equations and Steady State Temperature Problems 279

8.3.2 Dirichlet Problem For A Circle


Consider solving Laplace’s Equation in a circular region r < a subject to the boundary condition

u(a, θ ) = f (θ )

where f is a given function on 0 ≤ θ ≤ 2π. In polar coordinates Laplace’s Equation becomes


1 1
urr + ur + 2 uθ θ = 0. (8.17)
r r
Try Separation of Variables in Polar Coordinates

u(r, θ ) = R(r)Θ(θ ),

plug into the differential equation, Equation (8.114). This yields


1 1
R00 Θ + R0 Θ + 2 RΘ00 = 0
r r
or
R00 R0 Θ00
r2 +r = − =λ
R R Θ
where λ is a constant. We obtain the following system of ODEs

r2 R00 + rR0 − λ R = 0, (8.18)


00
Θ +λθ = 0. (8.19)

Since we have no homogeneous boundary conditions we must use instead the fact that the solutions
must be bounded and also periodic in Θ with period 2π. It can be shown that we need λ to be real.
Consider the three cases when λ < 0, λ = 0, λ > 0.
If λ < 0, let λ = −µ 2 , where µ > 0. So we find the equation for Θ becomes Θ00 − µ 2 Θ = 0. So

Θ(θ ) = c1 eµθ + c2 e−µθ

Θ can only be periodic if c1 = c2 = 0, so λ cannot be negative (Since we do not get any nontrivial
solutions.
If λ = 0, then the equation for Θ becomes Θ00 = 0 and thus

Θ(θ ) = c1 + c2 θ

For Θ to be periodic c2 = 0. Then the equation for R becomes

r2 R00 + rR0 = 0.

This equation is an Euler equation and has solution

R(r) = k1 + k2 ln(r)

Since we also need the solution bounded as r → ∞, then k2 = 0. So u(r, θ ) is a constant, and thus
proportional to the solution u0 (r, θ ) = 1.
If λ > 0, we let λ = µ 2 , where µ > 0. Then the system of equations becomes

r2 R00 + rR0 − µ 2 R = 0 (8.20)


00 2
Θ +µ Θ = 0 (8.21)
280 Chapter 8. Partial Differential Equations

The equation for R is an Euler equation and has the solution

R(r) = k1 r µ + k2 r−µ

and the equation for Θ has the solution

Θ(θ ) = c1 sin(µθ ) + c2 cos(µθ ).

For Θ to be periodic we need µ to be a positive integer n, so µ = n. Thus the solution r−µ is


unbounded as r → 0. So k2 = 0. So the solutions to the original problem are

un (r, θ ) = rn cos(nθ ), vn (r, θ ) = rn sin(nθ ), n = 1, 2, 3, ...

Together with u0 (r, θ ) = 1, by linear superposition we find



c0
u(r, θ ) = + ∑ rn (cn cos(nθ ) + kn sin(nθ )).
2 n=1

Using the boundary condition from the beginning



c0
u(a, θ ) = + ∑ an (cn cos(nθ ) + kn sin(nθ )) = f (θ )
2 n=1

for 0 ≤ θ ≤ 2π. We compute to coefficients by using our previous Fourier Series equations
1
Z2π
cn = n
f (θ ) cos(nθ )dθ , n = 1, 2, 3, ... (8.22)
πa 0
Z 2π
1
kn = f (θ ) sin(nθ )dθ , n = 1, 2, 3, ... (8.23)
πan 0
Note we need both terms since sine and cosine terms remain throughout the general solution.
 Example 8.5 Find the solution u(x, y) of Laplace’s Equation in the rectangle 0 < x < a, 0 < y < b,
that satisfies the boundary conditions

u(0, y) = 0, u(a, y) = 0, 0<y<b (8.24)


u(x, 0) = h(x), u(x, b) = 0, 0≤x≤a (8.25)

Answer: Using the method of Separation of Variables, write u(x, y) = X(x)Y (y). We get the
following system of ODEs

X 00 + λ X = 0, X(0) = X(a) = 0 (8.26)


00
Y − λY = 0, Y (b) = 0 (8.27)

It follows that λn = ( nπ 2 nπx


a ) and Xn (x) = sin( a ). The solution of the second ODE gives

Y (y) = d1 cosh(λ (b − y)) + d2 sinh(λ (b − y)).

Using y(b) = 0, we find that d1 = 0. Therefore the fundamental solutions are


nπx
un (x, y) = sin( ) sinh(λn (b − y)),
a
and the general solution is

nπx nπ(b − y)
u(x, y) = ∑ cn sin( ) sinh( ).
n=1 a a
8.4 Heat Equation and Schrödinger Equation 281

Using another boundary condition



nπx nπb
h(x) = ∑ cn sin( ) sinh( ).
n=1 a a

The coefficients are calculated using the equation from the Fourier Sine Series
Z a
2 nπx
cn = h(x) sin( )dx.
a sinh( nπb
a ) 0 a


 Example 8.6 Consider the problem of finding a solution u(x, y) of Laplace’s Equation in the
rectangle 0 < x < a, 0 < y < b, that satisfies the boundary conditions

ux (0, y) = 0, ux (a, y) = f (y), 0 < y < b, (8.28)


uy (x, 0) = 0, uy (x, b) = 0, 0≤x≤a (8.29)

This is an example of a Neumann Problem. We want to find the fundamental set of solutions.

X 00 − λ X = 0, X 0 (0) = 0 (8.30)
00 0 0
Y + λY = 0, Y (0) = Y (b) = 0. (8.31)

The solution to the equation for Y is

Y (y) = c1 cos(λ 1/2 y) + c2 sin(λ 1/2 y),

with Y 0 (y) = −c1 λ 1/2 sin(λ 1/2 y) + c2 λ 1/2 cos(λ 1/2 y). Using the boundary conditions we find
2 2
c2 = 0 and the eigenvalues are λn = nbπ2 , for n = 1, 2, 3, .... The corresponding Eigenfunctions are
Y (y) = cos( nπy nπx
b ) for n = 1, 2, 3, ... The solution of the equation for X becomes X(x) = d1 cosh( b )+
nπx
d2 sinh( b ), with
nπ nπx nπ nπx
X 0 (x) = d1 sinh( ) + d2 cosh( ).
b b b b
Using the boundary conditions, X(x) = d1 cosh( nπx
b ).So the fundamental set of solutions is

nπx nπy
un (x, y) = cosh( ) cos( ), n = 1, 2, 3, ...
b b
The general solution is given by

a0 nπx nπy
u(x, y) = + ∑ an cosh( ) cos( )
2 n=1 b b


8.4 Heat Equation and Schrödinger Equation


We will soon see that partial differential equations can be far more complicated than ordinary
differential equations. For PDEs, there is no general theory, the methods need to be adapted for
smaller groups of equations. This course will only do an introduction, you can find out much
more in advanced courses. We will be focusing on a single solution method called Separation of
Variables, which is pervasive in engineering and mathematics.
282 Chapter 8. Partial Differential Equations

Figure 8.1: Heat Flux across the boundary of a small slab with length ∆x. The graph is the graph
of temperature at a given time t. In accordance with Fourier’s Law, the heat leaves or enters the
boundary by flowing from hot to cold; hence at x the flux is opposing the sign of ux , while at x + ∆x
it is agreeing.

The first partial differential equation to consider is the famous heat equation which models the
temperature distribution in some object. We will focus on the one-dimensional heat equation, where
we want to find the temperature distributions in a one-dimensional bar of length l. In particular we
will assume that our bar corresponds to the interval (0, l) on the real line.
The assumption is made purely for simplicity. If we assume we have a real bar, the one-
dimensional assumption is equivalent to assuming at every lateral cross-section and every instant of
time, the temperature is constant. While this is unrealistic it is not a terrible assumption. Also, if
the length is much larger than the width in advanced mathematics one can assume the width is 0
since it is such a small fraction of the length. We are also assuming the bar is perfectly insulated,
so the only way heat can enter or leave the bar is through the ends x = 0 and x = l. So any heat
transfer will be one-dimensional.

8.4.1 Derivation of the Heat Equation


Many PDEs come from basic physical laws. Let u(x,t) denote the temperature at a point x at time t.
c will be the specific heat of the material the bar is made from (which is the amount of heat needed
to raise one unit of mass of this material by one temperature unit) and ρ is the density of the rod.
Note that in general, the specific heat and density of the rods do not have to be constants, they may
vary with x. We greatly simplify the problem by allowing them to be constant.
Let’s consider a small slab of length ∆x. We will let H(t) be the amount of heat contained in
this slab. The mass of the slab is ρ∆x and the heat energy contained in this small region is given by

H(t) = cuρ∆x (8.32)

On the other hand, within the slab, heat will flow from hot to cold (this is Fourier’s Law). The
only way heat can leave is by leaving through the boundaries, which are at x and x + ∆x (This is
the Law of Conservation of Energy). So the change of heat energy of the slab is equal to the heat
flux across the boundary. If κ is the conductivity of the bar’s material

dH
= κux (x + ∆x,t) − κux (x,t)
dt
This is illustrated in Figure 8.4.1. Setting the derivative of H(t) from above equal to the previous
equations we find

(cu(x,t)ρ∆x)t = κux (x + ∆x,t) − κux (x,t)

or
κux (x + ∆x,t) − κux (x,t)
cρut (x,t) = .
∆x
8.5 Separation of Variables and Heat Equation IVPs 283

Figure 8.2: Temperature versus position on a bar. The arrows show time dependence in accordance
with the heat equation. The temperature graph is concave up, so the left side of the bar is warming
up. While on the right the temperature is concave down and so th right side is cooling down..

If we take the limit as ∆x → 0, the right hand side is just the x-derivative of κux (x,t) or

cρut (x,t) = κuxx (x,t).


κ
Setting k = cρ > 0, we have the heat equation

ut = kuxx .

Notice that the heat equation is a linear PDE, since all of the derivatives of u are only multiplied
by constants. What is the constant k? It is called the Thermal Diffusivity of the bar and is a
measure of how quickly heat spreads through a given material.
How do we interpret the heat equation? Graph the temperature of the bar at a fixed time.
Suppose it looks like Figure 2. On the left side the bar is concave up. If the graph is concave up,
that means that the second derivative of the temperature (with respect to position x) is positive. The
heat equation tells us that the time derivative of the temperature at any of the points on the left
side of the bar will be increasing. The left side of the bar will be warming up. Similarly, on the
right side of the bar, the graph is concave down. Thus the second x-derivative of the temperature is
negative, and so will be the first t-derivative, and we can conclude that the right side of the bar is
cooling down.

8.5 Separation of Variables and Heat Equation IVPs


8.5.1 Initial Value Problems
Partial Differential Equations generally have a lot of solutions. To specify a unique one, we will
need additional conditions. These conditions are motivated by physics and are initial or boundary
conditions. An IVP for a PDE consists for the heat equation, initial conditions, and boundary
conditions.
An initial condition specifies the physical state at a given time t0 . For example, and initial
condition for the heat equation would be the starting temperature distribution

u(x, 0) = f (x)

This is the only condition required because the heat equation is first order with respect to time.
The wave equation, considered in a future section is second order in time and needs two initial
conditions.
PDEs are only valid on a given domain. Boundary conditions specify how the solution behaves
on the boundaries of the given domain. These need to be specified, because the solution does not
exist on one side of the boundary, we might have problems with differentiability there.
Our heat equation was derived for a one-dimensional bar of length l, so the relevant domain in
question can be taken to be the interval 0 < x < l and the boundary consists of the two points x = 0
284 Chapter 8. Partial Differential Equations

and x = l. We could have derived a two-dimensional heat equation, for example, in which case the
domain would be some region in the xy-plane with the boundary being some closed curve.
It will be clear from the physical description of the problem what the appropriate boundary
conditions are. We might know at the endpoints x = 0 and x = l, the temperature u(0,t) and u(l,t)
are fixed. Boundary conditions that give the value of the solution are called Dirichlet Boundary
Conditions. Or we might insulate the ends of the bar, meaning there should be no heat flow
out of the boundary. This would yield the boundary conditions ux (0,t) = ux (l,t) = 0. If the
boundary conditions specify the derivative at the boundary, they are called Neumann Conditions.
If the boundary conditions specify that we have one insulated end and at the other we control the
temperature. This is an example of a Mixed Boundary Condition.
As we have seen, changing boundary conditions can significantly change the solution. Initially,
we will work with homogeneous Dirichlet conditions u(0,t) = u(l,t) = 0, giving us the following
initial value problem
(DE) : ut = kuxx (8.33)
(BC) : u(0,t) = u(l,t) = 0 (8.34)
(IC) : u(x, 0) = f (x) (8.35)
After we have seen the general method, we will see what happens with homogeneous Neumann
conditions. We will discuss nonhomogeneous equations later.

8.5.2 Separation of Variables


Above we have derived the heat equation for the bar of length L. Suppose we have an initial value
problem such as Equation (8.33)-(8.35). How should we proceed? We want to try to build a general
solution out of smaller solutions which are easier to find.
We start by assuming we have a separated solution, where

u(x,t) = X(x)T (t).

Our solution is the product of a function that depends only on x and a function that depends only on
t. We can then try to write down an equation depending only on x and another solution depending
only on t before using our knowledge of ODEs to try and solve them.
It should be noted that this is a very special situation and will not occur in general. Even when
we can use it sometimes it is hard to move beyond the first step. However, it works for all equations
we will be considering in this class, and is a good starting point.
How does this method work? Plug the separated solution into the heat equation.
∂ ∂2
[X(x)T (t)] = k 2 [X(x)T (t)] (8.36)
∂t ∂x
X(x)T 0 (t) = kX 00 (x)T (t) (8.37)
Now notice that we can move everything depending on x to one side and everything depending on t
to the other.
T 0 (t) X 00 (x)
=
kT (t) X(x)
This equation should says that both sides are equal for any x or t we choose. Thus they both must
be equal to a constant. Since if what they equal depended on x or t both sides would not be equal
for all x and t. So
T 0 (t) X 00 (x)
= = −λ
kT (t) X(x)
8.5 Separation of Variables and Heat Equation IVPs 285

We have written the minus sign for convenience. It will turn out that λ > 0.
The equation above contains a pair of separate ordinary differential equations

X 00 + λ X = 0 (8.38)
0
T + λ kT = 0. (8.39)

Notice that our boundary conditions becomes X(0) = 0 and X(l) = 0. Now the second equation
can easily be solved, since we have T 0 = −λ kT , so that

T (t) = Ae−λ kt .

The first equation gives a boundary value problem

X 00 + λ X = 0 X(0) = 0 X(l) = 0

This should look familiar. The is the basic eigenfunction problem studied in section 10.1. As in
that example, it turns out our eigenvalues have to be positive. Let λ = µ 2 for µ > 0, our general
solution is

X(x) = B cos(µx) +C sin(µx).

The first boundary condition says B = 0. The second condition says that X(l) = C sin(µl) = 0. To
avoid only having the trivial solution, we must have µl = nπ. In other words,
nπ 2 nπx 
λn = and Xn (x) = sin
l l
for n = 1, 2, 3, ...
So we end up having found infinitely many solutions to our boundary value problem, one for
each positive integer n. They are
nπ 2 nπx
un (x,t) = An e−( l ) kt sin( ).
l
The heat equation is linear and homogeneous. As such, the Principle of Superposition still holds.
So a linear combination of solutions is again a solution. So any function of the form
N
nπ 2 nπx
u(x,t) = ∑ An e−( l ) kt sin(
l
) (8.40)
n=0

is also a solution to our problem.


Notice we have not used our initial condition yet. We have
N
nπx
f (x) = u(x, 0) = ∑ An sin( ).
n=0 l

So if our initial condition has this form, the result of superposition Equation (8.40) is in a good
form to use the IC. The coefficients An just being the associated coefficients from f (x).
 Example 8.7 Find the solutions to the following heat equation problem on a rod of length 2.

ut = uxx (8.41)
u(0,t) = u(2,t) = 0 (8.42)
3πx
u(x, 0) = sin( ) − 5 sin(3πx) (8.43)
2
286 Chapter 8. Partial Differential Equations

In this problem, we have k = 1. Now we know that our solution will have the form like
Equation (8.40), since our initial condition is just the difference of two sine functions. We just need
to figure out which terms are represented and what the coefficients An are.
Our initial condition is
3πx
f (x) = sin( ) − 5 sin(3πx)
2
Looking at (8.40) with l = 2, we can see that the first term corresponds to n = 3 and the second
n = 6, and there are no other terms. Thus we have A3 = 1 and A6 = −5, and all other An = 0. Our
solution is then
9π 2 3πx 2
u(x,t) = e−( 4 )t sin( ) − 5e(−9π )t sin(3πx).
2


There is no reason to suppose that our initial distribution is a finite sum of sine functions.
Physically, such situations are special. What do we do if we have a more general initial temperature
distribution?
Let’s consider what happens if we take an infinite sum of our separated solutions. Then our
solution is

nπ 2 nπx
u(x,t) = ∑ An e−( l ) kt sin(
l
).
n=0

Now the initial condition gives



nπx
f (x) = ∑ An sin( ).
n=0 l

This idea is due to the French Mathematician Joseph Fourier and is called the Fourier Sine Series
for f (x).
There are several important questions that arise. Why should we believe that our initial condition
f (x) ought to be able to be written as an infinite sum of sines? why should we believe that such a
sum would converge to anything?

8.5.3 Neumann Boundary Conditions


Now let’s consider a heat equation problem with homogeneous Neumann conditions

(DE) : ut = uxx (8.44)


(BC) : ux (0,t) = ux (l,t) = 0 (8.45)
(IC) : u(x, 0) = f (x) (8.46)

We will start by again supposing that our solution to Equation (8.44) is separable, so we have
u(x,t) = X(x)T (t) and we obtain a pair of ODEs, which are the same as before

X 00 + λ X = 0 (8.47)
0
T + λ kT = 0. (8.48)

The solution to the first equation is still

T (t) = Ae−λ kt
8.6 Heat Equation Problems 287

Now we need to determine the boundary conditions for the second equation. Our boundary
conditions are ux (0,t) and ux (l,t). Thus they are conditions for X 0 (0) and X 0 (l), since the t-
derivative is not controlled at all. So we have the boundary value problem

X 00 + λ X = 0 X 0 (0) = 0 X 0 (l) = 0.

Along the lines of the analogous computation last lecture, this has eigenvalues and eigenfunctions
nπ 2
λn = (8.49)
l
nπx 
yn (x) = cos (8.50)
l
for n = 0, 1, 2, ... So the individual solutions to Equation (8.44) have the form
nπ 2 nπx 
u( x,t) = An e( l ) kt cos .
l
Taking finite linear combinations of these work similarly to the Dirichlet case (and is the solution
to Equation (8.44) when f (x) is a finite linear combination of constants and cosines, but in general
we are interested in knowing when we can take infinite sums, i.e.

1 nπ 2 nπx
u(x,t) = A0 + ∑ An e−( l ) kt cos( )
2 n=1 l

Notice how we wrote the n = 0 case, as 12 A0 . The reason will be clear when talking about Fourier
Series. The initial conditions means we need

1 nπx
f (x) = A0 + ∑ An cos( ).
2 n=1 l

An expression of the form above is called the Fourier Cosine Series of f (x).

8.5.4 Other Boundary Conditions


It is also possible for certain boundary conditions to require the "full" Fourier Series of the initial
data, this is an expression of the form
∞  
1 nπx  nπx 
f (x) = A0 + ∑ An cos + Bn sin .
2 n=1 l l

but in most cases we will work with Dirichlet or Neumann conditions. However, in the process of
learning about Fourier sine and cosine series, we will also learn how to compute the full Fourier
series of a function.

8.6 Heat Equation Problems


In the previous lecture on the Heat Equation we saw that the product solutions to the heat equation
with homogeneous Dirichlet boundary conditions problem
ut = kuxx (8.51)
u(0,t) = u(l,t) = 0 (8.52)
u(x, 0) = f (x) (8.53)
had the form
nπ nπx 
un (x,t) = Bn e−( l )kt sin n = 1, 2, 3, ...
l
288 Chapter 8. Partial Differential Equations

Taking linear combinations of these (over each n) gives a general solution to the above problem.

nπ nπx 
u(x,t) = ∑ Bn e−( l )kt sin
l
(8.54)
n=1

Setting t = 0, this implies that we must have



nπx 
f (x) = ∑ Bn sin
n=1 l

In other words, the coefficients in the general solution for the given initial condition are the Fourier
Sine coefficients of f (x) on (0, l), which are given by
Z l
2 nπx 
Bn = f (x) sin dx.
l 0 l
We also, saw that if we instead have a problem with homogeneous Neumann boundary condi-
tions

ut = kuxx 0 < x < l, t >0 (8.55)


ux (0,t) = ux (l,t) = 0 (8.56)
u(0,t) = f (x) (8.57)

the product solutions had the form


nπ 2 nπx 
un (x,t) = An e−( l ) kt cos n = 1, 2, 3, ...
l
and the general solution has the form

1 nπ 2 nπx 
u(x,t) = A0 + ∑ An e−( l ) kt cos . (8.58)
2 n=1 l

With t = 0 this means that the initial condition must satisfy



1 nπx 
f (x) = A0 + ∑ An cos .
2 n=1 l

and so the coefficients for a particular initial condition are the Fourier Cosine coefficients of f (x),
given by
Z l
2 nπx 
An = f (x) cos dx.
l 0 l
One way to think about this difference is that given the initial data u(x, 0) = f (x), the Dirichlet
conditions specify the odd extension of f (x) as the desired periodic solution, while the Neumann
conditions specify the even extension. This should make sense since odd functions must have
f (0) = 0, while even functions must have f 0 (0) = 0.
So to solve a homogeneous heat equation problem, we begin by identifying the type of boundary
conditions we have. If we have Dirichlet conditions, we know our solution will have the form of
Equation (8.54). All we then have to do is compute the Fourier Sine coefficients of f (x). Similarly,
if we have Neumann conditions, we know the solution has the form of Equation (8.114) and we
have to compute the Fourier Cosine coefficients of f (x).
8.6 Heat Equation Problems 289

R Observe that for any homogeneous Dirichlet problem, the temperature distribution (8.54)
will go to 0 as t → ∞. This should make sense because these boundary conditions have a
physical interpretation where we keep the ends of our rod at freezing temperature without
regulating the heat flow in and out of the endpoints. As a result, if the interior of the rod is
initially above freezing, that heat will radiate towards the endpoints and into our reservoirs at
the endpoints. On the other hand, if the interior of the rod is below freezing, heat will come
from the reservoirs at the endpoints and warm it up until the temperature is uniform.
For the Neumann problem, the temperature distribution (8.114) will converge to 12 A0 . Again,
this should make sense because these boundary conditions correspond to a situation where
we have insulated ends, since we are preventing any heat from escaping the bar. Thus all heat
energy will move around inside the rod until the temperature is uniform.

8.6.1 Examples
 Example 8.8 Solve the initial value problem

ut = 3uxx 0 < x < 2, t >0 (8.59)


u(0,t) = u(2,t) = 0 (8.60)
u(x, 0) = 20. (8.61)

This problem has homogeneous Dirichlet conditions, so by (8.54) our general solution is

nπ 2 nπx
u(x,t) = ∑ Bn e−3( 2 ) t sin(
2
).
n=1

The coefficients for the particular solution are the Fourier Sine coefficients of u(x, 0) = 20, so we
have
Z 2
2 nπx
Bn = )dx
20 sin( (8.62)
02 2
40 nπx 2
= [− cos( )] (8.63)
nπ 2 0
40
= − (cos(nπ) − cos(0)) (8.64)

40
= (1 + (−1)n+1 ) (8.65)

and the solution to the problem is

40 ∞ 1 + (−1)n+1 − 3n2 π 2 t nπx


u(x,t) = ∑ e 4 sin( ).
π n=1 n 2


 Example 8.9 Solve the initial value problem

ut = 3uxx 0 < x < 2, t >0 (8.66)


ux (0,t) = ux (2,t) = 0 (8.67)
u(x, 0) = 3x. (8.68)

This problem has homogeneous Neumann conditions, so by (8.114) our general solution is

1 nπ 2 nπx
u(x,t) = A0 + ∑ An e−3( 2 ) t cos( ).
2 n=1 2
290 Chapter 8. Partial Differential Equations

The coefficients for the particular solution are the Fourier Cosine coefficients of u(x, 0) = 3x, so we
have
2 2
Z
A0 = 3xdx = 6 (8.69)
2 0
Z 2
2 nπx
An = 3x cos( )dx (8.70)
2 0 2
6x nπx 12 nπx 2
= [− cos( ) + 2 2 sin( )] (8.71)
nπ 2 n π 2 0
12
= − cos(nπ) (8.72)

12
= (−1)n+1 (8.73)

and the solution to the problem is

3 12 ∞ (−1)n+1 − 3n2 π 2 t nπx


u(x,t) = + ∑ e 4 cos( ).
2 π n=1 n 2

 Example 8.10 Solve the initial value problem

ut = 4uxx 0 < x < 2π, t >0 (8.74)


u(0,t) = u(2π,t) = 0 (8.75)
(
1 0<x<π
u(x, 0) = . (8.76)
x π < x < 2π

This problem has homogeneous Dirichlet conditions, so our general solution is



2 nx
u(x,t) = ∑ Bn e−n t sin( 2 ).
n=1

The coefficients for the particular solution are the Fourier Sine coefficients of u(x, 0), so we have
Z π Z 2π 
2 nx nx
Bn = sin( )dx + x sin( )dx (8.77)
2π 0 2 π 2
2 nx 2x nx 4 nx 2π
= − cos( )|π0 − cos( )|2π π + 2 sin( )| (8.78)
nπ 2 nπ 2 n π 2 π
2 nx 4 2 nπ 4 nπ
= − (cos( ) − cos(0)) − cos(nπ) + cos( ) − 2 sin( ) (8.79)
nπ 2 n n 2 n π 2
2 nπ 4 2 nπ 4 nπ
= − (cos( ) − 1) + (−1)n+1 + cos( ) − 2 sin( ) (8.80)

 2 n n 2 n π 2
2 1 nπ nπ 2 nπ
= − (cos( ) − 1) + 2(−1)n+1 cos( ) − sin( ) (8.81)
n π 2 2 nπ 2

and the solution to the problem is



∞ 
1 1 nπ nπ 2 nπ 2 nx
u(x,t) = 2 ∑ n+1
− (cos( ) − 1) + 2(−1) cos( ) − sin( ) e−n t sin( ).
n=1 n π 2 2 nπ 2 2


8.7 Other Boundary Conditions 291

8.7 Other Boundary Conditions


So far, we have used the technique of separation of variables to produce solutions to the heat
equation

ut = kuxx

on 0 < x < l with either homogeneous Dirichlet boundary conditions [u(0,t) = u(l,t) = 0] or
homogeneous Neumann boundary conditions [ux (0,t) = ux (l,t) = 0]. What about for some other
physically relevant boundary conditions?

8.7.1 Mixed Homogeneous Boundary Conditions


We could have the following boundary conditions

u(0,t) = ux (l,t) = 0

Physically, this might correspond to keeping the end of the rod where x = 0 in a bowl of ice water,
while the other end is insulated.
Use Separation of Variables. Let u(x,t) = X(x)T (t), and we get the pair of ODEs
T 0 = −kλ T (8.82)
00
X = −λ X. (8.83)
Thus

T (t) = Be−kλt .

We now have a boundary value problem for X to deal with, where the boundary conditions are
X(0) = X 0 (l) = 0. There are only positive eigenvalues, which are given by

(2n − 1)π 2
 
λn =
2l
and their associated eigenfunctions are
(2n − 1)πx
Xn (x) = sin( ).
2l
The separated solutions are then given by
(2n−1)π 2 (2n − 1)πx 
un (x,t) = Bn e−( 2l ) kt
sin
2l
and the general solution is
∞ (2n−1)π 2 (2n − 1)πx 
u(x,t) = ∑ Bn e−( 2l ) kt
sin
2l
. (8.84)
n=1

with an initial condition u(x, 0) = f (x), we have that



(2n − 1)πx 
f (x) = ∑ Bn sin .
n=1 2l
This is an example of a specialized sort of Fourier Series, the coefficients are given by
Z l
2 (2n − 1)πx 
Bn = f (x) sin dx.
l 0 2l
292 Chapter 8. Partial Differential Equations

R The convergence for a series like the one above is different than that of our standard Fourier
Sine or Cosine series, which converge to the periodic extension of the odd or even extensions
of the original function, respectively. Notice that the terms in the sum above are periodic with
period 4l (as opposed to the 2l-periodic series we have seen before). In this case, we need to
first extend our function f (x), given on (0, l), to a function on (0, 2l) symmetric around x = l.
Then, as our terms are all sines, the convergence on (−2l, 2l) will be to the odd extension of
this extended function, and the periodic extension of this will be what the series converges to
on the entire real line.

 Example 8.11 Solve the following heat equation problem

ut = 25uxx (8.85)
u(0,t) = 0 ux (10,t) = 0 (8.86)
u(x, 0) = 5. (8.87)
By (8.84) our general solution is
∞ (2n−1)π 2 (2n − 1)πx 
u(x,t) = ∑ Bn e−25( 20 ) t
sin
20
.
n=1

The coefficients for the particular solution are given by


Z 10
2 (2n − 1)πx 
Bn = 5 sin dx (8.88)
10
0 20
10 (2n − 1)πx  10
= − cos |0 (8.89)
(2n − 1)π 20
10 (2n − 1)π 
= − cos( ) − cos(0) (8.90)
(2n − 1)π 2
10
= . (8.91)
(2n − 1)π
and the solution to the problem is
10 ∞ 1 (2n−1)2 π 2 (2n − 1)πx 
u(x,t) = ∑ e− 16 t sin .
π n=1 (2n − 1) 20


8.7.2 Nonhomogeneous Dirichlet Conditions


The next type of boundary conditions we will look at are Dirichlet conditions, which fix the value of
u at the endpoints x = 0 and x = l. For the heat equation, this corresponds to fixing the temperature
at the ends of the rod. We have already looked at homogeneous conditions where the ends of the
rod had fixed temperature 0. Now consider the nonhomogeneous Dirichlet conditions

u(0,t) = T1 , u(l,t) = T2

This problem is slightly more difficult than the homogeneous Dirichlet condition problem we
have studied. Recall that for separation of variables to work, the differential equations and the
boundary conditions must be homogeneous. When we have nonhomogeneous conditions we need
to try to split the problem into one involving homogeneous conditions, which we know how to
solve, and another dealing with the nonhomogeneity.

R We used a similar approach when we applied the method of Undetermined Coefficients to


nonhomogeneous linear ordinary differential equations.
8.7 Other Boundary Conditions 293

How can we separate the core homogeneous problem from what is causing the inhomogeneity?
Consider what happens as t → ∞. We should expect that, since we fix the temperatures at the
endpoints and allow free heat flux at the boundary, at some point the temperature will stabilize and
we will be at equilibrium. Such a temperature distribution would clearly not depend on time, and
we can write

lim u(x,t) = v(x)


t→∞

Notice that v(x) must still satisfy the boundary conditions and the heat equation, but we should
not expect it to satisfy the initial conditions (since for large t we are far from where we initially
started). A solution such as v(x) which does not depend on t is called a steady-state or equilibrium
solution.
For a steady-state solution the boundary value problem becomes

0 = kv00 v(0) = T1 v(l) = T2 .

It is easy to see that solutions to this second order differential equation are

v(x) = c1 x + c2

and applying the boundary conditions, we have


T2 − T1
v(x) = T1 + x.
l
Now, let

w(x,t) = u(x,t) − v(x)

so that

u(x,t) = w(x,t) + v(x).

This function w(x,t) represents the transient part of u(x,t) (since v(x) is the equilibrium part).
Taking derivatives we have

ut = wt + vt = wt and uxx = wxx + vxx = wxx .

Here we use the fact that v(x) is independent of t and must satisfy the differential equation. Also,
using the equilibrium equation v00 = vxx = 0.
Thus w(x,t) must satisfy the heat equation, as the relevant derivatives of it are identical to those
of u(x,t), which is known to satisfy the equation. What are the boundary and initial conditions?
w(0,t) = u(0,t) − v(0) = T1 − T1 = 0 (8.92)
w(l,t) = u(l,t) − v(l) = T2 − T2 = 0 (8.93)
w(x, 0) = u(x, 0) − v(x) = f (x) − v(x) (8.94)
where f (x) = u(x, 0) is the given initial condition for the nonhomogeneous problem. Now, even
though our initial condition is slightly messier, we now have homogeneous boundary conditions,
since w(x,t) must solve the problem
wt = kwxx (8.95)
w(0,t) = w(l,t) = 0 (8.96)
w(x, 0) = f (x) − v(x) (8.97)
294 Chapter 8. Partial Differential Equations

This is just a homogeneous Dirichlet problem. We know the general solution is



nπ 2 nπx
w(x,t) = ∑ Bn e−( l ) kt sin(
l
).
n=1

where the coefficients are given by


Z l
2 nπx
Bn = ( f (x) − v(x)) sin( )dx.
l 0 l
Notice that limt→∞ w(x,t) = 0, so that w(x,t) is transient.
Thus, the solution to the nonhomogeneous Dirichlet problem

ut = kuxx (8.98)
u(0,t) = T1 , u(l,t) = T2 (8.99)
u(x, 0) = f (x) (8.100)

is u(x,t) = w(x,t) + v(x), or



nπ 2 nπx  T2 − T1
u(x,t) = ∑ Bn e−( l ) lt sin
l
+ T1 +
l
x
n=1

with coefficients
Z l
2 T2 − T1 nπx 
Bn = ( f (x) − T1 − x) sin dx.
l 0 l l

R Do not memorize the formulas but remember what problem w(x,t) has to solve and that the
final solution is u(x,t) = v(x) + w(x,t). For v(x), it is not a hard formula, but if one is not
sure of it, remember vxx = 0 and it has the same boundary conditions as u(x,t). This will
recover it.

 Example 8.12 Solve the following heat equation problem

ut = 3uxx (8.101)
u(0,t) = 20, u(40,t) = 100 (8.102)
u(x, 0) = 40 − 3x (8.103)

We start by writing

u(x,t) = v(x) + w(x,t)

where v(x) = 20 + 2x. Then w(x,t) must satisfy the problem

wt = 3wxx (8.104)
w(0,t) = w(40,t) = 0 (8.105)
w(x, 0) = 40 − 3x − (20 + 2x) = 20 − x (8.106)

This is a homogeneous Dirichlet problem, so the general solution for w(x,t) will be

nπ 2 nπx
w(x,t) = ∑ e−3( 40 ) t sin(
40
).
n=1
8.8 The Schrödinger Equation 295

The coefficients are given by


2 40 nπx 
Z
Bn = (20 − x) sin dx (8.107)
40 0 40
1 40(20 − x) nπx  1600 nπx 40
= [− cos − 2 2 sin (8.108)
20  nπ 40 nπ 40 0
1 800 800
= cos(nπ) + cos(0) (8.109)
20 nπ nπ
40
= ((−1)n + 1). (8.110)

So the solution is
40 ∞ (−1)n + 1 − 3n2 π 2 t nπx 
u(x,t) = 20 + 2x + ∑ e 1600 sin .
π n=1 n 40


8.7.3 Other Boundary Conditions


There are many other boundary conditions one could use, most of which have a physical interpreta-
tion. For example the boundary conditions

u(0,t) + ux (0,t) = 0 u(l,t) + ux (l,t) = 0

say that the heat flux at the end points should be proportional to the temperature. We could also
have had nonhomogeneous Neumann conditions

ux (0,t) = F1 ux (l,t) = F2

which would specify allowing a certain heat flux at the boundaries. These conditions are not
necessarily well suited for the method of separation of variables though and are left for future
classes.

8.8 The Schrödinger Equation


Recall the Schrödinger equation
h̄ ∂
− ∆Ψ +V Ψ = ih̄ Ψ. (8.111)
2m ∂t
To approach this problem we begin by separating variables by assuming Ψ = ψ(x, y, z)T (t). Substi-
tution into (8.111) gives
h̄ 1 1 dT
− ∆ψ +V = ih̄ =E
2m ψ T dt
where E is the separation constant (E is the energy of the particle in quantum mechanics). Integrat-
ing the time equation in T gives
1 dT
ih̄ =E ⇒ T (t) = e−iEt/h̄ .
T dt
and the space equation (Time-independent Schrödinger equation gives (after multiplication by ψ)

− ∆ψ +V ψ = Eψ
2m
296 Chapter 8. Partial Differential Equations

We will only consider the simplest situation here of a 1D problem with V = 0


h̄ d 2 ψ d 2 ψ 2mE
− = Eψ ⇔ + 2 ψ = 0.
2m dx2 dx2 h̄
2mE
The last equation is the Helmholtz equation with k2 = h̄2
. Thus, the solutions are

Ψ = ψ(x)T (t) = sin(kx)e−iEt/h̄ , cos(kx)e−iEt/h̄ .


 Example 8.13 The “particle in a box" problem in quantum mechanics requires the solution of

the Schrödinger equation with V = 0 on (0, `) and Ψ = 0 at the endpoints x = 0, ` for all t. The
Dirichlet BCs require only the solutions with sine. The basis functions for this problem are the
eigenfunctions
 nπx 
Ψn = sin r−iEnt/h̄
`
and the general solution is a linear combination of these solutions
∞  nπx 
Ψ(x,t) = ∑ bn sin r−iEnt/h̄
n=1 `


8.9 Wave Equations and the Vibrating String


8.9.1 Derivation of the Wave Equation
Consider a completely flexible string of length l and constant density ρ. We will assume that the
string will only undergo relatively small vertical vibrations, so that points do not move from side
to side. An example might be a plucked guitar string. Thus we can let u(x,t) be its displacement
from equilibrium at time t. The assumption of complete flexibility means that the tension force is
tangent to the string, and the string itself provides no resistance to bending. This means the tension
force only depends on the slope of the string.
Take a small piece of string going from x to x + ∆x. Let Θ(x,t) be the angle from the horizontal
of the string. Our goal is to use Newton’s Second Law F = ma to describe the motion. What forces
are acting on this piece of string?
(a) Tension pulling to the right, which has magnitude T (x + ∆x,t) and acts at an angle of Θ(x + ∆x,t)
from the horizontal.
(b) Tension pulling to the left, which has magnitude T (x,t) and acts at an angle of Θ(x,t) from the
horizontal.
(c) Any external forces, which we denote by F(x,t). p
Initially, we will assume that F(x,t) = 0. The length of the string is essentially (∆x)2 + (∆u)2 ,
so the vertical component of Newton’s Law says that
q
ρ (∆x)2 + (∆u)2 utt (x,t) = T (x + ∆x,t) sin(Θ(x + ∆x,t)) − T (x,t) sin(Θ(x,t)). (8.112)
Dividing by ∆x and taking the limit as ∆x → 0, we get
q

ρ 1 + (ux )2 utt (x,t) = [T (x,t) sin(Θ(x,t))]. (8.113)
∂x
We assumed our vibrations were relatively small. This means that Θ(x,t) is very close to zero. As a
result, sin(Θ(x,t)) ≡ tan(Θ(x,t)). Moreover, tan(Θ(x,t)) is just the slope of the string ux (x,t). We
conclude, since Θ(x,t) is small, that ux (x,t) is also very small. The above equation becomes
ρutt (x,t) = (T (x,t)ux (x,t))x . (8.114)
8.9 Wave Equations and the Vibrating String 297

We have not used the horizontal component of Newton’s Law yet. Since we assume there are only
vertical vibrations, our tiny piece of string can only move vertically. Thus the net horizontal force
is zero.

T (x + ∆x,t) cos(Θ(x + ∆x,t)) − T (x,t) cos(Θ(x,t)) = 0. (8.115)

Dividing by ∆x and taking the limit as ∆x → ∞ yields



[T (x,t) cos(Θ(x,t))] = 0. (8.116)
∂x
Since Θ(x,t) is very close to zero, cos(Θ(x,t)) is close to one. thus we have that ∂∂Tx (x,t) is close
to zero. So T (x,t) is constant along the string, and independent of x. We will also assume that T is
independent of t. Then Equation (8.114) becomes the one-dimensional wave equation

utt = c2 uxx (8.117)

where c2 = Tρ .

8.9.2 The Homogeneous Dirichlet Problem


Now that we have derived the wave equation, we can use Separation of Variables to obtain basic
solutions. We will consider homogeneous Dirichlet conditions, but if we had homogeneous
Neumann conditions the same techniques would give us a solution. The wave equation is second
order in t, unlike the heat equation which was first order in t. We will need to initial conditions in
order to obtain a solution, one for the initial displacement and the other for the initial speed.
The relevant wave equation problem we will study is
utt = c2 uxx (8.118)
u(0,t) = u(l,t) = 0 (8.119)
u(x, 0) = f (x), ut (x, 0) = g(x) (8.120)
The physical interpretation of the boundary conditions is that the ends of the string are fixed in
place. They might be attached to guitar pegs.
We start by assuming our solution has the form

u(x,t) = X(x)T (t). (8.121)

Plugging this into the equation gives

T 00 (t)X(x) = c2 T (t)X 00 (x). (8.122)

Separating variables, we have


X 00 T 00
= 2 = −λ (8.123)
X c T
where λ is a constant. This gives a pair of ODEs
T 00 + c2 λ T = 0 (8.124)
00
X +λX = 0. (8.125)
The boundary conditions transform into
u(0,t) = X(0)T (t) = 0 ⇒ X(0) = 0 (8.126)
u(l,t) = X(l)T (t) = 0 ⇒ X(l) = 0. (8.127)
298 Chapter 8. Partial Differential Equations

This is the same boundary value problem that we saw for the heat equation and thus the eigenvalues
and eigenfunctions are
nπ 2
λn = (8.128)
l
nπx 
Xn (x) = sin (8.129)
l
for n = 1, 2, ... The first ODE (8.124) is then
cnπ 2
T 00 + T = 0, (8.130)
l
and since the coefficient of T is clearly positive this has a general solution

nπct  nπct 
Tn (t) = An cos + Bn sin . (8.131)
l l
There is no reason to think either of these are zero, so we end up with separated solutions
 
nπct nπct nπx
un (x,t) = An cos( ) + Bn sin( ) sin( ) (8.132)
l l l

and the general solution is



∞ 
nπct nπct nπx
u(x,t) = ∑ An cos( ) + Bn sin( ) sin( ). (8.133)
n=1 l l l

We can directly apply our first initial condition, but to apply the second we will need to differentiate
with respect to t. This gives us
∞  
nπc nπct nπc nπct nπx
ut (x,t) = ∑ − An sin( )+ Bn cos( ) sin( ) (8.134)
n=1 l l l l l

Plugging in the initial condition then yields the pair of equations



nπx
u(x, 0) = f (x) = ∑ An sin( ) (8.135)
n=1 l

nπc nπx
ut (x, 0) = g(x) = ∑ Bn sin( ). (8.136)
n=1 l l

These are both Fourier Sine series. The first is directly the Fourier Since series for f (x) on (0, l).
The second equation is the Fourier Sine series for g(x) on (0, l) with a slightly messy coefficient.
The Euler-Fourier formulas then tell us that
2 l nπx
Z
An = f (x) sin( )dx (8.137)
l 0 l
Z l
nπc 2 nπx
Bn = g(x) sin( )dx (8.138)
l l 0 l
2 l nπx
Z
An = f (x) sin( )dx (8.139)
l 0 l
Z l
2 nπx
Bn = g(x) sin( )dx. (8.140)
nπc 0 l
8.9 Wave Equations and the Vibrating String 299

8.9.3 Examples
 Example 8.14 Find the solution (displacement u(x,t)) for the problem of an elastic string of
length L whose ends are held fixed. The string has no initial velocity (ut (x, 0) = 0) from an initial
position

4x L
L 0≤x≤ 4

u(x, 0) = f (x) = 1 L4 < x < 3L4
(8.141)
 4(L−x) 3L

L 4 ≤x≤L

By the formulas above we see if we separate variables we have the following equation for T
cnπ 2
T 00 + ( ) T =0 (8.142)
L
with the general solution
nπct nπct
Tn (t) = An cos( ) + Bn sin( ). (8.143)
L L
since the initial speed is zero, we find T 0 (0) = 0 and thus Bn = 0. Therefore the general solution is

nπct nπx
u(x,t) = ∑ An cos( ) sin( ). (8.144)
n=1 L L

where the coefficients are the Fourier Sine coefficients of f (x). So


2 L nπx
Z
An = f (x) sin( )dx (8.145)
L 0 L
Z L/4 Z 3L/4 Z L 
2 4x nπx nπx 4L − 4x nπx
= sin( )dx + sin( )dx + sin( )dx (8.146)
L 0 L L L/4 L 3L/4 L L
3nπ
sin( nπ
4 ) + sin( 4 )
= 8 (8.147)
n2 π 2
Thus the displacement of the string will be
3nπ
8 ∞ sin( nπ
4 ) + sin( 4 ) nπct nπx
u(x,t) = ∑ cos( ) sin( ). (8.148)
π 2 n=1 π2 L L


 Example 8.15 Find the solution (displacement u(x,t)) for the problem of an elastic string of
length L whose ends are held fixed. The string has no initial velocity (ut (x, 0) = 0) from an initial
position

8x(L − x)2
u(x, 0) = f (x) = (8.149)
L3
By the formulas above we see if we separate variables we have the following equation for T
cnπ 2
T 00 + ( ) T =0 (8.150)
L
with the general solution
nπct nπct
Tn (t) = An cos( ) + Bn sin( ). (8.151)
L L
300 Chapter 8. Partial Differential Equations

since the initial speed is zero, we find T 0 (0) = 0 and thus Bn = 0. Therefore the general solution is

nπct nπx
u(x,t) = ∑ An cos( ) sin( ). (8.152)
n=1 L L

where the coefficients are the Fourier Sine coefficients of f (x). So


2 L nπx
Z
An = f (x) sin( )dx (8.153)
L 0 L
2 L 8x(L − x)2 nπx
Z
= sin( )dx (8.154)
L 0 L3 L
2 + cos(nπ)
= 32 Integrate By Parts (8.155)
n3 π 3
Thus the displacement of the string will be
32 ∞ 2 + cos(nπ) nπct nπx
u(x,t) = ∑ cos( ) sin( ). (8.156)
π 3 n=1 n3 L L


8.9.4 D’Alembert’s Solution of the Wave Equation, Characteristics


Another approach to solving the wave equation
∂ 2u 2
2∂ u
= c (8.157)
∂t 2 ∂ x2
can be seen by transforming the equation. Introduce new independent variables:

v = x + ct, w = x − ct (8.158)

We can now think of u as a function of v, w instead of x,t. Now compute the appropriate partial
derivatives of u using the chain rule
∂u ∂u ∂v ∂u ∂w
= + = uv + u + w.
∂x ∂v ∂x ∂w ∂x
∂ 2u
= (uv + uw )x = (uv + uw )v vx + (uv + uw )w wx = uvv + 2uvw + uww
∂ x2
∂ 2u
= c2 (uvv − 2uvw + uww ).
∂t 2
Inserting these into the wave equation gives uvw = 0. Solve using two successive integrations, first
with respect to w and then with respect to v giving
∂u
Z
= h(v), u(v, w) = h(v)dv + ψ(w) = φ (v) + ψ(w).
∂v
Thus, replacing v and w by their definitions

u(x,t) = φ (x + ct) + ψ(x − ct) (8.159)

This is known as d’Alembert’s solution. In general given initial conditions u(x, 0) = f (x) and
ut (x, 0) = g(x), d’Alembert’s solution becomes
Z x+ct
1 1
u(x,t) = [ f (x + ct) + f (x − ct)] + g(s)ds. (8.160)
2 2c x−ct
8.9 Wave Equations and the Vibrating String 301

R The idea of d’Alembert’s solution is just a special case of the method of characteristics.
This concerns PDEs of the form

Auxx + 2Buxy +Cuyy = F(x, y, u, ux , uy ). (8.161)

PDEs have three general types defined by the coefficients A, B,C and the discriminant AC − B2

Figure 8.3: Three general types of PDEs, (from Kreysig Adv. Engineering Math).
Index

complex plane, 9
complex variables, 9

imaginary part, 9

real part, 9

You might also like