100% found this document useful (9 votes)

4K views864 pages

Ordinary Differential Equations 9781498733816 Compress

Uploaded by

Paywand Hakeem Aziz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (9 votes)

4K views864 pages

Ordinary Differential Equations 9781498733816 Compress

Uploaded by

Paywand Hakeem Aziz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 864

TEXTBOOKS in MATHEMATICS

ORDINARY
DIFFERENTIAL
EQUATIONS
An Introduction to the Fundamentals

Kenneth B. Howell
TEXTBOOKS in MATHEMATICS

ORDINARY
DIFFERENTIAL
EQUATIONS
An Introduction to the Fundamentals

Kenneth B. Howell
University of Alabama in Huntsville, USA
TEXTBOOKS in MATHEMATICS
Series Editors: Al Boggess and Ken Rosen

PUBLISHED TITLES
ABSTRACT ALGEBRA: AN INQUIRY-BASED APPROACH
Jonathan K. Hodge, Steven Schlicker, and Ted Sundstrom
ADVANCED LINEAR ALGEBRA
Nicholas Loehr
ADVANCED LINEAR ALGEBRA
Hugo Woerdeman
ADVANCED LINEAR ALGEBRA, SECOND EDITION
Bruce N. Cooperstein
APPLIED ABSTRACT ALGEBRA WITH MAPLE™ AND MATLAB®, THIRD EDITION
Richard Klima, Neil Sigmon, and Ernest Stitzinger
COMPUTATIONAL MATHEMATICS: MODELS, METHODS, AND ANALYSIS WITH MATLAB® AND MPI,
SECOND EDITION
Robert E. White
ELEMENTARY NUMBER THEORY
James S. Kraft and Lawrence C. Washington
EXPLORING LINEAR ALGEBRA: LABS AND PROJECTS WITH MATHEMATICA®
Crista Arangala
GRAPHS & DIGRAPHS, SIXTH EDITION
Gary Chartrand, Linda Lesniak, and Ping Zhang
INTRODUCTION TO ABSTRACT ALGEBRA, SECOND EDITION
Jonathan D. H. Smith
INTRODUCTION TO MATHEMATICAL PROOFS: A TRANSITION TO ADVANCED MATHEMATICS, SECOND EDITION
Charles E. Roberts, Jr.
INTRODUCTION TO NUMBER THEORY, SECOND EDITION
Marty Erickson, Anthony Vazzana, and David Garth
LINEAR ALGEBRA, GEOMETRY AND TRANSFORMATION
Bruce Solomon
PUBLISHED TITLES CONTINUED

MATHEMATICAL MODELLING WITH CASE STUDIES: USING MAPLE™ AND MATLAB®, THIRD EDITION
B. Barnes and G. R. Fulford
MATHEMATICS IN GAMES, SPORTS, AND GAMBLING–THE GAMES PEOPLE PLAY, SECOND EDITION
Ronald J. Gould
THE MATHEMATICS OF GAMES: AN INTRODUCTION TO PROBABILITY
David G. Taylor
MEASURE THEORY AND FINE PROPERTIES OF FUNCTIONS, REVISED EDITION
Lawrence C. Evans and Ronald F. Gariepy
NUMERICAL ANALYSIS FOR ENGINEERS: METHODS AND APPLICATIONS, SECOND EDITION
Bilal Ayyub and Richard H. McCuen
RISK ANALYSIS IN ENGINEERING AND ECONOMICS, SECOND EDITION
Bilal M. Ayyub
TRANSFORMATIONAL PLANE GEOMETRY
Ronald N. Umble and Zhigang Han

Also from the same author and published by CRC Press:

PRINCIPLES OF FOURIER ANALYSIS, 2001. The Second Edition is forthcoming.
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
© 2016 by Taylor & Francis Group, LLC
CRC Press is an imprint of Taylor & Francis Group, an Informa business

No claim to original U.S. Government works

Version Date: 20151012

International Standard Book Number-13: 978-1-4987-3384-7 (eBook - PDF)

This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been
made to publish reliable data and information, but the author and publisher cannot assume responsibility for the valid-
ity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright
holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this
form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may
rectify in any future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or uti-
lized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopy-
ing, microfilming, and recording, or in any information storage or retrieval system, without written permission from the
publishers.

For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://
www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923,
978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For
organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged.

Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for
identification and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
i i

6.1 Basic Notions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.2 Linear Substitutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
6.3 Homogeneous Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
6.4 Bernoulli Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

7 The Exact Form and General Integrating Factors 119

7.1 The Chain Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
7.2 The Exact Form, Deﬁned . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
7.3 Solving Equations in Exact Form . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
7.4 Testing for Exactness — Part I . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
7.5 “Exact Equations”: A Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
7.6 Converting Equations to Exact Form . . . . . . . . . . . . . . . . . . . . . . . . . 132
7.7 Testing for Exactness — Part II . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

8 Slope Fields: Graphing Solutions Without the Solutions 145

8.1 Motivation and Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
8.2 The Basic Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
8.3 Observing Long-Term Behavior in Slope Fields . . . . . . . . . . . . . . . . . . . 152
8.4 Problem Points in Slope Fields, and Issues of Existence and Uniqueness . . . . . . 158
8.5 Tests for Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

9 Euler’s Numerical Method 177

9.1 Deriving the Steps of the Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
9.2 Computing Via Euler’s Method (Illustrated) . . . . . . . . . . . . . . . . . . . . . 179
9.3 What Can Go Wrong . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
9.4 Reducing the Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
9.5 Error Analysis for Euler’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . 187
Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192

10 The Art and Science of Modeling with First-Order Equations 197

10.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
10.2 A Rabbit Ranch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
10.3 Exponential Growth and Decay . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
10.4 The Rabbit Ranch, Again . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
10.5 Notes on the Art and Science of Modeling . . . . . . . . . . . . . . . . . . . . . . 208
10.6 Mixing Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
10.7 Simple Thermodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
10.8 Appendix: Approximations That Are Not Approximations . . . . . . . . . . . . . 216
Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218

i i

i i
i i

i i

vii

III Second- and Higher-Order Equations 225

11 Higher-Order Equations: Extending First-Order Concepts 227
11.1 Treating Some Second-Order Equations as First-Order . . . . . . . . . . . . . . . 228
11.2 The Other Class of Second-Order Equations “Easily Reduced” to First-Order . . . 232
11.3 Initial-Value Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
11.4 On the Existence and Uniqueness of Solutions . . . . . . . . . . . . . . . . . . . . 238
Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241

12 Higher-Order Linear Equations and the Reduction of Order Method 245

12.1 Linear Differential Equations of All Orders . . . . . . . . . . . . . . . . . . . . . 245
12.2 Introduction to the Reduction of Order Method . . . . . . . . . . . . . . . . . . . 248
12.3 Reduction of Order for Homogeneous Linear Second-Order Equations . . . . . . . 249
12.4 Reduction of Order for Nonhomogeneous Linear Second-Order Equations . . . . . 254
12.5 Reduction of Order in General . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259

13 General Solutions to Homogeneous Linear Differential Equations 261

13.1 Second-Order Equations (Mainly) . . . . . . . . . . . . . . . . . . . . . . . . . . 261
13.2 Homogeneous Linear Equations of Arbitrary Order . . . . . . . . . . . . . . . . . 272
13.3 Linear Independence and Wronskians . . . . . . . . . . . . . . . . . . . . . . . . 273
Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276

14 Verifying the Big Theorems and an Introduction to Differential Operators 281

14.1 Verifying the Big Theorem on Second-Order, Homogeneous Equations . . . . . . . 281
14.2 Proving the More General Theorems on General Solutions and Wronskians . . . . 288
14.3 Linear Differential Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296

15 Second-Order Homogeneous Linear Equations with Constant Coefﬁcients 299

15.1 Deriving the Basic Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
15.2 The Basic Approach, Summarized . . . . . . . . . . . . . . . . . . . . . . . . . . 302
15.3 Case 1: Two Distinct Real Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
15.4 Case 2: Only One Root . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
15.5 Case 3: Complex Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
15.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316

16 Springs: Part I 319

16.1 Modeling the Action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
16.2 The Mass/Spring Equation and Its Solutions . . . . . . . . . . . . . . . . . . . . . 323
Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332

17 Arbitrary Homogeneous Linear Equations with Constant Coefﬁcients 335

17.1 Some Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
17.2 Solving the Differential Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
17.3 More Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
17.4 On Verifying Theorem 17.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
17.5 On Verifying Theorem 17.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352

i i

i i
i i

i i

viii

18 Euler Equations 355

18.1 Second-Order Euler Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
18.2 The Special Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
18.3 Euler Equations of Any Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362
18.4 The Relation Between Euler and Constant Coefﬁcient Equations . . . . . . . . . . 365
Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366

19 Nonhomogeneous Equations in General 369

19.1 General Solutions to Nonhomogeneous Equations . . . . . . . . . . . . . . . . . . 369
19.2 Superposition for Nonhomogeneous Equations . . . . . . . . . . . . . . . . . . . 373
19.3 Reduction of Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376

20 Method of Undetermined Coefﬁcients (aka: Method of Educated Guess) 381

20.1 Basic Ideas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
20.2 Good First Guesses for Various Choices of g . . . . . . . . . . . . . . . . . . . . 384
20.3 When the First Guess Fails . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 388
20.4 Method of Guess in General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390
20.5 Common Mistakes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393
20.6 Using the Principle of Superposition . . . . . . . . . . . . . . . . . . . . . . . . . 394
20.7 On Verifying Theorem 20.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398

21 Springs: Part II 401

21.1 The Mass/Spring System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401
21.2 Constant Force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
21.3 Resonance and Sinusoidal Forces . . . . . . . . . . . . . . . . . . . . . . . . . . . 404
21.4 More on Undamped Motion under Nonresonant Sinusoidal Forces . . . . . . . . . 410
Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412

22 Variation of Parameters (A Better Reduction of Order Method) 417

22.1 Second-Order Variation of Parameters . . . . . . . . . . . . . . . . . . . . . . . . 417
22.2 Variation of Parameters for Even Higher Order Equations . . . . . . . . . . . . . . 425
22.3 The Variation of Parameters Formula . . . . . . . . . . . . . . . . . . . . . . . . . 428
Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430

IV The Laplace Transform 433

23 The Laplace Transform (Intro) 435
23.1 Basic Deﬁnition and Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435
23.2 Linearity and Some More Basic Transforms . . . . . . . . . . . . . . . . . . . . . 441
23.3 Tables and a Few More Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . 443
23.4 The First Translation Identity (And More Transforms) . . . . . . . . . . . . . . . . 448
23.5 What Is “Laplace Transformable”? (and Some Standard Terminology) . . . . . . . 450
23.6 Further Notes on Piecewise Continuity and Exponential Order . . . . . . . . . . . 455
23.7 Proving Theorem 23.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 461

24 Differentiation and the Laplace Transform 465

24.1 Transforms of Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465
24.2 Derivatives of Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470

i i

i i
i i

i i

24.3 Transforms of Integrals and Integrals of Transforms . . . . . . . . . . . . . . . . . 472

24.4 Appendix: Differentiating the Transform . . . . . . . . . . . . . . . . . . . . . . . 477
Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480

25 The Inverse Laplace Transform 483

25.1 Basic Notions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483
25.2 Linearity and Using Partial Fractions . . . . . . . . . . . . . . . . . . . . . . . . . 485
25.3 Inverse Transforms of Shifted Functions . . . . . . . . . . . . . . . . . . . . . . . 491
Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493

26 Convolution 495
26.1 Convolution: The Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495
26.2 Convolution and Products of Transforms . . . . . . . . . . . . . . . . . . . . . . . 499
26.3 Convolution and Differential Equations (Duhamel’s Principle) . . . . . . . . . . . 503
Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507

27 Piecewise-Deﬁned Functions and Periodic Functions 509

27.1 Piecewise-Defined Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509
27.2 The “Translation Along the T -Axis” Identity . . . . . . . . . . . . . . . . . . . . 512
27.3 Rectangle Functions and Transforms of More Piecewise-Defined Functions . . . . 517
27.4 Convolution with Piecewise-Defined Functions . . . . . . . . . . . . . . . . . . . 521
27.5 Periodic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524
27.6 An Expanded Table of Identities . . . . . . . . . . . . . . . . . . . . . . . . . . . 529
27.7 Duhamel’s Principle and Resonance . . . . . . . . . . . . . . . . . . . . . . . . . 530
Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537

28 Delta Functions 541

28.1 Visualizing Delta Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541
28.2 Delta Functions in Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 542
28.3 The Mathematics of Delta Functions . . . . . . . . . . . . . . . . . . . . . . . . . 546
28.4 Delta Functions and Duhamel’s Principle . . . . . . . . . . . . . . . . . . . . . . 550
28.5 Some “Issues” with Delta Functions . . . . . . . . . . . . . . . . . . . . . . . . . 552
Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556

V Power Series and Modiﬁed Power Series Solutions 559

29 Series Solutions: Preliminaries 561
29.1 Inﬁnite Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561
29.2 Power Series and Analytic Functions . . . . . . . . . . . . . . . . . . . . . . . . . 566
29.3 Elementary Complex Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575
29.4 Additional Basic Material That May Be Useful . . . . . . . . . . . . . . . . . . . 578
Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 583

36 Critical Points, Direction Fields and Trajectories 775

36.1 The Systems of Interest and Some Basic Notation . . . . . . . . . . . . . . . . . . 775
36.2 Constant/Equilibrium Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 777
36.3 “Graphing” Standard Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 779
36.4 Sketching Trajectories for Autonomous Systems . . . . . . . . . . . . . . . . . . . 781
36.5 Critical Points, Stability and Long-Term Behavior . . . . . . . . . . . . . . . . . . 786
36.6 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 789
36.7 Existence and Uniqueness of Trajectories . . . . . . . . . . . . . . . . . . . . . . 795
36.8 Proving Theorem 36.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 797
Additional Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 801

Appendix: Author’s Guide to Using This Text 807

A.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 807
A.2 Chapter-by-Chapter Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 808

Answers to Selected Exercises 817

Index 849

i i

i i
i i

i i

i i
i i

i i

Preface
(with Important Information for the Reader)

This textbook reflects my personal views on how an “ideal” introductory course in ordinary differ-
ential equations should be taught, tempered by such practical constraints as the time available, and
the level and interests of the students. It also reflects my beliefs that a good text should (1) be a
useful resource beyond the course, and (2) back up its claims with solid and clear proofs (even if
some of those proofs are ignored by most readers). Moreover, I hope, it reflects the fact that such a
book should be written for those normally taking such a course; namely, students who are reasonably
acquainted with the differentiation and integration of functions of one variable, but who might not
yet be experts and may, on occasion, need to return to their elementary calculus texts for review.
Most of these students are not mathematicians and probably have no desire to become professional
mathematicians. Still, most are interested in fields of study where a fundamental understanding of
the mathematics and applications of differential equations is extremely useful. Some may have gone
beyond the basic single-variable calculus and be acquainted with multivariable calculus. If so, great.
They can delve into a few more topics. And those who’ve had a course in linear algebra or real
analysis are even luckier. They can be on the lookout for points where the theory from those more
advanced courses can be applied to and may even simplify some of the discussion.
If you are any one of these students, I hope you find this text readable and informative; after all,
I wrote it for you. And if you are an instructor of some of these students, then I hope you find this
text helpful as well. While I wrote this text mainly for students, the needs of the instructors were
kept firmly in mind. After all, this is the text my colleagues and I have been using for the last several
years.
Whether you are a student, instructor, or just a casual reader, there are a number of things you
should be aware of before starting the first chapter:

1. Extra material: There is more material in this text than can be reasonably covered in a
“standard” one-semester introductory course. In part, this is to provide the material for a
variety of “standard” courses which may or may not cover such topics as Laplace transforms,
series solutions, and systems. Beyond that, though, there are expanded discussions of topics
normally covered, as well as topics rarely covered, but which are still elementary enough
and potentially useful enough to merit discussion, as well as the proofs that are not simple
and illuminating enough to be included in the basic exposition, but should still be there to
keep the author honest and to serve as a reference for others. Because of this extra material,
there is an appendix, Author’s Guide to Using This Text, with advice on which sections must
be covered, which are optional, and which are best avoided by the ﬁrst-time reader. It also
contains a few opinionated comments.

2. Computer math packages: At several points in the text, the use of a “computer math package”
is advised or, in exercises, required. By a “computer math package”, I mean one of those
powerful software packages such as Maple or Mathematica that can do symbolic calculations,
graphing and so forth. Unfortunately, software changes over time, new products emerge, and
companies providing this software can be bought and sold. In addition, you may be able to
find other computational resources on the Internet (but be aware that websites can be much
more fickle and untrustworthy than major software providers). For these reasons, details on
using such software are not included in this text. You will have to figure that out yourself (it’s
not that hard). I will tell you this: Extensive use of Maple was made in preparing this text.
In fact, most of the graphs were generated in Maple and then cleaned up using commercial
graphics software.

xiii

i i

i i
i i

i i

xiv Preface (with Important Information)

On the subject of computer math packages: Please become reasonably proﬁcient in at

least one. If you are reading this, you are probably working in or will be working in a ﬁeld
in which this sort of knowledge is invaluable. But don’t think this software can replace a
basic understanding of the mathematics you are using. Even a simple calculator is useless to
someone who doesn’t understand just what + and × mean. Mindlessly using this software
can lead to serious and costly mistakes (as discussed in section 9.3).

3. Additional chapters: By the way, I do not consider this text as being complete. Additional
chapters on systems of differential equations, numerical methods beyond Euler’s method,
boundary-value problems and so on are being written, mainly for a follow-up text that I hope
to eventually have published. As these chapters become written (and rewritten), they will
become available at the website for this text (see below).
4. Text website: While this edition remains in publication, I intend to maintain a website for
this text containing at least the following:

• A lengthy solution manual

• The aforementioned chapters extending the material in this text

• A list of known errata discovered since the book’s publication

At the time I was writing this, the text’s website was at http://howellkb.uah.edu/DEtext/.
With luck, that will still be the website’s location when you need it. Unfortunately, I cannot
guarantee that my university will not change its website policies and conventions, forcing
you to search for the current location of the text’s website. If you must search for this site, I
would suggest starting with the website of the Department of Mathematical Sciences of the
University of Alabama in Huntsville.

Finally, I must thank the many students and fellow faculty who have used earlier versions of
this text and have provided the feedback that I have found invaluable in preparing this edition. Those
comments are very much appreciated. And, if you, the reader, should ﬁnd any errors or would like
to make any suggestions or comments regarding this text, please let me know. That, too, would be
very much appreciated.

Dr. Kenneth B. Howell

([email protected])

i i

i i
i i

i i

Part I
The Basics

i i

i i
i i

i i

i i
i i

i i

1
The Starting Point:
Basic Concepts and Terminology

Let us begin our study of “differential equations” with a few basic questions — questions that any
beginner should ask:
What are “differential equations”?
What can we do with them? Solve them? If so, what do we solve for? And how?
and, of course,
What good are they, anyway?
In this chapter, we will try to answer these questions (along with a few you would not yet think
to ask), at least well enough to begin our studies. With luck we will even raise a few questions
that cannot be answered now, but which will justify continuing our study. In the process, we will
also introduce and examine some of the basic concepts, terminology and notation that will be used
throughout this book.

1.1 Differential Equations: Basic Deﬁnitions and

Classiﬁcations
A differential equation is an equation involving some function of interest along with a few of its
derivatives. Typically, the function is unknown, and the challenge is to determine what that function
could possibly be.
Differential equations can be classiﬁed either as “ordinary” or as “partial”. An ordinary differ-
ential equation is a differential equation in which the function in question is a function of only one
variable. Hence, its derivatives are the “ordinary” derivatives encountered early in calculus. For the
most part, these will be the sort of equations we’ll be examining in this text. For example,

dy
= 4x 3
dx
dy 4
+ y = x2
dx x
d2 y dy
2
− 2 − 3y = 65 cos(2x)
dx dx

i i

i i
i i

i i

4 The Starting Point: Basic Concepts and Terminology

d2 y dy
4x 2 2
+ 4x + [4x 2 − 1]y = 0
dx d x
and
d4 y
= 81y
dx4
are some differential equations that we will later deal with. In each, y denotes a function that is
given by some, yet unknown, formula of x . Of course, there is nothing sacred about our choice of
symbols. We will use whatever symbols are convenient for the variables and functions, especially if
the problem comes from an application and the symbols help remind us of what they denote (such
as when we use t for a measurement of time).1
A partial differential equation is a differential equation in which the function of interest depends
on two or more variables. Consequently, the derivatives of this function are the partial derivatives
developed in the later part of most calculus courses.2 Because the methods for studying partial
differential equations often involve solving ordinary differential equations, it is wise to first become
reasonably adept at dealing with ordinary differential equations before tackling partial differential
equations.
As already noted, this text is mainly concerned with ordinary differential equations. So let us
agree that, unless otherwise indicated, the phrase “differential equation” in this text means “ordinary
differential equation”. If you wish to further simplify the phrasing to “DE” or even to something like
“Diffy-Q”, go ahead. This author, however, will not be so informal.
Differential equations are also classified by their “order”. The order of a differential equation is
simply the order of the highest order derivative explicitly appearing in the equation. The equations
dy dy 4 dy
= 4x 3 , + y = x2 and y = −9.8x
dx dx x dx
are all first-order equations. So is

dy
d y 4
+ 3y 2 = y ,
dx dx

despite the appearance of the higher powers — dy/dx is still the highest order derivative in this
equation, even if it is multiplied by itself a few times.
The equations

d2 y dy d2 y dy
2
− 2 − 3y = 65 cos(2x) and 4x 2 2
+ 4x + [4x 2 − 1]y = 0
dx dx dx dx
are second-order equations, while

d3 y d3 y d2 y dy
= e4x and − + − y = x2
dx3 dx3 dx2 dx
are third-order equations.

?Exercise 1.1: What is the order of each of the following equations?

d2 y dy
2
+ 3 − 7y = sin(x)
dx d x
1 On occasion, we may write “ y = y(x) ” to explicitly indicate that, in some expression, y denotes a function given by
some formula of x with y(x) denoting that “formula of x ”. More often, it will simply be understood that y is a function
given by some formula of whatever variable appears in our expressions.
2 A brief introduction to partial derivatives is given in section 3.7 for those who are interested and haven’t yet seen partial
derivatives.

i i

i i
i i

i i

Differential Equations: Basic Deﬁnitions and Classiﬁcations 5

d5 y d3 y
− cos(x) = y2
dx5 dx3

d5 y d3 y
− cos(x) = y6
dx5 dx3
2
d 42 y d3 y
=
d x 42 dx3

In practice, higher-order differential equations are usually more difficult to solve than lower-
order equations. This, of course, is not an absolute rule. There are some very difficult first-order
equations, as well as some very easily solved twenty-seventh-order equations.

Solutions: The Basic Notions∗

Any function that satisﬁes a given differential equation is called a solution to that differential equation.
“Satisﬁes the equation”, means that, if you plug the function into the differential equation and compute
the derivatives, then the result is an equation that is true no matter what real value we replace the
variable with. And if that resulting equation is not true for some real values of the variable, then that
function is not a solution to that differential equation.

!Example 1.1: Consider the differential equation

dy
− 3y = 0 .
dx

If, in this differential equation, we let y(x) = e3x (i.e., if we replace y with e3x ), we get
d

e3x − 3e3x = 0
dx

→ 3e3x − 3e3x = 0

→ 0 = 0 ,

which certainly is true for every real value of x . So y(x) = e3x is a solution to our differential
equation.
On the other hand, if we let y(x) = x 3 in this differential equation, we get
d

x 3 − 3x 3 = 0
dx

→ 3x 2 − 3x 3 = 0

→ 3x 2 (1 − x) = 0 ,

which is true only if x = 0 or x = 1 . But our interest is not in finding values of x that make
the equation true, but in finding functions of x (i.e., y(x) ) that make the equation true for all
values of x . So y(x) = x 3 is not a solution to our differential equation. (And it makes no sense,
whatsoever, to refer to either x = 0 or x = 1 as solutions, here.)
∗ Warning: The discussion of “solutions” here is rather incomplete so that we can get to the basic, intuitive concepts quickly.
We will refine our notion of “solutions” in section 1.3 starting on page 14.

i i

i i
i i

i i

6 The Starting Point: Basic Concepts and Terminology

Typically, a differential equation will have many different solutions. Any formula (or set of
formulas) that describes all possible solutions is called a general solution to the equation.

!Example 1.2: Consider the differential equation

dy
= 6x .
dx

All possible solutions can be obtained by just taking the indeﬁnite integral of both sides,

dy
dx = 6x dx
dx

→ y(x) + c1 = 3x 2 + c2

→ y(x) = 3x 2 + c2 − c1

where c1 and c2 are arbitrary constants. Since the difference of two arbitrary constants is just
another arbitrary constant, we can replace the above c2 − c1 with a single arbitrary constant c
and rewrite our last equation more succinctly as

y(x) = 3x 2 + c .

This formula for y describes all possible solutions to our original differential equation — it is a
general solution to the differential equation in this example. To obtain an individual solution to
our differential equation, just replace c with any particular number. For example, respectively
letting c = 1 , c = −3 , and c = 827 yield the following three solutions to our differential
equation:
3x 2 + 1 , 3x 2 − 3 and 3x 2 + 827 .

As just illustrated, general solutions typically involve arbitrary constants. In many applications,
we will find that the values of these constants are not truly arbitrary but are fixed by additional
conditions imposed on the possible solutions (so, in these applications at least, it would be more
accurate to refer to the “arbitrary” constants in the general solutions of the differential equations as
“yet undetermined” constants).
Normally, when given a differential equation and no additional conditions, we will want to
determine all possible solutions to the given differential equation. Hence, “solving a differential
equation” often means “finding a general solution” to that differential equation. That will be the
default meaning of the phrase “solving a differential equation” in this text. Notice, however, that the
resulting “solution” is not a single function that satisfies the differential equation (which is what we
originally defined “a solution” to be), but is a formula describing all possible functions satisfying the
differential equation (i.e., a “general solution”). Such ambiguity often arises in everyday language,
and we’ll just have to live with it. Simply remember that, in practice, the phrase “a solution to a
differential equation” can refer either to
any single function that satisfies the differential equation,
or
any formula describing all the possible solutions (more correctly called a general solution).
In practice, it is usually clear from the context just which meaning of the word “solution” is being
used. On occasions where it might not be clear, or when we wish to be very precise, it is standard

i i

i i
i i

i i

Differential Equations: Basic Deﬁnitions and Classiﬁcations 7

to call any single function satisfying the given differential equation a particular solution. So, in the
last example, the formulas

3x 2 + 1 , 3x 2 − 3 and 3x 2 + 827

describe particular solutions to

dy
= 6x .
dx

Initial-Value Problems
One set of auxiliary conditions that often arises in applications is a set of “initial values” for the desired
solution. This is a speciﬁcation of the values of the desired solution and some of its derivatives at
a single point. To be precise, an N th -order set of initial values for a solution y consists of an
assignment of values to

y(x 0 ) , y (x 0 ) , y (x 0 ) , y (x 0 ) , ... and y (N −1) (x 0 )

where x 0 is some ﬁxed number (in practice, x 0 is often 0 ) and N is some nonnegative integer.3
Note that there are exactly N values being assigned and that the highest derivative in this set is of
order N − 1 .
We will ﬁnd that N th -order sets of initial values are especially appropriate for Nth -order differ-
ential equations. Accordingly, the term N th -order initial-value problem will always mean a problem
consisting of
1. an N th -order differential equation, and

2. an N th -order set of initial values.

For example,
dy
− 3y = 0 with y(0) = 4
dx
is a first-order initial-value problem. “ dy/dx − 3y = 0 ” is the first-order differential equation, and
“ y(0) = 4 ” is the first-order set of initial values. On the other hand, the third-order differential
equation
d3 y dy
+ = 0
dx3 dx
along with the third-order set of initial conditions

y(1) = 3 , y (1) = −4 and y (1) = 10

makes up a third-order initial-value problem.

A solution to an initial-value problem is a solution to the differential equation that also satisfies
the given initial values. The usual approach to solving such a problem is to first find the general
solution to the differential equation (via any of the methods we’ll develop later), and then determine
the values of the ‘arbitrary’ constants in the general solution so that the resulting function also satisfies
each of the given initial values.

3 Remember, if y = y(x) , then

d2 y d3 y dk y
y = y = y = y (k) =
dy
, , , ... and .
dx dx 2 dx 3 dx k

We will use the ‘prime’ notation for derivatives when the d/dx notation becomes cumbersome.

i i

i i
i i

i i

8 The Starting Point: Basic Concepts and Terminology

!Example 1.3: Consider the initial-value problem

dy
= 6x with y(1) = 8 .
dx
From example 1.2, we know that the general solution to the above differential equation is
y(x) = 3x 2 + c
where c is an arbitrary constant. Combining this formula for y with the requirement that
y(1) = 8 , we have
8 = y(1) = 3 · 12 + c = 3 + c ,
which, in turn, requires that
c = 8 − 3 = 5 .
So the solution to the initial-value problem is given by
y(x) = 3x 2 + c with c = 5 ;
that is,
y(x) = 3x 2 + 5 .

By the way, the terms “initial values”, “initial conditions”, and “initial data” are essentially
synonymous and, in practice, are used interchangeably.

1.2 Why Care About Differential Equations? Some

Illustrative Examples
Perhaps the main reason to study differential equations is that they naturally arise when we attempt
to mathematically describe “real-world” processes that vary with, say, time or position. Let us look
at one well-known process: the falling of some object towards the earth. To illustrate some of the
issues involved, we’ll develop two different sets of mathematical descriptions for this process.
By the way, any collection of equations and formulas describing some process is called a
(mathematical) model of the process, and the process of developing a mathematical model is called,
unsurprisingly, modeling.

The Situation to Be Modeled:

Let us concern ourselves with the vertical position and motion of an object dropped from a plane at a
height of 1,000 meters. Since it’s just being dropped, we may assume its initial downward velocity is
0 meters per second. The precise nature of the object — whether it’s a falling marble, a frozen duck
(live, unfrozen ducks don’t usually fall) or some other familiar falling object — is not important at
this time. Visualize it as you will.
The ﬁrst two things one should do when developing a model is to sketch the process (if possible)
and to assign symbols to quantities that may be relevant. A crude sketch of the process is in ﬁgure
1.1 (I’ve sketched the object as a ball since a ball is easy to sketch). Following ancient traditions,
let’s make the following symbolic assignments:
m = the mass (in grams) of the object

t = time (in seconds) since the object was dropped

i i

i i
i i

i i

Why Care About Differential Equations? Some Illustrative Examples 9

initial height (1,000 meters)

v(t) falling object of mass m

y(0) y(t)

the ground (where y = 0)

Figure 1.1: Rough sketch of a falling object of mass m.

y(t) = vertical distance (in meters) between the object and the ground at time t

v(t) = vertical velocity (in meters/second) of the object at time t

a(t) = vertical acceleration (in meters/second2 ) of the object at time t

Where convenient, we will use y , v and a as shorthand for y(t) , v(t) and a(t) . Remember that,
by the deﬁnition of velocity and acceleration,

dy dv d2 y
v = and a = = .
dt dt dt 2
From our assumptions regarding the object’s position and velocity at the instant it was dropped,
we have that
dy
y(0) = 1,000 and = v(0) = 0 . (1.1)
dt t=0
These will be our initial values. (Notice how appropriate it is to call these the “initial values” —
y(0) and v(0) are, indeed, the initial position and velocity of the object.)
As time goes on, we expect the object to be falling faster and faster downwards, so we expect
both the position and velocity to vary with time. Precisely how these quantities vary with time might
be something we don’t yet know. However, from Newton’s laws, we do know

F = ma

where F is the sum of the (vertically acting) forces on the object. Replacing a with either the
corresponding derivative of velocity or position, this equation becomes
dv
F = m (1.2)
dt

or, equivalently,
d2 y
F = m . (1.2 )
dt 2
If we can adequately describe the forces acting on the falling object (i.e., the F ), then the velocity,
v(t) , and vertical position, y(t) , can be found by solving the above differential equations, subject
to the initial conditions in line (1.1).

i i

i i
i i

i i

10 The Starting Point: Basic Concepts and Terminology

The Simplest Falling Object Model

The Earth’s gravity is the most obvious force acting on our falling object. Checking a convenient
physics text, we ﬁnd that the force of the Earth’s gravity acting on an object of mass m is given by

Fgrav = −gm where g = 9.8 meters/second2 .

Of course, the value for g is an approximation and assumes that the object is not too far above
the Earth’s surface. It also assumes that we’ve chosen “up” to be the positive direction (hence the
negative sign).
For this model, let us suppose the Earth’s gravity, Fgrav , is the only signiﬁcant force involved.
Assuming this (and keeping in mind that we are measuring distance in meters and time in seconds),
we have
F = Fgrav = −9.8m
in the “ F = ma ” equation. In particular, equation (1.2 ) becomes

d2 y
−9.8m = m .
dt 2
The mass conveniently divides out, leaving us with

d2 y
= −9.8 .
dt 2
Taking the indeﬁnite integral with respect to t of both sides of this equation yields
2
d y
2
dt = −9.8 dt
dt

→ d dy
dt dt
dt = −9.8 dt

→ dy
dt
+ c1 = −9.8t + c2

→ dy
dt
= −9.8t + c

where c1 and c2 are the “arbitrary constants of integration” and c = c2 − c1 . This gives us our
formula for dy/dt up to an unknown constant c . But recall that the initial velocity is zero,

dy
= v(0) = 0 .
dt t=0

On the other hand, setting t equal to zero in the formula just derived for dy/ yields
dt

dy
= −9.8 · 0 + c .
dt t=0

Combining these two expressions for y (0) yields

dy
0 = = −9.8 · 0 + c .
dt t=0

Thus, c = 0 and our formula for dy/

dt reduces to
dy
= −9.8t .
dt

i i

i i
i i

i i

Why Care About Differential Equations? Some Illustrative Examples 11

Again, we have a differential equation that is easily solved by simple integration,

dy
dt = −9.8t dt
dt

→ y(t) + C 1 = −9.8
1 2
2
t + C2

→ y(t) = −4.9t 2 + C

where, again, C 1 and C 2 are the “arbitrary constants of integration” and C = C 2 −C 1 .4 Combining
this last equation with the initial condition for y(t) (from line (1.1)), we get

1,000 = y(0) = −4.9 · 02 + C .

Thus, C = 1,000 and the vertical position (in meters) at time t is given by

y(t) = −4.9t 2 + 1,000 .

A Better Falling Object Model

The above model does not take into account the resistance of the air to the falling object — a very
important force if the object is relatively light or has a parachute. Let us add this force to our model.
That is, for our “ F = ma ” equation, we’ll use

F = Fgrav + Fair

where Fgrav is the force of gravity discussed above, and Fair is the force due to the air resistance
acting on this particular falling body.
Part of our problem now is to determine a good way of describing Fair in terms relevant to
our problem. To do that, let us list a few basic properties of air resistance that should be obvious to
anyone who has stuck their hand out of a car window:
1. The force of air resistance does not depend on the position of the object, only on the relative
velocity between it and the surrounding air. So, for us, Fair will just be a function of v ,
Fair = Fair (v) . (This assumes, of course, that the air is still — no up- or downdrafts — and
that the density of the air remains fairly constant throughout the distance this object falls.)

2. This force is zero when the object is not moving, and its magnitude increases as the speed
increases (remember, speed is the magnitude of the velocity). Hence, Fair (v) = 0 when
v = 0 , and |Fair (v)| gets bigger as |v| gets bigger.

3. Air resistance acts against the direction of motion. This means that the direction of the force
of air resistance is opposite to the direction of motion. Thus, the sign of Fair (v) will be
opposite that of v .

While there are many formulas for Fair (v) that would satisfy the above conditions, common sense
suggests that we ﬁrst use the simplest. That would be

Fair (v) = −γ v

4 Note that slightly different symbols are being used to denote the different constants. This is highly recommended to
prevent confusion when (and if) we ever review our computations.

i i

i i
i i

i i

12 The Starting Point: Basic Concepts and Terminology

where γ is some positive value. The actual value of γ will depend on such parameters as the
object’s size, shape, and orientation, as well as the density of the air through which the object is
moving. For any given object, this value could be determined by experiment (with the aid of the
equations we will soon derive).

?Exercise 1.2: Convince yourself that

a: this formula for Fair (v) does satisfy the above three conditions, and
b: no simpler formula would work.

We are now ready to derive the appropriate differential equations for our improved model of a
falling object. The total force is given by

F = Fgrav + Fair = −9.8m − γ v .

Since this formula explicitly involves v instead of dy/

dt , let us use the equation (1.2) version of
“ F = ma ” from page 9,
dv
F = m .
dt
Combining the last two equations,
dv
m = F = −9.8m − γ v .
dt
Cutting out the middle and dividing through by the mass gives the slightly simpler equation
dv γ
= −9.8 − κv where κ= . (1.3)
dt m
Remember that γ , m and, hence, κ are positive constants, while v = v(t) is a yet unknown
function that satisfies the initial condition v(0) = 0 . After solving this initial-value problem for
v(t) , we could then find the corresponding formula for height at time t , y(t) , by solving the simple
initial-value problem
dy
= v(t) with y(0) = 1,000 .
dt
Unfortunately, we cannot solve equation (1.3) by simply integrating both sides with respect to
t,
dv
dt = [−9.8 − κv] dt .
dt
The first integral is not a problem. By the relation between derivatives and integrals, we still have

dv
dt = v(t) + c1
dt

where c1 is an arbitrary constant. It’s the other side that is a problem. Since κ is a constant, but
v = v(t) is an unknown function of t , the best we can do with the righthand side is

[−9.8 − κv] dt = − 9.8 dt − κ v(t) dt = −9.8t + c2 − κ v(t) dt .

Again, c2 is an arbitrary constant. However, since v(t) is an unknown function, its integral is
simply another unknown function of t . Thus, letting c = c2 − c1 and “integrating the equation”
simply gives us the rather unhelpful formula

v(t) = −9.8t + c − (κ · some unknown function of t ) .

i i

i i
i i

i i

Why Care About Differential Equations? Some Illustrative Examples 13

Fortunately, this is a text on differential equations, and methods for solving equations such as
equation (1.3) will be discussed in chapters 4 and 5. But there’s no need to rush things. The main
goal here is just to see how differential equations arise in applications. Of course, now that we have
equation (1.3), we also have a good reason to continue on and learn how to solve it.
By the way, if we replace v in equation (1.3) with dy/dt , we get the second-order differential
equation
d2 y dy
= −9.8 − κ .
dt 2 dt
This can be integrated, yielding
dy
= −9.8t − κ y + c
dt

where c is an arbitrary constant. Again, this is a ﬁrst-order differential equation that we cannot
solve until we delve more deeply into the various methods for solving these equations. And if, in
this last equation, we again use the fact that v = dy/dt , all we get is

v = −9.8t − κ y + c (1.4)

which is another not-very-helpful equation relating the unknown functions v(t) and y(t) .5

Summary of Our Models and the Related Initial Value Problems

For the ﬁrst model of a falling body, we had the second-order differential equation

d2 y
= −9.8 .
dt 2

along with the initial conditions

y(0) = 1,000 and y (0) = 0 .

In other words, we had a second-order initial-value problem. This problem, as we saw, was rather
easy to solve.
For the second model, we still had the initial conditions

y(0) = 1,000 and y (0) = 0 ,

but we found it a little more convenient to write the differential equation as

dv dy
= −9.8 − κv where = v
dt dt

and κ was some positive constant. There are a couple of ways we can view this collection of
equations. First of all, we could simply replace the v with dy/dt and say we have the second-order
initial problem
d2 y dy
= −9.8 − κ
dt 2 dt

with
y(0) = 1, 000 and y (0) = 0 .

5 well, not completely useless — see exercise 1.10 b on page 20.

i i

i i
i i

Recall that the domain of a function is the set of all numbers that can be plugged into the
function. Naturally, if a function is a solution to a differential equation over some interval, then that
function’s domain must include that interval.7
Since we’ve refined our definition of particular solutions, we should make the corresponding
refinement to our definition of a general solution. A general solution to a differential equation over
an interval of interest is a formula or set of formulas describing all possible particular solutions over
that interval.

Describing Particular Solutions

Let us get somewhat technical for a moment. Suppose we have a solution y to some differential
equation over some interval of interest. Remember, we’ve defined y to be a “function”. If you look
up the basic definition of “function” in your calculus text, you’ll find that, strictly speaking, y is a
mapping of one set of numbers (the domain of y ) onto another set of numbers (the range of y ). This
means that, for each value x in the function’s domain, y assigns a corresponding number which
we usually denote y(x) and call “the value of y at x ”. If we are lucky, the function y is described
by some formula, say, x 2 . That is, the value of y(x) can be determined for each x in the domain
by the equation
y(x) = x 2 .
Strictly speaking, the function y , its value at x (i.e., y(x)), and any formula describing how to
compute y(x) are different things. In everyday usage, however, the fine distinctions between these
concepts are often ignored, and we say things like
consider the function x 2 or consider y = x 2
instead of the more correct statement
consider the function y where y(x) = x 2 for each x in the domain of y .
For our purposes, “everyday usage” will usually suffice, and we won’t worry that much about
the differences between y , y(x) , and a formula describing y . This will save ink and paper, simplify
the English, and, frankly, make it easier to follow many of our computations.
In particular, when we seek a particular solution to a differential equation, we will usually be
quite happy to find a convenient formula describing the solution. We will then probably mildly abuse
terminology by referring to that formula as “the solution”. Please keep in mind that, in fact, any such
formula is just one description of the solution — a very useful description since it tells you how to
compute y(x) for every x in the interval of interest. But other formulas can also describe the same
function. For example, you can easily verify that
x
x2 , (x + 3)(x − 3) + 9 and 2t dt
t=0

are all formulas describing the same function on the real line.
There will also be differential equations for which we simply cannot ﬁnd a convenient formula
describing the desired solution (or solutions). On those occasions we will have to ﬁnd some alternative
way to describe our solutions. Some of these will involve using the differential equations to sketch
approximations to the graphs of their solutions. Other alternative descriptions will involve formulas
that approximate the solutions and allow us to generate lists of values approximating a solution at
different points. These alternative descriptions may not be as convenient or as accurate as explicit
formulas for the solutions, but they will provide usable information about the solutions.
7 In theory, it makes sense to restrict the domain of a solution to the interval of interest so that irrelevant questions regarding
the behavior of the function off the interval have no chance of arising. At this point of our studies, let us just be sure that a
function serving as a solution makes sense at least over whatever interval we have interest in.

i i

i i
i i

i i

Additional Exercises 17

Additional Exercises

1.3. For each differential equation given below, three choices for a possible solution y = y(x)
are given. Determine whether each choice is or is not a solution to the given differential equa-
tion. (In each case, assume the interval of interest is the entire real line (−∞, ∞) .)
dy
a. = 3y
dx
i. y(x) = e3x ii. y(x) = x 3 iii. y(x) = sin(3x)
dy
b. x = 3y
dx
i. y(x) = e3x ii. y(x) = x 3 iii. y(x) = sin(3x)
d2 y
c. = 9y
dx2
i. y(x) = e3x ii. y(x) = x 3 iii. y(x) = sin(3x)
d2 y
d. = −9y
dx2
i. y(x) = e3x ii. y(x) = x 3 iii. y(x) = sin(3x)
dy
e. x − 2y = 6x 4
dx
i. y(x) = x 4 ii. y(x) = 3x 4 iii. y(x) = 3x 4 + 5x 2
d2 y dy
f. 2
− 2x − 2y = 0
dx d x
2
i. y(x) = sin(x) ii. y(x) = x 3 iii. y(x) = e x
d2 y
g. + 4y = 12x
dx2
i. y(x) = sin(2x) ii. y(x) = 3x iii. y(x) = sin(2x) + 3x
d2 y dy
h. − 6 + 9y = 0
dx2 dx
i. y(x) = e3x ii. y(x) = xe3x iii. y(x) = 7e3x − 4xe3x

1.4. For each initial-value problem given below, three choices for a possible solution y = y(x)
are given. Determine whether each choice is or is not a solution to the given initial-value
problem.
dy
a. = 4y with y(0) = 5
dx
i. y(x) = e4x ii. y(x) = 5e4x iii. y(x) = e4x + 1
dy
b. x = 2y with y(2) = 20
dx
i. y(x) = x 2 ii. y(x) = 10x iii. y(x) = 5x 2

i i

i i
i i

i i

18 The Starting Point: Basic Concepts and Terminology

d2 y
c. − 9y = 0 with y(0) = 1 and y (0) = 9
dx2
i. y(x) = 2e3x − e−3x ii. y(x) = e3x iii. y(x) = e3x + 1
d2 y dy
d. x 2 − 4x + 6y = 36x 6 with y(1) = 1 and y (1) = 12
dx2 dx
i. y(x) = 10x 3 − 9x 2 ii. y(x) = 3x 6 − 2x 2 iii. y(x) = 3x 6 − 2x 3
1.5. For the following, let
y(x) = x2 + c
where c is an arbitrary constant.
a. Verify that this y is a solution to
dy x
=
dx y
no matter what value c is.
b. What value should c be so that the above y satisfies the initial condition
i. y(0) = 3 ? ii. y(2) = 3 ?
c. Using your results for the above, give a solution to each of the following initial-value
problems:
dy x
i. = with y(0) = 3
dx y
dy x
ii. = with y(2) = 3
dx y
1.6. For the following, let
2
y(x) = Ae x − 3
where A is an arbitrary constant.
a. Verify that this y is a solution to
dy
− 2x y = 6x
dx
no matter what value A is.
b. In fact, it can be verified (using methods that will be developed later) that the above
formula for y is a general solution to the above differential equation. Using this fact,
finish solving each of the following initial-value problems:
dy
i. − 2x y = 6x with y(0) = 1
dx
dy
ii. − 2x y = 6x with y(1) = 0
dx
1.7. For the following, let
y(x) = A cos(2x) + B sin(2x)
where A and B are arbitrary constants.
a. Verify that this y is a solution to
d2 y
+ 4y = 0
dx2
no matter what values A and B are.

i i

i i
i i

i i

Additional Exercises 19

b. Again, it can be verified that the above formula for y is a general solution to the above
differential equation. Using this fact, finish solving each of the following initial-value
problems:
d2 y
i. + 4y = 0 with y(0) = 3 and y (0) = 8
dx2
d2 y
ii. + 4y = 0 with y(0) = 0 and y (0) = 1
dx2
1.8. It was stated (on page 7) that “ N th -order sets of initial values are especially appropriate for
N th -order differential equations.” The following problems illustrate one reason this is true.
In particular, they demonstrate that, if y satisfies some N th -order initial-value problem,
then it automatically satisfies particular higher-order sets of initial values. Because of this,
specifying the initial values for y (m) with m ≥ N is unnecessary and may even lead to
problems with no solutions.
a. Assume y satisfies the first-order initial-value problem
dy
= 3x y + x 2 with y(1) = 2 .
dx

i. Using the differential equation along with the given value for y(1) , determine what
value y (1) must be.
ii. Is it possible to have a solution to
dy
= 3x y + x 2
dx
that also satisfies both y(1) = 2 and y (1) = 4 ? (Give a reason.)
iii. Differentiate the given differential equation to obtain a second-order differential equa-
tion. Using the equation obtained along with the now known values for y(1) and y (1) ,
find the value of y (1) .
iv. Can we continue and find y (1) , y (4) (1) , …?
b. Assume y satisfies the second-order initial-value problem
d2 y dy
+ 4 − 8y = 0 with y(0) = 3 and y (0) = 5 .
dx2 dx

i. Find the value of y (0) and of y (0)

ii. Is it possible to have a solution to
d2 y dy
+ 4 − 8y = 0
dx2 dx
that also satisﬁes all of the following:
y(0) = 3 , y (0) = 5 and y (0) = 0 ?
1.9. Consider the simplest model we developed for a falling object (see page 10). In that, we
derived
y(t) = −4.9t 2 + 1,000
as the formula for the height y above ground of some falling object at time t .
a. Find Thit , the time the object hits the ground.
b. What is the velocity of the object when it hits the ground?

i i

i i
i i

i i

20 The Starting Point: Basic Concepts and Terminology

c. Suppose that, instead of being dropped at t = 0 , the object is tossed up with an initial
velocity of 2 meters per second. If this is the only change to our problem, then:
i. How does the corresponding initial-value problem change?
ii. What is the solution y(t) to this initial value problem?
iii. What is the velocity of the object when it hits the ground?

1.10. Consider the “better” falling object model (see page 11), in which we derived the differential
equation
dv
= −9.8 − κv (1.5)
dt
for the velocity. In this, κ is some positive constant used to describe the air resistance felt
by the falling object.
a. This differential equation was derived assuming the air was still. What differential equa-
tion would we have derived if, instead, we had assumed there was a steady updraft of 2
meters per second?
b. Recall that, from equation (1.5) we derived the equation

v = −9.8t − κ y + c

relating the velocity v to the distance above ground y and the time t (see page 13). In the
following, you will show that it, along with experimental data, can be used to determine
the value of κ .
i. Determine the value of the constant of integration, c , in the above equation using the
given initial values (i.e., y(0) = 1,000 and v(0) = 0 ).
ii. Suppose that, in an experiment, the object was found to hit the ground at t = Thit with
a speed of v = vhit . Use this, along with the above equation, to ﬁnd κ in terms of Thit
and vhit .

1.11. For the following, let

y(x) = Ax + Bx ln |x|
where A and B are arbitrary constants.
a. Verify that this y is a solution to

d2 y dy
x2 − x + y = 0 on the intervals (0, ∞) and (−∞, 0) ,
dx2 dx

no matter what values A and B are.

b. Again, we will later be able to show that the above formula for y is a general solution
for the above differential equation. Given this, ﬁnd the solution to the above differential
equation satisfying y(1) = 3 and y (1) = 8 .
c. Why should your answer to 1.11 b not be considered a valid solution to

d2 y dy
x2 2
− x + y = 0
dx dx

over the entire real line, (−∞, ∞) ?

i i

i i
i i

i i

2
Integration and Differential Equations

Often, when attempting to solve a differential equation, we are naturally led to computing one or
more integrals — after all, integration is the inverse of differentiation. Indeed, we have already
solved one simple second-order differential equation by repeated integration (the one arising in the
simplest falling object model, starting on page 10). Let us now briefly consider the general case
where integration is immediately applicable, and also consider some practical aspects of using both
the indefinite integral and the definite integral.

2.1 Directly-Integrable Equations

We will say that a given ﬁrst-order differential equation is directly integrable if (and only if) it can
be (re)written as
dy
= f (x) (2.1)
dx
where f (x) is some known function of just x (no y’s ). More generally, any N th -order differential
equation will be said to be directly integrable if and only if it can be (re)written as
dN y
= f (x) (2.1 )
dx N
where, again, f (x) is some known function of just x (no y’s or derivatives of y ).

!Example 2.1: Consider the equation

dy
x2 − 4x = 6 . (2.2)
dx
Solving this equation for the derivative:
dy
x2 = 4x + 6
dx
4x + 6
→ dy
dx
=
x2
.

Since the right-hand side of the last equation depends only on x , we do have

dy 4x + 6
= f (x) with f (x) = 2
.
dx x
So equation (2.2) is directly integrable.

i i

i i
i i

i i

22 Integration and Differential Equations

!Example 2.2: Consider the equation

dy
x2 − 4x y = 6 . (2.3)
dx
Solving this equation for the derivative:
dy
x2 = 4x y + 6
dx
4x y + 6
→ dy
dx
=
x2
.

Here, the right-hand side of the last equation depends on both x and y , not just x . So equation
(2.3) is not directly integrable.

Solving a directly-integrable equation is easy. First solve for the derivative to get the equation
into form (2.1) or (2.1 ), then integrate both sides as many times as needed to eliminate the derivatives,
and, ﬁnally, do whatever simpliﬁcation seems appropriate.

!Example 2.3: Again, consider

dy
x2 − 4x = 6 . (2.4)
dx
In example 2.1, we saw that it is directly integrable and can be rewritten as
dy 4x + 6
= .
dx x2
Integrating both sides of this equation with respect to x (and doing a little algebra):

dy 4x + 6
dx = 2
dx (2.5)
dx x

→ y(x) + c1 =
4
x
6
+ 2
x
dx

−1
= 4 x dx + 6 x −2 dx

= 4 ln |x| + c2 − 6x −1 + c3

where c1 , c2 , and c3 are arbitrary constants. Rearranging things slightly and letting c =
c2 + c3 − c1 , this last equation simpliﬁes to

y(x) = 4 ln |x| − 6x −1 + c . (2.6)

This is our general solution to differential equation (2.4). Since both ln |x| and x −1 are discon-
tinuous at x = 0 , the solution can be valid over any interval not containing x = 0 .

?Exercise 2.1: Consider the differential equation in example 2.2 and explain why the y , which
is an unknown function of x , makes it impossible to completely integrate both sides of
dy 4x y + 6
=
dx x2
with respect to x .

i i

i i
i i

i i

On Using Indeﬁnite Integrals 23

2.2 On Using Indeﬁnite Integrals

This is a good point to observe that, whenever we take the indeﬁnite integrals of both sides of an
equation, we obtain a bunch of arbitrary constants — c1 , c2 , . . . (one constant for each integral)
— that can be combined into a single arbitrary constant c . In the future, rather than note all the
arbitrary constants that arise and how they combine into a single arbitrary constant c that is added to
the right-hand side in the end, let us agree to simply add that c at the end. Let’s not explicitly note
all the intermediate arbitrary constants. If, for example, we had agreed to this before doing the last
example, then we could have replaced all that material from equation (2.5) to equation (2.6) with

dy 4x + 6
dx = 2
dx
dx x

→ y(x) =
4
x
6
+ 2
x
dx

= 4 x −1 dx + 6 x −2 dx

= 4 ln |x| − 6x −1 + c .
This should simplify our computations a little.
This convention of “implicitly combining all the arbitrary constants” also allows us to write

dy
y(x) = dx (2.7)
dx
instead of
dy
y(x) + some arbitrary constant = dx .
dx
By our new convention, that “some arbitrary constant” is still in equation (2.7) — it’s just been moved
to the right-hand side of the equation and combined with the constants arising from the integral there.
Finally, like you, this author will get tired of repeatedly saying “where c is an arbitrary constant”
when it is obvious that the c (or the c1 or the A or …) that just appeared in the previous line is,
indeed, some arbitrary constant. So let us not feel compelled to constantly repeat the obvious, and
agree that, when a new symbol suddenly appears in the computation of an indeﬁnite integral, then,
yes, that is an arbitrary constant. Remember, though, to use different symbols for the different
constants that arise when integrating a function already involving an arbitrary constant.

!Example 2.4: Consider solving

d2 y
= 18x 2 . (2.8)
dx2
Clearly, this is directly integrable and will require two integrations. The ﬁrst integration yields
2
dy d y 18 3
= 2
dx = 18x 2 dx = x + c1 .
dx dx 3

Cutting out the middle leaves

dy
= 6x 3 + c1 .
dx
Integrating this, we have

dy 6 4
y(x) = dx = 6x 3 + c1 dx = x + c1 x + c2 .
dx 4

i i

i i
i i

i i

24 Integration and Differential Equations

So the general solution to equation (2.8) is

3 4
y(x) = x + c1 x + c2 .
2

In practice, rather than use the same letter with different subscripts for different arbitrary con-
stants (as we did in the above example), you might just want to use different letters, say, writing
3 4
y(x) = x + ax + b
2

instead of
3 4
y(x) = x + c1 x + c2 .
2

This sometimes prevents dumb mistakes due to bad handwriting.

2.3 On Using Deﬁnite Integrals

Basic Ideas
We have been using the indeﬁnite integral to recover y(x) from dy/ via the relation
dx

dy
dx = y(x) + c .
dx

Here, c is some constant (which we’ve agreed to automatically combine with other constants from
other integrals).
We could just about as easily have used the corresponding deﬁnite integral relation
x
dy
ds = y(x) − y(a) (2.9)
a ds

to recover y(x) from its derivative. Note that, here, we’ve used s instead of x to denote the variable
of integration. This prevents the confusion that can arise when using the same symbol for both the
variable of integration and the upper limit in the integral. The lower limit, a , can be chosen to be
any convenient value. In particular, if we are also dealing with initial values, then it makes sense to
set a equal to the point at which the initial values are given. That way (as we will soon see) we will
obtain a general solution in which the undetermined constant is simply the initial value.
Aside from getting it into the form
dy
= f (x) ,
dx
there are two simple steps that should be taken before using the deﬁnite integral to solve a ﬁrst-order,
directly-integrable differential equation:
1. Pick a convenient value for the lower limit of integration a . In particular, if the value of
y(x 0 ) is given for some point x 0 , set a = x 0 .
2. Rewrite the differential equation with s denoting the variable instead of x (i.e., replace x
with s ),
dy
= f (s) . (2.10)
ds

i i

i i
i i

i i

On Using Deﬁnite Integrals 25

After that, simply integrate both sides of equation (2.10) with respect to s from a to x :
x x
dy
ds = f (s) ds
a ds a
x
→ y(x) − y(a) = f (s) ds .
a

Then solve for y(x) by adding y(a) to both sides,

x
y(x) = f (s) ds + y(a) . (2.11)
a

This is a general solution to the given differential equation. It should be noted that the integral here
is a deﬁnite integral. Its evaluation does not lead to any arbitrary constants. However, the value of
y(a) , until speciﬁed, can be anything; so y(a) is the “arbitrary constant” in this general solution.

!Example 2.5: Consider solving the initial-value problem

dy
= 3x 2 with y(2) = 12 .
dx
Since we know the value of y(2) , we will use 2 as the lower limit for our integrals. Rewriting
the differential equation with s replacing x gives
dy
= 3s 2 .
ds
Integrating this with respect to s from 2 to x :
x x
dy
ds = 3s 2 ds
2 ds 2
x

→ y(x) − y(2) = s 3 = x 3 − 23
2
.

Solving for y(x) (and computing 23 ) then gives us

y(x) = x 3 − 8 + y(2) .

This is a general solution to our differential equation. To ﬁnd the particular solution that also
satisﬁes y(2) = 12 , as desired, we simply replace the y(2) in the general solution with its given
value,
y(x) = x 3 − 8 + y(2)
= x 3 − 8 + 12 = x 3 + 4 .

Of course, rather than go through the procedure just outlined to solve

dy
= f (x) ,
dx
we could, after determining a and f (s) , just plug these into equation (2.11),
x
y(x) = f (s) ds + y(a) ,
a

and compute the integral. That is, after all, what we derived for any choice of f .

i i

i i
i i

i i

26 Integration and Differential Equations

Advantages of Using Deﬁnite Integrals

By using definite integrals instead of indefinite integrals, we avoid dealing with arbitrary constants
and end up with expressions explicitly involving initial values. This is sometimes convenient.
A much more important advantage of using definite integrals is that they result in concrete,
computable formulas even when the corresponding indefinite integrals cannot be evaluated. Let us
look at a classic example.

!Example 2.6: Consider solving the initial-value problem

dy
= e−x
2
with y(0) = 0 .
dx

In particular, determine the value of y(x) when x = 10 .

Using indeﬁnite integrals yields

dy
e−x dx
2
y(x) = dx = .
dx

Unfortunately, this integral was not one you learned to evaluate in calculus.1 And if you check
the tables, you will discover that no one else has discovered a usable formula for this integral.
Consequently, the above formula for y(x) is not very usable. Heck, we can’t even isolate an
arbitrary constant or see how the solution depends on the initial value.
On the other hand, using deﬁnite integrals, we get
x x
dy
e−s ds
2
ds =
0 ds 0
x
→ y(x) − y(0) = e−s ds
2

0
x
→ y(x) = e−s ds + y(0) .
2

This last formula explicitly describes how y(x) depends on the initial value y(0) . Since we are
assuming y(0) = 0 , this reduces to
x
e−s ds .
2
y(x) =
0

We still cannot ﬁnd a computable formula for this integral, but, if we choose a speciﬁc value for
x , say, x = 10 , this expression becomes
10
e−s ds .
2
y(10) =
0

The value of this integral can be very accurately approximated using any of a number of numerical
integration methods such as the trapezoidal rule or Simpson’s rule. In practice, of course, we’ll
just use the numerical integration command in our favorite computer math package (Maple,
Mathematica, etc.). Using any such package, you will ﬁnd that
10
e−s ds ≈ 0.886 .
2
y(10) =
0

1 Well, you could expand e−x 2 in a Taylor series and integrate the series.

i i

models the energy production of this device when it’s kept in the dark until a light bulb (of unit
intensity) is suddenly switched on at t = 0 .
Computing the integrals of such functions is simply a matter of computing the integrals of the
various “pieces”, and then putting the integrated pieces together appropriately. Precisely how you do
that depends on whether you are using indefinite integrals or definite integrals. Either can be used,
but there is a good reason to prefer definite integrals: They automatically yield continuous solutions
(if such solutions exist). With indefinite integrals you must do extra work to ensure the necessary
continuity. To illustrate the basic ideas, let us solve the differential equation given at the start of this
section both ways: first using definite integrals, then using indefinite integrals.

!Example 2.8 (using deﬁnite integrals): We seek a general solution to

dy x2 if x < 2
= f (x) where f (x) = .
dx 1 if 2 ≤ x

Taking the deﬁnite integral (starting, for no good reason, at 0 ), we have

x
s2 if s < 2
y(x) = f (s) ds + y(0) where f (s) = .
0 1 if 2 ≤ s

Now, if x ≤ 2 , then f (s) = s 2 for every value of s in the interval (0, x) . So, when
x ≤ 2, x x
1 x 1
f (s) ds = s 2 ds = s 3 = x3 .
0 0 3 s=0 3

(Notice that this integral is valid for x = 2 even though the formula used for f (s) , s 2 , was only
valid for s < 2 .)
On the other hand, if 2 < x , we must break the integral into two pieces, the one over (0, 2)
and the one over (2, x) :
x 2 x
f (s) ds = f (s) ds + f (s) ds
0 0 2
2 x
= s 2 ds + 1 ds
0 2
x
1 3 2
= s + s s=2
3 s=0

1 2
= · 23 − 0 + [x − 2] = x + .
3 3

Y Y Y
4

1 1

0 1 X 0 1 X 0 2 X
(a) (b) (c)

Figure 2.1: Three piecewise deﬁned functions: (a) the step function, (b) the ramp function, (c)
f (x) from example 2.8.

i i

i i
i i

i i

30 Integration and Differential Equations

Thus, our general solution is

⎧ 1 3
x ⎪
⎨ x + y(0) if x ≤2
3
y(x) = f (s) ds + y(0) = .
0 ⎪
⎩x+ 2
+ y(0) if 2 < x
3

Keep in mind that solutions to differential equations are required to be continuous. After
checking the above formulas, it should be obvious that the y(x) obtained in the last example is
continuous everywhere except, possibly, at x = 2 . With a little work we could also verify that, in
fact, we also have continuity at x = 2 . We simply have to recall the limit definition of continuity,
and verify that the appropriate requirements are satisfied. But we won’t bother because, in a little
bit, it will be seen that solutions so obtained via definite integration are guaranteed to be continuous,
provided the discontinuities in the function being integrated are not too bad.
On the other hand, as the next example illustrates, the continuity of the solution is an issue when
we use indefinite integrals.

!Example 2.9 (using indeﬁnite integrals): Again, our differential equation is

dy x2 if x < 2
= f (x) where f (x) = .
dx 1 if 2 ≤ x

The indefinite integral of f (x) is computed by simply finding the indefinite integral of each
“piece”, noting the values of the variable for which the integration is valid. Thus,
⎧ ⎫ ⎧
⎪
⎨ x 2 dx if x < 2⎪ ⎬ ⎪ 1
⎨ 3 x 3 + c1 if x < 2
y(x) = f (x) dx = = .
⎪
⎩ 1 dx ⎪ ⎪
if 2 ≤ x ⎭ ⎩ x +c
2 if 2 ≤ x

Again, I remind you that solutions to differential equations are required to be continuous. And,
again, it should be obvious that the y(x) just obtained is continuous everywhere except, possibly,
at x = 2 . Now, recall what is means to say “ y(x) is continuous at x = 2 ” — it means

lim y(x) = y(2) .

x→2

Here, however, y(x) is given by different formulas on either side of x = 2 . So we will have to
consider both the left- and the right-hand limits, and require that

lim y(x) = y(2) = lim y(x) .

x→2− x→2+

Using the above set of formulas for y(x) , we see that

y(2) = 2 + c2 ,

1 1 8
lim y(x) = lim x 3 + c1 = · 23 + c1 = + c1
x→2− x→2 3 3 3
and
lim y(x) = lim [x + c2 ] = 2 + c2 .
x→2+ x→2

So our requirement for continuity at x = 2 ,

lim y(x) = y(2) = lim y(x) ,

x→2− x→2+

i i

i i
i i

i i

Integrals of Piecewise-Deﬁned Functions 31

becomes
8
+ c1 = 2 + c2 .
3

This, in turn, means that the “arbitrary constants” c1 and c2 are not completely arbitrary; they
must be related by
8
+ c1 = 2 + c2 .
3
Consequently, we must insist that
8 2
c1 = c2 + 2 − = c2 −
3 3
or, equivalently, that
2
c2 = c1 + .
3

Choosing the latter, we ﬁnally get a valid general solution to our differential equation, namely,
⎧ ⎫ ⎧ 1
⎨ 3 x 3 + c1 if x < 2⎪
⎪ 1
⎬ ⎪ 3
⎨ 3 x + c1 if x < 2
y(x) = = .
⎪
⎩ x +c ⎪
⎭ ⎪
⎩ x + 2 +c
2 if 2 ≤ x 1 if 2 ≤ x
3

In practice, a given piecewise deﬁned function may have more than two “pieces”, and the
differential equation may have order higher than one. For example, you may be called upon to solve
⎧
⎪
⎪ if x < 1
⎨0
d2 y
= f (x) where f (x) = 1 if 1 ≤ x < 2
dx2 ⎪
⎪
⎩0 if 2 ≤ x

or even something involving inﬁnitely many pieces, such as

⎧
⎪
⎪ 0 if x <0
⎪
⎪
⎪
⎪ if 0 ≤ x < 1
⎪
⎪ 1
⎨
d4 y if 2 ≤ x < 3
= stair(x) where stair(x) = 2 . (2.14)
dx4 ⎪
⎪
⎪3
⎪ if 3 ≤ x < 4
⎪
⎪
⎪
⎪
⎩ ..
.

The methods illustrated in the two examples can still be applied; you just have more integrals to keep
track of, and the accompanying bookkeeping becomes more involved. If you use indeﬁnite integrals,
make sure to relate all the “arbitrary” constants to each other so that your solution is continuous. If
you use deﬁnite integrals, then any concerns about the continuity of your solutions can probably be
alleviated by the discussion in the next subsection.

Continuity of the Integrals

x = 0, 1, 2, 3, 4, 5, . . . .

However, on the finite interval (−N , N ) , where N is any positive integer, the only discontinuities
of stair(x) are at
x = 0, 1, 2, 3, 4, . . . , N − 1 .
So stair(x) has only a finite number of discontinuities on (−N , N ) , and each of these is a finite-jump
discontinuity. The theorem then tells us that
x
g(x) = stair(s) ds
0

is continuous at each x in (−N , N ) . Since N can be made as large as we wish, we can conclude
that, in fact, g(x) is continuous at every x in (−∞, ∞) .
Now let’s prove our theorem. For the proof, we will use facts based on “area arguments” that
you should recall from your elementary calculus course.

PROOF (of theorem 2.1): First of all, note that the two requirements placed on f ensure
x
g(x) = f (s) ds
a

is well defined for any x in (α, β) using any of the definitions for the integral found in most
calculus texts (check this out yourself, using the definition in your calculus text). They also prevent
f (x) from “blowing up” on any closed subinterval [α , β ] of (α, β) . Thus, for each such closed
subinterval [α , β ] , there is a corresponding finite constant M such that5

| f (s)| ≤ M whenever α ≤ s ≤ β .

Now, to verify the claimed continuity of g , we must show that

lim g(x) = g(x 0 ) (2.16)

x→x 0

5 The constant M can be the maximum value of | f (s)| on [α , β ] , provided that maximum exists. It may change if either
endpoint α or β is changed.

i i

i i
i i

i i

34 Integration and Differential Equations

for any x 0 in (α, β) . But by the deﬁnition of g and well-known properties of integration,
x
lim g(x) = lim f (s) ds
x→x 0 x→x 0 a
x0 x
= lim f (s) ds + f (s) ds
x→x 0 a x0
x x
= lim g(x 0 ) + f (s) ds = g(x 0 ) + lim f (s) ds .
x→x 0 x0 x→x 0 x0

So, to show equation (2.16) holds, it sufﬁces to conﬁrm that

x
lim f (s) ds = 0 ,
x→x 0 x
0

which, in turn, is equivalent to conﬁrming that

x
x
lim f (s) ds = 0 and lim f (s) ds = 0 . (2.17)
x→x 0 + x0 x→x − 0 x0

To do this, pick any two ﬁnite values α and β satisfying α < α < x 0 < β < β . As noted,
there is some ﬁnite constant M bigger than | f (s)| on [α , β ] . So, if x 0 ≤ x ≤ β ,
x x x

0 ≤ f (s) ds ≤ | f (s)| ds ≤ M ds = M[x − x 0 ] .
x0 x0 x0

Similarly, if α < x < x 0 , then

x x
x0 0
0 ≤ f (s) ds = −

f (s) ds = f (s) ds
x0 x x
x0 x0
≤ | f (s)| ds ≤ M ds = M[x 0 − x] .
x x

Hence,
x
0 ≤ lim f (s) ds ≤ lim M[x − x 0 ] = M[x 0 − x 0 ] = 0
x→x + 0 x0 x→x 0 +
and
x
0 ≤ lim f (s) ds ≤ lim M[x 0 − x] = M[x 0 − x 0 ] = 0 , (2.18)
x→x − 0 x0 x→x 0 −

which, of course, means that equation set (2.17) holds.

Additional Exercises

2.2. Determine whether each of the following differential equations is or is not directly inte-
grable:
dy dy
a. = 3 − sin(x) b. = 3 − sin(y)
dx dx

i i

i i
i i

i i

Additional Exercises 35

dy dy
c. + 4y = e2x d. x = arcsin(x 2 )
dx dx
dy d2 y x +1
e. y = 2x f. =
dx dx2 x −1
d2 y d2 y
g. x 2 = 1 h. y 2 = 8x 2
dx2 dx2
d2 y dy d2 y dy
+ 8y = e−x
2
i. 2
+ 3 j. x 2 2
+ 3x = 0
dx dx dx dx

2.3. Find a general solution for each of the following directly integrable equations. (Use indef-
inite integrals on these.)
dy dy
a. = 4x 3 b. = 20e−4x
dx dx
dy √ √ dy
c. x + x = 2 d. x +4 = 1
dx dx
dy
dy
e. = x cos x 2 f. = x cos(x)
dx dx
dy dy
g. x = x2 − 9 h. 1 = x2 − 9
dx dx
dy d2 y
i. 1 = x 2 − 9 j. = sin(2x)
dx dx2
d2 y d4 y
k. − 3 = x l. = 1
dx2 dx4
2.4. Solve each of the following initial-value problems (using the indeﬁnite integral). Also, state
the largest interval over which the solution is valid (i.e., the maximal possible interval of
interest).
dy
a. = 4x + 10e2x with y(0) = 4
dx
√ dy
b. 3
x +6 = 1 with y(2) = 10
dx
dy x −1
c. = with y(0) = 8
dx x +1
dy √
d. x + 2 = x with y(1) = 6
dx
dy
e. cos(x) − sin(x) = 0 with y(0) = 3
dx

dy
f. x2 + 1 = 1 with y(0) = 3
dx
d2 y √
g. x + 2 = x with y(1) = 8 and y (1) = 6
dx2

2.5 a. Using deﬁnite integrals (as in example 2.5 on page 25), ﬁnd the general solution to
dy

x
= sin
dx 2

with y(0) acting as the arbitray constant.

i i

i i
i i

i i

36 Integration and Differential Equations

b. Using the formula just found for y(x) :

i. Find y(π ) when y(0) = 0 . ii. Find y(π ) when y(0) = 3 .
iii. Find y(2π ) when y(0) = 3 .
2.6 a. Using definite integrals (as in example 2.5 on page 25), find the general solution to
dy √
= 3 x +3
dx
with y(1) acting as the arbitrary constant.
b. Using the formula just found for y(x) :
i. Find y(6) when y(1) = 16 . ii. Find y(6) when y(1) = 20 .
iii. Find y(−2) when y(1) = 0 .
2.7. Using definite integrals (as in example 2.5 on page 25), find the solution to each of the
following initial-value problems. (In some cases, you may want to use the error function
or the sine-integral function.)
dy dy x
= x e−x
2
a. with y(0) = 3 b. = with y(2) = 7
dx dx x +5
2

dy 1 dy
= e−9x
2
c. = 2 with y(1) = 0 d. with y(0) = 1
dx x +1 dx
dy dy

e. x = sin(x) with y(0) = 4 f. x = sin x 2 with y(0) = 0
dx dx
2.8. Using an appropriate computer math package (such as Maple or Mathematica), graph each
of the following over the interval 0 ≤ x ≤ 10 :
a. the error function, erf(x) . b. the sine integral function, Si(x) .
c. the solution to

dy
= ln 2 + x 2 sin(x) with y(0) = 0 .
dx
2.9. Each of the following differential equations involves a function that is (or can be) piecewise
defined. Sketch the graph of each of these piecewise defined functions, and find the general
solution of each differential equation. If an initial value is also given, then also solve the
given initial-value problem:
dy
a. = step(x) with y(0) = 0 and step(x) as defined on page 28
dx

dy 0 if x <1
b. = f (x) with y(0) = 2 and f (x) =
dx 1 if 1 ≤ x
⎧
⎪
⎪ if x <1
⎨0
dy
c. = f (x) with y(0) = 0 and f (x) = 1 if 1 ≤ x < 2
dx ⎪
⎪
⎩0 if 2 ≤ x
dy
d. = |x − 2|
dx
dy
e. = stair(x) for x < 4 with y(0) = 0 and stair(x) as defined on page 31
dx

i i

i i
i i

i i

Part II
First-Order Equations

i i

i i
i i

i i

i i
i i

i i

3
Some Basics about First-Order Equations

For the next few chapters, our attention will be focused on first-order differential equations. We will
discover that these equations can often be solved using methods developed directly from the tools
of elementary calculus. And even when these equations cannot be explicitly solved, we will still be
able to use fundamental concepts from elementary calculus to obtain good approximations to the
desired solutions.
But first, let us discuss a few basic ideas that will be relevant throughout our discussion of
first-order differential equations.

3.1 Algebraically Solving for the Derivative

Here are some of the ﬁrst-order differential equations that we have seen or will see in the next few
chapters:
dy
x2 − 4x = 6 ,
dx
dy
− x 2 y2 = x 2 ,
dx
dy
+ 4x y = 2x y 2 ,
dx
and
dy
x + 4y = x 3 .
dx
One thing we can do with each of these equations is to algebraically solve for the derivative. Doing
this with the ﬁrst equation:
dy
x2 − 4x = 6
dx

→ x2
dy
dx
= 6 + 4x

4x + 6
→ dy
dx
=
x2
.

For the second equation:

dy
− x 2 y2 = x 2
dx

→ dy
dx
= x 2 + x 2 y2 .

i i

i i
i i

i i

40 Some Basics about First-Order Equations

Solving for the derivative is often a good first step towards solving a first-order differential equation.
For example, the first equation above is directly integrable — solving for the derivative yielded
dy 4x + 6
= ,
dx x2

and y(x) can now be found by simply integrating both sides with respect to x .
Even when the equation is not directly integrable and we get
dy
= “a formula of both x and y ” ,
dx

— as in our second equation above,

dy
= x 2 + x 2 y2
dx

— that formula on the right can still give us useful information about the possible solutions and
can help us determine which method is appropriate for obtaining the general solution. Observe, for
example, that the right-hand side of the last equation can be factored into a formula of x and a
formula of y ,
dy

= x 2 1 + y2 .
dx
In the next chapter, we will find that this means the equation is “separable” and can be solved by a
procedure developed for just such equations.
For convenience, let us say that a first-order differential equation is in derivative formula form
if it is written as
dy
= F(x, y) (3.1)
dx
where F(x, y) is some (known) formula of x and/or y . Remember, to convert a given first-order
differential equation to derivative form, simply use a little algebra to solve the differential equation
for the derivative.

?Exercise 3.1: Verify that the derivative formula forms of

dy dy
+ 4y = 3y 3 and x + 4x y = 2y 2
dx dx
are
dy dy 2y 2 − 4x y
= 3y 3 − 4y and = ,
dx dx x
respectively.

Keep in mind that the right side of equation (3.1), F(x, y) , need not always be a formula of
both x and y . As we saw in an example above, the equation might be directly integrable. In this
case, the right side of the above derivative formula form reduces to some f (x) , a formula involving
only x ,
dy
= f (x) .
dx
Alternatively, the right side may end up being a formula involving only y , F(x, y) = g(y) . We
have a word for such differential equations; that word is “autonomous”. That is, an autonomous
ﬁrst-order differential equation is a differential equation that can be written as
dy
= g(y)
dx

Since the derivative of a constant function is zero, plugging in this function, y = 2 into
dy
= 2x y 2 − 4x y
dx

gives
0 = 2x · 22 − 4x · 2 ,
which, after a little arithmetic and algebra, reduces further to

0 = 0 .

Hence, our constant function satisﬁes our differential equation, and, so, is a constant solution to
that differential equation.
On the other hand, plugging the constant function

y(x) = 3 for all x

into
dy
= 2x y 2 − 4x y
dx
gives
0 = 2x · 32 − 4x · 3 .

This only reduces to

0 = 6x ,
which is not valid for all values of x on any nontrivial interval. Thus, y = 3 is not a constant
solution to our differential equation.

Admittedly, constant functions are not usually considered particularly exciting. The graph of a
constant function,
y(x) = y0 for all x
is just a horizontal line (at y = y0 ), and its derivative (as noted in the above example) is zero. But
the fact that its derivative is zero is what simpliﬁes the task of ﬁnding all possible constant solutions
to a given differential equation, especially if the equation is in derivative formula form. After all, if
we plug a constant function
y(x) = y0 for all x
into an equation of the form
dy
= F(x, y) ,
dx
then, since the derivative of a constant is zero, this equation reduces to

3.3 On the Existence and Uniqueness of Solutions

Unfortunately, not all problems are solvable, and those that are solvable sometimes have several
solutions. This is true in mathematics just as it is true in real life.
Before attempting to solve a problem involving some given differential equation and auxiliary
condition (such as an initial value), it would certainly be nice to know that the given differential
equation actually has a solution satisfying the given auxiliary condition. This would be especially
true if the given differential equation looks difficult and we expect that considerable effort will be
required in solving it (effort which would be wasted if that solution did not exist). And even if we
can find a solution, we normally would like some assurance that it is the only solution.
The following theorem is the standard theorem quoted in most elementary differential equation
texts addressing these issues for fairly general first-order initial-value problems.

Theorem 3.1 (on existence and uniqueness)

Consider a ﬁrst-order initial-value problem
dy
= F(x, y) with y(x 0 ) = y0
dx

in which both F and ∂ F/∂ y are continuous functions on some open region of the XY –plane con-
taining the point (x 0 , y0 ) .2 The initial-value problem then has exactly one solution over some open
interval (α, β) containing x 0 . Moreover, this solution and its derivative are continuous over that
interval.

This theorem assures us that, if we can write a first-order differential equation in the derivative
formula form,
dy
= F(x, y) ,
dx
and that F(x, y) is a ‘reasonably well-behaved’ formula on some region of interest, then our differ-
ential equation has solutions — with luck and skill, we will be able to find them. Moreover, if we
can find a solution to this equation that also satisfies some initial value y(x 0 ) = y0 corresponding
to a point at which F is ‘reasonably well-behaved,’ then that solution is unique (i.e., it is the only
solution) — there is no need to worry about alternative solutions — at least over some interval
(α, β) . Just what that interval (α, β) is, however, is not explicitly described in this theorem. It turns
out to depend in subtle ways on just how well behaved F(x, y) is. More will be said about this in
a few paragraphs.

!Example 3.3: Consider the initial-value problem

dy
− x 2 y2 = x 2 with y(0) = 3 .
dx

As derived earlier, the derivative formula form for this equation is

dy
= x 2 + x 2 y2 .
dx
So
F(x, y) = x 2 + x 2 y 2

2 The ∂ F/ is a “partial derivative”. If you are not acquainted with partial derivatives see the appendix on page 61.
∂y

i i

i i
i i

i i

On the Existence and Uniqueness of Solutions 45

and
∂F ∂
= x 2 + x 2 y2 = 0 + x 2 2y = 2x 2 y .
∂y ∂y
It should be clear that these two functions are continuous everywhere on the XY –plane. Hence,
we can take the entire plane to be that “open region” in the above theorem, which then assures
us that the above initial-value problem has one (and only one) solution valid over some interval
(a, b) with a < 0 < b . Unfortunately, the theorem doesn’t tell us what that solution is nor what
that interval (a, b) might be. We will have to wait until we develop a method for solving this
differential equation.

The proof of the above theorem is nontrivial and can be safely skipped by most beginning
readers. In fact, despite the importance of the above theorem, we will rarely explicitly refer to it in
the chapters that follow. The main explicit references will be a “graphical” discussion of the theorem
in chapter 8 using methods developed there3 , and to note that analogous theorems can be proven for
higher-order differential equations. Nonetheless, it is an important theorem whose proof should be
included in this text if only to assure you that I’m not making it up. Besides, the basic core of the
proof is fairly accessible to most readers and contains some clever and interesting ideas. We will
go over that basic core in the next section (section 3.4), leaving the more challenging details for the
section after that (section 3.5).
Part of the proof will be to identify the interval (α, β) mentioned in the above theorem. In fact,
the interval (α, β) can be easily determined if F and ∂ F/∂ y are sufficiently well behaved. That is
what the next theorem gives us. Its proof requires just a few modifications of the proof of the above,
and will be briefly discussed after that proof.

Theorem 3.2
Consider a ﬁrst-order initial-value problem
dy
= F(x, y) with y(x 0 ) = y0
dx

over an interval (α, β) containing x 0 , and with F = F(x, y) being a continuous function on the
inﬁnite strip
R = { (x, y) : α < x < β and − ∞ < y < ∞ } .
Further suppose that, on R , the partial derivative ∂ F/∂ y is continuous and is a function of x only.4
Then the initial-value problem has exactly one solution over (α, β) . Moreover, this solution and its
derivative are continuous on that interval.

In practice, many of our ﬁrst-order differential equations will not satisfy the conditions described
in the last theorem. So this theorem is of relatively limited value for now. However, it leads to higher-
order analogs that will be used in developing the theory needed for important classes of higher-order
differential equations. That is why theorem 3.2 is mentioned here.

3 which you may ﬁnd more illuminating than the proof given here
4 More generally, the theorem remains true if we replace the phrase “a function of x only” with “a bounded function on
R ”. Our future interest, however, will be with the theorem as stated.

i i

i i
i i

i i

46 Some Basics about First-Order Equations

3.4 Conﬁrming the Existence of Solutions (Core Ideas)

So let us consider the ﬁrst-order initial-value problem
dy
= F(x, y) with y(x 0 ) = y0 ,
dx
assuming that both F and ∂ F/∂ y are continuous on some open region in the XY –plane containing
the point (x 0 , y0 ) . Our goal is to verify that a solution y exists over some interval. (This is the
existence claim of theorem 3.1. The uniqueness claim of that theorem will be left as an exercise
using material developed in the next section — see exercise 3.2 on page 58.)
The gist of our proof consists of three steps:
1. Observe that the initial-value problem is equivalent to a corresponding integral equation.
2. Derive a sequence of functions — ψ0 , ψ1 , ψ2 , ψ3 , . . . — using a formula inspired by
that integral equation.
3. Show that this sequence of functions converges on some interval to a solution y of the
original initial-value problem.
The “hard” part of the proof is in the details of the last step. We can skip over these details initially,
returning to them in the next section.
Two comments should be made here:
1. The ψk ’s end up being approximations to the solution y , and, in theory at least, the method
we are about to describe can be used to ﬁnd approximate solutions to an initial-value problem.
Other methods, however, are often more practical.
2. This method was developed by the French mathematician Emile Picard and is often referred
to as the (Picard’s) method of successive approximations or as Picard’s iterative method
(because of the way the ψk ’s are generated).
To simplify discussion let us assume x 0 = 0 , so that our initial-value problem is
dy
= F(x, y) with y(0) = y0 , (3.3)
dx
There is no loss of generality here. After all, if x 0 = 0 , we can apply the change of variable
s = x − x 0 and convert our original problem into problem (3.3) (with x replaced by s ).

Converting to an Integral Equation

Suppose y = y(x) is a solution to initial-value problem (3.3) on some interval (α, β) with α <
0 < β . Renaming x as s , our differential equation becomes
dy
= F(s, y(s)) for each s in (α, β) .
ds
Integrating this from 0 to any x in (α, β) and remembering that y(0) = y0 , we get
x x
dy
ds = F(s, y(s)) ds
0 ds 0
x
→ y(x) − y(0) = F(s, y(s)) ds
0
x
→ y(x) − y0 = F(s, y(s)) ds .
0

i i

i i
i i

i i

Conﬁrming the Existence of Solutions (Core Ideas) 47

That is, y satisﬁes the integral equation

x
y(x) = y0 + F(s, y(s)) ds whenever α < x < β .
0

On the other hand, if y is any continuous function on (α, β) satisfying this integral equation, then
basic calculus tells us that, on this interval, y is differentiable with
x x
dy d d
= y0 + F(s, y(s)) ds = 0 + F(s, y(s)) ds = F(x, y(x)) .
dx dx 0 dx 0

and 0
y(0) = y0 + F(s, y(s)) ds = y0 ,
0
0

Thus, y also satisﬁes our original initial-value problem.

We should note that, in the above, we implicitly assumed F(x, y) was a reasonably behaved
function at each point (x, y) where α < x < β and y = y(x) . In particular, if F is continuous at
each of these points, then this continuity, the continuity of y , and the fact that y = F(x, y) ensures
that y is not only differentiable on (α, β) but that y is continuous on (α, β) .
In summary, we have the following theorem:

Theorem 3.3
Let y be any continuous function on some interval (α, β) containing 0 , and assume F is a
function of two variables continuous at every (x, y) with α < x < β and y = y(x) . Then y has
a continuous derivative on (α, β) and satisﬁes the initial-value problem
dy
= F(x, y) with y(0) = y0 on (α, β)
dx

if and only if y satisﬁes the integral equation

x
y(x) = y0 + F(s, y(s)) ds whenever α < x < β .
0

Generating a Sequence of “Approximate Solutions”

Begin with any continuous function ψ0 . For example, we could simply choose ψ0 to be the constant
function
ψ0 (x) = y0 for all x .
(Later, we will place some additional restrictions on ψ0 , but the above constant function will still
be a valid choice for ψ0 .)
Next, let ψ1 be the function constructed from ψ0 by
x
ψ1 (x) = y0 + F(s, ψ0 (s)) ds .
0

Then construct ψ2 from ψ1 via

x
ψ2 (x) = y0 + F(s, ψ1 (s)) ds .
0

i i

i i
i i

i i

48 Some Basics about First-Order Equations

Continue the process, deﬁning ψ3 , ψ4 , ψ5 , . . . by

x
ψ3 (x) = y0 + F(s, ψ2 (s)) ds ,
0
x
ψ4 (x) = y0 + F(s, ψ3 (s)) ds ,
0
..
.
In general, once ψk is deﬁned, we deﬁne ψk+1 by
x
ψk+1 (x) = y0 + F(s, ψk (s)) ds . (3.4)
0

Since we apparently can continue this iterative process forever, we have an inﬁnite sequence of
functions
ψ0 , ψ 1 , ψ 2 , ψ 3 , ψ 4 , . . . .
In the future, we may refer to this sequence as the Picard sequence (based on ψ0 and F ). Note
that, for k = 1, 2, 3, . . . ,
0
ψk (0) = y0 + F(s, ψk−1 (s)) ds = y0 .
0
0

So each of these ψk ’s satisﬁes the initial condition in our initial-value problem. Moreover, since
each of these ψk ’s is a constant added to an integral from 0 to x , each of these ψk ’s should be
continuous at least over the interval of x’s on which the integral is ﬁnite.

(Naively) Taking the Limit

Now suppose there is an interval (α, β) containing 0 on which this sequence of ψk ’s converges to
some continuous function. Let y denote this function,
y(x) = lim ψk (x) for α < x < β .
k→∞

Now let x be any point in (a, b) . Blithely (and naively) taking the limit of both sides of equation
(3.4), we get
y(x) = lim ψk (x) = lim ψk+1 (x)
k→∞ k→∞
x
= lim y0 + F(s, ψk (s)) ds
k→∞ 0
x
= y0 + lim F(s, ψk (s)) ds
k→∞ 0
x
= y0 + lim F(s, ψk (s)) ds
0 k→∞
x
= y0 + F(s, y(s)) ds .
0

Thus (assuming the above limits are valid) we see that y satisﬁes the integral equation
x
y(x) = y0 + F(s, y(s)) ds for a < x < b .
0

i i

i i
i i

i i

Details in the Proof of Theorem 3.1 49

As noted in theorem 3.3, this means the function y is a solution to our original initial-value problem,
thus verifying the claimed existence of such a solution.
That was the essence of Picard’s method of successive approximations.

3.5 Details in the Proof of Theorem 3.1

Confirming the Existence of Solutions
What are the Remaining Details?
Before proclaiming that we have rigorously verified the existence of a solution to our initial-value
problem via the Picard method, we need to rigorously verify the assumptions made in the last
subsection. If you check carefully, you will see that we still need to rigorously confirm the following
three statements involving the functions ψ1 , ψ2 , . . . generated by the Picard iteration method:
1. There is an interval (α, β) containing 0 such that

lim ψk (x)
k→∞

exists for each x in (α, β) .

2. The function given by
y(x) = lim ψk (x)
k→∞
is continuous on the interval (α, β) .
3. The above deﬁned function y satisﬁes
x
y(x) = y0 + F(s, y(s)) ds whenever α < x < β .
0

Conﬁrming these claims under the assumptions in theorem 3.1 on page 44 will be the main goal of
this section.5

Some Preliminary Bounds

In carrying out our analysis, we will make use of a number of facts normally discussed in standard
introductory calculus sequences. For example, we will use without comment that fact that, for any
summation,

ck ≤ |ck | .

k k
This was called the triangle inequality. Recall, also, that “the absolute value of an integral is less
than or equal to the integral of the absolute value”. We will need to be a little careful about this
because the lower limits on our integrals will not always be less than our upper limits. If σ < τ ,
then we do have τ τ

g(s) ds ≤ |g(s)| ds .

σ σ
5 Some of the analysis in this section can be shortened considerably using tools from advanced real analysis. Since the
typical reader is not expected to have yet had a such a course, we will not use those tools. However, if you have had such
a course and are acquainted with such terms as “uniform convergence” and “Cauchy sequences”, then you should look to
see how your more advanced mathematics can shorten the analysis given here.

i i

i i
i i

i i

50 Some Basics about First-Order Equations

On the other hand, if τ < σ , then

τ
σ σ σ
g(s) ds = −
g(s) ds =
g(s) ds ≤ |g(s)| ds .

σ τ τ τ
Suppose that, in addition, |g(s)| ≤ M for all s in some interval containing σ and τ . Then, if
σ < τ , τ τ
τ
g(s) ds ≤ |g(s)| ds ≤ M ds = M[τ − σ ] = M |τ − σ | ,

σ σ σ
while, if τ < σ ,
τ
σ σ
g(s) ds ≤ |g(s)| ds ≤ M ds = M[−(τ − σ )] = M |τ − σ | .

σ τ τ
So, in general, we have the following little lemma:

Lemma 3.4
If |g(s)| ≤ M for all s in some interval containing σ and τ , then
τ

g(s) ds = M |τ − σ | .

σ

Other facts from calculus will be used and slightly expanded as needed. These facts will include
stuff on the absolute convergence of summations and the Taylor series for the exponentials.
The next lemma establishes the interval (α, β) mentioned in the existence theorem (theorem
3.1) along with some function bounds that will be useful in our analysis.

Lemma 3.5
Assume both F(x, y) and ∂ F/∂ y are continuous on some open region R in the XY –plane containing
the point (0, y0 ) . Then there are positive constants M and B , a closed interval [α, β] , and a ﬁnite
distance Y such that all the following hold:
1. α < 0 < β .

2. The open region R contains the closed rectangular region

R1 = { (x, y) : α ≤ x ≤ β and |y − y0 | ≤ Y } .
3. For each (x, y) in R1 ,

∂F
|F(x, y)| ≤ M
and ∂ y (x,y) ≤ B .

4. 0 < −α M ≤ Y and 0 < β M ≤ Y .

5. If φ is a continuous function on (α, β) satisfying
|φ(x) − y0 | ≤ Y for α ≤ x ≤ β ,
then x
ψ(x) = y0 + F(s, φ(s)) ds
0
deﬁnes the function ψ on the interval [α, β] . Moreover, ψ is continuous on [α, β] and
satisﬁes
|ψ(x) − y0 | ≤ Y for α ≤ x ≤ β .

i i

i i
i i

i i

Details in the Proof of Theorem 3.1 51

R1 R0
Y
y0
Y

α0 0 β β0 X
X X

Figure 3.1: Rectangles contained in region R for the proof of lemma 3.5 (with |α0 | < X and
X < β0 ).

PROOF: The goal is to ﬁnd a rectangle R1 on which the above holds. We start by noting that,
because R is an open region containing the point (0, y0 ) , that point is not on the boundary of
R , and we can pick a negative value α0 and two positive values β0 and Y so that the closed
rectangular region
R0 = { (x, y) : α0 ≤ x ≤ β0 and |y − y0 | ≤ Y }

is contained in R , as in ﬁgure 3.1.

Since F and ∂ F/∂ y are continuous on R , they (and their absolute values) must be continuous
on that portion of R which is R0 . But recall that a continuous function of one variable on a closed
ﬁnite interval will always have a maximum value on that interval. Likewise, a continuous function
of two variables will always have a maximum value over a closed ﬁnite rectangle. Let M and B
be, respectively, the maximum values of |F| and ∂ F/∂ y on R0 . Then, of course,

∂F
|F(x, y)| ≤ M and ≤ B
for each (x, y) in R0 .
∂ y (x,y)

Now let us further restrict the possible values of x by ﬁrst setting

Y
Y

X = so M = ,
M X
and then deﬁning the endpoints of the interval (α, β) by

α0 if −X < α0 X if X < β0
α = and β =
−X if α0 ≤ −X β0 if β0 ≤ X

(again, see ﬁgure 3.1).

By these choices,
α0 ≤ α < 0 < β ≤ β 0 ,

|x| ≤ X whenever α ≤ x ≤ β ,

0 < −α M ≤ X M = Y ,

0 < β M ≤ X M = Y ,

i i

i i
i i

i i

52 Some Basics about First-Order Equations

and the closed rectangle

R1 = { (x, y) : α ≤ x ≤ β and |y − y0 | ≤ Y }

is contained in the closed rectangle R0 , ensuring that

∂F
|F(x, y)| ≤ M and ≤ B
for each (x, y) in R1 .
∂ y (x,y)

This takes care of the ﬁrst four claims of the lemma.

To conﬁrm the lemma’s ﬁnal claim, let φ be a continuous function on (α, β) satisfying

|φ(x) − y0 | ≤ Y for α ≤ x ≤ β .

Then, (s, φ(s)) is a point in R1 for each s in the interval [α, β] . This, in turn, means that
F(s, φ(s)) exists and is bounded by M over the interval [α, β] . Moreover, it is easily veriﬁed
that the continuity of both F over R and φ over (α, β) ensures that F(s, φ(s)) is a bounded
continuous function of s over [α, β] . Consequently, the integral in
x
ψ(x) = y0 + F(s, φ(s)) ds
0

exists (and is ﬁnite) for each x in [α, β] .

To help conﬁrm the claimed continuity of ψ , take any two points x and x 1 in (α, β) . Using
lemma 3.4 and the fact that F is bounded by M on R1 , we have that
x1 x

|ψ(x 1 ) − ψ(x)| = y0 + F(s, φ(s)) ds − y0 − F(s, φ(s)) ds
0 0

x1
= F(s, φ(s)) ds
x
≤ M |x 1 − x| .

Hence,
lim |ψ(x 1 ) − ψ(x)| ≤ lim M |x 1 − x| = M · 0 = 0 ,
x→x 1 x→x 1

which, in turn, means that

lim ψ(x) = ψ(x 1 ) ,
x→x 1

conﬁrming that ψ is continuous at each x 1 in (α, β) . By almost identical arguments, we also have

lim ψ(x) = ψ(α) and lim ψ(x) = ψ(β) .

x→α + x→β −

Altogether, these limits tell us that ψ is continuous on the closed interval [α, β] .
Finally, let α ≤ x ≤ β . Again using lemma 3.4 and the boundedness of F , along with the
deﬁnition of X , we see that
x
Y
|ψ(x) − y0 | = F(s, φ(s)) ds ≤ M |x| ≤ MX = M · = Y .
0 M

i i

i i
i i

i i

Details in the Proof of Theorem 3.1 53

Convergence of the Picard Sequence

Let us now look more closely at the Picard sequence of functions,

ψ0 , ψ1 , ψ2 , ψ3 , . . .

with ψ0 being “some continuous function” and

x
ψk+1 (x) = y0 + F(s, ψk (s)) ds for k = 0, 1, 2, 3, . . . .
0

Remember, F and ∂ F/∂ y are continuous on some open region containing the point (0, y0 ) . This
means lemma 3.5 applies. Let [α, β] , M , B , and Y be the interval and constants from that
lemma. Let us also now impose an additional restriction on the choice for ψ0 : Let us insist that ψ0
be any continuous function on [α, β] such that

|ψ0 (x) − y0 | ≤ Y for α < x < β .

In particular, we could let ψ0 be the constant function ψ0 (x) = y0 for all x .

We now want to show that the sequence of ψk ’s converges to a function y on [α, β] . Our ﬁrst
step in this direction is to observe that, thanks to the additional requirement on ψ0 , lemma 3.5 can
be applied repeatedly to show that ψ1 , ψ2 , ψ3 , . . . are all well-deﬁned, continuous functions on
the interval [α, β] with each satisfying

|ψk (x) − y0 | ≤ Y for α ≤ x ≤ β .

Next, we need to establish useful bounds on the sequence

|ψ1 (x) − ψ0 (x)| , |ψ2 (x) − ψ1 (x)| , |ψ3 (x) − ψ2 (x)| , ...

when α ≤ x ≤ β . The ﬁrst is easy:

|ψ1 (x) − ψ0 (x)| = |ψ1 (x) − y0 − ψ0 (x) + y0 |

= |[ψ1 (x) − y0 ] + (−[ψ0 (x) − y0 ])|
≤ |ψ1 (x) − y0 | + |ψ0 (x) − y0 | ≤ 2Y .

To simplify the derivation of useful bounds on the others, let us observe that, if k ≥ 1 ,
x x

|ψk+1 (x) − ψk (x)| = y0 + F(s, ψk (s)) ds − y0 + F(s, ψk−1 (s)) ds
0 0
x

= [F(s, ψk (s)) − F(s, ψk−1 (s))] ds
0
x
≤ |F(s, ψk (s)) − F(s, ψk−1 (s))| ds .
0

Now recall that, if f is any continuous and differentiable function on an interval I , and t1 and t2
are two points in I , then there is a point τ between t1 and t2 such that

f (t2 ) − f (t1 ) = f (τ ) [t2 − t1 ] .

This was the mean value theorem for derivatives. Consequently, if

f (t) ≤ B for each t in I ,

i i

i i
i i

i i

54 Some Basics about First-Order Equations

then
| f (t2 ) − f (t1 )| = f (τ ) [t2 − t1 ] = f (τ ) |t2 − t1 | ≤ B |t2 − t1 | .
The same holds for partial derivatives. In particular, for each pair of points (x, y1 ) and (x, y2 ) in
the closed rectangle
R1 = { (x, y) : α ≤ x ≤ β and |y − y0 | ≤ Y } ,
we have a γ between y1 and y2 such that

∂F
|F(x, y2 ) − F(x, y1 )| = · [y2 − y1 ] ≤ B |y2 − y1 | .
∂ y (x,γ )

Thus, for 0 ≤ x ≤ β and k = 1, 2, 3, . . . ,

x
|ψk+1 (x) − ψk (x)| ≤ |F(s, ψk (s)) − F(s, ψk−1 (s))| ds
0
x
≤ B |ψk (s) − ψk−1 (s)| ds .
0

Repeatedly using this (with 0 ≤ x ≤ β ), we get

x
|ψ2 (x) − ψ1 (x)| ≤ B |ψ1 (s) − ψ0 (s)| ds
0
x
≤ B · 2Y ds = 2Y Bx ,
0

x
|ψ3 (x) − ψ2 (x)| ≤ B |ψ2 (s) − ψ1 (s)| ds
0
x
(Bx)2
≤ B · 2Y B s ds = 2Y ,
0 2

x
|ψ4 (x) − ψ3 (x)| ≤ |ψ3 (s) − ψ2 (s)| ds
0
x
s2 (Bx)3
≤ B · 2Y B 2 ds ≤ 2Y ,
0 2 3·2

x
|ψ5 (x) − ψ4 (x)| ≤ |ψ4 (s) − ψ3 (s)| ds
0
x
s3 (Bx)4
≤ B · 2Y B 3 ds ≤ 2Y ,
0 3·2 4!

..
.
Continuing, we get
(Bx)k
|ψk+1 (x) − ψk (x)| ≤ 2Y for 0≤x ≤β and k = 1, 2, 3, . . . .
k!
Virtually the same arguments give us
(−Bx)k
|ψk+1 (x) − ψk (x)| ≤ 2Y for α≤x ≤0 and k = 1, 2, 3, . . . .
k!

i i

i i
i i

i i

Details in the Proof of Theorem 3.1 55

More concisely, for α ≤ x ≤ β and k = 1, 2, 3, . . . ,

(B |x|)k
|ψk+1 (x) − ψk (x)| ≤ 2Y . (3.5)
k!

At this point it is worth recalling that the Taylor series for e X is

∞
Xk
k!
k=0

and that this series converges for each real value X . In particular, for any x ,
∞
(B |x|)k
2Y e B|x| = 2Y .
k!
k=0

Now consider the inﬁnite series

∞

S(x) = ψk+1 (x) − ψk (x) .
k=0

According to inequality (3.5), the absolute value of each term in this series is bounded by the
corresponding term in the Taylor series for 2Y e B|x| . The comparison test then tells us that S(x)
converges absolutely for each x in [α, β] . And this means that the limit

N

S(x) = lim ψk+1 (x) − ψk (x)
N →∞
k=0

exists for each x in the interval [α, β] . But

N

ψk+1 (x) − ψk (x) = [ψ1 (x) − ψ0 (x)] + [ψ2 (x) − ψ1 (x)] + [ψ3 (x) − ψ2 (x)]
k=0
+ · · · + [ψ N (x) − ψ N −1 (x)] + [ψ N +1 (x) − ψ N (x)]

= −ψ0 (x) + ψ1 (x) − ψ1 (x) + ψ2 (x) − ψ2 (x) + ψ3 (x)

+ · · · − ψ N −1 (x) + ψ N (x) − ψ N (x) + ψ N +1 (x) .

Most of the terms cancel out, leaving us with

N

ψk+1 (x) − ψk (x) = ψ N +1 (x) − ψ0 (x) . (3.6)
k=0

So

k−1

lim ψk (x) = lim ψ0 (x) + ψk+1 (x) − ψk (x) = ψ0 (x) + S(x) .
k→∞ k→∞
k=0

This shows that the limit

y(x) = lim ψ N (x)
N →∞
exists for each x in [α, β] , confirming the first statement we wished to confirm at the beginning of
this section (see page 49).

i i

i i
i i

i i

56 Some Basics about First-Order Equations

At this point, let us observe that, for α ≤ x ≤ β , we have the formulas

N −1

ψ N (x) = ψ0 (x) + S(x) = ψ0 (x) + ψk+1 (x) − ψk (x) , (3.7a)
k=0
and
∞

y(x) = ψ0 (x) + S(x) = ψ0 (x) + ψk+1 (x) − ψk (x) , (3.7b)
k=0

Let us also observe what we get when we combine the above formula for ψ N with inequality (3.5)
and the observations regarding the Taylor series of the exponential:

N −1
|ψ N (x)| ≤ |ψ0 (x)| + |ψk+1 (x) − ψk (x)|
k=0
(3.8a)

N
(B |x|)k
≤ |ψ0 (x)| + 2Y = |ψ0 (x)| + Y e B|x| .
k!
k=0

Likewise
|y(x)| ≤ |ψ0 (x)| + Y e B|x| . (3.8b)
These observations may later prove useful.

Continuity of the Limit

Now to conﬁrm the continuity of y claimed by the second statement from the beginning of this
section. We start by picking any two points x 1 and x in [α, β] , and any positive integer N , and
then observe that, because F is bounded by M ,
x1 x

|ψ N (x 1 ) − ψ N (x)| = y0 + F(s, ψ N −1 (s)) ds − y0 + F(s, ψ N −1 (s) ds
0 0

x1
= F(s, ψ N −1 (s)) ds
x
≤ M |x 1 − x| .
Combined with the deﬁnition of y and some basic facts about limits, this gives us
|y(x 1 ) − y(x)| = lim |ψ N (x 1 ) − ψ N (x)| ≤ M |x 1 − x| .
N →∞

As demonstrated at the end of the proof of lemma 3.5, this immediately tells us that y is continuous
on [α, β] .

The Limit as a Solution

Finally, let us verify the third statement made at the beginning of this section, namely that the above
defined y satisfies
x
y(x) = y0 + F(s, y(s)) ds whenever α < x < β .
0
This, according to theorem 3.3 on page 47, is equivalent to showing that y satisfies the differential
equation in our initial-value problem over the interval (α, β) .6
6 Yes, we’ve already shown that y is defined and continuous on [α, β] , not just (α, β) . However, the derivative of a
function is ill-defined at the endpoints of the interval over which it is defined, and that is why we are now limiting x to
being in (α, β) .

i i

i i
i i

i i

Details in the Proof of Theorem 3.1 57

We start by assuming α ≤ x ≤ β . Using equation set (3.7) and inequality (3.5), we see that

|y(x) − ψ N (x)| = |[y(x) − ψ0 (x)] − [ψ N (x) − ψ0 (x)]|

∞

N −1

= ψk+1 (x) − ψk (x) − ψk+1 (x) − ψk (x)

k=0 k=0
∞

= ψk+1 (x) − ψk (x)

k=N
∞

≤ |ψk+1 (x) − ψk (x)|
k=N
∞
(B |x|)k
≤ 2Y .
k!
k=N

Under the change of index k = N + n , this becomes

∞
(B |x|) N +n
|y(x) − ψ N (x)| ≤ 2Y . (3.9)
(N + n)!
n=0

But

(N + n)! = (N + n) (N + n − 1) (N + n − 2) · · · (N + n − [N − 1]) n(n − 1) · · · 2 · 1

≥N ≥N −1 ≥N −2 ≥1 =n!

≥ N ! n! .

Thus,
1 1
≤
(N + n)! N ! n!
and
∞
∞
∞
(B |x|) N +n (B |x|) N +n (B |x|) N (B |x|)n (B |x|) N B|x|
≤ ≤ = e .
(N + n)! N ! n! N! n! N!
n=0 n=0 n=0

Combining this with inequality (3.9) yields

(B |x|) N B|x|
|y(x) − ψ N (x)| ≤ 2Y e .
N!

Consequently,

x
ψ N +1 (x) − y0 − F(s, y(s)) ds

0
x x

= y0 + F(s, ψ N (s)) ds − y0 − F(s, y(s)) ds
0 0
x
≤ |F(s, ψ N (s)) − F(s, y(s))| ds
0
x
≤ B |ψ N −1 (s) − y(s)| ds
0

i i

i i
i i

i i

58 Some Basics about First-Order Equations

x
(B |s|) N −1 B|s|
≤ B · 2Y e ds
0 (N − 1)!

B N e B|x| x
= 2Y |s| N −1 ds .
(N − 1)! 0

Computing the last integral leaves us with

x
(B |x|) N B|x|
ψ N +1 (x) − y0 − F(s, y(s)) ds ≤ 2Y e .
N!
0

But, as is well known,

(B |x|) N
→ 0 as N →∞
N!
for any ﬁnite value B |x| . Hence

x
ψ N (x) − y0 − F(s, y(s)) ds → 0 as N →∞ .

0

That is x
0 = lim ψ N (x) − y0 − F(s, y(s)) ds
N →∞ 0
x
= lim ψ N (x) − y0 − F(s, y(s)) ds
N →∞ 0
x
= y(x) − y0 − F(s, y(s)) ds ,
0

verifying that x
y(x) = y0 + F(s, y(s)) ds whenever α < x < β ,
0

as desired.

Where Are We?

Let’s stop for a moment and review what we have done. We have just spent several pages rigorously
verifying the three statements made at the beginning of this section under the assumptions made in
theorem 3.1 on page 44. By verifying these statements, we’ve rigorously justiﬁed the computations
made in the previous section showing that the limit of a Picard sequence is a solution to the initial-
value problem in theorem 3.1. Consequently, we have now rigorously veriﬁed the claim in theorem
3.1 that a solution to the given initial-value problem exists on at least some interval (α, β) .
We now need to show that this y is the only solution on that interval.

The Uniqueness Claim in Theorem 3.1

If you’ve made it through this section up to this point, then you should have little difﬁculty in ﬁnishing
the proof of theorem 3.1 by doing the following exercises. Do make use of the work we’ve done in
the previous several pages.

?Exercise 3.2: Consider a ﬁrst-order initial-value problem

dy
= F(x, y) with y(0) = y0 ,
dx

i i

i i
i i

i i

Details in the Proof of Theorem 3.1 59

and with both F and ∂ F/∂ y being continuous functions on some open region containing the point
(0, y0 ) . Since lemma 3.5 applies, we can let [α, β] be the interval, and M , B and Y the
positive constants from that lemma. Using this interval and these constants:
a i: Verify that
0 ≤ M |x| ≤ Y for α ≤ x ≤ β .
ii: Also verify that any solution y to the above initial-value problem satisﬁes
|y(x) − y0 | ≤ M |x| for a < x < b .
Now observe that the last two inequalities yield
|y(x) − y0 | ≤ M |x| ≤ Y for α ≤ x ≤ β
whenever y is a solution to the above initial-value problem.
b: For the following, let y1 and y2 be any two solutions to the above initial-value problem on
(α, β) , and let

ψ0 , ψ1 , ψ2 , ψ3 , . . . and φ0 , φ1 , φ2 , φ3 , . . .

be the two Picard sequences of functions on (α, β) generated by setting

x
ψk+1 (x) = y0 + F(s, ψk (s)) ds
0
and x
φk+1 (x) = y0 + F(s, φk (s)) ds
0

with
ψ0 (x) = y1 (x) and φ0 (x) = y2 (x) .

i: Using ideas similar to those used above to prove the convergence of the Picard sequence,
show that, for each x in (α, β) and each positive integer k ,
x
|ψk+1 (x) − φk+1 (x)| ≤ B |ψk (s) − φk (s)| ds .
0

ii: Then verify that, for each x in (α, β) ,

|ψ0 (x) − φ0 (x)| ≤ 2Y ,
and
lim |ψk+1 (x) − φk+1 (x)| = 0 .
k→∞

(Hint: This is very similar to our showing that |ψk+1 (x) − ψk+1 (x)| → 0 as k → ∞ .)
iii: Verify that, for each x in (α, β) and positive integer k ,
ψk (x) = y1 (x) and φk (x) = y2 (x) .
iv: Combine the results of the last two parts to show that
y1 (x) = y2 (x) for α < x < β .

The end result of the above set of exercises is that there cannot be two different solutions on the
interval (α, β) to the initial-value problem. That was the uniqueness claim of theorem 3.1.

i i

i i
i i

i i

60 Some Basics about First-Order Equations

3.6 On Proving Theorem 3.2

We could spend several more enjoyable pages redoing the work in the previous section, but under
the assumptions made in theorem 3.2 on page 45 instead of those in theorem 3.1. To avoid that, let
me just brieﬂy describe how you can modify that work, and, thereby, prove theorem 3.2.
First of all, recall that much of the initial effort in proving the convergence of the Picard sequence,

ψ0 , ψ1 , ψ2 , ψ3 , . . .

with x
ψk+1 (x) = y0 + F(s, ψk (s)) ds for k = 0, 1, 2, 3, . . . ,
0
was in showing that there is an interval (α, β) such that, as long as α ≤ s ≤ β , then ψk (s) is never
so large or so small that (s, ψk (s)) is outside a rectangular region on which F is “well-behaved”
(this was the main result of lemma 3.5 on page 50). However, if (as in theorem 3.2) F = F(x, y)
is a continuous function on the inﬁnite strip

R = { (x, y) : α < x < β and − ∞ < y < ∞ } ,

then, for any continuous function φ on (α, β) , F(s, φ(s)) is a well-deﬁned, continuous function
of s over (α, β) , and the integral in
x
ψ(x) = y0 + F(s, φ(s)) ds
0

exists (and is ﬁnite) whenever α < x < β . Verifying that ψ is continuous requires a little more
thought than was needed in the proof of lemma 3.5, but is still pretty easy — simply appeal to the
continuity of F(s, φ(s)) as a function of s along with the fact that
x1
ψ(x 1 ) − ψ(x) = F(s, φ(s)) ds
x

to show that
lim ψ(x) = ψ(x 1 ) for each x 1 in (α, β) .
x→x 1
Consequently, all the functions in the Picard sequence ψ0 , ψ1 , ψ2 , . . . are continuous on (α, β)
(provided, of course, that we started with ψ0 being continuous).
Now choose ﬁnite values α1 and β1 so that α < α1 < 0 < β1 < β ; let Y be the maximum
value of
1
|ψ1 (x) − ψ0 (x)| for α1 ≤ x ≤ β1 ,
2
and let R0 be the inﬁnite strip

R0 = { (x, y) : α1 < x < β1 and − ∞ < y < ∞ } .

By the assumptions in the theorem, we know that, on R , the continuous function ∂ F/∂ y depends
only on x . So we can treat it as a continuous function on the closed interval [α1 , β1 ] . But such
functions are bounded. Thus, for some positive constant B and every point in R0 ,

∂F
≤ B .
∂y

Using this, the bounds on

|ψk+1 (x) − ψk (x)| for α1 ≤ x ≤ β 1 and k = 1, 2, 3, . . .

i i

i i
i i

i i

Appendix: A Little Multivariable Calculus 61

can now be rederived exactly as in the previous section (leading to inequality (3.5) on page 55 and
inequality set (3.8) on page 56), and we can then use arguments almost identical to those used in
the previous section to show that the Picard sequence converges on (α1 , β1 ) to a solution y of the
given initial-value problem. The only notable modiﬁcation is that the bound M used to show the
continuity of y must be rederived. For this proof, let M be the maximum value of F(x, y) on the
closed rectangle
{ (x, y) : α1 ≤ x ≤ β1 and |y| ≤ H }
where H is the maximum value of

|ψ0 (x)| + Y e B|x| for α1 ≤ x ≤ β1 .

Inequality set (3.8) then tells us that

|ψk (s)| ≤ H for α1 ≤ s ≤ β 1 and k = 0, 1, 2, 3, . . . .

This, in turn, assures us that

|F(s, ψk (s))| ≤ M for α1 ≤ s ≤ β1 and k = 0, 1, 2, 3, . . . ,

which is what we used in the previous section to prove the continuity of y .

Finally, since every point x in the interval (α, β) is also in some such subinterval (α1 , β1 ) , we
must have that the Picard sequence converges at every point x in (α, β) , and what it converges to,
y(x) , is a solution to the given initial-value problem. Straightforward modiﬁcations to the arguments
outlined in exercise 3.2 then show that this solution is the only solution.

3.7 Appendix: A Little Multivariable Calculus

There are a few places in our discussions where some knowledge of the calculus of functions of two
or more variables (i.e., “multivariable” calculus) is needed. These include the commentary about
existence and uniqueness in this chapter (theorems 3.1 and 3.2), and the use of the multivariable
version of the chain rule in chapter 7. This appendix is a brief introduction to those elements of
multivariable calculus that are needed for these discussions. It is for those who have not yet been
formally introduced to calculus of several variables, and contains just barely enough to get by.

Functions of Two Variables

At least while we are only concerned with ﬁrst-order differential equations, the only multivariable
calculus we will need involves functions of just two variables, such as
!
x 3 + 4y
f (x, y) = x 2 + x 2 y 2 , g(x, y) = and h(x, y) = x 3 + y2 .
x
These functions will be deﬁned on “regions” of the XY –plane.

Open and Closed Regions

Functions of one variable are typically deﬁned on intervals of the X–axis. For functions of two
variables, we must replace the concept of an interval with that of a “region”. For our purposes, a
region (in the X Y –plane) refers to the collection of all points enclosed by some curve or set of

i i

i i
i i

i i

62 Some Basics about First-Order Equations

curves on the plane (with the understanding that this curve or set of curves actually does enclose
some collection of points in the plane). If we include the curves with the enclosed points, then we
say the region is closed; if the curves are all excluded, then we refer to the region as open. This
corresponds to the distinction between a closed interval [a, b] (which does contain the endpoints),
and an open interval (a, b) (which does not contain the endpoints).

!Example 3.4: Consider the rectangular region R whose sides form the rectangle generated
from the vertical lines x = 1 and x = 4 along with the horizontal lines y = 2 and y = 6 . If
R is to be a closed region, then it must include this rectangle; that is,

R = { (x, y) : 1 ≤ x ≤ 4 and 2 ≤ y ≤ 6 } .

If R is to be an open region, then it must exclude this rectangle; that is,

R = { (x, y) : 1 < x < 4 and 2 < y < 6 } .

On the other hand, if R just includes one of its sides, say, its right side,

R = { (x, y) : 1 < x ≤ 4 and 2 < y < 6 } ,

then it is considered to be neither open or closed.

Limits
The concept of limits for functions of two variables is a natural extension of the concept of limits
for functions of one variable.
Given a function f (x, y) of two variables, a point (x 0 , y0 ) in the plane, and a ﬁnite value A ,
we say that

A is the limit of f (x, y) as (x, y) approaches (x 0 , y0 ) ,

equivalently,

lim f (x, y) = A or f (x, y) → A as (x, y) → (x 0 , y0 ) ,

(x,y)→(x 0 ,y0 )

if and only if we can make the value of f (x, y) as close (but not necessarily equal) to A as we desire
by requiring (x, y) be sufﬁciently close (but not necessarily equal) to (x 0 , y0 ) . More formally,

lim f (x, y) = A
(x,y)→(x 0 ,y0 )

if and only if, for every positive value there is a corresponding positive distance δ such that
f (x, y) is within of A whenever (x, y) is within δ of (x 0 , y0 ) . That is, (in mathematical
shorthand), for each > 0 there is a δ > 0 such that

distance from (x, y) to (x 0 , y0 ) < δ ⇒ | f (x, y) − A| < .

The rules for the existence and computation of these limits are straightforward extensions of
those for functions of one variable, and need not be discussed in detail here.

!Example 3.5: “Obviously”, if

f (x, y) = x 2 + x 2 y 2 ,

g(x, y) = x 2 y 3 and h(x, y) = sin x 2 + y 2 .

Verify that
∂g ∂g
= 2x y 3 and = 3x 2 y 2 ,
∂x ∂y
while
∂h ∂h
= 2x cos x 2 + y 2 and = 2y cos x 2 + y 2 .
∂x ∂y

Functions of More than Two Variables

The notation can become a bit more cumbersome, and the pictures even harder to draw, but every-
thing discussed above for functions of two variables naturally extends to functions of three or more
variables. For example, we may have a function of three variables f = f (x, y, z) deﬁned on, say,
an open box-like region
R = { (x, y, z) : x min < x < x max , ymin < y < ymax and z min < z < z max }
where x min , x max , ymin , ymax , z min and z max are ﬁnite numbers. We will then say that, for any
given point (x 0 , y0 , z 0 ) and value A ,
lim f (x, y, z) = A
(x,y,)→(x 0 ,y0 ,z 0 )

if and only if there is a corresponding positive distance δ for every positive value such that
f (x, y, z) is within of A whenever (x, y, z) is within δ of (x 0 , y0 , z 0 ) . We will also say that
this function is continuous on R if and only if we can legitimately write
lim f (x, y, z) = f (x 0 , y0 , z 0 )
(x,y,)→(x 0 ,y0 ,z 0 )

for every point (x 0 , y0 , z 0 ) in R . Finally, the three (ﬁrst) partial derivatives of this function are
given by
∂f f (x + x, y, z) − f (x, y, z)
= lim ,
∂x x→0 x
∂f f (x, y + y, z) − f (x, y, z)
= lim
∂y y→0 y
and
∂f f (x, y, z + z) − f (x, y, z)
= lim ,
∂z y→0 z

provided the limits exist. Again, in practice, the partial derivative with respect to any one of the three
variables is the derivative obtained by pretending the other variables are constants.

i i

i i
i i

i i

Additional Exercises 65

Additional Exercises

3.4. Rewrite each of the following in derivative formula form, and then ﬁnd all constant so-
lutions. (In some cases, you may have to use the quadratic formula to ﬁnd any constant
solutions.)
dy dy
a. + 3x y = 6x b. sin(x + y) − y = 0
dx dx
dy dy
c. − y3 = 8 d. x 2 + x y2 = x
dx dx
dy dy
e. − y2 = x f. y 3 − 25y + = 0
dx dx
dy dy
g. (x − 2) = y + 3 h. (y − 2) = x − 3
dx dx
dy dy
i. + 2y − y 2 = −2 j. + (8 − x)y − y 2 = −8x
dx dx

3.5. Which of the equations in the above exercise set are autonomous?

3.6. Consider the ﬁrst-order initial-value problem

dy √
= 2 y with y(1) = 0 .
dx

a. Verify that each of the following is a solution on the interval (−∞, ∞) , and graph that
solution:
i. y(x) = 0 for − ∞ < x < ∞.

0 if x <1
ii. y(x) = .
(x − 1) 2
if 1 ≤ x

0 if x <3
iii. y(x) = 2
.
(x − 3) if 3 ≤ x
b. You’ve just veriﬁed three different functions as being solutions to the above initial-value
problem. Why does this not violate theorem 3.1?

3.7. Let ψ0 , ψ1 , ψ2 , ψ3 , . . . be the sequence of functions generated by the Picard iterative

method (as described in section 3.4) using the initial-value problem
dy
= xy with y(0) = 2
dx

along with
ψ0 (x) = 2 for all x .
Using the formula for Picard’s method (formula (3.4) on page 48), compute the follow-
ing:
a. ψ1 (x) b. ψ2 (x) c. ψ3 (x)

i i

i i
i i

i i

66 Some Basics about First-Order Equations

3.8. Let ψ0 , ψ1 , ψ2 , ψ3 , . . . be the sequence of functions generated by the Picard iterative

method (as described in section 3.4) using the initial-value problem
dy
= 2x + y 2 with y(0) = 3
dx

along with
ψ0 (x) = 3 for all x .
Compute the following:
a. ψ1 (x) b. ψ2 (x)

i i

i i
i i

i i

4
Separable First-Order Equations

As we will see below, the notion of a differential equation being “separable” is a natural generalization
of the notion of a first-order differential equation being directly integrable. What’s more, a fairly
natural modification of the method for solving directly integrable first-order equations gives us the
basic approach to solving “separable” differential equations. However, it cannot be said that the
theory of separable equations is just a trivial extension of the theory of directly integrable equations.
Certain issues can arise that do not arise in solving directly integrable equations. Some of these
issues are pertinent to even more general classes of first-order differential equations than those that
are just separable, and may play a role later on in this text.
In this chapter we will, of course, learn how to identify and solve separable first-order differential
equations. We will also see what sort of issues can arise, examine those issues, and discuss some
ways to deal with them. Since many of these issues involve graphing, we will also draw a bunch of
pictures.

4.1 Basic Notions

Separability
A ﬁrst-order differential equation is said to be separable if, after solving it for the derivative,
dy
= F(x, y) ,
dx

the right-hand side can then be factored as “a formula of just x ” times “a formula of just y ”,

F(x, y) = f (x)g(y) .

If this factoring is not possible, the equation is not separable.

More concisely, a ﬁrst-order differential equation is separable if and only if it can be written as
dy
= f (x)g(y) (4.1)
dx

where f and g are known functions.

!Example 4.1: Consider the differential equation

dy
− x 2 y2 = x 2 . (4.2)
dx

i i

i i
i i

i i

68 Separable First-Order Equations

Solving for the derivative (by adding x 2 y 2 to both sides),

dy
= x 2 + x 2 y2 ,
dx

and then factoring out the x 2 on the right-hand side gives

dy

= x 2 1 + y2 ,
dx

which is in form
dy
= f (x)g(y)
dx
with
x2
f (x) = and g(y) = 1 + y2 .
no y ’s
no x ’s

So equation (4.2) is a separable differential equation.

!Example 4.2: On the other hand, consider

dy
− x 2 y2 = 4 . (4.3)
dx

Solving for the derivative here yields

dy
= x 2 y2 + 4 .
dx

The right-hand side of this clearly cannot be factored into a function of just x times a function
of just y . Thus, equation (4.3) is not separable.

We should (brieﬂy) note that any directly integrable ﬁrst-order differential equation
dy
= f (x)
dx

can be viewed as also being the separable equation

dy
= f (x)g(y)
dx

with g(y) being the constant 1 . Likewise, a ﬁrst-order autonomous differential equation
dy
= g(y)
dx

can also be viewed as being separable, this time with f (x) being 1 . Thus, both directly integrable
and autonomous differential equations are all special cases of separable differential equations.

Integrating Separable Equations

As just noted, a directly-integrable equation
dy
= f (x)
dx

i i

i i
i i

which, after cutting out the middle, reduces to

1 dy 1
dx = dy ,
2 1 + y dx 1 + y2

the very equation we would have obtained if we had yielded to temptation and naively “cancelled
out the dx ’s ”.
Consequently, the equation obtained by integrating both sides of equation (4.4) with respect
to x ,
1 dy
dx = x 2 dx ,
2 1 + y dx
is the same as
1
dy = x 2 dx .
1 + y2
Doing the indicated integration on both sides then yields
1 3
arctan(y) = x + c ,
3

which, in turn, tells us that

1 3
y = tan x +c .
3
This is the general solution to our differential equation.

Two generally useful ideas were illustrated in the last example. One is that, whenever we have
an integral of the form
dy
H (y) dx
dx
where y denotes some (differentiable) function of x , then this integral is more properly written as

H (y(x)) y (x) dx ,

which reduces to
H (y) d y

via the substitution y = y(x) (even though we don’t yet know what y(x) is). Thus, in general,

dy
H (y) dx = H (y) d y , (4.5)
dx

This equation is true whether you derive it rigorously, as we have, or obtain it naively by mechanically
canceling out the dx’s.1
The other idea seen in the example was that, if we divide an equation of the form
dy
= f (x)g(y)
dx
by g(y) , then (with the help of equation (4.5)) we can compute the integral with respect to x of
each side of the resulting equation,
1 dy
= f (x) .
g(y) d x

This leads us to a basic procedure for solving separable ﬁrst-order differential equations:
1 One of the reasons our notation is so useful is that naive manipulations of the differentials often do lead to valid equations.
Just don’t be too naive and cancel out the d’s in dy/dx .

i i

i i
i i

i i

Basic Notions 71

1. Get the differential equation into the form

dy
= f (x)g(y) .
dx
2. Divide through by g(y) to get
1 dy
= f (x) .
g(y) d x
(Note: At this point we’ve “separated the variables”, getting all the y’s and derivatives of y
on one side, and all the x’s on the other.)
3. Integrate both sides with respect to x , making use of the fact that

1 dy 1
dx = dy .
g(y) d x g(y)
4. Solve the resulting equation for y .
There are a few issues that can arise in some of these steps, and we will have to slightly reﬁne this
procedure to address those issues. Before doing that, though, let us practice with another differential
equation for which the above approach can be applied without any difﬁculty.

!Example 4.4: Consider solving the initial-value problem

dy x
= − with y(0) = 1 .
dx y−3
Here,
dy 1
= f (x)g(y) with f (x) = −x and g(y) = ,
dx y−3
and “dividing through by g(y) ” is the same as multiplying through by y − 3 . Doing so, and
then integrating both sides with respect to x , we get the following:
dy
[y − 3] = −x
dx

→ [y − 3]
dy
dx
dx = − x dx

→ [y − 3] d y = − x dx

→ 1 2
2
y
1
− 3y = − x 2 + c
2
.

Though hardly necessary, we can multiply through by 2 , obtaining the slightly simpler expression
y 2 − 6y = −x 2 + 2c .
We are now faced with the less-than-trivial task of solving the last equation for y in terms of x .
Since the left-hand side looks something like a quadratic for y , let us rewrite this equation as

y 2 − 6y + x 2 − 2c = 0

so that we can apply the quadratic formula to solve for y . Applying that venerable formula, we
get !
−(−6) ± (−6)2 − 4 x 2 − 2c
y = = 3 ± 9 − x 2 + 2c ,
2

i i

i i
i i

i i

72 Separable First-Order Equations

which, since 9 + 2c is just another unknown constant, can be written a little more simply as

y = 3 ± a − x2 . (4.6)

This is the general solution to our differential equation.

Now for the initial-value problem. Combining the general solution just derived with the
given initial value at x = 0 yields
√
1 = y(0) = 3 ± a − 02 = 3 ± a .

So √
± a = −2 .
This means that a = 4 , and that we must use the negative root in formula (4.6) for y . Thus, the
solution to our initial-value problem is

y = 3 − 4 − x2 .

4.2 Constant Solutions

Avoiding Division by Zero
In the above procedure for solving
dy
= f (x)g(y) ,
dx
we divided both sides by g(y) . This requires, of course, that g(y) not be zero — which is often
not the case.

!Example 4.5: Consider solving

dy
= 2x(y − 5) .
dx
As long as y = 5 , we can divide through by y − 5 and follow our basic procedure:
1 dy
= 2x
y − 5 dx

→ 1 dy
y − 5 dx
dx = 2x dx

→ 1
y−5
dy = 2x dx

→ ln |y − 5| = x 2 + c

→ |y − 5| = e x
2 +c
= e x ec
2

→ y − 5 = ±e x ec
2
.

i i

i i
i i

i i

Constant Solutions 73

So, assuming y = 5 , we get

2
y = 5 ± ec e x .
Notice that, because = 0 for every real value c , this formula for y never gives us y = 5 for
ec
any real choice of c and x .
But what about the case where y = 5 ?
Well, suppose y = 5 . To be more speciﬁc, let y be the constant function

y(x) = 5 for every x ,

and plug this constant function into our differential equation

dy
= 2x(y − 5) .
dx
Recalling (again) that derivatives of constants are zero, we get

0 = 2x(5 − 5) ,

which is certainly a true equation. So y = 5 is a solution. In fact, it is one of those “constant”

solutions we discussed in the previous chapter.
Combining all the above, we see that the “general solution” to the given differential equation
is actually the set consisting of the solutions
2
y(x) = 5 and y(x) = 5 ± ec e x .

Now consider the general case, where we seek all possible solutions to
dy
= f (x)g(y) .
dx
If y0 is any single value for which
g(y0 ) = 0 ,
then plugging the corresponding constant function

y(x) = y0 for all x

into the differential equation gives, after a trivial bit of computation,

0 = 0 ,

showing that
dy
y(x) = y0 is a constant solution to = f (x)g(y) ,
dx
just as we saw (in the above example) that
dy
y(x) = 5 is a constant solution to = 2x(y − 5) .
dx
Conversely, suppose y = y0 is a constant solution to
dy
= f (x)g(y)
dx
(and f is not the zero function). Then the equation is valid with y replaced by the constant y0 ,
giving us
0 = f (x)g(y0 ) ,

i i

i i
i i

i i

74 Separable First-Order Equations

which, in turn, means that y0 must be a constant such that

g(y0 ) = 0 .

What all this shows is that our basic method for solving separable equations may miss the
constant solutions because those solutions correspond to a division by zero in our basic method.2
Because constant solutions are often important in understanding the physical process the dif-
ferential equation might be modeling, let us be careful to ﬁnd them. Accordingly, we will insert the
following step into our procedure on page 70 for solving separable equations:

• Identify all constant solutions by ﬁnding all values y0 , y1 , y2 , … such that

g(yk ) = 0 ,

and then write down

y(x) = y0 , y(x) = y1 , y(x) = y2 , ... .

(These are the constant solutions.)

(And we will renumber the other steps as appropriate.)

Sometimes, the formula obtained by our basic procedure for solving can be ‘tweaked’ to also
account for the constant solutions. A standard ‘tweak’ can be seen by reconsidering the general
solution obtained in our last example.

!Example 4.6: The general solution obtained in the previous example was the set containing
2
y(x) = 5 and y(x) = 5 ± ec e x ,

If we let A = ±ec , the second equation reduces to

2
y(x) = 5 + Ae x .

Remember, though, A = ±ec can be any positive or negative number, but cannot be zero (because
of the nature of the exponential function). So, by our deﬁnition of A , our general solution is

y(x) = 5 (4.7a)
and
2
y(x) = 5 + Ae x where A can be any nonzero real number . (4.7b)

However, if we allow A to be zero, then equation (4.7b) reduces to equation (4.7a),

2
y(x) = 5 + 0 · e x = 5 ,

which means the entire set of possible solutions can be expressed more simply as
2
y(x) = 5 + Ae x

where A is an arbitrary constant with no restrictions on its possible values.

2 Because g(y ) = 0 is a ‘singular’ value for division, many authors refer to constant solutions of separable equations as
0
singular solutions.

i i

i i
i i

i i

Constant Solutions 75

In the future, we will usually express our general solutions as simply as practical, with the trick
of letting
A = ±ec or 0
often being used without comment. Keep in mind, though, that the sort of tweaking just described
is not always possible.

?Exercise 4.1: Verify that the general solution to

dy
= −y 2
dx

is given by the set consisting of

1
y(x) = 0 and y(x) = .
x +c

Is there any way to rewrite these two formulas for y(x) as a single formula using just one arbitrary
constant?

The Importance of Constant Solutions

Even if we can use the same general formula to describe all the solutions (constant and otherwise),
it is often worthwhile to explicitly identify any constant solutions. To see this, let us now solve
the differential equation from chapter 1 describing a falling object when we take into account air
resistance.

!Example 4.7: Let v = v(t) be the velocity (in meters per second) at time t of some object
of mass m plummeting towards the ground. In chapter 1, we decided that Fair , the force of air
resistance acting on the falling body, could be described by

Fair = −γ v

where γ was some positive constant dependent on the size and shape of the object (and probably
determined by experiment). Using this, we obtained the differential equation
dv γ
= −9.8 − κv where κ = .
dt m

This is a relatively simple separable equation. Assuming v equals a constant v0 yields

9.8 9.8m
0 = −9.8 − κv0 ⇒ v0 = − = − .
κ γ

So, we have one constant solution,

v(t) = v0 for all t

where
9.8 9.8m
v0 = − = − .
κ γ

For reasons that will soon become clear, v0 is called the terminal velocity of the object that is
falling.

i i

i i
i i

i i

76 Separable First-Order Equations

To ﬁnd the other possible solutions, we assume v = v0 and proceed:

dv
= −9.8 − κv
dt

→ 1 dv
9.8 + κv dt
= −1

→ 1 dv
9.8 + κv dt
dt = − 1 dt

→ 1
9.8 + κv
dv = − dt

→ 1
κ
ln |9.8 + κv| = −t + c

→ ln |9.8 + κv| = −κt + κc

→ 9.8 + κv = ±e−κt+κc

1
→ v(t) =
κ
−9.8 ± eκc e−κt .

Since v0 = −9.8κ −1 , the last equation reduces to

1
v(t) = v0 + Ae−κt where A = ± eκc .
κ
This formula for v(t) yields the constant solution, v = v0 , if we allow A = 0 . Thus, letting A
be a completely arbitrary constant, we have that

v(t) = v0 + Ae−κt (4.8a)

where
9.8m γ
v0 = − and κ= (4.8b)
γ m

describes all possible solutions to the differential equation of interest here. The graphs of some
possible solutions (assuming a terminal velocity of -10 meters/second) are sketched in ﬁgure 4.1.

Notice how the constant in the constant solution, v0 , appears in the general solution (equation
(4.8a)). More importantly, notice that the exponential term in this solution rapidly goes to zero
as t increases, so

v(t) = v0 + Ae−κt → v(t) = v0 as t → ∞ .

This is graphically obvious in figure 4.1. Consequently, no matter what the initial velocity and
initial height were, eventually the velocity of this falling object will be very close to v0 (provided
it doesn’t hit the ground first). That is why v0 is called the terminal velocity. That is also why that
constant solution is so important here (and is appropriately also called the equilibrium solution).
It accurately predicts the final velocity of any object falling from a sufficiently high height. And
if you are that falling object, then that velocity3 is probably a major concern.

3 between 120 and 150 miles per hour for a typical human body

i i

i i
i i

i i

Explicit Versus Implicit Solutions 77

V
5

T
0
2 4 6 8

−5

−10

−15

−20

Figure 4.1: Graphs of the velocity of a falling object during the ﬁrst 8 seconds of its fall assuming
a terminal velocity of −10 meters per second. Each graph corresponds to a different
initial velocity.

4.3 Explicit Versus Implicit Solutions

Thus far, we have been able to ﬁnd explicit formulas for all of our solutions; that is, we have been
able to carry out the last step in our basic procedure — that of solving the resulting (integrated)
equation for y in terms of x — obtaining

y = y(x) where y(x) is some formula of x (with no y’s ).

For example, as the general solution to

dy
− x 2 y2 = x 2 ,
dx

we obtained (in example 4.3)

1 3
y = tan x + c .
3

y(x)

Unfortunately, this is not always possible.

!Example 4.8: Consider

dy x +1
= .
dx 8 + 2π sin(π y)
In this case,
1
g(y) = ,
8 + 2π sin(π y)
which can never be zero. So there are no constant solutions, and we can blithely proceed with
our procedure. Doing so:
dy x +1
=
dx 8 + 2π sin(π y)

→ [8 + 2π sin(π y)]
dy
dx
= x +1

i i

i i
i i

i i

78 Separable First-Order Equations

→ dy
[8 + 2π sin(π y)] dx =
dx
x + 1 dx

→ [8 + 2π sin(π y)] d y = x + 1 dx

→ 8y − 2 cos(π y) =
1 2
2
x + x + c .

The next step would be to solve the last equation for y in terms of x . But look at that last
equation. Can you solve it for y as a formula of x ? Neither can anyone else. So we are not able
to obtain an explicit formula for y . At best, we can say that y = y(x) satisﬁes the equation
1 2
8y − 2 cos(π y) = x + x + c .
2

Still, this equation is not without value. It does implicitly describe the possible relations between
x and y . In particular, the graphs of this equation can be sketched for different values of c (we’ll
do this later on in this chapter). These graphs, in turn, give you the graphs you would obtain for
y(x) if you could actually ﬁnd the formula for y(x) .

In practice, we must deal with both “explicit” and “implicit” solutions to differential equations.
When we have an explicit formula for the solution in terms of the variable, that is, we have something
of the form
y = y(x) where y(x) is some formula of x (with no y’s ) , (4.9)

then we say that we have an explicit solution to our differential equation. Technically, it is that
“formula of x ” in equation (4.9) which is the explicit solution. In practice, though, it is common to
refer to the entire equation as “an explicit solution”. For example, we found that the solution to
dy
− x 2 y2 = x 2
dx
is explicitly given by
1 3
y = tan x +c .
3
Strictly speaking, the explicit solution here is the formula

1
tan x 3 + c .
3

That, of course, is what is really meant when someone answers the question
dy
What is the explicit solution to − x 2 y2 = x 2 ?
dx
with the equation
1 3
y = tan x +c .
3
If, on the other hand, we have an equation (other than something like (4.9)) involving the solution
and the variable, then that equation is called an implicit solution. In trying to solve the differential
equation in example 4.8,
dy x +1
= ,
dx 8 + 2π sin(π y)
we derived the equation
1 2
8y − 2 cos(π y) = x + x + c .
2
This equation is an implicit solution for the given differential equation.4

i i

i i
i i

i i

Full Procedure for Solving Separable Equations 79

Differential equations — be they separable or not — can have both implicit and explicit solutions.
Indeed, implicit solutions often arise in the process of deriving an explicit solution. For example, in
solving
dy
− x 2 y2 = x 2 ,
dx
we ﬁrst obtained
1 3
arctan(y) = x + c .
3
This is an implicit solution. Fortunately, it could be easily solved for y , giving us the explicit
solution
1
y = tan x 3 + c .
3
As a general rule, explicit solutions are preferred over implicit solutions. Explicit solutions
usually give more information about the solutions, and are easier to use than implicit solutions
(even when you have sophisticated computer math packages). So, whenever you solve a differential
equation,
FIND AN EXPLICIT SOLUTION IF AT ALL PRACTICAL.
Do not be surprised, however, if you encounter a differential equation for which an explicit solution
is not obtainable. This is not a disaster; it just means a little more work may be needed to extract
useful information about the possible solutions.

4.4 Full Procedure for Solving Separable Equations

In light of the possibility of singular solutions and the possibility of not ﬁnding explicit solutions,
we should reﬁne our procedure for solving a separable differential equation to:
1. Get the differential equation into the form
dy
= f (x)g(y) . (4.10)
dx

2. Identify all constant solutions by ﬁnding all values y0 , y1 , y2 , … such that

g(yk ) = 0 ,

and then write down

y(x) = y0 , y(x) = y1 , y(x) = y2 , ... .

(These are the constant solutions.)

3a. Divide equation (4.10) through by g(y) to get

1 dy
= f (x)
g(y) d x

(assuming y is not one of the constant solutions just found).

4 The fact that an explicit solution is a formula while an implicit solution is an equation may be a little confusing at ﬁrst. If it
helps, think of the phrase “implicit solution” as being shorthand for “an equation implicitly deﬁning the solution y = y(x) ”.

i i

i i
i i

i i

80 Separable First-Order Equations

b. Integrate both sides of the equation just obtained with respect to x .

c. Solve the resulting equation for y , if practical (thus obtaining an explicit solution). If
not practical, use that resulting equation as an implicit solution, possibly rearranged or
simpliﬁed if appropriate.

4. If constant solutions were found, see if the formulas obtained for the other solutions can be
tweaked to also describe the constant solutions. In any case, be sure to write out all solution(s)
obtained.

The above yields the general solution. If initial values are also given, then use those initial conditions
with the general solution just obtained to derive the particular solutions satisfying the given initial-
value problems.

4.5 Existence, Uniqueness, and False Solutions

On the Existence and Uniqueness of Solutions
Let’s consider a generic initial-value problem involving a separable differential equation,
dy
= f (x)g(y) with y(x 0 ) = y0 .
dx

Letting F(x, y) = f (x)g(y) this is

dy
= F(x, y) with y(x 0 ) = y0 ,
dx

which was the initial-value problem considered in theorem 3.1 on page 44. That theorem assures us
that there is exactly one solution to our initial-value problem on some interval (a, b) containing x 0
provided
F(x, y) = f (x)g(y)
and
∂F ∂
= [ f (x)g(y)] = f (x)g (y)
∂y ∂y
are continuous in some open region containing the point (x 0 , y0 ) . This means our initial-value
problem will have exactly one solution on some interval (a, b) provided f (x) is continuous on
some open interval containing x 0 , and both g(y) and g (y) are continuous on some open interval
containing y0 . In practice, this is typically what we have.
Typically, also, one rarely worries about the existence and uniqueness of the solution to an
initial-value problem with a separable differential equation, at least not when one can carry out
the integration and algebra required by our procedure. After all, doesn’t our reﬁned procedure for
solving separable differential equations always lead us to “the solution”? Well, here are two reasons
to have at least a little concern about existence and uniqueness:

1. After the integration in step 3, the resulting equation may involve a nontrivial formula of
y . After applying the initial condition and solving for y , it is possible to end up with more
than one formula for y(x) . But as long as f , g and g are sufﬁciently continuous, the
above tells us that there is only one solution. Thus, only one of these formulas for y(x) can
be correct. The others are “false solutions” that should be identiﬁed and eliminated. (An
example is given in the next subsection.)

i i

i i
i i

i i

Existence, Uniqueness, and False Solutions 81

2. Suppose g(y0 ) = 0 . Our refined procedure tells us that the constant function y = y0 ,
which certainly satisfies the initial condition, is also a solution to the differential equation.
So y = y0 is immediately seen to be a solution to our initial-value problem. Do we then
need to go through the rest of our procedure to see if any other solutions to the differential
equation satisfy y(x 0 ) = y0 ? The answer is No, not if f is continuous on an open interval
containing x 0 , and both g and g are continuous on an open interval containing y0 . If
that continuity holds, then the above analysis assures us that there is only one solution. Thus,
if we find a solution, we have found the solution.
It is possible, to have an initial-value problem
dy
= f (x)g(y) with y(x 0 ) = y0 ,
dx

in which the f or g or g is not suitably continuous. The problem in exercise 3.6 on page 65,
dy √
= 2 y with y(1) = 0 ,
dx
is one such problem. Here,
√ 1
y0 = 0 , f (x) = 1 , g(y) = 2 y and g (y) = √ .
y

Clearly, g and, especially, g are not continuous in any open interval containing y0 = 0 . So the
above results on existence and uniqueness cannot be assumed. Indeed, in this case there is not just
the one constant solution y = 0 , but, as shown in that exercise, there are many different solutions,
including

0 if x < 1 0 if x < 3
y(x) = and y(x) = .
(x − 1)2 if 1 ≤ x (x − 3)2 if 3 ≤ x

A Caution on False Solutions

It is always a good idea to verify that any ‘solution’ obtained in solving a differential equation really
is a solution. This is even more true when solving separable differential equations. Not only does
the extra algebra involved naturally increase the likelihood of human error, this algebra can, as noted
above, lead to ‘false solutions’ — formulas that are obtained as solutions but do not actually satisfy
the original problem.

!Example 4.9: Consider the initial-value problem

dy √
= 2 y with y(0) = 4 .
dx
The differential equation does have one constant solution, y = 0 , but since that doesn’t satisfy the
initial condition, it hardly seems relevant. To ﬁnd the other solutions, let’s divide the differential
√
equation by y and proceed with the basic procedure:
1 dy
√ = 2
y dx

→ √
1 dy
y dx
dx = 2 dx

i i

i i
i i

i i

82 Separable First-Order Equations

→ y − /2 d y
1
= 2 dx

→ 2y
1/
2 = 2x + c .

Dividing by 2 and squaring (and letting a = c/2 ), we get

y = (x + a)2 . (4.11)

Plugging this into the initial condition, we obtain

4 = y(0) = (0 + a)2 = a 2 ,

which means that

a = ±2 .
Hence, we have two formulas for the solution to our initial-value problem,

y+ (x) = (x + 2)2 and y− (x) = (x − 2)2 .

Both satisfy the initial condition. Do both satisfy the differential equation
dy √
= 2 y ?
dx
Well, plugging
y = y± (x) = (x ± 2)2
into the differential equation yields
d
(x ± 2)2 = 2 (x ± 2)2
dx

→ 2(x ± 2) = 2 (x ± 2)2 .

So, for y = y± (x) to be solutions to our differential equation, we must have

x ± 2 = (x ± 2)2 (4.12)

for all values of x ‘of interest’. In particular, this equation must be valid at the initial point
x = 0.
So, consider what happens to equation (4.12) at the initial point x = 0 . With y = y+ (x)
and x = 0 equation (4.12) becomes
√
0 + 2 = (0 + 2)2 = 4 ,

which, of course, simpliﬁes to the perfectly acceptable equation

2 = 2 .

But with y = y− (x) and x = 0 we get

√
0 − 2 = (0 − 2)2 = 4 = 2 ,

which, of course, simpliﬁes to

−2 = 2 ,

i i

i i
i i

i i

On the Nature of Solutions to Differential Equations 83

which is not acceptable. So we cannot accept y = y− (x) as a solution to our initial-value

problem. It was a false solution.
While we are at it, let’s look a little more closely at equation (4.12) with y = y+ (x) ,

x + 2 = (x + 2)2 .

Remember, if A is any real number, then

A2 = |A| .

So equation (4.12) with y = y+ can be written as

x + 2 = |x + 2| ,

which is true if and only if x + 2 ≥ 0 (i.e., x ≥ −2 ). This means that our solution, y = y+ (x) ,
is not valid for all values of x , only for those greater than or equal to −2 . Thus, the actual
solution that we have is

y = y+ (x) = (x + 2)2 for −2 ≤ x .

There was a lot of analysis done in the last example after obtaining the apparent solutions

y = (x ± 2)2 .

Don’t be alarmed. In most of the problems you will be given, verifying that your formula is a solution
should be fairly easy. Still, take the moral of this example to heart: It is a good idea to verify that
any formulas derived as solutions truly are solutions.
By the way, in a later chapter we will develop some graphical techniques that would have
simpliﬁed our work in the above example.

4.6 On the Nature of Solutions to Differential Equations

When we solve a ﬁrst-order directly integrable differential equation,
dy
= f (x) ,
dx

we get something of the form

y = F(x) + c
where F is any antiderivative of f and c is an arbitrary constant. Computationally, all we have to
do is ﬁnd a single antiderivative F for f and then add an arbitrary constant. Thus, also, the graph
of any possible solution is nothing more than the graph of F(x) shifted vertically by the value of c
(up if c > 0 , down if c < 0 ). What’s more, the interval for x over which

y = F(x) + c

is a valid solution depends only on the one function F . If F(x) is continuous for all x in an interval
(a, b) , then (a, b) is a valid interval for our solution. This interval does not depend on the choice
for c .

i i

i i
i i

i i

84 Separable First-Order Equations

10 Y 10 Y

5 5

X X
−3 −2 −1 1 2 3 −3 −2 −1 1 2 3

−5 −5

−10 −10
(a) (b)

1 3
Figure 4.2: The graph of y = tan 3 x + c (a) when y(0) = 0 and (b) when y(0) = 2 .

The situation can be much more complicated if our differential equation is not directly integrable.
First of all, finding an explicit solution can be impossible. And consider those explicit general
solutions we have found,

1
y = tan x 3 + c (from example 4.3 on page 69)
3
and
y = 3 ± a − x2 (from example 4.4 on page 71) .
In both of these, the arbitrary constants are not simply “added on” to some formula of x . Instead,
each solution formula combines the variable, x , with the arbitrary constant, c or a , in a very
nontrivial manner. There are two immediate consequences of this:
1. The graphs of the solutions are no longer simply vertically shifted copies of some single
function.
2. The possible intervals over which any solution is valid may depend on the arbitrary constant.
And since the value of that constant can be determined by the initial condition, the interval
of validity for our solutions may depend on the initial condition.
Both of these consequences are illustrated in figure 4.2, in which the graphs of two solutions to the
differential equation in example 4.3 have been sketched corresponding to two different initial values
(namely, y(0) = 0 and y(0) = 2 ). In these figures you can see how changing the initial condition
from y(0) = 0 to y(0) = 2 changes the interval over which the solution exists. Even more apparent
is that the graph corresponding to y(0) = 2 is not merely a ‘shift’ of the graph corresponding to
y(0) = 0 ; there is also a small but clear distortion in the shape of the graph.
The possible dependence of a solution’s interval of validity is even better illustrated by the
solutions obtained in example 4.4. There, the differential equation was
dy x
= −
dx y−3
and the general solution was found to be

y = 3 ± a − x2 .
The arbitrary constant here, a , occurs in the square root. For this square root to be real, we must
have
a − x2 ≥ 0 .

i i

i i
i i

i i

Using and Graphing Implicit Solutions 85

That is, √ √
− a ≤ x ≤ a
is the maximal interval over which

y = 3 + a − x2 and y = 3 − a − x2
are valid solutions.
To properly indicate this dependence of the solution’s possible domain on the arbitrary constant
or the initial value, we should state the maximal interval of validity along with any formula or
equation describing our solution(s). For example 4.4, that would mean writing the general solution
as √ √
y = 3 ± a − x2 for all − a ≤ x ≤ a .
When this is particularly convenient or noteworthy, we will attempt to remember to do so. Even
when we don’t, keep in mind that there may be limits as to the possible values of x , and that these
limits may depend on the values assumed by the arbitrary constants. √
By the way, notice also that the above a cannot be negative (otherwise, a will not be a real
number). This points out that, in general, the ‘arbitrary’ constants appearing in general solutions are
not always completely arbitrary.

4.7 Using and Graphing Implicit Solutions

Outside of courses speciﬁcally geared towards learning about differential equations, the main reason
to solve an initial-value problem such as
dy x +1
= with y(0) = 2
dx 8 + 2π sin(yπ )
is so that we can predict what values y(x) will assume when x has values other than 0 . In practice,
of course, y(x) will represent something of interest (position, velocity, promises made, number of
ducks, etc.) that varies with whatever x represents (time, position, money invested, food available,
etc.). When the solution y is given explicitly by some formula y(x) , then those values are relatively
easily obtained by just computing that formula for different values of x , and a picture of how y(x)
varies with x is easily obtained by graphing y = y(x) . If, instead, the solution is given implicitly
by some equation, then the possible values of y(x) for different x’s , along with any graph of y(x) ,
must be extracted from that equation. It may be necessary to use advanced numerical methods to
extract the desired information, but that should not be a signiﬁcant problem — these methods are
probably already incorporated into your favorite computer math package.

!Example 4.10: Let’s consider the initial-value problem

dy x +1
= with y(0) = 2 .
dx 8 + 2π sin(yπ )
In example 4.8, we saw that the general solution to the differential equation is given implicitly by
1 2
8y − 2 cos(yπ ) = x + x + c . (4.13)
2
The initial condition y(0) = 2 tells us that y = 2 when x = 0 . With this assumed, our implicit
solution reduces to
1
8 · 2 − 2 cos(2π ) = 02 + 0 + c .
2

i i

i i
i i

i i

86 Separable First-Order Equations

Y
8

y(8)
6

X
0
0 2 4 6 8 10

Figure 4.3: Graph of the implicit solution to the initial-value problem of example 4.10.

So
1
c = 8 · 2 − 2 cos(2π ) − 02 − 0 = 16 − 2 = 14 .
2
Plugging this back into equation (4.13) gives
1 2
8y − 2 cos(yπ ) = x + x + 14 (4.14)
2

as an implicit solution for our initial-value problem.

Replacing c with 14 does not make it any easier for us to convert this equation relating y
and x into a formula y(x) for y . Still, y = y(x) must satisfy equation (4.14), and the graph of
that equation can be generated by invoking the appropriate command(s) in a suitable computer
math package. That is how the graph in ﬁgure 4.3 was created. From this graph, we see that the
value of y(8) is between 6 and 7 . For a more precise determination of y(8) , set x = 8 in
equation (4.14). This gives us
1 2
8y − 2 cos(yπ ) = 8 + 8 + 14 ,
2

which, after a little arithmetic, reduces to

8y − 2 cos(yπ ) = 54 .

Now apply some numerical method (such as the Newton-Raphson method for finding roots5 )
to find, approximately, the corresponding value of y . Again, we need not do the tedious com-
putations ourselves; we can go to our favorite computer math package, look up the appropriate
commands, and let it compute that value for y . Doing so, we find that y(8) ≈ 6.642079594 .

Any curve that is at least part of the graph of an implicit solution for a differential equation is
called an integral curve for the differential equation. Remember, this is the graph of an equation.
If a function y(x) is a solution to that differential equation, then y = y(x) must also satisfy any
equation serving as an implicit solution, and, thus, the graph of that y(x) (which we will call a
solution curve) must be at least a portion of one of the integral curves for that differential equation.
Sometimes an integral curve will be a solution curve. That is “clearly” the case in ﬁgure 4.3, because
that curve is “clearly” the graph of a function (more on that later).

5 It should be in your calculus text.

i i

i i
i i

i i

Using and Graphing Implicit Solutions 87

Sometimes though, there are two (or more) different functions y1 and y2 such that both
y = y1 (x) and y = y2 (x) satisfy the same equation for the same values of x . If that equation is
an implicit solution to some differential equation, then its graph (the integral curve) will contain the
graphs of both y = y1 (x) and y = y2 (x) . In such a case, the integral curve is not a solution curve
but contains two or more solution curves.
To illuminate these comments, let us look at the solution curves and integral curves for one
equation we’ve already solved. At the same time, we will discover that, at least occasionally, the use
of implicit solutions can simplify our work, even when explicit solutions are available.

!Example 4.11: Consider graphing all the solutions to

dy x
= −
dx y−3

and the particular solution satisfying

y(0) = 1 .

If F(x, y) is a ﬁnite number for each point (x, y) in C , then C is the graph of a function satisfying
the given differential equation (i.e., C is a solution curve).

?Exercise 4.2: Explain why the integral curve graphed in ﬁgure 4.3 is “clearly” a solution curve.

4.8 On Using Deﬁnite Integrals with Separable Equations

Just as with any directly integrable differential equation, a separable differential equation
dy
= f (x)g(y) ,
dx
once separated to the form
1 dy
= f (x) ,
g(y) d x
can be integrated using deﬁnite integrals instead of the indeﬁnite integrals we’ve been using. The
basic ideas are pretty much the same as for directly integrable differential equations:
1. Pick a convenient value for the lower limit of integration, a . In particular, if the value of
y(x 0 ) is given for some point x 0 , set a = x 0 .
2. Rewrite the differential equation with s denoting the variable instead of x . This means that
we rewrite our separable equation as
dy
= f (s)g(y) ,
ds
which ‘separates’ to
1 dy
= f (s) .
g(y) ds
3. Then integrate each side with respect to s from s = a to s = x .
The integral on the left-hand side will be of the form
x
1 dy
ds .
s=a g(y) ds

Keep in mind that, here, y is some unknown function of s , and that the limits in the integral are
limits on s . Using the substitution y = y(s) , we see that
x y(x)
1 dy 1
ds = dy .
s=a g(y) ds y=y(a) g(y)

Do not forget to convert the limits to being the corresponding limits on y , instead of s .
Once the integration is done, we attempt to solve the resulting equation for y(x) just as before.

!Example 4.12: Let us solve

2. Even if a ‘nice’ formula for x

f (s) ds
a
cannot be found, the value of this integral can be closely approximated for specific values
of x using standard methods (which are already in many computer math packages). Using
these values for this integral, it is then often possible to find the corresponding values for
y(x) for specific values of x .

Unfortunately, we still have a serious problem if we cannot ﬁnd a usable formula for
y(x)
1
dy
y(a) g(y)

since the numerical methods for computing this integral require knowing the value of y(x) for the
desired choice of x , and that y(x) is exactly what we do not know.

Additional Exercises

4.3. Determine whether each of the following differential equations is or is not separable, and,
if it is separable, rewrite the equation in the form
dy
= f (x)g(y) .
dx
dy dy
a. = 3y 2 − y 2 sin(x) b. = 3x − y sin(x)
dx dx
dy dy
c. x = (x − y)2 d. = 1 + x2
dx dx
dy dy
e. + 4y = 8 f. + x y = 4x
dx dx
dy dy
g. + 4y = x 2 h. = x y − 3x − 2y + 6
dx dx
dy dy 2
i. = sin(x + y) j. y = e x−3y
dx dx

4.4. Using the basic procedure, ﬁnd the general solution to each of the following separable
equations:
dy x dy
a. = b. = y2 + 9
dx y dx
dy dy y2 + 1
c. x y = y2 + 9 d. = 2
dx dx x +1
dy dy
e. cos(y) = sin(x) f. = e2x−3y
dx dx

i i

i i
i i

i i

Additional Exercises 93

4.5. Using the basic procedure, ﬁnd the solution to each of the following initial-value prob-
lems:
dy x
a. = with y(1) = 3
dx y
dy
b. = 2x − 1 + 2x y − y with y(0) = 2
dx
dy
c. y = x y2 + x with y(0) = −2
dx
!
dy
d. y = 3 x y 2 + 9x with y(1) = 4
dx

4.6. Find all the constant solutions — and only the constant solutions — to each of the following.
If no constant solution exists, say so.
dy dy
a. = x y − 4x b. − 4y = 2
dx dx
dy dy
c. y = x y 2 − 9x d. = sin(y)
dx dx
dy 2 dy
e. = e x+y f. = 200y − 2y 2
dx dx

4.7. Find the general solution for each of the following. Where possible, write your answer as
an explicit solution.
dy dy
a. = x y − 4x b. = x y − 3x − 2y + 6
dx dx
dy dy
c. = 3y 2 − y 2 sin(x) d. = tan(y)
dx dx
dy y dy 6x 2 + 4
e. = f. =
dx x dx 3y 2 − 4y

dy
dy
g. x2 + 1 = y2 + 1 h. y2 − 1 = 4x y 2
dx dx
dy dy
i. = e−y j. = e−y + 1
dx dx
√
dy dy 2+ x
k. = 3x y 3 l. = √
dx dx 2+ y
dy dy
m. − 3x 2 y 2 = −3x 2 n. − 3x 2 y 2 = 3x 2
dx dx
dy
o. = 200y − 2y 2
dx

4.8. Solve each of the following initial-value problems. If possible, express each solution as an
explicit solution.
dy
a. − 2y = −10 with y(0) = 8
dx
dy
b. y = sin(x) with y(0) = −4
dx
dy
c. = 2x − 1 + 2x y − y with y(0) = −1
dx

i i

i i
i i

i i

94 Separable First-Order Equations

dy
d. x = y2 − y with y(2) = 1
dx
dy
e. x = y2 − y with y(1) = 2
dx
dy y2 − 1
f. = with y(1) = −2
dx xy

dy
g. y2 − 1 = 4x y with y(0) = 1
dx

4.9. In chapter 10, when studying population growth, we will obtain the “logistic equation”
dy
= βy − γ y 2
dx

with β and γ being positive constants.

a. What are the constant solutions to this equation?
b. Find the general solution to this equation.

4.10. For each of the following initial-value problems, ﬁnd the largest interval over which the
solution is valid. (Note: You’ve already solved these initial-value problems in exercise set
4.8 or at least found the general solution to the differential equation in 4.7.)
dy
a. − 2y = −10 with y(0) = 8
dx
dy
b. x = y2 − y with y(1) = 0
dx
dy
c. x = y2 − y with y(1) = 2
dx
dy
d. = e−y with y(0) = 1
dx
dy 1
e. = 3x y 3 with y(0) =
dx 2

i i

i i
i i

i i

5
Linear First-Order Equations

“Linear” first-order differential equations make up another important class of differential equations
that commonly arise in applications and are relatively easy to solve (in theory). As with the notion of
‘separability’, the idea of ‘linearity’ for first-order equations can be viewed as a simple generalization
of the notion of direct integrability, and a relatively straightforward (though, perhaps, not so intuitively
obvious) method will allow us to put any first-order linear equation into a form that can be relatively
easily integrated. We will derive this method in a short while (after, of course, describing just what
it means for a first-order equation to be “linear”).
By the way, the criteria given here for a differential equation being linear will be extended later
to higher-order differential equations, and a rather extensive theory will be developed to handle linear
differential equations of any order. That theory is not needed here; in fact, it would be of very limited
value. And, to be honest, the basic techniques we’ll develop in this chapter are only of limited use
when it comes to solving higher-order linear equations. However, these basic techniques involve an
“integrating factor”, which is something we’ll be able to generalize a little bit later (in chapter 7) to
help solve much more general first-order differential equations.

5.1 Basic Notions

Definitions
A first-order differential equation is said to be linear if and only if it can be written as
dy
= f (x) − p(x)y (5.1)
dx
or, equivalently, as
dy
+ p(x)y = f (x) (5.2)
dx
where p(x) and f (x) are known functions of x only.
Equation (5.2) is normally considered to be the standard form for first-order linear equations.
Note that the only appearance of y in a linear equation (other than in the derivative) is in a term
where y alone is multiplied by some formula of x . If there are any other functions of y appearing
in the equation after you’ve isolated the derivative, then the equation is not linear.

!Example 5.1: Consider the differential equation

dy
x + 4y − x 3 = 0 .
dx

i i

i i
i i

i i

96 Linear First-Order Equations

Solving for the derivative, we get

dy x 3 − 4y 4
= = x2 − y ,
dx x x

which is
dy
= f (x) − p(x)y
dx
with
4
p(x) = and f (x) = x 2 .
x
So this ﬁrst-order differential equation is linear. Adding 4/
x · y to both sides, we then get the
equation in standard form,
dy 4
+ y = x2 .
dx x
On the other hand
dy 4
+ y2 = x 2
dx x
is not linear because of the y 2 .

In testing whether a given ﬁrst-order differential equation is linear, it does not matter whether
you attempt to rewrite the equation as
dy
= f (x) − p(x)y
dx
or as
dy
+ p(x)y = f (x) .
dx

If you can put it into either form, the equation is linear. You may prefer the ﬁrst, simply because it is
a natural form to look for after solving the equation for the derivative. However, because the second
form (the standard form) is more suited for the methods normally used for solving these equations,
more experienced workers typically prefer that form.

!Example 5.2: Consider the equation

dy
x2 + x 3 [y − sin(x))] = 0 .
dx

Dividing through by x 2 and doing a little multiplication and addition convert the equation to
dy
+ x y = x sin(x) ,
dx

which is the standard form for a linear equation. So this differential equation is linear.

It is possible for a linear equation

dy
+ p(x)y = f (x)
dx

to also be a type of equation we’ve already studied. For example, if p(x) = 0 then the equation is
dy
= f (x) ,
dx

i i

i i
i i

i i

Basic Notions 97

which is directly integrable. If, instead, f (x) = 0 , the equation can be rewritten as
dy
= − p(x)y ,
dx
showing that it is separable. In addition, you can easily verify that a linear equation is separable if
f (x) is any constant multiple of p(x) .
If a linear equation is also directly integrable or separable, then it can be solved using methods
already discussed. Otherwise, a small trick turns out to be very useful.

Deriving the Trick for Solving

Suppose we want to solve some ﬁrst-order linear equation
dy
+ py = f (5.3)
dx
(for brevity, p = p(x) and f = f (x) ). To avoid triviality, let’s assume p(x) is not always 0 .
Whether f (x) vanishes or not will not be relevant.
The small trick to solving equation (5.3) comes from the product rule for derivatives: If μ and
y are two functions of x , then
d dμ dy
[μy] = y + μ .
dx dx dx
Rearranging the terms on the right side, we get
d dy dμ
[μy] = μ + y ,
dx dx dx
and the right side of this equation looks a little like the left side of equation (5.3). To get a better
match, let’s multiply equation (5.3) by μ ,
dy
μ + μpy = μ f .
dx
With luck, the left side of this equation will match the right side of the last equation for the product
rule, and we will have
d dy dμ
[μy] = μ + y
dx dx dx
(5.4)
dy
= μ + μpy = μ f .
dx
This, of course, requires that
dμ
= μp .
dx
Assuming this requirement is met, the equations in (5.4) hold. Cutting out the middle of that (and
recalling that f and μ are functions of x only), we see that the differential equation reduces to
d
[μy] = μ(x) f (x) . (5.5)
dx
The advantage of having our differential equation in this form is that we can actually integrate both
sides with respect to x , with the left side being especially easy since it is just a derivative with
respect to x .
The function μ is called an integrating factor for the differential equation. As noted in the
derivation, it must satisfy
dμ
= μp . (5.6)
dx

i i

i i
i i

i i

98 Linear First-Order Equations

This is a simple separable differential equation for μ (remember, p = p(x) is a known function).
Any nonzero solution to this can be used as an integrating factor (the zero solution, μ = 0 , would
simplify matters too much!). Applying the approach we learned for separable differential equations,
we divide through by μ , integrate, and solve the resulting equation for μ :

1 dμ
dx = p(x) dx
μ dx

→ ln |μ| = p(x) dx

"
→ μ = ±e p(x) dx

Since we only need one function μ(x) satisfying requirement (5.6), we can drop both the “±” and
any arbitrary constant arising from the integration of p(x) . This leaves us with a relatively simple
formula for our integrating factor, namely,
"
p(x) dx
μ(x) = e (5.7)

where it is understood that we can let the constant of integration be zero.

5.2 Solving First-Order Linear Equations

As we just derived, the real ‘trick’ to solving a first-order linear equation is to reduce it to an easily
integrated form via the use of an integrating factor. Here is a procedure for actually carrying out the
necessary steps. To illustrate these steps, we will immediately use them to find the general solution
to the equation from example 5.1,
dy
x + 4y = x 3 .
dx
The Procedure:
1. Get the equation into the standard form for first-order linear differential equations,
dy
+ p(x)y = f (x) .
dx
For our example, we just divide through by x , obtaining
dy 4
+ y = x2 .
dx x
As noted in example 5.1, this is the desired form with
4
p(x) = and f (x) = x 2 .
x

2. Compute an integrating factor "

p(x) dx
μ(x) = e .

Remember, since we only need one integrating factor, we can let the constant of integration
be zero here.

i i

i i
i i

i i

Solving First-Order Linear Equations 99

For our example, "

" 4/ dx
p(x) dx
μ(x) = e = e x = e4 ln|x| .

Applying some basic identities for the natural logarithm, we can rewrite this last
expression in a much more convenient form:
4

μ(x) = e4 ln|x| = eln x = x 4 = x 4 .

3a. Multiply the differential equation (in standard form) by the integrating factor,
dy
μ + p(x)y = f (x)
dx

→ μ
dy
dx
+ μpy = μf ,

b. and observe that, via the product rule and choice of μ , the left side can be written as the
derivative of the product of μ and y ,
dy
μ + μpy = μf ,
d x
d
d x [μy]

c. and then rewrite the differential equation as

d
[μy] = μf ,
dx

For our example, μ = x 4 . Multiplying our equation by this and proceeding

through the three substeps above yields
dy 4

x4 + y = x2
dx x

→ x4
d
dy
x
+ 4x 3 y = x 6

d
d x [x y]
4

→ d
dx
[x 4 y] = x 6 .

4. Integrate with respect to x both sides of the last equation obtained,

d
[μy] dx = μ(x) f (x) dx
dx

→ μy = μ(x) f (x) dx .

Don’t forget the arbitrary constant here!

Integrating the last equation in our example,

d
[x 4 y] dx = x 6 dx
dx

→ x4 y =
1 7
7
x + c .

i i

i i
i i

i i

100 Linear First-Order Equations

5. Finally, solve for y by dividing through by μ .

For our example,
y = x −4 + cx −4
1 7 1 3
x + c = x .
7 7

Later, we will use the above procedure to derive an explicit formula for computing y from
p(x) and f (x) . Unfortunately, it is not a particularly simple formula, and those who attempt to
memorize it typically make more mistakes than those who simply remember the above procedure.

!Example 5.3: Consider

dy
ex = 20 + 3e x y , y(0) = 7
dx

Subtracting 3e x y from both sides and then multiplying through by e−x puts this linear differential
equation into the desired form,
dy
− 3y = 20e−x .
dx
So p(x) = −3 , and our integrating factor is
"
−3 dx
μ = μ(x) = e = e−3x .

Multiplying the differential equation by μ and following the rest of the steps in our procedure
gives us the following:
dy
e−3x − 3y = 20e−x
dx

→ e−3x
dy
− 3e−3x y = 20e−4x
d x

d −3x y]
d x [e

→ d
dx
[e−3x y] = 20e−4x

→ d
dx
[e−3x y] dx = 20e−4x dx

→ e−3x y = −5e−4x + c

→ y = e3x −5e−4x + c .

So the general solution to our differential equation is

y(x) = −5e−x + ce3x .

Using this formula for y(x) with the initial condition gives us

7 = y(0) = −5e−0 + ce3·0 = −5 + c

Thus,
c = 7 + 5 = 12 ,
and the solution to the given initial-value problem is

y(x) = −5e−x + 12e3x .

i i

i i
i i

i i

Solving First-Order Linear Equations 101

Let us brieﬂy get back to our requirement for μ = μ(x) being an integrating factor for
dy
+ py = f .
dx
That requirement was equation (5.6),
dμ
= μp .
dx
Now, in computing this μ , you will often get something like

μ(x) = |μ0 (x)|

where μ0 (x) is a relatively simple continuous function (e.g., μ(x) = |sin(x)| ). Consequently, on
any interval over which the graph of μ0 (x) never crosses the X–axis,

μ0 (x) = μ(x) or μ0 (x) = −μ(x) .

Either way,
dμ0 d[±μ] dμ
= = ± = ±μp = μ0 p .
dx dx dx
So μ0 also satisﬁes the requirement for being an integrating factor for the given differential equation.
This means that, if in computing μ you do get something like

μ(x) = |μ0 (x)|

where μ0 (x) is a relatively simple function, then you can ignore the absolute value brackets and
just use μ0 for your integrating factor.

!Example 5.4: Consider solving the linear differential equation

dy
+ cot(x)y = x csc(x) .
dx
This equation is already in the desired form. In a case like this, it is often a good idea to see what
the equation looks like in terms of sines and cosines,
dy
cos(x) x
+ y = .
dx sin(x) sin(x)
"
To ﬁnd μ = e p dx , ﬁrst observe that, ignoring the constant of integration,
d
p(x) dx =
cos(x)
dx = dx sin(x) dx = ln |sin(x)| .
sin(x) sin(x)

→ sin(x)
dy
+ cos(x)y = x
d x

d
d x [sin(x)y]

i i

i i
i i

i i

102 Linear First-Order Equations

→ d
dx
[sin(x)y] dx = x dx

→ sin(x)y =
1 2
2
x + c1

x2 + c
→ y =
2 sin(x)
.

5.3 On Using Deﬁnite Integrals with Linear Equations

Integration arises twice in our method for solving
dy
+ p(x)y = f (x) .
dx
It ﬁrst arises when we integrate p to get the integrating factor,
"
p(x) dx
μ(x) = e .

It is needed again when we then integrate both sides of the corresponding equation
d
[μy] = μ f .
dx
At either point, of course, we could use definite integrals instead of indefinite integrals.
Let’s first look at what happens when we integrate both sides of the last equation using definite
integrals. Remember, everything is a function of x , so this equation can be written a bit more
explicitly as
d
[μ(x)y(x)] = μ(x) f (x) .
dx
As before, to avoid having x represent two different entities, we replace the x’s with another
variable, say, s , and rewrite our current differential equation as
d
[μ(s)y(s)] = μ(s) f (s) .
ds
Then we pick a convenient lower limit a for our integration and integrate each side of the above
with respect to s from s = a to s = x ,
x x
d
[μ(s)y(s)] ds = μ(s) f (s) ds . (5.8)
a ds a

But
dx x
[μ(s)y(s)] ds = μ(s)y(s) a = μ(x)y(x) − μ(a)y(a) .
a ds
So equation (5.8) reduces to x
μ(x)y(x) − μ(a)y(a) = μ(s) f (s) ds ,
a

Solving this for y(x) yields

x
1
y(x) = μ(a)y(a) + μ(s) f (s) ds . (5.9)
μ(x) a

i i

i i
i i

i i

On Using Deﬁnite Integrals with Linear Equations 103

This is not a simple enough formula to be worth memorizing (especially since you still have to
remember what μ is). Nonetheless, it is a formula worth knowing about for at least two good
reasons:

1. This formula can automatically take into account an initial value y(x 0 ) = y0 . All we have
to do is to choose the lower limit a to be x 0 . Then formula (5.9) tells us that the solution to
dy
+ py = f with y(x 0 ) = y0
dx

is x
1
y = μ(x 0 )y0 + μ(s) f (s) ds . (5.10)
μ(x) x0

2. Even if we cannot determine a relatively nice formula for the integral of μf (for a given
choice of μ and f ), the value of the integral in formula (5.9) can, in practice, still be
accurately computed for desired values of x using numerical integration routines found in
standard computer math packages. Indeed, using any of these packages and formula (5.9),
you could probably program a computer to accurately compute y(x) for a number of values
of x and use these values to produce a very accurate graph of y .

!Example 5.5: Consider solving

dy
− 2x y = 4 with y(0) = 3 .
dx

The differential equation is clearly linear and in the desired form for the ﬁrst step of our procedure.
Computing the integrating factor, we ﬁnd that, here,
" "
[−2x] dx
= e−x
p(x) dx 2 +c
μ = e = e .

Choosing, as we may, c to be zero, we then get

μ(x) = e−x
2
.

x x
p(s) ds and μ(s) f (s) ds
x0 x0

to be well-deﬁned, continuous functions of x in whatever interval of interest (α, β) we have. (Note

that this ensures x
μ(x) = exp p(s) ds
x0

is never zero in this interval.) Certainly, p and f will be ‘sufficiently integrable’ if they are
continuous on (α, β) . But continuity is not necessary; p and f can have a few discontinuities
provided these discontinuities are not too bad. In particular, we can allow the same piecewise-defined
functions considered back in section 2.4. That (along with theorem 2.1 on page 32) gives us the
following existence and uniqueness theorem for initial-value problems involving first-order linear
differential equations.

Theorem 5.1 (existence and uniqueness)

Let p and f be functions that are continuous except for, at most, a ﬁnite number of ﬁnite-jump
discontinuities in an interval (α, β) . Also let x 0 and y0 be any two numbers with α < x 0 < β .
Then the initial-value problem
dy
+ p(x)y = f (x) with y(x 0 ) = y0
dx

has exactly one solution over the interval (α, β) , and that solution is given by
x x
1
y(x) = y0 + μ(s) f (s) ds with μ(x) = exp p(s) ds .
μ(x) x0 x0

Additional Exercises

5.1. Determine whether each of the following differential equations is or is not linear, and, if it
is linear, rewrite the equation in standard form,
dy
+ p(x)y = f (x) .
dx
dy dy
a. x 2 + 3x 2 y = sin(x) b. y 2 + 3x 2 y = sin(x)
dx dx
dy √ dy
c. − x y2 = x d. = 1 + (x y + 3y)2
dx dx
dy dy
e. = 1 + x y + 3y f. = 4y + 8
dx dx
dy dy
g. − e2x = 0 h. = sin(x) y
dx dx
dy dy

i. + 4y = y 3 j. x + cos x 2 = 827y
dx dx

i i

i i
i i

i i

106 Linear First-Order Equations

5.2. Using the methods developed in this chapter, find the general solution to each of the fol-
lowing first-order linear differential equations:
dy dy
a. + 2y = 6 b. + 2y = 20e3x
dx dx
dy dy
c. = 4y + 16x d. − 2x y = x
dx dx
dy dy
e. x + 3y − 10x 2 = 0 f. x 2 + 2x y = sin(x)
dx dx
dy √ dy
g. x = x + 3y h. cos(x) + sin(x) y = cos2 (x)
dx dx
dy 20 √ dy √
i. x + (5x + 2)y = j. 2 x + y = 2xe− x
dx x dx
5.3. Find the solution to each of the following initial-value problems using the methods developed
in this chapter:
dy
a. − 3y = 6 with y(0) = 5
dx
dy
b. − 3y = 6 with y(0) = −2
dx
dy
c. + 5y = e−3x with y(0) = 0
dx
dy
d. x + 3y = 20x 2 with y(1) = 10
dx

dy π
e. x = y + x 2 cos(x) with y =0
dx 2
dy

f. (1 + x 2 ) = x 3 + 3x 2 − y with y(2) = 8
dx
5.4. Express the answer to each of the following initial-value problems in terms of definite
integrals:
dy
a. + 6x y = sin(x) with y(0) = 4
dx
dy √
b. x 2 + x y = x sin(x) with y(2) = 5
dx
dy
− y = x 2 e−x
2
c. x with y(3) = 8
dx
5.5. Let (α, β) be an interval, and let x 0 and y0 be any two numbers with α < x 0 < β .
Assume p and f are functions continuous at all but, at most, a finite number of points in
(α, β) , and that each of these discontinuities is a finite-jump discontinuity. Define μ(x)
and y(x) by x
μ(x) = exp p(s) ds .
x0
and
x
1
y(x) = y0 + μ(s) f (s) ds
μ(x) x0
Compute the first derivatives of μ and y , and then verify that y satisfies the initial condition
y(x 0 ) = y0 as well as the differential equation
dy
+ p(x)y = f (x) for α < x < β .
dx

i i

i i
i i

i i

6
Simplifying Through Substitution

In previous chapters, we saw how certain types of ﬁrst-order differential equations (directly inte-
grable, separable, and linear equations) can be identiﬁed and put into forms that can be integrated
with relative ease. In this chapter, we will see that, sometimes, we can start with a differential equa-
tion that is not one of these desirable types and construct a corresponding separable or linear equation
whose solution can then be used to construct the solution to the original differential equation.

6.1 Basic Notions

There are many ﬁrst-order differential equations, such as
dy
= (x + y)2 ,
dx

that are neither linear nor separable, and which do not yield up their solutions by direct application
of the methods developed thus far. One way of attempting to deal with such equations is to replace
y with a cleverly chosen formula of x and “ u ” where u denotes another unknown function of x .
This results in a new differential equation with u being the function of interest. If the substitution
truly is clever, then this new differential equation will be separable or linear (or, maybe, even directly
integrable), and can be be solved for u in terms of x using methods discussed in previous chapters.
Then the function of real interest, y , can be determined from the original ‘clever’ formula relating
u , y and x .
Here are the basic steps to this approach, described in a little more detail and illustrated by
being used to solve the above differential equation:

1. Identify what is hoped will be a good formula of x and u for y ,

y = F(x, u) .

This ‘good formula’ is our substitution for y . Here, u represents another unknown function
of x (so “ u = u(x) ”), and the above equation tells us how the two unknown functions y
and u are related. (Identifying that ‘good formula’ is the tricky part. We’ll discuss that
further in a little bit.)
Let’s try a substitution that reduces the right side of our differential equation,
dy
= (x + y)2 ,
dx

107

i i

i i
i i

i i

108 Simplifying Through Substitution

to u 2 . This means setting u = x + y . Solving this for y gives our substitution,‘

y = u − x .

2. Replace every occurrence of y in the given differential equation with that formula of x and
u , including the y in the derivative. Keep in mind that u is a function of x , so the dy/dx
will become a formula of x , u , and du/dx (it may be wise to ﬁrst compute dy/dx separately).
Since we are using y = u − x (equivalently, u = x + y ), we have

(x + y)2 = u 2 ,
and
dy d du dx du
= [u − x] = − = − 1 .
dx dx dx dx dx

So, under the substitution y = u − x ,

dy
= (x + y)2
dx
becomes
du
− 1 = u2 .
dx

3. Solve the resulting differential equation for u (don’t forget the constant solutions!). If
possible, get an explicit solution for u in terms of x . (This assumes, of course, that the
differential equation for u is one we can solve. If it isn’t, then our substitution wasn’t that
clever, and we may have to try something else.)
Adding 1 to both sides of the differential equation just derived for u yields
du
= u2 + 1 ,
dx

which we recognize as being a relatively easily solved separable equation with no

constant solutions. Dividing through by u 2 + 1 and integrating,
1 du
= 1
u2 + 1 d x

→ 1 du
u2 + 1 d x
dx = 1 dx

→ arctan(u) = x + c

→ u = tan(x + c) .

4. If you get an explicit solution u = u(x) , then just plug that formula u(x) into the original
substitution to get the explicit solution to the original equation,

y(x) = F(x, u(x)) .

If, instead, you only get an implicit solution for u , then go back to the original substitution,
y = F(x, u) , solve that to get a formula for u in terms of x and y (unless you already have
this formula for u ), and substitute that formula for u into the solution obtained to convert it
to the corresponding implicit solution for y .

i i

i i
i i

i i

Linear Substitutions 109

Our original substitution was y = u − x . Combining this with the formula for
u just obtained, we get

y = u − x = tan(x + c) − x

as a general solution to our original differential equation,

dy
= (x + y)2 .
dx

The key to this approach is, of course, in identifying a substitution, y = F(x, u) , that converts
the original differential equation for y to a differential equation for u that can be solved with
reasonable ease. Unfortunately, there is no single method for identifying such a substitution. At
best, we can look at certain equations and make good guesses at substitutions that are likely to work.
We will next look at three cases where good guesses can be made. In these cases the suggested
substitutions are guaranteed to lead to either separable or linear differential equations. As you may
suspect, though, they are not guaranteed to lead to simple separable or linear differential equations.

6.2 Linear Substitutions

If the given differential equation can be rewritten so that the derivative equals some formula of
Ax + By + C ,
dy
= f (Ax + By + C) ,
dx
where A , B , and C are known constants, then a good substitution comes from setting

u = Ax + By + C ,

and then solving for y . For convenience, we’ll call this a linear substitution1 .
We’ve already seen one case where a linear substitution works — in the example above illus-
trating the general substitution method. Here is another example, one in which we end up with an
implicit solution.

!Example 6.1: To solve

dy 1
= ,
dx 2x − 4y + 7
we use the substitution based on setting

u = 2x − 4y + 7 .

Solving this for y and then differentiating yields

1 x u 7
y = [2x − u + 7] = − +
4 2 4 4
and
dy d
x u 7
1 1 du
= − − = − .
dx dx 2 4 4 2 4 dx

1 because Ax + By + C = 0 is the equation for a straight line

i i

i i
i i

i i

110 Simplifying Through Substitution

So, the substitution based on u = 2x − 4y + 7 converts

dy 1
=
dx 2x − 4y + 7
to
1 1 du 1
− = .
2 4 dx u
This differential equation for u looks manageable, especially since it contains no x ’s . Solving
for the derivative in this equation, we get
du
1 1
2 u
2 − u
= −4 − = −4 − = −4 ,
dx u 2 2u 2u 2u
which simpliﬁes to
du u−2
= 2 . (6.1)
dx u
Again, this is a separable equation. This time, though, the differential equation has a constant
solution,
u = 2 . (6.2)
To ﬁnd the other solutions to our differential equation for u , we multiply both sides of equation
(6.1) by u and divide through by u − 2 , obtaining
u du
= 2 .
u − 2 dx
After noticing that
u u−2+2 u−2 2 2
= = + = 1 + ,
u−2 u−2 u−2 u−2 u−2
we can integrate both sides of our last differential equation for u ,

u du
dx = 2 dx
u − 2 dx

→ 1 +
2
u−2
du = 2x + c

→ u + 2 ln |u − 2| = 2x + c . (6.3)

Sadly, the last equation is not one we can solve to obtain an explicit formula for u in terms of x .
So we are stuck with using it as an implicit solution of our differential equation for u .
Together, formula (6.2) and equation (6.3) give us all the solutions to the differential equation
for u . To obtain all the solutions to our original differential equation for y , we must recall the
original (equivalent) relations between u and y ,
x u 7
u = 2x − 4y + 7 and y = − + .
2 4 4
The latter with the constant solution u = 2 (formula (6.2)) yields
x 2 7 x 5
y = − + = + .
2 4 4 2 4
On the other hand, it is easier to combine the ﬁrst relation between u and y with the implicit
solution for u in equation (6.3),
u = 2x − 4y + 7 with u + 2 ln |u − 2| = 2x + c ,

i i

i i
i i

i i

Linear Substitutions 111

obtaining
[2x − 4y + 7] + 2 ln |[2x − 4y + 7] − 2| = 2x + c .
After a little algebra, this simpliﬁes to

ln |2x − 4y + 5| = 4y + C .

which does not include the “constant u ” solution above. So, for y = y(x) to be a solution to
our original differential equation, it must either be given by
x 5
y = +
2 4
or satisfy
ln |2x − 4y + 5| = 4y + C .

Let us see what happens whenever we have a differential equation of the form
dy
= f (Ax + By + C)
dx

(where A , B , and C are known constants), and we attempt the substitution based on setting

u = Ax + By + C .

Solving for y and then differentiating yields

1 dy 1
du
y = [u − Ax − C] and = −A .
B dx B dx

Under these substitutions,

dy
= f (Ax + By + C)
dx
becomes
1
du
−A = f (u) .
B dx

After a little algebra, this can be rewritten as

du
= A + B f (u) ,
dx

which is clearly a separable equation. Thus, we will always get a separable differential equation for
u . Moreover, the ease with which this differential equation can be solved clearly depends only on
the ease with which we can evaluate

1
du .
A + B f (u)

i i

i i
i i

i i

112 Simplifying Through Substitution

6.3 Homogeneous Equations

We now consider ﬁrst-order differential equations in which the derivative can be viewed as a formula
of the ratio y/x . In other words, we are now interested in any differential equation that can be
rewritten as
dy
y
= f (6.4)
dx x
where f is some function of a single variable. Such equations are sometimes said to be homoge-
neous.2 Unsurprisingly, the substitution based on setting
y
u = (i.e., y = xu )
x
is often useful in solving these equations. We will, in fact, discover that this substitution will always
transform an equation of the form (6.4) into a separable differential equation.

!Example 6.2: Consider the differential equation

dy
x y2 = x 3 + y3 .
dx
Dividing through by x y 2 and doing a little factoring yields

y3
x3 1 + 3
dy x 3 + y3 x
= = ,
dx x y2 y2
x3
x2

which simpliﬁes to
y 3
dy 1+
= x2 . (6.5)
dx y
x
That is,
dy
y 1 + whatever3
= f with f (whatever) = .
dx x whatever2
So we should try letting
y
u =
x
or, equivalently,
y = xu .
On the right side of equation (6.5), replacing y with xu is just the same as replacing each
y/ with u . Either way, the right side becomes
x

1 + u3
.
u2
On the left side of equation (6.5), the substitution y = xu is in the derivative. Keeping in mind
that u is also a function of x , we have
dy d dx du du
= [xu] = u + x = u + x .
dx dx dx dx dx
2 Warning: Later we will refer to a completely different type of differential equation as being “homogeneous”.

i i

i i
i i

i i

Homogeneous Equations 113

So, y 3
dy 1 + y=xu du 1 + u3
= x2 ⇒ u + x = .
dx y dx u2
x
Solving the last equation for du/dx and doing a little algebra, we see that

du 1 1 + u3 1 1 + u3 u3 1 1 + u3 − u3 1
= 2
− u = 2
− 2
= 2
= .
dx x u x u u x u xu 2

How nice! Our differential equation for u is the very simple separable equation
du 1
= .
dx xu 2

Multiplying through by u 2 , integrating, and doing a little more algebra:

2 du 1
u dx = dx
dx x

→ 1 3
3
u = ln |x| + c

→ u 3 = 3 ln |x| + 3c

→ u = 3
3 ln |x| + 3c .

Combining this with our substitution y = xu gives

y = xu = x 3 3 ln |x| + 3c = x 3 3 ln |x| + C

as the general solution to our original differential equation.

In practice, it may not be immediately obvious if a given ﬁrst-order differential equation can be
written in form (6.4), but it is usually fairly easy to ﬁnd out. First, algebraically solve the differential
equation for the derivative to get
dy
= “some formula of x and y ” .
dx
With a little luck, you’ll be able to do a little algebra (as we did in the above example) to see if that
“formula of x and y ” can be written as just a formula of y/x , f ( y/x ) .
If it’s still not clear, then just go ahead and try the substitution y = xu in that “formula of x
and y ”. If all the x’s cancel out and you are left with a formula of u , then that formula, f (u) ,
is the right side of (6.4) (remember, u = y/x ). So the differential equation can be written in the
desired form. Moreover, half the work in plugging the substitution into the differential equation is
now done.
On the other hand, if the x’s do not cancel out when you substitute xu for y , then the differential
equation cannot be written in form (6.4), and there is only a small chance that this substitution will
yield an ‘easily solved’ differential equation for u .

!Example 6.3: Again, consider the differential equation

dy
x y2 = x 3 + y3 ,
dx

i i

i i
i i

i i

114 Simplifying Through Substitution

which we had already studied in the previous example. Solving for the derivative again yields

dy x 3 + y3
= .
dx x y2

Instead of factoring out x 3 from the numerator and denominator of the right side, let’s go ahead
and try the substitution y = xu and see if the x ’s cancel out:

x 3 + y3 x 3 + [xu]3 x 3 + x 3u3 x 3 1 + u3
= = = .
x y2 x[xu]2 x 3u2 x 3u2

The x ’s clearly do cancel out, leaving us with

1 + u3
.
u2

Thus, (as we already knew), our differential equation can be put into form (6.4). What’s more,
getting our differential equation into that form and using y = xu will lead to

1 + u3
u2

for the right side, just as we saw in the previous example.

When employing the substitution y = xu to solve

dy
y
= f ,
dx x

do not forget to treat u as a function of x ! Thus, when we differentiate y , we have

dy d dx du du
= [xu] = u + x = u + x .
dx dx dx dx dx

This is not a formula worth memorizing — you shouldn’t even bother memorizing y = xu — it
should be quite enough to remember that u = u(x) with u = y/x .
However, it is worth noting that, if we plug these substitutions into
dy
y
= f ,
dx x

we always get
du
u + x = f (u) ,
dx
which is the same as
du f (u) − u
= .
dx x
This conﬁrms that we will always get a separable equation, just as with linear substitutions. This
time, the ease with which the differential equation for u can be solved depends on the ease with
which we can evaluate
1
du .
f (u) − u

i i

i i
i i

i i

Bernoulli Equations 115

6.4 Bernoulli Equations

A Bernoulli equation is a ﬁrst-order differential equation that can be written in the form
dy
+ p(x)y = f (x)y n (6.6)
dx
where p(x) and f (x) are known functions of x only, and n is some real number. This looks
much like the standard form for linear equations. Indeed, a Bernoulli equation is linear if n = 0
or n = 1 (and is also separable if n = 1 ). Consequently, our main interest is in solving such an
equation when n is neither 0 nor 1 .
The above equation can be solved using a substitution, though good choice for that substitution
might not be immediately obvious. You might suspect that setting u = y n would help, but it doesn’t
— unless, that is, it leads you to try a substitution based on

u = yr

where r is some value yet to be determined. If you solve this for y in terms of u and plug the
resulting formula for y into the Bernoulli equation, you will then discover, after a bit of calculus
and algebra, that you have a linear differential equation for u if and only if r = 1 − n (see problem
6.8). So the substitution that does work is the one based on setting

u = y 1−n .

In the future, you can either remember this, re-derive it as needed, or know where to look it up.
You should also observe that, if n > 0 , then the constant function

y(x) = 0 for all x

is a solution to equation (6.6). This particular solution is often overlooked when using the substitution
u = y 1−n for a reason noted in the next example.

!Example 6.4: Consider the differential equation

dy
+ 6y = 30e3x y /3
2
.
dx

This is in form (6.6), with n = 2/3 . Right off, let’s note that this Bernoulli equation has a constant
solution y = 0 .
Setting
u = y 1−n = y 1− /3 = y /3 ,
2 1

we see that the substitution

y = u3
is called for. Plugging this into our original differential equation, we get
dy
+ 6y = 30e3x y /3
2
dx
d 3 2/
→ dx
u + 6 u 3 = 30e3x u 3 3

→ 3u 2
du
dx
+ 6u 3 = 30e3x u 2 .

i i

i i
i i

i i

116 Simplifying Through Substitution

Dividing this last equation through by 3u 2 gives

du
+ 2u = 10e3x .
dx
(This division assumes u = 0 , corresponding to an assumption that y = 0 . That is why the
y = 0 solution is often overlooked.)
The last equation is a relatively simple linear equation with integrating factor
"
2 dx
μ = e = e2x .

Continuing as usual with such equations,

du
e2x + 2u = 10e3x
dx

→ e2x
du
dx
+ 2e2x u = 10e5x

→ d
dx
[e2x u] = 10e5x .

Integrating both sides with respect to x then yields

2x
e u = 10e5x dx = 2e5x + c ,

which tells us that

u = e−2x 2e5x + c = 2e3x + ce−2x .
Finally, after recalling the substitution that led to the differential equation for u (and the fact that
y = 0 is a solution, we obtain our general solution to the given Bernoulli equation,
3
y(x) = u 3 = 2e3x + ce−2x and y(x) = 0 .

Additional Exercises

6.1. Use linear substitutions (as described in section 6.2) to ﬁnd a general solution to each of
the following:
dy 1 dy (3x − 2y)2 + 1 3
a. = b. = +
dx (3x + 3y + 2)2 dx 3x − 2y 2
dy
c. cos(4y − 8x + 3) = 2 + 2 cos(4y − 8x + 3)
dx
6.2. Using a linear substitution, solve the initial-value problem
dy 1
= 1 + (y − x)2 with y(0) = .
dx 4

i i

i i
i i

i i

Additional Exercises 117

6.3. Use substitutions appropriate to homogeneous ﬁrst-order differential equations (as described
in section 6.3) to ﬁnd a general solution to each of the following:
dy dy y x
a. x 2 − x y = y2 b. = +
dx dx x y
y dy y
y
c. cos − = 1 + sin
x dx x x

6.4. Again, use a substitution appropriate to homogeneous ﬁrst-order differential equations, this
time to solve the initial-value problem

dy x−y
= with y(0) = 3 .
dx x+y

6.5. Use substitutions appropriate to Bernoulli equations (as described in section 6.4) to ﬁnd a
general solution to each of the following:

dy dy 3
y 2
a. + 3y = 3y 3 . b. − y =
dx dx x x
dy
+ 3 cot(x)y = 6 cos(x)y /3
2
c.
dx

6.6. Use a substitution appropriate to a Bernoulli equation to solve the initial-value problem

dy 1 1
− y = with y(1) = 3 .
dx x y

6.7. For each of the following, determine a substitution that simpliﬁes the given differential
equation, and, using that substitution, ﬁnd a general solution. (Warning: The substitutions
for some of the later equations will not be substitutions already discussed.)
2
dy y x dy
a. = + b. 3 = −2 + 2x + 3y + 4
dx x y dx
dy 2 √ dy 1
c. + y = 4 y d. = 4 +
dx x dx sin(4x − y)
dy dy
e. (y − x) = 1 f. (x + y) = y
dx dx
dy dy 1
g. 2x y + 2x 2 = x 2 + 2x y + 2y 2 h. + y = x 2 y3
dx dx x
dy dy
i. = 2 2x + y − 3 − 2 j. = 2 2x + y − 3
dx dx
!
dy dy
k. x − y = xy + x2 l. + 3y = 28e2x y −3
dx dx
!
dy dy
m. = (x − y + 3)2 n. + 2x = 2 y + x 2
dx dx

dy dy y y2
o. cos(y) = e−x − sin(y) p. = x 1+2 2 + 4
dx dx x x

i i

i i
i i

i i

118 Simplifying Through Substitution

6.8. Consider a generic Bernoulli equation

dy
+ p(x)y = f (x)y n
dx

where p(x) and f (x) are known functions of x and n is any real number other than
0 or 1 . Use the substitution u = y r (equivalently, y = u /r ) and derive that the above
1

Bernoulli equation for y reduces to a linear equation for u if and only if r = 1 − n . In

the process, also derive the resulting linear equation for u .

i i

i i
i i

i i

7
The Exact Form and General Integrating
Factors

In the previous chapters, we’ve seen how separable and linear differential equations can be solved
using methods for converting them to forms that can be easily integrated. In this chapter, we will
develop a more general approach to converting a differential equation to a form (the “exact form”) that
can be integrated through a relatively straightforward procedure. We will see just what it means for a
differential equation to be in exact form and how to solve differential equations in this form. Because
it is not always obvious when a given equation is in exact form, a practical “test for exactness” will
also be developed. Finally, we will generalize the notion of integrating factors to help us find exact
forms for a variety of differential equations.
The theory and methods we will develop here are more general than those developed earlier
for separable and linear equations. In fact, the procedures developed here can be used to solve any
separable or linear differential equation (though you’ll probably prefer using the methods developed
earlier). More importantly, the methods developed in this chapter can, in theory at least, be used
to solve a great number of other first-order differential equations. As we will see though, practical
issues will reduce the applicability of these methods to a somewhat smaller (but still significant)
number of differential equations.
By the way, the theory, the computational procedures, and even the notation that we will
develop for equations in exact form are all very similar to that often developed in the later part of
many calculus courses for two-dimensional conservative vector fields. If you’ve seen that theory,
look for the parallels between it and what follows.

7.1 The Chain Rule

The exact form for a differential equation comes from one of the chain rules for differentiating
a composite function of two variables. Because of this, it may be wise to brieﬂy review these
differentiation rules.
First, suppose φ is a differentiable function of a single variable y (so φ = φ(y) ), and that y ,
itself, is a differentiable function of another variable t (so y = y(t) ). Then the composite function
φ(y(t)) is a differentiable function of t whose derivative is given by the (elementary) chain rule
d
[φ(y(t))] = φ (y(t)) y (t) .
dt
A less precise (but more suggestive) description of this chain rule is
d dφ d y
[φ(y(t))] = .
dt d y dt

119

i i

i i
i i

i i

120 The Exact Form and General Integrating Factors

!Example 7.1: Let

y(t) = t 2 and φ(y) = sin(y) .

Then
φ(y(t)) = sin t 2 ,
and
d
d dφ d y
sin t 2 = [φ(y(t))] =
dt dt d y dt
d d

= [sin(y)] · t2 = cos(y) · 2t = cos t 2 2t .
dy dt

(In practice, of course, you probably do not explicitly write out all the steps listed above.)

Now suppose φ is a differentiable function of two variables x and y (so φ = φ(x, y) ), while
both x and y are differentiable functions of a single variable t (so x = x(t) and y = y(t) ).
Then the composite function φ(x(t), y(t)) is a differentiable function of t , and its derivative can
be computed using a chain rule typically encountered later in the study of calculus, namely,
d ∂φ d x ∂φ d y
[φ(x(t), y(t))] = + . (7.1)
dt ∂ x dt ∂ y dt

In practice, it is usually easier to compute this derivative by simply replacing the x and y in the
formula for φ(x, y) with the corresponding formulas x(t) and y(t) , and then computing that
formula of t to compute the above derivative. Still, this chain rule (and other chain rules involving
functions of several variables) can be quite useful in more advanced applications. Our particular
interest is in the corresponding chain rule for computing
d
[φ(x, y(x))] ,
dx
which we can obtain from equation (7.1) by simply letting x = t . Then
dx dy dy
= 1 , y = y(t) = y(x) , = ,
dt dt dx
and equation (7.1) reduces to
d ∂φ ∂φ d y
[φ(x, y(x))] = + . (7.2)
dx ∂x ∂y dx

For brevity, we will henceforth refer to this formula as chain rule (7.2) (not very original, but better
than constantly repeating “the chain rule described in equation (7.2)”).
Don’t forget the difference between d/dx and ∂/∂ x . If φ = φ(x, y) , then
dφ
= the derivative of φ(x, y) assuming x is the variable and y is a function of x .
dx

while

∂φ
= the derivative of φ(x, y) assuming x is the variable and y is a constant.
∂x

!Example 7.2: Assume y is some function of x (i.e., y = y(x) ) and

φ(x, y) = y 2 + x 2 y .

i i

i i
i i

i i

The Exact Form, Deﬁned 121

Then
∂φ ∂
∂φ ∂

= y2 + x 2 y = 2x y , = y2 + x 2 y = 2y + x 2 ,
∂x ∂x ∂y ∂y

and, by chain rule (7.2),

d ∂φ ∂φ d y
dy
[φ(x, y(x))] = + = 2x y + 2y + x 2 .
dx ∂x ∂y dx dx

If, for example, y = sin(x) , then the above becomes

d

[φ(x, y(x))] = 2x sin(x) + 2 sin(x) + x 2 cos(x) .
dx

On the other hand, if y = y(x) is some unknown function, then, after replacing φ with its
formula, we simply have
d
dy
y 2 + x 2 y = 2x y + 2y + x 2 . (7.3)
dx dx

In our use of chain rule (7.2), y will be an unknown function of x , and the right side of equation
(7.2) will correspond to one side of whatever differential equation is being considered.

7.2 The Exact Form, Deﬁned

Let R be some region in the X Y –plane. We will say that a ﬁrst-order differential equation is in
exact form (on R ) if and only if both of the following hold:

1. The differential equation is written in the form

dy
M(x, y) + N (x, y) = 0 (7.4a)
dx

where M(x, y) and N (x, y) are known functions of x and y .

and
2. There is a differentiable function φ = φ(x, y) on R such that
∂φ ∂φ
= M(x, y) and = N (x, y) (7.4b)
∂x ∂y

at every point in R .

We will refer to the above φ(x, y) as a potential function for the differential equation.1
In practice, the region R is often either the entire XY –plane or a signiﬁcant portion of it.
There are a few technical issues regarding this region, but we can deal with these issues later. In
the meantime, little harm will be done by not explicitly stating the region. Just keep in mind that if
we say an certain equation is in exact form, then it is in exact form on some region R , and that the
graph of any solution y = y(x) derived will be restricted being a curve in that region.

1 Referring to φ as a “potential function” comes from the theory of conservative vector ﬁelds. In fact, it is not common
terminology in other differential equation texts. Most other texts just refer to this function as “ φ ”.

i i

i i
i i

i i

122 The Exact Form and General Integrating Factors

!Example 7.3: Consider the differential equation

dy
2x y + 2y + x 2 = 0 . (7.5)
dx

This equation is in the form

dy
M(x, y) + N (x, y) = 0
dx
with
M(x, y) = 2x y and N (x, y) = 2y + x 2 .

Moreover, if we glance back at example 7.2, we immediately see that, letting

φ(x, y) = y 2 + x 2 y ,
we have
∂φ ∂

= y2 + x 2 y = 2x y = M(x, y)
∂x ∂x
and
∂φ ∂

= y2 + x 2 y = 2y + x 2 = N (x, y)
∂y ∂y

everywhere in the X Y –plane. So equation (7.5) is in exact form (and we can take R to be the
entire X Y –plane).

In the next several sections, we will discuss how to determine when a differential equation is in
exact form, how to convert one not in exact form into exact form, how to ﬁnd a potential function,
and how to solve the differential equation using a potential function. However, we will develop the
material backwards, starting with solving a differential equation given a known potential function,
and working our way towards dealing with equations not in exact form. This ends up being the
natural (and least confusing) way to develop the material.
But ﬁrst, a few general comments regarding exact forms and potential functions:

1. Just being written as

dy
M(x, y) + N (x, y) = 0
dx
does not guarantee that a differential equation is in exact form; there still might not be a
φ(x, y) with
∂φ ∂φ
= M(x, y) and = N (x, y) .
∂x ∂y
For example, we will discover that there is no φ(x, y) satisfying
∂φ ∂φ
= 3y + 3y 3 and = x y2 − x .
∂x ∂y

So dy
3y + 3y 3 + x y 2 − x = 0
dx
is not in exact form.

2. A single differential equation will have several potential functions. In particular, adding any
constant to a potential function yields another potential function. After all, if
∂φ0 ∂φ0
= M(x, y) and = N (x, y)
∂x ∂y

i i

i i
i i

124 The Exact Form and General Integrating Factors

can be rewritten as
∂φ ∂φ d y
+ = 0 .
∂x ∂y dx
Chain rule (7.2) then tells us that the left side of this equation is just the derivative of φ(x, y(x)) .
So this last equation can be written more concisely as
d
[φ(x, y)] = 0
dx
(with y = y(x) ). Not only is this concise, but it is easily integrated:

d
[φ(x, y)] dx = 0 dx
dx

→ φ(x, y) = c .

Think about this last equation for a moment. It describes the relation between x and y = y(x)
assuming y satisﬁes the differential equation
dy
M(x, y) + N (x, y) = 0 .
dx
In other words, the equation φ(x, y) = c is an implicit solution to the differential equation for which
it is a potential function.
All this is important enough to be restated as a theorem.

Theorem 7.1 (importance of a potential function)

Let φ(x, y) be a potential function for a given ﬁrst-order differential equation. Then that differential
equation can be written as
∂φ ∂φ d y
+ = 0 .
∂x ∂y dx
This, in turn, can be rewritten as
d
[φ(x, y)] = 0 with y = y(x) ,
dx
which can be integrated,
d
[φ(x, y)] dx = 0 dx ,
dx
to obtain the implicit solution
φ(x, y) = c
where c is an arbitrary constant.

The above theorem contains all the steps for ﬁnding an implicit solution to a differential equation
that can be put in exact form, provided you have a corresponding potential function. Of course, you
have probably already noticed that this theorem can be shortened to the following:

Corollary 7.2
Let φ(x, y) be a potential function for a given ﬁrst-order differential equation. Then

φ(x, y) = c

is an implicit solution to that differential equation.

i i

i i
i i

i i

Solving Equations in Exact Form 125

In solving at least the ﬁrst few differential equations in the exercises at the end of this chapter,
you should use all the steps described in theorem 7.1 simply to reinforce your understanding of why
φ(x, y) = c . Then feel free to cut out the intermediate steps (i.e., use the corollary). And, of course,
don’t forget to see if the implicit solution can be solved for y in terms of x , yielding an explicit
solution.

!Example 7.4: Consider the differential equation

dy
2x y + 2y + x 2 = 0 .
dx
From example 7.3, we know this is in exact form and has corresponding potential function
φ(x, y) = y 2 + x 2 y .
Since
∂ ∂
y2 + x 2 y = 2x y and y2 + x 2 y = 2y + x 2 ,
∂x ∂y
our differential equation can be rewritten as
d y
∂ ∂
y2 + x 2 y + y2 + x 2 y = 0 .
∂x ∂y dx

By chain rule (7.2), this reduces to

d

y2 + x 2 y = 0 with y = y(x) .
dx
Integrating this,
d

y 2 + x 2 y dx = 0 dx ,
dx
yields the implicit solution
y2 + x 2 y = c .
Rewriting this as
y2 + x 2 y − c = 0
and then solving for y via the quadratic formula provides the explicit solution

−x 2 ± x 4 + 4c
y = .
2

Finding the Potential Function

Let use now consider a more difficult problem: Finding a potential function, φ(x, y) , for a given
differential equation that has been written in the form
dy
M(x, y) + N (x, y) = 0 .
dx
This means finding a function φ(x, y) satisfying both
∂φ ∂φ
= M(x, y) and = N (x, y) .
∂x ∂y
A relatively straightforward procedure for finding this φ will be outlined in a moment. But first, let
us make a few observations regarding this pair of partial differential equations:

i i

(Observe that this step yields a formula for φ(x, y) involving one yet unknown function of
y only. We now just have to determine h(y) to know the formula for φ(x, y) .)
3. Replace φ in the other partial differential equation,
∂φ
= N (x, y) ,
∂y
with the formula just derived for φ(x, y) , and compute the partial derivative. Keep in mind
that, because h(y) is a function of y only,
∂
[h(y)] = h (y) .
∂y
Then algebraically solve the resulting equation for h (y) .
For our example:
∂φ
= N (x, y)
∂y

∂

→ ∂y
x 2 y + 2x + h(y) = x2 + 4

→ x 2 + h (y) = x 2 + 4

→ h (y) = 2 .

4. Look at the formula just obtained for h (y) . Because h (y) is a function of y only, its
formula must involve only the variable y ; no x may appear.
If the x’s do not cancel out, then we have an impossible equation. This means the naive
assumption made in the ﬁrst step was wrong; the given differential equation was not in exact
form, and there is no φ(x, y) satisfying the desired pair of partial differential equations. In
this case, Stop! Go no further in this procedure!
On the other hand, if the previous step yields
h (y) = a formula of y only (no x’s ) ,
then integrate both sides of this equation with respect to y to obtain h(y) . (Because h(y)
does not depend on x , the constant of integration here will truly be a constant, not a function
of the other variable.)
For our example, the last step yielded
h (y) = 2 .
The right side does not contain x , so we can continue and integrate to obtain

h(y) = h(y) d y = 2 d y = 2y + c1 .

(We will later see that the constant c1 is not that important.)
5. Combine the formula just obtained for h with the formula obtained for φ in step 2.
In our example, combining the results from step 2 and the last step above yields
φ(x, y) = x 2 y + 2x + h(y)
= x 2 y + 2x + 2y + c1
where c1 is an arbitrary constant.

i i

i i
i i

i i

128 The Exact Form and General Integrating Factors

(Because of the arbitrary constant from the integration of h (y) , the formula obtained actually
describes all possible φ’s satisfying the desired pair of partial differential equations. If you
look at our discussion above, it should be clear that this formula will always be of the form

φ(x, y) = φ0 (x, y) + c1

where φ0 (x, y) is a particular formula and c1 is an arbitrary constant. But, as noted earlier,
we only need to ﬁnd one potential function φ(x, y) . So you can set c1 equal to your favorite
constant, 0 , or keep it arbitrary and see what happens.)

If the above procedure does yield a formula φ(x, y) , then it immediately follows that the given
equation is in exact form and φ(x, y) is a potential function for the given differential equation. But
don’t forget that the goal is usually to solve the given differential equation,
dy
M(x, y) + N (x, y) = 0 .
dx

The function φ = φ(x, y) is not that solution. It is a function such that the differential equation can
be rewritten as
d
[φ(x, y)] = 0 .
dx
Integrating this yields the implicit solution

φ(x, y) = c

from which, if the implicit solution is not too complicated, we can obtain an explicit solution
y = y(x) .

!Example 7.5: Consider solving

dy
2x y + 2 + x 2 + 4 = 0 .
dx

As just illustrated, this equation is in exact form, and has a potential function

φ(x, y) = x 2 y + 2x + 2y + c1

where c1 can be any constant. So the differential equation can be rewritten as

d

x 2 y + 2x + 2y + c1 = 0 ,
dx

which integrates to
x 2 y + 2x + 2y + c1 = c2 .
This is an implicit solution for the differential equation. Solving this for y is easy, and (after
letting c = c2 − c1 ) gives us the explicit solution
c − 2x
y = .
x2 + 2

Note that, in the last example, the constants arising from the integration of h (y) and dφ/
dx
were combined at the end. It is easy to see that this will always be possible.

i i

i i
i i

i i

Testing for Exactness — Part I 129

Other Ways to Find a Potential Function

There are other ways to find a function φ(x, y) satisfying both
∂φ ∂φ
= M(x, y) and = N (x, y) .
∂x ∂y
Two, in particular, are worth mentioning:
1. The first is the obvious modification of the one already given in which the roles of ∂φ/ =M
∂x
and ∂φ/∂ y = N are interchanged. That is, instead of
integrating ∂φ/ = M with respect to x , and then plugging the result into
∂x
∂φ/ =N,
∂y
you
integrate∂φ/ = N with respect to y , and then plug the result into ∂φ/ = M .
∂y ∂x
∂φ
The integration of ∂ y = N will yield some formula involving x , y and an unknown func-
/
tion of x , g(x) . Plugging this formula into ∂φ/∂ x = M should yield a ordinary differential
equation for g(x) which does not contain y . If the y’s do not cancel out, the desired φ
does not exist. Otherwise, g(x) can be obtained by integration and then combined with the
formula just obtained for φ(x, y) . "
" This is usually the preferred method when N (x, y) d y is much easier to compute than
M(x, y) dx . Indeed, it usually is a good idea to scan these two integrals and, if one looks
much easier to compute, compute that one, and plug the result into the partial differential
equation corresponding to the other integral. Don’t forget to check to see if the equation
resulting from that just involves the appropriate variable.
2. The other method is one often attempted by beginners who do not understand why it should
not be used: First independently integrate both ∂φ/∂ x = M and ∂φ/∂ y = N . Then stare at the
two different formulas obtained for φ(x, y) (each involving a different unknown function)
and try to guess what single formula for φ(x, y) (without unknown functions) matches the
results from the two integrations.
Fight any temptation to take this approach. Yes, with a little luck and skill, you can get
φ this way. But it is usually more work, it doesn’t easily warn you when φ(x, y) does not
exist, and is more likely to result in errors. Why use a method involving two integrations and
two unknown functions when you can use a method involving just one integration and one
unknown function with a straightforward way to determine that one function?

7.4 Testing for Exactness — Part I

The procedure just discussed for ﬁnding a potential function φ for
dy
M(x, y) + N (x, y) = 0 (7.6)
dx
does not tell us whether such a φ even exists until step 4, after a possibly tricky integration. For-
tunately, there is a simple test that that can often tell us when seeking that φ would be futile. This
test is based on the fact that, for any sufﬁciently differentiable φ(x, y) ,
∂2φ ∂2φ
= .
∂y ∂x ∂x ∂y

i i

i i
i i

i i

130 The Exact Form and General Integrating Factors

Now let R be any region in the XY –plane on which φ is sufﬁciently differentiable and satisﬁes
∂φ ∂φ
= M(x, y) and = N (x, y) .
∂x ∂y

Then, at every point in R ,

∂φ
∂M ∂ ∂2φ ∂2φ ∂ ∂φ ∂N
= = = = = .
∂y ∂y ∂x ∂y ∂x ∂x ∂y ∂x ∂y ∂x

So, for equation (7.6) to be in exact form over R , we must have

∂M ∂N
= (7.7)
∂y ∂x

at every point in R . If this equation does not hold, differential equation (7.6) is not in exact form
over that region — no corresponding potential function φ exists.
What we have not shown is that equality (7.7) necessarily implies that differential equation
(7.6) is in exact form. In fact, equality (7.7) does imply that, given any point in R , the equation
is in exact form over some subregion of R containing that point. Unfortunately, showing that and
describing those regions takes more development than is appropriate here. For now, let us just say
that, in practice, the equality (7.7) implies that differential equation (7.6) is “probably” in exact form
over the given region, and it is worthwhile to seek a corresponding potential function φ via the
method outlined earlier.
Let us summarize what has just been derived, accepting the term “suitably differentiable” as
simply meaning that the necessary partial derivatives can be computed:

Theorem 7.3 (test for probable exactness)

Let M(x, y) and N (x, y) be two suitably differentiable functions of two variables over a region
R in the X Y –plane, and consider the differential equation
dy
M(x, y) + N (x, y) = 0 .
dx

1. If
∂M ∂N
= on R ,
∂y ∂x
then the above differential equation is not in exact form on R .
2. If
∂M ∂N
= on R ,
∂y ∂x
then the above differential equation might be in exact form on R . It is worth seeking a
corresponding potential function.

!Example 7.6: Consider the differential equation

dy
3y + 3y 3 + x y 2 − x = 0 .
dx
Here
∂M ∂
= 3y + 3y 3 = 3 + 9y 2
∂y ∂y

i i

i i
i i

i i

“Exact Equations”: A Summary 131

and
∂N ∂

= x y2 − x = y2 − 1 .
∂x ∂x
So
∂M ∂N
= ,
∂y ∂x
telling us that the given differential equation is not in exact form over any region.

!Example 7.7: Consider the differential equation

y x dy
− + 2 = 0 .
x 2 + y2 x + y2 d x

Here
y x
M(x, y) = − and N (x, y) =
x 2 + y2 x 2 + y2
are well deﬁned and differentiable at every point on the XY –plane except (x, y) = (0, 0) .
So let’s take R to be the entire XY –plane with the origin removed. Computing the partial
derivatives, we get

∂M ∂ y 1(x 2 + y 2 ) − y(2y) y2 − x 2
= − 2 = − =
∂y ∂y 2x +y 2 2 2 (x + y )
2 2 2 (x + y )
and
∂N ∂ x 1(x 2 + y 2 ) − x(2x) y2 − x 2
= = = .
∂x ∂x x + y2
2 (x + y )
2 2 2 (x 2 + y 2 )2

So these two partial derivatives are equal throughout R , and our test for probable exactness tells
us that the given differential equation might be in exact form on R — it is worthwhile to try to
ﬁnd a corresponding potential function.

For many, the test described above in theorem 7.3 will sufﬁce. Those who wish a more complete
test should jump to section 7.7 starting on page 139 (where we will also ﬁnish solving the differential
equation in example7.7).

7.5 “Exact Equations”: A Summary

To review:
If you suspect that a given differential equation,
dy
M(x, y) + N (x, y) = 0 ,
dx
is in exact form, then you can quickly check for at least probable exactness by computing
∂ M/ and ∂ N/∂ x and seeing if
∂y
∂M ∂N
= .
∂y ∂x

If the two partial derivatives are equal, then follow the procedure for ﬁnding a potential
function φ(x, y) outlined on pages 126 to 128.

i i

i i
i i

i i

132 The Exact Form and General Integrating Factors

If that procedure is successful and yields a φ(x, y) , then ﬁnish solving the given dif-
ferential equation using the fact that the differential equation can be rewritten as

d
[φ(x, y)] = 0 ,
dx

the integration of which yields the implicit solution

φ(x, y) = 0 .

If this equation can be solved for y in terms of x , do so.

If the given differential equation is not in exact form, then there is a possibility that it can be
put into an exact form using appropriate “integrating factors”. We will discuss these next.
By the way, don’t forget that these equations may be solvable by other means. For example, the
equation used to illustrate the procedure for ﬁnding φ was a linear differential equation, and could
have been solved a bit more quickly using the methods from chapter 5.

7.6 Converting Equations to Exact Form

Basic Notions
Obviously, the first step to converting a given first-order differential equation to exact form is to get
it into the form
dy
M(x, y) + N (x, y) = 0 .
dx
Then apply the test for (probable) exactness. With luck, the test result will be positive. More likely,
it will not.
To see how we might further convert our equation to exact form, it may help to recall why we
want the exact form. It is so that the left side of the differential equation can be identified as an
ordinary derivative of some formula of x and y(x) ,

d
φ(x, y(x)) .
dx

We had a similar situation with linear equations. Given a linear equation

dy
+ p(x)y = f (x) ,
dx

we found that, after multiplying it by an integrating factor μ to get

dy
μ + μpy = μ f ,
dx

we could identify the equation’s left side as a complete derivative of a formula of x and y(x) ,
namely,
d
[μ(x)y(x)] .
dx
The same idea can be applied to convert an equation not in exact form to one that is in exact form.

i i

i i
i i

i i

Converting Equations to Exact Form 133

!Example 7.8: Consider the differential equation

dy
3y + 3y 3 + x y 2 − x = 0 .
dx

In example 7.6, we saw that this equation is not in exact form. But look what happens after we
multiply through by μ = x 2 y −2 ,
dy
x 2 y −2 3y + 3y 3 + x y 2 − x = 0 .
dx

We get
dy
3x 2 y −1 + 3x 2 y + x 3 − x 3 y −2 = 0
dx
Mnew (x,y) Nnew (x,y)

with
∂ Mnew ∂

= 3x 2 y −1 + 3x 2 y = −3x 2 y −2 + 3x 2 = 3x 2 − 3x 2 y −2
∂y ∂y

and
∂ Nnew ∂

= x 3 − x 3 y −2 = 3x 2 − 3x 2 y −2 .
∂x ∂x
So
∂ Mnew ∂ Nnew
= ,
∂y ∂x
telling us that the equation is now (probably) in exact form (over any region where y never equals
0 ).

We will refer to any nonzero function μ = μ(x, y) as an integrating factor for a ﬁrst-order
differential equation
dy
M + N = 0
dx
if and only if multiplying that equation through by μ ,
dy
μM + μN = 0 ,
dx

yields a differential equation in exact form.2 This integrating factor may be a function of x or of y
or of both x and y . Notice that, because
dy
μM + μN = 0
d x
“new” M “new” N

is exact, our test for exactness tells us that

∂ ∂
[μM] = [μN ] .
∂y ∂x

2 You can show that the integrating factors found for linear equations are just special cases of the integrating factors considered
here. If there is any danger of confusion, we’ll refer to the integrating factors now being discussed as the “more general”
integrating factors.

→ μ = ±eln x
2 +c
= Ax 2 .

i i

i i
i i

i i

136 The Exact Form and General Integrating Factors

Since only one nonzero integrating factor is needed, we can take A = 1 , giving us

μ(x) = x 2

as our integrating factor.

Unfortunately, there is always the possibility that the y’s will not cancel out.

!Example 7.10: It is easily veriﬁed that

dy
6x y + 5 x 2 + y = 0
dx

is not in exact form. To be a corresponding integrating factor, μ must satisfy

∂
∂

μ[6x y] = μ5 x 2 + y .
∂y ∂x

Assume μ is a function of x only. Then

∂μ dμ ∂μ
= and = 0 ,
∂x dx ∂y

and, thus,
∂
∂
μ[6x y] = μ5 x 2 + y
∂y ∂x

∂μ ∂ ∂μ 2 ∂

→ ∂y
[6x y] + μ [6x y] =
∂y ∂x
5 x +y + μ
∂x
5 x2 + y

dμ 2
→ 0 · 6x y + μ[6x] =
dx
5 x + 5 + μ[10x]

10xμ − 6xμ
→ dμ
dx
=
4x 2 + 5y
=
4xμ
4x 2 + 5y
.

Here, the y ’s do not cancel out, as they should if our assumption that μ depended only on x
were true. Hence, that assumption was wrong. This equation does not have an integrating factor
that is a function of x only. We will have to try something else.

Second Case: μ Being a Function of y Only

Plugging
μ(x, y) = x α y β
into equation (7.8) for our differential equation yields:
∂ ∂
(μM) = (μN )
∂y ∂x

∂
∂

→ ∂y
x α y β 3y + 3y 3 =
∂x
x α yβ x y2 − x

∂
∂

→ ∂y
3x α y β+1 + 3x α y β+3 = x α+1 y β+2 − x α+1 y β
∂x

→ 3(β + 1)x α y β + 3(β + 3)x α y β+2 = (α + 1)x α y β+2 − (α + 1)x α y β .

Combining like terms then gives

[3β + α + 4]x α y β + [3β − α + 8]x α y β+2 = 0 ,

which, in turn, holds if and only if

3β + α + 4 = 0 and 3β − α + 8 = 0 .

This last pair of equations constitute a simple system of linear equations,

3β + α + 4 = 0
3β − α + 8 = 0

which can be easily solved by any of a number of ways, yielding

α=2 and β = −2 .

Thus, the differential equation we started with,

dy
3y + 3y 3 + x y 2 − x = 0 ,
dx

does have an integrating factor of the form μ(x, y) = x α y β , and it is

μ(x, y) = x 2 y −2

(just as was used in example 7.8 on page 133).

i i

i i
i i

i i

Testing for Exactness — Part II 139

7.7 Testing for Exactness — Part II

Simple Connectivity and the Complete Test for Exactness
A more complete test for exactness than given in theorem 7.3 can be described if we are more careful
about describing our situation. So suppose we have an equation
dy
M(x, y) + N (x, y) = 0 ,
dx
and that, on some open region R of the XY –plane, all of the following hold:
1. The functions M(x, y) and N (x, y) , along with the derivatives ∂ M/ and ∂ N/ , are con-
∂y ∂x
tinuous everywhere in R .
2. At each point (x, y) in R ,
∂M ∂N
= .
∂y ∂x
That region R will said to be simply connected if each and every simple closed curve (i.e.,
loop) in R encloses only points in R . If any simple closed curve in R encloses any point not in
R , then we will say that R is not simply connected. If you think about it, you will realize that
saying a region is simply connected is just a precise way of saying that the region has no “holes”.
And if you think a little more about the situation, you will realize that, if our open region R has
“holes” (i.e., is not simply connected), then it is probably because these are points where M(x, y)
or N (x, y) or their partial derivatives fail to exist.

!Example 7.13: Again, consider the differential equation

y x dy
− + 2 = 0 .
x 2 + y2 x + y2 d x

As noted in example 7.7,

y x
M(x, y) = − and N (x, y) =
x 2 + y2 x 2 + y2

are well deﬁned and differentiable and satisfy

∂M ∂N
=
∂y ∂x

everywhere in the region R consisting of the XY –plane with the origin removed. Removing
this point (the origin) creates a “hole” in R . This point is also a point not in R but which is
enclosed by any loop in R around the origin.

Now we can state the full test for exactness. (Its proof will be brieﬂy discussed at the end of
this section.)

Theorem 7.4 (complete test for exactness)

Let R be a simply-connected open region in the XY –plane, and let M(x, y) and N (x, y) be two
continuous functions on R whose partial derivatives are also continuous on R . Then
dy
M(x, y) + N (x, y) = 0
dx

i i

i i
i i

i i

140 The Exact Form and General Integrating Factors

is in exact form on R if and only if

∂M ∂N
=
∂y ∂x
at every point in R .

This theorem assures us that, if our region R is simply connected, then we can (in theory at
least) use the procedure outlined on pages 126 to 128 to ﬁnd the corresponding potential function
φ(x, y) on R , and from that, derive an implicit solution φ(x, y) = c to our differential equation.
Theorem 7.4 does not deﬁnitely say the differential equation is not in exact form if R is not
simply connected. Whether the equation is or is not in exact form over all of R is still uncertain.
What is certain, however, is the following immediate consequence of theorem 7.4.

Corollary 7.5
Assume M(x, y) and N (x, y) are two continuous functions on some open region R of the XY –
plane. Assume further that, on R , the partial derivatives of M and N are continuous and satisfy
∂M ∂N
= .
∂y ∂x

Then
dy
M(x, y) + N (x, y) = 0
dx
is in exact form on each simply-connected open subregion of R .

Thus, even if our original region is not simply connected, we can at least pick any open, simply
connected subregion R1 , and (in theory at least) use the procedure outlined in section 7.3 to ﬁnd
a corresponding potential function φ1 (x, y) on R1 , and from that, derive the implicit solution
φ1 (x, y) = c to our differential equation, valid on subregion R1 .
But then, why might there not be a potential function φ(x, y) valid on the entire region R ?
Let’s go back to an example started earlier to see.

!Example 7.14: Let us continue our consideration of the differential equation

y x dy
− + 2 = 0
x 2 + y2 x + y2 d x

from example 7.7. As just noted in the previous example, the region R consisting of all the
X Y –plane except for the origin (0, 0) is not simply connected. But we can partition it into the
left and right half-planes

R+ = {(x, y) : x > 0} and R− = {(x, y) : x < 0} ,

which are simply connected. Theorem 7.4 assures us that our differential equation is in exact
form over each of these half-planes, and, indeed, you can easily show that all the corresponding
potential functions on these regions for our differential equation are given by
y
φ+ (x, y) = Arctan + c+ on R+
x
and y
φ− (x, y) = Arctan + c− on R−
x

Additional Exercises

7.1. For each choice of φ(x, y) given below, ﬁnd a differential equation for which the given
φ is a potential function, and then solve the differential equation using the given potential
function.
a. φ(x, y) = 3x y b. φ(x, y) = y 2 − 2x 3 y
c. φ(x, y) = x 2 y − x y 3 d. φ(x, y) = x Arctan(y)

7.2. The following concern the differential equation

dy 1 y
= − . (7.9)
dx y 2x

a. Verify that the above differential equation can be rewritten as

dy
y 2 − 2x + 2x y = 0 ,
dx

and then verify that this is an exact form for equation (7.9) by showing that

φ(x, y) = x y 2 − x 2

is a corresponding potential function.

b. Solve equation (7.9) using the above potential function.
c. Note that we can also rewrite equation (7.9) as
dy
e x y −x y 2 − 2x + e x y −x 2x y
2 2 2 2
= 0 .
dx

Show that this is also an exact form by showing that

2 −x 2
ψ(x, y) = e x y

is a corresponding potential function.

7.3. Assume φ(x, y) is a potential function corresponding to

dy
M(x, y) + N (x, y) = 0 .
dx

Show that
ψ1 (x, y) = eφ(x,y) and ψ2 (x, y) = sin(φ(x, y))
are also potential functions for this differential equation, though corresponding to different
exact forms.

7.4. Each of the following differential equations is in exact form. Find a corresponding potential
function for each, and then ﬁnd a general solution to the differential equation using that
potential function (even if it can be solved by simpler means).
dy dy
a. 2x y + y 2 + 2x y + x 2 = 0 b. 2x y 3 + 4x 3 + 3x 2 y 2 = 0
dx dx

i i

i i
i i

i i

144 The Exact Form and General Integrating Factors

dy dy
c. 2 − 2x + 3y 2 = 0 d. 1 + 3x 2 y 2 + 2x 3 y + 6y = 0
dx dx

4 dy x dy
e. 4x 3 y + x 4 − y = 0 f. 1 + ln |x y| + = 0
dx y dx
dy dy
g. 1 + e y + xe y = 0 h. e y + xe y + 1 = 0
dx dx

7.5. For each of the following differential equations,

i. verify that the equation is not in exact form,

ii. ﬁnd an integrating factor, and

iii. solve the given differential equation (using the integrating factor just found).
dy dy
a. 1 + y 4 + x y 3 = 0 b. y + y 4 − 3x = 0
dx dx
dy dy
c. 2x −1 y + 4x 2 y − 3 = 0 d. 1 + [1 − x tan(y)] = 0
dx dx
dy dy
e. 3y + 3y 2 + 2x + 4x y = 0 f. 2x(y + 1) − = 0
dx dx
dy
g. 2y 3 + 4x 3 y 3 − 3x y 2 = 0
dx
dy
h. 4x y + 3x 2 + 5y = 0 for y > 0
dx
x dy
i. 6 + 12x 2 y 2 + 7x 3 y + = 0
y dx

7.6. The following problems concern the differential equation

dy
M(x, y) + N (x, y) = 0 . (7.10)
dx

Assume M and N are continuous and have continuous partial derivatives over the entire
X Y –plane, and let P and Q be the functions given by
∂M ∂N ∂N ∂M
− −
∂y ∂x ∂x ∂y
P = and Q = .
N M

a. Show that, if P is a function of x only (so all the y ’s cancel out), then
"
P(x) dx
μ(x) = e

is an integrating factor for equation (7.10).

b. Show that, if Q is a function of y only (so all the x ’s cancel out), then
"
Q(x) dx
μ(y) = e

is an integrating factor for equation (7.10).

i i

i i
i i

i i

8
Slope Fields: Graphing Solutions Without
the Solutions

Up to now, our efforts have been directed mainly towards ﬁnding formulas or equations describing
solutions to given differential equations. Then, sometimes, we sketched the graphs of these solutions
using those formulas or equations. In this chapter, we will do something quite different. Instead
of solving the differential equations, we will use the differential equations, directly, to sketch the
graphs of their solutions. No other formulas or equations describing the solutions will be needed.
The graphic techniques and underlying ideas that will be developed here are, naturally, especially
useful when dealing with differential equations that cannot be readily solved using the methods
already discussed. But these methods can be valuable even when we can solve a given differential
equation because they yield “pictures” describing the general behavior of the possible solutions.
Sometimes, these pictures can be even more enlightening than formulas for the solutions.

8.1 Motivation and Basic Concepts

Suppose we have a ﬁrst-order differential equation that, for motivational purposes, “just cannot be
solved” using the methods already discussed. For illustrative purposes, let’s pretend
dy
16 + x y 2 = 9x
dx

is that differential equation. (True, this is really a simple separable differential equation. But it is
also a good differential equation for illustrating the ideas being developed.)
For our purposes, we need to algebraically solve the differential equation to get it into the
derivative formula form, y = F(x, y) . Doing so with the above differential equation, we get
dy x

= 9 − y2 . (8.1)
dx 16

Remember, there are inﬁnitely many particular solutions (with different particular solutions typically
corresponding to different values for the general solution’s ‘arbitrary’ constant). Let’s now pick some
point in the plane, say, (x, y) = (1, 2) , let y = y(x) be the particular solution to the differential
equation whose graph passes through that point, and consider sketching a short line tangent to this
graph at this point. From elementary calculus, we know the slope of this tangent line is given by
the derivative of y = y(x) at that point. And fortunately, equation (8.1) gives us a formula for
computing this very derivative without the bother of actually solving for y(x) ! So, for the graph of

145

i i

i i
i i

i i

146 Slope Fields

this particular y(x) ,

Y
5

0
0 1 2 3 4 5 6 X

Figure 8.3: A better slope ﬁeld (based on a 19×16 grid) for y (x) = 16
1 x 9 − y 2 , along with

four curves sketched “parallel” to the ﬁeld.

try to identify any curves that are horizontal lines. These are the graphs of constant solutions
and are likely to be particularly relevant. In fact, it’s often worthwhile to identify and sketch
the graphs of all constant solutions in the region, even if they do not pass through any of your
grid points.
Consider ﬁnding the values of y(4) and y(6) when y(x) is the solution to the
initial-value problem
dy x

= 9 − y2 with y(0) = 1 .
dx 16

Since this differential equation was the one used to generate the slope fields in
figures 8.2 and 8.3, we can use the curve drawn through (0, 1) in either of these
figures as an approximate graph for y(x) . On this curve in the better slope field
of figure 8.3, we see that y ≈ 2.6 when x = 4 , and that y ≈ 3 when x = 6 .
Thus, according to our sketch, if y(x) satisfies the above initial-value problem,
then
y(4) ≈ 2.6 and y(6) ≈ 3 .
More generally, after looking at figure 8.3, it should be apparent that any curve
in the sketched region that “parallels” the slope field will approach y = 3 when
x becomes large. This strongly suggests that, if y(x) is any solution to our
differential equation with 0 ≤ y(0) ≤ 5 , then

lim y(x) = 3 .
x→∞

Do observe that y = 3 is a constant solution to our differential equation.

i i

i i
i i

i i

152 Slope Fields

Y Y
5 5

4 4

3 3

2 2

1 1

0 0
0 1 2 3 4 5 6X 0 1 2 3 4 5 6X
(a) (b)

Figure 8.4: (a) A slope ﬁeld and some solutions for y (x) = 14 x(3 − y) , and (b) a slope ﬁeld
and some solutions for y (x) = 13 (y − 3)1/3

8.3 Observing Long-Term Behavior in Slope Fields

Basic Notions
A slope ﬁeld of a differential equation gives a picture of the general behavior of the possible solutions
to that differential equation, at least in the region covered by that slope ﬁeld. In many cases, this
picture may even give you a good idea of the “long-term” behavior of the solutions.

!Example 8.1: Consider the differential equation

dy x
= (3 − y) .
dx 4
A slope field (and some solution curves) for this equation is sketched in figure 8.4a. Now let
y = y(x) be any solution to this differential equation, and look at the slope field. It clearly
suggests that
y(x) → 3 as x → ∞ .
On the other hand, the slope field sketched in figure 8.4b for
dy 1
= (y − 3) /3
1
dx 3
suggests a rather different sort of behavior for this equation’s solutions as x gets large. Here, it
looks as if almost no solutions approach any constant value as x → ∞ . Instead, we appear to
have
lim y(x) = +∞ if y(0) > 3
x→∞
and
lim y(x) = −∞ if y(0) < 3 .
x→∞

Of course, one should be cautious about using a slope field to predict the value of y(x) when x
is outside the range of x-values used in the slope field. In general, a slope field for a given differential

i i

i i
i i

i i

Observing Long-Term Behavior in Slope Fields 153

equation sketched on one region of the XY –plane can be quite different from a slope field for that
differential equation over a different region. So it is important to be sure that the general pattern of
slope lines on which you are basing your prediction does not significantly change as you consider
points outside the region of your slope field.

!Example 8.2: If you look at the differential equation for the slope ﬁeld in ﬁgure 8.4a,
dy x
= (3 − y) ,
dx 4
you can see that the magnitude of the right side

x
(3 − y)
4
becomes larger as either |x| or |y − 3| becomes larger, but the sign of the right side remains

negative if x > 0 and y > 3 and positive if x > 0 and y < 3 .

Thus, the slope lines may become steeper as we increase x or as we go up or down with y , but
they continue to “direct” all sketched solution curves towards the line y = 3 as x → ∞ .

There is one class of differential equations whose slope fields are especially suited for mak-
ing long-term predictions: the autonomous first-order differential equations. Remember, such a
differential equation can be written as
dy
= g(y)
dx
where g(y) is a known formula of y only. The fact that the right side of this equation does not
depend on x means that the vertical column of slope lines at any one value of x is identically
repeated at every other value of x . So if the slope field tells you that the solution curve through,
say, the point (x, y) = (1, 4) has slope 1/2 , then you are certain that the solution curve through
any point (x, y) with y = 4 also has slope 1/2 . Moreover, if there is a horizontal slope line at a
point (x 0 , y0 ) , then there will be a horizontal slope line wherever y = y0 ; that is, y = y0 will be
a constant solution to the differential equation.

!Example 8.3: The differential equation for the slope field sketched in figure 8.4b,
dy 1
= (y − 3) /3
1
,
dx 3
is autonomous since this formula for the derivative does not explicitly involve x . So the pattern
of slope lines in any vertical column in the given slope field will be repeated identically in every
vertical column in any slope field covering a larger region (provided we use the same y -values).
Moreover, from the right side of the equation, we can see that the slopes of the slope lines
1. remain positive and steadily increase as y increases above y = 3 ,
and
2. remain negative and steadily decrease as y decreases below y = 3 .
Consequently, no matter how large a region we choose for the slope field, we will see that
1. the slope lines at points above y = 3 will be “directing” the solution curves more and
more steeply upwards as y increases
and

i i

i i
i i

Of course, the above definitions assume the differential equation is “reasonably well-defined in a
region about the constant solution y = y0 ”.6
Often, the stability or instability of a constant solution is readily apparent from a given slope field,
with rigorous confirmation easily done by fairly simple analysis. Asymptotically stable constant
solutions are also often easily identified in slope fields, though rigorously verifying asymptotic
stability may require a bit more analysis.

!Example 8.4: Recall that the slope ﬁeld in ﬁgure 8.4a is for
dy x
= (3 − y) .
dx 4

From our discussions in examples 8.1 and 8.2, we already know y = 3 is a constant solution to
this differential equation, and that, if y = y(x) is any other solution satisfying y(0) ≈ 3 , then
y(x) ≈ 3 for all x > 0 . In fact, because the slope lines are all angled towards y = 3 as x
increases, it should be clear that, for every x > 0 , y(x) will be closer to 3 than is y(0) . So
y = 3 is a stable constant solution to the above differential equation.
Is y = 3 an asymptotically stable solution? That is, do we have

lim y(x) = 3
x→∞

whenever y = y(x) is a solution with y(0) is sufﬁciently close to 3 ? The slope ﬁeld certainly
suggests so. Fortunately, this differential equation is a fairly simple separable equation which
you can easily solve to get
y(x) = 3 + Ae−x /2
2

as a general solution. Taking the limit, we see that

lim y(x) = lim 3 + Ae−x

2 /2
= 3 + 0 ,
x→∞ x→∞

no matter what y(0) is. So, yes, y = 3 is not just a stable constant solution to the above
differential equation; it is an asymptotically stable constant solution.

!Example 8.5: Now, again consider the slope ﬁeld in ﬁgure 8.4b, which is for
dy 1
= (y − 3) /3
1
.
dx 3

Again, we know y = 3 is a constant solution for this differential equation. However, from our
discussion in example 8.3, we also know that, if y = y(x) is any other solution, then

lim y(x) = ±∞ ,
x→∞

Clearly, then, even if y(0) is very close (but not equal) to 3 , y(x) will not remain close to 3 as
x increases. Thus, y = 3 is an unstable constant solution to this differential equation.

In the two examples given so far, all the solutions starting near a stable constant solution
converged to that solution, while all nonconstant solutions starting near an unstable solution diverged
to ±∞ as x → ∞ . The next two examples show that somewhat different behavior can occur.

6 e.g., that the differential equation can be written as y = F(x, y) where F is continuous at every (x, y) with x ≥ 0 and
|y − y0 | < δ for some δ > 0 .

i i

i i
i i

i i

156 Slope Fields

Y Y
5 5

4 4

3 3

2 2

1 1

0 0
X X
0 1 2 3 4 5 6 0 1 2 3 4 5 6
(a) (b)

Figure 8.5: Slope ﬁelds (a) for example 8.6, and (b) for example 8.7.

!Example 8.6: The slope ﬁeld and solution curves sketched in ﬁgure 8.5a are for
dy y−2
= .
dx 6e x/2 − 2

Here, y = 2 is the only constant solution. Following the slope lines in this ﬁgure, it appears that,
although the graph of each nonconstant solution y = y(x) starts at x = 0 by moving away from
y = 2 as x increases, this graph quickly levels out so that y(x) approaches some constant as
x → ∞ . This behavior can be conﬁrmed by solving the differential equation. With a little work,
you can solve this differential equation and show that, if y is any solution to this differential
equation, then
y(x) − 2 = 3 − e−x/2 [y(0) − 2] .

You can also easily verify that

3 − e−x/2 < 3 for x >0 .

So,

|y(x) − 2| = 3 − e−x/2 |y(0) − 2| < 3 |y(0) − 2| .
In other words, the distance between y(x) and y = 2 when x > 0 is never more than three
times the distance between y(x) and y = 2 when x = 0 . So, if we wish y(x) to stay within a
certain distance of y = 2 for all positive values of x , we merely need to be sure that y(0) is no
more than a third of that distance from 2 .
This conﬁrms that y = 2 is a stable constant solution. However, it is not asymptotically
stable because
lim y(x) = 2 + 3[y(0) − 2] = 2 whenever y(0) = 2 .
x→∞

?Exercise 8.1: Let y(x) be a solution to the differential equation discussed in the last example.
Using the solution formula given above:
a: Show that
|y(x) − 2| < 1 for all x >0

i i

i i
i i

i i

Observing Long-Term Behavior in Slope Fields 157

whenever
1
|y(0) − 2| < .
3
b: How close should y(0) be to 2 so that
1
|y(x) − 2| < for all x >0 ?
2

In the next example, there are two constant solutions, and the analysis is done without the beneﬁt
of having a general solution to the given differential equation.

!Example 8.7: The slope ﬁeld and solution curves sketched in ﬁgure 8.5b are for
dy 1
= (4 − y)(y − 2) /3
4
.
dx 2

Technically, this separable equation can be solved for an implicit solution by the methods discussed
for separable equations, but the resulting equation is too complicated to be of much value here.
Fortunately, from a quick examination of the right side of this differential equation, we can see
that

1. There are two constant solutions, y = 2 and y = 4 .

2. The differential equation is autonomous. So the pattern of slope lines seen in ﬁgure 8.5b
continues unchanged throughout the entire horizontal strip with 0 ≤ y ≤ 5 .

Following the slope lines in ﬁgure 8.5b, it seems clear that y = 4 is a stable constant
solution. In fact, it appears that
lim y(x) = 4
x→∞

whenever y is a solution satisfying

2 < y(0) < 5 .

This strongly suggests that y = 4 is an asymptotically stable constant solution.

On the other hand, if

lim y(x) = 4 whenever 2 < y(0) < 5 ,

x→∞

then the constant solution y = 2 cannot be stable. True, it appears that

lim y(x) = 2 whenever 0 < y(0) ≤ 2 ,

x→∞

but, if y(0) is just a tiny bit larger than 2 , then y(x) does not stay close to 2 as x increases —
it gets close to 4 . So we must consider this constant solution as being unstable. (We will later
see that this type of instability can cause serious problems when attempting to numerically solve
a differential equation.)

In the last example, we did not do the analysis to rigorously verify that y = 4 is an asymptoti-
cally stable constant solution, and that y = 2 is an unstable constant solution. Still, you are probably
pretty conﬁdent that more rigorous analysis will conﬁrm this. If so, good — you are correct. We’ll
verify this in section 8.5 using the more rigorous tests developed there.
Finally, a few comments that should be made regarding, not stability, but our discussion of
“stability”:

i i

i i
i i

i i

158 Slope Fields

1. Strictly speaking, we’ve been discussing the stability of solutions to initial-value problems
when the initial value of y(x) is given at x = 0 . To convert our discussion to a discussion
of the stability of solutions to initial-value problems with the initial value of y(x) given at
some other point x = x 0 , just repeat the above with x = 0 replaced by x = x 0 . There will
be no surprises.
2. Traditionally, discussions of “stability” only involve autonomous differential equations. We
did not do so here because there seemed little reason to do so (provided we are careful
about taking into account how the differential equation depends on x ). Admittedly, limiting
discussion to autonomous equations would have simpliﬁed things since the slope ﬁelds of
autonomous differential equations do not depend on x . In addition, constant solutions to
autonomous equations are traditionally called equilibrium solutions, and, to this author at
least, “stable and unstable equilibriums” sounds more interesting than “stable and unstable
constant solutions”. Still, that did not justify limiting our discussion to just autonomous
equations.

8.4 Problem Points in Slope Fields, and Issues of

Existence and Uniqueness
In sketching and using a slope ﬁeld for
dy
= F(x, y)
dx
we have, up to this point, assumed F(x, y) is well deﬁned and continuous throughout the region
of interest. This will not always be the case. So let us look at what can happen when F is not so
well behaved at certain points. This, by the way, will naturally lead to a brief continuation of our
discussion of “existence” and “uniqueness” that we began in the later part of chapter 3.

Infinite Slopes
Often, a given F(x, y) becomes infinite at certain points in the XY –plane. This, in turn, means that
the corresponding slope lines have “infinite slope”, that is, they are vertical. One practical problem is
that the software you are using to create your slope fields might object to ‘division by zero’ and not
be able to deal with these points. On a more fundamental level, these infinite slopes may be warning
you that something very significant is occurring in the solutions whose graphs include or are near
these points.
In particular, these vertical slope lines may be telling you that solutions are, themselves, be-
coming infinite for finite values of x .

!Example 8.8: A slope ﬁeld for

dy 1
=
dx 3−x
is sketched in ﬁgure 8.6a. Since
1
lim = ±∞ ,
x→3 3 − x

there are vertical slope lines at every point (x, y) with x = 3 . This, along with the pattern of
the other nearby slope lines, suggests that the solutions to this differential equation are “blowing

i i

i i
i i

i i

Problem Points in Slope Fields, and Issues of Existence and Uniqueness 159

Y Y
5 5

4 4

3 3

2 2

1 1

0 0
X X
0 1 2 3 4 5 6 0 1 2 3 4 5 6
(a) (b)

Figure 8.6: Slope ﬁelds (a) for y (x) = (3 − x)−1 from example 8.8, and (b) for
y (x) = 31 (x − 3)−2/3 from example 8.8.

up” as x approaches 3 . Fortunately, this differential equation is easily solved — just integrating
it yields
y = c − ln |3 − x| ,
which does, indeed, “blow up” at x = 3 for any choice of c .
Consequently, the vertical slope lines in ﬁgure 8.6a form a vertical asymptote for the graphs
of the solutions to the given differential equation. This further means that no solution to the
differential equation passes through a point (x, y) with x = 3 . In particular, if you are asked to
solve the initial-value problem
dy 1
= with y(3) = 2 ,
dx 3−x
you have every right to respond: “Nonsense, there is no solution to this initial-value problem.”

On the other hand, the vertical slope lines might not be harbingers of particularly bad behavior
in our solutions. Instead, the solutions may be fairly ordinary functions whose graphs just happen
to have vertical tangent lines at a few points.

!Example 8.9: In ﬁgure 8.6b, we have a slope ﬁeld for

dy 1
= .
dx 3(x − 3)2/3
Again, “division by zero” when x = 3 gives us vertical slope lines at every (x, y) with x = 3 .
This time, however, integrating the differential equation yields

y = (x − 3) /3 + c
1
.
For each value of c , this is a continuous function on the entire real line (including at x = 3 )
which just happens to have a vertical tangent when x = 3 .
In particular, as you can easily verify,
y = (x − 3) /3 + 2
1

i i

i i
i i

i i

160 Slope Fields

is the one and only solution on (−∞, ∞) to the initial-value problem

dy 1
= with y(3) = 2 .
dx 3(x − 3)2/3

Another possibility involving inﬁnite slopes is illustrated in the next example.

!Example 8.10: The slope ﬁeld in ﬁgure 8.7a is for

dy x −2
= .
dx 2−y

This time, the vertical slope lines occur wherever y = 2 (excluding the point (2, 2) , which we
will discuss later). It should be clear that these slope lines do not correspond to asymptotes of the
graphs of solutions that “blow up”, nor does it appear possible for a curve going from left to right
to pass through these points and still parallel the slope lines. Instead, if we carefully sketch the
curve that “follows the slope ﬁeld” through, say, the point (x, y) = (0, 2) , then we end up with
the circle sketched in the ﬁgure (which also has a vertical tangent at (x, y) = (4, 2) ). But such
a circle cannot be the graph of a function y = y(x) since it corresponds to two different values
for y(x) for each x in the interval (0, 4) .
Fortunately, again, our differential equation is a simple separable equation. Solving it (as
you can easily do), yields

y = 2 ± A − (x − 2)2 .
In particular, if we further require that y(0) = 2 , then we obtain exactly two solutions,

y = 2 + 4 − (x − 2)2 and y = 2 − 4 − (x − 2)2 ,

with each defined and continuous on the closed interval [0, 4] . The first satisfies the differential
equation on the interval (0, 4) , and its graph is the upper half of the sketched circle. The second
also satisfies the differential equation on the interval (0, 4) , but its graph is the lower half of the
sketched circle.

Undeﬁned and Indeterminant Slopes

Let’s now look at two examples involving points at which slope lines simply cannot be drawn because
F(x, y) is neither ﬁnite nor inﬁnite at those points.

!Example 8.11: Again, consider the slope ﬁeld in ﬁgure 8.7a for
dy x −2
= .
dx 2−y

If (x, y) = (2, 2) , this becomes the indeterminant expression

dy 0
= .
dx 0

Moreover, the slopes of the slope lines at points near (x, y) = (2, 2) range from 0 to ±∞ . In
fact, the point (x, y) = (2, 2) appears to be the center of the circles made up of the graphs of the
solutions to this differential equation— a fact that can be conﬁrmed using the solution formulas

i i

i i
i i

i i

Problem Points in Slope Fields, and Issues of Existence and Uniqueness 161

Y Y
5 5

4 4

3 3

2 2

1 1

0 0
0 1 2 3 4 5 6X 0 1 2 3 4 5 6X
(a) (b)

Figure 8.7: Slope ﬁelds (a) for y (x) = (x − 2)(2 − y)−1 from examples 8.10 and 8.11, and (b)
for y (x) = (y − 2)(x − 2)−1 from example 8.12.

from example 8.10. Clearly, no real curve can pass through the point (x, y) = (2, 2) and remain
parallel to the slope lines near this point. So if we really wanted a solution to

dy x −2
= with y(2) = 2 ,
dx 2−y

which is valid on some interval (α, β) , then we would be disappointed. There is no such solution.

!Example 8.12: We also get

dy 0
=
dx 0
when we let (x, y) = (2, 2) in
dy y−2
= .
dx x −2
This time, however, the slope ﬁeld (sketched in ﬁgure 8.7b) suggests that every solution curve
passes through this point. And, indeed, solving this simple separable equation yields the formula

y = 2 + A(x − 2)

where A is an arbitrary constant. This formula gives y = 2 when x = 2 no matter what A is.
Consequently, the initial-value problem

dy y−2
= with y(2) = 2
dx x −2

has inﬁnitely many solutions.

In both of the above examples, the slope lines were all well defined (possibly with infinite slope)
at all but one point in the X Y –plane. They are fairly representative examples of what can happen
when F(x, y) is undefined at isolated points. Of course, we can easily give examples in which
F(x, y) is undefined on vast regions of the XY –plane. There isn’t much to be said about these
cases, but we’ll provide one example for the sake of completeness.

i i

i i
i i

i i

162 Slope Fields

Y Y
5 5

4 4

3 3

2 2

1 1

0 0
0 1 2 3 4 5 6 X 0 1 2 3 4 5 6X
(a) (b)

Figure 8.8: Slope ﬁelds (a) for y (x) = 21 (y − 2)1/3 from example 8.14, and (b) for the
differential equation in example 8.15.

!Example 8.13: Consider the differential equation

!

dy
= 1 − x 2 + y2 .
dx

The right side only makes sense if x 2 + y 2 ≤ 1 . Obviously, there can be no “slope ﬁeld” in any
region outside the circle x 2 + y 2 = 1 (that’s why we didn’t attempt to sketch it), and it is just
plain silly to ask for a solution to this differential equation satisfying, say, y(x 0 ) = y0 whenever
(x 0 , y0 ) is a point outside the circle x 2 + y 2 = 1 .

Curves Diverging From or Converging To a Point

In example 8.12 (figure 8.7b), we have solution curves converging to and diverging from the point
(2, 2) . In that case, F(x, y) was indeterminant at that point. As the next example illustrates, we
can have solution curves converging to and diverging from a point even though F(x, y) is a nice
well-defined, finite number at that point. Fortunately, for reasons to be explained, this is not very
common.

!Example 8.14: Consider

dy 1
= (y − 2) /3
1
.
dx 2
A slope ﬁeld and some solutions for this differential equation are sketched in ﬁgure 8.8a. Note
that we’ve sketched three curves diverging from the point (0, 2) . These curves are the graphs of
x 3/2 x 3/2
y = 2 , y = 2 + and y = 2 − ,
3 3

all of which are solutions on [0, ∞) to the initial-value problem

dy 1
= (y − 2) /3
1
with y(0) = 2 .
dx 2

i i

i i
i i

i i

i i

Tests for Stability 165

Y y = y(x) L
yh

0 xh xL X

Figure 8.9: A slope ﬁeld on the strip y0 ≤ y ≤ yh for y = g(y) when g(y0 ) = 0 and g is an
increasing function on [y0 , yh ] .

8.5 Tests for Stability

In section 8.3, we discussed the stability of constant solutions, using slope fields to visually distinguish
between constant solutions that were stable, asymptotically stable or unstable. That was good for
developing a basic understanding of stability, but, as we saw in the examples, it is not always possible
to determine the stability of a given constant solution from just a slope field. So let us take a closer
look at the geometry of the solution curves to a first-order differential equation
dy
= F(x, y)
dx
which start out near the graph of a constant solution y = y0 , and see if we can derive some relatively
simple “computational” tests for verifying the stability or instability suggested by such slope fields
as in figures 8.9 and 8.10.
Throughout this section, we’ll assume we have three finite numbers y0 , yl and yh with
yl < y0 < yh .
The constant solution to our differential equation will be y = y0 , and the strips of interest will be
those strips bounded by the horizontal lines
y = y0 , y = yl and y = yh .
We will also assume F(x, y) is at least a continuous function of both x and y on these strips. This
ensures that we need not worry about any truly “bad” problem points in the strips and can safely
assume that no solution curve “ends” at a point in one of our strips.

Autonomous Equations
Since the analysis is much easier with autonomous equations, we will start with those. Accordingly,
we assume y = y0 is a constant solution to a differential equation of the form
dy
= g(y)
dx
where g is a continuous function on the closed interval [yl , yh ] .

i i

i i
i i

i i

166 Slope Fields

Y
yh
L

y

y0

0 x X

Figure 8.10: Slope ﬁeld for y = g(y) when g(y0 ) = 0 and g is a decreasing function on
[yl , yh ] with yl < y0 < yh .

The Single Crossing Point Lemma

We start by observing that no solution curve can cross a horizontal line y = yc more than once
if g(yc ) is a finite, nonzero value. In particular, suppose g(yc ) > 0 (as we have for yc = yh in
figure 8.9), and suppose y = y(x) is a solution to our autonomous differential equation whose graph
crosses the horizontal line y = yc at the point (x, y) = (x c , yc ) . At this point, the slope of the
solution curve is positive, telling us that the solution curve goes from below to above this horizontal
line as x goes from the left to the right of x c . And since g(y) > 0 at every point on the horizontal
line y = yc , there is no point where the solution curve can come back below this horizontal line as
x increases.
Likewise, if g(yc ) < 0 (see figure 8.10), then each solution curve crossing y = yc goes from
above to below y = yc and can never “come back up” to cross y = yc a second time.
We’ll use this observation several times in what follows, so let us dignify it as a lemma:

Lemma 8.1
Let y = y(x) be a solution to
dy
= g(y)
dx
on some interval (0, x max ) whose graph crosses a horizontal line y = yc when x = x c . Suppose,
further, that g(yc ) is a ﬁnite, nonzero value. Then,
g(yc ) > 0 ⇒ y(x) > yc whenever x c < x < x max ,
while
g(yc ) < 0 ⇒ y(x) < yc whenever x c < x < x max .

Instability
Consider the case illustrated in ﬁgure 8.9. Here, y = y0 is a constant solution to
dy
= g(y) ,
dx
and the slope of the slope line at (x, y) (i.e., the value of g(y) ) increases as y increases from
y = y0 to y = yh . So if
y0 < y1 < y2 < yh ,

i i

i i
i i

i i

Tests for Stability 167

then
0 = g(y0 ) < g(y1 ) < g(y2 ) < g(yh ) . (8.4)
Now take any solution y = y(x) to
dy
= g(y) with y0 < y(0) < yh ,
dx

and let L be the straight line tangent to the graph of this solution at the point where x = 0 (see
ﬁgure 8.9). From inequality set (8.4) (and ﬁgure 8.9), we see that:

1. The slope of tangent line L is positive. Hence, L crosses the horizontal line y = yh at
some point (x L , yh ) with 0 < x L < ∞ .

2. At each point in the strip, the slope of the tangent to the graph of y = y(x) is at least as
large as the slope of L . So, as x increases, the graph of y = y(x) goes upward faster than
L . Consequently, this solution curve crosses the horizontal line y = yh at a point (x h , yh )
with 0 < x h < x L .

From this and lemma 8.1, it follows that, if x is a point in the domain of our solution y = y(x) ,
then
y(x) ≥ yh whenever x > x h .
That is,
y(x) − y0 > yh − y0
whenever x is a point in the domain of y = y(x) with x h < x .
This tells us that, no matter how close we pick y(0) to y0 (at least with y(0) > y0 ), the graph
of our solution will, as x increases, diverge to a distance of at least yh − y0 from y0 . This means
we cannot choose a distance with
< yh − y0 ,

and ﬁnd a solution y = y(x) to

dy
= g(y) with y(0) > y0
dx

that remains within of y0 for all values of x . In other words, y = y0 is not a stable constant
solution.
This, along with analogous arguments when g(y) is an increasing function on [yl , y0 ] , gives
us:

Theorem 8.2
Let y = y0 be a constant solution to
dy
= g(y) .
dx

where g is a continuous function on some interval [yl , yh ] with yl < y < yh . Then y = y0 is an
unstable constant solution if either of the following holds:

1. g(y) is an increasing function on [yl , y0 ] for some yl < y0 .

2. g(y) is an increasing function on [y0 , yh ] for some y0 < yh .

i i

i i
i i

i i

168 Slope Fields

Stability
Now consider the case illustrated in figure 8.10. Here, y = y0 is a constant solution to
dy
= g(y)
dx
when g(y) (the slope of the slope line at point (x, y) ) is a decreasing function on an interval
[yl , yh ] . So, if
yl < y−2 < y−1 < y0 < y1 < y2 < yh ,
then
g(yl ) > g(y−2 ) > g(y−1 ) > 0 > g(y1 ) > g(y2 ) > g(yh ) .
Thus, the slope lines just below the horizontal line y = y0 have positive slope, those just above
y = y0 have negative slope, and the slopes become steeper as the distance from the horizontal line
y = y0 increases.
The fact that y = y0 is a stable constant solution should be obvious from the figure. After all,
the slope lines are all angled toward y = y0 as x increases, “directing” the solutions curves toward
y = y0 as x increases.
Figure 8.10 also suggests that, if y = y(x) is any solution to
dy
= g(y) with yl < y(0) < yh ,
dx
then
lim y(x) = y0 ,
x→∞
suggesting that y = y0 is also asymptotically stable. To rigorously confirm this, it is convenient to
separately consider the three cases

y(0) = y0 , y0 < y(0) < yh and yl < y(0) < y0 .

The ﬁrst case is easily taken care of. If y(0) = y0 , then our solution y = y(x) must be the
constant solution y = y0 (the already noted stability of this constant solution prevents any other
possible solutions). Hence,
lim y(x) = lim y0 = y0 .
x→∞ x→∞
Next, assume y = y(x) is a solution to
dy
= g(y) with y0 < y(0) < yh .
dx
To show
lim y(x) = y0 ,
x→∞
it helps to remember that the above limit is equivalent to saying that we can make y(x) as close to
y0 as desired (say, within some small, positive distance ) by simply picking x large enough.
So let be any small, positive value, and let us show that there is a corresponding “large enough
value” x so that y(x) is within a distance of y0 whenever x is bigger than x . And since we
are only concerned with being “small”, let’s go ahead and assume

< yh − y0 .

Now, for notational convenience, let y = y0 + , and let L be the straight line through the
point (x, y) = (0, yh ) with the same slope as the slope lines along the horizontal line y = y (see
ﬁgure 8.10). Because yh > y > y0 , the slope lines along the line y = y have negative slope.

i i

i i
i i

i i

Tests for Stability 169

Hence, so does L . Consequently, the line L goes downward from point (0, yh ) , intersecting the
horizontal line y = y at some point to the right of the Y –axis. Let x be the X–coordinate of that
point.
Next, consider the graph of our solution y = y(x) when 0 ≤ x ≤ x . Observe that:
1. This part of this solution curve starts at the point (0, y(0)) , which is between the lines L
and y = y0 .

2. The slope at each point of this solution curve above y = y is less than the slope of the line
L . Hence, this part of the solution curve must go downward faster than L as x increases.

3. If y(x) < y for some value of x , then y(x) < y for all larger values of x . (This is from
lemma 8.1.)

4. The graph of y = y(x) cannot go below the horizontal line y = y0 because the slope lines
at points just below y = y0 all have positive slope.
These observations tell us that, at least when 0 ≤ x ≤ x , our solution curve must remain between
the lines L and y = y0 . In particular, since L crosses the horizontal line y = y at x = x ,
we must have
y0 ≤ y(x ) ≤ y = y0 + .
From this along with lemma 8.1, it follows that

y0 ≤ y(x) ≤ y0 + for all x > x .

Equivalently,
0 ≤ y(x) − y0 ≤ for all x > x ,
which tells us that y(x) is within of y0 whenever x > x . Hence, we can make y(x) as close
as desired to y0 by choosing x large enough. That is,

lim y(x) = y0 .
x→∞

That leaves the veriﬁcation of

lim y(x) = y0
x→∞
when y = y(x) satisﬁes
dy
= g(y) with yl < y(0) < y0 .
dx

This will be left to the interested reader (just use straightforward modifications of the arguments in
the last few paragraphs — start by vertically flipping figure 8.10).
To summarize our results:

Theorem 8.3
Let y = y0 be a constant solution to an autonomous differential equation
dy
= g(y) .
dx

Theorem 8.7
Let y = y0 be a constant solution to a differential equation
dy
= F(x, y)
dx
in which F(x, y) is differentiable with respect to both x and y at every point in some strip

{(x, y) : 0 ≤ x and yl ≤ y ≤ yh }

with yl < y0 < yh . Further suppose that, at each point in this strip above the line y = y0 ,
∂F ∂F
< 0 and ≤ 0 ,
∂y ∂x

and that, at each point in this strip below the line y = y0 ,

∂F ∂F
< 0 and ≥ 0 .
∂y ∂x

Then y = y0 is both a stable and asymptotically stable constant solution.

We’ll leave it to the interested reader to come up with corresponding analogs of the other
theorems on stability and instability.

Additional Exercises

8.2. For each of the following, construct the slope ﬁeld for the given differential equation on the
indicated 2×2 or 3×3 grid of listed points:
dy 1
2
a. = x + y2 at (x, y) = (0, 0), (1, 0), (0, 1) and (1, 1)
dx 2
dy
b. 2 = x 2 − y2 at (x, y) = (0, 0), (1, 0), (0, 1) and (1, 1)
dx
dy y
3
3

c. = at (x, y) = (1, 1), /2, 1, 1, /2 and 3/2, 3/2
dx x
dy
d. (2x + 1) = x 2 − 2y 2 at (x, y) = (0, 1), (1, 1), (0, 2) and (1, 2)
dx
dy
e. 2 = (x − y)2
dx
at (x, y) = (0, 0), (0, 1), (0, 2), (1, 0), (0, 1), (1, 2), (2, 0), (2, 1) and (2, 2)
dy
f. = (1 − y)3
dx
at (x, y) = (0, 0), (0, 1), (0, 2), (1, 0), (0, 1), (1, 2), (2, 0), (2, 1) and (2, 2)

Several slope fields for unspecified first-order differential equations are given below. For sketch-
ing purposes, you may want to use an enlarged photocopy of each given slope field.

i i

i i
i i

i i

Additional Exercises 173

8.3. On the right is a slope ﬁeld for some ﬁrst- 7

order differential equation. 6
a Letting y = y(x) be the solution to
this differential equation that satisﬁes 5
y(0) = 3 :
4
i. Sketch the graph of this solution.
3
ii. Using your sketch, ﬁnd (approxi-
mately) the value of y(8) . 2

b. Sketch the graphs of two other so- 1

lutions to this unspeciﬁed differential
equation. 0
0 3 5 6 7 8 9 X
1 2 4
−1

8.4. On the right is a slope ﬁeld for some ﬁrst- Y

order differential equation.
7
a. Sketch the graphs of the solutions to
this differential equation that satisfy 6
i. y(0) = 2 5
ii. y(0) = 4
4
iii. y(0) = 4.5
3
b. What, approximately, is y(4) if y is
the solution to this unspeciﬁed differ- 2
ential equation satisfying
1
i. y(0) = 2 ?
0
ii. y(0) = 4 ? X
0 1 2 3 4 5 6 7 8 9
−1
iii. y(0) = 4.5 ?

8.5. On the right is a slope ﬁeld for some ﬁrst- Y

order differential equation. 7
a. Let y(x) be the solution to the differ-
ential equation with y(0) = 5 . 6

i. Sketch the graph of this solution. 5

ii. What (approximately) is the maxi- 4
mum value of y(x) on the interval
(0, 9) , and where does it occur? 3

iii. What (approximately) is y(8) ? 2

b. Now let y(x) be the solution to the dif- 1
ferential equation with y(0) = 0 .
0
i. Sketch the graph of this solution. X
0 1 2 3 4 5 6 7 8 9
ii. What (approximately) is y(8) ? −1

i i

i i
i i

i i

174 Slope Fields

8.6. On the right is a slope ﬁeld for some ﬁrst- 7

order differential equation. 6
a. Let y(x) be the solution to the differ-
ential equation with y(0) = 2 . 5

i. Sketch the graph of this solution. 4

ii. What (approximately) is y(3) ? 3
b. Now let y(x) be the solution to the dif- 2
ferential equation with y(3) = 1 .
1
i. Sketch the graph of this solution.
ii. What (approximately) is y(0) ? 0
X
0 1 2 3 4 5 6 7 8 9
−1

8.7. On the right is a slope ﬁeld for some ﬁrst- Y

order differential equation.
7
a. Let y(x) be the solution to the differ-
ential equation with y(0) = 4 . 6
i. Sketch the graph of this solution. 5
ii. What (approximately) is the maxi-
4
mum value of y(x) on the interval
(0, 9) , and where does it occur? 3
b. Now let y(x) be the solution to the dif- 2
ferential equation with y(2) = 0 .
1
i. Sketch the graph of this solution.
ii. What (approximately) is the maxi- 0
X
mum value of y(x) on the interval 0 1 2 3 4 5 6 7 8 9
−1
(0, 9) , and where does it occur?

8.8. Look up the commands for generating slope fields for first-order differential equations in
your favorite computer math package (they may be the same commands for generating
“direction fields”). Then:

i. Use the computer math package to sketch the indicated slope ﬁeld for each differ-
ential equation given below,

ii. and use the resulting slope ﬁeld to sketch (by hand) some of the solution curves for
the given differential equation.

dy
a. = sin(x + y) using a 25×25 grid on the region −2 ≤ x ≤ 10 and −2 ≤ y ≤ 10
dx
dy
b. 10 = y(5 − y) using a 25×25 grid on the region −2 ≤ x ≤ 10 and −2 ≤ y ≤ 10
dx

i i

i i
i i

i i

Additional Exercises 175

dy
c. 10 = y(y − 5) using a 25×25 grid on the region −2 ≤ x ≤ 10 and −2 ≤ y ≤ 10
dx
dy
d. 2 = y(y − 2)2 using a 25×17 grid on the region −2 ≤ x ≤ 10 and −1 ≤ y ≤ 3
dx
dy
= (4 − y)(y − 1) /3 using a 19×16 grid on the region 0 ≤ x ≤ 6 and 0 ≤ y ≤ 5
4
e. 3
dx
dy √
f. 3 = 3 x − y using a 25×21 grid on the region −2 ≤ x ≤ 10 and −2 ≤ y ≤ 8
dx

8.9. Slope fields for several (unspecified) first-order differential equations have be sketched be-
low. Assume that each horizontal line is the graph of a constant solution to the corresponding
differential equation. Identify each of these constant solutions, and, for each constant so-
lution, decide whether the slope field is indicating that it is a stable, asymptotically stable,
or unstable constant solution.

Y Y
5 5

4 4

3 3
a. b.
2 2

1 1

0 0
0 1 2 3 4 5 X 0 1 2 3 4 5 X

Y Y
5 5

4 4

3 3
c. d.
2 2

1 1

0 0
0 1 2 3 4 5 X 0 1 2 3 4 5 X

i i

i i
i i

i i

176 Slope Fields

Y Y
3 5

2 4

1
3
e. 0 f.
0 1 2 3 4 5 X 2
−1
1
−2
0
−3 0 1 2 3 4 5 X

i i

i i
i i

i i
i i

i i

Computing Via Euler’s Method (Illustrated) 179

with yk ≈ y(x k ) for each k . Plotting the (x k , yk ) points and connecting the resulting dots with
short straight lines leads to a piecewise straight approximation to the graph of the solution y(x) as
illustrated in figure 9.1b. For convenience, let us denote this approximation generated by the Euler
method by y E,x .
As already indicated, N will denote the number of steps taken. It must be chosen along with
x to ensure that x N is the maximum value of x of interest. In theory, both N and the maximum
value of x can be infinite. In practice, they must be finite.
The precise steps of Euler’s method are outlined and illustrated in the next section.

9.2 Computing Via Euler’s Method (Illustrated)

Suppose we wish to find a numerical solution to some first-order differential equation with initial
data y(x 0 ) = y0 , say,
dy
5 − y 2 = −x 2 with y(0) = 1 . (9.3)
dx
(As it turns out, this differential equation is not easily solved by any of the methods already discussed.
So if we want to find the value of, say, y(3) , then a numerical method may be our only choice.)
To use Euler’s method to find our numerical solution, we follow the steps given below. These
steps are grouped into two parts: the main part in which the values of the x k ’s and yk ’s are
iteratively computed, and the preliminary part in which the constants and formulas for those iterative
computations are determined.

The Steps in Euler’s Method

Part I (Preliminaries)
1. Get the differential equation into derivative formula form,

dy
= f (x, y) .
dx

For our example, solving for the derivative formula form yields

dy 1 2
= y − x2 .
dx 5

2. Set x 0 and y0 equal to the x and y values of the initial data.

In our example, the initial data is y(0) = 1 . So

x0 = 0 and y0 = 1 .

3. Pick a distance x for the step size, a positive integer N for the maximum number of steps,
and a maximum value desired for x , x max . These quantities should be chosen so that

x max = x 0 + N x .

Of course, you only choose two of these values and compute the third. Which two are chosen
depends on the problem.

i i

i i
i i

i i

180 Euler’s Numerical Method

For no good reason whatsoever, let us pick

1
x = and N = 6 .
2
Then
1
x max = x 0 + N x = 0 + 6 · = 3 .
2
4. Write out the equations
x k+1 = x k + x (9.4a)
and
yk+1 = yk + x · f (x k , yk ) (9.4b)

using the information from the previous steps.

For our example,
1 2 1
f (x, y) = y − x2 and x = .
5 2
So, for our example, equation set (9.4) becomes
1
x k+1 = x k + (9.4a )
2
and
1 1 2
yk+1 = yk + · y − x2
2 5
1 2
= yk + yk − x k 2 . (9.4b )
10

Formula (9.4b) for yk+1 is based on approximation (9.2). According to that approximation, if
y(x) is the solution to our initial-value problem and yk ≈ y(x k ) , then

y(x k+1 ) = y(x k + x) ≈ yk + x · f (x k , yk ) = yk+1 .

Because of this, each yk generated by Euler’s method is an approximation of y(x k ) .

Part II of Euler’s Method (Iterative Computations)

1. Compute x 1 and y1 using equation set (9.4) with k = 0 and the values of x 0 and y0 from
the initial data.
For our example, using equation set (9.4 ) with k = 0 and the initial values
x 0 = 0 and y0 = 1 gives us
1 1
x 1 = x 0+1 = x 0 + x = 0 + = ,
2 2
and
y1 = y0+1 = y0 + x · f (x 0 , y0 )
1 2
= y0 + y0 − x 0 2
10
1 2 11
= 1 + 1 − 02 = .
10 10

2. Compute x 2 and y2 using equation set (9.4) with k = 1 and the values of x 1 and y1 from
the previous step.

i i

i i
i i

i i

Computing Via Euler’s Method (Illustrated) 181

For our example, equation set (9.4 ) with k = 1 and the above values for x 1 and
y1 yields
1 1
x 2 = x 1+1 = x 1 + x = + = 1 ,
2 2
and
y2 = y1+1 = y1 + x · f (x 1 , y1 )
1 2
= y1 + y1 − x 1 2
10
2 2
11 1 11 1 290
= + − = .
10 10 10 2 250
3. Compute x 3 and y3 using equation set (9.4) with k = 2 and the values of x 2 and y2 from
the previous step.
For our example, equation set (9.4 ) with k = 2 and the above values for x 2 and
y2 yields
1 3
x 3 = x 2+1 = x 2 + x = 1 + = ,
2 2
and
1 2
y3 = y2+1 = y2 + y2 − x 2 2
10

29 1 29 2 774,401
= + − 12 = .
250 10 250 625,000
For future convenience, note that
774,401
y3 = ≈ 1.2390 .
625,000
(d), (e), … In each subsequent step, increase k by 1 , and compute x k+1 and yk+1 using equation
set (9.4) with the values of x k and yk from the previous step. Continue until x N and y N
are computed.
For our example (omitting many computational details):
With k + 1 = 4 ,
3 1
x 4 = x 3+1 = x 3 + x = + = 2 ,
2 2
and
1 2
y4 = y3+1 = y2 + y3 − x 3 2 = · · · ≈ 1.1676 .
10
With k + 1 = 5 ,
1 5
x 5 = x 4+1 = x 4 + x = 2 + = ,
2 2
and
1 2
y5 = y4+1 = y4 + y4 − x 4 2 = · · · ≈ 0.9039 .
10
With k + 1 = 6 ,
5 1
x 6 = x 5+1 = x 5 + x = + = 6 ,
2 2
and
1 2
y6 = y5+1 = y5 + y5 − x 5 2 = · · · ≈ 0.3606 .
10
Since we had earlier chosen N , the maximum number of steps, to be 6 , we can
stop computing.

i i

i i
i i

i i

182 Euler’s Numerical Method

Y
k xk yk
0 0 1 1.0
1 0.5 1.1000
2 1.0 1.1960
3 1.5 1.2390 0.5
4 2.0 1.1676
5 2.5 0.9039
6 3.0 0.3606 0
0 0.5 1.0 1.5 2.0 2.5 3.0 X
(a) (b)

Figure 9.2: Results of Euler’s method to solve 5y − y 2 = −x 2 with y(0) = 1 using x = 1/2
and N = 6 : (a) The numerical solution in which yk ≈ y(x k ) (for k ≥ 3 , the values
of yk are to the nearest 0.0001 ). (b) The graph of the corresponding approximate
solution y = y E,x (x) .

Using the Results of the Method

What you do with the results of your computations depends on why you are doing these computations.
If N is not too large, it is usually a good idea to write the obtained values of

{ x0 , x1 , x2 , x3 , . . . , x N } and { y0 , y1 , y2 , y3 , . . . , y N }

in a table for convenient reference (with a note that yk ≈ y(x k ) for each k ) as done in ﬁgure 9.2a
for our example. And, whatever the size of N , it is always enlightening to graph — as done in ﬁgure
9.2b for our example — the corresponding piecewise straight approximation y = y E,x (x) to the
graph of y = y(x) by drawing straight lines between each (x k , yk ) and (x k+1 , yk+1 ) .

On Doing the Computations

The ﬁrst few times you use Euler’s method, attempt to do all the computations by hand. If the
numbers become too awkward to handle, use a simple calculator and decimal approximations. This
will help you understand and appreciate the method. It will also help you appreciate the tremendous
value of programming a computer to do the calculations in the second part of the method. That, of
course, is how one should really carry out the computations in the second part of Euler’s method.
In fact, Euler’s method may already be one of the standard procedures in your favorite computer
math package. Still, writing your own version is enlightening and is highly recommended for the
good of your soul.

9.3 What Can Go Wrong

Do not forget that Euler’s method does not yield exact answers. Instead, it yields values

{ x0 , x1 , x2 , x3 , . . . , x N } and { y0 , y1 , y2 , y3 , . . . , y N }

with
yk ≈ y(x k ) for k > 0 .

i i

i i
i i

i i

What Can Go Wrong 183

Y Y
4 4

3 approx. soln. 3

2 2

true soln.
1 1

0 0
0 1 2 3 4X 0 1 2 3 4X
−1 −1

(a) (b)

Figure 9.3: Catastrophic failure of Euler’s method in solving y = (y − 1)2 with y(0) = −1.3 :
(a) Graphs of the true solution and the approximate solution. (b) Same graphs with a
slope ﬁeld, the graph of the equilibrium solution, and the graph of the true solution to
y = (y − 1)2 with y(x 1 ) = y1 .

What’s more, each yk+1 is based on the approximation

y(x k + x) ≈ y(x k ) + x · f (x k , y(x k ))

with y(x k ) being replaced with approximation yk when k > 0 . So we are computing approxima-
tions based on previous approximations.
Because of this, the accuracy of the approximation yk ≈ y(x k ) , especially for larger values of
k , is a serious issue. Consider the work done in the previous section: just how well can we trust the
approximation
y(3) ≈ 0.3606
obtained for the solution to initial-value problem (9.3)? In fact, it can be shown that

y(3) = −.23699 to the nearest 0.00001 .

So our approximation is not very good!

To get an idea of how the errors can build up, look back at figure 9.1a on page 177. You can
see that, if the graphs of the true solutions to the differential equation are generally concave up (as
in the figure), then the tangent line approximations used in Euler’s method lie below the true graphs
and yield underestimates for the approximations. Over several steps, these underestimates can build
up so that the yk ’s are significantly below the actual values of the y(x k )’s .
Likewise, if the graphs of the true solutions are generally concave down, then the tangent line
approximations used in Euler’s method lie above the true graphs and yield overestimates for the
approximations.
Also keep in mind that most of the tangent line approximations used in Euler’s method are not
based on lines tangent to the true solution but on lines tangent to solution curves passing through the
(x k , yk )’s . This can lead to the “catastrophic failure” illustrated in figure 9.3a. In this figure, the
true solution to
dy 13
= (y − 1)2 with y(0) = − ,
dx 10
is graphed along with the graph of the approximate solution generated from Euler’s method with

i i

i i
i i

i i

184 Euler’s Numerical Method

Y y = y(x) Y y = y(x)
y(x max ) y(x max )

yN
yN
y0 y = ŷ(x)
X X
x0 x max x0 x max
(a) (b)

Figure 9.4: Two approximations y N of y(x max ) where y is the solution to y = f (x, y) with
y(x 0 ) = y0 : (a) Using Euler’s method with x equaling the distance from x 0 to
x max . (b) Using Euler’s method with x equaling half the distance from x 0 to
x max (Note: ŷ is the solution to y = f (x, y) with y(x 1 ) = y1 .)

x = 1/2 . Exactly why the graphs appear so different becomes apparent when we superimpose the
slope ﬁeld in ﬁgure 9.3b. The differential equation has an unstable equilibrium solution y = 1 . If
y(0) < 1 , as in the above initial-value problem, then the true solution y(x) should converge to 1
as x → ∞ . Here, however, one step of Euler’s method overestimated the value of y1 enough that
(x 1 , y1 ) ended up above equilibrium and in the region where the solutions diverge away from the
equilibrium. The tangent lines to these solutions led to higher and higher values for the subsequently
computed yk ’s . Thus, instead of correctly telling us that

lim y(x) = 1 ,
x→∞

this application of Euler’s method suggests that

lim y(x) = ∞ .
x→∞

A few other situations where blindly applying Euler’s method can lead to misleading results
are illustrated in the exercises (see exercises 9.6, 9.7, and 9.8, 9.9). And these sorts of problems are
not unique to Euler’s method. Similar problems can occur with all numerical methods for solving
differential equations. Because of this, it is highly recommended that Euler’s method (or any other
numerical method) be used only as a last resort. Try the methods developed in the previous chapters
first. Use a numerical method only if the other methods fail to yield usable formulas or equations.
Unfortunately, the world is filled with first-order differential equations for which numerical
methods are the only practical choices. So be sure to skim the next section on improving the method.
Also, if you must use Euler’s method (or any other numerical method), be sure to do a reality check.
Graph the corresponding approximation on top of the slope field for the differential equation, and ask
yourself if the approximations are reasonable. In particular, watch out that your numerical solution
does not “jump” over an unstable equilibrium solution.

9.4 Reducing the Error

Smaller Step Sizes
Suppose we are applying Euler’s method to some initial-value problem over some interval [x 0 , x max ] .
The one parameter we can adjust is the step size, x (or, equivalently, the number of steps, N , in

i i

i i
i i

i i

Reducing the Error 185

x = 1
1.0

1
x =
0.5 2

1
true solution x =
4
1
x =
0 8
0 0.5 1.0 1.5 2.0 2.5 3.0 X

Figure 9.5: Graphs of the different piecewise straight line approximations of the solution to
5y − y 2 = −x 2 with y(0) = 1 obtained by using Euler’s method with different
values for the step size x = 1/2 . Also graphed is the true solution.

going from x 0 to x max ). By shrinking x (increasing N ), at least two good things are typically
accomplished:

1. The error in the underlying approximation

y(x k + x) ≈ y(x k ) + x · f (x k , y(x k ))

is reduced.

2. The slope in the piecewise straight approximation y = y E,x (x) is recomputed at more

points, which means that this approximation can better match the bends in the slope ﬁeld for
the differential equation.
Both of these are illustrated in ﬁgure 9.4.
Accordingly, we should expect that shrinking the step size in Euler’s method will yield numer-
ical solutions that more accurately approximate the true solution. We can experimentally test this
expectation by going back to our initial-value problem
dy
5 − y 2 = −x 2 with y(0) = 1 ,
dx

computing (as you’ll be doing for exercise 9.5) the numerical solutions arising from Euler’s method
using, say,
1 1 1
x = 1 , x = , x = and x = ,
2 4 8

and then graphing the corresponding piecewise straight approximations over the interval [0, 3] along
with the graph of the true solution. Do this, and you will get the graphs in ﬁgure 9.5.1 As expected,
the graphs of the approximate solutions steadily approach the graph of the true solution as x
gets smaller. It’s even worth observing that the distance between the true value for y(3) and the
approximated value appears to be cut roughly in half each time x is cut in half.
1 The graph of the “true solution” in ﬁgure 9.5 is actually the graph of a very accurate approximation. The difference
between this graph and the graph of the true solution is less than the thickness of the curve used to sketch it.

i i

i i
i i

i i

186 Euler’s Numerical Method

In fact, our expectations can be rigorously conﬁrmed. In the next section, we will analyze
the error in using Euler’s method to approximate y(x max ) where y is the solution to a ﬁrst-order
initial-value problem
dy
= f (x, y) with y(x 0 ) = y0 .
dx
Assuming f is a “reasonably smooth” function of x and y , we will discover that there is a
corresponding constant M such that

|y(x max ) − y N | < M · x (9.5)

where y N is the approximation to y(x max ) generated from Euler’s method with step size x .
Inequality (9.5) is an error bound. It describes the worst theoretical error in using y N for
y(x max ) . In practice, the error may be much less than suggested by this bound, but it cannot be any
worse (unless there are other sources of error). Since this bound shrinks to zero as x shrinks to
zero, we are assured that the approximations to y(x max ) obtained by Euler’s method will converge
to the correct value of y(x max ) if we repeatedly use the method with step sizes shrinking to zero. In
fact, if we know the value of M and wish to keep the error below some small positive value, we can
use error bound (9.5) to pick a step size, x , that will ensure the error is below that desired value.
Unfortunately,

1. M can be fairly large.

2. In practice (as we will see), M can be difficult to determine.
3. Error bound (9.5) does not take into account the round-off errors that normally arise in
computations.
Let’s briefly consider the problem of round-off errors. Inequality (9.5) is only the error bound
arising from the theoretically best implementation of Euler’s method. In a sense, it is an “ideal error
bound” because it is based on all the computations being done with infinite precision. This is rarely
practical, even when using a computer math package that can do infinite precision arithmetic — the
expressions for the numbers rapidly become too complicated to be usable, even by the computer
math packages themselves. In practice, the numbers must be converted to approximations with finite
precision, say, decimal approximations accurate to the nearest 0.0001 as done in the table on page
181.
Don’t forget that the computations in each step involve numbers from previous steps, and these
computations are affected by the round-off errors from those previous steps. So the ultimate error due
to round-off will increase as the number of steps increases. With modern computers, the round-off
error resulting from each computation is usually very small. Consequently, as long as the number
of steps N remains relatively small, the total error due to round-off will usually be insignificant
compared to the basic error in Euler’s method. But if we attempt to reduce the error in Euler’s method
by making the step size very, very small, then we must take many, many more steps to go from x 0
to the desired x max . It is quite possible to reach a point where the accumulated round-off error will
negate the theoretic improvement in accuracy of the Euler method described by inequality (9.5).

Better Methods
Be aware that Euler’s method is a relatively primitive method for numerically solving ﬁrst-order
initial-value problems. Reﬁnements on the method can yield schemes in which the approximations
to y(x max ) converge to the true value much faster as the step size decreases. For example, instead
of using the tangent line approximation in each step,

yk+1 = yk + x · f (x k , yk ) ,

i i

i i
i i

i i

Error Analysis for Euler’s Method 187

we might employ a “tangent parabola” approximation that better accounts for the bend in the graphs.
(However, writing a program to determine this “tangent parabola” can be tricky.)
In other approaches, the f (x k , yk ) in the above equation is replaced with a cleverly chosen
weighted average of values of f (x, y) computed at cleverly chosen points near (x k , yk ) . The
idea is that this yields a straight a line approximation with the slope adjusted to reduce the over-
or undershooting noted a page or two ago. At least two of the more commonly used methods, the
“improved Euler method” and the “fourth-order Runge-Kutta method”, take this approach.
Numerous other methods may also worth learning if you are going to make extensive use of
numerical methods. However, an extensive discussion of numerical methods beyond Euler’s would
take us beyond the brief introduction to numerical methods intended by this author for this chapter.
So let us save a more complete discussion of these alternative methods for the future.

∗
9.5 Error Analysis for Euler’s Method
The Problem and Assumptions
Throughout this section, we will be concerned with the accuracy of numerical solutions to some
ﬁrst-order initial-value problem
dy
= f (x, y) with y(x 0 ) = y0 . (9.6)
dx
The precise results will be given in theorem 9.1, below. For this theorem, L is some ﬁnite length,
and we will assume there is a corresponding rectangle in the XY –plane

R = {(x, y) : x 0 ≤ x ≤ x 0 + L and ymin < y < ymax }

such that all of the following hold:

1. f and its ﬁrst partial derivatives are continuous, bounded functions on R . This “bounded-
ness” means there are ﬁnite constants A , B and C such that, at each point in R ,

∂f ∂f
|f| ≤ A , ≤ B and ≤ C . (9.7)
∂x ∂y

2. There is a unique solution, y = y(x) , to the given initial-value problem valid over the
interval [x 0 , x 0 + L] . (We’ll refer to y = y(x) as the “true solution” in what follows.)
3. The rectangle R contains the graph over the interval [x 0 , x 0 + L] of the true solution.
4. If x 0 ≤ x k ≤ x 0 + L and (x k , yk ) is any point generated by any application of Euler’s
method to solve our problem, then (x k , yk ) is in R .
The rectangle R may be the entire vertical strip

{(x, y) : x 0 ≤ x ≤ x 0 + L and − ∞ < y < ∞}

if f and its partial derivatives are bounded on this strip. If f and its partial derivatives are not
bounded on this strip, then ﬁnding the appropriate upper and lower limits for this rectangle is one of
the challenges in using the theorem.
∗ A somewhat advanced discussion for the “interested reader”.

i i

i i
i i

i i

188 Euler’s Numerical Method

Theorem 9.1 (Error bound for Euler’s method)

Let f , x 0 , y0 , L and R be as above, and let y = y(x) be the true solution to initial-value problem
(9.6). Then there is a ﬁnite constant M such that

|y(x N ) − y N | < M · x (9.8)

whenever
{ x0 , x1 , x2 , x3 , . . . , x N } and { y0 , y1 , y2 , y3 , . . . , y N }

is a numerical solution to initial-value problem (9.6) obtained from Euler’s method with step spacing
x and total number of steps N satisfying

0 < x · N ≤ L . (9.9)

This theorem is only concerned with the error inherent in Euler’s method. Inequality (9.8) does
not take into account errors arising from rounding off numbers during computation. For a good
discussion of round-off errors in computations, the interested reader should consult a good text on
numerical analysis.
To prove this theorem, we will derive a constant M that makes inequality (9.8) true. (The
impatient can look ahead to equation (9.16) on page 192.) Accordingly, for the rest of this section,
y = y(x) will denote the true solution to our initial-value problem, and

{ x0 , x1 , x2 , x3 , . . . , x N } and { y0 , y1 , y2 , y3 , . . . , y N }

will be an arbitrary numerical solution to initial-value problem (9.6) obtained from Euler’s method
with step spacing x and total number of steps N satisfying inequality (9.9).
Also, to simplify discussion, let us agree that, in all the following, k always denotes an arbitrary
nonnegative integer less than N .

Preliminary Bounds
Our derivation of a value for M will be based on several basic inequalities and facts from calculus.
These include the inequalities
b b

|A + B| ≤ |A| + |B| and ψ(s) ds ≤ |ψ(s)| ds

a a

when a < b . Of course, if |ψ(s)| ≤ K for some constant K , then, whether or not a < b ,
b
|ψ(s)| ds ≤ K |b − a| .
a

Also remember that, if φ = φ(x) is continuous and differentiable, then

b
dφ
φ(a) − φ(b) = ds .
a ds

Combining the above, we get

Corollary 9.2
Assume φ is a continuous differentiable function on some interval. Assume further that φ ≤ K
on this interval for some constant K . Then, for any two points a and b in this interval,

|φ(a) − φ(b)| ≤ K |b − a| .

i i

i i
i i

i i

Error Analysis for Euler’s Method 189

We will use this corollary twice.

First, we apply it with φ(x) = f (x, y(x)) . Recall that, by the chain rule in chapter 7,

d ∂f ∂ f dy
f (x, y(x)) = + ,
dx ∂x ∂y dx

which we can rewrite as

d ∂f ∂f
f (x, y(x)) = + f (x, y)
dx ∂x ∂y

whenever y = y(x) is a solution to y = f (x, y) . Applying bounds (9.7), this then yields

df ∂f ∂f
≤ + | f (x, y)| ≤ B + C A at every point in R .
dx ∂x ∂y

The above corollary (with φ(x) = f (x, y(x)) and K = B + C A ) then tells us that

| f (a, y(a)) − f (b, y(b))| ≤ (B + C A)(b − a) (9.10)

whenever x 0 ≤ a ≤ b ≤ x 0 + L .
The second application of the above corollary is with φ(y) = f (x k , y) . Here, y is the variable,
x remains constant, and φ = ∂ f /∂ y . Along with the fact that ∂ f /∂ y < C on rectangle R , this
corollary immediately gives us
| f (x k , b) − f (x k , a)| ≤ C |b − a| (9.11)

whenever a and b are any two points in the interval [x 0 , x 0 + L] .

Maximum Error in the Underlying Approximation

Now consider the error in the underlying approximation

y(x k + x) ≈ y(x k ) + x · f (x k , y(x k )) .

Let k+1 be the difference between y(x k + x) and the above approximation,

k+1 = y(x k + x) − [y(x k ) − x · f (x k , y(x k ))] .

Note that this can be rewritten both as

y(x k+1 ) = y(x k ) + x · f (x k , y(x k )) + k+1 (9.12)

and as
k+1 = [y(x k + x) − y(x k )] − f (x k , y(x k )) · x .

From basic calculus, we know that

x k +x x k +x
f (x k , y(x k )) · x = f (x k , y(x k )) dx = f (x k , y(x k )) dx .
xk xk

We also know y = y(x) satisﬁes y = f (x, y) . Hence,

x k +x x k +x
dy
y(x k + x) − y(x k ) = dx = f (x, y(x)) dx .
xk dx xk

i i

i i
i i

i i

190 Euler’s Numerical Method

Taking the absolute value of k+1 and applying the last three observations yields

|k+1 | = |[y(x k + x) − y(x k )] − f (x k , y(x k )) · x|

xk +x xk +x

= f (x, y(x)) dx − f (x k , y(x k )) dx
xk xk
x +x
k

= f (x, y(x)) − f (x k , y(x k )) dx
xk
xk +x
≤ | f (x, y(x)) − f (x k , y(x k ))| dx .
xk

Remarkably, we’ve already found an upper bound for the integrand in the last line (inequality (9.10),
with a = x and b = x k ). Replacing this integrand with this upper bound, and then doing a little
elementary integration, yields
xk +x
1
|k+1 | ≤ (B + C A)(x − x k ) dx = (B + C A)(x)2 .
xk 2

This last inequality combined with equation (9.12) means that we can rewrite the underlying
approximation more precisely as

y(x k+1 ) = y(x k ) + x · f (x k , y(x k )) + k+1 (9.13a)

where
1
|k+1 | ≤ (B + C A)(x)2 . (9.13b)
2

Ideal Maximum Error in Euler’s Method

Now let E k be the difference between y(x k ) and yk ,

E k = y(x k ) − yk .

Because y0 = y(x 0 ) :
E 0 = y(x 0 ) − y0 = 0 .
More generally, using formula (9.13a) for y(x k + x) and the formula for yk+1 from Euler’s
method, we have

E k+1 = y(x k+1 ) − yk+1

= y(x k + x) − yk+1

= y(x k ) + x · f (x k , y(x k )) + k+1 − [yk + x · f (x k , yk )] .

Cleverly rearranging the last line and taking the absolute value lead to

|E k+1 | = |k+1 + [y(x k ) − yk ] + x · [ f (x k , y(x k )) − f (x k , yk )]|

= |k+1 + E k + x · [ f (x k , y(x k )) − f (x k , yk )]|
≤ |k+1 | + |E k | + |x · [ f (x k , y(x k )) − f (x k , yk )]| .

Fortunately, from inequality (9.13b), we know

1
|k+1 | ≤ (B + C A)(x)2 ,
2

i i

i i
i i

i i

Error Analysis for Euler’s Method 191

and from inequality (9.11) and the deﬁnition of E k , we know

| f (x k , y(x k )) − f (x k , yk )| ≤ C |y(x k ) − yk | = C |E k | .

Combining the last three inequalities, we get

|E k+1 | ≤ |k+1 | + |E k | + |x · [ f (x k , y(x k )) − f (x k , yk )]|
1
≤ (B + C A)(x)2 + |E k | + x · C |E k |
2
1
≤ (B + C A)(x)2 + (1 + x · C) |E k | .
2
This is starting to look ugly. So let
1
α = (B + C A) and β = 1 + x · C ,
2
just so that the above inequality can be written more simply as
|E k+1 | ≤ α(x)2 + β |E k | .
Remember, E 0 = 0 . Repeatedly applying the last inequality, we then obtain the following:
|E 1 | = |E 0+1 | = α(x)2 + β |E 0 | = α(x)2 ,

|E 2 | = |E 1+1 | ≤ α(x)2 + β |E 1 |
≤ α(x)2 + βα(x)2 ≤ (1 + β) α(x)2 ,

|E 3 | = |E 2+1 | ≤ α(x)2 + β |E 2 |
≤ α(x)2 + β (1 + β) α(x)2

≤ α(x)2 + β + β 2 α(x)2 ≤ 1 + β + β 2 α(x)2 ,
..
.
.
Continuing, we eventually get
|E N | ≤ S N α(x)2 where S N = 1 + β + β 2 + · · · + β N −1 . (9.14)
You may recognize S N as a partial sum for a geometric series. Whether you do or not, we have
(β − 1)S N = β S N − S N

= β 1 + β + β 2 + · · · + β N −1 − 1 + β + β 2 + · · · + β N −1

= β + β 2 + · · · + β k − 1 + β + β 2 + · · · + β N −1

= βN − 1 .
Dividing through by β and recalling what α and β represent then give us
βN − 1
SN α = α
β−1

(1 + x · C) N − 1 B + C A (1 + x · C) N − 1 (B + C A)
= · = .
1 + x · C − 1 2 x · 2C

i i

i i
i i

i i

192 Euler’s Numerical Method

So inequality (9.14) can be rewritten as

(1 + x · C) N − 1
|E N | ≤ α(x)2
x · C

Dividing out one x leaves us with

(1 + x · C) N − 1 (B + C A)
|E N | ≤ M N ,x · x where M N ,x = . (9.15)
2C

The claim of theorem 9.1 is almost proven with inequality (9.15). All we need to do now is to
ﬁnd a single constant M such that M N ,x ≤ M for all possible choices of M and x . To this
end, recall the Taylor series for the exponential,
∞
1 1 2 1
eX = Xn = 1 + X + X + X3 + · · · .
n! 2 6
n=0

If X > 0 then
1 2 1
1 + X < 1 + X + X + X3 + · · · = eX .
2 6
Cutting out the middle and letting X = x · C , this becomes

1 + x · C < ex·C .

Thus,
N
(1 + x · C) N < ex·C = e N x·C ≤ e LC

where L is that constant with N x ≤ L . So

(1 + x · C) N − 1 (B + C A)
M N ,x = < M
2C

where
(e LC − 1)(B + C A)
M = . (9.16)
2C
And this (ﬁnally) completes our proof of theorem 9.1 on page 188.

Additional Exercises

9.1. Several initial-value problems are given below, along with values for two of the three
parameters in Euler’s method: step size x , number of steps N , and maximum vari-
able of interest x max . For each, ﬁnd the corresponding numerical solution using Euler’s
method with the indicated parameter values. Do these problems without a calculator or
computer.
dy y 1
a. = with y(1) = −1 ; x = and N =3
dx x 3
dy
b. = −8x y with y(0) = 10 ; x max = 1 and N =4
dx

i i

i i
i i

i i

Additional Exercises 193

dy 1
c. 4x + = y2 with y(0) = 2 ; x max = 2 and x =
dx 2
dy y 1
d. + = 4 with y(1) = 8 ; x = and N =6
dx x 2
9.2. Again, several initial-value problems are given below, along with values for two of the
three parameters in Euler’s method: step size x , number of steps N , and maximum
variable of interest x max . For each, find the corresponding numerical solution using Euler’s
method with the indicated parameter values. Do these problems with a (nonprogrammable)
calculator.
dy 1
a. = 2x + y with y(0) = 0 ; x = and N = 6
dx 2
dy
b. (1 + y) = x with y(0) = 1 ; N = 6 and x max = 2
dx
dy
c. = yx with y(1) = 2 ; x = 0.1 and x max = 1.5
dx
dy 1
d. = cos(y) with y(0) = 0 ; x = and N =5
dx 5
9.3 a. Using your favorite computer language or computer math package, write a program or
worksheet for finding the numerical solution to an arbitrary first-order initial-value prob-
lem using Euler’s method. Make it easy to change the differential equation and the
computational parameters (step size, number of steps, etc.).2, 3
b. Test your program/worksheet by using it to re-compute the numerical solutions for the
problems in exercise 9.2, above.

9.4. Using your program/worksheet from exercise 9.3 a with each of the following step sizes,
ﬁnd an approximation for y(5) where y = y(x) is the solution to
!
dy
= x 2 + y2 + 1 with y(0) = 0 .
3
dx

a. x = 1 b. x = 0.1 c. x = 0.01 d. x = 0.001

9.5. Let y be the (true) solution to the initial-value problem considered in section 9.2,
dy
5 − y 2 = −x 2 with y(0) = 1 .
dx
For each step size x given below, use your program/worksheet from exercise 9.3 a to
ﬁnd an approximation to y(3) . Also, for each, ﬁnd the magnitude of the error (to the
nearest 0.0001 ) in using the approximation for y(3) , assuming the correct value of y(3)
is −0.23699 .
1 1 1
a. x = 1 b. x = c. x = d. x =
2 4 8
e. x = 0.01 f. x = 0.001 g. x = 0.0001

9.6. Consider the initial-value problem

dy 13
= (y − 1)2 with y(0) = − .
dx 10
2 If your computer math package uses inﬁnite precision or symbolic arithmetic, you may have to include commands to
ensure your results are given as decimal approximations.
3 It may be easier to compute all the x ’s ﬁrst, and then compute the y ’s .
k k

i i

i i
i i

i i

194 Euler’s Numerical Method

This is the problem discussed in section 9.3 in the illustration of a “catastrophic failure” of
Euler’s method.
a. Find the exact solution to this initial-value problem using methods developed in earlier
chapters. What, in particular, is the exact value of y(4) ?
b. Using your program/worksheet from exercise 9.3 a, find the numerical solution to the
above initial-value problem with x max = 4 and step size x = 1/2 . (Also, confirm that
this numerical solution has been properly plotted in figure 9.3 on page 182.)
c. Find the approximation to y(4) generated by Euler’s method with each of the following
step sizes (use your answer to the previous part or your program/worksheet from exercise
9.3 a). Also, compute the magnitude of the error in using this approximation for the exact
value found in the first part of this exercise.
1 1 1
i. x = 1 ii. x = iii. x = iv. x =
2 4 10

9.7. Consider the following initial-value problem

dy
= −4y with y(0) = 3 .
dx

The following will illustrate the importance of choosing appropriate step sizes.
a. Find the numerical solution using Euler’s method with x = 1/2 and N being any large
integer (this will be more easily done by hand than using a calculator!). Then do the
following:
i. There will be a pattern to the yk ’s . What is that pattern? What happens as k → ∞ ?
ii. Plot the piecewise straight approximation corresponding to your numerical solution
along with a slope field for the above differential equation. Using these plots, decide
whether your numerical solution accurately describes the true solution, especially as x
gets large.
iii. Solve the above initial-value problem exactly using methods developed in earlier chap-
ters. What happens to y(x) as x → ∞ ? Compare this behavior to that of your
numerical solution. In particular, what is the approximate error in using yk for y(x k )
when x k is large?
b. Now find the numerical solution to the above initial-value problem using Euler’s method
with x = 1/10 and N being any large integer (do this by hand, looking for patterns in
the yk ’s )). Then do the following:
i. Find a relatively simple formula describing the pattern in the yk ’s .
ii. Plot the piecewise straight approximation corresponding to this numerical solution along
with a slope field for the above differential equation. Does this numerical solution appear
to be significantly better (more accurate) than the one found in part 9.7 a?

9.8. In this problem we’ll see one danger of blindly applying a numerical method to solve an
initial-value problem. The initial-value problem is
dy 3
= with y(0) = 0 .
dx 7 − 3x

a. Find the numerical solution to this using Euler’s method with step size x = 1/
2 and
x max = 5 . (Use your program/worksheet from exercise 9.3 a).

i i

i i
i i

i i

Additional Exercises 195

b. Sketch the piecewise straight approximation corresponding to the numerical solution just
found.
c. Sketch the slope field for this differential equation, and find the exact solution to the above
initial-value problem by simple integration.
d. What happens in the true solution as x → 7/3 ?
e. What can be said about the approximations to y(x k ) obtained in the first part when
x k > 7/3 ?

9.9. What goes wrong with attempting to ﬁnd a numerical solution to

dy
(2y − 1)2 = 1 with y(0) = 0
dx

using Euler’s method with step size x = 1/2 ?

i i

i i
i i

i i

i i
i i

i i

10
The Art and Science of Modeling with
First-Order Equations

For some, “modeling” is the building of small plastic replicas of famous ships; for others, “modeling”
means standing in front of cameras wearing silly clothing; for us, “modeling” is the process of
developing sets of equations and formulas describing some process of interest. This process may be
the falling of a frozen duck, the changes in a population over time, the consumption of fuel by a car
traveling various distances, the accumulation of wealth by one individual or company, the cooling of
a cup of coffee, the electronic transmission of sound and images from a television station to a home
television, or any of a huge number of other processes affecting us. A major goal of modeling, of
course, is to predict “how things will turn out” at some point of interest, be it a point of time in the
future or a position along the road. Along with this, naturally, is often a desire to use the model to
determine changes we can make to the process to force things to turn out as we desire.
Of course, some things are more easily modeled mathematically than others. For example, it will
certainly be easier to mathematically describe the number of rabbits in a field than to mathematically
describe the various emotions of these rabbits. Part of the art of modeling is the determination of
which quantities the model will deal with (e.g., “number of rabbits” instead of “emotional states”).
Another part of modeling is the balancing between developing as complete a model as possible
by taking into account all possible influences on the process as opposed to developing a simple and
easy to use model by the use of simplifying assumptions and simple approximations. Attempting
to accurately describe all possible influences usually leads to such a complicated set of equations
and formulas that the model (i.e., the set of equations and formulas we’ve developed) is unusable.
A model that is too simple, on the other hand, may lead to wildly inaccurate predictions, and, thus,
would also not be a useful model.
Here, we will examine various aspects of modeling using first-order differential equations. This
will be done mainly by looking at a few illustrative examples, though, in a few sections, we will also
discuss how to go about developing and using models with first-order differential equations more
generally.

10.1 Preliminaries
Suppose we have a situation in which some measurable quantity of interest (e.g.: velocity of a
falling duck, number of rabbits in a ﬁeld, gallons of fuel in a vehicle, amount of money in a bank,
temperature of a cup of coffee) varies as some basic parameter (such as time or position) changes.
For convenience, let’s assume the parameter is time and denote that parameter by t , as is traditional.

197

i i

i i
i i

i i

198 The Art and Science of Modeling with First-Order Equations

Recall that, if
Q(t) = the amount of that measurable quantity at time t ,
then
dQ
= the rate at which Q varies as t varies .
dt
If we can identify what controls this rate, and can come up with a formula F(t, Q) describing how
this rate depends on t and the Q , then
dQ
= F(t, Q) .
dt

gives us a ﬁrst-order differential equation for Q which, with a little luck, can be solved to obtain a
general formula for Q in terms of t . At the very least, we will be able to construct this equation’s
slope ﬁeld and sketch graphs of Q(t) .
Our development of the “improved falling object model” in chapter 1.2 is a good example of this
sort of modeling. Go back to page 11 and take a quick look at it. There, the ‘measurable quantity’ is
the velocity v (in meters/second); the rate at which it varies with time, dv/dt , is the acceleration, and
we were able to determine a formula F for this acceleration by determining and adding together the
accelerations due to gravity and air resistance,

F(t, v) = total acceleration

= acceleration due to gravity + acceleration due to air resistance
= (−9.8) + (−κv)

where κ is some positive constant that would have to be determined by experiment. This gave us
the ﬁrst-order differential equation
dv
= F(t, v) = −9.8 − κv ,
dt

which we were later able to solve and analyze.

In what follows, we will develop models for several other situations. We will also, in section
10.5, give further advice on developing your own models with ﬁrst-order differential equations. Be
sure to observe how we develop these models and to read the notes in section 10.5. You will be
developing more models in the exercises and, maybe later, in real life.

10.2 A Rabbit Ranch

The Situation to be Modeled
Pretend we’ve been given a breeding pair of rabbits along with acres and acres of prime rabbit range
with no predators. Let us further assume this rabbit range is fenced in so that no rabbits can escape
or come in, and so that no predators can come in. We release the rabbits, planning to return in a few
years (say, ﬁve) to harvest rabbits for the Easter trade.
An obvious question is How many rabbits will we have on our rabbit ranch in ﬁve years?

i i

i i
i i

i i

A Rabbit Ranch 199

Setting Up the Model

Experienced modelers typically begin by drawing a simple, enlightening picture of the process (if
appropriate) and identifying the relevant variables (as we did for the “falling object model” — see
page 8). Since the author could not think of a particularly appropriate and enlightening picture, we
will skip the picture and go straight to identifying the obvious variables of interest. They are ‘time’
and ‘the number of rabbits’, which we will naturally denote using the symbols t and R , respectively.
The time t can be measured in seconds, days, months, years, centuries, etc. We will use months,
with t = 0 being the time the rabbits were released. So,

t = number of months since the rabbits were released

and
R = R(t) = number of rabbits at time t .

Since we started with a pair of rabbits, the initial condition is

R(0) = 2 . (10.1)

Now, since t is being measured in months,

dR
= rate R varies as t varies
dt
= change in the number of rabbits per month .

Because the fence prevents rabbits escaping or coming in, the change in the number of rabbits is due
entirely to the number of births and deaths in our rabbit population. Thus,

dR
= change in the number of rabbits per month
dt (10.2)
= number of births per month − number of deaths per month .

Now we need to model the “number of births per month” and the “number of deaths per month”.
Starting with the birth process, and assuming that half the population are females, we note that

number of births per month

= number of births per female rabbit per month

× number of female rabbits that month
1
= number of births per female rabbit per month × R .
2

(We are also assuming that all the females are capable of having babies, no matter what their age.
Well, these are rabbits; they marry young.)
It seems reasonable to assume the average number of births per female rabbit per month is a
constant. For future convenience, let
1
β = × number of births per female rabbit per month .
2

This is the “monthly birth rate per rabbit” and allows us to write

number of births per month = β R . (10.3)

i i

i i
i i

i i

200 The Art and Science of Modeling with First-Order Equations

Checking a reliable reference on rabbits (any decent encyclopedia will do), it can be found that,
on the average, each female rabbit has 6 litters per year with 5 bouncy baby bunnies in each litter.
Hence, since there are 12 months in a year,
1
β = × number of births per female rabbit per month
2
1 1
= × × number of births per female rabbit per year
2 12
1 1
= × ×6×5 .
2 12
That is,
5
β = . (10.4)
4
What about the death rate? Since there are no predators and plenty of food, it seems reasonable
to assume old age is the main cause of death. Again checking a reliable reference on rabbits, it can
be found that the average life span for a rabbit is 10 years. Clearly, then, the number of deaths per
month will be negligible compared to the number of births. So we will assume

number of deaths per month = 0 . (10.5)

Combining equations (10.2), (10.3) and (10.5), we obtain

dR
= number of births per month − number of deaths per month
dt
= βR − 0 .

That is,
dR
= βR (10.6)
dt
where β is the average monthly birth rate per rabbit.1
Of course, equation (10.6) does not just apply to the situation being considered here. The same
equation would have been obtained for the changing population of any creature having zero death
rate and a constant birth rate β per unit time per creature. But the problem at hand involves rabbits,
and for rabbits, we derived β = 5/4 . This, the above differential equation, and the fact that we started
with two rabbits means that R(t) must satisfy
dR 5
= R with R(0) = 2 .
dt 4
This is our “model”.

Using Our Model

Our differential equation is
dR 5
= βR with β = .
dt 4
This is a simple separable and linear differential equation. You can easily show that its general
solution is
R(t) = Aeβt .
1 In developing this equation, we equated an “instantaneous rate of change”, d R/ , to a “change in the number of rabbits
dt
per month”, and then found a formula for that “monthly change” based on the value of R “at time t ” instead of over the
entire month. If this bothers you, see appendix 10.8 on page 216.

2. the “average birth rate per individual per unit of time” β0 is constant,
and
3. the “average death rate per individual per unit of time” δ0 is constant (i.e., a constant fraction
δ0 of the population dies off during each unit of time);

then
dP
= change in the number of individuals per unit time
dt
= number of births per unit time − number of deaths per unit time
= β0 P(t) − δ0 P(t) .

Letting β be the “net birth rate per individual per unit time”,

β = β0 − δ0 ,

this reduces to
dP
= β P(t) , (10.8)
dt
the solution of which, as we already know, is

P(t) = P0 eβt = P0 e(β0 −δ0 )t where P0 = P(0) .

If β0 > δ0 , then the model predicts that the population will grow exponentially. If β0 < δ0 , then
the model predicts that the population will decline exponentially. And if β0 = δ0 , then the model
predicts that the population remains static.
This is a simple model whose accuracy depends on the validity of the three basic assumptions
made above. In many cases, these assumptions are often reasonably acceptable during the early
stages of the process, and, initially, we do see exponential growth of populations, say, of yeast added
to grape juice or of a new species of plants or animals introduced to a region where it can thrive.
As illustrated in our “rabbit ranch” example, however, this is too simple a model to describe the
long-term growth of most biological populations.

Natural Radioactive Decay

The effect of radioactive decay on the amount of some radioactive isotope can be described by
a model completely analogous to the general population model just discussed. Assume we start
with some amount (say, a kilogram) of some radioactive isotope of interest (say, uranium–235).
During any given length of time, there is a certain probability that any given atom of that material
will spontaneously decay into a smaller atom along with associated radiation and other atomic and
subatomic particles. Thus, the amount we have of that particular radioactive isotope will decrease

2 Precisely what “birth” or “death” means may depend on the creatures/plants in the population. For a microbe, “birth”
may be when a parent cell divides into two copies of itself. If the population is the set of people infected with a particular
disease, then “birth” occurs when a person contracts the disease.

i i

i i
i i

i i

Exponential Growth and Decay 203

as more and more of the atoms decay (provided there is not some other material that decays into the
isotope of interest.)
Let’s assume we have some radioactive isotope of interest, and that there is no other radioactive
material decaying into that isotope. For convenience, let A(t) denote the amount of that radioactive
material at time t , and let δ be the fraction of the material that decays per unit time. In essence, the
decay of an atom is the death of that atom, and this δ is essentially the same as the δ0 in the above
population growth discussion. Virtually the same analysis done to obtain equation (10.8) (but using
P instead of A , and noting that β0 = 0 since no new atoms of the isotope are being “born”) then
yields
dA
= −δ A(t) .
dt
Solving this differential equation then gives us

A(t) = A0 e−δt with A0 = A(0) . (10.9)

Because radioactive decay is a probabilistic event, and because there are typically huge numbers
of atoms in any sample of radioactive material, the laws of probability and statistics ensure that this is
usually a very accurate model over long periods of time (unlike the case with biological populations).
The positive constant δ , called the decay rate, is different for each different isotope. It is large
if the isotope is very unstable and a large fraction of the atoms decay in a given time period, and it
is small if the isotope is fairly stable and only a small fraction of the atoms decay in the same time
period. In practice, the decay rate δ is usually described indirectly through the half-life τ1/2 of the
isotope, which is the time it takes for half of the original amount to decay. Using the above formula
for A(t) , you can easily verify that τ1/2 and δ are related by the equation

δ × τ1/2 = ln 2 . (10.10)

?Exercise 10.1: Derive equation (10.10). Use formula (10.9) for A(t) and the fact that, by the
deﬁnition of τ1/2 ,
1
A(τ1/2 ) = A(0) .
2

!Example 10.1: Cobalt-60 is a radioactive isotope of cobalt with a half-life of approximately

5.27 years.3 Using equation (10.10), we ﬁnd that its (approximate) decay constant is given by
ln 2 ln 2
δ = = ≈ 0.1315 (per year) .
τ1/2 5.27 (years)

Combining this with formula (10.9) gives us

A(t) ≈ A0 e−0.1315t with A0 = A(0) .

as the formula for the amount of undecayed cobalt remaining after t years.
Suppose we initially have 10 grams of cobalt-60. At the end of one year, those 10 grams
would have decayed to approximately

(10 gm.) × e−0.1315×1 ≈ 8.77 grams of cobalt-60 .

At the end of two years, those 10 grams would have decayed to approximately

(10 gm.) × e−0.1315×2 ≈ 7.69 grams of cobalt-60 .

3 Cobalt-60 has numerous medical applications, as well as having the potential as an ingredient in some particularly nasty
nuclear weapons. It is produced by exposing cobalt-59 to “slow” neutrons, and decays to a stable nickel isotope after giving
off one electron and two gamma rays.

i i

i i
i i

i i

204 The Art and Science of Modeling with First-Order Equations

And at the end of ten years, those 10 grams would have decayed to approximately

(10 gm.) × e−0.1315×10 ≈ 2.68 grams of cobalt-60 .

10.4 The Rabbit Ranch, Again

Back to wrangling rabbits.

The Situation (and Problem)

Recall that we imagined ourselves having a fenced-in ranch enclosing many acres of prime rabbit
range. We start with a breeding pair of rabbits, and plan to return in ﬁve years. The question is How
many rabbits will we have then?
In section 10.2, we attempted to answer this question using a fairly simple model we had just
developed. However, the predicted number of rabbits after ﬁve years (which had a corresponding
mass a thousand times greater than that of the Sun) was clearly absurd. That model did not account
for the problems arising when a population of rabbits grows too large. Let us now see if we can
derive a more realistic model.

A Better Model
Again, we let
R(t) = number of rabbits after t months
with R(0) = 2 . We still have
dR
= change in the number of rabbits per month
dt (10.11)
= number of births per month − number of deaths per month .

However, the assumptions that

number of deaths per month = 0 ,
and
number of births per month = β R
where
5
β = monthly birth rate per rabbit = .
4

are too simplistic. As the population increases, the amount of range land (and, hence, food) per
rabbit decreases. Eventually, the population may become too large for the available ﬁelds to support
all the rabbits. Some will starve to death, and those female rabbits that survive will be malnourished
and will give birth to fewer bunnies. In addition, overcrowding is conducive to the spread of diseases
which, in a population already weakened by hunger, can be devastating.

i i

i i
i i

i i

The Rabbit Ranch, Again 205

Clearly, we at least need to correct our formula for the number of deaths per month, because,
once overcrowding begins, we can expect a certain fraction of the population to die each month.
Letting δ denote that fraction,

number of deaths per month = fraction of the population that dies each month
× number of rabbits

= δR .

Keep in mind that this fraction δ , which we can call the monthly death rate per rabbit, will not be
constant. It will depend on just how overcrowded the rabbits are. In other words, δ will vary with
R , and, thus, should be treated as a function of R , δ = δ(R) . Just how δ varies with R is yet
unknown, but it should be clear that
if R is small, then overcrowding is not a problem and δ(R) should be close to zero,
and
as R increases, then overcrowding increases and more rabbits start dying. So, δ(R)
should increase as R increases.
The simplest function of R for δ satisfying the two above conditions is

δ = δ(R) = γ D R

where γ D is some positive constant. This gives us

number of deaths per month = δ R = γ D R R = γ D R 2 .

What sort of “correction” should we now consider for

number of births per month = β R ?

Well, as with the monthly death rate δ , above, we should expect the monthly birth rate per rabbit,
β , to be a function of the number of rabbits, β = β(R) . Moreover:
If R is small, then overcrowding is not a problem and β(R) should be close to its ideal
value β0 = 5/4 ,
and
as R increases, then more rabbits become malnourished, and females have fewer babies
each month. So, β(R) should decrease from its ideal value as R increases.
A simple formula describing this is obtained by subtracting from the ideal birth rate a simple cor-
rection term proportional to the number of rabbits,

β = β(R) = β0 − γ B R

where β0 = 5/4 is the ideal monthly birth rate per rabbit and γ B is some positive constant.4 This
then gives us

number of births per month = β R = β0 − γ B R R = β0 R − γ B R 2 .
4 Yes, the birth rate becomes negative if R becomes large enough, and negative birth rates are not realistic. But we still
trying for as simple a model as feasible — with luck R will not get large enough that the birth rate becomes negative.

i i

i i
i i

i i

206 The Art and Science of Modeling with First-Order Equations

As with our simpler model, the one we are developing can be applied to populations of other
organisms by using the appropriate value for the ideal birth rate per organism, β0 . For rabbits, we
have β0 = 5/4 .
Combining the above formulas for the monthly number of births and deaths with the generic
differential equation (10.11) yields
dR
= number of births per month − number of deaths per month
dt
= βR − δR
= β0 R − γ B R 2 − γ D R 2 ,

which, letting γ = γ B + γ D , simpliﬁes to

dR
= β0 R − γ R 2 (10.12)
dt

where β0 = 5/4 is the ideal monthly birth rate per rabbit and γ is some positive constant. Presumably,
γ could be determined by observation (if this new model does accurately describe the situation).

Using the Better Model

Equation (10.12) is called the logistic equation and was an important development in the study of
population dynamics. It is a relatively simple separable equation that can be solved without too
much difﬁculty. But let’s not, at least, not yet. Instead, let us ﬁrst rewrite this equation by factoring
out γ R ,
dR β0
= γR −R .
dt γ
From this it is obvious that our differential equation has two constant solutions,
β0
R = 0 and R = .
γ

The ﬁrst tells us that, if we start with no rabbits, then we get no rabbits in the future (no surprise
there). The second is more interesting. For convenience, let κ = β0/γ . Our differential equation can
then be written as
dR β
= γ R (κ − R) with γ = 0 , (10.13)
dt κ
and the two constant solutions are

R = 0 and R = κ .

(We probably should note that κ , as a ratio of positive constants, is a positive constant.)
While we are at it, let’s further observe that, if 0 < R < κ , then
dR
= γ R (κ − R ) > 0 .
dt
>0 >0

In other words, if 0 < R < κ , then the population is increasing.

On the other hand, if κ < R , then
dR
= γ R (κ − R ) < 0 .
dt
>0 <0

i i

i i
i i

i i

The Rabbit Ranch, Again 207

R R
125
Region with negative slopes
κ 100
Region with positive slopes
75

0 0
0 T 0 10 20 30 40 50 60T
(a) (b)

Figure 10.1: Slope Fields for Logistic Equation (10.13): (a) A minimal ﬁeld for a generic logistic
equation and (b) a slope ﬁeld for the logistic equation with κ = 100 and β0 = 1/10 .
The graphs of a few solutions have been roughly sketched on each.

That is, the population will be decreasing if κ < R .

We can graphically represent these observations using the crude slope field sketched in figure
10.1a. This figure suggests that, over time, the number of rabbits will stabilize around κ . If there
are initially fewer than κ rabbits (but at least some), then the rabbit population will increase towards
a total of κ rabbits. If there are initially more than κ rabbits, then the population will decrease
towards a total of κ rabbits. This prediction is reflected in the more carefully constructed slope
field in figure 10.1b. Because κ is the maximum number of rabbits that can exist in the long term
given the resources available, κ is often called the carrying capacity of the ‘system’ consisting of
the rabbits and their environment. (Of course, if the carrying capacity is too low, say, κ ≈ 0.5 then,
realistically, all the rabbits will die.)
Finding the precise formula for R(t) will be left to you (exercise 10.7). What you will show
is that, in terms of the carrying capacity κ , ideal birth rate β0 , and initial population R0 = R(0) ,
κ R0
R(t) = . (10.14)
R0 + (κ − R0 )e−β0 t
This formula reflects the fact that there are three basic parameters in our model: the ideal monthly
birth rate β0 , the initial number of rabbits R(0) , and the carrying capacity of the system κ . The
first two we know or can figure out from basic biology. The last, κ , will have to be determined “from
experiment”. For example, we might return a year after releasing the original pair (at t = 12 ), count
the number of rabbits on the ranch, R(12) , and then use this value along with the known values for
R0 and β0 in formula (10.14) to create an equation for κ . Solving that equation will then give us
κ . This, of course, assumes that the model is fairly accurate — an assumption that would require
further experiment to verify or disprove. But, at least the model’s prediction regarding the population
growth seems a good deal more reasonable than that made by the simpler model in section 10.2.

i i

i i
i i

i i

208 The Art and Science of Modeling with First-Order Equations

10.5 Notes on the Art and Science of Modeling

Our current interest is in modeling situations in which the rate at which some quantity varies is fairly
well understood. In these sorts of problems, it is often “relatively easy” to develop a ﬁrst-order
differential equation to serve as the basis for a mathematical model for the situation. We’ve seen
several examples already and will see more in the next few sections. But now, let us pause to discuss
some of the steps and issues in developing and using such models.

First Steps in the Modeling Process

Naturally, one of your very first steps in modeling something should be to learn whatever you believe
is needed for developing the model. Then identify and label the significant basic variables and decide
on the units associated with these variables. In our rabbit ranch problems, those variables were R
and t (with associated units rabbits and months, respectively); in the following, we’ll use Q for the
generic quantity of interest and assume it varies over time t .
Next, write out everything you know using these variables. This includes any initial values
you may have for any of the variables. In our rabbit ranch problem, we did not know much at first,
only the initial value for R , R(0) = 2 . If you can draw an illuminating picture representing the
situation, do so and label it for easy reference.
Then turn your attention to deriving a differential equation that accurately models the rate at
which Q varies with t , d Q/dt . Do not attempt to directly derive a formula for Q(t) , at least, not
with the sort of problems being considered here. We are now dealing with problems for which it is
much easier to first find a formula F(t, Q) for d Q/dt , and then find Q(t) by solving the resulting
differential equation.

Developing the Differential Equation for the Model

Coming up with a usable differential equation
dQ
= F(t, Q)
dt

that accurately models how Q’s rate of change depends on t and Q is the most important, and, for
many, the hardest part of the of the modeling process. After all, as you now know, anyone can solve
a ﬁrst-order differential equation (or, at least, construct a slope ﬁeld for one). Coming up with the
right differential equation can be much more challenging.
Here are a few things you can do to make it less challenging:

Identify and Describe the Processes Driving the Model

Keep in mind that d Q/dt is the rate at which Q(t) changes as t changes. This rate depends on
the processes driving the situation, not on the particular value of Q at some particular time. In
particular, the value of Q(0) is irrelevant in setting up the differential equation.
Once you’ve determined your variables and drawn your illuminating pictures, write out5
dQ
= the change in Q per unit time
dt

5 As noted in an earlier footnote, we are equating an “instantaneous rate of change”, d Q/ , to a “change in Q per unit time”,
dt
which, in turn, will be based on the values of Q and t at a speciﬁc time t instead of over the unit interval of time. For a
more detailed analysis justifying this, see appendix 10.8 on page 216.

i i

i i
i i

i i

Notes on the Art and Science of Modeling 209

and then identify the different processes that cause that Q to change. In our examples with rabbits,
these processes were “birth” and “death”, and we initially observed that
dR
= change in the number of rabbits per month
dt
= number of births per month − number of deaths per month .

Then we worked out how many births and deaths should be expected each month given that we had
R rabbits that month.
In general, you want to identify the different processes that cause Q(t) to increase (e.g., births
and immigration into the region) and to decrease (e.g., deaths and emigration out of the region).
Each of these processes corresponds to a different term in F(t, Q) (remember to add those terms
corresponding to processes that increase Q , and subtract those terms corresponding to processes
that decrease Q ). For example, if Q(t) is the number of, say, people in a certain region at time t ,
we may have
dQ
= F(t, Q)
dt
where
F(t, Q) = Fbirth + Fimmig − Fdeath − Femig
with
Fbirth = number of births per unit time ,
Fimmig = number of number of people immigrating into the region per unit time ,
Fdeath = number of deaths per unit time ,
and
Femig = number of number of people emigrating out of the region per unit time .

Once you’ve identified the different terms making up F(t, Q) (e.g., the above Fbirth , Fimmig ,
etc.), take each term, consider the process it is supposed to describe, and try to come up with a
reasonable formula describing the change in Q during a unit time interval due to that process alone.
Often, that formula will involve just Q , itself. For example, in our first “rabbit ranch” example
(with R = Q ),
Fbirth = number of births per month
= number of births per female rabbit per month
× number of female rabbits that month
= ···
5
= βR with β = .
4
Often, you will make ‘simplifications’ and ‘assumptions’ to keep the model from becoming too
complicated. In the above formula for Fbirth , for example, we did not attempt to account for seasonal
variations in birth rate, and we assumed that half the rabbit population were breeding females. We
also assumed a constant monthly birth rate and death rate per rabbit, no matter how many rabbits we
had.

Balance the Units

As already noted, one of the ﬁrst steps in modeling a situation is to decide on the main variables and
to choose the units for measuring these variables. The subsequent computations and derivations are
all in terms of these units, and we can often avoid embarrassing mistakes by just keeping track of our

i i

i i
i i

i i

210 The Art and Science of Modeling with First-Order Equations

units and being sure that their use is consistent. In particular, the units implicit in any equation must
remain balanced; that is, each term in any equation must have the same associated units as every
other term in that equation.
For example, the basic quantities in our rabbit ranch models are R and t . Even though we
treated these as numbers, we knew that

R = number of rabbits and t = time in months .

So the units associated with R and t are, respectively, rabbits and months. Consequently, the units
associated with
dR R(t + t) − R(t) change in the number of rabbits
= lim = lim
dt t→0 t t→0 change in time as measured in months

are rabbits/ any formula for d R/dt must also have

month (i.e., rabbits per month), and every term in
rabbits/
month as the associated units. If someone suggested a
term that was not “rabbits per month”,
then that term would be wrong and should be immediately rejected. Thus, for example,
dR
= Rt
dt

is clearly wrong because the right side is describing rabbits × months, not rabbits/month .
The constants used in our derivations also have associated units. The monthly birth rate per
rabbit,
β = number of rabbits born per month per rabbit on the ranch

number of rabbits born
= per rabbit on the ranch
month
number of rabbits born
= ,
month × rabbit

has associated units 1/month (since the units of rabbits cancel out), and if we had wanted to be a bit
more explicit, we would have written equation (10.4) on page 200 as
5
1
β =
4 month

instead of just
5
β = .
4
Often, you will not see the units being explicitly noted throughout the development and use of
a model. There are several possible reasons for this:

1. If the formulas and equations are correctly developed, then the units in these formulas and
equations naturally remain balanced. The modeler knows this and trusts his or her skill in
modeling.

2. The writer assumes the readers can keep track of the units themselves.

3. The writer is lazy or needs to save space.

There is much to be said in favor of explicitly giving the units associated with every element of every
formula and equation. It helps prevent stupid mistakes and may help clarify the meaning of some
of the formulas for the reader (and for the model builder). We will do this, somewhat, in the next
major example. Beginning modelers are strongly encouraged to keep track of the units in every step

i i

i i
i i

i i

Mixing Problems 211

as they develop their own equations. At the very least, stop every so often and check that the units in
the equations balance. If not, you did something wrong — go back, ﬁnd your error, and correct it.
Oh yes, one more thing about “units”: Be sure that anyone who is going to read your work or
use any model you’ve developed knows the units you are using.6

Testing and Using the Model

Once you’ve developed a differential equation modeling the way a quantity of interest Q(t) varies,
you will normally want to solve that differential equation or otherwise analyze it to see what it
says about Q(t) for various choices of t . If the situation being modeled is fairly simple and
straightforward, and your modeling skills are adequate, then your model can probably be trusted to
give fairly accurate predictions.
In practice, it is usually wise to check and see if predictions based on this model are reasonable
before announcing your new model to the world. After all, it is quite possible that some of your
‘simplifications’ and ‘assumptions’ overly simplified your model and caused important issues to be
ignored. That certainly happened with our first rabbit ranch model in which assuming constant the
birth and death rates resulted in a model predicting far more rabbits in five years than possible.
If your predictions are not reasonable, go back, revisit your derivations, and see where a more
careful modeling of the individual processes leads. If necessary, learn more about the processes
themselves.7 This should lead to a refined model for Q(t) that, in turn, leads to more reasonable
projections as to the behavior of Q(t) . The differential equation will probably be more complicated,
but that is the price you pay for a better, more accurate model.
Of course, you should not automatically assume that ‘apparently reasonable’ predictions are
accurate. If possible, compare results predicted by the model to real world data. You may need to do
this anyway to determine the values of some of the constants in your model. Hopefully, the results
predicted and the real world data will agree well enough that you can feel confident that your model
is sufficiently accurate for the desired applications. If not, refine your model further.
By the way, in using your model, keep in mind the simplifications and assumptions made in
deriving it so that you have some idea as to the limitations of this model.

10.6 Mixing Problems

In a “mixing problem”, some substance is continually being added to some container in which the
substance is mixed with some other material, and with the resulting mixture being constantly drained
off at some rate. This container may be a large tank, a lake, or the system of veins and arteries in a
body; and the substance being added may be some chemical, pollutant, or medicine being added to
the liquid in the tank, the water in the lake, or the blood in the body. These problems are favorites
of authors of differential equation texts because they can be modeled fairly easily using the basic

6 In 1999, the Mars Climate Orbiter crashed into Mars instead of orbiting the planet because the Orbiter’s software gave
instructions in terms of the imperial system (which measures force in pounds) while the hardware assumed the metric
system (which measures force in newtons – with 1 pound ≈ 4.45 newtons). This failure to communicate the units being
used caused an embarrassing end to a space project costing over 300 million dollars.
7 This author once read a paper describing a ‘new’ model for “laser interaction with a solid material”. Using this model, you
could then show that any solid can be chilled to absolute zero by suitably heating it with a laser — a rather dubious result.
That paper’s author should have better tested his model and learned more about thermodynamics.

i i

i i
i i

i i

212 The Art and Science of Modeling with First-Order Equations

90% alcohol-water mix ﬂows in at 2 gallons/minute

Tank with 500 gallons of mix

Tank mixture ﬂows
out at 2
gallons/minute

Figure 10.2: Figure illustrating a simple mixing problem.

observation that (usually)

the rate the amount of substance in the container changes

= the rate the substance is added − the rate the substance is drained off .

We will do one simple mixing problem, and then brieﬂy mention some possible variations.

A Simple Mixing Problem

The Situation to Be Modeled
We start out with a large tank containing 500 gallons of pure water. Each minute thereafter, two
gallons of an alcohol-water mix are added, and two gallons of the mixture in the tank are drained.
The alcohol-water mix being added is 90 percent alcohol. Throughout this entire process, we assume
the mixture in the tank is thoroughly and uniformly mixed. The problem is to develop a formula
describing the amount of alcohol in the tank at any given time. In particular, let’s determine if and
when the mixture in the tank is 50 percent alcohol.

Setting Up the Model

In this case, a simple, illustrative picture for the process is easily drawn. Just see ﬁgure 10.2. We
will let
t = number of minutes since we started adding the alcohol-water mix
and
y = y(t) = gallons of pure alcohol in the tank at time t .

Since we started with a tank containing pure water (no alcohol), the initial condition is

y(0) = 0 .

Our derivation of the differential equation modeling the change in y starts with the observation
that
dy
= change in the amount of alcohol in the tank per minute
dt
= rate alcohol is added to the tank − rate alcohol is drained from the tank .

i i

i i
i i

i i

Mixing Problems 213

Since we are adding 2 gallons per minute of a 90 percent alcohol mix,

gallons of input mix 90 gallons of alcohol
rate alcohol is added to the tank = 2 ×
minute 100 gallons of input mix

9 gallons of alcohol
= .
5 minute

In determining how much is being drained away, we must determine the concentration of alcohol in
the tank’s mixture at any given time, which is simply the total amount of alcohol in the tank at that
time (i.e., y(t) gallons) divided by the total amount of the mixture in the tank (which, because we
drain off as much as we add, remains constant at 500 gallons). So,

gallons of tank mix
rate alcohol is drained from the tank = 2
minute
× amount of alcohol per gallon of tank mix

gallons of tank mix y(t) (gallons of alcohol)
= 2 ×
minute 500 (gallons of tank mix)

y(t) gallons of alcohol
= .
250 minute

Combining the above gives us

dy
= rate alcohol is added to the tank − rate alcohol is drained from the tank
dt

9 y(t) gallons of alcohol
= − .
5 250 minute

Thus, the initial-value problem that y = y(t) must satisfy is

dy 9 y
= − with y(0) = 0 . (10.15)
dt 5 250

Using the Model

Factoring out 1/ on the right side of our differential equation yields
250

dy 1
= (450 − y) .
dt 250
From this we see that
y = 450
is the only constant solution. Moreover,
dy 1
= (450 − y) > 0 if y < 450 ,
dt 250
and
dy 1
= (450 − y) < 0 if y > 450 .
dt 250
So, we should expect the graphs of the possible solutions to this differential equation to be something
like the curves in ﬁgure 10.3. In other words, no matter what the initial condition is, we should expect
y(t) to approach 450 as t → ∞ .

i i

i i
i i

i i

214 The Art and Science of Modeling with First-Order Equations

Y
dy
Region where < 0
dt
450
dy
Region where > 0
dt

0
0 T

Figure 10.3: Crude graphs of solutions to the simple mixing problem from ﬁgure 10.2 based on
the sign of dy/dt .

Fortunately, the differential equation at hand is fairly simple. It (the differential equation in
initial-value problem (10.15)) is both separable and linear, and, using either the method we developed
for separable equations or the method we developed for linear equations, you can easily show that

y(t) = 450 − Ae−t/250

is the general solution. Note that, as t → ∞ ,

y(t) = 450 − Ae−t/250 → 450 − A · 0 = 450 ,

just as ﬁgure 10.3 suggests. Consequently, no matter how much alcohol is originally in the tank,
eventually there will be nearly 450 gallons of alcohol in the tank. Since the tank holds 500 gallons
of mix, the concentration of the alcohol in the mix will eventually be nearly 450/500 = 9/10 (i.e., 90
percent of the liquid in the tank will be alcohol).
For our particular problem, y(0) = 0 . So,

0 = y(0) = 450 − Ae−0/250 = 450 − A .

Hence, A = 450 and

y(t) = 450 − 450e−t/250 .
Finally, recall that we wanted to know when the mixture in the tank is 50 percent alcohol. This
will be the time when half the liquid in the tank (i.e., 250 gallons) is alcohol. Letting τ denote this
time, we must have
250 = y(τ ) = 450 − 450e−τ/250

→ 450e−τ/250 = 450 − 250

→ e−τ/250 =
200
450
=
4
9
τ
9
→ −
250
= ln
4
9
= − ln
4
.

So the mixture in the tank will be half alcohol at time

9
τ = 250 ln ≈ 202.7 (minutes) .
4

i i

i i
i i

i i

Simple Thermodynamics 215

1. Instead of adding an alcohol-water mix, we may be adding a mixture of so many ounces of

some chemical (such as sugar or salt) dissolved in the water (or other solvent).

2. The ﬂow rate into the tank may be different from the drainage ﬂow rate. In this case, the
volume of the mixture in the tank will be changing, and that will affect how the concentration
in the tank is computed.

3. We may have the problem considered in our simple mixing problem, but with some of the
drained ﬂow being diverted to a machine that magically converts a certain fraction of the
alcohol to water, and the ﬂow from that machine being dumped back into the tank. (Think
of that machine as the tank’s ‘liver’.)

4. Instead of adding an alcohol-water mix, we may be adding a certain quantity of some microor-
ganism (yeast, e-coli bacteria, etc.) in a nutrient solution. Then we would have to consider a
mixture/population dynamics model to also account for the growth of the microorganism in
the tank, as well as the in-ﬂow and drainage.

5. And so on … .

10.7 Simple Thermodynamics

Bring a hot cup of coffee into a cool room, and, in time, the coffee cools down to room temperature.
Put a similar hot cup of coffee into a refrigerator, and you will discover that the coffee cools down
faster. Let’s try to describe this cooling process a little more precisely.
To be a little more general, let us simply assume we have some object (such as a hot cup of
coffee or a cold glass of water) that we place in a room in which the air is at temperature Troom . To
keep matters simple, assume Troom remains constant. Let T = T (t) be the temperature at time t
of the object we placed in the room. As time t goes on, we expect T to approach Troom . Now
consider
dT
= rate at which T approaches Troom as time t increases .
dt
It should seem reasonable that this rate at any instant of time t depends just on the difference between
the temperature of the object and the temperature of the room, T − Troom ; that is
dT
= F(T − Troom ) . (10.16)
dt

for some function F . Moreover,

1. If T − Troom = 0 , then the object is the same temperature as the room. In this case, we
do not expect the object’s temperature to change. Hence, we should have dT/dt = 0 when
T = Troom .

2. If T − Troom is a large positive value, then the object is much warmer than the room. We then
expect the object to be rapidly cooling; that is, T should be a rapidly decreasing function of
t . Hence dT/dt should be large and negative.

i i

i i
i i

i i

216 The Art and Science of Modeling with First-Order Equations

3. If T − Troom is a large negative value, then the object is much cooler than the room. We then
expect the object to be rapidly warming; that is, T should be a rapidly increasing function
of t . Hence dT/dt should be large and positive.

In terms of the function F on the right side of equation (10.16), these three observations mean

T − Troom = 0 ⇒ F(T − Troom ) = 0 ,

T − Troom is a large positive value ⇒ F(T − Troom ) is a large negative value

and
T − Troom is a large negative value ⇒ F(T − Troom ) is a large positive value .

The simplest choice of F satisfying these three conditions is

F(T − Troom ) = −κ(T − Troom )

where κ is some positive constant. Plugging this into equation (10.16) yields
dT
= −κ(T − Troom ) . (10.17)
dt

This equation is often known as Newton’s law of heating and cooling. The positive constant κ
describes how easily heat ﬂows between the object and the air, and must be determined by experiment.
Equation (10.17) states that the change in the temperature of the object is proportional to the
difference in the temperatures of the object and the room. It’s not exactly the same as equation (10.8)
on page 202 (unless Troom = 0 ), but it is quite similar in spirit. We’ll leave its solution and further
discussion as exercises for the reader.

10.8 Appendix: Approximations That Are Not

Approximations
In our ﬁrst rabbit ranch model, (after assuming a death rate of zero), our derivation of the model can,
essentially, be described by
dR
= number of births per month = β R(t)
dt
where
β = monthly birth rate per rabbit .

Those who are comfortable with calculations involving rates should be comfortable with this. Others,
however, may be concerned that we have two approximations here: The ﬁrst is in approximating the
derivative d R/dt (an ’‘instantaneous rate of change at time t ”) by the monthly rate of change. The
second is in describing this monthly rate of change in terms of R(t) , the number of rabbits at the
instant of time t , even though the number of rabbits clearly changes over a month.
Let us reassure those concerned readers by looking at this derivation a little more carefully. We
start by recalling the deﬁnition of the derivative of R at time t :
dR R
= lim
dt t→0 t

i i

i i
i i

i i

Appendix: Approximations That Are Not Approximations 217

where
R = R(t + t) − R(t) = change in R as time changes from t to t + t .

Of course, the ‘ R(t + t) − R(t) ’ formula for R is pretty useless since we don’t have the formula
for R . However, we can approximate R via

R = bunnies born as time changes from t to t + t

≤ monthly birth rate per rabbit
× maximum number of rabbits at any one time between t and t + t
× length of time (in months) between t and t + t

= β Rmax t
where
Rmax = maximum number of rabbits at any one time between t and t + t .

Note that

lim Rmax = maximum number of rabbits at any one time between t and t + 0
t→0

= number of rabbits at time t

= R(t) .

Consequently,
dR R
= lim ≤ lim β Rmax = β R(t) .
dt t→0 t t→0

Similar arguments with

Rmin = minimum number of rabbits at any one time between t and t + t

yield
dR
≥ lim β Rmin = β R(t) .
dt t→0

Together the two above inequalities involving d R/ tell us that

dR
β R(t) ≤ ≤ β R(t)
dt

which, of course, means that

dR
= β R(t) ,
dt
just as we originally derived.
More generally, this sort of analysis can be used to justify letting

dQ Q
=
dt t

where t is the unit time interval in whatever units we are using, and then deriving a formula for
Q/
t in terms of t and Q(t) , just as we do in our examples, and just as you should do in the
exercises.

i i

i i
i i

i i

218 The Art and Science of Modeling with First-Order Equations

Additional Exercises

10.2. Do the following using formula (10.7) on page 201 from the simple model for the rabbit
population on our rabbit ranch:
a. Find the approximate number of rabbits on the ranch after one year.
b. How long does it take for the number of rabbits to increase
i. from 2 to 4 ? ii. from 4 to 8 ? iii. from 8 to 16 ?
c. How long does it take for the number of rabbits to increase
i. from 2 to 20 ? ii. from 5 to 50 ? iii. from 10 to 100 ?
d. Approximately how long does it take for the mass of the rabbits on the ranch to equal the
mass of the Earth?

10.3. (Epidemiology) Imagine the following situation:

A stranger infected with a particularly contagious strain of the snifﬂes enters

a city. Let I (t) be the number of people in the city infected with the sniffles
t days after the stranger entered the city. Assume that only the stranger has
the sniffles on day 0, and that the number of people with the sniffles increases
exponentially thereafter (as derived in the simple population growth model in
section 10.3). Assume further that 50 people have the sniffles on the tenth day
after the stranger entered the city.

Let I (t) be the number of people in the city with sniffles on day t .
a. What is the formula for I (t) ?
b. How many people have the sniffles on day 20 ?
c. Approximately how long until 250,000 people in the city have the sniffles?

10.4. Assume that A(t) = A0 e−δt is the amount of some radioactive substance at time t having
a half-life τ1/2 .
a. Verify that, for each value of t (not just t = 0 ),
1
A(t + τ1/2 ) = A(t) .
2

b. Verify that the formula A(t) = A0 e−δt can be rewritten as

t/τ
1
A(t) = A0
1/2
.
2

10.5. Cesium-137 is a radioactive isotope of cesium with a half-life of about 30 years.

a. Find the corresponding decay constant δ for cesium-137.
b. Suppose we have a bottle (which we never open) containing 20 grams of cesium-137.
Approximately how many grams of cesium-137 will still be in the bottle
i. after 10 years? ii. after 25 years? iii. after 100 years?

i i

i i
i i

i i

Additional Exercises 219

10.6. (Carbon-14 dating) A little background:

Most of the carbon in living tissue comes, directly or indirectly, from the carbon
dioxide in the air. A tiny fraction (about one part per trillion) of this carbon
is the radioactive isotope carbon-14 (which has a half-life of approximately
5,730 years). The rest of the carbon is not radioactive. As a result, about one
trillionth of the carbon in the tissues of a living plant or animal is that radioactive
form of carbon. This ratio of carbon-14 to nonradioactive carbon in the air and
living tissue has remained fairly constant8 because the rate at which carbon-14
is created (through an interaction of cosmic radiation with the nitrogen in the
upper atmosphere) matches the rate at which it decays.
At death, however, the plant or animal stops absorbing carbon, and the tiny
amount of carbon-14 in its tissues begins to decrease due to radioactive decay.
By measuring the current ratio of carbon-14 to the nonradioactive carbon in a
tissue sample (say, a piece of old bone or wood), and then comparing this ratio to
the ratio in comparable living tissue, a good estimate of fraction of the carbon-14
that has decayed can be made. Using that and our model for radioactive decay,
the age of the bone or wood can then be approximated.
Using the above information:
a. Find the (approximate) decay constant δ for carbon-14.
b. Suppose a piece of wood came from a tree that died t years ago. Approximately what
percentage of the carbon-14 that was in the piece of wood when the tree died still remains
undecayed if
i. t = 10 years? ii. t = 100 years? iii. t = 1,000 years?
iv. t = 5,000 years? v. t = 10,000 years? vi. t = 50,000 years?
c. Suppose a skeleton of a person found in an ancient grave contains 30 percent of the
carbon-14 normally found in (equally sized) skeletons of living people. Approximately
how long ago did this person die?
d. The wood in the ornate funeral mask of the ancient ﬁctional ruler Rootietootiekoomin is
found to contain 60 percent of the carbon-14 originally in the wood. Approximately how
long ago did Rootietootiekoomin die?
e. Let A be the amount of carbon-14 measured in a tissue sample (e.g., an old bone or piece
of wood), and let A0 be the amount of carbon-14 in the tissue when the plant or creature
died. Derive a formula for the approximate length of time since that plant’s or creature’s
demise in terms of the ratio A/A0 .

10.7. Consider the “better model” for the rabbit population in section 10.4.
a. Solve the logistic equation derived there (equation (10.13) on page 206), and verify that
the solution can be written as given in formula (10.14) on page 207.
b. Assume the same values for the initial number of rabbits and ideal birth rate as assumed
in section 10.4,
5
R(0) = 2 and β0 = .
4
Also assume that our rabbit ranch has a carrying capacity κ of 10,000,000 rabbits (it’s

8 but not perfectly constant — see a good article on carbon-14 dating.

i i

i i
i i

i i

220 The Art and Science of Modeling with First-Order Equations

a big ranch). How many rabbits (approximately) does our “better model” predict will be
on our ranch
i. at the end of the ﬁrst 6 months?
ii. at the end of the ﬁrst year? (Compare this to the number predicted by the simple model
in exercise 10.2 a, and to the carrying capacity.)
iii. at the end of the second year? (Compare this to the carrying capacity.)
c. Solve formula (10.14) on page 207 for the carrying capacity κ in terms of R0 , R(t) , β
and t .
d. Using the formula for the carrying capacity just derived (and assuming the ideal birth rate
β0 = 5/4 , as before), determine the approximate carrying capacity of a rabbit ranch under
each of the following conditions:
i. You have 1,000 rabbits 6 months after starting with a single breeding pair.
ii. You have 2,000 rabbits 6 months after starting with a single breeding pair.

10.8. Suppose we have a rabbit ranch and have begun harvesting rabbits. Let

R(t) = number of rabbits on the ranch t months after beginning harvesting

and assume the following:

1. The monthly birth rate per rabbit, β , is 5/

4 (as we derived).

2. We have no problems with overpopulation (i.e., for all practical purposes, we can
assume the natural death rate is 0 ).

3. Each month we harvest 500 rabbits. (Assume this is done “over the month”, so the
rabbits are still reproducing as we are harvesting.)

a. Derive the differential equation for R(t) based on the above assumptions.
b. Find any equilibrium solutions to your differential equation (this may surprise you), and
analyze how the rabbit population varies over time based on how many we had when we
first began harvesting. (Feel free to use a crude slope field as done in section 10.4.)
c. Solve the differential equation. Get your final answer in terms of t and R0 = R(0) .

10.9. Repeat the previous problem, only, instead of harvesting 500 rabbits a month, assume we
harvest 50 percent of the rabbits on the ranch each month.

10.10. Again, assume we have a rabbit ranch, and let

R(t) = number of rabbits on the ranch after t months.

Taking into account the problems that arise when the population is too large, we obtained
the differential equation
dR
= β R − γ R2
dt
where β is the monthly birth rate per rabbit (which we ﬁgured was 5/ )
4 and γ was some
positive constant that would have to be determined later.

i i

i i
i i

i i

Additional Exercises 221

This differential equation was obtained assuming we were not harvesting rabbits.
Assume, instead, that we are harvesting h rabbits each month. How do we change the
above differential equation to reﬂect this if
a. we harvest a constant number h 0 of rabbits each month?
b. we harvest one fourth of all the rabbits on the ranch each month?

10.11. Consider the following situation:

Mullock the Barbarian begins a campaign of self-enrichment with a horde of 200

vicious warriors. Each week he loses 5 percent of his horde to the unavoidable
accidents that occur while sacking and pillaging. Unfortunately, the horde’s
lifestyle of wanton violence and mindless destruction attracts 50 new warriors
to the horde each week.

Let y(t) be the number of warriors in Mullock’s horde t weeks after starting the cam-
paign.
a. Derive the differential equation describing how y(t) changes each week. Is there also an
initial value given?
b. To what size does the horde eventually grow? (Don’t solve the initial-value problem to
answer this question. Instead, use equilibrium solutions and graphical methods.)
c. Now solve the initial-value problem from the ﬁrst part.
d. How long does it take Mullock’s horde to reach 90 percent of its ﬁnal size?

10.12. (mixing) Consider the following mixing problem:

We have a large tank initially containing 1,000 gallons of pure water. We begin
adding an alcohol-water mix at a rate of 3 gallons per minute. This alcohol-water
mix being added is 75 percent alcohol. At the same time, the mixture in the tank
is drained at a rate of 3 gallons per minute. Throughout this entire process, the
mixture in the tank is thoroughly and uniformly mixed.

Let y(t) be the number of gallons of pure alcohol in the tank t minutes after we started
adding the alcohol-water mix.

a. Find the differential equation for y(t) .

b. Sketch a crude slope field for the differential equation just obtained, and find any equilib-
rium solutions.
c. Using the differential equation just obtained, find the formula for y(t) .
d. Approximately how many gallons of alcohol are in the tank at
i. t = 10 ? ii. t = 60 ? iii. t = 1000 ?
e. When will the mixture in the tank be half alcohol?

10.13. Redo exercise 10.12, but assume the tank initially contains 900 gallons of pure water and
100 gallons of alcohol.

i i

i i
i i

i i

222 The Art and Science of Modeling with First-Order Equations

10.14. Consider the following mixing problem:

We have a tank initially containing 200 gallons of pure water, and start adding
saltwater (containing 3 ounces of salt per gallon of water) at the rate of 1/2 gallon
per minute. At the same time, the resulting mixture in the tank is drained at the
rate of 1/2 gallon per minute. As usual, the mixture in the tank is thoroughly
and uniformly mixed at all times.
Let y(t) be the number of ounces of salt in the tank at t minutes after we started adding
the saltwater.
a. Find the differential equation for y(t) .
b. Sketch a crude slope field for the differential equation just obtained, and find any equilib-
rium solutions.
c. Using the differential equation just obtained along with any given initial values, find the
formula for y(t) .
d. Approximately how many ounces of salt are in the tank at
i. t = 10 ? ii. t = 60 ? iii. t = 100 ?
e. What does the concentration of the salt in the tank approach as t → ∞ ?
f. When will the concentration of the salt in the tank be 2 ounces of salt per gallon of water?

10.15. Redo exercise 10.14, but assume that a device has been attached to the tank that, each minute,
ﬁlters out half the salt in a single gallon from the mixture in the tank.

10.16. Consider the following variation of the mixing problem in exercise 10.12:
We have a large tank initially containing 500 gallons of pure water, and start
adding saltwater (containing 2 ounces of salt per gallon of water) at the rate of 2
gallons per minute. At the same time, the resulting mixture in the tank is drained
at the rate of 3 gallons per minute. As usual, assume the mixture in the tank is
thoroughly and uniformly mixed at all times.
Note that the tank is being drained faster that it is being filled.
Let y(t) be the number of ounces of salt in the tank at t minutes after we started adding
the saltwater.
a. What is the formula for the volume of the liquid in the tank t minutes after we started
adding the saltwater?
b. Find the differential equation for y(t) . (Keep in mind that the concentration of salt in the
outflow at time t will depend on both the amount of salt and the volume of the liquid in
the tank at that time.)
c. Using the differential equation just obtained along with any given initial values, find the
formula for y(t) .
d. How many ounces of salt are in the tank at
i. t = 10 ? ii. t = 60 ? iii. t = 100 ?
e i. When will there be exactly 1 gallon of saltwater in the tank?
ii. How much salt will be in that gallon of saltwater?

i i

i i
i i

i i

Additional Exercises 223

10.17. (heating/cooling) Consider the following situation:

At 2 o’clock in the afternoon, the butler reported discovering the dead body of
his master, Lord Hakky d’Sack, in the Lord’s personal wine cellar. The Lord had
apparently been bludgeoned to death with a bottle of Rip’le 04. At 4 o’clock,
the forensics expert arrived and measured the temperature of the body. It was 90
degrees at that time. One hour later, the body had cooled down to 80 degrees. It
was also noted that the wine cellar was maintained at a constant temperature of
50 degrees.

Should the butler be arrested for murder? (Base your answer on the time of death as
determined from the above information, Newton’s law of heating and cooling and the fact
that a reasonably healthy person’s body temperature is about 98.2 degrees.)

i i

i i
i i

i i

i i
i i

i i

Part III
Second- and Higher-Order
Equations

i i

i i
i i

i i

i i
i i

i i

11
Higher-Order Equations: Extending
First-Order Concepts

Let us switch our attention from ﬁrst-order differential equations to differential equations of order
two or higher. Our main interest will be with second-order differential equations, both because it is
natural to look at second-order equations after studying ﬁrst-order equations, and because second-
order equations arise in applications much more often than do third-, or fourth- or eighty-third-order
equations. Some examples of second-order differential equations are1

y + y = 0 ,

y + 2x y − 5 sin(x) y = 30e3x ,
and
(y + 1)y = (y )2 .

Still, even higher order differential equations, such as

8y + 4y + 3y − 83y = 2e4x ,

x 3 y (iv) + 6x 2 y + 3x y − 83 sin(x)y = 2e4x ,

and
y (83) + 2y 3 y (53) − x 2 y = 18 ,

can arise in applications, at least on occasion. Fortunately, many of the ideas used in solving these
are straightforward extensions of those used to solve second-order equations. We will make use of
this fact extensively in the following chapters.
Unfortunately, though, the methods we developed to solve first-order differential equations
are of limited direct use in solving higher-order equations. Remember, most of those methods were
based on integrating the differential equation after rearranging it into a form that could be legitimately
integrated. This rarely is possible with higher-order equations, and that makes solving higher-order
equations more of a challenge. This does not mean that those ideas developed in previous chapters
are useless in solving higher-order equations, only that their use will tend to be subtle rather than
obvious.
Still, there are higher-order differential equations that, after the application of a simple substitu-
tion, can be treated and solved as first-order equations. While our knowledge of first-order equations
is still fresh, let us consider some of the more important situations in which this is possible. We will
also take a quick look at how the basic ideas regarding first-order initial-value problems extend to
1 For notational brevity, we will start using the ‘prime’ notation for derivatives a bit more. It is still recommended, however,
that you use the ‘ d/dx ’ notation when finding solutions just to help keep track of the variables involved.

227

i i

i i
i i

!Example 11.1: Consider the second-order differential equation

d2 y dy
+ 2 = 30e3x .
dx2 dx

Setting
dy d2 y dv
= v and 2
= ,
dx dx dx
as suggested above, the differential equation becomes

dv
+ 2v = 30e3x .
dx

This is a linear ﬁrst-order differential equation with integrating factor

"
2 dx
μ = e = e2x .

Proceeding as normal with linear ﬁrst-order equations,

dv
e2x + 2v = 30e3x
dx

→ e2x
dv
dx
+ 2e2x v = 30e3x e2x

→ d
dx
e2x v = 30e5x

→ d
dx
e2x v dx = 30e5x dx

→ e2x v = 6e5x + c0 .

Hence,
v = e−2x 6e5x + c0 = 6e3x + c0 e−2x .

But v = dy/dx , so the last equation can be rewritten as

dy
= 6e3x + c0 e−2x ,
dx

which is easily integrated,

c
y = 6e3x + c0 e−2x dx = 2e3x − 0 e−2x + c2 .
2

Thus (letting c1 = −c0/2 ), the solution to our original differential equation is

y(x) = 2e3x − c1 e−2x + c2 .

If your differential equation for v is separable and you are solving as such, don’t forget to
check for the constant solutions to this differential equation and to take these “constant-v” solutions
into account when integrating y = v .

i i

i i
i i

i i

230 Higher-Order Equations: Extending First-Order Concepts

!Example 11.2: Consider the second-order differential equation

d2 y
dy 2
= − − 3 . (11.1)
dx2 dx

Letting
dy d2 y dv
= v and = ,
dx dx2 dx
the differential equation becomes
dv
= −(v − 3)2 . (11.2)
dx

This equation has a constant solution,

v = 3 ,
which we can rewrite as
dy
= 3 .
dx
Integrating then gives us
y(x) = 3x + c0 .
This describes all the “constant-v ” solutions to our original differential equation.
To ﬁnd the nonconstant solutions to equation (11.2), divide through by (v − 3)2 and inte-
grate:
dv
= −(v − 3)2
dx

→ (v − 3)−2
dv
dx
= −1

→ (v − 3)−2
dv
dx
dx = − 1 dx

→ −(v − 3)−1 = −x + c1

→ v = 3 +
1
x − c1
.

But, since v = y , this last equation is the same as

dy 1
= 3 + ,
dx x − c1

which is easily integrated, yielding

y(x) = 3x + ln |x − c1 | + c2 .

Gathering all the solutions we’ve found gives us the set consisting of

y = 3x + ln |x − c1 | + c2 and y(x) = 3x + c0 (11.3)

describing all possible solutions to our original differential equation.

i i

i i
i i

i i

Treating Some Second-Order Equations as First-Order 231

Equations of Even Higher Orders

With just a little imagination, the basic ideas discussed above can be applied to a few differential
equations of even higher order. Here is an example:

!Example 11.3: Consider the third-order equation

2 −2
d3 y d y
3 3 = 2
.
dx dx
Set
d2 y
v= .
dx2
Then
dv d d2 y d3 y
= = ,
dx dx dx2 dx3
and the original differential equation reduces to a simple separable ﬁrst-order equation,
dv
3 = v −2 .
dx
Multiplying both sides by v 2 and proceeding as usual with such equations:
dv
3v 2 = 1
dx

→ 3v 2
dv
dx
dx = 1 dx

→ v 3 = x + c1

→ v = (x + c1 ) /3
1
.
So
d2 y
= v = (x + c1 ) /3
1
.
dx2
Integrating once:

dy d2 y
(x + c1 ) /3 dx = (x + c1 ) /3 + c2
1 3 4
= dx = .
dx dx2 4

And once again:

dy
(x + c1 ) /3 + c2 dx = (x + c1 ) /3 + c2 x + c3
3 4 9 7
y = dx = .
dx 4 28

Converting a Differential Equations to a System∗

Consider what we actually have after taking a second-order differential equation (with y being
the yet unknown function and x being the variable) and converting it to a ﬁrst-order equation for
∗ These comments relate the material in this chapter to more advanced concepts and methods that will be developed much
later in this text. This discussion won’t help you solve any differential equations now, but will give a little hint of an
approach to dealing with higher-order equations we could take (and will explore in the distant future). You might ﬁnd them
interesting. Then, again, you may just want to ignore this discussion for now.

i i

i i
i i

i i

232 Higher-Order Equations: Extending First-Order Concepts

v through the substitution v = y . We actually have a pair of ﬁrst-order differential equations

involving the two unknown functions y and v . The ﬁrst equation in the pair is simply the equation
for the substitution, y = v , and the second is what we obtain after using the substitution with the
original second-order equation. If we are lucky, we can directly solve the second equation of this
pair. But, as the next example illustrates, we have this pair whether or not either of the above cases
applies. Together, this pair forms a “system” of ﬁrst-order differential equations” whose solution is
the pair y(x) and v(x) .

!Example 11.4: Suppose our original differential equation is

d2 y dy
+ 2x = 5 sin(x) y .
dx2 dx
Setting
dy dv d2 y
v = and = ,
dx dx dx2
the original differential equation reduces to
dv
+ 2xv = 5 sin(x) y .
dx
Thus, v = v(x) and y = y(x) , together, must satisfy both
dy dv
v = and + 2xv = 5 sin(x) y .
dx dx
This is a system of two ﬁrst-order equations. Traditionally, each equation is rewritten in derivative
formula form, and the system is then written as
dy
= v
dx
.
dv
= 5 sin(x) y − 2xv
dx

As just illustrated, almost any second-order differential equation encountered in practice can be
converted to a system of two first-order equations involving two unknown functions. In fact, almost
any N th -order differential equation can be converted using similar ideas to a system of N first-order
differential equations involving N unknown functions. This is significant because methods exist
for dealing with such systems. In many cases, these methods are analogous to methods we used with
first-order differential equations. We will discuss some of these methods in the future (the distant
future).

11.2 The Other Class of Second-Order Equations “Easily

†
Reduced” to First-Order
As noted a few pages ago, the substitution v = y can also be useful when no formulas of x explicitly
appear in the given second-order differential equation. Such second-order differential equations are
said to be autonomous (this extends the deﬁnition of “autonomous” given for ﬁrst-order differential
equations in chapter 3).
† The material in this section is interesting and occasionally useful, but not essential to the rest of this text. At least give this
section a quick skim before going to the discussion of initial-value problems in the next section (and promise to return to
this section when convenient or necessary).

→ dv
v dy = −
dy
y dy

→ 1 2
2
1
v = − y 2 + c0
2
!
→ v = ± 2c0 − y 2 .

3. Rewrite the original substitution,

dy
= v ,
dx
replacing the v with the formula just found for v . Then observe that this is another first-order
differential equation. In fact, you should notice that it is a separable first-order equation.
Replacing the v in
dy
= v
dx
with the formula just obtained in our example for v , we get
!
dy
= ± 2c0 − y 2 .
dx
Why, yes, this is a separable first-order differential equation for y !
4. Solve the first-order differential equation just obtained. This gives the general solution to the
original second-order differential equation.
For our example, !
dy
= ± 2c0 − y 2
dx

→
1 dy
= ±1
2c0 − y 2 d x

→
1 dy
dx = ± 1 dx .
2c0 − y d x
2

Evaluating these integrals (after, perhaps, consulting our old calculus text or a
handy table of integrals) yields
y
arcsin = ±x + b .
a
where a and b are arbitrary constants with a 2 = 2c0 . (Note that c0 had to be
positive for square root to make sense.)
Taking the sine of both sides and recalling that sine is an odd function, we see
that
y
= sin(±x + b) = ± sin(x ± b) .
a
Thus, letting c1 = ±a and c2 = ±b , we have
y(x) = c1 sin(x + c2 )
as a general solution to our original differential equation,
d2 y
+ y = 0 .
dx2
(Later, after developing more theory, we will ﬁnd easier ways to solve this and
certain similar ‘linear’ equations.)

i i

i i
i i

i i

Initial-Value Problems 235

A Few Comments
Most of these should be pretty obvious:

1. Again, remember to check for the “constant-v” solutions of any separable differential equation
for v .
2. It may be that the original differential equation does not explicitly contain either x or y . If so,
then the approach just described and the approach described in the previous section may both
be appropriate. Which one you choose is up to you. Often, it makes little difference (though
the ﬁrst is usually at least a little more straightforward). But occasionally (as illustrated in
exercise 11.10) one approach may be much easier to carry out than the other.
3. To be honest, we won’t be using the method just outlined in later sections or chapters. Many
of the second-order autonomous equations arising in applications are also “linear”, and we
will develop better methods for dealing with these equations over the next few chapters
(where we will also learn just what it means to say that a differential equation is “linear”).
It should also be mentioned that, much later, we will develop clever ways to analyze the
possible solutions to fairly arbitrary autonomous differential equations after rewriting these
equations as systems.
Still, the method described here is invaluable for completely solving certain autonomous
differential equations that are not “linear”.

11.3 Initial-Value Problems

Initial Values with Higher-Order Equations
Remember, an N th -order initial-value problem consists of an N th -order differential equation along
with the set of assignments
y(x 0 ) = y0 , y (x 0 ) = y1 , y (x 0 ) = y2 , ... and y (N −1) (x 0 ) = y N −1
where x 0 is some single fixed number and the yk ’s are the desired values of the function and its
first few derivatives at position x 0 .
In particular, a first-order initial-value problem consists of a first-order differential equation
with a y(x 0 ) = y0 initial condition. For example,
dy
x + 4y = x 3 with y(1) = 3 .
dx
A second-order initial-value problem consists of a second-order differential equation along with
y(x 0 ) = y0 and y (x 0 ) = y1 initial conditions. For example,
d2 y dy
+ 2 = 30e3x with y(0) = 5 and y (0) = 14 .
dx2 dx
A third-order initial-value problem consists of a third-order differential equation along with y(x 0 ) =
y0 , y (x 0 ) = y1 and y (x 0 ) = y2 initial conditions. For example,
2 −2
d3 y d y
3 3 = 2
with y(0) = 4 , y (0) = 6 and y (0) = 8 .
dx dx

And so on.

i i

i i
i i

i i

236 Higher-Order Equations: Extending First-Order Concepts

Solving Higher-Order Initial-Value Problems

The Basic Approach
The basic procedure for solving a typical higher-order initial-value problem is just about the same
as the procedure for solving a first-order initial-value problem. You just need to account for the
additional initial values.
First find the general solution to the differential equation. (Though we haven’t proven it, you
should expect the formula for the general solution to have as many arbitrary/undetermined constants
as you have initial conditions.) Use the formula found for the general solution with each initial
condition. This creates a system of algebraic equations for the yet-undetermined constants which
can be solved for those constants. Solve the system by whatever means you can, and use those
values for the constants in the formula for the differential equation’s general solution. The resulting
formula is the solution to the initial-value problem.
One example should suffice.

!Example 11.5: Consider the second-order initial-value problem

d2 y dy
2
+ 2 = 30e3x with y(0) = 5 and y (0) = 14 .
dx d x
From example 11.1 we already know that

y(x) = 2e3x − c1 e−2x + c2

is the general solution to the differential equation. Since the initial conditions include the value
of y (x) at x = 0 , we will also need the formula for y ,
dy
y (x) = = 6e3x + 2c1 e−2x ,
dx
which we can obtain by either differentiating the above formula for y or copying the formula for
y from the work done in the example. Combining these formulas with the given initial conditions
yields
5 = y(0) = 2e3·0 − c1 e−2·0 + c2 = 2 − c1 + c2
and
14 = y (0) = 6e3·0 + 2c1 e−2·0 = 6 + 2c1 .

That is,
5 = 2 − c1 + c2
and
14 = 6 + 2c1 .

Doing the obvious arithmetic, we get the system

−c1 + c2 = 3
2c1 = 8

of two equations and two unknowns. This is an easy system to solve. From the second equation
we immediately see that
8
c1 = = 4 .
2
Then, solving the ﬁrst equation for c2 and using the value just found for c1 , we see that

c2 = c1 + 3 = 4 + 3 = 7 .

i i

i i
i i

i i

Initial-Value Problems 237

Thus, for y(x) to satisfy the given differential equation and the two given initial conditions, we
must have
y(x) = 2e3x − c1 e−2x + c2 with c1 = 4 and c2 = 7 .

That is,
y(x) = 2e3x − 4e−2x + 7
is the solution to our initial-value problem.

An Alternative Approach
At times, it may be a little easier to determine the values of the arbitrary/undetermined constants
“as they arise” in solving the differential equation. This is especially true when using the methods
discussed in sections 11.1 and 11.2, where we used the substitution v = y to convert the differential
equation to a ﬁrst-order differential equation for v . For the sort of equation considered in section
11.1, this substitution immediately gives a ﬁrst-order initial-value problem with

v(x 0 ) = y (x 0 ) = y1 .

For the type of equation considered in section 11.2 (the autonomous differential equations), the initial
condition for v = v(y) comes from combining v = y and the original initial conditions

y(x 0 ) = y0 and y (x 0 ) = y1

into
v(y0 ) = y (x 0 ) = y1 .
Let’s do one simple example.

!Example 11.6: Consider the initial-value problem

d2 y dy
2
+ 2 = 30e3x with y(0) = 9 and y (0) = 2 .
dx d x

The differential equation is the same as in example 11.1 on page 229. Letting v(x) = y (x)
yields the ﬁrst-order initial-value problem
dv
+ 2v = 30e3x with v(0) = y (0) = 2 .
dx
As shown in example 11.1, the general solution to the differential equation for v is

v = 6e3x + c0 e−2x .

Combining this with the initial value yields

2 = v(0) = 6e3·0 + c0 e−·0 = 6 + c0 .

So, c0 = 2 − 6 = −4 , and
dy
= v = 6e3x − 4e−2x .
dx
Integrating this,

y(x) = 6e3x − 4e−2x dx = 2e3x + 2e−2x + c1 ,

i i

i i
i i

i i

238 Higher-Order Equations: Extending First-Order Concepts

and using this formula for y with the initial condition y(0) = 9 gives us

9 = y(0) = 2e3·0 + 2e−2·0 + c1 = 2 + 2 + c1 = 4 + c1 .

Thus, c1 = 9 − 4 = 5 , and
y(x) = 2e3x + 2e−2x + 5
is the solution to our initial-value problem.

As the example illustrates, one advantage of this approach is that you only deal with one
unknown constant at a time. This approach also by-passes obtaining the general solution to the
original differential equation. Consequently, if the general solution is also desired, then the slight
advantages of this method are considerably reduced.

11.4 On the Existence and Uniqueness of Solutions

Second-Order Problems
When we were dealing with ﬁrst-order differential equations, we often found it useful to rewrite a
given ﬁrst-order equation in the derivative formula form,
dy
= F(x, y) .
dx

Extending this form to higher-order differential equations is straightforward. In particular, if we

algebraically solve a second-order differential equation for the second derivative, y , in terms of
x , y and the ﬁrst derivative, y , then we will have rewritten our differential equation in the form

y = F x, y, y

for some function F of three variables. We will call this the second-derivative formula form for the
second-order differential equation.

!Example 11.7: Solving the equation

y + 2x y = 5 sin(x) y

for the second derivative yields

y = 5 sin(x) y − 2x y .

F(x,y,y )

Replacing the derivative on the right with the symbol z , we see that the formula for F is

F(x, y, z) = 5 sin(x) y − 2x z .

To be honest, we won’t ﬁnd the second-derivative formula form particularly useful in solv-
ing second-order differential equations until we seriously start dealing with systems of differential
equations. It is mentioned here because it is the form used in the following basic theorems on the
existence and uniqueness of solutions to second-order initial-value problems:

i i

i i
i i

i i

On the Existence and Uniqueness of Solutions 239

Theorem 11.1 (existence and uniqueness for second-order initial-value problems)

Consider a second-order initial-value problem

y = F x, y, y with y(x 0 ) = y0 and y (x 0 ) = z 0

in which F = F(x, y, z) and the corresponding partial derivatives ∂ F/∂ y and ∂ F/∂z are all continuous
in some open region containing the point (x 0 , y0 , z 0 ) . This initial-value problem then has exactly
one solution over some open interval (α, β) containing x 0 . Moreover, this solution and its ﬁrst and
second derivatives are continuous over that interval.

Theorem 11.2 (existence and uniqueness for second-order initial-value problems)

Consider a second-order initial-value problem

y = F x, y, y with y(x 0 ) = y0 and y (x 0 ) = z 0

over an interval (α, β) containing the point x 0 , and with F = F(x, y, z) being a continuous
function on inﬁnite slab

R = { (x, y, z) : α < x < β , − ∞ < y < ∞ and − ∞ < z < ∞ } .

Further suppose that, on R , the partial derivatives ∂ F/∂ y and ∂ F/∂z are continuous and are functions
of x only. Then the initial-value problem has exactly one solution on (α, β) . Moreover, the solution
and its ﬁrst and second derivatives are all continuous on this interval.

The above theorems are the second-order analogs of the theorems on existence and uniqueness
for ﬁrst-order differential equations given in section 3.3, and are also special cases of the two theorems
we’ll discuss next. They assure us that most of the second-order initial-value problems encountered
in practice are, in theory, solvable. We will ﬁnd the second theorem particularly important in
rigorously establishing some useful results concerning the general solutions to an important class of
second-order differential equations.

Problems of Any Order

The biggest difﬁculty in extending the above existence and uniqueness results for second-order
problems to problems of arbitrary order N is that we quickly run out of letters to denote the
variables and constants. So we will use subscripts.
Extending the idea of the “derivative formula form” for a ﬁrst-order differential equation remains
trivial. If, given a N th -order differential equation, we can algebraically solve for the N th -order
derivative y (N ) in terms of x , y , and the other derivatives of y , then we will say that we’ve gotten
the differential equation into the derivative formula form!highest order for the differential equation,

y (N ) = F(x, y, y , . . . , y (N −1) ) .

Note that F will be a function of N + 1 variables, which we will denote by x , s1 , s2 ,…, s N .

!Example 11.8: Solving the equation

y (4) − x 2 y − yy + 2x y 3 y = 0

for the fourth derivative yields

y (4) = x 2 y + yy − 2x y 3 y .

F(x,y,y ,y ,y )

i i

i i
i i

i i

240 Higher-Order Equations: Extending First-Order Concepts

The formula for F is then

F(x, s1 , s2 , s3 , s4 ) = x 2 s4 + s1 s3 − 2x(s1 )3 s2 .

Again, the main reason to mention this form is that it is used in the N th -order analogs of the
existence and uniqueness theorems given in section 3.3. These analogous theorems are

Theorem 11.3 (existence and uniqueness for N th -order initial-value problems)

Consider an N th -order initial-value problem

y (N ) = F x, y, y , . . . , y (N −1)
with
y(x 0 ) = σ1 , y (x 0 ) = σ2 , ... and y (N −1) (x 0 ) = σ N

in which F = F(x, s1 , s2 , . . . , s N ) and the corresponding partial derivatives

∂F ∂F ∂F
, , ... and
∂s1 ∂s2 ∂sn

are all continuous functions in some open region containing the point (x 0 , σ1 , σ2 , . . . , σ N ) . This
initial-value problem then has exactly one solution over some open interval (α, β) containing x 0 .
Moreover, this solution and its derivatives up to order N are continuous over that interval.

Theorem 11.4 (existence and uniqueness for Nth -order initial-value problems)
Consider an N th -order initial-value problem

y (N ) = F x, y, y , . . . , y (N −1)
with
y(x 0 ) = σ1 , y (x 0 ) = σ2 , ... and y (N −1) (x 0 ) = σ N

over an interval (α, β) containing the point x 0 , and with F = F(x, s1 , s2 , . . . , s N +1 ) being a
continuous function on

R = { (x, s1 , s2 , . . . , s N ) : α < x < β and − ∞ < sk < ∞ for k = 1, 2, . . . , N } .

Further suppose that, on R , the partial derivatives

∂F ∂F ∂F
, , ... and
∂s1 ∂s2 ∂s N

are all continuous and are functions of x only. Then the initial-value problem has exactly one
solution on (α, β) . Moreover, the solution and its ﬁrst and second derivatives are all continuous on
this interval.

A good way to prove the four theorems above is to use a “systems” approach. We’ll discuss
this further when we get to “systems of differential equations” in chapter 35.

i i

i i
i i

i i

Additional Exercises 241

Additional Exercises

11.1. None of the following second-order equations explicitly contains y . Solve each using the
substitution v = y as described in section 11.1.
a. x y + 4y = 18x 2 b. x y = 2y
c. y = y d. y + 2y = 8e2x
e. x y = y − 2x 2 y f. (x 2 + 1)y + 2x y = 0

11.2. For each of the following, determine if the given differential equation explicitly contains
y . If it does not, solve it using the substitution v = y as described in section 11.1.

a. y = 4x y b. y y = 1
c. yy = −(y )2 d. x y = (y )2 − y
e. x y − y = 6x 5 f. yy − (y )2 = y
g. y = 2y − 6 h. (y − 3)y = (y )2

i. y + 4y = 9e−3x j. y = y y − 2

11.3. Solve the following higher-order differential equations using the basic ideas from section
11.1 (as done in example 11.3 on page 231):
a. y = y b. x y + 2y = 6x

c. y = 2 y d. y (4) = −2y

11.4. The following second-order equations are all autonomous. Solve each using the substitution
v = y as described in section 11.2.
a. yy = (y )2 b. 3yy = 2(y )2
c. sin(y)y + cos(y)(y )2 = 0 d. y = y
e. (y )2 + yy = 2yy f. y 2 y + y + 2y(y )2 = 0

11.5. For each of the following, determine if the given differential equation is autonomous. If it
is, then solve it using the substitution v = y as described in section 11.2.

a. y = 4x y b. yy = −(y )2
c. y y = 1 d. x y = (y )2 − y
e. x y − y = 6x 5 f. yy − (y )2 = y
g. yy = 2(y )2 h. (y − 3)y = (y )2

i. y + 4y = 9e−3x j. y = y y − 2

11.6. Solve the following initial-value problems. In several cases you can use the general solu-
tions already found for the corresponding differential equations in exercise sets 11.1 and
11.3:
a. x y + 4y = 18x 2 with y(1) = 8 and y (1) = −3

i i

i i
i i

i i

242 Higher-Order Equations: Extending First-Order Concepts

b. x y = 2y with y(−1) = 4 and y (−1) = 12

c. y = y with y(0) = 8 and y (0) = 5
d. y + 2y = 8e2x with y(0) = 0 and y (0) = 0
e. y = y with y(0) = 10 , y (0) = 5 and y (0) = 2
f. x y + 2y = 6x with y(1) = 2 , y (1) = 1 and y (1) = 4

g. x y + 2y = 6 with y(1) = 4 and y (1) = 5

√
h. 2x y y = (y )2 − 1 with y(1) = 0 and y (1) = 3

11.7. Solve the following initial-value problems. In several cases you can use the general solutions
already found for the corresponding differential equations in exercise set 11.4:
a. yy = (y )2 with y(0) = 5 and y (0) = 15
b. 3yy = 2(y )2 with y(0) = 8 and y (0) = 6
c. 3yy = 2(y )2 with y(1) = 1 and y (1) = 9
3
d. yy + 2(y )2 = 3yy with y(0) = 2 and y (0) =
4
e. y = −y e−y with y(0) = 0 and y (0) = 2

11.8. In solving a second-order differential equation using the methods described in this chapter,
we first solved a first-order differential equation for v = y , obtaining a formula for v = y
involving an arbitrary constant. Sometimes the value of the first ‘arbitrary’ constant affects
how we solve v = y for y . You will illustrate this using

y = −2x(y )2 (11.4)

in the following set of problems.

a. Using the “alternative approach” to solving initial-value problems (as illustrated in exam-
ple 11.6 on page 237), ﬁnd the solution to differential equation (11.4) satisfying each of
the following sets of initial values:
i. y(0) = 3 and y (0) = 4 ii. y(0) = 3 and y (0) = 0
1
iii. y(1) = 0 and y (1) = 1 iv. y(0) = − and y (1) = 5
4
(Observe how different the solutions to these different initial-value problems are, even
though they all involve the same differential equation.)
b. Find the set of all possible solutions to differential equation (11.4).

11.9. We will again illustrate the issue raised at the beginning of exercise 11.8, but using differ-
ential equation
y = 2yy . (11.5)

a. Using the “alternative approach” to solving initial-value problems (as illustrated in exam-
ple 11.6 on page 237), ﬁnd the solution to differential equation (11.5) satisfying each of
the following sets of initial values:
i. y(0) = 0 and y (0) = 1 ii. y(0) = 1 and y (0) = 1
iii. y(0) = 1 and y (0) = 0 iv. y(0) = 0 and y (0) = −1

i i

i i
i i

i i

Additional Exercises 243

(Again, observe how different the solutions to these different initial-value problems are,
even though they all involve the same differential equation.)
b. Find the set of all possible solutions to differential equation (11.5).

11.10. In exercise 11.2 g, you showed that the the differential equation

y = 2y − 6 .

is easily solved using the substitution

dy d2 y dv
v = with = .
dx dx2 dx

(If you didn’t do exercise 11.2 g, go back and do it.) Since the differential equation is also
autonomous, the other substitution discussed in this chapter,

dy d2 y dv
v = with = v ,
dx dx2 dy

should also be considered as appropriate. Try it and see what you get for v as a function of
y . Can you, personally, actually go further and replace v with dy/dx and solve the resulting
differential equation? What is the moral of this exercise?

i i

i i
i i

i i

i i
i i

i i

12
Higher-Order Linear Equations and the
Reduction of Order Method

We have just seen that some higher-order differential equations can be solved using methods for
ﬁrst-order equations after applying the substitution v = dy/dx . Unfortunately, this approach has
its limitations. Moreover, as we will later see, many of those differential equations that can be so
solved can also be solved more easily using the theory and methods that will be developed in the
next few chapters. This theory and methodology apply to the class of “linear” differential equations.
This is a rather large class that includes a great many differential equations arising in applications.
In fact, this class of equations is so important and the theory for dealing with these equations is so
extensive that we will not again seriously consider higher-order nonlinear differential equations for
many, many chapters.

12.1 Linear Differential Equations of All Orders

The Equations
Recall that a first-order differential equation is said to be linear if and only it can be written as
dy
+ py = f (12.1)
dx
where p = p(x) and f = f (x) are known functions. Observe that this is the same as saying that
a first-order differential equation is linear if and only if it can be written as
dy
a + by = g (12.2)
dx
where a , b , and g are known functions of x . After all, the first equation is equation (12.2) with
a = 1 , b = p and f = g , and any equation in the form of equation (12.2) can be converted to one
looking like equation (12.1) by simply dividing through by a (so p = b/a and f = g/a ).
Higher order analogs of either equation (12.1) or equation (12.2) can be used to define when a
higher-order differential equation is “linear”. We will find it slightly more convenient to use analogs
of equation (12.2) (which was the reason for the above observations). Second- and third-order linear
equations will first be described so you can start seeing the pattern. Then the general definition will
be given. For convenience (and because there are only so many letters in the alphabet), we may start
denoting different functions with subscripts.
A second-order differential equation is said to be linear if and only if it can be written as
d2 y dy
a0 2
+ a1 + a2 y = g (12.3)
dx dx

245

i i

i i
i i

i i

246 Higher-Order Linear Equation and the Reduction of Order Method

where a0 , a1 , a2 , and g are known functions of x . (In practice, generic second-order differential
equations are often denoted by
d2 y dy
a 2
+ b + cy = g ,
dx d x

instead.) For example,

d2 y dy √ d2 y dy
2
+ x2 − 6x 4 y = x + 1 and 3 2
+ 8 − 6y = 0
dx dx dx dx

are second-order linear differential equations, while

d2 y dy √ d2 y
d y 2
2
+ y2 = x +1 and =
dx dx dx2 dx
are not.
A third-order differential equation is said to be linear if and only if it can be written as

d3 y d2 y dy
a0 3
+ a1 2 + a2 + a3 y = g
dx dx dx

where a0 , a1 , a2 , a3 , and g are known functions of x . For example,

d3 y d2 y dy d3 y
x3 3
+ x2 2 + x − 6y = e x and − y = 0
dx dx dx dx3

are third-order linear differential equations, while

d3 y d3 y dy
− y2 = 0 and + y = 0
dx3 dx3 dx
are not.
Getting the idea?
In general, for any positive integer N , we refer to a N th -order differential equation as being
linear if and only if it can be written as

dN y d N −1 y d2 y dy
a0 N
+ a1 N −1 + · · · + a N −2 2 + a N −1 + aN y = g (12.4)
dx dx dx dx

where a0 , a1 , . . . , a N , and g are known functions of x . For convenience, this equation will often
be written using the prime notation for derivatives,

a0 y (N ) + a1 y (N −1) + · · · + a N −2 y + a N −1 y + a N y = g .

The function g on the right side of the above equation is often called the forcing function for the
differential equation (because it often describes a force affecting whatever phenomenon the equation
is modeling). If g = 0 (i.e., g(x) = 0 for every x in the interval of interest), then the equation is
said to be homogeneous.1 Conversely, if g is nonzero somewhere on the interval of interest, then
we say the differential equation is nonhomogeneous.
As we will later see, solving a nonhomogeneous equation

a0 y (N ) + a1 y (N −1) + · · · + a N −2 y + a N −1 y + a N y = g

1 You may recall the term “homogeneous” from chapter 6. If you compare what “homogeneous’ meant there with what it
means here, you will ﬁnd absolutely no connection. The same term is being used for two completely different concepts.

i i

i i
i i

i i

Linear Differential Equations of All Orders 247

is usually best done after ﬁrst solving the homogeneous equation generated from the original equation
by simply replacing g with 0 ,

a0 y (N ) + a1 y (N −1) + · · · + a N −2 y + a N −1 y + a N y = 0 .

This corresponding homogeneous equation is ofﬁcially called either the corresponding homogeneous
equation or the associated homogeneous equation, depending on the author (we will use whichever
phrase we feel like at the time). Do observe that the zero function,

y(x) = 0 for all x ,

is always a solution to a homogeneous linear differential equation (verify this for yourself). This is
called the trivial solution and is not a very exciting solution. Invariably, the interest is in ﬁnding the
nontrivial solutions.

Intervals of Interest for Linear Equations

When attempting to solve a linear differential equation, be it second order,

a0 y + a1 y + a2 y = g , (12.5)

or of arbitrary order,

a0 y (N ) + a1 y (N −1) + · · · + a N −2 y + a N −1 y + a N y = g , (12.6)

we should keep in mind that we are seeking a solution (or general solution) valid over some “interval
of interest”, (α, β) . To ensure that the solutions exist and are reasonably well behaved on the interval,
we will usually require that

the g and the ak ’s in equation (12.5) or equation (12.6) (depending on the equation of
interest) must all be continuous functions on the interval (α, β) with a0 never being
zero at any point in this interval.
Often, in practice, this assumption is not explicitly stated, or the interval (α, β) is not explicitly
given. In these cases, you should usually assume that the interval of interest (α, β) is one over
which the above assumption holds for the given differential equation.
Why is this assumption important? It is because of the following two theorems:

Theorem 12.1 (existence and uniqueness for second-order linear initial-value problems)
Consider the initial-value problem

ay + by + cy = g with y(x 0 ) = A and y (x 0 ) = B

over an interval (α, β) containing the point x 0 . Assume, further, that a , b , c and g are continuous
functions on (α, β) with a never being zero at any point in this interval.. Then the initial-value
problem has exactly one solution on (α, β) . Moreover, the solution and its ﬁrst and second derivatives
are all continuous on this interval.

Theorem 12.2 (existence and uniqueness for Nth -order linear initial-value problems)
Consider the initial-value problem

a0 y (N ) + a1 y (N −1) + · · · + a N −2 y + a N −1 y + a N y = g

i i

i i
i i

i i

248 Higher-Order Linear Equation and the Reduction of Order Method

with
y(x 0 ) = A1 , y (x 0 ) = A2 , ... and y (N −1) (x 0 ) = A N

over an interval (α, β) containing the point x 0 . Assume, further, that g and the ak ’s are continuous
functions on (α, β) with a0 never being zero at any point in this interval. Then the initial-value
problem has exactly one solution on (α, β) . Moreover, the solution and its derivatives up to order
N are all continuous on this interval.

These theorems assure us that the initial-value problems we’ll encounter are “completely solv-
able”, at least, in theory.
Both of the above theorems are actually corollaries of theorems from the previous chapter
(theorems 11.2 on page 239 and 11.4 on page 240). For those interested, here is the proof of one:

PROOF (of theorem 12.1): Start by rewriting the differential equation,

ay + by + cy = g

as
g b c
y = − y − y ,
a a a
and observe that this is
g(x) b(x) c(x)
y = F(x, y, y ) where F(x, y, z) = − z − y .
a(x) a(x) a(x)

The partial derivatives ∂ F/ and ∂ F/ are easily computed:

∂y ∂z

∂F c(x) ∂F b(x)
= − and = − .
∂y a(x) ∂z a(x)

Note that these partial derivatives are functions of x only. In addition, since a , b and c are
continuous on (α, β) and a is never being zero on (α, β) , it is easy to see that F , ∂ F/∂ y and ∂ F/∂z
are all continuous on

R = { (x, y, z) : α < x < β , − ∞ < y < ∞ and − ∞ < z < ∞ } .

The claims of our theorem now follow immediately from theorem 11.2 (using y0 = A and z 0 = B ).

12.2 Introduction to the Reduction of Order Method

The rest of this chapter is devoted to the reduction of order method. This is a method for converting any
linear differential equation to another linear differential equation of lower order, and then constructing
the general solution to the original differential equation using the general solution to the lower-order
equation. In general, to use this method with an N th -order linear differential equation

a0 y (N ) + a1 y (N −1) + · · · + a N −2 y + a N −1 y + a N y = g ,

i i

i i
i i

i i

Reduction of Order for Homogeneous Linear Second-Order Equations 249

we need to already have one known nontrivial solution y1 = y1 (x) to the corresponding homoge-
neous differential equation. We then try a substitution of the form

y = y1 u

where u = u(x) is a yet unknown function (and y1 = y1 (x) is the aforementioned known solution).
Plugging this substitution into the differential equation then leads to a linear differential equation for
u . As we will see, because y1 satisﬁes the corresponding homogeneous equation, the differential
equation for u ends up being of the form

A0 u (N ) + A1 u (N −1) + · · · + A N −2 u + A N −1 u = g

— remarkably, there is no “ A N u ” term. This means we can use the substitution

v = u ,

as discussed in chapter 11, to rewrite the differential equation for u as a (N − 1)th -order differential
equation for v ,

A0 v (N −1) + A1 v (N −2) + · · · + A N −2 v + A N −1 v = g .

So we have reduced the order of the equation to be solved. If a general solution v = v(x) for this
equation can be found, then the most general formula for u can be obtained from v by integration
(since u = v ). Finally then, by going back to the original substitution formula y = y1 u , we can
obtain a general solution to the original differential equation.
This method is especially useful for solving second-order, homogeneous linear differential
equations since (as we will see) it reduces the problem to one of solving relatively simple first-order
differential equations. Accordingly, we will first concentrate on its use in finding general solutions
to second-order, homogeneous linear differential equations. Then we will briefly discuss using
reduction of order with second-order, nonhomogeneous linear equations, and with both homogeneous
and nonhomogeneous linear differential equations of higher orders.

12.3 Reduction of Order for Homogeneous Linear

Second-Order Equations
The Method
Here we lay out the details of the reduction of order method for second-order, homogeneous linear
differential equations. To illustrate the method, we’ll use the differential equation

x 2 y − 3x y + 4y = 0 .

Note that the first coefficient, x 2 , vanishes when x = 0 . From comments made earlier (see theorem
12.1 on page 247), we should suspect that x = 0 ought not be in any interval of interest for this
equation. So we will be solving over the intervals (0, ∞) and (−∞, 0) .
Before starting the reduction of order method, we need one nontrivial solution y1 to our
differential equation. Ways for finding that first solution will be discussed in later chapters. For now
let us just observe that if
y1 (x) = x 2 ,

i i

i i
i i

i i

250 Higher-Order Linear Equation and the Reduction of Order Method

then
d2 2 d 2
x 2 y1 − 3x y1 + 4y1 = x 2 2
x − 3x x + 4 x2
dx dx

= x 2 [2 · 1] − 3x[2x] + 4x 2

= x 2 [2 − (3 · 2) + 4] = 0 .

0!

Thus, one solution to the above differential equation is y1 (x) = x 2 .

As already stated, this method is for ﬁnding a general solution to some homogeneous linear
second-order differential equation

ay + by + cy = 0

(where a , b , and c are known functions with a(x) never being zero on the interval of interest). We
will assume that we already have one nontrivial particular solution y1 (x) to this generic differential
equation.
For our example (as already noted), we will seek a general solution to

x 2 y − 3x y + 4y = 0 . (12.7)

The one (nontrivial) solution we know is y1 (x) = x 2 .

Here are the details in using the reduction of order method to solve the above:
1. Let
y = y1 u
where u = u(x) is a function yet to be determined. To simplify “plugging into the differential
equation”, go ahead and compute the corresponding formulas for the derivatives y and y
using the product rule:
y = (y1 u) = y1 u + y1 u
and

y = y = y1 u + y1 u

= y1 u + y1 u

= y1 u + y1 u + y1 u + y1 u
= y1 u + 2y1 u + y1 u .

For our example,

y = y1 u = x 2 u
where u = u(x) is the function yet to be determined. The derivatives of y are

y = x 2 u = 2xu + x 2 u

and

y = (y ) = 2xu + x 2 u

= (2xu) + x 2 u

= 2u + 2xu + 2xu + x 2 u
= 2u + 4xu + x 2 u .

i i

i i
i i

i i

Reduction of Order for Homogeneous Linear Second-Order Equations 251

2. Plug the formulas just computed for y , y and y into the differential equation, group
together the coefﬁcients for u and each of its derivatives, and simplify as far as possible.
(We’ll do this with the example ﬁrst and then look at the general case.)
Plugging the formulas just computed above for y , y and y into equation (12.7),
we get

0 = x 2 y − 3x y + 4y

= x 2 2u + 4xu + x 2 u − 3x 2xu + x 2 u + 4 x 2 u
= 2x 2 u + 4x 3 u + x 4 u − 6x 2 u − 3x 3 u + 4x 2 u

= x 4 u + 4x 3 − 3x 3 u + 2x 2 − 6x 2 + 4x 2 u
= x 4 u + x 3 u + 0 · u .

Notice that the u term drops out! So the resulting differential equation for u is
simply
x 4 u + x 3 u = 0 ,
which we can further simplify by dividing out x 4 ,
1
u + u = 0
x

In general, plugging the formulas for y and its derivatives into the given differential
equation yields

0 = ay + by + cy

= a y1 u + 2y1 u + y1 u + b y1 u + y1 u + c y1 u
= a y1 u + 2a y1 u + ay1 u + by1 u + by1 u + cy1 u

= ay1 u + 2a y1 + by1 u + a y1 + by1 + cy1 u .

That is, the differential equation becomes

Au + Bu + Cu = 0

where

A = ay1 , B = 2a y1 + by1 and C = a y1 + by1 + cy1 .

But remember, y1 is a solution to the homogeneous equation

ay + by + cy = 0 .

Consequently,
C = a y1 + by1 + cy1 = 0 ,
and the differential equation for u automatically reduces to

Au + Bu = 0 .

The u term always drops out!

i i

i i
i i

i i

252 Higher-Order Linear Equation and the Reduction of Order Method

3. Now ﬁnd the general solution to the second-order differential equation just obtained for u ,

Au + Bu = 0 ,

via the substitution method discussed in section 11.1:

(a) Let u = v (and, thus, u = v = dv/dx ) to convert the second-order differential
equation for u to the first-order differential equation for v ,
dv
A + Bv = 0 .
dx
(It is worth noting that this first-order differential equation is both linear and separable.)
(b) Find the general solution v(x) to this first-order equation. (Since it is both linear and
separable, you can solve it using either the procedure developed for first-order linear
equations or the approach developed for first-order separable equations.)
(c) Using the formula just found for v , integrate the substitution formula u = v to obtain
the formula for u ,
u(x) = v(x) dx .

Don’t forget all the arbitrary constants.

In our example, we obtained
1
u + u = 0 .
x
Letting v = u and, thus, v = u this becomes
dv 1
+ v = 0 .
dx x
Equivalently,
dv 1
= − v .
dx x
This is a relatively simple separable ﬁrst-order equation. It has one constant
solution, v = 0 . To ﬁnd the others, we divide through by v and proceed as usual
with such equations:
1 dv 1
= −
v dx x

→ 1 dv
v dx
dx = −
1
x
dx

→ ln |v| = − ln |x| + c0

→ v = ±e− ln|x| + c0

→ v = ±ec0 x −1 .

Letting A = ±ec0 , this simpliﬁes to

A
v = ,
x
which also accounts for the constant solution (when A = 0 ).

i i

i i
i i

i i

Reduction of Order for Homogeneous Linear Second-Order Equations 253

Since u = v , it then follows that

A
u(x) = v(x) dx = dx = A ln |x| + B .
x

4. Finally, plug the formula just obtained for u(x) into the ﬁrst substitution,

y = y1 u ,

used to convert the original differential equation for y to a differential equation for u . The
resulting formula for y(x) will be a general solution for that original differential equation.
(Sometimes that formula can be simpliﬁed a little. Feel free to do so.)

In our example, the solution we started with was y1 (x) = x 2 . Combined with
the u(x) just found, we have

y = y1 u = x 2 [A ln |x| + B] .

That is,
y(x) = Ax 2 ln |x| + Bx 2

is the general solution to equation (12.7).

An Observation About the General Solution

In the above example, the general solution obtained was

y(x) = Ax 2 ln |x| + Bx 2 .

It will be worth noting that, after renaming the arbitrary constants A and B as c2 and c1 , respec-
tively, we can rewrite this as
y(x) = c1 y1 (x) + c2 y2 (x)
where
y1 (x) = x 2

is the one solution we already knew, and

y2 (x) = x 2 ln |x|

is another function that arose in the course of our procedure, and which, we should also note, can be
written as
y2 (x) = y1 (x)u 0 (x) with u 0 (x) = ln |x| .

Moreover:

1. Since y(x) = c1 y1 (x) + c2 y2 (x) is a solution to our differential equation for any choice of
constants c1 and c2 , it follows (by taking c1 = 0 and c2 = 1 ) that y2 is also a particular
solution to our differential equation.

2. Since u 0 (x) is NOT a constant, y2 is NOT a constant multiple of y1 .

i i

i i
i i

i i

254 Higher-Order Linear Equation and the Reduction of Order Method

It turns out that the above observation does not just hold for this one example. If you look care-
fully at the reduction of order method for solving any second-order linear homogeneous differential
equation,
ay" + by + cy = 0 ,
you will ﬁnd that this method always results in a general solution of the form

y(x) = c1 y1 (x) + c2 y2 (x)

where

1. c1 and c2 are arbitrary constants,

2. y1 is the one known particular solution needed to start the method, and

3. y2 is another particular solution to the given differential equation which is NOT a constant
multiple of y1 .
As noted, the above can be easily veriﬁed by looking at the general formulas that arise in using the
reduction of order method.2 For now though, let us take the above as something you should observe
in the solutions you obtain in doing the exercises for this section.
And why are the above observations important? Because they will serve as the starting point
for a more general discussion of “general solutions” in the next chapter, and which, in turn, will lead
to other methods for solving our differential equations, often (but not always) without the need to go
through the full reduction of order method.

12.4 Reduction of Order for Nonhomogeneous Linear

Second-Order Equations
If you look back over our discussion in section 12.3, you will see that the reduction of order method
applies almost as well in solving a nonhomogeneous equation

ay + by + cy = g ,

provided that “one solution y1 ” is a solution to the corresponding homogeneous equation

ay + by + cy = 0 .

Then, letting y = y1 u in the nonhomogeneous equation and then replacing u with v leads to an
equation of the form
Av + Bv = g
instead of
Av + Bv = 0 .

So the resulting first-order equation for v is not both separable and linear; it is just linear. Still,
we know how to solve such equations. Solving that first-order linear differential equation for v
and continuing with the method already described finally yields the general solution to the desired
nonhomogeneous differential equation.
We will do one example. Then I’ll tell you why the method is rarely used in practice.
2 Actually, there are technical issues arising at points where y is zero.
1

i i

i i
i i

i i

Reduction of Order for Nonhomogeneous Linear Second-Order Equations 255

!Example 12.1: Let us try to solve the second-order nonhomogeneous linear differential equation
√
x 2 y − 3x y + 4y = x (12.8)

over the interval (0, ∞) .

As we saw in our main example in section 12.3, the corresponding homogeneous equation

x 2 y − 3x y + 4y = 0

has y1 (x) = x 2 as one solution (in fact, from that example, we know the entire general solution
to this homogeneous equation, but we only need this one particular solution for the method). Let

y = y1 u = x 2 u

where u = u(x) is the function yet to be determined. The derivatives of y are

y = x 2 u = 2xu + x 2 u

and

y = (y ) = 2xu + x 2 u
= 2u + 2xu + 2xu + x 2 u
= 2u + 4xu + x 2 u .

Plugging these into equation (12.8) yields

√
x = x 2 y − 3x y + 4y

= x 2 2u + 4xu + x 2 u − 3x 2xu + x 2 u + 4 x 2 u
= 2x 2 u + 4x 3 u + x 4 u − 6x 2 u − 3x 3 u + 4x 2 u

= x 4 u + 4x 3 − 3x 3 u + 2x 2 − 6x 2 + 4x 2 u
= x 4 u + x 3 u + 0 · u .

As before, the u term drops out. In this case, we are left with
√
x 4 u + x 3 u = x .

That is,
1/
x 4v + x 3v = x 2 with v = u .
This is a relatively simple ﬁrst-order linear equation. To help ﬁnd the integrating factor, we now
divide through by x 4 , obtaining
dv 1
+ v = x − /2
7
.
dx x

So the integrating factor is "

1/ dx
μ = e x = eln|x| = |x| .

Since we are just attempting to solve over the interval (0, ∞) , we really just have

μ = x .

i i

i i
i i

i i

256 Higher-Order Linear Equation and the Reduction of Order Method

Multiplying the last differential equation for v and proceeding as usual when solving ﬁrst-order
linear differential equations:
dv 1
7
x + v = x x − /2
dx x

→ x
dv
dx
+ v = x − /2
5

d
→ dx
xv = x − /2
5

d
→ dx
xv dx = x − /2 dx
5

→ 2
xv = − x − /2 + c1
3
3

→ 2
v = − x − /2 +
3
5 c1
x
Recalling that v = u , we can rewrite the last line as
du 2 5 c
= − x − /2 + 1 .
dx 3 x
Thus,
du 2 5 c
u = dx = − x − /2 + 1 dx
dx 3 x
2 2
x − /2 + c1 ln(x) + c2
3
=
3
4 −3/2
= x + c1 ln(x) + c2 ,
9
and the general solution to our nonhomogeneous equation is
4 3
y(x) = x 2 u(x) = x 2 x − /2 + c1 ln(x) + c2
9
4 1/2
= x + c1 x 2 ln(x) + c2 x 2 .
9
For no obvious reason at this point, let’s observe that we can write this solution as
4√
y(x) = c1 x 2 ln(x) + c2 x 2 + x . (12.9)
9

It should be observed that, in the above example, we only used one particular solution, y1 (x) =
x 2 , to the homogeneous differential equation
x 2 y − 3x y + 4y = 0
even though we had already found the general solution
Ax 2 ln |x| + Bx 2 .
Later, in chapter 22, we will develop a reﬁnement of the reduction of order method for solving second-
order, nonhomogeneous linear differential equations that makes use of the entire general solution to
the corresponding homogeneous equation. This reﬁnement (the “variation of parameters” method)
has two distinct advantages over the reduction of order method when solving nonhomogeneous
differential equations:

i i

i i
i i

i i

Reduction of Order in General 257

1. The computations required for the reﬁned procedure tend to be simpler and more easily
carried out.

2. With a few straightforward modiﬁcations, the reﬁned procedure readily extends to being a
useful method for dealing with nonhomogeneous linear differential equations of any order.
For the reasons discussed in the next section, the same cannot be said about the basic reduction
of order method.

That is why, in practice, the basic reduction of order method is rarely used with nonhomogeneous
equations.

12.5 Reduction of Order in General

In theory, reduction of order can be applied to any linear equation of any order, homogeneous or not.
Whether its application is useful is another issue.

!Example 12.2: Consider the third-order homogeneous linear differential equation

y − 8y = 0 . (12.10)

If you rewrite this equation as

y = 8y ,
and think about what happens when you differentiate exponentials, you will realize that

y1 (x) = e2x

is ‘obviously’ a solution to our differential equation (verify it yourself). Letting

y = y1 u = e2x u

and repeatedly using the product rule, we get

y = e2x u = 2e2x u + e2x u ,

y = e2x u = 2e2x u + e2x u

= 4e2x u + 2e2x u + 2e2x u + e2x u

= 4e2x u + 4e2x u + e2x u

and
y = e2x u = 4e2x u + 4e2x u + e2x u

= 8e2x u + 4e2x u + 8e2x u + 4e2x u + 2e2x u + e2x u

= 8e2x u + 12e2x u + 6e2x u + e2x u .

i i

i i
i i

i i

258 Higher-Order Linear Equation and the Reduction of Order Method

So, using y = e2x u ,

y − 8y = 0

→ 8e2x u + 12e2x u + 6e2x u + e2x u − 8 e2x u = 0

→ e2x u + 6e2x u + 12e2x u + 8e2x − 8e2x u = 0 .

Again, the u term vanishes, leaving us with

e2x u + 6e2x u + 12e2x u = 0 .

Letting v = u and dividing out the exponential, this becomes the second-order differential
equation
v + 6v + 12v = 0 . (12.11)
Thus we have changed the problem of solving the third-order differential equation to one of solving
a second-order differential equation. If we can now correctly guess a particular solution v1 to
that second-order differential equation, we could again use reduction of order to get the general
solution v(x) to that second-order equation, and then use that and the fact that y = e2x u with
v = u to obtain the general solution to our original differential equation. Unfortunately, even
though the order is less, “guessing” a solution to equation (12.11) is a good deal more difﬁcult
than was guessing a particular solution to the original differential equation, equation (12.10).

As the example illustrates, even if we can, somehow, obtain one particular solution to a given
N th -order linear homogeneous linear differential equation, and then use it to reduce the problem to
solving an (N − 1)th -order differential equation, that lower order differential equation may be just
as hard to solve as the original differential equation (unless N = 2 ). In fact, we will learn how to
solve differential equations such as equation (12.11), but those methods can also be used to ﬁnd the
general solution to the original differential equation, equation (12.10), as well.
Still it does no harm to know that the problem of solving an N th -order linear homogeneous
linear differential equation can be reduced to that of solving an (N − 1)th -order differential equation,
at least when we have one solution to the original equation. For the record, here is a theorem to that
effect:

Theorem 12.3 (reduction of order in homogeneous equations)

Let y be any solution to some N th -order homogeneous differential equation

a0 y (N ) + a1 y (N −1) + · · · + a N −2 y + a N −1 y + a N y = g (12.12)

where g and the ak ’s are known functions on some interval (α, β) , and let y1 be a nontrivial
particular solution to the corresponding homogeneous equation

a0 y (N ) + a1 y (N −1) + · · · + a N −2 y + a N −1 y + a N y = 0 .

Set
y
u = (so that y = y1 u ) .
y1

Then v = u satisﬁes an (N − 1)th -order differential equation

A0 v (N −1) + A1 v (N −2) + · · · + A N −2 v + A N −1 v = g .

i i

i i
i i

i i

Additional Exercises 259

where the Ak ’s are functions on the interval (α, β) that can be determined from the ak ’s along
with y1 and its derivatives.

The proof is relatively straightforward: You see what happens when you repeatedly use the
product rule with y = y1 u , and plug the results into the equation (12.12). You can ﬁll in the
details yourself (see exercise 12.4).

Additional Exercises

12.1. For each of the following differential equations, identify

i. the order of the equation,

ii. whether the equation is linear or not, and,

iii. if it is linear, whether the equation is homogeneous or not.

a. y + x 2 y − 4y = x 3 b. y + x 2 y − 4y = 0
c. y + x 2 y = 4y d. y + x 2 y + 4y = y 3
e. x y + 3y = e2x f. y + y = 0
g. (y + 1)y = (y )3 h. y = 2y − 5y + 30e3x
i. y (iv) + 6y + 3y − 83y − 25 = 0 j. yy + 6y + 3y = y
k. y + 3y = x 2 y l. y (55) = sin(x)
12.2. For each of the following, ﬁrst verify that the given y1 is a solution to the given differential
equation, and then ﬁnd the general solution to the differential equation using the given y1
with the method of reduction of order.
a. y − 5y + 6y = 0 , y1 (x) = e2x
b. y − 10y + 25y = 0 , y1 (x) = e5x
c. x 2 y − 6x y + 12y = 0 on x >0 , y1 (x) = x 3
d. 2x 2 y − x y + y = 0 on x >0 , y1 (x) = x
2 √
e. 4x y + y = 0 on x > 0 , y1 (x) = x

2
4

f. y − 4 + y + 4+ y = 0 on x >0 , y1 (x) = e2x
x x
g. (x + 1)y + x y − y = 0 , y1 = e−x
1
h. y − y1 (x) = e−x
2
y − 4x 2 y = 0 ,
x
i. y + y = 0 , y1 (x) = sin(x)
j. x y + (2 + 2x)y + 2y = 0 on x >0 , y1 (x) = x −1

k. sin2 (x)y − 2 cos(x) sin(x)y + 1 + cos2 (x) y = 0 , y1 (x) = sin(x)

i i

i i
i i

i i

260 Higher-Order Linear Equation and the Reduction of Order Method

l. x 2 y − 2x y + x2 + 2 y = 0 on x >0 , y1 (x) = x sin(x)

m. x 2 y + x y + y = 0 on x > 0 , y1 (x) = sin(ln |x|)

n. x 2 y + x y + x 2 − on x > 0 , y1 (x) = x −1/2 cos(x)
1
y = 0
4

12.3. Several nonhomogeneous differential equations are given below. For each, ﬁrst verify that
the given y1 is a solution to the corresponding homogeneous differential equation, and then
ﬁnd the general solution to the given nonhomogeneous differential equation using reduction
of order with the given y1 .
a. y − 4y + 3y = 9e2x , y1 (x) = e3x
b. y − 6y + 8y = e4x , y1 (x) = e2x
√
c. x 2 y + x y − y = x on x >0 , y1 (x) = x
2
d. x y − 20y = 27x 5
on x >0 , y1 (x) = x 5
e. x y + (2 + 2x)y + 2y = 8e2x on x >0 , y1 = x −1
f. (x + 1)y + x y − y = (x + 1)2 , y1 = e−x

12.4. Prove the claims in theorem 12.3 assuming:

a. N = 3 b. N = 4 c. N is any positive integer

12.5. Each of the following is one of the relatively few third- and fourth-order differential equa-
tions that can be easily solved via reduction of order. For each, ﬁrst verify that the given
y1 is a solution to the given differential equation or to the corresponding homogeneous
equation (as appropriate), and then ﬁnd the general solution to the differential equation
using the given y1 with the method of reduction of order.
a. y − 9y + 27y − 27y = 0 , y1 = e3x
b. y − 9y + 27y − 27y = e3x sin(x) , y1 = e3x
c. y (4) − 8y + 24y − 32y + 16y = 0 , y1 = e2x
d. x 3 y − 4y + 10y − 12y = 0 on x >0 , y1 = x 2

i i

i i
i i

i i

13
General Solutions to Homogeneous Linear
Differential Equations

The reduction of order method is, admittedly, limited. First of all, you must already have one solution
to the given differential equation before you can start the method. Moreover, it’s just not that helpful
when the order of the equation is greater than two. We will be able to get around these limitations,
at least when dealing with certain important classes of differential equations, by using methods that
will be developed in later chapters. An important element of those methods is the construction of
general solutions from a “suitably chosen” collection of particular solutions. That element is the
topic of this chapter. We will discover how to choose that “suitably chosen” set and how to (easily)
construct a general solution from that set. Once you know that, you can proceed straight to chapter
15 (skipping, if you wish, chapter 14) and learn how to fully solve some of the more important
equations found in applications.
By the way, we will not abandon the reduction of order method. It will still be needed, and the
material in this chapter will help tell us when it is needed.

13.1 Second-Order Equations (Mainly)

The Problem and a Basic Question
Throughout this section, our interest will be in “ﬁnding” a general solution over some interval (α, β)
to a fairly arbitrary second-order, homogeneous linear differential equation

ay" + by + cy = 0 .

As suggested in the last chapter, we will limit the interval (α, β) to being one over which the
functions a , b and c are all continuous with a never being zero.
At the end of section 12.3, we observed that the reduction of order method always seemed to
yield a general solution to the above differential equation in the form

y(x) = c1 y1 (x) + c2 y2 (x)

where c1 and c2 are arbitrary constants, and y1 and y2 are a pair of particular solutions that are
not constant multiples of each other. Then, y1 was a previously known solution to the differential
equation, and y2 was a solution that arose from the order of reduction method. You were not,
however, given much advice on how to find that first particular solution. We will discover (in later
chapters) that there are relatively simple methods for finding “that one solution” at least for certain

261

i i

i i
i i

i i

262 General Solutions to Homogeneous Linear Differential Equations

common types of differential equations. We will further discover that, while these methods do not,
themselves, yield general solutions, they often do yield sets of particular solutions. Here is a simple
example:

!Example 13.1: Consider the homogeneous differential equation

y + y = 0

over the entire real line, (−∞, ∞) . We can ﬁnd at least two solutions by rewriting this as

y = −y ,

and then asking ourselves if we know of any basic functions (powers, exponentials, trigonometric
functions, etc.) that satisfy this. It should not take long to recall that

y(x) = cos(x) and y(x) = sin(x)

are two such functions: If y(x) = cos(x) , then

d2 d
y (x) = [cos(x)] = [− sin(x)] = − cos(x) = −y(x) ,
dx2 dx

and if y(x) = sin(x) , then

d2 d
y (x) = 2
[sin(x)] = [cos(x)] = − sin(x) = −y(x) .
dx dx

Thus, both
y(x) = cos(x) and y(x) = sin(x)
are solutions to our differential equation. Moreover, it should be clear that neither is a constant
multiple of the other. So, to ﬁnd the general solution of our differential equation, should we use
the reduction of order method with y1 (x) = cos(x) as the known solution, or the reduction of
order method with y1 (x) = sin(x) as the known solution, or can we simply say

y(x) = c1 cos(x) + c2 sin(x)

is the general solution, skipping the reduction of order method altogether?

Thus, we are led to the basic question:

Can the general solution to any second-order, homogeneous linear differential equation

ay + by + cy = 0 ,

be given by
y(x) = c1 y1 (x) + c2 y2 (x)
where c1 and c2 are arbitrary constants, and y1 and y2 are any two solutions that
are not constant multiples of each other?
Our goal is to answer this question by the end of this section.

i i

i i
i i

i i

Second-Order Equations (Mainly) 263

Linear Combinations and the Principle of Superposition

Time to introduce a little terminology to simplify future discussion: Given any ﬁnite collection of
functions — y1 , y2 , . . . and y N — a linear combination of these yk ’s is any expression of the
form
c1 y1 + c2 y2 + · · · + c N y N
where the ck ’s are constants. If these constants are all arbitrary, then the expression is, unsurprisingly,
an arbitrary linear combination. In practice, we will often refer to some function y as a linear
combination of the yk ’s (over some interval (α, β) ) to indicate there are constants c1 , c2 , . . . ,
and c N such that

y(x) = c1 y1 (x) + c2 y2 (x) + · · · + c N y N (x) for all x in (α, β) .

!Example 13.2: Here are some linear combinations of cos(x) and sin(x) :

4 cos(x) + 2 sin(x) ,

3 cos(x) − 5 sin(x)
and
sin(x) (i.e., 0 cos(x) + 1 sin(x)) .

And
c1 cos(x) + c2 sin(x) + c3 e x
is an arbitrary linear combination of the three functions cos(x) , sin(x) and e x .

Let us now assume that we have found two solutions y1 and y2 to our homogeneous differential
equation
ay + by + cy = 0
on (α, β) . This means that

a y1 + by1 + cy1 = 0 and a y2 + by2 + cy2 = 0 .

Let’s see what happens when we plug into our differential equation some linear combination
of these two solutions, say,
y = 2y1 + 6y2 .
By the fundamental properties of differentiation, we know that

y = [2y1 + 6y2 ] = [2y1 ] + [6y2 ] = 2y1 + 6y2

and
y = [2y1 + 6y2 ] = [2y1 ] + [6y2 ] = 2y1 + 6y2 .

So,
ay + by + cy = a [2y1 + 6y2 ] + b [2y1 + 6y2 ] + c [2y1 + 6y2 ]
= 2a y1 + 6a y2 + 2by1 + 6by2 + 2cy1 + 6cy2

= 2 a y1 + by1 + cy1 + 6 a y2 + by2 + cy2
= 2 [0] + 6 [0]
= 0 .

i i

i i
i i

i i

264 General Solutions to Homogeneous Linear Differential Equations

Thus, the linear combination 2y1 + 6y2 is another solution to our homogeneous linear differential
equation.
Of course, there was nothing special about the constants 2 and 6 . If we had used any linear
combination of y1 and y2
y = c1 y1 + c2 y2 ,
then the above computations would have yielded

ay + by + cy
= a [c1 y1 + c2 y2 ] + b [c1 y1 + c2 y2 ] + c [c1 y1 + c2 y2 (x)]

= ···

= c1 a y1 + by1 + cy1 + c2 a y2 + by2 + cy2

= c1 [0] + c2 [0]
= 0 .

Nor is there any reason to stop with two solutions. If y had been any linear combination

y = c1 y1 + c2 y2 + · · · + c K y K

with each yk being a solution to our homogeneous differential equation,

a yk + byk + cyk = 0 ,

then the above computations — expanded to account for the N solutions — clearly would have
yielded
ay + by + cy = c1 [0] + c2 [0] + · · · + c K [0] = 0 .
This is a major result, often called the “principle of superposition”.1 Being a major result, it deserves
its own theorem:

Theorem 13.1 (principle of superposition [for second-order equations])

Any linear combination of solutions to a second-order, homogeneous linear differential equation is,
itself, a solution to that homogeneous linear equation.

Note that this partially answers our basic question on page 262 by assuring us that

y = c1 y1 + c2 y2

is a solution to our differential equation for every choice of constants c1 and c2 . What remains is
to see whether this describes all possible solutions.

Linear Independence and Fundamental Sets

We now know that if we have, say, three solutions y1 , y2 and y3 to our homogeneous differential
equation, then any linear combination of these functions

y = c1 y1 + c2 y2 + c3 y3

1 The name comes from the fact that, geometrically, the graph of a linear combination of functions can be viewed as a
“superposition” of the graphs of the individual functions.

i i

i i
i i

i i

Second-Order Equations (Mainly) 265

is also a solution. But what if one of these yk ’s is also a linear combination of the other yk ’s , say,

y3 = 4y1 + 2y2 .

Then we can simplify our expression for y by noting that

y = c1 y1 + c2 y2 + c3 y3
= c1 y1 + c2 y2 + c3 [4y1 + 2y2 ]
= [c1 + 4c3 ]y1 + [c2 + 2c3 ]y2 .

Since c1 + 4c3 and c2 + 2c3 are, themselves, just constants — call them a1 and a2 — our formula
for y reduces to
y = a1 y1 + a2 y2 .
Thus, our original formula for y did not require y3 at all. In fact, including this redundant function
gives us a formula with more constants than necessary. Not only is this a waste of ink, it will cause
difﬁculties when we use these formulas in solving initial-value problems.
This prompts even more terminology to simplify future discussion. Suppose

{ y1 , y2 , . . . , y M }

is a set of functions deﬁned on some interval. We will say this set is linearly independent (over the
given interval) if none of the yk ’s can be written as a linear combination of any of the others (over the
given interval). On the other hand, if at least one yk in the set can be written as a linear combination
of some of the others, then we will say the set is linearly dependent (over the given interval).

!Example 13.3: The set of functions

{ y1 (x), y2 (x), y3 (x) } = { cos(x), sin(x), 4 cos(x) + 2 sin(x) } .

is linearly dependent (over any interval) since the last function is clearly a linear combination of
the ﬁrst two.

By the way, we should observe the almost trivial fact that, whatever functions y1 , y2 , . . . and
y M may be,
0 = 0 · y1 + 0 · y2 + · · · + 0 · y M .
So the zero function can always be treated as a linear combination of other functions, and, hence,
cannot be one of the functions in any linearly independent set.

Linear Independence for Function Pairs

Matters simplify greatly when our set is just a pair of functions

{ y1 , y2 } .

In this case, the statement that one of these yk ’s is a linear combination of the other over the interval
(α, β) is just the statement that either y1 = c2 y2 for some constant c2 , that is,

y1 (x) = c2 y2 (x) for all x in (α, β) ,

or that y2 = c1 y1 for some constant c1 , that is,

y2 (x) = c1 y1 (x) for all x in (α, β) .

i i

i i
i i

i i

266 General Solutions to Homogeneous Linear Differential Equations

Either way, one function is simply a constant multiple of the other over the interval of interest. In fact,
unless c1 = 0 or c2 = 0 , then each function is a constant multiple of the other with c1 · c2 = 1 .
Thus, for a pair of functions, the concepts of linear independence and dependence reduce to the
following:

The set { y1 , y2 } is linearly independent.

⇐⇒ Neither y1 nor y2 is a constant multiple of the other.

and

The set { y1 , y2 } is linearly dependent.

⇐⇒ Either y1 or y2 is a constant multiple of the other.

In practice, this makes it relatively easy to determine when a pair of functions is linearly independent.

!Example 13.4: In example 13.3 we obtained

{ y1 (x), y2 (x) } = { cos(x), sin(x) }

as a pair of solutions for the homogeneous second-order linear differential equation

y + y = 0 .

It should be clear that there is no constant c1 or c2 such that

cos(x) = c2 sin(x) for all x in (α, β)

or
sin(x) = c1 cos(x) for all x in (α, β) .

After all, if we were to believe, say, that sin(x) = c1 cos(x) , then we would have to believe that
there is a constant c1 such that
sin(x)
= c1 for all x in (α, β) ,
cos(x)

which, in turn, would require that we believe

√
0 sin(0) sin(π/4) 2/2
0 = = = c1 = = √ = 1 !
1 cos(0) cos(π/4) 2/2

So, clearly, neither sin(x) nor cos(x) is a constant multiple of the other over the real line. Hence,
{cos(x), sin(x)} is a linearly independent set (over the entire real line), and

y(x) = c1 sin(x) + c2 sin(x)

not only describes many possible solutions to our differential equation, it cannot be simpliﬁed to
an expression with fewer arbitrary constants.

i i

i i
i i

i i

Second-Order Equations (Mainly) 267

Fundamental Solution Sets and General Solutions

One ﬁnal bit of terminology: Given any homogeneous linear differential equation

a0 y (N ) + a1 y (N −1) + · · · + a N −2 y + a N −1 y + a N y = 0

(over an interval (α, β) ), a fundamental set of solutions (for the given differential equation) is simply
any linearly independent set of solutions

{y1 , y2 , . . . , y M }

such that the arbitrary linear combination of these solutions,

y = c1 y1 + c2 y2 + · · · + c M y M

is a general solution to the differential equation.

While the above deﬁnition holds for any homogeneous linear differential equation, our current
interest is focused on the case where N = 2 , and, at this point, you are probably suspecting that
any linearly independent pair {y1 , y2 } of solutions to our second-order differential equation is a
fundamental set. This suspicion can be conﬁrmed by applying the principle of superposition and the
following lemma on the existence and uniqueness of solutions to the problems we are dealing with:

Lemma 13.2 (existence and uniqueness for second-order homogeneous linear equations)
Consider the initial-value problem

ay + by + cy = 0 with y(x 0 ) = A and y (x 0 ) = B

This lemma is nothing more than theorem 12.1 on page 247 with “ g = 0 ”. To see how it is
relevant to us, let’s go back to the differential equation in an earlier example.

!Example 13.5: We know that one linearly independent pair of solutions to

y + y = 0

on (−∞, ∞) is
{y1 , y2 } = { cos(x) , sin(x) } .
In addition, the principle of superposition assures us that any linear combination of these solutions

y(x) = c1 cos(x) + c2 sin(x)

is also a solution to the differential equation. To go further and verify that this pair is a fundamental
set of solutions for the above differential equation, and that the above expression for y is a general
solution, we need to show that every solution to this differential equation can be written as the
above linear combination for some choice of constants c1 and c2 .
So let #y be any single solution to the above differential equation, and consider the problem
of solving the initial-value problem

y + y = 0 with y(0) = A and y (0) = B (13.1a)

i i

i i
i i

i i

268 General Solutions to Homogeneous Linear Differential Equations

where
A =#
y(0) and y (0)
B = # . (13.1b)

Obviously, #
y is a solution, but so is

y(x) = c1 cos(x) + c2 sin(x)

provided we can ﬁnd constants c1 and c2 such that the above and its derivative,
d
y (x) = [c1 cos(x) + c2 sin(x)] = −c1 sin(x) + c2 cos(x) ,
dx
equal A and B , respectively, when x = 0 . This means we want to ﬁnd c1 and c2 so that

y(0) = c1 cos(0) + c2 sin(0) = A

and
y (0) = −c1 sin(0) + c2 cos(0) = B .

Since cos(0) = 1 and sin(0) = 0 , this system reduces to

c1 · 1 + c2 · 0 = A
and
−c1 · 0 + c2 · 1 = B ,

immediately telling us that c1 = A and c2 = B . Thus, both

#
y(x) and A cos(x) + B sin(x)

are solutions over (−∞, ∞) to initial-value problem (13.1). But lemma 13.2 tells us that there
is only one solution. So our two solutions must be the same; that is, we must have

#
y(x) = A cos(x) + B sin(x) for all x in (−∞, ∞) .

Thus, not only is

y(x) = c1 cos(x) + c2 sin(x)
a solution to
y + y = 0
for every pair of constants c1 and c2 , this arbitrary linear combination describes every possible
solution to this differential equation. In other words,

y(x) = c1 cos(x) + c2 sin(x)

is a general solution to
y + y = 0 ,
and
{ cos(x) , sin(x) }
is a fundamental set of solutions to the above differential equation.
Before leaving this example, let us make one more observation; namely that, if x 0 , A and
B are any three real numbers, then lemma 13.2 tells us that there is exactly one solution to the
initial-value problem

y + y = 0 with y(x 0 ) = A and y (x 0 ) = B .

i i

i i
i i

i i

Second-Order Equations (Mainly) 269

Moreover, by the above analysis, we know that a solution is given by

y(x) = c1 cos(x) + c2 sin(x)

for some single pair of constants c1 and c2 . Hence, there is one and only one choice of constants
c1 and c2 such that

A = y(x 0 ) = c1 cos(x 0 ) + c2 sin(x 0 )

and
B = y (x 0 ) = −c1 sin(x 0 ) + c2 cos(x 0 ) .

It turns out that much of the analysis just done in the last example for

y + y = 0 on (−∞, ∞)

can be repeated for any given

ay" + by + cy = 0 on (α, β)

provided a , b and c are all continuous functions on (α, β) with a never being zero. More
precisely, if you are given such a differential equation, along with a linearly independent pair of
solutions {y1 , y2 } , then by applying the principle of superposition and lemma 13.2 as done in the
above example, you can show:
1. The arbitrary linear combination

y(x) = c1 y1 (x) + c2 y2 (x) for all x in (α, β)

is a general solution to the differential equation (and, hence, {y1 , y2 } is a fundamental set of
solutions).
2. Given any point x 0 in (α, β) , and any two real numbers A and B , then there is exactly
one choice of constants c1 and c2 such that

y(x) = c1 y1 (x) + c2 y2 (x) for all x in (α, β)

is the solution to the initial-value problem

ay" + by + cy = 0 with y(x 0 ) = A and y (x 0 ) = B .

Try it yourself:

?Exercise 13.1: Consider the homogeneous linear differential equation

y − y = 0 .

a: Verify that $ %
e x , e−x
is a linearly independent pair of solutions to the above differential equation on (−∞, ∞) .
b: Verify that
y(x) = c1 e x + c2 e−x
satisﬁes the given differential equation for any choice of constants c1 and c2 .

i i

i i
i i

i i

270 General Solutions to Homogeneous Linear Differential Equations

c: Let A and B be any two real numbers, and ﬁnd the values of c1 and c2 (in terms of A and
B such that
y(x) = c1 e x + c2 e−x
satisﬁes
y − y = 0 with y(0) = A and y (0) = B .
(Answer: c1 = (A + B)/2 and c2 = (A − B)/2 )
d: Let #
y be any solution to the given differential equation, and, using lemma 13.2 and the results
from the last exercise (with A = # y(0) and B = # y (0) ), show that

y(x) = c1 e x + c2 e−x
# for all x in (−∞, ∞)

for some choice of constants c1 and c2 .

e: Note that, by the above, it follows that

y(x) = c1 e x + c2 e−x

is a general solution to
y − y = 0 on (−∞, ∞) .

Hence, {e x , e−x } is a fundamental set of solutions to this differential equation. Now convince
yourself (preferably using lemma 13.2) that every initial-value problem

y − y = 0 with y(x 0 ) = A and y (x 0 ) = B

has a solution of the form

y(x) = c1 e x + c2 e−x
(assuming, of course, that x 0 , A and B are real numbers).

The Big Theorem on Second-Order Homogeneous Equations

With the principle of superposition and existence/uniqueness lemma 13.2, you should be able to verify
that any given linearly independent pair {y1 , y2 } of solutions to any given reasonable second-order,
homogeneous linear differential equation

ay + by + cy = 0

is a fundamental set of solutions to the differential equation. In fact, it should seem reasonable that
we could prove a general theorem to this effect, allowing us to simply invoke that theorem without
going through the details we went through in the above example and exercise, and even without
knowing a particular set of solutions. That theorem would look something like the following:

Theorem 13.3 (general solutions to second-order, homogenous linear differential equations)

Let (α, β) be some open interval, and suppose we have a second-order homogeneous linear differ-
ential equation
ay + by + cy = 0
where, on (α, β) , the functions a , b and c are continuous, and a is never zero. Then the following
statements all hold:
1. Fundamental sets of solutions for this differential equation (over (α, β) ) exist.

i i

i i
i i

i i

Second-Order Equations (Mainly) 271

2. Every fundamental solution set consists of a pair of solutions.

3. If {y1 , y2 } is any linearly independent pair of particular solutions over (α, β) , then:
(a) {y1 , y2 } is a fundamental set of solutions.
(b) A general solution to the differential equation is given by

y(x) = c1 y1 (x) + c2 y2 (x)

where c1 and c2 are arbitrary constants.

(c) Given any point x 0 in (α, β) and any two ﬁxed values A and B , there is exactly one
ordered pair of constants {c1 , c2 } such that

y(x) = c1 y1 (x) + c2 y2 (x)

also satisﬁes the initial conditions

y(x 0 ) = A and y (x 0 ) = B .

This theorem can be considered as the “Big Theorem on Second-Order, Homogeneous Linear
Differential Equations”. It will be used repeatedly, often without comment, in the chapters that
follow. A full proof of this theorem is given in the next chapter for the dedicated reader. That proof
is, basically, an expansion of the discussion given in example 13.5.
By the way, the statement about “initial conditions” in the above theorem further assures us
that second-order sets of initial conditions are appropriate for second-order, homogeneous linear
differential equations. It also assures us that no linearly independent pair of solutions for a second-
order, homogeneous linear differential equation can become “degenerate” at any point in the interval
(α, β) . (To see why we might be worried about “degeneracy”, see exercises 13.3 and 13.4 at the end
of the chapter.)

!Example 13.6: Consider the differential equation

x 2 y + x y − 4y = 0 .

This is
ay + by + cy = 0
with
a(x) = x 2 , b(x) = x and c = −4 .

These are all continuous functions everywhere, but

a(0) = 02 = 0 .

So, to apply the above theorem, our interval of interest (α, β) must not include x = 0 . Accord-
ingly, let’s consider solving

x 2 y + x y − 4y = 0 on (0, ∞) .

You can easily verify that one pair of solutions2 is

y1 (x) = x −2 and y2 (x) = x 2 .

2 We’ll discuss a method for ﬁnding these particular solutions in chapter 18.

i i

i i
i i

i i

272 General Solutions to Homogeneous Linear Differential Equations

It should be obvious that neither is a constant multiple of the other. Theorem 13.3 now immediately
tells us that this linearly independent pair
& '
x −2 , x 2

is a fundamental set of solutions for our differential equation, and that

y(x) = c1 x −2 + c2 x 2

is a general solution to this differential equation.

(By the way, note that one of the solutions, speciﬁcally y1 (x) = x −2 , “blows up” at the
point where a(x) = x 2 is zero. This is something that often happens with solutions of

ay + by + cy = 0

at a point where a is zero.)

13.2 Homogeneous Linear Equations of Arbitrary Order

The discussion in the previous section can, naturally, be extended to an analogous discussion con-
cerning solutions to homogeneous linear equations of any order. Two results of this discussion that
should be noted are the generalizations of the principle of superposition and the big theorem on
second-order, homogeneous linear differential equations (theorem 13.3). Here are those generaliza-
tions:

Theorem 13.4 (principle of superposition)

Any linear combination of solutions to a homogeneous linear differential equation is, itself, a solution
to that homogeneous linear equation.

Theorem 13.5 (general solutions to homogenous linear differential equations)

Let (α, β) be some open interval, and suppose we have an N th -order homogeneous linear differential
equation
a0 y (N ) + a1 y (N −1) + · · · + a N −2 y + a N −1 y + a N y = 0
where, on (α, β) , the ak ’s are all continuous functions with a0 never being zero. Then the following
statements all hold:
1. Fundamental sets of solutions for this differential equation (over (α, β) ) exist.
2. Every fundamental solution set consists of exactly N solutions.
3. If {y1 , y2 , . . . , y N } is any linearly independent set of N particular solutions over (α, β) ,
then:
(a) {y1 , y2 , . . . , y N } is a fundamental set of solutions.
(b) A general solution to the differential equation is given by

y(x) = c1 y1 (x) + c2 y2 (x) + · · · + c N y N (x)

where c1 , c2 , . . . and c N are arbitrary constants.

i i

i i
i i

i i

Linear Independence and Wronskians 273

(c) Given any point x 0 in (α, β) and any N ﬁxed values A1 , A2 , . . . and A N , there is
exactly one ordered set of constants {c1 , c2 , . . . , c N } such that

y(x) = c1 y1 (x) + c2 y2 (x) + · · · + c N y N (x)

also satisﬁes the initial conditions

y(x 0 ) = A1 , y (x 0 ) = A2 ,
(N −1)
y (x 0 ) = A3 , ... and y (x 0 ) = A N .

The proof of the more general version of the principle of superposition is a straightforward
extension of the derivation of the original version. Proving theorem 13.5 is more challenging and
will be discussed in the next chapter after proving the theorem it generalizes.

13.3 Linear Independence and Wronskians

As we saw in section 13.1, determining whether a set of just two solutions {y1 , y2 } is linearly
independent or not is simply a matter of checking whether one function is a constant multiple of the
other. However, when the set has three or more solutions, {y1 , y2 , y3 , . . .} , then the basic method of
determining linear independence requires checking to see if any of the yk ’s is a linear combination
of the others. This can be a difﬁcult task. Fortunately, this task can be simpliﬁed by the use of
“Wronskians”.

Deﬁnition of Wronskians
Let {y1 , y2 , . . . , y N } be a set of N sufﬁciently differentiable functions on an interval (α, β) . The
corresponding Wronskian, denoted by either W or W [y1 , y2 , . . . , , y N ] , is the function on (α, β)
generated by the following determinant of a matrix of derivatives of the yk ’s :

y1 y2 y3 ··· y N

y y2 y3 ··· y N
1

W = W [y1 , y2 , . . . , , y N ] = y1 y2 y3 ··· y N .

.. .. .. .. ..
. . . . .

(N −1)
y1 y2 (N −1) y3 (N −1) · · · y N (N −1)

In particular, if N = 2 ,

y1 y2
W = W [y1 , y2 ] = = y1 y2 − y1 y2 .
y1 y2

!Example 13.7: Let’s ﬁnd W [y1 , y2 ] on the real line when

y1 (x) = x 2 and y2 (x) = x 3 .

In this case,
y1 (x) = 2x and y2 (x) = 3x 2 ,

i i

i i
i i

i i

274 General Solutions to Homogeneous Linear Differential Equations

and

y1 (x) y2 (x) x2 x 3

W [y1 , y2 ] = = = x 2 3x 2 − 2x x 3 = x 4 .
y1 (x) y2 (x) 2x 3x 2

Applications of Wronskians
One reason for our interest in Wronskians is that they naturally arise when solving with initial-value
problems. For example, suppose we have a pair of functions y1 and y2 , and we want to ﬁnd
constants c1 and c2 such that
y(x) = c1 y1 (x) + c2 y2 (x)

satisﬁes
y(x 0 ) = 2 and y (x 0 ) = 5
for some given point x 0 in our interval of interest. In solving for c1 and c2 , you can easily show
(and we will do it in section 14.1) that

c1 W (x 0 ) = 2y2 (x 0 ) − 5y2 (x 0 ) and c2 W (x 0 ) = 5y1 (x 0 ) − 2y1 (x 0 ) .

Thus, if W (x 0 ) = 0 , then there is exactly one possible value for c1 and one possible value for c2 ,
namely,
2y2 (x 0 ) − 5y2 (x 0 ) 5y1 (x 0 ) − 2y1 (x 0 )
c1 = and c2 = .
W (x 0 ) W (x 0 )
However, if W (x 0 ) = 0 , then the system reduces to

0 = 2y2 (x 0 ) − 5y2 (x 0 ) and 0 = 5y1 (x 0 ) − 2y1 (x 0 )

which cannot be solved for c1 and c2 . (In practice, you probably don’t even notice that your
formulas involve the Wronskian.)
Another reason for our interest — a possibly more important reason — is that the vanishing
of a Wronskian of a set of solutions signals that the given set is not a good choice in constructing
solutions to initial-value problems. The value of this fact is enhanced by the following remarkable
theorem:

Theorem 13.6 (Wronskians and fundamental solution sets)

Let W be the Wronskian of any set {y1 , y2 , . . . , y N } of N particular solutions to an N th -order,
homogeneous linear differential equation

a0 y (N ) + a1 y (N −1) + · · · + a N −2 y + a N −1 y + a N y = 0

on some open interval (α, β) . Assume further that the ak ’s are continuous functions with a0 never
being zero on (α, β) . Then:
1. If W (x 0 ) = 0 for any single point x 0 in (α, β) , then W (x) = 0 for every point x in (α, β) ,
and the set {y1 , y2 , . . . , y N } is not linearly independent (and, hence, is not a fundamental
solution set) on (α, β) .
2. If W (x 0 ) = 0 for any single point x 0 in (α, β) , then W (x) = 0 for every point x in
(α, β) , and {y1 , y2 , . . . , y N } is a fundamental set of solutions for the given differential
equation on (α, β) .

i i

i i
i i

i i

Linear Independence and Wronskians 275

This theorem (whose proof will be discussed in the next chapter) gives us a relatively easy way
to determine if a set of solutions to a linear homogeneous differential equation is a fundamental set
of solutions. This test is especially useful when the order of the differential equation is 3 or higher.

!Example 13.8: Consider the functions

y1 (x) = 1 , y2 (x) = cos(2x) and y3 (x) = sin2 (x) .

You can easily verify that all are solutions (over the entire real line) to the homogeneous third-order
linear differential equation
y + 4y = 0 .
So, is & '
1, cos(2x) , sin2 (x)

a fundamental set of solutions for this differential equation? To check we compute the ﬁrst-order
derivatives

y1 (x) = 0 , y2 (x) = −2 sin(2x) and y3 (x) = 2 sin(x) cos(x) ,

the second-order derivatives

y1 (x) = 0 , y2 (x) = −4 cos(2x) and y3 (x) = 2 cos2 (x) − 2 sin2 (x) ,

and form the corresponding Wronskian,

1 cos(2x) sin2 (x)

W (x) = W [1, cos(2x) , sin (x)] = 0
2
−2 sin(2x) 2 sin(x) cos(x) .

0 −4 cos(2x) 2 cos2 (x) − 2 sin2 (x)

Rather than compute this determinant for all values of x (which would be very tedious), let us
simply pick a convenient value for x , say x = 0 , and compute the Wronskian at that point:

1 cos(2 · 0) sin2 (0) 1 1 0

W (0) = 0 −2 sin(2 · 0) 2 sin(0) cos(0) = 0 0 0 = 0 .

0 −4 cos(2 · 0) 2 cos2 (0) − 2 sin2 (0) 0 −4 2

Theorem 13.6 now assures us that, because this Wronskian vanishes at that one point, it must
vanish everywhere. More importantly for us, this theorem also tells us that {1, cos(2x) , sin2 (x)}
is not a fundamental set of solutions for our differential equation.

!Example 13.9: Now consider the functions

y1 (x) = 1 , y2 (x) = cos(2x) and y3 (x) = sin(2x) .

Again, you can easily verify that all are solutions (over the entire real line) to the homogeneous
third-order linear differential equation

y + 4y = 0 .

So, is
{ 1, cos(2x) , sin(2x)}

i i

i i
i i

i i

276 General Solutions to Homogeneous Linear Differential Equations

a fundamental set of solutions for our differential equation, above? To check we compute the
appropriate derivatives and form the corresponding Wronskian,

W (x) = W [1, cos(2x) , sin(2x)]

y1 y2 y3 1 cos(2x) sin(2x)

= y1 y2 y3 = 0 −2 sin(2x) 2 cos(2x) .

y1 y2 y3 0 −4 cos(2x) −2 sin(2x)

Letting x = 0 , we get

1 cos(2 · 0) sin(2 · 0) 1 1 0

W (0) = 0 −2 sin(2 · 0) 2 cos(2 · 0) = 0 0 2 = 8 = 0 .

0 −4 cos(2 · 0) −2 sin(2 · 0) 0 −4 0

Theorem 13.6 assures us that, since this Wronskian is nonzero at one point, it is nonzero ev-
erywhere, and that {1, cos(2x) , sin(2x)} is a fundamental set of solutions for our differential
equation. Hence,
y(x) = c1 · 1 + c2 cos(2x) + c3 sin(2x)
is a general solution to our third-order differential equation.

Additional Exercises

13.2. Several initial-value problems are given below, each involving a second-order homogeneous
linear differential equation, and each with a pair of functions y1 (x) and y2 (x) . Verify that
each pair {y1 , y2 } is a fundamental set of solutions to the given differential equation (verify
both that y1 and y2 are solutions and that the pair is linearly independent), and then ﬁnd
a linear combination of these solutions that satisﬁes the given initial-value problem.
a. I.v. problem: y + 4y = 0 with y(0) = 2 and y (0) = 6 .
Functions: y1 (x) = cos(2x) and y2 (x) = sin(2x) .
b. I.v. problem: y − 4y = 0 with y(0) = 0 and y (0) = 12 .
−2x
Functions: y1 (x) = e 2x
and y2 (x) = e .
c. I.v. problem: y + y − 6y = 0 with y(0) = 8 and y (0) = −9 .
Functions: y1 (x) = e2x and y2 (x) = e−3x .
d. I.v. problem: y − 4y + 4y = 0 with y(0) = 1 and y (0) = 6 .
Functions: y1 (x) = e2x and y2 (x) = xe2x .
e. I.v. problem: x 2 y − 4x y + 6y = 0 with y(1) = 0 and y (1) = 4 .
2 3
Functions: y1 (x) = x and y2 (x) = x .
f. I.v. problem: 4x 2 y + 4x y − y = 0 with y(1) = 8 and y (1) = 1 .
√ 1
Functions: y1 (x) = x and y2 (x) = √ .
x

i i

i i
i i

i i

Additional Exercises 277

g. I.v. problem: x 2 y − x y + y = 0 with y(1) = 5 and y (1) = 3 .

Functions: y1 (x) = x and y2 (x) = x ln |x| .
3
h. I.v. problem: x y − y + 4x y = 0
√ √
with y( π) = 3 and y ( π) = 4 .

Functions: y1 (x) = cos x 2 and y2 (x) = sin x 2 .

i. I.v. problem: (x + 1)2 y − 2(x + 1)y + 2y = 0

with y(0) = 0 and y (0) = 4 .
Functions: y1 (x) = x 2 − 1 and y2 (x) = x + 1 .

13.3. In exercise 13.2 e, above, you found that {x 2 , x 3 } is a fundamental set of solutions to
x 2 y − 4x y + 6y = 0 ,
at least over some interval containing x 0 = 1 .

$ is the%largest interval containing x 0 = 1 for which theorem 13.3 on page 270 assures
a. What
us x 2 , x 3 is a fundamental set of solutions to the above differential equation?
b. Attempt to ﬁnd constants c1 and c2 so that
y(x) = c1 x 2 + c2 x 3

satisﬁes the initial conditions

y(0) = 0 and y (0) = −4 .

What goes wrong? Why does this not violate the claim in theorem 13.3 about initial-value
problems being solvable?
$

%
13.4. In exercise 13.2 h, above, you found that cos x 2 , sin x 2 is a fundamental set of solu-
tions to
x y − y + 4x 3 y = 0 ,
at least over some interval containing x 0 = π .

%interval containing x 0 = π that theorem 13.3 on page 270 assures us
a. $What
is the largest
cos x 2 , sin x 2 is a fundamental set of solutions to the above differential equation?
b. Attempt to ﬁnd constants c1 and c2 so that

y(x) = c1 cos x 2 + c2 sin x 2

satisﬁes the initial conditions

y(0) = 1 and y (0) = 4 .
What goes wrong? Why does this not violate the claim in theorem 13.3 about initial-value
problems being solvable?
13.5. Some third- and fourth-order initial-value problems are given below, each involving a ho-
mogeneous linear differential equation, and each with a set of three or four functions y1 (x) ,
y2 (x) , . . . . Verify that these functions form a fundamental set of solutions to the given
differential equation, and then ﬁnd a linear combination of these solutions that satisﬁes the
given initial-value problem.

i i

i i
i i

i i

278 General Solutions to Homogeneous Linear Differential Equations

a. I.v. problem: y + 4y = 0

with y(0) = 3 , y (0) = 8 and y (0) = 4 .
Functions: y1 (x) = 1 , y2 (x) = cos(2x) and y3 (x) = sin(2x) .
b. I.v. problem: y + 4y = 0
with y(0) = 3 , y (0) = 8 and y (0) = 4 .
Functions: y1 (x) = 1 , y2 (x) = sin2 (x) and y3 (x) = sin(x) cos(x) .
(4)
c. I.v. problem: y − y = 0
with y(0) = 0 , y (0) = 4 , y (0) = 0 and y (0) = 0 .
Functions: y1 (x) = cos(x) , y2 (x) = sin(x) , y3 (x) = cosh(x)
and y4 (x) = sinh(x) .

13.6. Particular solutions to the differential equation in each of the following initial-value prob-
lems can found by assuming
y(x) = er x
where r is a constant to be determined. To determine these constants, plug this formula for
y into the differential equation, observe that the resulting equation miraculously simpliﬁes
to a simple algebraic equation for r , and solve for the possible values of r .
Do that with each equation and use those solutions (with the big theorem on general
solutions to second order, homogeneous linear equations —theorem 13.3 on page 270)
to construct a general solution to the differential equation. Then, ﬁnally, solve the given
initial-value problem.
a. y − 4y = 0 with y(0) = 1 and y (0) = 0

b. y + 2y − 3y = 0 with y(0) = 0 and y (0) = 1

c. y − 10y + 9y = 0 with y(0) = 8 and y (0) = −24

d. y + 5y = 0 with y(0) = 1 and y (0) = 0

13.7. Find solutions of the form

y(x) = er x
where r is a constant (as in the previous exercise) and use the solutions found (along with
the results given in theorem 13.5 on page 272) to construct general solutions to the following
differential equations:
a. y − 9y = 0 b. y (4) − 10y + 9y = 0

13.8. Thus far, we’ve derived two general solutions to

y + y = 0 .

In chapter 11 we obtained
y(x) = A sin(x + B) ,
and in this chapter, we got
y(x) = c1 sin(x) + c2 cos(x) .

Using a well-known trigonometric identity, verify that these two solutions are equivalent,
and ﬁnd how c1 and c2 are related to A and B .

i i

i i
i i

where x 0 is some point in (α, β) , and A and B are any two real numbers. Recall that lemma 13.2
on page 267 assures us that any such initial-value problem has one and only one solution.

Solving Initial-Value Problems

Until further notice, we will assume that we have a pair of solutions over (α, β)

{ y1 , y2 }

to our differential equation

ay + by + cy = 0 .
By the principle of superposition (theorem 13.1 on page 264), we know that any linear combination
of these two solutions

y(x) = c1 y1 (x) + c2 y2 (x) for all x in (α, β)

is also a solution to our differential equation. Now let us consider solving initial-value problem
(14.1) using this linear combination.
As already noted, this linear combination satisﬁes our differential equation. So all we need to
do is to ﬁnd constants c1 and c2 such that

A = y(x 0 ) = c1 y1 (x 0 ) + c2 y2 (x 0 )
and
B = y (x 0 ) = c1 y1 (x 0 ) + c2 y2 (x 0 ) .

That is, we need to solve the linear algebraic system of two equations

c1 y1 (x 0 ) + c2 y2 (x 0 ) = A

c1 y1 (x 0 ) + c2 y2 (x 0 ) = B

for c1 and c2 . But this is easy. Start by multiplying each equation by y2 (x 0 ) or y2 (x 0 ) , as

appropriate:

c1 y1 (x 0 ) + c2 y2 (x 0 ) = A y2 (x 0 ) c1 y1 (x 0 )y2 (x 0 ) + c2 y2 (x 0 )y2 (x 0 ) = Ay2 (x 0 )
⇒
c1 y1 (x 0 ) + c2 y2 (x 0 ) = B y2 (x 0 ) c1 y1 (x 0 )y2 (x 0 ) + c2 y2 (x 0 )y2 (x 0 ) = By2 (x 0 )

Subtracting the second equation from the ﬁrst (and looking carefully at the results) yields

c1 y1 (x 0 )y2 (x 0 ) − y1 (x 0 )y2 (x 0 ) + c2 y2 (x 0 )y2 (x 0 ) − y2 (x 0 )y2 (x 0 )

W (x 0 ) 0

y1 (x 0 )y2 (x 0 ) = y2 (x 0 )y1 (x 0 ) . (14.4)

Let us consider all the possibilities. For simplicity, we will start by assuming y2 (x 0 ) = 0 :

i i

i i
i i

i i

Verifying the Big Theorem 285

1. If y2 (x 0 ) = 0 and y1 (x 0 ) = 0 , then equation (14.4) yields

y1 (x 0 )y2 (x 0 ) = y2 (x 0 )y1 (x 0 ) = 0 ,
implying that y1 (x 0 ) = 0 and y2 (x 0 ) = 0 . Consequently, we can divide both sides of this
equation by these two values and set
y1 (x 0 ) y (x )
κ = = 1 0 .
y2 (x 0 ) y2 (x 0 )
Thus, κ is a constant such that
y1 (x 0 ) = κ y2 (x 0 ) and y1 (x 0 ) = κ y2 (x 0 ) ,
and corollary 14.3 tells us that {y1 , y2 } is linearly dependent.
2. If y2 (x 0 ) = 0 and y1 (x 0 ) = 0 , then equation (14.4) yields
y1 (x 0 )y2 (x 0 ) = y2 (x 0 )y1 (x 0 ) = y2 (x 0 ) · 0 = 0 ,
telling us that y1 (x 0 ) = 0 or y2 (x 0 ) = 0 .
(a) But if y1 (x 0 ) = 0 , then y1 is the solution to
ay + by + cy = 0 with y(x 0 ) = 0 and y (x 0 ) = 0 ,
which, as noted in corollary 14.2 means that y1 is the trivial solution, automatically
making {y1 , y2 } linearly dependent.
(b) On the other hand, if y2 (x 0 ) = 0 , then we can set
y1 (x 0 )
κ = .
y2 (x 0 )
This, along with the fact that y1 (x 0 ) = 0 and y2 (x 0 ) = 0 gives us
y1 (x 0 ) = κ y2 (x 0 ) and y1 (x 0 ) = κ y2 (x 0 ) ,
and corollary 14.3 tells us that {y1 , y2 } is linearly dependent.
From the above it follows that {y1 , y2 } must be linearly dependent if we have both W (x 0 ) = 0 and
y2 (x 0 ) = 0 . Now let’s assume W (x 0 ) = 0 and y2 (x 0 ) = 0 . Equation (14.4) still holds, but, this
time, yields
y1 (x 0 )y2 (x 0 ) = y2 (x 0 )y1 (x 0 ) = 0 · y1 (x 0 ) = 0 ,
which means that y1 (x 0 ) = 0 or y2 (x 0 ) = 0 .
1. If y2 (x 0 ) = 0 and y2 (x 0 ) = 0 , then y2 satisﬁes
ay + by + cy = 0 with y(x 0 ) = 0 and y (x 0 ) = 0 .
Again, corollary 14.2 tells us that y1 is the trivial solution, automatically making {y1 , y2 }
linearly dependent.
2. On the other hand, if y2 (x 0 ) = 0 , y2 (x 0 ) = 0 and y1 (x 0 ) = 0 , then we can set
y1 (x 0 )
κ =
y2 (x 0 )
and observe that, from this and the fact that y1 (x 0 ) = 0 = y2 (x 0 ) , we have
y1 (x 0 ) = κ y2 (x 0 ) and y1 (x 0 ) = κ y2 (x 0 ) ,
which, according to corollary 14.3, means that {y1 , y2 } is linearly dependent.

i i

i i
i i

i i

286 Verifying the Big Theorems and an Introduction to Differential Operators

Thus, we have that {y1 , y2 } is linearly dependent whenever W (x 0 ) = 0 whether or not y2 (x 0 ) = 0 .

That is,
W (x 0 ) = 0 for some x 0 in (α, β)
(14.5a)
⇒ {y1 , y2 } is linearly dependent over (α, β) .
Equivalently,
{y1 , y2 } is linearly independent over (α, β)
(14.5b)
⇒ W (x) = 0 for every x in (α, β) .

Linearly Independent Pairs and General Solutions

Now assume that our pair of solutions {y1 , y2 } is linearly independent over (α, β) , and let #
y be any
single solution to our differential equation. Pick some x 0 in (α, β) , and consider the initial-value
problem
ay + by + cy = 0 with y(x 0 ) = A and y (x 0 ) = B
where
A = # y(x 0 ) and B = # y (x 0 ) .
Clearly, y = #y is a solution to the initial-value problem.
On the other hand, implication (14.5b) tells us that, since {y1 , y2 } is linearly independent, the
corresponding Wronskian is nonzero at x 0 , and lemma 14.1 and the principle of superposition then
assure us that there are constants c1 and c2 such that
y = c1 y1 + c2 y2
is also a solution to this same initial-value problem. So we have “two” solutions to our initial-value
problem,
#y and y = c1 y1 + c2 y2
But (by lemma 13.2 on page 267) we know this initial-value problem only has one solution. This
means our two solutions must be the same,
#
y(x) = c1 y1 (x) + c2 y2 (x) for every x in (α, β) .
Hence, any solution to our differential equation can be written as a linear combination of y1 and
y2 . This shows that
{y1 , y2 } is linearly independent over (α, β)
(14.6)
⇒ y = c1 y1 + c2 y2 is a general solution to our differential equation ,
which, incidentally, means that the linearly independent pair {y1 , y2 } is a fundament set of solutions
for our differential equation.

Summarizing the Results So Far

Combining the results given in lemma 14.1 and implications (14.3), (14.5) and (14.6) gives us the
following lemma.

Lemma 14.4
Assume a , b and c are continuous functions on an interval (α, β) with a never being zero in
(α, β) , and let {y1 , y2 } be a pair of solutions on (α, β) to
ay + by + cy = 0 .
Then all of the following statements are equivalent (i.e., if one holds, they all hold):

i i

i i
i i

i i

Verifying the Big Theorem 287

1. The pair {y1 , y2 } is linearly independent on (α, β) .

2. The Wronskian of {y1 , y2 } ,
W = y1 y2 − y2 y1 ,
is nonzero at one point in (α, β) .
3. The Wronskian of {y1 , y2 } ,
W = y1 y2 − y2 y1 ,
is nonzero at every point in (α, β) .
4. Given any point x 0 in (α, β) and any two ﬁxed values A and B , there is exactly one
ordered pair of constants {c1 , c2 } such that

y(x) = c1 y1 (x) + c2 y2 (x)

also satisﬁes the initial conditions

y(x 0 ) = A and y (x 0 ) = B .

5. The arbitrary linear combination of y1 and y2 ,

y = c1 y1 + c2 y2 ,

is a general solution for the above differential equation.

6. The pair {y1 , y2 } is a fundamental set of solutions for the above differential equation.

If you check, you will see that most of our “big theorem”, theorem 13.3, follows immediately
from this lemma, as does the theorem on Wronskians, theorem 13.6, for the case where N = 2 .

Proving the Rest of Theorem 13.3

All that remains to achieving our goal of verifying theorem 13.3 is to verify that fundamental sets of
solutions exist, and that a fundamental set of solutions must contain exactly two solutions. Verifying
these two facts will be easy.

Existence of Fundamental Sets

Let x 0 be some point in (α, β) , and let y1 and y2 be, respectively, the solutions to the initial-value
problems
ay + by + cy = 0 with y(x 0 ) = 1 and y (x 0 ) = 0
and
ay + by + cy = 0 with y(x 0 ) = 0 and y (x 0 ) = 1 .

(By lemma 13.2, we know these solutions exist.) Computing their Wronskian at x 0 , we get

W (x 0 ) = y1 (x 0 )y2 (x 0 ) − y2 (x 0 )y1 (x 0 ) = 1 · 1 − 0 · 0 = 0 ,

which, according to lemma 14.4, means that {y1 , y2 } is a fundamental set of solutions to our
differential equation.

i i

i i
i i

i i

288 Verifying the Big Theorems and an Introduction to Differential Operators

Size of Fundamental Sets of Solutions

It should be clear that we cannot solve every initial-value problem

ay + by + cy = 0 with y(x 0 ) = A and y (x 0 ) = B

using a linear combination of a single particular solution y1 of the differential equation. After all,
just try ﬁnding a constant c1 so that
y(x) = c1 y1 (x)
satisﬁes
ay + by + cy = 0 with y(x 0 ) = A and y (x 0 ) = B
when
A = y1 (x 0 ) and B = 1 + y1 (x 0 ) .

So a fundamental set of solutions for our differential equation must contain at least two solutions.
On the other hand, if anyone were to propose the existence of a fundamental solution set of
more than two solutions
{ y1 , y2 , y3 , . . . } ,
then the required linear independence of the set (i.e., that no solution in this set is a linear combination
of the others) would automatically imply that the set of just the ﬁrst two solutions,

{ y1 , y2 } ,

is also linearly independent. But then lemma 14.4 tells us that this smaller set is a fundamental
set of solutions for our differential equation, and that every other solution, including the y3 in the
set originally proposed, is a linear combination of y1 and y2 . Hence, the originally proposed set
of three or more solutions cannot be linearly independent, and, hence, is not a fundamental set of
solutions for our differential equation.
So, a fundamental set of solutions for a second-order, homogeneous linear differential equation
cannot contain less than two solutions or more than two solutions. It must contain exactly two
solutions.
And that completes our proof of theorem 13.3.

14.2 Proving the More General Theorems on General

Solutions and Wronskians
Extending the discussion in the previous section into proofs of the more general theorems in chapter
13 (theorem 13.5 on page 272 and theorem 13.6 on page 274) is relatively straightforward provided
you make use of some basic facts normally developed in a good introductory course on linear algebra.
We will discuss this further (and brieﬂy) in section 35.6 in the context of “systems of differential
equations”.

i i

i i
i i

i i

Linear Differential Operators 289

14.3 Linear Differential Operators

The Operator Associated with a Linear Differential Equation
Sometimes, when given some N th -order linear differential equation

dN y d N −1 y d2 y dy
a0 + a1 + · · · + a N −2 + a N −1 + aN y = g ,
dx N d x N −1 dx2 dx

it is convenient to let L[y] denote the expression on the left side, whether or not y is a solution to
the differential equation. That is, for any sufﬁciently differentiable function y ,

dN y d N −1 y d2 y dy
L[y] = a0 + a1 + · · · + a N −2 + a N −1 + aN y .
dx N d x N −1 dx2 dx

To emphasize that y is a function of x , we may also use L[y(x)] instead of L[y] . For much
of what follows, y need not be a solution to the given differential equation, but it does need to be
sufficiently differentiable on the interval of interest for all the derivatives in the formula for L[y] to
make sense.
While we defined L[y] as the left side of the above differential equation, the expression for
L[y] is completely independent of the equation’s right side. Because of this and the fact that the
choice of y is largely irrelevant to the basic definition, we will often just define “ L ” by stating

dN d N −1 d2 d
L = a0 + a1 + · · · + a N −2 + a N −1 + aN
dx N d x N −1 dx2 dx

where the ak ’s are functions of x on the interval of interest.1

!Example 14.1: If our differential equation is

d2 y 2 dy
√
+ x − 6y = x +1 ,
dx2 dx

then
d2 d
L = + x2 − 6 ,
dx2 dx
and, for any twice-differentiable function y = y(x) ,

d2 y dy
L[y(x)] = L[y] = + x2 − 6y .
dx2 dx

In particular, if y = sin(2x) , then

d2 2d

L[y] = L sin(2x) = 2
sin(2x) + x sin(2x) − 6 sin(2x)
dx dx
= −4 sin(2x) + x · 2 cos(2x) − 6 sin(2x)
2

= 2x 2 cos(2x) − 10 sin(2x) .
1 If using “ L ” is just too much shorthand for you, observe that the formulas for L can be written in summation form:

N
d N −k y
N
d N −k
L[y] = ak and L = ak .
d x N −k d x N −k
k=0 k=0
You can use these summation formulas instead of “ L ” if you wish.

i i

i i
i i

i i

290 Verifying the Big Theorems and an Introduction to Differential Operators

Observe that L is something into which we plug a function (such as the sin(2x) in the above
example) and out of which pops another function (which, in the above example, ended up being
2x 2 cos(2x) − 10 sin(2x) ). Anything that so converts one function into another is often called
an operator (on functions), and since the general formula for computing L[y] looks like a linear
combination of differentiations up to order N ,
dN y d N −1 y d2 y dy
L[y] = a0 N
+ a1 N −1 + · · · + a N −2 2 + a N −1 + aN y ,
dx dx dx dx
it is standard to refer to L as a linear differential operator (of order N ).
We should also note that our linear differential operators are “linear” in the sense normally
deﬁned in linear algebra:

Lemma 14.5
Assume L is a linear differentiable operator
dN d N −1 d2 d
L = a0 + a1 + · · · + a N −2 + a N −1 + aN
dx N d x N −1 dx2 dx
where the ak ’s are functions on some interval (α, β) . If y1 and y2 are any two sufﬁciently
differentiable functions on (α, β) , and c1 and c2 are any two constants, then
L[c1 y1 + c2 y2 ] = c1 L[y1 ] + c2 L[y2 ] .

To prove this lemma, you basically go through the same computations as used to derive the
principle of superposition (see the derivations just before theorem 13.1 on page 264).

The Composition Product

Deﬁnition and Notation
The (composition) product L 2 L 1 of two linear differential operators L 1 and L 2 is the differential
operator given by
L 2 L 1 [φ] = L 2 L 1 [φ]
for every sufﬁciently differentiable function φ = φ(x) .2

!Example 14.2: Let

d d
L1 = + x2 and L2 = + 4 .
dx dx
For any twice-differentiable function φ = φ(x) , we have
dφ
L 2 L 1 [φ] = L 2 [L 1 [φ]] = L 2 + x 2φ
dx
d
dφ dφ
= + x 2φ + 4 + x 2φ
dx dx dx
d2φ d
dφ
= 2
+ x 2φ + 4 + 4x 2 φ
dx dx dx

d2φ dφ dφ
= 2
+ 2xφ + x 2 + 4 + 4x 2 φ
dx dx dx
d2φ
dφ
= + 4 + x2 + 2x + 4x 2 φ .
dx2 dx
2 The notation L L , instead of L L would also be correct.
2 1 2 1

i i

i i
i i

i i

Linear Differential Operators 291

Cutting out the middle yields

d2φ
dφ
L 2 L 1 [φ] = + 4 + x2 + 2x + 4x 2 φ
dx2 dx

for every sufﬁciently differentiable function φ . Thus

d2
d
L2 L1 = + 4 + x2 + 2x + 4x 2 .
dx2 dx

When we have formulas for our operators L 1 and L 2 , it will often be convenient to replace
the symbols “ L 1 ” and “ L 2 ” with their formulas enclosed in parentheses. We will also enclose any
function φ being “plugged into” the operators with square brackets, “ [φ] ”. This will be called the
product notation.3

!Example 14.3: Using the product notation, let us recompute L 2 L 1 for

d d
L1 = + x2 and L2 = + 4 .
dx dx

Letting φ = φ(x) be any twice-differentiable function,

d d d dφ
+ 4 + x 2 [φ] = + 4 + x 2φ
dx dx dx dx
d
dφ dφ
= + x 2φ + 4 + x 2φ
dx dx dx

d2φ d
dφ
= + x 2φ + 4 + 4x 2 φ
dx2 dx dx

d2φ dφ dφ
= + 2xφ + x 2 + 4 + 4x 2 φ
dx2 dx dx

d2φ
dφ
= + 4 + x2 + 2x + 4x 2 φ .
dx2 dx

So,
d d d2
d
L2 L1 = + 4 + x2 = + 4 + x2 + 2x + 4x 2 ,
dx dx dx2 dx

just as derived in the previous example.

Algebra of the Composite Product

The notation L 2 L 1 [φ] is convenient, but it is important to remember that it is shorthand for

compute L 1 [φ] and plug the result into L 2 .

3 Many authors do not enclose “the function being plugged in” in square brackets, and just write L L φ . We are avoiding
2 1
that because it does not explicitly distinguish between “ φ as a function plugged in” and “ φ as an operator, itself”. For
being
the ﬁrst, L 2 L 1 φ means the function you get from computing L 2 L 1 [φ] . For the second, L 2 L 1 φ means the operator
such that, for any sufﬁciently differentiable function ψ ,

L 2 L 1 [φ[ψ]] = L 2 L 1 [φψ] .
The two possible interpretations for L 2 L 1 φ are not the same.

i i

i i
i i

i i

292 Verifying the Big Theorems and an Introduction to Differential Operators

The result of this can be quite different from

compute L 2 [φ] and plug the result into L 1 ,

which is what L 1 L 2 [φ] means. Thus, in general,

L 2 L 1 = L 1 L 2 .

In other words, the composition product of differential operators is generally not commutative.

!Example 14.4: In the previous two examples, we saw that

d d d2
d
+ 4 + x2 = + 4 + x2 + 2x + 4x 2 .
dx dx dx2 dx
On the other hand, switching the order of the two operators, and letting φ be any sufﬁciently
differentiable function gives
d d d dφ
+ x2 + 4 [φ] = + x2 + 4φ
dx dx dx dx
d
dφ dφ
= + 4φ + x2 + 4φ
dx dx dx

d 2φ dφ dφ
= + 4 + x2 + 4x 2 φ
dx2 dx dx

d 2φ
dφ
= + 4 + x2 + 4x 2 φ .
dx2 dx
Thus,
d d d2

2 d
+ x2 + 4 = 2
+ 4 + x + 4x 2 .
dx dx dx dx
After comparing this with the ﬁrst equation in this example, we clearly see that
d d d d
+ x2 + 4 = + 4 + x2 .
dx dx dx dx

?Exercise 14.1: Let

d
L1 = and L2 = x ,
dx
and verify that
d d
L2L1 = x while L1L2 = x + 1 .
dx dx

Later (in chapters 17 and 20) we will be dealing with special situations in which the composition
product is commutative. In fact, the material we are now developing will be most useful verifying
certain theorems involving those situations. In the meantime, just remember that, in general,

L 2 L 1 = L 1 L 2 .

Here are a few other short and easily veriﬁed notes about the composition product:
1. In the above examples, the operators L 2 and L 1 were all ﬁrst order differential operators.
This was not necessary. We could have used, say,

d3 d2 d √
L2 = x3 3
+ sin(x) 2 − xe3x + 87 x
dx dx dx

i i

i i
i i

i i

Linear Differential Operators 293

and
d 26 3d
3
L1 = − x ,
d x 26 dx3

though we would have certainly needed many more pages for the calculations.

2. There is no need to limit ourselves to composition products of just two operators. Given any
number of linear differential operators — L 1 , L 2 , L 3 , . . . — the composition products
L 3 L 2 L 1 , L 4 L 3 L 2 L 1 , etc. are deﬁned to be the differential operators satisfying, for each
and every sufﬁciently differentiable function φ ,

L 3 L 2 L 1 [φ] = L 3 L 2 L 1 [φ] ,

L 4 L 3 L 2 L 1 [φ] = L 4 L 3 L 2 L 1 [φ] ,
..
.

Naturally, the order of the operators is still important.

3. Any composition product of linear differential operators is, itself, a linear differential operator.
Moreover, the order of the product
L K · · · L2 L1

is the sum

(the order of L K ) + · · · + (the order of L 2 ) + (the order of L 1 ) .

4. Though not commutative, the composition product is associative. That is, if L 1 , L 2 and L 3
are three linear differential operators, and we ‘precompute’ the products L 2 L 1 and L 3 L 2 ,
and then compute
(L 3 L 2 )L 1 , L 3 (L 2 L 1 ) and L 3 L 2 L 1 ,

we will discover that

(L 3 L 2 )L 1 = L 3 (L 2 L 1 ) = L 3 L 2 L 1 .

5. Keep in mind that we are dealing with linear differential operators and that their products are
linear differential operators. In particular, if α is some constant and φ is any sufﬁciently
differentiable function, then

L K · · · L 2 L 1 [αφ] = αL K · · · L 2 L 1 [φ] .

And, of course,
L K · · · L 2 L 1 [0] = 0 .

Factoring
Now suppose we have some linear differential operator L . If we can ﬁnd other linear differential
operators L 1 , L 2 , L 3 , . . . , and L K such that

d2
d d d
+ 4 + x2 + 4x 2 = + x2 + 4 .
dx2 dx dx dx

So our differential equation can be written as

d d
+ x2 + 4 [y] = 0 .
dx dx

That is,
d dy
+ x2 + 4y = 0 . (14.7)
dx dx

Now consider
dy
+ 4y = 0 .
dx

This is a simple ﬁrst-order linear and separable differential equation, whose general solution is
easily found to be y = c1 e−4x . In particular, e−4x is a solution. According to the above theorem,
e−4x is also a solution to our original differential equation. Let’s check to be sure:

d 2 −4x
d d d
e + 4 + x2 e−4x + 4x 2 e−4x = + x2 + 4 e−4x
dx2 dx dx dx
d d
= + x2 e−4x + 4e−4x
dx dx
d
= + x2 −4e−4x + 4e−4x
dx
d
= + x 2 [0]
dx
d0
= + x2 · 0
dx
= 0 .

Keep in mind, though, that e−4x is simply one of the possible solutions, and that there will be
solutions not given by c1 e−4x .

Unfortunately, unless it is of an exceptionally simple type (such as considered in chapter 17),

factoring a linear differential operator is a very nontrivial problem. And even with those simple types
that we will be able to factor, we will ﬁnd the main value of the above to be in deriving even simpler
methods for ﬁnding solutions. Consequently, in practice, you should not expect to be solving many
differential equations via “factoring”.

i i

i i
i i

i i

296 Verifying the Big Theorems and an Introduction to Differential Operators

Additional Exercises

14.2 a. State the linear differential operator L corresponding to the left side of
d2 y dy
+ 5 + 6y = 0 .
dx2 dx
b. Using this L , compute each of the following:

i. L[sin(x)] ii. L e4x iii. L e−3x iv. L x 2

c. Based on the answers to the last part, what is one solution to the differential equation in
part a?

14.3 a. State the linear differential operator L corresponding to the left side of
d2 y dy
2
− 5 + 9y = 0 .
dx d x
b. Using this L , compute each of the following:

i. L[sin(x)] ii. L[sin(3x)] iii. L e2x iv. L e2x sin(x)

14.4 a. State the linear differential operator L corresponding to the left side of
d2 y dy
x2 2
+ 5x + 6y = 0 .
dx d x
b. Using this L , compute each of the following:

i. L[sin(x)] ii. L e4x iii. L x 3

14.5 a. State the linear differential operator L corresponding to the left side of
d3 y dy
3
− sin(x) + cos(x) y = x 2 + 1 ,
dx d x
b. and then, using this L , compute each of the following:

i. L[sin(x)] ii. L[cos(x)] iii. L x 2

14.6. Several choices for linear differential operators L 1 and L 2 are given below. For each
choice, compute L 2 L 1 and L 1 L 2 .
d d
a. L 1 = + x and L2 = − x
dx dx
d d
b. L 1 = + x2 and L2 = + x3
dx dx
d d
c. L 1 = x + 3 and L2 = + 2x
dx dx
d2
d. L 1 = and L2 = x
dx2
d2
e. L 1 = and L2 = x3
dx2

i i

i i
i i

i i

Additional Exercises 297

d2
f. L 1 = and L 2 = sin(x)
dx2

14.7. Compute the following composition products:

d d d d
a. +2 +3 b. x +2 x +3
dx dx dx dx
d d 1
d d 1

c. x +4 + d. + 4x +
dx dx x dx dx x
d 1
d d 2
e. + + 4x f. + 5x 2
dx x dx dx
d d2 d

d2 d

d

g. + x2 + h. + + x2
dx dx2 dx dx2 dx dx

14.8. Verify that

d2
d d d
+ sin(x) − 3 − 3 sin(x) = + sin(x) − 3 ,
dx2 dx dx dx

and, using this factorization, ﬁnd one solution to

d2 y
dy
2
+ sin(x) − 3 − 3 sin(x)y = 0 .
dx dx

14.9. Verify that

d2 d
d d
+ x + 2 − 2x 2 = −x + 2x ,
dx2 dx dx dx

and, using this factorization, ﬁnd one solution to

d2 y dy

+ x + 2 − 2x 2 y = 0 .
dx2 dx

14.10. Verify that

d2 d
d 2
x2 − 7x + 16 = x −4 ,
dx2 dx dx
and, using this factorization, ﬁnd one solution to

d2 y dy
x2 − 7x + 16y = 0 .
dx2 dx

i i

i i
i i

i i

i i
i i

i i

15
Second-Order Homogeneous Linear
Equations with Constant Coefﬁcients

A very important class of second-order homogeneous linear equations consists of those with constant
coefﬁcients; that is, those that can be written as

ay + by + cy = 0

where a , b and c are real-valued constants (with a = 0 ). Some examples are

y − 5y + 6y = 0 ,

y − 6y + 9y = 0
and
y − 6y + 13y = 0 .

There are two reasons these sorts of differential equations are important: First of all, they
often arise in applications. Secondly, as we will see, it is relatively easy to find fundamental sets of
solutions for these equations.
Do note that, because the coefficients are constants, they are, trivially, continuous functions on
the entire real line. Consequently, we can take the entire real line as the interval of interest, and be
confident that any solutions derived will be valid on all of (−∞, ∞) .
IMPORTANT: What we will derive and define here (e.g., “the characteristic equation”) is based
on the assumption that the coefficients in our differential equation— the a , b and c above — are
constants. Some of the results will even require that these constants be real valued. Do not, later,
try to blindly apply what we develop here to differential equations in which the coefficients are not
real-valued constants.

15.1 Deriving the Basic Approach

Seeking Inspiration
Let us look for clues on how to solve our second-order equations by first looking at solving a
first-order, homogeneous linear differential equation with constant coefficients, say,

dy
2 + 6y = 0 .
dx

i i

Deriving the Basic Approach 301

From our experience with the ﬁrst-order case, it seems reasonable to expect at least some of the
solutions to be exponentials. So let us ﬁnd all such solutions by setting
y = er x
where r is a constant to be determined, plugging this formula into our differential equation, and
seeing if a constant r can be determined.
For our example,
y − 5y + 6y = 0 .
Letting y = er x yields
d2 r x d rx
e − 5 e + 6 er x = 0
dx2 dx

→ r 2 er x − 5rer x + 6er x = 0

→ er x r 2 − 5r + 6 = 0 .

Since er x can never be zero, we can divide it out, leaving the algebraic equation
r 2 − 5r + 6 = 0 .
Before solving this for r , let us pause and consider the more general case.
More generally, letting y = er x in
ay + by + cy = 0 (15.1)
yields
d2 r x d rx
a 2
e + b e + c er x = 0
dx dx

→ ar 2 er x + brer x + cer x = 0

→ er x ar 2 + br + c = 0 .

Since er x can never be zero, we can divide it out, leaving us with the algebraic equation
ar 2 + br + c = 0 (15.2)
(remember: a , b and c are constants). Equation (15.2) is called the characteristic equation for
differential equation (15.1). Note the similarity between the original differential equation and its
characteristic equation. The characteristic equation is nothing more that the algebraic equation
obtained by replacing the various derivatives of y with corresponding powers of r (treating y as
being the zeroth derivative of y ):

ay + by + cy = 0 (original differential equation)

→ ar 2 + br + c = 0 (characteristic equation)
The nice thing is that the characteristic equation is easily solved for r by either factoring the
polynomial or using the quadratic formula. These values for r must then be the values of r
for which y = er x are (particular) solutions to our original differential equation. Using what we
developed in previous chapters, we may then be able to construct a general solution to the differential
equation.

i i

i i
i i

i i

302 Second-Order Homogeneous Linear Equations with Constant Coefﬁcients

In our example, letting y = er x in

y − 5y + 6y = 0

led to the characteristic equation

r 2 − 5r + 6 = 0 ,

which factors to
(r − 2)(r − 3) = 0 .
Hence,
r − 2 = 0 or r − 3 = 0 .
So the possible values of r are

r = 2 and r = 3 ,

which, in turn, means

y1 = e2x and y2 = e3x

are solutions to our original differential equation. Clearly, neither of these functions
is a constant multiple of the other; so, after recalling the big theorem on solutions to
second-order, homogeneous linear differential equations, theorem 13.3 on page 270, we
know that & '
e2x , e3x
is a fundamental set of solutions and

y(x) = c1 e2x + c2 e3x

is a general solution to our differential equation.

We will discover that we can always construct a general solution to any given homogeneous
linear differential equation with constant coefficients using the solutions to its characteristic equation.
But first, let us restate what we have just derived in a somewhat more concise and authoritative form,
and briefly consider the nature of the possible solutions to the characteristic equation.

15.2 The Basic Approach, Summarized

To solve a second-order homogeneous linear differential equation

ay + by + cy = 0

in which a , b and c are constants, start with the assumption that

y(x) = er x

where r is a constant to be determined. Plugging this formula for y into the differential equation
yields, after a little computation and simpliﬁcation, the differential equation’s characteristic equation
for r ,
ar 2 + br + c = 0 .

i i

i i
i i

i i

The Basic Approach, Summarized 303

Alternatively, the characteristic equation can simply be constructed by replacing the derivatives of
y in the original differential equation with the corresponding powers of r .
(By the way, the polynomial on the left side of the characteristic equation,
ar 2 + br + c ,
is called the characteristic polynomial for the differential equation. Recall from algebra that a “root”
of a polynomial p(r) is the same as a solution to p(r) = 0 . So we can — and will — use the terms
“solution to the characteristic equation” and “root of the characteristic polynomial” interchangeably.)
Since the characteristic polynomial is only of degree two, solving the characteristic equation
for r should present no problem. If this equation is simple enough, we can factor the polynomial
and ﬁnd the values of r by inspection. At worst, we must recall that the solution to the polynomial
equation
ar 2 + br + c = 0
can always be obtained via the quadratic formula,

−b ± b2 − 4ac
r = .
2a
Notice how the nature of the value r depends strongly on the value under the square root, b2 − 4ac .
There are three possibilities:
√
1. If b2 − 4ac > 0 , then b2 − 4ac is some positive value, and we have two distinct real
values for r ,
−b − b2 − 4ac −b + b2 − 4ac
r− = and r+ = .
2a 2a
(In practice, we may denote these solutions by r1 and r2 , instead.)
2. If b2 − 4ac = 0 , then √
−b ± b2 − 4ac −b ± 0
r = = ,
2a 2a
and we only have one real root for our characteristic equation, namely,
b
r = − .
2a
3. If b2 − 4ac < 0 , then the quantity under the square root is negative, and, thus, this square
root gives rise to an imaginary number. To be explicit,
! !
b2 − 4ac = −1 · b2 − 4ac = i b2 − 4ac
√
where “ i = −1 ”.1 Thus, in this case, we will get two distinct complex roots, r+ and r−
with
−b ± b2 − 4ac −b ± i |b2 − 4ac| b |b2 − 4ac|
r± = = = − ± i .
2a 2a 2a 2a
Whatever the case, if we ﬁnd r0 to be a root of the characteristic polynomial, then, by the very
steps leading to the characteristic equation, it follows that
y0 (x) = er0 x
is a solution to our original differential equation. As you can imagine, though, the nature of the
corresponding general solution to this differential equation depends strongly on which of the above
three cases we are dealing with. Let us consider each case.
1 Some people prefer to use j instead of i for
√
−1 . Clearly, these people don’t know how to spell ‘imaginary’.

i i

i i
i i

i i

304 Second-Order Homogeneous Linear Equations with Constant Coefﬁcients

15.3 Case 1: Two Distinct Real Roots

Suppose the characteristic equation for

ay + by + cy = 0

has two distinct (i.e., different) real solutions r1 and r2 . Then we have that both

y1 = er1 x and y2 = er2 x

are solutions to the differential equation. Since we are assuming r1 and r2 are not the same, it
should be clear that neither y1 nor y2 is a constant multiple of the other. Hence
$ r x r x%
e1 ,e2

is a linearly independent set of solutions to our second-order, homogeneous linear differential equa-
tion. The big theorem on solutions to second-order, homogenous linear differential equations,
theorem 13.3 on page 270, then tells us that

y(x) = c1 er1 x + c2 er2 x

is a general solution to our differential equation.

We will later ﬁnd it convenient to have a way to refer back to the results of the above observations.
That is why those results are now restated in the following lemma:

Lemma 15.1
Let a , b and c be constants with a = 0 . If the characteristic equation for

ay + by + cy = 0

has two distinct real solutions r1 and r2 , then

y1 (x) = er1 x and y2 (x) = er2 x

are two solutions to this differential equation. Moreover, {er1 x , er2 x } is a fundamental set for the
differential equation, and
y(x) = c1 er1 x + c2 er2 x
is a general solution.

The example done while deriving the basic approach illustrated this case. Another example,
however, may not hurt.

!Example 15.1: Consider the differential equation

y + 7y = 0 .

Assuming y = er x in this equation gives

d2 r x d rx
2
e + e = 0
dx dx

→ r 2 er x + 7rer x = 0 .

i i

i i
i i

i i

Case 2: Only One Root 305

Dividing out er x gives us the characteristic equation

r 2 + 7r = 0 .

In factored form, this is

r(r + 7) = 0 ,
which means that
r = 0 or r + 7 = 0 .
Consequently, the solutions to the characteristic equation are

r = 0 and r = −7 .

The two corresponding solutions to the differential equation are

y1 = e0·x = 1 and y2 = e−7x .

Thus, our fundamental set of solutions is

& '
1, e−7x ,

and the corresponding general solution to our differential equation is

y(x) = c1 · 1 + c2 e−7x ,

which is slightly more simply written as

y(x) = c1 + c2 e−7x .

15.4 Case 2: Only One Root

Using Reduction of Order
If the characteristic polynomial only has one root r , then

y1 (x) = er x

is one solution to our differential equation. This, alone, is not enough for a general solution, but we
can use this one solution with the reduction of order method to get the full general solution. Let us
do one example this way.

!Example 15.2: Consider the differential equation

y − 6y + 9y = 0 .

The characteristic equation is

r 2 − 6r + 9 = 0 ,
which factors nicely to
(r − 3)2 = 0 ,

i i

i i
i i

i i

306 Second-Order Homogeneous Linear Equations with Constant Coefﬁcients

giving us r = 3 as the only root. Consequently, we have

y1 (x) = e3x

as one solution to our differential equation.

To ﬁnd the general solution, we start the reduction of order method as usual by letting

y(x) = y1 (x)u(x) = e3x u(x) .

The derivatives are then computed,

y (x) = e3x u = 3e3x u + e3x u

and

y (x) = 3e3x u + e3x u
= 3 · 3e3x u + 3e3x u + 3e3x u + e3x u
= 9e3x u + 6e3x u + e3x u ,

and plugged into the differential equation,

0 = y − 6y + 9y

= 9e3x u + 6e3x u + e3x u − 6 3e3x u + e3x u + 9 e3x u
$ %
= e3x 9u + 6u + u − 18u − 6u + 9u .

Dividing out the exponential and grouping together the coefﬁcients for u , u and u yield

0 = u + [6 − 6]u + [9 − 18 + 9]u = u .

As expected, the “ u term” drops out. Even nicer, though, is that the “ u term” also drops out,
leaving us with u = 0 ; that is, to be a little more explicit,

d2u
= 0 .
dx2

No need to do anything fancy here — just integrate twice. The ﬁrst time yields
2
du d u
= 2
dx = 0 dx = A .
dx dx

Integrating again,
du
u(x) = dx = A dx = Ax + B .
dx
Thus,
y(x) = e3x u(x) = e3x [Ax + B] = Axe3x + Be3x
is the general solution to the differential equation being considered here.

are two solutions to this differential equation. Moreover, {er x , xer x } is a fundamental set for the
differential equation, and
y(x) = c1 er x + c2 xer x
is a general solution.

Let’s redo example 15.2 using this lemma:

!Example 15.3: Consider the differential equation

y − 6y + 9y = 0 .

The characteristic equation is

r 2 − 6r + 9 = 0 ,
which factors nicely to
(r − 3)2 = 0 ,
giving us r = 3 as the only root. Consequently,

y1 (x) = e3x

is one solution to our differential equation. By our work above, summarized in lemma 15.2, we
know a second solution is
y2 (x) = xe3x ,
and a general solution is
y(x) = c1 e3x + c2 xe3x .

i i

i i
i i

i i

Case 3: Complex Roots 309

15.5 Case 3: Complex Roots

Blindly Using Complex Roots
Let us start with an example.

!Example 15.4: Consider solving

y − 6y + 13y = 0 .

The characteristic equation is

r 2 − 6r + 13 = 0 .
Factoring this is not easy for most, so we will resort to the quadratic formula for ﬁnding the
possible values of r :
−b ± b2 − 4ac
r =
2a

−(−6) ± (−6)2 − 4 · 1 · 13
=
2·1
√
6 ± −16
=
2
6 ± i4
= = 3 ± i2 .
2
So we get
y+ (x) = e(3+i 2)x and y− (x) = e(3−i 2)x
as two solutions to the differential equation. Since the factors in the exponents are different, we
can reasonably conclude that neither e(3+i 2)x nor e(3−i 2)x is a constant multiple of the other,
and so
y(x) = c+ e(3+i 2)x + c− e(3−i 2)x
should be a general solution to our differential equation.

In general, if the roots to the characteristic equation for

ay + by + cy = 0

are not real valued, then, as noted at the beginning of these notes, we will get a pair of complex-valued
roots
−b ± b2 − 4ac −b ± i |b2 − 4ac| b |b2 − 4ac|
r = = = − ± i .
2a 2a 2a 2a
For convenience, let’s write these roots generically as

r+ = λ + iω and r− = λ − iω

where λ and ω are the real numbers

b |b2 − 4ac|
λ = − and ω = .
2a 2a
Don’t bother memorizing these formulas for λ and ω , but do observe that these two values for r ,
λ + iω and λ − iω , form a “conjugate pair;” that is, they differ only by the sign on the imaginary
part.

i i

i i
i i

i i

310 Second-Order Homogeneous Linear Equations with Constant Coefﬁcients

It should also be noted that ω cannot be zero; otherwise we would be back in case 2 (one real
root). So r− = λ + iω and r+ = λ − iω are two distinct roots. As we will verify, this means

y+ (x) = er+ x = e(λ+i ω)x and y− (x) = er− x = e(λ−i ω)x

are two independent solutions, and, from a purely mathematical point of view, there is nothing wrong
with using
y(x) = c+ e(λ+i ω)x + c− e(λ−i ω)x
as a general solution to our differential equation.
There are problems, however, with using er x when r is a complex number. For one thing, it
introduces complex numbers into problems that, most likely, should be entirely describable using
just real-valued functions. More importantly, you might not yet know what er x means when r
is a complex number! (Quick, graph e(3+i 2)x !) To deal with these problems, you have several
choices:
1. Pray you never have to deal with complex roots when solving differential equations. (A very
bad choice since such equations turn out to be among the most important ones in applications.)
2. Blindly use e(λ±i ω)x , hoping that no one questions your results and that your results are never
used in a system the failure of which could result in death, injury, or the loss of substantial
sums of money and/or prestige.
3. Derive an alternative formula for the general solution, and hope that no one else ever springs
a e(λ±i ω)x on you. (Another bad choice since these exponentials are used throughout the
sciences and engineering.)
4. Spend a little time learning what e(λ±i ω)x means.
Guess which choice we take.

The Complex Exponential Function

Let us now consider the quantity e z where z is any complex value. We, like everyone else, will call
this the complex exponential function. In our work,

z = r x = (λ ± iω)x = λx ± iωx .

where λ and ω are real constants. Keep in mind that our r value came from a characteristic
equation that, in turn, came from plugging y(x) = er x into a differential equation. In getting the
characteristic equation, we did our computations assuming the er x behaved just as it would behave
if r were a real value. So, in determining just what er x means when r is complex, we should
assume the complex exponential satisﬁes the rules satisﬁed by the exponential we already know. In
particular, we must have, for any constant r (real or complex),
d rx
e = rer x ,
dx
And, for any two constants A and B ,
e A+B = e A e B .

Thus,
er x = e(λ±i ω)x = eλx±i ωx = eλx e±i ωx .
The ﬁrst factor, eλx is just the ordinary, ‘real’ exponential — a function you should be able to graph
in your sleep. It is the second factor, e±i ωx , that we need to come to grips with.

i i

i i
i i

i i

Case 3: Complex Roots 311

To better understand e±i ωx , let us ﬁrst examine

φ(t) = ei t .

Later, we’ll replace t with ±ωx .

One thing we can do with this φ(t) is to compute it at t = 0 ,

φ(0) = ei ·0 = e0 = 1 .

We can also differentiate φ , getting

d it
φ (t) = e = iei t .
dt
Letting t = 0 here gives
φ (0) = iei ·0 = ie0 = i .
Notice that the right side of the formula for φ (t)
is just i times φ . Differentiating φ (t) again, we
get
d it
φ (t) = ie = i 2 ei t = −1 · φ(t) ,
dt
giving us the differential equation
φ = −φ ,
which is better written as
φ + φ = 0
simply because we’ve seen this equation before (using y instead of φ for the unknown function).
What we have just derived is that φ(t) = ei t satisﬁes the initial-value problem

φ + φ = 0 with φ(0) = 1 and φ (0) = i .

We can solve this! From example 13.5 on page 267, we know

φ(t) = A cos(t) + B sin(t)

is a general solution to the differential equation. Using this formula for φ we also have
d
φ (t) = [A cos(t) + B sin(t)] = −A sin(t) + B cos(t) .
dt
Applying the initial conditions gives

1 = φ(0) = A cos(0) + B sin(0) = A · 1 + B · 0 = A

and
i = φ (0) = −A sin(0) + B cos(0) = −a · 0 + B · 1 = B .

Thus,
ei t = φ(t) = 1 · cos(t) + i · sin(t) = cos(t) + i sin(t) .
The last formula is so important that we should write it again with the middle cut out and give
it a reference number:
ei t = cos(t) + i sin(t) . (15.4)
Before replacing t with ±ωx , it is worth noting what happens when t is replaced by −t in the
above formula. In doing this, remember that the cosine is an even function and that the sine is an
odd function, that is,

cos(−t) = cos(t) while sin(−t) = − sin(t) .

i i

i i
i i

i i

312 Second-Order Homogeneous Linear Equations with Constant Coefﬁcients

So
e−i t = ei (−t) = cos(−t) + i sin(−t) = cos(t) − i sin(t) .
Cutting out the middle leaves us with
e−i t = cos(t) − i sin(t) . (15.5)
Formulas (15.4) and (15.5) are the (famous) Euler formulas for ei t and e−i t . They really
should be written as a pair:
ei t = cos(t) + i sin(t) (15.6a)
e−i t = cos(t) − i sin(t) (15.6b)
That way, it is clear that when you add these two equations together, the sines cancel out and we get
ei t + e−i t = 2 cos(t) .
On the other hand, subtracting equation (15.6b) from equation (15.6a) yields
ei t − e−i t = 2i sin(t) .
So, the sine and cosine functions can be rewritten in terms of complex exponentials,
ei t + e−i t ei t − e−i t
cos(t) = and sin(t) = . (15.7)
2 2i
This is nice. This means that many computations involving trigonometric functions can be done
using exponentials instead, and this can greatly simplify those computations.

!Example 15.5: Consider deriving the trigonometric identity involving the product of the sine
and the cosine functions. Using equation set (15.7) and basic algebra,
ei t + e−i t ei t − e−i t
sin(t) cos(t) = ·
2 2i

i t 2
−i t 2
e − e
=
2 · 2i
ei 2t − e−i 2t
=
2 · 2i
1 ei (2t) − e−i (2t) 1
= · = sin(2t) .
2 2i 2
Thus, we have (re)derived the trigonometric identity
2 sin(t) cos(t) = sin(2t) .
(Compare this to the derivation you did years ago without complex exponentials!)

But we digress — our interest is not in rederiving trigonometric identities, but in ﬁguring out
what to do when we get e(λ±i ω)x as solutions to a differential equation. Using the law of exponents
and formulas (15.6a) and (15.6b) (with ωx replacing t ), we see that
e(λ+i ω)x = eλx+i ωx = eλx ei ωx = eλx [cos(ωx) + i sin(ωx)]
and
e(λ−i ω)x = eλx−i ωx = eλx e−i ωx = eλx [cos(ωx) − i sin(ωx)] .

So now we know how to interpret e(λ±i ω)x , and now we can get back to solving our differential
equations.

i i

i i
i i

i i

Case 3: Complex Roots 313

Intelligently Using Complex Roots

Recall, we are interested in solving
ay + by + cy = 0
when a , b and c are real-valued constants, and the solutions to the corresponding characteristic
equation are complex. Remember, also, that these complex roots will form a conjugate pair,
r+ = λ + iω and r− = λ − iω
where λ and ω are real numbers with ω = 0 . This gave us
y+ (x) = er+ x = e(λ+i ω)x and y− (x) = er− x = e(λ−i ω)x
as two solutions to our differential equation. From our discussion of the complex exponential, we
now know that
y+ (x) = e(λ+i ω)x = eλx [cos(ωx) + i sin(ωx)] (15.8a)
and
y− (x) = e(λ−i ω)x = eλx [cos(ωx) − i sin(ωx)] . (15.8b)
Clearly, neither y+ nor y− is a constant multiple of the other. So each of the equivalent formulas
y(x) = c+ y+ (x) + c− y− (x) ,

y(x) = c+ e(λ+i ω)x + c− e(λ−i ω)x

and
y(x) = c+ eλx [cos(ωx) + i sin(ωx)] + c− eλx [cos(ωx) − i sin(ωx)] ,
can, legitimately, be used as a general solution to our differential equation.
Still, however, these solutions introduce complex numbers into formulas that, in applications,
should probably just involve real numbers. To avoid that, let us derive an alternative fundamental
pair of solutions by choosing the constants c+ and c− appropriately. The basic idea is the same
as used to derive the formulas (15.7) for sin(t) and cos(t) in terms of complex exponentials. First
add equations (15.8a) and (15.8b) together, noting how the sine terms cancel out:

y+ (x) = eλx [cos(ωx) + i sin(ωx)]

$ %
+ y− (x) = eλx [cos(ωx) − i sin(ωx)] .

y+ (x) + y− (x) = 2eλx cos(ωx)

So
= eλx cos(ωx)
1 1
y1 (x) = y+ (x) + y− (x)
2 2
is a solution to our differential equation. On the other hand, the cosine terms cancel out when we
subtract equation (15.8b) from equation (15.8a), leaving us with
y+ (x) − y− (x) = 2ieλx sin(ωx) .
So
= eλx sin(ωx)
1 1
y2 (x) = y+ (x) − y− (x)
2i 2i
is another solution to our differential equation. Again, it should be clear that our latest solutions are
not constant multiples of each other. Consequently,
y1 (x) = eλx cos(ωx) and y2 (x) = eλx sin(ωx)

i i

i i
i i

i i

314 Second-Order Homogeneous Linear Equations with Constant Coefﬁcients

form a fundamental set, and

y(x) = c1 eλx cos(ωx) + c2 eλx sin(ωx)

is a general solution to our differential equation.

All this work should be summarized so we won’t forget:

Lemma 15.3
Let a , b and c be real-valued constants with a = 0 . If the characteristic equation for

ay + by + cy = 0

does not have one or two distinct real solutions, then it will have a conjugate pair of solutions

r+ = λ + iω and r− = λ − iω

where λ and ω are real numbers with ω = 0 .

Moreover, both
$ (λ+i ω)x (λ−i ω)x % $ %
e ,e and eλx cos(ωx) , eλx sin(ωx)

are fundamental sets of solutions for the differential equation, and the general solution can be written
as either
y(x) = c+ e(λ+i ω)x + c− e(λ−i ω)x (15.9a)
or
y(x) = c1 eλx cos(ωx) + c2 eλx sin(ωx) (15.9b)
as desired.

In practice, formula (15.9b) is often preferred since it is a formula entirely in terms of real-
valued functions. On the other hand, if you are going to do a bunch of calculations with y(x)
involving differentiation and/or integration, you may prefer using formula (15.9a), since calculus
with exponentials — even complex exponentials — is much easier that calculus with products of
exponentials and trigonometric functions.
By the way, instead of “memorizing” the above theorem, it may be better to just remember that
you can get the eλx cos(ωx) and eλx sin(ωx) solutions from the real and imaginary parts of

e(λ±i ω)x = eλx e±i ωx

= eλx [cos(ωx) ± i sin(ωx)] = eλx cos(ωx) ± ieλx sin(ωx) .

!Example 15.6: Again, consider solving

y − 6y + 13y = 0 .

From example 15.4, we already know

r+ = 3 + i2 and r− = 3 − i2

are the two solutions to the characteristic equation,

r 2 − 6r + 13 = 0 .

Thus, one fundamental set of solutions consists of

e(3+i 2)x and e(3−i 2)x

i i

i i
i i

i i

Summary 315

with
e(3±i 2)x = e3x e±i 2x
= e3x [cos(2x) ± i sin(2x)] = e3x cos(2x) ± ie3x sin(2x) .

Alternatively, we can use the pair of real-valued functions

e3x cos(2x) and e3x sin(2x)

as a fundamental set. Using this, we have

y(x) = c1 e3x cos(2x) + c2 e3x sin(2x)

as the general solution to the differential equation in terms of just real-valued functions.

15.6 Summary
Combining lemma 15.1 on page 304, lemma 15.2 on page 308, and lemma 15.3 on page 314, we get
the big theorem on solving second-order homogeneous linear differential equations with constant
coefﬁcients:

Theorem 15.4
Let a , b and c be real-valued constants with a = 0 . Then the characteristic polynomial for

ay + by + cy = 0

will have either one or two distinct real roots or will have two complex roots that are complex
conjugates of each other. Moreover:

1. If there are two distinct real roots r1 and r2 , then

{ er 1 x , er 2 x }

is a fundamental set of solutions to the differential equation, and

y(x) = c1 er1 x + c2 er2 x

is a general solution.

2. If there is only one real root r , then

{ er x , xer x }

is a fundamental set of solutions to the differential equation, and

y(x) = c1 er x + c2 xer x

is a general solution.

i i

i i
i i

i i

316 Second-Order Homogeneous Linear Equations with Constant Coefﬁcients

3. If there is a conjugate pair of roots r = λ ± iω , then both

{ e(λ+i ω)x , e(λ−i ω)x } and { eλx cos(ωx) , eλx sin(ωx) }

are fundamental sets of solutions to the differential equation, and either

y(x) = c+ e(λ+i ω)x + c− e(λ−i ω)x

or
y(x) = c1 eλx cos(ωx) + c2 eλx sin(ωx)

can be used as a general solution.

Additional Exercises

15.1. Find the general solution to each of the following:

a. y − 7y + 10y = 0 b. y + 2y − 24y = 0
c. y − 25y = 0 d. y + 3y = 0
e. 4y − y = 0 f. 3y + 7y − 6y = 0

15.2. Solve the following initial-value problems:

i i
i i

i i

i i
i i

i i

16
Springs: Part I

Second-order differential equations arise in a number of applications. We saw one involving a

falling object at the beginning of this text (the falling frozen duck example in section 1.2). In fact,
since acceleration is given by the second derivative of position, any application requiring Newton’s
equation F = ma has the potential to be modeled by a second-order differential equation.
In this chapter we will consider a class of applications involving masses bouncing up and down
at the ends of springs. This is a particularly good class of examples for us to examine. For one
thing, the basic model is relatively easy to derive and is given by a second-order differential equation
with constant coefﬁcients. So we will be able to apply what we learned in the last chapter to derive
reasonably accurate descriptions of the motion under a variety of situations. Moreover, most of
us already have an intuitive idea of how these “mass/spring systems” behave. Hopefully, what we
derive will correspond to what we expect, and may even reﬁne our intuitive understanding.
Another good point about the work we are about to begin is that many of the notions and
results we will develop here can carry over to the analysis of other applications involving things that
vibrate or oscillate in some manner. For example, the analysis of current in basic electric circuits is
completely analogous to the analysis we’ll carry out for masses on springs.

16.1 Modeling the Action

The Mass/Spring System
Imagine a horizontal spring with one end attached to an immobile wall and the other end attached to
some object of interest (say, a box of frozen ducks) which can slide along the floor, as in figure 16.1.
For brevity, this entire assemblage of spring, object, wall, etc. will be called a mass/spring system.
Let us assume that:
1. The object can only move back and forth in the one horizontal direction.
2. Newtonian physics apply.
3. The total force acting on the object is the sum of:
(a) The force from the spring responding to the spring being compressed and stretched.
(b) The forces resisting motion because of air resistance and friction between the box and
the floor.
(c) Any other forces acting on the object. (This term will usually be zero in this chapter.
We include it here for use in later chapters, so we don’t have to re-derive the equation
for the spring to include other forces.)

319

i i

i i
i i

i i

320 Springs: Part I

Natural length of Natural length of

the spring Fspring the spring Fspring

m m

0 y(t) Y y(t) 0 Y
(a) (b)

Figure 16.1: The mass/spring system with the direction of the spring force Fspring on the mass
(a) when the spring is extended ( y(t) > 0 ), and (b) when the spring is compressed
( y(t) < 0 ).

(All forces are directed parallel to the direction of the object’s motion.)

4. The spring is an “ideal spring” with no mass. It has some natural length at which it is neither
compressed nor stretched, and it can be both stretched and compressed. (So the coils are not
so tightly wound that they are pressed against each other, making compression impossible.)

Our goal is to describe how the position of the object varies with time, and to see how this object’s
motion depends on the different parameters of our mass/spring system (the object’s mass, the strength
of the spring, the slipperiness of the ﬂoor, etc.).
To set up the general formulas and equations, we’ll ﬁrst make the following traditional symbolic
assignments:

m = the mass (in kilograms) of the object ,

t = the time (in seconds) since the mass/spring system was set into motion ,
and
y = the position (in meters) of the object when the spring is at its natural length .

This means our Y –axis is horizontal (nontraditional, maybe, but convenient for this application), and
positioned so that y = 0 is the “equilibrium position” of the object. Let us also direct the Y –axis
so that the spring is stretched when y > 0 , and compressed when y < 0 (again, see ﬁgure 16.1).

Modeling the Forces

The motion of the object is governed by Newton’s law F = ma with F being the force acting on
the box and
d2 y
a = a(t) = acceleration of the box at time t = .
dt 2
By our assumptions,
F = Fresist + Fspring + Fother
where
Fresist = force due to the air resistance and friction ,

Fspring = force from the spring due to it being compressed or stretched ,

and
Fother = any other forces acting on the object .

i i

i i
i i

i i

Modeling the Action 321

Thus,
d2 y
Fresist + Fspring + Fother = F = ma = m ,
dt 2
which, for convenient reference later, we will rewrite as

d2 y
m − Fresist − Fspring = Fother . (16.1)
dt 2

The resistive force, Fresist , is basically the same as the force due to air resistance discussed in
the Better Falling Object Model in chapter 1 (see page 11) — we are just including friction from the
floor along with the friction with the air (or whatever medium surrounds our mass/spring system).
So let us model the total resistive force here the same way we modeled the force of air resistance in
chapter 1:
dy
Fresist = −γ × velocity of the box = −γ (16.2)
dt
where γ is some nonnegative constant. Because of the role it will play in determining how much
the resistive forces “dampen” the motion, we call γ the damping constant. It will be large if the air
resistance is substantial (possibly because the mass/spring system is submerged in water instead of
air) or if the object does not slide easily on the floor. It will be small if there is little air resistance
and the floor is very slippery. And it will be zero if there is no air resistance and no friction with the
floor (a very idealized situation).
Now consider what we know about the spring force, Fspring . At any given time t , this force
depends only on how much the spring is stretched or compressed at that time, and that, in turn,
is completely described by y(t) . Hence, we can describe the spring force as a function of y ,
Fspring = Fspring (y) . Moreover:

1. If y = 0 , then the spring is at its natural length, neither stretched nor compressed, and exerts
no force on the box. So Fspring (0) = 0 .

2. If y > 0 , then the spring is stretched and exerts a force on the box pulling it backwards. So
Fspring (y) < 0 whenever y > 0 .

3. Conversely, if y < 0 , then the spring is compressed and exerts a force on the box pushing it
forwards. So Fspring (y) > 0 whenever y < 0 .

Knowing nothing more about the spring force, we might as well model it using the simplest mathe-
matical formula satisfying the above:
Fspring (y) = −κ y (16.3)

where κ is some positive constant.

Formula (16.3) is the famous Hooke’s law for springs. Experiment has shown it to be a good
model for the spring force, provided the spring is not stretched or compressed too much. The constant
κ in this formula is called the spring constant. It describes the “stiffness” of the spring (i.e., how
strongly it resists being stretched), and can be determined by compressing or stretching the spring by
some amount y0 , and then measuring the corresponding force F0 at the end of the spring. Hooke’s
law then says that
F0
κ = − .
y0
And because κ is a positive constant, we can simplify things a little bit more to
|F0 |
κ = . (16.4)
|y0 |

i i

i i
i i

i i

322 Springs: Part I

!Example 16.1: Assume that, when our spring is pulled out 2 meters beyond its natural length,
we measure that the spring is pulling back with a force F0 of magnitude

kg·meter
|F0 | = 18 2
.
sec

Then,
|F0 | 18 kg·meter/sec2
κ = = .
|y0 | 2 meter
That is,
kg
κ = 9 .
sec2
(This is a pretty weak spring.)

A Note on Units
We defined m , t , and y to be numerical values describing mass, position, and time in terms
of kilograms, seconds, and meters. Consequently, everything derived from these quantities —
velocity, acceleration, the resistance coefficient γ , and the spring constant κ — are numerical values
describing physical parameters in terms of corresponding units. Of course, velocity and acceleration
are in terms of meters/second and meters/second2 , respectively. And, because “F = ma”, any value
for force should be interpreted as being in terms of kilogram·meter/second2 (also called newtons).
As indicated in the example above, the corresponding units associated with the spring constant
are kilogram/second2 , and as you can readily verify, the resistance coefficient γ is in terms of
kilograms/second.
In the above example, the units involved in every calculation were explicitly given in parenthesis.
In the future, we will not explicitly state the units in every calculation and trust that you, the reader,
can determine the units appropriate for the final results from the information given.
Indeed, for our purposes, the actual choice of the units is not important. The formulas we
have developed and illustrated (along with those we will later develop and illustrate) remain just
as valid if m , t , and y are in terms of, say, grams, weeks, and miles, respectively, provided it
is understood that the values of the corresponding velocities, accelerations, etc. are in terms of
miles/week, miles/week2 , etc.

!Example 16.2: Pretend that, when our spring is pulled out 2 miles beyond its natural length,
we measure that the spring is pulling back with a force F0 of magnitude

gram·mile
|F0 | = 18 2
.
week

Then,
|F0 | 18 gram·mile/week2
κ = = .
|y0 | 2 meter
That is,
gram
κ = 9 .
week2

i i

i i
i i

i i

The Mass/Spring Equation and Its Solutions 323

16.2 The Mass/Spring Equation and Its Solutions

The Differential Equation
Replacing Fresist and Fspring in equation (16.1) with the formulas for these forces from equations
(16.2) and (16.3), we get
d2 y dy
m + γ + κ y = Fother . (16.5)
dt 2 dt
This is the differential equation for y(t) , the position y of the object in the system at time t .
For the rest of this chapter, let us assume the object is moving “freely” under the inﬂuence of no
forces except those from friction and from the spring’s compression and expansion.1 Thus, for the
rest of this chapter, we will restrict our interest to the above differential equation with Fother = 0 ,

d2 y dy
m + γ + κy = 0 . (16.6)
dt 2 dt

This is a second-order, homogeneous, linear differential equation with constant coefﬁcients; so we

can solve it by the methods discussed in the previous chapter. The precise functions in these solutions
(sine/cosines, exponentials, etc.) will depend on the coefﬁcients. We will go through all the possible
cases soon.
Keep in mind that the mass, m , and the spring constant, κ , are positive constants for a real
spring. On the other hand, the damping constant, γ , can be positive or zero. This is signiﬁcant.
Because γ = 0 when there is no resistive force to dampen the motion, we say the mass/spring
system is undamped when γ = 0 . We will see that the motion of the mass in this case is relatively
simple.
If, however, there is a nonzero resistive force to dampen the motion, then γ > 0 . Accordingly,
in this case, we say the mass/spring system is damped. We will see that there are three subcases to
consider, according to whether γ 2 − 4κm is negative, zero or positive.
Let’s now carefully examine, case by case, the solutions that can arise.

Undamped Systems
If γ = 0 , differential equation (16.6) reduces to

d2 y
m + κy = 0 . (16.7)
dt 2

The corresponding characteristic equation,

mr 2 + κ = 0 ,

has roots √ (
−κm κ
r± = ± = ±iω0 where ω0 = .
m m
From our discussions in the previous chapter, we know the general solution to our differential equation
is given by
y(t) = c1 cos(ω0 t) + c2 sin(ω0 t)
where c1 and c2 are arbitrary constants. However, for graphing purposes (and a few other purposes)
it is convenient to write our general solution in yet another form. To derive this form, plot (c1 , c2 ) as
1 We will introduce other forces in later chapters.

i i

i i
i i

i i

324 Springs: Part I

Y
p0
A cos(φ) 4
c2
2 A
A A sin(φ) T
φ 1 2 3 4 5 6 7 8 9 10 11
0 c1

(a) (b)

Figure 16.2: (a) Expressing c1 and c2 as A cos(φ) and A sin(φ) . (b) The graph of y(t) for the
undamped mass/spring system of example 16.3.

a point on a Cartesian coordinate system, and let A and φ be the corresponding polar coordinates
of this point (see ﬁgure 16.2a). That is, let

A = (c1 )2 + (c2 )2

and let φ be the angle in the range [0, 2π ) with

c1 = A cos(φ) and c2 = A sin(φ) .

Using this and the well-known trigonometric identity

cos(θ ± φ) = cos(θ ) cos(φ) ∓ sin(θ ) sin(φ) ,

we get
c1 cos(ω0 t) + c2 sin(ω0 t) = [A cos(φ)] cos(ω0 t) + [A sin(φ)] sin(ω0 t)

= A cos(ω0 t) cos(φ) + sin(ω0 t) sin(φ)
= A cos(ω0 t − φ) .

Thus, our general solution is given by either

y(t) = c1 cos(ω0 t) + c2 sin(ω0 t) (16.8a)

or, equivalently,
y(t) = A cos(ω0 t − φ) (16.8b)

where (
κ
ω0 =
m
and the other constants are related by
c1 c2
A = (c1 )2 + (c2 )2 , cos(φ) = and sin(φ) = .
A A
It is worth noting that ω0 depends only on the spring constant and the mass of the attached object.
The other constants are “arbitrary” and are determined by the initial position and velocity of the
attached object.
Either of the formulas from set (16.8) can be used for the position y of the box at time t .
One advantage of using formula (16.8a) is that the constants c1 and c2 are fairly easily determined
in many of initial-value problems involving this differential equation. However, formula (16.8b)
gives an even simpler description of how the position varies with time. It tells us that the position is

i i

i i
i i

i i

The Mass/Spring Equation and Its Solutions 325

completely described by a single shifted cosine function multiplied by the positive constant A and
shifted so that
y(0) = A cos(φ) .
You should be well acquainted with such functions. The graph of one is sketched in ﬁgure 16.2b.
Take a look at it, and then read on.
Formula (16.8b) tells us that the object is oscillating back and forth from y = A to y = −A .
Accordingly, we call A the amplitude of the oscillation. The (natural) period p0 of the oscillation
is the time it takes the mass to go through one complete “cycle” of oscillation. Using formula (16.8b),
rewritten as
y(t) = A cos(X) with X = ω0 t − φ ,
we see that our system going through one cycle as t varies from t = t0 to t = t0 + p0 is the same
as cos(X) going through one complete cycle as X varies from

X = X 0 = ω0 t0 − φ
to
X = ω0 (t0 + p0 ) − φ = X 0 + ω0 p0 .

But, as we well know, cos(X) goes through one complete cycle as X goes from X = X 0 to
X = X 0 + 2π . Thus,
ω0 p0 = 2π , (16.9)
and the natural period of our system is
2π
p0 = .
ω0

This is the “time per cycle” for the oscillations in the mass/spring system. Its reciprocal,
1 ω
ν0 = = 0 ,
p0 2π

then gives the “cycles per unit time” for the system (typically measured in terms of hertz, with
one hertz equaling one cycle per second). We call ν0 the (natural) frequency for the system. The
closely related quantity originally computed, ω0 (which can be viewed as describing “radians per unit
time”), will be called the (natural) angular frequency for the system.2 Because the natural frequency
ν0 is usually more easily measured than the natural circular frequency ω0 , it is sometimes more
convenient to express the formulas for position (formula set 16.8) with 2π ν0 replacing ω0 ,

y(t) = c1 cos(2π ν0 t) + c2 sin(2π ν0 t) , (16.8a )

and, equivalently,
y(t) = A cos(2π ν0 t − φ) . (16.8b )

By the way, the angle φ in all the formulas above is called the phase angle of the oscillations,
and any motion described by these formulas is referred to as simple harmonic motion.

!Example 16.3: Assume we have an undamped mass/spring system in which the spring’s spring
constant κ and the attached object’s mass m are

kg
κ = 9 2
and m = 4 (kg)
sec

2 Many authors refer to ω as a circular frequency instead of an angular frequency.

i i

i i
i i

i i

326 Springs: Part I

(as in example 16.1). Let us try to ﬁnd and graph the position y at time t of the attached object,
assuming the object’s initial position and velocity are
√
y(0) = 2 and y (0) = 3 3 .

With the above values for γ , m and κ , the differential equation for y , equation (16.6),
becomes
d2 y
4 + 9y = 0 .
dx2
As noted in our discussion, the general solution can be given by either

y(t) = c1 cos(ω0 t) + c2 sin(ω0 t)

or
y(t) = A cos(ω0 t − φ)

where the natural angular frequency is

( (
κ 9 3
ω0 = = = .
m 4 2

This means the natural period p0 and the natural frequency ν0 of the system are
2π 2π 4π 1 3
p0 = = = and ν0 = = .
ω0 3/2 3 p0 4π

To determine the other constants in the above formulas for y(t) , we need to consider the
given initial conditions. Using the ﬁrst formula for y , we have

3 3
y(t) = c1 cos t + c2 sin t
2 2
and
y (t) = − c1 sin
3 3 3 3
t + c2 cos t .
2 2 2 2

Plugging these into the initial conditions yields

3 3
2 = y(0) = c1 cos · 0 + c2 sin · 0 = c1
2 2
and
√
3 3 = y (0) = − c1 sin · 0 + c2 cos · 0 =
3 3 3 3 3
c2 .
2 2 2 2 2

So √ √
2
c1 = 2 , c2 = ·3 3 = 2 3 ,
3
and
! √ √
A = (c1 )2 + (c2 )2 = (2)2 + (2 3)2 = 4 + 12 = 4 .
This gives us enough information to graph y(t) . From our computations, we know this
graph is a cosine shaped curve with amplitude A = 4 and period p0 = 4π/3 . It is shifted
horizontally so that the initial conditions
√
y(0) = 2 and y (0) = 3 3 > 0

are satisﬁed. In other words, the graph must cross the Y –axis at y = 2 , and the graph’s slope at
that crossing point must be positive. That is how the graph in ﬁgure 16.2b was constructed.

i i

i i
i i

i i

The Mass/Spring Equation and Its Solutions 327

To ﬁnd the phase angle φ , we must solve the pair of trigonometric equations
√ √
c1 2 1 c2 2 3 3
cos(φ) = = = and sin(φ) = = =
A 4 2 A 4 2

for 0 ≤ φ < 2π . From our knowledge of trigonometry, we know the first of these equations is
satisfied if and only if
π 5π
φ = or φ = ,
3 3
while the second is satisfied if and only if
π 2π
φ = or φ = .
3 3

Hence, for both of the above trigonometric equations to hold, we must have
π
φ = .
3

Finally, using the values just obtained, we can completely write out two equivalent formulas
for our solution:
√
3 3
y(t) = 2 cos t + 2 3 sin t
2 2
and
3 π
y(t) = 4 cos t − .
2 3

Damped Systems
If γ > 0 , then all coefﬁcients in our differential equation

d2 y dy
m + γ + κy = 0
dt 2 dt

are positive. The corresponding characteristic equation is

mr 2 + γ r + κ = 0 ,

and its solutions are given by

−γ ± γ 2 − 4κm
r = r± = . (16.10)
2m

As we saw in the last chapter, the nature of the differential equation’s solution, y = y(t) , depends on
whether γ 2 − 4κm is positive, negative or zero. And this, in turn, depends on the positive constants
γ , κ and mass m as follows:
√
γ < 2 κm ⇐⇒ γ 2 − 4κm < 0 ,
√
γ = 2 κm ⇐⇒ γ 2 − 4κm = 0 ,
and
√
2 κm < γ ⇐⇒ γ 2 − 4κm > 0 .

i i

i i
i i

i i

328 Springs: Part I

For reasons that may (or may not) be clear by the end of this section, we say that a mass/spring
system is, respectively,

underdamped , critically damped or overdamped

if and only if
√ √ √
0 < γ < 2 κm , γ = 2 κm or 2 κm < γ .

Since we’ve already considered the case where γ = 0 , √

the ﬁrst damped cases considered will
be the underdamped mass/spring systems (where 0 < γ < 2 κm ).
√
Underdamped Systems ( 0 < γ < 2 κ m )
In this case,
! ! ! !
γ 2 − 4κm = − γ 2 − 4κm = i γ 2 − 4κm = i 4κm − γ 2 ,

and formula (16.10) for the r± ’s can be written as

γ 4κm − γ 2
r± = −α ± iω where α = and ω = .
2m 2m

Both α and ω are positive real values, and, from the discussion in the previous chapter, we know a
corresponding general solution to our differential equation is

y(t) = c1 e−αt cos(ωt) + c2 e−αt sin(ωt) .

Factoring out the exponential and applying the same analysis to the linear combination of sines and
cosines as was done for the undamped case, we get that the position y of the box at time t is given
by any of the following:
y(t) = e−αt [c1 cos(ωt) + c2 sin(ωt)] , (16.11a)

y(t) = Ae−αt cos(ωt − φ) (16.11b)

and even
φ
y(t) = Ae−αt cos ω t − . (16.11c)
ω

These three formulas are equivalent, and the arbitrary constants are related, as before, by
c c
A = (c1 )2 + (c2 )2 , cos(φ) = 1 and sin(φ) = 2 .
A A

Note the similarities and differences in the motion of the undamped system and the underdamped
system. In both cases, a shifted cosine function plays a major role in describing the position of
the mass. In the underdamped system this cosine function has angular frequency ω and, hence,
corresponding period and frequency
2π ω
p = and ν = .
ω 2π

However, in the underdamped system, this shifted cosine function is also multiplied by a decreasing
exponential, reﬂecting the fact that the motion is being damped, but not so damped as to completely

i i

i i
i i

i i

The Mass/Spring Equation and Its Solutions 329

prevent oscillations in the box’s position. (You will further analyze how p and ν vary with γ in
exercise 16.7 b.)
Because the α in the formula set (16.11) determines the rate at which the maximum values of
y(t) are decreasing as t increases, let us call α the decay coefﬁcent for our system. It is also tempting
to call ω , p and ν the angular frequency, period and frequency of the system, but, because y(t) is
not truly periodic, this terminology is not truly appropriate. Instead, let’s refer to these quantities the
angular quasi-frequency, quasi-period and quasi-frequency of the system.3 And, if you must give
them names, call A the quasi-amplitude and Ae−αt the time-varying amplitude of the system.
And, again, it is sometimes more convenient to express our formulas in terms of the quasi-
frequency ν instead of the angular quasi-frequency ω , with, for example, formulas (16.11a) and
(16.11b) being rewritten as

y(t) = e−αt [c1 cos(2π νt) + c2 sin(2π νt)] (16.11a )

and
y(t) = Ae−αt cos(2π νt − φ) . (16.11b )

!Example 16.4: Again, assume the spring constant κ and the mass m in our mass/spring
system are
kg
κ = 9 and m = 4 (kg) .
sec2
For the system to be underdamped, the resistance coefﬁcient γ must satisfy
√ √
0 < γ < 2 κm = 2 9 · 4 = 12 .

For this example, assume γ = 2 . Then the position y at time t of the object is given by

y(t) = Ae−αt cos(ωt − φ)

where
γ 2 1
α = = =
2m 2×4 4
and √
4κm − γ 2 (4 · 9 · 3) − 22 35
ω = = = .
2m 2·4 4
The corresponding quasi-period for the system is
2π 2π 8π
p = = √ = √ ≈ 4.25 .
ω 35/4 35

To keep this example short, we won’t solve for A and φ from some set of initial conditions.
Instead, we’ll just set
π
A = 4 and φ = ,
3
and note that the resulting graph of y(t) is sketched in ﬁgure 16.3.

3 Some authors prefer using “psuedo” instead “quasi”.

i i

i i
i i

i i

330 Springs: Part I

Y
4
y(t)
Ae−αt
2

1 5 T

Figure 16.3: Graph of y(t) for the underdamped mass/spring system of example 16.4

√
Critically Damped Systems ( γ = 2 κ m )
In this case, !
γ 2 − 4κm = 0
and √ √ √ (
−γ ± γ 2 − 4κm −2 κm ± 0 κm κ
r± = = = − = − .
2m 2m m m
So the corresponding general solution to our differential equation is
(
−αt −αt κ
y(t) = c1 e + c2 te where α = .
m
Factoring out the exponential yields
y(t) = [c1 + c2 t]e−αt . (16.12)
The cosine factor in the underdamped case has now been replaced with a formula for a straight line,
c1 + c2 t . If y (0) is positive, then y(t) will initially increase as t increases. However, at some
point, the decaying exponential will force the graph of y(t) back down towards 0 as t → ∞ . This
is illustrated in ﬁgure 16.4a.
If, on the other hand, y(0) is positive and y (0) is negative, then the slope of the straight line
is negative, and the graph will initially head downward as t increases. Eventually, c1 + c2 t will be
negative. And, again, the decaying exponential will eventually force y(t) back (upward, this time)
towards 0 as t → ∞ . This is illustrated in ﬁgure 16.4b.

!Example 16.5: Once again, assume the spring constant κ and the mass m in our mass/spring
system are
kg
κ = 9 and m = 4 (kg) .
sec2
For the system to be critically damped, the resistance coefficient γ must satisfy
√ √
γ = 2 κm = 2 9 · 4 = 12 .
Assuming this, the position y at time t of the object in this mass/spring system is given by
( (
−αt κ 9 3
y(t) = [c1 + c2 t]e where α = = = .
m 4 2
The graph of this with (c1 , c2 ) = (2, 8) is sketched in figure 16.4a; the graph of this with
(c1 , c2 ) = (2, −4) is sketched in figure 16.4b.

i i

i i
i i

i i

The Mass/Spring Equation and Its Solutions 331

Y Y

2 2

1 5 T 1 5 T
−0.5 −0.5
(a) (b)

Figure 16.4: Graph of y(t) for the critically damped mass/spring system of example 16.5 (a)
with y(0) = 2 and y (0) > 0 (b) with y(0) = 2 and y (0) < 0 .

√
Overdamped Systems ( 2 κ m < γ )
In this case, it is ﬁrst worth observing that
!
γ > γ 2 − 4κm > 0 .

Consequently, the formula for the r± ’s (equation (16.10)),

−γ ± γ 2 − 4κm
r± =
2m
can be written as
r+ = α and r− = β
where α and β are the positive values

γ− γ 2 − 4κm γ+ γ 2 − 4κm
α = and β = .
2m 2m
Hence, the corresponding general solution to the differential equation is

y(t) = c1 e−αt + c2 e−βt ,

a linear combination of two decaying exponentials.

Some of the possible graphs for y are illustrated in ﬁgure 16.5.

!Example 16.6: Once again, assume the spring constant and the mass in our mass/spring system
are, respectively,
kg
κ = 9 and m = 4 (kg) .
sec2
For the system to be overdamped, the resistance coefﬁcient γ must satisfy
√ √
γ > 2 κm = 2 9 · 4 = 12 .

In particular, the system is overdamped if the resistance coefﬁcient γ is 15 . Assuming this, the
general position y at time t of the object in our system is given by

y(t) = c1 e−αt + c2 e−βt

i i

i i
i i

i i

332 Springs: Part I

Y
Y Y
2 2 2

0 0 0
0 5 T 0 5 T 0 5 T

(a) (b) (c)

Figure 16.5: Graphs of y(t) (with y(0) = 2 ) for the overdamped mass/spring system of
example 16.6. In (a) y (0) > 0 . In (b) and (c) y (0) < 0 with the magnitude of
y (0) in (c) being signiﬁcantly larger than in (b).

where
γ− γ 2 − 4κm 15 − 152 − 4 · 9 · 4 3
α = = = ··· =
2m 2·4 4
and

γ+ γ 2 − 4κm 15 + 152 − 4 · 9 · 4
β = = = ··· = 3 .
2m 2·4
Figures 16.5a, 16.5b, and 16.5c were drawn using this formula with, respectively,

(c1 , c2 ) = (4, −2) , (c1 , c2 ) = (1, 1) and (c1 , c2 ) = (−2, 4) .

Additional Exercises

16.1. Assume we have a single undamped mass/spring system, and do the following:
a. Find the spring constant κ for the spring given that, when pulled out 1/
2 meter beyond
its natural length, the spring pulls back with a force F0 of magnitude

kg·meter
|F0 | = 2 2
.
sec

b. Find the the natural angular frequency ω0 , the natural frequency ν0 , and the natural
period p0 of this system assuming the mass of the attached object is 16 kilograms.
c. Four different sets of initial conditions are given below for this mass/spring system. For
each, determine the corresponding amplitude A and phase angle φ for the system, and
sketch the graph of the position over time y = y(t) . (Use the values of κ and ω0 derived
above, and assume position and time are given in meters and seconds, respectively.):
i. y(0) = 2 and y (0) = 0 ii. y(0) = 0 and y (0) = 2
√
iii. y(0) = 0 and y (0) = −2 iv. y(0) = 2 and y (0) = 3

i i

i i
i i

i i

Additional Exercises 333

16.2. Assume we have a single undamped mass/spring system, and do the following:
a. Find the spring constant κ for the spring given that, when pulled out 1/
4 meter beyond
its natural length, the spring pulls back with a force F0 of magnitude

kg·meter
|F0 | = 72 2
.
sec

b. Find the the natural angular frequency ω0 , the natural frequency ν0 , and the natural
period p0 of this system when the mass of the attached object is 2 kilograms.
c. Three different sets of initial conditions are given below for this mass/spring system. For
each, determine the corresponding amplitude A , and sketch the graph of the position over
time y = y(t) :
i. y(0) = 1 and y (0) = 0 ii. y(0) = 0 and y (0) = 1
iii. y(0) = 1 and y (0) = 3

16.3. Suppose we have an undamped mass/spring system with natural angular frequency ω0 .
Let y0 and v0 be, respectively, the position and velocity of the object at t = 0 . Show that
the corresponding amplitude A is given by
)
v 2
A = y0 2 + 0 ,
ω0

and that the phase angle satisﬁes

v0
tan(φ) = .
y0 ω0

16.4. Suppose that a particular undamped mass/spring system has natural period p0 = 3 seconds.
What is the spring constant κ of the spring if the mass m of the object is (in kilograms)
1
a. m = 1 b. m = 2 c. m =
2
16.5. Suppose we have an underdamped mass/spring system with decay coefﬁcient α and angular
quasi-frequency ω . Let y0 and v0 be, respectively, the position and velocity of the object
at t = 0 . Show that the corresponding amplitude A is given by
(
v + αy 2
A = y0 2 + 0 0
,
ω
while the phase angle satisﬁes
v0 + αy0
tan(φ) = .
y0 ω

16.6. Consider a damped mass/spring system with spring constant, mass and damping coefﬁcient
being, respectively,

κ = 37 , m = 4 and γ = 4 .

a. Verify that this is an underdamped system.

b. Find the decay coefﬁcient α , the angular quasi-frequency ω , the quasi-period p and the
quasi-frequency ν of this system.

i i

i i
i i

i i

334 Springs: Part I

c. Three different sets of initial conditions are given below for this mass/spring system. For
each, determine the corresponding quasi-amplitude A for the system, and roughly sketch
the graph of the position over time y = y(t) .
i. y(0) = 1 and y (0) = 0 ii. y(0) = 4 and y (0) = −2
iii. y(0) = 0 and y (0) = 1 iv. y(0) = 2 and y (0) = 2

16.7 a. Let ω be the angular quasi-frequency of some underdamped mass/spring system. Show
that (
γ 2
ω = (ω0 )2 −
2m
where m is the mass of the object in the system, γ is the damping constant, and ω0 is
the natural frequency of the corresponding undamped system.
b. Suppose we have a mass/spring system in which we can adjust the damping coefﬁcient
γ (the mass m and the spring constant κ remain constant). How do the√quasi-frequency
ν and the quasi-period p vary as γ varies from γ = 0 up to γ = 2 κm ? (Compare
ν and p to the natural frequency and period of the corresponding undamped system, ν0
and p0 .)

16.8. Consider a damped mass/spring system with spring constant, mass and damping coefﬁcient
being, respectively,

κ = 4 , m = 1 and γ = 4 .

a. Verify that this system is critically damped.

b. Find and roughly graph the position of the object over time, y(t) , assuming that

y(0) = 2 and y (0) = 0 .

c. Find and roughly graph the position of the object over time, y(t) , assuming that

y(0) = 0 and y (0) = 2 .

16.9. Consider a damped mass/spring system with spring constant, mass and damping coefﬁcient
being, respectively,

κ = 4 , m = 1 and γ = 5 .

a. Verify that this system is overdamped.

i i
i i

i i

336 Arbitrary Homogeneous Linear Equations with Constant Coefﬁcients

The idea of “factoring”, of course, extends to polynomials of higher degree. And to use this
idea with these polynomials, it will help to introduce the “completely factored form” for an arbitrary
N th -degree polynomial

p(r) = a0 r N + a1r N −1 + · · · + a N −2 r 2 + a N −1r + a N .

We will say that we’ve (re)written this polynomial into its completely factored form if and only if
we’ve factored it to an expression of the form

p(r) = a0 (r − r1 )m 1 (r − r2 )m 2 · · · (r − r K )m K (17.1)

where
{ r1 , r2 , . . . , r K }
is the set of all different (possibly complex) roots of the polynomial (i.e., values of r satisfying
p(r) = 0 ), and
{ m1, m2, . . . , m K }
is some corresponding set of positive integers.
Let’s make a few simple observations regarding the above, and then look at a few examples.

1. It will be important for our discussion that

{ r1 , r2 , . . . , r K }

is the set of all different roots of the polynomial. If j = k , then r j = rk .

2. Each m k is the largest integer such that (r −rk )m k is a factor of the original polynomial. Con-
sequently, for each rk , there is only one possible value for m k . We call m k the multiplicity
of rk .

3. As shorthand, we often say that rk is a simple root if its multiplicity is 1 , a double root if
its multiplicity is 2 , a triple root if its multiplicity is 3 , and so on.

4. If you multiply out all the factors in the completely factored form in line (17.1), you get a
polynomial of degree
m1 + m2 + · · · + m K .
Since this polynomial is supposed to be p(r) , an N th -degree polynomial, we must have

m1 + m2 + · · · + m K = N .

!Example 17.1: By straightforward multiplication, you can verify that

2(r − 4)3 (r + 5) = 2r 4 − 14r 3 − 24r 2 + 352r − 640 .

This means
p(r) = 2r 4 − 14r 3 − 24r 2 + 352r − 640
can be written in completely factored form

p(r) = 2(r − 4)3 (r − [−5]) .

This polynomial has two distinct real roots, 4 and −5 . The root 4 has multiplicity 3 , and −5
is a simple root.

i i

i i
i i

i i

Some Algebra 337

!Example 17.2: Straightforward multiplication also veriﬁes that

(r − 3)5 = r 5 − 15r 4 + 90r 3 − 270r 2 + 405r − 243 .

Thus,
r 5 − 15r 4 + 90r 3 − 270r 2 + 405r − 243
has the completely factored form
(r − 3)5 .
Here, 3 is the only distinct root, and this root has multiplicity 5 .

!Example 17.3: As the last example, for now, you can show that

2
2
r − [3 + 4i] r − [3 − 4i] = r 4 − 12r 3 + 86r 2 − 300r + 625 .

Hence,

2
2
r − [3 + 4i] r − [3 − 4i]
is the completely factored form for

r 4 − 12r 3 + 86r 2 − 300r + 625 .

This time we have two complex roots, 3 + 4i and 3 − 4i , with each being a double root.

Can every polynomial be written in completely factored form? The next theorem says “yes”.

Theorem 17.1 (complete factorization theorem)

Every polynomial can be written in completely factored form.

Note that we are not requiring the coefficients of the polynomial be real. This theorem is valid
for every polynomial with real or complex coefficients. Unfortunately, you will have to accept this
theorem on faith. Its proof requires developing more theory than is appropriate in this text.1
Unfortunately, also, this theorem does not tell us how to find the completely factored form. Of
course, if the polynomial is of degree 2 ,

ar 2 + br + c ,

then we can ﬁnd the roots via the quadratic formula,

−b ± b2 − 4ac
r = r± = .
2a

Analogs of this formula do exist for polynomials of degrees 3 and 4 , but these analogs are rather
complicated and not often used unless the user is driven by great need. For polynomials of degrees
greater than 4 , it has been shown that no such analogs exist.
This means that ﬁnding the completely factored form may require some of those “tricks for
factoring” you learned long ago in your old algebra classes. We’ll review a few of those tricks later
in examples involving differential equations.

1 Those who are interested may want to look up the “Fundamental Theorem of Algebra” in a text on complex variables.
The complete factorization theorem given here is a corollary of that theorem.

i i

i i
i i

i i

338 Arbitrary Homogeneous Linear Equations with Constant Coefﬁcients

17.2 Solving the Differential Equation

The Characteristic Equation
Suppose we have some N th -order differential equation of the form

a0 y (N ) + a1 y (N −1) + · · · + a N −2 y + a N −1 y + a N y = 0 (17.2)

where the ak ’s are all constants (and a0 = 0 ). Since

r x
e = rer x

er x = rer x = r · rer x = r 2 er x

er x = r 2 er x = r 2 · rer x = r 3 er x
..
.

for any constant r , it is easy to see that plugging y = er x into the differential equation yields

a0 r N er x + a1r N −1 er x + · · · + a N −2 r 2 er x + a N −1rer x + a N er x = 0 ,

which, after dividing out er x , gives us the corresponding characteristic equation

a0 r N + a1 r N −1 + · · · + a N −2 r 2 + a N −1r + a N = 0 . (17.3)

As before, we refer to the polynomial on the left,

p(r) = a0 r N + a1r N −1 + · · · + a N −2 r 2 + a N −1r + a N ,

as the characteristic polynomial for the differential equation. Also, as in a previous chapter, it should
be observed that the characteristic equation can be obtained from the original differential equation
by simply replacing the derivatives of y with the corresponding powers of r .
According to the complete factorization theorem, the above characteristic equation can be
rewritten in completely factored form,

a0 (r − r1 )m 1 (r − r2 )m 2 · · · (r − r K )m K = 0 (17.4)

where the rk ’s are all the different roots of the characteristic polynomial, and the m k ’s are the
multiplicities of the corresponding roots. It turns out that, for each root rk with multiplicity m k ,
we can identify a corresponding linearly independent set of m k particular solutions to the original
differential equation. It will be obvious (once you see them) that no solution generated from one
root can be written as a linear combination of solutions generated from the other roots. Hence, the
set of all these particular solutions generated from all the rk ’s will be a linearly independent set
containing (according to our complete factorization theorem)

m1 + m2 + · · · + m K = N

solutions. From the big theorem on solutions to homogeneous equations (theorem 13.5 on page
272), we then know that this big set is a fundamental set of solutions for the differential equation,
and that the general solution is given by an arbitrary linear combination of these particular solutions.
Exactly which particular solutions are generated from each individual root depends on the
multiplicity and whether the root is real valued or not.

i i

i i
i i

i i

Solving the Differential Equation 339

Particular Solutions Corresponding to One Root

In the following, we will assume rk is a root of multiplicity m k to our characteristic polynomial.
That is,
(r − rk )m k
is one factor in equation (17.4). However, since the choice of k will be irrelevant in this discussion,
we will, for simplicity, drop the subscripts.

The Basic Result

Assume r is a root of multiplicity m to our characteristic polynomial. Then, as before,

er x

is one particular solution to the differential equation, and if m = 1 , it is the only solution corre-
sponding to this root that we need to ﬁnd.
So now assume m > 1 . In the previous chapter, we found that

x er x

is a second solution to the differential equation when r is a repeated root and N = 2 . This was
obtained via reduction of order. For the more general case being considered here, it can be shown
that x er x is still a solution. In fact, it can be shown that the m particular solutions to the differential
equation corresponding to root r can be generated one after the other by simply multiplying the
previously found solution by x . That is, we have the following theorem:

Theorem 17.2
Let r be a root of multiplicity m to the characteristic polynomial for

a0 y (N ) + a1 y (N −1) + · · · + a N −2 y + a N −1 y + a N y = 0

where the ak ’s are all constants. Then

& '
er x , x er x , x 2 er x , . . . , x m−1 er x

is a linearly independent set of m solutions to the differential equation.

The proof of this theorem will be discussed later, in section 17.4. (And it probably should be
noted that x m er x ends up not being a solution to the differential equation.)

Particular Solutions Corresponding to a Real Root

If r is a real root of multiplicity m to our characteristic polynomial, then theorem 17.2, above, tells
us that & '
er x , x er x , x 2 er x , . . . , x m−1 er x

is the linearly independent set of m solutions to the differential equation corresponding to that root.
No more need be said.

!Example 17.4: Consider the homogeneous differential equation

y (5) − 15y (4) + 90y − 270y + 405y − 243y = 0 .

i i

i i
i i

i i

340 Arbitrary Homogeneous Linear Equations with Constant Coefﬁcients

Its characteristic equation is

r 5 − 15r 4 + 90r 3 − 270r 2 + 405r − 243 = 0 .

The left side of this equation is the polynomial from example 17.2. Going back to that example,
we discover that this characteristic equation can be factored to

(r − 3)5 = 0 .

So 3 is the only root, and it has multiplicity 5 . Theorem 17.2 then tells us that the linearly
independent set of 5 corresponding particular solutions to the differential equation is
& '
e3x , xe3x , x 2 e3x , x 3 e3x , x 4 e3x .

Since 5 is also the order of the differential equation, we know (via theorem 13.5 on page 272)
that the above set is a fundamental set of solutions to our homogeneous differential equation, and,
thus,
y(x) = c1 e3x + c2 xe3x + c3 x 2 e3x + c4 x 3 e3x + c5 x 4 e3x
is a general solution for our differential equation.

Particular Solutions Corresponding to a Complex Root

In chapter, 15 we observed that complex roots to a second-degree polynomial always occur as a
conjugate pair when the coefﬁcients of the polynomial are real. With a little bit of work (see section
17.5), we can extend that observation to:

Theorem 17.3
Consider a polynomial

p(r) = a0 r N + a1r N −1 + · · · + a N −1 r + a N

in which a0 , a1 , . . . , and a N are all real numbers. Let λ and ω be two real numbers, and let m
be some positive integer. Then

r0 = λ + iω is a root of multiplicity m for polynomial p(r)

if and only if
r0 ∗ = λ − iω is a root of multiplicity m for polynomial p(r) .

Now assume λ + iω is a complex root of multiplicity m to our characteristic polynomial.

Theorems 17.2 and 17.3, together, tell us that
& '
e(λ+i ω)x , x e(λ+i ω)x , x 2 e(λ+i ω)x , . . . , x m−1 e(λ+i ω)x

and & '

e(λ−i ω)x , x e(λ−i ω)x , x 2 e(λ−i ω)x , . . . , x m−1 e(λ−i ω)x
are linearly independent sets of m solutions to the differential equation corresponding to roots λ+iω
and λ − iω .
For many problems, though, these are not particularly desirable sets of solutions because they
introduce complex values into computations we expect to yield real values. But recall how we dealt
with complex-exponential solutions for second-order equations, constructing alternative pairs of

i i

i i
i i

i i

Solving the Differential Equation 341

solutions via linear combinations. Unsurprisingly, the same idea works here, and we can construct
an alternative pair of solutions {yk,1 , yk,2 } from each pair
& '
x k e(λ+i ω)x , x k e(λ−i ω)x
via the linear combinations
1 k (λ+i ω)x 1 k (λ−i ω)x
yk,1 (x) = x e + x e
2 2
and
1 k (λ+i ω)x 1 k (λ−i ω)x
yk,2 (x) = x e − x e .
2i 2i
Since
e(λ±i ω)x = eλx [cos(ωx) ∓ i sin(ωx)] ,
you can easily verify that
yk,1 = x k eλx cos(ωx) and yk,2 = x k eλx sin(ωx) .
It is also “easily” veriﬁed that the set of these functions, with k = 0, 1, 2, . . . , m − 1 , is linearly
independent.
Thus, instead of using
& '
e(λ+i ω)x , x e(λ+i ω)x , x 2 e(λ+i ω)x , . . . , x m−1 e(λ+i ω)x
and & '
e(λ−i ω)x , x e(λ−i ω)x , x 2 e(λ−i ω)x , . . . , x m−1 e(λ−i ω)x
as the two linearly independent sets corresponding to roots λ + iω and λ − iω , we can use the sets
of real-valued functions
$ λx %
e cos(ωx) , x eλx cos(ωx) , x 2 eλx cos(ωx) , . . . , x m−1 eλx cos(ωx)
and
$ λx %
e sin(ωx) , x eλx sin(ωx) , x 2 eλx sin(ωx) , . . . , x m−1 eλx sin(ωx) .

!Example 17.5: Consider the differential equation

17.4 On Verifying Theorem 17.2

Theorem 17.2 claims to give a linearly independent set of solutions to a linear homogeneous dif-
ferential equation with constant coefﬁcients corresponding to a repeated root for the equation’s
characteristic polynomial. Our task of verifying this claim will be greatly simpliﬁed if we slightly
expand our discussion of “factoring” linear differential operators from section 14.3. (You may want
to go back and quickly review that section.)

Linear Differential Operators with Constant Coefﬁcients

First, we need to expand our terminology a little. When we refer to L as being an N th -order
linear differential operator with constant coefﬁcients, we just mean that L is an N th -order linear
differential operator

dN d N −1 d2 d
L = a0 N
+ a1 N −1 + · · · + a N −2 2 + a N −1 + aN
dx dx dx dx

in which all the ak ’s are constants. Its characteristic polynomial p(r) is simply the polynomial

p(r) = a0 r N + a1r N −1 + · · · + a N −2 r 2 + a N −1r + a N .

It turns out that factoring a linear differential operator with constant coefﬁcients is remarkably easy
if you already have the factorization for its characteristic polynomial.

!Example 17.8: Consider the linear differential operator

d2 d
L = 2
− 5 + 6 .
dx dx

Its characteristic polynomial is

r 2 − 5r + 6 ,
which factors to
(r − 2)(r − 3) .

i i

i i
i i

i i

On Verifying Theorem 17.2 345

Now, consider the analogous composition product

d d
−2 −3 .
dx dx
Letting φ be any suitably differentiable function, we see that
d d d dφ
−2 − 3 [φ] = −2 − 3φ
dx dx dx dx
d
dφ dφ
= − 3φ − 2 − 3φ
dx dx dx
d2φ dφ dφ
= − 3 − 2 + 6φ
dx2 dx dx
d2φ dφ
= − 5 + 6φ
dx2 dx
= L[φ] .

Thus, d d
L = −2 −3 .
dx dx

Redoing this last example with the numbers 2 and 3 replaced by constants r1 and r2 leads
to the ﬁrst result of this section:

Together, these observations give us the actual result we will use.

Corollary 17.6
Let L be a linear differential operator with constant coefﬁcients; let

p(r) = a(r − r1 )m 1 (r − r2 )m 2 · · · (r − r K )m K

be the completely factored form for the characteristic polynomial for L , and let r j be any one of
the roots of p . Suppose, further, that y is a solution to
d m j
− rj [y] = 0 .
dx

Then y is also solution to

L[y] = 0 .

Proof of Theorem 17.2

Theorem 17.2 claims that, if r is a root of multiplicity m for the characteristic polynomial of some
linear homogeneous differential equation with constant coefﬁcients, then
& '
er x , x er x , x 2 er x , . . . , x m−1 er x

is a linearly independent set of solutions to that differential equation. First we will verify that each of
these x k er x ’s is a solution to the differential equation. Then we will conﬁrm the linear independence
of this set.

Verifying the Solutions

If you look back at corollary 17.6, you will see that we need only show that
d m
−r [x k er x ] = 0 (17.5)
dx

whenever k is a nonnegative integer less than m . To expedite our main computations, we’ll do two
preliminary computations. And, since at least one may be useful in a later chapter, we’ll describe
the results in an easily referenced lemma.

Lemma 17.7
Let r , α and β be constants with α being a positive integer and β being real valued. Then
d α d α d α−1
−r er x = 0 and −r x β er x = β −r x β−1 er x .
dx dx dx

PROOF: For the ﬁrst:

d α d α−1 d
−r er x = −r − r er x
dx dx dx
d α−1 d α−1
= −r rer x − rer x = −r [0] = 0 .
dx dx

i i

i i
i i

i i

348 Arbitrary Homogeneous Linear Equations with Constant Coefﬁcients

For the second:

d α d α−1 d
−r x β er x = −r −r x β er x
dx dx dx
d α−1 d
= −r x β er x − r x β er x
dx dx
d α−1
= −r βx β−1 er x + x β rer x − r x β er x
dx
d α−1
= β −r x β−1 er x .
dx

Now let k be an positive integer less than m . Using the above lemma, we see that
d m d m−1
−r x k er x = k −r x k−1 er x
dx dx
d m−2
= k(k − 1) −r x k−2 er x
dx
d m−3
= k(k − 1)(k − 2) −r x k−3 er x
dx
..
.
d m−k
= k(k − 1)(k − 2) · · · (k − [k − 1]) −r x k−k er x
dx
d m−k
= k! −r er x
dx
= 0 ,

verifying equation (17.5).

Verifying Linear Independence

To ﬁnish verifying the claim of theorem 17.2, we need only conﬁrm that
& '
er x , x er x , x 2 er x , . . . , x m−1 er x

is a linearly independent set of functions on the real line. Well, let’s ask if this set could be, instead, a
linearly dependent set of functions on the real line. Then one of these functions, say, x κ er x , would
be a linear combination of the others,

x κ er x = linear combination of the other x k er x ’s .

Subtract x κ er x from both sides, and you get

0 = linear combination of the other x k er x ’s − 1 · x κ er x ,

which we can rewrite as

0 = c0 er x + c1 x er x + c2 x 2 er x + · · · + cm x m−1 er x

where the ck ’s are constants with

cκ = −1 .

i i

i i
i i

i i

On Verifying Theorem 17.2 349

Dividing out er x reduces the above to

0 = c0 + c1 x + c2 x 2 + · · · + cm−1 x m−1 . (17.6)

Since this is supposed to hold for all x , it should hold for x = 0 , giving us

0 = c0 + c1 · 0 + c2 · 02 + · · · + cm−1 · 0m−1 = c0 .

Now differentiate both sides of equation (17.6) and plug in x = 0 :

d d

[0] = c0 + c1 x + c2 x 2 + · · · + cm−1 x m−1
dx dx

→ 0 = 0 + c1 + 2c2 x + · · · + (m − 1)cm−1 x m−2

→ 0 = 0 + c1 + 2c2 · 0 + · · · + (m − 1)cm−1 · 0m−2

→ 0 = c1 .

Differentiating both sides of equation (17.6) twice and plugging in x = 0 :

d2 d2

2
[0] = c0 + c1 x + c2 x 2 + c3 x 3 · · · + cm−1 x m−1
dx dx2

→ 0 =
d
dx
0 + c1 + 2c2 x + 3c3 x 2 · · · + (m − 1)cm−1 x m−2

→ 0 = 0 + 0 + 2c2 + 6c3 x · · · + (m − 1)cm−1 (m − 2)x m−2

→ 0 = 0 + 0 + 2c2 + 6c3 · 0 + · · · + (m − 1)(m − 2)cm−1 · 0m−2

→ 0 = c2 .

Clearly, we can differentiate equation (17.6) again and again, plug in x = 0 , and, eventually,
obtain
0 = ck for k = 0, 1, 2, . . . , m − 1 .
But, one of these ck ’s is cκ which we know is −1 (assuming our set of x k er x ’s is linearly
dependent). In other words, for our set of x k er x ’s to be linearly dependent, we must have

0 = cκ = −1 ,

which is impossible. So our set of x k er x ’s cannot be linearly dependent. It must be linearly

independent, just as theorem 17.2 claimed.

i i

i i
i i

i i

350 Arbitrary Homogeneous Linear Equations with Constant Coefﬁcients

17.5 On Verifying Theorem 17.3

Theorem 17.3 is a theorem about complex conjugation in the algebra of complex numbers. So let’s
start with a brief discussion of that topic.

Algebra with Complex Conjugates

Recall that a complex number z is something that can be written as
z = x + iy
where x and y are real numbers, which we generally refer to as, respectively, the real and the
imaginary parts of z . Along these lines, we say z is real if and only if z = x (i.e., y = 0 ), and
we say z is imaginary if and only if z = i y (i.e., x = 0 ).
The corresponding complex conjugate of z — denoted z ∗ — is z with the sign of its complex
part switched,
z = x + iy ⇒ z ∗ = x + i(−y) = x − i y .
Note that
∗ ∗
z = (x − i y)∗ = x + i y = z ,
and that
z∗ = z if z is real .
We will use these facts in a moment. We will also use formulas involving the complex conjugates
of sums and products. To derive them, let
z = x + iy and c = a + ib
where x , y , a and b are all real, and compute out
(c + z)∗ , c∗ + z ∗ , (cz)∗ and c∗ z ∗
in terms of a , b , x and y . You’ll quickly discover that
(c + z)∗ = c∗ + z ∗ and (cz)∗ = c∗ z ∗ .
It then follows that
∗
2
z2 = (z · z)∗ = z ∗ · z ∗ = z ∗ ,
∗ ∗ ∗
2
3
z3 = z2 · z = z2 · z∗ = z∗ · z∗ = z∗ ,
..
.
.
Continuing along these lines, it is a straightforward exercise to conﬁrm that, given any polynomial
c0 z N + c1 z N −1 + · · · + c N −1 z + c N ,
then
∗
c0 z N + c1 z N −1 + · · · + c N −1 z + c N

N
N −1
= c0 ∗ z ∗ + c1 ∗ z ∗ + · · · + c N −1 ∗ z ∗ + c N ∗ .

i i

i i
i i

i i

On Verifying Theorem 17.3 351

If, in addition, each ck is a real number, then ck ∗ = ck and the above reduces even more.

Lemma 17.8
Let
c0 z N + c1 z N −1 + · · · + c N −1 z + c N
be a polynomial in which each ck is a real number. Then, for any complex number z ,
∗
c0 z N + c1 z N −1 + · · · + c N −1 z + c N

N
N −1
= c0 z ∗ + c1 z ∗ + · · · + c N −1 z ∗ + c N .

Proof of Theorem 17.3

Let me remind you of the statement of theorem 17.3:
Consider a polynomial

p(r) = a0 r N + a1r N −1 + · · · + a N −2 r 2 + a N −1r + a N

in which a0 , a1 , . . . , and a N are all real numbers. Let λ and ω be two real numbers,
and let m be some positive integer. Then

r0 = λ + iω is a root of multiplicity m for polynomial p(r)

if and only if

r0 ∗ = λ − iω is a root of multiplicity m for polynomial p(r) .

To start our proof of this theorem, assume r0 is a root of multiplicity m of p . Then

0 = a0 (r0 ) N + a1 (r0 ) N −1 + · · · + a N −2 (r0 )2 + a N −1 r0 + a N .

But ∗
0 = 0∗ = a0 (r0 ) N + a1 (r0 ) N −1 + · · · + a N −2 (r0 )2 + a N −1 r0 + a N

N
N −1
2
= a0 r0 ∗ + a1 r0 ∗ + · · · + a N −2 r0 ∗ + a N −1 r0 ∗ + a N ,

showing that r0 ∗ is also a root of p . This also means that r − r0 and r − r0 ∗ are both factors of
p(r) , and, hence,
p(r) = p1 (r) (r − r0 ) (r − r0 ∗ )
where p1 is the polynomial of degree N − 2 that can be obtained by dividing these two factors out
of p ,
p(r)
p1 (r) = .
(r − r0 ) (r − r0 ∗ )
Now

(r − r0 ) r − r0 ∗ = (r − [λ + iω]) (r − [λ − iω]) = · · · = r 2 − 2λr + ω2 .

So the coefficients of both the denominator and the numerator in the fraction defining p1 (r) are
real-valued constants. If you think about how one actually computes this fraction (via, say, long
division), you will realize that all the coefficients of p1 (r) must also be real.

i i

i i
i i

i i

352 Arbitrary Homogeneous Linear Equations with Constant Coefﬁcients

If m > 1 then (r − r0 )m−1 — but not (r − r0 )m — will be a factor of p1 (r) . Thus, r0 will
be a root of multiplicity m − 1 for p1 . Repeating the above arguments with p1 replacing p leads
to the conclusions that

1. r0 ∗ is also a root of p1
and
2. there is an (N − 4)th degree polynomial p2 with real coefﬁcients such that

p(r) = p1 (r) (r − r0 ) r − r0 ∗ = p2 (r) (r − r0 )2 (r − r0 ∗ )2 .

Clearly, we can continue repeating these arguments, ultimately obtaining the formula

p(r) = pm (r) (r − r0 )m (r − r0 ∗ )m

where pm is a polynomial of degree N − 2m with just real coefﬁcients and for which r0 is not a
root.
Could r0 ∗ be a root of pm ? If so, then the argument given at the start of this proof would show
that (r0 ∗ )∗ is also a root of pm . But (r0 ∗ )∗ = r0 and we know r0 is not a root of pm . So it is not
possible for r0 ∗ to be a root of pm .
All this shows that

r0 is a root of multiplicity m for p(r)

⇒ r0 ∗ is a root of multiplicity m for p(r) .

Replacing r0 with r0 ∗ then gives us

r0 ∗ is a root of multiplicity m for p(r)

∗ ∗
⇒ r0 is a root of multiplicity m for p(r) .

Together with the fact that (r0 ∗ )∗ = r0 , these two implications give us

r0 is a root of multiplicity m for p(r)

⇐⇒ r0 ∗ is a root of multiplicity m for p(r) ,

completing our proof of theorem 17.3.

Additional Exercises

17.1. Using clever factoring of the characteristic polynomials (such as was done in example 17.6
on page 342), ﬁnd the general solution to each of the following:
a. y (4) − 4y (3) = 0
b. y (4) + 4y = 0
c. y (4) − 34y + 225y = 0
d. y (4) − 81y = 0

i i

i i
i i

i i

Additional Exercises 353

e. y (4) − 18y + 81y = 0

f. y (5) + 18y (3) + 81y = 0

17.2. For each of the following differential equations, one or more roots to the corresponding
characteristic polynomial can be found by “testing candidates” (as illustrated in example
17.7 on page 343). Using this fact, ﬁnd the general solution to each.
a. y − y + y − y = 0
b. y − 6y + 11y − 6y = 0
c. y − 8y + 37y − 50y = 0
d. y (4) + 2y (3) + 10y + 18y + 9y = 0

17.3. Find the solution to each of the following initial-value problems:

a. y + 4y = 0 with y(0) = 4 , y (0) = 6 and y (0) = 8
b. y − 6y + 12y − 8y = 0
with y(0) = 5 , y (0) = 13 and y (0) = 86
c. y (4) + 26y + 25y = 0
with y(0) = 6 , y (0) = −28 , y (0) = −102 and y (0) = 622

17.4. Find the general solution to each of the following:

a. y − 8y = 0
b. y (4) + 13y + 36y = 0
c. y (6) − 3y (4) + 3y − y = 0
d. y (6) − 2y (3) + y = 0

i i

i i
i i

i i

i i
i i

i i

18
Euler Equations

We now know how to completely solve any equation of the form

ay + by + cy = 0

or even
a0 y (N ) + a1 y (N −1) + · · · + a N −2 y + a N −1 y + a N y = 0
in which the coefﬁcients are all real-valued constants (provided we can completely factor the corre-
sponding characteristic polynomial).
Let us now consider some equations of the form

ay + by + cy = 0
or even
a0 y (N ) + a1 y (N −1) + · · · + a N −2 y + a N −1 y + a N y = 0

when the coefficients are not all constants. In particular, let us consider the “Euler equations”,
described more completely in the next section, in which the coefficients happen to be particularly
simple polynomials.1
As with the constant-coefficient equations, we will discuss the second-order Euler equations
(and their solutions) first, and then note how those ideas extend to corresponding higher order Euler
equations.

18.1 Second-Order Euler Equations

Basics
A second-order differential equation is called an Euler equation if it can be written as

αx 2 y + βx y + γ y = 0

where α , β and γ are constants (in fact, we will assume they are real-valued constants). For
example,
x 2 y − 6x y + 10y = 0 ,

x 2 y − 9x y + 25y = 0
1 These differential equations are also called Cauchy–Euler equations, Euler–Cauchy equations and Cauchy equations. By
the way, “Euler” is pronounced “oi ler”.

355

i i

i i
i i

i i

356 Euler Equations

and
x 2 y − 3x y + 20y = 0
are the Euler equations we’ll solve to illustrate the methods we’ll develop below. In these equations,
the coefficients are not constants but are constants times the variable raised to the power equaling
the order of the corresponding derivative. Notice, too, that the first coefficient, αx 2 , vanishes at
x = 0 . This means we should not attempt to solve these equations over intervals containing 0 .
For convenience, we will use (0, ∞) as the interval of interest. You can easily verify that the same
formulas derived using this interval also work using the interval (−∞, 0) after replacing the x in
these formulas with either −x or |x| .
Euler equations are important for two or three good reasons:
1. They are easily solved.
2. They occasionally arise in applications, though not nearly as often as equations with constant
coefficients.
3. They are especially simple cases of a broad class of differential equations for which infinite
series solutions can be obtained using the “method of Frobenius”.2
The basic approach to solving Euler equations is similar to the approach used to solve constant-
coefficient equations: Assume a simple formula for the solution y involving one constant “to be
determined”, plug that formula for y into the differential equation, simplify and solve the resulting
equation for the constant, and then construct the general solution using the constants found and the
basic theory already developed.
The appropriate form for the solution to an Euler equation is not the exponential assumed for a
constant-coefficient equation. Instead, it is
y(x) = x r
where r is a constant to be determined. This choice for y(x) can be motivated by either first
considering the solutions to the corresponding first-order Euler equations
dy
αx + βy = 0 ,
dx
or by just thinking about what happens when you compute
dm
xm m xr .
dx
We will outline the details of the method in a moment. Do not, however, bother memorizing
anything except for the first assumption about the form of the solution and general outline of the
method. The precise formulas that arise are not as easily memorized as the corresponding formulas
for differential equations with constant coefficients. Moreover, you won’t probably be using them
enough later on to justify memorizing these formulas.

The Steps in Solving Second-Order Euler Equations

Here are the basic steps for ﬁnding a general solution to any second-order Euler equation
αx 2 y + βx y + γ y = 0 for x >0 .
Remember α , β and γ are real-valued constants. To illustrate the basic method, we will solve
x 2 y − 6x y + 10y = 0 for x >0 .

2 We won’t start discussing the method of Frobenius until chapter 32.

i i

i i
i i

i i

Second-Order Euler Equations 357

1. Assume a solution of the form

y = y(x) = x r

where r is a constant to be determined.

2. Plug the assumed formula for y into the differential equation and simplify. Let’s do the
example ﬁrst:
Replacing y with x r gives

0 = x 2 y − 6x y + 10y

= x 2 x r − 6x x r + 10 x r

= x 2 r(r − 1)x r −2 − 6x r x r −1 + 10 x r
= (r 2 − r)x r − 6r x r + 10x r

= r 2 − r − 6r + 10 x r

= r 2 − 7r + 10 x r .

Since we are solving on an interval where x = 0 , we can divide out the x r ,

leaving us with the algebraic equation

r 2 − 7r + 10 = 0 .

In general, replacing y with x r gives

0 = αx 2 y + βx y + γ y

= αx 2 x r + βx x r + γ x r

= αx 2 r(r − 1)x r −2 + βx r x r −1 + γ x r
= α(r 2 − r)x r + βr x r + γ x r

= αr 2 − αr + βr + γ x r

= αr 2 + (β − α)r + γ x r .

Dividing out the x r leaves us with the second-degree polynomial equation

αr 2 + (β − α)r + γ = 0 .

This equation, known as the indicial equation corresponding to the given Euler equation3 , is
analogous to the characteristic equation for a second-order, homogeneous linear differential
equation with constant coefﬁcients. (Don’t memorize this equation — it is easy enough to
simply rederive it each time. Besides, analogous equations for higher-order Euler equations
are signiﬁcantly different.)4
3. Solve the polynomial equation for r .
In our example, we obtained the indicial equation

r 2 − 7r + 10 = 0 ,
3 Often, though, it’s just called “the equation for r ”.
4 However, there is a shortcut for ﬁnding the indicial equations which may be useful if you are solving large numbers of
Euler equations of different orders. See exercise 18.5 at the end of this chapter.

i i

i i
i i

i i

358 Euler Equations

which factors to
(r − 2)(r − 5) = 0 .

So r = 2 and r = 5 are the possible values of r .

4. Remember that, for each value of r obtained, x r is a solution to the original Euler equation.
If there are two distinct real values r1 and r2 for r , then
$ r %
x 1 , x r2

is clearly a fundamental set of solutions to the differential equation, and

y(x) = c1 x r1 + c2 x r2

is a general solution. If there is only one value for r , then

y1 (x) = x r

is one solution to the differential equation, and the general solution can be obtained via
reduction of order. (The cases where there is only one value of r and where the two values
of r are complex will be examined more closely in the next section.)
In our example, we obtained two values for r , 2 and 5 . So
$ 2 5%
x ,x

is a fundamental set of solutions to the differential equation, and

y(x) = c1 x 2 + c2 x 5

is a general solution.

18.2 The Special Cases

A Single Value for r
Let’s do an example and then discuss what happens in general.

!Example 18.1: Consider

x 2 y − 9x y + 25y = 0 for x >0 .

will always be solutions to the differential equation. Since they are clearly not constant multiples of
each other, they form a fundamental set for the differential equation. Thus, in this case,

y(x) = c1 x r + c2 x r ln |x|

will always be a general solution to the given Euler equation. This may be worth remembering, if you
expect to be solving many Euler equations (which you probably won’t). Otherwise just remember
how to use reduction of order.
Verifying this claim is left to the interested reader (see exercise 18.3 on page 367).

Complex Values for r

Again, we start with an example.

!Example 18.2: Consider

x 2 y − 3x y + 20y = 0 for x >0 .

Using y = x r , we get

0 = x 2 y − 3x y + 20y

= x 2 x r − 3x x r + 20 x r

= x 2 r(r − 1)x r −2 − 3x r x r −1 + 20 x r

= x r r 2 − r − 3r + 20 ,

which simpliﬁes to
r 2 − 4r + 20 = 0 .
The solution to this is
√
−(−4) ± (−4)2 − 4(20) 4 ± −64
r = = = 2 ± i4 .
2 2

Thus, we have two distinct values for r , 2+i4 and 2−i4 . Presumably, then, we could construct
a general solution from
x 2+i 4 and x 2−i 4 ,
provided we had some idea as to just what “ x to a complex power” meant.

i i

i i
i i

i i

The Special Cases 361

So let’s ﬁgure out what “ x to a complex power” means.

For exactly the same reasons as when we were solving constant coefﬁcient equations, the
complex solutions to the indicial equation will occur as complex conjugate pairs

r+ = λ + iω and r− = λ − iω ,

which, formally at least, yield

y+ (x) = x r+ = x λ + i ω and y− (x) = x r− = x λ − i ω

as solutions to the original Euler equation. Now, assuming the standard algebraic rules remain valid
for complex powers5 ,
x λ ± i ω = x λ x ±i ω ,
and, for x > 0 ,
±i ω
x ±i ω = eln|x| = e±i ω ln|x| = cos(ω ln |x|) ± i sin(ω ln |x|) .

So our two solutions can be written as

y+ (x) = x λ cos(ω ln |x|) + i sin(ω ln |x|)
and

y− (x) = x λ cos(ω ln |x|) − i sin(ω ln |x|) .

To get solutions in terms of only real-valued functions, essentially do what was done when we
had complex-valued roots to characteristic equations for constant-coefﬁcient equations, namely, use
the fundamental set
{ y1 , y2 }
where
= · · · = x λ cos(ω ln |x|)
1 1
y1 (x) = y+ (x) + y− (x)
2 2
and
= · · · = x λ sin(ω ln |x|)
1 1
y2 (x) = y+ (x) − y− (x) .
2i 2i

Note that these are just the real and the imaginary parts of the formulas for y± = x λ±i ω .
If you really wish, you can memorize what we just derived, namely:
If you get
r = λ ± iω
when assuming y = xr is a solution to an Euler equation, then

y1 (x) = x λ cos(ω ln |x|) and y2 (x) = x λ sin(ω ln |x|)

form a corresponding linearly independent pair of real-valued solutions to the differential

equation, and
y(x) = c1 x λ cos(ω ln |x|) + c2 x λ sin(ω ln |x|)
is a general solution in terms of just real-valued functions.
Memorizing these formulas is not recommended. It’s easy enough (and safer) to simply re-
derive the formulas for x λ±i ω as needed, and then just take the real and imaginary parts as our the
two real-valued solutions.
5 They do.

i i

i i
i i

i i

362 Euler Equations

!Example 18.3: Let us ﬁnish solving

x 2 y − 3x y + 20y = 0 for x >0 .

From above, we got the complex-power solutions

y± (x) = x 2 ± i 4 .

Rewriting this using the corresponding complex exponential, we get

±i 4
y± (x) = x 2 x ±i 4 = x 2 eln|x|

= x 2 e±i 4 ln|x| = x 2 cos(4 ln |x|) ± i sin(4 ln |x|) .

Taking the real and imaginary parts of this then yields the corresponding linearly independent
pair of real-valued solutions to the differential equation,

y1 (x) = x 2 cos(4 ln |x|) and y2 (x) = x 2 sin(4 ln |x|) .

Thus,
y(x) = c1 x 2 cos(4 ln |x|) + c2 x 2 sin(4 ln |x|)
is a general solution in terms of just real-valued functions.

18.3 Euler Equations of Any Order

The deﬁnitions and ideas just described for second-order Euler equations are easily extended to
analogous differential equations of any order. The natural extension of the concept of a second-order
Euler differential equation is that of an N th -order Euler equation, which is any differential equation
that can be written as

α0 x N y (N ) + α1 x N −1 y (N −1) + · · · + α N −2 x 2 y + α N −1 x y + α N y = 0

where the αk ’s are all constants (and α0 = 0 ). We will further assume they are all real constants.
The basic ideas used to ﬁnd the general solution to an N th -order Euler equation over (0, ∞)
are pretty much the same as used to solve the second-order Euler equations:
1. Assume a solution of the form
y = y(x) = x r

where r is a constant to be determined.

2. Plug the assumed formula for y into the differential equation and simplify. The result will
be an N th degree polynomial equation

A0 r N + A1 r N −1 + · · · + A N −1r + A N = 0 .

We’ll call this the indicial equation for the given Euler equation, and the polynomial on
the left will be called the indicial polynomial. It is easily shown that the Ak ’s are all real
(assuming the αk ’s are real) and that A0 = α0 . However, the relation between the other
Ak ’s and αk ’s will depend on the order of the original differential equation.

i i

i i
i i

i i

Euler Equations of Any Order 363

3. Solve the indicial equation. The same tricks used to help solve the characteristic equations
in chapter 17 can be used here. And, as with those characteristic equations, we will obtain a
list of all the different roots of the indicial polynomial,

r1 , r2 , r3 , ... and r K ,

along with their corresponding multiplicities,

m1 , m2 , m3 , ... and m K .

As noted in chapter 17,

m1 + m2 + m3 + · · · + m K = N .

What you do next with each rk depends on whether rk is real or complex, and on the
multiplicity m k of rk .

4. If r = rk is real, then there will be a corresponding linearly independent set of m = m k

solutions to the differential equation. One of these, of course, will be y = x r . If this root’s
multiplicity m is greater than 1 , then a second corresponding solution to the Euler equation
is obtained by multiplying the ﬁrst, x r , by ln |x| , just as in the second-order case. This —
multiplying the last solution found by ln |x| — turns out to be the pattern for generating the
other solutions when m = m k > 2 . That is, the set of solutions to the differential equation
corresponding to r = rk is
& '
x r , x r ln |x| , x r (ln |x|)2 , . . . , x r (ln |x|)m−1

with m = m k . (We’ll verify this rigorously in the next section.)

5. If a root is complex, say, r = λ + iω , and has multiplicity m , then we know that this root’s
complex conjugate r ∗ = λ − iω is another root of multiplicity m . By the same arguments
given for real roots, we have that the functions

x λ+i ω , x λ+i ω ln |x| , x λ+i ω (ln |x|)2 , ... and x λ+i ω (ln |x|)m−1

along with

x λ−i ω , x λ−i ω ln |x| , x λ−i ω (ln |x|)2 , ... and x λ−i ω (ln |x|)m−1

make up a linearly independent set of 2m solutions to the Euler equation. To obtain the
corresponding set of real-valued solutions, we again use the fact that, for x > 0 ,

x λ±i ω = x λ x ±i ω = x λ e±i ω ln|x| = x λ cos(ω ln |x|) ± i sin(ω ln |x|) (18.1)

to obtain the alternative set of 2m solutions

$ λ
x cos(ω ln |x|) , x λ sin(ω ln |x|) ,
x λ cos(ω ln |x|) ln |x| , x λ sin(ω ln |x|) ln |x| ,
x λ cos(ω ln |x|) (ln |x|)2 , x λ sin(ω ln |x|) (ln |x|)2 ,
%
. . . , x λ cos(ω ln |x|) (ln |x|)m−1 , x λ sin(ω ln |x|) (ln |x|)m−1

for the Euler equation.

i i

i i
i i

i i

364 Euler Equations

6. Now form the set of solutions to the Euler equation consisting of the m k solutions described
above for each real root rk , and the 2m k real-valued solutions described above for each con-
jugate pair of roots rk and rk ∗ . Since (as we saw in chapter 17) the sum of the multiplicities
equals N , and since the rk ’s are distinct, it will follow that this set will be a fundamental
set of solutions for our Euler equation. Thus, ﬁnally, a general solution to the given Euler
equation can be written out as an arbitrary linear combination of the functions in this set.
We will do two examples (skipping some of the tedious algebra).

!Example 18.4: Consider the third-order Euler equation

x 3 y − 6x 2 y + 19x y − 27y = 0 for x >0 .
Plugging in y = x r , we get
x 3r(r − 1)(r − 2)x r −3 − 6x 2 r(r − 1)x r −2 + 19xr x r −1 − 27x r = 0 ,
which, after a bit of algebra, reduces to
r 3 − 9r 2 + 27r − 27 = 0 .
This is the indicial equation for our Euler equation. You can verify that its factored form is
(r − 3)3 = 0 .
So the only root to our indicial polynomial is r = 3 , and it has multiplicity 3 . As discussed
above, the corresponding fundamental set of solutions to the Euler equation is
& '
x 3 , x 3 ln |x| , x 3 (ln |x|)2 ,

and the corresponding general solution is

y = c1 x 3 + c2 x 3 ln |x| + c3 x 3 (ln |x|)2 .

!Example 18.5: Consider the fourth-order Euler equation

x 4 y (4) + 6x 3 y + 25x 2 y + 19x y + 81y = 0 for x >0 .
Plugging in y = x r , we get

x 4 r(r − 1)(r − 2)(r − 3)x r −4 + 6x 3r(r − 1)(r − 2)x r −3

+ 25x 2 r(r − 1)x r −2 + 19xr x r −1 + 81x r = 0 ,
which simpliﬁes to
r 4 + 18r 2 + 81 = 0 .
Solving this yields
r = ±3i with multiplicity 2 ,
and the four corresponding solutions to our Euler equation are
cos(3 ln |x|) , sin(3 ln |x|) , cos(3 ln |x|) ln |x| and sin(3 ln |x|) ln |x| .
The general solution, then, is
y = c1 cos(3 ln |x|) + c2 sin(3 ln |x|) + c4 cos(3 ln |x|) ln |x| + c4 sin(3 ln |x|) ln |x| .

i i

i i
i i

i i

The Relation Between Euler and Constant Coefﬁcient Equations 365

18.4 The Relation Between Euler and Constant

Coefﬁcient Equations
Let us suppose that
A0 r N + A1 r N −1 + · · · + A N −1r + A N = 0 (18.2)

is the indicial equation for some N th -order Euler equation

dN y N −1 y 2
N −1 d 2d y
α0 x N + α 1 x + · · · + α N −2 x + αN y = 0 . (18.3)
dx N d x N −1 dx2

Observe that polynomial equation (18.2) is also the characteristic equation for the N th -order constant
coefﬁcient equation

dNY d N −1 Y dY
A0 + A 1 + · · · + A N −1 + AN Y = 0 . (18.4)
dt N dt N −1 dt

This means that, if r is a solution to polynomial equation (18.2), then

xr and er t

are solutions, respectively, to the above Euler equation and the above constant coefﬁcient equation.
This suggests that these two differential equations are related to each other, possibly through a
substitution of the form
x r = er t .
Taking the r th root of both sides, this simpliﬁes to

x = et or, equivalently, ln |x| = t .

Exploring this possibility further eventually leads to the following lemma about the solutions to the
above differential equations:

Lemma 18.1
Assume that two homogeneous linear differential equations of equal order are given, with one being
an Euler equation and the other having constant coefficients. Assume, further that the indicial
equation of the Euler equation is the same as the characteristic equation of the other. Also, let y(x)
and Y (t) be two functions, with y defined on (0, ∞) , and Y (t) defined on (−∞, ∞) , and related
by the substitution x = et (equivalently, ln |x| = t ); that is,

y(x) = Y (t) where x = et and t = ln |x| .

Then y is a solution to the given Euler equation if and only if Y is a solution to the given constant-
coefﬁcient equation.

The proof of this lemma involves repeated chain rule computations such as
dy d dt d d ln |x| d 1 dY dY
= Y (t) = Y (t) = Y (t) = = e−t . (18.5)
dx dx d x dt d x dt x dt dt

We’ll leave the details to the adventurous (see exercises 18.6, 18.7 and 18.8).

i i

i i
i i

i i

366 Euler Equations

There are two noteworthy consequences of this lemma:

1. It gives us another way to solve Euler equations. To be specific: we can use the substitution
in the lemma to convert the Euler equation into a constant coefficient equation (with t as the
variable); solve that coefficient equation for its general solution (in terms of functions of t ),
and then use the substitution backwards to get the general solution to the Euler equation (in
terms of functions of x ).6
2. We can now confirm the claim made (and used) in the previous section about solutions to the
Euler equation corresponding to a root r of multiplicity m to the indicial equation. After
all, if r is a solution of multiplicity m to equation (18.2), then we know that
& '
er t , ter t , t 2 er t , . . . , t m−1 er t

is a set of solutions to constant coefﬁcient equation (18.4). The lemma then assures us that
this set, with t = ln |x| , is the corresponding set of solutions to Euler equation (18.3). But,
using this substitution,
r
t k er t = et t k = x r (ln |x|)k .
So the set of solutions obtained to the Euler equation is
& '
x r , x r ln |x| , x r (ln |x|)2 , . . . , x r (ln |x|)m−1 ,

just as claimed in the previous section.

Additional Exercises

18.1. Find the general solution to each of the following Euler equations on (0, ∞) :
a. x 2 y − 5x y + 8y = 0 b. x 2 y − 2y = 0
c. x 2 y − 2x y = 0 d. 2x 2 y − x y + y = 0
e. x 2 y − 5x y + 9y = 0 f. x 2 y + 5x y + 4y = 0
g. 4x 2 y + y = 0 h. x 2 y − x y + 10y = 0
i. x 2 y + 5x y + 29y = 0 j. x 2 y + x y + y = 0
k. 2x 2 y + 5x y + y = 0 l. 4x 2 y + 37y = 0
m. x 2 y + x y = 0 n. x 2 y + x y − 25y = 0

18.2. Solve the following initial-value problems:

a. x 2 y − 6x y + 10y = 0 with y(1) = −1 and y (1) = 7

b. 4x 2 y + 4x y − y = 0 with y(4) = 0 and y (4) = 2

c. x 2 y − 11x y + 36y = 0 with y(1) = 1/2 and y (1) = 2

6 It may be argued that this method, requiring the repeated use of the chain rule, is more tedious and error-prone than the
one developed earlier, which only requires algebra and differentiation of x r . That would be a good argument.

i i

i i
i i

i i

Additional Exercises 367

d. x 2 y − 3x y + 13y = 0 with y(1) = 9 and y (1) = 3

18.3. Suppose that the indicial equation for a second-order Euler equation only has one solution
r . Using reduction of order (or any other approach you think appropriate) show that both

y1 (x) = x r and y2 (x) = x r ln |x|

are solutions to the differential equation on (0, ∞) .

18.4. Find the general solution to each of the following third- and fourth-order Euler equations
on (0, ∞) :
a. x 3 y + 2x 2 y − 4x y + 4y = 0
b. x 3 y + 2x 2 y + x y − y = 0
c. x 3 y − 5x 2 y + 14x y − 18y = 0
d. x 4 y (4) + 6x 3 y − 3x 2 y − 9x y + 9y = 0
e. x 4 y (4) + 2x 3 y + x 2 y − x y + y = 0
f. x 4 y (4) + 6x 3 y + 7x 2 y + x y − y = 0

18.5. While memorizing the indicial equations is not recommended, it must be admitted that there
is a simple, easily derived shortcut to ﬁnding these equations.
a. Show that the indicial equation for the second-order Euler equation

αx 2 y + βx y + γ y = 0
is given by
αr(r − 1) + βr + γ = 0 .

b. Show that the indicial equation for the third-order Euler equation

α0 x 3 y + α1 x 2 y + α2 x y + α3 y = 0
is given by
α0 r(r − 1)(r − 2) + α1 r(r − 1) + α2 r + α3 = 0 .

c. So what do you suspect is the general shortcut for ﬁnding the indicial equation of any
Euler equation?

18.6. Conﬁrm that the claim of lemma 18.1 holds when N = 2 by considering the general
second-order Euler equation

αx 2 y + βx y + γ y = 0

and doing the following:

a. Find the corresponding indicial equation.
b. Using the substitution x = et , convert the above Euler equation to a second-order,
constant coefficient differential equation, and write out the corresponding characteristic
equation. Remember, x = et is equivalent to t = ln |x| . (You may want to glance back
at the chain rule computations in line (18.5).)
c. Confirm (by inspection!) that the characteristic equation for the constant coefficient
equation just obtained is identical to the indicial equation for the above Euler equation.

i i

i i
i i

i i

368 Euler Equations

18.7. Conﬁrm that the claim of lemma 18.1 holds when N = 3 by considering the general
third-order Euler equation

α0 x 3 y + α1 x 2 y + α2 x y + α3 y = 0

and doing the following:

a. Find the corresponding indicial equation.
b. Convert the above Euler equation to a third-order, constant coefficient differential equation
using the substitution x = et .
c. Confirm that the characteristic equation for the constant coefficient equation just obtained
is identical to the indicial equation for the above Euler equation.

18.8. Conﬁrm that the claim of lemma 18.1 holds when N is any positive integer.

i i

i i
i i

i i

19
Nonhomogeneous Equations in General

Now that we are proﬁcient at solving many homogeneous linear differential equations, including

y − 4y = 0 ,

it is time to expand our skills to solving nonhomogeneous linear equations, such as

y − 4y = 5e3x .

19.1 General Solutions to Nonhomogeneous Equations

In chapter 13, we saw that any linear combination

y = c1 y1 + c2 y2

of two solutions y1 and y2 to a second-order, homogeneous linear differential equation

ay + by + cy = 0

(on some interval) is another solution to that differential equation. However, for a second-order,
nonhomogeneous linear differential equation

ay + by + cy = g ,

the situation is not as simple. To see this, let us compute

ay + by + cy

assuming y is some linear combination of any two sufﬁciently differentiable functions y1 and y2 ,
say,
y = 2y1 + 6y2 .
By the fundamental properties of differentiation, we know that1

y = [2y1 + 6y2 ] = [2y1 ] + [6y2 ] = 2y1 + 6y2

1 If the computations look familiar, it’s because we did very similar computations in deriving the principle of superposition
in chapter 13 (see page 263).

369

i i

i i
i i

i i

370 Nonhomogeneous Equations in General

and
y = [2y1 + 6y2 ] = [2y1 ] + [6y2 ] = 2y1 + 6y2 .
So,
ay + by + cy
= a [2y1 + 6y2 ] + b [2y1 + 6y2 ] + c [2y1 + 6y2 (x)]

= 2a y1 + 6a y2 + 2by1 + 6by2 + 2cy1 (x) + 6cy2 (x)

= 2 a y1 + by1 + cy1 + 6 a y2 + by2 + cy2 .
Of course, there was nothing special about the constants 2 and 6 . If we had used any linear
combination of y1 and y2
y = c1 y1 + c2 y2 , (19.1a)
then the above computations would have yielded

ay + by + cy = c1 a y1 + by1 + cy1 + c2 a y2 + by2 + cy2 . (19.1b)
Now, suppose we have two particular solutions y p and yq to the nonhomogeneous equation
ay + by + cy = g .
This means
a y p + by p + cy p = g and a yq + byq + cyq = g .
From equation set (19.1) we see that if
y = 2y p + 6yq ,
then
ay + by + cy = 2 a y p + by p + cy p + 6 a yq + byq + cyq
= 2[g] + 6[g]
= 8g
= g ,
showing that this linear combination of solutions to our nonhomogeneous differential equation is
not a solution to our original nonhomogeneous equation. So it is NOT true that, in general, a
linear combination of solutions to a nonhomogeneous differential equation is another solution to that
nonhomogeneous differential equation.
Notice, however, what happens when we use the difference between these two particular solu-
tions
y = yq − y p = 1 · yq + (−1)y p .
Then
ay + by + cy = 1 a y p + by p + cy p + (−1) a yq + byq + cyq
= g − g
= 0 ,
which means that y = yq − y p is a solution to the corresponding homogeneous equation
ay + by + cy = 0 .

i i

i i
i i

i i

General Solutions to Nonhomogeneous Equations 371

Let me rephrase this:

If y p and yq are any two solutions to a given second-order nonhomogeneous linear
differential equation, then

yq = y p + a solution to the corresponding homogeneous equation .

On the other hand, if

y = y p + yh
where y p is any particular solution to the nonhomogeneous equation and yh is any solution to the
corresponding homogeneous equation (so that

a y p + by p + cy p = g and a yh + byh + cyh = 0 ),

then equation set (19.1) yields

ay + by + cy = a y p + by p + cy p + a y0 + by0 + cy0
= g + 0
= g .

Thus:
If y p is a particular solution to a given second-order, nonhomogeneous linear differential
equation, and

y = y p + any solution to the corresponding homogeneous equation ,

then y is also a solution to the nonhomogeneous differential equation.

If you think about it, you will realize that we’ve just derived the form for a general solution to
any nonhomogeneous linear differential equation order two; namely,

y = y p + yh

where y p is any one particular solution to that nonhomogeneous differential equation and yh a gen-
eral solution to the corresponding homogeneous linear differential equation. And if you think about
it a little more, you will realize that analogous computations can be done for any nonhomogeneous
linear differential equation, no matter what its order is. That gives us the following theorem:

Theorem 19.1 (general solutions to nonhomogeneous equations)

A general solution to any given nonhomogeneous linear differential equation is given by

y = y p + yh

where y p is any particular solution to the given nonhomogeneous equation, and yh is a general
solution to the corresponding homogeneous differential equation.2

2 Many texts refer to the general solution of the corresponding homogeneous differential equation as “the complementary
solution” and denote it by yc instead of yh . We are using yh to help remind us that this is the general solution to the
corresponding homogeneous differential equation.

i i

i i
i i

i i

372 Nonhomogeneous Equations in General

!Example 19.1: Consider the nonhomogeneous differential equation

y − 4y = 5e3x . (19.2)
Observe that 3x
e − 4 e3x = 32 e3x − 4e3x = 5e3x .
So one particular solution to our nonhomogeneous equation is
y p (x) = e3x .
The corresponding homogeneous equation is
y − 4y = 0 ,
a linear equation with constant coefﬁcients. Its characteristic equation,
r2 − 4 = 0 ,
has solutions r = 2 and r = −2 . So this homogeneous equation has
$ % $ %
y1 (x) , y2 (x) = e2x , e−2x
as a fundamental set of solutions, and
yh (x) = c1 e2x + c2 e−2x
as a general solution.
As we saw in deriving theorem 19.1, the general solution to nonhomogeneous equation (19.2)
is then
y(x) = y p (x) + yh (x)
= e3x + c1 e2x + c2 e−2x .
(Note that there are only two arbitrary constants, and that they are only in the formula for yh .
There is no arbitrary constant corresponding to y p !)

This last example illustrates what happens when we limit ourselves to second-order equations.
More generally, if we recall how we construct general solutions to the corresponding homogeneous
equations, then we get the following corollary of theorem 19.1:

Corollary 19.2 (general solutions to nonhomogeneous second-order equations)

A general solution to a second-order, nonhomogeneous linear differential equation
ay + by + cy = g
is given by
y(x) = y p (x) + c1 y1 (x) + c2 y2 (x) (19.3)
where y p is any particular solution to the nonhomogeneous equation, and {y1 , y2 } is any funda-
mental set of solutions for the corresponding homogeneous equation
ay + by + cy = 0 .

Do note that there are only two arbitrary constants c1 and c2 in formula (19.3), and that they
are multiplying only particular solutions to the corresponding homogeneous equation. The particular
solution to the nonhomogeneous equation, y p , is NOT multiplied by an arbitrary constant!
Of course, if we don’t limit ourselves to second-order equations, but still recall how to construct
general solutions to homogeneous equations from a fundamental set of solutions to that homogeneous
equation, then we get the N th -order analog of the last corollary:

i i

i i
i i

i i

Superposition for Nonhomogeneous Equations 373

Corollary 19.3 (general solutions to nonhomogeneous N th -order equations)

A general solution to an N th -order, nonhomogeneous linear differential equation

a0 y (N ) + a1 y (N −1) + · · · + a N −2 y + a N −1 y + a N y = g

is given by
y(x) = y p (x) + c1 y1 (x) + c2 y2 (x) + · · · + c N y N (x) (19.4)
where y p is any particular solution to the nonhomogeneous equation, and {y1 , y2 , . . . , y N } is any
fundamental set of solutions for the corresponding homogeneous equation

a0 y (N ) + a1 y (N −1) + · · · + a N −2 y + a N −1 y + a N y = 0 .

19.2 Superposition for Nonhomogeneous Equations

Before discussing methods for ﬁnding particular solutions, we should note that equation (19.1) on
page 370 is describing a “principle of superposition for nonhomogeneous equations”; namely, that if
y1 , y2 , g1 and g2 are functions satisfying

a y1 + by1 + cy1 = g1 and a y2 + by2 + cy2 = g2

over some interval, and y is some linear combination of these two solutions

y = c1 y2 + c2 y2 ,

then, over that interval,

ay + by + cy = c1 a y1 + by1 + cy1 + c2 a y2 + by2 + cy2
= c1 g1 + c2 g2 ,

showing that y = c1 y2 + c2 y2 is a solution to

ay + by + cy = c1 g1 + c2 g2

Obviously, similar computations will yield similar results involving any number of “ (y j , g j )
pairs”, and using comparable nonhomogeneous linear differential equations of any order. This gives
us the following theorem:3

Theorem 19.4 (principle of superposition for nonhomogeneous equations)

Let a0 , a1 , . . . and a N be functions on some interval (α, β) , and let K be a positive integer.
Assume {y1 , y2 , . . . , y K } and {g1 , g2 , . . . , g K } are two sets of K functions related over (α, β) by

a0 yk (N ) + · · · + a N −2 yk + a N −1 yk + a N yk = gk for k = 1, 2, . . . , K .

Then, for any set of K constants {c1 , c2 , . . . , c K } , a particular solution to

a0 yk (N ) + · · · + a N −2 yk + a N −1 yk + a N yk = G
3 You might want to compare this principle of superposition to the principle of superposition for homogeneous equations
that was described in theorem 13.1 on page 264.

i i

i i
i i

i i

374 Nonhomogeneous Equations in General

where
G = c1 g1 + c2 g2 + · · · + c K g K

is given by
y p = c1 y1 + c2 y2 + · · · + c K y K .

This principle gives us a means for constructing solutions to certain nonhomogeneous equations
as linear combinations of solutions to simpler nonhomogeneous equations, provided, of course, we
have the solutions to those simpler equations.

!Example 19.2: From our last example, we know that

y1 (x) = e3x satisﬁes y1 − 4y1 = 5e3x .

The principle of superposition (with K = 1 ) then assures us that, for any constant a1 ,

y p (x) = a1 y1 (x) = a1 e3x satisﬁes y1 − 4y1 = a1 5e3x .

For example, a particular solution to

y − 4y = e3x ,

which we will rewrite as

y − 4y =
1
5e3x
5
is given by
1 1 3x
y p (x) = y1 (x) = e .
5 5
And for the general solution, we simply add the general solution to the corresponding ho-
mogeneous equation found in the previous example:

+ c1 e2x + c2 e−2x
1 3x
y(x) = y p (x) + yh (x) = e .
5

The basic use of superposition requires that we already know the appropriate “ yk ’s ”. At times,
we may not already know them, but, with luck, we can make good “guesses” as to appropriate yk ’s
and then, after computing the corresponding gk ’s , use the principle of superposition.

!Example 19.3: Consider solving

y − 4y = 2x 2 − 8x + 3 . (19.5)

Let us “guess” that a particular solution can be given by a linear combination of

y1 (x) = x 2 , y2 (x) = x and y3 (x) = 1 .

Plugging these into the lefthand side of equation (19.5), we get

d2

g1 (x) = y1 − 4y1 = x2 − 4 x 2 = 2 − 4x 2 ,
dx2

d2
g2 (x) = y2 − 4y2 = [x] − 4[x] = −4x ,
dx2

i i

i i
i i

i i

Reduction of Order 375

and
d2
g3 (x) = y3 − 4y3 = [1] − 4[1] = −4 .
dx2

Now set
y p (x) = a1 y1 (x) + a2 y2 (x) + a3 y3 (x) .

By the principle of superposition,

y p − 4y p = a1 g1 (x) + a2 g2 (x) + a3 g3 (x)

= a1 2 − 4x 2 + a2 [−4x] + a3 [−4]

= −4a1 x 2 − 4a2 x + [2a1 − 4a3 ] .

This means y p is a solution to our differential equation,

y − 4y = 2x 2 − 8x + 3 ,

if and only if

−4a1 = 2 , − 4a2 = −8 and 2a1 − 4a3 = 3 .

Solving for the ak ’s yields

1
a1 = − , a2 = 2 and a1 = −1 .
2

Thus, a particular solution to our differential equation is given by

1
y p (x) = a1 y1 (x) + a2 y2 (x) + a3 y3 (x) = − x 2 + 2x − 1 ,
2

and a general solution is

1
y(x) = y p (x) + yh (x) = − x 2 + 2x − 1 + c1 e2x + c2 e−2x .
2

By the way, we’ll further discuss the “art of making good guesses” in the next chapter, and
develop a somewhat more systematic method that uses superposition in a slightly more subtle way.
Unfortunately, as we will see, “guessing” is only suitable for relatively simple problems.

19.3 Reduction of Order

In practice, ﬁnding a particular solution to a nonhomogeneous linear differential equation can be
a challenge. One method, the basic reduction of order method for second-order, nonhomogeneous
linear differential equations, was brieﬂy discussed in section 12.4. If you haven’t already looked
at that section, or don’t remember the basic ideas discussed there, you can go back and skim that
section. Or not. Truth is, better methods will be developed in the next few sections.

i i

i i
i i

i i

376 Nonhomogeneous Equations in General

Additional Exercises

19.1. What should g(x) be so that y(x) = e3x is a solution to

a. y + y = g(x) ?
b. x 2 y − 4y = g(x) ?
c. y (3) − 4y + 5y = g(x) ?

19.2. What should g(x) be so that y(x) = x 3 is a solution to

a. y + 4y + 4y = g(x) ?
b. x 2 y + 4x y + 4y = g(x) ?
3
c. y (4) + x y (3) + 4y − y = g(x) ?
x

19.3 a. Can y(x) = sin(x) be a solution to

y + y = g(x)

for some nonzero function g ? (Give a reason for your answer.)

b. What should g(x) be so that y(x) = x sin(x) is a solution to

y + y = g(x) ?

19.4 a. Can y(x) = x 4 be a solution on (0, ∞) to

x 2 y − 6x y + 12y = g(x)

for some nonzero function g ? (Give a reason for your answer.)

b. What should g(x) be so that y(x) = x 4 ln |x| is a solution to

x 2 y − 6x y + 12y = g(x) for x >0 ?

19.5. Consider the nonhomogeneous linear differential equation

y + 4y = 24e2x .

a. Verify that one particular solution to this nonhomogeneous differential equation is

y p (x) = 3e2x .

b. What is yh , the general solution to the corresponding homogeneous equation?

c. What is the general solution to the above nonhomogeneous equation?
d. Find the solution to the above nonhomogeneous equation that also satisﬁes each of the
following sets of initial conditions:
i. y(0) = 6 and y (0) = 6 ii. y(0) = −2 and y (0) = 2

i i

i i
i i

i i

Additional Exercises 377

19.6. Consider the nonhomogeneous linear differential equation

y + 2y − 8y = 8x 2 − 3 .

a. Verify that one particular solution to this equation is

1
y p (x) = −x 2 − x .
2

b. What is yh , the general solution to the corresponding homogeneous equation?

19.7. Consider the nonhomogeneous linear differential equation

y − 9y = 36 .

a. Verify that one particular solution to this equation is

y p (x) = −4 .

b. Find the general solution to this nonhomogeneous equation.

c. Find the solution to the above nonhomogeneous equation that also satisﬁes

y(0) = 8 and y (0) = 6 .

19.8. Consider the nonhomogeneous linear differential equation

y − 3y − 10y = −6e4x .

a. Verify that one particular solution to this equation is

y p (x) = e4x .

b. Find the general solution to this nonhomogeneous equation.

c. Find the solution to the above nonhomogeneous equation that also satisﬁes

y(0) = 6 and y (0) = 8 .

i i

Additional Exercises 379

Using this and the principle of superposition, ﬁnd a particular solution y p to each of the
following:
a. y − 3y − 10y = e4x
b. y − 3y − 10y = e5x
c. y − 3y − 10y = −18e4x + 14e5x
d. y − 3y − 10y = 35e5x + 12e4x

19.14. In exercise 19.11, you veriﬁed that y1 (x) = 5x + 2 is a particular solution to

x 2 y − 4x y + 6y = 10x + 12 for x >0 .

It should also be clear that y2 (x) = 1 is a particular solution to

x 2 y − 4x y + 6y = 6 for x >0 .

Using these facts and the principle of superposition, ﬁnd a particular solution y p to each
of the following on (0, ∞) :
a. x 2 y − 4x y + 6y = 1 b. x 2 y − 4x y + 6y = x
c. x 2 y − 4x y + 6y = 22x + 24

19.15 a. What should g(x) be so that y(x) is a solution to

x 2 y − 7x y + 15y = g(x) for x >0

i. when y(x) = x 2 ? ii. when y(x) = x ? iii. when y(x) = 1 ?

b. Using the results just derived and the principle of superposition, ﬁnd a particular solution
y p to
x 2 y − 7x y + 15y = x 2 for x > 0 .
c. Using the results derived above and the principle of superposition, ﬁnd a particular solution
y p to
x 2 y − 7x y + 15y = 4x 2 + 2x + 3 for x > 0 .

19.16 a. What should g(x) be so that y(x) is a solution to

y − 2y + y = g(x)

i. when y(x) = cos(2x) ? ii. when y(x) = sin(2x) ?

b. Using the results just derived and the principle of superposition, ﬁnd a particular solution
y p to
y − 2y + y = cos(2x) .
c. Using the results just derived and the principle of superposition, ﬁnd a particular solution
y p to
y − 2y + y = sin(2x) .

i i

i i
i i

i i

i i
i i

i i

20
Method of Undetermined Coefﬁcients
(aka: Method of Educated Guess)

In this chapter, we will discuss one particularly simple-minded, yet often effective, method for find-
ing particular solutions to nonhomogeneous differential equations. As the above title suggests, the
method is based on making “good guesses” regarding these particular solutions. And, as always,
“good guessing” is usually aided by a thorough understanding of the problem (being ‘educated’),
and usually works best if the problem is not too complicated. Fortunately, you have had the neces-
sary education, and a great many nonhomogeneous differential equations of interest are sufficiently
simple.
As usual, we will start with second-order equations, and then observe that everything developed
also applies, with little modification, to similar nonhomogeneous differential equations of any order.

20.1 Basic Ideas

Suppose we wish to ﬁnd a particular solution to a nonhomogeneous second-order differential equation

ay + by + cy = g .

If g is a relatively simple function and the coefﬁcients — a , b and c — are constants, then, after
recalling what the derivatives of various basic functions look like, we might be able to make a good
guess as to what sort of function y p (x) yields g(x) after being plugged into the left side of the
above equation. Typically, we won’t be able to guess exactly what y p (x) should be, but we can
often guess a formula for y p (x) involving speciﬁc functions and some constants that can then be
determined by plugging the guessed formula for y p (x) into the differential equation and solving the
resulting algebraic equation(s) for those constants (provided the initial ‘guess’ was good).

!Example 20.1: Consider

y − 2y − 3y = 36e5x .
Since all derivatives of e5x equal some constant multiple of e5x , it should be clear that, if we let

y(x) = some multiple of e5x ,

then
y − 2y − 3y = some other multiple of e5x .

381

i i

i i
i i

i i

382 Method of Undetermined Coefﬁcients

So let us let A be some constant “to be determined”, and try

y p (x) = Ae5x

as a particular solution to our differential equation:

y p − 2y p − 3y p = 36e5x

→ Ae5x − 2 Ae5x − 3 Ae5x = 36e5x

→ 25Ae5x − 2 5Ae5x − 3 Ae5x = 36e5x

→ 25Ae5x − 10 Ae5x − 3Ae5x = 36e5x

→ 12 Ae5x = 36e5x

→ A = 3 .

So our “guess”, y p (x) = Ae5x , satisﬁes the differential equation only if A = 3 . Thus,

y p (x) = 3e5x

is a particular solution to our nonhomogeneous differential equation.

In the next section, we will determine the appropriate “first guesses” for particular solutions
corresponding to different choices of g in our differential equation. These guesses will involve
specific functions and initially unknown constants that can be determined as we determined A in
the last example. Unfortunately, as we will see, the first guesses will sometimes fail. So we will
discuss appropriate second (and, when necessary, third) guesses, as well as when to expect the first
(and second) guesses to fail.
Because all of the guesses will be linear combinations of functions in which the coefficients are
“constants to be determined”, this whole approach to finding particular solutions is formally called
the method of undetermined coefficients. Less formally, it is also called the method of (educated)
guess.
Keep in mind that this method only finds a particular solution for a differential equation. In
practice, we really need the general solution, which (as we know from our discussion in the pre-
vious chapter) can be constructed from any particular solution along the general solution to the
corresponding homogeneous equation (see theorem 19.1 and corollary 19.2 on page 371).

!Example 20.2: Consider ﬁnding the general solution to

y − 2y − 3y = 36e5x .

From the last example, we know

y p (x) = 3e5x
is a particular solution to the differential equation. The corresponding homogeneous equation is

y − 2y − 3y = 0 .

Its characteristic equation is

r 2 − 2r − 3 = 0 ,

i i

i i
i i

i i

Basic Ideas 383

which factors as
(r + 1)(r − 3) = 0 .
So r = −1 and r = 3 are the possible values of r , and

yh (x) = c1 e−x + c2 e3x

is the general solution to the corresponding homogeneous differential equation.

As noted in corollary 19.2, it then follows that

y(x) = y p (x) + yh (x) = 3e5x + c1 e−x + c2 e3x .

is a general solution to our nonhomogeneous differential equation.

Also keep in mind that you may not just want the general solution but also the one solution that
satisﬁes some particular initial conditions.

!Example 20.3: Consider the initial-value problem

y − 2y − 3y = 36e5x with y(0) = 9 and y (0) = 25 .

From above, we know the general solution to the differential equation is

y(x) = 3e5x + c1 e−x + c2 e3x .

Its derivative is

y (x) = 3e5x + c1 e−x + c2 e3x = 15e5x − c1 e−x + 3c2 e3x .

This, with our initial conditions, gives us

9 = y(0) = 3e5·0 + c1 e−0 + c2 e3·0 = 3 + c1 + c2

and
25 = y (0) = 15e5·0 − c1 e−0 + 3c2 e3·0 = 15 − c1 + 3c2 ,

which, after a little arithmetic, becomes the system

c1 + c2 = 6
.
−c1 + 3c2 = 10

Solving this system by whatever means you prefer yields

c1 = 2 and c2 = 4 .

So the solution to the given differential equation that also satisﬁes the given initial conditions is

y(x) = 3e5x + c1 e−x + c2 e3x = 3e5x + 2e−x + 4e3x .

i i

i i
i i

i i

384 Method of Undetermined Coefﬁcients

20.2 Good First Guesses for Various Choices of g

In all of the following, we are interested in ﬁnding a particular solution y p (x) to

ay + by + cy = g (20.1)

where a , b and c are constants and g is the indicated type of function. In each subsection, we
will describe a class of functions for g and the corresponding ‘ﬁrst guess’ as to the formula for a
particular solution y p . In each case, the formula will involve constants “to be determined”. These
constants are then determined by plugging the guessed formula for y p into the differential equation
and solving the system of algebraic equations that results. Of course, if the resulting equations are
not solvable for those constants, then the ﬁrst guess is not adequate, and you’ll have to read the next
section to learn a good ‘second guess’.

Exponentials
As illustrated in example 20.1,

If, for some constants C and α ,

g(x) = Ceαx

then a good ﬁrst guess for a particular solution to differential equation (20.1) is

y p (x) = Aeαx

where A is a constant to be determined.

Sines and Cosines

!Example 20.4: Consider

y − 2y − 3y = 65 cos(2x) .

A naive ﬁrst guess for a particular solution might be

y p (x) = A cos(2x) ,

where A is some constant to be determined. Unfortunately, here is what we get when we plug
this guess into the differential equation:
y p − 2y p − 3y p = 65 cos(2x)

→ [A cos(2x)] − 2[A cos(2x)] − 3[A cos(2x)] = 65 cos(2x)

→ −4A cos(2x) + 4A sin(2x) − 3A cos(2x) = 65 cos(2x)

→ A[−7 cos(2x) + 4 sin(2x)] = 65 cos(2x) .

But there is no constant A satisfying this last equation for all values of x . So our naive ﬁrst
guess will not work.

i i

i i
i i

i i

Good First Guesses for Various Choices of g 385

Since our naive ﬁrst guess resulted in an equation involving both sines and cosines, let us
add a sine term to the guess and see if we can get all the resulting sines and cosines in the resulting
equation to balance. That is, assume
y p (x) = A cos(2x) + B sin(2x)
where A and B are constants to be determined. Plugging this into the differential equation:

y p − 2y p − 3y p = 65 cos(2x)

→ [A cos(2x) + B sin(2x)] − 2[A cos(2x) + B sin(2x)]

− 3[A cos(2x) + B sin(2x)] = 65 cos(2x)

→ −4A cos(2x) − 4B sin(2x) − 2[−2 A sin(2x) + 2B cos(2x)]

− 3[A cos(2x) + B sin(2x)] = 65 cos(2x)

→ (−7A − 4B) cos(2x) + (4A − 7B) sin(2x) = 65 cos(2x) .

For the cosine terms on the two sides of the last equation to balance, we need
−7A − 4B = 65 ,
and for the sine terms to balance, we need
4A − 7B = 0 .
This gives us a relatively simple system of two equations in two unknowns. Its solution is easily
found. From the second equation, we have
4
B = A .
7
Combining this with the ﬁrst equation yields
65
4 49 16
65 = −7A − 4 A = − − A = − A .
7 7 7 7
Thus,
4 4
A = −7 and B = A = (−7) = −4 ,
7 7
and a particular solution to the differential equation is given by
y p (x) = A cos(2x) + B sin(2x) = −7 cos(2x) − 4 sin(2x) .

The last example illustrates the fact that, typically, if g(x) is a sine or cosine function (or a
linear combination of a sine and cosine function with the same frequency) then a linear combination
of both the sine and cosine can be used for y p (x) . Thus, we have the following rule:
If, for some constants C c , C s and ω ,
g(x) = C c cos(ωx) + C s sin(ωx)
then a good ﬁrst guess for a particular solution to differential equation (20.1) is
y p (x) = A cos(ωx) + B sin(ωx)
where A and B are constants to be determined.

i i

i i
i i

i i

386 Method of Undetermined Coefﬁcients

Polynomials

!Example 20.5: Let us ﬁnd a particular solution to

y − 2y − 3y = 9x 2 + 1 .
Now consider, if y is any polynomial of degree N , then y , y and y are also polynomials of
degree N or less. So the expression “ y − 2y − 3y ” would then be a polynomial of degree N .
Since we want this to match the right side of the above differential equation, which is a polynomial
of degree 2 , it seems reasonable to try a polynomial of degree N with N = 2 . So we “guess”
y p (x) = Ax 2 + Bx + C .
In this case
y p (x) = 2 Ax + B and y p (x) = 2 A .
Plugging these into the differential equation:

y p − 2y p − 3y p = 9x 2 + 1

→ 2 A − 2[2 Ax + B] − 3[Ax 2 + Bx + C] = 9x 2 + 1

→ −3Ax 2 + [−4A − 3B]x + [2 A − 2B − 3C] = 9x 2 + 1 .

For the last equation to hold, the corresponding coefﬁcients to the polynomials on the two sides
must equal, giving us the following system:
x 2 terms: −3A = 9
x terms: −4A − 3B = 0
constant terms: 2 A − 2B − 3C = 1
So,
9
A = − = −3 ,
3
4A 4(−3)
B = − = − = 4
3 3
and
1 − 2 A + 2B 1 − 2(−3) + 2(4) 15
C = = = = −5 .
−3 −3 −3
And the particular solution is
y p (x) = Ax 2 + Bx + C = −3x 2 + 4x − 5 .

Generalizing from this example, we can see that the rule for the ﬁrst guess for y p (x) when g
is a polynomial is:
If
g(x) = a polynomial of degree K ,
then a good ﬁrst guess for a particular solution to differential equation (20.1) is a K th -
degree polynomial
y p (x) = A0 x K + A1 x K −1 + · · · + A K −1 x + A K
where the Ak ’s are constants to be determined.

i i

i i
i i

i i

Good First Guesses for Various Choices of g 387

Products of Exponentials, Polynomials, and Sines and Cosines

If g is a product of the simple functions discussed above, then the guess for y p must take into
account everything discussed above. That leads to the following rule:

If, for some pair of polynomials P(x) and Q(x) , and some pair of constants α and
ω,
g(x) = P(x)eαx cos(ωx) + Q(x)eαx sin(ωx)
then a good ﬁrst guess for a particular solution to differential equation (20.1) is

y p (x) = A0 x K + A1 x K −1 + · · · + A K −1 x + A K eαx cos(ωx)

+ B0 x K + B1 x K −1 + · · · + B K −1 x + B K eαx sin(ωx)

where the Ak ’s and Bk ’s are constants to be determined and K is the highest power
of x appearing in polynomial P(x) or Q(x) .
(Note that the above include the cases where α = 0 or ω = 0 . In these cases the
formula for y p simpliﬁes a bit.)

!Example 20.6: To ﬁnd a particular solution to

y − 2y − 3y = 65x cos(2x) ,

we should start by assuming it is of the form

y p (x) = [A0 x + A1 ] cos(2x) + [B0 x + B1 ] sin(2x) .

With a bit of work, you can verify yourself that, with y = y p (x) , the above differential equation
reduces to

[−2 A0 − 7A1 + 4B0 − 4B1 ] cos(2x) + [−7A0 − 4B0 ]x cos(2x)

+ [−4A0 + 4A1 − 2B0 − 7B1 ] sin(2x) + [4A0 − 7B0 ]x sin(2x) = 65x cos(2x) .

Comparing the terms on either side of the last equation, we get the following system:

cos(2x) terms: −2 A0 − 7A1 + 4B0 − 4B1 = 0

x cos(2x) terms: −7A0 − 4B0 = 65
sin(2x) terms: −4A0 + 4A1 − 2B0 − 7B1 = 0
x sin(2x) terms: 4A0 − 7B0 = 0

Solving this system yields

158 244
A0 = −7 , A1 = − , B0 = −4 . and B1 = .
65 65

So a particular solution to the differential equation is given by

158 244
y p (x) = −7x − cos(2x) + −4x + sin(2x) .
65 65

i i

i i
i i

i i

388 Method of Undetermined Coefﬁcients

20.3 When the First Guess Fails

!Example 20.7: Consider
y − 2y − 3y = 28e3x .
Our ﬁrst guess is
y p (x) = Ae3x .
Plugging it into the differential equation:

y p − 2y p − 3y p = 28e3x

→ Ae3x − 2 Ae3x − 3 Ae3x = 28e3x

→ 9Ae3x − 2 3Ae3x − 3 Ae3x = 28e3x

→ 9Ae3x − 6Ae3x − 3Ae3x = 28e3x .

But when we add up the left side of the last equation, we get the impossible equation

0 = 28e3x !

No value for A can make this equation true! So our ﬁrst guess fails.
Why did it fail? Because the guess, Ae3x was already a solution to the corresponding
homogeneous equation
y − 2y − 3y = 0 ,
which we would have realized if we had recalled the general solution to this homogeneous
differential equation. So the left side of our differential equation will have to vanish when we
plug in this guess, leaving us with an ‘impossible’ equation.

In general, whenever our ﬁrst guess for a particular solution contains a term that is also a solution
to the corresponding homogeneous differential equation, then the contribution of that term to

a y p + by p + cy p = g

vanishes, and we are left with an equation or a system of equations with no possible solution. In
these cases, we can still attempt to solve the problem using the ﬁrst guess with the reduction of order
method mentioned in the previous chapter. To save time, though, I will tell you what would happen.
You would discover that, if the ﬁrst guess fails, then there is a particular solution of the form

x × “the ﬁrst guess”

unless this formula also contains a term satisfying the corresponding homogeneous differential
equation, in which case there is a particular solution of the form

x 2 × “the ﬁrst guess” .

Thus, instead of using reduction of order (or the method we’ll learn in the next chapter), we can
apply the following rules for generating the appropriate guess for the form for a particular solution
y p (x) (given that we’ve already ﬁgured out the ﬁrst guess using the rules in the previous section):

i i

i i
i i

i i

When the First Guess Fails 389

If the ﬁrst guess for y p (x) contains a term that is also a solution to the corresponding
homogeneous differential equation, then consider

x × “the ﬁrst guess”

as a “second guess”. If this (after multiplying through by the x ) does not contain a term
satisfying the corresponding homogeneous differential equation, then set

y p (x) = “second guess” = x × “the ﬁrst guess” .

If, however, the second guess also contains a term satisfying the corresponding homo-
geneous differential equation, then set

y p (x) = “the third guess”

where
“third guess” = x × “the second guess” = x 2 × “the ﬁrst guess” .

It must be emphasized that the second guess is used only if the first fails (i.e., has a term that
satisfies the homogeneous equation). If the first guess works, then the second (and third) guesses
will not work. Likewise, if the second guess works, then the third guess is not only unnecessary, it
will not work. If, however the first and second guesses fail, you can be sure that the third guess will
work.

!Example 20.8: Again, consider

y − 2y − 3y = 28e3x .

Our ﬁrst guess

Ae3x
was a solution to the corresponding homogeneous differential equation. So we try a second guess
of the form
x × “ﬁrst guess” = x × Ae3x = Axe3x .
Comparing this (our second guess) to the general solution

yh (x) = c1 e−x + c1 e3x

of the corresponding homogeneous equation (see exercise 20.2), we see that our second guess
is not a solution to the corresponding homogeneous differential equation, and, so, we can ﬁnd a
particular solution to our nonhomogeneous differential equation by setting

y p (x) = “second guess” = Axe3x .

The ﬁrst two derivatives of this are

y p (x) = Ae3x + 3Axe3x

and
y p (x) = 3Ae3x + 3Axe3x + 9Axe3x = 6Ae3x + 9Axe3x .

Using this:
y p − 2y p − 3y p = 28e3x

→ Axe3x − 2 Axe3x − 3 Axe3x = 28e3x

i i

i i
i i

i i

390 Method of Undetermined Coefﬁcients

→ 6Ae3x + 9Axe3x − 2 Ae3x + 3Axe3x − 3Axe3x = 28e3x

→ [9 − 2(3) − 3] Axe3x + [6 − 2]Ae3x = 28e3x

→ 4Ae3x = 28e3x .

Thus,
28
A = = 7
4
and
y p (x) = 7xe3x .

20.4 Method of Guess in General

If you think about why the method of (educated) guess works with second-order equations, you
will realize that this basic approach will work just as well with any linear differential equation with
constant coefficients,
a0 y (N ) + a1 y (N −1) + · · · + a N −1 y + a N y = g ,
provided the g(x) is any of the types of functions already discussed. The appropriate first guesses
are exactly the same, and, if a term in one ‘guess’ happens to already satisfy the corresponding
homogeneous differential equation, then x times that guess will be an appropriate ‘next guess’. The
only modification in our method is that, with higher order equations, we may have to go to a fourth
guess or a fifth guess or . . . .

!Example 20.9: Consider the seventh-order nonhomogeneous differential equation

y (7) − 625y (3) = 6e2x .
An appropriate ﬁrst guess for a particular solution is still
y p (x) = Ae2x .
Plugging this guess into the differential equation:
y p (7) − 625y p (3) = 6e2x
(7) (3)
→ Ae2x − 625 Ae2x = 6e2x

→ 27 Ae2x − 625 · 23 Ae2x = 6e2x

→ 128Ae2x − 5,000 Ae2x = 6e2x

→ −4,872 Ae2x = 6e2x

→ A = −
6
4,872
= −
1
812
.

i i

i i
i i

i i

Method of Guess in General 391

So a particular solution to our differential equation is

1 2x
y p (x) = − e .
812

Fortunately, we dealt with the corresponding homogeneous equation,

y (7) − 625y (3) = 0 ,

in example 17.6 on page 342. Looking back at that example, we see that the general solution to
this homogeneous differential equation is

yh (x) = c1 + c2 x + c3 x 2 + c4 e5x + c5 e−5x + c6 cos(5x) + c7 sin(5x) . (20.2)

Thus, the general solution to our nonhomogeneous equation,

y (7) − 625y (3) = 6e2x ,

is
y(x) = y p (x) + yh (x)

+ c1 + c2 x + c3 x 2 + c4 e5x + c5 e−5x
1 2x
= − e
809
+ c6 cos(5x) + c7 sin(5x) .

!Example 20.10: Now consider the nonhomogeneous equation

y (7) − 625y (3) = 300x + 50 .

Since the right side is a polynomial of degree one, the appropriate ﬁrst guess for a particular
solution is
y p (x) = Ax + B .
However, the general solution to the corresponding homogeneous equation (formula (20.2), above)
contains both a constant term and a cx term. So plugging this guess into the nonhomogeneous
differential equation will yield the impossible equation

0 = 300x + 50 .

Likewise, both terms of the second guess,

x × “ﬁrst guess” = x × (Ax + B) = Ax 2 + Bx ,

and the last term of the third guess,

x × “second guess” = x × (Ax 2 + Bx) = Ax 3 + Bx 2 ,

satisfy the corresponding homogeneous differential equation, and, thus, would fail. The fourth
guess,
x × “third guess” = x × (Ax 3 + Bx 2 ) = Ax 4 + Bx 3 ,
has no terms in common with the general solution to the corresponding homogeneous equation
(formula (20.2), above). So the appropriate “guess” here is

y p (x) = “fourth guess” = Ax 4 + Bx 3 .

i i

i i
i i

i i

392 Method of Undetermined Coefﬁcients

Using this:
y p (7) − 625y p (3) = 300x + 50
(7) (3)
→ Ax 4 + Bx 3 − 625 Ax 4 + Bx 3 = 300x + 50

→ 0 − 625[A · 4 · 3 · 2x + B · 3 · 2 · 1] = 300x + 50

→ −15,000 Ax − 3,750B = 300x + 50 .

Thus,
300 1 50 1
A = − = − and B = − = − ,
15,000 50 3,750 75
and a particular solution to our nonhomogeneous differential equation is given by

x4 x3
y p (x) = − − .
50 75

For the sake of completeness, let us end our development of the method of (educated) guess
(more properly called the method of undetermined coefﬁcients) with a theorem that does two
things:

1. It concisely summarizes the rules we’ve developed in this chapter. (But its conciseness may
make it too dense to be easily used — so just remember the rules we’ve developed instead
of memorizing this theorem.)

2. It assures us that the method we’ve just developed will always work.

Theorem 20.1
Suppose we have a nonhomogeneous linear differential equation with constant coefﬁcients

a0 y (N ) + a1 y (N −1) + · · · + a N −1 y + a N y = g

where
g(x) = P(x)eαx cos(ωx) + Q(x)eαx sin(ωx)
for some pair of polynomials P(x) and Q(x) , and some pair of constants α and ω . Let K be
the highest power of x appearing in P(x) or Q(x) , and let M be the smallest nonnegative integer
such that
x M eαx cos(ωx)
is not a solution to the corresponding homogeneous differential equation.
Then there are constants A0 , A1 , . . . and A K , and constants B0 , B1 , . . . and B K such that

y p (x) = x M A0 x K + A1 x K −1 + · · · + A K −1 x + A K eαx cos(ωx)
(20.3)
+ x M B0 x K + B1 x K −1 + · · · + B K −1 x + B K eαx sin(ωx)

is a particular solution to the given nonhomogeneous differential equation.

Proving this theorem is not that difﬁcult, provided you have the right tools. Those who are
interested can turn to section 20.7 for the details.

i i

i i
i i

i i

Common Mistakes 393

20.5 Common Mistakes

A Bad Alternative to Formula (20.3)
One common mistake is to use

x M A0 x K + A1 x K −1 + · · · + A K −1 x + A K eαx C 1 cos(ωx) + C 2 sin(ωx)

for y p (x) instead of formula (20.3). These two formulas are not equivalent.

!Example 20.11: Let us suppose that the particular solution we are seeking is actually

y p (x) = [2x + 3] cos(2x) + [4x − 5] sin(2x) ,

and that we are (incorrectly) trying to use the “guess”

y p (x) = [ A0 x + A1 ] [C 1 cos(2x) + C 2 sin(2x)]

to ﬁnd it. Setting the guess equal to the correct answer, and multiplying things out, we get

[2x + 3] cos(2x) + [4x − 5] sin(2x)

= [A0 x + A1 ] [C 1 cos(2x) + C 2 sin(2x)]
= [A0 C 1 x + A1 C 1 ] cos(2x) + [ A0 C 2 x + A1 C 2 ] sin(2x)] .

Thus, we must have

A0 C 1 = 2 , A1 C 1 = 3 , A0 C 2 = 4 and A1 C 2 = −5 .

But then,
2 A C A A C 4
= 0 1 = 0 = 0 2 = ,
3 A1 C 1 A1 A1 C 2 −5
which is impossible. So we cannot ﬁnd the correct formula for y p using

y p (x) = [ A0 x + A1 ] [C 1 cos(2x) + C 2 sin(2x)]

instead of formula (20.3).

The problem here is that, while the products A0 C 1 , A1 C 1 , A0 C 2 and A1 C 2 do deﬁne
four constants

A 0 C 1 = D1 , A 1 C 1 = D2 , A 0 C 2 = D3 and A 1 C 2 = D4 ,

these constants are not independent constants — given any three of those constants, the fourth is
related to the other three by
D1 A C A C D
= 0 1 = 0 2 = 3 .
D2 A1 C 1 A1 C 2 D4

i i

i i
i i

i i

394 Method of Undetermined Coefﬁcients

Using Too Many Undetermined Coefﬁcients

It may be argued that there is no harm in using expressions with extra undetermined coefﬁcients,
say,

y p (x) = A0 x 3 + A1 x 2 + A2 x + A0 cos(2x) + B0 x 3 + B1 x 2 + B2 x + B3 sin(2x)

when theorem 20.1 assures you that

y p (x) = A0 x + A1 cos(2x) + B0 x + B1 sin(2x)

will suffice. After all, won’t the extra coefficients just end up being zero? Well, yes, IF you do
all your calculations correctly. But, by including those extra terms, you have greatly increased the
difficulty and length of your calculations, thereby increasing your chances of making errors in your
calculations. And why complicate your calculation so much when you should already know that
those extra terms will all be zero?
So, make sure your “guess” contains the right number of coefficients to be determined — not
too many, and not too few.

20.6 Using the Principle of Superposition

Suppose we have a nonhomogeneous linear differential equation with constant coefﬁcients

a0 y (N ) + a1 y (N −1) + · · · + a N −1 y + a N y = g

where g is the sum of functions

g(x) = g1 (x) + g2 (x) + · · ·

with each of the gk ’s requiring a different ‘guess’ for y p . One approach to ﬁnding a particular
solution y p (x) to this is to construct a big guess by adding together all the guesses suggested
by the gk ’s . This typically leads to rather lengthy formulas and requires keeping track of many
undetermined constants, and that often leads to errors in computations — errors that, themselves,
may be difﬁcult to recognize or track down.
Another approach is to break down the differential equation to a collection of slightly simpler
differential equations,

a0 y (N ) + a1 y (N −1) + · · · + a N −1 y + a N y = g1 ,

a0 y (N ) + a1 y (N −1) + · · · + a N −1 y + a N y = g2 ,
..
.

and, for each gk , ﬁnd a particular solution y = y pk to

a0 y (N ) + a1 y (N −1) + · · · + a N −1 y + a N y = gk .

By the principle of superposition for nonhomogeneous equations discussed in section 19.2, we know
that a particular solution to the differential equation of interest,

a0 y (N ) + a1 y (N −1) + · · · + a N −1 y + a N y = g1 + g2 + · · · ,

i i

i i
i i

i i

On Verifying Theorem 20.1 395

can then be constructed by simply adding up the y pk ’s ,

y p (x) = y p1 (x) + y p2 (x) + · · · .

Typically, the total amount of computational work is essentially the same for either approach.
Still the approach of breaking the problem into simpler problems and using superposition is usually
considered to be easier to actually carry out since we are dealing with smaller formulas and fewer
variables at each step.

!Example 20.12: Consider

y − 2y − 3y = 65 cos(2x) + 9x 2 + 1 .

Because
g1 (x) = 65 cos(2x) and g2 (x) = 9x 2 + 1
lead to different initial guesses for y p (x) , we will break this problem into the separate problems
of ﬁnding particular solutions to

y − 2y − 3y = 65 cos(2x)
and
y − 2y − 3y = 9x 2 + 1 .

Fortunately, these happen to be differential equations considered in previous examples. From

example 20.4 we know that a particular solution to the ﬁrst of these two equations is

y p1 (x) = −7 cos(2x) − 4 sin(2x) ,

and from example 20.5 we know that a particular solution to the second of these two equations is

y p2 (x) = −3x 2 + 4x − 5 .

So, by the principle of superposition, we have that a particular solution to

y − 2y − 3y = 65 cos(2x) + 9x 2 + 1

is given by
y p (x) = y p1 (x) + y p2 (x)
= −7 cos(2x) − 4 sin(2x) − 3x 2 + 4x − 5 .

20.7 On Verifying Theorem 20.1

Theorem 20.1, which conﬁrms our “method of guess”, is the main theorem of this chapter. Its proof
follows relatively easily using some of the ideas developed in sections 14.3 and 17.4 on multiplying
and factoring linear differential operators.

i i

i i
i i

i i

396 Method of Undetermined Coefﬁcients

A Useful Lemma
Rather than tackle the proof of theorem 20.1 directly, we will ﬁrst prove the following lemma. This
lemma contains much of our theorem, and its proof nicely illustrates the main ideas in the proof of
the main theorem. After this lemma’s proof, we’ll see about proving our main theorem.

Lemma 20.2
Let L be a linear differential operator with constant coefﬁcients, and assume y p is a function
satisfying
L[y p ] = g
where, for some nonnegative integer K and constants ρ , b0 , b1 , . . . and b K ,

g(x) = b0 x K eρx + b1 x K −1 eρx + · · · + b K −1 xeρx + b K eρx .

Then there are constants A0 , A1 , . . . and A K such that

y p (x) = x M A0 x K + A1 x K −1 + · · · + A K −1 x + A K eρx

where

multiplicity of ρ if ρ is a root of L ’s characteristic polynomial
M = .
0 if ρ is not a root of L ’s characteristic polynomial

PROOF: Let yq be any function on the real line satisfying

L[yq ] = g .

(Theorem 11.4 on page 240 assures us that such a function exists.) Applying an equality from lemma
17.7 on page 347, you can easily verify that
d K +1
−ρ [g] = 0 .
dx
Hence,
d K +1 d K +1
−ρ L[yq ] = −ρ [g] = 0 .
dx dx
That is, yq is a solution to the homogeneous linear differential equation with constant coefﬁcients
d K +1
−ρ L[y] = 0 .
dx
Letting r1 , r2 , . . . and r L be all the roots other than ρ to the characteristic polynomial for L , we
can factor the characteristic equation for the last differential equation above to

a(r − ρ) K +1 (r − r1 )m 1 (r − r2 )m 2 · · · (r − r L )m L (r − ρ) M = 0 .

Equivalently,
a(r − r1 )m 1 (r − r2 )m 2 · · · (r − r L )m L (r − ρ) M+K +1 = 0 .

From what we learned about the general solutions to homogeneous linear differential equations with
constant coefﬁcients in chapter 17, we know that

yq (x) = Y1 (x) + Yρ (x)

i i

i i
i i

i i

On Verifying Theorem 20.1 397

where Y1 is a linear combination of the x k er x ’s arising from the roots other than ρ , and

Yρ (x) = C 0 eρx + C 1 xeρx + C 2 x 2 eρ + · · · + C M+K x M+K eρx .

Now let Yρ,0 consist of the ﬁrst M terms of Yρ , and set

y p = yq − Y1 − Yρ,0

Observe that, while yq is a solution to the nonhomogeneous differential equation L[y] = g , every
term in Y1 (x) and Yρ,0 (x) is a solution to the corresponding homogeneous differential equation
L[y] = 0 . Hence,

L[y p ] = L[yq − Y1 − Yρ,0 ] = L[yq ] − L[Y1 ] − L[Yρ,0 ] = g − 0 − 0 .

So y p is a solution to the nonhomogeneous differential equation in the lemma. Moreover,

y p (x) = yq − Y1 − Yρ,0
= Yρ (x) − the ﬁrst M terms of Yρ (x)
= C M x M eρx + C M+1 x M+1 xeρx + C M+2 x M+2 eρ + · · · + C M+K x M+K eρx

= x M C M + C M+1 x + C M+2 x 2 + · · · + C M+K x K eρx ,

which, except for minor cosmetic differences, is the formula for y p claimed in the lemma.

Proving the Main Theorem

Take a look at theorem 20.1 on page 392. Observe that, if you set ρ = α , then our lemma is just a
restatement of that theorem with the additional assumption that ω = 0 . So the claims of theorem
20.1 follow immediately from our lemma when ω = 0 .
Verifying the claims of theorem 20.1 when ω = 0 requires just a little more work. Simply let

ρ = α + iω

and redo the lemma’s proof (making the obvious modiﬁcations) with the double factor
d d
−ρ − ρ∗
dx dx

replacing the single factor d

−ρ ,
dx
and keeping in mind what you know about the solutions to a homogeneous differential equation
with constant coefﬁcients corresponding to the complex roots of the characteristic polynomial. The
details will be left to the interested reader.

i i

i i
i i

i i

398 Method of Undetermined Coefﬁcients

Additional Exercises

20.1. Find both a particular solution y p (via the method of educated guess) and a general solution
y to each of the following:
a. y + 9y = 52e2x b. y − 6y + 9y = 27e6x
c. y + 4y − 5y = 30e−4x d. y + 3y = e x/2

20.2. Solve the initial-value problem

y − 3y − 10y = −5e3x with y(0) = 5 and y (0) = 3 .

20.3. Find both a particular solution y p (via the method of educated guess) and a general solution
y to each of the following:
a. y + 9y = 10 cos(2x) + 15 sin(2x) b. y − 6y + 9y = 25 sin(6x)
x x
c. y + 3y = 26 cos − 12 sin d. y + 4y − 5y = cos(x)
3 3

20.4. Solve the initial-value problem

y − 3y − 10y = −4 cos(x) + 7 sin(x) with y(0) = 8 and y (0) = −5 .

20.5. Find both a particular solution y p (via the method of educated guess) and a general solution
y to each of the following:
a. y − 3y − 10y = −200 b. y + 4y − 5y = x 3
c. y − 6y + 9y = 18x 2 + 3x + 4 d. y + 9y = 9x 4 − 9

20.6. Solve the initial-value problem

y + 9y = x 3 with y(0) = 0 and y (0) = 0 .

20.7. Find both a particular solution y p (via the method of educated guess) and a general solution
y to each of the following:
a. y + 9y = 25x cos(2x) b. y − 6y + 9y = e2x sin(x)
c. y + 9y = 54x 2 e3x d. y = 6xe x sin(x)
e. y − 2y + y = [−6x − 8] cos(2x) + [8x − 11] sin(2x)
f. y − 2y + y = [12x − 4]e−5x

20.8. Solve the initial-value problem

y + 9y = 39xe2x with y(0) = 1 and y (0) = 0 .

20.9. Find both a particular solution y p (via the method of educated guess) and a general solution
y to each of the following:
a. y − 3y − 10y = −3e−2x b. y + 4y = 20

i i

i i
i i

i i

Additional Exercises 399

c. y + 4y = x 2 d. y + 9y = 3 sin(3x)
e. y − 6y + 9y = 10e3x f. y + 4y = 4xe−4x

20.10. Find a general solution to each of the following, using the method of educated guess to ﬁnd
a particular solution.

a. y − 3y − 10y = 72x 2 − 1 e2x b. y − 3y − 10y = 4xe6x

c. y − 10y + 25y = 6e5x d. y − 10y + 25y = 6e−5x

3. κ is the spring constant, a positive quantity describing the “stiffness” of the spring (with
“stiffer” springs having larger values for κ ).

4. γ is the damping constant, a nonnegative quantity describing how much friction is in the
system resisting the motion (with γ = 0 corresponding to an ideal system with no friction
whatsoever).

5. F is the sum of all forces acting on the spring other than those due to the spring responding
to being compressed and stretched, and the frictional forces in the system resisting motion.

Since we are expanding on the results from chapter 16, let us recall some of the major results
derived there regarding the general solution yh to the corresponding homogeneous equation

d 2 yh dy
m + γ h + κy = 0 . (21.1)
dt 2 dt

If there is no friction in the system, then we say the system is undamped, and the solution to
equation (21.1) is
yh (t) = c1 cos(ω0 t) + c2 sin(ω0 t)
or, equivalently,
yh (t) = A cos(ω0 t − φ)

where (
κ
ω0 =
m
is the natural angular frequency of the system, and the other constants are related by
c c
A = (c1 )2 + (c2 )2 , cos(φ) = 1 and sin(φ) = 2 .
A A
When convenient, we can rewrite the above formulas for yh in terms of the system’s natural frequency
ν0 by simply replacing each ω0 with 2π ν0 .
If there is friction resisting the object’s motion (i.e., if 0 < γ ), then we say the system
is damped, and we can further classify the system as being underdamped, critically damped and

i i

i i
i i

i i

Constant Force 403

overdamped, according to the precise relation between γ , κ and m . In these cases, different
solutions to equation (21.1) arise, but in each of these cases, every term of the solution yh (t) has an
exponentially decreasing factor. This factor ensures that

yh (t) → 0 as t → ∞ .

That is what will be particularly relevant in this chapter.

(At this point, you may want to go back and quickly review chapter 16 yourself, verifying the
above and ﬁlling in some of the details glossed over. In particular, you may want to glance back over
the brief note on ‘units’ starting on page 322.)

21.2 Constant Force

Let us ﬁrst consider the case where the external force is constant. For example, the spring might be
hanging vertically and the external force is the force of gravity on the object. Letting F0 be that
constant, the differential equation for y = y(t) is

d2 y dy
m 2
+ γ + κ y = F0 .
dt dt

From our development of the method of guess, we know the general solution is

y(t) = yh (t) + y p (t)

where yh is as described in the previous section, and the particular solution, y p , is some constant,

y p (t) = y0 for all t .

Plugging this constant solution into the differential equation, we get

m · 0 + γ · 0 + κ y0 = F0 .

Hence,
F0
y0 = . (21.2)
κ
If the system is undamped, then

y(t) = yh (t) + y0 = c1 cos(ω0 t) + c2 sin(ω0 t) + y0 ,

which tells us that the object is oscillating about y = y0 . On the other hand, if the system is damped,
then
lim y(t) = lim [yh (t) + y0 ] = 0 + y0 .
t→∞ t→∞

In this case, y = y0 is where the object ﬁnally ends up. Either way, the effect of this constant force
is to change the object’s equilibrium point from y = 0 to y = y0 . Accordingly, if L is the natural
length of the spring, then we call L + y0 the equilibrium length of the spring in this mass/spring
system under the constant force F0 .
It’s worth noting that, in practice, y0 is a quantity that can often be measured. If we also know
the force, then relation (21.2) can be used to determine the spring constant κ .

i i

i i
i i

i i

404 Springs: Part II (Forced Vibrations)

!Example 21.1: Suppose we have a spring whose natural length is 1 meter. We attach a 2
kilogram mass to its end and hang it vertically (as in figure 21.1c), letting the force of gravity
(near the Earth’s surface) act on the mass. After the mass stops bobbing up and down, we measure
the spring and find that its length is now 1.4 meters, 0.4 meters longer than its natural length.
This gives us y0 (as defined above), and since we are near the Earth’s surface,

kg·meter
F0 = force of gravity on the mass = mg = 2 × 9.8 2
.
sec

Solving equation (21.2) for the spring constant and plugging in the above values, we get

F 2 × 9.8 kg
κ = 0 = = 49 .
y0 .4 2
sec

We should note that the sign of F0 and y0 in the calculations can both be positive or negative,
depending on the orientation of the system relative to the positive direction of the Y –axis. Still,
κ must be positive. So, to simply avoid having to keep track of the signs, let us rewrite the above
relation between κ , F0 and y0 as
F
κ = 0 . (21.3)
y0

21.3 Resonance and Sinusoidal Forces

The mass/spring systems being considered here are but a small subset of all the things that naturally
vibrate or oscillate at or around fixed frequencies — consider the swinging of a pendulum after being
pushed, the vibrations of a guitar string or a steel beam after being plucked or struck — even an
ordinary drinking glass may vibrate when lightly struck. And if these vibrating/oscillating systems
are somehow forced to move using a force that, itself, varies periodically, then we may see resonance.
This is the tendency of the system’s vibrations or oscillations to become very large when the force
periodically fluctuates at certain frequencies. Sometimes, these oscillations can be so large that the
system breaks. Because of resonance, bridges have collapsed, singers have shattered glass, and small
but vital parts of motors have broken off at inconvenient moments. (On the other hand, if you are
in a swing, you use resonance in pumping the swing to swing as high as possible, and if you are a
musician, your instrument may well use resonance to amplify the mellow tones you want amplified.
So resonance is not always destructive.)
We can investigate the phenomenon of resonance in our mass/spring system by looking at the
solutions to
d2 y dy
m 2
+ γ + κ y = F(t)
dt dt
when F(t) is a sinusoidal, that is,

F = F(t) = a cos(ηt) + b sin(ηt)

where a , b and η are constants with η > 0 . Naturally, we call η the forcing angular frequency,
and the corresponding frequency, μ = η/2π , the forcing frequency. To simplify our imagery, let us
use an appropriate trigonometric identity (see page 324), and rewrite this function as a shifted cosine
function,
F(t) = F0 cos(ηt − φ)

i i

i i
i i

i i

Resonance and Sinusoidal Forces 405

where
a b
F0 = a 2 + b2 , cos(φ) = and sin(φ) = .
F0 F0

(Such a force can be generated by an unbalanced ﬂywheel on the object spinning with angular velocity
η about an axis perpendicular to the Y –axis. Alternatively, one could use a very well-trained ﬂapping
bird.)
The value of φ is relatively unimportant to our investigations, so let’s set φ = 0 and just
consider the system modeled by

d2 y dy
m + γ + κ y = F0 cos(ηt) . (21.4)
dt 2 dt
You can easily verify, at your leisure, that completely analogous results are obtained using φ = 0 .
The only change is that each particular solution y p will have a corresponding nonzero shift.
In all that follows, keep in mind that F0 and η are positive constants. You might even want to
observe that letting η → 0 leads to the constant force case just considered in the previous section.
It is convenient to consider the undamped and damped systems separately. We’ll start with an
ideal mass/spring system in which there is no friction to dampen the motion.

Sinusoidal Force in Undamped Systems

If the system is undamped, equation (21.4) reduces to

d2 y
m + κ y = F0 cos(ηt) ,
dt 2
and the general solution to the corresponding homogeneous equation is
(
κ
yh (t) = c1 cos(ω0 t) + c2 sin(ω0 t) with ω0 = .
m

To save a little effort later, let’s observe that the equation for the natural angular frequency ω0 can
be rewritten as κ = m(ω0 )2 . This and a little algebra allow us to rewrite the above differential
equation as
d2 y F
2
+ (ω0 )2 y = 0 cos(ηt) . (21.5)
dt m
As discussed in the previous two chapters, the general solution to this is

y(t) = yh (t) + y p (t) = c1 cos(ω0 t) + c2 sin(ω0 t) + y p (t)

where y p is of the form

⎧
⎨ A cos(ηt) + B sin(ηt) if η = ω0
y p (t) = .
⎩ At cos(ω t) + Bt sin(ω t) if η = ω0
0 0

We now have two cases to consider: the case where η = ω0 , and the case where η = ω0 . Let’s
start with the most interesting of these two cases.

The Case Where η = ω0

If the forcing angular frequency η is the same as the natural angular frequency ω0 of our mass/spring
system, then
y p (t) = At cos(ω0 t) + Bt sin(ω0 t) .

i i

i i
i i

i i

406 Springs: Part II (Forced Vibrations)

2π T
ω0

Figure 21.2: Graph of a particular solution exhibiting the “runaway” resonance in an undamped
mass/spring system having natural angular frequency ω0 .

Right off, you can see that this is describing oscillations of larger and larger amplitude as time goes
on. To get a more precise picture of the motion, plug the above formula for y = y p into differential
equation (21.5). You can easily verify that the result is

2Bω0 − At (ω0 )2 cos(ω0 t) + −2 Aω0 − (Btω0 )2 sin(ω0 t)
F0
+ (ω0 )2 [At cos(ω0 t) + Bt sin(ω0 t)] = cos(ω0 t) ,
m

which simpliﬁes to
F0
2Bω0 cos(ω0 t) − 2 Aω0 sin(ω0 t) = cos(ω0 t) .
m

Comparing the cosine terms and the sine terms on either side of this equation then gives us the pair
F0
2Bω0 = and − 2 Aω0 = 0 .
m

Thus,
F0
B = and A = 0 ,
2mω0
the particular solution is
F0
y p (t) = t sin(ω0 t) , (21.6)
2mω0
and the general solution is
F0
y(t) = yh (t) + y p (t) = c1 cos(ω0 t) + c2 sin(ω0 t) + t sin(ω0 t) . (21.7)
2mω0

The graph of y p is sketched in ﬁgure 21.2. Clearly we have true, “runaway” resonance here.
As time increases, the size of the oscillations are becoming steadily larger, dwarﬁng those in the
yh term. With each oscillation, the object moves further and further from its equilibrium point,
stretching and compressing the spring more and more (try visualizing that motion!). Wait long
enough, and, according to our model, the magnitude of the oscillations will exceed any size desired
. . . unless the spring breaks.

i i

i i
i i

i i

Resonance and Sinusoidal Forces 407

The Case Where η = ω0

Plugging
y p (t) = A cos(ηt) + B sin(ηt)
into equation (21.5) yields
F0
−η2 [A cos(ηt) + B sin(ηt)] + (ω0 )2 [ A cos(ηt) + B sin(ηt)] = cos(ηt) ,
m
which simpliﬁes to
F
(ω0 )2 − η2 A cos(ηt) + (ω0 )2 − η2 B sin(ηt) = 0 cos(ηt) .
m
Comparing the cosine terms and the sine terms on either side of this equation then gives us the pair
F

(ω0 )2 − η2 A = 0 and (ω0 )2 − η2 B = 0 .
m
Thus,
F0
A = and B = 0 ,
m (ω0 )2 − η2
the particular solution is
F0
y p (t) = cos(ηt) , (21.8)
m (ω0 )2 − η2
and the general solution is
y(t) = yh (t) + y p (t)
F0 (21.9)
= c1 cos(ω0 t) + c2 sin(ω0 t) + cos(ηt) .
m (ω0 )2 − η2

Here, the oscillations in the y p term are not increasing with time. However, if the forcing
angular frequency η is close to the natural angular frequency ω0 of the system (and F0 = 0 ), then

(ω0 )2 − η2 ≈ 0

and, so, the amplitude of the oscillations in y p ,

F0 ,
m (ω )2 − η2
0

will be very large. If we can adjust the forcing angular frequency η (but keeping F0 constant),
then we can make the amplitude of the oscillations in y p as large as we could wish. So, again, our
solutions are exhibiting “resonance” (perhaps we should call this “near resonance”).

Some Comments About What We’ve Just Derived

1. Relevance of the yh term: Because the oscillations in the y p term are not increasing with
time when η = ω0 , every term in formula (21.9) can play a relatively significant role in
the long-term motion of the object in an undamped mass/spring system. In addition, the
oscillations in the yh term can “interfere” with the y p term to prevent y(t) from reaching
its maximum value within the first oscillation from when the object is initially still. In
fact, the interaction of the yh terms with the y p term can lead to some very interesting
motion. However, exploring how yh and y p can interact goes a little outside of our current
discussions of “resonance”. Accordingly, we will delay a more complete discussion of this
interaction to section 21.4, after finishing our discussion of resonance.

i i

i i
i i

i i

408 Springs: Part II (Forced Vibrations)

2. The limit as near resonance approaches true resonance: The resonant frequency of a system
is the forcing frequency at which resonance is most pronounced for that system. The above
analysis tells us that the resonant frequency for an undamped mass/spring system is the same
as the system’s natural frequency. At least, it tells us that when the forcing function is given
by a cosine function. It turns out that, using more advanced tools, we can show that we get
those ever-increasing oscillations whenever the force is given by a periodic function having
the same frequency as the natural frequency of that undamped mass/spring system.
Something you might expect is that, as η gets closer and closer to the natural angular
frequency ω0 , the corresponding solution y of equation 21.5 satisfying some given initial
values will approach that obtained when η = ω0 . This is, indeed, the case, and its veriﬁcation
will be left as an exercise (exercise 21.5 on page 414).

3. Limitations in our model: Keep in mind that our model for the mass/spring system was
based on certain assumptions regarding the behavior of springs. In particular, the κ term in
our differential equation came from Hooke’s law,

Fspring (y) = −κ y ,

relating the spring’s force to the object’s position. As we noted after deriving Hooke’s law
(page 321), this is a good model for the spring force, provided the spring is not stretched
or compressed too much. So if our formulas for y(t) have |y(t)| becoming too large for
Hooke’s law to remain valid, then these formulas are probably are not that accurate after
|y(t)| becomes that large. Precisely what happens after the oscillations become so large that
our model is no longer valid will depend on the spring and the force.

Sinusoidal Force in Damped Systems

If the system is damped, then we need to consider equation (21.4),

d2 y dy
m + γ + κ y = F0 cos(ηt) , (21.4 )
dt 2 dt

assuming 0 < γ . As noted a few pages ago, the terms in yh , the solution to the corresponding
homogeneous differential equation, all contain decaying exponential factors. Hence,

yh (t) → 0 as t → ∞ ,

and we can assume a particular solution of the form

y p (t) = A cos(ηt) + B sin(ηt) .

We can then write the general solution to our nonhomogeneous differential equation as

y(t) = yh (t) + y p (t) = yh (t) + A cos(ηt) + B sin(ηt) ,

and observe that, as t → ∞ ,

y(t) = yh (t) + y p (t) → 0 + y p (t) = A cos(ηt) + B sin(ηt) .

This tells us that any long-term behavior of y depends only on y p , and may explain why, in these
cases, we refer to yh (t) as the transient part of the solution and y p (t) as the steady-state part of the
solution.

i i

i i
i i

i i

Resonance and Sinusoidal Forces 409

The analysis of the particular solution,

y p (t) = A cos(ηt) + B sin(ηt) ,

is relatively straightforward, but a little tedious. We’ll leave the computational details to the interested
reader (exercise 21.6 on page 414), and quickly summarize the high points.
Plugging in the above formula for y p into our differential equation and solving for A and B
yield
y p (t) = A cos(ηt) + B sin(ηt) (21.10a)
with
ηγ F0 κ − mη2 F0
A = 2 and B = − 2 . (21.10b)
κ − mη2 + η2 γ 2 κ − mη2 + η2 γ 2

Using a little trigonometry, we can rewrite this as

y p (t) = C cos(ηt − φ) (21.11a)

where the amplitude of these forced vibrations is

F0
C = ! 2 (21.11b)
κ − mη2 + η2 γ 2
and φ satisﬁes
A B
cos(φ) = and sin(φ) = . (21.11c)
C C

Note that the amplitude, C , does not blow up with time, nor does it become inﬁnite for any
forcing angular frequency η . So we do not have the “runaway” resonance exhibited by an undamped
mass/spring system. Still this amplitude does vary with the forcing frequency. With a little work,
you can show that the amplitude of the forced vibrations has a maximum value provided the friction
is not too great. To be speciﬁc, if √
γ < 2κm ,
then the maximum amplitude occurs when the forcing angular frequency is
(
κ γ2
η0 = − . (21.12)
m 2m 2
This is the resonant angular frequency for the given damped mass/spring system. Plugging this value
for η back into formula (21.11b) then yields (after a little algebra) the maximum amplitude
2m F0
C max = . (21.13)
γ 4κm − γ 2
√
If, on the other hand, γ > 2κm , then there is no maximum value for the amplitude as η
varies over (0, ∞) . Instead, the amplitude steadily decreases as η increases.
You might recall that a damped mass/spring system given by

d2 y dy
m + γ + κ y = F0 cos(ηt)
dt 2 dt
is further classiﬁed, respectively, as

underdamped , critically damped or overdamped

i i

i i
i i

i i

410 Springs: Part II (Forced Vibrations)

according to whether
√ √ √
0 < γ < 2 κm , γ = 2 κm or 2 κm < γ .
√ √
Since “resonance” can only occur if γ < 2κm (and since 2 < 2 ), it should be clear that it
makes no sense to talk about resonant frequencies for critically or overdamped systems. Indeed,
we can even further subdivide the underdamped systems into those having resonant frequencies and
those that do not.
Finally, let us also observe that the formula for η0 can be rewritten as
(

1 γ 2
η0 = (ω0 )2 −
2 m

where, as you should recall, (

κ
ω0 =
m
is the natural angular frequency of the corresponding undamped system. From this we see that the
resonant frequency of the damped system is always
√ less than the natural frequency of the undamped
system. In fact, as γ increases from 0 to mω0 2 , the damped system’s resonant angular frequency
shrinks from ω0 to 0 .

21.4 More on Undamped Motion under Nonresonant

Sinusoidal Forces
When two or more sinusoidal functions of different frequencies are added together, they can alter-
natively amplify and interfere with each other to produce a graph that looks somewhat like a single
sinusoidal function whose amplitude varies in some regular fashion. This is illustrated in figure 21.3
in which graphs of
cos(ηt) − cos(ω0 t)
have been sketched using one value for ω0 and two values for η . The first figure (figure 21.3a)
illustrates what is commonly called the beat phenomenon, in which we appear to have a fairly high
frequency sinusoidal whose amplitude seems to be given by another, more slowly varying sinusoidal.
This slowly varying sinusoidal gives us the individual “beats” in which the high frequency function
intensifies and fades (figure 21.3a shows three beats).
This beat phenomenon is typical of the sum (or difference) of two sinusoidal functions of almost
the same frequency, and can be analyzed somewhat using trigonometric identities. For the functions
graphed in figure 21.3a we can use basic trigonometric identities to show that

η + ω0 η − ω0
cos(ηt) − cos(ω0 t) = −2 sin t sin t .
2 2

Thus, we have

η + ω0
cos(ηt) − cos(ω0 t) = A(t) sin ωhigh t with ωhigh =
2

where
ω − η
A(t) = ±2 sin(ωlow t) with ωlow = 0 .
2

i i

i i
i i

i i

More on Undamped Motion under Nonresonant Sinusoidal Forces 411

Y Y
2 2

T T

(a) (b)

Figure 21.3: Graph of cos(ηt) − cos(ω0 t) with ω0 = 2 and with (a) η = 0.9 ω0 and (b)
η = 0.1 ω0 . (Drawn using the same horizontal scales for both graphs).

The angular frequency of the high-frequency wiggles in ﬁgure 21.3a are approximately ωhigh , while
ωlow corresponds to the angular frequency of pairs of beats. (Visualizing

A(t) as a slowly varying
amplitude only makes sense if A(t) varies much more slowly than sin ωhigh t . And, if you think
about it, you will realize that, if η ≈ ω0 , then

η + ω0 ω − η
ωhigh = ≈ ω0 and ωlow = 0 ≈0 .
2 2
So this analysis is justiﬁed if the forcing frequency is close, but not equal, to the resonant frequency.)
The general phenomenon just described (with or without “beats”) occurs whenever we have a
linear combination of sinusoidal functions. In particular, it becomes relevant whenever describing
the behavior of an undamped mass/spring system with a sinusoidal forcing function not at resonant
frequency. Let’s do one general example:

!Example 21.2: Consider an undamped mass/spring system having resonant angular frequency
ω0 under the inﬂuence of a force given by

F(t) = F0 cos(ηt)

where η = ω0 . Assume further that the object in the system (with mass m ) is initially at rest.
In other words, we want to ﬁnd the solution to the initial-value problem

d2 y F
+ (ω0 )2 y = 0 cos(ηt) with y(0) = 0 and y (0) = 0 .
dt 2 m
From our work a few pages ago (see equation (21.9)), we know
F0
y(t) = c1 cos(ω0 t) + c2 sin(ω0 t) + cos(ηt) .
m (ω0 )2 − η2

To satisfy the initial conditions, we must then have

F0
0 = y(0) = c1 cos(0) + c2 sin(0) + cos(0)
m (ω0 )2 − η2
and
F0 η
0 = y (0) = −c1 ω0 sin(0) + c2 ω0 cos(0) − sin(0) ,
m (ω0 )2 − η2

which simpliﬁes to the pair

F0
0 = c1 + and 0 = c2 ω0 .
m (ω0 )2 − η2

i i

i i
i i

i i

412 Springs: Part II (Forced Vibrations)

So
F0
c1 = − , c2 = 0 ,
m (ω0 )2 − η2
and
F0 F0
y(t) = − cos(ω0 t) + cos(ηt)
m (ω0 ) − η
2 2 m (ω0 )2 − η2
F0
= cos(ηt) − cos(ω0 t) .
m (ω0 ) − η
2 2

If η = 0.9 ω0 , the last formula for y reduces to

100 F0
y(t) = · cos(ηt) − cos(ω0 t) ,
19 m(ω0 )2

and the graph of the object’s position at time t is the same as the graph in ﬁgure 21.3a with the
amplitude multiplied by
100 F0
· .
19 m(ω0 )2
If η = 0.1 ω0 , then
100 F0
y(t) = · cos(ηt) − cos(ω0 t) ,
99 m(ω0 ) 2

and the graph of the object’s position at time t is the same as the graph in ﬁgure 21.3b with the
amplitude multiplied by
100 F0
·
99 m(ω0 )2

(which, it should be noted, is approximately 1/

5 the amplitude when η = 0.9ω0 ).

?Exercise 21.1: Consider the mass/spring system just discussed in the last example. Using the
graphs in ﬁgure 21.3, try to visualize the motion of the object in this system
a: when the forcing frequency is 0.9 the natural frequency.
b: when the forcing frequency is 0.1 the natural frequency.

Additional Exercises

21.2. A spring, whose natural length is 0.1 meter, is stretched to an equilibrium length of 0.12
meter when suspended vertically (near the Earth’s surface) with a 0.01 kilogram mass at
the end.
a. Find the spring constant κ for this spring.
b. Find the natural angular frequency ω0 and the natural frequency ν0 for this mass/spring
system, assuming the system is undamped.

21.3. All of the following concern a single spring of natural length 1 meter mounted vertically
with one end attached to the ﬂoor (as in ﬁgure 21.1b on page 401).

i i

i i
i i

i i

Additional Exercises 413

a. Suppose we place a 25 kilogram box of frozen ducks on top of the spring, and, after
moving the box down to its equilibrium point, we find that the length of the spring is now
0.9 meter.
i. What is the spring constant for this spring?
ii. What is the natural angular frequency of the mass/spring system assuming the system
is undamped?
iii. Approximately how many times per second will this box bob up and down assuming
the system is undamped and the box is moved from its equilibrium point and released?
(In other words, what is the natural frequency, ν0 ?)
b. Suppose we replace the box of frozen ducks with a single 2 kilogram chicken.
i. Now what is the equilibrium length of the spring?
ii. What is the natural angular frequency of the undamped chicken/spring system?
iii. Assuming the system is undamped, not initially at equilibrium, and the chicken is not
flapping its wings, how many times per second does this bird bob up and down?
c. Next, the chicken is replaced with a box of imported fruit. After the box stops bobbing
up and down, we find that the length of the spring is 0.85 meter. What is the mass of this
box of fruit?
d. Finally, everything is taken off the spring, and a bunch of red, helium filled balloons is
tied onto the end of the spring, stretching it to an new equilibrium length of 1.02 meters.
What is the buoyant force of this bunch of balloons?

21.4. A live 2 kilogram chicken is securely attached to the top of the the floor-mounted spring
of natural length 1 meter (similar to that described in exercise 21.3, above). Nothing else
is on the spring. Knowing that the spring will break if it is stretched or compressed by half
its natural length, and hoping to use the resonance of the system to stretch or compress the
spring to its breaking point, the chicken starts flapping its wings. The force generated by
the chicken’s flapping wings t seconds after it starts to flap is

F(t) = F0 cos(2π μt)

where μ is the frequency of the wing ﬂapping (ﬂaps/second) and

kg·meter
F0 = 3 2
.
sec

For the following exercises, also assume the following:

1. This chicken/spring system is undamped and has natural frequency ν0 = 6 (hertz).

2. The model given by differential equation (21.5) on page 405 is valid for this
chicken/spring system right up to the point where the spring breaks.

3. The chicken’s position at time t , y(t) is just given by the particular solution y p
found by the method of educated guess (formula (21.6) on page 406 or formula
(21.8) on page 407, depending on μ ).
a. Suppose the chicken ﬂaps at the natural frequency of the system.
i. What is the formula for the chicken’s position at time t ?

i i

i i
i i

i i

414 Springs: Part II (Forced Vibrations)

ii. When does does the amplitude of the oscillations become large enough to break the
spring?
b. Suppose that the chicken manages to consistently flap its wings 3 times per second.
i. What is the formula for the chicken’s position at time t ?
ii. Does the chicken break the spring? If so, when?
c. What is the range of values for μ , the flap frequency, that the chicken can flap at, eventually
breaking the spring? (That is, find the minimum and maximum values of μ so that the
corresponding near resonance will stretch or compress the spring enough to break it.)

21.5. For each η > 0 , let yη be the solution to

d 2 yη F
+ (ω0 )2 yη = 0 cos(ηt) with yη (0) = 0 and yη (0) = 0
dt 2 m

where ω0 , m and F0 are all positive constants. (Note that this describes an undamped
mass/spring system in which the mass is initially at rest.)
a. Find yη (t) assuming η = ω0 .
b. Find yη (t) assuming η = ω0 .
c. Verify that, for each t ,
lim yη (t) = yω0 (t) .
η→ω0

d. Using a computer math package, sketch the graph of yη from t = 0 to t = 100 when
ω0 = 5 , F0/m = 1 , and
i. η = 0.1 ω0 ii. η = 0.5 ω0 iii. η = 0.75 ω0 iv. η = 0.9 ω0
v. η = 0.99 ω0 vi. η = ω0 vii. η = 1.1 ω0 viii. η = 2 ω0
In particular, observe what happens when η ≈ ω0 , and how these graphs illustrate the
result given in part c of this exercise.

21.6. Consider a damped mass/spring system given by

d2 y dy
m + γ + κ y = F0 cos(ηt)
dt 2 dt

where m , γ , κ and F0 are all positive constants. (This is the same as equation (21.4).)

a. Using the method of educated guess, derive the particular solution given by equation set
(21.10) on page 409.
b. Then show that the solution in the previous part can be rewritten as described by equation
set (21.11) on page 409; that is, verify that the solution can be written as

y p (t) = C cos(ηt − φ)

where the amplitude of these forced vibrations is

F0
C = ! 2
κ − mη2 + η2 γ 2

i i

i i
i i

i i

Additional Exercises 415

and φ satisﬁes
A B
cos(φ) = and sin(φ) = .
C C

c. Next, by “ﬁnding the maximum of C with respect to η ”, show that the angular resonant
frequency of the system is (
κ γ2
η0 = − ,
m 2m 2
and that the corresponding maximum amplitude is
2m F0
C max =
γ 4κm − γ 2
√ √
provided γ < 2κm . What happens if, instead, γ ≥ 2κm ?
√
d. Assume that γ < 2κm . Then the system is underdamped and, as noted in chapter 16,
the general solution to the corresponding homogeneous equation is

yh (t) = c1 e−αt cos(ωt) + c2 e−αt sin(ωt)

where α is the decay coefﬁcient and ω is the angular quasi-frequency. Verify that this
ω , the resonant angular frequency η0 , and the natural angular frequency ω0 of the
corresponding undamped system are related by
γ 2 γ 2
(η0 )2 + = ω2 = (ω0 )2 − ,
2m 2m

and, from this, conclude that η0 < ω < ω0 .

i i

i i
i i

i i

i i
i i

i i

22
Variation of Parameters
(A Better Reduction of Order Method)

“Variation of parameters” is another way to solve nonhomogeneous linear differential equations, be

they second order,
ay + by + cy = g ,
or even higher order,

a0 y (N ) + a1 y (N −1) + · · · + a N −1 y + a N y = g .

One advantage of this method over the method of undetermined coefficients from chapter 20 is
that the differential equation does not have to be simple enough that we can ‘guess’ the form for
a particular solution. In theory, the method of variation of parameters will work whenever g and
the coefficients are reasonably continuous functions. As you may expect, though, it is not quite as
simple a method as the method of guess. So, for ‘sufficiently simple’ differential equations, you
may still prefer using the guess method instead of what we’ll develop here.
We will first develop the variation of parameters method for second-order equations. Then we
will see how to extend it to deal with differential equations of even higher order.1 As you will see,
the method is really just a very clever improvement on the reduction of order method for solving
nonhomogeneous equations.

22.1 Second-Order Variation of Parameters

Derivation of the Method
Suppose we want to solve a second-order nonhomogeneous differential equation

ay + by + cy = g

over some interval of interest, say,

x 2 y − 2x y + 2y = 3x 2 for x >0 .

Let us also assume that the corresponding homogeneous equation,

ay + by + cy = 0 ,
1 It is possible to use a “variation of parameters” method to solve ﬁrst-order nonhomogeneous linear equations, but that
would be just plain silly.

417

i i

i i
i i

i i

418 Variation of Parameters

has already been solved. That is, we already have an independent pair of functions y1 = y1 (x) and
y2 = y2 (x) for which
yh (x) = c1 y1 (x) + c2 y2 (x)
is a general solution to the homogeneous equation.

For our example,

x 2 y − 2x y + 2y = 3x 2 ,
the corresponding homogeneous equation is the Euler equation

x 2 y − 2x y + 2y = 0 .

You can easily verify that this homogeneous equation is satisﬁed if y is either

y1 = x or y2 = x 2 .

Clearly, the set {x, x 2 } is linearly independent, and, so, the general solution to the
corresponding homogeneous equation is

yh = c1 x + c2 x 2 .

Now, in using reduction of order to solve our nonhomogeneous equation

ay + by + cy = g ,

we would ﬁrst assume a solution of the form

y = y0 u

where u = u(x) is an unknown function ‘to be determined’, and y0 = y0 (x) is any single solution
to the corresponding homogeneous equation. However, we do not just have a single solution to the
corresponding homogeneous equation — we have two: y1 and y2 (along with all linear combina-
tions of these two). So why don’t we use both of these solutions and assume, instead, a solution of
the form
y = y1 u + y2 v

where y1 and y2 are the two solutions to the corresponding homogeneous equation already found,
and u = u(x) and v = v(x) are two unknown functions to be determined.

For our example,

x 2 y − 2x y + 2y = 3x 2 ,
we already have that
y1 = x and y2 = x 2
form a fundamental pair of solutions to the corresponding homogeneous differential
equation. So, in this case, the assumption that

y = y1 u + y2 v
is
y = xu + x 2 v

where u = u(x) and v = v(x) are two functions to be determined.

i i

For reasons that will be clear in by the end of this section, let us divide this equation
through by x 2 , giving us
u + 2xv = 3 . (22.2)
Keep in mind that this is what our differential equation reduces to if we start by letting

y = xu + x 2 v

and requiring that

xu + x 2 v = 0 .

Now back to the general case, where our differential equation is

ay + by + cy = g .

If we set
y = y1 u + y2 v
where y1 and y2 are solutions to the corresponding homogeneous equation, and require that

y1 u + y2 v = 0 ,

then
y = y1 u + y2 v

= y1 u + y2 v
= y1 u + y1 u + y2 v + y2 v
= y1 u + y2 v + y1 u + y2 v .

0

So
y = y1 u + y2 v ,
and

y = y1 u + y2 v
= y1 u + y1 u + y2 v + y2 v
= y1 u + y2 v + y1 u + y2 v .

Remember, y1 and y2 , being solutions to the corresponding homogeneous equation, satisfy

a y1 + by1 + cy1 = 0 and a y2 + by2 + cy2 = 0 .

Using all the above, we have

ay + by + cy = g

→ a y1 u + y2 v + y1 u + y2 v + b y1 u + y2 v + c y1 u + y2 v = g

→ a y1 u + y2 v + a y1 + by1 + cy1 u + a y2 + by2 + cy2 v = g

.
0 0

The vanishing of the u and v terms should not be surprising. A similar vanishing occurred in the
original reduction of order method. What we also have here, thanks to the ‘other equation’ that

i i

i i
i i

i i

Second-Order Variation of Parameters 421

we chose, is that no second-order derivatives of u or v occur either. Consequently, our original

differential equation,
ay + by + cy = g ,
reduces to
a y1 u + y2 v = g .
Dividing this by a then yields
g
y1 u + y2 v = .
a
Keep in mind what the last equation is. It is what our original differential equation reduces to
after setting
y = y1 u + y2 v (22.3)

(where y1 and y2 are solutions to the corresponding homogeneous equation), and requiring that

y1 u + y2 v = 0 .

This means that the derivatives u and v of the unknown functions in formula (22.3) must satisfy
the pair (or system) of equations
y1 u + y2 v = 0
g .
y1 u + y2 v =
a

This system can be easily solved for u and v . Integrating what we get for u and v then gives
us the formulas for u and v which we can plug back into formula (22.3) for y , the solution to our
nonhomogeneous differential equation.
Let’s ﬁnish our example:

We have
y = xu + x 2 v
where u and v satisfy the system

xu + x 2 v = 0
.
u + 2xv = 3

(The ﬁrst was the equation we chose to require; the second was what the differential
equation reduced to.) From the ﬁrst equation in this system, we have that

u = −xv .

Combining this with the second equation:

u + 2xv = 3

→ −xv + 2xv = 3

→ xv = 3

→ v =
3
x
.

i i

i i
i i

i i

422 Variation of Parameters

Hence, also,
3
u = −xv = −x · = −3 .
x
Remember, the primes denote differentiation with respect to x . So we have
du dv 3
= −3 and = .
dx dx x
Integrating yields

du
u = dx = −3 dx = −3x + c1
dx
and
dv 3
v = dx = dx = 3 ln |x| + c2 .
dx x

Plugging these into the formula for y , we get

y = xu + x 2 v

= x − 3x + c1 + x 2 3 ln |x| + c2
= −3x 2 + c1 x + 3x 2 ln |x| + c2 x 2
= 3x 2 ln |x| + c1 x + (c2 − 3)x 2 ,

which simpliﬁes a little to

y = 3x 2 ln |x| + C 1 x + C 2 x 2 .

This, at long last, is our solution to

x 2 y − 2x y + 2y = 3x 2 .

Before summarizing our work (and reducing it to a fairly simple procedure) let us make two
observations based on the above:
1. If we keep the arbitrary constants arising from the indeﬁnite integrals of u and v , then
the resulting formula for y is a general solution to the nonhomogeneous equation. If we
drop the arbitrary constants (or use deﬁnite integrals), then we will end up with a particular
solution.
2. After plugging the formulas for u and v into y = y1 u + y2 v , some of the resulting terms
can often be absorbed into other terms. (In the above, for example, we absorbed the −3x 2
and c2 x 2 terms into one C 2 x 2 term.)

Summary: How to Do It
If you look back over our derivation, you will see that we have the following:

Equations
Take another quick look at part of our derivation in the previous section. In setting

y = y1 u + y2 v
and then requiring
y1 u + y2 v = 0 ,

i i

i i
i i

i i

426 Variation of Parameters

we ensured that the formula for y ,

y = y1 u + y2 v = y1 u + y1 u + y2 v + y2 v
= y1 u + y2 v + y1 u + y2 v = y1 u + y2 v + 0 ,

contains no derivatives of the unknown functions u and v .

Suppose, instead, that we have three known functions y1 , y2 and y3 , and we set

y = y1 u + y2 v + y3 w

where u , v and w are unknown functions. For the same reasons as before, requiring that

y1 u + y2 v + y3 w = 0 (22.6)

will insure that the formula for y contains no derivative of u , v and w , but will simply be

y = y1 u + y2 v + y3 w .

Differentiating this yields

y = y1 u + y1 u + y2 v + y2 v + y3 w + y3 w

= y1 u + y2 v + y3 w + y1 u + y2 v + y3 w ,

which reduces to
y = y1 u + y2 v + y3 w
provided we require that
y1 u + y2 v + y3 w = 0 . (22.7)
Thus, requiring equations (22.6) and (22.7) prevents derivatives of the unknown functions from
appearing in the formulas for either y or y . As you can easily verify, differentiating the last
formula for y and plugging the above formulas for y , y and y into a third-order differential
equation
a0 y + a1 y + a2 y + a3 y = g
then yield

a0 y1 u + y2 v + y3 w + · · · u + · · · v + · · · w = g (22.8)

where the coefﬁcients in the u , v and w terms will vanish if y1 , y2 and y3 are solutions to the
corresponding homogeneous differential equation.
Together equations (22.6), (22.7) and (22.8) form a system of three equations in three unknown
functions. If you look at this system, and recall the original formula for y , you’ll see that we’ve de-
rived the variation of parameters method for solving third-order nonhomogeneous linear differential
equations:

To solve the nonhomogeneous differential equation

a0 y + a1 y + a2 y + a3 y = g ,

ﬁrst ﬁnd a fundamental set of solutions {y1 , y2 , y3 } to the corresponding homogeneous

equation
a0 y + a1 y + a2 y + a3 y = 0 .
Then set
y = y1 u + y2 v + y3 w (22.9)

i i

i i
i i

428 Variation of Parameters

22.3 The Variation of Parameters Formula

Second-Order Version with Indeﬁnite Integrals
By solving system (22.5) for u and v using generic y1 and y2 , integrating, and then plugging
the result back into formula (22.4)
y = y1 u + y2 v ,

you can show that the solution to

ay + by + cy = g (22.13)

is given by
y2 (x) f (x) y1 (x) f (x)
y(x) = −y1 (x) dx + y2 (x) dx (22.14)
W (x) W (x)

where
g(x)
f (x) = , W (x) = y1 (x)y2 (x) − y1 (x)y2 (x) .
a(x)
and {y1 , y2 } is any fundamental set of solutions to the corresponding homogeneous equation. The
details will be left as an exercise (exercise 22.5 on page 431).
A few observations should be made about the elements in formula 22.14:

1. Back in section 12.1, we saw that solutions to second-order linear differential equations are
continuous and have continuous derivatives, at least over intervals on which a , b and c
are continuous functions and a is never zero. Consequently, the above W (x) will be a
continuous function on any such interval.

2. Moreover, if you recall the discussion about the “Wronskian” corresponding to the function
set {y1 , y2 } (see section 13.6), then you may have noticed that the W (x) in the above formula
is that very Wronskian. As noted in theorem 13.6 on page 274, W (x) will be nonzero at
every point in our interval of interest, provided a , b and c are continuous functions and a
is never zero on that interval.

Consequently, the integrands of the integrals in formula (22.14) will (theoretically at least) be nice
integrable functions over our interval of interest as long as a , b , c and g are continuous functions
and a is never zero over this interval.2 And this verifies that, in theory, the variation of parameters
method does always yield the solution to a nonhomogeneous linear second-order differential equation
over appropriate intervals.
In practice, this author discourages the use of formula (22.14), at least at first. For most, trying
to memorize and effectively use this formula is more difficult than remembering the basic system
from which it was derived. And the small savings in computational time gained by using this formula
is hardly worth the effort unless you are going to be solving many of equations of the form

ay + by + cy = g

in which the left side remains the same, but you have several different choices for g .

2 In fact, f does not have to even be continuous. It just cannot have particularly bad discontinuities.

i i

i i
i i

i i

The Variation of Parameters Formula 429

Second-Order Version with Deﬁnite Integrals

If you fix a point x 0 in the interval of interest and rederive formula (22.14) using definite integrals
instead of the indefinite ones used just above, you get that a particular solution to

ay + by + cy = g

is given by
x x
y2 (s) f (s) y1 (s) f (s)
y p (x) = −y1 (x) ds + y2 (x) ds (22.15)
x0 W (s) x0 W (s)

where, again,
g(s)
f (s) = , W (s) = y1 (s)y2 (s) − y1 (s)y2 (s)
a(s)
and {y1 , y2 } is any fundamental set of solutions to the corresponding homogeneous equation.
Then, of course,
y(x) = y p (x) + c1 y(x) + c2 (x) . (22.16)
is the corresponding general solution to the original nonhomogeneous differential equation.
There are two practical advantages to using deﬁnite integral formula (22.15) instead of the
corresponding indeﬁnite integral formula, formula (22.14):

1. Often, it is virtually impossible to ﬁnd usable, explicit formulas for the integrals of
y2 (x) f (x) y1 (x) f (x)
and .
W (x) W (x)

In these cases, formula (22.14), with its impossible to compute indefinite integrals, is of very
little practical value. However, the definite integrals in formula (22.15) can still be accurately
approximated for specific values of x using any decent numerical integration method. Thus,
while we may not be able to obtain a nice formula for y p (x) , we can still evaluate it at
desired points on any reasonable interval of interest, possibly using these values to generate
a table for y p (x) or to sketch its graph.

2. As you can easily show (exercise 22.6), the y p given by formula (22.15) satisﬁes the initial
conditions
y(x 0 ) = 0 and y (x 0 ) = 0 .
This makes it a little easier to ﬁnd the constants c1 and c2 such that

y(x) = y p (x) + c1 y(x) + c2 (x)

i i

i i
i i

i i

i i
i i

i i

Part IV
The Laplace Transform

i i

i i
i i

i i

i i
i i

i i

23
The Laplace Transform (Intro)

The Laplace transform is a mathematical tool based on integration that has a number of applications. It
particular, it can simplify the solving of many differential equations. We will ﬁnd it particularly useful
when dealing with nonhomogeneous equations in which the forcing functions are not continuous.
This makes it a valuable tool for engineers and scientists dealing with “real-world” applications.
By the way, the Laplace transform is just one of many “integral transforms” in general use. Con-
ceptually and computationally, it is probably the simplest. If you understand the Laplace transform,
then you will ﬁnd it much easier to pick up the other transforms as needed.

23.1 Basic Deﬁnition and Examples

Deﬁnition, Notation and Other Basics
Let f be a ‘suitable’ function (more on that later). The Laplace transform of f , denoted by either
F or L[ f ] , is the function given by
∞
F(s) = L[ f ]|s = f (t)e−st dt . (23.1)
0

!Example 23.1: For our ﬁrst example, let us use

1 if t ≤ 2
f (t) = .
0 if 2 < t
This is the relatively simple discontinuous function graphed in ﬁgure 23.1a. To compute the
Laplace transform of this function, we need to break the integral into two parts:
∞
F(s) = L[ f ]|s = f (t)e−st dt
0
2 ∞
−st
= f (t) e dt + f (t) e−st dt
0 2
1 0
2 ∞ 2
= e−st dt + 0 dt = e−st dt .
0 2 0
So, if s = 0 ,
2
2
e−st 1 1
F(s) = e−st dt = = − e−s·2 − e−s·0 = 1 − e−2s .
0 −s t=0 s s

435

i i

i i
i i

i i

436 The Laplace Transform

1 1

0 1 2 T 2 S
(a) (b)

Figure 23.1: The graph of (a) the discontinuous function f (t) from example 23.1 and (b) its
Laplace transform F(s) .

And if s = 0 ,
2 2
−0·t
F(s) = F(0) = e dt = 1 dt = 2 .
0 0

This is the function sketched in ﬁgure 23.1b. (Using L’Hôpital’s rule, you can easily show that
F(s) → F(0) as s → 0 . So, despite our need to compute F(s) separately when s = 0 , F is
a continuous function.)

As the example just illustrated, we really are ‘transforming’ the function f (t) into another
function F(s) . This process of transforming f (t) to F(s) is also called the Laplace transform
and, unsurprisingly, is denoted by L . Thus, when we say “the Laplace transform”, we can be
referring to either the transformed function F(s) or to the process of computing F(s) from f (t) .
Some other quick notes:

1. There are standard notational conventions that simplify bookkeeping. The functions ‘to be
transformed’ are (almost) always denoted by lower case Roman letters — f , g , h , etc. —
and t is (almost) always used as the variable in the formulas for these functions (because, in
applications, these are typically functions of time). The corresponding ‘transformed func-
tions’ are (almost) always denoted by the corresponding upper case Roman letters — F , G ,
H , ETC. — and s is (almost) always used as the variable in the formulas for these functions.
Thus, if we happen to refer to functions f (t) and F(s) , it is a good bet that F = L[ f ] .

2. Observe that, in the integral for the Laplace transform, we are integrating the inputted function
f (t) multiplied by the exponential e−st over the positive T –axis. Because of the sign in
the exponential, this exponential is a rapidly decreasing function of t when s > 0 and is
a rapidly increasing function of t when s < 0 . This will help determine both the sort of
functions that are ‘suitable’ for the Laplace transform and the domains of the transformed
functions.

3. It is also worth noting that, because the lower limit in the integral for the Laplace transform
is t = 0 , the formula for f (t) when t < 0 is completely irrelevant. In fact, f (t) need
not even be deﬁned for t < 0 . For this reason, some authors explicitly limit the values for
t to being nonnegative. We won’t do this explicitly, but do keep in mind that the Laplace
transform of a function f (t) is only based on the values/formula for f (t) with t ≥ 0 .
This will become a little more relevant when we discuss inverting the Laplace transform (in
chapter 25).

i i

i i
i i

i i

Basic Deﬁnition and Examples 437

4. As indicated by our discussion so far, we are treating the s in

∞
F(s) = f (t)e−st dt
0
as a real variable; that is, we are assuming s denotes a relatively arbitrary real value. Be
aware, however, that in more advanced developments, s is often treated as a complex variable,
s = σ + iξ . This allows the use of results from the theory of analytic complex functions.
But we won’t need that theory (a theory which few readers of this text are likely to have
yet seen). So, in this text (with one very brief exception in chapter 25), s will always be
assumed to be real.

Transforms of Some Common Functions

Before we can make much use of the Laplace transform, we need to build a repertoire of common
functions whose transforms we know. It would also be a good idea to compute a number of transforms
simply to get a better grasp of this whole ‘Laplace transform’ idea.
So let’s get started.

!Example 23.2 (transforms of favorite constants): Let f be the zero function, that is,
f (t) = 0 for all t .
Then its Laplace transform is
∞ ∞
F(s) = L[0]|s = 0 · e−st dt = 0 dt = 0 . (23.2)
0 0
Now let h be the unit constant function, that is,
h(t) = 1 for all t .
Then ∞ ∞
−st
H (s) = L[1]|s = 1·e dt = e−st dt .
0 0
What comes out of this integral depends strongly on whether s is positive or not. If s < 0 , then
0 < −s = |s| and
∞ ∞ ∞
−st |s|t 1 |s|t
e dt = e dt = e
0 0 |s| t=0
1 |s|t 1 |s|·0 1
= lim e − e = ∞ − = ∞ .
t→∞ |s| |s| |s|
If s = 0 , then

∞ ∞ ∞ ∞
e−st dt = e0·t dt = 1 dt = t t=0 = ∞ .
0 0 0
Finally, if s > 0 , then
∞
1 −st ∞ 1 1 −s·0 1 1
e−st dt = e = lim e−st − e = 0 + = .
0 −s t=0 t→∞ −s −s s s
So, ⎧
∞ ⎨ 1 if 0 < s
L[1]|s = 1 · e−st dt = s . (23.3)
⎩
0 ∞ if s ≤ 0

i i

i i
i i

i i

438 The Laplace Transform

As illustrated in the last example, a Laplace transform F(s) is often a well-defined (finite)
function of s only when s is greater that some fixed number s0 ( s0 = 0 in the example). This is a
result of the fact that the larger s is, the faster e−st goes to zero as t → ∞ (provided s > 0 ). In
practice, we will only give the formulas for the transforms over the intervals where these formulas
are well-defined and finite. Thus, in place of equation (23.3), we will write
1
L[1]|s = for s > 0 . (23.4)
s

As we compute Laplace transforms, we will note such restrictions on the values of s . To be honest,
however, these restrictions will usually not be that important in our work. What will be important is
that there is some ﬁnite value s0 such that our formulas are valid whenever s > s0 .
Keeping this in mind, let’s go back to computing transforms.

!Example 23.3 (transforms of some powers of t ): We want to ﬁnd

∞ ∞

L t n s = t n e−st dt = t n e−st dt for n = 1, 2, 3, . . . .
0 0

With a little thought, you will realize this integral will not be ﬁnite if s ≤ 0 . So we will assume
s > 0 in these computations. This, of course, means that

lim e−st = 0 .
t→∞

It also means that, using L’Hôpital’s rule, you can easily verify that

lim t n e−st = 0 for n ≥ 0 .

t→∞

Keeping the above in mind, consider the case where n = 1 ,

∞
L[t]|s = te−st dt .
0

This integral just cries out to be integrated “by parts”:

∞
L[t]|s = t e−st
dt
0
u dv

∞ ∞
= uv t=0 − v du
0
1 ∞ ∞
1
= t e−st − e−st dt
−s t=0 0 −s
∞
1 1 ∞ −st
= − lim te−st − 0 · e−s·0 − e−st dt = e dt .
s t→∞ s 0
0
0 0

Admittedly, this last integral is easy to compute, but why bother since we computed it in the
previous example! In fact, it is worth noting that combining the last computations with the
computations for L[1] yields

1 ∞ −st 1 ∞ 1 1 1
L[t]|s = e dt = 1 · e−st dt = L[1]|s = .
s 0 s 0 s s s

i i

i i
i i

i i

Basic Deﬁnition and Examples 439

So,
1
L[t]|s = for s > 0 . (23.5)
s2
Now consider the case where n = 2 . Again, we start with an integration by parts:
∞

L t 2 s = t 2 e−st dt
0
∞
= t 2 e−st
dt
0
u dv

∞ ∞
= uv t=0 − v du
0
1 ∞ ∞
−st 1
= t 2
e − e−st 2t dt
−s t=0 0 −s
∞
1 2 ∞ −st
= lim t 2 e−st − 02 e−s·0 − 2 te−st dt = te dt .
−s t→∞ s 0
0
0 0

But remember, ∞
te−st dt = L[t]|s .
0
Combining the above computations with this (and referring back to equation (23.5)), we end up
with
2 ∞ −st 2 2 1 2
L t2 = te dt = L[t]|s =
s
= . (23.6)
s 0 s s s2 s3

Clearly, a pattern is emerging. I’ll leave the computation of L t 3 to you.

?Exercise 23.1: Assuming s > 0 , verify (using integration by parts) that

3
L t 3 s = L t 2 s ,
s

and from that and the formula for L t 2 computed above, conclude that
3·2 3!
L t 3 s = 4 = 4 .
s s

?Exercise 23.2: More generally, use integration by parts to show that, whenever s > 0 and n
is a positive integer,
n
L t n s = L t n−1 s .
s

Since sines and cosines oscillate between 1 and −1 as t → ∞ , the last limit does not exist
unless
lim e−st = 0 ,
t→∞
and this occurs if and only if s > 0 . In this case,

lim e−(s−i 3)t = lim e−st cos(3t) + i sin(3t) = 0 .
t→∞ t→∞

Thus, when s > 0 ,

−1
−1 1
L ei 3t s = lim e−(s−i 3)t − e−(s−i 3)0 = [0 − 1] = .
s − i3 t→∞ s − i3 s − i3

Again, replacing 3 with any real number is trivial.

?Exercise 23.4 (transforms of complex exponentials): Let α be any real number and show
that
1
L ei αt s = for 0 < s . (23.9)
s − iα

23.2 Linearity and Some More Basic Transforms

Suppose we have already computed the Laplace transforms of two functions f (t) and g(t) , and,
thus, already know the formulas for

F(s) = L[ f ]|s and G(s) = L[g]|s .

Now look at what happens if we compute the transform of any linear combination of f and g .
Letting α and β be any two constants, we have
∞
L[α f (t) + βg(t)]|s = [α f (t) + βg(t)]e−st dt
0
∞
= α f (t)e−st + βg(t)e−st dt
0
∞ ∞
= α f (t)e−st dt + β g(t)e−st dt
0 0
= αL[ f (t)]|s + βL[g(t)]|s = α F(s) + βG(s) .

Thus, the Laplace transform is a linear transform; that is, for any two constants α and β , and any
two Laplace transformable functions f and g ,

L[α f (t) + βg(t)] = αL[ f ] + βL[g] .

This fact will simplify many of our computations and is important enough to enshrine as a theorem.
While we are at it, let’s note that the above computations can be done with more functions than two,

i i

i i
i i

i i

442 The Laplace Transform

and that we, perhaps, should have noted the values of s for which the integrals are ﬁnite. Taking all
that into account, we can prove:

Theorem 23.1 (linearity of the Laplace transform)

The Laplace transform transform is linear. That is,
L[c1 f 1 (t) + c2 f 2 (t) + · · · + cn f n (t)] = c1 L[ f 1 (t)] + c2 L[ f 2 (t)] + · · · + cn L[ f n (t)]
where each ck is a constant and each f k is a “Laplace transformable” function.
Moreover, if, for each f k we have a value sk such that
Fk (s) = L[ f k (t)]|s for sk < s ,
then, letting smax be the largest of these sk ’s ,

L[c1 f 1 (t) + c2 f 2 (t) + · · · + cn f n (t)]|s

= c1 F1 (s) + c2 F2 (s) + · · · + cn Fn (s) for smax < s .

!Example 23.6 (transform of the sine function): Let us consider ﬁnding the Laplace transform
of sin(ωt) for any real value ω . There are several ways to compute this, but the easiest starts
with using Euler’s formula for the sine function along with the linearity of the Laplace transform:
i ωt
e − e−i ωt
L[sin(ωt)]|s = L
2i s

1 1
= L ei ωt − e−i ωt = L ei ωt s − L e−i ωt s .
2i s 2i
From example 23.5 and exercise 23.4, we know
1
L ei ωt s = for s > 0 .
s − iω
Thus, also,
1 1
L e−i ωt s = L ei (−ω)t s = = for s > 0 .
s − i(−ω) s + iω
Plugging these into the computations for L[sin(ωt)] (and doing a little algebra) yields, for s > 0 ,
1

L[sin(ωt)]|s = L ei ωt s − L e−i ωt s
2i
1
1 1

= −
2i s − iω s + iω

1 s + iω s − iω
= −
2i (s − iω)(s + iω) (s + iω)(s − iω)

1 (s + iω) − (s − iω)
=
2i s 2 − i 2 ω2

1 2iω
= ,
2i s − i 2 ω2
2

which immediately simpliﬁes to

ω
L[sin(ωt)]|s = for s > 0 . (23.10)
s 2 + ω2

i i

i i
i i

i i

Tables and a Few More Transforms 443

?Exercise 23.5 (transform of the cosine function): Show that, for any real value ω ,
s
L[cos(ωt)]|s = for s > 0 . (23.11)
s 2 + ω2

23.3 Tables and a Few More Transforms

In practice, those using the Laplace transform in applications do not constantly recompute basic
transforms. Instead, they refer to tables of transforms (or use software) to look up commonly used
transforms, just as so many people use tables of integrals (or software) when computing integrals.
We, too, can use tables (or software) after
1. you have computed enough transforms on your own to understand the basic principles,
and
2. we have computed the transforms appearing in the table so we know our table is correct.
The table we will use is table 23.1, Laplace Transforms of Common Functions (Version 1), on page
444. Checking that table, we see that we have already veriﬁed all but two or three of the entries,
with those being the transforms of fairly arbitrary powers of t , t α , and the “shifted step function”,
step(t − α) . So let’s compute them now.

Arbitrary Powers (and the Gamma Function)

Earlier, we saw that
∞
n!
L t n s = t n e−st dt = for s>0 (23.12)
0 s n+1

when n is any nonnegative integer. Let us now consider computing

∞

L t α s = t α e−st dt for s > 0
0

when α is any real number greater than −1 . (When α ≤ −1 , you can show that t α ‘blows up’
too quickly near t = 0 for the integral
to be finite.)
The method we used to find L t n becomes awkward when we try to apply it to find L[t α ]
when α is not an integer. Instead, we will ‘cleverly’ simplify the above integral for L[t α ] by using
the substitution u = st . Since t is the variable in the integral, this means
u 1
t = and dt = du .
s s
So, assuming s > 0 and α > −1 ,
∞

L t α s = t α e−st dt
0
∞
u α −u 1
= e du
0 s s
∞ ∞
uα −u 1
= α+1
e du = u α e−u du .
0 s s α+1 0

i i

i i
i i

i i

444 The Laplace Transform

Table 23.1: Laplace Transforms of Common Functions (Version 1)

In the following, α and ω are real-valued constants, and, unless otherwise noted, s > 0 .

f (t) F(s) = L[ f (t)]|s Restrictions

1
1
s
1
t
s2
n!
tn n = 1, 2, 3, . . .
s n+1
√
1 π
√ √
t s

(α + 1)
tα −1 < α
s α+1
1
eαt α<s
s −α
1
ei αt
s − iα
s
cos(ωt)
s 2 + ω2
ω
sin(ωt)
s 2 + ω2

e−αs
stepα (t), step(t − α) 0≤α
s

Notice that the last integral depends only on the constant α — we’ve ‘factored out’ any dependence
on the variable s . Thus, we can treat this integral as a constant (for each value of α ) and write
∞
α Cα

L t s = α+1 where Cα = u α e−u du .
s 0

It just so happens that the above formula for C α is very similar to the formula for something
called the “Gamma function”. This is a function that crops up in various applications (such as this)
and, for x > 0 , is given by ∞
(x) = u x−1 e−u du . (23.13)
0
Comparing this with the formula for C α , we see that
∞ ∞
Cα = u α e−u du = u (α+1)−1 e−u du = (α + 1) .
0 0

So our formula for the Laplace transform of t α (with α > −1 ) can be written as
(α + 1)
L t α s = α+1
for s > 0 . (23.14)
s

i i

i i
i i

i i

Tables and a Few More Transforms 445

0
0 1 2 3 4 X

Figure 23.2: The graph of the Gamma function over the interval (0, 4) . As x → 0+ or
x → +∞ , (x) → +∞ very rapidly.

This is normally considered the preferred way to express L[t α ] because the Gamma function is
considered to be a “well-known” function. Perhaps you don’t yet consider it “well known”, but you
can find tables for evaluating (x) , and it is probably one of the functions already defined in your
favorite computer math package. That makes graphing (x) , as done in figure 23.2, relatively easy.
determine the value of (x) when x is a positive integer by comparing
As it is, we can readily
our two formulas for L t n when n is a nonnegative integer — the one mentioned at the start of
our discussion (formula (23.12)), and the more general formula (formula (23.14)) just derived for
L[t α ] with α = n :
n! (n + 1)
= L t n s = when n = 0, 1, 2, 3, 4, . . . .
s n+1 n+1
s
Thus,
(n + 1) = n! when n = 0, 1, 2, 3, 4, . . . .
Letting x = n + 1 , this becomes

(x) = (x − 1)! when x = 1, 2, 3, 4, . . . . (23.15)

In particular:
(1) = (1 − 1)! = 0! = 1 ,

(2) = (2 − 1)! = 1! = 1 ,

(3) = (3 − 1)! = 2! = 2 ,

(4) = (4 − 1)! = 3! = 6 ,
and
(12) = (12 − 1)! = 11! = 39,916,800 .

This shows that the Gamma function can be viewed as a generalization of the factorial. Indeed, you
will ﬁnd texts where the factorial is redeﬁned for all positive numbers (not just integers) by

x! = (x + 1) .

We won’t do that.
Computing (x) when x is not an integer is not so simple. It can be shown that
√
1
= π . (23.16)
2

i i

i i
i i

i i

446 The Laplace Transform

Also, using integration by parts (just as you did in exercise 23.2 on page 439), you can show that

(x + 1) = x(x) for x >0 , (23.17)

which is analogous to the factorial identity (n + 1)! = (n + 1)n! . We will leave the veriﬁcation of
these to the more adventurous (see exercise 23.13 on page 462), and go on to the computation of a
few more transforms.

!Example 23.7: Consider ﬁnding the Laplace transforms of

1 √ √
3
√ , t and t .
t

For the ﬁrst, we use formulas (23.14) with α = −1/2 , along with equation (23.16):

−
1
+
1 √
1 −1/2 2
1
2 π
L √ = L t = = = √ .
t s s −1/ +1 1/ s
s 2 s 2
For the second, formula (23.14) with α = 1/2 gives

1 3
√ 1 +1
/2 2 2
L t = L t = 1/ +1
= 3/
.
s s s 2 s 2
Using formulas (23.17) and (23.16), we see that

3 1 1 1 1√
= +1 = = π .
2 2 2 2 2

Thus
√ 1
3 √
/2 2 π
L t = L t = = .
s /2 2s /2
s s 3 3
√
For the transform of 3 t , we simply have

1 4
√ 1 +1
/3 3 3
L t = L t
3
= 1/ +1
= 4/
.
s s s 3 s 3

Unfortunately, there is not a formula analogous to (23.16) for 4/3 or 1/3 . There is the
approximation
4
≈ .8929795121 ,
3
which can be found using either tables or a computer math package, but, since this is just an
approximation, we might as well leave our answer as

4
√ 1
/3 3
L t = L t
3
= 4/
.
s s s 3

i i

i i
i i

i i

Tables and a Few More Transforms 447

1 1

0 T 0 α T

(a) (b)

Figure 23.3: The graphs of (a) the basic step function step(t) and (b) a shifted step function
stepα (t) with α > 0 .

The Shifted Unit Step Function

Step functions are the simplest discontinuous functions we can have. The (basic) unit step function,
which we will denote by step(t) , is deﬁned by

0 if t < 0
step(t) = .
1 if 0 < t

Its graph has been sketched in ﬁgure 23.3a.1

For any real value α , the corresponding shifted unit step function, which we will denote by
stepα , is given by

0 if t − α < 0 0 if t < α
stepα (t) = step(t − α) = = .
1 if 0 < t − α 1 if α < t

Its graph, with α > 0 , has been sketched in ﬁgure 23.3b. Do observe that the basic step function
and the step function at zero are the same, step(t) = step0 (t) .
You may have noted that we’ve not deﬁned the step functions at their points of discontinuity
( t = 0 for step(t) , and t = α for stepα (t) ). That is because the value of a step function right
at its single discontinuity will be completely irrelevant in any of our computations or applications.
Observe this fact as we compute the Laplace transform of stepα (t) when α ≥ 0 :
∞

L stepα (t) s = stepα (t)e−st dt
0
α ∞
= stepα (t)e−st dt + stepα (t)e−st dt
0 α
α ∞ ∞
= 0 · e−st dt + 1 · e−st dt = e−st dt .
0 α α

You can easily show that the above integral is inﬁnite if s < 0 or s = 0 . But if s > 0 , then the
above becomes
∞
1 −st ∞
L stepα (t) s = e−st dt = e
α −s t=α
1 1 −sα 1
= lim e−st − e = 0 + eαs .
t→∞ −s −s s
1 The unit step function is also called the Heaviside step function, and, in other texts, is often denoted by u and, occasionally,
by h or H .

i i

i i
i i

i i

448 The Laplace Transform

Thus,
1
L stepα (t) s = e−αs for s>0 and α ≥ 0 . (23.18)
s

23.4 The First Translation Identity (And More

Transforms)
The linearity of the Laplace transform allows us to construct transforms from linear combinations
of known transforms. Other identities allow us to construct new transforms from other formulas
involving known transforms. One particularly useful identity is the “ﬁrst translation identity” (also
called the “translation along the S–axis identity” for reasons that will soon be obvious). The
derivation of this identity starts with the observation that, in the expression
∞
F(s) = L[ f (t)]|s = f (t)e−st dt for s > s0 ,
0

the s is simply a place holder. It can be replaced with any symbol, say, X , that does not involve the
integration variable, t ,
∞
F(X) = L[ f (t)]| X = f (t)e−X t dt for X > s0 .
0

In particular, let X = s − α where α is any real constant. Using this for X in the above gives us
∞
F(s − α) = L[ f (t)]|s−α = f (t)e−(s−α)t dt for s − α > s0 .
0

But
s − α > s0 ⇐⇒ s > s0 + α
and ∞ ∞
f (t)e−(s−α)t dt = f (t)e−st eαt dt
0 0
∞
= eαt f (t)e−st dt = L eαt f (t) s .
0

So the expression above for F(s − α) can be written as

F(s − α) = L eαt f (t) s for s > s0 + α .

This gives us the following identity:

Jump Discontinuities
A function f is said to have a jump discontinuity at a point t0 if the left- and right-hand limits
lim f (t) and lim f (t)
t→t0 − t→t0 +

exist, but are different ﬁnite numbers. The jump at this discontinuity is the difference of the two
limits,
jump = lim f (t) − lim f (t) ,
t→t0 + t→t0 −
and the average of the two limits is the Y-coordinate of the midpoint of the jump,

1
ymidpoint = lim f (t) + lim f (t) .
2 t→t0 + t→t0 −

A generic example is sketched in figure 23.4a. And right beside that figure (in figure 23.4b) is the
graph of a function with multiple jump discontinuities.
The simplest example of a function with a jump discontinuity is the basic step function, step(t) .
Just looking at its graph (figure 23.3a on page 446) you can see that it has a jump discontinuity at
t = 0 with jump = 1 , and y = 1/2 as the Y -coordinate of the midpoint.
On the other hand, consider the functions ⎧
⎨ 0 if t < 2
1
f (t) = and g(t) = 1 ,
(t − 2)2 ⎩ if 2 < t
2 (t − 2)
sketched in figures 23.5a and 23.5b, respectively. Both have discontinuities as t = 2 . In each case,
however, the limit of the function as t → 2 from the right is infinite. Hence, we do not view these
discontinuities as “jump” discontinuities.

Piecewise Continuity
We say that a function f is piecewise continuous on an finite open interval (a, b) if and only if both
of the following hold:
1. f is continuous on the interval except for, at most, a finite number of jump discontinuities
in (a, b) .
2. The endpoint limits
lim f (t) and lim f (t)
t→a+ t→b−
exist and are finite.

i i

i i
i i

i i

452 The Laplace Transform

Y Y Y

2 T 2 T T
(a) (b) (c)

Figure 23.5: Functions having at least one point with an inﬁnite left- or right-hand limit at some
point.

We extend this concept to functions on infinite open intervals (such as (0, ∞) ) by defining a
function f to be piecewise continuous on an infinite open interval if and only if f is piecewise
continuous on every finite open subinterval. In particular then, a function f being piecewise
continuous on (0, ∞) means that
lim f (t)
t→0+
is a finite value, and that, for every finite, positive value T , f (t) has at most a finite number of
discontinuities on the interval (0, T ) , with each of those being a jump discontinuity.
For some of our discussions, we will only need our function f to be piecewise continuous
on (0, ∞) . Strictly speaking, this says nothing about the possible value of f (t) when t = 0 .
If, however, we are dealing with initial-value problems, then we may require our function f to
be piecewise continuous on [0, ∞) , which simply means f is piecewise continuous on (0, ∞) ,
defined at t = 0 , and
f (0) = lim f (t) .
t→0+
Keep in mind that “a finite number” of jump discontinuities can be zero, in which case f has
no discontinuities and is, in fact, continuous on that interval. What is important is that a piecewise
continuous function cannot ‘blow up’ at any (finite) point in or at the ends of the interval. At worst,
it has only ‘a few’ jump discontinuities in each finite subinterval.
The functions sketched in figure 23.4 are piecewise continuous, at least over the intervals in the
figures. And any step function is piecewise continuous on (0, ∞) . On the other hand, the functions
sketched in figures 23.5a and 23.5b, are not piecewise continuous on (0, ∞) because they both
“blow up” at t = 2 . Consider even the function
1
f (t) = ,
t
sketched in figure 23.5c. Even though this function is continuous on the interval (0, ∞) , we do not
consider it to be piecewise continuous on (0, ∞) because
1
lim = ∞ .
t→0+ t

Two simple observations will soon be important to us:

1. If f is piecewise continuous on (0, ∞) , and T is any positive ﬁnite value, then the integral
T
f (t) dt
0

is well deﬁned and evaluates to a ﬁnite number. Remember, geometrically, this integral is the
“net area” between the graph of f and the T –axis over the interval (0, T ) . The piecewise

i i

i i
i i

i i

What Is “Laplace Transformable”? 453

continuity of f assures us that f does not “blow up” at any point in (0, T ) , and that we
can divide the graph of f over (0, T ) into a finite number of fairly nicely behaved ‘pieces’
(see figure 23.4b) with each piece enclosing finite area.

2. The product of any two piecewise continuous functions f and g on (0, ∞) will, itself, be
piecewise continuous on (0, ∞) . You can easily verify this yourself using the fact that

lim f (t)g(t) = lim f (t) × lim g(t) .

t→t0 ± t→t0 ± t→t0 ±

Combining the above two observations with the obvious fact that, for any real value of s ,
g(t) = e−st is a piecewise continuous function of t on (0, ∞) gives us:

Lemma 23.3
Let f be a piecewise continuous function on (0, ∞) , and let T be any finite positive number. Then
the integral
T
f (t)e−st dt
0
is a well-defined finite number for each real value s .

Because of our interest in the Laplace transform, we will want to ensure that the above integral
converges to a ﬁnite number as T → ∞ . That is the next issue we will address.

Exponential Order∗
Let f be a function on (0, ∞) , and let s0 be some real number. We say that f is of exponential
order s0 if and only if there are ﬁnite constants M and T such that

| f (t)| ≤ Mes0 t whenever T ≤t . (23.21)

Often, the precise value of s0 is not particularly important. In these cases we may just say that f
is of exponential order to indicate that it is of exponential order s0 for some value s0 .
Saying that f is of exponential order is just saying that the graph of | f (t)| is bounded above
by the graph of some constant multiple of some exponential function on some interval of the form
[T, ∞) . Note that, if this is the case and s is any real number, then

f (t)e−st = | f (t)| e−st ≤ Mes0 t e−st = Me−(s−s0 )t whenever T ≤ t .

Moreover, if s > s0 , then s − s0 is positive, and

f (t)e−st ≤ Me−(s−s0 )t → 0 as t →∞ . (23.22)

Thus, in the future, we will automatically know that

lim f (t)e−st = 0
t→∞

whenever f is of exponential order s0 and s > s0 .

is a continuous function on (s0 , ∞) and

lim F(s) = 0 .
s→∞

We will verify this theorem at the end of the chapter.

23.6 Further Notes on Piecewise Continuity and

Exponential Order
Issues Regarding Piecewise Continuous Functions on (0, ∞)
In the next several chapters, we will be concerned mainly with functions that are piecewise continuous
on (0, ∞) . There are a few small technical issues regarding these functions that could become
signiﬁcant later if we don’t deal with them now. These issues concern the values of such functions
at jumps.

On the Value of a Function at a Jump

Take a look at ﬁgure 23.4b on page 450. Call the function sketched there f , and consider evaluating,
say,
t2
f (t)e−st dt .
0

The obvious approach is to break up the integral into three pieces,

t2 t0 t1 t2
f (t)e−st dt = f (t)e−st dt + f (t)e−st dt + f (t)e−st dt ,
0 0 t0 t1

and use values/formulas for f over the intervals (0, t0 ) , (t0 , t1 ) and (t1 , t2 ) to compute the indi-
vidual integrals in the above sum. What you would not worry about would be the actual values of
f at the points of discontinuity, t0 , t1 and t2 . In particular, it would not matter if

f (t0 ) = lim f (t) or f (t0 ) = lim f (t)

t→t0 − t→t0 +
or
f (t0 ) = the Y-coordinate of the midpoint of the jump .

This extends an observation made when we computed the Laplace transform of the shifted
step function. There, we found
that the precise value of stepα (t) at t = α was irrelevant to the
computation of L stepα (t) . And the pseudo-computations in the previous paragraph point out
that, in general, the value of any piecewise continuous function at a point of discontinuity will be
irrelevant to the integral computations we will be doing with these functions.
Parallel to these observations are the observations of how we use functions with jump discon-
tinuities in applications. Typically, a function with a jump discontinuity at t = t0 is modeling
something that changes so quickly around t = t0 that we might as well pretend the change is in-
stantaneous. Consider, for example, the output of a one-lumen incandescent light bulb switched on
at t = 2 : Until is is switched on, the bulb’s light output is 0 lumen. For a brief period around
t = 2 the ﬁlament is warming up and the light output increases from 0 to 1 lumen, and remains at

i i

i i
i i

i i

456 The Laplace Transform

1 lumen thereafter. In practice, however, the warm-up time is so brief that we don’t notice it, and
we are content to describe the light output by

0 lumen if t < 2
light output at time t = = step2 (t) lumen
1 lumen if 2 < t

without giving any real thought as to the value of the light output the very instant we are turning on
the bulb.4
What all this is getting to is that, for our work involving piecewise continuous functions on
(0, ∞) ,

the value of a function f at any point of discontinuity t0 in (0, ∞) is irrelevant.

What is important is not f (t0 ) but the one-sided limits

lim f (t) and lim f (t) .

t→t0 − t→t0 +

Because of this, we will not normally specify the value of a function at a discontinuity, at least
not while developing Laplace transforms. If this disturbs you, go ahead and assume that, unless
otherwise indicated, the value of a function at each jump discontinuity is given by the Y -coordinate
of the jump’s midpoint. It’s as good as any other value.

Equality of Piecewise Continuous Functions

Because of the irrelevance of the value of a function at a discontinuity, we need to slightly modify
what it means to say “ f = g on some interval”. Henceforth, let us say that

f = g on some interval (as piecewise continuous functions)

means
f (t) = g(t)

for every t in the interval at which f and g are continuous. We will not insist that f and g be
equal at the relatively few points of discontinuity in the functions. But do note that we will still have

lim f (t) = lim g(t)

t→t0 ± t→t0 ±

for every t0 in the interval. Consequently, the graphs of f and g will have the same ‘jumps’ in the
interval.
By the way, the phrase “as piecewise continuous functions” in the above deﬁnition is recom-
mended but is often forgotten.

!Example 23.10: The functions

0 if t ≤ 2 0 if t < 2
step2 (t) , f (t) = and g(t) =
1 if 2 < t 1 if 2 ≤ t

all satisfy
step2 (t) = f (t) = g(t)

4 On the other hand, “What is the light output of a one-lumen light bulb the very instant the light is turned on?” may be a
nice question to meditate upon if you are studying Zen.

i i

i i
i i

i i

Further Notes on Piecewise Continuity and Exponential Order 457

for all values of t in (0, ∞) except t = 2 , at which each has a jump. So, as piecewise continuous
functions,
step2 = f = g on (0, ∞) .

Conversely, if we know h = step2 on (0, ∞) (as piecewise continuous functions), then we

know
0 if 0 < t < 2
h(t) = .
1 if 2 < t
We do not know (nor do we care about) the value of h(t) when t = 2 (or when t < 0 ).

Testing for Exponential Order

Before deriving this test for exponential order, it should be noted that the “order” is not unique. After
all, if
| f (t)| ≤ Mes0 t whenever T ≤ t ,
and s0 ≤ σ , then
| f (t)| ≤ Mes0 t ≤ Meσ t whenever T ≤t ,
proving the following little lemma:

Lemma 23.6
If f is of exponential order s0 , then f is of exponential order σ for every σ ≥ s0 .

Now here is the test:

Lemma 23.7 (test for exponential order)

Let f be a function on (0, ∞) .
1. If there is a real value s0 such that

lim f (t)e−s0 t = 0 ,
t→∞

then f is of exponential order s0 .

The above lemma will let us use the exponential bound M0 es0 t over all of (0, ∞) , and not just
(T, ∞) . The next lemma is one you should either already be acquainted with or can easily conﬁrm
on your own.

Lemma 23.9
If g is an integrable function on the interval (a, b) , then
b b

g(t) dt ≤ |g(t)| dt .

a a

Proof of Theorem 23.5

Now we will prove the two claims of theorem 23.5. Keep in mind that f is a piecewise continuous
function on (0, ∞) of exponential order s0 , and that
∞
F(s) = f (t)e−st dt for s > s0 .
0
We will make repeated use of the fact, stated in lemma 23.8 just above, that there is a constant M0
such that
| f (t)| ≤ M0 es0 t for 0 < t . (23.23)
Since the second claim is a little easier to verify, we will start with that.

i i

i i
i i

i i

460 The Laplace Transform

Proof of the Second Claim

The second claim is that
lim F(s) = 0 ,
s→∞
which, of course, can be proven by showing
lim |F(s)| ≤ 0 .
s→∞

Now let s > s0 . Using inequality (23.23) with the integral inequality from lemma 23.9, we
have

∞ ∞
|F(s)| = f (t)e−st dt ≤ f (t)e−st dt
0 0
∞
= | f (t)| e−st dt
0
∞ M0
≤ M0 es0 t e−st dt = M0 L es0 t s = .
0 s − s0
Thus,
M0
lim |F(s)| ≤ lim = 0 ,
s→∞ s→∞ s − s0
conﬁrming the claim.

Proof of the First Claim

The ﬁrst claim is that F is continuous on (s0 , ∞) . To prove this, we need to show that, for each
s1 > s0 ,
lim F(s) = F(s1 ) .
s→s1
Note that this limit can be veriﬁed by showing
lim |F(s) − F(s1 )| ≤ 0 .
s→s1

Now let s and s1 be two different points in (s0 , ∞) (Hence, s − s0 > 0 and s1 − s0 > 0 ).
Using the integral inequality from lemma 23.9, we get
∞ ∞

|F(s) − F(s1 )| = −st
f (t)e dt − f (t)e −s1 t
dt
0 0

∞ ∞
= f (t) e−st − e−s1 t dt ≤ | f (t)| e−st − e−s1 t dt .
0 0
Applying inequality (23.23), we then have
∞
|F(s) − F(s1 )| ≤ M0 es0 t e−st − e−s1 t dt (23.24)
0
Now observe that, if s ≤ s1 and t ≥ 0 , then

es0 t e−st − e−s1 t = es0 t e−st − e−s1 t = e−(s−s0 )t − e−(s1 −s0 )t .
Thus, when s ≤ s1 ,
∞ ∞

es0 t e−st − e−s1 t dt = e−(s−s0 )t − e−(s1 −s0 )t dt
0 0

1 1 1 1
= − = − .
s − s0 s1 − s0 s − s0 s1 − s0

i i

i i
i i

i i

Additional Exercises 461

Almost identical computations also show that, when s ≥ s1 ,

i i

Additional Exercises 463

23.14. Several functions are given below. Sketch the graph of each over an appropriate interval,
and decide whether each is or is not piecewise continuous on (0, ∞) .

a. f (t) = 2 step3 (t) b. g(t) = step2 (t) − step3 (t)

sin(t)
c. sin(t) d.
t
√
e. tan(t) f. t
1
g. √ h. t 2 − 1
t
1 1
i. j.
t2 − 1 t2 + 1
k. The “ever increasing stair” function,
⎧
⎪
⎪ 0 if t < 0
⎪
⎪
⎪
⎪
⎪
⎪ 1 if 0 < t < 1
⎪
⎪
⎪
⎨2 if 2 < t < 3
stair(t) =
⎪
⎪ 3 if 3 < t < 4
⎪
⎪
⎪
⎪ if 4 < t < 5
⎪
⎪ 4
⎪
⎪
⎩ ...
⎪ ..
.

23.15. Assume f and g are two piecewise continuous functions on an interval (a, b) containing
the point t0 . Assume further that f has a jump discontinuity at t0 while g is continuous
at t0 . Verify that the jump in the product f g at t0 is given by

“the jump in f at t0 ” × g(t0 ) .

23.16. Using either the basic deﬁnition or the test for exponential order (lemma 23.7 on page
457), determine which of the following are of exponential order, and, for each which is of
exponential order, determine the possible values for the order.

a. e3t b. t 2 c. te3t
2
d. et e. sin(t)

23.17. For the following, let α and σ be any two positive numbers.

a. Using basic calculus, show that t α e−σ t has a maximum value Mα,σ on the interval
[0, ∞) . Also, ﬁnd both where this maximum occurs and the value of Mα,σ .
b. Explain why this conﬁrms that
i. t α ≤ Mα,σ eσ t whenever t > 0 , and that
ii. t α is of exponential order σ for any σ > 0 .

23.18. Assume f is a piecewise continuous function on (0, ∞) of exponential order s0 , and let
α and σ be any two positive numbers. Using the results of the last exercise, show that
t α f (t) is piecewise continuous on (0, ∞) and of exponential order s0 + σ .

i i

i i
i i

i i

464 The Laplace Transform

23.19. For this problem, let

g(t, s) = 2ste−st
2
.
Your goal, in the following, is to show that
∞ ∞
lim g(t, s) dt = lim g(t, s) dt .
s→∞ 0 0 s→∞

a. By simply computing the integral and then taking the limit, show that
∞
lim g(t, s) dt = 1 .
s→∞ 0

b. Then, using L’Hôpital’s rule, verify that, for each t ≥ 0 ,

lim g(t, s) = 0 ,
s→∞

and observe that this means

∞ ∞
lim g(t, s) dt = 0 = 1 = lim g(t, s) dt .
0 s→∞ s→∞ 0

i i

i i
i i

i i

24
Differentiation and the Laplace
Transform

In this chapter, we will explore how the Laplace transform interacts with the basic operators of
calculus: differentiation and integration. The greatest interest will be in the ﬁrst identity that we
will derive. This relates the transform of a derivative of a function to the transform of the original
function, and will allow us to convert many initial-value problems to easily solved algebraic equations.
But there are other useful relations involving the Laplace transform and either differentiation or
integration. So we’ll look at them, too.

24.1 Transforms of Derivatives

The Main Identity
To see how the Laplace transform can convert a differential equation to a simple algebraic equation,
let us examine how the transform of a function’s derivative,
d f ∞ ∞
d f −st df
L f (t) s = L = e dt = e−st dt ,
dt s 0 dt 0 dt

is related to the corresponding transform of the original function,

∞
F(s) = L[ f (t)]|s = f (t)e−st dt .
0

The last formula above for L f (t) clearly suggests using integration by parts, and to ensure that
this integration by parts is valid, we need to assume f is continuous on [0, ∞) and f is at least
piecewise continuous on (0, ∞) . Assuming this,
∞
df

L f (t) s = e−st dt
0 dt
u dv

∞ ∞
= uv t=0 − v du
0

∞ ∞
= e −st
f (t)
t=0
− f (t) − se−st dt
0

465

i i

i i
i i

i i

466 Differentiation and the Laplace Transform

∞
−st −s·0

= lim e f (t) − e f (0) − − se−st f (t) dt
t→∞ 0
∞
= lim e−st f (t) − f (0) + s f (t)e−st dt .
t→∞ 0

Now, if f is of exponential order s0 , then

lim e−st f (t) = 0 whenever s > s0

t→∞

and ∞
F(s) = L[ f (t)]|s = f (t)e−st dt exists for s > s0 .
0

Thus, continuing the above computations for L f (t) with s > s0 , we ﬁnd that
∞

L f (t) s = lim e−st f (t) − f (0) + s f (t)e−st dt
t→∞ 0
= 0 − f (0) + sL[ f (t)]|s ,

which is a little more conveniently written as

L f (t) s = sL[ f (t)]|s − f (0) (24.1a)
or even as

L f (t) s = s F(s) − f (0) . (24.1b)

This will be a very useful result, well worth preserving in a theorem.

Theorem 24.1 (transform of a derivative)

Let F = L[ f ] where f is a continuous function of exponential order s0 on [0, ∞) . If f is at
least piecewise continuous on (0, ∞) , then

L f (t) s = s F(s) − f (0) for s > s0 .

Extending these identities to formulas for the transforms of higher derivatives is easy. First, for
convenience, rewrite equation (24.1a) as

L g (t) s = sL[g(t)]|s − g(0)
or, equivalently, as
dg

L = sL[g(t)]|s − g(0) .
dt s

(Keep in mind that this assumes g is a continuous function of exponential order, g is piecewise
continuous and s is larger than the order of g .) Now we simply apply this equation with g = f ,
g = f , etc. Assuming all the functions are sufﬁciently continuous and are of exponential order,
we see that

d f
L f (t) s = L
= sL f (t) s − f (0)

dt s
= s [s F(s) − f (0)] − f (0)

= s 2 F(s) − s f (0) − f (0) .

i i

i i
i i

i i

Transforms of Derivatives 467

Using this, we then see that

d f
L f (t) s = L
= sL f (t) s − f (0)
dt s

= s s 2 F(s) − s f (0) − f (0) − f (0)

= s 3 F(s) − s 2 f (0) − s f (0) − f (0) .

Clearly, if we continue, we will end up with the following corollary to theorem 24.1:

Corollary 24.2 (transforms of derivatives)

Let F = L[ f ] where f is a continuous function of exponential order s0 on [0, ∞) . If f is at
least piecewise continuous on (0, ∞) , then

L f (t) s = s F(s) − f (0) for s > s0 .

If, in addition, f is a continuous function of exponential order s0 , and f is at least piecewise

continuous, then

L f (t) s = s 2 F(s) − s f (0) − f (0) for s > s0 .

More generally, if f , f , f , …, and f (n−1) are all continuous functions of exponential order s0
on [0, ∞) for some positive integer n , and f (n) is at least piecewise continuous on (0, ∞) , then,
for s > s0 ,

L f (n) (t) = s n F(s) − s n−1 f (0) − s n−2 f (0)
s

− s n−3 f (0) − · · · − s f (n−2) (0) − f (n−1) (0) .

Using the Main Identity

Let us now see how these identities can be used in solving initial-value problems. We’ll start with
something simple:

!Example 24.1: Consider the initial-value problem

dy
− 3y = 0 with y(0) = 4 .
dt

Observe what happens when we take the Laplace transform of the differential equation (i.e., we
take the transform of both sides). Initially, we just have
dy

L − 3y = L[0]|s .
dt s

By the linearity of the transform and fact that L[0] = 0 , this is the same as
d y

L − 3L[y]|s = 0 .
dt s

Letting Y = L[y] and applying the “transform of the derivative identity” (theorem 24.1, above),
our equation becomes
sY (s) − y(0) − 3Y (s) = 0 ,

→ s 2 Y (s) − sy(0) − y (0)
16
− 7 sY (s) − y(0) + 12Y (s) =
s−2

→ s 2 Y (s) − s6 − 4
16
− 7 sY (s) − 6 + 12Y (s) =
s−2

→ s 2 Y (s) − 6s − 4
16
− 7sY (s) + 7 · 6 + 12Y (s) =
s−2

→ s 2 − 7s + 12 Y (s) − 6s + 38 =
16
s−2

2
→ s − 7s + 12 Y (s) =
16
s−2
+ 6s − 38 .

Thus,
16 6s − 38
Y (s) = + 2 . (24.2)
(s − 2)(s 2 − 7s + 12) s − 7s + 12
If desired, we can obtain a slightly more concise expression for Y (s) by ﬁnding the common
denominator and adding the two terms on the right,
16 (s − 2)(6s − 38)
Y (s) = +
,
(s − 2)(s 2 − 7s + 12) (s − 2) s 2 − 7s + 12

obtaining
6s 2 − 50s + 92
Y (s) =
. (24.3)
(s − 2) s 2 − 7s + 12
We will finish solving the above initial-value problem in example 25.6 on page 488. At that time,
we will find the later expression for Y (s) to be more convenient. At this point, though, there
is no significant advantage gained by reducing expression (24.2) to (24.3). When doing similar
problems in the exercises, go ahead and “find the common denominator and add” if the algebra
is relatively simple. Otherwise, leave your answers as the sum of two terms.
However, do observe that we did NOT multiply out the factors in the denominator, but left
them as
(s − 2) s 2 − 7s + 12 .

Do the same in your own work. In the next chapter, we will see that leaving the denominator in
factored form will simplify the task of recovering y(t) from Y (s) .

!Example 24.3: Find the Laplace transform of t sin(3t) . Here, we have

dF
L[t sin(3t)]|s = L[t f (t)]|s = −
ds

with f (t) = sin(3t) . From the tables (or memory), we ﬁnd that
3
F(s) = L[ f (t)]|s = L[sin(3t)]|s = .
s2 + 9

Applying the identity just derived (identity (24.5)) yields

dF
L[t sin(3t)]|s = L[t f (t)]|s = −
ds

d 3 −3 · 2s 6s
= − = −
2 =
2 .
ds s2 + 9 s +9
2 s +9
2

Deriving corresponding identities involving higher order derivatives and higher powers of t
is straightforward. Simply use the identity in theorem 24.3 repeatedly, replacing f (t) with t f (t) ,
t 2 f (t) , etc.:
d
L t 2 f (t) s = L t[t f (t)] s = − L[t f (t)]|s
ds
d

dF
d2 F
= − − = (−1)2 ,
ds ds ds 2

d
L t 3 f (t) s = L t[t 2 f (t)] s = − L t 2 f (t)
ds s
2F
3
d 2 d 3d F
= − (−1) 2
= (−1) 3
,
ds ds ds

d
L t 4 f (t) s = L t[t 3 f (t)] s = − L t 3 f (t)
ds s
3F

d d d4 F
= − (−1)3 3 = (−1)4 4 ,
ds ds ds

and so on. Clearly, then, as a corollary to theorem 24.3, we have:

Corollary 24.4 (derivatives of transforms)

Let F = L[ f ] where f is a piecewise continuous function of exponential order s0 . Then F(s) is
inﬁnitely differentiable for s > s0 , and
dn F
L t n f (t) s = (−1)n n for n = 1, 2, 3, . . . .
ds

For easy reference, all the Laplace transform identities we’ve derived so far are listed in table
24.1. Also in the table are two identities that will be derived in the next section.

i i

i i
i i

i i

472 Differentiation and the Laplace Transform

Table 24.1: Commonly Used Identities (Version 1)

In the following, F(s) = L[ f (t)]|s .

h(t) H (s) = L[h(t)]|s Restrictions

∞
f (t) f (t)e−st dt
0

eαt f (t) F(s − α) α is real

df
s F(s) − f (0)
dt

d2 f
s 2 F(s) − s f (0) − f (0)
dt 2

dn f s n F(s) − s n−1 f (0) − s n−2 f (0)

n = 1, 2, 3, . . .
dt n − s n−3 f (0) − · · · − f (n−1) (0)

dF
t f (t) −
ds
dn F
t n f (t) (−1)n n = 1, 2, 3, . . .
ds n
t
F(s)
f (τ ) dτ
0 s
∞
f (t)
F(σ ) dσ
t s

24.3 Transforms of Integrals and Integrals of Transforms

Analogous to the differentiation identities

L f (t) s = s F(s) − f (0) and L[t f (t)]|s = −F (s)

are a pair of identities concerning transforms of integrals and integrals of transforms. These identities
will not be nearly as important to us as the differentiation identities, but they do have their uses and
are considered to be part of the standard set of identities for the Laplace transform.
Before we start, however, take another look at the above differentiation identities. They show
that, under the Laplace transform, the differentiation of one of the functions, f (t) or F(s) , corre-
sponds to the multiplication of the other by the appropriate variable. This may lead you to suspect
that the analogous integration identities show that, under the Laplace transform, integration of one
of the functions, f (t) or F(s) , corresponds to the division of the other by the appropriate variable.
Be suspicious. We will conﬁrm (and use) this suspicion.

i i

i i
i i

i i

Transforms of Integrals and Integrals of Transforms 473

Transform of an Integral
Let t
g(t) = f (τ ) dτ
0
where f is piecewise continuous function on (0, ∞) and of exponential order s0 . From calculus,
we (should) know the following:

1. g is continuous on [0, ∞) .2

2. g is differentiable at every point in (0, ∞) at which f is continuous, and

t
dg d
= f (τ ) dτ = f (t)
dt dt 0
0
3. g(0) = f (τ ) dτ = 0 .
0
In addition, it is not that difﬁcult to show (see lemma 24.7 on page 476) that g is also of exponential
order s1 where s1 is any positive value greater than or equal to s0 . So both f and g have Laplace
transforms, which, as usual, will be denoted by F and G , respectively. Letting s > s1 , and using
the second and third facts listed above, along with our ﬁrst differentiation identity, we have
dg
= f (t)
dt
dg

→ L
dt
= L[ f (t)]|s
s

→ sG(s) − g(0) = F(s)

→ sG(s) = F(s)

Dividing through by s and recalling what G and g represent then give us the following theorem:

Theorem 24.5 (transform of an integral)

Let F = L[ f ] where f is any piecewise continuous function on (0, ∞) of exponential order s0 ,
and let s1 be any positive value greater than or equal to s0 . Then
t
f (τ ) dτ
0

is a continuous function of t on [0, ∞) of exponential order s1 , and

t
F(s)
L f (τ ) dτ = for s > s1 .
0 s s

!Example 24.4: Let α be any nonnegative real number. The “ramp at α function” can be
deﬁned by t
rampα (t) = stepα (τ ) dτ .
0
2 If the continuity of g is not obvious, take a look at the discussion of theorem 2.1 on page 32.

i i

i i
i i

i i

474 Differentiation and the Laplace Transform

Y Y
1

α T 2π 4π T
(a) (b)

Figure 24.1: The graphs of (a) the ramp function at α from example 24.4 and (b) the sinc
function from example 24.5.

If t ≥ α , t
rampα (t) = stepα (τ ) dτ
0
α t
= stepα (τ ) dτ + stepα (τ ) dτ
0 α
α t
= 0 dτ + 1 dτ = 0 + t − α .
0 α
If t < α ,
t t
rampα (t) = stepα (τ ) dτ = 0 dτ = 0 .
0 0
So
0 if t < α
rampα (t) = ,
t −α if α ≤ t
which is the function sketched in ﬁgure 24.1a.
By the integral formula we gave for rampα (t) and the identity in theorem 24.5,
t t
F(s)
L rampα (t) s = L stepα (τ ) dτ = L f (τ ) dτ =
0 s s 0 s
where f (t) = stepα (t) , s0 = 0 , and
e−αs
F(s) = L[ f (t)]|s = L stepα (t) s = for s > 0 .
s
So,
F(s) e−αs e−αs
L rampα (t) s = = = for s > 0 .
s 2 s·s s

Integral of a Transform
The identity just derived should reinforce our suspicion that, under the Laplace transform, the division
of f (t) by t should correspond to some integral of F . To conﬁrm this suspicion and derive that
integral, let’s assume
f (t)
g(t) = .
t

i i

i i
i i

i i

Transforms of Integrals and Integrals of Transforms 475

where f is some piecewise continuous function on (0, ∞) of exponential order s0 . Let us further
assume that
f (t)
lim g(t) = lim
t→0+ t→0+ t
converges to some ﬁnite value. Clearly then, g is also piecewise continuous on (0, ∞) and of
exponential order s0 .
Using an old trick along with the “derivative of a transform” identity (in theorem 24.3), we have

f (t) dG
F(s) = L[ f (t)]|s = L t · = L[tg(t)]|s = − .
t s ds
Cutting out the middle and renaming the variable as σ ,
dG
F(σ ) = − ,
dσ
allows us to use s as a limit when we integrate both sides,
s s
dG
F(σ ) dσ = − dσ = −G(s) + G(a) ,
a a dσ
which we can then solve for G(s) :
s a
G(s) = G(a) − F(σ ) dσ = G(a) + F(σ ) dσ .
a s
All this, of course, assumes a and s are any two real numbers greater than s0 . Since we are trying
to get a formula for G , we don’t know G(a) . What we do know (from theorem 23.5 on page 454)
is that G(a) → 0 as a → ∞ . This, along with the fact that s and a are independent of each other,
means that
a ∞
G(s) = lim G(s) = lim G(a) + F(σ ) dσ = 0 + F(σ ) dσ .
a→∞ a→∞ s s

After recalling what G and g originally denoted, we discover that we have veriﬁed:

Theorem 24.6 (integral of a transform)

Let F = L[ f ] where f is piecewise continuous on (0, ∞) and of exponential order s0 . Assume
further that
f (t)
lim
t→0+ t
converges to some ﬁnite value. Then
f (t)
t
is also piecewise continuous on (0, ∞) and of exponential order s0 . Moreover,
∞
f (t)
L = F(σ ) dσ for s > s0 .
t s s

!Example 24.5: The sinc function (pronounced “sink”) is deﬁned by

sin(t)
sinc(t) = for t = 0 .
t
Its limit as t → 0 is easily computed using L’Hôpital’s rule, and deﬁnes the value of sinc(t)
when t = 0 ,
d
sin(t) [sin(t)] cos(t)
sinc(0) = lim sinc(t) = lim = lim dt d = lim = 1 .
t→0 t→0 t t→0 [t] t→0 1
dt

i i

i i
i i

i i

476 Differentiation and the Laplace Transform

The graph of the sinc function is sketched in ﬁgure 24.1b.

Now, by deﬁnition,
f (t)
sinc(t) = with f (t) = sin(t) for t > 0 .
t
Clearly, this f satisﬁes all the requirements for f given in theorem 24.6 (with s0 = 0 ). Thus,
for s > 0 , we have ∞

f (t)
L[sinc(t)]|s = L = F(σ ) dσ
t s s
with
1
F(σ ) = L[ f (t)]|σ = L[sin(t)]|σ = .
σ2 + 1
So, for s > 0 ,
∞
1
L[sinc(t)]|s = dσ
s σ2 + 1
∞
= arctan(σ ) s
π
= lim arctan(σ ) − arctan(s) = − arctan(s) .
σ →∞ 2
(By the way, you can derive the equivalent formula

1
L[sinc(t)]|s = arctan
s

using either arctangent identities or the substitution σ = 1/u in the last integral above.)

Addendum
Here’s a little fact used in deriving the “transform of an integral” identity in theorem 24.5. We prove
it here because the proof could distract from the more important part of that derivation.

Lemma 24.7
Let f be any piecewise continuous function on (0, ∞) of exponential order s0 . Then the function
g , given by t
g(t) = f (τ ) dτ ,
0
is of exponential order s1 where s1 is any positive value greater than or equal to s0 .

PROOF: Since f is piecewise continuous on (0, ∞) and of exponential order s0 , lemma 23.8
on page 459 assures us that there is a constant M such that
| f (t)| ≤ Mes0 t whenever 0 < t .
Now let s1 be any positive value greater than or equal to s0 , and let t > 0 . Using the above and
the integral inequality from lemma 23.9 on page 459, we have
t t

|g(t)| = f (τ ) dτ ≤ | f (τ )| dτ
0 0
t
≤ Mes0 τ dτ
0
t
M s1 t M s1 t
≤ Mes1 τ dτ = e − es1 ·0 ≤ e .
0 s1 s1

i i

i i
i i

i i

Appendix: Differentiating the Transform 477

A5
A3 T
t0 = 0 t1 A2 t2 t3 t4 t5 t6
A4 A6

Figure 24.2: The graph of sin(st)

t for some s > 0 with tk = kπ/s .

So, letting M1 = M/ ,
s1
|g(t)| < M1 es1 t for t >0 ,
verifying that g is of exponential order s1 .

24.4 Appendix: Differentiating the Transform

The Main Issue
On page 470, we derived the “derivative of a transform” identity, F = −L[t f (t)] , naively using
the “fact” that ∞ ∞
d ∂
g(t, s) dt = [g(t, s)] dt . (24.6)
ds 0 0 ∂s
The problem is that this “fact”, while often true, is not always true.

!Example 24.6: Let

sin(st)
g(t, s) = for t > 0 and s > 0 .
t

(Compare this function to the sinc function in example 24.5 on page 475.) It is easily veriﬁed
that the graph of this function over the positive T –axis is as sketched in ﬁgure 24.2. Recalling
the relation between integration and area, we also see that, for each s > 0 ,
∞ T
sin(st)
g(t, s) dt = lim dt
0 T →∞ 0 t
∞

= A1 − A2 + A3 − A4 + A5 − A6 + · · · = (−1)k+1 Ak .
k=1

where each Ak is the area enclosed by the graph of g and the T –axis interval (tk−1 , tk ) described
in figure 24.2. Notice that this last summation is an alternating series whose terms are steadily
decreasing to zero. As you surely recall from your calculus course, any such summation is
convergent. Hence, so is the above integral of g . That is, this integral is well defined and finite

i i

i i
i i

i i

478 Differentiation and the Laplace Transform

for each s > 0 . Unfortunately, this integral cannot by evaluated by elementary means. Still,
using the substitution t = τ/s , we can reduce this integral to a slightly simpler form:
∞ ∞ ∞ ∞
sin(st) sin(τ ) dτ sin(τ )
g(t, s) dt = dt = = dτ .
0 0 t 0 τ/s s 0 τ

Thus, in fact, this integral does not depend on s . Consequently,

∞ ∞
d d sin(τ )
g(t, s) dt = dτ = 0 .
ds 0 ds 0 τ

On the other hand,

∞ ∞
∂ ∂ sin(st)
[g(t, s)] dt = dt
0 ∂s 0 ∂s t
∞
cos(st) · t
= dt
0 t
∞
sin(st)
= cos(st) dt = lim ,
0 t→∞ s

which is not 0 — it does not even converge! Thus, at least for this choice of g(t, s) ,
∞ ∞
d ∂
g(t, s) dt = [g(t, s)] dt .
ds 0 0 ∂s

There are fairly reasonable conditions ensuring that equation (24.6) holds, and our use of it on
page 470 in deriving the “derivative of the transform” identity can be justiﬁed once we know those
“reasonable conditions”. But instead, let’s see if we can rigorously verify our identity just using basic
facts from elementary calculus.

The Rigorous Derivation

Our goal is to prove theorem 24.3 on page 470. That is, we want to rigorously derive the identity

F (s) = −L[t f (t)]|s (24.7)

assuming F = L[ f ] with f being a piecewise continuous function of exponential order s0 . We

will also assume s > s0 .
First, you should verify that the results of exercise 23.18 on page 463 and lemma 23.8 on page
459 give us:

Lemma 24.8
If f (t) is of exponential order s0 , and n is any positive integer, then t n f (t) is a piecewise
continuous function on (0, ∞) of exponential order s0 + σ for any σ > 0 . Moreover, there is a
constant Mσ such that
n
t f (t) ≤ Mσ e(s0 +σ )t for all t > 0 .

Since we can always find a positive σ such that s0 < s0 + σ < s , this lemma assures us that
L t n f (t) s is well defined for s > s0 .
Now let’s consider F (s) . By definition
F(s + s) − F(s)
F (s) = lim
s→0 s

i i

i i
i i

i i

Appendix: Differentiating the Transform 479

provided the limit exists. Taking |s| small enough so that s + s is also greater that s0 (even if
s < 0 ), we have
F(s + s) − F(s) 1
= [F(s + s) − F(s)]
s s
⎡ ⎤
∞ ∞
=
1 ⎣ f (t) e−(s+s)t f (t)e−st dt ⎦
s 0
dt − 0
,
e−st e−st

which simpliﬁes to
∞
F(s + s) − F(s)
e−(s)t − 1 e−st dt
1
= f (t) . (24.8)
s 0 s

To deal with the integral in the last equation, we will use the fact that, for any value x , the
exponential of x is given by its Taylor series,
∞
1 k 1 2 1 3 1 4
ex = x = 1 + x + x + x + x + ··· .
k! 2! 3! 4!
k=0

So
1 2 1 3 1 4
ex − 1 = x + x + x + x + ···
2! 3! 4!
2
= x + x E(x)

where
1 1 1 2
E(x) = + x + x + ··· .
2! 3! 4!
Consequently (using x = −(s)t ),

e−(s)t − 1 =
1 1
−(s)t + [−(s)t]2 E(−(s)t)
s s

= −t + (s)t 2 E(−(s)t) .

Combined with equation (24.8), this yields

∞
F(s + s) − F(s)
= f (t) −t + (s)t 2 E(−(s)t) e−st dt
s 0
∞ ∞
= f (t) [−t] e−st dt + f (t) (s)t 2 E(−(s)t) e−st dt .
0 0

That is,
∞
F(s + s) − F(s)
= −L[t f (t)]|s + s t 2 f (t)E(−(s)t)e−st dt . (24.9)
s 0

Obviously, the question now is What happens to the second term on the right when s → 0 ?
To help answer that, let us observe that, for all x ,

1 1 1 1
|E(x)| = + x + x 2 + x 3 + · · ·
2! 3! 4! 5!
1 1 1 1
≤ + |x| + |x|2 + |x|3 + · · ·
2! 3! 4! 5!

|x|3 + · · · = e|x|
1 1
< 1 + |x| + |x|2 + .
2! 3!

i i

i i
i i

i i

480 Differentiation and the Laplace Transform

Moreover, as noted in lemma 24.8, for each σ > 0 , there is a positive constant Mσ such that

2
t f (t) ≤ Mσ e(s0 +σ )t for all t > 0 .

Remember, s > s0 . And since σ can be chosen as close to 0 as desired, and we are taking the
limit as s → 0 , we may assume that s + [σ + |s|] > s0 . Doing so and applying the above then
yields
∞ ∞
2
s t 2
f (t)E(−(s)t)e −st
dt ≤ |s| t f (t) |E(−(s)t)| e−st dt

0 0
∞
≤ |s| Mσ e(s0 +σ )t e|−(s)t| e−st dt
0
∞
= |s| Mσ e−(s−s0 −σ −|s|)t dt
0
Mσ
≤ |s| .
s − s0 − σ − |s|

Thus,
∞ Mσ
lim s t f (t)E(−(s)t)e
2 −st
dt = 0 · = 0 .
s→0 0 s − s0 − σ − 0

Combining this with equation (24.9), we ﬁnally obtain

F(s + s) − F(s)
lim = −L[t f (t)]|s + 0 ,
s→0 s

verifying both the differentiability of F at s , and equation (24.7).

Additional Exercises

24.1. Find the Laplace transform Y (s) of the solution to each of the following initial-value
problems. Just ﬁnd Y (s) using the ideas illustrated in examples 24.1 and 24.2. Do NOT
solve the problem using methods developed before we started discussing Laplace transforms
and then computing the transform! Also, do not attempt to recover y(t) from each Y (s)
you obtain.
a. y + 4y = 0 with y(0) = 3

b. y − 2y = t 3 with y(0) = 4

c. y + 3y = step4 (t) with y(0) = 0

d. y − 4y = t 3 with y(0) = 1 and y (0) = 3

e. y + 4y = 20e4t with y(0) = 3 and y (0) = 12

f. y + 4y = sin(2t) with y(0) = 3 and y (0) = 5

g. y + 4y = 3 step2 (t) with y(0) = 0 and y (0) = 5

i i

i i
i i

i i

Additional Exercises 481

h. y + 5y + 6y = e4t with y(0) = 1 and y (0) = 0

i. y − 5y + 6y = t 2 e4t with y(0) = 0 and y (0) = 2

j. y − 5y + 6y = 7 with y(0) = 2 and y (0) = 4

k. y − 4y + 13y = e2t sin(3t) with y(0) = 4 and y (0) = 3

l. y + 4y + 13y = 4t + 2e2t sin(3t) with y(0) = 4 and y (0) = 3

m. y − 27y = e−3t with y(0) = 2 , y (0) = 3 and y (0) = 4

24.2. Compute the Laplace transforms of the following functions using the given tables and the
‘derivative of the transforms identities’ from theorem 24.3 (and its corollary).
a. t cos(3t) b. t 2 sin(3t) c. te−7t
d. t 3 e−7t e. t step(t − 3) f. t 2 step4 (t)

24.3. Verify the following identities using the ‘derivative of the transforms identities’ from theo-
rem 24.3.
2ωs s 2 − ω2
a. L[t sin(ωt)]|s =
2 b. L[t cos(ωt)]|s =
2
s 2 + ω2 s 2 + ω2

24.4. For the following, let y be the solution to

d2 y dy
t 2
+ + ty = 0 with y(0) = 1 and y (0) = 0 .
dt dt
The above differential equation, known as Bessel’s equation of order zero, is important in
many two-dimensional problems involving circular symmetry. The solution to this equation
with the above initial values turns out to be particularly important. It is called the Bessel
function (of the ﬁrst kind) of order zero, and is universally denoted by J0 . Thus, in the
following,
y(t) = J0 (t) and Y (s) = L[y(t)]|s = L[J0 (t)]|s .

a. Using the differentiation identities from this chapter, show that

dY
s2 + 1 + sY = 0 .
ds
b. The above differential equation for Y is a simple ﬁrst-order differential equation. Find
its general solution.
c. It can be shown (trust the author on this) that
∞
J0 (t) = 1 .
0

What does this tell you about Y (0) ?

d. Using what you now know about Y (s) , ﬁnd = L[J0 (t)]|s .

24.5. Compute the Laplace transforms using the tables provided. You will have to apply two
different identities.
a. te4t sin(3t) b. te4t cos(3t) c. te4t step(t − 3) d. e3t t 2 step1 (t)

i i

i i
i i

i i

482 Differentiation and the Laplace Transform

24.6. The following concern the ramp at α function whose transform was computed in example
24.4 on page 473. Assume α > 0 .
a. Verify that the “ramp-squared at α ” function,

2
ramp2α (t) = rampα (t) ,
satisﬁes
t
ramp2α (t) = 2 rampα (τ ) dτ .
0

b. Using the above and the “transform of an integral” identity, ﬁnd L ramp2α (t) s .

24.7. The sine-integral function, Si , is given by

t
sin(τ )
Si(t) = dτ .
0 τ

In example 24.5, it is shown that

sin(t) 1
L = arctan .
t s s

What is L[Si(t)]|s ?

24.8. Verify that the limit of each of the following functions as t → 0 is a ﬁnite number, and then
ﬁnd the Laplace transform of that function using the “integral of a transform” identity.
1 − e−t e2t − 1 e−2t − e3t
a. b. c.
t t t
1 − cos(t) 1 − cosh(t) sin(3t)
d. e. f.
t t t

i i

i i
i i

!Example 25.2: We have 4

L−1 = 4e3t
s−3 t
because
4
= L 4e3t s .
s−3
Likewise, since
6
L t 3 s = 4 ,
s
we have 6

t 3 = L−1 4
.
s t

The fact that

f (t) = L−1 [F(s)]|t ⇐⇒ L[ f (t)]|s = F(s)

means that any table of Laplace transforms (such as table 23.1 on page 444) is also a table of inverse
Laplace transforms. Instead of reading off the F(s) for each f (t) found, read off the f (t) for
each F(s) .
As you may have already noticed, we take inverse transforms of “functions of s that are denoted
by upper case Roman letters” and obtain “functions of t that are denoted by the corresponding lower
case Roman letter”. These notational conventions are consistent with the notational conventions laid
down for the Laplace transform early in chapter 23.
We should also note that the phrase “inverse Laplace transform” can refer to either the ‘inverse
transformed function’ f or to the process of computing f from F .
By the way, there is a formula for computing inverse Laplace transforms. If you must know, it
is Y
1
L−1 [F(s)]|t = lim et (σ +i ξ ) F(σ + iξ ) dξ .
2π Y →+∞ −Y
The integral here is over a line in the complex plane, and σ is a suitably chosen positive value.
In deriving this formula, you actually verify uniqueness theorem 25.1. Unfortunately, deriving and
verifying this formula go beyond our current abilities.2
Don’t pretend to understand this formula, and don’t try to use it until you’ve had a course in
complex variables. Besides, it is not nearly as useful as a good table of transforms.

25.2 Linearity and Using Partial Fractions

Linearity of the Inverse Transform
The fact that the inverse Laplace transform is linear follows immediately from the linearity of the
Laplace transform. To see that, let us consider L−1 [α F(s) + βG(s)] where α and β are any two
constants and F and G are any two functions for which inverse Laplace transforms exist. Following
our conventions, we’ll denote those inverse transforms by f and g . That is,

f (t) = L−1 [F(s)]|t and g(t) = L−1 [G(s)]|t .

2 Two derivations can be found in the third edition of Transforms and Applications Handbook (Ed: A. Poularikas, CRC
Press). One, using Fourier transforms, is in section 2.4.6 of the chapter on Fourier transforms by Howell. The other, using
results from the theory of complex analytic functions, is in section 5.6 of the chapter on Laplace transforms by Poularikas
and Seely.

i i

i i
i i

i i

486 The Inverse Laplace Transform

Remember, this is completely the same as stating that

L[ f (t)]|s = F(s) and L[g(t)]|s = G(s) .

Because we already know the Laplace transform is linear, we know

L[α f (t) + βg(t)]|s = αL[ f (t)]|s + βL[g(t)]|s = α F(s) + βG(s) .

This, along the deﬁnition of the inverse transform and the above deﬁnitions of f and g , yields

L−1 [α F(s) + βG(s)]|t = α f (t) + βg(t) = αL−1 [F(s)]|t + βL−1 [G(s)]|t .

Redoing these little computations with as many functions and constants as desired then gives us the
next theorem:

Theorem 25.2 (linearity of the inverse Laplace transform)

The inverse Laplace transform is linear. That is,

L−1 [c1 F1 (s) + c2 F2 (s) + · · · + cn Fn (s)]

= c1 L−1 [F1 (s)] + c2 L[F2 (s)] + · · · + cn L[Fn (s)]

when each ck is a constant and each Fk is a function having an inverse Laplace transform.

Let’s now use the linearity to compute a few inverse transforms.

!Example 25.3: Let’s ﬁnd

1
L−1 .
s 2 + 9 t

We know (or found in table 23.1 on page 444) that

3
L−1 2 = sin(3t) ,
s +9 t

which is almost what we want. To use this in computing our desired inverse transform, we will
combine linearity with one of mathematics’ oldest tricks — multiplying by 1 — with, in this
case, 1 = 3/3 ):

1 −1 1 3 1 −1 3
L −1 = 1 sin(3t) .
s +9
2 = L · 2
3 = L
s +9 2 3 s +9 3
t t t

The use of linearity along with ‘multiplying by 1 ’ will be used again and again. Get used to it.

!Example 25.4: Let’s ﬁnd the inverse Laplace transform of

30 8
+ .
s7 s−4

We know 6! 1

L−1 7
= t6 and L−1 = e4t .
s t s−4 t

i i

i i
i i

i i

Linearity and Using Partial Fractions 487

So, 30
8 1 1
L−1 + = 30L−1 7 + 8L−1
s7 s−4 t s t s − 4 t
1 6!

= 30L−1 · + 8e4t
6! s7 t

30 −1 6! 30
= L 7
+ 8e4t = t 6 + 8e4t ,
6! s t 6·5·4·3·2

which, after a little arithmetic, reduces to

30
8 1 6
L−1 7 + = t + 8e4t .
s s−4 t 24

Partial Fractions
When using the Laplace transform with differential equations, we often get transforms that can
be converted via ‘partial fractions’ to forms that are easily inverse transformed using the tables
and linearity, as above. This means that the general method(s) of partial fractions are particularly
important. By now, you should be well-acquainted with using partial fractions — remember, the
basic idea is that, if we have a fraction of two polynomials

Q(s)
P(s)

and P(s) can be factored into two smaller polynomials

P(s) = P1 (s)P2 (s) ,

then two other polynomials Q 1 (s) and Q 2 (s) can be found so that

Q(s) Q(s) Q (s) Q (s)

= = 1 + 2 .
P(s) P1 (s)P2 (s) P1 (s) P2 (s)

Moreover, if (as will usually be the case for us) the degree of Q(s) is less than the degree of P(s) ,
then the degree of each Q k (s) will be less than the degree of the corresponding Pk (s) .
You probably used partial fractions to compute some of the integrals in the earlier chapters of
this text. We’ll go through a few examples to both refresh our memories of this technique and to see
how it naturally arises in using the Laplace transform to solve differential equations.

!Example 25.5: In exercise 24.1 e on page 480, you found that the Laplace transform of the
solution to
y + 4y = 20e4t with y(0) = 3 and y (0) = 12
is
3s 2 − 28
Y (s) =
.
(s − 4) s 2 + 4

The partial fraction expansion of this is

3s 2 − 28 A Bs + C
Y (s) =
= + 2
(s − 4) s 2 + 4 s−4 s +4

i i

i i
i i

i i

488 The Inverse Laplace Transform

for some constants A , B and C . There are many ways to ﬁnd these constants. The basic method
is to “undo” the partial fraction expansion by getting a common denominator and adding up the
fractions on the right:

3s 2 − 28 A Bs + C

= + 2
(s − 4) s 2 + 4 s−4 s +4

A s2 + 4 (s − 4)(Bs + C)
=
+

(s − 4) s + 4
2 (s − 4) s 2 + 4
= ···
( A + B)s 2 + (C − 4B)s + 4( A − C)
=
.
(s − 4) s 2 + 4

Cutting out the middle and canceling out the common denominator lead to the equation

3 · s 2 + 0 · s − 28 = (A + B)s 2 + (C − 4B)s + 4(A − C) ,

which, in turn, means that our constants satisfy the three by three system

3 = A + B
0 = C − 4B .
−28 = 4A − 4C

This is a relatively simple system. Solving it however you wish, you obtain

A = 1 and B = 2 and C = 8 .

Hence
A Bs + C 1 2s + 8
Y (s) = + 2 = + 2 ,
s−4 s +4 s−4 s +4
and
−1 −1 1 2s + 8
y(t) = L [Y (s)]|t = L + 2
s −4 s + 4 t
1
s 1
= L −1
+ 2L−1 + 8 L−1
s−4 t

s2 + 4 t s 2 + 4 t

s 2
= e4t + 2 L−1 + 8 · 1 L−1
s +2
2 2 2 s + 2 t
2 2
t

= e4t + 2 cos(2t) + 4 sin(2t) .

!Example 25.6: In example 24.2 on page 468 we obtained

16 6s − 38
Y (s) = + 2
(s − 2)(s 2 − 7s + 12) s − 7s + 12

and, equivalently,
6s 2 − 50s + 92
Y (s) =

(s − 2) s 2 − 7s + 12
as the Laplace transform of the solution to some initial-value problem. While we could ﬁnd partial
fraction expansions for each term of the ﬁrst expression above, it will certainly be more convenient

i i

i i
i i

i i

Linearity and Using Partial Fractions 489

to simply ﬁnd the single partial fraction expansion for the second expression for Y (s) . But before
attempting that, we should note that one factor in the denominator can be further factored,

s 2 − 7s + 12 = (s − 3)(s − 4) ,

giving us
6s 2 − 50s + 92
Y (s) = .
(s − 2)(s − 3)(s − 4)
Now we can seek the partial fraction expansion of Y (s) :

6s 2 − 50s + 92 A B C
= + +
(s − 2)(s − 3)(s − 4) s−2 s−3 s−4
= ···
A(s − 3)(s − 4) + B(s − 2)(s − 4) + C(s − 2)(s − 3)
= .
(s − 2)(s − 3)(s − 4)

Cutting out the middle and canceling out the common denominator leave

6s 2 − 50s + 92
(25.1)
= A(s − 3)(s − 4) + B(s − 2)(s − 4) + C(s − 2)(s − 3) .

Rather than multiplying out the right side of this equation and setting up the system that A , B
and C must satisfy for this equation to hold (as we did in the previous example), let’s ﬁnd these
constants after making clever choices for the value of s in this last equation.
Letting s = 2 in equation (25.1):

6 22 − 50 · 2 + 92

= A(2 − 3)(2 − 4) + B(2 − 2)(2 − 4) + C(2 − 2)(2 − 3)

→ 16 = 2 A + 0B + 0C A = 8 .

Letting s = 3 in equation (25.1):

6 32 − 50 · 3 + 92

= A(3 − 3)(3 − 4) + B(3 − 2)(3 − 4) + C(3 − 2)(3 − 3)

→ −4 = 0 A − B + 0C B = 4 .

Letting s = 4 in equation (25.1):

6 42 − 50 · 4 + 92

= A(4 − 3)(4 − 4) + B(4 − 2)(4 − 4) + C(4 − 2)(4 − 3)

→ −12 = 0 A + 0B + 2C C = −6 .

In the next section, we will discuss an easy way to ﬁnd the inverse transform of each of the terms
in this partial fraction expansion.

25.3 Inverse Transforms of Shifted Functions

All the identities derived for the Laplace transform can be rewritten in terms of the inverse Laplace
transform. Of particular value to us is the ﬁrst shifting identity

L eat f (t) s = F(s − a)

where F = L[ f (t)] and a is any ﬁxed real number. In terms of the inverse transform, this is

L−1 [F(s − a)]|t = eat f (t) .

where f = L−1 [F(s)] and a is any ﬁxed real number. Viewed this way, we have a nice way to
ﬁnd inverse transforms of functions that can be written as “shifts” of functions in our tables.

!Example 25.8: Consider

1
L−1 .
(s − 6) t
3

Here, the ‘shift’ is clearly by a = 6 , and we have, by the above identity,

1

L−1 = F −1 [F(s − 6)]|t = e6t f (t) . (25.2)
(s − 6)3 t

We now need to ﬁgure out the f (t) from the fact that
1
F(s − 6) = .
(s − 6)3
Letting X = s − 6 in this equation, we have
1
F(X) = .
X3

i i

i i
i i

i i

492 The Inverse Laplace Transform

Thus,
1
F(s) = ,
s3
and 1

f (t) = L−1 [F(s)]|t = L−1 3

s t

2!
2!
−1 1 1 1
= L · 2+1 = L−1 · 2+1 = t 2 .
2! s t 2! s t 2
Plugging this back into equation (25.2), we obtain
1
1 1
L−1 = · · · = e6t f (t) = e6t t 2 = t 2 e6t .
(s − 6)
3 t 2 2

In many cases, determining the shift is part of the problem.

!Example 25.9: Consider ﬁnding the inverse Laplace transform of

1
.
s 2 − 8s + 25
If the denominator could be factored nicely, we would use partial fractions. This denominator
does not factor nicely (unless we use complex numbers). When that happens, try “completing
the square” to rewrite the denominator in terms of “ s − a ” for some constant a . Here,

s 2 − 8s + 25 = s 2 − 2 · 4s + 42 − 42 + 25
= s2 − 2
· 4s + 42 − 42 + 25 = (s − 4)2 + 9 .
(s−4)2

Hence,
1 1
L−1 = L−1

s − 8s + 25 t
2 (s − 4) + 9 t
2
(25.3)
−1 4t
= L [F(s − 4)]|t = e f (t) .
Again, we need to ﬁnd f (t) from a shifted version of its transform. Here,
1
F(s − 4) = .
(s − 4)2 + 9
Letting X = s − 4 in this equation, we have
1
F(X) = ,
X2 + 9
which means the formula for F(s) is
1
F(s) = .
s2 + 9
Thus,

1
f (t) = L−1 [F(s)]|t = L−1
s2 + 9 t
1 3
3 1 1
= L−1 · 2 = L−1 2 = sin(3t) .
3 s +9 t 3 s + 32 t 3
Plugging this back into equation (25.3), we get

1 1 1
L−1 2 = · · · = e4t f (t) = e4t sin(3t) = e4t sin(3t) .
s − 8s + 25 t 3 3

i i

i i
i i

i i

Additional Exercises 493

Additional Exercises

25.1. Using the tables (mainly, table 23.1 on page 444) or your own memory, ﬁnd the inverse
Laplace transform for each of the following:
1 1 1
a. b. c.
s−6 s+2 s2
6 5 s
d. e. f.
s4 s 2 + 25 s 2 + 3π 2

25.2. Using the tables and linearity, ﬁnd the inverse Laplace transform for each of the follow-
ing:
6 1 3 8
a. b. c. √ −
s+2 s4 s s−4

4s 2 − 4 3s + 1 1 − e−4s
d. e. f.
s5 s 2 + 25 s

25.3. In exercise 24.3 on page 481, you found the transform of t sin(ωt) and t cos(ωt) . Now
verify the following inverse Laplace transforms assuming ω is any real constant:

−1 s t
a. L
2 = 2ω sin(ωt)
s 2 + ω2
t

1 1
b. L−1
2 = [sin(ωt) − ωt cos(ωt)]
s +ω
2 2 2ω 3
t

25.4. Solve each of the following initial-value problems using the Laplace transform:
a. y + 9y = 0 with y(0) = 4

b. y + 9y = 0 with y(0) = 4 and y (0) = 6

25.5. Using the tables and partial fractions, ﬁnd the inverse Laplace transform for each of the
following:
7s + 5 s−1 1
a. b. c.
(s + 2)(s − 1) s 2 − 7s + 12 s2 − 4
3s 2 + 6s + 27 1 8s 3
d. e. f.
s 3 + 9s s 3 − 4s 2 s 4 − 81
5s 2 + 6s − 40 2s 3 + 3s 2 + 2s + 27 6s 2 + 62s + 92
g.
h.

i.

(s + 6) s 2 + 16 s2 + 9 s2 + 1 (s + 1) s 2 + 10s + 21

25.6. Solve each of the following initial-value problems using the Laplace transform (and partial
fractions):
a. y − 9y = 0 with y(0) = 4 and y (0) = 9

b. y + 9y = 27t 3 with y(0) = 0 and y (0) = 0

c. y + 8y + 7y = 165e4t with y(0) = 8 and y (0) = 1

i i

i i
i i

i i

494 The Inverse Laplace Transform

25.7. Using the translation identity (and the tables), ﬁnd the inverse Laplace transform for each
of the following:
1 1 s 1
a. b. c. d. √
(s − 7)5 s 2 − 6s + 45 s 2 − 6s + 45 s +2
1 s 1 s2
e. f. g. h.
s 2 + 8s + 16 s 2 − 12s + 40 s 2 + 12s + 40 (s − 3)5

25.8. Using the Laplace transform with the translation identity, solve the following initial-value
problems:
a. y − 8y + 17y = 0 with y(0) = 3 and y (0) = 12

b. y − 6y + 9y = e3t t 2 with y(0) = 0 and y (0) = 0

c. y + 6y + 13y = 0 with y(0) = 2 and y (0) = 8

d. y + 8y + 17y = 0 with y(0) = 3 and y (0) = −12

25.9. Using the Laplace transform, solve the following initial-value problems:
a. y = et sin(t) with y(0) = 0 and y (0) = 0

b. y − 4y + 40y = 122e−3t with y(0) = 0 and y (0) = 8

c. y − 9y = 24e−3t with y(0) = 6 and y (0) = 2

d. y − 4y + 13y = e2t sin(3t) with y(0) = 4 and y (0) = 3

25.10. The inverse transforms of the following could be computed using partial fractions. Instead,
ﬁnd the inverse transform of each using the appropriate integration identity from section
24.3.
1 1 1
a.
b. c.
s s +9
2 s(s − 4) s(s − 3)2

i i

1
√ ∗ t2 when t = 4 .
t

Here,
1
f (t) = √ and g(t) = t 2 .
t
So
1
f (x) = √ and g(t − x) = (t − x)2 ,
x
and t
1 1
√ ∗ t 2 = f ∗ g(t) = √ (t − x)2 dx
t 0 x

1
t
= x − /2 t 2 − 2t x + x 2 dx
0
t
t 2 x − /2 − 2t x
1 1/ 3/
= 2 +x 2 dx
0

1/ 2 3/2 2 5/2 t
= t 2 2x 2 − 2t x + x
3 5 x=0
1/ 4 3/ 2 5/2
= 2t 2 · t 2 − t ·t 2 + t .
3 5

i i

i i
i i

i i

Convolution: The Basics 497

After a little algebra and arithmetic, this reduces to

1 16 5/2
√ ∗ t2 = t . (26.1)
t 15

Thus, to compute
1
√ ∗ t2 when t = 4 ,
t
we actually compute
16 5/2
t with t = 4 ,
15
obtaining
· 4 /2 =
16 5 16 512
· 25 = .
15 15 15

Basic Identities
Let us quickly note a few easily veriﬁed identities that can simplify the computation of some convo-
lutions.
The ﬁrst identity is trivial to derive. Let α be a constant, and let f and g be two functions.
Then, of course,
t t t
[α f (x)]g(t − x) dx = f (x)[αg(t − x)] dx = α f (x)g(t − x) dx ,
0 0 0

which we can rewrite as

[α f ] ∗ g = f ∗ [αg] = α[ f ∗ g] .
In other words, we can “factor out constants”.
A more substantial identity comes from looking at how switching the roles of f and g changes
the convolution. That is, how does the result of computing
t
g ∗ f (t) = g(x) f (t − x) dx
0
compare to what we get by computing
t
f ∗ g(t) = f (x)g(t − x) dx ?
0

Well, in the last integral, let’s use the substitution y = t − x . Then x = t − y , dx = −d y and
t
f ∗ g(t) = f (x)g(t − x) dx
x=0
t−t
= f (t − y)g(y)(−1) d y
y=t−0
0 t
= − g(y) f (t − y) d y = g(y) f (t − y) d y .
t 0

The last integral is exactly the same as the integral for computing g ∗ f (t) , except for the cosmetic
change of denoting the variable of integration by y instead of x . So that integral is the formula for
g ∗ f (t) , and our computations just above reduce to
f ∗ g(t) = g ∗ f (t) . (26.2)
Thus we see that convolution is “commutative”.

i i

i i
i i

i i

498 Convolution

!Example 26.3: Let’s consider the convolution

1
t2 ∗ √ .
t

Since we just showed that convolution is commutative, we know that

1 1
t2 ∗ √ = √ ∗ t2 .
t t

What an incredible stroke of luck! We’ve already computed the convolution on the right in
example 26.2. Checking back to equation (26.1), we ﬁnd

1 16 5/2
√ ∗ t2 = t .
t 15

Hence,
1 1 16 5/2
t2 ∗ √ = √ ∗ t2 = t .
t t 15

In addition to being commutative, convolution is “distributive” and “associative”. That is, given
three functions f , g and h ,
[ f + g] ∗ h = [ f ∗ h] + [g ∗ h] , (26.3)

f ∗ [g + h] = [ f ∗ g] + [ f ∗ h] (26.4)
and
f ∗ [g ∗ h] = [ f ∗ g] ∗ h . (26.5)

The first and second equations are that “addition distributes over convolution”. They are easily
confirmed using the basic definition of convolution. For the first:
t
[ f + g] ∗ h(t) = [ f (x) + g(x)]h(t − x) dx
0
t
= [ f (x)h(t − x) + g(x)h(t − x)] dx
0
t t
= f (x)h(t − x) dx + g(x)h(t − x)] dx
0 0
= [ f ∗ g] + [g ∗ h] .

The second, equation (26.4) follows in a similar manner or by combining (26.3) with the commu-
tativity of the convolution. The last equation in the list, equation (26.5), states that convolution is
“associative”; that is, when convolving three functions together, it does not matter which two you
convolve first. Its verification requires showing that the two double integrals defining

f ∗ [g ∗ h] and [ f ∗ g] ∗ h

are equivalent. This is a relatively straightforward exercise in substitution, and will be left as a
challenge for the interested student (exercise 26.3 on page 507).

i i

i i
i i

i i

Convolution and Products of Transforms 499

Finally, just for fun, let’s make three more simple observations:
t
0 ∗ g(t) = g ∗ 0(t) = 0 · g(t − x) dx = 0 .
0
t t
f ∗ 1(t) = 1 ∗ f (t) = f (s) · 1 dx = f (s) dx .
0 0
0
f ∗ g(0) = f (x)g(0 − x) dx = 0 .
0

Observations on the Existence of the Convolution

The observant reader will have noted that, if f and g are at least piecewise continuous on (0, ∞) ,
then, for any positive value t , the product f (x)g(t − x) is a piecewise continuous function of x
on (0, t) . It then follows that the integral in
t
f ∗ g(t) = f (x)g(t − x) dx
0

is well defined and finite for every positive value of t . In other words, f ∗ g is a well-defined
function on (0, ∞) , at least whenever f and g are both piecewise continuous on (0, ∞) . (In fact,
it can then even be shown that f ∗ g(t) is a continuous function on [0, ∞) .)
But now observe that one of the functions in example 26.2, namely t −1/2 , ‘blows up’ at
t = 0 and, thus, is not piecewise continuous on (0, ∞) . So that example also demonstrates that,
sometimes, f ∗ g is well defined on (0, ∞) even though f or g is not piecewise continuous.

26.2 Convolution and Products of Transforms

To see one reason convolution is important in the study of Laplace transforms, let us examine the
Laplace transform of the convolution of two functions f (t) and g(t) . Our goal is a surprisingly
simple formula of the corresponding transforms,
∞
F(s) = L[ f (t)]|s = f (t)e−st dt
0
and ∞
G(s) = L[g(t)]|s = g(t)e−st dt .
0

(The impatient can turn to theorem 26.1 on page 501 for that formula.)
Keep in mind that we can rename the variable of integration in each of the above integrals. In
particular, note (for future reference) that
∞ ∞
−sx
F(s) = e f (x) dx and G(s) = e−sy g(y) d y .
0 0

i i

i i
i i

i i

500 Convolution

R
T

x
t=
(x 0 , t0 )
t = t0 (t0 , t0 )

t = x0 (x 0 , x 0 )

x = x0 x = t0 X

Figure 26.1: The region R for the transform of the convolution. Note that the coordinates of any
point (x 0 , t0 ) in R must satisfy 0 < x 0 < t0 < ∞ .

Now, simply writing out the integral formulas for the Laplace transform and for the convolution
yields ∞
L[ f ∗ g(t)]|s = e−st f ∗ g(t) dt
0
∞ t
= e−st f (x)g(t − x) dx dt
t=0 x=0
∞ t
= e−st f (x)g(t − x) dx dt .
t=0 x=0

Combined with the observation that

e−st = e−st+sx−sx = e−s(t−x) e−sx ,

the above sequence becomes

∞ t
L[ f ∗ g(t)]|s = e−sx f (x) e−s(t−x) g(t − x) dx dt
t=0 x=0
∞ t (26.6)
= K (x, t) dx dt
t=0 x=0

where, simply to simplify expressions in the next few lines, we’ve set

K (x, t) = e−sx f (x) e−s(t−x) g(t − x) .

It is now convenient to switch the order of integration in the last double integral. According to
the limits in that double integral, we are integrating over the region R in the X T –plane consisting
of all (x, t) for which
0<t <∞
and, for each of these values of t ,
0<x <t .

As illustrated in figure 26.1, region R is the portion of the first quadrant of the X T –plane to the
left of the line t = x . Equivalently, as can also be seen in this figure, R is the portion of the first
quadrant above the line t = x . So R can also be described as the set of all (x, t) for which

0<x <∞

i i

i i
i i

i i

Convolution and Products of Transforms 501

and, for each of these values of x ,

x <t <∞ .

Thus,
∞ t ∞ ∞
K (x, t) dx dt = K (x, t) d A = K (x, t) dt dx .
t=0 x=0 R x=0 t=x
Combining this with equation (26.6), and then continuing, we have
∞ ∞
L[ f ∗ g(t)]|s = K (x, t) dt dx
x=0 t=x
∞ ∞
= e−sx f (x) e−s(t−x) g(t − x) dt dx
x=0 t=x
∞ ∞
−sx −s(t−x)
= e f (x) e g(t − x) dt dx .
x=0 t=x

Let us simplify the inner integral with the substitution y = t − x (remembering that t is the variable
of integration in this integral):
∞ ∞−x ∞
−s(t−x) −sy
e g(t − x) dt = e g(y) d y = e−sy g(y) d y = G(s) !
t=x y=x−x 0

Combining this with our last formula for L[ f ∗ g] then yields

∞
L[ f ∗ g(t)]|s = f (x)e−sx G(s) dx
0
∞
= e−sx f (x) dx · G(s) = F(s) · G(s) !
0

Thus,
L[ f ∗ g(t)]|s = F(s)G(s) .
Equivalently,
f ∗ g(t) = L−1 [F(s)G(s)]|t .
If we had been a little more complete in our computations, we would have kept track of the
exponential order of all the functions involved (see exercise 26.9), and obtained all of the following
theorem.

Theorem 26.1 (Laplace convolution identities)

Assume f (t) and g(t) are two functions of exponential order s0 , and with Laplace transforms

F(s) = L[ f (t)]|s and G(s) = L[g(t)]|s .

Then the convolution f ∗ g(t) is of exponential order s1 for any s1 > s0 . Moreover,

L[ f ∗ g(t)]|s = F(s)G(s) for s > s0 (26.7)

and
L−1 [F(s)G(s)]|t = f ∗ g(t) . (26.8)

Do remember that identities (26.7) and (26.8) are equivalent. It is also worthwhile to rewrite
these identities as
L[ f ∗ g(t)]|s = L[ f ]|s · L[g(t)]|s (26.7 )

i i

i i
i i

i i

502 Convolution

and
L−1 [F(s)G(s)]|t = L−1 [F(s)]|t ∗ L−1 [G(s)]|t , (26.8 )
respectively. These forms, especially the latter, are sometimes a little more convenient in practice.

!Example 26.4: Consider ﬁnding the inverse Laplace transform of

1
.
s 2 − 10s + 21
Factoring the denominator and applying the above, we get

1 1
L −1 −1
s − 10s + 21
2 = L
(s − 3)(s − 7)
t t
1 1

= L−1 ·
s −3 s−7 t
1 1

= L−1 ∗ L−1 = e3t ∗ e7t .
s −3 t s−7 t
As luck would have it, this convolution was computed in example 26.1 on page 495),

1
e3t ∗ e7t = e7t − e3t .
4
Thus,
1
L −1 = e3t ∗ e7t = 1 e7t − e3t .
s 2 − 10s + 21 t 4

The inverse transform in the last example could also have been computed using partial fractions.
Indeed, many of the inverse transforms we computed using partial fractions can also be computed
using convolution. Whether one approach or the other is preferred depends on the opinion of the
person doing the computing. However, as the next example shows, there are cases where convolution
can be applied, but not partial fractions. We will also use this example to demonstrate how convolution
naturally arises when solving differential equations.

!Example 26.5: Consider solving the initial-value problem

1
y + 9y = √ with y(0) = 0 and y (0) = 0 .
t
Taking the Laplace transform of both sides:

1

L y + 9y s = L √
t s
√
π
→ L y s + 9L[y]|s = √
s
√
π
→ s 2 Y (s) − s y(0) − y (0) + 9Y (s) = √
s
0 0
√
π
→ s 2 + 9 Y (s) = √
s
√
π
→ Y (s) = √

s s2 + 9
.

i i

i i
i i

i i

Convolution and Differential Equations (Duhamel’s Principle) 503

Thus, y(t) is the inverse Laplace transform of

√
π
√
2 .
s s +9

Because the denominator does not factor into two polynomials, we cannot use partial fractions
— we must use convolution,
√ √ √
π π
L−1 √
2 = L−1 √ ·
1 = L−1 √π ∗ L−1 1 .
s s +9 s2 s +9 2 s s +9
t t t t

Reversing the transform made on the right side of the above equations, we have
√
π 1
L−1 √ = √ .
s t t

Using our tables, we ﬁnd that

1
L−1 2 = 1 L−1 3 = 1 sin(3t) .

s +9 t 3 s + 3 t
2 2 3

Combining the above and recalling that “constants factor out”, we then obtain
√ √
π π 1
y(t) = L−1 √
2 = L−1 √ ∗ L−1
s s +9 t
s t s + 9 t
2

1 1
= √ ∗ sin(3t)
t 3
1 1
= √ ∗ sin(3t) .
3 t

That is,
1 t 1
y(t) = √ sin(3[t − x]) dx .
3 0 x
Admittedly, this last integral is not easily evaluated by hand. But it is something that can be
accurately approximated for any speciﬁc (nonnegative) value of t using routines found in many
computer math packages. So it is still a usable formula.

26.3 Convolution and Differential Equations (Duhamel’s

Principle)
As illustrated in our last example, convolution has a natural role in solving differential equations
when using the Laplace transform. However, if we look a little more carefully at the process of
solving differential equations using the Laplace transform, we will ﬁnd that convolution can play an
even more signiﬁcant role.

!Example 26.6: Let’s consider solving the nonhomogeneous initial-value problem

y − 10y + 21y = f (t) with y(0) = 0 and y (0) = 0

i i

i i
i i

i i

504 Convolution

where f = f (t) is any Laplace transformable function. Naturally, we will use the Laplace
transform. So let
Y (s) = L[y(t)]|s and F(s) = L[ f (t)]|s .

Because of our initial conditions, the “transform of the derivatives” identities simplify consider-
ably:
L y s = s 2 Y (s) − s y(0) − y (0) = s 2 Y (s)

0 0
and

L y s = sY (s) − y(0) = sY (s) .

0

Consequently,

L y − 10y + 21y s = L[ f (t)]|s

→ L y s − 10L y s + 21L[y]|s = F(s)

→ s 2 Y (s) − 10sY (s) + 21Y (s) = F(s)

→ s 2 − 10s + 21 Y (s) = F(s) .

Dividing through by the polynomial, we get

and then letting T → ∞ , you can show that y = h ∗ f is well deﬁned and satisﬁes the
initial-value problem even when f is merely piecewise continuous on (0, ∞) .
3. It is not hard to show that, if you want the solution to

a0 y (N ) + a1 y (N −1) + · · · + a N −2 y + a N −1 y + a N y = f (t) ,

but satisfying nonzero initial conditions, then you simply need to add the solution obtained by
Duhamel’s principle to the solution to the corresponding homogeneous differential equation

a0 y (N ) + a1 y (N −1) + · · · + a N −2 y + a N −1 y + a N y = 0

that satisﬁes the desired initial conditions.

Additional Exercises

26.2. Compute the convolution f ∗ g(t) of each of the following pairs of functions:
1
a. f (t) = e3t and g(t) = e5t b. f (t) = √ and g(t) = t 2
t
√
c. f (t) = t and g(t) = 6 d. f (t) = t and g(t) = e3t
e. f (t) = t 2 and g(t) = t 2 f. f (t) = sin(t) and g(t) = t
g. f (t) = sin(t) and g(t) = sin(t) h. f (t) = sin(t) and g(t) = e−3t

26.3. Verify the associative property of convolution. That is, verify equation (26.5) on page 498.

i i

i i
i i

i i

508 Convolution

26.4. Using convolution, compute the inverse Laplace transform of each of the following:
1 1 1
a. b. c.
(s − 4)(s − 3) s(s − 3) s(s 2 + 4)
1 1 s2
d. e. f.
(s − 3)(s 2 + 1) (s + 9)2
2 (s 2 + 4)2
1 1
g. √ (leave in integral form) h. √
(leave in integral form)
s(s − 3) s s2 + 4
26.5. For each of the following initial-value problems, ﬁnd the corresponding transfer function
H and the impulse response function h , and write down the corresponding convolution
integral formula for the solution:
a. y + 4y = f (t) with y(0) = 0 and y (0) = 0

b. y − 4y = f (t) with y(0) = 0 and y (0) = 0

c. y − 6y + 9y = f (t) with y(0) = 0 and y (0) = 0

d. y − 6y + 18y = f (t) with y(0) = 0 and y (0) = 0

e. y + 16y = f (t) with y(0) = 0 and y (0) = 0

26.6. Using the results from exercise 26.5 a, find the solution to
y + 4y = f (t) with y(0) = 0 and y (0) = 0
for each of the following choices of f :
a. f (t) = 1 b. f (t) = t c. f (t) = e3t
d. f (t) = sin(2t) e. f (t) = sin(αt) where α = 2
26.7. Using the results from exercise 26.5 c, find the solution to
y − 6y + 9y = f (t) with y(0) = 0 and y (0) = 0
for each of the following choices of f :
a. f (t) = 1 b. f (t) = t c. f (t) = e3t
d. f (t) = e−3t e. f (t) = eαt where α = 3
26.8. Using the results from exercise 26.5 e, find the solution to
y + 16y = f (t) with y(0) = 0 and y (0) = 0
for each of the following choices of f :
a. f (t) = 1 b. f (t) = t c. f (t) = e3t
d. f (t) = sin(4t) e. f (t) = sin(at) where α = 4
26.9. Let f and g be two piecewise continuous functions on the positive real line satisfying, for
all t > 0 ,
| f (t)| < M f es0 t and |g(t)| < Mg es0 t
for some constants M f , Mg and s0 .
a. Show that | f ∗ g(t)| < M f Mg es0 t t whenever t > 0 .
b. Why does this tell us that f ∗ g is of exponential order s1 for any s1 > s0 ?

i i

i i
i i

i i

27
Piecewise-Deﬁned Functions and Periodic
Functions

At the start of our study of the Laplace transform, it was claimed that the Laplace transform is
“particularly useful when dealing with nonhomogeneous equations in which the forcing functions
are not continuous”. Thus far, however, we’ve done precious little with any discontinuous functions
other than step functions. Let us now rectify the situation by looking at the sort of discontinuous
functions (and, more generally, “piecewise-defined” functions) that often arise in applications, and
develop tools and skills for dealing with these functions.
We will also take a brief look at transforms of periodic functions other than sines and cosines.
As you will see, many of these functions are, themselves, piecewise defined. And finally, we will use
some of the material we’ve recently developed to re-examine the issue of resonance in mass/spring
systems.

27.1 Piecewise-Deﬁned Functions

Piecewise-Defined Functions, Defined
When we talk about a “discontinuous function f ” in the context of Laplace transforms, we usually
mean f is a piecewise continuous function that is not continuous on the interval (0, ∞) . Such a
function will have jump discontinuities at isolated points in this interval. Computationally, however,
the real issue is often not so much whether there is a nonzero jump in the graph of f at a point
t0 , but whether the formula for computing f (t) is the same on either side of t0 . So we really
should be looking at the more general class of “piecewise-defined” functions that, at worst, have
jump discontinuities.
Just what is a piecewise-defined function? It is any function given by different formulas on
different intervals. For example,
⎧ ⎧
⎪
⎪ 0 if t < 1 ⎪
⎪ if t ≤ 1
⎨ ⎨ 0
f (t) = 1 if 1 < t < 2 and g(t) = t −1 if 1 < t < 2
⎪
⎪ ⎪
⎪
⎩0 if 2 < t ⎩ 1 if 2 ≤ t

are two relatively simple piecewise-defined functions. The first (sketched in figure 27.1a) is discon-
tinuous because it has nontrivial jumps at t = 1 and t = 2 . However, the second function (sketched
in figure 27.1b) is continuous because t − 1 goes from 0 to 1 as t goes from 1 to 2 . There are
no jumps in the graph of g .

509

i i

i i
i i

i i

510 Piecewise-Deﬁned Functions and Periodic Functions

1 1

0 1 2 T 0 1 2 T
(a) (b)

Figure 27.1: The graphs of two piecewise-deﬁned functions.

By the way, we may occasionally refer to the sort of lists used above to deﬁne f (t) and g(t) as
sets of conditional formulas for f and g , simply because these are sets of formulas with conditions
stating when each formula is to be used.
Do note that, in the above formula set for f , we did not specify the values of f (t) when t = 1
or t = 2 . This was because f has jump discontinuities at these points and, as we agreed in chapter
23 (see page 455), we are not concerned with the precise value of a function at its discontinuities.
On the other hand, using the formula set given above for g , you can easily verify that

lim g(t) = 0 = lim g(t) and lim g(t) = 1 = lim g(t) ;

t→1− t→2+ t→1− t→2+

so there is not a true jump in g at these points. That is why we went ahead and speciﬁed that

g(1) = 0 and g(2) = 1 .

In the future, let us agree that, even if the value of a particular function f or g is not explicitly
specified at a particular point t0 , as long as the left- and right-hand limits of the function at t0 are
defined and equal, then the function is defined at t0 and is equal to those limits. That is, we’ll assume

f (t0 ) = lim f (t) = lim f (t)

t→t0 − t→t0 +
whenever
lim f (t) = lim f (t) .
t→t0 − t→t0 +

This will simplify notation a little and may keep us from worrying about issues of continuity when
those issues are not important.

Step Functions, Again

Most people would probably consider the step functions to be the simplest piecewise-deﬁned func-
tions. These include the basic step function,

0 if t < 0
step(t) =
1 if 0 ≤ t

(sketched in ﬁgure 27.2a), as well as the step function at a point α ,

0 if t < α
stepα (t) = step(t − α) =
1 if α < t

(sketched in ﬁgure 27.2b).

i i

i i
i i

i i
i i

i i

The “Translation Along the T -Axis” Identity 513

α ∞
−st
= f (t − α) stepα (t)e dt + f (t − α) stepα (t)e−st dt
t=0 t=α
∞
= f (t − α) stepα (t)e−st dt
t=0

= L f (t − α) stepα (t) s .

Combining the above computations with equation set (27.1) then gives us
∞

e −αs
F(s) = · · · = f (τ − α)e−sτ dτ = · · · = L f (t − α) stepα (t) s .
τ =α

Cutting out the middle, we get our second translation identity:

Theorem 27.1 (second translation identity [translation along the T –axis])

Let
F(s) = L[ f (t)]|s
where f is any Laplace transformable function. Then, for any positive constant α ,

L f (t − α) stepα (t) s = e−αs F(s) . (27.2a)
Equivalently,

L−1 e−αs F(s) t = f (t − α) stepα (t) . (27.2b)

Computing Inverse Transforms

The Basic Computations
Computing inverse transforms using the translation along the T –axis identity is usually straightfor-
ward.

!Example 27.2: Consider ﬁnding the inverse Laplace transform of

e−2s
.
s2 + 1
Applying the identity, we have
−2s
e 1
L−1 2 = L−1 e−2s = L−1 e−2s F(s) = f (t − 2) step2 (t) .
s +1 s2 + 1 t t
t
F(s)

Here the inverse transform of F is easily read off the tables:

1
f (t) = L−1 [F(s)]|t = L−1 2 = sin(t) .
s +1 t

So, for any X ,

f (X) = sin(X) .
Using this with X = t − 2 in the above inverse transform computation then yields
−2s
e
L−1 2 = f (t − 2) step (t) = sin(t − 2) step (t) .
s +1 2 2
t

i i

i i
i i

i i

514 Piecewise-Deﬁned Functions and Periodic Functions

2 2+π 2 + 2π T

Figure 27.3: The graph of sin(t − 2) step(t − 2) .

Keep in mind that

0 if t < 2
sin(t − 2) step2 (t) = .
sin(t − 2) if 2 < t
The graph of this function is sketched in ﬁgure 27.3.

Observe that, as illustrated in ﬁgure 27.3, the graph of

L−1 e−αs F(s) t = f (t − α) stepα (t)
is always zero for t < α , and is the graph of f (t) on [0, ∞) shifted by α for α ≤ t . Remembering
this can simplify graphing these types of functions.

Describing Piecewise-Deﬁned Functions Arising From Inverse

Transforms
Let us start with a simple, but illustrative, example.

!Example 27.3: Consider computing the inverse Laplace transform of

1 −s 1
F(s) = e − 2 e−2s .
s2 s
Going to the tables, we see that
1
G(s) = ⇒ g(t) = t .
s2
Using this, along with linearity and the second translation identity, we get
1
1
f (t) = L−1 [F(s)]|t = L−1 2 e−s − 2 e−2s
s s t

−1 1 −1s −1 1 −2s
= L 2
e − L 2
e
s t s t
= (t − 1) step1 (t) − (t − 2) step2 (t) .
Note that the step functions tell us that ‘signiﬁcant changes’ occur in f (t) at the points t = 1
and t = 2 .
While the above is a valid answer, it is not a particularly convenient answer. It would be
much easier to graph and see what f really is if we go further and completely compute f (t) on
the intervals having t = 1 and t = 2 as endpoints:
For t < 1 ,
f (t) = (t − 1) step1 (t) − (t − 2) step2 (t) = 0 − 0 = 0 .

0 0

i i

i i
i i

i i

The “Translation Along the T -Axis” Identity 515

For 1 < t < 2 ,

f (t) = (t − 1) step1 (t) − (t − 2) step2 (t) = (t − 1) − 0 = t − 1 .

1 0

For 2 < t ,

f (t) = (t − 1) step1 (t) − (t − 2) step2 (t) = (t − 1) − (t − 2) = 1 .

1 1

Thus, ⎧
⎪
⎪ 0 if t < 1
⎨
f (t) = t −1 if 1 < t < 2 .
⎪
⎪
⎩ 1 if 2 < t
(This is the function sketched in ﬁgure 27.1b on page 509.)

As just illustrated, piecewise-deﬁned functions naturally arise when computing inverse Laplace
transforms using the second translation identity. Typically, use of this identity leads to an expression
of the form

f (t) = g0 (t) + g1 (t) stepα1 (t) + g2 (t) stepα2 (t) + g3 (t) stepα3 (t) + · · · (27.3)

where f is the function of interest, the gk (t)’s are various formulas, and the αk ’s are positive
constants. This expression is a valid formula for f , and the step functions tell us that ‘signiﬁcant
changes’ occur in f (t) at the points t = α1 , t = α2 , t = α3 , . . . . Still, to get a better picture of
the function f (t) , we will want to obtain the formulas for f (t) over each of the intervals bounded
by the αk ’s . Assuming we were reasonably intelligent and indexed the αk ’s so that

0 < α1 < α2 < α3 < · · · ,

we would have
For t < α1 ,

f (t) = g0 (t) + g1 (t) stepα1 (t) + g2 (t) stepα2 (t) + g3 (t) stepα3 (t) + · · ·

0 0 0

= g0 (t) + 0 + 0 + 0 + · · · = g0 (t) .

For α1 < t < α2 ,

f (t) = g0 (t) + g1 (t) stepα1 (t) + g2 (t) stepα2 (t) + g3 (t) stepα3 (t) + · · ·

1 0 0

= g0 (t) + g1 (t) + 0 + 0 + · · · = g0 (t) + g1 (t) .

For α2 < t < α3 ,

f (t) = g0 (t) + g1 (t) stepα1 (t) + g2 (t) stepα2 (t) + g3 (t) stepα3 (t) + · · ·

1 1 0

= g0 (t) + g1 (t) + g2 (t) + 0 + · · · = g0 (t) + g1 (t) + g2 (t) .

And so on.

i i

i i
i i

i i

516 Piecewise-Deﬁned Functions and Periodic Functions

Thus, the function f described by formula (27.3), above, is also given by the conditional set of
formulas ⎧
⎪ f 0 (t)
⎪
⎪
if t < α1
⎪
⎪
⎨ f 1 (t) if α1 < t < α2
f (t) =
⎪
⎪ f (t) if α2 < t < α3
⎪ 2
⎪
⎪
⎩ . .
.
where
f 0 (t) = g0 (t) ,

f 1 (t) = g0 (t) + g1 (t) ,

f 2 (t) = g0 (t) + g1 (t) + g2 (t) ,

..
.
.

Computing Transforms with the Identity

The translation along the T –axis identity is also helpful in computing the transforms of piecewise-
deﬁned functions. Here, though, the computations typically require a little more care. We’ll deal
with fairly simple cases here, and develop this topic further in the next section.

!Example 27.4: Consider ﬁnding L[g(t)]|s where

0 if t < 3
g(t) = .
t 2
if 3 < t

Remember, this function can also be written as

g(t) = t 2 step3 (t) .

Plugging this into the transform and applying our new translation identity gives

L[g(t)]|s = L t 2 step3 (t) = L f (t − 3) step3 (t) s = e−3s F(s)
s

where
f (t − 3) = t 2 .
But we need the formula for f (t) , not f (t − 3) , to compute F(s) . To ﬁnd that formula, let
X = t − 3 (hence, t = X + 3 ) in the formula for f (t − 3) . This gives

f (X) = (X + 3)2 .

Thus,
f (t) = (t + 3)2 = t 2 + 6t + 9 ,
and

F(s) = L[ f (t)]|s = L t 2 + 6t + 9
s

2 6 9
= L t 2 + 6L[t]|s + 9L[1]|s = 3 + 2 + .
s s s s

i i

i i
i i

i i

Rectangle Functions and Transforms of More Piecewise-Deﬁned Functions 517

α β T

Figure 27.4: Graph of the rectangle function rect(α,β) (t) with −∞ < α < β < ∞ .

Plugging this back into the above formula for L[g(t)]|s gives us

2 6 9
L[g(t)]|s = e−3s F(s) = e−3s 3 + 2 + .
s s s

27.3 Rectangle Functions and Transforms of More

Piecewise-Deﬁned Functions
Rectangle Functions
“Rectangle functions” are slight generalizations of step functions. Given any interval (α, β) , the
rectangle function on (α, β) , denoted rect(α,β) , is the function given by
⎧
⎪
⎪ if t < α
⎨0
rect(α,β) (t) = 1 if α < t < β .
⎪
⎪
⎩0 if β < t

The graph of rect(α,β) with −∞ < α < β < ∞ has been sketched in ﬁgure 27.4. You can see
why it is called a rectangle function — it’s graph looks rather “rectangular”, at least when α and β
are ﬁnite. If α = −∞ or β = ∞ , the corresponding rectangle functions simplify to

1 if t < β
rect(−∞,β) (t) =
0 if β < t
and
0 if t < α
rect(α,∞) (t) = .
1 if α < t

And if both a = −∞ and b = ∞ , then we have

rect(−∞,∞) (t) = 1 for all t .

All of these rectangle functions can be written as simple linear combinations of 1 and step
functions at α and/or β , with, again, the step functions acting as ‘switches’ — switching the
rectangle function ‘on’ (from 0 to 1 at α ), and switching it ‘off’ (from 1 back to 0 at β ). In
particular, we clearly have

rect(−∞,∞) (t) = 1 and rect(α,∞) (t) = stepα (t) .

i i

i i
i i

i i

518 Piecewise-Deﬁned Functions and Periodic Functions

Somewhat more importantly (for us), we should observe that, for −∞ < α < β < ∞ ,

1 − 0 if t < β 1 if t < β
1 − stepβ (t) = = = rect(−∞,β) (t) ,
1 − 1 if β < t 0 if β < t
and ⎧ ⎫
⎪ ⎪
⎨0 − 0
⎪ if t < α ⎪
⎬
stepα (t) − stepβ (t) = 1 − 0 if α < t < β
⎪
⎪ ⎪
⎪
⎩1 − 1 if β < t ⎭
⎧ ⎫
⎪
⎪ t <a ⎪
⎪
⎨0 if ⎬
= 1 if α<t <β = rect(α,β) (t) .
⎪
⎪ ⎪
⎪
⎩0 if β<t ⎭

In summary, for −∞ < α < β < ∞ ,

rect(α,β) (t) = stepα (t) − stepβ (t) , (27.4a)

rect(−∞,β) (t) = 1 − stepβ (t) (27.4b)

and
rect(α,∞) (t) = stepα (t) . (27.4c)
These formulas allow us to quickly compute the Laplace transforms of rectangle functions using
the known transforms of 1 and the step functions.

!Example 27.5:

L rect(3,4) (t) s = L rect(3,4) (t) s

= L step3 (t) − step4 (t) s
1 1
= L step3 (t) s − L step4 (t) s = e−3s − e−4s .
s s

Transforming More General Piecewise-Deﬁned Functions

To help us deal with more general piecewise-deﬁned functions, let us make the simple observations
that
⎧ ⎫ ⎧
⎪
⎪ g(t) · 0 if t < a ⎪
⎪ ⎪
⎪ if t < a
⎨ ⎬ ⎨ 0
g(t) rect(a,b) (t) = g(t) · 1 if a < t < b = g(t) if a < t < b ,
⎪
⎪ ⎪
⎪ ⎪
⎪
⎩g(t) · 0 if b < t ⎭ ⎩ 0 if b < t
and
g(t) · 1 if t < b g(t) if t < b
g(t) rect(−∞,b) (t) = = .
g(t) · 0 if b < t 0 if b < t
So functions of the form
⎧
⎪
⎪ if t < a
⎨ 0 g(t) if t < b
f (t) = g(t) if a < t < b and h(t) =
⎪
⎪ 0 if b < t
⎩ 0 if b < t

i i

i i
i i

i i

Rectangle Functions and Transforms of More Piecewise-Deﬁned Functions 519

can be rewritten, respectively, as

f (t) = g(t) rect(a,b) (t) and h(t) = g(t) rect(−∞,b) (t) .
More generally, it should now be clear that anything of the form
⎧
⎪
⎪ g0 (t) if t < α1
⎪
⎪
⎪
⎨ g1 (t) if α1 < t < α2
f (t) = (27.5a)
⎪
⎪ g2 (t) if α2 < t < α3
⎪
⎪
⎪
⎩ . .
.
can be rewritten as
f (t) = g0 (t) rect(−∞,α1 ) (t) + g1 (t) rect(α1 ,α2 ) (t) + g2 (t) rect(α2 ,α3 ) (t) + · · · . (27.5b)
The second form (with the rectangle functions) is a bit more concise than the “conditional set of
formulas” used in form (27.5a), and is generally preferred by typesetters. Of course, there is a more
important advantage of form (27.5b): Assuming f is piecewise continuous and of exponential order,
its Laplace transform can now be taken by expressing the rectangle functions in formula (27.5b) as
the linear combinations of 1 and step functions given in equation set (27.4), and then using linearity
and what we learned in the previous section about taking transforms of functions multiplied by step
functions.

!Example 27.6: Consider ﬁnding F(s) = L[ f (t)]|s when

t2 if t < 3
f (t) = .
0 if 3 < t
From the above, we see that
f (t) = t 2 rect(−∞,3) (t)

= t 2 1 − step3 (t) = t 2 − t 2 step3 (t) .
So

F(s) = L[ f (t)]|s = L t 2 − t 2 step3 (t) = L t 2 − L t 2 step3 (t) .
s s s

The Laplace transform of t 2 is in the tables, while the transform of t 2 step3 (t) just happened to
have been computed in example 27.4 a few pages ago. Using these transforms, the above formula
for F becomes
2 2 6 9
F(s) = 3 − e−3s 3 + 2 + .
s s s s

!Example 27.7: Consider ﬁnding F(s) = L[ f (t)]|s when

⎧
⎪
⎪ if t < 2
⎨ 0
f (t) = e 3t
if 2 < t < 4 .
⎪
⎪
⎩ 0 if 4 < t
From the above, we see that
f (t) = e3t rect(2,4) (t)

= e3t step2 (t) − step4 (t) = e3t step2 (t) − e3t step4 (t) .

i i

i i
i i

i i

520 Piecewise-Deﬁned Functions and Periodic Functions

Thus,

L[ f (t)]|s = L e3t step2 (t) − L e3t step4 (t) . (27.6)
s s
Both of the transforms on the right side of our last equation are easily computed via the
translation identity developed in this chapter. For the ﬁrst, we have

L e3t step2 (t) = L g(t − 2) step2 (t) s = e−2s G(s)
s

where
g(t − 2) = e3t .
Letting X = t − 2 (so t = X + 2 ), the last expression becomes

g(X) = e3(X +2) = e3X +6 = e6 e3X .

So
g(t) = e6 e3t
and

1
G(s) = L[g(t)]|s = L e6 e3t = e6 L e3t = e6 .
s s s−3

This, along with the ﬁrst equation in this paragraph, gives us

e−2(s−3)
1
L e3t step2 (t) = e−2s G(s) = e−2s e6 = .
s s−3 s −3

The transform of e3t step4 (t) can be computed in the same manner, yielding
e−4(s−3)

L e3t step4 (t) = .
s s −3

(The details of this computation are left to you.)

Finally, by combining the formulas we just obtained for the transforms of e3t step2 (t) and
e step4 (t) with equation (27.6), we have
3t

L[ f (t)]|s = L e3t step2 (t) − L e3t step4 (t)
s s
e−2(s−3) e −4(s−3)
= − .
s−3 s−3

!Example 27.8: Let’s ﬁnd the Laplace transform F(s) of

⎧
⎪
⎪ if t <1
⎨ 2
f (t) = e 3t
if 1<t <3 .
⎪
⎪
⎩ t2 if 3<t

To apply the Laplace transform, we ﬁrst convert the above to an equivalent expression involving
step functions:

f (t) = 2 rect(−∞,1) (t) + e3t rect(1,3) (t) + t 2 rect(3,∞) (t)

= 2 1 − step1 (t) + e3t step1 (t) − step3 (t) + t 2 step3 (t)
= 2 − 2 step1 (t) + e3t step1 (t) − e3t step3 (t) + t 2 step3 (t) .

i i

i i
i i

i i

Convolution with Piecewise-Deﬁned Functions 521

Using the tables and methods already discussed earlier in this chapter (as in examples 27.7 and
27.4), we discover that
2 2e−s
L[2]|s = , L 2 step1 (t) s = ,
s s
e−(s−3)
e−3(s−3)

L e3t step1 (t) = , L e3t step3 (t) =
s s −3 s s−3
and

−3s 2 6 9
L t step3 (t) = e
2
3
+ 2 + .
s s s s

Combining the above and using the linearity of the Laplace transform, we obtain
F(s) = L[ f (t)]|s

= L 2 − 2 step1 (t) + e3t step1 (t) − e3t step3 (t) + t 2 step3 (t)
s

2 2e−s e−(s−3) e−3(s−3) −3s 2 6 9
= − + − + e + 2 + .
s s s−3 s −3 3 s s s

t
e−3t ∗ f (t) = f (x)e−3(t−x) dx
0
t t
−3t+3x −3t
= f (x)e dx = e f (x)e3x dx .
0 0

Now, if t < 2 ,
t t
e−3t ∗ f (t) = e−3t f (x) e3x dx = e−3t xe3x dx .
0 0
x
(since x < t < 2 )

This integral is easily computed using integration by parts, yielding

t 1 1
1

e−3t ∗ f (t) = e−3t e3t − e3t + = 3t − 1 + e−3t .
3 9 9 9

Thus,
1
e−3t ∗ f (t) = 3t − 1 + e−3t if t < 2 . (27.9)
9

On the other hand, when 2 < t < 4 ,

t
e−3t ∗ f (t) = e−3t f (x)e3x dx
0
2 t
= e−3t f (x) e3x dx + f (x) e3x dx
0 2
x 2
(since x < 2 ) (since 2 < x < t < 4 )
2 t
−3t 3x 3x
= e xe dx + 2·e dx
0 2
= ···
1
2 3t

= e−3t 5e6 + 1 + e − e6
9 3
= ···
2 1

= + 1 − e6 e−3t .
3 9

Thus,
2 1
e−3t ∗ f (t) = + 1 − e6 e−3t for 2 < t < 4 . (27.10)
3 9
Finally, when 6 < t ,
t
e−3t ∗ f (t) = e−3t e3x f (x) dx
0
2 4 t
−3t 3x 3x 3x
= e f (x) e dx + f (x) e dx + f (x) e dx
0 2 4
x 2 0
(since x < 2 ) (since 2 < x < 4 ) (since 4 < x )
2 4 t
= e−3t xe3x dx + 2 · e3x dx + 0 · e3x dx
0 2 4

i i

i i
i i

i i

524 Piecewise-Deﬁned Functions and Periodic Functions

= ···
1
2 12

= e−3t 5e6 + 1 + e − e6 + 0
9 3
= ···
1

= 6e12 + 1 − e6 e−3t .
9
Thus,
1
e−3t ∗ f (t) = 6e12 + 1 − e6 e−3t for 4 < t . (27.11)
9
Putting it all together, equations (27.9), (27.10) and (27.11) give us
⎧ 1
⎪ −3t
⎪
⎪ 9 3t − 1 + e if t < 2
⎪
⎨
1
e−3t ∗ f (t) = 1 − e6 e−3t
2
+ if 2 < t < 4 .
⎪
⎪ 3
9
⎪
⎪
⎩ 1 6e12 + 1 − e6 e−3t if 4 < t
9

And what if both f and g are piecewise deﬁned? Then you must keep track of where formulas
of both f (t) and h(t −x) change. Fortunately, we will have little need to deal with such convolutions
at this time.

27.5 Periodic Functions

Basics
Often, a function of interest f is periodic with period p for some positive value p . This means
that the graph of the function remains unchanged when shifted to the left or right by p . This is
equivalent to saying
f (t + p) = f (t) for all t . (27.12)

You are well-acquainted with several periodic functions — the trigonometric functions, for example.
In particular, the basic sine and cosine functions

sin(t) and cos(t)

are periodic with period p = 2π . But other periodic functions, such as the “saw” function sketched
in figure 27.5a and the “square-wave” function sketched in figure 27.5b, can arise in applications.
Strictly speaking, a truly periodic function is defined on the entire real line, (−∞, ∞) . For
our purposes, though, it will suffice to have f “periodic on (0, ∞) ” with period p . This simply
means that f is that part of a periodic function along the positive T –axis. What f (t) is for t < 0
is irrelevant. Accordingly, for functions periodic on (0, ∞) , we modify requirement (27.12) to

f (t + p) = f (t) for all t > 0 . (27.13)

In what follows, however, it will usually be irrelevant as to whether a given function is truly periodic
or merely periodic on (0, ∞) , In either case, we will refer to the function as “periodic”, and specify
whether it is deﬁned on all of (−∞, ∞) or just (0, −∞) only if necessary.

i i

i i
i i

i i

Periodic Functions 525

1 1

1 2 3 4 T 1 2 3 4 T
(a) (b)

Figure 27.5: Two periodic functions: (a) a basic saw function, and (b) a basic square wave
function.

A convenient way to describe a periodic function f with period p is by

f 0 (t) if 0 < t < p
f (t) = .
f (t + p) in general

The f 0 (t) is the formula for f over the base period interval (0, p) . The second line is simply telling
us that the function is periodic and that equation (27.12) or (27.13) holds and can be used to compute
the function at points outside of the base period interval. (The value of f (t) at t = 0 and integral
multiples of p are determined — or ignored — following the conventions for piecewise-deﬁned
functions discussed in section 27.1.)

!Example 27.11: Let saw(t) denote the basic saw function sketched in ﬁgure 27.5a. It clearly
has period p = 1 , has jump discontinuities at integer values of t , and is given on (0, ∞) by

t if 0 < t < 1
saw(t) = .
saw(t + 1) in general

In this case, the formula for computing saw(τ ) when 0 < τ < 1 is

saw0 (τ ) = τ .

So, for example, saw 3/4 = 3/4 .
On the other hand, to compute saw(τ ) when τ > 1 (and not an integer), we must use

saw(t + 1) = saw(t)

repeatedly until we ﬁnally reach a value t in the base period interval (0, 1) . For example,

8 5 5
saw = saw + 1 = saw
3 3 3

2 2 2
= saw + 1 = saw = .
3 3 3

Often, the formula for the function over the base period interval is, itself, piecewise deﬁned.

!Example 27.12: Let sqwave(t) denote the square-wave function in ﬁgure 27.5b. This function
has period p = 2 , and, over its base period interval (0, 2) , is given by

1 if 0 < t < 1
sqwave(t) = .
0 if 1 < t < 2

i i

i i
i i

i i

526 Piecewise-Deﬁned Functions and Periodic Functions

So, ⎧
⎪
⎪ 1 if 0 < t < 1
⎨
sqwave(t) = 0 if 1 < t < 2 .
⎪
⎪
⎩ sqwave(t − 2) in general

Before discussing Laplace transforms of periodic functions, let’s make a couple of observations
concerning a function f which is periodic with period p over (0, ∞) . We won’t prove them.
Instead, you should think about why these statements are “obviously true”.
1. If f is piecewise continuous over (0, p) , then f is piecewise continuous over (0, ∞) .
2. If f is piecewise continuous over (0, p) , then f is of exponential order s0 = 0 .

Transforms of Periodic Functions

Suppose we want to ﬁnd the Laplace transform
∞
F(s) = L[ f (t)]|s = f (t)e−st dt
0
when f is piecewise continuous and periodic with period p . Because f (t) satisﬁes
f (t) = f (t + p) for t >0 ,
we should expect to (possibly) simplify our computations by partitioning the integral of the transform
into integrals over subintervals of length p ,
∞
F(s) = f (t)e−st dt
0
p 2p 3p
−st −st
= f (t)e dt + f (t)e dt + f (t)e−st dt
0 p 2p
4p 5p
−st
+ f (t)e dt + f (t)e−st dt + · · · .
3p 4p

For brevity, let’s rewrite this as

∞
(k+1) p
F(s) = f (t)e−st dt . (27.14)
k=0 kp

Now consider using the substitution τ = t −kp in the k th term of this summation. Then t = τ +kp ,
e−st = e−s(τ +kp) = e−kps e−sτ ,
and, by the periodicity of f ,
f (τ + p) = f (τ )

f (τ + 2 p) = f ([τ + p] + p) = f (τ + p) = f (τ )

f (τ + 3 p) = f ([τ + 2 p] + p) = f (τ + 2 p) = f (τ )

..
.

f (τ + kp) = · · · = f (τ ) .

i i

i i
i i

i i

Periodic Functions 527

So,
(k+1) p (k+1) p−kp
−st
f (t)e dt = f (τ + kp)e−s(τ +kp) dt
t=kp τ =kp−kp
p p
= f (τ )e−kps e−sτ dt = e−kps f (τ )e−sτ dt .
0 0
Note that the last integral does not depend on k . Consequently, combining the last result with
equation (27.14), we have
∞ p ∞ p

−kps −sτ −kps
F(s) = e f (τ )e dt = e f (τ )e−sτ dτ .
k=0 0 k=0 0

Here we have an incredible stroke of luck, at least if you recall what a geometric series is and
how to compute its sum. Assuming you do recall this, we have
∞
∞
k 1
e−kps = e− ps = . (27.15)
1 − e− ps
k=0 k=0

We also have this if you do not recall about geometric series, but then you will certainly want to go
to the brief review of geometric series in section 29.1 to see how we get this equation.
Whether or not you recall about geometric series, equation (27.15) combined with the last
formula for F (along with the observations made earlier regarding piecewise continuity and periodic
functions) gives us the following theorem.

Theorem 27.2
Let f be a piecewise continuous and periodic function with period p . Then its Laplace transform
F is given by
F0 (s)
F(s) = for s > 0
1 − e− ps
where p
F0 (s) = f (t)e−st dt .
0

There are at least two alternative ways of describing F0 in the above theorem. First of all, if
f is given by
f 0 (t) if 0 < t < p
f (t) = ,
f (t + p) in general
then, of course, p
F0 (s) = f 0 (t)e−st dt .
0
Also, using the fact that
p ∞
f 0 (t)e−st dt = f 0 (t) rect(0, p) (t) e−st dt ,
0 0
we see that
F0 (s) = L f 0 (t) rect(0, p) (t) s
or, equivalently, that

F0 (s) = L f (t) rect(0, p) (t) s .
Whether any of the alternative descriptions of F0 (s) is useful may depend on what transforms you
have already computed.

i i

i i
i i

i i

528 Piecewise-Deﬁned Functions and Periodic Functions

!Example 27.13: Let’s find the Laplace transform of the saw function from example 27.11 and
sketched in figure 27.5a,
t if 0 < t < 1
saw(t) = .
saw(t + 1) in general
Here, p = 1 , and the last theorem tells us that
F0 (s) F0 (s)
L[saw(t)]|s = = for s > 0
1−e −1·s 1 − e−s
where (using each of the formulas discussed for F0 )
1
F0 (s) = saw(t)e−st dt (27.16a)
0
1
= te−st dt (27.16b)
0

= L t rect(0,1) (t) s . (27.16c)

Had the author been sufficiently clever, L t rect(0,1) (t) would have already been computed in a
previous example, and we could write out the final result using formula (27.16c). But he wasn’t,
so let’s just compute F0 (s) using formula (27.16b) and integration by parts:
1
F0 (s) = te−st dt
0
1 1
t 1 −st
= − e−st − − e dt
s t=0 0 s
1 1
1
= − e−s·1 + 0 − 2 e−s·1 − e−s·0 = 2
1 − e−s − se−s .
s s s
Hence,
F0 (s)
L[saw(t)]|s =
1 − e−s
1 1 − e−s − se−s
= ·
s2 1 − e−s

1 se−s 1 1 e−s
= 1− = − · .
s 2 1 − e−s s 2 s 1 − e−s
This is our transform. If you wish, you can apply a little algebra and ‘simplify’ it to
1 1 1
L[saw(t)]|s = − · s ,
s2 s e −1
though you may prefer to keep the formula with 1 − e− ps in the denominator to remind you that
this transform came from a periodic function with period p .
Just for fun, let’s go even further using the fact that
e−s e−s 2es/2 1 2e−s/2 1 e−s/2
−s = −s · s/2 = · s/2 −s/2
= · .
1−e 1−e 2e 2 e −e 2 sinh(s/2)
Thus, the above formula for the Laplace transform of the saw function can also be written as
1 1 e−s/2
L[saw(t)]|s = 2
− · .
s 2s sinh(s/2)
This is significant only in that it demonstrates why hyperbolic trigonometric functions are some-
times found in tables of transforms.

i i

i i
i i

i i

An Expanded Table of Identities 529

Table 27.1: Commonly Used Identities (Version 2)

In the following, F(s) = L[ f (t)]|s .

h(t) H (s) = L[h(t)]|s Restrictions

∞
f (t) f (t)e−st dt
0

eαt f (t) F(s − α) α is real

f (t − α) stepα (t) e−αs F(s) α>0

df
s F(s) − f (0)
dt

d2 f
s 2 F(s) − s f (0) − f (0)
dt 2

dn f s n F(s) − s n−1 f (0) − s n−2 f (0)

n = 1, 2, 3, . . .
dt n − s n−3 f (0) − · · · − f (n−1) (0)

dF
t f (t) −
ds
dn F
t n f (t) (−1)n n = 1, 2, 3, . . .
ds n
t
F(s)
f (τ ) dτ
0 s
∞
f (t)
F(σ ) dσ
t s

f ∗ g(t) F(s)G(s)
"p −st dt
f is periodic with period p 0 f (t)e
1 − e− ps

27.6 An Expanded Table of Identities

For reference, let us write out a new table of Laplace transform identities containing the identities
listed in our ﬁrst table of Laplace transform identities, table 24.1 on page 472, along with some of
the more important identities derived after making that table. Our new table is table 27.1.

i i

i i
i i

i i

530 Piecewise-Deﬁned Functions and Periodic Functions

f
m

0 y(t) Y

Figure 27.6: A mass/spring system with mass m and an outside force f acting on the mass.

27.7 Duhamel’s Principle and Resonance

The Problem
Now is a good time to re-examine some of those “forced” mass/spring systems originally discussed
in chapter 21, and diagrammed in ﬁgure 27.6. Recall that this system is modeled by
d2 y dy
m + γ + κy = f
dt 2 dt
where y = y(t) is the position of the mass at time t (with y = 0 being the “equilibrium” position of
the mass when f = 0 ), m is the mass of the object attached to the spring, κ is the spring constant,
γ is the damping constant, and f = f (t) is the sum of all forces acting on the spring other than the
damping friction and the spring’s reaction to being stretched and compressed ( f was called Fother
in chapter 16 and F in chapter 21). Remember, also, that m and κ are positive constants.
Our main interest will be in the phenomenon of resonance in an undamped system. Accordingly,
we will assume γ = 0 , and restrict our attention to solving
d2 y
m + κy = f . (27.17)
dt 2
Ultimately, we will further restrict our attention to cases in which f is periodic. But let’s wait on
that, and derive some basic formulas without assuming this periodicity.

Solutions Using Arbitrary f

The General Solution
As you know quite well by now, the general solution to our differential equation, equation (27.17),
is
y(t) = y p (t) + yh (t)
where yh is the general solution to the corresponding homogeneous differential equation, and y p
is any particular solution to the given nonhomogeneous differential equation.
The formula for yh is already known. In chapter 16, we found that
(
κ
yh (t) = c1 cos(ω0 t) + c2 sin(ω0 t) where ω0 = .
m
Recall that ω0 is the natural angular frequency of the mass/spring system, and is related to the
system’s natural frequency ν0 and natural period p0 via
ω0 1 2π
ν0 = and p0 = = .
2π ν0 ω0

i i

i i
i i

i i

Duhamel’s Principle and Resonance 531

For future use, note that yh is a periodic function with period p0 ; hence

yh (t + p0 ) − yh (t) = 0 for all t .

That leaves ﬁnding a particular solution y p . Let’s take this to be the solution to the initial-value
problem
d2 y
m + κy = f with y(0) = 0 and y (0) = 0 .
dt 2
This is easily found by either applying the Laplace transform and using the convolution identity in
taking the inverse transform, or by appealing directly to Duhamel’s principle. Either way, we get
t
y p (t) = h ∗ f (t) = h(t − x) f (x) dx
0

where
1
h(τ ) = L−1 = 1 L−1 1 .

ms + κ τ
2 m s + m τ
2 κ/
√κ
Since ω0 = /m ,

1 1
h(τ ) = L−1 2 = 1 sin(ω0 τ ) .
m s + (ω0 ) τ
2 ω0 m

Thus, the above integral formula for y p can be written as

t
1
y p (t) = sin(ω0 [t − x]) f (x) dx . (27.18)
ω0 m 0

The Difference Formula and First Theorem

For our studies, we will want to see how any solution y varies “over a cycle” (i.e., as t increases
by p0 ). This variance in y over a cycle is given by the difference y(t + p0 ) − y(t) , and will be
especially meaningful when the forcing function is periodic with period p0 .
For now, let’s consider the difference y(t + p0 ) − y(t) assuming y = y p + yh is any solution
to our differential equation. Of course, the yh term is irrelevant because of its periodicity,

y(t + p0 ) − y(t) = y p (t + p0 ) + yh (t + p0 ) − y p (t) + yh (t)
= y p (t + p0 ) − y p (t) + yh (t + p0 ) − yh (t) .

0

Now, using formula (27.18) for y p , we see that

t+ p0
1
y p (t + p0 ) = sin(ω0 [(t + p0 ) − x]) f (x) dx
ω0 m 0
t+ p0
1
= sin(ω0 [t − x] + ω0 p0 ) f (x) dx
ω0 m 0
2π
t+ p0
1
= sin(ω0 [t − x]) f (x) dx
ω0 m 0
t t+ p0
1 1
= sin(ω0 [t − x]) f (x) dx + sin(ω0 [t − x]) f (x) dx .
ω0 m 0 ω0 m t

i i

i i
i i

i i

532 Piecewise-Deﬁned Functions and Periodic Functions

But the ﬁrst integral in the last line is simply the integral formula for y p (t) given in equation (27.18).
So the above reduces to
t+ p0
1
y(t + p0 ) = y(t) + sin(ω0 [t − x]) f (x) dx . (27.19)
ω0 m t

To further “reduce” our difference formula, let us use a well-known trigonometric identity:
t+ p0
sin(ω0 [t − x]) f (x) dx
t
t+ p0
= sin(ω0 t − ω0 x) f (x) dx
t
t+ p0
= [sin(ω0 t) cos(ω0 x) − cos(ω0 t) sin(ω0 x)] f (x) dx
t
t+ p0 t+ p0
= sin(ω0 t) cos(ω0 x) f (x) dx − cos(ω0 t) sin(ω0 x) f (x) dx .
t t

Combining this result with the last equation for y(t + p0 ) and recalling the previous results derived
in this section then yield:

Theorem 27.3
Let m and κ be positive constants, and let f be any piecewise continuous function of exponential
order. Then, the general solution to
d2 y
m + κy = f
dt 2

is
y(t) = y p (t) + c1 cos(ω0 t) + c2 sin(ω0 t)

where ( t
κ 1
ω0 = and y p (t) = sin(ω0 [t − x]) f (x) dx .
m ω0 m 0

Figure 27.8: (a) A basic saw function, and (b) the corresponding response of an undamped
mass/spring system with natural period 1 over 6 cycles.

and so on. In general,

y(tn ) = y(τ ) + n A cos(ω0 τ − φ) . (27.21)
Clearly, if A = 0 and ω0 τ − φ is neither π/ or 3π/ , then
2 2

y(tn ) → ±∞ as n→∞ .

This is clearly “runaway resonance”.

Thus, it is the A in difference formula (27.20) that determines if we have “runaway resonance”.
If A = 0 , the solution contains an oscillating term with a steadily increasing amplitude. On the
other hand, if A = 0 , then the solution y is periodic and does not “blow up”.
By the way, for graphing purposes it may be convenient to use the periodicity of the cosine term
and rewrite equation (27.21) as

y(tn ) = y(τ ) + n A cos(ω0 tn − φ) .

Replacing tn with t , and recalling what n and τ represent, we see that this is the same as saying

y(t) = y(τ ) + n A cos(ω0 τ − φ) (27.22)

where n is the largest integer such that np0 ≤ t and τ = t − np0 .

!Example 27.14: Let us use the theorems in this section to analyze the response of an undamped
mass/spring system with natural period p0 = 1 to a force f given by the basic saw function
sketched in ﬁgure 27.8a,

t if 0 < t < 1
f (t) = saw(t) = .
saw(t − 1) if 1 < t

The corresponding natural angular frequency is

2π
ω0 = = 2π .
p0

The actual values of the mass m and spring constant κ are irrelevant provided they satisfy
(
κ
2π = ω0 = .
m

Also, since the solution to the corresponding homogeneous differential equation was pretty much
irrelevant in the discussion leading to our last theorem, let’s assume our solution satisﬁes

y(0) = 0 and y (0) = 0 ,

i i

i i
i i

i i

536 Piecewise-Deﬁned Functions and Periodic Functions

so that the solution formula described in theorem 27.3 becomes

t
1
y(t) = y p (t) = sin(ω0 [t − x]) saw(x) dx .
2π m 0

In particular, if 0 ≤ x ≤ t < 1 , then saw(x) = x , and we can complete our computations

of y(t) using integration by parts:

t
1
y(t) = sin(2π [t − x]) x dx
2π m 0
t
1 x t 1
= cos(2π [t − x]) − cos(2π [t − x]) dx
2π m 2π x=0 0 2π

1 t 0
= cos(2π [t − t]) − cos(2π [t − 0])
2π m 2π 2π

1 1
+ sin(2π [t − t]) − sin(2π [t − 0]) .
(2π ) 2 (2π )2

This simpliﬁes to
1
y(t) = [2π t − sin(2π t)] when 0 ≤ t < 1 . (27.23)
8π 3 m

In a similar manner, we ﬁnd that

p0 1
IS = cos(ω0 x) f (x) dx = cos(2π x) x dx = · · · = 0
0 0
and
p0 1
1
IC = − sin(ω0 x) f (x) dx = − sin(2π x) x dx = · · · = .
0 0 2π

Thus,
!
1 1
A = (I S )2 + (IC )2 = .
2π m 4π 2 m
Since A = 0 , we have resonance. There is an oscillatory term whose amplitude steadily increases
as t increases.
To actually graph our solution, we still need to ﬁnd the phase, φ , which (according to our
last theorem) is the value in [0, 2π ) such that
IC IS
cos(φ) = ! = 1 and sin(φ) = ! = 0 .
(I S )2 + (IC )2 (I S )2 + (IC )2

Clearly φ = 0 .
So let t ≥ 0 . Then, employing formula (27.22) (derived just before this example),

y(t) = y(τ ) + n A cos(ω0 t − φ)

1 n
= [2π τ − sin(2π τ )] + cos(2π t)
8π 3 m 4π 2 m

where (since p0 = 1 ) n is the largest integer with n ≤ t and τ = t − n . This is the function
graphed in ﬁgure 27.8b.

i i

i i
i i

i i

Additional Exercises 537

Additional Exercises

27.2. Using the ﬁrst translation identity or one of the differentiation identities, compute each of
the following:

a. L e4t step6 (t) b. L t step6 (t) s
s

27.3. Compute (using the translation along the T –axis identity) and then graph the inverse trans-
forms of the following functions:
e−4s e−3s √ −3/ −s
a. b. c. πs 2 e
s3 s+2
π e−4s (s + 2)e−5s
d. e−2s e. f.
s2 + π 2 (s − 5)3 (s + 2)2 + 16

27.4. Finish solving the differential equation in example 27.1.

27.5. Compute and then graph the inverse transforms of the following functions (express your
answers as sets of conditional formulas):
1 − e−s e−s + e−3s 2 2 + 4s −2s
a. b. c. − e
s2 s s3 s3

π 1 + e−s (s + 4)e−12 − 8e−3s e−2s − 2e−4s + e−6s
d. e. f.
s2 + π 2 s 2 − 16 s2

27.6. Find and graph the solution to each of the following initial-value problems:

a. y = step3 (t) with y(0) = 0

b. y = step3 (t) with y(0) = 4

c. y = step2 (t) with y(0) = 0 and y (0) = 0

d. y = step2 (t) with y(0) = 4 and y (0) = 6

e. y + 9y = step10 (t) with y(0) = 0 and y (0) = 0

27.7. Compute the Laplace transforms of the following functions using the translation along the
T –axis identity. (Trigonometric identities may also be useful for some of these.)
⎧
0 if t < 6 ⎨ 0 if t < 4
a. f (t) = b. g(t) = 1
e4t if 6 < t ⎩√ if 4 < t
t −4

c. t step6 (t) d. te step2 (t)

e. t 2 step6 (t) f. sin(2(t − 1)) step1 (t)

g. sin(2t) stepπ/2 (t) h. sin(2t) stepπ/4 (t)
i. sin(2t) stepπ/6 (t)

i i

i i
i i

i i

538 Piecewise-Deﬁned Functions and Periodic Functions

27.8. For each of the following choices of f :

i. Graph the given function over the positive T –axis.

i i
i i

542 Delta Functions

1 1
= =
4 4

1 1
= =
2 2

1 =1 1 =1

0 1 0 α α+
(a) (b)

Figure 28.1: The graphs of (a) 1 rect(0,) (t) and (b) 1 rect(0,) (t − α) (equivalently
1 rect
(α,α+) (t) ) for = 1 , = /2 and = /4 .
1 1

of the delta function as “an inﬁnite spike enclosing unit area” is useful, just as it is useful in physics
to sometimes pretend that we can have a “point mass” (an inﬁnitesimally small particle of nonzero
mass).
The above describes “the” delta function. For any real number α , the delta function at α ,
δα (t) , is simply “the” delta function shifted by α ,
1
δα (t) = δ(t − α) = lim rect(0,) (t − α) .
→0

With a little thought (or a glance at ﬁgure 28.1b), you can see that the nonzero part of rect(0,) (t − α)
starts at t = α and ends at t = α + , and that
1
δα (t) = δ(t − α) = lim rect(α,α+) (t) .
→0

Do notice that
δ0 (t) = δ(t − 0) = δ(t) .
This means that anything we derive concerning δα also holds for δ — just let α = 0 .

28.2 Delta Functions in Modeling

There are at least two general situations in which delta functions naturally arise when we attempt to
describe “real world” phenomena. One is when we attempt to model brief but strong forces. The
other is when we imagine physical objects as “point masses”. In both, the delta functions appear in
integrals. This will be signiﬁcant and is well worth observing in the models described below.
Since it will be especially useful to see how delta functions model “strong forces of brief
duration”, we’ll start with that.

i i

i i
i i

i i

Delta Functions in Modeling 543

Strong Forces of Brief Duration

Consider the motion of some object under a force that varies with time. We will assume the object’s
motion is one dimensional (say, along some X–axis), and, as usual, we’ll let

m = the mass (in kilograms) of the object (assumed constant) ,

t = time (in seconds) ,

v(t) = velocity (in meters/second) of the object at time t ,

and
F(t) = force (in kilogram·meters/second2 ) acting on the object at time t .

(Of course, any units for time, mass and distance can be used, as long as we are consistent.)
Newton’s famous law of force gives
dv
F(t) = m × acceleration = m .
dt
If we integrate this over an interval (t0 , t1 ) , we get
t1 t1
dv
F(t) dt = m dt = m [v(t1 ) − v(t0 )] .
t0 t0 dt

So the integral of F(t) from t = t0 to t = t1 is the object’s mass times the change in the
object’s velocity over that period of time. This integral of F is sometimes called the impulse of
the force over the interval (t0 , t1 ) (with the total impulse being this integral with t0 = −∞ and
t1 = ∞ ).1 Note that, following our above conventions for units, the units associated with the impulse
is kilogram·meters/second.
Let’s now restrict ourselves to situations in which the force is zero except for a very short period
of time, during which the force is strong enough to significantly change the velocity of the object
under question. We may be talking about the force of a baseball bat striking a baseball, or the force
of some propellent (gunpowder, compressed air, etc.) forcing a bullet out of a gun, or even the force
of a baseball in flight striking some unfortunate bat that fluttered out over the field to catch flies.
For concreteness, let’s pretend we are studying the force of a baseball bat hitting a baseball at “time
t = α ”. If we are very precise, we may let t = α be the first instant the bat comes into contact with
the ball, and the length of time the bat remains in contact with the ball. Considering the situation,
this length of time, , must be positive, but very small.
Before and after the bat touches the ball, this force is zero. So our F(t) must be some function,
such as,
1
rect(α,α+) (t) ,

that satisfies
F(t) = 0 if t < α and if α + < t .
Thus, if t0 < α and α + < t1 , then
α α
m [v(α) − v(t0 )] = F(t) dt = 0 dt = 0
t0 t0
and
t1 t1
m [v(t1 ) − v(α + )] = F(t) dt = 0 dt = 0 .
α+ α+
1 Students of physics will observe that the impulse is actually equal to the change in the momentum, mv .

i i

i i
i i

i i

544 Delta Functions

Since t0 can be any value less than α , and t1 can be any value greater than α + , the last two
equations tell us that the velocity is one constant vbefore before the bat hits the ball, and another
constant vafter afterwards, with
vbefore = v(α) and vafter = v(α + ) .
(We are using vbefore and vafter because the expressions v(α) and v(α+) will become problematic
when we let → 0 .)
The precise formula for F(t) while the bat is in contact with the ball is typically both difﬁcult
to determine and of little interest. All we usually care about is describing F(t) well enough to get
the correct change in the velocity of the ball, vafter − vbefore . So let us pick
1
F(t) = rect(α,α+) (t) ,

and see what the resulting change of velocity is as t changes from t0 to t1 (with t0 < α and
α + < t1 ):
m [vafter − vbefore ] = m [v(t1 ) − v(t0 )]
t1
= F(t) dt
t0
t1 α+
1 1
= rect(α,α+) (t) dt = dt = 1 .
t0 α

In other words,
1
F(t) = rect(α,α+) (t)

describes a force of duration starting at t = α with a total impulse of 1 . Obviously, if we, instead,
wanted a force of duration starting at t = α with a total impulse of I , we could just multiply the
above by I . The corresponding velocity of the ball is then given by

vbefore if t < α
v(t) = .
vafter if α + < t

where t1
1
m [vafter − vbefore ] = I· rect(α,α+) (t) dt = I .
t0

There is just one little complication: Determining the length of time, , the bat is in contact
with the ball. And, naturally, because this length of time is so close to being zero, we will simplify
our computations by letting → 0 . Thus, for some constant I , we model the force by a delta
function force
1
F(t) = lim I · rect(α,α+) (t) = I δα (t) .
→0+
The resulting velocity of the ball v(t) is then given by two constants vbefore and vafter , with

vbefore if t < α
v(t) = .
vafter if α < t

where
m [vafter − vbefore ] = total impulse of F = I .
Observe that using a delta function force leads to the velocity changing instantly from one constant
to another. The velocity is no longer continuous, and the velocity right at t = α is no longer well
deﬁned. This is not physically possible but is still a very good approximation of what really happens.

i i

i i
i i

i i

Delta Functions in Modeling 545

We should also note that F(t) = δα (t) corresponds to a force acting instantaneously at t = α
with total impulse of 1 . For that reason, δα is also known as the (instantaneous) unit impulse
function at α .

!Example 28.1: A baseball of mass 0.145 kilograms is thrown with a speed of 35 meters per
second (about 78 miles per hour) towards a batter, who then hits the ball (with his bat), sending it
back to the batter with a speed of 42 meters per second (about 94 miles per hour). We’ll simplify
matters slightly by assuming the ball travels along an X –axis both before and after it is hit, with
the initial direction of travel in the negative direction and the ﬁnal direction of travel in the positive
direction along the X –axis. So, (in meters/second)

vbefore = −35 and vafter = 42 .

Letting α be the time the bat hits the ball, we can model the force of the bat on the ball by

kg·meter
F(t) = I δα (t) 2 second

where the impulse of the force is

kg·meter
I = m [vafter − vbefore ] = 0.145 [42 + 35] = 11.165 .
second

As Density Functions for Point Masses

Suppose we have some material spread out along the X–axis. Recall that the linear density of the
material at position x , ρ(x) , is the “mass per unit length” of the material at point x . More precisely,
it is the function such that, if x 0 < x 1 , then
x1
ρ(x) dx
x0

gives the mass of the material between positions x = x 0 and x = x 1 .

Now think about what it means to have a density function
⎧
⎪
⎪ 0 if t < 0
⎪
⎨
m m
ρ(x) = rect(α,α+) (x) = if 0 < t <
⎪
⎪
⎪
⎩
0 if < t

where α , m and are real numbers with m and being positive. Here, all the mass is uniformly
spread out in some object located between x = α and x = α + . Picking x 0 < α and α + < x 1 ,
we see that
x1
total mass of the object = ρ(x) dx
x0
x1 α+
m m
= rect(α,α+) (x) dx = 1 dx = m .
x0 α

So we have an object of mass m occupying the X–axis from x = α to x = α + .

In many applications, the width of the object, , is much smaller than the other dimensions
involved, and taking account of this width complicates computations without signiﬁcantly affecting

i i

i i
i i

i i

546 Delta Functions

the results of the computations. In these cases, it is common to simplify the mathematics by letting
→ 0 and thereby converting

our object of mass m occupying the region between x = α and x = α +

to
an object of mass m occupying the point x = α .
In doing so, we see that
m 1
ρ(x) = lim rect(α,α+) (x) = m lim rect(α,α+) (x) = mδα (x) .
→0+ →0+

Thus, the delta function at α multiplied by m describes the linear density of a “point mass” at α
of mass m .

28.3 The Mathematics of Delta Functions

Integrals with Delta Functions
While we used
1
δα (t) = δ(t − α) = lim rect(α,α+) (t) (28.1)
→0+

to visualize the delta function at α , it is mathematically better to view δα through the integral
equation
t1 t1
1
g(t)δα (t) dt = lim g(t) rect(α,α+) (t) dt (28.2)
t0 →0+ t0

where (t0 , t1 ) can be any interval and g can be any function on (t0 , t1 ) continuous at α . This
means we are really viewing “ δα (t) ” as notation indicating a certain limiting process involving
integration. Remember, that’s how we actually used delta functions in modeling strong brief forces
and point masses.
Since our interest is mainly in using delta functions with the Laplace transform, let us simplify
matters a little and just consider the integral
∞
g(t) δα (t) dt
0

when α ≥ 0 and g is any function continuous at α and piecewise continuous on [0, ∞) . Before
applying equation (28.2), observe that, because 0 ≤ α and
⎧
⎪
⎪ if t < α
⎨ 0
g(t) rect(α,α+) (t) = g(t) if α < t < α +
⎪
⎪
⎩ 0 if α + < t ,

we have
∞
1 1 α+
g(t) · rect(α,α+) (t) dt = g(t) dt .
0 α

i i

i i
i i

i i

The Mathematics of Delta Functions 547

g(t )

α t α+
" α+
Figure 28.2: The rectangle with area equal to α g(t) dt .

Because we will be taking the limit of the above as → 0 , we can assume is small enough that
g is continuous on the closed interval [a, α + ] , and then apply the fact (illustrated in ﬁgure 28.2)
that
α+
g(t) dt = “(net) area between T –axis and graph of y = g(t) with α ≤ t ≤ α + ”
α
= “(net) area of rectangle with base [α, α + ] and (signed) height g(t )
for some t in the interval [a, α + ] ”
= × g(t ) for some t in [α, α + ] .

Combining the above and applying equation (28.2), we obtain

∞ ∞
1
g(t) δα (t) dt = lim g(t) · rect(α,α+) (t) dt
0 →0 0
α+
1
= lim g(t) dt
→0 α
1 $ %
= lim × × g(t ) for some t in [α, α + ]
→0
$ %
= lim g(t ) for some t in [α, α + ]
→0

= g(t ) for some t in [α, α + 0] .

But, of course, the only t in [α, α + 0] is t = α . So the above reduces to a simple result that is
important enough to place in a theorem.

Theorem 28.1
Let α ≥ 0 and let g be any piecewise continuous function on [0, ∞) which is continuous at t = α .
Then ∞
g(t) δα (t) dt = g(α) . (28.3)
0
In particular, since δ = δ0 , ∞
g(t) δ(t) dt = g(0) . (28.4)
0

!Example 28.2: Actually, two examples:

∞
t 2 δ3 (t) dt = 32 = 9 ,
0

i i

i i
i i

i i

548 Delta Functions

and ∞
(5 − t)3 δ(t) dt = (5 − 0)3 = 125 .
0

We derived the above theorem because it covers the cases of greatest interest to us. Still, it is
worth noting that with just a little more work, you can verify that
t1
g(α) if t0 ≤ α < t1
g(t) δα (t) dt = (28.5)
t0 0 if α < t0 or t1 ≤ α

whenever g is a function continuous at α and piecewise continuous on [t0 , t1 ) .

Equations (28.3) and (28.4) (and, more generally, equation (28.5)) are often used instead of
equation (28.2) “fundamental descriptions” of the delta functions. Their simplicity belies their
signiﬁcance.

Laplace Transforms of Delta Functions

Finding the Laplace transform of a delta function is easy. Just use the integral formula for the Laplace
transform along with an equation from theorem 28.1. Assuming α ≥ 0 , we have
∞
L[δα (t)]|s = δα (t)e−st dt = e−sα ,
0

which we usually prefer to write as

L[δα (t)]|s = e−αs .
In particular,
L[δ(t)]|s = L[δ0 (t)]|s = e−0s = 1 .
These transforms are important enough to add to our table of common transforms, giving us
table 28.1.

Differential Equations with Delta functions

Using the Laplace transform, it is relatively easy to solve many differential equations in which delta
functions act as forcing functions. Let us look at two examples.

!Example 28.3: Let’s ﬁnd the solution to

dy
= δα (t) with y(0) = 0
dt
where α is any positive real number.
Taking the Laplace transform of both sides:
d y

L = L[δα (t)]|s
dt s

→ sY (s) − y(0) = e−αs

→ sY (s) − 0 = e−αs

e−αs
→ Y (s) =
s
.

i i

i i
i i

i i

The Mathematics of Delta Functions 549

Table 28.1: Laplace Transforms of Common Functions (Version 2)

In the following, α and ω are real-valued constants, and, unless otherwise noted, s > 0 .

f (t) F(s) = L[ f (t)]|s Restrictions

1
1
s
1
t
s2
n!
tn n = 1, 2, 3, . . .
s n+1
√
1 π
√ √
t s

(α + 1)
tα −1 < α
s α+1
1
eαt α<s
s −α
1
ei αt
s − iα
s
cos(ωt)
s 2 + ω2
ω
sin(ωt)
s 2 + ω2

e−αs
stepα (t), step(t − α) 0≤α
s

δ(t) 1

δα (t), δ(t − α) e−αs 0≤α

Thus, the solution to our differential equation is

e−αs
y(t) = L−1 [Y (s)]|t = L−1 = stepα (t) .
s t

According to the last example, y(t) = stepα (t) is a solution to y (t) = δα (t) . In other words,
d
stepα (t) = δα (t) .
dt
This interesting fact may also be a disturbing fact for those of you who realize that step functions
are not differentiable, at least not in the sense normally taught in calculus courses. The truth is that
delta functions are somewhat exotic entities that are outside the classical theory of calculus. They
are really examples of things better referred to as “generalized functions”, and the above equation
about the derivative of the step function, while not valid in a strict classical sense, is valid using a

i i

i i
i i

i i

550 Delta Functions

deﬁnition of differentiation appropriate for these generalized functions. (We will discuss this a little
further in section 28.5.)
But enough worrying about technicalities. Let’s solve another differential equation with a delta
function.

!Example 28.4: Now consider

y − 10y + 21y = δ(t) with y(0) = 0 and y (0) = 0 .

Taking the Laplace transform of both sides:

L y − 10y + 21y s = L[δ(t)]|s

→ L y s − 10L y s + 21L[y]|s = 1

→ s 2 Y (s) − 10sY (s) + 21Y (s) = 1

2
→ s − 10s + 21 Y (s) = 1 .

So,
1
Y (s) = ,
s 2 − 10s + 21
which just happens to be the function whose inverse transform was found in example 26.4 on
page 502. Using the result of that example, we can just write out

y(t) = L−1 [Y (s)]|t =
1
e7t − e3t .
4

28.4 Delta Functions and Duhamel’s Principle

If you compare the results of the last example with the results of example 26.6 on page 503, you’ll
notice that the solution y(t) to

y − 10y + 21y = δ(t) with y(0) = 0 and y (0) = 0

and the impulse response function h(t) for

y − 10y + 21y = f (t)

are one and the same. Is this an amazing coincidence?

No.
In section 26.3 we saw that, for any real constants a , b and c , and any Laplace transformable
function f , the solution on (0, ∞) to the generic initial-value problem

ay + by + cy = f (t) with y(0) = 0 and y (0) = 0

is given by
y(t) = h ∗ f (t)

i i

i i
i i

i i

Delta Functions and Duhamel’s Principle 551

where
1
h = L−1 [H ] and H (s) = .
as 2 + bs + c
Now consider the corresponding initial-value problem

ay + by + cy = δ(t) with y(0) = 0 and y (0) = 0 ,

which is just the generic initial-value problem above with f = δ . Taking the Laplace transform,
we get

L ay + by + cy s = L[δ(t)]|s

→ aL y s + bL y s + cL[y]|s = 1

→ as 2 Y (s) + bsY (s) + cY (s) = 1

2
→ as + bs + c Y (s) = 1 .

Dividing by the polynomial and comparing the result with the above formula for H , we see that
1
Y (s) = = H (s) .
as 2 + bs + c
Thus,
y(t) = L−1 [Y (s)]|t = L−1 [H (s)]|t = h(t) .
In other words, h(t) is the solution to the particular initial-value problem

ah + bh + ch = δ(t) with y(0) = 0 and y (0) = 0 .

This explains why h is commonly referred to as the “impulse response function” — well,
almost explains. Here’s a little background: In many applications, the solution to the initial-value
problem
ay + by + cy = f (t) with y(0) = 0 and y (0) = 0
describes how some physical system responds to an applied “force” f (actually, f might not be
an actual force). With this interpretation, h does give the response of the system to a delta function
force, and, as noted earlier, the delta function is also known as a unit impulse function. Hence the
term “impulse response function” for h .
Of course, the generic computations just done can be done with higher-order differential equa-
tions. Combining this with theorem 26.2 on page 506 yields

Theorem 28.2 (Duhamel’s principle, version 2)

Let N be any positive integer, let a0 , a1 , . . . and a N be any collection of real-valued constants,
and let f (t) be any Laplace transformable function. Then, the solution to

a0 y (N ) + a1 y (N −1) + · · · + a N −2 y + a N −1 y + a N y = f (t)

satisfying the N th -order “zero” initial conditions,

y(0) = 0 , y (0) = 0 , y (0) = 0 , ... and y N −1 (0) = 0 ,

is given by t
y(t) = h ∗ f (t) = h(x) f (t − x) dx
0

i i

i i
i i

i i

552 Delta Functions

where h(t) is the solution to

a0 h (N ) + a1 h (N −1) + · · · + a N −2 h + a N −1 h + a N h = δ(t)

with
h(0) = 0 , h (0) = 0 , h (0) = 0 , ... and h N −1 (0) = 0 .

There is a practical consequence to h being the impulse response function. Suppose you have
a physical system in which you know the ‘output’ y(t) is related to an ‘input’ f (t) through a
differential equation of the form given in the above theorem. Suppose, further, that you do not know
exactly what that differential equation is. Maybe, for example, you have a mass/spring system some
of whose basic parameters — mass, spring constant or damping constant — are unknown and cannot
be easily measured. The above theorem tells us that, if we input the physical equivalent of a delta
function (say, we provide a unit impulse to the mass/spring system by carefully hitting the mass with
a hammer), then measuring the output over time will yield a description of the impulse response
function, h(t) . Save those values for h(t) over time in a computer, and you can then numerically
evaluate the output y(t) corresponding to any other input f (t) through the formula

y(t) = f ∗ h(t) .

In practice, generating and inputting the physical equivalent of δ(t) is usually impossible. What
is often possible is to generate and input a good approximation to the delta function, say,
1
rect(0,) (t)

for some small value of . The resulting measured output will not be h(t) exactly, but, if the errors
in measurement aren’t too bad, it will be a close approximation.

28.5 Some “Issues” with Delta Functions

The astute reader may have noticed that we’ve glossed over a few troublesome issues in our discussion
of delta functions. Let’s deal with a few of these now.

Deﬁning the Delta Functions

You may have noticed that we have not yet deﬁned the delta function. In particular, I’ve not given
you any formula for computing the values of δ(t) or δα (t) for different values of t . Instead, you’ve
only been told to visualize δα (t) in terms of either the limit
1
δα (t) = δ(t − α) = lim rect(α,α+) (t) , (28.6)
→0

or the limit t1 t1
1
g(t)δα (t) dt = lim g(t) rect(α,α+) (t) dt . (28.7)
t0 →0+ t0

If you check other texts, you’ll often ﬁnd δa (with α ≥ 0 ) “deﬁned” either as the limit in
(28.6) or as the ‘function’ such that
∞
g(t) δα (t) dt = g(α) (28.8)
0

i i

i i
i i

i i

Some “Issues” with Delta Functions 553

whenever g is a function continuous at α . (This, recall, was something we derived from equation
(28.7).) Both of these are good ‘working’ deﬁnitions in that, properly interpreted, they tell you how
you should use the symbol δα in computations (provided you interpret the limit in (28.6) as really
meaning the limit in (28.7)).
Unfortunately, if you treat either as a rigorous deﬁnition for a classical function δα , then you
can then rigorously derive
δα (t) = 0 whenever t = α .

Rigorously applying the classical theory of integration normally developed in undergraduate math-
ematics, you then ﬁnd that
∞ α ∞
g(t) δα (t) dt = g(t) δα (t) dt + g(t) δα (t) dt
0 0 α
α ∞
= g(t) · 0 dt + g(t) · 0 dt = 0 .
0 α

In particular, using g(t) = t 2 , α = 1 and both equation (28.7) and the last equation above, we get
∞
1 = 12 = t 2 δ2 (t) dt = 0 !
0

The problem is that there is no classical function that satisfies either definition. Fortunately, there
is a way to ‘generalize’ the classical notion of ‘functions’ yielding a class of things called “generalized
functions”. Delta functions are members of this class. Unfortunately, a proper development of
“generalized functions” goes beyond the scope of this text. What can be said is that, if f is a
generalized function, then, for every sufficiently smooth and integrable function g and suitable
interval (t0 , t1 ) , then t1
g(t) f (t) dt
t0

“makes sense” in some generalized sense. For f = δα , this integral can be deﬁned by equation
(28.7).2 Using the theory of generalized functions, along with the corresponding generalization of
the theory of calculus, everything developed in this chapter can be rigorously deﬁned or derived,
including the observation that, “in a generalized sense”,
d
δα (t) = stepα (t) .
dt

For now, however, it may best to view the computations we are doing with δα as shorthand for doing
the same computations with
1
rect(α,α+) ,

and then letting → 0+ in the ﬁnal result.

!Example 28.5: Let’s reconsider solving

d y
= δα (t) with y(0) = 0
dt

2 If you must know, “generalized functions” are actually ”continuous linear functionals on a suitable space of test functions”,
and if you want to ﬁnd out what that means, see part IV of the author’s Principles of Fourier Analysis, or go to the library
and look up books on either generalized functions or distributional theory, or, possibly, do an internet search for these
terms.

i i

i i
i i

i i

554 Delta Functions

1 1

α α + a T α α + b T
(a) (b)

Figure 28.3: The graph of the solution y to initial-value problem (28.9) (a) when = a and
(b) when = b with 0 < b < a .

where α is any positive real number. Doing the replacement suggested above, we’ll ﬁrst solve
d y 1
= rect(α,α+) with y(0) = 0 , (28.9)
dt
assuming > 0 , and then take the limit of the result as → 0 .
Taking the Laplace transform of both sides of the last equation:
d y
1
L = L rect(α,α+)
dt s s

1
→ sY (s) − y(0) = L rect(α,α+) s

→ sY (s) − 0 =
1 1 −αs
s
e
1
− e−(α+)s
s

→ Y (s) =
1 1 −αs
s2
e
s
1
− 2 e−(α+)s .

So, 1 1
1 −(α+)s
y (t) = L−1 e −αs
− e
s2 s2 t
0 1 1 1
1 −1 −αs −1 −(α+)
= L e − L e
2 s t 2 s t
1$ %
= [t − α] stepα (t) − [t − (α + )] stepα+ (t)

⎧
⎪
⎪ 0−0 if t < α
⎨
1
= [t − α] − 0 if α < t < α +
⎪⎪
⎩ [t − α] − [t − (α + )] if α + < t
⎧
⎪
⎪ 0 if t < α
⎪
⎨
t −α
= if α < t < α + .
⎪
⎪
⎪
⎩
1 if α + < t
Graphs of y for two different values of are sketched in ﬁgure 28.3.
Finally, taking the limit, we get
⎧ ⎫
⎪
⎪ 0 if t < α ⎪
⎪
⎪
⎨ ⎪
⎬
t −α 0 if t < α
y(t) = lim y (t) = lim if α < t < α + = .
→0+ →0+ ⎪⎪ ⎪
⎪ 1 if α < t
⎪
⎩ ⎪
⎭
1 if α + < t

i i

i i
i i

i i

Some “Issues” with Delta Functions 555

That is,
y(t) = stepα (t) ,
just as obtained (with much less work!) in example 28.3.

Continuity of Solutions and Problems with Initial Values

Early in this text, it was stated that solutions to ﬁrst-order differential equations had to be continu-
ous, and solutions to second-order differential equations had to be continuous and have continuous
derivatives. But y = stepα (t) , the solution to
dy
= δα (t) with y(0) = 0
dt

obtained in exercises 28.3 and 28.5, is clearly not continuous. If you think about it, this may not
be so surprising. Our original insistence on the continuity of solutions assumed we were using
classical functions. The exotic nature of the delta functions takes us outside the classical theory to
the idealized cases where instantaneous change can occur.
Normally, this is not a problem. Indeed, it can desirable, especially if you are modeling “brief
strong forces”. One place where this can cause some confusion is when the discontinuities occur
where initial data is given. In these cases, the confusion can be somewhat abated by remembering
that a delta function really indicates a limiting process.

!Example 28.6: Letting α = 0 , we see that the solution to

dy
= δ(t) with y(0) = 0
dt

is
y(t) = step(t) .
However, step(t) has a jump at t = 0 , and its limit from the right at this point is 1 . So how
can we say this step function satisfies the given initial condition, y(0) = 0 ? By going back to
exercise 28.5, which showed that the above solution should be viewed as the limit as → 0 of
the function y (t) graphed in figure 28.3 with α = 0 . For each > 0 , y (t) is continuous at
t = 0 and satisfies y (0) = 0 . As becomes smaller, the values of y (t) increase more rapidly
to 1 for positive values of t . So what we end up with after taking → 0 is that the left-hand
limit of y(t) at t = 0 is 0 , but y(t) “immediately” increases from 0 to 1 as t switches from
negative values to positive values.

What this last example demonstrates is that, when the differential equation has a δ(t) in its
forcing function, then initial conditions naively written as

y(0) = y0 , y (0) = y1 , ...

are, well, naive. What is really meant is that these values give the left-hand limits,

lim y(t) = y0 , lim y (0) = y1 , ... .

t→0− t→0−

i i

i i
i i

i i

556 Delta Functions

Additional Exercises

28.1. The speeds of the pitched and batted baseball given in example 28.1 are close to ‘typical’
speeds. However, a really good professional pitcher can throw a fastball at a speed of 45
meters per second (a little over 100 miles per second), and a good batter can hit the ball
back at 49 meters per second (almost 110 miles per hour). Assuming these values (and a
0.145 kilogram baseball):
a. Find the magnitude of the impulse the pitcher initially imparts to the thrown ball.
b. Find the magnitude of the impulse the batter imparts to the ball.

28.2. For the following, assume an object of mass m kilograms is initially moving along the
X –axis with constant velocity vbefore meters/second until its velocity is changed to vafter
meters/second by a delta function force with impulse I kilogram·meters/second at time
t = α seconds.
a. Find vafter assuming m = 2 , vbefore = −10 and
i. I = 60 ii. I = 100 iii. I = 20
b. Assume m = 0.2 and vbefore = −40 . What impulse I is needed to obtain
i. vafter = 50 ii. vafter = 100 iii. vafter = 0
c. Assume I = 30 , and that the velocity of the object before and after t = α is determined
by radar. What is the mass of the object if
i. vbefore = −10 and vafter = 50 ii. vbefore = 0 and vafter = 15

28.3. Using the results given in theorem 28.1, compute the following integrals
∞ ∞
a. t 2 δ4 (t) dt b. t 2 δ(t) dt
0 0
∞ ∞
c. cos(t) δ(t) dt d. sin(t) δπ/6 (t) dt
0 0
∞ ∞
e. t 2 rect(1,4) (t)δ3 (t) dt f. t 2 rect(1,4) (t)δ6 (t) dt
0 0

28.4. Prove/derive equation (28.5) on page 548.

28.5. Show that

g ∗ δα (t) = g(t − α) stepα (t)
whenever α ≥ 0 , t > 0 and g is a piecewise continuous function on (0, ∞) .

28.6. Find and sketch the solution over [0, ∞) to each of the following:
a. y = 3δ2 (t) with y(0) = 0

b. y = δ2 (t) − δ4 (t) with y(0) = 0
c. y = δ3 (t) with y(0) = 0 and y (0) = 0
d. y = δ1 (t) − δ4 (t) with y(0) = 0 and y (0) = 0

i i

i i
i i

i i

Additional Exercises 557

e. y + 2y = 4δ1 (t) with y(0) = 0

f. y + y = δ(t) + δπ (t) with y(0) = 0 and y (0) = 0
g. y + y = −2δπ/2 (t) with y(0) = 0 and y (0) = 0

28.7. Find the solution on t > 0 to each of the following initial-value problems:
a. y + 3y = δ2 (t) with y(0) = 2

b. y + 3y = δ(t) with y(0) = 0 and y (0) = 0

c. y + 3y = δ1 (t) with y(0) = 0 and y (0) = 1

d. y + 16y = δ2 (t) with y(0) = 0 and y (0) = 0

e. y − 16y = δ10 (t) with y(0) = 0 and y (0) = 0

f. y + y = δ(t) with y(0) = 0 and y (0) = −1

g. y + 4y − 12y = δ(t) with y(0) = 0 and y (0) = 0

h. y + 4y − 12y = δ3 (t) with y(0) = 0 and y (0) = 0

i. y + 6y + 9y = δ4 (t) with y(0) = 0 and y (0) = 0

j. y − 12y + 45y = δ(t) with y(0) = 0 and y (0) = 0

k. y + 9y = δ1 (t) with y(0) = 0 , y (0) = 0 and y (0) = 0

l. y (4) − 16y = δ(t) with y(0) = y (0) = y (0) = y (0) = 0

i i

i i
i i

i i

i i
i i

i i

Part V
Power Series and Modiﬁed
Power Series Solutions

i i

i i
i i

i i

i i
i i

i i

29
Series Solutions: Preliminaries
(A Brief Review of Inﬁnite Series, Power
Series and a Little Complex Variables)

At this point, you should have no problem in solving any differential equation of the form
d2 y dy d2 y dy
a + b + cy = 0 or ax 2 + bx + cy = 0
dx2 dx dx2 dx
when a , b and c are all constants. You’ve even solved a few (in chapters 11 and 12) in which a ,
b and/or c were not constants. Unfortunately, the methods used in those chapters are somewhat
limited. More general methods do exist, and, in the next few chapters, we will discuss some of the
more important of these in which solutions are described in terms of “power series” and “modified
power series”.
Ideally, you are already well-enough acquainted with infinite series and power series to jump
right into the discussion of the next chapter. As a precaution, though, you may want to skim through
this chapter. It is a brief review of infinite series with an emphasis on power series, along with a
brief discussion of using complex variables in these series. As much as possible, we’ll limit our
discussion to topics that will be needed in the next few chapters, including a few that probably were
not emphasized during your first exposure to power series.

29.1 Inﬁnite Series

Basic Basics
Recall that, in the language of mathematics, an infinite series is a summation with infinitely many
terms. For example,
∞
1 1 1 1 1
= 1 + + + + + ···
k 2 3 4 5
k=1
is the infamous harmonic series. More generally, an infinite series is anything of the form
∞

αk or (equivalently) αγ + αγ +1 + αγ +2 + αγ +3 + αγ +4 + · · ·
k=γ

where γ , the starting index, is some integer (often, it’s 0 or 1 ), and the αk ’s are things that can be
added together. These αk ’s may be numbers, functions or even matrices. For the moment, we will
assume them to be numbers (as in the harmonic series, above).

561

i i

i i
i i

i i

562 Brief Review of Inﬁnite Series and Power Series

Given an arbitrary inﬁnite series

∞

αk = αγ + αγ +1 + αγ +2 + αγ +3 + αγ +4 + · · ·
k=γ

and any integer N ≥ γ , we deﬁne the corresponding N th partial sum S N by1

S N = sum of all terms from aγ to a N

N
= αk = αγ + αγ +1 + αγ +2 + · · · + α N .
k=γ

Observe that

N ∞

lim S N = lim αk = αk = αγ + αγ +1 + αγ +2 + αγ +3 + αγ +4 + · · · .
N →∞ N →∞
k=γ k=γ

Naturally, the usefulness of an infinite series usually depends on whether it actually adds up to
some finite value; that is, whether
N
lim αk
N →∞
k=γ
is some finite value. If the above limit does exist as a finite
2∞ value, then we say our series converges,
and call that value the sum of that series (freely using k=γ αk to denote both the series and its
sum). On the other hand, if this limit does not exist as a finite value, then we say the series diverges.
Recall the following simple facts:
2∞
k=γ αk converges, then we can closely approximate its sum by any N partial sum,
1. If th

provided we choose N large enough.

2
2. If ∞ k=γ αk converges, then its terms must shrink to zero as k gets large,

αk → 0 as k → ∞ .
2∞
Consequently, k=γ αk cannot converge (i.e., must diverge) if the terms do not shrink to
zero as k gets large. 2
However, it is quite possible to have a series ∞k=γ αk that diverges even though

αk → 0 as k → ∞ .
The harmonic series, above, is one example. Even though
1
αk = →0 as k → ∞ ,
k
you can easily show that the series diverges (to inﬁnity) using the integral test.
2 2∞
3. If ∞ k=γ αk and β are both convergent series, and A and B are any two ﬁnite num-
2 k
k=γ 2∞
bers, then the series ∞ k=γ [Aαk ] and k=γ [Aαk + Bβk ] are also convergent. Moreover,
∞
∞
∞
∞
∞

[Aαk ] = A αk and [Aαk + Bβk ] = A αk + B βk .
k=γ k=γ k=γ k=γ k=γ

To illustrate some of the above concepts, and to give us a first glimpse of “power series”, let’s
look at the “geometric series”.
1 It’s also standard to define S to be the sum of the first N terms. Our choice will be slightly more convenient.
N

i i

i i
i i

i i

Inﬁnite Series 563

The Geometric Series

Let x be any ﬁnite value and let γ be any nonnegative integer. The corresponding geometric series
is
∞
x k = x γ + x γ +1 + x γ +2 + x γ +3 + x γ +4 + · · · .
k=γ

If γ = 0 , we may refer to this as a basic geometric series.2

Letting γ = 0 and using, respectively, x = 0 , 1/2 , −1/2 , 1 , −1 , 2 , and −2 gives us the
following geometric series:
∞

0k = 1 + 0 + 0 + 0 + 0 + · · · ,
k=0

∞
1 k 1 1 1 1
= 1 + + + + + ··· ,
2 2 4 8 16
k=0

∞

1 k 1 1 1 1
− = 1 − + − + − ··· ,
2 2 4 8 16
k=0

∞

1k = 1 + 1 + 1 + 1 + 1 + · · · ,
k=0

∞

(−1)k = 1 − 1 + 1 − 1 + 1 − · · · ,
k=0

∞

2k = 1 + 2 + 4 + 8 + 16 + · · · ,
k=0
and
∞

(−2)k = 1 − 2 + 4 − 8 + 16 − · · · .
k=0
2∞
It should be obvious that a geometric series k=0 x k converges if x = 0 and diverges whenever
|x| ≥ 1 . It will also be worth noting that
∞

x k = x γ + x γ +1 + x γ +2 + x γ +3 + x γ +4 + · · ·
k=γ
∞
∞
∞

= x γ +k = xγ xk = xγ xk .
k=0 k=0 k=0

That is,
∞
∞

xk = xγ xk . (29.1)
k=γ k=0

2 Since 00 is an indeterminant form, it may be argued that there is a problem with the x 0 term when x = 0 . However, in
every geometric series with x = 0 , x 0 = 1 . So, to be consistent, we automatically assume x 0 = 1 in a geometric series
when x = 0 .

i i

i i
i i

i i

564 Brief Review of Inﬁnite Series and Power Series

Geometric series are unusual in that rather simple formulas can be derived for their partial sums.
To see this, let
N
SN = xk = x0 + x1 + x2 + · · · + x N .
k=0
If x = 1 , then

N
SN = 1k = 1 + 1 + 1+ · · · + 1 = N + 1 .
k=0 N +1 terms

If x = 1 , then
(1 − x)S N = S N − x S N

= x0 + x1 + x2 + · · · + x N

− x x0 + x1 + x2 + · · · + x N

= 1 + x1 + x2 + · · · + x N

− x 1 + x 2 + x 3 + · · · + x N +1

= 1 − x N +1 .
Dividing by 1 − x then gives us

N
1 − x N +1
x k = SN = for x = 1 . (29.2)
1−x
k=0

!Example 29.1: With x = 1/2 , the above formula for S N becomes

1 N +1
1− N +1
1
SN = 2
1
= 2 1− .
2
1−
2
In particular, 4+1
1 1 31
S4 = 2 1 − = 2 1− = .
2 32 16
Of greater interest is that
N +1 N +1
1 1
lim S N = lim 2 1 − = 2 1 − lim = 2[1 − 0] = 2 .
N →∞ N →∞ 2 N →∞ 2

Thus, the geometric series with x = 1/2 converges, and

∞
1 k
= lim S N = 2 .
2 N →∞
k=0

?Exercise 29.1: Repeat the computations done in the last example, but using x = −1/2 . Show
that the corresponding geometric series converges with
∞

1 k 2
− = .
2 3
k=0

i i

i i
i i

i i

Inﬁnite Series 565

As you can easily verify for yourself (and as illustrated in the above example and exercise),

lim x N +1 = 0 whenever |x| < 1 .

N →∞

This, along with equations (29.2) and (29.1) (and some of the other comments above), leads to the
following:

Theorem 29.1 (geometric 2 series)

The basic geometric series ∞k=0 x converges if |x| < 1 and diverges if |x| ≥ 1 . Moreover,
k

∞
1
xk = for |x| < 1 .
1−x
k=0
2
More generally, for any nonnegative integer γ , the geometric series ∞ k
k=γ x converges if and only
if |x| < 1 . Moreover,
∞
xγ
xk = for |x| < 1 .
1−x
k=γ

Absolute Convergence and Convergence Tests

Absolute and Conditional Convergence
2
Recall that a series ∞ k=γ αk can converge “absolutely” or “conditionally”. It converges absolutely
if and only if the corresponding series of absolute values
∞

|αk |
k=γ

converges. Basically, as the index increases in an absolutely convergent series, the terms shrink
towards zero fast enough to ensure convergence. Consequently, it’s easily veriﬁed that an absolutely
convergent series is, as the terminology suggests, convergent. Moreover, by repeatedly using the
triangle inequality,
|a + b| ≤ |a| + |b| ,
you can easily verify that
∞ ∞

α ≤ |αk | .
k
k=γ k=γ

If a series converges but is not absolutely convergent, then it is converging because each term
“cancels out” some of the previous terms, and the series is said to be conditionally convergent. Such
a convergence is somewhat unstable and can be upset by, say, rearranging the terms of the series in
a clever way. Because of this, we will much prefer series that converge absolutely.

Tests for Convergence and Divergence

i i

i i
i i

i i

568 Brief Review of Inﬁnite Series and Power Series

It then
2follows
from the comparison test (theorem 29.2) using the above convergent geometric series
that ∞ ak X k converges for this choice of X . In other words,
k=γ

∞
∞

|X| < |r| and ak r k converges ⇒ ak X k converges absolutely .
k=γ k=γ

On the other hand,

∞
∞

0 < |ρ| < |X| and ak ρ k diverges ⇒ ak X k diverges ,
k=γ k=γ
2
because if ∞ k
k=γ ak X did not diverge, then the very arguments just used in the previous paragraph
2
would falsely imply that ∞ k=γ ak ρ converges.
k

Letting X = x − x 0 , and taking r as large as possible and/or ρ as small as possible then gives
the existence of the value R (which may be 0 or +∞ ) in the next theorem.

Theorem 29.3 2
For each power series ∞
k=γ ak (x − x 0 ) , there is a R — which is either 0 , a ﬁnite positive value
k

or +∞ — such that
∞

|x − x 0 | < R ⇒ ak (x − x 0 )k converges absolutely ,
k=γ
while
∞

R < |x − x 0 | ⇒ ak (x − x 0 )k diverges .
k=γ

The R in the above theorem is called the radius of convergence for the given power series. If
R = 0 , the power series only converges for x = x 0 (which means the series won’t be of much
use); if R = +∞ , the power series converges for all values of x (which is very nice). Otherwise,
the series converges absolutely at every point in the interval (x 0 − R, x 0 + R) . Whether we have
convergence when x = x 0 ± R depends on the particular series, and, frankly, will usually not be of
great concern to us.
The radius of convergence for a given power series can sometimes be determined through careful
use of the formulas in either the limit ratio test or the limit root test. You may recall doing this.
We, however, will discover that the radii of convergence for the power series of interest to us can be
determined much more easily from the “singularities” of whatever differential equation we will be
trying to solve.

Algebra with Power Series and Analytic Functions

Addition
Adding two power series with the same center and starting index is trivial:
∞
∞
∞

k k
ak (x − x 0 ) + bk (x − x 0 ) = ak (x − x 0 )k + bk (x − x 0 )k
k=γ k=γ k=γ
∞

= [ak + bk ] (x − x 0 )k .
k=γ

∞
∞
∞

(k + 1)ak X k+2 = ([n − 2] + 1)an−2 X n = (n − 1)an−2 X n .
k=0 n−2=0 n=2

For the second, we use n = k and pull out the ﬁrst two terms,
∞
∞
∞

ak X k = an X n = a0 X 0 + a1 X 1 + an X n .
k=0 n=0 n=2

Thus,
∞ ∞ ∞
∞

(k + 1)ak X k+2 + ak X k = (n − 1)an−2 X n + a0 + a1 X + an X n
k=0 k=0 n=2 n=2
∞ ∞

= a0 + a1 X + (n − 1)an−2 X n + an X n
n=2 n=2
∞

= a0 + a1 X + (n − 1)an−2 + an X n .
n=2

A Basic Equation
We will often ﬁnd ourselves with the equation
∞

ak (x − x 0 )k = 0 for |x − x 0 | < R ,
k=0

which, in more explicit form (with X = x − x 0 ), is

a0 + a1 X + a2 X 2 + a3 X 3 + · · · = 0 for −R<X<R .

Plugging in X = 0 gives
a0 + a1 0 + a2 02 + a3 03 + · · · = 0 .

0

Hence,
a0 = 0 ,
and, for |X| < R ,
a0 + a1 X + a2 X 2 + a3 X 3 + · · · = 0

→ 0 + a1 X + a2 X 2 + a3 X 3 + · · · = 0

→ X a1 + a2 X + a3 X 2 + · · · = 0 .

d d d d
= a0 + a1 (x − x 0 ) + a2 (x − x 0 )2 + a3 (x − x 0 )3 + · · ·
dx dx dx dx
d
+ ak (x − x 0 )k + · · ·
dx

= 0 + a1 + 2a2 (x − x 0 ) + 3a3 (x − x 0 )2 + · · · + kak (x − x 0 )k−1 + · · ·

∞

= k ak (x − x 0 )k−1 .
k=1

Note that the derivative of the “ k = 0 term for f (x) ” is 0 . That is why, in the last series above,
we dropped the k = 0 term and started with k = 1 . Strictly speaking, this is not necessary. Since
kak (x − x 0 )k−1 = 0 when k = 0 ,
the above series formula for f would still be valid if it started at k = 0 instead of k = 1 . Still,
dropping the k = 0 term in the above can help prevent some embarrassing mistakes in the sort of
computations we’ll be doing in the next chapter.
Repeating the above (in abbreviated form) with the series obtained for f (x) , we get
∞
d
f (x) = k ak (x − x 0 )k−1
dx
k=1
∞
d
∞

= k ak (x − x 0 )k−1 = k(k − 1) ak (x − x 0 )k−2 .
dx
k=1 k=2

Using this, we then have

∞
d
f (x) = k(k − 1) ak (x − x 0 )k−2
dx
k=2
∞
d
∞

= k(k − 1) ak (x − x 0 )k−2 = k(k − 1)(k − 2) ak (x − x 0 )k−3 .
dx
k=2 k=3

Continuing these computations, you end up getting

∞

(n)
f (x) = k(k − 1)(k − 2) · · · (k − n + 1) ak (x − x 0 )k−n
k=n

for any nonnegative integer n .

There is a technical issue with the above computations. The
derivative of a sum = sum of the derivatives
rule from elementary calculus was only shown to be true when the sum had finitely many terms.
Here we have infinitely many terms. In fact, there are infinite series of functions for which this rule
fails. Fortunately, it does not fail for power series, and the following theorem can be rigorously
confirmed (see any good calculus text).

Theorem 29.6 (differentiation of power series)

Suppose f is a function given by a power series with a nonzero radius of convergence R ,
∞

f (x) = ak (x − x 0 )k for |x − x 0 | < R .
k=0

i i

i i
i i

i i

Power Series and Analytic Functions 573

Then, for any positive integer n , the n th derivative of f exists. Moreover, R is also the radius of
convergence of the differentiated series, with
∞

f (n) (x) = k(k − 1)(k − 2) · · · (k − n + 1) ak (x − x 0 )k−n for |x − x 0 | < R .
k=n

In particular,
∞

f (x) = k ak (x − x 0 )k−1 for |x − x 0 | < R ,
k=1
and
∞

f (x) = k(k − 1) ak (x − x 0 )k−2 for |x − x 0 | < R .
k=2

Integral analogs to the above theorem also hold. In particular, if

∞

f (x) = ak (x − x 0 )k for |x − x 0 | < R ,
k=0

then it can be veriﬁed that

x ∞
x ∞
ak
f (t) dt = ak (t − x 0 )k dt = (x − x 0 )k+1
x0 k+1
k=0 x 0 k=0

whenever |x − x 0 | < R . This can be a useful fact, though we won’t have much need for it.

Power Series for Analytic Functions

As already noted, any function f given by a power series centered at x 0 in some open interval
containing x 0 is said to be analytic at x 0 . If, in addition, f is analytic at every point in some
interval, then we say f is analytic on that interval.
So suppose we have a function f analytic at x 0 with
∞

f (x) = ak (x − x 0 )k for |x − x 0 | < R
k=0

for some R > 0 . Our ‘differentiation of power series’ theorem (theorem 29.6) tells us that f is, in
fact, “inﬁnitely differentiable” on the interval (x 0 − R, x 0 + R) .4 That theorem also allows us to
derive a simple relationship between the ak ’s in the series and the derivatives of f at x 0 .
Let’s derive that relation. First, plugging x = x 0 into the above, we get
∞

f (x 0 ) = ak (x 0 − x 0 )k
k=0

= a0 + a1 (x 0 − x 0 ) + a2 (x 0 − x 0 )2 + a3 (x 0 − x 0 )3 + · · ·
= a0 + a1 · 0 + a2 · 02 + a3 · 03 + · · ·
= a0 .
4 We say that a function f is infinitely differentiable at a point x if and only if f (n) (x) exists for every positive integer
n , and is infinitely differentiable on a given interval if and only if it is infinitely differentiable at each point in the interval.

i i

i i
i i

i i

574 Brief Review of Inﬁnite Series and Power Series

Then, using formulas from theorem 29.6, we see that

∞

f (x 0 ) = k ak (x 0 − x 0 )k−1
k=1

= 1 a1 + 2 a2 · 0 + 3 a3 · 02 + 4 a4 · 03 + · · ·
= 1 a1 ,

and
∞

f (x 0 ) = k(k − 1) ak (x 0 − x 0 )k−2
k=2

= 2 · 1 a2 + 3 · 2 a3 · 0 + 4 · 3 a4 · 02 + 5 · 4 a5 · 03 + · · ·
= (2 · 1)a2 .

More generally, for any positive integer n ,

∞

f (n) (x 0 ) = k(k − 1)(k − 2) · · · (k − n + 1) ak (x 0 − x 0 )k−n
k=n

= n(n − 1)(n − 2) · · · 1 an
+ (n + 1)(n)(n − 1) · · · 2 an+1 · 0 + (n + 2)(n + 1)(n) · · · 3 an+2 · 02
+ ···
= n! an .

Dividing the last relation through by n! and observing that the result also holds when n = 0
(interpreting f (0) as f ) gives us the next theorem.

Theorem 29.7
Let f be analytic at x 0 . Then, for every x in some open interval containing x 0 ,
∞
f (k) (x 0 )
f (x) = ak (x − x 0 )k with ak = .
k!
k=0

As an immediate corollary, we have the following (which will be important when discussing
“power series solutions to initial-value problems”):

Corollary 29.8
Assume
∞

f (x) = ak (x − x 0 )k for |x − x 0 | < R
k=0
with R > 0 . Then,
a0 = f (x 0 ) and a1 = f (x 0 ) .

You may recognize the series in theorem 29.7, written a little more simply as
∞
f (k) (x 0 )
(x − x 0 )k ,
k!
k=0

i i

i i
i i

i i

Elementary Complex Analysis 575

as the Taylor’s series (formula) for f (x) about x 0 , and you may recall having once computed Taylor
series for such functions as e x , sin(x) , cos(x) and ln |x| . In fact, the Taylor series about any point
x 0 can be computed for any function f which is infinitely differentiable at that point. However, a
function can be infinitely differentiable at a point x 0 without being analytic there — its Taylor series
exists, but does not equal the function at any point other than x 0 . With luck you saw an example,
say,
0 if x = 0
f (x) = ,
e−1/x
2
if x = 0
which can be shown to be infinitely differentiable but not analytic at x 0 = 0 (see exercise 29.10).
Still, you probably saw that many functions are analytic at many points. You may well have
already verified that such functions as

e−2x
2
ex , , sin(x) and cos(x)

are analytic at every point on the real line, and that functions such as
√
x and ln x

are analytic at any point x 0 > 0 . You may have even been given the impression that most functions
typically encountered in “real life” are analytic at every point at which they are infinitely differen-
tiable. In a sense, this is true, though very difficult to confirm using the methods normally developed
in elementary calculus courses. (We’ll return to this issue in chapter 31.)

29.3 Elementary Complex Analysis

Up to now, we’ve acted as if we were only dealing with real numbers in our inﬁnite series. In fact,
just about everything said so far, up to the discussion of Calculus with Power Series, holds even if
the numbers are complex, provided we make some obvious changes in notation and phrasing.5 In
fact, we will later have particular interest in power series in which the variables are complex.

The Complex Plane

Recall that a complex number z is simply something that can be written as

z = x + iy

where x and y are real numbers, and i is a constant satisfying i 2 = −1 . Because we’ll be using
such expressions so often, let us agree that, unless otherwise noted, in any expression of the form
z = x + i y , both x and y are real numbers.
As you are probably aware, each complex number z = x + i y can be identiﬁed with the point
(x, y) in the X Y –plane. When we do so, we generally refer to the plane as the complex plane and
denote it by ⺓ . The distance between any two points

z 1 = x 1 + i y1 and z 2 = x 2 + i y2

5 And, after discussing “complex calculus” in chapter 31, we’ll discover that what was said in Calculus with Power Series
also holds for power series with a complex variable.

i i

∞
(−1)k
sin(z) = z 2k+1 for |z| < ∞ , (29.5d)
(2k + 1)!
k=0
and
∞
(−1)k−1
ln |z| = (z − 1)k for |z − 1| < 1 . (29.5e)
k
k=1

There is a issue here that may concern the thoughtful reader: some of the above functions
have formulas other than the above power series for computing their values at complex points. For
example, in chapter 15, we learned of another formula for e z when z = x + i y , namely,
e z = e x+i y = e x [cos(y) + i sin(y)] .
Can we be sure that this formula will give the same result as using the above power series for e z ,
∞
∞

1 k 1
z = (x + i y)k ?
k! k!
k=0 k=0

Yes, we can. Trust the author on this. And if you don’t feel that trust, turn ahead to section 31.7
(starting on page 647) where we discuss the calculus of functions of a complex variable.
6 If you’ve had a course in complex analysis, you may have seen a different definition for “analyticity”. In section 31.7, we
will find that the two definitions are equivalent.

i i

i i
i i

i i

578 Brief Review of Inﬁnite Series and Power Series

29.4 Additional Basic Material That May Be Useful

The material in the previous sections will be needed in the next chapter. But there are some additional
facts about series and power series that will be useful later, especially when we get deeper into the
rigorous theory behind the computations that we will be developing. For convenience, we’ll provide
some the more basic general facts here, and develop the more advanced material as needed. It won’t
hurt to skip this material initially, provided you return to it as needed.

Two More General Tests for Convergence

The well-known basic comparison test for determining if a given series converges or diverges was
described in theorem 29.2 on page 566 and was used in developing the radius of convergence for
power series. In chapter 31, we will ﬁnd the next test, a clever reﬁnement of the basic comparison
test, to be useful.

Theorem
2 29.10 (the 2limit comparison test)
Let ∞k=γ αk and
∞
k=μ βk be two inﬁnite series, and suppose

α
lim k
k→∞ βk

exists as either a ﬁnite number or as +∞ . Then

∞
∞

α
lim k < ∞ and |βk | converges ⇒ |αk | converges ,
k→∞ βk
k=μ k=γ
while
∞
∞

α
lim k > 0 and |βk | diverges ⇒ |αk | diverges .
k→∞ βk
k=μ k=γ

Under certain conditions, you can use the limit of the ratio of the consecutive terms of a single
series to construct a geometric series that can serve as a second series in the above limit comparison
test. That leads to a third test, which will be used near the end of chapter 34.

Theorem
2 29.11 (the limit ratio test)
Let ∞k=γ αk be an inﬁnite series, and suppose

αk+1
lim
k→∞ αk

exists as either a ﬁnite number or as +∞ . Then

∞

αk+1
lim < 1 ⇒ αk converges absolutely ,
k→∞ αk
k=γ
while
∞

αk+1
lim > 1 ⇒ αk diverges .
k→∞ αk
k=γ

(If the limit is 1 , there is no conclusion.)

i i

i i
i i

i i

Additional Basic Material That May Be Useful 579

The derivations of the above tests can be found in any reasonable elementary calculus text.

More on Algebra with Power Series and Analytic Functions

Multiplication
The following — a straightforward extension of a basic formula for computing products of polyno-
mials — is worth a brief mention, especially since we will need it in chapter 31.

Theorem 29.12
The product of two power series centered at the same point is another power series whose radius
of convergence is at least as large as the smallest radius of convergence of the original two series.
Moreover, 3∞ 43 ∞ 4
∞

k k
ak (z − z 0 ) bk (z − z 0 ) = ck (z − z 0 )k
k=0 k=0 k=0
with

k
ck = a0 bk + a1 bk−1 + · · · + ak b0 = a j bk− j .
j=0

Factoring a Power Series/Analytic Function

On occasion, it will be convenient to “factor out” factors of the form (z − z 0 )m from a function f
analytic at z 0 . Our ability to do this follows immediately from the fact that, being analytic at z 0 ,
f is given by some power series with a nonzero radius of convergence R ,
∞

f (z) = ak (z − z 0 )k for |z − z 0 | < R .
k=0

Note that
f (z 0 ) = a0 + a1 (z 0 − z 0 ) + a2 (z 0 − z 0 )2 + · · · = a0 ,
telling us that f (z 0 ) = 0 if and only if a0 = 0 .
Let’s go a little further and assume f (z 0 ) = 0 , Then, as just noted, a0 = 0 . Moreover,
f (z) = a0 + a1 (z − z 0 ) + a2 (z − z 0 )2 + a3 (z − z 0 )3 + · · ·

= 0 + (z − z 0 ) a1 + a2 (z − z 0 ) + a3 (z − z 0 )2 + · · ·
∞

= (z − z 0 ) ak (z − z 0 )k−1 for |z − z 0 | < R .
k=1

Of course, we could also have a1 = 0 , in which case we can repeat the above to obtain
∞

f (z) = (z − z 0 )2 ak (z − z 0 )k−2 for |z − z 0 | < R .
k=2

Continuing until we ﬁnally reach a nonzero coefﬁcient (assuming f in nontrivial), we get, for some
positive integer m ,
∞

f (z) = (z − z 0 )m ak (z − z 0 )k−m with am = 0 ,
k=m

i i

i i
i i

i i

580 Brief Review of Inﬁnite Series and Power Series

which, after letting bn = an+m , can be written as

∞

f (z) = (z − z 0 )m bn (z − z 0 )n with b0 = 0 .
n=0

Thus, setting
∞

f 0 (z) = bn (z − z 0 )n
n=0
and noting that, even if a0 = 0 , it is trivially true that

f (z) = (z − z 0 )0 f (z) ,

we get

Lemma 29.13
Let f be a nontrivial function analytic at z 0 . Then there is a nonnegative integer m such that

f (z) = (z − z 0 )m f 0 (z 0 )

where f 0 is a function analytic at z 0 with f (z 0 ) = 0 . Moreover:

1. f (z 0 ) = 0 if and only if m > 0 .

2. The power series for f and f 0 about z 0 have the same radii of convergence.

It is standard to refer to any point z 0 as a zero for an analytic function if f (z 0 ) = 0 . It is also

standard to refer to the m described in the last lemma as the multiplicity of the zero z 0 .

Quotients of Analytic Functions

There is a technical issue that may arise when deﬁning a function h as the quotient of two functions
analytic at a given point.

!Example 29.4: Consider deﬁning h = f/

g when f and g are the polynomials

f (z) = z 2 − 1 and g(z) = z − 1 .

For any value of z other than z = 1 , we simply have

f (z) z2 − 1
h(z) = = ,
g(z) z−1

which we can rewrite more simply by dividing out the common factor,
z2 − 1 (z − 1)(z + 1)
h(z) = = = z + 1 .
z−1 z−1
These two formulas for h(z) give identical results whenever z = 1 . However, we get two
different results if we plug in z = 1 :
12 − 1 0
h(1) = = and h(1) = 1 + 1 = 2 .
1−1 0
The ﬁrst expression is indeterminant, and is problematic in practical applications. The second is
an ﬁnite number, and is clearly what we want to use for h(1) , especially since:

i i

i i
i i

i i

Additional Basic Material That May Be Useful 581

1. It comes from a simpler formula for h(z) when z = 1 .

2. It is the same as the value we would obtain using the obvious limit,

f (z) z2 − 1
lim h(z) = lim h(z) = lim = lim = ··· = 2 ,
z→1 z→1 z→1 g(z) z→1 z − 1

computed either using L’Hôpital’s rule or the formula obtained by dividing out the common
factors. Hence, h is continuous at z = 1 .

More generally, when we deﬁne a function h as the quotient of two other continuous functions,
h= f / , we automatically mean the function given by
g

f (z 0 )
h(z 0 ) =
g(z 0 )

at each point z 0 in the common domain of f and g at which g(z 0 ) = 0 ; and, provided the limit
exists as a ﬁnite number, by
f (z)
h(z 0 ) = lim
z→z 0 g(z)

at each point z 0 in the common domain of f and g at which g(z 0 ) = 0 . (If the above limit does
not exist, we simply accept that h is not deﬁned at that z 0 .)
We should note that using the limit when f and g are analytic and zero at z 0 yields exactly
the same as if we were to divide out any common factors of z − z 0 in the quotient. After all, if f
and g are analytic, then (as shown in the previous subsection) we can rewrite f (z) and g(z) as

f (z) = (z − z 0 )m f 0 (z) and g(z) = (z − z 0 )n g0 (z)

where m and n are nonnegative integers, and f 0 and g0 are functions analytic at z 0 with f 0 (z 0 ) =
0 and g0 (z 0 ) = 0 . Hence,
f (z) (z − z 0 )m f 0 (z) f (z)
= = (z − z 0 )m−n 0 , (29.6)
g(z) (z − z 0 ) g0 (z)
n g0 (z)

and
f (z) f 0 (z 0 )
lim = lim (z − z 0 )m−n ,
z→z 0 g(z) z→z 0 g0 (z 0 )
which is ﬁnite if and only if m ≥ n . Since this observation will later be used, let us make it a lemma.

Lemma 29.14
Let f and g be two functions analytic at z 0 . Then
f (z)
lim
z→z 0 g(z)

exists and is ﬁnite if and only if there is a nonnegative integer N and a R > 0 such that
f (z) f 0 (z)
= (z − z 0 ) N
g(z) g0 (z)

where f 0 and g0 are functions analytic at z 0 with f 0 (z 0 ) = 0 and g0 (z 0 ) = 0 .

i i

i i
i i

i i

582 Brief Review of Inﬁnite Series and Power Series

Partial Sum Approximations with Taylor Series

Another approach to deriving the Taylor series formula for function f analytic at a point x 0 starts
with the well-known equality
x
f (x) − f (x 0 ) = f (s) ds .
x0

Solving for f (x) and using a “clever” integration by parts yields

x
f (x) = f (x 0 ) + f (s) ds
x 0
u dv
x x

= f (x 0 ) + f (s)(s − x 0 ) − (s − x 0 ) f (s) ds
s=x0 x 0
u v v du
x
= f (x 0 ) + f (x)(x − x) − f (x 0 )(x 0 − x) − (s − x 0 ) f (s) ds
x0
x
= f (x 0 ) + 0 + f (x 0 )(x − x 0 ) − (s − x 0 ) f (s) ds .
x0

Repeating this again and again with similar “clever” uses of integration by parts ultimately leads to
f (x) = PN (x) + E N (x) for N = 1, 2, 3, . . . (29.7a)
where PN (x) is the N th degree Taylor polynomial,

N
f (k) (x 0 )
PN (x) = (x − x 0 )k , (29.7b)
k!
k=0

and E N (x) is the corresponding remainder term,

x
1
E N (x) = (−1) N f (N +1) (s)(s − x) N ds . (29.7c)
N ! x0

Note that PN (x) is the N th partial sum for the power series for f about x 0 , and E N (x) is the
error in using this partial sum in place of f (x) .
Since we are assuming f is analytic at x 0 , we know that
∞
f (k) (x 0 )
f (x) = (x − x 0 )k = lim PN (x) whenever |x − x 0 | < R ,
k! N →∞
k=0

where R is the radius of convergence for the Taylor series. This, in turn, means that
lim E N (x) = lim [ f (x) − PN (x)] = 0 whenever |x − x 0 | < R .
N →∞ N →∞

Conversely, in theory, you can verify that f truly is analytic at x 0 and that its power series at that
point has a radius of convergence of at least R by verifying that E N (x) → 0 as N → ∞ whenever
x0 − R < x < x0 + R .
In practice, computing E N (x) is rarely practical. Because of this, a slightly more usable error
estimate in terms of upper bounds on the derivatives of f is often described in textbooks. For our
purposes, let [a, b] be some closed subinterval with
x0 − R < a < x0 < b < x0 + R .

29.3. Verify each of the following equations:

∞
∞
∞

1 1 1 2k
a. xk + xk = 1 + x + xk
k+1 k−1 2 k2 − 1
k=0 k=2 k=2
∞
∞
∞

b. k2 + 9 x k − 6 kx k = 9 + (k − 3)2 x k
k=0 k=1 k=1
∞
∞

c. (k − 1)x k−2 = (n + 1)x n
k=2 n=0
∞
∞

d. x (k − 1)x k−2 = nx n
k=2 n=1
∞
∞

e. (k + 1)x k+1 − (k − 1)x k−1 = x + 2x 2
k=0 k=4
∞
∞

f. x 3 ak x k = an−3 x n
k=0 n=3

∞ ∞

g. x2 + 5 ak x k = 5a0 + 5a1 x + [an−2 + 5an ]x n
k=0 n=2
∞
∞
∞

h. k(k − 1)ak x k−2 − 3 ak x k = (n + 2)(n + 1)an+2 − 3an x n
k=2 k=0 n=0

29.4. Rewrite each of the following expressions as a single power series centered at a point x 0 ,
with the index being the order of each term. That is, if n is the index, then each term should
be of the form
[formula not involving x] × (x − x 0 )n .

In most cases, x 0 = 0 . And, in some cases, the ﬁrst few terms will have to be written
separately. Simplify your expressions as much as practical.
∞
∞
∞

1 1 k 1 2 1 (−1)k k
a. xk − x b. x + x + x3 + x
k+1 k 2 3 k
k=0 k=1 k=1
∞
∞

c. 3k 2 (x − 5)k+3 d. x k(k − 1)x k−2
k=1 k=2
∞
∞

e. (x − 3) k(k − 1)x k−2 f. x k(k − 1)(x − 3)k−2
k=2 k=2
∞
∞

g. k 2 ak x k+3 h. (x − 1)2 ak (x − 1)k
k=1 k=0
∞
∞
∞
∞

i. (k + 1)ak x k+1 − (k − 1)ak x k−1 j. kak x k−1 + 5 ak x k
k=0 k=4 k=1 k=0

i i

i i
i i

i i

Additional Exercises 585

∞
∞
∞
∞

k. x 2 k(k − 1)ak x k−2 − 4 ak x k l. k(k − 1)ak x k−2 − 3x 2 ak x k
k=2 k=0 k=2 k=0

29.5. On page 565, we saw that

∞

1
= xk for |x| < 1 .
1−x
k=0

By differentiating this, ﬁnd the power series about 0 for each of the following:
1 1
a. b.
(1 − x)2 (1 − x)3

29.6. Find the Taylor series about x 0 for each of the following:
a. e x with x0 = 0 b. cos(x) with x0 = 0
c. sin(x) with x0 = 0 d. ln |x| with x0 = 1

29.7 a. Using the Taylor series formula from theorem 29.7, find the fourth partial sum of the
power series about 0 for √
f (x) = 1 + x .
b. Using the results from the previous part along with a simple substitution, find the first five
terms of the power series about 0 for

g(x) = 1 − x 2 .

c. Let 1
g(x) = 1 − x2 and h(x) = .
1 − x2
Verify that
g (x)
h(x) = − ,
x
and using this along with the results from the previous part, ﬁnd the ﬁrst four terms of the
power series about 0 for h(x) .

29.8 a. Using your favorite computer mathematics package (e.g., Maple or Mathematica), along
with the Taylor series formula from theorem 29.7, write a program/worksheet that will find
the first N coefficients in the power series about x 0 for f where x 0 is any given point
on the real line, f is any function analytic at x 0 , and N is any given positive integer.
Also, have your program/worksheet write out the corresponding N th -degree partial sum
of this power series. Be sure to write your program/worksheet so that N , x 0 and f are
easily changed.
b. Use your program/worksheet with each of the following choices of f , x 0 and N to find
the N th -degree polynomial about x 0 for f .
i. f (x) = e2x with x0 = 0 and N =9
1
ii. f (x) = with x0 = 0 and N = 11
cos(x)

iii. f (x) = 2x 2 + 1 with x0 = 2 and N =7

i i

i i
i i

i i

586 Brief Review of Inﬁnite Series and Power Series

29.9. We saw that

∞

1
= xk for |x| < 1 .
1−x
k=0
Replacing x with −x ,
∞

1
= (−x)k for |−x| < 1 ,
1 − (−x)
k=0

gives us a power series formula for (1 + x)−1 ,

∞

1
= (−x)k for |x| < 1 .
1+x
k=0

Find a power series representation (and its radius of convergence R ) for the each of the
following by replacing the x in some of the “known” power series from exercises 29.5 and
29.6, above, with a suitable formula of x , as just done above.
1 1 2 2
a. b. c. d.
1 − 2x 1 + x2 2−x (2 − x)2

e. e−x
2
f. sin x 2

29.10. Let
0 if x =0
f (x) = −1/x 2
.
e if x = 0

a. Verify that ⎧
⎨ 0 if x =0
f (x) = 2 .
⎩e −1/x 2
if x = 0
x3
Note: The derivative at x = 0 should be computed using the basic deﬁnition
f (0 + x) − f (0)
f (0) = lim .
x→0 x

b. Verify that ⎧
⎨ 0 if x =0
f (x) = 1
.
⎩e −1/x 2
4 − 6x 2 if x = 0
x6
c. Continuing, it can be shown that, for any positive integer k ,
⎧
⎨ 0 if x = 0
f (k) (x) = 1
⎩ e−1/x
2
p (x)
3k k
if x = 0
x

where pk (x) is some nonzero polynomial. Using this fact, write out the Taylor series for
f about 0 .
d. Why is f not analytic at 0 even though it is inﬁnitely differentiable at 0 ?

i i

i i
i i

i i

30
Power Series Solutions I: Basic
Computational Methods

When a solution to a differential equation is analytic at a point, then that solution can be represented
by a power series about that point. In this and the next chapter, we will discuss when this can be
expected, and how we might employ this fact to obtain usable power series formulas for the solutions
to various differential equations. In this chapter, we will concentrate on two basic methods — an
“algebraic method” and a “Taylor series method” — for computing our power series. Our main
interest will be in the algebraic method. It is more commonly used and is the method we will extend
in chapter 32 to obtain “modiﬁed” power series solutions when we do not quite have the desired
analyticity. But the algebraic method is not well suited for solving all types of differential equations,
especially when the differential equations in question are not linear. For that reason (and others) we
will also introduce the Taylor series method near the end of this chapter.

30.1 Basics
General Power Series Solutions
If it exists, a power series solution for a differential equation is just a power series formula
∞

y(x) = ak (x − x 0 )k
k=0

for a solution y to the given differential equation in some open interval containing x 0 . The series
is a general power series solution if it describes all possible solutions in that interval.
As noted in the last chapter (corollary 29.8 on page 574), if y(x) is given by the above power
series, then
a0 = y(x 0 ) and a1 = y (x 0 ) .
Because a general solution to a first-order differential equation normally has one arbitrary constant,
we should expect a general power series solution to a first-order differential equation to also have a
single arbitrary constant. And since that arbitrary constant can be determined by a given initial value
y(x 0 ) , it makes sense to use a0 as that arbitrary constant.
On the other hand, a general solution to a second-order differential equation usually has two
arbitrary constants, and they are normally determined by initial values y(x 0 ) and y (x 0 ) . Conse-
quently, we should expect the first two coefficients, a0 and a1 , to assume the roles of the arbitrary
constants in our general power series solutions for second-order differential equations.

587

i i

i i
i i

i i

588 Power Series Solutions I: Basic Computational Methods

The Two Methods, Brieﬂy

The basic ideas of both the “algebraic method” and the “Taylor series method” are fairly simple.

The Algebraic Method

The algebraic method starts by assuming the solution y can be written as a power series
∞

y(x) = ak (x − x 0 )k
k=0

with the ak ’s being constants to be determined. This formula for y is then plugged into the
differential equation. By using a lot of algebra and only a little calculus, we then “simplify” the
resulting equation until it looks something like
∞

n th formula of the ak ’s x n = 0 .
n=0

As we saw in the last chapter, this means

n th formula of ak ’s = 0 for n = 0, 1, 2, 3, . . . ,

which (as we will see) can be used to determine all the ak ’s in terms of one or two arbitrary
constants. Plugging these ak ’s back into the series then gives the power series solution to our
differential equation about the point x 0 .
We will outline the details for this method in the next two sections for ﬁrst- and second-order
homogeneous linear differential equations

a(x)y + b(x)y = 0 and a(x)y + b(x)y + c(x)y = 0

in which the coefficients are rational functions. These are the equations for which the method is
especially well suited.1 For pedagogic reasons, we will deal with first-order equations first, and then
expand our discussion to include second-order equations. It should then be clear that this approach
can easily be extended to solve higher-order analogs of the equations discussed here.

The Taylor Series Method

The basic ideas behind the Taylor series method are even easier to describe. We simply use the given
differential equation to ﬁnd the values of all the derivatives of the solution y(x) when x = x 0 , and
then plug these values into the formula for the Taylor series for y about x 0 (see corollary 29.7 on
page 574). Details will be laid out in section 30.6.

1 Recall that a rational function is a function that can be written as one polynomial divided by another polynomial. Actually,
in theory at least, the algebraic method is “well suited” for a somewhat larger class of ﬁrst- and second-order linear
differential equations. We’ll discuss this in the next chapter.

i i

(Remember that the derivative of the a0 term is zero. Explicitly dropping this zero term in
the series for y is not necessary, but can simplify bookkeeping, later.)
Since we’ve already decided x 0 = 0 , we assume
∞

y = y(x) = ak x k , (30.4)
k=0

and compute
d
∞ ∞
d
∞

y = ak x k = ak x k = kak x k−1 .
dx dx
k=0 k=0 k=1

Step 2: Plug the series for y and y back into the differential equation and “multiply things out”.
(If x 0 = 0 , see the comments on page 597.)
Some notes:

i. Absorb any x’s from A(x) and B(x) into the series.

ii. Your goal is to get an equation in which zero equals the sum of a few power series
about x 0 .
Using the above series with the given differential equation, we get

0 = (x − 2)y + 2y
∞
∞

k−1
= (x − 2) kak x + 2 ak x k
k=1 k=0
∞ ∞ ∞

= x kak x k−1 − 2 kak x k−1 + 2 ak x k
k=1 k=1 k=0
∞
∞
∞

= kak x k + (−2)kak x k−1 + 2ak x k .
k=1 k=1 k=0

Step 3: For each series in your last equation, do a change of index4 so that each series looks like
∞

something not involving x (x − x 0 )n .
n=something

Be sure to appropriately adjust the lower limit in each series.

4 see Changing the Index on page 569

i i

i i
i i

i i

The Algebraic Method with First-Order Equations 591

In all but the second series in the example, the “change of index” is trivial ( n = k ).
In the second series, we set n = k − 1 (equivalently, k = n + 1 ):
∞
∞
∞

0 = kak x k + (−2)kak x k−1 + 2ak x k
k=1 k=1 k=0

n=k n = k−1 n=k
∞
∞
∞

= nan x n + (−2)(n + 1)an+1 x n + 2an x n
n=1 n+1=1 n=0
∞
∞
∞

= nan x n + (−2)(n + 1)an+1 x n + 2an x n .
n=1 n=0 n=0

Step 4: Convert the sum of series in your last equation into one big series. The ﬁrst few terms will
probably have to be written separately. Go ahead and simplify what can be simpliﬁed.
Since one of the series in the last equation begins with n = 1 , we need to separate
out the terms corresponding to n = 0 in the other series before combining series:
∞
∞
∞

n n
0 = nan x + (−2)(n + 1)an+1 x + 2an x n
n=1 n=0 n=0

∞
∞

= nan x n + (−2)(0 + 1)a0+1 x +
0
(−2)(n + 1)an+1 x n

n=1 n=1
∞

n
+ 2a0 x +
0
2an x
n=1

∞

= [−2a1 + 2a0 ]x 0 + [nan − 2(n + 1)an+1 + 2an ]x n
n=1
∞

= 2[a0 − a1 ]x 0 + (n + 2)an − 2(n + 1)an+1 x n .
n=1

Step 5: At this point, you have an equation basically of the form

∞
th
n formula of the ak ’s (x − x 0 )n = 0 ,
n=0

which is possible only if

n th formula of the ak ’s = 0 for n = 0, 1, 2, 3, . . . .
Using this last equation:
(a) Solve for the ak with the highest index, obtaining
ahighest index = formula of n and lower-indexed coefficients .
A few of these equations may need to be treated separately, but you should obtain one
relatively simple formula that holds for all indices above some fixed value. This formula
is a recursion formula for computing each coefficient from the previously computed
coefficients.

i i

i i
i i

i i

592 Power Series Solutions I: Basic Computational Methods

(b) To simplify things just a little, do another change of index so that the recursion formula
just derived is rewritten as

ak = formula of k and lower-indexed coefﬁcients .

From the last step in our example, we have

∞

2[a0 − a1 ]x 0 + (n + 2)an − 2(n + 1)an+1 x n = 0 .
n=1

So,
2[a0 − a1 ] = 0 , (30.5a)
and, for n = 1, 2, 3, 4, . . . ,

(n + 2)an − 2(n + 1)an+1 = 0 . (30.5b)

In equation (30.5a), a1 is the highest indexed ak ; solving for it in terms of the

lower-indexed ak ’s (i.e., a0 ) yields

a1 = a0 .

Equation (30.5b) also just contains two ak ’s : an and an+1 . Since n + 1 > n ,
we solve for an+1 ,
n+2
an+1 = an for n = 1, 2, 3, 4, . . . .
2(n + 1)

Letting k = n + 1 (equivalently, n = k − 1 ), this becomes

k+1
ak = ak−1 for k = 2, 3, 4, 5, . . . . (30.6)
2k
This is the recursion formula we will use.

Step 6: Use the recursion formula (and any corresponding formulas for the lower-order terms) to
ﬁnd all the ak ’s in terms of a0 . Look for patterns!
In the last step, we saw that
a1 = a0 .
Using this and recursion formula (30.6) with k = 2, 3, 4, . . . (and looking for
patterns), we obtain the following:
2+1 3 3
a2 = a2−1 = a1 = a0 ,
2·2 2·2 2·2
3+1 4 4 3 4
a3 = a3−1 = a2 = · a0 = 3 a0 ,
2·3 2·3 2·3 2·2 2
4+1 5 5 4 5
a4 = a4−1 = a3 = · a0 = 4 a0 ,
2·4 2·4 2 · 4 23 2
5+1 6 6 5 6
a5 = a5−1 = a4 = · a0 = 5 a0 ,
2·5 2·5 2 · 5 25 2
..
.

i i

i i
i i

i i

The Algebraic Method with First-Order Equations 593

The pattern here is obvious:

k+1
ak = a0 for k = 2, 3, 4, . . . .
2k
Note that this formula even gives us our a1 = a0 equation,
1+1 2
a1 = 1
a0 = a0 = a0 ,
2 2

and is even valid with k = 0 ,

0+1
a0 = a0 = a0 .
22
So, in fact,
k+1
ak = a0 for k = 0, 1, 2, 3, 4, . . . . (30.7)
2k

Step 7: Using the formulas just derived for the coefﬁcients, write out the resulting series for y(x) .
Try to simplify it and factor out a0 .
Plugging the formula just derived for the ak ’s into the power series assumed for
y yields
∞ ∞ ∞
k +1 k+1 k
y(x) = ak x k = k
a0 x k
= a0 k
x .
2 2
k=0 k=0 k=0

So we have
∞
k +1
y(x) = a0 xk
2k
k=0
0 + 1 1+1 1 2+1 2 3+1 3

= a0 x0 +
x + x + x + ···
20 21 22 23
3 1

= a0 1 + x + x 2 + x 3 + · · ·
4 2

as the series solution for our ﬁrst-order differential equation (assuming it con-
verges).

Last Step: See if you recognize the series derived as the series for some well-known function (you
probably won’t!).
By an amazing stroke of luck, in exercise 29.9 d on page 586 we saw that
∞
2 1 k+1 k
= x .
(2 − x)2 2 2k
k=0

So our formula for y simpliﬁes considerably:

∞

k+1 k 2 4a0
y(x) = a0 x = a0 2 · = .
2k (2 − x)2 (2 − x)2
k=0

i i

i i
i i

i i

594 Power Series Solutions I: Basic Computational Methods

Practical Advice on Using the Method

General Comments
The method just described is a fairly straightforward procedure, at least up to the point where you are
trying to “ﬁnd a pattern” for the ak ’s . The individual steps are, for the most part, simple and only
involve elementary calculus and algebraic computations — but there are a lot of these elementary
computations, and an error in just one can throw off the subsequent computations with disastrous
consequences for your ﬁnal answer. So be careful, write neatly, and avoid shortcuts and doing too
many computations in your head. It may also be a good idea to do your work with your paper turned
sideways, just to have enough room for each line of formulas.

On Finding Patterns
In computing the ak ’s , we usually want to find some “pattern” described by some reasonably simple
formula. In our above example, we found formula (30.7),
k +1
ak = a0 for k = 0, 1, 2, 3, 4, . . . .
2k
Using this formula, it was easy to write out the power series solution.
More generally, we will soon verify that the ak ’s obtained by this method can all be simply
related to a0 by an expression of the form
ak = αk a0 for k = 0, 1, 2, 3, 4, . . .
where α0 = 1 and the other αk ’s are fixed numbers (hopefully given by some simple formula of
k ). In the example cited just above,
k+1
αk = for k = 0, 1, 2, 3, 4, . . . .
2k
Finding that pattern and its formula (i.e., the above mentioned αk ’s ) is something of an art
and requires a skill that improves with practice. One suggestion is to avoid multiplying factors out.
It was the author’s experience that, in deriving formula (30.7), led him to leave 22 and 23 as 22
and 23 , instead of as 4 and 8 — he suspected a pattern would emerge. Another suggestion is to
compute “many” of the ak ’s using the recursion formula before trying to identify the pattern. And
once you believe you’ve found that pattern and derived that formula, say,
k +1
ak = a0 for k = 0, 1, 2, 3, 4, . . . ,
2k
test it by computing a few more ak ’s using both the recursion formula directly and using your newly
found formula. If the values computed using both methods don’t agree, your formula is wrong.
Better yet, if you are acquainted with the method of induction, use that to rigorously confirm your
formula.5
Unfortunately, in practice, it may not be so easy to find such a pattern for your ak ’s . In fact, it
is quite possible to end up with a three (or more) term recursion formula, say,
1 2
an = an−1 + an−2 ,
n2 + 1 3n(n + 3)
which can make “finding patterns” quite difficult.
Even if you do see a pattern, it might be difficult to describe. In these cases, writing out a
relatively simple formula for all the terms in the power series solution may not be practical. What
we can still do, though, is to use the recursion formula to compute (or have a computer compute) as
many terms as we think are needed for a reasonably accurate partial sum approximation.
5 And to learn about using induction, see section 30.7.

i i

i i
i i

i i

The Algebraic Method with First-Order Equations 595

Terminating Series
It’s worth checking your recursion formula

ak = formula of k and lower-indexed coefﬁcients

to see if the right side becomes zero for some value K of k . Then

aK = 0

and the computation of the subsequent ak ’s may become especially simple. In fact, you may well
have
ak = 0 for all k ≥ K .

This, essentially, “terminates” the series and gives you a polynomial solution — something that’s
usually easier to handle than a true inﬁnite series solution.

!Example 30.1: Consider ﬁnding the power series solution about x 0 = 0 to

x 2 + 1 y − 4x y = 0 .

It is already in the right form. So, following the procedure, we let

∞
∞

y(x) = ak (x − x 0 )k = ak x k ,
k=0 k=0

and ‘compute’
d
∞ ∞
d
∞

y (x) = ak x k = ak x k = ak kx k−1 .
dx dx
k=0 k=0 k=1

Plugging this into the differential equation and carrying out the index manipulation and algebra
of our method:

0 = x 2 + 1 y − 4x y

∞ ∞

= x2 + 1 ak kx k−1 − 4x ak x k
k=1 k=0

∞
∞
∞

= x2 ak kx k−1 + 1 ak kx k−1 − 4x ak x k
k=1 k=1 k=0

∞
∞
∞

= ak kx k+1 + ak kx k−1 − 4ak x k+1
k=1 k=1 k=0

n = k+1 n = k−1 n = k+1

∞
∞
∞

= an−1 (n − 1)x n + an+1 (n + 1)x n − 4an−1 x n
n=2 n=0 n=1

i i

i i
i i

i i

596 Power Series Solutions I: Basic Computational Methods

∞
∞

n
= an−1 (n − 1)x + a0+1 (0 + 1)x 0 + a1+1 (1 + 1)x 1 + an+1 (n + 1)x n
n=2 n=2
∞

− 4a1−1 x 1 + 4an−1 x n
n=2

∞

= a1 x 0 + [2a2 − 4a0 ] x 1 + an−1 (n − 1) + an+1 (n + 1) − 4an−1 x n
n=2

∞

= a1 x 0 + [2a2 − 4a0 ] x 1 + (n + 1)an+1 + (n − 5)an−1 x n .
n=2

Remember, the coefﬁcient in each term must be zero. From the x 0 term, we get

a1 = 0 .

From the x 1 term, we get

2a2 − 4a0 = 0 .
And for n ≥ 2 , we have
(n + 1)an+1 + (n − 5)an−1 = 0 .

Solving each of the above for the ak with the highest index, we get

a1 = 0 ,

a2 = 2a0
and
5−n
an+1 = an−1 for n = 2, 3, 4, . . . .
n+1
Letting k = n + 1 then converts the last equation to the recursion formula
6−k
ak = ak−2 for k = 3, 4, 5, . . . .
k
Now, using our recursion formula, we see that
6−3 3 1
a3 = a3−2 = a1 = · 0 = 0 ,
3 3 2
6−4 2 1
a4 = a4−2 = a2 = · 2a0 = a0 ,
4 4 2
6−5 1 1
a5 = a5−2 = a3 = · 0 = 0 ,
5 5 5
6−6 0
a6 = a6−2 = a4 = 0 ,
6 6
6−7 1 1
a7 = a7−2 = − a5 = − · 0 = 0 ,
7 7 7
6−8 2 1
a8 = a8−2 = − a6 = − · 0 = 0 ,
8 8 4
..
.

i i

i i
i i

i i

The Algebraic Method with First-Order Equations 597

Clearly, the vanishing of both a5 and a6 means that the recursion formula will give us

ak = 0 whenever k > 4 .

Thus,
∞

y(x) = ak x k
k=0

= a0 + a1 x + a2 x 2 + a3 x 3 + a4 x 4 + a5 x 5 + a6 x 6 + a7 x 7 + · · ·
= a0 + 0x + 2a0 x 2 + 0x 3 + a0 x 4 + 0x 5 + 0x 6 + 0x 7 + · · ·
= a0 + 2a0 x 2 + a0 x 4 .

That is, the power series for y reduces to the polynomial

y(x) = a0 1 + 2x 2 + x 4 .

If x0 = 0
The computations in our procedure (and the others we’ll develop) tend to get a little messier when
x 0 = 0 , and greater care needs to be taken. In particular, before you ”multiply things out” in step 2,
you should rewrite your polynomials A(x) and B(x) in terms of (x − x 0 ) instead of x to better
match the terms in the series. For example, if

A(x) = x 2 + 2 and x0 = 1 ,

then rewrite A(x) as follows:

A(x) = [(x − 1) + 1]2 + 2

= (x − 1)2 + 2(x − 1) + 1 + 2 = (x − 1)2 + 2(x − 1) + 3 .

Alternatively (and probably better), you can just convert the differential equation

A(x)y + B(x)y = 0 (30.8a)

using the change of variables X = x − x 0 . That is, ﬁrst set

Y (X) = y(x) with X = x − x0

and then rewrite the differential equation for y(x) in terms of Y and X . After noting that x = X +x 0
and that (via the chain rule)
d d dY d X dY d dY
y (x) = [y(x)] = [Y (X)] = = [x − x 0 ] = = Y (X) ,
dx dx d X dx d X dx dX
we see that this converted differential equation is simply

A(X + x 0 )Y + B(X + x 0 )Y = 0 . (30.8b)

Consequently, if we can ﬁnd a general power series solution

∞

Y (X) = ak X k
k=0

i i

i i
i i

i i

598 Power Series Solutions I: Basic Computational Methods

to the converted differential equation (equation (30.8b)), we can then generate the corresponding
general power series to the original equation (equation (30.8a)) by rewriting X in terms of x ,
∞

y(x) = Y (X) = Y (x − x 0 ) = ak (x − x 0 )k .
k=0

!Example 30.2: Consider the problem of ﬁnding the power series solution about x 0 = 3 for

x 2 − 6x + 10 y + (12 − 4x)y = 0 .

Proceeding as suggested, we let

Y (X) = y(x) with X = x − 3 .

Then x = X + 3 , and
x 2 − 6x + 10 y + (12 − 4x)y = 0

→ [X + 3]2 − 6[X + 3] + 10 Y + (12 − 4[X + 3])Y = 0

After a bit of simple algebra, this last equation simpliﬁes to

X 2 + 1 Y − 4XY = 0 ,

which, by an amazing stroke of luck, is the differential equation we just dealt with in example
30.1 (only now written using capital letters). From that example, we know

Y (X) = a0 1 + 2X 2 + X 4 .

Thus,
y(x) = Y (X) = Y (x − 3) = a0 1 + 2(x − 3)2 + (x − 3)4 .

Initial-Value Problems (and Finding Patterns, Again)

The method just described yields a power series solution
∞

y(x) = ak (x − x 0 )k = a0 + a1 (x − x 0 ) + a2 (x − x 0 )2 + a3 (x − x 0 )3 + · · ·
k=0

in which a0 is an arbitrary constant. Remember,

y(x 0 ) = a0 .

So the above general series solution for

A(x)y + B(x)y = 0

becomes the solution to the initial-value problem

A(x)y + B(x)y = 0 with y(x 0 ) = y0

if we simply replace the arbitrary constant a0 with the value y0 .

i i

i i
i i

i i

Validity of of the Algebraic Method for First-Order Equations 599

Along these lines, it is worth recalling that we are dealing with first-order, homogeneous linear
differential equations, and that the general solution to any such equation can be given as an arbitrary
constant times any nontrivial solution. In particular, we can write the general solution y to any
given first-order, homogeneous linear differential equation as
y(x) = a0 y1 (x)
where a0 is an arbitrary constant and y1 is the particular solution satisfying the initial condition
y(x 0 ) = 1 . So if our solutions can be written as power series about x 0 , then there is a particular
power series solution
∞
y1 (x) = αk (x − x 0 )k
k=0
where α0 = y1 (x 0 ) = 1 and the other αk ’s are fixed numbers (hopefully given by some simple
formula of k ). It then follows that the general solution y is given by
∞
∞

y(x) = a0 y1 (x 0 ) = a0 αk (x − x 0 )k = ak (x − x 0 )k
k=0 k=0
where
ak = αk a0 for k = 0, 1, 2, 3, . . . ,
just as was claimed a few pages ago when we discussed “finding patterns”. (This also confirms that
we will always be able to factor out the a0 in our series solutions.)
One consequence of these observations is that, instead of assuming a solution of the form
∞

ak (x − x 0 )k with a0 arbitrary
k=0

in the ﬁrst step of our method, we could assume a solution of the form
∞

αk (x − x 0 )k with α0 = 1 ,
k=0

and then just multiply the series obtained by an arbitrary constant a0 . In practice, though, this
approach is no simpler than that already outlined in the steps of our algebraic method.

30.3 Validity of of the Algebraic Method for First-Order

Equations
Our algebraic method will certainly lead to a general solution of the form
∞

y(x) = ak (x − x 0 )k with a0 arbitrary ,
k=0

provided such a general solution exists. But what assurance do we have that such solutions exist?
And what about the radius of convergence? What good is a formula for a solution if we don’t know
the interval over which that formula is valid? And while we are asking these sorts of questions, why
do we insist that A(x 0 ) = 0 in pre-step 2?
Let’s see if we can at least partially answer these questions.

i i

i i
i i

i i

600 Power Series Solutions I: Basic Computational Methods

Non-Existence of Power Series Solutions

Let a and b be functions on an interval (α, β) containing some point x 0 , and let y be any function
on that interval satisfying
a(x)y + b(x)y = 0 with y(x 0 ) = 0 .

For the moment, assume this differential equation has a general power series solution about x 0 valid
on (α, β) . This means there are ﬁnite numbers a0 , a1 , a2 , . . . such that
∞

y(x) = ak (x − x 0 )k for α < x < β .
k=0

In particular, there are ﬁnite numbers a0 and a1 with

a0 = y(x 0 ) = 0 and a1 = y (x 0 ) .

Also observe that we can algebraically solve our differential equation for y , obtaining
b(x)
y (x) = − y(x) .
a(x)

Thus,
b(x 0 ) b(x )
a1 = y (x 0 ) = − y(x 0 ) = − 0 a0 , (30.9)
a(x 0 ) a(x 0 )
provided the above fraction is a ﬁnite number — which will certainly be the case if a(x) and b(x)
are polynomials with a(x 0 ) = 0 .
More generally, the fraction in equation (30.9) might be indeterminant. To get around this
minor issue, we’ll take limits:

b(x)
a1 = y (x 0 ) = lim y (x) = lim − y(x)
x→x 0 x→x 0 a(x)

b(x) b(x)
= − lim y(x 0 ) = − lim a0 .
x→x 0 a(x) x→x 0 a(x)

Solving for the limit, we then have

b(x) a1
lim = − .
x→x 0 a(x) a0

This means the above limit must exist and be a well-defined finite number whenever the solution y
can be given by the above power series. And if you think about what this means when the above
limit does not exist as a finite number, you get:

Lemma 30.1 (nonexistence of a power series solution)

Let a and b be two functions on some interval containing a point x 0 . If
b(x)
lim
x→x 0 a(x)

does not exist as a ﬁnite number, then

a(x)y + b(x)y = 0

does not have a general power series solution about x 0 with an arbitrary constant term.

!Example 30.4: To ﬁrst illustrate the algebraic method, we used

2
y + y = 0 ,
x −2
which we rewrote as
(x − 2)y + 2y = 0 .
Now
A(z s ) = z s − 2 = 0 ⇐⇒ zs = 2 .
So this differential equation has just one singular point, z s = 2 . Any x 0 = 2 is then an ordinary
point for the differential equation, and the corresponding radius of analyticity is
Rx0 = distance from x 0 to z s = |z s − x 0 | = |2 − x 0 | .
Theorem 30.2 then assures us that, about any x 0 = 2 , the general solution to our differential
equation has a power series formula, and its radius of convergence is at least equal to |2 − x 0 | .
In particular, the power series we found,
∞
k+1
y = a0 xk ,
2k
k=0

is centered at x 0 = 0 . So the corresponding radius of analyticity is

R = |2 − 0| = 2
and our theorems assure us that our series solution is valid at least on the interval (x 0 −R, x 0 +R) =
(−2, 2) .
In this regard, let us note the following:
1. If |x| ≥ 2 , then the terms of our power series solution
∞
k +1
∞
x k
y = a0 x k = a0 (k + 1) ,
2k 2
k=0 k=0

clearly increase in magnitude as k increases. Hence, this series diverges whenever |x| ≥
2 . So, in fact, the radius of convergence is 2 , and our power series solution is only valid
on (−2, 2) .
2. As we observed on page 593, the above power series is the power series about x 0 for
4a0
y(x) = .
(2 − x)2
But you can easily verify that this simple formula gives us a valid general solution to our
differential equation on any interval not containing the singular point x = 2 , not just
(−2, 2) .

i i

i i
i i

i i

604 Power Series Solutions I: Basic Computational Methods

The last example shows that a power series for a solution may be valid over a smaller interval
than the interval of validity for another formula for that solution. Of course, ﬁnding that more general
formula may not be so easy, especially after we start dealing with higher-order differential equations.
The next example illustrates something slightly different, namely that the radius of convergence
for a power series solution can, sometimes, be much larger than the corresponding radius of analyticity
for the differential equation.

!Example 30.5: In example 30.2, we considered the problem of ﬁnding the power series solution
about x 0 = 3 for
x 2 − 6x + 10 y + (12 − 4x)y = 0 .

Any singular point z for this differential equation is given by

z 2 − 6z + 10 = 0 .

Using the quadratic formula, we see that we have two singular points z + and z − given by

6± (−6)2 − 4 · 10
z± = = 3 ± 1i .
2
The radius of analyticity about x 0 = 3 for our differential equation is the distance between each
of these singular points and x 0 = 3 ,

|z ± − x 0 | = |[3 ± 1i] − 3| = |±i| = 1 .

So the radius of convergence for our series is at least 1 , which means that our series solution is
valid on at least the interval

(x 0 − R, x 0 + R) = (3 − 1, 3 + 1) = (2, 4) .

Recall, however, that example 30.2 demonstrated the possibility of a “terminating series”,
and that our series solution to the above differential equation actually reduced to the polynomial

y(x) = a0 1 + 2x 2 + x 4 ,

which is easily veriﬁed to be a valid solution on the entire real line (−∞, ∞) , not just (2, 4) .

30.4 The Algebraic Method with Second-Order Equations

Extending the algebraic method to deal with second-order differential equations is straightforward.
The only real complication (aside from the extra computations required) comes from the fact that
our solutions will now involve two arbitrary constants instead of one, and that complication won’t
be particularly troublesome.

i i

i i
i i

i i

The Algebraic Method with Second-Order Equations 605

Details of the Method

Our goal, now, is to ﬁnd a general power series solution to

a(x)y + b(x)y + c(x)u = 0

assuming a(x) , b(x) and c(x) are rational functions. As hinted above, the procedure given here
is very similar to that given in the previous section. Because of this, some of the steps will not be
given in the same detail as before.
To illustrate the method, we will ﬁnd a power series solution to

y − x y = 0 . (30.10)

This happens to be Airy’s equation. It is a famous equation and cannot be easily solved by any
method we’ve discussed earlier in this text.
Again, we have two preliminary steps:
Pre-step 1: Get the differential equation into preferred form, which is

A(x)y + B(x)y + C(x)y = 0

where A(x) , B(x) and C(x) are polynomials, preferably with no factors shared by all
three.
Our example is already in the desired form.

Pre-step 2: If not already speciﬁed, choose a value for x 0 such that that A(x 0 ) = 0 . If initial
conditions are given for y(x) at some point, then use that point for x 0 (provided A(x 0 ) = 0 ).
Otherwise, choose x 0 as convenient — which usually means choosing x 0 = 0 .6
For our example, we have no initial values at any point, so we choose x 0 as simply
as possible; namely, x 0 = 0 .

Now for the basic method:

Step 1: Assume
∞

y = y(x) = ak (x − x 0 )k (30.11)
k=0

with a0 and a1 being arbitrary and the other ak ’s “to be determined”, and then compute/write
out the corresponding series for the ﬁrst two derivatives,
∞
d
∞

y = ak (x − x 0 )k = kak (x − x 0 )k−1
dx
k=0 k=1
and
∞
d
∞

y = kak (x − x 0 )k−1 = k(k − 1)ak (x − x 0 )k−2 .
dx
k=1 k=2

Step 2: Plug these series for y , y , and y back into the differential equation and “multiply things
out” to get zero equalling the sum of a few power series about x 0 .
6 Again, the requirement that A(x ) = 0 is a simpliﬁcation of requirements we’ll develop in the next section. But
0
“A(x0 ) = 0” will sufﬁce for now, especially if A , B and C are polynomials with no factors shared by all three.

i i

i i
i i

i i

606 Power Series Solutions I: Basic Computational Methods

Step 3: For each series in your last equation, do a change of index so that each series looks like
∞

something not involving x (x − x 0 )n .
n=something

Step 4: Convert the sum of series in your last equation into one big series. The ﬁrst few terms may
have to be written separately. Simplify what can be simpliﬁed.
Since we’ve already decided x 0 = 0 in our example, we let
∞

y = y(x) = ak x k , (30.12)
k=0

and “compute”
∞
d
∞

y = ak x k = kak x k−1
dx
k=0 k=1
and
∞
d
∞

y = kak x k−1 = k(k − 1)ak x k−2 .
dx
k=1 k=2

Plugging these into the given differential equation and carrying out the other steps
stated above then yield the following sequence of equalities:

0 = y − x y
∞
∞

= k(k − 1)ak x k−2 − x ak x k
k=2 k=0

∞
∞

= k(k − 1)ak x k−2 + (−1)ak x k+1
k=2 k=0

n = k−2 n = k+1
∞
∞

n
= (n + 2)(n + 1)an+2 x + (−1)an−1 x n
n=0 n=1

∞
∞

= (0 + 2)(0 + 1)a0+2 x 0 + (n + 2)(n + 1)an+2 x n + (−1)an−1 x n
n=1 n=1

∞

= 2a2 x 0 + [(n + 2)(n + 1)an+2 − an−1 ]x n .
n=1

Step 5: At this point, you will have an equation of the basic form
∞
th
n formula of the ak ’s (x − x 0 )n = 0 .
n=0

Now:

i i

i i
i i

i i

The Algebraic Method with Second-Order Equations 607

(a) Solve
n th formula of the ak ’s = 0 for n = 0, 1, 2, 3, 4, . . . .
for the ak with the highest index,
ahighest index = formula of n and lower-indexed coefﬁcients .

Again, a few of these equations may need to be treated separately, but you will also
obtain a relatively simple formula that holds for all indices above some fixed value.
This is a recursion formula for computing each coefficient from previously computed
coefficients.
(b) Using another change of index, rewrite the recursion formula just derived so that it
looks like
ak = formula of k and lower-indexed coefficients .

From the previous step in our example, we have

∞

2a2 x 0 + [(n + 2)(n + 1)an+2 − an−1 ]x n = 0 .
n=1

So
2a2 = 0 ,
and, for n = 1, 2, 3, 4, . . . ,

(n + 2)(n + 1)an+2 − an−1 = 0 .

The ﬁrst tells us that

a2 = 0 .
Solving the second for an+2 yields the recursion formula
1
an+2 = an−1 for n = 1, 2, 3, 4, . . . .
(n + 2)(n + 1)

Letting k = n + 2 (equivalently, n = k − 2 ), this becomes

1
ak = ak−3 for k = 3, 4, 5, 6, . . . . (30.13)
k(k − 1)

This is the recursion formula we will use.

Step 6: Use the recursion formula (and any corresponding formulas for the lower-order terms) to
ﬁnd all the ak ’s in terms of a0 and a1 . Look for patterns!
We already saw that
a2 = 0 .
Using this and recursion formula (30.13) with k = 3, 4, . . . (and looking for
patterns), we see that
1 1
a3 = a3−3 = a0 ,
3(3 − 1) 3·2
1 1
a4 = a4−3 = a1 ,
4(4 − 1) 4·3

i i

= a0 + a3 x 3 + a6 x 6 + a9 x 9 + · · ·

+ a1 + a4 x 4 + a7 x 7 + a10 x 10 + · · ·

+ a2 + a5 x 5 + a8 x 8 + a11 x 11 + · · ·

= a0 + a3 x 3 + a6 x 6 + · · · + a3n x 3n + · · ·

+ a1 + a4 x 4 + a7 x 7 + · · · + a3n+1 x 3n+1 + · · ·

+ a2 + a5 x 5 + a8 x 8 + · · · + a3n+2 x 3n+2 + · · ·

1 1
= a0 + a0 x 3 + · · · + a0 x 3n + · · ·
3·2 (2 · 3)(5 · 6) · · · ([3n − 1] · 3n)

1
+ a1 x + a1 x 4 + · · ·
4·3

1
+ a1 x 3n+1 + · · ·
(3 · 4)(6 · 7) · · · (3n · [3n + 1])

+ 0 + 0x 5 + 0x 8 + 0x 11 + · · ·

1 3 1
= a0 1 + x + ··· + x 3n + · · ·
3·2 (2 · 3)(5 · 6) · · · ([3n − 1] · 3n)

1 4 1
+ a1 x + x + ··· + x 3n+1 + · · ·
4·3 (3 · 4)(6 · 7) · · · (3n · [3n + 1])

So,
y(x) = a0 y1 (x) + a1 y2 (x) (30.14a)
where
∞
1
y1 (x) = 1 + x 3n (30.14b)
(2 · 3)(5 · 6) · · · ([3n − 1] · 3n)
n=1
and
∞
1
y2 (x) = x + x 3n+1 . (30.14c)
(3 · 4)(6 · 7) · · · (3n · [3n + 1])
n=1

Last Step: See if you recognize either of the series derived as the series for some well-known
function (you probably won’t!).
It is unlikely that you have ever seen the above series before. So we cannot rewrite
our power series solutions more simply in terms of better-known functions.

i i

i i
i i

i i

610 Power Series Solutions I: Basic Computational Methods

Practical Advice on Using the Method

The advice given for using this method with first-order equations certainly applies when using this
method for second-order equations. All that can be added is that even greater diligence is needed
in the individual computations. Typically, you have to deal with more power series terms when
solving second-order differential equations, and that, naturally, provides more opportunities for
error. That also leads to a greater probability that you will not succeed in finding “nice” formulas
for the coefficients and may have to simply use the recursion formula to compute as many terms as
you think necessary for a reasonably accurate partial sum approximation.

Initial-Value Problems (and Finding Patterns)

Observe that the solution obtained in our example (formula set 30.14) can be written as

y(x) = a0 y1 (x) + a1 y2 (x)

where y1 (x) and y2 (x) are power series about x 0 = 0 with

y1 (x) = 1 + a summation of terms of order 2 or more

and
y2 (x) = 1 · (x − x 0 ) + a summation of terms of order 2 or more .

In fact, we can derive this observation more generally after recalling that the general solution
to a second-order, homogeneous linear differential equation is given by

y(x) = a0 y1 (x) + a1 y2 (x)

where a0 and a1 are arbitrary constants, and y1 and y2 form a linearly independent pair of particular
solutions to the given differential equation. In particular, we can take y1 to be the solution satisfying
initial conditions
y1 (x 0 ) = 1 and y1 (x 0 ) = 0 ,
while y2 is the solution satisfying initial conditions

y2 (x 0 ) = 0 and y2 (x 0 ) = 1 .

If our solutions can be written as power series about x 0 , then y1 and y2 can be written as particular
power series
∞
∞

y1 (x 0 ) = αk (x − x 0 )k and y2 (x 0 ) = βk (x − x 0 )k
k=0 k=0

where
α0 = y1 (x 0 ) = 1 , α1 = y1 (x 0 ) = 0 ,

β0 = y2 (x 0 ) = 0 and β1 = y2 (x 0 ) = 1 ,

and the other αk ’s and βk ’s are ﬁxed numbers (hopefully given by relatively simple formulas of
k ). Thus,
y1 (x) = α0 + α1 (x − x 0 ) + α2 (x − x 0 )2 + α3 (x − x 0 )3 + · · ·
= 1 + 0 · (x − x 0 ) + α2 (x − x 0 )2 + α3 (x − x 0 )3 + · · ·
∞

= 1 + αk (x − x 0 )k ,
k=2

i i

i i
i i

i i

The Algebraic Method with Second-Order Equations 611

while
y2 (x) = β0 + β1 (x − x 0 ) + β2 (x − x 0 )2 + β3 (x − x 0 )3 + · · ·
= 0 + 1 · (x − x 0 ) + β2 (x − x 0 )2 + β3 (x − x 0 )3 + · · ·
∞

= 1 · (x − x 0 ) + βk (x − x 0 )k ,
k=2
verifying that the observation made at the start of this subsection holds in general.
With regard to initial-value problems, we should note that, with these power series for y1 and
y2 ,
y(x) = a0 y1 (x) + a1 y2 (x)
automatically satisﬁes the initial conditions
y(x 0 ) = a0 and y (x 0 ) = a1
for any choice of constants a0 and a1

Even and Odd Solutions (a Common Pattern)

Things simplify slightly when your recursion formula is of the form
ak = ρ(k) ak−2 for k = 2, 3, 4, . . .
where ρ(k) is some formula of k . This turns out to be a relatively common situation. When this
happens,
a2 = ρ(2) a0 a3 = ρ(3) a1
a4 = ρ(4) a2 = ρ(4)ρ(2) a0 a5 = ρ(5) a3 = ρ(5)ρ(3) a1
a6 = ρ(6) a4 = ρ(6)ρ(4)ρ(2) a0 a7 = ρ(7) a5 = ρ(7)ρ(5)ρ(3) a1
a8 = ρ(8) a6 = ρ(8)ρ(6)ρ(4)ρ(2) a0 a9 = ρ(9) a7 = ρ(9)ρ(7)ρ(5)ρ(3) a1
.. ..
. .
Clearly then, each even-indexed coefficient ak = a2m with k ≥ 2 is given by
a2m = c2m a0 where c2m = ρ(2m) c2m−2 ,
while each odd-indexed coefficient ak = a2m+1 with k ≥ 3 is given by
a2m+1 = c2m+1 a1 where c2m+1 = ρ(2m + 1) c2m−1 .
If we further define c0 = c1 = 1 (simply so that a0 = c0 a0 and a1 = c1 a1 ), and let x 0 = 0 , then
y(x) = a0 + a1 x + a2 x 2 + a3 x 3 + a4 x 4 + a5 x 5 + a6 x 6 + · · ·

= c0 a0 + c1 a1 x + c2 a0 x 2 + c3 a1 x 3 + c4 a0 x 4 + c5 a1 x 5 + c6 a0 x 6 + · · ·

= a0 c0 + c2 x 2 + c4 x 4 + c6 x 6 + · · · + a1 c1 x + c3 x 3 + c5 x 5 + c7 x 7 + · · ·

∞
∞

= a0 c2m x 2m + a1 c2m+1 x 2m+1 .
m=0 m=0
In other words, our power series solution is a linear combination of an even power series with an odd
power series. Since these observations may prove useful in a few situations (and in a few exercises
at the end of this chapter) let us summarize them in a little theorem (with x − x 0 replacing x ).

i i

i i
i i

i i

612 Power Series Solutions I: Basic Computational Methods

Theorem 30.5
Let
∞

y(x) = ak (x − x 0 )k
k=0
be a power series whose coefﬁcients are related via a recursion formula

ak = ρ(k) ak−2 for k = 2, 3, 4, . . .

in which ρ(k) is some formula of k . Then

y(x) = a0 y E (x) + a1 y O (x)

where y E and y O are, respectively, the even- and odd-termed series

∞
∞

y E (x) = c2m (x − x 0 )2m and y O (x) = c2m+1 (x − x 0 )2m+1
m=0 m=0

with c0 = c1 = 1 and

ck = ρ(k) ck−2 for for k = 2, 3, 4, . . . .

30.5 Validity of the Algebraic Method for Second-Order

Equations
Let’s start by deﬁning “ordinary” and “singular” points.

Ordinary and Singular Points, and the Radius of Analyticity

Given a differential equation
a(x)y + b(x)y + c(x)y = 0 (30.15)

we classify any point z 0 as a singular point if either of the limits

b(z) c(z)
lim or lim
z→z 0 a(z) z→z 0 a(z)

fails to exist as a finite number. Note that (just as with our definition of singular points for first-order
differential equations) we used “ z ” in this definition, indicating that we may be considering points
on the complex plane as well. This can certainly be the case when a , b and c are rational functions.
And if a , b and c are rational functions, then the nonsingular points (i.e., the points that are not
singular points) are traditionally referred to as ordinary points for the above differential equation.
The radius of analyticity for the above differential equation about any given point z 0 is defined
just as before. It is the distance between z 0 and the singular point z s closest to z 0 , provided the
differential equation has at least one singular point. If the equation has no singular points, then we
define the equation’s radius of analyticity (about z 0 ) to be +∞ .
Again, it will be important to remember that the singular point z s of most interest in a particular
situtation (i.e., the one closest to z 0 ) might not be on the real line, but some point in the complex
plane off of the real line.

i i

i i
i i

i i

Validity of the Algebraic Method for Second-Order Equations 613

Nonexistence of Power Series Solutions

The above definitions are inspired by the same sort of computations as led to the analogous definitions
for first-order differential equations in section 30.3. I’ll leave those computations to you. In particular,
rewriting differential equation (30.15) as
b(x) c(x)
y (x) = − y (x) − y(x) ,
a(x) a(x)

and using the relations between the values y(x 0 ) , y (x 0 ) and y (x 0 ) , and the ﬁrst three coefﬁcients
in
∞
y(x) = ak (x − x 0 )k ,
k=0
you should be able to prove the second-order analog to lemma 30.1:

Lemma 30.6 (nonexistence of a power series solution)

If x 0 is a singular point for
a(x)y + b(x)y + c(x)y = 0 ,
2∞
then this differential equation does not have a power series solution y(x) = k=0 ak (x − x 0 )k with
a0 and a1 being arbitrary constants.

?Exercise 30.1: Verify lemma 30.6.

(By the way, a differential equation might have a “modiﬁed” power series solution about a
singular point. We’ll consider this possibility starting in chapter 32.)

Validity of the Algebraic Method

Once again, we have a lemma telling us that our algebraic method for ﬁnding power series solutions
about x 0 will fail if x 0 is a singular point (only now we are considering second-order equations).
And, unsurprisingly, we also have a second-order analog of theorem 30.7 assuring us that the method
will succeed when x 0 is an ordinary point for our second-order differential equation, and even giving
us a good idea of the interval over which the general power series solution is valid. That theorem is

Theorem 30.7 (existence of power series solutions)

Let x 0 be an ordinary point on the real line for

a(x)y + b(x)y + c(x)y = 0

where a , b and c are rational functions. Then this differential equation has a general power series
solution
∞
y(x) = ak (x − x 0 )k
k=0
with a0 and a1 being the arbitrary constants. Moreover, this solution is valid at least on the interval
(x 0 − R, x 0 + R) where R is the radius of analyticity about x 0 for the differential equation.

And again, we will wait until the next chapter to prove this theorem (or a slightly more general
version of this theorem).

i i

i i
i i

i i

614 Power Series Solutions I: Basic Computational Methods

Identifying Singular and Ordinary Points

The basic approach to identifying a point z 0 as being either a singular or ordinary point for

a(x)y + b(x)y + c(x)y = 0

is to look at the limits

b(z) c(z)
lim and lim .
z→z 0 a(z) z→z 0 a(z)

If the limits are both ﬁnite numbers, x 0 is an ordinary point; otherwise x 0 is a singular point. And if
you think about how these limits are determined by the values of a(z) and b(z) as z → z 0 , you’ll
derive the shortcuts listed in the next lemma.

Lemma 30.8 (tests for ordinary/singular points)

Let z 0 be a point on the complex plane, and consider a differential equation

a(x)y + b(x)y + c(x)y = 0

in which a , b and c are all rational functions. Then:

1. If a(z 0 ) , b(z 0 ) and c(z 0 ) are all ﬁnite values with a(z 0 ) = 0 , then z 0 is an ordinary point
for the differential equation.

2. If a(z 0 ) , b(z 0 ) and c(z 0 ) are all ﬁnite values with a(z 0 ) = 0 , and either b(z 0 ) = 0 or
c(z 0 ) = 0 , then z 0 is a singular point for the differential equation.

3. If a(z 0 ) is a ﬁnite value but either

lim |b(z)| = ∞ or lim |c(z)| = ∞ ,

z→z 0 z→z 0

then z 0 is a singular point for the differential equation.

4. If b(z 0 ) and c(z 0 ) are ﬁnite numbers, and

lim |a(z)| = ∞ ,
z→z 0

then z 0 is an ordinary point for the differential equation.

Applying the above to the differential equations of interest here (rewritten in the form recom-
mended for the algebraic method) gives us:

Corollary 30.9
Let A , B , and C be polynomials with no factors shared by all three. Then a point z 0 on the
complex plane is a singular point for

A(x)y + B(x)y + C(x)y = 0

if and only if A(z 0 ) = 0 .

!Example 30.6: The coefﬁcients in Airy’s equation

y − x y = 0

i i

i i
i i

i i

The Taylor Series Method 615

are polynomials with the ﬁrst coefﬁcient being

A(x) = 1 .

Since there is no z s in the complex plane such that A(z s ) = 0 , Airy’s equation has no singular
point, and theorem 30.7 assures us that the power series solution we obtained in solving Airy’s
equation (formula set (30.14) on page 609) is valid for all x .

30.6 The Taylor Series Method

The Basic Idea (Expanded)
In this approach to finding power series solutions, we compute the terms in the Taylor series for the
solution,
∞ (k)
y (x 0 )
y(x) = (x − x 0 )k ,
k!
k=0
in a manner reminiscent of the way you probably computed Taylor series in elementary calculus.
Unfortunately, this approach often fails to yield a useful general formula for the coefficients of the
power series solution (unless the original differential equation is very simple). Consequently, you
typically end up with a partial sum of the power series solutions consisting of however many terms
you’ve had the time or desire to compute. But there are two big advantages to this method over the
general basic method:
1. The computation of the individual coefficients of the power series solution is a little more
direct and may require a little less work than when using the algebraic method, at least for the
first few terms (provided you are proficient with product and chain rules of differentiation).

2. The method can be used on a much more general class of differential equations than described
so far. In fact, it can be used to formally ﬁnd the Taylor series solution for any differential
equation that can be rewritten in the form

y = F1 (x, y) or y = F2 (x, y, y )

where F1 and F2 are known functions that are sufﬁciently differentiable with respect to all
variables.
With regard to the last comment, observe that

a(x)y + b(x)y = 0 and a(x)y + b(x)y + c(x)y = 0

can be rewritten, respectively, as

b(x) b(x) c(x)
y = − y and y = − y − y .
a(x) a(x) a(x)

F1 (x,y) F2 (x,y,y )

So this method can be used on the same differential equations we used the algebraic method on in
the previous sections. Whether you would want to is a different matter.

i i

i i
i i

i i

616 Power Series Solutions I: Basic Computational Methods

The Steps in the Taylor Series Method

Here are the steps in our procedure for ﬁnding at least a partial sum for the Taylor series about a point
x 0 for the solution to a fairly arbitrary ﬁrst- or second-order differential equation. As an example,
let us use the differential equation
y + cos(x)y = 0 .

As before we have two preliminary steps:

Pre-step 1: Depending on whether the original differential equation is ﬁrst order or second order,
respectively, rewrite it as

y = F1 (x, y) or y = F2 (x, y, y ) .

For our example, we simply subtract cos(x)y from both sides, obtaining

y = − cos(x)y .

Pre-step 2: Choose a value for x 0 . It should be an ordinary point if the differential equation
is linear. (More generally, F1 or F2 must be “differentiable enough” to carry out all the
subsequent steps in this procedure.) If initial values are given for y(x) at some point, then
use that point for x 0 . Otherwise, choose x 0 as convenient — which usually means choosing
x0 = 0 .
For our example, we choose x 0 = 0 .

Now for the steps in the Taylor series method. In going through this method, note that the last
step and some of the ﬁrst few steps depend on whether your original differential equation is ﬁrst or
second order.

Step 1: Set
y(x 0 ) = a0 .
If an initial value for y at x 0 has already been given, you can use that value for a0 . Otherwise,
treat a0 as an arbitrary constant. It will be the ﬁrst term in the power series solution.
For our example, we chose x 0 = 0 , and have no given initial values. So we set

y(0) = a0

with a0 being an arbitrary constant.

Step 2: (a) If the original differential equation is ﬁrst order, then compute y (x 0 ) by plugging
x 0 and the initial value set above into the differential equation from the ﬁrst pre-step,

y = F1 (x, y) .

Remember, to take into account the fact that y is shorthand for y(x) . Thus,

y (x 0 ) = F1 (x 0 , y(x 0 )) = F1 (x 0 , a0 ) .

Our sample differential equation is second order. So we do nothing here.

i i

i i
i i

i i

The Taylor Series Method 617

(b) If the original differential equation is second order, then just set

y (x 0 ) = a1 .

If an initial value for y at x 0 has already been given, then use that value for a1 .
Otherwise, treat a1 as an arbitrary constant.
Since our sample differential equation is second order, we set

y (0) = a1

with a1 being an arbitrary constant.

Step 3 (a) If the original differential equation is ﬁrst order, then differentiate both sides of

y = F1 (x, y)

to obtain an expression of the form

y = F2 x, y, y .

Our sample differential equation is second order. So we do nothing here.

y (0) = sin(0)y(0) − cos(0)y (0) = 0 · a0 − 1 · a1 = −a1 .

For steps 5, 6 and so on, simply repeat step 4 with derivatives of increasing order. In general:

Step k : Using the formula

y (k−1) = Fk−1 x, y, y , y , . . . , y (k−2) ,

obtained in the previous step, and values for y(x 0 ) , y (x 0 ) , y (x 0 ) , . . . and y (k−1) (x 0 )
determined in the previous steps:
(a) Differentiate both sides of this formula for y (k−1) to obtain a corresponding formula
for y (k) ,

y (k) = Fk x, y, y , y , . . . , y (k−1) .

(b) Then compute y (k) (x 0 ) by plugging x 0 into the formula for y (k) (x) just derived,

y (k) (x 0 ) = Fk x 0 , y(x 0 ), y (x 0 ), . . . , y (k−1) (x 0 ) .

Save this newly computed value for future use.

Finally, you need to stop. Assuming you’ve computed y (k) (x 0 ) for k up to N , where N is either
some predetermined integer or the order of the last derivative computed before you decided you’ve
done enough:

Last Step: If you can determine a general formula for y (k) (x 0 ) in terms of a0 or in terms of a0
and a1 (depending on whether the original differential equation is ﬁrst or second order), then
use that formula to write out the Taylor series for the solution,
∞ (k)
y (x 0 )
y(x) = (x − x 0 )k .
k!
k=0

Otherwise, just use the computed values of the y (k) (x 0 )’s to write out the N th partial sum
for this Taylor series, obtaining the solution’s N th partial sum approximation

N
y (k) (x 0 )
y(x) ≈ (x − x 0 )k .
k!
k=0

Thus, the sixth partial sum of the Taylor series for y about 0 is

6
y (k) (0)
S6 (x) = xk
k!
k=0

y (0) 2 y (0) 3 y (4) (0) 4 y (5) (0) 5 y (6) (0) 6

= y(0) + y (0)x + x + x + x + x + x
2! 3! 4! 5! 6!
−a0 2 −a1 3 2a0 4 4a1 5 −9a0 6
= a0 + a1 x + x + x + x + x + x
2 3! 4! 5! 6!
1 1 4 1 6
1 1 5

= a0 1 − x 2 + x − x + a1 x − x 3 + x .
2 12 80 6 30

Validity of the Solutions

If the solutions to a given differential equation are analytic at a point x 0 , then the above method will
clearly ﬁnd the Taylor series (i.e., the power series) for these solutions about x 0 . Hence any valid
theorem stating the existence and radius of convergence of power series solutions to a given type of
differential equation about a point x 0 also assures us of the validity of the Taylor series method with
these equations. In particular, we can appeal to theorems 30.2 on page 602 and 30.7 on page 613,
or even to the more general versions of these theorems in the next chapter (theorems 31.9 and 31.10
starting on page 639). In particular, theorem 31.9 will assure us that our above use of the Taylor
series method in attempting to solve
y + cos(x)y = 0

is valid. Hence, it will assure us that, at least when x ≈ 0 ,

1 1 4 1 6
1 1 5

y(x) ≈ a0 1 − x 2 + x − x + a1 x − x 3 + x .
2 12 80 6 30

30.7 Appendix: Using Induction

The Basic Ideas
Suppose we have some sequence of numbers — A0 , A1 , A2 , A3 , . . . — with each of these
numbers (other than A0 ) related by some recursion formula to the previous number in the sequence.
Further suppose we have some other formula F(k) , and we suspect that

Ak = F(k) for k = 0, 1, 2, 3, . . . .

The obvious question is whether our suspicions are justiﬁed. Can we conﬁrm that the above equality
holds for every nonnegative integer k ?
Well, let’s make two more assumptions:

1. That we know
A0 = F(0) .

i i

i i
i i

i i

Appendix: Using Induction 621

2. That we also know that, for every nonnegative integer N , we can rigorously derive the
equality
A N +1 = F(N + 1)
from
A N = F(N )

using the recursion formula for the Ak ’s . For brevity, let’s express this assumption as

A N = F(N ) ⇒ A N +1 = F(N + 1) for N = 0, 1, 2, . . . . (30.16)

These two assumptions are the “steps” in the basic principle of induction with a numeric
sequence. The ﬁrst — that A0 = F(0) — is the base step or anchor, and the second — implication
(30.16) — is the inductive step. It is important to realize that, in implication (30.16), we are not
assuming A N = F(N ) is actually true, only that we could derive A N +1 = F(N + 1) provided
A N = F(N ) .
However, if these assumptions hold, then we do know A0 = F(0) , which is A N = F(N ) with
N = 0 . That, combined with implication (30.16), assures us that we could then derive the equality

A0+1 = F(0 + 1) .

So, in fact, we also know that

A1 = F(1) .
But this last equation, along with implication (30.16) (using N = 1 ) assures us that we could then
obtain
A1+1 = F(1 + 1) ,
assuring us that
A2 = F(2) .
And this, with implication (30.16) (this time using N = 2 ) tells us that we could then derive

A2+1 = F(2 + 1) ,

thus conﬁrming that

A3 = F(3) .
And, clearly, we can continue, successively conﬁrming that

A4 = F(4) , A5 = F(5) , A6 = F(6) , A7 = F(7) , ... .

Ultimately, we could conﬁrm that

Ak = F(k)
for any given positive integer k , thus assuring us that our original suspicions were justiﬁed.
To summarize:

Theorem 30.10 (basic principle of induction for numeric sequences)

Let A0 , A1 , A2 , A3 , . . . be a sequence of numbers related to each other via some recursion
formula, and let F be some function with F(k) being deﬁned and ﬁnite for each nonnegative
integer k . Assume further that

1. (base step) A0 = F(0) ,

and that

i i

i i
i i

i i

622 Power Series Solutions I: Basic Computational Methods

2. (inductive step) for each nonnegative integer N , the equality

A N +1 = F(N + 1)

can be obtained from the equality

A N = F(N )

using the recursion formula for the Ak ’s .

Then
Ak = F(k) for k = 0, 1, 2, 3, . . . .

To use the above, we must ﬁrst verify that the two assumptions in the theorem actually hold for
the case at hand. That is, we must verify both that

1. A0 = F(0)
and that
2. for each positive integer N , A N +1 = F(N + 1) can be derived from A N = F(N ) using
the recursion formula.
In practice, the formula F(k) comes from noting a “pattern” in computing the first few Ak ’s using
a recursion formula. Consequently, the verification of the base step is typically contained in those
computations. It is the verification of the inductive step that is most important. That step is where
we confirm that the “pattern” observed in computing the first few Ak ’s continues and can be used
to compute the rest of the Ak ’s .

!Example 30.8: For our Ak ’s , let us use the coefﬁcients of the power series solution obtained
in our ﬁrst example of the algebraic method,

Ak = ak

where a0 is an arbitrary constant, and the other ak ’s are generated via recursion formula (30.6)
on page 5927 ,
k+1
ak = ak−1 for k = 1, 2, 3, 4, . . . . (30.17)
2k
In particular, we obtained:
2 3 4 5 6
a1 = a0 , a2 = a0 , a3 = a0 , a4 = a0 , a5 = a0 ,
2 22 23 24 25
and “speculated” that, in general,
k+1
ak = a0 for k = 0, 1, 2, 3, 4, . . . . (30.18)
2k
So, we want to show that, indeed,
k +1
ak = F(k) with F(k) = a0 for k = 0, 1, 2, 3, 4, . . . .
2k
To apply the principle of induction, we ﬁrst must verify that

a0 = F(0) .
7 Originally, recursion formula (30.6) was derived for k ≥ 2 . However, with k = 1 , this recursion formula reduces to
a1 = a0 which is true for this power series solution. So this formula is valid for all nonnegative integers.

i i

i i
i i

i i

Appendix: Using Induction 623

But this is easily done by just computing F(0) ,

0+1
F(0) = a0 = a0 .
20
Next comes the veriﬁcation of the inductive step; that is, the veriﬁcation that for any non-
negative integer N , the recursion formula for the ak ’s yields the implication

a N = F(N ) ⇒ a N +1 = F(N + 1) .

Now,
N +1 [N + 1] + 1 N +2
F(N ) = a0 and F(N + 1) = [N +1]
a0 = N +1 a0 .
2N 2 2
So what we really need to show is that the implication
N +1 N +2
aN = a0 ⇒ a N +1 = a0 . (30.19)
2N 2 N +1
holds because of recursion formula (30.17), which, with k = N + 1 , becomes
[N + 1] + 1 N +2
a N +1 = a[N +1]−1 = aN .
2[N + 1] 2[N + 1]
Combining this with the first equation in implication (30.19) gives us
N +2 N +2 N +1 N +2
a N +1 = aN = · N a0 = N +1 a0 ,
2[N + 1] 2[N + 1] 2 2
which is the second equation in implication (30.19). This confirms that implication (30.19) holds
for every nonnegative integer N .
With both steps of the inductive process successfully verified, the principle of induction
(theorem 30.10) assures us that yes, indeed,
k+1
ak = a0 for k = 0, 1, 2, 3, 4, . . .
2k
just as we originally speculated.

Usage Notes
In the above example, the Ak ’s were all the coefﬁcients in a particular power series. In other ex-
amples, especially examples involving second-order equations, you may have to separately consider
different subsequences of the coefﬁcients.
2∞
!Example 30.9: In deriving the power series solution k=0 ak x
k to Airy’s equation, we
obtained three patterns:
1
a3n = a0 for n = 1, 2, 3, . . . ,
(2 · 3)(5 · 6) · · · ([3n − 1] · 3n)
1
a3n+1 = a1 for n = 1, 2, 3, . . .
(3 · 4)(6 · 7) · · · (3n · [3n + 1])
and
a3n+1 = 0 for n = 1, 2, 3, . . . .

i i

i i
i i

i i

624 Power Series Solutions I: Basic Computational Methods

To verify each of these formulas, we would need to apply the method of induction three times:
1
1. once with An = a3n and F(n) = a0 ,
(2 · 3)(5 · 6) · · · ([3n − 1] · 3n)
1
2. once with An = a3n+1 and F(n) = a1 , and
(3 · 4)(6 · 7) · · · (3n · [3n + 1])
3. once with An = a3n+2 and F(n) = 0 .

It probably should be noted that there are slight variations on the method of induction. In
particular, it should be clear that, in the inductive step, we need not base our derivation of a N +1 =
F(N + 1) on just a N = F(N ) — we could base it on any or all equations ak = F(k) with
k ≤ N . To be precise, a slight variation in our development of theorem 30.10 would have given us
the following:

Theorem 30.11 (alternative principle of induction for numeric sequences)

Let A0 , A1 , A2 , A3 , . . . be a sequence of numbers related to each other via some set of recursion
formulas, and let F be some function with F(k) being deﬁned and ﬁnite for each nonnegative
integer k . Assume further that
1. (base step) A0 = F(0) ,
and that
2. (inductive step) for each nonnegative integer N , the equality
A N +1 = F(N + 1)
can be obtained from the equalities
Ak = F(k) for k = 0, 1, 2, . . . , N
using the given set of recursion formulas.
Then
Ak = F(k) for k = 0, 1, 2, 3, . . . .

This would be the version used with recursion formulas having two or more terms.

The General Principle of Induction

For the sake of completeness, it should be mentioned that the general principle of induction is a
fundamental principle of logic. Here’s the basic version:

Theorem 30.12 (basic general principle of induction)

Let S0 , S1 , S2 , S3 , . . . be a sequence of logical statements. Assume further that
1. S0 is true,
and that
2. for each nonnegative integer N , it can be shown that S N +1 is true provided S N is true.
Then all the Sk ’s are true.

You can ﬁnd discussions of this principle in just about any “introduction to mathematical logic”
or “introduction to abstract mathematics” textbook.

?Exercise 30.2: What is the logical statement Sk in theorem 30.10?

i i

i i
i i

i i

Additional Exercises 625

Additional Exercises

30.3. Using the algebraic method from section 30.2, ﬁnd a general power series solution about
x 0 to each of the ﬁrst-order differential equations given below. In particular:

i Identify the recursion formula for the coefﬁcients.

ii Use the recursion formula to ﬁnd the corresponding series solution with at least
four nonzero terms explicitly given (if the series terminates, just write out the cor-
responding polynomial solution).

iii Unless otherwise instructed, ﬁnd a general formula for the coefﬁcients and write
out the resulting series.

Also, then solve each of these equations using more elementary methods.
a. y − 2y = 0 with x0 = 0

b. y − 2x y = 0 with x0 = 0
2
c. y + y = 0 with x0 = 0
2x − 1
d. (x − 3)y − 2y = 0 with x 0 = 0

e. 1 + x 2 y − 2x y = 0 with x 0 = 0
1
f. y + y = 0 with x0 = 0
x −1
1
g. y + y = 0 with x0 = 3
x −1
h. (1 − x)y − 2y = 0 with x0 = 5
i. (2 − x 3 )y − 3x 2 y = 0 with x0 = 0
j. (2 − x 3 )y + 3x 2 y = 0 with x0 = 0

k. (1 + x) y − x y = 0 with x 0 = 0
(Do not attempt finding a general formula for the coefficients.)
l. (1 + x)y + (1 − x)y = 0 with x 0 = 0
(Do not attempt finding a general formula for the coefficients.)

30.4. For each of the equations in the previous set, ﬁnd each singular point z s , the radius of
analyticity R about the given x 0 , and the interval I over which the power series solution
is guaranteed to be valid according to theorem 30.2.

30.5. Find the general power series solution about the given x 0 for each of the differential
equations given below. In your work and answers:

i Identify the recursion formula for the coefﬁcients.

i i

i i
i i

i i

626 Power Series Solutions I: Basic Computational Methods

ii Express your ﬁnal answer as a linear combination of a linearly independent pair of

power series solutions. If a series terminates, write out the resulting polynomial.
Otherwise write out the series with at least four nonzero terms explicitly given.

iii Unless otherwise instructed, ﬁnd a formula for the coefﬁcients of each series, and
write out the resulting series.

a. 1 + x 2 y − 2y = 0 with x 0 = 0

b. y + x y + y = 0 with x 0 = 0

c. 4 + x 2 y + 2x y = 0 with x 0 = 0

d. y − 3x 2 y = 0 with x 0 = 0
(Do not attempt finding a general formula for the coefficients.)

e. 4 − x 2 y − 5x y − 3y = 0 with x 0 = 0
(Do not attempt finding a general formula for the coefficients.)

f. 1 − x 2 y − x y + 4y = 0 with x 0 = 0 (a Chebyshev equation)
(Do not attempt finding a general formula for the coefficients.)
g. y − 2x y + 6y = 0 with x 0 = 0

(a Hermite equation)
(Do not attempt finding a general formula for the coefficients.)

h. x 2 − 6x y + 4(x − 3)y + 2y = 0 with x 0 = 3

i. y + (x + 2)y + 2y = 0 with x 0 = −2
2
j. (x − 2x + 2)y + (1 − x)y − 3y = 0 with x 0 = 1
(Do not attempt finding a general formula for the coefficients.)
k. y − 2y − x y = 0 with x 0 = 0
(Do not attempt finding a general formula for the coefficients.)
l. y − x y − 2x y = 0 = 0 with x 0 = 0
(Do not attempt finding a general formula for the coefficients.)

30.6. For each of the equations in the previous set, ﬁnd each singular point z s , the radius of
analyticity R about the given x 0 , and the interval I over which the power series solution
is guaranteed to be valid according to theorem 30.7.

30.7. In describing the coefﬁcients of power series solutions, experienced mathematicians often
condense lengthy formulas through clever use of the factorial. Develop some of that ex-
perience by deriving the following expressions for the indicated products. Do not simply
multiple all the factors and compare the two sides of each equation. Instead, use clever
algebra to convert one side of the equation to the other.
a. Products of the ﬁrst m even integers:
i. 6 · 4 · 2 = 23 3! (Hint: 6 · 4 · 2 = (2 · 3)(2 · 2)(2 · 1) )
ii. 8 · 6 · 4 · 2 = 24 4!
iii. (2m)(2m − 2)(2m − 4) · · · 6 · 4 · 2 = 2m m!

i i

i i
i i

i i

Additional Exercises 627

b. Products of the ﬁrst m odd integers:

5!
i. 5 · 3 · 1 =
22 2!
7!
ii. 7 · 5 · 3 · 1 =
23 3!
(2m − 1)!
iii. (2m − 1)(2m − 3)(2m − 5) · · · 5 · 3 · 1 =
2m−1 (m − 1)!

30.8. The following exercises concern the Hermite equation, which is

y − 2x y + λy = 0

where λ is a constant (as in exercise 30.5 g, above).

a. Using the algebraic method, derive the general
2∞ recursion formula (in terms of λ ) for the
general power series solution yλ (x) = k=0 ka x k to the above Hermite equation.

b. Over what interval are these power series solutions guaranteed to be valid according to
theorem 30.7?
c. Using the recursion formula just found, along with theorem 30.5 on page 612, verify that
the general power series solution yλ can be written as

yλ (x) = a0 yλ,E (x) + a1 yλ,O (x)

where yλ,E and yλ,O are, respectively, even- and odd-termed series
∞
∞

yλ,E (x) = c2m x 2m and yλ,O (x) = c2m+1 x 2m+1
m=0 m=0

with c0 = c1 = 1 and the other ck ’s satisfying some recursion formula. Write out that
recursion formula.
d. Assume N is an nonnegative integer, and ﬁnd the one value λ N for λ such that the
above-found recursion formula yields a N +2 = 0 · a N .
e. Using the above, show that,
i. If λ = λ N for some nonnegative integer N , then exactly one of the two power series
yλ,E (x) or yλ,O (x) reduces to an even or odd N th degree polynomial p N , with

yλ,E (x) if N is even
p N (x) = ,
yλ,O (x) if N is odd

and with the other power series not reducing to a polynomial. (The polynomials, mul-
tiplied by suitable constants, are called the Hermite polynomials.)
ii. If λ = λ N for any nonnegative integer N , then neither of the two power series yλ,E (x)
or yλ,O (x) reduces to polynomial.
f. Find the polynomial solution p N (x) when
i. N = 0 ii. N = 1 iii. N = 2 iv. N = 3
v. N = 4 vi. N = 5

i i

i i
i i

i i

628 Power Series Solutions I: Basic Computational Methods

30.9. The following exercises concern the Chebyshev8 equation with parameter λ

1 − x 2 y − x y + λy = 0

(as in exercise 30.5 f, above). The parameter λ may be any constant.

a. Using the algebraic method, derive the general
2∞ recursion formula (in terms of λ ) for the
general power series solution yλ (x) = k
k=0 ak x to the above Chebyshev equation.

b. Using the recursion formula just found, along with theorem 30.5 on page 612, verify that
the general power series solution yλ can be written as

yλ (x) = a0 yλ,E (x) + a1 yλ,O (x)

where yλ,E and yλ,O are, respectively, even- and odd-termed series
∞
∞

yλ,E (x) = c2m x 2m and yλ,O (x) = c2m+1 x 2m+1
m=0 m=0

with c0 = c1 = 1 and the other ck ’s satisfying some recursion formula. Write out that
recursion formula.
c. Assume N is an nonnegative integer, and ﬁnd the one value λ N for λ such that the
above-found recursion formula yields a N +2 = 0 · a N .
d. Using the above, show that:
i. If λ = λ N for some nonnegative integer N , then exactly one of the two power series
yλ,E (x) or yλ,O (x) reduces to an even or odd N th degree polynomial p N , with

yλ,E (x) if N is even
p N (x) = ,
yλ,O (x) if N is odd

and with the other power series not reducing to a polynomial. (The polynomials, mul-
tiplied by suitable constants, are the Chebyshev polynomials of the first type.)
ii. If λ = λ N for any nonnegative integer N , then neither of the two power series yλ,E (x)
or yλ,O (x) reduces to polynomial.
e. Now, find the following:
i. λ0 and p0 (x) ii. λ1 and p1 (x) iii. λ2 and p2 (x)
iv. λ3 and p3 (x) v. λ4 and p4 (x) vi. λ5 and p5 (x)
f. Now let λ be any constant (not necessarily λ N ).
i. What is the largest interval over which these power series solutions to the Chebyshev
equation are guaranteed to be valid according to theorem 30.7?
ii. Use the recursion formula along with the ratio test or limit ratio test to find the radius of
convergence and largest interval of convergence for yλ,E (x) and for yλ,O (x) , provided
the series does not terminate as polynomials.
g. Verify each of the following using work already done above:
i. If λ = N 2 for some nonnegative integer N , then the Chebyshev equation with param-
eter λ has polynomial solutions, all of which are all constant multiples of p N (x) .
8 also spelled Tschebyscheff.

i i

i i
i i

i i

Additional Exercises 629

ii. If λ = N 2 for every nonnegative integer N , then the Chebyshev equation with param-
eter λ has no polynomial solutions (other than y = 0 ).
iii. If yλ is a nonpolynomial solution to a Chebyshev equation on (−1, 1) , then it is given
by a power series about x 0 = 0 with a radius of convergence of exactly 1 .

30.10. For any constant λ , the Legendre equation9 with parameter λ is

(1 − x 2 )y − 2x y + λy = 0 .

This equation is the object of study in the following exercises.

a. Using the algebraic method, derive the general
2∞ recursion formula (in terms of λ ) for the
general power series solution yλ (x) = k=0 ak x k to the above Legendre equation.

b. Using the recursion formula just found, along with theorem 30.5 on page 612, verify that
the general power series solution yλ can be written as

yλ (x) = a0 yλ,E (x) + a1 yλ,O (x)

where yλ,E and yλ,O are, respectively, even- and odd-termed series
∞
∞

yλ,E (x) = c2m x 2m and yλ,O (x) = c2m+1 x 2m+1
m=0 m=0

and with the other power series not reducing to a polynomial. (The polynomials, mul-
tiplied by suitable constants, are called the Legendre polynomials.)
ii. If λ = λ N for any nonnegative integer N , then neither of the two power series yλ,E (x)
or yλ,O (x) reduces to polynomial.
e. Based on the above, ﬁnd the following:
i. λ0 , p0 (x) , and y0,O (x) ii. λ1 , p1 (x) , and y1,E (x)
iii. λ2 and p2 (x) iv. λ3 and p3 (x)
v. λ4 and p4 (x) vi. λ5 and p5 (x)

9 The Legendre equations arise in problems involving three-dimensional spherical objects (such as the Earth). In section
33.5, we will continue the analysis begun in this exercise, discovering that the “polynomial solutions” found here are of
particular importance.

i i

i i
i i

i i

630 Power Series Solutions I: Basic Computational Methods

f. Now let λ be any constant (not necessarily λ N ).

i. What is the largest interval over which these power series solutions to the Legendre
equation are guaranteed to be valid according to theorem 30.7?
ii. Use the recursion formula along with the ratio test or limit ratio test to ﬁnd the radius of
convergence and largest interval of convergence for yλ,E (x) and for yλ,O (x) , provided
the series does not terminate as polynomials.
g. Verify each of the following using work already done above:
i. If λ = N (N + 1) for some nonnegative integer N , then the Legendre equation with
parameter λ has polynomial solutions, and they are all constant multiples of p N (x) .
ii. If λ = N (N + 1) for every nonnegative integer N , then yλ is not a polynomial.
iii. If yλ is not a polynomial, then it is given by a power series about x 0 = 0 with a radius
of convergence of exactly 1 .

30.11. For each of the following, use the Taylor series method to ﬁnd the N th -degree partial sum
of the power series solution about x 0 to the given differential equation. If either applies,
use theorems 30.2 on page 602 or 30.7 on page 613 to determine an interval I over which
the power series solutions are valid.
a. y + 4y = 0 with x0 = 0 and N =5
b. y − x 2 y = 0 with x0 = 0 and N =5
c. y + e2x y = 0 with x0 = 0 and N =4
π
d. sin(x) y − y = 0 with x0 = and N =4
2
e. y + x y = sin(x) with x0 = 0 and N =5
f. y − sin(x) y − x y = 0 with x0 = 0 and N =4
2
g. y − y = 0 with x0 = 0 and N =5
h. y + cos(y) = 0 with x0 = 0 and N =3

i i

i i
i i

31.2 Ordinary and Singular Points, the Radius of

Analyticity, and the Reduced Form
Introducing Complex Variables
To properly address at least one of our questions, and to simplify the statements of our theorems,
it will help to start viewing the coefﬁcients of our differential equations as functions of a complex
variable z . We actually did this in the last chapter when we referred to a point z s in the complex
plane for which A(z s ) = 0 . But A was a polynomial then, and viewing a polynomial as a function
of a complex variable is so easy that we hardly noted doing so. Viewing other functions (such as
exponentials, logarithms and trigonometric functions) as functions of a complex variable may be a
bit more challenging.

Analyticity and Power Series

Let us start by recalling that we need not restrict the variable or the center in a power series to real
values — they can be complex,
∞

ak (z − z 0 )k for |z − z 0 | < R ,
k=0

in which case the radius of convergence R is the radius of the largest open disk in the complex plane
centered at z 0 on which the power series is convergent.1
Also recall that our deﬁnition of analyticity also applies to functions of a complex variable; that
is, any function f of a complex variable is analytic at a point z 0 in the complex plane if and only
if f (z) can be expressed as a power series about z 0 ,
∞

f (z) = ak (z − z 0 )k for |z − z 0 | < R
k=0

for some R > 0 . Moreover, as also noted in section 29.3, if f is any function of a real variable
given by a power series on the interval (x 0 − R, x 0 + R) ,
∞

f (x) = ak (x − x 0 )k ,
k=0

1 If you don’t recall this, quickly review section 29.3.

i i

i i
i i

i i

Ordinary and Singular Points … 633

then we can view this function as a function of the complex variable z = x + i y on a disk of radius
R about x 0 by simply replacing the real variable x with the complex variable z ,
∞

f (z) = ak (z − x 0 )k .
k=0

We will do this automatically in all that follows.

By the way, do observe that, if

lim | f (z)| = ∞ ,
z→z 0

then f certainly is not analytic at z 0 !

Some Results from Complex Analysis

Useful insights regarding analytic functions can be gained from the theory normally developed in
an introductory course on “complex analysis”. Sadly, we do not have the time or space to properly
develop that theory here. As an alternative, a brief overview of the relevant parts of that theory is
given for the interested reader in an appendix near the end of this chapter (section 31.7). From that
appendix, we get the following two lemmas (both of which should seem reasonable):

Lemma 31.1 2
Assume F is a function analytic at z 0 with corresponding power series ∞
k=0 f k (z − z 0 ) , and let
k

R be either some positive value or +∞ . Then

∞

F(z) = f k (z − z 0 )k whenever |z − z 0 | < R
k=0

if and only if F is analytic at every complex point z satisfying

|z − z 0 | < R .

Lemma 31.2
Assume F(z) and A(z) are two functions analytic at a point z 0 . Then the quotient F/ is also
A
analytic at z 0 if and only if
F(z)
lim
z→z 0 A(z)

is ﬁnite.

Let us note the following immediate corollary of the ﬁrst lemma:

Corollary 31.3 2
Assume F(x) is some function on the real line, and ∞
k=0 f k (x − x 0 ) is a power series with a
k

inﬁnite radius of convergence . If

∞

F(x) = f k (x − x 0 )k for −∞ < x < ∞ ,
k=0

then F(z) is analytic at every point in the complex plane.

i i

i i
i i

i i

634 Power Series Solutions II: Generalizations and Theory

From this corollary and the series in set (29.4) on page 567, it immediately follows that the
sine and cosine functions, as well as the exponential functions, are all analytic on the entire complex
plane.
Let us also note that the second lemma extends some observations regarding quotients of
functions made in section 29.4.2

Ordinary and Singular Points

Let z 0 be a point on the complex plane, and let a , b and c be functions on the complex plane. We
will say that z 0 is an ordinary point for the ﬁrst-order differential equation

a(x)y + b(x)y = 0

if and only if the quotient

b(z)
a(z)
is analytic at z 0 . And we will say that z 0 is an ordinary point for the second-order differential
equation
a(x)y + b(x)y + c(x)y = 0
if and only if the quotients
b(z) c(z)
and
a(z) a(z)
are both analytic at z 0 .
Any point that is not an ordinary point (that is, any point at which the above quotients are not
analytic) is called a singular point for the differential equation.
Using lemma 31.2, you can easily verify the following shortcuts for determining whether a
point is a singular or ordinary point for a given differential equation. You can then use these lemmas
to verify that our new deﬁnitions reduce to those given in the last chapter when the coefﬁcients of
our differential equation are rational functions.

Lemma 31.4
Let z 0 be a point in the complex plane, and consider the differential equation

a(x)y + b(x)y = 0

where a and b are functions analytic at z 0 . Then:

1. If a(z 0 ) = 0 , then z 0 is an ordinary point for the differential equation.
2. If a(z 0 ) = 0 and b(z 0 ) = 0 , then z 0 is a singular point for the differential equation.
3. The point z 0 is an ordinary point for this differential equation if and only if
b(z)
lim
z→z 0 a(z)

is ﬁnite.

2 If you haven’t already done so, now might be a good time to at least skim over the material in the subsection More on
Algebra with Power Series and Analytic Functions starting on page 579.

i i

i i
i i

i i

Ordinary and Singular Points … 635

Lemma 31.5
Let z 0 be a point in the complex plane, and consider the differential equation
a(x)y + b(x)y + c(x)y = 0
where a , b and c are functions analytic at z 0 . Then:
1. If a(z 0 ) = 0 , then z 0 is an ordinary point for the differential equation.
2. If a(z 0 ) = 0 , and either b(z 0 ) = 0 or c(z 0 ) = 0 , then z 0 is a singular point for the
differential equation.
3. The point z 0 is an ordinary point for this differential equation if and only if
b(z) c(z)
lim and lim
z→z 0 a(z) z→z 0 a(z)

are both ﬁnite.

!Example 31.1: Consider the two differential equations

y + sin(x) y = 0 and sin(x) y + y = 0 .
From corollary 31.3, we know that the sine function is analytic at every point on the complex
plane, and that
sin(z) = 0 if z = nπ with n = 0, ±1, ±2, . . . .
Moreover, it’s not hard to show (see exercise 31.4) that the above points are the only points in the
complex plane at which the sine is zero.
What this means is that both coefficients of
y + sin(x) y = 0
are analytic everywhere, with the first coefficient (which is simply the constant 1 ) never being
zero. Thus, lemma 31.5 assures us that every point in the complex plane is an ordinary point for
this differential equation. It has no singular points.
On the other hand, while both coefficients of
sin(x) y + 5y = 0
are analytic everywhere, the first coefficient is zero at z 0 = 0 (and at every other integral multiple
of π ). Since the second coefficient (again, the constant 1 ) is not zero at z 0 = 0 , lemma 31.5
tells us that z 0 = 0 (and every other integral multiple of π ) is a singular point for this differential
equation.

Radius of Analyticity
The Definition, Recycled
Why waste a perfectly good definition? Given
a(x)y + b(x)y = 0 or a(x)y + b(x)y + c(x)y = 0
we define the radius of analyticity (for the differential equation) about any given point z 0 to be the
distance between z 0 and the singular point closest to z 0 , unless the differential equation has no
singular points, in which case we define the radius of analyticity to be +∞ .
This is precisely the same definition as given (twice) in the previous chapter.

i i

i i
i i

i i

636 Power Series Solutions II: Generalizations and Theory

Is the Radius Well Deﬁned?

When the coefﬁcients of our differential equations were just polynomials, it should have been obvious
that there really was a “singular point closest to z 0 ” (provided the equation had singular points). But
a cynical reader — especially one who has seen some advanced analysis — may wonder if such a
singular point always exists with our more general equations, or if, instead, a devious mathematician
could construct a differential equation with an inﬁnite set of singular points, none of which are closest
to the given ordinary point. Don’t worry, no mathematician is devious enough.

Lemma 31.6
Let z 0 be an ordinary point for some ﬁrst- or second-order linear homogeneous differential equation.
Then, if the differential equation has singular points, there is at least one singular point z s such that
no other singular point is closer to z 0 than z s .

The z s in this lemma is a “singular point closest to z 0 ”. There may, in fact, be other singular
points at the same distance from z 0 , but none closer. Anyway, this ensures that “the radius of
analyticity” for a given differential equation about a given point is well deﬁned.
The proof of lemma 31.6 is subtle, and is discussed in an appendix (section 31.8).

31.3 The Reduced Forms

A Standard Way to Rewrite Our Equations
There is some beneﬁt in dividing a given differential equation

ay + by = 0 or ay + by + cy = 0

by the equation’s leading coefﬁcient, obtaining the equation’s corresponding reduced form3

y + P y = 0 or y + P y + Qy = 0

(with P = b/a and Q = c/a ). For one thing, it may reduce the number of products of inﬁnite series
to be computed. In addition, it will allow us to use the generic recursion formulas that we will be
deriving in a little bit. However, the advantages of using the reduced form depend somewhat on
the ease in ﬁnding and using the power series for P (and, in the second-order case, for Q ). If the
differential equation can be written as

Ay + By = 0 or Ay + By + C y = 0

where the coefficients are given by relatively simple known power series, then the extra effort in
finding and using the power series for the coefficients of the corresponding reduced equations

y + P y = 0 or y + P y + Qy = 0

may out-weigh any supposed advantages of using these reduced forms. In particular, if A , B and
C are all relatively simple polynomials (with A not being a constant), then dividing

Ay + By + C y = 0

∞
∞
k
→ kak x k−1 + a j pk− j x k = 0
k=1 k=0 j=0

n = k−1 n=k

∞
∞
n
→ (n + 1)an+1 x n + a j pn− j x n = 0
n=0 n=0 j=0

∞

n
→ (n + 1)an+1 + a j pn− j x n = 0 .
n=0 j=0

Thus,

n
(n + 1)an+1 + a j pn− j = 0 for n = 0, 1, 2, . . . .
j=0

Solving for an+1 and letting k = n + 1 gives us

1
k−1
ak = − a j pk−1− j for k = 1, 2, 3, . . . . (31.1)
k
j=0

Of course, we would have obtained the same recursion formula with x 0 being any ordinary point
for the given differential equation (just replace x in the above computations with X = x − x 0 ).

Second-Order Case
We will leave this derivation as an exercise.

i i

i i
i i

i i

Existence of Power Series Solutions 639

?Exercise 31.1: Assume that, over some interval containing the point x 0 , P and Q are
functions given by power series
∞
∞

P(x) = pk (x − x 0 )k and Q(x) = qk (x − x 0 )k ,
k=0 k=0

and derive the recursion formula

1
k−2

ak = − ( j + 1)a j+1 pk−2− j + a j qk−2− j (31.2)
k(k − 1)
j=0

for the series solution

∞

y(x) = ak (x − x 0 )k
k=0
to
y + P y + Qy = 0 .
(For simplicity, start with the case in which x 0 = 0 .)

Validity of the Power Series Solutions

Here are the big theorems on the existence of power series solutions. They are also theorems on the
computation of these solutions since they contain the recursion formulas just derived.

Theorem 31.9 (ﬁrst-order series solutions)

Suppose x 0 is an ordinary point for a ﬁrst-order homogeneous differential equation whose reduced
form is
y + P y = 0 .
Then P has a power series representation
∞

P(x) = pk (x − x 0 )k for |x − x 0 | < R
k=0

where R is the radius of analyticity about x 0 for this differential equation.

Moreover, a general solution to the differential equation is given by
∞

y(x) = ak (x − x 0 )k for |x − x 0 | < R
k=0

where a0 is arbitrary, and the other ak ’s satisfy the recursion formula

1
k−1
ak = − a j pk−1− j . (31.3)
k
j=0

Theorem 31.10 (second-order series solutions)

Suppose x 0 is an ordinary point for a second-order homogeneous differential equation whose reduced
form is
y + P y + Qy = 0 .

i i

i i
i i

i i

640 Power Series Solutions II: Generalizations and Theory

Then P and Q have power series representations

∞

P(x) = pk (x − x 0 )k for |x − x 0 | < R
k=0
and
∞

Q(x) = qk (x − x 0 )k for |x − x 0 | < R
k=0

where R is the radius of analyticity about x 0 for this differential equation.

Moreover, a general solution to the differential equation is given by
∞

y(x) = ak (x − x 0 )k for |x − x 0 | < R
k=0

where a0 and a1 are arbitrary, and the other ak ’s satisfy the recursion formula

1
k−2

ak = − ( j + 1)a j+1 pk−2− j + a j qk−2− j . (31.4)
k(k − 1)
j=0

There are four major parts to the proof of each of these theorems:
1. Deriving the recursion formula. (Done!)
2. Assuring ourselves that the coefficient functions in the reduced forms have the stated power
series representations. (Done! See lemmas 31.7 and 31.8.)
3. Verifying that the radius of convergence for the power series generated from the given recur-
sion formula is at least R .
4. Noting that the calculations used to obtain each recursion formula also confirm that the
resulting series is the solution to the given differential equation over the interval (x 0 −
R, x 0 + R) . (So noted!)
Thus, all that remains to proving these two major theorems is the verification of the claimed radii of
convergence for the given series solutions. This verification is not difficult, but is a bit lengthy and
technical, and may not be as exciting to the reader as was the derivation of the recursion formulas.
Those who are interested should proceed to section 31.5.
But now, let us try using our new theorems.

!Example 31.2: Consider, again, the differential equation from example 30.7 on page 619,
y + cos(x)y = 0 .
Again, let us try to ﬁnd at least a partial sum of the general power series solution about x 0 = 0 .
This time, however, we will use the results from theorem 31.10.
The equation is already in reduced form
y + P y + Qy = 0
with P(x) = 0 and Q(x) = cos(x) . Since both of these functions are analytic on the entire
complex plane, the theorem assures us that there is a general power series solution
∞

y(x) = ak x k for |x| < ∞
k=0

i i

i i
i i

i i

Existence of Power Series Solutions 641

with a0 and a1 being arbitrary, and with the other ak ’s being given through recursion formula
(31.4). And to use this recursion formula, we need the corresponding power series representations
for P and Q . The series for P , of course, is trivial,
∞

P(x) = 0 ⇐⇒ P(x) = pk x k with pk = 0 for all k .
k=0

Fortunately, the power series for Q is well-known and only needs to be slightly rewritten for use
in our recursion formula:

Q(x) = cos(x)
∞
1
= (−1)m x 2m
(2m)!
m=0

1 2 1 1
= 1 − x + x4 − x6 + · · ·
2! 4! 6!
2 1
= (−1) /2 x 0 + 0x 1 + (−1) /2 x 2 + 0x 3
0
2!
4/ 1 4 6 1
+ (−1) 2 x + 0x + (−1) /2 x 6 + 0x 7 + · · ·
5
.
4! 6!
So,
1 1
q0 = 1 , q1 = 0 , q2 = − , q3 = 0 , q4 = , ... .
2! 4!
In general,
∞ k
(−1) /2
1
k if k is even
Q(x) = qk x with qk = k! ,
k=0 0 if k is odd

and recursion formula (31.4) becomes, for k ≥ 2 ,

1
k−2

ak = − ( j + 1)a j+1 pk−2− j + a j qk−2− j
k(k − 1)
j=0
0
⎧ ⎫

k−2 ⎨(−1)(k−2− j)/2 1
if k − 2 − j is even⎬
1 (k − 2 − j)!
= − aj . (31.5)
k(k − 1) ⎩ 0 if k − 2 − j is odd ⎭
j=0

However, since we are only attempting to ﬁnd a partial sum and not the entire series, let us simply
use the recursion formula

1
k−2

ak = − ( j + 1)a j+1 pk−2− j + a j qk−2− j
k(k − 1)
j=0
0

1
k−2
= − a j qk−2− j ,
k(k − 1)
j=0

with the particular qn ’s given above, and with the factorials computed:
1 1
q0 = 1 , q1 = 0 , q2 = − , q3 = 0 , q4 = .
2 24

i i

i i
i i

i i

642 Power Series Solutions II: Generalizations and Theory

Doing so, we get

1
2−2
a2 = − a j q2−2− j
2(2 − 1)
j=0

1
0
1 1 1
= − a j q− j = − a0 q0 = − a0 · 1 = − a0 ,
2 2 2 2
j=0

1
3−2
a3 = − a j q3−2− j
3(3 − 1)
j=0

1
1
= − a j q1− j
6
j=0
1 1 1
= − [a0 q1 + a1 q0 ] = − [a0 · 0 + a1 · 1] = − a1 ,
6 6 6

1
4−2
a4 = − a j q4−2− j
4(4 − 1)
j=0

1
2
= − a j q2− j
12
j=0
1
= − [a0 q2 + a1 q1 + a2 q0 ]
12
1
1 1 1
= − a0 − + a1 · 0 + − a0 · 1 = a0 ,
12 2 2 12

1
5−2
a5 = − a j q5−2− j
5(5 − 1)
j=0

1
3
= − a j q3− j
20
j=0
1
= − [a0 q3 + a1 q2 + a2 q1 + a3 q0 ]
20
1
1 1 1
= − a0 · 0 + a1 − + a2 · 0 + − a1 · 1 = a1
20 2 6 30
and
1
6−2
a6 = − a j q6−2− j
6(6 − 1)
j=0

1
4
= − a j q4− j
30
j=0
1
= − [a0 q4 + a1 q3 + a2 q2 + a3 q1 + a4 q0 ]
30
1
1 1 1 1 1
= − a0 + a1 · 0 + − a0 − + a3 · 0 + a0 · 1 = − a0 .
30 24 2 2 12 80

i i

i i
i i

i i

Radius of Convergence for the Solution Series 643

Thus, the sixth partial sum of the power series for y about 0 is
S6 (x) = a0 + a1 x + a2 x 2 + a3 x 3 + a4 x 4 + a5 x 5 + a6 x 6
1 1 1 1 1
= a0 + a1 x − a0 x 2 − a1 x 3 + a0 x 4 + a1 x 5 − a0 x 6
2 6 12 30 80
1 1 4 1 6
1 1 5

= a0 1 − x 2 + x − x + a1 x − x 3 + x ,
2 12 80 6 30
just as we had found, using the Taylor series method, in example 30.7 on page 619.

If you compare the work done in the last example with the work done in example 30.7, it
may appear that, while we obtained identical results, we may have expended more work in using
the recursion formula from theorem 31.10 than in using the Taylor series method. On the other
hand, all the computations done in the last example were fairly simple arithmetic computations —
computations we could easily program a computer to do, especially if we use recursion formula
(31.5). So there can be computational advantages to using our new results.

31.5 Radius of Convergence for the Solution Series

To ﬁnish our proofs of theorems 31.10 and 31.9, we need to verify that the radius of convergence for
each of the given series solutions is at least the given value for R . We will do this for the solution
series in theorem 31.10, and leave the corresponding veriﬁcation for theorem 31.9 (which will be
slightly easier) as an exercise.

What We Have, and What We Need to Show

Recall: We have a positive value R and two power series
∞
∞

pk X k and qk X k
k=0 k=0
that we know converge when |X| < R (for simplicity, we’re letting X = x − x 0 ). We also have a
corresponding power series
∞
ak X k
k=0
where a0 and a1 are arbitrary, and the other coefﬁcients are given by the recursion formula

1
k−2

ak = − ( j + 1)a j+1 pk−2− j + a j qk−2− j for k = 2, 3, 4, . . . .
k(k − 1)
j=0
2
We now only need to show that 2∞ k=0 ak X converges whenever |X| < R , and to do that, we
k
∞
will produce another power series k=0 bk X k whose convergence is “easily” shown using the limit
ratio test, and which is related to our ﬁrst series by
|ak | ≤ bk k = 0, 1, 2, 3, . . . .
for
2 2∞
By the comparison test, it then immediately follows that ∞ k k
k=0 ak X , and hence also k=0 ak X ,
converges.
So let X be any value with |X| < R .

i i

i i
i i

i i

644 Power Series Solutions II: Generalizations and Theory

Constructing the Series for Comparison

2∞
Our ﬁrst step in constructing k=0 bk X
k is to pick some value r between |X| and R ,
0 ≤ |X| < r < R .
Since |r| < R , we know the series
∞
∞

pk r k and qk r k
k=0 k=0

both converge. But a series cannot converge if the terms in the series
become arbitrarily large in
magnitude. So the magnitudes of these terms — the pk r k ’s and qk r k ’s — must be bounded; that
is, there must be a ﬁnite number M such that

k
pk r k < M and qk r < M for k = 0, 1, 2, 3, . . . .

Equivalently (since r > 0 ),

M M
| pk | < and |qk | < for k = 0, 1, 2, 3, . . . .
rk rk
These inequalities, the triangle inequality and the recursion formula combine to give us, for k =
2, 3, 4, . . . ,

k−2
1

|ak | = − ( j + 1)a j+1 pk−2− j + a j qk−2− j
k(k − 1)
j=0

1
k−2

≤ ( j + 1) a j+1 pk−2− j + a j qk−2− j
k(k − 1)
j=0
k−2
1 M M
≤ ( j + 1) a j+1 k−2− j + a j k−2− j ,
k(k − 1) r r
j=0

which we will rewrite as

1
k−2
M ( j + 1) a j+1 + a j
|ak | ≤ .
k(k − 1) r k−2− j
j=0

Now let b0 = |a0 | , b1 = |a1 | and

k−2
M ( j + 1) a j+1 + a j
bk = for k = 2, 3, 4, . . . .
r k−2− j
j=0

From the preceding inequality, it is clear that we’ve chosen the bk ’s so that
|ak | ≤ bk for k = 0, 1, 2, 3, . . . .
In fact, we even have
1
|ak | ≤ bk for k = 2, 3, . . . . (31.6)
k(k − 1)
Thus,

ak X k ≤ bk |X|kk = 2, 3, . . . , for
2∞ k
2∞ test) we kcan conﬁrm the convergence of k=0 ak X by simply verifying
and (by the comparison
the convergence of k=0 bk |X| .

i i

From this and the fact that |X| < r , we see that

bk+1 X k+1
lim = lim 1 + M + 1 |X| = 1 + 0 |X| = |X| < 1 ,
k→∞ bk X k k→∞ r k −1 r r
2∞ k
conﬁrming (by the limit ratio test) that k=0 bk X converges, and, thus, completing our proof of
theorem 31.10.

To ﬁnish the proof of theorem 31.9, do the following exercise:

2∞
?2
Exercise 31.2: Let k=0 pk X be a power series that converges for |X| < R , and let
k
∞ k
k=0 ak X be a power series where a0 is arbitrary, and the other coefﬁcients are given by the
recursion formula
1
k−1
ak = − a j pk−1− j for k = 1, 2, 3, . . . .
k
j=0
2
Show that ∞ k=0 ak X converges also for |X| < R .
k

(Suggestion: Go back to the start of this section and “redo” the computations step by step,
making the obvious modiﬁcations to deal with the given recursion formula.)

31.6 Singular Points and the Radius of Convergence

In the last section, we veriﬁed that the power series solutions obtained in this chapter are valid at
least over (x 0 − R, x 0 + R) where R is the distance from x 0 to the nearest singular point, provided
the differential equation has singular points. This fact can be reﬁned by the following theorem:

Theorem 31.11
Let
∞

y(x) = ck (x − x 0 )k for |x − x 0 | < R
k=0
be a power series solution for some ﬁrst- or second-order homogeneous linear differential equation.
Assume, further, that R is ﬁnite and is the radius of convergence for the above power series. Then
this differential equation has a singular point z s with |z s − x 0 | = R .

The proof of this theorem, unfortunately, is nontrivial. The adventurous can read about it in
an appendix, section 31.9 (after reading sections 31.7 and 31.8). By the way, the above theorem
is actually a consequence of more general results obtained in the appendix, some of which will be
useful in some of our more advanced work in chapter 33.

i i

i i
i i

i i

Appendix: A Brief Overview of Complex Calculus 647

31.7 Appendix: A Brief Overview of Complex Calculus

To properly address issues regarding the analyticity of our functions and the regions of convergence of
their power series, we need to delve deeper into the theory of analytic functions — much deeper than
normally presented in elementary calculus courses. Instead, we want the theory normally developed
in introductory courses in complex analysis. That’s because the complex-variable theory exposes a
much closer relation between “differentiability” and “analyticity” than does the real-variable theory
developed in elementary calculus. If you’ve had such a course, good; the following will be a review.
If you’ve not had such a course, think about taking one, and read on. What follows is a brief synopsis
of the relevant concepts and results from such a course, written assuming you have not had such a
course (but have, at least, skimmed the introductory section on complex variables in section 29.3,
starting on page 575).

Functions of a Complex Variable

In “complex analysis”, the basic concepts and theories developed in elementary calculus are extended
so that they apply to complex-valued functions of a complex variable. Thus, for example, where we
may have considered the “real” polynomial and “real” exponential

p(x) = 3x 2 + 4x − 5 and h(x) = e x for all x in ⺢

in elementary calculus, in complex analysis we consider the “complex” polynomial and “complex”
exponential

p(z) = 3z 2 + 4z − 5 and h(z) = e z for all z = x + i y in ⺓ .

Note that we treat z as a single entity. Still, the complex variable z is just x + i y . Consequently,
much of complex analysis follows from what you already know about the calculus of functions of
two variables. In particular, the partial derivatives with respect to x and y are deﬁned just as they
were deﬁned back in your calculus course (and section 3.7 of this text), and when we say f is
continuous at z 0 , we mean that
f (z 0 ) = lim f (z)
z→z 0

with the understanding that

lim f (z) = lim f (x + i y) with z 0 = x 0 + i y0 .

z→z 0 x→x 0
y→y0

Along these lines, you should be aware that, in complex variables, we normally assume that
functions are deﬁned over subregions of the complex plane, instead of subintervals of the real line.
In what follows, we will often require our region of interest to be open (as discussed in section 3.7).
For example, we will often refer to the disk of all z satisfying |z − z 0 | < R for some complex point
z 0 and positive value R . Any such disk is an open region.

Complex Differentiability
Given a function f and a point z 0 = x 0 + i y0 in the complex plane, the complex derivative of f
at z 0 — denoted by f (z 0 ) or d f /dz z — is given by
0

df f (z) − f (z 0 )
f (z 0 ) = = lim .
dz z 0 z→z 0 z − z0

i i

i i
i i

i i

648 Power Series Solutions II: Generalizations and Theory

If this limit exists as a ﬁnite complex number, we will say that f is differentiable with respect to
the complex variable at z 0 (complex differentiable for short). Remember, z = x + i y ; so, for the
above limit to make sense, the formula for f must be such that f (x + i y) makes sense for every
x + i y in some open region about z 0 .
We further say that f is complex differentiable on a region of the complex plane if and only if
it is complex differentiable at every point in the region.
Naturally, we can extend the complex derivative to higher orders:

d2 f d df d3 f d d2 f
f = 2
= , f = = , ··· .
dz dz dz dz 3 dz dz 2

As with functions of a real variable, if f (k) exists for every positive integer k (at a point or in a
region), then we say f is inﬁnitely differentiable (at the point or in the region).
In many ways, the complex derivative is analogous to the derivative you learned in elementary
calculus (the real-variable derivative). The same basic basic computational formulas apply, giving
us, for example,
d k d αz d
z = kz k−1 , e = αeαz and [α f (z) + βg(z)] = α f (z) + βg (z) .
dz dz dz
In addition, the well-known product and quotient rules can easily be veriﬁed, and, in verifying these
rules, you automatically verify the following:

Theorem 31.12
Assume f and g are complex differentiable on some open region of the complex plane. Then
their product f g is also complex differentiable on that region. Moreover, so is their quotient f /g ,
provided g(z) = 0 for every z in this region.

Testing for Complex Diffentiability

If f is complex differentiable in some open region of the complex plane, then, unsurprisingly, the
chain rule can be shown to hold. In particular,
∂ ∂
f (x + i y) = f (x + i y) · [x + i y] = f (x + i y) · 1 = f (x + i y)
∂x ∂x
and
∂ ∂
f (x + i y) = f (x + i y) · [x + i y] = f (x + i y) · i = i f (x + i y) .
∂y ∂y

Combining these two equations, we get

∂ ∂
f (x + i y) = i f (x + i y) = i f (x + i y) .
∂y ∂x

Thus, if f is complex differentiable in some open region, then

∂ ∂
f (x + i y) = i f (x + i y) (31.9)
∂y ∂x

at every point z = x + i y in that region.4 Right off, this gives us a test for “nondifferentiability”:
If equation (31.9) does not hold throughout some region, then f is not complex differentiable on
4 The two equations you get by splitting equation (31.9) into its real and imaginary parts are the famous Cauchy-Riemann
equations.

i i

i i
i i

i i

Appendix: A Brief Overview of Complex Calculus 649

that region. Remarkably, it can be shown that equation (31.9) can also be used to verify complex
differentiability. More precisely, the following theorem can be veriﬁed using tools developed in a
typical course in complex analysis.

Theorem 31.13
A function f is complex differentiable on an open region if and only if
∂ ∂
f (x + i y) = i f (x + i y)
∂y ∂x

at every point z = x + i y in the region. Moreover, in any open region on which f is complex
differentiable,
d ∂
f (z) = f (z) = f (x + i y) .
dz ∂x

Differentiability of an Analytic Function

In the subsection starting on page 571 of section 29.2, we discussed differentiating power series
and analytic functions when the variable is real. That discussion remains true if we replace the real
variable x with the complex variable z and use the complex derivative. In particular, we have

Theorem 31.14 (differentiation of power series)

Suppose f is a function given by a power series,
∞

f (z) = ak (z − z 0 )k for |z − z 0 | < R
k=0

for some R > 0 . Then, for any positive integer n , the n th derivative of f exists. Moreover,
∞

f (n) (x) = k(k − 1)(k − 2) · · · (k − n + 1) ak (z − z 0 )k−n for |z − z 0 | < R .
k=n

As an immediate corollary, we have:

Corollary 31.15
Let f be analytic at z 0 with power series representation
∞

f (z) = ak (z − z 0 )k whenever |z − z 0 | < R .
k=0

Then f is inﬁnitely complex differentiable on the disk of all z with |z − z 0 | < R . Moreover

f (k) (z 0 )
ak = for k = 0, 1, 2, . . . .
k!

i i

i i
i i

i i

650 Power Series Solutions II: Generalizations and Theory

Complex Differentiability and Analyticity

Despite the similarity between complex differentiation and real-variable differentiation, complex
differentiability is a much stronger condition on functions than is real-variable differentiability. The
next theorem illustrates this.

Theorem 31.16
Assume f (z) is complex differentiable in some open region R . Then f is analytic at each point
z 0 in R , and is given by its Taylor series

∞
f (k) (z 0 )
f (z) = (z − z 0 )k whenever |z − z 0 | < R
k!
k=0

where R is the radius of any open disk centered at z 0 and contained in region R .

This remarkable theorem tells us that complex differentiability on an open region automatically
implies analyticity on that region, and tells us the region over which a function’s Taylor series
converges and equals the function. Proving this theorem is beyond this text. It is, in fact, a summary
of results normally derived over the course of many chapters of a typical text in complex variables.
Keep in mind that we already saw that analyticity implied complex differentiability (corollary
31.15). So as immediate corollaries to the above, we have:

Corollary 31.17
A function f is analytic at every point in an open region of the complex plane if and only if it is
complex differentiable at every point in that region.

Corollary 31.18 2
Assume F is a function analytic at z 0 with corresponding power series ∞
k=0 f k (z − z 0 ) , and let
k

R be either some positive value or +∞ . Then

∞

F(z) = f k (z − z 0 )k whenever |z − z 0 | < R
k=0

if and only if F is analytic at every complex point z satisfying

|z − z 0 | < R .

The second corollary is especially of interest to us because it is the same as lemma 31.1 on page
633, which we used extensively in this chapter. And the other lemma that we used, lemma 31.2 on
page 633? Well, using, in order,

1. corollary 29.14 on page 581 on quotients of analytic functions,

2. theorem 31.12 on page 648 on the differentiation of products and quotients

and
3. corollary 31.17, above,

you can easily verify the following (which is the same as lemma 31.2):

i i

i i
i i

i i

Appendix: The “Closest Singular Point” 651

Corollary 31.19
Assume F(z) and A(z) are two functions analytic at a point z 0 . Then the quotient F/ is also
A
analytic at z 0 if and only if
F(z)
lim
z→z 0 A(x)

is ﬁnite.

The details are left to you.

?Exercise 31.3: Prove corollary 31.19.

31.8 Appendix: The “Closest Singular Point”

Here we want to answer a subtle question: Is it possible to have a ﬁrst- or second-order linear
homogeneous differential equation whose singular points are arranged in such a manner that none
of them is the closest to some given ordinary point?
For example, could there be a differential equation having z 0 = 0 as an ordinary point, but
whose singular points form an inﬁnite sequence
1
z1 , z2 , z3, . . . with |z k | = 1 + ,
k

possibly located in the complex plane so that they “spiral around” the circle of radius 1 about z 0 = 0
without converging to some single point? Each of these singular points is closer to z 0 = 0 than the
previous ones in the sequence, so not one of them can be called “a closest singular point”.
Lemma 31.6 on page 636 claims that this situation cannot happen. Let us see why we should
believe this lemma.

The Problem and Fundamental Theorem

We are assuming that we have some ﬁrst- or second-order linear homogeneous differential equation
having singular points. We also assume z 0 is not one of these singular points — it is an ordinary
point. Our goal is to show that
there is a singular point z s such that no other singular point is closer to z 0 .
If we can conﬁrm such a z s exists, then we’ve shown that the answer to this section’s opening
question is No (and proven lemma 31.6).
We start our search for this z s by rewriting our differential equation in reduced form

y + P(x)y = 0 or y + P(x)y + Q(x)y = 0

and recalling that a point z is an ordinary point for our differential equation if and only if the
coefﬁcient(s) ( P for the ﬁrst-order equation, and both P and Q for the second-order equation) are
all analytic at z (see lemmas 31.7 and 31.8). Consequently,

1. z s is a closest singular point to z 0 for the ﬁrst-order differential equation if and only if z s
is a point closest to z 0 at which P is not analytic,
and

i i

i i
i i

i i

652 Power Series Solutions II: Generalizations and Theory

2. z s is a closest singular point to z 0 for the second-order differential equation if and only if
z s is a point closest to z 0 at which either P or Q is not analytic.

Either way, our problem of verifying the existence of a singular point z s “closest to z 0 ” is
reduced to the problem of verifying the existence of a point z s “closest to z 0 ” at which a given
function F is not analytic while still being analytic at z 0 . That is, to prove lemma 31.6, it will
sufﬁce to prove the following:

Theorem 31.20
Let F be a function that is analytic at some but not all points in the complex plane, and let z 0 be
one of the points at which F is analytic. Then there is a positive value R0 and a point z s in the
complex plane such that all the following hold:

1. R0 = |z s − z 0 | .

2. F is not analytic at z s .

3. F is analytic at every z with |z − z 0 | < R0 .

The point z s in the above theorem is a point closest to z 0 at which F is not analytic. There
may be other points the same distance from z 0 at which F is not analytic, but the last statement in
the theorem tells us that there is no point closer to z 0 than z s at which F is not analytic.

Verifying Theorem 31.20

The Radius of Analyticity Function
Our proof of theorem 31.20 will rely on properties of the radius of analyticity function R A for F ,
which we deﬁne at each point z in the complex plane as follows:
• If F is not analytic at z , then R A (z) = 0 .

• If F is analytic at z , then R A (z) is the largest value of R such that

F is analytic on the open disk of radius R about z . (31.10)

(To see that this “largest value of R ” exists when f is analytic at z , first note that the set of all
positive values of R for which (31.10) holds forms an interval with 0 as the lower endpoint. Since
we are assuming there are points at which F is not analytic, this interval must be finite, and, hence,
has a finite upper endpoint. That endpoint is R A (z) .)5
The properties of this function that will be used in our proof of theorem 31.20 are summarized
in the following lemmas.

Lemma 31.21
Let R A be the radius of analyticity function corresponding to a function F analytic at some, but
not all, points of the complex plane, and let z 0 be a point at which F is analytic. Then:
1. If |ζ − z| < R A (z) , then F is analytic at ζ .
2. If F is not analytic at a point ζ , then |ζ − z| ≥ R A (z) .

5 In practice, R (z) is usually the radius of convergence R for the Taylor series for F about z . In theory, though, one
A
can deﬁne F to not equal its Taylor series at some points in the disk of radius R about z . R A (z) is then the radius of the
largest open disk about z on which F is given by its Taylor series about z .

i i

i i
i i

i i

Appendix: The “Closest Singular Point” 653

3. If R > R A (z) then there is a point in the open disk about z of radius R at which F is not
analytic.

Lemma 31.22
Let F be a function which is analytic at some but not all points of the complex plane, and let R A
be the radius of analyticity function corresponding to F . Then, for each complex point z ,
F is analytic at z ⇐⇒ R A (z) > 0 .
Equivalently,
F is not analytic at z ⇐⇒ R A (z) = 0 .

Lemma 31.23
If F is a function analytic at some but not all points of the complex plane, then R A , the radius of
analyticity function corresponding to F , is a continuous function on the complex plane.

The claims in the ﬁrst lemma follow immediately from the deﬁnition of R A ; so let us concentrate
on proving the other two lemmas.

PROOF (lemma 31.22): First of all, by deﬁnition

F is not analytic at z ⇒ R A (z) = 0 .
Hence, we also have
R A (z) > 0 ⇒ F is analytic at z .
2
On the other hand, if F is analytic at z , then there is a power series ∞k=0 ak (ζ − z) and a
k

R > 0 such that

∞
F(ζ ) = ak (ζ − z)k whenever |ζ − z| < R .
k=0

Corollary 31.18 immediately tells us that F is analytic on the open disk of radius R about z . Since
R A (z) is the largest such R , R ≤ R A (z) . And since 0 < R , we now have
F is analytic at z ⇒ R A (z) > 0 .
This also means
RA = 0 ⇒ F is not analytic at z .
Combining all the implications just listed yields the claims in the lemma.

PROOF (lemma 31.23): To verify the continuity of R A , we need to show that

lim R A (z) = R A (z 1 ) for each complex value z 1 .
z→z 1

There are two cases: the easy case with F not being analytic at z 1 , and the less-easy case with F
being analytic at z 1 . For the second case, we will use pictures.
Consider the ﬁrst case, where F is not analytic at z 1 (hence, R A (z 1 ) = 0 ). Then, as noted in
lemma 31.21,
0 ≤ R A (z) ≤ |z 1 − z| for any z in ⺓ .
Taking limits, we see that
0 ≤ lim R A (z) ≤ lim |z 1 − z| = 0 ,
z→z 1 z→z 1

i i

Figure 31.2: Disks in the complex plane for theorem 31.24 and its proof.

Theorem 31.24
Let D A and D B be two open disks in the complex plane that intersect each other. Assume that f A
is a function analytic on D A , and that f B is a function analytic on D B . Further assume that there
is an open disk D0 contained in both D A and D B (see ﬁgure 31.2) such that

f A (z) = f B (z) for every z ∈ D0 .

Then
f A (z) = f B (z) for every z ∈ D A ∩ D B .

Think of f A as being the original function deﬁned on D A , and f B as some other analytic
function that we constructed on D B to match f A on D0 . This theorem tells us that we can deﬁne
a “new” function f on D A ∪ D B by

f A (z) if z is in D A
f (z) = .
f B (z) if z is in D B

On the intersection, f is given both by f A and f B , but that is okay because the theorem assures us
that f A and f B are the same on that intersection. And since f A and f B are, respectively, analytic
at every point in D A and D B , it follows that f is analytic on the union of D A and D B , and
satisﬁes
f (z) = f A (z) for each z in D A .
That is, f is an “analytic extension” of f A from the domain D A to the domain D A ∪ D B .
The proof of the above theorem is not difﬁcult and is somewhat instructive.

PROOF (theorem 31.24): We need to show that f A (ζ ) = f B (ζ ) for every ζ in D A ∩ D B . So

let ζ be any point in D A ∩ D B .
If ζ ∈ D0 then, by our assumptions, we automatically have f A (ζ ) = f B (ζ ) .
On the other hand, if ζ in not in D0 , then, as illustrated in figure 31.2, we can clearly find a
finite sequence of open disks D1 , D2 , . . . and D M with respective centers z 1 , z 2 , . . . and z M
such that
1. each z k is also in Dk−1 ,
2. each Dk is in D A ∩ D B , and
3. the last disk, D M , contains ζ .

i i

i i
i i

i i

Appendix: Singular Points and the Radius of Convergence for Solutions 657

Now, because f A and f B are the same on D0 , so are all their derivatives. Consequently, the
Taylor series for f A and f B about the point z 1 in D0 will be the same. And since D1 is a disk
centered at z 1 and contained in both D A and D B , we have
∞
f A (k) (z 1 )
f A (z) = (z − z 1 )k
k!
k=0
∞
f B (k) (z 1 )
= (z − z 1 )k = f B (z) for every z in D1 .
k!
k=0

Repeating these arguments using the Taylor series for f A and f B at the points z 2 , z 3 , and so
on, we eventually get
f A (z) = f B (z) for every z in D M .
In particular then,
f A (ζ ) = f B (ζ ) ,
just as we wished to show.

Ordinary and Singular Points for Power-Series Functions

It will help if we expand our notions of ordinary and singular points to any function given by a power
series,
∞

f (z) = ck (z − z 0 )k for |z − z 0 | < R ,
k=0

assuming, of course, that R > 0 . For convenience, let D be the open disk about z 0 of radius R .
Then for each z 1 either in D or on its boundary, we will say:

1. z 1 an ordinary point for f if and only if there is a function f 1 analytic on a disk D1 of

positive radius about z 1 and which equals f on the region where D and D1 overlap.

2. z 1 an singular point for f if and only if it is not an ordinary point for f .

Do note that theorem 31.16 on page 650 assures us that every point in D is an ordinary point
for f . So the only singular points must be on the boundary. And the next theorem tells us that there
must be a singular point on the boundary of D when R is ﬁnite and the radius of convergence for
the above power series.

Theorem 31.25
Let R be a positive ﬁnite number, and assume it is the radius of convergence for
∞

f (z) = ck (z − z 0 )k .
k=0

Then f must have a singular point on the circle |z − z 0 | = R .

PROOF: For convenience, let D be the open disk of radius R about z 0 ,

D = {z : |z − z 0 | < R} ,

i i

i i
i i

i i

658 Power Series Solutions II: Generalizations and Theory

be the union of D and its boundary,

and let D

= {z : |z − z 0 | ≤ R}
D .

as follows:
Now, let’s deﬁne a function RC on D

1. For each ordinary point ζ , there is a function f 1 analytic on a disk D1 of positive radius
about ζ and which equals f on the region where D and D1 overlap. Since f 1 is analytic,
it can be given by a power series about ζ . Let RC (ζ ) be the radius of convergence for that
power series.

2. For each singular point ζ , let RC (ζ ) = 0 .

By deﬁnition, RC (ζ ) ≥ 0 for each ζ in D . We should also note that RC (ζ ) cannot be inﬁnite

for any ζ in D because there would then be a function f 1 analytic on all of the complex plane and
equaling f on the disk D . And since f 1 is analytic everywhere, theorem 31.16 assures us that the
radius of convergence R1 of the Taylor series of f 1 about z 0 would be infinite. But that Taylor
series for f 1 about z 0 would have to be the same as the Taylor series for f about z 0 since f = f 1
on D . And that, in turn, would mean the two power series have the same radius of convergence,
giving us
∞ = R1 = R < ∞ ,
. In other
which is impossible. Thus, it is not possible for RC (ζ ) to be infinite at any point ζ in D
words, RC is a well-defined function on D with

0 ≤ RC (z) < ∞
for each z in D .

The function RC is very similar to the “radius of analyticity function” R A discussed in section
31.8, and, using arguments very similar to those used in that section for R A , you can verify all the
following:

1. RC (z) > 0 if and only if z is an ordinary point for the differential equation.

2. RC (z) = 0 if and only if z is a singular point for the differential equation.

3. RC is a continuous function on the circle |z − z 0 | = R .

4. RC (z) has a minimum value ρ at some point z s on the circle |z − z 0 | = R .

Now, if we can show ρ = 0 , then the above tells us that the corresponding point z s is a singular
point for the given power series, and our theorem is proven. And to show that, it will suffice to show
that ρ > 0 is impossible.
So, for the moment, assume ρ > 0 . Then (as illustrated in figure 31.3) we can choose a finite
sequence of points ζ1 , ζ2 , . . . , and ζ N about the circle |z − z 0 | = R such that |ζ N − ζ1 | < ρ
and
|ζn+1 − ζn | < ρ for n = 1, 2, 3, . . . , n .
For each ζn , let Dn be the open disk of radius ρ about ζn , and observe that, by basic geometry,
the union of all the disks D1 , D2 , . . . , and D N contains not just the boundary of our original disk
D but the annulus of all z satisfying

R ≤ |z − z 0 | ≤ R +

for some positive value .

i i

i i
i i

i i

Appendix: Singular Points and the Radius of Convergence for Solutions 659

D4
ζ5
ζ4 D3

ζ3
D2
ζ2
R
R+ D1
ζ1
z0

ζN DN

Figure 31.3: Disks for the proof of theorem 31.25. The darker disk is the disk of radius R on
which y is originally deﬁned.

By our choice of ζn ’s and D’s , we know that, for each integer n from 1 to N , there is a
power series function
∞

f n (z) = cn,k (z − ζn )k
k=0
deﬁned on all of Dn and which equals our original function f in the overlap of D and Dn .
Repeated use of theorem 31.24 then shows that any two functions from the set

{ f, f 1 , f 2 , . . . , f N }

equal each other wherever both are deﬁned. This allows us to deﬁne a “new” analytic function F
on the union of all of our disks via

f (z) if |z − z 0 | < R
F(z) = .
f n (z) if z ∈ Dn for n = 1, 2, . . . , N

Now, because the union of all of the disks contains the disk of radius R + about 0 , theorem 31.16
on page 650 on the radius of convergence for Taylor series assures us that the Taylor series for F
about z 0 must have a radius of convergence of at least R + . But, f (z) = F(z) when |z| < R .
So f and F have the same Taylor series at z 0 , and, hence, these two power series share the same
radius of convergence.
That is, if ρ > 0 , then

R = radius of convergence for the power series of f about z 0

= radius of convergence for the power series of F about z 0 > R ,

which is clearly impossible. So, it is not possible to have ρ > 0 .

i i

i i
i i

i i

660 Power Series Solutions II: Generalizations and Theory

Complex Power Series Solutions

Throughout most of these chapters, we’ve been tacitly assuming that the derivatives in our differential
equations are derivatives with respect to a real variable x ,

dy d2 y
y = and y = ,
dx dx2

as in elementary calculus. In fact, we can also have derivatives with respect to the complex variable
z,
dy d2 y
y = and y = ,
dz dz 2
as described on page 647. Remember that, computationally, differentiation with respect to z is
completely analogous to the differentiation with respect to x learned in basic calculus. In particular,
if
∞

y(z) = ak (z − z 0 )k for |z − z 0 | < R
k=0

for some point z 0 in the complex plane and some R > 0 , then
∞
d
y (z) = ak (z − z 0 )k
dz
k=0
∞
d
∞

= ak (z − z 0 )k = kak (z − z 0 )k−1 for |z − z 0 | < R .
dz
k=0 k=1

Consequently, all of our computations in chapters 30 and 31 can be carried out using the complex
variable z instead of the real variable x , and using a point z 0 in the complex plane instead of a
point x 0 on the real line. In particular, we have the following complex-variable analogs of theorems
31.9 and 31.10:

Theorem 31.26 (ﬁrst-order series solutions)

Suppose z 0 is an ordinary point for a ﬁrst-order homogeneous differential equation whose reduced
form is
y + P y = 0 .
Then P has a power series representation
∞

P(z) = pk (z − z 0 )k for |z − z 0 | < R
k=0

where R is the radius of analyticity about z 0 for this differential equation.

Moreover, a general solution to the differential equation is given by
∞

y(z) = ak (z − z 0 )k for |z − z 0 | < R
k=0

where a0 is arbitrary, and the other ak ’s satisfy the recursion formula

1
k−1
ak = − a j pk−1− j . (31.11)
k
j=0

i i

i i
i i

i i

Appendix: Singular Points and the Radius of Convergence for Solutions 661

Theorem 31.27 (second-order series solutions)

Suppose z 0 is an ordinary point for a second-order homogeneous differential equation whose reduced
form is
d2 y dy
2
+ P + Qy = 0 .
dz dz
Then P and Q have power series representations
∞

P(z) = pk (z − z 0 )k for |z − z 0 | < R
k=0
and
∞

Q(z) = qk (z − z 0 )k for |z − z 0 | < R
k=0

where R is the radius of analyticity about z 0 for this differential equation.

Moreover, a general solution to the differential equation is given by
∞

y(z) = ak (z − z 0 )k for |z − z 0 | < R
k=0

where a0 and a1 are arbitrary, and the other ak ’s satisfy the recursion formula

1
k−2

ak = − ( j + 1)a j+1 pk−2− j + a j qk−2− j . (31.12)
k(k − 1)
j=0

By the way, it should also be noted that, if

∞

y(x) = ck (x − x 0 )k
k=0

is a solution to the real-variable differential equation for |x − x 0 | < R , then

∞

y(z) = ck (z − x 0 )k
k=0

is a solution to the corresponding complex-variable differential equation for |z − x 0 | < R . This fol-
lows immediately from the last theorem and the relation between dy/dx and dy/dz (see the discussion
of the complex derivative in section 31.7).

Singular Points of Differential Equations and Solutions

So, let’s suppose we have a power series solution
∞

y(z) = ck (z − z 0 )k for |z − x 0 | < R
k=0

to some ﬁrst- or second-order linear homogenous differential equation, and, as before, let D and
D be, respectively, the open and closed disks of radius R about z 0 ,

D = {z : |z − z 0 | < R} and = {z : |z − z 0 | ≤ R}
D .

i i

i i
i i

i i

662 Power Series Solutions II: Generalizations and Theory

Now consider any single point z 1 in D . If z 1 is an ordinary point for the given differential equation,
then there is a disk D1 of some positive radius R1 about z 1 such that, on that disk, general solutions
to the differential equation exist and are given by power series about z 1 . Using the material already
developed in this section, it’s easy to verify that, in particular, the above power series solution y is
given by a power series about z 1 at least on the overlap of disks D and D1 . And that means z 1
is also an ordinary point for the above power series solution. And, of course, this also means that
a point z 1 in D cannot be both an ordinary point for the differential equation and a singular point
for y .
To summarize:

Lemma 31.28
Let R be a positive ﬁnite number, and assume that, on the disk |z − z 0 | < R ,
∞

y(z) = ck (z − z 0 )k
k=0

is a power series solution to some ﬁrst- or second-order linear homogeneous differential equation.
Then, for each point ζ in the closed disk given by |z − z 0 | ≤ R :

1. If ζ is an ordinary point for the differential equation, then it is an ordinary point for the
above power series solution.

2. If ζ is a singular point for the above power series solution, then it is a singular point for the
differential equation.

Combining the last lemma with theorem 31.25 on singular points for power series functions,
we get the main result of this appendix:

Theorem 31.29
Let
∞

y(z) = ck (z − z 0 )k for |z − z 0 | < R
k=0

be a power series solution for some ﬁrst- or second-order homogeneous linear differential equation.
Assume, further, that R is ﬁnite and is the radius of convergence for the above power series. Then
there is a point z s with |z s − z 0 | = R which is a singular point for both the above power series
solution and the given differential equation.

Theorem 31.11 now follows as a corollary of the last theorem.

Additional Exercises

31.4. In the following, you will determine all the points in the complex plane at which certain
common functions are zero. It may help to remember that, for a complex value Z = X +iY
to be zero, both its real part X and its imaginary part Y must be zero.

i i

i i
i i

i i

Additional Exercises 663

a. Using the fact that

e x+i y = e x [cos(y) + i sin(y)] ,
show that ez can never equal zero for any z in the complex plane.
b. In chapter 15 we saw that
ei z − e−i z ei z + e−i z
sin(z) = and cos(z) = ,
2i 2
at least when z was a real value (see page 312). In fact, these formulas (along with
the deﬁnition of the complex exponential) can deﬁne the sine and cosine functions at all
values of z , real and complex. Using these formulas
i. Verify that
e y + e−y e y − e−y
sin(x + i y) = sin(x) + i cos(x)
2 2
and
e y + e−y e y − e−y
cos(x + i y) = sin(x) − i cos(x) .
2 2

ii. Using the above formulas, verify that

sin(z) = 0 ⇐⇒ z = nπ with n = 0, ±1, ±2, . . .

and
1
cos(z) = 0 ⇐⇒ z = n+ π with n = 0, ±1, ±2, . . . .
2

c. The hyperbolic sine and cosine are deﬁned on the complex plane by
e z − e−z e z + e−z
sinh(z) = and cosh(z) = .
2 2
Show that

sinh(z) = 0 ⇐⇒ z = inπ with n = 0, ±1, ±2, . . .

and
1
cosh(z) = 0 ⇐⇒ z = i n+ π with n = 0, ±1, ±2, . . . .
2

31.5. For each of the following differential equations, ﬁnd all singular points, as well as the radius
of analyticity R about the given point x 0 . You may have to use results from the previous
problem. You may even have to expand on some of those results. And you may certainly
need to rearrange a few equations.
a. y − e x y = 0 with x0 = 0

b. y − tan(x)y = 0 with x0 = 0
c. sin(x) y + x 2 y − e x y = 0 with x0 = 2
2 x
d. sinh(x) y + x y − e y = 0 with x0 = 2
e. sinh(x) y + x 2 y − sin(x) y = 0 with x0 = 2

f. e3x y + sin(x) y +
2

y = 0 with x0 = 0
x2 + 4

1 + ex
g. y + y = 0 with x0 = 3
(1 − e x )

i i

i i
i i

i i

664 Power Series Solutions II: Generalizations and Theory

h. x 2 − 4 y + x 2 + x − 6 y = 0 with x0 = 2

i. x y + 1 − e x y = 0

j. sin π x 2 y + x 2 y = 0 with x 0 = 0

31.6. Using recursion formula (31.3) on page 639, ﬁnd the 4th -degree partial sum of the general
power series solution for each of the following about the given point x 0 . Also state the
interval I over which you can be sure the full power series solution is valid.
a. y − e x y = 0 with x0 = 0
b. y + e2x y = 0 with x0 = 0

c. y + cos(x) y = 0 with x0 = 0

d. y + ln |x| y = 0 with x0 = 1
31.7. Using the recursion formula (31.4) on page 640, ﬁnd the 4th -degree partial sum of the
general power series solution for each of the following about the given point x 0 . Also state
the interval I over which you can be sure the full power series solution is valid.
a. y − e x y = 0 with x0 = 0
x
b. y + 3x y − e y = 0 with x0 = 0
c. x y − 3x y + sin(x) y = 0 with x0 = 0 and N =4
d. y + ln |x| y = 0 with x 0 = 1
√
e. x y + y = 0 with x 0 = 1

f. y + 1 + 2x + 6x 2 y + [2 + 12x] y = 0 with x0 = 0

31.8 a. Using your favorite computer mathematics package, along with recursion formula (31.3)
on page 639, write a program/worksheet that will find the N th partial sum of the power
series solution about x 0 to
y + P(x)y = 0
for any given positive integer N , point x 0 and function P(x) analytic at x 0 . Finding the
appropriate partial sum of the corresponding power series for P should be part of the pro-
gram/worksheet (see exercise 29.8 on page 585). Be sure to write your program/worksheet
so that N , x 0 and P are easily changed.
b. Use your program/worksheet to find the N th -degree partial sum of the general power
series solution about x 0 for each of the following differential equations and choices for
N and x 0 .
i. y − e x y = 0 with x 0 = 0 and N = 10

ii. y + x 2 + 1y = 0 with x 0 = 0 and N = 8
iii. cos(x)y + y = 0 with x 0 = 0 and N = 8

iv. y + 2x 2 + 1y = 0 with x 0 = 2 and N = 5
31.9 a. Using your favorite computer mathematics package, along with recursion formula (31.4)
on page 640, write a program/worksheet that will find the N th partial sum of the power
series solution about x 0 to
y + P(x)y + Q(x)y = 0

i i

i i
i i

i i

Additional Exercises 665

for any given positive integer N , point x 0 , and functions P(x) and Q(x) analytic at
x 0 . Finding the appropriate partial sum of the corresponding power series for P and Q
should be part of the program/worksheet (see exercise 29.8 on page 585). Be sure to write
your program/worksheet so that N , x 0 , P and Q are easily changed.
b. Use your program/worksheet to ﬁnd the N th -degree partial sum of the general power
series solution about x 0 for each of the following differential equations and choices for
N and x 0 .
i. y − e x y = 0 with x0 = 0 and N =8

ii. y + cos(x)y = 0 with x0 = 0 and N = 10

iii. y + sin(x)y + cos(x)y = 0 with x 0 = 0 and N = 7
√
iv. x y + y + x y = 0 with x 0 = 1 and N = 5

i i

i i
i i

i i

i i
i i

i i

32
Modiﬁed Power Series Solutions and the
Basic Method of Frobenius

The partial sums of a power series solution about an ordinary point x 0 of a differential equation
provide fairly accurate approximations to the equation’s solutions at any point x near x 0 . This
is true even if relatively low-order partial sums are used (provided you are just interested in the
solutions at points very near x 0 ). However, these power series typically converge slower and slower
as x moves away from x 0 towards a singular point, with more and more terms then being needed
to obtain reasonably accurate partial sum approximations. As a result, the power series solutions
derived in the previous two chapters usually tell us very little about the solutions near singular points.
This is unfortunate because, in some applications, the behavior of the solutions near certain singular
points can be a rather important issue.
Fortunately, in many of these applications, the singular point in question is not that “bad” a
singular point. In these applications, the differential equation “is similar to” an easily solved Euler
equation (which we discussed in chapter 18), at least in some interval about that singular point.
This fact will allow us to modify the algebraic method discussed in the previous chapters so as to
obtain “modified” power series solutions about these points. The basic process for generating these
modified power series solutions is typically called the method of Frobenius, and is what we will
develop and use in this and the next two chapters.
By the way, we will only consider second-order homogeneous linear differential equations.
One can extend the discussion here to first- and higher-order equations, but the important examples
are second-order.

32.1 Euler Equations and Their Solutions

As already indicated, the Euler equations from chapter 18 will play an fundamental role in our
discussions and are, in fact, the simplest examples of the sort of equations of interest in this chapter.
Let us quickly review them and their solutions, and take a look at what happens to their solutions
about their singular points.
Recall that a standard second-order Euler equation is a differential equation that can be written
as
α0 x 2 y + β0 x y + γ0 y = 0
where α0 , β0 and γ0 are real constants with α0 = 0 . Recall, also, that the basic method for solving
such an equation begins with attempting a solution of the form y = x r where r is a constant to be

667

i i

i i
i i

i i

668 Modiﬁed Power Series Solutions and the Basic Method of Frobenius

determined. Plugging y = x r into the differential equation, we get

α0 x 2 x r + β0 x x r + γ0 x r = 0

→ α0 x 2 r(r − 1)x r −2 + β0 x r x r −1 + γ0 x r = 0

→ x r α0 r(r − 1) + β0r + γ0 = 0

→ α0r(r − 1) + β0 r + γ0 = 0 .

The last equation above is the indicial equation, which we typically rewrite as
α0 r 2 + (β0 − α0 )r + γ0 = 0 ,
and, from which, we can easily determine the possible values of r using basic algebra.
Generalizing slightly, we have the shifted Euler equation
α0 (x − x 0 )2 y + β0 (x − x 0 )y + γ0 y = 0
where x 0 , α0 , β0 and γ0 are real constants with α0 = 0 . Notice that x 0 is the one and only
singular point of this equation. (Notice, also, that a standard Euler equation is a shifted Euler equation
with x 0 = 0 .)
To solve this slight generalization of a standard Euler equation, use the obvious slight general-
ization of the basic method for solving a standard Euler equation. First set
y = (x − x 0 )r ,
where r is a yet unknown constant. Then plug this into the differential equation, compute, and
simplify. Unsurprisingly, you end up with the corresponding indicial equation
α0 r(r − 1) + β0 r + γ0 = 0 ,
which you can rewrite as
α0 r 2 + (β0 − α0 )r + γ0 = 0 ,
and then solve for r , just as with the standard Euler equation. And, as with a standard Euler equation,
there are then only three basic possibilities regarding the roots r1 and r2 to the indicial equation
and the corresponding solutions to the Euler equations:
1. The two roots can be two different real numbers, r1 = r2 .
In this case, the general solution to the differential equation is
y(x) = c1 y1 (x) + c2 y2 (x)
with, at least when x > x 0 ,1
y1 (x) = (x − x 0 )r1 and y2 (x) = (x − x 0 )r2 .
Observe that, for j = 1 or j = 2 , ⎧
⎪
⎪ if r j > 0
⎨ 0
lim y j (x) = lim |x − x 0 |r j = 1 if r j = 0 .
x→x 0+ x→x 0 ⎪
⎪
⎩ +∞ if r j < 0
1 In all of these cases, the formulas and observations also hold when x < x , though we may wish to replace x − x with
0 0
|x − x0 | to avoid minor issues with (x − x0 )r for certain values of r (such as r = 1/2 ).

i i

i i
i i

i i

Euler Equations and Their Solutions 669

2. The two roots can be the same real number, r2 = r1 .

In this case, we can use reduction of order and ﬁnd that the general solution to the
differential equation is (when x > x 0 )

y(x) = c1 y1 (x) + c2 y2 (x)

with
y1 (x) = (x − x 0 )r1 and y2 (x) = (x − x 0 )r1 ln |x − x 0 | .

After recalling how ln |x − x 0 | behaves when x ≈ x 0 , we see that

0 if ri > 0
lim |y2 (x)| = lim (x − x 0 ) 1 ln |x − x 0 | =
r
.
x→x 0+ x→x 0 +∞ if ri ≤ 0

3. Finally, the two roots can be complex conjugates of each other

r1 = λ + iω and r2 = λ − iω with ω > 0 .

After recalling that

X λ±i ω = X λ [cos(ω ln |X|) ± i sin(ω ln |X|)] for X >0

(see the discussion of complex exponents in section 18.2), we ﬁnd that the general solution
to the differential equation for x > x 0 can be given by

y(x) = c1 y1 (x) + c2 y2 (x)

where y1 and y2 are the real-valued functions

y1 (x) = (x − x 0 )λ cos(ω ln |x − x 0 |)
and
y2 (x) = (x − x 0 )λ sin(ω ln |x − x 0 |) .

The behavior of these solutions as x → x 0 is a bit more complicated. First observe that,
as X goes from 1 to 0 , ln |X| goes from 0 to −∞ , which means that sin(ω ln |X|) and
cos(ω ln |X|) then must oscillate inﬁnitely many times between their maximum and minimum
values of 1 and −1 , as illustrated in ﬁgure 32.1a. So sin(ω ln |X|) and cos(ω ln |X|) are
bounded, but do not approach any single value as X → 0 . Taking into account how X λ
behaves (and replacing X with x − x 0 ), we see that

lim |yi (x)| = 0 if λ > 0 ,

x→x 0+
and
lim |yi (x)| does not exist if λ≤0 .
x→x 0+

Notice that the behavior of these solutions as x → x 0 depends strongly on the values of r1
and r2 . Notice, also, that you can rarely arbitrarily assign the initial values y(x 0 ) and y (x 0 ) for
these solutions.

!Example 32.1: Consider the shifted Euler equation

(x − 3)2 y − 2y = 0 .

i i

i i
i i

i i

670 Modiﬁed Power Series Solutions and the Basic Method of Frobenius

1 1

X X
1 1

−1 −1
(a) (b)

Figure 32.1: Graphs of (a) sin(ω ln |X|) , and (b) X λ sin(ω ln |X|) with ω = 10 and

λ = 11/10 .

If y = (x − 3)r for any constant r , then

(x − 3)2 y − 2y = 0

→ (x − 3)2 (x − 3)r − 2 (x − 3)r = 0

→ (x − 3)2 r(r − 1)(x − 3)r −2 − 2(x − 3)r = 0

→ (x − 3)r [r(r − 1) − 2] = 0 .

Thus, for y = (x − 3)r to be a solution to our differential equation, r must satisfy the indicial
equation
r(r − 1) − 2 = 0 .
After rewriting this as
r2 − r − 2 = 0 ,
and factoring,
(r − 2)(r + 1) = 0 ,
we see that the two solutions to the indicial equation are r = 2 and r = −1 . Thus, the general
solution to our differential equation is

y = c1 (x − 3)2 + c2 (x − 3)−1 .

This has one term that vanishes as x → 3 and another that blows up as x → 3 . In particular,
we cannot insist that y(3) be any particular nonzero number, say, “ y(3) = 2 ”.

!Example 32.2: Now consider

x 2 y + x y + y = 0 ,

which is a shifted Euler equation, but with “shift” x 0 = 0 .

The indicial equation for this is

r(r − 1) + r + 1 = 0 ,

i i

i i
i i

i i

Regular and Irregular Singular Points 671

which simpliﬁes to
r2 + 1 = 0 .
So, √
r = ± −1 = ±i .
In other words, r = λ ± iω with λ = 0 and ω = 1 . Consequently, the general solution to the
differential equation in this example is given by

y = c1 x 0 cos(1 ln |x|) + c2 x 0 sin(1 ln |x|) .

which we naturally would prefer to write more simply as

y = c1 cos(ln |x|) + c2 sin(ln |x|) .

Here, neither term vanishes or blows up. Instead, as x goes from 1 to 0 , we have ln |x| going
from 0 to −∞ . This means that the sine and cosine terms oscillate inﬁnitely many times as x
goes from 1 to 0 , similar to the function illustrated in ﬁgure 32.1a. Again, there is no way we
can require that “ y(0) = 2 ”.

The “blowing up” or “vanishing” of solutions illustrated above is typical behavior of solutions
about the singular points of their differential equations. Sometimes it is important to know just how
the solutions to a differential equation behave near a singular point x 0 . For example, if you know
a solution describes some process that does not blow up as x → x 0 , then you know that those
solutions that do blow up are irrelevant to your problem (this will become very important when we
study boundary-value problems).
The tools we will develop in this chapter will yield “modified power series solutions” around
singular points for at least some differential equations. The behavior of the solutions near these
singular points can then be deduced from these modified power series. For which equations will
this analysis be appropriate? Before answering that, I must first tell you the difference between a
“regular” and an “irregular” singular point.

32.2 Regular and Irregular Singular Points (and the

Frobenius Radius of Convergence)
Basic Terminology
Assume x 0 is singular point on the real line for some given homogeneous linear second-order
differential equation
ay + by + cy = 0 . (32.1)
We will say that x 0 is a regular singular point for this differential equation if and only if that
differential equation can be rewritten as

(x − x 0 )2 α(x)y + (x − x 0 )β(x)y + γ (x)y = 0 (32.2)

where α , β and γ are ‘suitable, well-behaved’ functions about x 0 with α(x 0 ) = 0 . The above
form will be called a quasi-Euler form about x 0 for the given differential equation, and the shifted
Euler equation
(x − x 0 )2 α0 y + (x − x 0 )β0 y + γ0 y = 0 (32.3a)

i i

i i
i i

i i

672 Modiﬁed Power Series Solutions and the Basic Method of Frobenius

where
α0 = α(x 0 ) , β0 = β(x 0 ) and γ0 = γ (x 0 ) . (32.3b)

will be called the associated (shifted) Euler equation (about x0 ).

Precisely what we mean above by “ ‘suitable, well-behaved’ functions about x 0 ” depends,
in practice, on the coefficients of the original differential equation. In general, it means that the
functions α , β and γ in equation (32.2) are all analytic at x 0 . However, if the coefficients of the
original equation (the a , b and c in equation (32.1)) are rational functions, then we will be able to
further insist that the α , β and γ in the quasi-Euler form be polynomials.
It is quite possible that our differential equation cannot be written in quasi-Euler form about x 0 .
Then we will say x 0 is an irregular singular point for our differential equation. Thus, every singular
point on the real line for our differential equation is classified as being regular or irregular depending
on whether the differential equation can be rewritten in quasi-Euler form about that singular point.
While we are at it, let’s also define the Frobenius radius of analyticity about any point z 0 for
our differential equation. It is simply the distance between z 0 and the closest singular point z s other
than z 0 , provided such a point exists. If no such z s exists, then the Frobenius radius of analyticity
is ∞ . The Frobenius radius of analyticity will play almost the same role as played by the radius of
analyticity in the previous two chapters. In fact, it is the radius of analyticity if z 0 happens to be an
ordinary point.

!Example 32.3 (Bessel equations): Let ν be any positive real constant. Bessel’s equation of
order ν is the differential equation 2
ν 2
1
y + y + 1− y = 0 .
x x

The coefﬁcients of this,

1
ν 2 x 2 − ν2
a(x) = 1 , b(x) = and c(x) = 1 − = ,
x x x2

are rational functions. Multiplying through by x 2 then gives us

x 2 y + x y + x 2 − ν 2 y = 0 . (32.4)

Clearly, x 0 = 0 is the one and only singular point for this differential equation. This, in turn,
means that the Frobenius radius of analyticity about x 0 = 0 is ∞ (though the radius of analyticity,
as deﬁned in chapter 30, is 0 since x 0 = 0 is a singular point.)
Now observe that the last differential equation is

(x − 0)2 α(x)y + (x − 0)β(x)y + γ (x)y = 0 . (32.5a)

where α , β and γ are the simple polynomials

α(x) = 1 , β(x) = 1 and γ (x) = x 2 − ν 2 , (32.5b)

which are certainly analytic about x 0 = 0 (and every other point on the complex plane). Moreover,
since
α(0) = 1 = 0 , β(0) = 1 and γ (0) = −ν 2 ,
we now have that:
2 Bessel’s equations and their solutions often arise in two-dimensional problems involving circular objects.

i i

i i
i i

i i

Regular and Irregular Singular Points 673

1. The Bessel equation of order ν can be written in quasi-Euler form about x 0 = 0 .

2. The singular point x 0 = 0 is a regular singular point.
and
3. The associated Euler equation about x 0 = 0 is

(x − 0)2 · 1y + (x − 0) · 1y − ν 2 y = 0 ,

which, of course, is normally written

x 2 y + x y − ν 2 y = 0 .

Two quick notes before going on:

1. Often, the point of interest is x 0 = 0 , in which case we will write equation (32.2) more
simply as
x 2 α(x)y + xβ(x)y + γ (x)y = 0 (32.6)
with the associated Euler equation being

x 2 α0 y + xβ0 y + γ0 y = 0 (32.7a)
where
α0 = α(0) , β0 = β(0) and γ0 = γ (0) . (32.7b)

2. In fact, any singular point on the complex plane can be classiﬁed as regular or irregular.
However, this won’t be particularly relevant to us. Our interest will only be in whether given
singular points on the real line are regular or not.

Testing for Regularity

As illustrated in our last example, if the coefficients in our original differential equation are relatively
simple rational functions, then it can be relatively straightforward to show that a given singular point
x 0 is or is not regular by seeing if we can or cannot rewrite the equation in quasi-Euler form about
x 0 . An advantage of deriving this quasi-Euler form (if possible) is that we will want this quasi-Euler
form in solving our differential equation. However, there are possible difficulties. If we cannot
rewrite our equation in quasi-Euler form, then we may be left with the question of whether x 0 truly
is an irregular singular point, or whether we just weren’t clever enough to get the equation into
quasi-Euler form. Also, if the coefficients in our original equation are not so simple, then the process
of attempting to convert it to quasi-Euler form may be quite challenging.
A useful test for regularity is easily derived by first assuming that x 0 is a regular singular point
for
a(x)y + b(x)y + c(x)y = 0 . (32.8)

By deﬁnition, this means that this differential equation can be rewritten as

(x − x 0 )2 α(x)y + (x − x 0 )β(x)y + γ (x)y = 0 (32.9)

where α , β and γ are functions analytic at x 0 with α(x 0 ) = 0 . Dividing each of these two
equations by its ﬁrst coefﬁcient converts our differential equation, respectively, to the two forms
b(x) c(x)
y + y + y = 0
a(x) a(x)

i i

i i
i i

i i

674 Modiﬁed Power Series Solutions and the Basic Method of Frobenius

and
β(x) γ (x)
y + y + y = 0 .
(x − x 0 )α(x) (x − x 0 )2 α(x)

But these last two equations describe the same differential equation, and have the same first coeffi-
cients. Clearly, the other coefficients must also be the same, giving us
b(x) β(x) c(x) γ (x)
= and = .
a(x) (x − x 0 )α(x) a(x) (x − x 0 )2 α(x)

Equivalently,
b(x) β(x) c(x) γ (x)
(x − x 0 ) = and (x − x 0 )2 = .
a(x) α(x) a(x) α(x)

From this and the fact that α , β and γ are analytic at x 0 with α(x 0 ) = 0 we get that the two
limits
b(x) β(x) β(x 0 )
lim (x − x 0 ) = lim =
x→x 0 a(x) x→x 0 α(x) α(x 0 )
and
c(x) γ (x) γ (x 0 )
lim (x − x 0 )2 = lim =
x→x 0 a(x) x→x 0 α(x) α(x 0 )

are finite.
So, x 0 being a regular singular point assures us that the above limits are finite. Consequently,
if those limits are not finite, then x 0 cannot be a regular singular point for our differential equation.
That gives us

Lemma 32.1 (test for irregularity)

Assume x 0 is a singular point on the real line for

a(x)y + b(x)y + c(x)y = 0 .

If either of the two limits

b(x) c(x)
lim (x − x 0 ) and lim (x − x 0 )2
x→x 0 a(x) x→x 0 a(x)

is not ﬁnite, then x 0 is an irregular singular point for the differential equation.

This lemma is just a test for irregularity. It can be expanded to a more complete test if we
make mild restrictions on the coefﬁcients of the original differential equation. In particular, using
properties of polynomials and rational functions, we can verify the following:

Theorem 32.2 (testing for regular singular points (ver.1))

Assume x 0 is a singular point on the real line for

a(x)y + b(x)y + c(x)y = 0

where a , b and c are rational functions. Then x 0 is a regular singular point for this differential
equation if and only if the two limits
b(x) c(x)
lim (x − x 0 ) and lim (x − x 0 )2
x→x 0 a(x) x→x 0 a(x)

i i

i i
i i

i i

Regular and Irregular Singular Points 675

are both ﬁnite values. Moreover, if x 0 is a regular singular point for the differential equation, then
this differential equation can be written in quasi-Euler form

(x − x 0 )2 α(x)y + (x − x 0 )β(x)y + γ (x)y = 0

where α , β and γ are polynomials with α(x 0 ) = 0 .

The full proof of this (along with a similar theorem applicable when the coefﬁcients are merely
quotients of analytic functions) is discussed in an appendix, section 32.7.

!Example 32.4: Consider the differential equation

1
1

y + 2 y + 1 − 2 y = 0 .
x x

Clearly, x 0 = 0 is a singular point for this equation. Writing out the limits given in the above
theorem (with x 0 = 0 ) yields
b(x) x −2 1
lim (x − 0) = lim x · = lim
x→0 a(x) x→ 1 x→0 x
and
c(x) 1 − x −2

lim (x − 0)2 = lim x 2 · = lim x 2 − 1 ,
x→0 a(x) x→ 1 x→0

The ﬁrst limit is certainly not ﬁnite. So our test for regularity tells us that x 0 = 0 is an irregular
singular point for this differential equation.

!Example 32.5: Consider

2x y − 4y − y = 0 .
Again, x 0 = 0 is clearly the only singular point. Now,
b(x) −4
lim (x − 0) = lim x · = −2
x→0 a(x) x→0 2x
and
c(x) −1
lim (x − 0)2 = lim x 2 · = 0 ,
x→0 a(x) x→0 2x

both of which are ﬁnite. So x 0 = 0 is a regular singular point for our equation, and the given
differential equation can be written in quasi-Euler form about 0 . In fact, that form is obtained
by simply multiplying the original differential equation by x ,

2x 2 y − 4x y − x y = 0 .

i i

i i
i i

i i

676 Modiﬁed Power Series Solutions and the Basic Method of Frobenius

32.3 The (Basic) Method of Frobenius

Motivation and Preliminary Notes
Let’s suppose we have a second-order differential equation with x 0 as a regular singular point. Then,
as we just discussed, this differential equation can be written as

(x − x 0 )2 α(x)y + (x − x 0 )β(x)y + γ (x)y = 0

where α , β and γ are analytic at x 0 with α(x 0 ) = 0 . By continuity, when x ≈ x 0 , we have

α(x) ≈ α0 , β(x) ≈ β0 and γ (x) ≈ γ0

where
α0 = α(x 0 ) , β0 = β(x 0 ) and γ0 = γ (x 0 ) .

It should then seem reasonable that any solution y = y(x) to

(x − x 0 )2 α(x)y + (x − x 0 )β(x)y + γ (x)y = 0

can be approximated, at least when x ≈ x 0 , by a corresponding solution to the associated shifted

Euler equation
(x − x 0 )2 α0 y + (x − x 0 )β0 y + γ0 y = 0 .
And since some solutions to this shifted Euler equation are of the form a0 (x − x 0 )r where a0 is an
arbitrary constant and r is a solution to the corresponding indicial equation,

α0 r(r − 1) + β0 r + γ0 = 0 ,

it seems reasonable to expect at least some solutions to our original differential equation to be
approximated by this a0 (x − x 0 )r ,

y(x) ≈ a0 (x − x 0 )r at least when x ≈ x0 .

Now, this is equivalent to saying

y(x)
≈ a0 when x ≈ x0 ,
(x − x 0 )r

which is more precisely stated as

y(x)
lim = a0 . (32.10)
x→x 0 (x − x 0 )r

At this point, there are a number of ways we might ‘guess’ at solutions satisfying the last
approximation. Let us try a trick similar to one we’ve used before; namely, let’s assume that y is
the known approximate solution (x − x 0 )r multiplied by some yet unknown function a(x) ,

y(x) = (x − x 0 )r a(x) .

To satisfy equation (32.10), we must have

y(x)
a(x 0 ) = lim a(x) = lim = a0 ,
x→x 0 x→x 0 (x − x 0 )r

i i

i i
i i

i i

The (Basic) Method of Frobenius 677

telling us that a(x) is reasonably well-behaved near x 0 , perhaps even analytic. Well, let’s hope it
is analytic, because that means we can express a(x) as a power series about x 0 with the arbitrary
constant a0 as the constant term,
∞

a(x) = ak (x − x 0 )k .
k=0

Then, with luck and skill, we might be able to use the methods developed in the previous chapters
to ﬁnd the ak ’s in terms of a0 .
That is the starting point for what follows. We will assume a solution of the form
∞

y(x) = (x − x 0 )r ak (x − x 0 )k
k=0

where r and the ak ’s are constants to be determined, with a0 being arbitrary. This will yield the
“modified power series” solutions alluded to in the title of this chapter.
In the next subsection, we will describe a series of steps, generally called the (basic) method
of Frobenius, for finding such solutions. You will discover that much of it is very similar to the
algebraic method for finding power series solutions in chapter 30.
Before we start that, however, there are a few things worth mentioning about this method:

1. We will see that the method of Frobenius always yields at least one solution of the form
∞

y(x) = (x − x 0 )r ak (x − x 0 )k
k=0

where r is a solution to the appropriate indicial equation. If the indicial equation for the
corresponding Euler equation

(x − x 0 )2 α0 y + (x − x 0 )β0 y + γ0 y = 0

has two distinct solutions, then we will see that the method often (but not always) leads to an
independent pair of such solutions.

2. If, however, that indicial equation has only one solution r , then the fact that the corresponding
shifted Euler equation has a ‘second solution’ in the form

(x − x 0 )r ln |x − x 0 |

may lead you to suspect that a ‘second solution’ to our original equation is of the form
∞

(x − x 0 )r ln |x − x 0 | ak (x − x 0 )k .
k=0

That turns out to be almost the case.

Just what can be done when the basic Frobenius method does not yield an independent pair of
solutions will be discussed in the next chapter.

i i

i i
i i

i i

678 Modiﬁed Power Series Solutions and the Basic Method of Frobenius

The (Basic) Method of Frobenius

Suppose we wish to find the solutions about some regular singular point x 0 to some second-order
homogeneous linear differential equation
a(x)y + b(x)y + c(x)y = 0
with a(x) , b(x) , and c(x) being rational functions.3 In particular, let us seek solutions to Bessel’s
equation of order 1/2 ,
1 1
y + y + 1 − 2 y = 0 (32.11)
x 4x
near the regular singular point x 0 = 0 .
As with the algebraic method for finding power series solutions, there are two preliminary
steps:
Pre-Step 1. If not already specified, choose the regular singular point x 0 .
For our example, we choose x 0 = 0 , which we know is the only regular singular
point from the discussion in the previous section.

Pre-Step 2. Get the differential equation into quasi-Euler form; that is, into the form
(x − x 0 )2 α(x)y + (x − x 0 )β(x)y + γ (x)y = 0 (32.12)
where α , β and γ are polynomials, with α(x 0 ) = 0 , and with no factors shared by all
three.4
To get the given differential equation into the form desired, we multiply equation
(32.11) by 4x 2 . That gives us the differential equation
4x 2 y + 4x y + [4x 2 − 1]y = 0 . (32.13)
(Yes, we could have just multiplied by x2 , but getting rid of any fractions will
simplify computation.)

Now for the basic method of Frobenius:

Step 1. (a) Assume a solution of the form

∞
∞

0 = ak 4(k + r)(k + r − 1)x k + ak 4(k + r)x k
k=0 k=0
∞
∞

+ ak 4x k+2 + ak (−1)x k .
k=0 k=0

Step 4. For each series in your last equation, do a change of index so that each series looks like
∞

something not involving x (x − x 0 )n .
n=something

Be sure to appropriately adjust the lower limit in each series.

In all but the third series in the example, the change of index is trivial, n = k .
In the third series, we will set n = k + 2 (equivalently, n − 2 = k ). This
means, in the third series, replacing k with n − 2 , and replacing k = 0 with
n = 0+2 = 2:
∞
∞

0 = ak 4(k + r)(k + r − 1)x k + ak 4(k + r)x k
k=0 k=0

n=k n=k
∞
∞

+ ak 4x k+2 + ak (−1)x k
k=0 k=0

n = k+2 n=k

∞
∞

= an 4(n + r)(n + r − 1)x n + an 4(n + r)x n
n=0 n=0
∞
∞

+ an−2 4x n + an (−1)x n .
n=2 n=0

formula of r = 0 .

This is the indicial equation for the differential equation. It will be a quadratic equation (we’ll
see why later). Solve this equation for r . You will get two solutions (sometimes called either
the exponents of the solution or the exponents of the singularity). Denote them by r1 and
r2 . If the exponents are real (which is common in applications), label the exponents so that
r1 ≥ r2 . If the exponents are not real, then it does not matter which is labeled as r1 .11
In our example, the ﬁrst term in the “big series” is the ﬁrst term in equation (32.16),

a0 4r 2 − 1 x 0 .

Since this must be zero (and a0 = 0 by assumption) the indicial equation is

4r 2 − 1 = 0 . (32.17)

Thus, (
1 1
r = ± = ± .
4 2

Following the convention given above (that r1 ≥ r2 ),

1 1
r1 = and r2 = − .
2 2

Step 7. Using r1 (the largest r if the exponents are real):

(a) Plug r1 into the last series equation (and simplify, if possible). This will give you an
equation of the form
∞

n th formula of a j ’s (x − x 0 )n = 0 .
n=n 0

Since each term must vanish, we must have

n th formula of a j ’s = 0 for n0 ≤ n .

(b) Solve this last set of equations for12

ahighest index = formula of n and lower indexed a j ’s .

A few of these equations may need to be treated separately, but you will also obtain
a relatively simple formula that holds for all indices above some fixed value. This
formula is the recursion formula for computing each coefficient an from the previously
computed coefficients.

11 We are assuming the coefﬁcients of our differential equation— α , β and γ — are real-valued functions on the real line. In
the very unlikely case they are not, then a more general convention should be used: If the solutions to the indicial equation
differ by an integer, then label them so that r1 − r2 ≥ 0 . Otherwise, it does not matter which you call r1 and which you
call r2 . The reason for this convention will become apparent later (in section 32.5) after we further discuss the formulas
arising from the Frobenius method.
12 If you did as suggested earlier and put the differential equation into quasi-Euler form, then n will be the “highest index”
in this equation.

i i

i i
i i

i i

The (Basic) Method of Frobenius 683

(c) (Optional) To simplify things just a little, do another change of indices so that the
recursion formula just derived is rewritten as

ak = formula of k and lower-indexed coefﬁcients .

Letting r = r1 = 1/2 in equation (32.16) yields

0 = a0 4r 2 − 1 x 0 + a1 4r 2 + 8r + 3 x 1
∞

+ an 4(n + r)2 − 1 + 4an−2 x n
n=2

1 2 1 2 1
= a0 4 − 1 x + a1 4
0
+8 + 3 x1
2 2 2

∞

1 2
+ an 4 n + − 1 + 4an−2 x n
2
n=2

∞

= a0 [0]x 0
+ a1 8x 1
+ an 4n 2 + 4n + 4an−2 x n .
n=2

The first term vanishes (as it should since r = 1/2 satisfies the indicial equation,
which came from making the first term vanish). Doing a little more simple
algebra, we see that, with r = 1/2 , equation (32.16) reduces to
∞

0a0 x 0 + 8a1 x 1 + 4 n(n + 1)an + an−2 x n = 0 . (32.18)
n=2

Since the individual terms in this series must vanish, we have

0a0 = 0 , 8a1 = 0
and
n(n + 1)an + an−2 = 0 for n = 2, 3, 4 . . . .

Solving for an gives us the recursion formula

−1
an = an−2 for n = 2, 3, 4 . . . .
n(n + 1)

Using the trivial change of index, k = n , this is

−1
ak = ak−2 for k = 2, 3, 4 . . . . (32.19)
k(k + 1)

(d) Use the recursion formula (and any corresponding formulas for the lower-order terms)
to ﬁnd all the ak ’s in terms of a0 and, possibly, one other am . Look for patterns!
From the ﬁrst two terms in equation (32.18),

0a0 = 0 ⇒ a0 is arbitrary.

8a1 = 0 ⇒ a1 = 0 .

i i

i i
i i

i i

684 Modiﬁed Power Series Solutions and the Basic Method of Frobenius

Using these values and recursion formula (32.19) with k = 2, 3, 4, . . . (and

looking for patterns):
−1 −1 −1
a2 = a2−2 = a0 = a0 ,
2(2 + 1) 2·3 3·2
−1 −1 −1
a3 = a3−2 = a1 = ·0 = 0 ,
3(3 + 1) 3·4 3·4

−1 −1 −1 −1 (−1)2
a4 = a4−2 = a2 = · a0 = a0 ,
4(4 + 1) 4·5 5·4 3·2 5!
−1 −1
a5 = a5−2 = ·0 = 0 ,
5(5 + 1) 5·6

−1 −1 −1 (−1)2 (−1)3
a6 = a6−2 = a4 = · a0 = a0 ,
6(6 + 1) 6·7 7·6 5! 7!
..
.

The patterns should be obvious here:

ak = 0 for k = 1, 3, 5, 7, . . . ,

and
(−1)k/2
ak = a0 for k = 2, 4, 6, 8, . . . .
(k + 1)!
Using k = 2m , this can be written more conveniently as
a0
a2m = (−1)m for m = 1, 2, 3, 4, . . . .
(2m + 1)!
Moreover, this last equation reduces to the trivially true statement “ a0 = a0 ”
if m = 0 . So, in fact, it gives all the even-indexed coefﬁcients,
a0
a2m = (−1)m for m = 0, 1, 2, 3, 4, . . . .
(2m + 1)!

(e) Using r = r1 along with the formulas just derived for the coefﬁcients, write out the
resulting series for y . Try to simplify it and factor out the arbitrary constant(s).
Plugging r = 1/2 and the formulas just derived for the ak ’s into the formula
originally assumed for y (equation (32.15) on page 679), we get
∞

r
y = x ak x k
k=0

∞ ∞

r k k
= x ak x + ak x
k=0 k=0
k odd k even

∞ ∞

a0
x /2
1 k m
= 0·x + (−1) x 2m
(2m + 1)!
k=0 m=0
k odd
∞

1/ 1
= x 2 0 + a0 (−1)m x 2m .
(2m + 1)!
m=0

i i

i i
i i

i i

The (Basic) Method of Frobenius 685

So one set of solutions to Bessel’s equation of order 1/ (equation (32.11) on

2
page 678) is given by
∞

1/ (−1)m
y = a0 x 2 x 2m (32.20)
(2m + 1)!
m=0

with a0 being an arbitrary constant.

Step 8. If r2 = r1 , skip this step and just go to the next. If the exponents are complex (which
is unlikely), see Dealing with Complex Exponents starting on page 698. But if the indical
equation has two distinct real solutions, now try to repeat step 7 with the other exponent, r2 ,
replacing r1 .
Before starting, however, you should be warned that, in attempting to redo step 7 with
r = r2 , one of three things will happen:

i. With luck, this step will lead to a solution of the form

∞

(x − x 0 )r2 ak (x − x 0 )k
k=0

with a0 being the only arbitrary constant.

ii. This step can also lead to a series having two arbitrary constants, a0 and one other
coefﬁcient a M (our example will illustrate this). If you keep this second arbitrary
constant, you will then end up with two particular series solutions — one multiplied
by a0 and one multiplied by a M . However, it is easily shown (see exercise 32.8)
that the one multiplied by a M is simply the series solution already obtained at the
end of the previous step. Thus, in keeping a M arbitrary, you will rederive the one
solution you already have. Why bother? Go ahead and set a M = 0 and continue. It
will save you a lot of needless work.

iii. Finally, this step can lead to a contradiction. More precisely, the recursion formula
might “blow up” for one of the coefﬁcients. This tells you that there is no solution of
the form
∞

(x − x 0 )r2 ak (x − x 0 )k with a0 = 0 .
k=0

If this happens, make note of it, and skip to the next step.

(Why we have these three possibilities will be further discussed in Problems Possibly Arising
in Step 8 starting on page 693.)

Letting r = r2 = −1/2 in equation (32.16) yields

0 = a0 4r 2 − 1 x 0 + a1 4r 2 + 8r + 3 x 1
∞

+ an 4(n + r)2 − 1 + 4an−2 x n
n=2

i i

i i
i i

i i

686 Modiﬁed Power Series Solutions and the Basic Method of Frobenius

1 2 0 1 2 1
= a0 4 − − 1 x + a1 4 − + 8 − + 3 x1
2 2 2

∞

1 2
+ an 4 n − − 1 + 4an−2 x n
2
n=2

∞

= a0 0x 0
+ a1 0x 1
+ an 4n 2 − 4n + 4an−2 x n .
n=2

That is,
∞

0a0 x 0 + 0a1 x 1 + 4 an n(n − 1) + an−2 x n = 0 ,
n=2

which means that

0a0 = 0 , 0a1 = 0 (32.21a)
and
an n(n − 1) + an−2 = 0 for n = 2, 3, 4, . . . . (32.21b)

The ﬁrst two equations hold for any values of a0 and a1 . So, strictly speaking,
both a0 and a1 are arbitrary constants. But, as noted above, keeping a1 arbitrary
in these calculations will simply lead to solution (32.20) with a1 replacing a0 .
Since we don’t need to rederive this solution, we will simplify our work by setting

a1 = 0 .

From equation (32.21b) we get

−1
an = an−2 for n = 2, 3, 4, . . . .
n(n − 1)

That is,
−1
ak = ak−2 for k = 2, 3, 4, . . . . (32.22)
k(k − 1)

This is the recursion formula for ﬁnding all the other ak ’s in terms of a0 (which
is arbitrary) and a0 (which we’ve set to 0 ).
Why don’t you ﬁnish these computations as an exercise? You should have no
trouble in obtaining
∞
(−1)m
y = a0 x − /2
1
x 2m . (32.23)
(2m)!
m=0

as our second solution. Choosing a0 = 1 and a1 = 0 , we get the second particular solution
∞
(−1)m
x − /2
1
y2 (x) = x 2m .
(2m)!
m=0

While we are at it, let us agree that, for the rest of this chapter, x 0 always denotes a regular
singular point for some differential equation of interest, and that r1 and r2 always denote the
corresponding exponents; that is, the solutions to the corresponding indicial equation. Let us further
agree that these exponents are indexed according to the convention given in the basic Frobenius,
with r1 ≥ r2 if they are real.

32.4 Basic Notes on Using the Frobenius Method

The Obvious
One thing should be obvious: The method we’ve just outlined is even longer and more tedious than
the algebraic method used in chapter 30 to ﬁnd power series solutions to similar equations about
ordinary points. On the other hand, much of this method is based on that algebraic method, which,
by now, you have surely mastered.
Naturally, all the ‘practical advice’ given regarding the algebraic method in chapter 30 still
holds, including the recommendation that you use

Y (X) = y(x) with X = x − x0

to simplify your calculations when x 0 = 0 .

But there are a number of other things you should be aware of:

Solutions on Intervals with x < x0

On an interval with x < x 0 , (x − x 0 )r might be complex-valued (or even ambiguous) if r is not
an integer. For example,

(x − x 0 ) /2 = (− |x − x 0 |) /2 = (−1) /2 |x − x 0 | /2 = ±i |x − x 0 | /2
1 1 1 1 1
.

More generally, we will have

(x − x 0 )r = (− |x − x 0 |)r = (−1)r |x − x 0 |r when x < x0 .

i i

i i
i i

i i

Basic Notes on Using the Frobenius Method 689

But this is not a significant issue because (−1)r can be viewed as a constant (possibly complex)
and can be divided out of the final formula (or incorporated in the arbitrary constants). Thus, in our
final formulas for y(x) , we can replace

(x − x 0 )r with |x − x 0 |r

to avoid having any explicitly complex-valued expressions (at least when r is not complex). Keep
in mind that there is no reason to do this if r is an integer.

Convergence of the Series

It probably won’t surprise you to learn that the Frobenius radius of analyticity serves as a lower bound
on the radius of convergence for the power series found in the Frobenius method. To be precise, we
have the following theorem:

Theorem 32.3
Let x 0 be a regular singular point for some second-order homogeneous linear differential equation

a(x)y + b(x)y + c(x)y = 0

and let
∞

y(x) = |x − x 0 |r ak (x − x 0 )k
k=0
be a modiﬁed power series solution to the2 differential equation found by the basic method of Frobe-
nius. Then the radius of convergence for ∞ k=0 ak (x − x 0 ) is at least equal to the Frobenius radius
k

of convergence about x 0 for this differential equation.

We will verify this claim in chapter 34. For now, let us simply note that this theorem assures
us that the given solution y is valid at least on the intervals

(x 0 − R, x 0 ) and (x 0 , x 0 + R)

where R is that Frobenius radius of convergence. Whether or not we can include the point x 0
depends on the value of the exponent r .

!Example 32.6: To illustrate the Frobenius method, we found modiﬁed power series solutions
about x 0 = 0
∞
∞

1/ (−1)m (−1)m
y2 (x) = x − /2
1
y1 (x) = x 2 x 2m and x 2m
(2m + 1)! (2m)!
m=0 m=0

for Bessel’s equation of order 1/ ,

1
1

y + y + 1− y = 0 .
x 4x 2
Since there are no singular points for this differential equation other than x 0 = 0 , the Frobenius
radius of convergence for this differential equation about x 0 = 0 is R = ∞ . That means the
power series in the above formulas for y1 and y2 converge everywhere.
However, the x 1/2 and x −1/2 factors multiplying these power series are not “well behaved”
at x 0 = 0 — neither is differentiable there, and one becomes inﬁnite as x → 0 . So, the above
formulas for y1 and y2 are valid only on intervals not containing x = 0 , the largest of which

i i

i i
i i

i i

690 Modiﬁed Power Series Solutions and the Basic Method of Frobenius

are (0, ∞) and (−∞, 0) . Of course, on (−∞, 0) the square roots yield imaginary values, and
we would prefer using the solutions
∞
∞

(−1)m (−1)m
y1 (x) = |x| /2 y2 (x) = |x|− /2
1 1
x 2m and x 2m .
(2m + 1)! (2m)!
m=0 m=0

Variations of the Method

Naturally, there are several variations of “the basic method of Frobenius”. The one just given is
merely one the author ﬁnds convenient for initial discussion.
One variation you may want to consider is to ﬁnd particular solutions without arbitrary constants
by setting a0 = 1 . That is, in step 1, assume
∞

r
y(x) = (x − x 0 ) ak (x − x 0 )k with a0 = 1 .
k=0

Assuming a0 is 1 , instead of an arbitrary nonzero constant, leads to a ﬁrst particular solution

∞

y1 (x) = (x − x 0 )r1 ak (x − x 0 )k with a0 = 1 .
k=0

With a little thought, you will realize that this is exactly the same as you would have obtained at the
end of step 7, only not multiplied by an arbitrary constant. In particular, had we done this with the
Bessel’s equation used to illustrate the method, we would have obtained
∞

1/ (−1)m
y1 (x) = x 2 x 2m
(2m + 1)!
m=0

instead of formula (32.20) on page 685.

If r2 = r1 , then, with luck, doing step 8 with a0 = 1 will yield a second particular solution
∞

y2 (x) = (x − x 0 )r2 ak (x − x 0 )k with a0 = 1 .
k=0

In particular, doing this with the illustrating example would have yielded
∞
(−1)m
y2 (x) = x − /2
1
x 2m ,
(2m)!
m=0

instead of formula (32.23) on page 686.

Assuming the second particular solution can be found, this variant of the method yields a pair
of particular solutions {y1 , y2 } that, because r1 = r1 , is easily seen to be linearly independent over
any interval on which the formulas are valid. Thus,

y(x) = c1 y1 (x) + c2 y2 (x)

is the general solution to the differential equation over this interval.

i i

i i
i i

i i

About the Indicial and Recursion Formulas 691

Using the Method When x0 is an Ordinary Point

It should be noted that we can also ﬁnd the power series solutions of a differential equation about an
ordinary point x 0 using the basic Frobenius method (ignoring, of course, the ﬁrst preliminary step).
In practice, though, it would be silly to go through the extra work in the Frobenius method when
you can use the shorter algebraic method from chapter 30. After all, if x 0 is an ordinary point for
our differential equation, then we already know the solution y can be written as

y(x) = a0 y1 (x) + a1 y2 (x)

where y1 (x) and y2 (x) are power series about x 0 with

y1 (x) = 1 + a summation of terms of order 2 or more

∞

= (x − x 0 )0 bk (x − x 0 )k with bk = 1
k=0
and
y2 (x) = 1 · (x − x 0 ) + a summation of terms of order 2 or more
∞

= (x − x 0 )1 ck (x − x 0 )k with ck = 1
k=0

(see Initial-Value Problems on page 610). Clearly then, r2 = 0 and r1 = 1 , and the indicial
equation that would arise in using the Frobenius method would just be r(r − 1) = 0 .
So why bother solving for the exponents r1 and r2 of a differential equation at a point x 0
when you don’t need to? It’s easy enough to determine whether a point x 0 is an ordinary point or
a regular singular point for your differential equation. Do so, and don’t waste your time using the
Frobenius method unless the point in question is a regular singular point.

32.5 About the Indicial and Recursion Formulas

In chapter 34, we will closely examine the formulas involved in the basic Frobenius method. Here
are a few things regarding the indicial equation and the recursion formulas that we will verify then
(and which you should observe in your own computations now).

The Indicial Equation and the Exponents

Remember that one of the preliminary steps has us rewriting the differential equation as

(x − x 0 )2 α(x)y + (x − x 0 )β(x)y + γ (x)y = 0

where α , β and γ are polynomials, with α(x 0 ) = 0 , and with no factors shared by all three. In
practice, these polynomials are almost always real (i.e., their coefﬁcients are real numbers). Let us
assume this.
If you carefully follow the subsequent computations in the basic Frobenius method, you will
discover that the indicial equation is just as we suspected at the start of section 32.3, namely,

α0 r(r − 1) + β0 r + γ0 = 0 ,

i i

i i
i i

i i

692 Modiﬁed Power Series Solutions and the Basic Method of Frobenius

where
α0 = α(x 0 ) , β0 = β(x 0 ) and γ0 = γ (x 0 ) .

The exponents for our differential equation (i.e., the solutions to the indicial equation) can then be
found by rewriting the indicial equation as

α0 r 2 + (β0 − α0 )r + γ0 = 0

and using basic algebra.

!Example 32.7: In describing the Frobenius method, we used Bessel’s equation of order 1/
2
with x 0 = 0 . That equation was (on page 678) rewritten as

4x 2 y + 4x y + [4x 2 − 1]y = 0 ,

which is
(x − 0)2 α(x)y + (x − 0)β(x)y + γ (x)y = 0
with
α(x) = 4 , β(x) = 4 and γ (x) = 4x 2 − 1 .
So
α0 = α(0) = 4 , β0 = β(0) = 4 and γ0 = γ (0) = −1 .

According to the above, the corresponding indicial equation should be

4r(r − 1) + 4r − 1 = 0 ,

which simpliﬁes to
4r 2 − 1 = 0 ,
and which, indeed, is what we obtained (equation (32.17) on page 682) as the indicial equation
for Bessel’s equation of order 1/2 .

From basic algebra, we know that, if the coefﬁcients of the indicial equation

α0 r 2 + (β0 − α0 )r + γ0 = 0

are all real numbers, then the two solutions to the indicial equation are either both real numbers
(possibly the same real numbers) or are complex conjugates of each other. This is usually the case
in practice (and will always be the case in the examples and exercises of this chapter).

The Recursion Formulas

After ﬁnding the solutions r1 and r2 to the indicial equation, the basic method of Frobenius has us
deriving the recursion formula corresponding to each of these r’s . In chapter 34, we’ll discover that
each of these recursion formulas can always be written as
F(k, r, a0 , a1 , . . . , ak−1 )
ak = for k≥κ (32.25a)
(k + r − r1 ) (k + r − r2 )

where κ is some positive integer, and r is either r1 or r2 , depending on whether this is the recursion
formula corresponding to r1 or r2 , respectively. It also turns out that this formula is derived from
the seemingly equivalent relation

(k + r − r1 ) (k + r − r2 ) ak = F(k, r, a0 , a1 , . . . , ak−1 ) , (32.25b)

i i

i i
i i

i i

About the Indicial and Recursion Formulas 693

which holds for every integer k greater than or equal to κ . (This later equation will be useful when
we discuss possible “degeneracies” in the recursion formula).
At this point, all you need to know about F(k, r, a0 , a1 , . . . , ak−1 ) is that it is a formula that
yields a ﬁnite value for every possible choice of its variables. The actual formula is not that easily
described and would not be all that useful for the differential equations being considered here.
Of greater interest is the fact that the denominator in the recursion formula factors so simply
and predicably. The recursion formula obtained using r = r1 will be
F(k, r1 , a0 , a1 , . . . , ak−1 ) F(k, r1 , a0 , a1 , . . . , ak−1 )
ak = = ,
(k + r1 − r1 )(k + r1 − r2 ) k(k + r1 − r2 )

while the recursion formula obtained using r = r2 will be

F(k, r2 , a0 , a1 , . . . , ak−1 ) F(k, r2 , a0 , a1 , . . . , ak−1 )
ak = = .
(k + r2 − r1 )(k + r2 − r2 ) (k − [r1 − r2 ])k

!Example 32.8: In solving Bessel’s equation of order 1/

2 we obtained
1 1
r1 = and r2 = − .
2 2

The recursion formulas corresponding to each of these were found to be

−1 −1
ak = ak−2 and ak = ak−2 for k = 2, 3, 4, . . . ,
k(k + 1) k(k − 1)

respectively (see formulas (32.19) and (32.22)). Looking at the recursion formula corresponding
to r = 1/2 we see that, indeed,
−1 −1 −1
ak = ak−2 = 1 ak−2 = ak−2 .
k(k + 1) 1
k k+ − − k (k + r1 − r2 )
2 2

Likewise, looking at the recursion formula corresponding to r = 1/2 we see that

−1 −1 −1
ak = ak−2 = 1 1 ak−2 = ak−2 .
(k − 1)k k+ − − k (k + r 2 − r1 )k
2 2

Knowing that the denominator in your recursion formulas should factor into either

k(k + r1 − r2 ) or (k + r2 − r1 )k

should certainly simplify your factoring of that denominator. And if your denominator does not so
factor, then you know you made an error in your computations. So it provides a partial error check.
It also leads us to our next topic.

Problems Possibly Arising in Step 8

In discussing step 8 of the basic Frobenius method, we indicated that there may be “problems” in
ﬁnding a modiﬁed series solution corresponding to r2 . Well, take a careful look at the recursion
formula obtained using r2 :
F(k, r2 , a0 , a1 , . . . , ak−1 )
ak = for k≥κ
(k − [r1 − r2 ])k

i i

i i
i i

i i

694 Modiﬁed Power Series Solutions and the Basic Method of Frobenius

(remember, κ is some positive integer). See the problem? That’s correct, the denominator may be
zero for one of the k’s . And it occurs if (and only if) r1 and r2 differ by some integer K greater
than or equal to κ ,
r1 − r2 = K .
Then, when we attempt to compute a K , we get
F(K , r2 , a0 , a1 , . . . , ak−1 )
aK =
(K − [r1 − r2 ])K

F(K , r2 , a0 , a1 , . . . , ak−1 ) F(K , r2 , a0 , a1 , . . . , ak−1 )

= = !
(K − K )K 0

Precisely what we can say about a K when this happens depends on whether the numerator in this
expression is zero or not. To clarify matters slightly, let’s rewrite the above using recursion relation
(32.25b) which, in this case, reduces to

0 · a K = F(K , r2 , a0 , a1 , . . . , ak−1 ) . (32.26)

Now:
1. If F(K , r2 , a0 , a1 , . . . , ak−1 ) = 0 , then the recursion formula “blows up” at k = K ,
F(K , r2 , a0 , a1 , . . . , ak−1 )
aK = .
0

More precisely, a K must satisfy

0 · a K = F(K , r2 , a0 , a1 , . . . , ak−1 ) = 0 ,

which is impossible. Hence,

2 it is impossible for our differential equation to have a solution
of the form (x − x 0 )r2 ∞ a
k=0 k (x − x 0 ) k with a = 0 .
0

2. If F(K , r2 , a0 , a1 , . . . , ak−1 ) = 0 , then the recursion equation for a K reduces to

0
aK = ,
0

an indeterminant expression telling us nothing about a K . More precisely, (32.26) reduces

to
0 · aK = 0 ,
which is valid for any value of a K ; that is, a K is another arbitrary constant (in addition
to a0 ). Moreover, a careful analysis will conﬁrm that, in this case, we will re-derive the
modiﬁed power series solution corresponding to r1 with a K being the arbitrary constant in
that series solution (see exercise 32.8).
But why don’t we worry about the denominator in the recursion formula obtained using r1 ,
F(k, r1 , a0 , a1 , . . . , ak−1 )
ak = for k ≥ κ ?
k(k + r1 − r2 )

Because this denominator is zero only if r1 − r2 is some negative integer, contrary to our labeling
convention of r1 ≥ r2 .

!Example 32.9: Consider

2x y − 4y − y = 0 .

i i

i i
i i

i i

About the Indicial and Recursion Formulas 695

Multiplying this by x , we get it into the form

2x 2 y − 4x y − x y = 0 .

As noted in example 32.5 on page 675, this equation has one regular singular point, x 0 = 0 .
Setting
∞ ∞
y = y(x) = x r ak x k = ak x k+r
k=0 k=0

and differentiating, we get (as before)

∞ ∞
d
y = ak x k+r = ak (k + r)x k+r −1
dx
k=0 k=0
and
∞ ∞
d2
y = ak x k+r
= ak (k + r)(k + r − 1)x k+r −2 .
dx2
k=0 k=0

Plugging these into our differential equation:

0 = 2x 2 y − 4x y − x y
∞
∞
∞

= 2x 2 ak (k + r)(k + r − 1)x k+r −2 − 4 ak (k + r)x k+r −1 − x ak x k+r
k=0 k=0 k=0

∞
∞
∞

= 2ak (k + r)(k + r − 1)x k+r
− 4ak (k + r)x k+r
− ak x k+r +1 .
k=0 k=0 k=0

Dividing out the x r , reindexing, and grouping like terms:

y(x) = c+ y+ (x) + c− y− (x)

is a general solution to our differential equation.

However, the two solutions y+ and y− will be complex valued. To avoid complex-valued
solutions, we can employ the trick used before to derive a corresponding linearly independent pair
{y1 , y2 } of real-valued solutions to our differential equation by setting
1 1
y1 (x) = y+ (x) + y− (x) and y2 (x) = y+ (x) − y− (x) .
2 2i

If the need arises, you can, with a little algebra, derive formulas for y1 and y2 in terms of the real
and imaginary parts of (x − x 0 )λ+i ω and the αk ’s . To be honest, though, that need is unlikely to
ever arise.

i i

i i
i i

i i

Appendix: On Tests for Regular Singular Points 699

32.7 Appendix: On Tests for Regular Singular Points

Proof of Theorem 32.2
The validity of theorem 32.2 follows immediately from lemma 32.1 (which we’ve already proven)
and the next lemma.

Lemma 32.4
Let a , b and c be rational functions, and assume x 0 is a singular point on the real line for

a(x)y + b(x)y + c(x)y = 0 . (32.28)

If the two limits

b(x) c(x)
lim (x − x 0 ) and lim (x − x 0 )2
x→x 0 a(x) x→x 0 a(x)
are both ﬁnite, then the differential equation can be written in quasi-Euler form

(x − x 0 )2 α(x)y + (x − x 0 )β(x)y + γ (x)y = 0

where α , β and γ are polynomials with α(x 0 ) = 0 .

PROOF: Because a , b and c are rational functions (i.e., quotients of polynomials), we can
multiply equation (32.28) through by the common denominators, obtaining a differential equation
in which all the coefﬁcients are polynomials. Then factoring out the (x − x 0 )-factors, and dividing
out the factors common to all terms, we obtain

(x − x 0 )k A(x)y + (x − x 0 )m B(x)y + (x − x 0 )n C(x)y = 0 (32.29)

where A , B and C are polynomials with

A(x 0 ) = 0 , B(x 0 ) = 0 and C(x 0 ) = 0 ,

and k , m and n are nonnegative integers, one of which must be zero. Moreover, since this is just a
rewrite of equation (32.28), and x 0 is, by assumption, a singular point for this differential equation,
we must have that k ≥ 1 . Hence m = 0 or n = 0 .
Dividing equations (32.28) and (32.29) by their leading terms then yields, respectively,
b(x) c(x)
y + y + y = 0
a(x) a(x)
and
B(x) C(x)
y + (x − x 0 )m−k y + (x − x 0 )n−k y = 0 .
A(x) A(x)

But these last two equations describe the same differential equation, and have the same first coeffi-
cients. Consequently, the other coefficients must be the same, giving us
b(x) B(x) c(x) C(x)
= (x − x 0 )m−k and = (x − x 0 )n−k .
a(x) A(x) a(x) A(x)

Thus,
b(x) B(x) B(x 0 )
lim (x − x 0 ) = lim (x − x 0 )m+1−k = lim (x − x 0 )m+1−k
x→x 0 a(x) x→x 0 A(x) x→x 0 A(x 0 )

i i

i i
i i

i i

700 Modiﬁed Power Series Solutions and the Basic Method of Frobenius

and
c(x) C(x) C(x 0 )
lim (x − x 0 )2 = lim (x − x 0 )n+2−k = lim (x − x 0 )n+2−k .
x→x 0 a(x) x→x 0 A(x) x→x 0 A(x 0 )

By assumption, though, these limits are finite, and for the limits on the right to be finite, the exponents
must be zero or larger. This, combined with the fact that k ≥ 1 means that k , m and l must satisfy
m+1 ≥ k ≥ 1 and n+2 ≥ k ≥ 1 . (32.30)
But remember, m , n and k are nonnegative integers with m = 0 or n = 0 . That leaves us with
only three possibilities:
1. If m = 0 , then the first inequality in set (32.30) reduces to
1 ≥ k ≥ 1
telling us that k = 1 . Equation (32.29) then becomes

(x − x 0 )1 A(x)y + B(x)y + (x − x 0 )n C(x)y = 0 .

Muliplying through by x − x 0 , this becomes
(x − x 0 )2 α(x)y + (x − x 0 )β(x)y + γ (x)y = 0
where α , β and γ are the polynomials
α(x) = A(x) , β(x) = B(x) and γ (x) = (x − x 0 )n+1 C(x) .
2. If n = 0 , then the second inequality in set (32.30) reduces to

2 ≥ k ≥ 1 .
So k = 1 or k = 2 , giving us two options:
(a) If k = 1 , then equation (32.29) becomes
(x − x 0 )1 A(x)y + (x − x 0 )m B(x)y + C(x)y = 0 ,

which can be rewritten as

(x − x 0 )2 α(x)y + (x − x 0 )β(x)y + γ (x)y = 0
where α , β and γ are the polynomials
α(x) = A(x) , β(x) = (x − x 0 )m B(x) and γ (x) = (x − x 0 )C(x) .
(b) If k = 2 , then the other inequality in set (32.30) becomes
m+1 ≥ 2 .

Hence m ≥ 1 . In this case, equation (32.29) becomes

(x − x 0 )2 A(x)y + (x − x 0 )m B(x)y + C(x)y = 0
which is the same as
(x − x 0 )2 α(x)y + (x − x 0 )β(x)y + γ (x)y = 0
where α , β and γ are the polynomials
α(x) = A(x) , β(x) = (x − x 0 )m−1 B(x) and γ (x) = C(x) .

i i

i i
i i

i i

Additional Exercises 701

Note that, in each case, we can rewrite our differential equation as

(x − x 0 )2 α(x)y + (x − x 0 )β(x)y + γ (x)y = 0

where α , β and γ are polynomials with

α(x 0 ) = α0 (x 0 ) = 0 .

So, in each case, we have rewritten our differential equation in quasi-Euler form, as claimed in the
lemma.

Testing for Regularity When the Coefﬁcients Are Not Rational

By using lemma 29.13 on page 580 on factoring analytic functions, or lemma 29.14 on page 581 on
factoring quotients of analytic functions, you can easily modify the arguments in the last proof to
obtain an analog of the above lemma appropriate to differential equations whose coefﬁcients are not
rational. Basically, just replace

“rational functions” with “quotients of functions analytic at x 0 ” ,

and replace
“polynomials” with “functions analytic at x 0 ” .
The resulting lemma, along with lemma 32.1, then yields the following analog to theorem 32.2.

Theorem 32.5 (testing for regular singular points (ver.2))

Assume x 0 is a singular point on the real line for the differential equation

a(x)y + b(x)y + c(x)y = 0

where a , b and c are quotients of functions analytic at x 0 . Then x 0 is a regular singular point for
this differential equation if and only if the two limits
b(x) c(x)
lim (x − x 0 ) and lim (x − x 0 )2
x→x 0 a(x) x→x 0 a(x)

are both ﬁnite.

Additional Exercises

32.2. Find a fundamental set of solutions {y1 , y2 } for each of the following shifted Euler equa-
tions, and sketch the graphs of y1 and y2 near the singular point.
a. (x − 3)2 y − 2(x − 3)y + 2y = 0
b. 2x 2 y + 5x y + 1y = 0
c. (x − 1)2 y − 5(x − 1)y + 9y = 0

i i

i i
i i

i i

702 Modiﬁed Power Series Solutions and the Basic Method of Frobenius

d. (x + 2)2 y + (x + 2)y = 0
e. 3(x − 5)2 y − 4(x − 5)y + 2y = 0
f. (x − 5)2 y + (x − 5)y + 4y = 0

32.3. Identify all of the singular points for each of the following differential equations, and
determine which of those on the real line are regular singular points, and which are irregular
singular points. Also, ﬁnd the Frobenius radius of convergence R for the given differential
equation about the given x 0 .
x 2
a. x 2 y + y + y = 0 , x0 = 0
x −2 x +2
b. x 3 y + x 2 y + y = 0 , x 0 = 2

c. x 3 − x 4 y + (3x − 1)y + 827y = 0 , x0 = 1

1 1
d. y + y + y = 0 , x0 = 3
x −3 x −4
1 1
e. y + y + y = 0 , x0 = 4
(x − 3)2 (x − 4)2
1 1
1
1
f. y + − y + − y = 0 , x0 = 0
x 3 x 4
2
1 − x2
g. 4x 2 − 1 y + 4 − y + y = 0 , x0 = 0
x 2 1+x
2
h. 4 + x 2 y + y = 0 , x0 = 0

32.4. Use the basic method of Frobenius to ﬁnd modiﬁed power series solutions about x 0 = 0
for each of the following differential equations. In particular:

i Find, identify and solve the corresponding indicial equation for the equation’s ex-
ponents r1 and r2 .

ii Find the recursion formula corresponding to each exponent.

iii Find and explicitly write out at least the ﬁrst four nonzero terms of all series solutions
about x 0 = 0 that can be found by the basic Frobenius method (if a series terminates,
ﬁnd all the nonzero terms).

iv Try to ﬁnd a general formula for all the coefﬁcients in each series.

v Finally, either state the general solution or, when a second particular solution cannot
be found by the basic method, give a reason that second solution cannot be found.

(Note: x 0 = 0 is a regular singular point for each equation. You need not verify it.)

a. x 2 y − 2x y + x 2 + 2 y = 0

b. 4x 2 y + (1 − 4x)y = 0
c. x 2 y + x y + (4x − 4)y = 0

d. x 2 − 9x 4 y − 6x y + 10y = 0

i i

i i
i i

a. Using the fact that r1 > r2 , show that y1 and y2 cannot be constant multiples of each
other, and, from this, conclude that {y1 , y2 } must be a fundamental set of solutions to
that one differential equation.
b. What does this say about solution y3 ?
c. Show that, in fact, y3 = y1 and that r1 = r2 + M .
d. Why does the above verify the claim in step 8 on page 685 that we can set a M = 0 ?

i i

i i
i i

i i

33
The Big Theorem on the Frobenius
Method, with Applications

At this point, you may have a number of questions, including:

1. What do we do when the basic method does not yield the necessary linearly independent pair
of solutions?
2. Are there any shortcuts?
To properly answer these questions requires a good bit of analysis — some straightforward and some,
perhaps, not so straightforward. We will do that in the next chapter. Here, instead, we will present
a few theorems summarizing the results of that analysis, and we will see how those results can,
in turn, be applied to solve and otherwise gain useful information about solutions to some notable
differential equations.
By the way, in the following, it does not matter whether we are restricting ourselves to differential
equations with rational coefﬁcients or are considering the more general case. The discussion holds
for either.

33.1 The Big Theorems

The Theorems
The ﬁrst theorem simply restates the deﬁnition of a “regular singular point”, along with some results
discussed earlier in section 32.5.

Theorem 33.1 (the indicial equation and corresponding exponents)

Let x 0 be a point on the real line. Then x 0 is a regular singular point for a given second-order,
linear homogeneous differential equation if and only if that differential equation can be written as
(x − x 0 )2 α(x)y + (x − x 0 )β(x)y + γ (x)y = 0
where α , β and γ are all analytic at x 0 with α(x 0 ) = 0 . Moreover:
1. The indicial equation arising in the method of Frobenius to solve this differential equation is
α0 r(r − 1) + β0r + γ0 = 0
where
α0 = α(x 0 ) , β0 = β(x 0 ) and γ0 = γ (x 0 ) .

705

i i

i i
i i

i i

706 The Big Theorem on the Frobenius Method, with Applications

2. The indicial equation has exactly two solutions r1 and r2 (possibly identical). And, if
α(x 0 ) , β(x 0 ) and γ (x 0 ) are all real valued, then r1 and r2 are either both real valued or
are complex conjugates of each other.

The next theorem is “the big theorem” of the Frobenius method. It describes generic formulas
for solutions about regular singular points and gives the intervals over which these formulas are valid.
Proving it will be the major goal in the next chapter.

Theorem 33.2 (general solutions about regular singular points)

Assume x 0 is a regular singular point on the real line for some given second-order homogeneous
linear differential equation with real coefﬁcients. Let R be the corresponding Frobenius radius of
convergence, and let r1 and r2 be the two solutions to the corresponding indicial equation, with
r1 ≥ r2 if they are real. Then, on the intervals (x 0 , x 0 + R) and (x 0 − R, x 0 ) , general solutions to
the differential equation are given by

y(x) = c1 y1 (x) + c2 y2 (x)

where c1 and c2 are arbitrary constants, and y1 and y2 are solutions that can be written as
follows1 :
1. In general,
∞

y1 (x) = |x − x 0 |r1 ak (x − x 0 )k with a0 = 1 . (33.1)
k=0

2. If r1 − r2 is not an integer, then

∞

y2 (x) = |x − x 0 |r2 bk (x − x 0 )k with b0 = 1 . (33.2)
k=0

3. If r1 − r2 = 0 (i.e., r1 = r2 ), then
∞

y2 (x) = y1 (x) ln |x − x 0 | + |x − x 0 |1+r1 bk (x − x 0 )k . (33.3)
k=0

4. If r1 − r2 = K for some positive integer K , then

∞

y2 (x) = μy1 (x) ln |x − x 0 | + |x − x 0 |r2 bk (x − x 0 )k (33.4)
k=0

where b0 = 1 , b K is arbitrary and μ is some (nonarbitrary) constant (possibly zero).

Moreover,
y2 (x) = y2,0 (x) + b K y1 (x)
where y2,0 (x) is given by formula (33.4) with b K = 0 .

1 In this theorem, we are assigning convenient values to the coefﬁcients, such as a and b , that could, in fact, be considered
0 0
as arbitrary nonzero constants. Any coefﬁcient not explicitly mentioned is not arbitrary.

i i

i i
i i

i i

The Big Theorems 707

Alternate Formulas
Solutions Corresponding to Integral Exponents
Remember that in developing the basic method of Frobenius, the ﬁrst solution we obtained was
actually of the form
∞

y1 (x) = (x − x 0 )r1 ak (x − x 0 )k
k=0
and not
∞

y1 (x) = |x − x 0 |r1 ak (x − x 0 )k
k=0
As noted on page 688, either solution is valid on both (x 0 , x 0 + R) and on (x 0 − R, x 0 ) . It’s just
that the second formula yields a real-valued solution even when x < x 0 and r1 is a real number
other than an integer.
Still, if r1 is an integer — be it 8 , 0 , −2 or any other integer — then replacing
(x − x 0 )r1 with |x − x 0 |r1
is completely unnecessary, and is usually undesirable. This is especially true if r1 = n for some
nonnegative integer n , because then
∞
∞

y1 (x) = (x − x 0 )n ak (x − x 0 )k = ak (x − x 0 )k+n
k=0 k=0
is actually a power series about x 0 , and the proof of theorem 33.2 will show that this is a solution
on the entire interval (x 0 − R, x 0 + R) . It might also be noted that, in this case, y1 is analytic at
x 0 , a fact that might be useful in some applications.
Similar comments hold if the other exponent, r2 , is an integer. So let us make ofﬁcial:

Corollary 33.3 (solutions corresponding to integral exponents)

Let r1 and r2 be as in theorem 33.2.
1. If r1 is an integer, then the solution given by formula (33.1) can be replaced by the solution
given by
∞
y1 (x) = (x − x 0 )r1 ak (x − x 0 )k with a0 = 1 . (33.5)
k=0
2. If r2 is an integer while r1 is not, then the solution given by formula (33.2) can be replaced
by the solution given by
∞

y2 (x) = (x − x 0 )r2 bk (x − x 0 )k with b0 = 1 . (33.6)
k=0
3. If r1 = r2 and are integers, then the solution given by formula (33.3) can be replaced by the
solution given by
∞

y2 (x) = y1 (x) ln |x − x 0 | + (x − x 0 )1+r1 bk (x − x 0 )k . (33.7)
k=0
4. If r1 and r2 are two different integers, then the solution given by formula (33.4) can be
replaced by the solution given by
∞

y2 (x) = μy1 (x) ln |x − x 0 | + (x − x 0 )r2 bk (x − x 0 )k . (33.8)
k=0

i i

i i
i i

i i

708 The Big Theorem on the Frobenius Method, with Applications

when r1 − r2 = K for some positive integer K . In both cases, c0 = b0 .

Theorem 33.2 and the Method of Frobenius

Remember that, using the basic method of Frobenius, we can ﬁnd every solution of the form
∞

y(x) = |x − x 0 |r ak (x − x 0 )k with a0 = 0 ,
k=0

provided, of course, such solutions exist. Fortunately for us, statement 1 in the above theorem assures
us that such solutions will exist corresponding to r1 . This means our basic method of Frobenius will
successfully lead to a valid ﬁrst solution (at least when the coefﬁcients of the differential equation
are rational). Whether there is a second solution y2 of this form depends:

1. If r1 − r2 is not an integer, then statement 2 of the theorem states that there is such a second
solution. Consequently, step 8 of the method will successfully lead to the desired result.

2. If r1 − r2 is a positive integer, then there might be such a second solution, depending on

whether or not μ = 0 in formula (33.4). If μ = 0 , then formula (33.4) for y2 reduces to the
sort of modified power series we are seeking, and step 8 in the Frobenius method will give us
this solution. What’s more, as indicated in statement 4 of the above theorem, in carrying out
step 8, we will also rederive the solution already obtained in steps 7 (unless we set b K = 0 )!
On the other hand, if μ = 0 , then it follows from the above theorem that no such second
solution exists. As a result, all the work carried out in step 8 of the basic Frobenius method
will lead only to a disappointing end, namely, that the terms “blow up” (as discussed in the
subsection Problems Possibly Arising in Step 8 starting on page 693).2
3. Of course, if r2 = r1 , then the basic method of Frobenius cannot yield a second solution
different from the first. But our theorem does assure us that we can use formula (33.3) for
y2 (once we figure out the values of the bk ’s ).
So what can we do if the basic method of Frobenius does not lead to the second solution y2 (x) ?
At this point, we have two choices, both using the formula for y1 (x) already found by the basic
Frobenius method:

1. Use the reduction of order method.

2 We will ﬁnd a formula for μ in the next chapter. Unfortunately, it is not a simple formula and is not of much help in
preventing us from attempting step 8 of the basic Frobenius method when that step does not lead to y2 .

i i

33.3 Local Behavior of Solutions: Limits at Regular

Singular Points
As just noted, a major concern in many applications is the behavior of a solution y(x) as x
approaches a regular singular point x 0 . In particular, it may be important to know whether

lim y(x)
x→x 0

is zero, some finite nonzero value, infinite, or completely undefined.

We discussed this issue at the beginning of chapter 30 for the shifted Euler equation

(x − x 0 )2 α0 y + (x − x 0 )β0 y + γ0 y = 0 . (33.9)

Let us try using what we know about those solutions after comparing them with the solutions given
in our big theorem for
(x − x 0 )2 α(x)y + (x − x 0 )β(x)y + γ (x)y = 0 (33.10)

assuming
α0 = α(x 0 ) , β0 = β(x 0 ) and γ0 = γ (x 0 ) .
Remember, these two differential equations have the same indicial equation

α0 r(r − 1) + β0 r + γ0 = 0 .

Naturally, we’ll further assume α , β and γ are analytic at x 0 with α(x 0 ) = 0 so we can
use our theorems. Also, in the following, we will let r1 and r2 be the two solutions to the indicial
equation, with r1 ≥ r2 if they are real.

Preliminary Approximations
Observe that each of the solutions described in theorem 33.2 involves one or more power series
∞

ck (x − x 0 )k = c0 + c1 (x − x 0 ) + c2 (x − x 0 )2 + c3 (x − x 0 )3 + · · ·
k=0

where c0 = 0 . As we’ve noted in earlier chapters,

∞

lim ck (x − x 0 )k = lim c0 + c1 (x − x 0 ) + c2 (x − x 0 )2 + · · · = c0 .
x→x 0 x→x 0
k=0

This means
∞

ck (x − x 0 )k ≈ c0 when x ≈ x0 .
k=0

Solutions Corresponding to r1
Now, recall that the solutions corresponding to r1 for the shifted Euler equation are constant multiples
of
yEuler,1 (x) = |x − x 0 |r1 ,

i i

i i
i i

i i

Local Behavior of Solutions: Limits at Regular Singular Points 711

while the corresponding solutions to equation (33.10) are constant multiples of something of the
form
∞

y1 (x) = |x − x 0 |r1 ak (x − x 0 )k with a0 = 1 .
k=0
Using the approximation just noted for the power series, we immediately have

y1 (x) ≈ |x − x 0 |r1 a0 = |x − x 0 |r1 = yEuler,1 (x) when x ≈ x0 .

which is exactly what we suspected back the beginning of section 32.3. In particular, if r1 is real,
then ⎧
⎪
⎪ if r1 > 0
⎨ 0
lim |y1 (x)| = lim |x − x 0 |r1 = 1 if r1 = 0 .
x→x 0 x→x 0 ⎪
⎪
⎩ +∞ if r1 < 0
(For the case where r1 is not real, see exercise 33.3.)

Solutions Corresponding to r2 when r2 = r1

In this case, all the corresponding solutions to the shifted Euler equation are given by constant
multiples of
yEuler,2 = |x − x 0 |r2 .
If r1 and r2 do not differ by an integer, then the corresponding solutions to equation (33.10) are
constant multiples of something of the form
∞

y2 (x) = |x − x 0 |r2 ak (x − x 0 )k with a0 = 1 ,
k=0

and the same arguments given above with r = r1 apply and conﬁrm that

y2 (x) ≈ |x − x 0 |r2 a0 = |x − x 0 |r2 = yEuler,2 (x) when x ≈ x0 .

On the other hand, if r1 − r2 = K for some positive integer K , then the corresponding solutions
to (33.10) are constant multiples of something of the form
∞

y2 (x) = μy1 (x) ln |x − x 0 | + |x − x 0 |r2 bk (x − x 0 )k with b1 = 1 .
k=0

Using approximations already discussed, we have, when x ≈ x 0 ,

y2 (x) ≈ μ |x − x 0 |r1 ln |x − x 0 | + |x − x 0 |r2 b0

= μ |x − x 0 |r2 +K ln |x − x 0 | + |x − x 0 |r2

= |x − x 0 |r2 μ |x − x 0 | K ln |x − x 0 | + 1 .

But K is a positive integer, and you can easily show, via L’Hôpital’s rule, that

lim (x − x 0 ) K ln |x − x 0 | = 0 .
x→x 0

Thus, when x ≈ x 0 ,
y2 (x) ≈ |x − x 0 |r2 [μ · 0 + 1] = |x − x 0 |r2 ,

i i

i i
i i

i i

712 The Big Theorem on the Frobenius Method, with Applications

conﬁrming that, whenever r2 = r1

y2 (x) ≈ yEuler,2 (x) when x ≈ x0 .

In particular, if r2 is real, then ⎧

⎪
⎪ if r2 > 0
⎨ 0
lim |y2 (x)| = lim |x − x 0 | r1
= 1 if r2 = 0 .
x→x 0 x→x 0 ⎪
⎪
⎩ +∞ if r2 < 0

Solutions Corresponding to r2 when r2 = r1

In this case, all the second solutions to the shifted Euler equation are constant multiples of

yEuler,2 = |x − x 0 |r2 ln |x − x 0 | ,

and the corresponding solutions to our original differential equation are constant multiples of
∞

y2 (x) = y1 (x) ln |x − x 0 | + |x − x 0 |1+r1 bk (x − x 0 )k .
k=0

If x ≈ x 0 , then
y2 (x) ≈ |x − x 0 |r1 ln |x − x 0 | + |x − x 0 |1+r1 b0

|x − x 0 |
= |x − x 0 |r1 ln |x − x 0 | 1 + b0 .
ln |x − x 0 |

But
|x − x 0 |
lim = 0 .
x→x 0 ln |x − x 0 |

Consequently, when x ≈ x 0 ,

y2 (x) ≈ |x − x 0 |r1 ln |x − x 0 | [1 + 0 · b0 ] = |x − x 0 |r1 ln |x − x 0 | ,

conﬁrming that, again, we have

y2 (x) ≈ yEuler,2 (x) when x ≈ x0 .

Taking the limit (possibly using L’Hôpital’s rule), you can then easily show that

0 if r1 > 0
lim |y2 (x)| = lim |a0 | |x − x 0 |r1 |ln |x − x 0 || = .
x→x 0 x→x 0 +∞ if r1 ≤ 0

!Example 33.1 (Bessel’s equation of order 1): Suppose we are only interested in those solutions
on (0, ∞) to Bessel’s equation of order 1 ,

x 2 y + x y + x2 − 1 y = 0 ,

that do not “blow up” as x → 0 .

First of all, observe that this equation is already in the form

(x − x 0 )2 α(x)y + (x − x 0 )β(x)y + γ (x)y = 0

i i

i i
i i

i i

Local Behavior: Analyticity and Singularities in Solutions 713

with x 0 = 0 , α(x) = 1 , β(x) = 1 and γ (x) = x 2 − 1 . So x 0 = 0 is a regular singular point

for this differential equation, and the indicial equation,
α0 r(r − 1) + β0 r + γ0 = 0 ,
is
1 · r(r − 1) + 1 · r − 1 = 0 .
This simpliﬁes to
r2 − 1 = 0 ,
and has solutions r = ±1 . Thus, the exponents for our differential equation are
r1 = 1 and r2 = −1 .
By the analysis given above, we know that all solutions corresponding to r1 are constant
multiples of one particular solution y1 satisfying
lim |y1 (x)| = lim |x − x 0 |1 = 0 .
x→x 0 x→x 0

So none of the solutions corresponding to r = 1 blow up as x → 0 .

On the other hand, the analysis given above for solutions corresponding to r2 when r2 = r1
tells us that all the (nontrivial) second solutions are (nonzero) constant multiples of one particular
solution y2 satisfying
lim |y2 (x)| = lim |x − x 0 |−1 = ∞ .
x→x 0 x→x 0

So these do “blow up” as x → 0 .

Thus, since we are only interested in the solutions that do not blow up as x → 0 , we need
only concern ourselves with those solutions corresponding to r1 = 1 . Those corresponding to
r2 = −1 are not relevant, and we need not spend time and effort ﬁnding their formulas.

Derivatives
To analyze the behavior of the derivative y (x) as x → x 0 , you simply differentiate the modiﬁed
power series for y(x) , and then apply the ideas described above. Doing this is left for you (exercise
33.4 at the end of the chapter).

33.4 Local Behavior: Analyticity and Singularities in

Solutions
It is certainly easier to analyze how y(x) or any of its derivatives behave as x → x 0 when y(x) is
given by a basic power series about x 0
∞

y(x) = ck (x − x 0 )k .
k=0

Then, in fact, y is inﬁnitely differentiable at x 0 , and we know

lim y(x) = y(x 0 ) = c0
x→x 0

i i

i i
i i

i i

714 The Big Theorem on the Frobenius Method, with Applications

and
lim y(x) = y (k) (x 0 ) = k! ck for k = 1, 2, 3, . . . .
x→x 0

On the other hand, if y(x) is given by one of those modiﬁed power series described in the big
theorem, then, as we partly saw in the last section, computing the limits of y(x) and its derivatives
as x → x 0 is much more of a challenge. Indeed, unless that series is of the form
∞

(x − x 0 )r αk (x − x 0 )k
k=0

with r being zero or some postive integer, then lim x→x0 y(x) may blow up or otherwise fail to
exist. And even if that limit does exist, you can easily show that the corresponding limit of the n th
derivative, y (n) (x) , will either blow up or fail to exist if n > r .
To simplify our discussion, let’s slightly expand our “ordinary/singular point” terminology so
that it applies to any function y . Basically, we want to refer to a point x 0 as an ordinary point for
a function y if y is analytic at x 0 , and we want to refer to x 0 as a singular point for y if y is not
analytic at x 0 .
There are, however, small technical issues that must be taken into account. Typically, our
function y is defined on some interval (α, β) , possibly by some power series or modified power
series. So, initially at least, we must confine our definitions of ordinary and singular points for y to
points in or at the ends of this interval. To be precise, for any such 2 point x 0 , wekwill say that x 0
is an ordinary point for y if and only if there is a power series ∞ k=0 ak (x − x 0 ) with a nonzero
radius of convergence R such that,
∞

y(x) = ak (x − x 0 )k
k=0

for all x in (α, β) with |x − x 0 | < R . If no such power series exists, then we’ll say x 0 is a singular
point for y .3
A rather general (and moderately advanced) treatment of ‘singular points’ for functions and
solutions to differential equations was given in section 31.9. One corollary of a result derived there
(lemma 31.28 on page 662) is the following unsurprising lemma:

Lemma 33.4
Let y be a solution on an interval (α, β) to some second-order homogeneous differential equation,
and let x 0 be either in (α, β) or be one of the endpoints. Then

x 0 is an ordinary point for the differential equation ⇒ x 0 is an ordinary point for y .

Equivalently,

x 0 is a singular point for y ⇒ x 0 is a singular point for the differential equation .

This lemma does not say that y must have a singularity at each singular point of the differential
equation. After all, if x 0 is a regular singular point, and the ﬁrst solution r1 of the indicial equation

3 If x is actually in (α, β) , not an endpoint of (α, β) , then

x0 is an ordinary point for y ⇐⇒ y is analytic at x0 ⇐⇒ x0 is point of analyticity for y .

i i

i i
i i

i i

Local Behavior: Analyticity and Singularities in Solutions 715

is a nonnegative integer, say, r1 = 3 , then our big theorem assures us of a solution y1 given by
∞

y1 (x) = (x − x 0 )3 ak (x − x 0 )k ,
k=0

which is a true power series since

∞
∞
∞

(x − x 0 )3 ak (x − x 0 )k = ak (x − x 0 )k+3 = ck (x − x 0 )n
k=0 k=0 n=0

where
0 if n<3
cn = .
an−3 if n≥3
Still, there are cases where one can be sure that a solution y has at least one singular point. In
particular, consider the situation in which y is given by a power series
∞

y(x) = ck (x − x 0 )k for |x − x 0 | < R
k=0

when R is the radius of convergence and is ﬁnite. Replacing x with the complex variable z ,
∞

y(z) = ck (z − x 0 )k
k=0

yields a power series with radius of convergence R for a complex-variable function y(z) analytic
at least on the disk of radius R about z 0 . Now, it can be shown that there must then be a point z s
on the edge of this disk at which y(z) is not “well behaved”. Unsurprisingly, it can also be shown
that this z s is a singular point for the differential equation. And if all the singular points of this
differential equation happen to lie on the real line, then this singular point z s must be one of the two
points on the real line satisfying |z s − x 0 | = R , namely,
zs = x0 − R or zs = x0 + R .
That gives us the following theorem (which will be useful in the following section and some exercises).

Theorem 33.5
Let x 0 be a point on the real line, and assume
∞

y(x) = ck (x − x 0 )k for |x − x 0 | < R
k=0

is a power series solution to some ﬁrst- or second-order homogeneous linear differential equation.
Suppose further, that both of the following hold:
1. R is the radius of convergence for the above power series and is ﬁnite.
2. All the singular points for the differential equation are on the real axis.
Then at least one of the two points x 0 − R or x 0 + R is a singular point for y .

Admittedly, our derivation of the above theorem was rather sketchy and involved claims that
“can be shown”. If that isn’t good enough for you, then turn back to section 31.9 for a more satisfying
development of a more general version of the above theorem (theorem 31.29 on page 662). You will
even ﬁnd pictures there.

i i

i i
i i

i i

716 The Big Theorem on the Frobenius Method, with Applications

33.5 Case Study: The Legendre Equations

As an application of what we have just developed, let us analyze the behavior of the solutions to the
set of Legendre equations. These equations often arise in physical problems in which some spherical
symmetry can be expected, and in many applications, only those solutions that are “bounded” on
(−1, 1) are of interest. Our analysis here will allow us to readily ﬁnd those solutions, and will save
us from a lot of needless work dealing with those solutions that, ultimately, will not be of interest.
Let me remind you that a Legendre equation is any differential equation that can be written as

(1 − x 2 )y − 2x y + λy = 0 (33.11)

where λ , the equation’s parameter, is some real constant. You may recall these equations from
exercise 30.10 on page 629, and we will begin our analysis here by recalling what we learned in that
rather lengthy exercise.

What We Already Know

In that exercise 30.10 on page 629, you discovered the following:

1. The only singular points for each Legendre equation are x = −1 and x = 1 .

2. For each λ , a general solution on (−1, 1) of Legendre equation (33.11) is

yλ (x) = a0 yλ,E (x) + a1 yλ,O (x)

where yλ,E (x) is a power series about 0 having just even-powered terms and whose ﬁrst
term is 1 , and yλ,O (x) is a power series about 0 having just odd-powered terms and whose
ﬁrst term is x ,

3. If λ = m(m + 1) for some nonnegative integer m , and pm is deﬁned by

yλ,E (x) if m is even
pm (x) = ,
yλ,O (x) if m is odd

then this pm (x) is a polynomial of degree m . In particular,

p0 (x) = y0,E (x) = 1 ,

p1 (x) = y2,O (x) = x ,

p2 (x) = y6,E (x) = 1 − 3x 2 ,

5
p3 (x) = y12,O (x) = x − x 3 ,
3
35 4
p4 (x) = y20,E (x) = 1 − 10x 2 + x
3
and
14 3 21 5
p5 (x) = y30,O (x) = x − x + x .
3 5

Moreover, any other polynomial solution to Legendre’s equation of order λ is a constant

multiple of pm (x) .

i i

i i
i i

i i

Case Study: The Legendre Equations 717

4. The Legendre equation with parameter λ has no polynomial solution on (−1, 1) if λ =

m(m + 1) for every nonnegative integer m

5. If y is a nonpolynomial solution to a Legendre equation on (−1, 1) , then it is given by a

power series about x 0 = 0 with a radius of convergence of exactly 1 .

Let us now see what more we can determine about the solutions to the Legendre equations on
(−1, 1) using the material developed in the last few sections. For convenience, we will record the
noteworthy results as we derive them in a series of lemmas which, ultimately, will be summarized
in a major theorem on Legendre equations, theorem 33.11.

The Singular Points of the Solutions

First of all, we should note that, because polynomials are analytic everywhere, the polynomial
solutions to Legendre’s equation have no singular points.
Now, suppose y is a solution to a Legendre equation but is not a polynomial. As noted above,
y(x) can be given by a power series about x 0 = 0 with radius of convergence 1 . This, and the fact
that the only singular points for Legendre’s equation are the points x = −1 and x = 1 on the real
line, means that lemma 33.5 applies and immediately gives us the following:

Lemma 33.6
Let y be a nonpolynomial solution to a Legendre equation on (−1, 1) . Then y must have a
singularity at either x = −1 or at x = 1 (or both).

Solution Limits at x = 1
To determine the limits of the solutions at x = 1 , we will first find the exponents of the Legendre
equation at x = 1 by solving the appropriate indicial equation. To do this, it is convenient to first
multiply the Legendre equation by −1 and factor the first coefficient, giving us

(x − 1)(x + 1)y + 2x y − λy = 0 .

Multiplying this by x − 1 converts the equation into quasi-Euler form

(x − 1)2 α(x)y + (x − 1)β(x)y + γ (x)y = 0

where
α(x) = x + 1 , β(x) = 2x and γ (x) = −(x − 1)λ .

Thus, the corresponding indicial equation is

r(r − 1)α0 + rβ0 + γ0 = 0

with
α0 = α(1) = 2 , β0 = β(1) = 2 and γ0 = γ (1) = 0 .

That is, the exponents r = r1 and r = r2 must satisfy

r(r − 1)2 + 2r = 0 ,

which simpliﬁes to r 2 = 0 . So,

r1 = r2 = 0 ,

i i

i i
i i

i i

718 The Big Theorem on the Frobenius Method, with Applications

and our big theorem on the Frobenius method, theorem 33.2 on page 706, tells us that any solution
y to a Legendre equation on (−1, 1) can be written as

y(x) = c1+ y1+ (x) + c2+ y2+

where
∞
∞

y1+ (x) = |x − 1|0 ak+ (x − 1)k = ak+ (x − 1)k with a0+ = 1 ,
k=0 k=0
and
∞

y2+ (x) = y1+ (x) ln |x − 1| + (x − 1) bk+ (x − 1)k .
k=0

Now, let’s make some simple observations using these formulas:

1. Solution y1+ is analytic at x = 1 (i.e., x = 1 is an ordinary point for y1+ ). Moreover

lim y + (x) = a0+ = 1 .

x→1 1

2. On the other hand, because of the logarithmic factor in y2+ ,

lim y + (x) = 1 · lim ln |x − 1| + (1 − 1)b0+ = −∞ .

x→1 2 x→1

Hence, x = 1 is a singular point for solution y2+ .

3. More generally, if y(x) = c1+ y1+ (x) + c2+ y2+ (x) is any nontrivial solution to Legendre’s
equation on (−1, 1) , then the following must hold:
(a) If c2+ = 0 , then x = 1 is a singular point for y , and

lim |y(x)| = c1+ lim y1+ (x) + c2+ lim y2+ (x) = ∞ .
x→1 x→1 x→1

1 −∞

(b) If c2+ = 0 , then c1+ = 0 , and x = 1 is an ordinary point for y . Moreover,

lim y(x) = c1+ lim y1+ (x) = c1+ = 0 .

x→1 x→1

(Hence, x = 1 is a singular point for y if and only if c2+ = 0 .)

All together, these observations give us:

Lemma 33.7
Let y be a nontrivial solution on (−1, 1) to Legendre’s equation. Then x = 1 is a singular point
for y if and only if
lim |y(x)| = ∞ .
x→1−
Moreover, if x = 1 is not a singular point for y , then

lim y(x)
x→1−

exists and is a ﬁnite nonzero value.

i i

i i
i i

i i

Case Study: The Legendre Equations 719

Solution Limits at x = −1
A very similar analysis using x = −1 instead of x = 1 leads to:

Lemma 33.8
Let y be a nontrivial solution on (−1, 1) to Legendre’s equation. Then x = −1 is a singular point
for y if and only if
lim |y(x)| = ∞ .
x→−1+

Moreover, if x = −1 is not a singular point for y , then

lim y(x)
x→−1+

exists and is a ﬁnite nonzero value.

?Exercise 33.1: Verify the above lemma by redoing the analysis done in the previous subsection
using x = −1 in place of x = 1 .

The Unboundedness of the Nonpolynomial Solutions

Recall that a function y is said to be bounded on an interval (a, b) if there is a ﬁnite number M
which “bounds” the absolute value of y(x) when x is in (a, b) ; that is,

|y(x)| ≤ M whenever a < x < b .

Naturally, if a function is not bounded on the interval of interest, we say it is unbounded.

Now, if y happens to be one of those nonpolynomial solutions to a Legendre equation on
(−1, 1) , then we know y has a singularity at either x = −1 or at x = 1 or at both (lemma 33.6).
Lemmas 33.7 and 33.8 then tell us that

lim |y(x)| = ∞ or lim |y(x)| = ∞ ,

x→1− x→−1+

clearly telling us that y(x) is not bounded on (−1, 1) !

Lemma 33.9
Let y be a nonzero solution to a Legendre equation on (−1, 1) . If y is not a polynomial, then it is
not bounded on (−1, 1) .

The Polynomial Solutions and Legendre Polynomials

Now assume y is a nonzero polynomial solution to a Legendre eqaution.
First of all, because each polynomial is a continuous function on the real line, we know from a
classic theorem of calculus that the polynomial has maximum and minimum values over any given
closed subinterval. Thus, in particular, our polynomial solution y has a maximum and a minimum
value over [−1, 1] , and, hence, is bounded on (−1, 1) . So

Lemma 33.10
Each polynomial solution to a Legendre equation is bounded on (−1, 1) .

i i

i i
i i

i i

720 The Big Theorem on the Frobenius Method, with Applications

But now remember that the Legendre equation with parameter λ has polynomial solutions if
and only if λ = m(m + 1) for some nonnegative integer m , and those solutions are all constant
multiples of
yλ,E (x) if m is even
pm (x) = .
yλ,O (x) if m is odd
Since x = 1 is not a singular point for pm , lemma 33.7 tells us that

pm (1) = lim pm (x) = 0 .

x→1

This allows us to deﬁne the m th Legendre polynomial Pm by

1
Pm (x) = pm (x) for m = 0, 1, 2, 3, . . . .
pm (1)

Clearly, any constant multiple of pm is also a constant multiple of Pm . So we can use the Pm ’s
instead of the pm ’s to describe all polynomial solutions to the Legendre equations.
In practice, it is more common to use the Legendre polynomials than the pm ’s . In part, this is
because
1
Pm (1) = pm (1) = 1 for m = 0, 1, 2, 3, . . . .
pm (1)
With a little thought, you’ll realize that this means that, for each nonnegative integer m , Pm is the
polynomial solution to the Legendre equation with parameter λ = m(m + 1) that equals 1 when
x = 1.

Summary
This is a good time to skim the lemmas in this section, along with the discussion of the Legendre
polynomials, and verify for yourself that these results can be condensed into the following:

Theorem 33.11 (bounded solutions of the Legendre equations)

There are bounded, nontrivial solutions on (−1, 1) to the Legendre equation

(1 − x 2 )y − 2x y + λy = 0

if and only if λ = m(m + 1) for some nonnegative integer m . Moreover, y is such a solution if
and only if y is a constant multiple of the m th Legendre polynomial.

We may ﬁnd a use for this theorem later (much later).

33.6 Finding Second Solutions Using Theorem 33.2

Let’s return to actually solving differential equations.
When the basic method of Frobenius fails to deliver a second solution, we can turn to the
appropriate formulas given in theorem 33.2 on page 706, namely, formula (33.3),
∞

y2 (x) = y1 (x) ln |x − x 0 | + |x − x 0 |1+r1 bk (x − x 0 )k
k=0

i i

i i
i i

i i

Finding Second Solutions Using Theorem 33.2 721

or formula (33.4),
∞

y2 (x) = μy1 (x) ln |x − x 0 | + |x − x 0 |r2 bk (x − x 0 )k ,
k=0

depending, respectively, on whether the exponents r1 and r2 are equal or differ by a nonzero integer
(the only cases for which the basic method might fail). The unknown constants in these formulas
can then be found by fairly straightforward variations of the methods we’ve already developed. Plug
formula (33.3) or (33.4) (as appropriate) into the differential equation, derive the recursion formula
for the bk ’s (and the value of μ if using formula (33.4)), and compute as many of the bk ’s as
desired.
Because the procedures are straightforward modiﬁcations of what we’ve already done many
times in the last few chapters, we won’t describe the steps in detail. Instead, we’ll illustrate the basic
ideas with an example, and then comment on those basic ideas. As you’ve come to expect in these
last few chapters, the computations are simple but lengthy. But do note how the linearity of the
equations is used to break the computations into more digestible pieces.

The Second Solution When r1 − r2 Is a Positive Integer

!Example 33.2: In example 32.9, starting on page 694, we attempted to find modified power
series solutions about x 0 = 0 to
2x y − 4y − y = 0 ,
which we rewrote as
2x 2 y − 4x y − x y = 0 .
We found the exponents of this equation to be
r1 = 3 and r2 = 0 ,
and obtained
∞
6
y(x) = c x 3 xk (33.12)
2k k!(k + 3)!
k=0
as the solutions to the differential equation corresponding to r1 . Unfortunately, we found that
there was not a similar solution corresponding to r2 .
To apply theorem 33.2, we first need the particular solution corresponding to r1
∞

y1 (x) = x 3 ak x k with a0 = 1 .
k=0

So we use formula (33.12) with c chosen so that

6
a0 = c · = 1 .
20 0!(0 + 3)!
Simple computations show that c = 1 and, so,
∞
∞

6 k 6
y1 (x) = x 3 x = x k+3 .
2k k!(k + 3)! 2k k!(k + 3)!
k=0 k=0

According to theorem 33.2, the general solution to our differential equation is

y(x) = c1 y1 (x) + c2 y2 (x)

i i

i i
i i

i i

722 The Big Theorem on the Frobenius Method, with Applications

where y1 is as above, and (since r2 = 0 and K = r1 − r2 = 3 )

∞

y2 (x) = μy1 (x) ln |x − x 0 | + bk x k .
k=0

where b0 = 1 , b3 is arbitrary (we’ll take it to be zero), and μ is some constant. To simplify

matters, let us rewrite the last formula as

y2 (x) = μY1 (x) + Y2 (x)

where
∞

Y1 (x) = y1 (x) ln |x| and Y2 (x) = bk x k .
k=0

Thus,
0 = 2x 2 y1 − 4x y1 − x y1
= 2x 2 [μY1 + Y2 ] − 4x[μY1 + Y2 ] − x[μY1 + Y2 ] .

By the linearity of the derivatives, this can be rewritten as

& ' & '
0 = μ 2x 2 Y1 − 4xY1 − xY1 + 2x 2 Y2 − 4xY2 − xY2 . (33.13)

Now, because y1 is a solution to our differential equation, you can easily verify that

2x 2 Y1 − 4xY1 − xY1
d2 d
= 2x 2 [y1 (x) ln |x|] − 4x [y1 (x) ln |x|] − x [y1 (x) ln |x|]
dx2 dx
= ···
= 4x y1 − 6y1 .

Replacing the y1 in the last line with its series formula then gives us
∞ ∞
d 6 6
2x 2 Y1 − 4xY1 − xY1 = 4x x k+3
− 6 x k+3 ,
dx 2 k!(k + 3)!
k 2 k!(k + 3)!
k
k=0 k=0

which, after suitable computation and reindexing (you do it), becomes

∞
12(2n − 3)
2x 2 Y1 − 4xY1 − xY1 = xn . (33.14)
2n−3 (n − 3)!n!
n=3

Next, using the series formula for Y2 we have

∞ ∞ ∞
d2 d
2x 2 Y2 − 4xY2 − xY2 = 2x 2 bk x k
− 4x bk x k
− x bk x k ,
dx2 dx
k=0 k=0 k=0

which, after suitable computations and changes of indices (again, you do it!), reduces to
∞

2x 2 Y2 − 4xY2 − xY2 = {2n(n − 3)bn − bn−1 } x n . (33.15)
n=1

i i

i i
i i

i i

Finding Second Solutions Using Theorem 33.2 723

Combining equations (33.13), (33.14) and (33.15):

& ' & '
0 = μ 2x 2 Y1 − 4xY1 − xY1 + 2x 2 Y2 − 4xY2 − xY2
∞
∞

12(2n − 3) n
= μ x + {2n(n − 3)bn − bn−1 } x n
2n−3 (n − 3)!n!
n=3 n=1

∞
0
12(2n − 3)
= μ xn + [−4b1 − b0 ]x 1 + [−4b2 − b1 ]x 2
2n−3 (n − 3)!n!
n=3
∞
1
+ {2n(n − 3)bn − bn−1 } x n
n=3

= −[4b1 + b0 ]x 1 − [4b2 + b1 ]x 2
∞

12(2n − 3)
+ 2n(n − 3)bn − bn−1 + μ xn .
2n−3 (n − 3)!n!
n=3

Since each term in this last power series must be zero, we must have
−[4b1 + b0 ] = 0 , − [4b2 + b1 ] = 0
and
12(2n − 3)
2n(n − 3)bn − bn−1 + μ = 0 for n = 3, 4, 5, . . . .
2n−3 (n − 3)!n!
This (and the fact that we’ve set b0 = 1 ) means that
1 1 1 1 1
−1 1
b1 = − b0 = − · 1 = − , b2 = − b1 = − =
4 4 4 4 4 4 16
and
12(2n − 3)
2n(n − 3)bn = bn−1 − μ for n = 3, 4, 5, . . . . (33.16)
2n−3 (n − 3)!n!
Because of the n − 3 factor in front of bn , dividing the last equation by 2n(n − 3) to get a
recursion formula for the bn ’s would result in a recursion formula that ”blow ups” for n = 3 .
So we need to treat that case separately.
With n = 3 in equation (33.16), we get
12(2 · 3 − 3)
2 · 3(3 − 3)b3 = b2 − μ
23−3 (3 − 3)!3!

→ 0 =
1
16
− μ6

→ μ =
1
96
.

Notice that we obtained the value for μ instead of b3 . As claimed in the theorem, b3 is arbitrary.
Because of this, and because we only need one second solution, let us set
b3 = 0 .
Now we can divide equation (33.16) by 2n(n − 3) and use the value for μ just derived
(with a little arithmetic) to obtain the recursion formula

1 (2n − 3)
bn = bn−1 − n for n = 4, 5, 6, . . . .
2n(n − 3) 2 (n − 3)!n!

i i

i i
i i

i i

724 The Big Theorem on the Frobenius Method, with Applications

So,

1 (2 · 4 − 3) 1 5 5
b4 = b3 − 4 = 0 − = − ,
2 · 4(4 − 3) 2 (4 − 3)!4! 8 384 3,072

1 (2 · 5 − 3) 1 −5 7 11
b5 = b4 + 5 = + = − ,
2 · 5(5 − 3) 2 (5 − 3)!5! 20 3,072 7,680 307,200
..
.

We won’t attempt to ﬁnd a general formula for the bn ’s here!

Thus, a second particular solution to our differential equation is

∞

y2 (x) = μy1 (x) ln |x − x 0 | + bk x k
k=0
0 1
1 1 1 2 5 11
= y1 (x) ln |x| + 1 − x + x + 0x 3 − x4 − x5 + · · ·
96 4 16 3,072 307,200

where y1 is our ﬁrst particular solution,

∞
6
y1 (x) = x k+3 .
2k k!(k + 3)!
k=0

In general, when
r1 − r2 = K

for some positive integer K , the computations illustrated in the above example will yield a second
particular solution. It will turn out that b1 , b2 , . . . and b K −1 are all “easily computed” from b0 ,
just as in the example. You will also obtain a recursion relation for bn in terms of lower-indexed
bk ’s and the coefﬁcients from the series formula for y1 . This recursion formula (formula (33.16) in
our example) will hold for n ≥ K , but be degenerate when n = K (just as in our example, where
K = 3 ). From that degenerate case, the value of μ in formula (33.4) can be determined. The rest
of the bn ’s can then be computed using the recursion relation. Unfortunately, it is highly unlikely
that you will be able to ﬁnd a general formula for these bn ’s in terms of just the index, n . So just
compute as many as seem reasonable.

The Second Solution When r1 = r2

The basic ideas illustrated in the last example also apply when the exponents of the differential
equation, r1 and r2 , are equal. Of course, instead of using the formula used in the example for y2 ,
use formula (33.3),
∞

y2 (x) = y1 (x) ln |x − x 0 | + |x − x 0 |1+r1 bk (x − x 0 )k .
k=0

In this case, there is no “ μ ” to determine and none of the bk ’s will be arbitrary. In some ways,
that makes this a simpler case than considered in our example. You can work out the details in the
exercises.

i i

i i
i i

i i

Additional Exercises 725

Additional Exercises

33.2. For each differential equation and singular point x 0 given below, let r1 and r2 be the
corresponding exponents (with r1 ≥ r2 if they are real), and let y1 and y2 be the two
modiﬁed power series solutions about the given x 0 described in the “big theorem on the
Frobenius method”, theorem 33.2 on page 706, and do the following:

i If not already in the form

(x − x 0 )2 α(x)y + (x − x 0 )β(x)y + γ (x)y = 0

where α , β and γ are all analytic at x 0 with α(x 0 ) = 0 , then rewrite the
differential equation in this form.

ii Determine the corresponding indicial equation, and ﬁnd r1 and r2 .

iii Write out the corresponding shifted Euler equation

(x − x 0 )2 α(x 0 )y + (x − x 0 )β(x 0 )y + γ (x 0 )y = 0 ,

and ﬁnd the solutions yEuler,1 and yEuler,2 which approximate, respectively, y1 (x)
and y2 (x) when x ≈ x 0 .

iv Determine the limits lim x→x0 |y1 (x)| and lim x→x0 |y2 (x)| .

Do not attempt to ﬁnd the modiﬁed power series formulas for y1 and y2 .

a. x 2 y − 2x y + 2 − x 2 y = 0 , x 0 = 0

b. x 2 y − 2x 2 y + x 2 − 2 y = 0 , x 0 = 0
1
c. y + y + y = 0 , x 0 = 0 (Bessel’s equation of order 0)
x

d. x 2 2 − x 2 y + 5x + 4x 2 y + (1 + x 2 )y = 0 , x 0 = 0

e. x 2 y − 5x + 2x 2 y + 9y = 0 , x 0 = 0

f. x 2 (1 + 2x)y + x y + (4x 3 − 4)y = 0 , x0 = 0

2
g. 4x y + 8x y + (1 − 4x)y = 0 , x0 = 0
h. x 2 y + x y − (1 + 2x)y = 0 , x0 = 0
12
i. x y + 4y + y = 0 , x0 = 0
(x + 2)2
12
j. x y + 4y + y = 0 , x 0 = −2
(x + 2)2
k. (x − 3)y + (x − 3)y + y = 0 , x0 = 3
l. (1 − x 2 )y − x y + 3y = 0 , x0 = 1 (Chebyshev equation with parameter 3)

i i

i i
i i

i i

726 The Big Theorem on the Frobenius Method, with Applications

33.3. Suppose x 0 is a regular singular point for some second-order homogeneous linear differ-
ential equation, and that the corresponding exponents are complex

r+ = λ + iω and r− = λ − iω

(with ω = 0 ). Let y be any nontrivial solution to this differential equation on an interval

having x 0 as the left endpoint. Show that

lim y(x)
x→x 0 +

is zero if λ > 0 , and does not exist if λ ≤ 0 .

33.4. Assume x 0 is a regular singular point on the real line for

(x − x 0 )2 α(x)y + (x − x 0 )β(x)y + γ (x)y = 0

where, as usual, α , β and γ are all analytic at x 0 with α(x 0 ) = 0 . Assume the solutions
r1 and r2 to the corresponding indicial equation are real, with r1 ≥ r2 . Let y1 (x) and
y2 (x) be the corresponding solutions to the differential equation as described in the big
theorem on the Frobenius method, theorem 33.2.
a. Compute the derivatives of y1 and y2 and show that, for i = 1 and i = 2 ,
⎧
⎪
⎪ if 1 < ri
⎨ 0
lim yi (x) = ∞ if 0 < ri < 1 .
x→x 0 ⎪
⎪
⎩∞ if r < 0 i

Be sure to consider all cases.

b. Compute lim x→x0 y2 (x) when r1 = 1 and when r1 = 0 .

c. What can be said about lim x→x0 y2 (x) when r1 = 1 and when r1 = 0 ?

33.5. Recall that the Chebyshev equation with parameter λ is

(1 − x 2 )y − x y + λy = 0 , (33.17)

where λ can be any constant. In exercise 30.9 on page 628 you discovered that:

1. The only singular points for each Chebyshev equation are x = 1 and x = −1 .

2. For each λ , the general solution on (−1, 1) to equation (33.17) is given by

yλ (x) = a0 yλ,E (x) + a1 yλ,O (x)

where yλ,E and yλ,O are, respectively, even- and odd-termed series
∞
∞

yλ,E (x) = ck x k and yλ,O (x) = ck x k
k=0 k=0
k is even k is odd

with c0 = 1 , c1 = 1 and the other ck ’s determined from c0 or c1 via the recursion

formula.

i i

i i
i i

i i

Additional Exercises 727

3. Equation (33.17) has nontrivial polynomial solutions if and only if λ = m 2 for some
nonnegative integer m . Moreover, for each such m , all the polynomial solutions
are constant multiples of an m th degree polynomial pm given by

yλ,E (x) if m is even
pm (x) = .
yλ,O (x) if m is odd

In particular,

p0 (x) = 1 , p1 (x) = x , p2 (x) = 1 − 2x 2 ,

4 3
p3 (x) = x − x , p4 (x) = 1 − 8x 2 + 8x 4
3
and
16 5
p5 (x) = x − 4x 3 + x .
5

4. Each nonpolynomial solution to a Chebyshev equation on (−1, 1) is given by a

power series about x 0 = 0 whose radius of convergence is exactly 1 .

In the following, you will continue the analysis of the solutions to the Chebyshev equations
in a manner analogous to our continuation of the analysis of the solutions to the Legendre
equations in section 33.5.
a. Verify that x = 1 and x = −1 are regular singular points for each Chebyshev equation.
b. Find the exponents r1 and r2 at x = 1 and x = −1 of each Chebyshev equation.
c. Let y be a nonpolynomial solution to a Chebyshev equation on (−1, 1) , and show that
either
lim y (x) = ∞ or lim y (x) = ∞
x→1− x→−1+
(or both limits are inﬁnite).
d. Verify that pm (1) = 0 for each nonnegative integer m .
e. For each nonnegative integer m , the m th Chebyshev polynomial Tm (x) is the polynomial
solution to the Chebyshev equation with parameter λ = m 2 satisfying Tm (1) = 1 . Find
Tm (x) for m = 0, 1, 2, . . . , 5 .
f. Finish verifying that the Chebyshev equation

(1 − x 2 )y − 2x y + λy = 0

has nontrivial solutions with bounded ﬁrst derivatives on (−1, 1) if and only if λ = m 2
for some nonnegative integer. Moreover, y is such a solution if and only if y is a constant
multiple of the m th Chebyshev polynomial.

33.6. The following differential equations all have x 0 = 0 as a regular singular point. For each,
the corresponding exponents r1 and r2 are given, along with the solution y1 (x) on x > 0
corresponding to r1 , as described in theorem 33.2 on page 706. This solution can be found
by the basic method of Frobenius. The second solution, y2 , cannot be found by the basic
method, but, as stated in theorem 33.2, it is of the form
∞

y2 (x) = y1 (x) ln |x| + |x|1+r1 bk x k
k=0

i i

i i
i i

i i

728 The Big Theorem on the Frobenius Method, with Applications

or
∞

y2 (x) = μy1 (x) ln |x| + |x|r2 bk x k ,
k=0

depending, respectively, on whether the exponents r1 and r2 are equal or differ by a

nonzero integer. Do recall that, in the second formula, b0 = 1 , and b K is arbitrary for
K = r1 − r2 .
“Find y2 (x) for x > 0 ” for each of the following. More precisely, determine which
of the above two formulas hold, and ﬁnd at least the values of b0 , b1 , b2 , b3 and b4 ,
along with the value for μ if appropriate. You may set any arbitrary constant equal to 0 ,
and assume x > 0 .
∞
1 √ 1
a. 4x 2 y + (1 − 4x)y = 0 : r1 = r2 = , y1 (x) = x xk
2 2 (k!)
k=0
1
b. y + y + y = 0 (Bessel’s equation of order 0) : r1 = r2 = 0 ,
x
∞
(−1)m
y1 (x) = x 2m
(2m m!)2
m=0

c. x 2 y − x + x 2 y + 4x y = 0 ; r1 = 2 , r2 = 0 ,
2 1 4
y1 (x) = x 2 − x 3 + x
3 12

d. x 2 y + x y + (4x − 4)y = 0 ; r1 = 2 , r2 = −2 ,
∞
(−4)k 4! k
y1 (x) = x 2 x
k!(k + 4)!
k=0

i i

i i
i i

Basic Derivations
First, let’s see what we get from plugging the arbitrary modiﬁed power series
∞
∞

y(x) = x r ck x k = ck x k+r
k=0 k=0

into L (using the formula from theorem 29.12 on multiplying power series):

L[y] = x 2 y + x P(x)y + Q(x)y

∞

= x2 ck (k + r)(k + r − 1)x k+r −2
k=0
3∞ 43 ∞ 4 3∞ 43 ∞ 4

k k+r −1 k k+r
+ x pk x ck (k + r)x + qk x ck x
k=0 k=0 k=0 k=0

∞

= ck (k + r)(k + r − 1)x k+r
k=0
∞
k ∞
k
+ c j pk− j ( j + r)x k+r + c j qk− j x k+r
k=0 j=0 k=0 j=0

∞

k

= xr ck (k + r)(k + r − 1) + c j pk− j ( j + r) + qk− j x k .
k=0 j=0

i i

i i
i i

i i

The Indicial Equation and Basic Recursion Formula 731

That is, ∞
∞

L xr ck x k = xr Lk xk
k=0 k=0
where

k

L k = ck (k + r)(k + r − 1) + c j pk− j ( j + r) + qk− j .
j=0

Let’s now look at the individual L k ’s .

For k = 0 ,

0

L 0 = c0 (0 + r)(0 + r − 1) + c j p0− j ( j + r) + q0− j
j=0

= c0 r(r − 1) + c0 [ p0 r + q0 ]

= c0 r(r − 1) + p0 r + q0 .
The expression in the last bracket will arise several more times in our computations. For convenience,
we will let I be the corresponding polynomial function
I (ρ) = ρ(ρ − 1) + p0 ρ + q0 .
Then
L 0 = c0 I (r) .
For k > 0 ,

k

L k = ck (k + r)(k + r − 1) + c j pk− j ( j + r) + qk− j
j=0

k−1

= ck (k + r)(k + r − 1) + c j pk− j ( j + r) + qk− j + ck pk−k (k + r)qk−k
j=0

k−1

= ck (k + r)(k + r − 1) + p0 (k + r) + q0 + c j pk− j ( j + r) + qk− j .

I (k+r ) ! j=0

We’ll be repeating the above computations at least two more times in this chapter. To save time,
let’s summarize what we have.

Lemma 34.1
Let L be the differential operator
L[y] = x 2 y + x P(x)y + Q(x)y
where
∞
∞

P(x) = pk x k and Q(x) = qk x k .
k=0 k=0
2
Then, for any modiﬁed power series x r ∞ k
k ck x ,
∞ ∞

k−1

L xr ck x k = x r c0 I (r) + ck I (k + r) + c j pk− j ( j + r) + qk− j x k
k=0 k=1 j=0

i i

i i
i i

i i

732 Validating the Method of Frobenius

where
I (ρ) = ρ(ρ − 1) + p0 ρ + q0 .

Our immediate interest is in ﬁnding a modiﬁed power series

∞

r
y(x) = x ck x k with c0 = 0
k=0

that satisﬁes our differential equation,

L[y] = 0 .

Applying the above lemma, we see that we must have

∞

k−1
r
k
x c0 I (r) + ck I (k + r) + c j pk− j ( j + r) + qk− j x = 0 ,
k=1 j=0

which means that each term in the above power series must be zero. That is,

I (r) = 0 (34.4a)
and

k−1

ck I (k + r) + c j pk− j ( j + r) + qk− j = 0 for k = 1, 2, 3, . . . . (34.4b)
j=0

The Indicial Equation

The Equation and Its Solutions
You probably already recognized equation (34.4a) as the indicial equation from the basic method of
Frobenius. In more explicit form, it’s the polynomial equation

r(r − 1) + p0 r + q0 = 0 . (34.5a)

Equivalently, we can write this equation as

r 2 + ( p0 − 1)r + q0 = 0 , (34.5b)
or even
(r − r1 )(r − r2 ) = 0 (34.5c)
or
r 2 − (r1 + r2 )r + r1r2 = 0 (34.5d)

where r1 and r2 are the solutions to the indicial equation,

1 − p0 + ( p0 − 1)2 − 4q0 1 − p0 − ( p0 − 1)2 − 4q0
r1 = and r2 = .
2 2
This, of course, also means that we can write the formula for I in four different ways:

I (ρ) = ρ(ρ − 1) + p0 ρ + q0 , (34.6a)

I (ρ) = ρ 2 + ( p0 − 1)ρ + q0 , (34.6b)

i i

i i
i i

i i

The Indicial Equation and Basic Recursion Formula 733

I (ρ) = (ρ − r1 )(ρ − r2 ) (34.6c)

and
I (ρ) = ρ 2 − (r1 + r2 )ρ + r1r2 . (34.6d)

For the rest of this chapter, we will use whichever of the above formulas for I seems most convenient
at the time. Also, r1 and r2 will always denote the two values given above. Do note that if both
are real, then r1 ≥ r2 .
By the way, if you compare the second and last of the above formulas for I (ρ) , you’ll see that

p0 = 1 − (r1 + r2 ) and q0 = r1r2 .

Later, we may ﬁnd these observations useful.

Proof of Theorem 33.1

Recall that p0 and q0 are related to the coefﬁcients in the equation we ﬁrst started with,

x 2 α(x)y + xβ(x)y + γ (x)y = 0 ,

via
β0 γ0
p0 = P(0) = and q0 = Q(0) =
α0 α0
where
α0 = α(0) , β0 = β(0) and γ0 = γ (0) .
Using these relations, we can rewrite the ﬁrst version of the indicial equation (equation (34.5a)) as
β0 γ
r(r − 1) + r+ 0 = 0 ,
α0 α0

which, after multiplying through by α0 is

α0 r(r − 1) + β0 r + γ0 = 0 .

This, along with the formulas for r1 and r2 , completes the proof of theorem 33.1 on page 705.

Recursion Formulas
The Basic Recursion Formula
You probably also recognized that equation (34.4b) is, essentially, a recursion formula for any given
value of r . Let us ﬁrst rewrite it as

k−1

ck I (k + r) = − c j pk− j ( j + r) + qk− j for k = 1, 2, 3, . . . . (34.7)
j=0

If
I (k + r) = 0 for k = 1, 2, 3, . . . ,
then we can solve the above for ck , obtaining the generic recursion formula

−1
k−1

ck = c j pk− j ( j + r) + qk− j for k = 1, 2, 3, . . . . (34.8)
I (k + r)
j=0

It will be worth noting that p0 and q0 do not explicitly appear in this recursion formula except
in the formula for I (k + r) .

i i

i i
i i

i i

734 Validating the Method of Frobenius

More General Recursion Formulas and a Convergence Theorem

Later, we will have to deal with recursion formulas of the form

k−1
1
ck = fk − c j pk− j ( j + r) + qk− j
I (k + r)
j=0

where the f k ’s are coefﬁcients of some power series convergent on (−R, R) . (Note that this reduces
to recursion formula (34.8) if each f k is 0 .) To deal with the convergence of any power series based
on any such a recursion formula, we have the following theorem:

Theorem 34.2
Let R > 0 . Assume
∞
∞
∞

pk x k , qk x k and fk x k
k=0 k=0 k=0

are power series convergent for |x| < R , and

∞

ck x k
k=0

is a power series such that, for some value ω and some integer K 0 ,

k−1
1
ck = fk − c j pk− j ( j + ω) + qk− j for k ≥ K 0
J (k)
j=0

where J is some second-degree polynomial function satisfying

J (k) = 0 for k = K 0 , K 0 + 1, K 0 + 2, . . . .
2∞
Then k ck x k is also convergent for |x| < R .

The proof of this convergence theorem will be given in section 34.6. It is very similar to the
convergence proofs developed in chapter 31 for power series solutions.

34.3 The Easily Obtained Series Solutions

Now let r j be either of the two solutions r1 and r2 to the indicial equation,

I (r) = 0 .

To use recursion formula (34.8) with r = r j , it sufﬁces to have

I (k + r j ) = 0 for k = 1, 2, 3, . . . .

But r1 and r2 are the only solutions to I (r) = 0 , so the last line tells us that, to use recursion
formula (34.8) with r = r j , it sufﬁces to have

k + r j = r1 and k + r j = r2 for k = 1, 2, 3, . . . ;

i i

i i
i i

i i

The Easily Obtained Series Solutions 735

that is, it sufﬁces to have

r1 − r j = k and r2 − r j = k for k = 1, 2, 3, . . . .
As long as this holds, we can start with any nonzero constant c0 and generate subsequent ck ’s via
the basic recursion formula (34.8) to create a power series
∞

ck x k .
k=0

Moreover, theorem 34.2 assures us that this series is convergent for |x| < R . Consequently,
∞

rj
y(x) = x ck x k
k=0

is a well-deﬁned function, at least on (0, R) (just what happens at x = 0 depends on the x r j factor
in this formula). Plugging this formula back into our differential equation and basically repeating
the computations leading to the indicial equation and the recursion formula would then conﬁrm that
this y is, indeed, a solution on (0, R) to our differential equation. Let’s record this:

Lemma 34.3
If r j is either of the two solutions r1 and r2 to the indicial equation for the problem considered in
this chapter, and
r1 − r j = k and r2 − r j = k for k = 1, 2, 3, . . . , (34.9)
then a solution on (0, R) to the original differential equation is given by
∞

y(x) = x r j ck x k
k=0

where c0 is any nonzero constant, and c1 , c2 , c3 , . . . are given by recursion formula (34.8) with
r = rj .

Now let us consider the r j = r1 and r j = r2 cases separately, adding the assumption that the
coefﬁcients of our original differential equation are all real-valued in some interval about x 0 = 0 .
This means that the coefﬁcients in the indicial equation are all real. Hence, we may assume that
either both r1 and r2 are real with r1 ≥ r2 , or that r1 and r2 are complex conjugates of each other.

Solutions Corresponding to r1
With r j = r1 , condition (34.9) in the above lemma becomes
r1 − r1 = k and r2 − r1 = k for k = 1, 2, 3, . . . ,
Clearly, the only way this cannot be satisﬁed is if
r2 − r1 = K for some positive integer K .
But, using the formulas for r1 and r2 from page 732, you can easily verify that
!
r2 − r1 = = − ( p0 − 1)2 − 4q0 ,
which cannot equal some positive integer. Thus, the above lemma assures us of the following:

i i

i i
i i

i i

736 Validating the Method of Frobenius

One solution on (0, R) to our differential equation is given by

∞

y(x) = x r1 ck x k
k=0

where c0 is any nonzero constant, and c1 , c2 , c3 , . . . are given by recursion formula

(34.8) with r = r1 .
This conﬁrms statement 1 in theorem 33.2.
In particular, for the rest of our discussion, let us let y1 be the solution
∞

r1
y1 (x) = x ak x k (34.10)
k=0

where a0 = 1 and

−1
k−1

ak = a j pk− j ( j + r1 ) + qk− j for k = 1, 2, 3, . . . .
I (k + r1 )
j=0

On occasion, we may call this our “ﬁrst” solution.

“Unexceptional” Solutions Corresponding to r2

With r j = r2 , condition (34.9) in lemma 34.3 becomes

r1 − r2 = k and r2 − r2 = k for k = 1, 2, 3, . . . ,

which, obviously, is the same as

r1 − r2 = K for each positive integer K .

Unfortunately, this requirement does not automatically hold. It is certainly possible that

r1 − r2 = K for some positive integer K .

This is an “exceptional” case which we will have examine further. For now, the lemma above simply
assures us that:
If r1 − r2 is not a positive integer, then a solution on (0, R) to our differential equation
is given by
∞
y(x) = x r2 ck x k
k=0
where c0 is any nonzero constant, and the other ck ’s are given by recursion formula
(34.8) with r = r2 .
Of course, the solutions just described corresponding to r2 will be the same as those corre-
sponding to r1 if r2 = r1 (i.e., r1 − r2 = 0 ). This is another exceptional case that we will have to
examine later.
For the rest of this chapter, let us say that, if r1 and r2 are not equal and do not differ by an
integer, then the “second solution” to our differential equation on (0, R) is
∞

y2 (x) = x r2 bk x k
k=0

i i

i i
i i

i i

Second Solutions When r2 = r1 737

where b0 = 1 and

−1
k−1

bk = b j pk− j ( j + r2 ) + qk− j for k = 1, 2, 3, . . . .
I (k + r2 )
j=0

We should note that, if r1 and r2 are two different values not differing by an integer, then the
above y1 and y2 are clearly not constant multiples of each other (at least, it should be clear once
you realize that the ﬁrst terms of y1 (x) and y2 (x) are, respectively, x r1 and x r2 ). Consequently
{y1 , y2 } is a fundamental set of solutions to our differential equation on (0, R) , and
y(x) = c1 y1 (x) + c2 y2 (x)
is a general solution to our differential equation over (0, R) . That ﬁnishes the proof of theorem 33.2
up through statement 2.

Deriving the “Exceptional” Solutions

In the next two sections, we will derive formulas for the solutions corresponding to r = r2 when
r1 and r2 are equal or differ by a nonzero integer. In deriving these solutions, we could use the ﬁrst
solution, y1 , with the reduction of order method from chapter 12. Unfortunately, that gets somewhat
messy and does not directly lead to useful recursion formulas. So, instead, we will take somewhat
different approaches.
Since the approach we’ll take when r2 = r1 is a bit more elementary (but still tedious) and
somewhat less “clever” than the approach we’ll take when r1 − r2 is a positive integer, we will
consider the case where r2 = r1 ﬁrst.

34.4 Second Solutions When r2 = r1

Recall that, based on what we learned from studying Euler equations, we suspected that a second
solution to our differential equation when r2 = r1 will be of the form
∞

r1
y(x) = ln |x| Y (x) with Y (x) = x bk x k .
k=0

Unfortunately, this turns out not to be generally true. But since it seemed so reasonable at the time,
let us still try using this, but with an added “error term”. That is, let’s try something of the form
y(x) = ln |x| Y (x) + (x) . (34.11)
Plugging this into the differential equation:
0 = L[y]
= x 2 y + x P y + Qy
2 1
1

= x 2 ln |x| Y + Y − 2 Y + + x P ln |x| Y + Y +
x x x
+ Q [ln |x| Y (x) + (x)]

= ln |x| x 2 Y + x PY + QY + 2xY − Y + PY + x 2 + x P + Q .

L[Y ] L[]

i i

i i
i i

i i

738 Validating the Method of Frobenius

Choosing Y to be our ﬁrst solution,

∞

Y (x) = y1 (x) = x r1 ak x k ,
k=0

causes the natural log term to vanish, leaving us with

0 = 2x y1 − y1 + P y1 + L[] ,

which we can rewrite as

L[] = F(x) (34.12a)
with
F(x) = y1 (x) − 2x y1 (x) − P(x)y1 (x) . (34.12b)
It turns out that we will be seeing both the above differential equation and the function F when
we deal with the case where r2 and r1 differ by a nonzero integer. So, for now, let’s expand F(x)
using the series formulas for y1 and P without assuming r2 = r1 :

F(x) = y1 (x) − 2x y1 (x) − P(x)y1 (x)

∞
3 ∞
4 3∞ 43 ∞
4

r1 n r1 n−1 n r1 m
= x an x − 2x x an (r1 + n)x − pn x x am x
n=0 n=0 n=0 m=0
∞ ∞
3∞ 43 ∞
4

= x r1 an x n − an (2r1 + 2n)x n − pn x n am x m
n=0 n=0 n=0 m=0

∞

n
= x r1 an [1 − 2r1 − 2n] − a j pn− j x n .
n=0 j=0

Recalling that a0 = 1 and that, in general p0 = 1 − r1 − r2 , we see that the ﬁrst term in the series
simpliﬁes somewhat to

0
a0 [1 − 2r1 − 2 · 0)] − a j p0− j = a0 [1 − 2r1 − p0 ] = r2 − r1 .
j=0

For the other terms, we have

n
n−1
an [1 − 2r1 − 2n] − a j pn− j = an [1 − 2r1 − 2n] − a j pn− j − an p0
j=0 j=0

n−1
= an [1 − 2r1 − p0 − 2n] − a j pn− j
j=0

n−1
= an [r2 − r1 − 2n] − a j pn− j .
j=0

So, in general,
∞

n−1
F(x) = x r1 r2 − r1 + an [r2 − r1 − 2n] − a j pn− j x n . (34.13)
n=1 j=0

i i

i i
i i

and, for k = 1, 2, 3, . . . ,

k−1

k (k + 1)2 + j pk− j ( j + r1 + 1) + qk− j = f k .
j=0

That is,
0 = f 0 (34.16a)
and, for k = 1, 2, 3, . . . ,

k−1
1
k = fk − j pk− j ( j + r1 + 1) + qk− j (34.16b)
(k + 1)2
j=0

where the f k ’s are given by formula (34.14b).

Now recall just what we are looking for, namely, a function (x) such that

y2 (x) = y1 (x) ln |x| + (x)

is a solution to our original differential equation. We have obtained

∞

(x) = x ρ k x k
k=0

where ρ = r1 + 1 and the k ’s are given by formula set (34.16). Plugging this formula back into the
original differential equation and repeating the computations used to derive the above will conﬁrm
that y2 is, indeed, a solution over (0, R) , provided the series for converges. Fortunately, using
theorem 34.2 on page 734 this convergence is easily conﬁrmed.
Thus, the above y2 is a solution to our original differential equation on the interval (0, R) .
Moreover, y2 is clearly not a constant multiple of y1 . So {y1 , y2 } is a fundamental set of solutions,

y(x) = c1 y1 (x) + c2 y2 (x)

is a general solution to our differential equation over (0, R) , and we have veriﬁed statement 3 in
theorem 33.2 (with bk = k ).

34.5 Second Solutions When r1 − r2 = K

Preliminaries
Let’s now assume r1 and r2 differ by some positive integer K , r1 − r2 = K . Setting
∞

r2
y(x) = x bk x k
k=0

with bk = 1 and using recursion formula (34.8) gives us

−1
k−1

bk = b j pk− j ( j + r) + qk− j for k = 1, 2, 3, . . . , K − 1 .
I (r2 + k)
j=0

i i

i i
i i

i i

Second Solutions When r1 − r2 = K 741

Unfortunately, I (r2 + K ) = I (r1 ) = 0 , giving us a “division by zero” when we attempt to compute

b K . This is the complication we will deal with for the rest of this section. It turns out that there are
two subcases, depending on whether

K −1

ΓK = b j p K − j ( j + r2 ) + q K − j
j=0

is zero or not. If it is zero, we get lucky.

The Case Where We Get Lucky

Recall that we actually derived our recursion formula from the requirement that, for
∞

y(x) = x r2 bk x k
k=0

to be a solution to our differential equation, it sufﬁces to have

k−1

bk I (r2 + k) = − b j pk− j ( j + r) + qk− j for k = 1, 2, 3, . . . , . (34.17)
j=0

As noted above, we can use this to ﬁnd bk for k < K . For k = K , we have I (r2 + K ) = I (r1 ) = 0 ,
and the above equation becomes
bK · 0 = ΓK .
If we are lucky, then Γ K = 0 and the above equation is trivially true for any value of b K . So b K is
arbitrary if Γ K = 0 . Pick any value you wish (say, b K = 0 ), and use equation (34.17) to compute
the rest of the bk ’s for
∞

y2 (x) = x r2 bk x k .
k=0
Convergence theorem 34.2 on page 734 now applies and assures us that the above power series
converges for |x| < R . And then, again, the very computations leading to the indicial equation
and recursion formulas verify that this y(x) is a solution to our differential equation. Moreover,
the leading term is x r2 . Consequently, y1 (x) and y2 (x) are not constant multiples of each other.
Hence, {y1 , y2 } is a fundamental set of solutions, and

y(x) = c1 y1 (x) + c2 y2 (x)

is a general solution to our differential equation over (0, R) .

If you go back and check, you will see that the above y2 is the solution claimed to exist in
statement 4 of theorem 33.2 when μ = 0 . Hence we’ve conﬁrmed that part of the claim.

?Exercise 34.1: Show that, if we took b0 = 0 and b K = 1 in the above (instead of b0 = 1 and
b K = 0 ), we would have obtained the ﬁrst solution, y1 (x) . (Thus, if r1 − r2 = K and Γk = 0 ,
the Frobenius method will generate the complete general solution when using r = r2 .)

i i

i i
i i

i i

742 Validating the Method of Frobenius

The Other Case

Let us now assume

K −1

ΓK = b j p K − j ( j + r2 ) + q K − j = 0 .
j=0

Ultimately, we want to conﬁrm that formula (33.4) in theorem 33.2 does describe a solution to our
differential equation. Before doing that, however, let us see how anyone could have come up with
formula (33.4) in the ﬁrst place.

Deriving a Solution as the Limit of Other Second Solutions

Suppose we have two differential equations that are very similar to each other. Does it not seem
reasonable to expect one solution of one of these equations to also be very similar to some solution
to the other differential equation? Of course it does, and this is what we will use to derive the second
solution to our differential equation,

x 2 y + x P y + Qy = 0 . (34.18)

Remember, this has the corresponding indicial equation

I (r) = 0 with I (ρ) = (ρ − r2 )(ρ − r1 ) .

Also remember that we are assuming r1 = r2 + K for some positive integer K , and that Γ K (as
deﬁned above) is nonzero.
Now let r be any real value close to r2 (say, |r − r2 | < 1 ) and consider

x 2 y + x Pr y + Q r y = 0

where Pr and Q r differ from P and Q only in having the ﬁrst coefﬁcients in their power series
about 0 adjusted so that the corresponding indicial equation is

Ir (r) = 0 with Ir (ρ) = (ρ − r)(ρ − r1 ) .

If r = r2 , this is our original equation. If r = r2 , this is our “approximating differential equation”.

From our discussion in section 34.3 on the “easily obtained solutions”, we know that, when r = r2 ,
a second solution to this equation is given by
∞

r
y(x, r) = x bk (r)x k
k=0

with b0 (r) = 1 and

1
k−1
bk (r) = − b j (r)[Pk− j ( j + r) + Q k− j ] for k = 1, 2, . . . .
Ir (r + k)
j=0

This, presumably, will approximate some second solution y(x, r1 ) to equation (34.18),

y(x, r2 ) ≈ y(x, r) .

Presumably, also, this approximation improves as r → r1 . So, let us go further and seek the
y(x, r2 ) given by
y(x, r2 ) = lim y(x, r) .
r →r2

i i

i i
i i

i i

Second Solutions When r1 − r2 = K 743

Before going further, let us observe that

Ir (r + k) = (r + k − r)(r + k − r1 )
= k(k + r − [r2 + K ]) = k(k − K + r − r2 ) .

Thus,

−1
k−1
bk (r) = b j (r)[Pk− j ( j + r) + Q k− j ] for k = 1, 2, . . . .
k(k − K + r − r2 )
j=0

In particular,

K −1
−1
b K (r) = b j (r)[PK − j ( j + r) + Q K − j ] .
K (r − r2 )
j=0

So, while we have

bk (r2 ) = lim bk (r) when k < K
r →r2

being well-deﬁned ﬁnite values, we also have

lim |b K (r)| = ∞ ,
r →r2

suggesting that
lim |bk (r)| = ∞ for k > K
r →r2

since the recursion formula for these bk (r)’s all contain b K (r) .
It must be noted, however, that we are assuming limr →r2 y(x, r) exists despite the fact that
individual terms in y(x, r) behave badly as r → r2 . Let’s hold to this hope. Assuming this,
∞
∞ −1

K
r
lim x bk (r)x = lim x r
k
bk (r)x k − x r bk (r)x k
r →r2 r →r2
k=K k=0 k=0
−1 −1

K
K
r k r2
= lim y(x, r) − x bk (r)x = y(x, r2 ) − x bk (r2 )x k ,
r →r2
k=0 k=0

which is ﬁnite for each x in the interval of convergence. Consequently,

∞

lim (r − r2 )x r bk (r)x k = 0 ,
r →r2
k=K

which we will rewrite as

∞

lim x r βk (r)x k = 0 (34.19)
r →r2
k=K
by letting
βk (r) = (r − r2 )bk (r) for r = r2 .
Now, let’s start computing y(x, r2 ) as a limit using a simple, cheap trick:

y(x, r2 ) = lim y(x, r)

r →r2
∞

= lim x r bk (r)x k
r →r2
k=0

i i

i i
i i

i i

744 Validating the Method of Frobenius

K −1 ∞

= lim x r bk (r)x k + lim x r bk (r)x k
r →r2 r →r2
k=0 k=K

−1 ∞
r − r2 r
K
= x r2 bk (r2 )x k + lim x bk (r)x k
r →r2 r − r2
k=0 k=K

K −1 2∞
xr k=K βk (r)x
k
= x r2 bk (r2 )x k + lim .
r →r2 r − r2
k=0

Using L’Hôpital’s rule, we see that

2∞ ∂ r 2∞
xr β (r)x k x k=K βk (r)x k
lim k=K k
= lim ∂r
r →r2 r − r2 r →r2 ∂
[r − r2 ]
∂r
2 r 2∞ β (r)x k
x r ln |x| ∞ k
k=K βk (r)x + x k=K k
= lim
r →r2 1
∞
∞

= x r2 ln |x| βk (r2 )x k + x r2 βk (r2 )x k .
k=K k=K

Combining the last two results gives

∞ ∞

bk (r2 ) if k<K
y(x, r2 ) = x ln |x|
r2
βk (r2 )x k
+ x r2
xk .
k=K k=0
βk (r2 ) if K ≤k

This is not a very “pretty” expression. To simplify it, let

bk (r2 ) if k < K
k =
,
βk (r2 ) if K ≤ k

and observe that, letting αk = βk+K (r2 ) ,

∞
∞

x r2 βk (r2 )x k = x r1 −K α0 x K + α1 x K +1 + α2 x K +2 + · · · = x r1 αk x k .
k=K k=0

Then
y(x, r2 ) = ln |x| Y (x) + (x) (34.20a)
where
∞
∞

Y (x) = x r1 αk x k and (x) = x r2 k x k . (34.20b)
k=0 k=0

Admittedly, part of the derivation of formula (34.20) was based on “hope” and assumptions that
seemed reasonable but were not rigorously justiﬁed. So we are not yet certain this formula does yield
the desired solution. Moreover, the methods given in this derivation for computing the αk ’s and k ’s
certainly appear to be rather difﬁcult to carry out in practice. These are valid concerns that we will
deal with by now ignoring just how we derived this formula. Instead, we will see about validating
this formula and obtaining more usable recursion formulas for the αk ’s and k ’s via methods that,
by now, should be familiar to the reader.

i i

i i
i i

i i

Second Solutions When r1 − r2 = K 745

Verifying Our Solution

Notice how similar formula (34.20a) for y(x, r2 ) is to formula (34.11) on page 737 from which we
derived the second solution y(x) when r1 − r2 = 0 in section 34.4. Let us be inspired by the work
done in that section (and reuse as much of that work as possible) and try to ﬁnd a solution of the
form
y(x) = ln |x| Y (x) + (x)
where
∞

(x) = x r2 k x k .
k=0

Glancing back at the work near the beginning of section 34.4, it should be clear that

y(x) = ln |x| Y (x) + (x)

will satisfy our differential equation L[y] = 0 if

Y (x) = y1 (x) and L[] = F(x)

where, taking into account the facts that I (r2 ) = 0 and r1 − r2 = K ,

∞

k−1
r2
k
L[] = x 0 I (r2 ) + k I (r2 + k) + j pk− j (r2 + j) + qk− j x
k=1 j=0
∞

k−1
r2

= x k I (r2 + k) + j pk− j (r2 + j) + qk− j xk
k=1 j=0

and
∞

n−1
F(x) = x r1 r2 − r1 + an [r2 − r1 − 2n] − a j pn− j x n
n=1 j=0
∞

n−1
= x r2 +K − K + an [−K − 2n] − a j pn− j x n
n=1 j=0
∞

n−1
= x r2 − K x K + an [−K − 2n] − a j pn− j x K +n .
n=1 j=0

Using k = n + K , we can rewrite our last formula as

∞

F(x) = x r2 − K x K + fk x k
k=K +1
with
−1
k−K
f k = ak−K [K − 2k] − a j pk−K − j .
j=0

As in the previous section, we know the power series in the formula for F(x) converges for |x| < R
because of the way it was constructed from power series already known to be convergent for these
values of x .

i i

i i
i i

i i

746 Validating the Method of Frobenius

So the differential equation, L[] = F , expands to

∞

k−1

x r2 k I (r2 + k) + j pk− j (r2 + j) + qk− j x k
k=1 j=0
∞

r2 K k
= x − Kx + fk x ,
k=K +1

which means that we are seeking k ’s satisfying the system

⎧
⎪
⎪ if 1 ≤ k < K

k−1
⎨ 0
k I (r2 + k) + j pk− j (r2 + j) + qk− j = −K if k = K . (34.21)
⎪
⎪
j=0 ⎩ f if k > K
k

Solving for k in the ﬁrst few equations of this set yields

−1
k−1

k = j pk− j (r2 + j) + qk− j for k = 1, 2, . . . , K − 1 .
I (r2 + k)
j=0

To simplify matters, let’s recall that, at the start of this section, we had already obtained a set
{b0 , b1 , . . . , b K −1 } satisfying b0 = 1 and

−1
k−1

bk = b j pk− j (r2 + j) + qk− j for k = 1, 2, . . . , K − 1 .
I (r2 + k)
j=0

It is then easily veriﬁed that, whatever value we have for 0 ,

k = 0 bk for k = 1, 2, . . . , K − 1 .

Now, also recall that

K −1

ΓK = b j p K − j (r2 + j) + q K − j = 0 ,
j=0

and take a look at the K th equation in system (34.21):

K −1

K I (r2 + K ) + j p K − j (r2 + j) + q K − j = −K
j=0

K −1

→ K I (r1 ) + 0 b j p K − j (r2 + j) + q K − j = −K
j=0

→ K · 0 + 0 Γ K = −K .

So K can be any value, while

K
0 = − ,
ΓK
and
K bk
k = 0 · bk = − for k = 1, 2, . . . , K − 1 .
ΓK

i i

i i
i i

i i

Convergence of the Solution Series 747

For the remaining k ’s , we simply solve each of the remaining equations in system (34.21) for
k (using whatever value of K we choose), obtaining

k−1
1
k = fk − j pk− j (r2 + j) + qk− j for k > K .
I (r2 + k)
j=0
2∞
Theorem 34.2 tells us that the resulting k=0 k x
k converges for |x| < R , and that, along
with all the computations above, tells us that
∞

y(x) = y1 (x) ln |x| + x r2 k x k
k=0

is a solution to our original differential equation on (0, R) . Clearly, it is not a constant multiple of
y1 , and so {y1 , μy} is a fundamental set of solutions for any nonzero constant μ . In particular, the
solution mentioned in theorem 33.2 is the one with
1 Γ
μ = = − K .
0 K
And that, except for verifying convergence theorem 34.2, conﬁrms statement 4 of theorem 33.2,
and completes the proof of theorem 33.2, itself.

34.6 Convergence of the Solution Series

Finally, let’s verify theorem 34.2 on page 734 on the convergence of our series. As you will see, the
proof is very similar to (and a bit simpler than) the proofs of convergence in chapter 31.

Assumptions and Claim

We are assuming that ω is some constant,
∞
∞
∞

fk x k , pk x k and qk x k
k=0 k=0 k=0

are power series convergent for |x| < R , J is a second-degree polynomial function, and K 0 is
some nonnegative integer such that
J (k) = 0 for k = K 0 , K 0 + 1, K 0 + 2, . . . .
We are also assuming that we have a power series
∞

ck x k
k=0

whose coefﬁcients satisfy

k−1
1
ck = fk − c j pk− j ( j + ω) + qk− j for k ≥ K0 .
J (k)
j=0
2∞
2∞ k ck x k converges for all x satisfying |x| < R . This, of
The claim of the theorem is that k

course, can be veriﬁed by showing k |ck | |x| converges for each x in (−R, R) .

i i

i i
i i

i i

748 Validating the Method of Frobenius

The Proof
We start by letting x be any single value in (−R, R) . We then can (and do) choose X to be some
value with |x| < X < R . Also, by the convergence of the series, we can (and do) choose M to be
a positive value such that, for k = 0, 1, 2, . . . ,

fk X k < M , pk X k < M and qk X k < M .
2
Now consider the power series ∞ k
k=0 C k x with

C k = |ck | for k < K0

and

k−1
1
C k = M X −k + C M X −[k− j]
( j + |ω|) + M X −[k− j]
for k ≥ K0 .
J (k)
j
j=0

Comparing the recursion formulas for ck and C k , it is obvious that

|ck | |x|k ≤ C k |x|k for k = 0, 1, 2, . . . .
2∞ 2
Consequently, the convergence of k ck x k can be conﬁrmed by showing ∞ k C k |x| converges,
k

and, by the limit ratio test (theorem 29.11 on page 578) that can be shown by verifying that

Ck+1 x k+1
lim ≤ 1 .
k→∞ Ck x k

Fortunately, for k > K 0 ,

k
1
C k+1 = −(k+1)
+ −[k+1− j]
( j + |ω|) + M X −[k+1− j]
J (k + 1) MX Cj MX
j=0

k−1
X −1

= MX −k
+ Cj MX −[k− j]
( j + |ω|) + M X −[k− j]
J (k + 1)
j=0

+ C k M X −[k−k] (k + |ω|) + M X −[k−k]

X −1

= (|J (k)| C k + C k M[k + ω + 1])
J (k + 1)

|J (k)| + M[k + ω + 1] Ck

= · .
J (k + 1) X

Thus,
Ck+1 x k+1
= Ck+1 |x| = |J (k)| + M[k + ω + 1] · |x| .
C xk Ck J (k + 1) X
k
Since J is a second-degree polynomial, you can easily verify that

|J (k)| + M[k + ω + 1]

lim = 1 .
k→∞ J (k + 1)

Hence, since |x| < X ,

Ck+1 x k+1 |J (k)| + M[k + ω + 1] |x| |x|
·
lim
k→∞ C x k = k→∞
lim
J (k + 1) X = 1· X < 1 .
k

i i

i i
i i

i i

Part VI
Systems of Differential
Equations

(A Brief Introduction)

i i

i i
i i

i i

i i
i i

i i

35
Systems of Differential Equations:
A Starting Point

Thus far, we have been dealing with individual differential equations. But there are many applications
that lead to sets of differential equations sharing common solutions. These sets are generally referred
to as “systems of differential equations”.
In this chapter we will begin a brief discussion of these systems. Unfortunately, a complete
discussion goes beyond the scope of this text and requires that the reader has had a decent course
on linear algebra. So our discussion will be somewhat limited in scope and in the types of systems
considered. The goal is for the reader to begin understanding the basics, including why these
systems are important, and how some of the methods for analyzing their solutions can be especially
illuminating, even when the system comes from a single differential equation. In fact, these methods
are especially important in analyzing those differential equations that are not linear.

35.1 Basic Terminology and Notions

In general, a k th -order system of M differential equations with N unknowns is simply a collection
of M differential equations involving N unknown functions with k being the highest order of the
derivatives explicitly appearing in the equations. For brevity, we may refer to such a system as a
“k th -order M × N system”.
Our primary interest, however, will be in in ﬁrst-order N ×N systems of differential equations
that can be written as
x 1 = f 1 (t, x 1 , x 2 , . . . , x N )
x 2 = f 2 (t, x 1 , x 2 , . . . , x N )
.. (35.1)
.
x N = f N (t, x 1 , x 2 , . . . , x N )

where the x j ’s are (initially unknown) real-valued functions of t (hence x = dx/dt ), and the
f k (t, x 1 , x 2 , . . . , x N )’s — the component functions of the system — are known functions of N + 1
variables. Note that each equation is a ﬁrst-order differential equation in which only one of the
unknown functions is differentiated.1 We will refer to such systems of differential equations as
either standard ﬁrst-order systems or, when extreme brevity is desired, as standard systems.

1 Also note that we have gone back to using t as the main variable.

751

i i

i i
i i

i i

752 Systems of Differential Equations: A Starting Point

While we will be consistent in using t as the variable in the initially unknown functions, we will
feel free to use whatever symbols seem convenient for these functions of t . In fact, when N = 2
or N = 3 (which will be the case for almost all of our examples), we will abandon subscripts and
denote our functions of t by x , y and, if needed, z , and we will write our generic systems as

x = f (t, x, y, z)
x = f (t, x, y)
or y = g(t, x, y, z) , (35.2)
y = g(t, x, y)
z = h(t, x, y, z)

as appropriate.

!Example 35.1: Here is a simple standard ﬁrst-order system:

x = x + 2y
.
y = 5x − 2y

Now suppose we have a standard first-order N×N system with unknown functions x 1 , x 2 , . . .
and x N , along with some interval of interest (α, β) . A solution to this system over this interval is
any ordered set of N specific real-valued functions x̂ 1 , x̂ 2 , . . . and x̂ N such that all the equations
in the system are satisfied for all values of t in the interval (α, β) when we let2

x 1 (t) = x̂ 1 (t) , x 2 (t) = x̂ 2 (t) , ... and x N (t) = x̂ N (t) .

A general solution to our system of differential equations (over (α, β) ) is any ordered set of N for-
mulas describing all possible such solutions. Typically, these formulas include arbitrary constants.3

!Example 35.2: Consider the system

x = x + 2y
y = 5x − 2y

over the entire real line, (−∞, ∞) . If we let

x(t) = e3t + 2e−4t and y(t) = e3t − 5e−4t ,

and plug these formulas for x and y into the ﬁrst differential equation in our system,

x = x + 2y ,

we get
d
e3t + 2e−4t = e3t + 2e−4t + 2 e3t − 5e−4t
dt

→ 3e3t − 2 · 4e−4t = [1 + 2]e3t + [2 − 2 · 5]e−4t

→ 3e3t − 8e−3t = 3e3t − 8e−3t ,

which is an equation valid for all values of t . So these two functions, x and y , satisfy the ﬁrst
differential equation in the system over (−∞, ∞) .
2 We could allow the x (t)’s to be complex valued, but this will not gain us anything with the systems of interest to us and
k
would complicate the “graphing” techniques we’ll later develop and employ.
3 And it is also typical that the precise interval of interest, (α, β) , is not explicitly stated, and may not even be precisely
known.

i i

i i
i i

i i

Basic Terminology and Notions 753

Likewise, it is easily seen that these two functions also satisfy the second equation:

y = 5x − 2y

→ d
dt
e3t − 5e−4t = 5 e3t + 2e−4t − 2 e3t − 5e−4t

→ 3e3t − 5(−4)e−4t = [5 − 2]e3t + [5 · 22(−5)]e−4t

→ 3e3t + 20e−4t = 3e3t + 20e−4t .

Thus, the pair

x(t) = e3t + 2e−4t and y(t) = e3t − 5e−4t

is a solution to our system over (−∞, ∞) .

More generally, you can easily verify that, for any choice of constants c1 and c2 ,

x(t) = c1 e3t + 2c2 e−4t and y(t) = c1 e3t − 5c2 e−4t

satisﬁes the given system (see exercise 35.4). In fact, we will later verify that the above is a
general solution to the system. (Note that the two formulas in the above general solution share
arbitrary constants. This will be typical.)
On the other hand, plugging the pair

x(t) = e3t + 2e4t and y(t) = 2e3t + e4t

into the ﬁrst equation of our system yields

x = x + 2y

→ d
dt
e3t + 2e4t = e3t + 2e4t + 2 2e3t + e4t

→ 3e3t + 8e4t = 5e3t + 4e4t

→ 4e4t = 2e3t ,

which is not true for every real value t . So this last pair of functions is not a solution to our
system.

If, in addition to our stanadard first-order system, we have the value of every x k specified at
some single point t0 , then we have an initial-value problem, a solution of which is any solution to
the system that also satisfies the given initial values. Unsurprisingly, we usually solve initial-value
problems by first finding the general solution to the system, and then applying the initial conditions
to the general solution to determine the values of the ‘arbitrary’ constants.

!Example 35.3: Consider the initial-value problem consisting of the system from the previous
example,
x = x + 2y
,
y = 5x − 2y

i i

i i
i i

i i

754 Systems of Differential Equations: A Starting Point

along with the initial conditions

x(0) = 0 and y(0) = 1 .

In the previous example, it was asserted that the pair

x(t) = c1 e3t + 2c2 e−4t and y(t) = c1 e3t − 5c2 e−4t (35.3)

is a solution to our system for any choice of constants c1 and c2 . Using these formulas with the
initial conditions, we get

0 = x(0) = c1 e3·0 + 2c2 e−4·0 = c1 + 2c2

and
1 = y(0) = c1 e3·0 − 5c2 e−4·0 = c1 − 5c2 .

So, to ﬁnd c1 and c2 , we solve the simple algebraic linear system

c1 + 2c2 = 0
.
c1 − 5c2 = 1

Doing so however you wish, you should easily discover that

2 1
c1 = and c2 = − ,
7 7
which, after plugging these values back into the general formulas for x(t) and y(t) given in
equation set (35.3), yields the solution to the given initial-value problem,
2 3t 2 2 3t 5
x(t) = e − e−4t and y(t) = e + e−4t .
7 7 7 7

By the way, you will occasionally hear the term “coupling” in describing the extent in which
each equation of the system contains different unknown functions. A system is completely uncoupled
if each equation involves just one of the unknown functions, as in

x = 5x + sin(x)
,
y = 4y

and is weakly coupled or only partially coupled if at least one of the equations just involves only one
unknown function, as in
x = 5x + 2y
.
y = 4y

Such systems can be solved in the obvious manner. First solve each equation involving a single
unknown function, and then plug those solutions into the other equations, and deal with them.

!Example 35.4: Consider the system

x = 5x + 2y
.
y = 4y

The second equation, y = 4y is a simple linear and separable equation whose general solution
you can readily ﬁnd to be
y(t) = c1 e4t .

i i

i i
i i

i i

A Few Illustrative Applications 755

With this, the ﬁrst equation in the system becomes

x = 5x + 2c1 e4t ,
another ﬁrst-order linear equation that you should have little trouble solving (see chapter 5). Its
general solution is
1
x(t) = c1 e4t + c2 e5t .
2
So, the general solution to our system is the pair
1
x(t) = c1 e4t + c2 e5t and y(t) = c1 e4t .
2

For the most part, our interest will be in systems that are neither uncoupled nor weakly coupled.

35.2 A Few Illustrative Applications

Let us look at a few applications that naturally lead to standard 2×2 ﬁrst-order systems.

A Falling Object
Way back in section 1.2, we considered an object of mass m plummeting towards the ground under
the inﬂuence of gravity. As we did there, let us set
t = time (in seconds) since the object was dropped ,

y(t) = vertical distance (in meters) between the object and the ground at time t ,
and
v(t) = vertical velocity (in meters/second) of the object at time t .
We can view y and v as two unknown functions related by
dy
= v .
dt
Now, in developing a “better model” describing the fall (see the discussion starting on page 11), we
took into account air resistance and obtained
dv
= −9.8 − κv
dt
where κ is a positive constant describing how strongly air resistance acts on the falling object. This
gives us a system of two differential equations with two unknown functions,
y = v
.
v = −9.8 − κv
Fortunately, this is a very weakly coupled system whose second equation is a simple ﬁrst-order
equation involving only the function v . We’ve already solved it (in example 4.7 on page 75),
obtaining
9.8
v(t) = v0 + c1 e−κt where v0 = − .
κ

i i

i i
i i

i i

756 Systems of Differential Equations: A Starting Point

6 gal./min. (0% alcohol)

6 gal./min. (50% alcohol)

2 gal./min.

Tank A Tank B
T

6 gal./min. (1,000 gal.) 6 gal./min.

(500 gal.)
2 gal./min.

Figure 35.1: A simple system of two tanks containing water/alcohol mixtures.

Plugging this back into the ﬁrst equation of the system yields
dy
= v0 + c1 e−κt ,
dt
which is easily integrated:

dy c
y(t) = dt = v0 + c1 e−κt dt = v0 t − 1 e−κt + c2 .
dt κ
So, the general solution to this system is the pair
c1 −κt
y(t) = v0 t − e + c2 and v(t) = v0 + c1 e−κt .
κ

Mixing Problems with Multiple Tanks

Let us expand, slightly, our discussion of “mixing” from section 10.6 on page 211 by considering
the situation illustrated in ﬁgure 35.1. Here we have two tanks, A and B. Each minute 6 gallons
of a water/alcohol mix consisting of 50% alcohol is added to tank A, and 6 gallons of the mix in
tank A is drained out of the tank. At the same time, 6 gallons of pure water is added to tank B and
6 gallons of the mix in tank B is drained out of the tank. Meanwhile, the mix from each tank is
pumped into the other tank at the rate of 2 gallons per minute.
Following standard conventions, we will let
t = number of minutes since we started the mixing process ,

x = x(t) = amount (in gallons) of alcohol in tank A at time t ,

and
y = y(t) = amount (in gallons) of alcohol in tank B at time t .
Let us assume that tank A initially contains 500 gallons of pure water, while tank B initially contains
1,000 gallons of an alcohol-water mix with 90 percent of that mix being alcohol. Note that the input
and output ﬂows for each tank cancel out, leaving the total amount of mix in each tank constant. So,
our initial conditions are
90
x(0) = 0 and y(0) = × 1000 = 900 ,
100
and the concentrations of the alcohol at time t in tanks A and B are, respectively,
x y
and .
500 1000

i i

i i
i i

i i

A Few Illustrative Applications 757

In this system we have six “flows” affecting the rate the amount of alcohol varies in each tank over
time, each corresponding to one of the pipes in figure 35.1. In each case the rate at which alcohol is
flowing is simply the total flow rate of the mix in the pipe times the concentration of alcohol in that
mix. Thus,
dx
x = = change in the amount of alcohol in tank A per minute
dt
= rate alcohol is pumped into tank A from the outside
+ rate alcohol is pumped into tank A from tank B
− rate alcohol is pumped from tank A into tank B
− rate alcohol is drained from tank A

50 y x x
= 6× + 2× − 2× − 6×
100 1000 500 500
2 8
= 3 + y − x ,
1000 500
and
dy
y = = change in the amount of alcohol in tank B per minute
dt
= rate alcohol is pumped into tank B from the outside
+ rate alcohol is pumped into tank B from tank A
− rate alcohol is pumped from tank B into tank A
− rate alcohol is drained from tank B
x
y
y

= (6 × 0) + 2 × − 2× − 6×
500 1000 1000
2 8
= x − y .
500 1000
Thus, we have the system
8 2
x = − x + y + 3
500 1000
. (35.4)
2 8
y = x − y
500 1000

Rabbits and Gerbils: A Competing Species Model

A Single Species Competing with Itself
Back in chapter 10, we developed two models for population growth. Let us brieﬂy review the
“better” model, still assuming our population is a bunch of rabbits in an enclosed ﬁeld. In that model

R(t) = number of rabbits in the ﬁeld after t months

and
dR
R = = change in the number of rabbits per month = β R(t)
dt
where β is the “net birth rate” (that is,‘the number of new rabbits normally born each month per
rabbit’ minus ‘the fraction of the population that dies each month’). Under ideal conditions, β is a
constant β0 , which can be determined from the natural reproductive rate for rabbits and the natural
lifetime of a rabbit (see section 10.2). But assuming β is constant led to a model that predicted an

i i

y = 3y − 8 cos(y) .

Let us now introduce a second “unknown” function x related to y by setting

y = x .

Differentiating this and then applying the last two equations yields

x = y = 3y − 8 cos(y) = 3x − 8 cos(y) .

This (with the middle cut out), along with the deﬁnition of x , then gives us the 2×2 standard
ﬁrst-order system
x = 3x − 8 cos(y)
.
y = x

Thus, we have converted the single second-order differential equation

y − 3y + 8 cos(y) = 0

to the above 2×2 standard ﬁrst-order system. If we can solve this system, then we would also
have the solution to the original single second-order differential equation. And even if we cannot
truly solve the system, we may be able to use methods that we will later develop for systems to
gain useful information about the desired solution y .

In general, any second-order differential equation that can be written as

y = F t, y, y

can be converted to a standard 2×2 ﬁrst-order system by introducing a new unknown function x
related to y via
y = x ,
and then observing that

x = y = F t, y, y = F (t, y, x) .

This gives us the system

x = F(t, y, x)
.
y = x

Let us cleverly call this system the ﬁrst-order system corresponding to the original differential
equation. If we can solve this system, then we automatically have the solution to our original

i i

i i
i i

i i

760 Systems of Differential Equations: A Starting Point

differential equation. And even if we cannot easily solve the system, we will ﬁnd that some of the
tools we’ll later develop for ﬁrst-order systems will greatly aid us in analyzing the possible solutions,
especially when the original equation is not linear.
Do note that, using this procedure, any second-order set of initial values

y(t0 ) = a1 and y (t0 ) = a2

is converted to initial values
x(t0 ) = a2 and y(t0 ) = a1 .

!Example 35.6: Consider the second-order initial-value problem

y − sin(y) y = 0 with y(0) = 2 and y (0) = 3 .

Rewriting the differential equation as

y = sin(y) y

and letting x = y (so that x = y ) lead to

x = y = sin(y) y = sin(y) x = x sin(y) ,

giving us the standard ﬁrst-order system

x = x sin(y)
y = x

with initial conditions

x(0) = y (0) = 3 and y(0) = 2 .

Converting Higher-Order Differential Equations and Systems

What we just did with a second-order differential equation can easily be extended to convert a
third-order differential equation
y = F(t, y, y , y )
to a standard ﬁrst-order system with three differential equations. All we do is introduce two new
functions x and z that are related to y and each other by

x = y and z = x = y .

Then we have
z = y = F(t, y, y , y ) = F(t, y, x, z) ,
giving us the standard system
x = z
y = x .

z = F(t, y, x, z)

!Example 35.7: Consider the third-order differential equation

y − 3y + sin(y) y = 0 ,

i i

i i
i i

i i

Converting to First-Order Systems 761

which we will rewrite as

y = 3y − sin(y) y .
Introducing the functions x and z related to each other and to y by

x = y and z = x = y

and observing that

z = y = 3y − sin(y) y = 3z − sin(y) x ,

we see that the ﬁrst-order system of three equations corresponding to our original third-order
equation is
x = z
y = x .

z = 3z − sin(y) x

Needless to say, the above can be extended to a simple process for converting any N th -order
differential equation that can be written as

y (N ) = F t, y, y , y , . . . , y (N −1)

to an N ×N standard first-order system. The biggest difficulty is that, if N > 3 , you will probably
want to start using subscripted functions. We’ll see this later in section 35.5.
It should also be clear that this process can be applied to convert higher-order systems of
differential equations to standard first-order systems. The details are easy enough to figure out if
you ever need to do so.

Why Do It?
It turns out that some of the useful methods we developed to deal with first-order differential equa-
tions can be modified to being useful methods for dealing with standard first-order systems. These
methods include the use of slope fields (see chapter 8) and Euler’s numerical method (see chapter
9). Consequently, the above process for converting a single second-order differential equation to
a first-order system can be an important tool in analyzing solutions to second-order equations that
cannot be easily handled by the methods previously discussed in this text. In particular, this conver-
sion process is a particularly important element in the study of nonlinear second-order differential
equations.

Another Example: The Pendulum

There is one more system that we will want for future use. This is the system describing the motion
of the pendulum illustrated in ﬁgure 35.2. This pendulum consists of a small weight of mass m
attached at one end of a massless rigid rod, with the other end of the rod attached to a pivot so that
the weight can swing around in a circle of radius L in a vertical plane. The forces acting on this
pendulum are the downward force of gravity and, possibly, a frictional force from either friction at
the pivot point or air resistance.
Let us describe the motion of this pendulum using the angle θ from the vertical downward line
through the pivot to the rod, measured (in radians) in the counterclockwise direction. This means that
dθ/ is positive when the pendulum is moving counterclockwise, and is negative when the pendulum
dt
is moving clockwise.

i i

i i
i i

i i

762 Systems of Differential Equations: A Starting Point

θ T

mg
θ
Fgrav,tan

Figure 35.2: The pendulum system with a weight of mass m attached to a massless rod of length
L swinging about a pivot point under the inﬂuence of gravity.

Since the motion is circular, and the ‘positive’ direction is counterclockwise, our interest is in
the components of velocity, acceleration and force in the direction of vector T illustrated in figure
35.2. This is the unit vector tangent to the circle of motion pointing in the counterclockwise direction
from the current location of the weight. From basic physics and geometry, we know these tangential
components of the weight’s velocity and acceleration are
dθ d2θ
vtan = L = Lθ and atan = L = Lθ .
dt dt 2
Using basic physics and trigonometry (and figure 35.2), we see that the corresponding component
of gravitational force is
Fgrav,tan = −mg sin(θ ) .
For the frictional force, we’ll use what we’ve used several times before,
Ffric,tan = −γ vtan = −γ Lθ
where γ is some nonnegative constant — either zero if this is an ideal pendulum having no friction,
or a small to large positive value corresponding to a small to large frictional force acting on the
pendulum.
Writing out the classic “ ma = F ” equation, we have
m Lθ = matan = Fgrav,tan + Ffric,tan = −mg sin(θ ) − γ Lθ .
Cutting out the middle and dividing through by m L then gives us the slightly simpler equation
g γ
θ = − sin(θ ) − κθ where κ = . (35.7)
L m
Observe that, because of the sin(θ ) term, this second-order differential equation is nonlinear.
To convert this equation to a first-order system, let ω be the angular velocity, dθ/dt . So
dθ
θ = = ω
dt
and
g g
ω = θ = − sin(θ ) − κθ = − sin(θ ) − κω ,
L L
giving us the system
θ = ω
g . (35.8)
ω = − sin(θ ) − κω
L

i i

i i
i i

i i

Using Laplace Transforms to Solve Systems 763

35.4 Using Laplace Transforms to Solve Systems

If this were a slightly more advanced treatment of systems (and if we had more space), we would
discuss solving certain simple systems using “eigenvalues” and “eigenvectors”. But that takes us
beyond the scope of this text (in which an understanding of such terms is not assumed). So, as an
alternative, let us observe that, if a system can be written as

x = ax + by + f (t)
y = cx + d y + g(t)

where a , b , c and d are constants, and f and g are ‘reasonable’ functions on (0, ∞) , then we
can ﬁnd the solutions to this system by taking the Laplace transform of each equation, solving the
resulting algebraic system for

X = L[x] and Y = L[y]

and then taking the inverse Laplace transform of each. If initial conditions x(0) and y(0) are
known, then use them when taking the transforms of the derivatives. Otherwise, just view x(0) and
y(0) as arbitrary constants, possibly renaming them x 0 and y0 for convenience.
One example should sufﬁce.

!Example 35.8: Consider solving the system

x = x + 2y + 10e4t
y = 2x + y

with initial conditions

x(0) = 0 and y(0) = 2 .
Taking the Laplace transform of the ﬁrst equation:

L x s = L[x]|s + 2L[y]|s + 10L e4t
s

→ s X (s) − x(0) = X (s) + 2Y (s) +

10
s−4

→ s X (s) − 0 = X (s) + 2Y (s) +

10
s−4

→ [s − 1]X (s) − 2Y (s) =

10
s −4
.

Doing the same with the second equation:

L y s = 2L[x]|s + L[y]|s

→ sY (s) − y(0) = 2X (s) + Y (s)

→ sY (s) − 2 = 2X (s) + Y (s)

→ −2X (s) + [s − 1]Y (s) = 2 .

i i

i i
i i

i i

764 Systems of Differential Equations: A Starting Point

So the transforms X = L[x] and Y = L[y] must satisfy the algebraic system
10
[s − 1]X (s) − 2Y (s) =
s −4 . (35.9)
−2X (s) + [s − 1]Y (s) = 2

It is relatively easy to solve the above system. To find X (s) we can first add s − 1 times
the first equation to 2 times the second, and then apply a bit more algebra, to obtain
10(s − 1)
[s − 1]2 X (s) − 4X (s) = + 4
s−4
10(s − 1) + 4(s − 4)
→ s 2 − 2s − 3 X (s) =
s −4
14s − 26
→ (s + 1)(s − 3)X (s) =
s −4
14s − 26
→ X (s) =
(s + 1)(s − 3)(s − 4)
. (35.10)

Similarly, to ﬁnd Y (s) , we can add 2 times the ﬁrst equation in system (35.9) to s − 1
times the second equation, obtaining
20
−4Y (s) + [s − 1]2 Y (s) = + 2(s − 1) .
s−4

Solving this for Y (s) then, eventually, yields

2s 2 − 10s + 28
Y (s) = . (35.11)
(s + 1)(s − 3)(s − 4)

Finding the formulas for X and Y was easy. Now we need to compute the formulas for
x = L−1 [X] and y = L−1 [Y ] from formulas (35.10) and (35.11) using the theory, techniques
and tricks for ﬁnding inverse Laplace transforms developed in Part IV of this text:

14s − 26
−1
x(t) = L [X (s)]|t = L −1 = · · · = 6e4t − 2e−t − 4e3t

(s + 1)(s − 3)(s − 4) t

and

2s 2 − 10s + 28
y(t) = L−1 [Y (s)]|t = L−1 = · · · = 4e4t + 2e−t − 4e3t .
(s + 1)(s − 3)(s − 4) t

This is less easy, and, as you may have noted, the details are left to the reader “as a review of
Laplace transforms” (and to save space here).

i i

i i
i i

i i

Existence, Uniqueness and General Solutions for Systems 765

35.5 Existence, Uniqueness and General Solutions for

Systems
Existence and Uniqueness of Solutions
In section 3.3, two theorems were given — theorems 3.1 and 3.2 — describing conditions ensuring
the existence and uniqueness of solutions to ﬁrst-order differential equations. Here are the “systems
versions” of those theorems:

Theorem 35.1 (existence and uniqueness for general systems)

Consider an N × N standard ﬁrst-order initial-value problem
x 1 = f 1 (t, x 1 , x 2 , . . . , x N )
x 2 = f 2 (t, x 1 , x 2 , . . . , x N )
..
.
x N = f N (t, x 1 , x 2 , . . . , x N )
with
( x 1 (t0 ) , x 2 (t0 ) , . . . , x N (t0 ) ) = ( a1 , a2 , . . . , a N ) .
Suppose each f j and ∂ f j/∂ xk are continuous functions on an open region of the T X 1 X 2 · · · X N–space
containing the point (t0 , a1 , a2 , . . . , a N ) . This initial-value problem then has exactly one solution
( x 1 , x 2 , . . . , x N ) = ( x 1 (t) , x 2 (t) , . . . , x N (t) )
over some open interval (α, β) containing t0 . Moreover, each x k and its derivative are continuous
over that interval.

Theorem 35.2
Consider an N × N standard ﬁrst-order initial-value problem
x 1 = f 1 (t, x 1 , x 2 , . . . , x N )
x 2 = f 2 (t, x 1 , x 2 , . . . , x N )
..
.
x N = f N (t, x 1 , x 2 , . . . , x N )
with
( x 1 (t0 ) , x 2 (t0 ) , . . . , x N (t0 ) ) = ( a1 , a2 , . . . , a N )
over an interval (α, β) containing t0 , and with each f j being a continuous function on the inﬁnite
strip
R = { (t, x 1 , x 2 , . . . , x N ) : α < t < β and − ∞ < x k < ∞ for k = 1, 2, . . . , N } .
Further suppose that, on R , each ∂ f j/
is continuous and is a function of t only. Then the
∂ xk
initial-value problem has exactly one solution
( x 1 , x 2 , . . . , x N ) = ( x 1 (t) , x 2 (t) , . . . , x N (t) )
over (α, β) . Moreover, each x k and its derivative are continuous on that interval.

The proofs of the above two theorems are simply multidimensional versions of the proofs
already discussed in sections 3.4, 3.5 and 3.6 for theorems 3.1 and 3.2.

i i

i i
i i

i i

766 Systems of Differential Equations: A Starting Point

General Solutions
In exercise 35.4, you either have veriﬁed or will verify that, for any choice of constants c1 and c2 ,
the pair
x(t) = c1 e3t + 2c2 e−4t and y(t) = c1 e3t − 5c2 e−4t
is a solution to the system
x = x + 2y
.
y = 5x − 2y

But is the above a general solution, or could there be other solutions not described by the above?
Well, let us consider a more general situation; namely, let us consider “ﬁnding” the general
solution to a 2×2 standard system
x = f (t, x, y)
y = g(t, x, y)

over some interval (α, β) containing a point t0 . Let us further assume that the component functions
are “reasonable”. To be speciﬁc, we’ll assume that f , g and their partial derivatives with respect
to x and y are continuous on the entire XY –plane when α < t < β . (This will allow us to apply
theorem 35.1 without getting bogged down in technicalities. Besides, all the 2×2 systems we’ll see
satisfy these conditions.)
Now suppose (as in the case we ﬁrst started with) that we have found two formulas of t , c1
and c2
x̂(t, c1 , c2 ) and ŷ(t, c1 , c2 )
such that both of the following hold:

1. For each choice of real constants c1 and c2 , the pair

x(t) = x̂(t, c1 , c2 ) and y(t) = ŷ(t, c1 , c2 )

is a solution to our standard system over (α, β) .

2. For each ordered pair of real numbers A and B , there is a corresponding pair of real constants
C 1 and C 2 such that

x̂(t0 , C 1 , C 2 ) = A and ŷ(t0 , C 1 , C 2 ) = B .

Now, let x 1 (t) and y1 (t) be any particular solution pair to our system on (α, β) , and let
A = x 1 (t0 ) and B = y1 (t) . Then, of course, x 1 and y1 , together, satisfy the initial-value problem

x = f (t, x, y)
y = g(t, x, y)

with
(x(t0 ), y(t0 )) = (A, B) .
In addition, by our assumptions, we can also ﬁnd constants C 1 and C 2 such that

x(t) = x̂(t, C 1 , C 2 ) and y(t) = ŷ(t, C 1 , C 2 )

is a solution to our standard system over (α, β) with

x(t0 ) = x̂(t0 , C 1 , C 2 ) = A and y(t0 ) = ŷ(t0 , C 1 , C 2 ) = B .

i i

i i
i i

i i

Existence, Uniqueness and General Solutions for Systems 767

So, we have two solution pairs for the above initial-value problem, the pair

x 1 (t) and y1 (t)

and the pair
x(t) = x̂(t, C 1 , C 2 ) and y(t) = ŷ(t, C 1 , C 2 ) .

But theorem 35.1 tells us that there can only be one solution to the above initial-value problem. So
our two solutions must be the same; that is, we must have

x 1 (t) = x̂(t, C 1 , C 2 ) and y1 (t) = ŷ(t, C 1 , C 2 ) for α<t <β .

This means that every solution to our 2×2 standard system is given by

x(t) = x̂(t, c1 , c2 ) and y(t) = ŷ(t, c1 , c2 )

for some choice of constants c1 and c2 . In other words, we have the following theorem:

Theorem 35.3
Let (α, β) be an interval containing a point t0 , and consider a 2×2 standard ﬁrst-order initial-value
problem
x = f (t, x, y)
y = g(t, x, y)

in which f , g and their partial derivatives with respect to x and y are continuous functions of all
variables when t is in (α, β) . Assume, further, that

x̂(t, c1 , c2 ) and ŷ(t, c1 , c2 )

are formulas of three variables satisfying both of the following:

1. For each choice of real constants c1 and c2 , the pair

x(t) = x̂(t, c1 , c2 ) and y(t) = ŷ(t, c1 , c2 )

satisﬁes the above ﬁrst-order system over (α, β) .

2. For each ordered pair of real numbers (A, B) , there is a corresponding ordered pair of real
numbers (C 1 , C 2 ) such that

x̂(t0 , C 1 , C 2 ) = A and ŷ(t0 , C 1 , C 2 ) = B .

Then, treating c1 and c2 as arbitrary constants, the pair

x̂(t, c1 , c2 ) and ŷ(t, c1 , c2 , . . . , c N )

is a general solution to the above system over (α, β) .

!Example 35.9: As already noted, the pair

x(t) = c1 e3t + 2c2 e−4t and y(t) = c1 e3t − 5c2 e−4t

x̂(t,c1 ,c1 ) ŷ(t,c1 ,c1 )

satisﬁes
x = x + 2y
y = 5x − 2y

i i

i i
i i

i i

768 Systems of Differential Equations: A Starting Point

over the entire real line, (−∞, ∞) for any choice of constants c1 and c2 . Moreover, it is
easily seen that this ﬁrst-order system satisﬁes the conditions required in the last theorem with
(α, β) = (−∞, ∞) . Hence, this theorem will assure us that the formula pair (x̂, ŷ) is a general
solution if, for each ordered pair of real numbers (A, B) , there is a corresponding ordered pair
of real numbers (C 1 , C 2 ) such that

x̂(t0 , C 1 , C 2 ) = A and ŷ(t0 , C 1 , C 2 ) = B (35.12)

where t0 is some ﬁxed point, say, t0 = 0 . That is easily conﬁrmed by simply solving this system
for C 1 and C 2 in terms of A and B . Using our formulas for x̂ and ŷ with t0 = 0 , we see
that system (35.12) reduces to the simple algebraic system

C 1 + 2C 2 = A
.
C 1 − 5C 2 = B

Subtracting the second equation from the ﬁrst yields

A−B
7C 2 = A − B C2 =
7
.

Plugging this back into the ﬁrst equation gives

A− B 5 A + 2B
C1 + 2
7
= A C1 =
7
.

Thus, for any pair A and B , we can clearly ﬁnd C 1 and C 2 such that system (35.12) holds
when t0 = 0 . Theorem 35.3 then assures us that the pair

x(t) = c1 e3t + 2c2 e−4t and y(t) = c1 e3t − 5c2 e−4t

is, in fact, a general solution. That is, every solution is given by the above for some choice of c1
and c2 .

The derivation leading to theorem 35.3 can be easily redone starting with an N × N standard
system, and using K arbitrary constants (where N and K are positive integers), giving us:

Theorem 35.4
Let (α, β) be an interval containing a point t0 , and consider an N × N standard ﬁrst-order initial-
value problem
x 1 = f 1 (t, x 1 , x 2 , . . . , x N )
x 2 = f 2 (t, x 1 , x 2 , . . . , x N )
..
.
x N = f N (t, x 1 , x 2 , . . . , x N )

in which each f j and ∂ f j/ is a continuous function of all variables when t is in (α, β) . Assume,
∂ xk
further, that

x̂ 1 (t, c1 , c2 , . . . , c K ) , x̂ 2 (t, c1 , c2 , . . . , c K ) , ... and x̂ N (t, c1 , c2 , . . . , c K )

are N formulas of K + 1 variables satisfying both of the following:

i i

i i
i i

i i

Single N th -Order Differential Equations 769

1. For each choice of real constants c1 , c2 , . . . and c K , the ordered set of N functions x 1 ,
x 2 , . . . and x N given by

x n (t) = x̂ n (t, c1 , c2 , . . . , c K ) for n = 1, 2, . . . , N

satisﬁes the above ﬁrst-order system over (α, β) .

2. For each ordered N -tuple of real numbers (A1 , A2 , . . . , A N ) , there is a corresponding K -

tuple of real numbers (C 1 , C 2 , . . . , C K ) such that

x̂ n (t, C 1 , C 2 , . . . , C K ) = An for n = 1, 2, . . . , N .

Then, treating the ck ’s as arbitrary constants,

x̂ 1 (t, c1 , c2 , . . . , c K ) , x̂ 2 (t, c1 , c2 , . . . , c K ) , ... and x̂ N (t, c1 , c2 , . . . , c K )

forms a general solution to the above system over (α, β) .

It should be noted that, using linear algebra, it can easily be shown that we must have K ≥ N
for condition 2 in the last theorem to hold (with K = N in most cases of interest).

35.6 Single N th -Order Differential Equations

Several theorems regarding the existence and uniqueness of solutions to a single second- or higher-
order differential equation were given near the end of chapter 11. It is worth noting that all of these
theorems can be derived from the results in the last section. To see this, let us consider a single
N th -order initial-value problem

y (N ) = F t, y, y , . . . , y (N −1) (35.13a)
with
y(t0 ) = a1 , y (t0 ) = a2 , ... and y (n−1) (t0 ) = a N . (35.13b)

In this:

We are using t as the basic variable (so y = y(t) and y (k) = d

ky
1. /dt k ).

2. t0 , a1 , a1 , . . . and a N are ﬁxed real values.

3. F(t, x 1 , x 2 , . . . , x N ) is a function of N + 1 variables on some open region R of N + 1-

dimensional space containing the point (t0 , a1 , a2 , . . . , a N ) .

Our goal will be to derive theorem 11.3 on page 240. So assume, as in the theorem, that F and each
∂ F/ is continuous on the region R .
∂ xk
Let us now convert this single differential equation to a standard ﬁrst-order system by introducing
N new unknown functions x 1 , x 2 , . . . and x N related to y via

x1 = y , x2 = y , x 3 = y , ... and x N = y (N −1) .

i i

i i
i i

i i

770 Systems of Differential Equations: A Starting Point

Then we have
x1 = y = x2 ,

x 2 = y = y = x 3 ,
..
.

x N −1 = y (N −2) = y (N −1) = x N ,

and, ﬁnally,

xN = y (N −1) = y (N ) = F t, y, y , . . . , y (n−1) = F(t, x 1 , x 2 , x 3 , . . . , x N ) .

Thus, our original initial-value problem can be rewritten as

x 1 (t) = f 1 (t, x 1 , x 2 , . . . , x N )
x 2 (t) = f 2 (t, x 1 , x 2 , . . . , x N )
..
.
x N (t) = f N (t, x 1 , x 2 , . . . , x N )

with
( x 1 (t0 ) , x 2 (t0 ) , . . . , x N (t0 ) ) = ( a1 , a2 , . . . , a N ) ,
where
f 1 (t, x 1 , x 2 , . . . , x N ) = x 2 ,
f 2 (t, x 1 , x 2 , . . . , x N ) = x 3 ,
..
.
and
f N −1 (t, x 1 , x 2 , . . . , x N ) = x N ,
while
f N (t, x 1 , x 2 , . . . , x N ) = F(t, x 1 , x 2 , x 3 , . . . , x N ) .

It is almost trivial to verify that each f j and each ∂ f j/∂ xk with j = N is a continuous function
on the region R . Moreover, the assumptions made on F ensure that f N and the ∂ f N/∂ xk ’s are also
continuous functions on R . This means theorem 35.1 on page 765 applies, telling us that

1. the above N ×N initial-value problem has exactly one solution (x 1 , x 2 , . . . , x N ) over some
open interval (α, β) containing t0 ,
and, moreover,
2. each x k and its derivative are continuous over the interval (α, β) .

But, since
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
x1 y x1 y
⎢x ⎥ ⎢ y ⎥ ⎢x ⎥ ⎢ y ⎥
⎢ 2⎥ ⎢ ⎥ ⎢ 2 ⎥ ⎢ ⎥
⎢ . ⎥ = ⎢ . ⎥ and ⎢ . ⎥ = ⎢ . ⎥ ,
⎢ . ⎥ ⎢ . ⎥ ⎢ . ⎥ ⎢ . ⎥
⎣ . ⎦ ⎣ . ⎦ ⎣ . ⎦ ⎣ . ⎦

xN y (N −1) xN y (N )

i i

i i
i i

i i

Single N th -Order Differential Equations 771

the above results concerning the x k ’s tell us that there is exactly one y and that it and its derivatives
up to order N are continuous over the interval (α, β) . In other words, we’ve veriﬁed the following
theorem:

Theorem 35.5 (existence and uniqueness for N th -order initial-value problems)

Consider an N th -order initial-value problem

y (N ) = F t, y, y , . . . , y (N −1)
with
y(t0 ) = a1 , y (t0 ) = a2 , ... and y (N −1) (t0 ) = a N

in which F = F(t, x 1 , x 2 , . . . , x N ) and the corresponding partial derivatives

∂F ∂F ∂F
, , ... and
∂ x1 ∂ x2 ∂xN

are all continuous functions in some open region containing the point (t0 , a1 , a2 , . . . , a N ) . This
initial-value problem then has exactly one solution y over some open interval (α, β) containing t0 .
Moreover, this solution and its derivatives up to order N are continuous over that interval.

If you now go back and compare the above theorem with theorem 11.3 on page 240, you will
ﬁnd that (except for cosmetic differences in notation) the two theorems are the same.
While we are at it, we should note that the above certainly applies if our differential equation is
a homogeneous N th -order linear differential equation

a0 y (N ) + a1 y (N −1) + · · · + a N −2 y + a N −1 y + a N y = 0

on an interval (α, β) on which the ak ’s are all continuous functions with a0 never being zero. In
fact, it’s easily veriﬁed that theorem 35.2 applies after rewriting the differential equation as

−1

y (N ) = a1 y (N −1) + · · · + a N −2 y + a N −1 y + a N y .
a0

Moreover, if {y1 , y2 , . . . , y K } is a linearly independent set of solutions to this differential equation

on (α, β) , then we can use theorem 35.4 and the principle of superposition to show that

y(t) = c1 y1 (t) + c2 y2 (t) + · · · + c K y K (t)

is a general solution if and only if, for each N -tuple of real numbers (A1 , A2 , . . . , A N ) , there is a
corresponding K -tuple of real numbers (C 1 , C 2 , . . . , C K ) such that, for n = 1, 2, . . . , N ,

C 1 y1 (n−1) (t0 ) + C 2 y2 (n−1) (t0 ) + · · · + C K y K (n−1) (t0 ) = An

where t0 is some ﬁxed point in (α, β) . Then, using theorem 35.4 and basic results from linear alge-
bra, you can, eventually, verify all the results regarding general solutions to N th -order homogeneous
equations described in theorem 13.5 on page 272 and all the results on Wronskians in theorem 13.6
on page 274.

i i

i i
i i

i i

772 Systems of Differential Equations: A Starting Point

Additional Exercises

35.1. Determine which of the following function pairs are solutions to the ﬁrst-order system
x = 2y
.
y = 1 − 2x
1
a. x(t) = sin(2t) + and y(t) = cos(2t)
2
b. x(t) = e2t − 1 and y(t) = e2t
1
c. x(t) = 3 cos(2t) + and y(t) = −3 sin(2t)
2
35.2. Determine which of the following function pairs are solutions to the ﬁrst-order system
x = 4x − 3y
.
y = 6x − 7y

a. x(t) = 6e3t and y(t) = 2e3t

b. x(t) = 3e2t − e−5t and y(t) = 2e2t − 3e−5t
c. x(t) = 3e2t + e−5t and y(t) = 2e2t − 3e−5t
35.3. Determine which of the following function pairs are solutions on (0, ∞) to the ﬁrst-order
system
t x + 2x = 15y
.
t y = x

a. x(t) = 3t 3 and y(t) = −t 3

b. x(t) = 3t 3 and y(t) = t 3
c. x(t) = −5t −5 and y(t) = t −5
35.4 a. Verify that, for any choice of constants c1 and c2 ,
x(t) = c1 e3t + 2c2 e−4t and y(t) = c1 e3t − 5c2 e−4t
satisfies the system
x = x + 2y
.
y = 5x − 2y
b. Then find the solution to this system that satisfies
x(0) = 7 and y(0) = −7 .

35.5 a. Verify that, for any choice of constants c1 and c2 ,

x(t) = c1 e9t − c2 e−3t and y(t) = c1 e9t + 2c2 e−3t
satisﬁes the system
x = 5x + 4y
.
y = 8x + y

i i

i i
i i

i i

Additional Exercises 773

IA gal./min. (CA % alcohol)

IB gal./min. (CB % alcohol)

γ1 gal./min.

Tank A
T

Tank B
OA gal./min. (1200 gal.) OB gal./min.
(600 gal.)
γ2 gal./min.

Figure 35.3: A simple system of two tanks containing water/alcohol mixtures for exercise 35.8 .

b. Then ﬁnd the solution to this system that satisﬁes

x(0) = 0 and y(0) = 9 .

35.6 a. Verify that, for any choice of constants c1 and c2 ,

x(t) = c1 e−2t + 2c2 e5t and y(t) = −3c1 e−2t + c2 e5t

satisﬁes the system

x = 4x + 2y
.
y = 3x − y

b. Then ﬁnd the solution to this system that satisﬁes

x(0) = 0 and y(0) = −21 .

35.7. Solve each of the following weakly coupled systems:

x + 2x = 10z
x + x = 0 x = 2yx
a. b. c. zy + 5zy = 15x
y = x t y = y
z − 3z = 0

35.8. Consider the tank system illustrated in figure 35.3. Let x and y be, respectively, the amount
of alcohol (measured in gallons) in tanks A and B at time t (measured in minutes), and
find the standard first-order system describing how x and y vary over time when:
a. IA = 5 , C A = 0 , OA = 5 , IB = 3 , C B = 100 ,
OB = 3 , γ1 = 1 and γ2 = 1
b. IA = 8 , C A = 25 , OA = 6 , IB = 2 , C B = 50 ,
OB = 4 , γ1 = 1 and γ2 = 3

35.9. Rewrite each of the following differential equations as a standard ﬁrst-order system.
a. y + 4y + 2y = 0 b. y − 8t 2 y − 32y = sin(t)
c. y = 4 − y 2 d. t 2 y − 5t y + 8y = 0

e. t 2 y − t y + 10y = 0 f. y = 4t 2 − sin y y

g. y + 2y − 3y − 4y = 0 h. y + y t 2 + y 2 = 0

i i

i i
i i

i i

774 Systems of Differential Equations: A Starting Point

35.10. Solve each of the following initial-value problems using the Laplace transform.
x = x + 2y x(0) = 1
a. with
y = 5x − 2y y(0) = 15
x = 2y x(0) = x 0
b. with
y = 2x y(0) = y0
x = 2y x(0) = x 0
c. with
y = −2x y(0) = y0
x = −2y x(0) = x 0
d. with
y = 8x y(0) = y0
x = 4x − 13y x(0) = 2
e. with
y = x y(0) = 1
x = 3x + 2y x(0) = x 0 = a1
f. with
y = −2x + 3y y(0) = y0 = a2
x = 8x + 2y − 17 x(0) = 0
g. with
y = 4x + y − 13 y(0) = 0
x = 8x + 2y + 7e2t x(0) = −1
h. with
y = 4x + y − 7e2t y(0) = 1

x = 4x + 3y − 6e3t x(0) = 4
i. with
y = x + 6y + 2e3t y(0) = 0
x = −y x(0) = 0
j. with
y = 4x + 24t y(0) = 0
x = 4x − 13y x(0) = 13
k. with
y = x + 19 cos(4t) − 13 sin(4t) y(0) = 0
x = 4x + 3y + 5 step2 (t) x(0) = 0
l. with
y = x + 6y + 17 step2 (t) y(0) = 0
35.11. Using theorem 35.3 and results from exercise 35.5, verify that

x(t) = c1 e9t − c2 e−3t and y(t) = c1 e9t + 2c2 e−3t

is a general solution to
x = 5x + 4y
.
y = 8x + y

35.12. Using theorem 35.3 and results from exercise 35.6, verify that

x(t) = c1 e−2t + 2c2 e5t and y(t) = −3c1 e−2t + c2 e5t

is a general solution to
x = 4x + 2y
.
y = 3x − y

i i

i i
i i

i i

36
Critical Points, Direction Fields and
Trajectories

In this chapter, we will restrict our attention to certain standard ﬁrst-order systems consisting of only
two differential equations. Admittedly, just about everything we will discuss can be extended to
larger systems, but, by staying with the smaller systems, we will be able to keep the notation and
discussion more simple, and develop the main concepts more clearly. This will largely be because
it is much easier to draw pictures with solutions consisting of just two functions, x = x(t) and
y = y(t) . And that will be important. Our goal is to ﬁnd a way to “graphically represent” solutions
in a useful manner. What we will develop, called “phase plane analysis”, turns out to be a fundamental
tool in the study of systems, especially for those many systems we cannot explicitly solve.

36.1 The Systems of Interest and Some Basic Notation

The Systems of Interest
As just noted above, the focus of this chapter concerns the standard 2×2 ﬁrst-order system

x = f (t, x, y)
(36.1)
y = f (t, x, y)

(though we may replace the symbols x and y with other symbols in some of the applications). In
addition, we will often also require that our system be “regular” and/or “autonomous ”. So let us
deﬁne these terms:
• System (36.1) is “regular” if the component functions, f and g , are “reasonably nice”. More
precisely, the system is regular if f , g and all their ﬁrst partial derivatives exist and are
continuous for all real values of their variables. Most of the systems we have seen are regular.
• System (36.1) is autonomous if and only if no component function explicitly depends on t .
In this case, we should leave out any reference to t and rewrite system (36.1) as

x = f (x, y)
. (36.2)
y = f (x, y)

Autonomous systems naturally arise in applications. If you check, all the ﬁrst-order systems
derived in the previous chapter from applications are autonomous.

775

i i

i i
i i

i i

776 Critical Points, Direction Fields and Trajectories

Some Convenient Notation

Remember that a solution over an interval (α, β) to a system

x = f (t, x, y)
(36.3)
y = f (t, x, y)

consists of a pair of functions x = x(t) and y = y(t) that, together, satisfy the system for all t in
the interval (α, β) . In the last chapter, we occasionally wrote this pair explicitly as an ordered pair,

(x, y) = (x(t), y(t)) .

We will ﬁnd it even more convenient to do so in this chapter, especially when discussing generic
systems. Let us also observe that we can write the above system, as well as the initial conditions

x(t0 ) = x 0 and y(t0 ) = y0 ,

in “vector form”,
x f (t, x, y) x(t0 ) x0
= with = .
y g(t, x, y) y(t0 ) y0

When convenient, we will further reduce the above line to

x = F(t, x) with x(t0 ) = x0

with the understanding that

x(t) x x0
x = x(t) = , x = , x0 =
y(t) y y0
and

f (t, x, y)
F(t, x) = .
g(t, x, y)

Of course, if the system is autonomous, we will simply denote the system by x = F(x) with the
understanding that
f (x, y)
F(x) = .
g(x, y)
Along these same lines, we will occasionally let 0 denote the “zero vector”

0
0 = .
0

!Example 36.1: Consider the system

x = x + 2y
y = 5x − 2y

along with the initial conditions

x(0) = 0 and y(0) = 1 .

In example 35.3 we saw that a solution to this initial-value problem is (x, y) where
2 3t 2 2 3t 5
x = x(t) = e − e−4t and y = y(t) = e + e−4t .
7 7 7 7

i i

i i
i i

i i

Constant/Equilibrium Solutions 777

In matrix/vector form, this initial-value problem can be written as either

x x + 2y x(0) 0
= with =
y 5x − 2y y(0) 1
or even as
x = F(x) with x(0) = x0
where
x x + 2y 0
x = , F(x) = and x0 = .
y 5x − 2y 1

In the future, we will use, or not use, vector notation as is convenient.

36.2 Constant/Equilibrium Solutions

A solution (x, y) to a system x = F(t, x) is a constant solution if both x and y are constant
functions; that is, for some pair of real numbers x 0 and y0 ,

(x(t), y(t)) = (x 0 , y0 ) for all t .

But remember,

(x(t), y(t)) = (x 0 , y0 ) for all t ⇐⇒ x (t), y (t) = (0, 0) for all t .

Thus, (x, y) is a constant solution if and only if

x (t) 0
x (t) = = = 0 for all t .
y (t) 0

Combining this with x = F(t, x) , we get

0 = x = F(t, x) .

Clearly then, for any pair of constants (x 0 , y0 ) , we have that (x(t), y(t)) = (x 0 , y0 ) is a constant
solution of our system if and only if
x0
0 = F(t, x0 ) for all t where x0 = .
y0

From this, you can ﬁnd the constant solutions for a given system. (Do remember that we are insisting
our solutions be real valued. So the constants must be real numbers.)
Constant solutions are especially important for autonomous systems. And, when the system is
autonomous, it is traditional to call any constant solution an equilibrium solution. We will follow
tradition.

!Example 36.2: Let’s try to ﬁnd every equilibrium solution for the autonomous system

x = x(y 2 − 9)
.
y = (x − 1)(y 2 + 1)

i i

i i
i i

i i

778 Critical Points, Direction Fields and Trajectories

The constant/equilibrium solutions are all obtained by setting x and y both equal to 0 , and
then solving the resulting algebraic system,

0 = x(y 2 − 9)
. (36.4)
0 = (x − 1)(y 2 + 1)

Consider the ﬁrst equation, ﬁrst:

0 = x(y 2 − 9)

→ x =0 or y2 = 9

→ x =0 or y = 3 or y = −3 .

If x = 0 , then the second equation in system (36.4) reduces to

0 = (0 − 1)(y 2 + 1) ,

which means that y 2 = −1 , and, hence, y = ±i . But these are not real numbers, as required.
So we do not have an equilibrium solution

(x(t), y(t)) = (x 0 , y0 ) with x0 = 0 .

On the other hand, if the ﬁrst equation in system (36.4) is satisﬁed because y = 3 , then the
second equation in that system reduces to

0 = (x − 1) 32 + 1 = (x − 1) · 10 ,

telling us that x = 1 . Thus, one equilibrium solution for our system of differential equations is

(x(t), y(t)) = (1, 3) .

Finally, if the ﬁrst equation in system (36.4) holds because y = −3 , then the second equation in
that system becomes
0 = (x − 1) · 10 .
Hence, again, x = 1 , and the corresponding equilibrium solution is

(x(t), y(t)) = (1, −3) for all t .

In summary, then, our system of differential equations has two equilibrium solutions,

(x(t), y(t)) = (1, 3) and (x(t), y(t)) = (1, −3) .

i i

i i
i i

i i

“Graphing” Standard Systems 779

t = − 12
Y
t = − 12 Y
2 t = 21
2
t = 21

t =0 (x(t), y(t))
1
(t, x(t), y(t)) t =0
1

− 12 − 14 T
1 1 X
2 4 1
2 0 1 2 3
X 3

(a) (b)

Figure 36.1: Two graphical representations of the solution from example 36.3 with
−1/2 ≤ t ≤ 1/2 : (a) The actual graph, and (b) The curve traced out by (x(t), y(t))
in the XY –plane.

36.3 “Graphing” Standard Systems

True Graphs and Trajectories
Let us brieﬂy discuss two ways of graphically representing solutions to standard ﬁrst-order systems,
starting with a simple example.

!Example 36.3: Consider “graphically representing” the solution to the initial-value problem

x x + 2y x(0) 0
= with = .
y 5x − 2y y(0) 1

From example 35.3 we know that a solution to this initial-value problem is (x, y) with
2 3t 2 2 3t 5
x = x(t) = e − e−4t and y = y(t) = e + e−4t .
7 7 7 7
To construct the actual graph of this solution, we need to plot each point (t, x(t), y(t)) in
T X Y–space, using the above formulas for x(t) and y(t) . This results in a curve in T XY–space.
Part of this curve has been sketched in figure 36.1a.
As an alternative to constructing the graph, we can sketch the curve in the XY –plane that is
traced out by (x(t), y(t)) as t varies. That is what what was sketched in figure 36.1b.
Take a look at the two figures. Both were easily done using a computer math package.

In general, the graph of a solution to a standard 2×2 ﬁrst-order system requires a coordinate
system with 3 axes. Sketching such a graph is do-able, as in the above example, especially if you
have a decent computer math package. Then you can even rotate the image to see the graph from

i i

i i
i i

i i

780 Critical Points, Direction Fields and Trajectories

different views. Unfortunately, when you are limited to just one view, as in figure 36.1a, it may be
somewhat difficult to properly interpret the figure. Because of this, we will rarely, if ever again,
actually attempt to sketch true graphs of our solutions.
On the other hand, we will find the approach illustrated by figure 36.1b quite useful. Moreover,
the sketches that we will generate for 2×2 systems can give us insight as to the behavior of solutions
to larger systems.
Observe that the curve in figure 36.1b has a natural “direction of travel” corresponding to the
way the curve is traced out as t increases. If you start at the point where t = −1/2 and travel the
curve “in the positive direction” (that is, in the direction in which (x(t), y(t)) travels along the curve
as t increases), then you would pass through the point where t = 0 and then through the point
where t = +1/2 . That makes this an oriented curve. We will call this oriented curve a trajectory for
our system.
Just to be a bit more complete, suppose we have a solution

(x, y) = (x(t), y(t))

to a standard 2×2 ﬁrst-order system x = F(t, x) . For each t , we will view (x(t), y(t)) as a point
on the X Y –plane, and we will refer to the oriented curve traced out by this point as t increases
as this solution’s trajectory, with the orientation being the direction of travel along the curve as t
increases. We will also refer to this oriented curve as a trajectory for the system x = F(t, x) .

Velocity Vectors and the Direction of Travel

Do recall that
x (t)
v(t) = x (t) =
y (t)
is the velocity vector v at time t of an object whose position at time t is (x(t), y(t) .1 As you
should recall from elementary multivariable calculus, this vector v(t) is an ‘arrow’ pointing in the
direction the object is traveling at time t . So, if we pick some value t0 for t and draw v0 = v(t0 )
at position (x(t0 ), y(t0 )) , then v0 will be tangent to and pointing in the direction of travel of the
trajectory of the object at (x(t0 ), y(t0 )) , thus giving us some idea of what that trajectory looks like
near that point. And if (x(t), y(t) is known to be a solution to

x (t) = F(t, x) ,

then we can actually compute v0 = x (t0 ) for each choice of t0 and x0 without even solving the
system for x(t) . All we need to do is to compute F(t0 , x0 ) .

!Example 36.4: Consider the trajectories of two objects both of whose positions (x, y) at time
t satisfy the nonautonomous system

x t (x + y)
= .
y y − tx

Assume the ﬁrst object passes through the point (x, y) = (1, 2) at time t = 0 . At that time, its
velocity is
x 0(1 + 2) 0
v = = = .
y 2 − 0·1 2

1 This assumes we are using a Cartesian coordinate system. If the coordinate system is, say, polar, then the velocity is not
so simply computed.

i i

i i
i i

i i

Sketching Trajectories for Autonomous Systems 781

However, if the other object also passes through the point (x, y) = (1, 2) , but at time t = 2 ,
then its velocity at that time is

x 2(1 + 2) 6
v = = = .
y 2 − 2·1 0

Note the directions of travel for each of these objects as they pass through the point (x, y) =
(1, 2) : The ﬁrst is moving parallel to the Y –axis, while the second in moving parallel to the
X –axis.

As the last example illustrates, the direction of travel for a solution’s trajectory through a given
point may depend on “when” it passes through the point. However, this is only for nonautonomous
systems. If our ﬁrst-order system x = F is autonomous, then F does not depend on t , only on the
components of x . Consequently, the “velocities” of

x = F(x)

depend only on position, not t . We will use this fact for the rest of this chapter.

36.4 Sketching Trajectories for Autonomous Systems

Direction Fields
Suppose we are given a regular 2×2 autonomous ﬁrst-order system of differential equations

x = f (x, y)
.
y = g(x, y)

Now, pick a point (x 0 , y0 ) on the XY –plane, and, using the given system, compute the ‘velocity’
at that point for the trajectory through that point,

x f (x 0 , y0 )
v = = .
y g(x 0 , x 0 )

This gives us a vector tangent at the point (x 0 , x 0 ) to any trajectory passing through this point, and
pointing in the direction of travel along the trajectory as t increases. So, if we draw a short arrow
at (x 0 , y0 ) in the same direction as this velocity vector, we then have a short arrow tangent to the
trajectory through this point and pointing in the “direction of travel” along this curve. We will call
this short arrow a direction arrow. (This assumes x is nonzero at the point. If it is zero, we have
a “critical point”. We’ll discuss critical points in just a little bit.) In figure 36.2a, we’ve sketched a
few of these direction arrows at points along one particular trajectory.
If we sketch a corresponding direction arrow at every point (x j , yk ) in a grid of points, then we
have a direction field, as illustrated (along with a few trajectories) in figure 36.2b. This direction field
tells us the general behavior of the system’s trajectories. To sketch a trajectory using a direction field,
simply start at any chosen point in the plane, and sketch a curve following the directions indicated
by the nearby direction arrows. “Go with the flow”, and do not attempt to “connect the dots”. The
goal is to draw, as well as practical, a curve whose tangent at each point on the curve is lined up with
the direction arrow that would be sketched at that point. That curve (if perfectly drawn), oriented
in the direction given by the tangent direction arrows, is a trajectory of the system. (In practice, of
course, it’s as good an approximation to a trajectory as our drafting skills allow.)

i i

i i
i i

i i

782 Critical Points, Direction Fields and Trajectories

Y Y

2 2

1 1

−1 1 2 3 X −1 1 2 3 X

(a) (b)

Figure 36.2: Direction arrows for trajectories for the system in example 36.5, with (a) being a
few direction arrows tangent to the trajectory passing through (0, 1) (drawn
oversized for clarity), and (b) being a more complete direction ﬁeld, along with a
few more trajectories.

The construction and use of a direction field for a 2 × 2 first-order autonomous system of
differential equations is analogous to the construction and use of a slope field for a first-order
differential equation (see chapter 8). It’s not exactly the same — we are now sketching trajectories
instead of true graphs of solutions (which would require a T –axis), but the mechanics are very much
the same. And, just as with slope lines for slope fields, it is good for your understanding to practice
sketching the direction arrows for a few small direction fields, and, for the sake of your sanity, it is
a good idea to learn how to construct direction fields (and trajectories) using a good computer math
package.

Critical Points
When constructing a direction ﬁeld, it is important to note each point (x 0 , y0 ) for which

f (x 0 , y0 ) 0
= .
g(x 0 , y0 ) 0

Any such point (x 0 , y0 ) is said to be a critical point for the system. Since the zero vector has no
well-deﬁned direction, we cannot sketch a direction arrow at a critical point. Instead, plot a clearly
visible dot at this point. After all, if (x 0 , y0 ) is a critical point for our system, then

x f (x 0 , y0 ) 0
= = ,
y g(x 0 , y0 ) 0

which, in turn, means we have the constant/equilibrium solution

(x(t), y(t)) = (x 0 , y0 ) for all t ,

and, if you think about it for a moment, you’ll realize that the point (x 0 , y0 ) is the entire trajectory
of this solution.
In fact, this gives us an alternate deﬁnition for critical point, namely, that a critical point for
an autonomous system of differential equations is the trajectory of an equilibrium solution for that
system.

i i

i i
i i

i i

Sketching Trajectories for Autonomous Systems 783

Finding critical points and determining the behavior of the trajectories in regions around them
can be rather important. The mechanics of ﬁnding critical points is identical to the mechanics of
ﬁnding equilibrium solutions (as illustrated in example 36.2 on page 777). Issues regarding the
behavior of near-by trajectories will be discussed in the next section.

!Example 36.5: Consider, once again, the system

x x + 2y
= .
y 5x − 2y

Plugging in (x, y) = (0, 1) , we get

x 0+2·1 2
= = .
y 5·0−2·1 −2

Thus, the direction arrow sketched at position (x, y) = (0, 1) should be a short arrow (which
we center at the point) parallel to and pointing in the same direction as the vector from the origin
(0, 0) to position (2, −2) . That is what was sketched at point (0, 1) in figure 36.2a, along with
the trajectory through that point.
A more detailed direction field, along with three other trajectories, is illustrated in figure
36.2b. Note the dot at (0, 0) . This is the critical point for the system and is the trajectory of the
one equilibrium solution
(x(t), y(t)) = (0, 0) for all t .

Trajectories and Solutions

Keep in mind that a trajectory through a point (a, b) for a regular autonomous system

x f (x, y)
=
y g(x, y)

is not a solution to the system; it is the path traced out by a solution to the initial-value problem

x f (x, y) x(t0 ) a
= with =
y g(x, y) y(t0 ) b

for any choice of t0 in (−∞, ∞) . And since there are inﬁnitely many possible values of t0 , we
should expect inﬁnitely many solutions tracing out that one trajectory.
However, all the different solutions

corresponding
to a single trajectory are simply ‘shifted’
versions of each other. To see this, let x̂(t), ŷ(t) be a solution to

x f (x, y) x(0) a
= with = .
y g(x, y) y(0) b

You can then easily verify for yourself (see exercise 36.11) that, for any real value t0 ,

(x(t), y(t)) = x̂(t − t0 ), ŷ(t − t0 )

is a solution to
x f (x, y) x(t0 ) a
= with = .
y g(x, y) y(t0 ) b

i i

i i
i i

i i

784 Critical Points, Direction Fields and Trajectories

The uniqueness theorems brieﬂy discussed in the last chapter then assure us that there are no other
solutions to this initial-value problem.
This, by the way, leads to a minor technical point that should be discussed brieﬂy. When
specifying a solution (x(t), y(t)) , we should also specify the domain of this solution, that is, the
interval (α, β) over which t varies. In practice, we often don’t mention that interval, simply
assuming it to be “as large as possible” (which often means (α, β) = (−∞, ∞) ). On the few
occasions when it is particularly relevant, we will refer to a trajectory as being complete if it is the
trajectory of a solution
x(t)
x(t) = with α < t < β
y(t)
where the interval (α, β) is “as large as possible”. (That was the implicit assumption in the previous
paragraph.)
Of course, while we may be most interested in the complete trajectories of our systems, in
practice, the trajectories we sketch are often merely portions of complete trajectories simply because
the complete trajectories often extend beyond the region over which we are making our drawings.
One obvious exception, of course, is that a critical point (x 0 , y0 ) is a complete trajectory since it is
the trajectory of the equilibrium solution

(x(t), y(t)) = (x 0 , y0 ) for −∞<t <∞ ,

and there certainly is no larger interval than (−∞, ∞) .

Properties of Trajectories
It should be noted that the ‘existence and uniqueness of solutions’ implies a corresponding ‘existence
and uniqueness of (complete) trajectories’, allowing us to say that “through each point there is one and
only one (complete) trajectory”. We’ll say more about this in section 36.7. Also in that section, we will
verify the following facts concerning trajectories of any regular autonomous standard system:
1. If a trajectory contains a critical point, then that critical point is the entire trajectory. Con-
versely, if a trajectory has nonzero length, then it does not contain a critical point of the system.
(Hence, in sketching trajectories other than critical points, make sure your trajectories do not
pass through any critical points.)

2. If a complete trajectory has an endpoint, then that endpoint must be a critical point for the
system.
3. Any oriented curve not containing a critical point that “follows” the system’s direction ﬁeld
(i.e., any oriented curve whose tangent vector at each point is in the same direction as the
system’s direction arrow at that point) is the trajectory of some solution to the system. (This
is what we intuitively expect when “sketching trajectories” on a direction ﬁeld.)

4. Any trajectory that is not a critical point is “smooth” in that no trajectory can have a “kink”
or “corner” (as in ﬁgure 36.3a).

5. No trajectory can “split” into two or more trajectories (as in figure 36.3b), nor can two or
more trajectories “merge” into one trajectory (as in figure 36.3b with the arrows reversed).
Keeping the above in mind will help in both sketching trajectories from a direction field and in
interpreting your results.

i i

i i
i i

i i

Sketching Trajectories for Autonomous Systems 785

Y Y

b b

a X a X
(a) (b)

Figure 36.3: Two impossibilities for a trajectory of a regular autonomous system: (a) A sharp
“kink” at some point, and (b) a “splitting” into two different trajectories at some
point.

Phase Portraits and Planes

A little more terminology: When dealing with direction fields and trajectories for a standard 2×2
autonomous system, we refer to the plane on which we sketch the direction field and/or trajectories
as the phase plane (as opposed to the “ XY –plane” or “ X 1 X 2 –plane” plane or …). If we sketch an
‘enlightening’ representative sample of trajectories on the phase plane, then this sketch is said to be
a phase portrait of the system. At this point, we are using direction fields to sketch phase portraits,
so we are getting phase portraits superimposed on direction fields. If a phase portrait does not have
an accompanying direction field to indicate direction of travel along the trajectories, then you should
have little arrows on the trajectories to indicate the direction of travel for each trajectory.

Higher-Order Cases
The fundamental ideas just discussed and developed for any 2 × 2 regular autonomous standard
system extend naturally to analogous ideas for any N × N regular autonomous standad system
⎡ ⎤ ⎡ ⎤
x1 f 1 (x 1 , x 2 , . . . , x N )
⎢x ⎥ ⎢ f (x , x , . . . , x ) ⎥
⎢ 2 ⎥ ⎢ 2 1 2 N ⎥
x = ⎢ . ⎥ = ⎢
⎢ ⎥
⎢ . ⎥ = F(x) .
⎥
⎣ .. ⎦ ⎣ .. ⎦
xN f N (x 1 , x 2 , . . . , x N )

As before, we deﬁne a critical point to be any point x 10 , x 20 , . . . , x N
0 in N -dimensional space for

which ⎡
0 0 0 ⎤
⎡ ⎤
f1 x1 , x2 , . . . , x N 0

⎢ f x 0, x 0, . . . , x 0 ⎥ ⎢0⎥
⎢ 2 1 2 N ⎥ ⎢ ⎥
⎢ .. ⎥ = ⎢.⎥
⎢ ⎥ ⎢.⎥
⎣ . ⎦ ⎣.⎦

0 0
f N x1 , x2 , . . . , x N
0 0
and, as before, any such point is the complete trajectory of the corresponding equilibrium solution

( x 1 (t) , x 2 (t) , . . . x N (t) ) = x 10 , x 20 , . . . , x N
0
for all t

for our system x = F(x) .

Likewise, at any point other than a critical point, we can, in theory, ﬁnd a short vector pointing
in the direction of travel of any trajectory through that point by just taking any short vector pointing
in the same direction as F computed at that point. This gives a direction arrow at that point. Plotting

i i

i i
i i

i i

786 Critical Points, Direction Fields and Trajectories

these direction arrows on a suitable grid of points in N -dimensional space then gives us a direction
field for the system.
Admittedly, few of us can actually sketch and use a direction field when N > 2 (especially if
N > 3 !). Still, we can find the critical points, and it turns out that much of what we will learn about
the behavior of trajectories near critical points for 2×2 systems can apply when our systems are
larger.
By the way, it is traditional to refer to the N -dimensional space in which the trajectories would,
in theory, be drawn as the phase space (phase plane if N = 2 ), with any enlightening representative
sample of trajectories in this space being called a phase portrait. Of course, visualizing a phase
portrait when N > 2 requires a certain imagination (especially if N > 3 !).

36.5 Critical Points, Stability and Long-Term Behavior

A useful feature of a direction ﬁeld for an autonomous system of differential equations is that it can
give us some notion of the long-term behavior of the solutions to that system. All we need to do is
to follow the sketched trajectories.

Critical Points and Stability

Stability of Equilibrium Solutions
Of particular interest will be the long-term behavior of solutions whose trajectories pass close to a
critical point (x 0 , y0 ) of the system, and we will use this behavior to classify the ‘stability’ of that
critical point and the corresponding equilibrium solution
(x(t), y(t)) = (x 0 , y0 ) for all t .
Loosely speaking we will classify this critical point and the corresponding equilibrium solution as
being
• stable if and only if each trajectory that gets close to (x 0 , y0 ) stays close to (x 0 , y0 ) af-
terwards. That is, this critical point and equilibrium solution are stable if and only if each
solution (x, y) to the system satisfying (x(t0 ), y(t0 )) ≈ (x 0 , y0 ) for some t0 also satisfies
(x(t), y(t)) ≈ (x 0 , y0 ) for all t > t0 ,
as illustrated in figures 36.4a and 36.4b.
• asymptotically stable if and only if each trajectory that gets close to (x 0 , y0 ) doesn’t just
stay close but converges to (x 0 , y0 ) as t → +∞ . That is, this critical point and equilibrium
solution are asymptotically stable if and only if each solution (x, y) to the system satisfying
(x(t0 ), y(t0 )) ≈ (x 0 , y0 ) for some t0 also satisfies
lim (x(t), y(t)) = (x 0 , y0 ) ,
t→+∞

as illustrated in ﬁgure 36.4b. (Obviously, an asymptotically stable critical point is automati-

cally stable.)
• unstable if and only if the equilibrium solution is not stable. Examples of unstable equilibrium
solutions are illustrated in ﬁgure 36.4c (in which the nearby trajectories all diverge away from
the critical point) and in ﬁgure 36.2b (in which trajectories approach the critical point (0, 0)
and then diverge away).

i i

i i
i i

i i

Critical Points, Stability and Long-Term Behavior 787

(a) (b) (c)

Figure 36.4: Two-dimensional direction ﬁelds and trajectories about critical points corresponding
to equilibrium solutions that are (a) stable, but not asymptotically stable, (b)
asymptotically stable and (c) unstable.

Be warned that the stability of an equilibrium solution is not always clear from just the direction
ﬁeld. The ﬁeld may suggest that the nearby trajectories are loops circling the critical point (indi-
cating that the equilibrium solution is stable) when, in fact, the nearby trajectories are either slowly
spiraling into or away from the critical point (in which case the equilibrium solution is actually either
asymptotically stable or is unstable).

Precise Deﬁnitions
Of course, precise mathematics requires precise deﬁnitions. So, to be precise, we classify our
equilibrium solution
(x(t), y(t)) = (x 0 , y0 ) for all t
and the corresponding critical point as being
• stable if and only if, for each > 0 , there is a corresponding δ > 0 such that, if (x, y) is
any solution to the system satisfying
!
(x(t) − x 0 )2 + (y(t) − y0 )2 < δ for some t0 ,
then !
(x(t) − x 0 )2 + (y(t) − y0 )2 < for all t > t0 .
• asymptotically stable if and only if there is a δ > 0 such that, if (x, y) is any solution to the
system satisfying
!
(x(t) − x 0 )2 + (y(t) − y0 )2 < δ for some t0 ,
then !
lim (x(t) − x 0 )2 + (y(t) − y0 )2 = 0 .
t→+∞
• unstable if the equilibrium solution is not stable.
While we are at it, we should give precise meanings to the ‘convergence/divergence’ of a
trajectory to/from a critical point (x 0 , y0 ) . So assume we have a trajectory and any solution (x, y)
to the system that generates that trajectory. We’ll say the trajectory
• converges to critical point (x 0 , y0 ) if and only if limt→+∞ (x(t), y(t)) = (x 0 , y0 ) ,
and
• diverges from critical point (x 0 , y0 ) if and only if limt→−∞ (x(t), y(t)) = (x 0 , y0 ) .

i i

i i
i i

i i

788 Critical Points, Direction Fields and Trajectories

6 gal./min. (0% alcohol)

6 gal./min. (50% alcohol)

2 gal./min.

Tank A Tank B
T

6 gal./min. (1,000 gal.) 6 gal./min.

(500 gal.)
2 gal./min.

Figure 36.5: A simple system of two tanks containing water/alcohol mixtures.

Long-Term Behavior
A direction field may also give us some idea of the long-term behavior of the solutions to the given
system at points far away from the critical points. Of course, this supposes that any patterns that
appear to be evident in the direction field actually continue outside the region on which the direction
field is drawn.

!Example 36.6: Consider the direction field and sample trajectories sketched in figure 36.2b on
page 781. In particular, look at the trajectory passing through the point (1, 0) , and follow it in the
direction indicated by the direction field. The last part of this curve seems to be straightening out
to a straight line proceeding further into the first quadrant at, very roughly, a 45 degree angle to
both the positive X –axis and positive Y –axis. This suggests that, if (x(t), y(t)) is any solution
to the direction field’s system satisfying (x(t0 ), y(t0 )) = (1, 0) for some t0 , then

lim (x(t), y(t)) = (∞, ∞)

t→∞

with
y(t) ≈ x(t) when t is “large” .
On the other hand, if you follow the trajectory passing through position (−1, 0) , then you
probably get the impression that, as t increases, the trajectory is heading deeper into the third
quadrant of the X Y –plane, suggesting that if (x(t), y(t)) is any solution to the direction ﬁeld’s
system satisfying (x(t0 ), y(t0 )) = (−1, 0) for some t0 , then

lim (x(t), y(t)) = (−∞, −∞) .

t→∞

You may even suspect that, for such a solution,

y(t) ≈ x(t) when t is “large” ,

though there is hardly enough of the trajectory sketched to be too conﬁdent of this suspicion.

i i

i i
i i

i i

Applications 789

36.6 Applications
Let us try to apply the above to three applications from the previous chapter; namely, the applications
involving two-tank mixing, competing species, and a swinging pendulum. In each case, we will ﬁnd
the critical points, attempt to use a (computer-generated) direction ﬁeld to determine the stability of
these points, and see what conclusions we can then derive.

A Mixing Problem
Our mixing problem is illustrated in ﬁgure 36.5. In it, we have two tanks A and B containing, re-
spectively, 500 and 1,000 gallons of a water/alcohol mix. Each minute 6 gallons of a water/alcohol
mix containing 50% alcohol is added to tank A, while 6 gallons of the mix is drained from that
tank. At the same time, 6 gallons of pure water is added to tank B, and 6 gallons of the mix in tank
B is drained out. In addition, the two tanks are connected by two pipes, with one pumping liquid
from tank A to tank B at a rate of 2 gallons per minute, and with the other pumping liquid in the
opposite direction, from tank B to tank A, at a rate of 2 gallons per minute.
In the previous chapter, we found that the system describing this mixing process is
8 2
x = − x + y + 3
500 1000
. (36.5)
2 8
y = x − y
500 1000
where
t = number of minutes since we started the mixing process ,

x = x(t) = amount (in gallons) of alcohol in tank A at time t ,

and
y = y(t) = amount (in gallons) of alcohol in tank B at time t .
To find any critical points of the system we first replace each derivative in system (36.5) with
0 , obtaining
8 2
0 = − x + y + 3
500 1000
.
2 8
0 = x − y
500 1000
Then we solve this algebraic system. That is easily done. From the last equation we have
500 8
x = · y = 2y .
2 1000
Using this with the first equation, we get
8 2 15
0 = − · 2y + y + 3 = 3 − y
500 1000 500

→ y = 3·
500
15
= 100 and x = 2y = 2 · 100 = 200 .

So the one critical point for this system is

(x 0 , y0 ) = (200, 100) .

i i

i i
i i

i i

790 Critical Points, Direction Fields and Trajectories

Y
200

150

100

X
0
0 100 200 300 400

Figure 36.6: Direction ﬁeld for the system of two tanks from ﬁgure 36.5.

Considering the physical process being modeled, it should seem reasonable for this critical point to
describe the long-term equilibrium of the system. That is, we should expect (200, 100) to be an
asymptotically stable equilibrium, and that
lim (x(t), y(t)) = (200, 100) .
t→∞

Checking the computer-generated direction ﬁeld in ﬁgure 36.6, we see that this is clearly the case.
Around critical point (200, 100) , the direction arrows are all pointing toward this point. Thus, as
t → ∞ the concentrations of alcohol in tanks A and B, respectively, approach
200 100
(i.e., 40%) and (i.e., 10%) .
500 1000

Competing Species
The Model and Analysis
In our competing species example, we assumed that we have a large field containing rabbits and
gerbils that are competing with each other for the resources in the field. Letting
R = R(t) = number of rabbits in the field at time t
and
G = G(t) = number of gerbils in the field at time t ,
we derived the following system describing how the two populations vary over time:
R = (β1 − γ1 R − α1 G)R
. (36.6)
G = (β2 − γ2 G − α2 R)G
In this, β1 and β2 are the net birth rates per creature under ideal conditions for rabbits and gerbils,
respectively, and the γk ’s and αk ’s are positive constants that would probably have to be deterimined
by experiment and measurement. In particular, the values we chose yielded the system

5 1 3
R = − R− G R
4 160 1000
. (36.7)
3 3

G = 3− G− R G
500 160

i i

i i
i i

i i

Applications 791

Setting R = G = 0 in equation set (36.7) gives us the algebraic system we need to solve to
ﬁnd the critical points:
5 1 3
0 = − R− G R
4 160 1000
. (36.8)
3 3
0 = 3− G− R G
500 160

The ﬁrst equation in this algebraic system tells us that either

1 3 5
R = 0 or R + G = .
160 1000 4

If R = 0 , the second equation reduces to

3

0 = 3− G G ,
500

which means that either

G = 0 or G = 500 .

So two critical points are (R, G) = (0, 0) and (R, G) = (0, 500) .
If, on the other hand, the ﬁrst equation in algebraic system (36.8) holds because

1 3 5
R + G = ,
160 1000 4

then the system’s second equation can only hold if either

3 3
G = 0 or G + R = 3 .
500 160

If G = 0 , then we can solve the ﬁrst equation in the system, obtaining

5
R = · 160 = 200 .
4

So (R, G) = (200, 0) is one critical point. Looking at what remains, we see that there is one more
critical point, and it satisﬁes the simple algebraic linear system

1 3 5
R + G =
160 1000 4
.
3 3
R + G = 3
160 500

You can easily verify that the solution to this is (R, G) = (80, 250) .
So the critical points for our system are (R, G) equaling

(0, 0) , (0, 500) , (200, 0) and (80, 250) .

The ﬁrst tells us that, if we start with no rabbits and no gerbils, then the populations remain at 0
rabbits and 0 gerbils — no big surprise. The next tells us that our populations can remain constant if
either we have no rabbits with 500 gerbils, or we have 200 rabbits and no gerbils. The last critical
point says that the two populations can coexist at 80 rabbits and 250 gerbils.

i i

i i
i i

i i

792 Critical Points, Direction Fields and Trajectories

G
500
G

250

R
0
0 80 200
(a) (b) (c)

Figure 36.7: (a) A direction ﬁeld and some trajectories for the competing species example with
system (36.7), and detailed direction ﬁelds over very small regions about critical
points (b) (0, 500) and (c) (80, 250) .

But look at the direction field in figure 36.7a for this system. From that, it is clear that critical
point (0, 0) is unstable. If we have at least a few rabbits and/or gerbils, then those populations do
not, together, die out. We will always have at least some rabbits or gerbils.
On the other hand, critical point (200, 0) certainly appears to be an asymptotically stable critical
point. In fact, from the direction field it appears that, if we have, say R(0) > 150 and G(0) < 250 ,
then the direction of the direction arrows “near” (200, 0) indicate that

lim (R(t), G(t)) = (200, 0) .

t→∞

In other words, the gerbils die out and the number of rabbits stabilizes at 200 .
Likewise, critical point (500, 0) also appears to be asymptotically stable. Admittedly, a some-
what more detailed direction field about (500, 0) , such as in figure 36.7b, may be desired to clarify
this. Thus, if we start with enough gerbils and too few rabbits (say, G(0) > 250 and R(0) < 80 ),
then
lim (R(t), G(t)) = (0, 500) .
t→∞
In other words, the rabbits die out and the number of gerbils approaches 500 .
Finally, what about critical point (80, 250) ? In the region about this critical point, we can see
that a few direction arrows point towards this critical point, while others seem to lead the trajectories
past the critical point. That suggests that (80, 250) is an unstable critical point. Again, a more
detailed direction field in a small area about critical point (80, 250) , such as in figure 36.7c, is called
for. This direction field shows more clearly that critical point (80, 250) is unstable. Thus, while it is
possible for the populations to stabilize at 80 rabbits and 250 gerbils, it is also extremely unlikely.
To summarize: It is possible for the two competing species to coexist, but, in the long run, it
is much more likely that one or the other dies out, leaving us with either a field of 200 rabbits or a
field of 500 gerbils, depending on the initial number of each.

i i

i i
i i

i i

Applications 793

θ T

mg
θ
Fgrav,tan

Figure 36.8: The pendulum system with a weight of mass m attached to a massless rod of length
L swinging about a pivot point under the inﬂuence of gravity.

Some Notes
It turns out that different choices for the parameters in system (36.6) can lead to very different
outcomes. For example, the system

R = (120 − 2R − 2G)R
G = (320 − 8G − 3R)G

also has four critical points; namely,

(0, 0) , (60, 0) , (0, 40) and (32, 28) .

In this case, however, the ﬁrst three are unstable, and the last is asymptotically stable. Thus, if we
start out with at least a few rabbits and a few gerbils, then

lim (R(t), G(t)) = (32, 28) .

t→∞

So neither population dies out. The rabbits and gerbils in this field are able to coexist, and there will
eventually be approximately 32 rabbits and 28 gerbils.
It should also be noted that, sometimes, it can be difficult to determine the stability of a critical
point from a given direction field. When this happens, a more detailed direction field about the
critical point may be tried. A better approach involves a method using the eigenvalues of a certain
matrix associated with the system and critical point. Unfortunately, this approach goes beyond the
scope of this text (but not very much beyond). The reader is strongly encouraged to explore this
method in the future.

The (Damped) Pendulum

In the last chapter, we derived the system

θ = ω
(36.9)
ω = −γ sin(θ ) − κω

to describe the angular motion of the pendulum in ﬁgure 36.8. Here

θ (t) = the angular position of pendulum at time t measured counterclockwise

from the vertical line “below” the pivot point

i i

i i
i i

i i

794 Critical Points, Direction Fields and Trajectories

θ
−3π −π π 3π

Figure 36.9: A phase portrait (with direction ﬁeld) for pendulum system (36.10).

and
ω(t) = θ = the angular velocity of the pendulum at time t .

In addition, γ is the positive constant given by γ = g/L where L is the length of the pendulum and
g is the acceleration of gravity, and κ is the “drag coefﬁcient”, a nonnegative constant describing the
effect friction has on the motion of the pendulum. The greater the effect of friction on the system,
the larger the value of κ , with κ = 0 when there is no friction slowing down the pendulum. We
will restrict our attention to the damped pendulum in which friction plays a role (i.e., κ > 0 ).2 For
simplicity, we will assume
γ = 8 and κ = 2 ,
though the precise values for γ and κ will not play much of a role in our analysis. This gives us
the system
θ = ω
. (36.10)
ω = −8 sin(θ ) − 2ω

Before going any further, observe that the right side of our system is periodic with period 2π
with respect to θ . This means that, on the θ ω–plane, the pattern of the trajectories in any vertical
strip of width 2π will be repeated in the next vertical strip of width 2π .
Setting θ = 0 and ω = 0 in system (36.10) yields the algebraic system

0 = ω
0 = −8 sin(θ ) − 2ω

for the critical points. From this we get that there are infinitely many critical points, and they are
given by
(θ, ω) = (nπ, 0) with n = 0, ±1, ±2, . . . .
A direction field with a few trajectories for this system is given in figure 36.9. From it, we can
see that the behavior of the trajectories near a critical point (θ, ω) = (nπ, 0) depends strongly on
whether n is an even integer or an odd integer.
If n is an even integer, then the nearby trajectories are clearly spirals “spiraling” in towards the
critical point (θ, ω) = (nπ, 0) . Hence, every critical point (nπ, 0) with n even is asymptotically
2 It turns out that the undamped pendulum, in which κ = 0 , is more difficult to analyze.

i i

i i
i i

i i

Existence and Uniqueness of Trajectories 795

stable. That is, if n is an even integer and

(θ (t0 ), ω(t0 )) ≈ (nπ, 0)

for some t0 , then

(θ (t), ω(t)) → (nπ, 0) as t →∞ .
This makes sense. After all, if n is even, then (θ, ω) = (nπ, 0) describes a pendulum hanging
straight down and not moving — certainly what most of us would call a ‘stable equilibrium’ position
for the pendulum, and certainly the position we would expect a real-world pendulum (in which there
is inevitably some friction slowing the pendulum) to eventually assume.
Now consider any critical point (θ, ω) = (nπ, 0) when n is an odd integer. From figure 36.9
it is apparent that these critical points are unstable. This also makes sense. With n being an odd
integer, (θ, ω) = (nπ, 0) describes a stationary pendulum balanced straight up from its pivot point,
which is a physically unstable equilibrium. It may be possible to start the pendulum moving in such
a manner that it approaches this configuration. But if the initial conditions are not just right, then
the motion will be given by a trajectory that approaches and then goes away from that critical point.
In particular, the trajectories near this critical point that pass through the horizontal axis (where
the angular velocity ω is zero) are describing the pendulum slowing to a stop before reaching the
upright position and then falling back down, while the trajectories near this critical point that pass
over or below this point describe the pendulum traveling fast enough to reach and continue through
the upright position.
From figure 36.9, it is apparent that every trajectory eventually converges to one of the critical
points, with most spiraling into one of the stable critical points. This tells us that, while the pendulum
may initially have enough energy to spin in complete circles about the pivot point, it eventually stops
spinning about the pivot and begins rocking back and forth in smaller and smaller arcs about its
stable downward vertical position. Eventually, the arcs are so small that, for all practical purposes,
the pendulum is motionless and hanging straight down.
By the way, it’s fairly easy to redo the above using fairly arbitrary positive choices of γ and
κ in pendulum system (36.9). As long as the friction is not too strong (i.e., as long as κ is not too
large compared to γ ), the resulting phase portrait will be quite similar to what we just obtained.
√
However, if κ is too large compared to γ (to be precise, if κ ≥ 2 γ ), then, while the critical points
and their stability remain the same as above, the trajectories no longer spiral about the stable critical
points but approach them more directly. The interested reader is encouraged to redo the above with
a relatively large κ to see for themself.

36.7 Existence and Uniqueness of Trajectories

The existence and uniqueness of solutions to standard systems of differential equations were dis-
cussed in the last chapter. It is worth noting that the assumption of a system being regular immediately
implies that the component functions all satisfy the conditions required in theorem 35.1. So we au-
tomatically have:

Corollary 36.1
Suppose x = F(t, x) is a regular 2 × 2 system. Then any initial-value problem involving this
system,
x = F(t, x) with x(t0 ) = a ,

i i

i i
i i

i i

796 Critical Points, Direction Fields and Trajectories

has exactly one solution on some open interval (α, β) containing t0 . Moreover, the component
functions of this solution and their ﬁrst derivatives are continuous over that interval.

As almost immediate corollaries of corollary 36.1 (along with a few observations made in section
36.4), we have the following facts concerning trajectories of any given 2 × 2 standard ﬁrst-order
system that is both regular and autonomous:
1. Through each point of the plane there is exactly one complete trajectory.

2. If a trajectory contains a critical point for that system, then that entire trajectory is that single
critical point. Conversely, if a trajectory for a regular autonomous system has nonzero length,
then that trajectory does not contain a critical point.

3. Any trajectory that is not a critical point is “smooth” in that no trajectory can have a “kink”
or “corner”.

The other properties of trajectories that were claimed earlier all follow from the following
theorem and its corollaries.

Theorem 36.2
Assume x = F(x) is an 2×2 standard ﬁrst-order system that is both regular and autonomous, and
let C be any oriented curve of nonzero length such that all the following hold at each point (x, y)
in C :
1. The point (x, y) is not a critical point for the system.
2. The curve C has a unique tangent line at (x, y) , and that line is parallel to the vector F(x) .
3. The direction of travel of C through (x, y) is in the same direction as given by the vector
F(x) .
Then C (excluding any endpoints) is the trajectory for some solution to the system x = F(x) .

This theorem assures us that, in theory, the curves drawn “following” a direction ﬁeld will be
trajectories of our system (in practice, of course, the curves we actually draw will be approximations).
Combining this thoerem with the existence and uniqueness results of corollary 36.1 leads to the next
two corollaries regarding complete trajectories.

Corollary 36.3
Two different complete trajectories of a regular autonomous system cannot intersect each other.

Corollary 36.4
If a complete trajectory of a regular autonomous system has an endpoint, that endpoint must be a
critical point.

We’ll discuss the proof of the above theorem in the next section for those who are interested.
Verifying the two corollaries will be left as exercises (see exercise 36.12).

i i

i i
i i

i i

Proving Theorem 36.2 797

36.8 Proving Theorem 36.2

The Assumptions
In all the following, let us assume we have some regular autonomous 2 × 2 standard system of
differential equations
x = F(x) ,
along with an oriented curve C of nonzero length such that all the following hold at each point
(x, y) in C :
1. The point (x, y) is not a critical point for the system.
2. The curve C has a unique tangent line at (x, y) , and that line is parallel to the vector F(x) .
3. The direction of travel of C through (x, y) is in the same direction as given by the vector
F(x) .
Note that the requirement that C has a tangent line at each point in C means that we are excluding
any endpoints of this curve.

Preliminaries
To verify our theorem, we will need some material concerning “parametric curves” that you should
recall from your calculus course.

Norms and Normalizations

The norm (or length) of a column vector or vector-valued function

v1
v =
v2
is !
v = [v1 ]2 + [v2 ]2 .

If v is a nonzero vector, then we can normalize it by dividing it by its norm, obtaining a vector

v 1 v1
n = = !
v 2 2 v2
v 1 + v 2

of unit length (i.e., n = 1 ) and pointing in the same direction as v .

Oriented Curves and Unit Tangents

If (x, y) is any point on any oriented curve at which the curve has a well-deﬁned tangent line, then
this curve has a unit tangent vector at (x, y) , denoted by T(x, y) , which is simply a unit vector
tangent to the curve at that point, and pointing in the direction of travel along the curve at that point.
For our oriented curve, C , that tangent line is parallel to F(x) , and the direction of travel is given
by F(x) . So the unit tangent at (x, y) must be the normalization of F(x) . That is
F(x)
T(x, y) = for each (x, y) in C . (36.11)
F(x)

i i

i i
i i

i i

798 Critical Points, Direction Fields and Trajectories

Curve Parameterizations
A parametrization of an oriented curve C is an ordered pair of functions on some interval

(x(t), y(t)) for tS < t < t E

that traces out the curve in the direction of travel along C as t varies from t S to t E . Given any
such parametrization, we will automatically let

x(t) x (t)
x = x(t) = and x = x (t) = .
y(t) y (t)

If we view our parametrization (x(t), y(t)) as giving the position at time t of some object
traveling along C , then, provided the functions are suitably differentiable,

x (t)
x (t) =
y (t)

is the corresponding “velocity”, of the object at time t . This is a vector pointing in the direction of
travel of the object at time t , and whose length,
5 5 !
5x (t)5 = [x (t)]2 + [y (t)]2 ,

is the speed of the object at time t (i.e., as it goes through position (x(t), y(t)) ). Recall that the
integral of this speed from t = t1 to t = t2 ,
t2
5 5
5x (t)5 dt , (36.12)
t1

gives the signed distance one would travel along the curve in going from position (x(t1 ), y(t1 )) to
position (x(t2 ), y(t2 )) . This value is positive if t1 < t2 and negative if t1 > t2 . Recall, also, that
this distance (the arclength) is traditionally denoted by s .
The most fundamental parametrizations are the arclength parametrizations. To define one for
our oriented curve C , first pick some point (x 0 , y0 ) on C . Then let s S and s E be, respectively,
the negative and positive values such that s E is the “maximal distance” that can be traveled in the
positive direction along C from (x 0 , y0 ) , and |s S | is the “maximal distance” that can be traveled
in the negative direction along C from (x 0 , y0 ) . These distances may be infinite.3 Finally, define
the arclength parametrization
(x̃(s), ỹ(s)) for sS < s < s E

as follows (and as indicated in ﬁgure 36.10):

1. For 0 ≤ s < s E set (x̃(s), ỹ(s)) equal to the point on C arrived at by traveling in the
positive direction along C by a distance of s from (x 0 , y0 ) .

2. For s S < s ≤ 0 set (x̃(s), ỹ(s)) equal to the point on C arrived at by traveling in the
negative direction along C by a distance of |s| from (x 0 , y0 ) .
We should note that if the curve intersects itself, then the same point (x̃(s), ỹ(s)) may be given
by more than one value of s . In particular, if C is a loop of length L , then (x̃(s), ỹ(s)) will be
periodic with (x̃(s + L), ỹ(s + L)) = (x̃(s), ỹ(s)) for every real value s .

3 Better deﬁnitions for s and s are discussed in the ‘technical note’ at the end of this subsection
S E

i i

i i
i i

i i

Proving Theorem 36.2 799

Y s = −3
(x 0 , y0 )
(x̃(−3), ỹ(−3))
s=2
(x̃(2), ỹ(2))

Figure 36.10: Two points given by an arclength parameterization x̃(s) of an oriented curve.

It should also be noted that, from arclength integral (36.12) and the fact that, by deﬁnition, s
is the signed distance one would travel along the curve to go from (x̃(0), ỹ(0)) to (x̃(s), ỹ(s)) , we
automatically have s
5 5
5x̃ (σ )5 dσ = s .
0
Differentiating this yields

5 5 s 5 5
5x̃ (s)5 = d 5x̃ (σ )5 dσ = ds = 1 .
ds 0 ds

Hence, each x̃ (s) is a unit vector pointing in the direction of travel on C at x̃(s) — that is, x̃ (s)
is the unit tangent vector for C at (x̃(s), ỹ(s)) . Combining this with equation (36.11) yields
F(x̃(s))
x̃ (s) = T(x̃(s), ỹ(s)) = 55 5 for s S < s < s E . (36.13)
F(x̃(s))5

Technical Note on “Maximal Distances”

We set s E equal to “the ‘maximal distance’ that can be traveled in the positive direction along C
from (x 0 , y0 ) ”. Technically, this “maximal distance” may not exist because, technically, an endpoint
of a trajectory need not actually be part of that trajectory.
To be more precise, let us deﬁne a subset S of the positive real numbers by specifying that

s is in S

if and only
there is a point on C arrived at by traveling a distance of s in the positive direction
along C from (x 0 , y0 ) .
With a little thought, it should be clear that S must be a subinterval of (0, ∞) (assuming some
‘obvious facts’ about the nature of curves). One end point of S must clearly be 0 . The other
endpoint gives us the value s E . In particular, letting C + be that part of C containing all the points
arrived at by traveling in the positive direction along C from (x 0 , y0 ) :
1. If C + is a closed loop, then s E = ∞ (because we keep going around the loop as s increases).
2. If C + is a curve that does not intersect itself, then s E is the length of C + (which may be
inﬁnite).
Obviously, similar comments can be made regarding the deﬁnition of s S .

i i

i i
i i

i i

800 Critical Points, Direction Fields and Trajectories

Finishing the Proof of Theorem 36.2

Let us now use the arclength parameterization (x̃, ỹ) to deﬁne a function t˜ of s by
s
1
t˜(s) = 5
5
5 dσ for s S < s < s E .
0 F(x̃(σ ))5

Since C contains no critical points, the integrand is always ﬁnite and positive, and the above function
is a differentiable steadily increasing function with

1
t˜ (s) = 55 5 for s S < s < s E .
F(x̃(s))5

Consequently, for each s in (s E , s E ) , there is exactly one corresponding t with t = t˜(s) . Thus,
we can invert this relationship, deﬁning a function s̃ by

s̃(t) = s ⇐⇒ t = t˜(s) .

The function s̃ is deﬁned on the interval (t S , t E ) where

tS = lim t˜(s) and tE = lim t˜(s) .

s→s S + s→s E −

By deﬁnition,

s = s̃ t˜(s) for sS < s < s E .

From this, the chain rule and the above formula for t˜ , we get

ds d

1
1 = = s̃ t˜(s) = s̃ t˜(s) t˜ (s) = s̃ t˜(s) · 55 5 .
ds ds F(x̃(s))5

Hence,

5 5
s̃ t˜(s) = 5F(x̃(s))5 . (36.14)

Now let
(x(t), y(t)) = ( x̃(s̃(t)) , ỹ(s̃(t) ) for tS < t < t E .

Observe that (x(t), y(t)) will trace out C as t varies from t S to t E , and that

F(x(t)) = F x̃(s̃(t))

where
x(t) x̃(s̃(t))
x(t) = = = x̃(s̃(t)) .
y(t) ỹ(s̃(t))

The differentiation of this (using the chain rule applied to the components), along with equations
(36.13) and (36.14), then yields

d F(x(t))
x (t) = x̃(s̃(t)) = x̃ (s̃(t)) · s̃ (t) = · F(x(t) = F(x(t)) ,
dt F(x(t))

completing our proof of theorem 36.2.

i i

i i
i i

i i

Additional Exercises 801

Additional Exercises

36.1. Find all the constant/equilibrium solutions to each of the following systems:

x 2x − 5y x 2x − 5y + 4
a. = b. =
y 3x − 7y y 3x − 7y + 5

x 3x + y x x y − 6y
c. = d. =
y 6x + 2y y x − y−5

x x 2 − y2 x x sin(y)
e. = f. =
y x 2 − 6x + 8 y x 2 − 6x + 9
2
x 4x − x y x x + y2 + 4
g. = h. =
y x 2 y + y3 − x 2 − y2 y 2x − 6y
36.2. Sketch the direction ﬁeld for the system

x −x + 2y
=
y 2x − y

a. on the 2×2 grid with (x, y) = (0, 0) , (2, 0) , (0, 2) and (2, 2) .
b. on the 3×3 grid with x = −1 , 0 and 1 , and with y = −1 , 0 and 1 .
36.3. Sketch the direction ﬁeld for the system

x (1 − 2x)(y + 1)
=
y x−y
on the 3×3 grid with x = 0 , 1/
2 and 1 , and with y = 0 , 1/
2 and 1 .
36.4. A direction ﬁeld for

x −x + 2y Y
= 3
y 2x − y

has been sketched to the right. Using

this system and direction ﬁeld:
2
a. Find and plot all the critical points.
b. Sketch the trajectories that go through
points (1, 0) and (0, 1) .
1
c. Sketch a phase portrait for this system.
d. Suppose (x(t), y(t)) is the solution to
this system satisfying (x(0), y(0)) = X
(1, 0) . What apparently happens to −1 1 2
x(t) and y(t) as t gets large?
e. As well as you can, decide whether the
critical point found above is asymptot- −1
ically stable, stable but not asymptoti-
cally stable, or unstable.

i i

i i
i i

i i

802 Critical Points, Direction Fields and Trajectories

36.5. A direction ﬁeld for

x −x + 2y Y
=
y −2x + 2 3

has been sketched to the right. Using

this system and direction ﬁeld:
2
a. Find and plot all the critical points.
b. Sketch the trajectories that go through
points (−1, 0) and (0, 2) .
1
c. Sketch a phase portrait for this system.
d. Suppose (x(t), y(t)) is the solution to
this system satisfying (x(0), y(0)) = X
(−1, 0) . What apparently happens to −1 1 2
x(t) and y(t) as t gets large?
e. As well as you can, decide whether the
−1
critical point found in part a is asymp-
totically stable, stable but not asymp-
totically stable, or unstable.

36.6. A direction ﬁeld for

x y
= Y
y − sin(2x) 3

has been sketched to the right. Using

this system and direction ﬁeld:
a. Find and plot all the critical points. 2

b. All the critical points of this system are

either stable (but not asymptotically
stable) or unstable. Using this direc- 1
tion ﬁeld, determine which are stable
and which are unstable.
c. Sketch the trajectories that go through X
points (1, 0) and (0, 2) . −1 1 2
d. Sketch a phase portrait for this system.
e. Suppose (x(t), y(t)) is the solution to −1
this system satisfying (x(0), y(0)) =
(1, 0) . What apparently happens to
x(t) and y(t) as t gets large?

i i

i i
i i

i i

Additional Exercises 803

36.7. The system

Y
x x − 2y − 1 3
=
y 2x − y − 2

has one critical point and that point is

stable, but not asymptotically stable. A 2
direction ﬁeld for this system has been
sketched to the right. Using this infor-
mation:
1
a. Find and plot the critical point.
b. Sketch the trajectories that go through
points (0, 0) and (0, 1) . X
c. Sketch a phase portrait for this system. 1 2 3

d. Suppose (x(t), y(t)) is the solution to

this system satisfying (x(0), y(0)) = −1
(0, 0) . What apparently happens to
x(t) and y(t) as t gets large?

36.8. A direction ﬁeld for

x −2x + y + 1 Y
=
y −x − 4y + 5 3

has been sketched to the right. Using

this system and direction ﬁeld:
2
a. Find and plot the one critical point of
this system.
b. Decide whether this critical point is
asymptotically stable, stable but not 1
asymptotically stable, or unstable.
c. Sketch the trajectories that go through
X
points (2, 0) and (0, 2) .
1 2 3
d. Sketch a phase portrait for this system.
e. Suppose (x(t), y(t)) is the solution to
this system satisfying (x(0), y(0)) = −1
(1, 0) . What apparently happens to
x(t) and y(t) as t gets large?

i i

i i
i i

i i

804 Critical Points, Direction Fields and Trajectories

Y
3

36.9. A direction ﬁeld for

x x + 4y 2 − 1 2
=
y 2x − y − 2

has been sketched to the right. Using

this system and direction ﬁeld: 1
a. Find and plot all the critical points.
b. Sketch the trajectories that go through
X
the points (0, 0) and (1, 1) .
1 2 3
c. Sketch a phase portrait for this system.

−1

36.10. Look up the commands for generating direction ﬁelds for systems of differential equations
in your favorite computer math package. Then, use these commands to do the following
for each problem below:
i. Sketch the indicated direction ﬁeld for the given system.

ii. Use the resulting direction ﬁeld to sketch (by hand) a phase portrait for the system.

a. The system is

x −x + 2y
= .
y 2x − y
Use a 25×25 grid on the region −1 ≤ x ≤ 3 and −1 ≤ y ≤ 3 . (Compare the resulting
direction field to the direction arrows computed in exercise 36.2.)
b. The system is

x (2x − 1)(y + 1)
= .
y y−x
Use a 25×25 grid on the region −1 ≤ x ≤ 3 and −1 ≤ y ≤ 3 . (Compare the resulting
direction field to the direction field found in exercise 36.3.)
c. The system is

x x + 4y 2 − 1
= .
y 2x − y − 2
Use a 25×25 grid on the region 0 ≤ x ≤ 2 and −1 ≤ y ≤ 1 . (This gives a ‘close up’
view of the critical points of the system in exercise 36.9.)
d. The system is

x x + 4y 2 − 1
= .
y 2x − y − 2

Use a 25×25 grid on the region 3/4 ≤ x ≤ 5/4 and −1/4 ≤ y ≤ 1/4 . (This gives an even
‘closer up’ view of the critical points of the system in exercise 36.9.)

i i

i i
i i

i i

Additional Exercises 805

36.11. Consider the initial-value problem

x f (x, y) x(0) a
= with = .
y g(x, y) y(0) b

Assume the system is regular and autonomous, and that (x̃, ỹ) is a solution to the above on
an interval (α, β) containing 0 . Now let t0 be any other point on the real line, and show
that a solution to

x f (x, y) x(t0 ) a
= with = .
y g(x, y) y(t0 ) b

is then given by

(x(t), y(t)) = (x̃(t − t0 ), ỹ(t − t0 )) for α + t0 < t < β + t0 .

(Hint: Showing that the ﬁrst-order system is satisﬁed is a simple chain rule problem.)

36.12. Using corollary 36.1 on page 795 on the existence and uniqueness of solutions to regular
systems and theorem 36.2 on page 796 on curves being trajectories, along with (possibly)
the results of exercise 36.11, verify the following:
a. Corollary 36.3 on page 796. (Hint: Start by assuming the two trajectories do intersect.)
b. Corollary 36.4 on page 796. (Hint: Start by assuming an endpoint of a given maximal
trajectory is not a critical point.)

i i

i i
i i

i i

i i
i i

i i

Appendix
Author’s Guide to Using This Text

The following comments are mainly directed to the instructors of introductory courses in dif-
ferential equations. That said, the students in these courses and independent readers may also proﬁt
from this discussion.

This text was written to be as complete and rigorous as is practical for an introductory text on
differential equations. Unfortunately, it is rarely practical for an instructor of an introductory course
on differential equations to be as complete and rigorous as they would would like. Constraints due
to the time available, the mathematical background of the students, the imposed requirements on the
topics to be covered, and, sometimes, meddling administrators all force us to focus our instruction,
abbreviate the development of some topics and even sacriﬁce interesting material. To a great extent
that is why this text contains much more material than you, the instructor, can realistically cover —
so that the interested student can, on their own time, go back and pick up some of the material that
they will later ﬁnd useful and/or interesting.
So let me, the author, tell you, the instructor, my opinion of what material in this text should be
covered, what is optional, and what should probably be skipped in your course.

A.1 Overview
This text is divided into six parts:
I. The Basics
II. First-Order Equations
III. Second- and Higher-Order Equations
IV. The Laplace Transform
V. Power Series and Modiﬁed Power Series Solutions
VI. Systems of Differential Equations (A Brief Introduction)
The ﬁrst three parts should be viewed as the core of your course and should be covered as completely
as practical (subject to the further advice given in the chapter-by-chapter commentary that follows).
The material in these three parts is needed for the next three parts. However, the next three parts are
independent of each other (excluding one optional section in part VI on the use of Laplace transforms

807

i i

3. Some Basics about First-Order Equations
Sections 3.1 and 3.2 are fundamental and should not be skipped.
Section 3.3 (on existence and uniqueness) should only be brieﬂy discussed, and that discussion
should probably be limited to theorem 3.1. Most instructors will want to skip the rest of chapter 3.
(Still, you might tell the more mathematically inquisitive that the Picard iteration method developed
in section 3.4 is pretty cool, and that the discussion is fairly easy to follow since the boring parts
have been removed and stuck in the sections 3.5 and 3.6.)

4. Separable First-Order Equations

Cover sections 4.1, 4.2, 4.3 and 4.4.
You can skip sections 4.5 (Existence, Uniqueness and False Solutions), 4.6 (On the Nature of
Solutions to DEs) and 4.7 (Using and Graphing Implicit Solutions). In an ideal world, the material
in these sections would be recognized as important for understanding “solutions”. But this is not the
ideal world, and there isn’t enough time to cover everything. Tell your students that they can read it
on their own, and that understanding this material will help lead them to enlightenment.
You can also skip section 4.8. It’s on using deﬁnite integrals.

5. Linear First-Order Equations

Cover sections 5.1 and 5.2. Sections 5.3 (on using deﬁnite integrals) and 5.4 (a bit of theory) can be
ignored by most.

6. Simplifying Through Substitution

Cover the whole thing. Students should be able to recognize and use the “obvious” substitutions in
sections 6.2 and 6.3. They should also be able to use any other reasonable substitution suggested to
them. On the other hand, I see little value in memorizing the substitution for the Bernoulli equations
— being able to derive it (exercise 6.8), however, is of great value.

i i

i i
i i

i i

810 Author’s Guide to Using This Text

7. Exact Equations and General Integrating Factors

Whether or not this chapter is covered depends on the background of your students. If they have
had a course covering calculus of several variables — in particular, if they are acquainted with the
multidimensional chain rule — then all of this chapter should be covered. After all, the methods for
dealing with ﬁrst-order differential equations discussed in chapters 4 and 5 are all just special cases
of the general methods discussed in this chapter.
On the other hand, many introductory courses in differential equations do not have multivariable
calculus as a prerequisite, and the students in those courses cannot be expected to know the multidi-
mensional chain rule on which almost everything in this chapter is based. That makes covering this
material problematic. For that reason, this chapter and the rest of the text were written so that this
chapter could be omitted. (But students in these courses should realize that they will want to read
this chapter on their own after learning about the multidimensional chain rule.)

8. Slope Fields: Graphing Solutions Without the Solutions

Cover sections 8.1 and 8.2. These sections describe what slope ﬁelds are, why they can be useful,
and how to construct and use them. I’ve also tried to make the exercises more meaningful than just
“sketch a bunch of solution curves”. If you want a computer assignment (highly recommended),
require exercise 8.8.
Sections 8.3, 8.4 and 8.5 cover useful stuff regarding slope ﬁelds that is rarely discussed in intro-
ductory differential equation courses. If time permits, a brief discussion of stability as approached
in section 8.3 may be enlightening, as might a discussion of “problem points”, as done in section 8.4,
just to illustrate how solutions can go bad.

9. Euler’s Numerical Method

This development of Euler’s numerical method for first-order differential equations follows naturally
from the discussion of slope fields in the previous chapter. Cover sections 9.1 and 9.2, and then briefly
comment on the material in sections 9.3 and 9.4 regarding the errors that can arise in using Euler’s
method. The detailed error analysis in section 9.5 is only for the most dedicated.
(By the way, you can go straight from chapter 8 to chapter 10, and cover chapter 9 later. If
a chapter must be sacrificed because of time limitations, this would probably be one of the better
candidates for the sacrifice.)

10. The Art and Science of Modeling with First-Order Equations

Cover sections 10.1 through 10.6. Consider section 10.7 on thermodynamics (Newton’s law of
heating and cooling) as optional. Section 10.8 is extremely optional; it covers technical issues that
only mathematicians worry about.
Add few applications of your own if you want.
By the way, my approach to “applications” is a bit different from that found in many other texts.
I prefer working on the reader’s ability to derive and use the differential equations modeling any
given situation, rather just than ﬂashing a big list of applications.

Part III: Second- and Higher-Order Equations

11. Higher-Order Equations: Extending First-Order Concepts
Cover section 11.1. That material is needed for future development. I’ll leave it to you to decide
whether the solution method in section 11.2 is worth covering; it can be safely skipped.
Cover section 11.3 on initial-value problems.

i i

i i
i i

i i

Chapter-by-Chapter Guide 811

Section 11.4 is on the existence and uniqueness of solutions to higher-order differential equa-
tions, and should only be brieﬂy discussed, if at all. At most, mention that the existence and
uniqueness results for ﬁrst-order equations have higher-order analogs.

12. Higher-Order Linear Equations and the Reduction of Order

Method
Cover sections 12.1, 12.2 and 12.3. All of this is needed for later development.
Extensions of the reduction of order method are discussed in sections 12.4 and 12.5, along
with explanations as to why these extensions are rarely used in practice. Consider these sections as
very optional, but, to hint at future developments, you might mention that the variation of parameters
method for solving nonhomogeneous equations, which will be developed later, is just an improvement
on the reduction of order method for nonhomogeneous equations discussed in section 12.4.

13. General Solutions to Homogeneous Linear Differential Equations

Cover sections 13.1 and 13.2. This material is fundamental. However, if you want to concentrate on
second-order equations at ﬁrst, you can cover section 13.1 now (which just concerns second-order
equations), and then go on to chapters 15 and 16 (and maybe the ﬁrst two sections of chapter 18 on
Euler equations). You can then go back and cover section 13.2 just before discussing the higher-order
equations in chapter 17.
Section 13.3 concerns Wronskians. Consider it optional. This section was written assuming
the students had not yet taken a course in linear algebra. Under this assumption, I do not feel that
Wronskians are worth any more than a brief mention here. It’s later that Wronskians become truly
of interest.

14. Verifying the Big Theorems and an Introduction to Differential

Operators
You can, and probably should, completely skip this chapter.
The ﬁrst two sections are there just to assure readers that I didn’t make up the big theorems in
the previous chapter. Of course, you can read it for your own personal enjoyment and so that you
can tell me of all the typos in this chapter.
Section 14.3 can help the students better understand “differential operators” in the context of
differential equations, but the material in this section is only used later in the text to prove a few
theorems. Since you are not likely to be discussing these particular proofs, you can safely skip this
section. Suggest this section as enrichment for your more inquisitive students.

15. Second-Order Homogeneous Linear Equations with Constant

Coefﬁcients
This may be the most important chapter for many of your students. By the end of the term, they
should be able to solve these equations in their sleep. I also consider the introduction of the complex
exponential as a practical tool for doing trigonometric function computations to be important.
Cover everything in this chapter except, possibly, example 15.5 on page 312 (that example
concerns using the complex exponential to derive trigonometric identities — a really nice thing,
but you’ll probably be a little pressed for time and will want to concentrate on solving differential
equations.)

i i

i i
i i

i i

812 Author’s Guide to Using This Text

16. Springs: Part I

This is a fairly standard discussion of unforced springs (maybe with more emphasis than usual on
the modeling process). Cover all of it quickly.

17. Arbitrary Homogeneous Linear Equations with Constant

Coefﬁcients
Cover all of sections 17.1, 17.2 and 17.3.
Don’t cover sections 17.4 and 17.5. They contain rigorous proofs of theorems naively justiﬁed
earlier in the chapter. Besides, you will need the material from section 14.3 (which you probably
skipped).

18. Euler Equations

Cover sections 18.1 and 18.2. Section 18.3 is optional. Don’t bother covering section 18.4, though
you may want to brieﬂy comment on that material (which relates Euler equations to equations with
constant coefﬁcients via a substitution).
By the way, the approach to Euler equations taken here is taken to help reinforce the students’
grasp of the theory developed for general linear equations. It will also help prepare them for the
series methods (especially the method of Frobenius) that some of them will later see. That is why I
downplay the substitution method commonly used by others.

19. Nonhomogeneous Equations in General

i i

814 Author’s Guide to Using This Text

30. Power Series Solutions I: Basic Computational Methods

Cover all of sections 30.1, 30.2, 30.3, 30.4 and 30.5. These cover the finding of basic power series
solutions (and their radii of convergence) for first- and second-order linear equations with rational
coefficients.
Section 30.6, on using Taylor series, can be enlightening. It can also be considered optional.
Section 30.7 is an appendix on using induction when deriving the formulas for the coefficients
in a power series solutions. Theoretically, it should be covered. As a practical matter, it’s probably
best to simply tell your more mathematically mature students that this is a section they may want to
read on their own.

32. Power Series Solutions II: Generalizations and Theory

In this chapter:

1. The ideas and solution methods discussed in the previous chapter are extended to correspond-
ing ideas and solution methods for for ﬁrst- and second-order linear equations with analytic
coefﬁcients.

2. The validity of these methods and the claims of the relation between singular points and the
radii of convergence for the power series solutions are rigorously veriﬁed.

It may be a good idea to quickly go through sections 31.1 and 31.2 since some of the basic notions
developed here will apply later in the next two chapters. However, the extension of solution method
and the rigorous veriﬁcation of the validity of the methods, no matter how skillfully done, may be
something you would not want to deal with in an introductory course. I certainly wouldn’t. So feel
free to skip the rest of the chapter, telling your students that they can return to it if the need ever
arises.

32. Modiﬁed Power Series Solutions and the Basic Method of

Frobenius
Cover sections 32.1, 32.2. 32.3, 32.4 and 32.5. This covers the development of the basic method
of Frobenius as a natural extension of the power series method from chapter 30 and the approach to
solving Euler equations given in chapter 18.
Section 32.6 discusses the case in which the exponents are complex. Skip it. It’s included for
the sake of completeness, but this case never arises in practice.
Section 32.7 is an appendix containing a proof and a generalization. It can be safely skipped.

33. The Big Theorem on the Frobenius Method, with Applications

Definitely cover section 33.1. It contains the “Big Theorem” describing when the basic Frobenius
method works, and describes the sort of modified power series solutions that arise in the ‘exceptional
cases’ in which the basic method only works partially.
Sections 33.2, 33.3 and 33.4 contain a discussion of the behavior of modified power series
solutions near singular points. While this seems to be rarely discussed, I consider knowing this
behavior to be very important in many applications of the Frobenius method. For some solutions,
this behavior near singularities is all you need to know. So I suggest covering these sections. If
time allows, it may also be worthwhile to illustrate this material with the application in section 33.5
concerning the Legendre equation and Legendre polynomials.
Adapting the basic Frobenius method to handle the ‘exceptional cases’ is discussed in section
33.6. With any luck, your students will never need this. Let them read it if ever the need arises.

i i

i i
i i

i i

Chapter-by-Chapter Guide 815

34. Validating the Method of Frobenius

You can, and probably should, completely skip this chapter.
Pity. I spent a lot of time and effort on this chapter, and I am rather proud of the result. It’s
straightforward, rigorous, and understandable by students (and instructors). As far as I am aware,
this chapter contains the only intelligible explanation of why you get the modiﬁed power series
solutions that you do get for the ‘exceptional cases.’ But the chapter is mainly on rigorously proving
the “Big Theorem” in the previous chapter and is intended to serve as a reference for those few who
really would like to see a good proof of that theorem.

Part VI: Systems of Differential Equations (A Brief

Introduction)
35. Systems of Differential Equations: A Starting Point
Cover sections 35.1, 35.2 and 35.3. In these sections, the student is introduced to systems of
differential equation and some interesting applications, one of which involves a single second-order
nonlinear differential equation.
Section 35.4 discusses the use of Laplace transforms to solve certain systems. It’s hardly the
best way to solve these systems but is pretty well the only way that could be discussed within the
parameters of this text. I would consider this section to be optional.
Section 35.5 on the existence and uniqueness of solutions and on identifying general solutions
should also be considered optional.
In section 35.6, some existence and uniqueness results for single N th -order equations are
discussed in context of the results given in section 35.5 — very optional for most classes.

36. Critical Points, Direction Fields and Trajectories

Cover sections 36.1, 36.2, 36.3, 36.4 and 36.5. The rudiments of phase plane analysis are developed
in these sections.
Also cover section 36.6, in which phase plane analysis is used to extract useful information
regarding three applications (mixing in multiple tanks, two competing species, and the motion of
a pendulum). This would be a nice place to end the course, so consider the rest of the chapter (on
verifying some properties of trajectories) as very optional.

i i

i i
i i

i i

i i
i i

i i

−1/2 −1
5a. y(x) = ± 1 + ce6x and y = 0 5b. y(x) = 2x 3 C − x 2 and y = 0
3
c
5c. y(x) = sin(x) + and y = 0 6. y(x) = 11x 2 − 2x
sin(x)
y
7a. u = ; y(x) = x (3 ln |x| + c) /3
1
x 2
1 1 2 4 1
7b. u = 2x + 3y + 4 ; y(x) = x + c − x − and y(x) = − (2x + 4)
3 2 3 3 3
1/
c
2
7c. u = y 2 ; y(x) = x + and y = 0 7d. u = 4x − y ; y(x) = 4x − arccos(x + c)
x
y
7e. u = y − x ; y(x) = 1 + x − Ae−y 7f. u = ; y ln |y| − cy = x and y = 0
y x
7g. u = ; y(x) = −x ± x ln |x| + c
x −1/2
7h. u = y −2 ; y(x) = ± cx 2 − 2x 3 and y = 0
7i. u = 2x + y − 3 ; y(x) = (x + c)2 −2x + 3 and y(x) =
3 − 2x

7j. u = 2x + y − 3 ; 2x + y − 3 − ln 1 + 2x + y − 3 = x + c
y
2
1
7k. u = ; y(x) = x ln |x| + c − x and y = −x
x 2
1/4
7l. u = y 4 ; y(x) = ± 8e2x + ce−12x
−1
7m. u = x − y + 3 ; y(x) = 3 + x + Ae2x − 1 Ae2x + 1 and y = x + 4
7n. u = y + x 2 ; y(x) = c2 + 2cx

and y−x= −x
2

7o. u = sin(y) ; y(x) = arcsin (c + x)e 7p. u = yx −2 ; y(x) = x 2 tan(c + ln |x|)

du
8. + (1 − n) p(x)u = (1 − n) f (x)
dx

Chapter 7
dy c
dy
1a. 3y + 3x = 0 , y(x) = 1b. −6x 2 y + 2y − 2x 3 = 0 , y(x) = x 3 ± c + x 6
d x x dx
3 2 2 dy 2 3
1c. 2x y − y + x − 3x y = 0 , x y − xy = c
x dy
dx c !
c
1d. Arctan(y) + = 0 , y(x) = tan 2b. y(x) = ± x+
1 + y2 d x x x
x 1 √ 2
4a. φ(x, y) = x 2 y + x y 2 + c1 , y(x) = − ± x +C
2 2x

1/
4b. φ(x, y) = x 2 y 3 + x 4 + c1 , y(x) = cx −2 − x 2 3

1/
4c. φ(x, y) = 2x − x 2 + y 3 + c1 , y(x) = c + x 3 − 2x 3
(
c−x
4d. φ(x, y) = x 3 y 2 + x + 3y 2 + c1 , y(x) = ± 3
x +3
1 5 1 5 1
4e. φ(x, y) = x y − 4
y + c1 , =c
x4 y − 4f. φ(x, y) = x ln |x y| + c1 , y(x) = ec/x
y
5 5 x
y c − x y
4g. φ(x, y) = x + xe + c1 , y(x) = ln 4h. φ(x, y) = xe + y + c1 , xe + y = c
y
x

1/
5a. μ = x 3 , y(x) = ± C x −4 − 1 4 5b. μ = y −4 , x y −3 + y = c
−2 2 −2
5c. μ=x y , y −x y =
4 3
!c 5d. μ = cos(y) , x cos(y) + sin(y) = c
√
5f. μ = e−x , y(x) = Ce x − 1
1 1
1 + C x − /2
3 2 2
5e. μ = x , y(x) = − ±
2 2
√
5h. μ = y , y /2 + x 2 y /2 = c
5 3
5g. μ = x −3 , y 4 − x −2 y 3 = c
1/ 1/ 7/
5i. μ = x y 3, x2 y 3 + x4y 3 =c

i i

i i
i i

i i

Answers to Selected Exercises 821

Chapter 8

1b. y(0) should be within 1/ of 2. 2a.

0
0 1

1 3/ 2
2

1
2b. 2c. 2d. 1

0 0 3/ 0
0 1 0 1 2 0 1 2

2 2

2e. 1 2f. 1

0 0
0 1 2 0 1 2
Y

4
3. a. 3 3b. y(8) ≈ 2
2

0
0 1 2 3 4 5 6 7 8 9 X
Y
7

4 bi. y(4) ≈ 31/2 bii. y(4) = 4

4. a.
biii. y(4) ≈ 61/3
3

0
X
0 1 2 3 4 5 6 7 8 9

i i

i i
i i

i i

822 Answers to Selected Exercises

Y
7

4
aii. The max. is approximately 61/2
5. ai. & bi. 3 and is at x ≈ 3 .
2
aiii. y(8) ≈ 31/2 , bii. y(8) ≈ 1

0
X
0 1 2 3 4 5 6 7 8 9
−1

Y
7

4
6. ai. & bi. 3
aii. y(3) ≈ 31/2 bii. y(0) ≈ −1

0
X
0 1 2 3 4 5 6 7 8 9
−1
Y
6

4 aii. The max. is approximately 5

3 and is at x ≈ 2 .
7. ai. & bi.
bii. The max. is approximately 3
2
and is at x ≈ 61/2 .
1

0
X
0 1 2 3 4 5 6 7 8 9
−1
9a. y = 2 appears to be an asymptotically stable constant solution.
9b. y = 2 appears to be an unstable constant solution.
9c. y = 2 appears to be a stable (but maybe not asymptotically stable) constant solution.
9d. y = 3 appears to be an unstable constant solution.
9e. y = 2 appears to be an unstable constant solution, and y = −2 appears to be an
asymptotically stable constant solution.
9f. y = 1 appears to be a stable (but not asymptotically stable) constant solution.

i i

i i
i i

i i

Answers to Selected Exercises 823

Chapter 9
k xk yk k xk yk
k xk yk
0 0 10 0 0 2
0 1 −1 1/ 1/
1 4 10 1 2 4
1a. 1 4/
3 −4/3 1b. 1/ 1c.
2 2 5 2 1 11
2 5/
3 −5/3 3/ 3/ 139/
3 4 0 3 2 2
3 2 −2 19853/
4 1 0 4 2 8

e−x/2

7b. y(x) = e x/2 cos(2π x) 0

X
1 2 3 4
−e−x/2

−1
8a. y(x) = c1 e3x + c2 e−3x 8b. y(x) = c1 cos(3x) + √ c2 sin(3x) √
−3x −3x (−3+3 2)x
8c.y(x) = c1 e + c2 xe 8d. y(x) = c1 e + c2 e(−3−3 2)x
8e.y(x) = c1 e x/3 + c2 xe x/3 8f. y(x) = c1 e−3x cos(x) + c2 e−3x sin(x)
8g.y(x) = c1 e cos(6x) + c2 e sin(6x)
2x 2x
8h. y(x) = c1 e2x + c2 e x/2
−5x −5x
8i.y(x) = c1 e + c2 xe 8j. y(x) = c1 e
x/3
+ c2 e−x/3
x x
8k. y(x) = c1 cos + c2 sin 8l. y(x) = c1 + c2 e−x/9
3 3 √
√
8m. y(x) = c1 e−2x cos 3x + c2 e−2x sin 3x
8n. y(x) = c1 e−2x cos(x) + c2 e−2x sin(x) 8o. y(x) = c1 e−2x + c2 xe−2x
−3x
8p. y(x) = c1 e + c2 e 5x
√y(x)
8q. = c1 + c2 e
4x
8r. y(x) = c1 e−4x + c2 xe−4x
√
3 3
8s. y(x) = c1 cos x + c2 sin x 8t. y(x) = c1 e x/2 cos(x) + c2 e x/2 sin(x)
2 2

Chapter 16
(sec−1 ) , ν0 = (sec−1 ) , p0 = 4π (sec)
1 1
1a. κ = 4 (kg/sec2 ) 1b. ω0 =
2 4π
π 3π
1c i. A = 2 , φ = 0 1c ii. A = 4 , φ = 1c iii. A = 4 , φ =
2 2
π
1c iv. A = 4 , φ = 2a. κ = 288 (kg/sec2 )
3
π
2b. ω0 = 12 (sec−1 ) , ν0 = (sec−1 ) , p0 =
6 1
(sec) 2c i. A = 1 2c ii. A =
√ π 6 12
17 4π 2 8π 2 2π 2
2c iii. A = 4a. κ = 4b. κ = 4c. κ =
1√
4 9 9 9
1 2π 3 1
6b. α = ,ω=3, p= ,ν = 6c i. A = 37 6c ii. A = 4 6c iii. A =
2√
2 3 2π 6 3
6c iv. A = 10
3
7b. ν decreases from the natural frequency of the undamped system, ν0 , down to zero.
p increases from the natural period of the undamped system, p0 , up to ∞ .
Y
2

8b. y(t) = [2 + 4t]e−2t 1

0
0 1 2 3 T

i i

i i
i i

i i

828 Answers to Selected Exercises

3π

1
13a. (iii) 13b. (i) y(t + π ) − y(t) = cos 2π t −
2 4 6 8 T 3π 2 m 2

1 n
13b. (ii) y(t) = sin(π τ ) [1 − cos(π τ )] − sin(2π t)
3π 2 m 3π 2 m
Y

13b. (iii) 13c. (i) y(t + π ) − y(t) = 0 (No resonance!)

1 2 3 4 5 T

1 T
13c. (ii) y(t) = [2 sin(2π t) − sin(4π t)] 13c. (iii)
12π 2 m 1 2 3 4 5

Chapter 28
1a. 6.525 kg·meter/sec 1b. 13.63 kg·meter/sec 2a i. 20 meter/sec
2a ii. 40 meter/sec 2a iii. 0 meter/sec 2b i. 18 kg·meter/sec 2b ii. 28 kg·meter/sec
2b iii. 8 kg·meter/sec 2c i. 0.5 kg 2c ii. 2 kg 3a. 16 3b. 0 3c. 1
1
3d. 3e. 9 3f. 0 6a. y(t) = 3 step2 (t) 6b. y(t) = rect(2,4) (t)
2 ⎧
⎪
⎪ if t < 1
⎨ 0
6c. y(t) = (t − 3) step3 (t) 6d. y(t) = t − 1 if 1 < t < 4
⎪
⎪
⎩ 3 if 4 < t

sin(t) if t < π
6e. y(t) = 4e−2(t−1) step(t − 1) 6f. y(t) =
0 if t < π

π −3t −3(t−2)
6g. y(t) = 2 cos(t) step t − 7a. y(t) = 2e + e step(t − 2)
2
1 − e−3t 1 − e−3t + 1 − e−3(t−1) step1 (t)
1 2 1
7b. y(t) = 7c. y(t) =
3 3 3
− e−4(t−10) step10 (t)
1 1 4(t−10)
7d. y(t) = sin(4(t − 2)) step2 (t) 7e. y(t) = e
4 8
e2t − e−6t e2(t−3) − e−6(t−3) step3 (t)
1 1
7f. y(t) = 0 7g. y(t) = 7h. y(t) =
8 8
7i. y(t) = (t − 4)e−3(t−4) step4 (t)
1
7j. y(t) = e6t sin(3t)
3
1
7k. y(t) = 1 − cos(3(t − 1)) step1 (t)
9
e2(t−1) − e−2(t−1) + sin(2(t − 1)) step1 (t)
1 3
7l. y(t) =
4 2

i i

i i
i i

i i

Answers to Selected Exercises 837

Chapter 29
121 9,841 3 1 3 665
2a. 2b. 2c. 2d. 2e. 2f. 2g. Diverges
81 6,561 2 162 5 32
∞
∞

15 −1 (−1)k
2h. 14 2i. −5 2j. 4a. 1 + xk 4b. x 2 + xk
4 k(k + 1) k
k=1 k=4
∞
∞
∞

4c. 3(n − 3)2 (x − 5)n 4d. (n + 1)nx n 4e. −6 + [−2(n + 1)(n + 3)]x n
n=4 n=1 n=1
∞
∞
∞

n n
4f. 6 + [2(n + 1)(2n + 3)]x 4g. 2
(n − 3) an−3 x 4h. an−2 (x − 1)n
n=1 n=4 n=2
∞
∞

4i. a0 x + 2a1 x 2 + n an−1 − an+1 x n 4j. (n + 1)an+1 + 5an x n
n=3 n=0
∞

4k. −4a0 − 4a1 x + k − k − 4 ak x k
2

k=2
∞
∞

4l. 2a2 + 6a3 x + (n + 2)(n + 1)an+2 − 3an−2 x n 5a. (k + 1)x k for |x| < 1
n=2 k=0
∞
∞
∞

1 k 1 k (−1)k 2k
5b. (k + 2)(k + 1)x for |x| < 1 6a. x 6b. x
2 k! (2k)!
k=0 k=0 k=0
∞ ∞

(−1)k (−1)k−1 1 1 1 3 5 4
6c. x 2k+1 6d. (x − 1)k 7a. 1 + x − x 2 + x − x
(2k + 1)! k 2 8 16 128
k=0 k=1
1 2 1 4
1 6 5 8 1 3 5
7b. 1 − x − − x −
x x 7c. 1 + x 2 + x 4 + x 6
2 8
16 128 2 8 16
4 4 4 8 7 2 8 4
8b i. 1 + 2x + 2x + x 4 + x 5 + x 6 +
2
x + x + x9
3 15 45 315 315 2835
1 5 61 6 277 8 50521 10
8b ii. 1 + x 2 + x 4 + x + x + x
2 24 720 8064 3628800
4 1 2 4 3 31
8b iii. 3 − (x − 2) + (x − 2) − (x − 2) + (x − 2)4
3 27 243 4374
58 139 238
− (x − 2)5 + (x − 2)6 − (x − 2)7
19683 118098 531441
∞ ∞ ∞
k k 1 k 2k 1 k
9a. 2 x with R = 9b. (−1) x with R = 1 9c. x with R = 2
2 2k
k=0 k=0 k=0
∞ ∞
k+1 k (−1)k 2k
9d. k+1
x with R = 2 9e. x with R = ∞
2 k!
k=0 k=0
∞
(−1)k 4k+2
9f. x with R = ∞ 10c. Taylor series = 0
(2k + 1)!
k=0
10d. Because f (x) does not equal its Taylor series about 0 except right at x = 0 .

Chapter 30
2. Sk is the statement: ak = F(k) for a given function F and a given sequence A0 , A1 , A2 , . . . .
∞ ∞
2 2k k 1
3a. ak = ak−1 for k ≥ 1 , y(x) = a0 x = a0 (2x)k
k k! k!
k=0 k=0
∞

2 1 2m 2
3b. ak = ak−2 for k ≥ 2 (with a1 = 0) , y(x) = a0 x = a0 e x
k m!
m=0

i i

i i
i i

i i

838 Answers to Selected Exercises

∞
a0
3c. ak = 2ak−1 for k ≥ 1 , y(x) = a0 2k x k =
1 − 2x
k=0
k−3 2 1
3d. ak = ak−1 for k ≥ 1 , y(x) = a0 1 − x + x 2
3k 3 9
4−k
3e. ak = ak−2 for k ≥ 2 (with a1 = 0) , y(x) = a0 1 + x 2
k
∞
a0
3f. ak = ak−1 for k ≥ 1 , y(x) = a0 xk =
1−x
k=0
∞

1 −1 k
3g. ak = − ak−1 for k ≥ 1 , y(x) = a0 (x − 3)k
2 2
k=0
∞

1+k −1 k
3h. ak = − ak−1 for k ≥ 1 , y(x) = a0 (k + 1) (x − 5)k
4k 4
k=0
∞

1 1 3m
3i. ak = ak−3 for k ≥ 3 (with a1 = a2 = 0) , y(x) = a0 x
2 2m
m=0
k−6 1
3j. ak = ak−3 for k ≥ 3 (with a1 = a2 = 0) , y(x) = a0 1 − x 3
2k 2
1 k−1
3k. ak = ak−2 − ak−1 for k ≥ 2 (with a1 = 0) ,
k k
1 2 1 3 11
y(x) = a0 1 + x − x 3 + x 4 − x 5 + · · ·
2 3 8 30
1
3l. ak = ak−2 − ak−1 for k ≥ 2 (with a1 = −a0 ) ,
k
3 11 53
y(x) = a0 1 − x + x 2 − x 3 + x 4 + · · ·
2 6 24
4a. No singular points, R = ∞, I = (−∞, ∞)
1 1 1 1
4b. No singular points, R = ∞, I = (−∞, ∞) 4c. z s = , R = , I = − ,
2 2 2 2
4d. z s = 3, R = 3, I = (−3, 3) 4e. z s = ±i, R = 1, I = (−1, 1)
4f. z s = 1, R = 1, I = (−1, 1) 4g. z s = 1, R = 2, I = (1, 5)
4h. z s = 1, R = 4, I = (1, 9) √ √
√ √ √
z s = 2, 2−2/3 −1 ± i 3 ; R = 2, I = − 2, 2
3 3 3 3
4i.
√ √ √ √ √
4j. z s = 2, 2−2/3 −1 ± i 3 ; R = 2, I = − 2, 2
3 3 3 3

4k. z s = −1, R = 1, I = (−1, 1) 4l. z s = −1, R = 1, I = (−1, 1)

4−k
5a. ak = ak−2 for k ≥ 2 , y(x) = a0 y1 (x) + a1 y2 (x) where y1 (x) = 1 + x 2 and
k
∞

1 1 5 1 7 1 9 (−1)m+1
y2 (x) = x + x 3 − x + x − x + ··· = x 2m+1
3 5·3 7·5 9·7 (1 + 2m)(2m − 1)
m=0
1
5b. ak = − ak−2 for k ≥ 2 , y(x) = a0 y1 (x) + a1 y2 (x) where
k
∞
1 1 1 1
y1 (x) = 1 − x 2 + 2 x 4 − 3 x 6 + · · · = (−1)m m x 2m and
2 2 ·2 2 (3!) 2 m!
m=0
∞
1 1 5 1 2m m!
y2 (x) = x − x 3 + x − x7 + ··· = (−1)m x 2m+1
3 5·3 7·5·3 (2m + 1)!
m=0
k−2
5c. ak = − ak−2 for k ≥ 2 , y(x) = a0 y1 (x) + a1 y2 (x) where y1 (x) = 1 and
4k
∞
1 1 1 1
y2 (x) = x − x3 + 2 x5 − 3 x7 + · · · = (−1)m m x 2m+1
4·3 4 ·5 4 ·7 4 (2m + 1)
m=0
3
5d. ak = ak−4 for k ≥ 4 (with a2 = a3 = 0) , y(x) = a0 y1 (x) + a1 y2 (x) where
k(k − 1)

i i

i i
i i

i i

1 3 3
6b. y2 (x) = y1 (x) ln |x| + x 0 + x + 0x 2 − x + 0x 4 + · · ·
4 128
22 43
6c. y2 (x) = −6y1 (x) ln |x| + 1 + 4x + 0x 2 − x 3 + x 4 + · · ·
3 24
16 1 4 4 16
6d. y2 (x) = − y1 (x) ln |x| + 2 1 + x + x 2 + x 3 + 0x 4 + · · ·
9 x 3 3 9

Chapter 35
1a. Yes, it is 1b. No, it is not 1c. Yes, it is 2a. No, it is not 2b. Yes, it is
2c. No, it is not 3a. No, it is not 3b. Yes, it is 3c. Yes, it is
4b. x(t) = 3e3t + 4e−4t and y(t) = 3e3t − 10e−4t
5b. x(t) = 3e9t − 3e−3t and y(t) = 3e9t + 6e−3t
6b. x(t) = 6e−2t − 6e5t and y(t) = −18e−2t − 3e5t
7a. x(t) = c1 cos(t) + c2 sin(t) and y(t) = c1 sin(t) − c2 cos(t) + c3
2
7b. x(t) = Aec1 t and y(t) = c1 t
x(t) = 2 Ae3t + Be−2t , y(t) = 6 + t + C e−5t and z(t) = Ae3t
15B
7c.
A
6 1 9 1
8a. x =− x + y 8. x =2− x + y 9a. x = −2y − 4x
1200 600 1200 600
1 4 3 5 y = x
y = 3 + x − y y = 1 + −x y
1200 600 1200 600
9b.
x = 32y + 8t x + sin(t)
2 9c. x = 4−y 2 9d. x = 5t −1 x − 8t −2 y
y = x y = x y = x
9e. x = t −1 x − 10t −2 y 9f.
x = 4t 2 − sin(x) y 9g. x = z
y = x y = x y = x
z = 3x + 4y − 2z
9h. x = z 10a. x(t) = −4e−4t + 5e3t and y(t) = 10e−4t + 5e3t

y = x

z = −x t 2 + y 2
10b. x(t) = Ae2t + Be−2t and y(t) = Ae2t − Be−2t where
1 1
A = (x 0 + y0 ) and B = (x 0 − y0 )
2 2
10c. x(t) = x 0 cos(2t) + y0 sin(2t) and y(t) = y0 cos(2t) − x 0 sin(2t)
y0
10d. x(t) = x 0 cos(4t) − sin(4t) and y(t) = y0 cos(4t) + 2x 0 sin(4t)
2
10e. x(t) = e2t [2 cos(3t) − 3 sin(3t)] and y(t) = e2t cos(3t)
10f. x(t) = e3t [x 0 cos(2t) + y0 sin(2t)] and y(t) = e3t [y0 cos(2t) − x 0 sin(2t)]
9t
10g. x(t) = −2e + t + 2 and y(t) = −e − 4t + 1
9t
1 2t
10h. x(t) = e − 3 and y(t) = −5e + 6 2t
2
10i. x(t) = e7t + 3e3t − 6te3t and y(t) = e7t − e3t + 2te3t
10j. x(t) = 3 sin(2t) − 6t and y(t) = −6 cos(2t) + 6
10k. x(t) =13[cos(4t) + sin(4t)] and y(t) = 8 sin(4t)
7(t−2) 3(t−2)

10l. x(t) = 2e − 3e + 1 step2 (t) and y(t) = 2e7(t−2) + e3(t−2) − 3 step2 (t)

Chapter 36
1a. (0, 0) 1b. (3, 2) 1c. Every (x 0 , y0 ) with y0 = −3x 0 1d. (6, 1) and (5, 0)
1e. (2, 2) , (2, −2) , (4, 4) and (4, −4) 1f. (3, nπ ) for n = 0, ±1, ±2, ±3, . . .

i i

i i
i i

i i

846 Answers to Selected Exercises

Y
2

1g. (0, 0) and (0, 1) 1h. No constant solutions. 2a. 1

X
0
0 1 2
Y Y
1 1

X
2b. −1 1
3. 1
2
4. a. (0, 0)
3

−1 X
0 1
0 2 1
Y Y
3 3

2 2

4b. 1 4c. 1

X X
−1 1 2 −1 1 2

−1 −1

4d. They become large and nearly equal. 4e. Unstable 5. a. (1, 1/2)
Y Y
3 3

2 2

5b. 1 5c. 1

X X
−1 1 2 −1 1 2

−1 −1

5d. (x, y) → (1, 1/2)5e. Asymptotically stable

6. a. (nπ/2, 0) for n = 0, ±1, ±2, . . .
6b. Stable: (nπ, 0) for n = 0, ±1, ±2, . . . ; unstable: (nπ/2, 0) for n = 1, ±3, ±5, . . .

i i

i i
i i

i i

Answers to Selected Exercises 847

Y Y
3 3

2 2

6c. 1 6d. 1

X X
−1 1 2 −1 1 2

−1 −1
6e. (x, y) “orbits” about (0, 0) clockwise. 7. a. (1, 0)
Y Y
3 3

2 2

7b. 1 7c. 1

X X
1 2 3 1 2 3

−1 −1

7d. (x, y) “orbits” about (1, 0) counterclockwise. 8. a. (1, 1) 8b. Asymptotically stable
Y Y
3 3

2 2

8c. 1 8d. 1

X X
1 2 3 1 2 3

−1 −1

15
8e. (x, y) → (1, 1) 9. a. (1, 0) and /16, −1/8

i i

i i
i i

i i

848 Answers to Selected Exercises

Y Y
3 3

2 2

9b. 1 9b. 1

X X
1 2 3 1 2 3

−1 −1
Y Y
3 3

2 2

10a. 1 10b. 1

X X
−1 1 3 −1 1 2 3

−1 −1
Y
1 1 Y
4

X X
10c. 0 1 2
10d.
3 1 5
4 4

−1 − 41

i i

i i
Mathematics

TEXTBOOKS in MATHEMATICS

ORDINARY
DIFFERENTIAL
EQUATIONS
An Introduction to the Fundamentals

Ordinary Differential Equations: An Introduction to the Funda-

mentals is a rigorous yet remarkably accessible textbook ideal for an
introductory course in ordinary differential equations. Providing a use-
ful resource both in and out of the classroom, the text:
• Employs a unique expository style that explains the how and why
of each topic covered
• Allows for a flexible presentation based on instructor preference
and student ability
• Supports all claims with clear and solid proofs
• Includes material rarely found in introductory texts
Ordinary Differential Equations: An Introduction to the Fundamen-
tals also includes access to an author-maintained website featuring
detailed solutions and a wealth of bonus material. Use of a math soft-
ware package that can do symbolic calculations, graphing, and so
forth, such as Maple™ or Mathematica®, is highly recommended, but
not required.

K26326

w w w. c rc p r e s s . c o m

Calculus For Engineers, 4th Edition
100% (13)
Calculus For Engineers, 4th Edition
1,216 pages
An Introduction To Linear Algebra For Science and Engineering - 3rd Ed - Norman
100% (16)
An Introduction To Linear Algebra For Science and Engineering - 3rd Ed - Norman
592 pages
MC OWEN, ROBERT - Partial Differential Equations. Methods and Applications
75% (4)
MC OWEN, ROBERT - Partial Differential Equations. Methods and Applications
427 pages
Ian Stewart, David Tall - Complex Analysis-Cambridge University Press (2018) PDF
100% (11)
Ian Stewart, David Tall - Complex Analysis-Cambridge University Press (2018) PDF
404 pages
Mathematics P2 Grade 11 MEMO June 2023
100% (2)
Mathematics P2 Grade 11 MEMO June 2023
9 pages
Multivariable Calculus With Applications - Lax
95% (21)
Multivariable Calculus With Applications - Lax
488 pages
Fundamentals of Differential Equations (Nagle, Saff, Snider)
100% (11)
Fundamentals of Differential Equations (Nagle, Saff, Snider)
739 pages
Holmes MH Introduction To Differential Equations
100% (1)
Holmes MH Introduction To Differential Equations
248 pages
Calculus and Differential Equations With MATLAB PDF
100% (5)
Calculus and Differential Equations With MATLAB PDF
455 pages
Sobot R Engineering Mathematics by Example
91% (11)
Sobot R Engineering Mathematics by Example
474 pages
A Student S Guide To Infinite Series and Sequences Annas Archive Libgenrs NF 3366008
100% (8)
A Student S Guide To Infinite Series and Sequences Annas Archive Libgenrs NF 3366008
202 pages
(Mark S. Gockenbach) Partial Differential Equation
96% (28)
(Mark S. Gockenbach) Partial Differential Equation
638 pages
Differential Equations
100% (12)
Differential Equations
230 pages
Advanced Linear Algebra PDF
100% (13)
Advanced Linear Algebra PDF
348 pages
Introduction To Linear Algebra For Science and Engineering 1st Ed
90% (58)
Introduction To Linear Algebra For Science and Engineering 1st Ed
550 pages
Multi-Variable Calculus A First Step PDF
100% (10)
Multi-Variable Calculus A First Step PDF
337 pages
2018 Book MethodsForPartialDifferentialE PDF
100% (4)
2018 Book MethodsForPartialDifferentialE PDF
473 pages
A Basic Course in Partial Differential Equations - Qing Han
100% (8)
A Basic Course in Partial Differential Equations - Qing Han
305 pages
Numerical Analysis - Book49
75% (8)
Numerical Analysis - Book49
895 pages
Vector Calculus
100% (35)
Vector Calculus
593 pages
An Introduction To Partial Differential Equations
100% (5)
An Introduction To Partial Differential Equations
169 pages
Diff Erential Geometry of Curves and Surfaces: Kristopher Tapp
100% (12)
Diff Erential Geometry of Curves and Surfaces: Kristopher Tapp
370 pages
Numerical Methods and Modelling
100% (9)
Numerical Methods and Modelling
343 pages
Sanet - ST 3030528146
100% (5)
Sanet - ST 3030528146
506 pages
Differential Equations Book Math 2420 Resources1 PDF
No ratings yet
Differential Equations Book Math 2420 Resources1 PDF
155 pages
Impact of GT, SFD and EIRP On System Design
No ratings yet
Impact of GT, SFD and EIRP On System Design
18 pages
(Industrial and Applied Mathematics) Martin Brokate, Pammy Manchanda, Abul Hasan Siddiqi - Calculus For Scientists and Engineers (2019, Springer)
100% (14)
(Industrial and Applied Mathematics) Martin Brokate, Pammy Manchanda, Abul Hasan Siddiqi - Calculus For Scientists and Engineers (2019, Springer)
655 pages
Methods of Applied Mathematics For Engineers and Scientists
100% (39)
Methods of Applied Mathematics For Engineers and Scientists
897 pages
Multivariable and Vector Calculus 696f PDF
100% (11)
Multivariable and Vector Calculus 696f PDF
319 pages
Partial Differential Equations
100% (9)
Partial Differential Equations
135 pages
PDF
100% (3)
PDF
291 pages
Rao K.S - Introduction To Partial Differential Equations (2011) PDF
93% (15)
Rao K.S - Introduction To Partial Differential Equations (2011) PDF
499 pages
Fractional Calculus - Applications
100% (3)
Fractional Calculus - Applications
306 pages
LinearAlgebra
100% (26)
LinearAlgebra
532 pages
Tensor Analysis
100% (10)
Tensor Analysis
346 pages
Partial Differential Equations Mathematical Techniques For Engineers
No ratings yet
Partial Differential Equations Mathematical Techniques For Engineers
261 pages
(De Gruyter Stem) Dingyu Xue - Differential Equation Solutions With MATLAB - Fundamentals and Numerical Implementations (De Gruyter STEM) (2020, de Gruyter)
100% (1)
(De Gruyter Stem) Dingyu Xue - Differential Equation Solutions With MATLAB - Fundamentals and Numerical Implementations (De Gruyter STEM) (2020, de Gruyter)
454 pages
Integral Calculus
100% (10)
Integral Calculus
493 pages
Complex Analysis With Applications
94% (18)
Complex Analysis With Applications
501 pages
Calculus PDF
No ratings yet
Calculus PDF
133 pages
Partial Differential Equations
100% (3)
Partial Differential Equations
132 pages
Advanced Functions and Introductory Calculus
100% (20)
Advanced Functions and Introductory Calculus
498 pages
Partial Differential Equations Analytical and Numerical Methods
80% (5)
Partial Differential Equations Analytical and Numerical Methods
638 pages
A Concise Introduction To Analysis ( Daniel W. Stroock )
100% (10)
A Concise Introduction To Analysis ( Daniel W. Stroock )
226 pages
Ordinary differential equations : an introduction to the fundamentals Second Edition. Edition Howell download
No ratings yet
Ordinary differential equations : an introduction to the fundamentals Second Edition. Edition Howell download
181 pages
Main Contents
No ratings yet
Main Contents
10 pages
Notes 2
No ratings yet
Notes 2
136 pages
Elementary Differential Equations Book
No ratings yet
Elementary Differential Equations Book
183 pages
Differential Equations
No ratings yet
Differential Equations
249 pages
Differential Equations Notes PDF
No ratings yet
Differential Equations Notes PDF
186 pages
Engineering Applied Math
No ratings yet
Engineering Applied Math
302 pages
Introductory Mathematical Analysis For Quantitative Finance 081537254x 9780815372547 Compress
100% (3)
Introductory Mathematical Analysis For Quantitative Finance 081537254x 9780815372547 Compress
322 pages
Differential - Equations - Book by Daniel An
No ratings yet
Differential - Equations - Book by Daniel An
160 pages
DifferentialEquations2420-book2013 Mod PDF
No ratings yet
DifferentialEquations2420-book2013 Mod PDF
105 pages
AMU Applied - III - 2 Lecture Note PDF
100% (1)
AMU Applied - III - 2 Lecture Note PDF
103 pages
Applied Mathematics III
No ratings yet
Applied Mathematics III
103 pages
Notes On Diffy Qs PDF
No ratings yet
Notes On Diffy Qs PDF
315 pages
Diffyqs 2 PDF
No ratings yet
Diffyqs 2 PDF
315 pages
Math Diff PDF
No ratings yet
Math Diff PDF
315 pages
Full Download Worldwide Differential Equations With Linear Algebra 1st Edition Robert Mcowen PDF
100% (1)
Full Download Worldwide Differential Equations With Linear Algebra 1st Edition Robert Mcowen PDF
51 pages
Differential Equations Text
No ratings yet
Differential Equations Text
238 pages
Differential Equations
No ratings yet
Differential Equations
149 pages
Questions in Set Theory
No ratings yet
Questions in Set Theory
1 page
H.W Chapter3
No ratings yet
H.W Chapter3
1 page
Count
No ratings yet
Count
5 pages
Probability and Statistics - Coursebook
No ratings yet
Probability and Statistics - Coursebook
4 pages
Variational Methods Applied To The Particle in A Box
No ratings yet
Variational Methods Applied To The Particle in A Box
22 pages
Pengaruh Kadar Air Terhadap Tegangan Tembus Minyak Transformator Distribusi
No ratings yet
Pengaruh Kadar Air Terhadap Tegangan Tembus Minyak Transformator Distribusi
8 pages
Math
No ratings yet
Math
2 pages
D2 Manual
No ratings yet
D2 Manual
23 pages
Hotel Reservation System
0% (2)
Hotel Reservation System
24 pages
BOQ 20m Depth HDW
No ratings yet
BOQ 20m Depth HDW
1 page
361 Introduction
No ratings yet
361 Introduction
1 page
The MPEG Handbook 2nd Edition John Watkinson Instant Download
No ratings yet
The MPEG Handbook 2nd Edition John Watkinson Instant Download
52 pages
External Guide pt1
No ratings yet
External Guide pt1
3 pages
DVS 2203-2
100% (1)
DVS 2203-2
3 pages
Reference - Advanced Operators For Web Search - GoogleWebSearchEducation
No ratings yet
Reference - Advanced Operators For Web Search - GoogleWebSearchEducation
5 pages
Exploring Game Dynamics in Padel Implications For.27
No ratings yet
Exploring Game Dynamics in Padel Implications For.27
8 pages
Tapered, Circular Tubes
No ratings yet
Tapered, Circular Tubes
11 pages
The Z Engine, A New Type of Car Diesel Engine Having Low Emissions, High Part Load Efficiency and Power Density and Low Manufacturing Costs
No ratings yet
The Z Engine, A New Type of Car Diesel Engine Having Low Emissions, High Part Load Efficiency and Power Density and Low Manufacturing Costs
9 pages
License Control Item Lists (ERAN12.0 - 01)
No ratings yet
License Control Item Lists (ERAN12.0 - 01)
32 pages
Biochemistry ES 105
100% (4)
Biochemistry ES 105
1 page
ETOM SID Framework
No ratings yet
ETOM SID Framework
8 pages
PG TRB Chemistry Syllabus
No ratings yet
PG TRB Chemistry Syllabus
10 pages
Bearings Constructions and Scale Drawings nXcM959JKxCN9hdp
No ratings yet
Bearings Constructions and Scale Drawings nXcM959JKxCN9hdp
18 pages
347 Syllabus
No ratings yet
347 Syllabus
2 pages
Mass Transfer
No ratings yet
Mass Transfer
11 pages
Setup For Failover Clustering and Microsoft Cluster Service PDF
No ratings yet
Setup For Failover Clustering and Microsoft Cluster Service PDF
40 pages
Coa Mod 3 s4
No ratings yet
Coa Mod 3 s4
27 pages
Branch Prediction: Prof. Mikko H. Lipasti University of Wisconsin-Madison
No ratings yet
Branch Prediction: Prof. Mikko H. Lipasti University of Wisconsin-Madison
22 pages
Types of Claims
100% (1)
Types of Claims
2 pages
Bianchi Vending LF Part List
No ratings yet
Bianchi Vending LF Part List
28 pages
Variable Block
No ratings yet
Variable Block
3 pages
Field Service Starter Kit - Production Implementation Guide (PDF Version)
No ratings yet
Field Service Starter Kit - Production Implementation Guide (PDF Version)
43 pages