Raffoul y Applied Mathematics For Scientists and Engineers
Raffoul y Applied Mathematics For Scientists and Engineers
Mathematicians, physicists, engineers, biologists, and other scientists who study related
fields frequently use differential equations, linear algebra, calculus of variations, and integral
equations. The purpose of Applied Mathematics for Scientists and Engineers is to provide a con-
cise and well-organized study of the theoretical foundations for the development of mathemat-
ics and problem-solving methods. A wide range of solution strategies are shown for real-world
challenges. The author’s main objective is to provide as many examples as possible to help make
the theory reflected in the theorems more understandable. The book’s five chapters can be used
to create a one-semester course as well as for self-study. The only prerequisites are a basic un-
derstanding of calculus and differential equations.
Abstract Algebra
A First Course, Second Edition
Stephen Lovett
Classical Analysis
An Approach through Problems
Hongwei Chen
Probability and Statistics for Engineering and the Sciences with Modeling using R
William P. Fox and Rodney X. Sturdivant
https://www.routledge.com/Textbooks-in-Mathematics/book-series/CANDHTEX-
BOOMTH
Applied Mathematics
for Scientists and
Engineers
Youssef N. Raffoul
First edition published 2024
by CRC Press
6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742
Reasonable efforts have been made to publish reliable data and information, but the author and pub-
lisher cannot assume responsibility for the validity of all materials or the consequences of their use.
The authors and publishers have attempted to trace the copyright holders of all material reproduced
in this publication and apologize to copyright holders if permission to publish in this form has not
been obtained. If any copyright material has not been acknowledged please write and let us know so
we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information stor-
age or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, access www.copyright.
com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA
01923, 978-750-8400. For works that are not available on CCC please contact mpkbookspermis-
[email protected]
Trademark notice: Product or corporate names may be trademarks or registered trademarks and are
used only for identification and explanation without intent to infringe.
DOI: 10.1201/9781003449881
Publisher’s note: This book has been prepared from camera-ready copy provided by the authors.
Dedication
To my beautiful and adorable granddaughter
Aurora Jane Palmore
Contents
Preface xi
Author xv
vii
viii Contents
Appendices 397
Bibliography 417
Index 421
Preface
The author is very excited to share his book with you, and hopes you will find it
beneficial in broadening your education and advancing your career. The main objec-
tive of this book is to give the reader a thorough understanding of the basic ideas
and techniques of applied mathematics as they are employed in various engineering
fields. Topics such as differential equations, linear algebra, the calculus of variations,
and integral equations are fundamental to scientists, physicists, and engineers. The
book emphasizes both the theory and its applications. It incorporates engineering
applications throughout, and in line with that idea, derivations of the mathemat-
ical models of numerous physical systems are presented to familiarize the reader
with the foundational ideas of applied mathematics and its applications to real-world
problems.
For the last twenty-four years, the author has been teaching a graduate course in
applied mathematics for graduate students majoring in mathematics, physics, and
engineering at the University of Dayton. The course covered various topics in differ-
ential equations, linear algebra, calculus of variations, and integral equations. As a
result, the author’s lecture notes eventually became the basis for this book.
The book is self-contained, and no knowledge beyond an undergraduate course on
ordinary differential equations is required. A couple of sections of Chapters 4 and 5
require knowledge of Fourier series. To make up for this deficiency, an appendix was
added. The book should serve as a one-semester graduate textbook exploring the the-
ory and applications of topics in applied mathematics. Educators have the flexibility
to design their own three-chapter, one-semester course from Chapters 2–5. The first
chapter is intended as a review on the subject of ordinary differential equations, and
we refer to particular sections of it in later chapters when dealing with calculus of
variations and integral equations. The author made every effort to create a balance
between rigor and presenting the most difficult subject in an elementary language
while writing the book in order to make it accessible to a wide variety of readers.
The author’s main objective was to provide as many examples as possible to help
make the theory reflected in the theorems more understandable. The purpose of the
book is to provide a concise and well-organized study of the theoretical foundations
for the development of mathematics and problem-solving methods. This book’s text
is organized in a way that is both very readable and mathematically sound. A wide
range of solution strategies are shown for a number of real-world challenges.
The author’s presentational manner and style have a big impact on how this book
develops mathematically and pedagogically. Some of the concepts from the extensive
xi
xii Preface
and well-established literature on many applied mathematics topics found their way
into this book. Whenever possible, the author tried to deal with concepts in a more
conversational way, copiously illustrated by 165 completely worked-out examples.
Where appropriate, concepts and theories are depicted in 83 figures.
Exercises are a crucial component of the course’s learning tool and are included
at the end of each section. They range from simple computations to the solution
of more challenging ones. Before starting the exercises, students must read the
mathematics in the pertinent section. The book is divided into five chapters and an
appendix.
Chapter 1 is a review of ordinary differential equations and is not intended to be for-
mally covered by the instructor. It is recommended that students become acquainted
with it before proceeding to the following chapters. The main reason for including
Chapter 1 is that by the time students take a graduate course in applied mathematics,
they have already forgotten most techniques for solving ordinary differential equa-
tions. In addition, it will save class time by not formally reviewing such topics but
rather asking the students to read them beforehand. The chapter covers first-order and
higher-order differential equations. It also includes a section on the Cauchy-Euler
equation, which plays a significant role in Chapters 4 and 5.
The second chapter is devoted to the study of partial differential equations, with
the majority of the content aimed toward graduate students pursuing engineering
degrees. The chapter begins with linear equations with constant and variable coef-
ficients and then moves on to quasi-linear equations. Burger’s equation occupies an
important role in the chapter, as do second-order partial differential equations and
homogeneous and nonhomogeneous wave equations.
The third chapter discusses matrices and systems of linear equations. Gauss elimina-
tion, matrix algebra, vector spaces, and eigenvalues and eigenvectors are all covered.
The chapter concludes with an examination of inner product spaces, diagonalization,
quadratic forms, and functions of symmetric matrices.
Chapter 4 delves deeply into fundamental themes in the calculus of variations in a
functional analytic environment. The calculus of variations is concerned with the op-
timization of functionals over a set of competing objects. We begin by deriving the
Euler-Lagrange necessary condition and generalizing the concept to functionals with
higher derivatives or with multiple variables. We provide a nice discussion on the the-
ory behind sufficient conditions. Some of the topics are generalized to isoperimetric
problems and functionals with constraints. Toward the end of the chapter, we closely
examine the connection between the Sturm-Liouville problem and the calculus of
variations. We end the chapter with the Rayleigh-Ritz method and the development
of Euler-Lagrange to allow variational computation of multiple integrals.
Chapter 5 is solely devoted to the study of Fredholm and Volterra integral equations.
The chapter begins by introducing integral equations and the connections between
them and ordinary differential equations. The development of Green’s function occu-
pies an important role in the chapter. It is used to classify kernels, which in turn leads
Preface xiii
us to the appropriate approach for finding solutions. This includes integral equations
with symmetric kernels or degenerate kernels. Toward the end of the chapter, we
develop iterative methods and the Neumann series. We briefly discuss ways of ap-
proximating non-degenerate kernels and the use of the Laplace transform in solving
integral equations of convolution types. Since not all integral equations can be re-
duced to differential equations, one should expect odd behavior from solutions. For
such reasons, we devote the last section of the chapter to the qualitative analysis of
solutions using fixed point theory and the Liapunov direct method.
Appendix A covers the basic topics of Fourier series. We briefly discuss Fourier se-
ries expansion, including sine and cosine, and the corresponding relations to periodic
odd extension and periodic even extension. We provide applications to the heat prob-
lem in a finite slab by utilizing the concept of separation of variables. We transform
the Laplacian equation in different dimensions to polar, cylindrical, and spherical
coordinates. We end this appendix by studying the Laplacian equation in circular do-
mains, such as the annulus. Materials in this section will be useful in several places
in the book, especially Chapters 2, 4, and 5.
The author owes a debt of gratitude to Drs. Sam Brensinger and George Todd for
reading the first and third chapters, respectively, and for their insightful remarks and
recommendations. I’d like to express my gratitude to the hundreds of graduate stu-
dents at the University of Dayton who helped the author polish and refine the lecture
notes so that a significant portion of them made it into this book over the course of
the last 22 years.
This book would not exist without the encouragement and support of my wife, my
children Hannah, Paul, Joseph, and Daniel, and my brother Melhem.
Youssef N. Raffoul
University of Dayton
Dayton, Ohio
July, 2023
Author
xv
1
Ordinary Differential Equations
In this chapter, we briefly go over elementary topics from ordinary differential equa-
tions that we will need in later chapters. The chapter provides the foundations to
assist students in learning not only how to read and understand differential equa-
tions but also how to read technical material in more advanced-setting texts as they
progress through their studies. We discuss basic topics in first-order differential equa-
tions, including separable and exact equations and the variation of parameters for-
mula. We provide applications for infections. At the end of the chapter, we study
higher-order differential equations and some of their theoretical aspects. The chapter
is not intended to be taught as a part of a graduate course but rather as a reference for
the students for later chapters.
1.1 Preliminaries
Let I be an interval of the real numbers R and consider the function f : I → R. For
x0 ∈ I, the derivative of f at x0 is
f (x0 + h) − f (x0 )
f ′ (x0 ) = lim (1.1)
h→0 h
provided the limit exists. When the limit exists, we say that f is differentiable at
x0 . The term f ′ (x0 ) is the instantaneous rate of change of the function f at x0 . If x0
is one of the endpoints of the interval I, then the above definition of the derivative
becomes a one-sided derivative. If f ′ (x0 ) exists at every point x0 ∈ I, then we say
f is differentiable on I and write f ′ (x). The derivative of a function f is again a
function f ′ ; its domain, which is a subset of the domain of f , is the set of all points
x0 for which f is differentiable. Other notations for the derivative are Dx f , d f /dx,
and dy/dx, where y = f (x). The function f ′ may in turn have a derivative, denoted
by f ′′ , which is defined at all points where f ′ is differentiable. f ′′ is called the second
derivative of f . For higher-order derivatives, we use the notations
dn f
f ′′′ (x), f (4) (x), . . . , f (n) (x), or for n = 1, 2, 3, . . . .
dxn
DOI: 10.1201/9781003449881-1 1
2 Ordinary Differential Equations
Example 1.1 For x ∈ R, we set f (x) = x|x|. Then we have that f (x) = x2 , for x >
0 and f (x) = −x2 , for x < 0. Next, we compute f ′ (x0 ). For x0 > 0, we may choose
|h| small enough so that x0 + h > 0. Then by (1.1), we see that
y(n) = f x, y, y′ , . . . , y(n−1) .
(1.3)
for all x ∈ I. If we require, for some initial time x0 ∈ R, a solution y(x) to satisfy the
initial conditions
for constants ai , i = 0, 1, 2, ..., n − 1, then (1.3) along with (1.4) is called an initial
value problem (IVP).
Before we state the next theorem, we define partial derivatives.
Given a function of several variables f (x, y), the partial derivative of f with respect
to x is the rate of change of f as x varies, keeping y constant, and it is given by
∂f f (x + h, y) − f (x, y)
= lim .
∂x h→0 h
Similarly, the partial derivative of f with respect to y is the rate of change of f as y
varies, keeping x constant, and it is given by
∂f f (x, y + h) − f (x, y)
= lim .
∂ y h→0 h
∂f ∂f
More often we write fx , fy , to denote ∂x and ∂y , respectively.
For the (IVP) (1.3) and (1.4), the following existence and uniqueness result is true.
For more discussion on the topic and on the proof of the next theorem, we refer to
[3], [4], or [5].
Theorem 1.1 Consider the (IVP) defined by (1.3) and (1.4), where f is continuous
on the (n + 1)-dimensional rectangle D of the form
D = {(x, y0 , y1 , . . . , yn−1 ) : bk−1 < yk−1 < dk−1 and bn < x < dn , k = 1, 2, . . . , n}.
If the initial conditions are chosen so that the point (x0 , a0 , a1 , . . . , an−1 ) is in D, then
the (IVP) has at least one solution satisfying the initial conditions. If, in addition, f
has continuous partial derivatives
∂ f ∂2 f ∂ n−1 f
, ′
, ..., ,
∂y ∂y ∂ yn−1
in D, then the solution is unique.
Suppose f satisfies the hypothesis of Theorem 1.1 in D. Then a general solution of
the (IVP) in D is given by the formula
y = ϕ(x, c1 , c2 , . . . , cn )
4 Ordinary Differential Equations
if y solves (1.3) and if for any initial condition (x0 , a0 , a1 , . . . , an−1 ) in D one can
choose values c1 , c2 , . . . , cn so that the solution y satisfies these initial conditions.
We have the following corollary concerning existence and uniqueness of solutions
of first-order initial value problems, which is an immediate consequence of Theorem
1.1.
Corollary 1 Let D ⊂ R × R, and denote the set of all real continuous functions on
D by C(D, R). Let f ∈ C(D, R) and suppose ∂∂ xf is continuous on D. Then for any
(t0 , x0 ) ∈ D, the (IVP)
y′ = f (x, y), y(x0 ) = y0 ,
has a unique solution on an interval containing x0 in its domain.
Example 1.2 As an example, consider
y′ (x) = xy1/2 ,
then
∂f x
f (x, y) = xy1/2 and = 1/2
∂y 2y
are continuous in the upper half-plane defined by y > 0. We conclude from Corollary
1 that for any point (x0 , y0 ), y0 > 0, there is some interval around x0 on which the
given differential equation has a unique solution. □
Example 1.3 Consider
y′ (x) = y2 , y(0) = 1.
Here, we have that
∂f
f (x, y) = y2 and = 2y
∂y
are continuous everywhere in the plane and in particular on the rectangle
Since the initial point (0, 1) lies inside the rectangle, Corollary 1 guarantees a unique
solution of the (IVP). □
In the next example, we illustrate the existence of more than one solution.
Example 1.4 Consider the differential equation
3
y′ (x) = y1/3 , y(0) = 0, x ∈ R.
2
Here, we have
3 ∂f 1
f (x, y) = y1/3 and = y−2/3 .
2 ∂y 2
Separable Equations 5
y(x)
x
1 2 3 4 5 6
−5
FIGURE 1.1
This example displays three solutions at x1 = 0, 1, 2.
In some cases, we may want to solve (1.5) and, at different times, we may want to
solve it subject to the initial condition
y(x0 ) = y0 , (1.6)
where the point (x0 , y0 ) is specified and in D. In some instances, the first-order dif-
ferential equation (1.5) can be rearranged and put in the form,
dy
+ h(x) = 0
g(y) (1.7)
dx
where h, g : R → R and are continuous on some subsets of R. Here, the functions
h(x) and g(y) are only functions of x and y, respectively. When (1.5) can be put in
the form (1.7), we say it separates or that the differential equation is separable. Now
we discuss the method of solution when (1.5) is separable. In this case, we consider
(1.7) and assume that the functions H(x) and G(y) are the definite integrals of h(x)
and g(y), respectively. Then (1.7) can be written in the form
H ′ (x)dx = −G′ (y)dy.
By integrating the left-side with respect to x and the right-side with respect to y, we
get the general solution
H(x) + G(y) = c, (1.8)
where c is an arbitrary constant. If (1.8) permits us to solve for y in terms x and c,
then we obtain a general solution of (1.7) of the form y = ϕ(x, c). If we cannot solve
for y in terms of x and c, then (1.8) represents an implicit solution of (1.7).
The technique used to obtain (1.8) is simple and easy to understand and apply. How-
ever, if we try to be precise and justify the procedure, some care would have to be im-
plemented. To see this, we take the initial point (x0 , y0 ) in D and suppose that
h(x0 ) ̸= 0 and g(y0 ) ̸= 0.
Let y = ϕ(x) be a solution of (1.7) on an interval I = {x : |x −x0 | < d}, which satisfies
the initial condition ϕ(x0 ) = y0 . Then for all x in I we see that
dϕ(x)
h(x) = −g(ϕ(x)) .
dx
Integrating both sides from x0 to x, we arrive at
x x
h(s)ds = − g(ϕ(s))ϕ ′ (s)ds.
x0 x0
h(x0 ) 0)
Since ϕ ′ (x0 ) = − g(ϕ(x = − h(x
g(y0 ) ̸= 0, then in the neighborhood of x0 the solution
0 ))
ϕ(x) is either increasing or decreasing. In either case, we are able to use the change
of variable u = ϕ(x) in the right-hand integral and obtain
x ϕ(x) y
h(s)ds = − g(u)du = − g(u)du, (1.9)
x0 ϕ(x0 ) y0
Example 1.5 Solve y′ = 5/2 − y/2, y(0) = 2. The differential equation can be writ-
ten as
1 dy 1
= .
5 − y dx 2
Using (1.9), we obtain
x y
1 1
ds = du.
0 2 2 5−u
For finite x, we must have u < 5; otherwise, the integral on the right diverges. There-
fore, 5 − u > 0 and integration yields
x
= − ln(5 − y) + ln(3),
2
or 3
x
= ln .
2 5−y
Taking the exponential of both sides, we arrive at the solution
x
y(x) = 5 − 3e− 2 .
□
Example 1.6 (Logistic Equation) The logistic equation is a simple model of popu-
lation dynamics. Suppose, we have a population of size y with initial population size
of y0 > 0 that has a birth rate αy and a death rate β y. With this model, we obtain
dy
= (α − β )y, which has the solution
dt
y = y0 e(α−β )t .
dy
= (α − β )y − γy2
dt
or
dy y
= ry 1 − ,
dt d
where r = α − β and d = r/γ. This is the differential logistic equation. Note that it
is separable and can be solved explicitly. □
8 Ordinary Differential Equations
Example 1.7 The law of mass action is a useful concept that describes the behavior
of a system that consists of many interacting parts, such as molecules, that react with
each other, or viruses that are passed along from a population of infected individuals
to non-immune ones. The law of mass action was derived first for chemical systems
but subsequently found wide use in epidemiology and ecology. To describe the law
of mass action, we assume m substances s1 , s2 , . . . , sm together form a product with
concentration p. Then the law of mass action states that ddtp is proportional to the
product of the m concentrations si , i = 1, . . . , m. That is,
dp
= ks1 s2 . . . sm .
dt
Suppose we have a homogeneous population of fixed size, divided into two groups.
Those who have the disease are called infective, and those who do not have the dis-
ease are called susceptible. Let S = S(t) be the susceptible portion of the population
and I = I(t) be the infective portion. Then by assumption, we may normalize the
population and have S + I = 1. We further assume that the dynamics of this epidemic
satisfy the law of mass action. Hence, for some positive constant λ we have the
nonlinear differential equation
I ′ (t) = λ SI (1.10)
Let I(0) = I0 , 0 < I(0) < 1 be a given initial condition. It follows that by substituting
S = 1 − I into (1.10),
I ′ (t) = λ I(1 − I), I(0) = I0 . (1.11)
If we can solve (1.11) for I(t), then S(t) can be found from the relation I + S = 1. We
separate the variables in (1.11) and obtain
dI
= λ dt.
I(1 − I)
Using partial fractions on the left side of the equation and then integrating both sides
yields
ln(|I|) − ln(|1 − I|) = λt + c,
or for some positive constant c1 we have
c1 eλt
I(t) = .
1 + c1 eλt
Applying I(0) = I0 gives the solution
I0 eλt
I(t) = . (1.12)
1 − I0 + I0 eλt
Now for 0 < I(0) < 1, the solution given by (1.12) is increasing with time as ex-
pected. Moreover, using L’Hospital’s rule, we have
I0 eλt
lim I(t) = lim = 1.
t→∞ t→∞ 1 − I0 + I0 eλt
Hence, the infection will grow, and everyone in the population will get infected even-
tually. □
Separable Equations 9
1.2.1 Exercises
In Exercises 1.1, verify that the given expression is a solution of the equation. Where
appropriate, c1 , c2 denote constants.
Exercise 1.1 (a) y′′ + y′ − 2y = 0; y = c1 ex + c2 e−2x .
(b) y′ = 25 + y2 ; y = 5 tan(5x).
√
r
′ y
(c) y = ; y = ( x + c1 )2 , x > 0, c1 > 0.
x
(d) 3x2 ydx + (x3 y + 2y)dy = 0; x3 y + y2 = c1 .
dy 2c1 e2x
(e) = y(2 − 3y); y = .
dx 1 + 3c1 e2x
(f) x2 y′′ − xy′ + 2y = 0; y = x cos(ln x), x > 0.
In each of Exercises 1.2–1.7, decide whether the existence and uniqueness theorem
or corollary of this section does or does not guarantee the existence of a solution of
each of the initial value problems. In the case a solution exists, determine whether
uniqueness is guaranteed or not and determine the region of existence and unique-
ness.
Exercise 1.2
y′ = 3x2 y, y(0) = 3.
Exercise 1.3
y′ = y2 , y(1) = 5.
Exercise 1.4 √
y′ = x − y, y(2) = 1
Exercise 1.5
y′ = ln(1 + y2 ), y(2) = 2.
Exercise 1.6
y′ = x ln(y), y(1) = 0
Exercise 1.7 p
y′ = 1 − y2 , y(0) = 0.
Exercise 1.8 Show that the (IVP)
p
y′ = − 4 − y2 , y(0) = 2
has the two solutions y1 (x) = 2 and y2 (x) = 2 cos(x) on the interval [0, 2π]. Why
doesn’t this contradict Corollary 1?
In 1.9–1.16, solve the given differential equation by separation of variables.
Exercise 1.9
y′ + xy = y, y(1) = 3.
10 Ordinary Differential Equations
Exercise 1.10
dy
(y2 − 1) = xex , y(0) = 5.
dx
Exercise 1.11
dy √
− 1 = x − y.
dx
Exercise 1.12
dy
x = y2 − y, y(0) = 1.
dx
Exercise 1.13
y′ tan(x) = y, y(π/2) = π/2.
Exercise 1.14
dy
+ 2 = sin(2x + y + 1).
dx
Exercise 1.15
y′ = 2x3 y2 + 3x2 y2 , y(0) = 2.
Exercise 1.16
x3 y − y
y′ = , y(0) = 1.
y4 − y3 + 1
d f = fx dx + fy dy.
dy
Definition 1.1 (Exact equation) Q(x, y) dx + P(x, y) = 0 is an exact equation if and
only if the differential form Q(x, y) dy + P(x, y) dx is exact, i.e. there exists a function
f (x, y) for which
We have
P = y cos(xy) + 1, Q = x cos(xy) + ey
Then ∂∂Py = ∂∂Qx = cos(xy) − xy sin(xy). Hence, we are dealing with an exact equation.
To find the function f we integrate either
∂f ∂f
= y cos(xy) + 1, or = x cos(xy) + ey .
∂x ∂y
Integrating the first equation gives
f (x, y) = sin(xy) + x + g(y).
Note that since it was a partial derivative with respect to x holding y constant, the
“constant” term can be any function of y. Differentiating the derived f with respect
to y, we have
∂f
= x cos(xy) + g′ (y) = x cos(xy) + ey .
∂y
Thus g′ (y) = ey and g(y) = ey . The constant of integration need not be included in the
preceding line since the solution is f (x, y) = c. This is due to the fact that d f (x, y) = 0
implies that f (x, y) = constant. Thus, the final solution is
sin(xy) + x + ey = c.
If we are given an initial condition, say y(π/2) = 1, then c = 1 + π/2 + e. □
12 Ordinary Differential Equations
1.3.2 Exercises
In Exercises 1.17–1.22 show the differential equation is exact and then find its solu-
tion.
Exercise 1.17
Exercise 1.18
(ex sin(y) + 3y)dx + (3x + ex cos(y))dy = 0.
Exercise 1.19
Exercise 1.20
y
(x3 + )dx + (y2 + ln(x))dy = 0.
x
Exercise 1.21
x+y
(x + arctan(y))dx + dy = 0.
1 + y2
Exercise 1.22
Exercise 1.25 Solve the given differential equation by finding an appropriate inte-
grating factor.
(a) (xy + y2 + y)dx + (x + 2y)dy = 0.
(b) (2y2 + 3x)dx + 2xydy = 0.
(c) (y ln(y) + yex )dx + (x + y cos(y))dy = 0.
(d) (4xy2 + y)dx + (6y3 − x)dy = 0.
Exercise 1.26 For appropriate integers r and q show that µ(x, y) = xr yq is an inte-
grating factor for the differential equation and then solve it,
where g ∈ C(R × R, R) and a ∈ C(R, R). Note that (1.17) can be nonlinear or linear,
which depends on the function g. If g(x, y) = g(x), a function of x alone then (1.17)
is said to be linear (linear in y.) To obtain a formula for the solution, we multiply
x
x0 a(u)du
both sides of (1.17) by the integrating factor e . Observing that
x x x
d a(u)du
a(u)du a(u)du
y(x)e x0 = y′ (x)e x0 + a(x)y(x)e x0 ,
dt
we arrive at x x
d a(u)du
a(u)du
y(x)e x0 = g(x, y(x))e x0 .
dx
An integration of the above expression from x0 to x and using y(x0 ) = y0
yields
x x s
x0 a(u)du x0 a(u)du
y(x)e = y0 + g(s, y(s))e ds
x0
It can be easily shown that if y(x) satisfies (1.18), then it satisfies (1.17). Expression
(1.18) is known as the variation of parameters formula. We note that (1.18) is a
functional equation in y since the integrand is a function of y. If we replace the
Linear Differential Equations 15
function g with a function h(x) where h ∈ C(R, R), then (1.18) takes the special
form x
− xx a(u)du x
y(x) = y0 e 0 + h(s)e− s a(u)du ds, x ≥ x0 . (1.19)
x0
Another special form of (1.18) is that, if the function a(x) is constant for all x ≥ x0
and g is replaced with h(x) as before, then we have from (1.19) that
x
y(x) = y0 e−a(x−x0 ) + e−a(x−s) h(s)ds, x ≥ x0 . (1.20)
x0
Remark 1 If no initial conditions are assigned, then (1.18) takes the form
− a(x)dx
y(x) = e C + g(x, y(x))e a(x)dx dx .
1.4.1 Exercises
Exercise 1.27 Solve each of the given differential equation.
dy
(a) x + 2y = 5xy, y(1) = 0.
dx
dy
(b) + 2y = ex , y(1) = 0.
dx
dy
(c) (x + 1) + (x + 2)y = 2xe−x .
dx
dy ex
(d) + = y, y(1) = 0.
dx 1 + x2
dy
(e) x = y ln(x), y(1) = 2.
dx
Exercise 1.28 Find a continuous solution satisfying
where u and v are continuous and are functions of the variable x. The choice of either
(a) or (b) depends on the number of terms that multiply P and Q. For example, if P
is multiplied by fewer terms, then go with (b). Similarly, if Q is multiplied by fewer
Homogeneous Differential Equations 17
terms, then go with (a). You may use either (a) or (b) if they are multiplied by the
same number of terms.
dy du
Say we go with y = ux. Then compute = x + u. Multiplying both sides by dx
dx dx
we get
dy = x du + u dx. (1.23)
Next substitute y = ux and (1.23) into (1.21) and the resulting differential equation is
separable in x and u and can be easily solved. If we go with (b), then use
dx = y dv + v dy (1.24)
and then substitute back into (1.21) to obtain a separable equation in terms of v and
y.
Example 1.11 Consider
dy
x + x − 3y = 0. (1.25)
dx
Then the equation (1.25) takes the form
xdy + (x − 3y)dx = 0,
which is homogeneous of degree 1. Since dy is multiplied by one term only, we use
y = ux. Using the above procedure, the differential equation reduces to
x(udx + xdu) + (x − 3xu)dx = 0.
Simplifying by x and then regrouping we arrive at the separable equation
dx du
= .
x 2u − 1
Integrating both sides and then substituting u = y/x we arrive at the solution of the
original problem
1 2y
ln |x| = ln − 1 + c,
2 x
for some constant c. □
Notice that (1.25) can be written as
dy y y
= 1 + 3 = f (1, ),
dx x x
and this hints at another way of defining homogeneous differential equations. Thus
we make another alternate definition.
Definition 1.6 A differential equation
dy
= g(x, y) (1.26)
dx
is called a homogeneous differential equation if g is a homogeneous function of de-
gree 0, that is
g(λ x, λ y) = λ 0 g(x, y) = g(x, y).
18 Ordinary Differential Equations
1
If we let λ = , then
x
y
g(λ x, λ y) = λ 0 g(x, y) = g(x, y) = g(1, ).
x
Hence, if we let F( xy ) = g(1, xy ), then (1.26) can be put in the form
dy y
= F( ). (1.27)
dx x
y
This suggests making the substitution u = . In this case the differential equation
x
given by (1.27) is reduced to
du F(u) − u
= ,
dx x
which is separable.
Another type of first-order differential equation that requires transformations in both
the dependent and independent variables is of the form
dy ax + by + c
= , (1.28)
dx dx + ey + g
with ae ̸= bd. To solve (1.28) we propose the substitutions
x = t − p, y = w − k,
where the constants p and k must be carefully chosen so that the resulting equation
is homogeneous. Using the chain rule, we see that
dy d dw dw dt dw dw
= (w − k) = = = (1) = .
dx dx dx dt dx dt dt
Moreover,
ax + by + c = at + bw + (c − ap − bk).
Similarly,
dx + ey + g = dt + ew + (g − d p − ek).
c − ap − bk = 0, g − d p − ek = 0,
which has a unique solution p and k since ae ̸= bd. This results in the new differential
equation
dw at + bw
= , (1.29)
dt dt + ew
which is homogeneous in w and t.
Homogeneous Differential Equations 19
x = t − p, y = w−k
−5 − p − 2k = 0, 4 + 2p + k = 0
and obtain p = −1, k = −2. Using (1.29) we arrive at the homogeneous differential
equation
(2t + w)dw + (t + 2w)dt = 0.
This equation can be easily solved by letting w = ut, and obtaining the separable
differential equation
dt 2+u
=− 2 du.
t u + 4u + 1
An integration of both sides gives
1
ln |t| = − ln |u2 + 4u + 1| +C.
2
Or,
1 (y − 2)2 y−2
ln |x − 1| = − ln + 4 + 1 +C.
2 (x − 1)2 x−1
□
1.5.1 Exercises
In Exercises 1.30–1.34 Show the given differential equations are homogeneous and
solve them.
Exercise 1.30
xdy + (2y + 5x)dx = 0, y(1) = 2.
Exercise 1.31
dy y2
x − 3y = , y(1) = 5.
dx x
Exercise 1.32
dy
x2 xy = x2 + y2 , y(1) = 2.
dx
Exercise 1.33
dy
2xy = 4x2 + 3y2 .
dx
Exercise 1.34
dy p
x = y + x2 − y2 , y(1) = 0.
dx
20 Ordinary Differential Equations
Note that if n = 0, then (1.30) is a linear differential equation and it can be solved by
the method of Section 1.4. Similarly, if n = 1, then (1.30) is a separable differential
equation that can be solved by the method of Section 1.2. Thus, we consider (1.30)
only for
n≠ 0, 1.
We make the substitution
W = y1−n ,
so that
1 n
y′ = W 1−n W ′ .
n−1
Substituting into (1.30) we arrive at the linear differential equation in W and x
2 ex
W ′ + W = −2 ,
x x
which has the solution
ex ex
W = cx−2 − 2 +2 2.
x x
Since y = W −1/2 , the solution of the Bernoulli equation is given by
ex ex −1/2
y = cx−2 − 2 + 2 2 .
x x
□
1.6.1 Exercises
Exercise 1.39 Find a general solution of the Bernoulli equation.
dy
(a) 3(1 + x2 ) = 2xy(y3 − 1).
dx
dy y4 cos(x)
(b) y3 + = .
dx x x4
dy 2
(c) (x2 + 1) + 3x3 y = 6xe−3/x .
dx
dy √
(d) x + 2y + sin(x) y = 0.
dx
dy
(e) + 2y = x3 y2 sin(x).
dx
dny d n−1 y dy
an (x) n
+ an−1 (x) n−1 + . . . + a1 (x) + a0 (x)y = F(x). (1.32)
dx dx dx
Unless otherwise noted, we always assume that the coefficients ai (x), i = 1, 2, . . . , n
and the function F(x) are continuous on some open interval I. The interval I may be
unbounded. If the function F(x) vanishes for all x ∈ I, then we call (1.32) a homo-
geneous linear equation; otherwise, it is nonhomogeneous. Thus the homogeneous
22 Ordinary Differential Equations
dny d n−1 y dy
an (x) n
+ an−1 (x) n−1
+ . . . + a1 (x) + a0 (x)y = 0. (1.33)
dx dx dx
c1 f1 + c2 f2 + . . . + cn fn = 0
for every x ∈ I. If the set of functions is not linearly dependent on the interval I, it is
said to be linearly independent.
Definition 1.8 (Fundamental set of solutions). A set of n solutions of the linear dif-
ferential system (1.33) all defined on the same open interval I, is called a fundamental
set of solutions on I if the solutions are linearly independent functions on I.
We have the following corollary.
Corollary 2 Let the coefficients ai (x), i = 1, 2, . . . , n be continuous on an interval
I. If {φ1 (x), φ2 (x), . . . , φn (x)} form a fundamental set of solutions on I, then the
general solution of (1.33) is given by
for constants ci , i = 1, 2, . . . , n.
Example 1.14 The second-order differential equation
has the two solutions ϕ1 (x) = ex and ϕ2 (x) = e−2x and they are linearly independent
on I = (−∞, ∞) and hence the fundamental solution is
e−2x
x
e
W (ϕ1 , ϕ2 ) = x = −2e−x − e−x = −3e−x ̸= 0, for all x ∈ (−∞, ∞).
−2e−2x
e
This is an example of a linearly independent pair of functions. Note that the Wron-
skian is everywhere nonzero. On the other hand, if the functions f and g are linearly
dependent, with g = k f for a nonzero constant k, then
f k f
W ( f , g) = ′ = k f f ′ − k f f ′ = 0.
f k f ′
Thus the Wronskian of two linearly dependent functions is zero. This will be made
formal in Theorem 1.5. The above Wronskian discussion can be easily extended to
the set of functions f1 , f2 , . . . , fn , where
···
f1 f2 fn
′
f2′ ··· fn′
f1
W ( f1 , f2 , . . . , fn ) = .. .. ..
..
. . . .
(n−1) (n−1) (n−1)
1
f
2 f ··· f n
for all x ∈ I. Now (1.37) can only be true for some c1 , and c2 not both zero if and
only if W (y1 (x), y2 (x)) = 0, for all x ∈ I. If (1.37) holds for some point x0 ∈ I, then
the function y = c1 y1 + c2 y2 is a solution (1.36) and satisfies the initial conditions,
y(x0 ) = y′ (x0 ) = 0. On the other hand the zero function; y = 0 is also a solution and
satisfies the initial conditions. This violates the uniqueness of the solution unless y
and the zero solution, y = 0 are the same. Now y = 0 implies (1.37) is true for all x ∈ I.
This shows that W (y1 (x), y2 (x)) = 0, for all x ∈ I if and only if W (y1 (x), y2 (x)) =
0, for at least one x0 ∈ I. This completes the proof.
Clearly, each of the functions ϕ1 (x) = e−x , and ϕ2 (x) = e−2x is a solution of the
homogeneous equation y′′ + 3y′ + 2y = 0. Also, they are linearly independent since
−x
e−2x
e
W (ϕ1 , ϕ2 ) = −x = −3e−3x ̸= 0, for all x ∈ (−∞, ∞).
−2e−2x
−e
y′′p + 3y′p + 2y p = 6.
1.7.1 Exercises
Exercise 1.40 Use Definition 1.7 to show that for any nonzero constant r the set
is linearly independent.
Exercise 1.41 Decide whether or not the solutions given determine a fundamental
set of solutions for the equation.
(a) y′′′ − 3y′′ − y′ + 3y = 0, y1 = e3x + ex , y2 = ex − e−x , y3 = e3x + e−x .
(b) y′′′ − 2y′′ − y′ + 2y = 0, y1 = ex , y2 = e−x , y3 = e2x .
(c) x2 y′′′ + xy′′ − y′ = 0, y1 = 1 + x2 , y2 = 2 + x2 , y3 = ln(x).
(d) x3 y′′′ + 2x√
2 ′′
y + 3xy′ − 3y = 0, √y1 = x,
y2 = cos( 3 ln(x)), y3 = sin( 3 ln(x)), x > 0.
an y(n) (x) + an−1 y(n−1) (x) + . . . + a2 y′′ (x) + a1 y′ (x) + a0 y(x) = 0 (1.39)
and try to find its solution. In the previous section we noticed that solutions to some
of the differential equations that were considered, were exponential functions. Thus,
we search for solutions of (1.39) of the form
y = erx ,
d k rx
e = rk erx for k = 0, 1, 2, . . . , (1.40)
dxk
A substitution of (1.40) into (1.39) leads to
erx an rn + an−1 rn−1 + . . . + a2 r2 + a1 r + a0 = 0.
Distinct Roots
Now suppose the roots of (1.41) can be found. Then we can always write the funda-
mental solution or general solution of (1.39). The easiest case is when all the roots
ri , i = 1, 2, . . . , n are real and distinct. That is, no two roots are the same, or
ri ̸= r j , i, j = 1, 2, . . . , n.
We have the following theorem.
Theorem 1.7 (Distinct Real Roots) Suppose all the roots of (1.41) ri , i = 1, 2, . . . , n
are real and distinct. Then the general solution of (1.39) is given by
n
y= ∑ c k e rk x , (1.42)
k=1
for constants ck , k = 1, 2, . . . , n.
Proof Since the roots are real and distinct, the set {erk x , k = 1, 2, . . . , n} is linearly
independent. Moreover, each function in the set is a solution of (1.39) and hence they
form a fundamental set of solutions. Then, by Theorem 1.6, the solution is given by
(1.42).
Example 1.16 Consider the third order differential equation
y′′′ + 2y′′ − y′ − 2y = 0.
Its characteristic equation is found to be r3 + 2r2 − r − 2 = 0, which factors into
(r2 − 1)(r + 2) = 0. Thus the three roots are −2, −1, and 1 and they are real and
distinct and hence by Theorem 1.7 the general solution is
y = c1 e−2x + c2 e−x + c3 ex .
□
Repeated Roots
Now we turn our attention to the case when the characteristic equation (1.41) has
some of its roots repeated. In such cases, we are not able to produce n linearly in-
dependent solutions using Theorem 1.7. For example, if the characteristic equation
of a given differential equation has the roots −1, 1, 2, and 2, then we can only pro-
duce the three linearly independent functions e−x , ex , and e2x . The problem is then
to find a way to obtain the linearly independent solutions. To that end we introduce
the symbol L to represent a linear operator in the sense that for functions y1 and y2
in an appropriate space
L (c1 y1 + c2 y2 ) = c1 L y1 + c2 L y2 , for constants c1 , c2 .
Thus, in terms of the operator L , equation (1.39) can take the form L y = 0,
where
dn d n−1 d
L = an n + an−1 n−1 + . . . + a1 + a0 . (1.43)
dx dx dx
Equations with Constant Coefficients 27
d
In addition, we introduce the term D = and hence the notations,
dx
Dy = y′ , D2 y = y′′ , . . . , Dn y = y(n) .
(D − d)y = Dy − dy = y′ − dy.
Now suppose that (1.41) has a simple root r0 and another root r1 with multiplicity k,
where k is an integer such that k > 1. Then by Exercise 1.42, equation (1.44) reduces
to
L = (D − r1 )k (D − r0 ) = (D − r0 )(D − r1 )k . (1.45)
Then setting (1.45) equal to zero, which corresponds to the differential L y = 0,
we arrive at the two solutions y0 = er0 x , and y1 = er1 x . Remember, we need to find
k +1 linearly independent solutions for the construction of the general solution. Thus,
there are k − 1 missing linearly independent solutions. Applying y to the operator in
(1.45), yields
k
L y = (D − r0 )(D − r1 )k y = (D − r0 ) D − r1 y .
By setting L y = 0 we arrive at
k
D − r1 y = 0. (1.46)
Every solution of the kth order differential equation in (1.46) will also be a solution
of the original differential equation L y = 0. Since er1 x is already a known solution,
we search for other solutions of the form
Or,
(D − r1 ) u(x)er1 x = (Du(x))er1 x .
Thus y = u(x)er1 x is a solution of (1.46) if and only if (Dk u(x))er1 x = 0. But this
holds if and only if (Dk u(x)) = 0. Since (Dk u(x)) = uk (x) = 0, the solution is
u(x) = c0 + c1 x + c2 x2 + . . . + ck xk−1 ,
□
Complex Roots
We discuss the situation when one of the roots is complex. That is, if (1.41) has a
√ it appears in complex conjugate pairs α ± iβ , where α,
simple complex root then,
and β are real and i = −1. Recall
∞
tn t2 t3 tn
et = ∑ n! = 1 + t + 2! + 3! + n! + . . . .
n=0
(ix)n
∞
x2 ix3 x4 ix5
eix = ∑ = 1 + ix − − + + −...
n=0 n! 2! 3! 4! 5!
x2 x4 x3 x5
= 1− + −... +i x− + +...
2! 4! 3! 5!
= cos(x) + i sin(x).
Equations with Constant Coefficients 29
Derx = rerx .
For emphasis, erx will be a solution of the differential equation given by (1.39) if and
only if r is a root of its characteristic equation given by (1.41). Thus, if the conjugate
compelx pair of roots r1 = α + iβ , and r2 = α − iβ , are simple (nonrepeated), then
the corresponding part of the general solution is
K1 e(α+iβ )x + K2 e(α−iβ )x = K1 eαx cos(β x) + i sin(β x)
+ K2 eαx cos(β x) − i sin(β x)
= eαx c1 cos(β x) + c2 sin(β x) ,
where c1 = K1 + K2 , and c2 = (K1 − K2 )i. It is easy to verify that eαx cos(β x),
and eαx sin(β x), are linearly independent. As a consequence we have the following
theorem.
Theorem 1.9 (Complex Simple Roots) Suppose the characteristic equation (1.41)
has a nonrepeated pair of complex conjugate roots α ± iβ . Then the corresponding
part of the general solution of (1.39) is
eαx c1 cos(β x) + c2 sin(β x)
□
30 Ordinary Differential Equations
1.8.1 Exercises
Exercise 1.42 Show that for constants a, and b and a function y(x) that is differen-
tiable,
(D − a)(D − b)y = (D − b)(D − a)y.
Exercise 1.43 Use an induction argument to show that
(D − r1 )k u(x)er1 x = (Dk u(x))er1 x .
Derx = rerx .
an y(n) (x) + an−1 y(n−1) (x) + . . . + a2 y′′ (x) + a1 y′ (x) + a0 y(x) = f (x), (1.48)
Nonhomogeneous Equations 31
an y(n) (x) + an−1 y(n−1) (x) + . . . + a2 y′′ (x) + a1 y′ (x) + a0 y(x) = 0, (1.49)
L y = f, L y = 0,
L (y p − z) = L y p − L z = f − f = 0.
The method of this section only applies to functions f (x) that are polynomial in
x, combinations of sine or cosine, exponentials in x or combinations of the after-
mentioned forms of f (x). We illustrate the idea by displaying a few examples.
Example 1.19 The differential equation
y′′ − 3y′ + 2y = 4
equation we arrive at the relation 2Ae3x = 2e3x . This gives A = 1, and hence the
general solution is
□
Example 1.21 The equation
1.9.1 Exercises
In Exercises 1.52–1.53 the characteristic equation and the forcing function of a cer-
tain differential equation are given. Write down the particular solution without solv-
ing for the coefficients.
Exercise 1.52
Exercise 1.53
f (x) yp
Constant C A
eax Aeax
n
Cx , n = 0, 1, 2, . . . A0 + A1 x + A2 x2 + . . . + An xn
cos(bx), or sin(bx) A1 cos(bx) + A2 sin(bx)
(A0 + A1 x + A2 x2 + . . . + An xn ) cos(bx)
xn cos(bx), or xn sin(bx)
+(B0 + B1 x + B2 x2 + . . . + Bn xn ) sin(bx)
xn eax (A0 + A1 x + A2 x2 + . . . + An xn )eax
where the functions u1 , and u2 are to be found and continuous on the interval I. For
the rest of this section we suppress the independent variable x in (1.53). Differentiat-
ing (1.53) with respect to x we obtain
Substituting y p and y′p into (1.51) and making use of the fact that y1 (x), and y2 (x) are
known solutions of the corresponding homogeneous equation (1.52), we arrive at the
relation
u′1 y′1 + u′2 y′2 = f (x). (1.55)
Solving (1.54) and (1.55) by using the process of elimination we get
f (x)y2 f (x)y1
u′1 = and u′2 = . (1.56)
y2 y′1 − y1 y′2 y1 y′2 − y2 y′1
Using Wronskian notations, and hence the name of this section, we arrive at the easy
formulae to remember
W1 W2
u′1 = and u′2 = , (1.57)
W W
where
y y2 0 y2 y1 0
W = 1′ , W 1 = , and W2 = .
y′2 y′2 y′
y1 f (x) f (x)
1
Before we go for an example, we briefly discuss how the method can be extended to
nonhomogenous nth order differential equations of the form
y(n) (x) + Pn−1 (x)y(n−1) (x) + . . . + P2 (x)y′′ (x) + P1 (x)y′ (x) + P0 (x)y(x) = f (x).
(1.58)
If its corresponding homogeneous equation has the homogeneous solution yh (x) =
c1 y1 + c2 y2 + . . . + cn yn , then the particular solution y p is of the form
yh (x) = u1 y1 + u2 y2 + . . . + un yn ,
where
Wi
u′i = , i = 1, 2, . . . , n.
W
Here,
···
y1 y2 yn
′
y′2 ··· y′n
y1
W (y1 , y2 , . . . , yn ) = .. .. ..
..
. . . .
(n−1) (n−1) (n−1)
y
1 y2 ··· yn
Wronskian Method 35
and Wi is the determinant obtained by replacing the ith column of the Wronskian
by
0
0
.. .
.
f (x)
We provide the following example.
Example 1.22 We consider (1.50). First, the homogeneous solution is
yh = c1 sin(2x) + c2 cos(2x),
and
sin(2x) 0
W2 = = tan(2x).
2 cos(2x) sec(2x)
Thus,
1 1
u′1 = , u′2 = − tan(2x).
2 2
An integration gives
x 1
u1 = , u2 = ln cos(2x) .
2 4
Hence
x 1
yp = sin(2x) + ln cos(2x) cos(2x).
2 4
Finally, the general solution is y = yh + y p . □
1.10.1 Exercises
In Exercises 1.55- 1.60 solve the given differential equation.
Exercise 1.55
y′′ + 9y = csc(3x).
Exercise 1.56
ln(x)
y′′ + 2y′ + y = , x > 0.
ex
Exercise 1.57
y′′ + y′ + 2y = e−x sin(2x).
Exercise 1.58
2ex
y′′ − y = .
ex + e−x
36 Ordinary Differential Equations
Exercise 1.59
1
y′′ − 6y′ + 9y = , x > 0.
x
Exercise 1.60
1
e4x y′′ + 8y′ + 16y = 2 , x > 0.
x
an xn y(n) (x) + an−1 xn−1 y(n−1) (x) + . . . + a2 x2 y′′ (x) + a1 xy′ (x) + a0 y(x) = 0, x>0
where a j , j = 1, 2, . . . , n, are constants. Euler equations are important since they pop
up in many applications and partial differential equations. In addition, they make
their presence in Chapter 4. We concentrate on the second-order Cauchy-Euler ho-
mogeneous equation
ax2 y′′ + dxy′ + ky = 0,
and write it in the form
b c
y′′ + y′ + 2 y = 0, x > 0. (1.59)
x x
We note that the coefficients of (1.59) are continuous everywhere except at x = 0.
However, we shall consider the equation over the interval (0, ∞). We solve the
Cauchy-Euler equation (1.59) by making the substitution x = et , or equivalently t =
ln(x). Once we rewrite (1.59) in terms of the new variables y, and t, then it is possible
to use the method of Section 1.8 to find its general solution. Let
x = et , or equivalently t = ln(x).
Then,
dy dy dt dy 1 dy
= = = e−t .
dx dt dx dt x dt
Moreover,
d2y d dy d −t dy
y′′ = = = e−t e
dx2 dx dx dt dt
dy d 2y
= e−t − e−t + e−t 2
dt dt
dy 2
d y
= −e−2t + e−2t 2 .
dt dt
Cauchy-Euler Equation 37
Substituting into (1.59) and noting that x2 = e2t , we arrive at the second-order differ-
ential equation
d2y dy
+ (b − 1) + cy = 0. (1.60)
dt 2 dt
We remark that on the interval (−∞, 0) we make the substitution |x| = et , or equiva-
lently, x = −et , which will again reduce (1.59) to (1.60). Thus, once (1.60) is solved,
we use the inverse substitution t = ln |x| to obtain solutions of the original equation
(1.59). Recall the three cases that we discussed in Section 1.8.
(a) (Distinct roots) Let r1 and r2 be the two distinct roots of the auxiliary equation
of (1.60). Then, the general solution is given by
(b) (Repeated roots) Let r be a repeated root of the auxiliary equation of (1.60).
Then, the general solution is given by
for constants c1 , and c2 . Letting t = ln |x|, we obtain the general solution of our
original equation to be
y(x) = |x|α c1 cos(β ln |x|) + c2 sin(β ln |x|) .
We need to put the equation in the standard form, by dividing with x2 and arrive at
1 1 ln(x)
y′′ − y′ + 2 y = 2 , x > 0.
x x x
Next we find yh of
1 1
y′′ − y′ + 2 y = 0, x > 0.
x x
38 Ordinary Differential Equations
r2 − 2r + 1 = 0,
yh = c1 x + c2 x ln(x).
To find y p we use the method of Section 1.50. Let y1 = x, and y2 = x ln(x). Set
f (x) = ln(x)
x2
. Then
ln(x) 1
u2 = −ue−u − e−u = − − .
x x
Then, after some calculations we arrive at
y p = u1 y1 + u2 y2 = 2 + ln(x).
Finally,
y = yh + y p = c1 x + c2 x ln(x) + 2 + ln(x),
is the general solution. □
Cauchy-Euler Equation 39
1.11.1 Exercises
Exercise 1.61 Solve each of the given differential equation.
(a) x2 y′′ − 4xy′ + 6y = 0, y(−2) = 8, y′ (−2) = 0.
(b) x2 y′′ − xy′ + y = 4x ln(x), x > 0.
(c) x2 y′′ + xy′ + y = sec(ln(x)), x > 0.
√
(d) x2 y′′ + 3xy′ + y = x, x > 0.
(e) x2 y′′ − 2xy′ + 2y = 2x3 , x > 0.
(f) x2 y′′ + 2xy′ + y = ln(x), y(1) = y′ (1) = 0.
Exercise 1.62 Let α be a constant. Use the substitution x − α = et to reduce
to the equation
d2y dy
+ (b − 1) + cy = 0.
dt 2 dt
In Exercises 1.63-1.64 use the results of Exercise 1.62 to solve each of the given
differential equation.
Exercise 1.63
Exercise 1.64
(x − 2)2 y′′ + y = 0, y(1) = 3, y′ (1) = 1.
2
Partial Differential Equations
2.1 Introduction
A partial differential equation, short PDE, is an equation that contains the unknown
function u and its partial derivatives. Recall from Chapter 1, Section 1.1 that given a
function of two variables, f (x, y), the partial derivative of f with respect to x is the
rate of change of f as x varies, keeping y constant and it is given by
∂f f (x + h, y) − f (x, y)
= lim .
∂ x h→0 h
Similarly, the partial derivative of f with respect to y is the rate of change of f as y
varies, keeping x constant and it is given by
∂f f (x, y + h) − f (x, y)
= lim .
∂ y h→0 h
More often we write fx , fy to denote ∂∂ xf and ∂∂ yf , respectively. Similar notations will
be used to denote higher partial derivatives and mixed partial derivatives. Let D be a
subset of R2 and u = u(x, y) such that u : D → R. Then we may denote the general
first oder PDE in u(x, y) by
F(x, y, u(x, y), ux (x, y), uy (x, y)) = 0,
or
F(x, y, u, ux , uy ) = 0, (2.1)
for some function F. In this Chapter, we limit our discussion to PDEs in two inde-
pendent variables. Below we list some important PDEs.
ut + cux = 0, c ∈ R (Transport equation).
DOI: 10.1201/9781003449881-2 40
Introduction 41
□
Order and linearity are two of the main properties of PDEs.
Definition 2.1 The order of a partial differential equation is the highest order
derivative in the given PDE.
Definition 2.2 A given partial differential equation is said to be linear if the un-
known function and all of its derivatives enter linearly.
For example, all the equations listed above are linear except Burger’s equation. Now
to better understand linearity we utilize the operator concept L on an appropriate
space, where L is a differential operator. Recall from Chapter 1, that an operator
is really just a function that takes a function as an argument instead of numbers as
we are used to dealing with in functions. For example, L u assigns u a new function
L u. Another example if we take
∂2 ∂2
L = − ,
∂t 2 ∂ x2
then
L u = utt − uxx .
The next definition gives a precise and convenient way to test for linearity.
Definition 2.3 An operatorL is said to be linear if is satisfies
(a)
L (u1 + u2 ) = L u1 + L u2 ,
(b)
L (cu1 ) = cL u1 ,
42 Partial Differential Equations
L u = utt − uxx .
L u = uut − ux .
L (u+v) = (u+v)(u+v)t −(u+v)x = [uut −ux ]+[vvt −vx ]+uvt +vut ̸= L u+L v.
is a solution of the heat equation uxx − ut = 0, t > 0, for the temperature u = u(x,t)
in a rod, considered as a function of the distance x measured along the rod and of the
time t. In addition, we show for constants ci , i = 1, 2, . . . , N that
N
u(x,t) = ∑ cn un (x,t)
n=1
is also a solution.
Introduction 43
∂2
To do so, we let L be the differential operator given by L = ∂ x2
∂
− ∂t . It is clear that
L is linear. Now for any fixed n, n = 1, 2, . . . , N we have
∂ 2 −n2 π 2 t ∂ 2 2
L un = 2
e sin(nπx) − e−n π t sin(nπx)
∂x ∂t
∂ 2 2
2 2
= nπe−n π t cos(nπx) + n2 π 2 e−n π t sin(nπx)
∂x
2 π 2t 2 π 2t
= −n2 π 2 e−n sin(nπx) + n2 π 2 e−n sin(nπx)
= 0.
□
Let L be a differential operator. Then the partial differential equation
Lu= f (2.2)
L u = 0. (2.3)
If we assume L is linear, then the construction of the general solution of the non-
homogeneous PDE given by (2.2) is similar to its counterpart in ordinary differen-
tial equations. Let u p be a particular solution of (2.2) and uh be the homogeneous
solution of (2.3). Then, due to the linearity of the differential operator L we see
that
L (uh + u p ) = L uh + L u p = 0 + f = f .
Thus, it suffices to find u p of (2.2) and add it to the homogeneous solution uh of (2.3)
to get the general solution
u = uh + u p .
When the PDE is linear and involves only simple derivatives of only one variable, it
is more likely that it can be solved along the lines of an ordinary differential equation,
as the next example shows.
Example 2.5 In this example we display various forms of the solution u = u(x, y)
for the following PDEs
44 Partial Differential Equations
(a) uy = 0,
(b) uyy + u = 0,
(c) uyyx = 0.
We begin with (a). The given PDE, uy = 0 has no partial derivatives with respect
to x, which indicates that the solution u is a function of x only. Thus, the solution is
u(x, y) = g(x), for some function g. Suppose we impose the initial condition u(x, a) =
e2x , then
e2x = u(x, a) = g(x),
which uniquely determines the function g.
(b) Now the PDE, uyy + u = 0 can be thought of as the second-order ODE
z′′ +z = 0, which has the solution z = c1 cos(y)+c2 sin(y). Or, u(x, y) = g(x) cos(y)+
h(x) sin(y), since the constants c1 and c2 may depend on the other variable x, where
the functions g and h are differentiable.
2.1.1 Exercises
Exercise 2.1 Determine the order and use the operator L to decide linearity, non-
linearity of the given equations.
Linear Equations 45
where we have suppressed the independent variables x and y. Note that the require-
ment A2 + B2 ̸= 0, implies that A and B are not both zero at the same time; otherwise,
we would not have a differential equation to solve. Our aim is to use a linear transfor-
mation and transform (2.5) into an ordinary differential equation in a single indepen-
dent variable, say, x, and the dependent variable u. We begin by letting ξ = ξ (x, y)
and η(x, y) = η, where
ξ = c11 x + c12 y (2.6)
and
η = c21 x + c22 y (2.7)
where the constants c11 , c12 , c21 , and c12 are to be appropriately chosen in order to
reduce the PDE into an ODE. Using the chain rule, we have
and
uy = uξ ξy + uη ηy = c12 uξ + c22 uη . (2.9)
Now substituting (2.8) and (2.9) into (2.5) and rearranging the terms we arrive
at
Ac11 + Bc12 uξ + Ac21 + Bc22 uη +Cu = 0. (2.10)
Assume A ̸= 0 and choose c11 = 1, c12 = 0, c21 = B, and c22 = −A. Then
ξ = x and η = Bx − Ay.
Remark 2 In the case that both A and B are not zero, you may use either (2.11) or
(2.12).
Example 2.7 Solve
ux + uy + u = x + y, (2.13)
subject to the initial condition
u(0, y) = y2 .
First we find uh and since neither A nor B is zero we have the luxury to use either
(2.11) or (2.12). Using (2.12) with A = B = C = 1, we obtain
u p = a + bx + cy.
a + b + c + bx + cy = x + y.
It follows from the above expression that a = −2, b = c = 1. Then the general solu-
tion is
u(x, y) = e−y f (x − y) + x + y − 2.
Our next task is to use the initial data to uniquely determine the function f . The
initial data implies that when x = 0 and y = y we have u = y2 . Thus, it follows from
the general solution that
y2 = e−y f (−y) + y − 2.
This gives f (−y) = ey (y2 − y + 2). Let k = −y. Then f (k) = e−k (k2 + k + 2), from
which we get
f (x − y) = ey−x [(x − y)2 + (x − y) + 2].
It follows from
u(x, y) = e−y f (x − y) + x + y − 2
that
u(x, y) = e−x [(x − y)2 + (x − y) + 2] + x + y − 2.
□
48 Partial Differential Equations
subject to
u(x, 0) = f (x). (2.15)
Using the above remark, we immediately obtain the solution to be
u(x,t) = f (x − ct).
The easiest way is to replace x in f with x − ct and rearrange the domains. Thus,
1, x − ct ≤ 0
u(x,t) = f (x − ct) = 1 − x + ct, 0 < x − ct ≤ 1
0, x − ct > 1.
It follows that
1, x ≤ ct
u(x,t) = 1 − x + ct, ct < x ≤ 1 + ct
0, x > 1 + ct.
□
Next, we give a geometrical interpretation of the characteristic lines and solutions.
Consider the simpler form of (2.5) with C = 0. That is,
where A2 + B2 ̸= 0. That is A and B can not both be zero. Let · denote the inner
product in R2 . Then (2.16) is equivalent to
If we let →
−v =< A, B >, then the left side of the equation is the directional derivative
of u in the direction of the vector →
−
v . That is, the solution u of (2.16) must be constant
in the direction of the vector v = Ai + B j. The lines parallel to the vector →
→
− −v have
the equation for an arbitrary constant constant K
Bx − Ay = K, (2.17)
Linear Equations 49
y y
η
>
B
A,
<
v= (x, y) ξ
x x
η =< x, y > · < B, −A >
ξ =< x, y > · < A, B >
FIGURE 2.1
Characteristic lines; change of coordinates.
since the vector < B, −A > is orthogonal to → −v , and as such is a normal vector to all
→
−
lines that are parallel to v . Thus, (2.17) provides a family of lines and each one of
them is uniquely determined by the specific value of K. The family of lines given
by (2.17) are called characteristic lines for the equation (2.16). We conclude that the
solution u(x, y) is constant in the direction of →
−
v , which is true also along the family
of lines given by (2.17). Such any line is determined by K = Bx − Ay and hence u
will depend only on Bx − Ay. It follows that
2.2.2 Exercises
Exercise 2.7 Give all the details in arriving at Equation (2.12).
Exercise 2.8 Redo Example 2.7 using (2.11).
Exercise 2.9 For constants A and B use the transformation
ξ = Ax + By, η = Bx − Ay
subject to
u(0, y) = 2 + y.
Exercise 2.11 Solve each of the given PDEs.
(a) ux + uy + u = sin(x).
(b) ux + u = x, u(0, y) = y2 .
(c) 2ux + uy = x + ey , u(0, y) = y2 .
Exercise 2.12 Solve
(a) ux − 3uy = sin(x) + cos(y), u(x, 0) = x.
(b) 2ux + uy = x + ey , u(0, y) = y2 .
Exercise 2.13 Solve
1
(a) ux − 5uy = 0, u(x, 0) = .
1 + x2
(b) ux + ut = −3, u(x, 0) = e3x .
Exercise 2.14 Solve for u(x,t) and sketch u(x,t) at t = 1, 2.
subject to
u(x, 0) = f (x), (2.18)
where
1, x≤0
f (x) = 1 − x, 0<x≤1
0, x > 1.
A(x, y)ux (x, y) + B(x, y)uy (x, y) +C(x, y)u(x, y) = G(x, y), (2.19)
where A and B are continuously differentiable functions in the variables in the open
set D ⊂ R2 and that A and B do not simultaneously identically vanish in D. In addi-
tion, the functions C and G are continuous in D. We begin by examining the homo-
geneous equation
Aux + Buy +Cu = 0, (2.20)
Linear Equations 51
ux = uξ ξx + uη ηx ,
and
uy = uξ ξy + uη ηy .
Example 2.9 Consider
x2 ux − xyuy + yu = 0, x ̸= 0, (2.26)
subject to
u(1, y) = e2y .
Based on the above discussion, we have A = x2 , B = −xy, and C = y. The character-
istic equation
dy −xy
= 2 ,
dx x
has the characteristic lines as its solution given by xy = k. Thus, we define η(x, y) =
xy. Let ξ (x, y) = x. Then the Jacobian
ξx ηx
J= = x ̸= 0.
ξy ηy
ux = uξ + yuη , uy = xuη .
x2 uξ + yu = 0.
− η2
= f (η)e 2ξ .
Quasi-Linear Equations 53
2.2.4 Exercises
Exercise 2.15 Solve
2
(a) yux + xuy = 0, u(0, y) = e−y .
(b) xux − uy = 0, u(x, 0) = x.
Exercise 2.16 Solve
1
(a) 3ux − 2uy + u = x, u(x, 0) = .
1 + x2
(b) ux + ut = −3, u(x, 0) = e3x .
Exercise 2.17 Use Remark 4 to solve
(a) xux + uy = 1, u(x, 0) = cos(3x).
(b) xux − yuy + y2 u = y2 , x, y ̸= 0.
< ux , uy , −1 >
z = u(x, y)
FIGURE 2.2
Surface z = u(x, y).
Hence, the normal vector < ux , uy , −1 > to the surface at a given point is orthog-
onal to the vector < A, B, C > at that point. It follows that the vector < A, B, C >
must be tangent to the surface u(x, y)−z = 0 and therefore integral surfaces must be
formed from the integral curves of the vector field < A, B, C > . Thus, the integral
curves are given as solutions to the system of ODEs
dx dy dz
= A(x, y, z), = B(x, y, z), = C(x, y, z). (2.28)
dt dt dt
Note that the choice of parameter t in (2.28) is artificial and it can be suppressed to
write (2.28) in the form
dx dy dz
= = . (2.29)
A(x, y, z) B(x, y, z) C(x, y, z)
Either of the systems (2.28) or (2.29) is called the characteristic system associ-
ated with (2.27). Characteristics curves are solutions of either system (2.28) or
(2.29).
If a surface S : z = u(x, y) is a union of characteristic curves, then S is an integral
surface. We have the following theorem.
Theorem 2.2 Assume the point P = (x0 , y0 , z0 ) is a point on the integral surface
S : z = u(x, y). Let γ be the characteristic curve through P. Then γ lies entirely on
S.
Quasi-Linear Equations 55
Then W (t0 ) = z(t0 ) − u(x0 , y0 ) = 0, since the point P lies on the surface S. It follows
from the chain rule and (2.28) that
dW dz dx dy
= − ux (x, y) − uy (x, y)
dt dt dt dt
= C(x, y, z) − ux (x, y)A(x, y, z) − uy (x, y)B(x, y, z),
Next, we discuss the Cauchy Problem for quasi-linear equation (2.27), which says,
find the integral surface z = u(x, y) of (2.27) containing an initial curve Γ.
Method of solution:
Let Γ be non-characteristic contained in the surface u(x, y) − z = 0 ; (that is the
tangent to Γ is nowhere tangent to the characteristic vector < A, B, C > along Γ), be
given initial curve. Parametrize Γ by
2) Eliminate s from the two equations in 1) and obtain the functional relation
F(c1 , c2 ) = 0, between c1 and c2 .
3) Then the solution to the Cauchy problem is
F u1 (x, y, z), u2 (x, y, z) = 0.
56 Partial Differential Equations
4) If we are given an initial curve, then utilize it to compute the arbitrary function
in 3).
Remark 5 (1) You may obtain different looking solutions, but this depends on
whether you use (2.28) or (2.29). However, once you apply the initial data given
by the initial curve Γ, then solutions should match.
(2) The method can be easily extended to PDEs of multiple variables by adapting the
relations (2.28) and (2.29). We shall explain this in one of the examples below.
Example 2.10 Find u(x, y) satisfying
xux + yuy = u + 1,
and
u(x, 1) = 3x.
Let z = u(x, y). First we parametrize the initial curve Γ given by the data u(x, 1) = 3x.
Let x = s at t = 0. Then
Γ : x = s, y = 1, z = 3s, for t = 0.
dx dy dz
= x, = y, = z + 1,
dt dt dt
with corresponding solutions
x = c1 et , y = c2 et , z = c3 et − 1.
x
z = (3 + 1)y − 1 = 3x + y − 1.
y
So the solution is
u(x, y) = 3x + y − 1.
x
Fig. 2.3 shows the characteristic lines given by y = intersecting the initial curve
s
y = 1 at exactly one point (they are not in the same direction) and hence, as the curves
cover all the plane, the solution is defined everywhere. □
In the next example, we revisit the transport equation with incompatible data.
Quasi-Linear Equations 57
y
FIGURE 2.3
x
Characteristic lines y = s intersecting initial curve y = 1.
subject to
u(x, cx) = f (x). (2.32)
To parametrize the initial curve Γ we let x = s at t = 0. Then
dx dy dz
= 1, = c, = 0,
dt dt dt
and corresponding solutions along Γ are
x = t + c1 , y = ct + c2 , z = c3 .
x = s + t, y = ct + cs, z = f (s).
We see that it is not feasible to eliminate s and write the solution u in x and y. Notice
that the characteristic line given by y = c(s + t) = cx, for a fixed value of c is the
same as the equation of the initial curve y = cx. In other words, the direction of the
characteristic lines are in the same direction as the initial curve. □
58 Partial Differential Equations
ux + uy + u − 1 = 0,
subject to
u(x, x + x2 ) = sin(x), x > 0.
Let z = u(x, y). By parametrizing the initial curve Γ we have
y = s + s2 + t, z = 1 − 1 − sin(s) e−t .
x = s + t,
which is defined for y > x. In Fig. 2.4, we plotted the traces of the characteristic
curves given by y = x + c, where c = s2 against the data from the initial curve y =
x + x2 and the two intersect in the region y > x, where the solution exists. □
Example 2.13 Consider the transport equation
subject to
u(x, 0) = f (x). (2.34)
To parametrize the initial curve Γ we let x = s at t = 0. Then
Γ : x = s, y = 0, z = f (s), for t = 0.
y>x
y = x + x2 y = x+1
y y=x
x
(0, 0)
FIGURE 2.4
The solution exists in the shaded region y > x.
2 2
function. Since the solution is constant on the characteristic curves y = x2 + s2 , and
the curves intersect the line y = 0 (the initial curve), the solution exists in the re-
2
gion where the traces of the characteristic lines given by y = x2 , intersects the y-axis.
Moreover, we must make sure that s2 = 2y − x2 > 0. In conclusion, solutions exist in
the region
x2
y≤
2
as depicted by Fig. 2.5.
The next example will show that based on the nature of the given equation, you will
need to utilize either (2.28) or (2.29). But first, we make the following remark.
Remark 6 A useful technique for integrating a system of first-order equations is that
of multipliers. Recall from algebra that is ab = dc , then
a c λ a + µc
= =
b d λ b + µd
for arbitrary values of multipliers λ , µ. This can be generalized and one would have
a c e λ a + µc + νe
= = = (2.35)
b d f λ b + µd + ν f
60 Partial Differential Equations
y
x2 x2
y≤ 2 y≤ 2
x
(0, 0)
x2
y≤ 2
x2
y≤ 2
FIGURE 2.5
Feasible region for the solution.
for arbitrary multipliers λ , µ, and ν. Hopefully, with the right choices of the param-
eters λ , µ, and ν, expression (2.35) leads to simpler systems of ODEs that can be
easily solved. In particular, if λ , µ, and ν, are chosen such that
λ A + µB + νC = 0,
then λ dx + µdy + νdz = 0. Now if there is a function u such that
du = λ dx + µdy + νdz = 0,
then u(x, y, z) = c1 is an integral curve.
Example 2.14 Solve
xux + (x + u)uy = y − x, (2.36)
containing the curve
u(x, 2) = 1 + 2x.
A parametrization of the initial curve Γ gives
Γ : x = s, y = 2, z = 1 + 2s, for t = 0.
Here you can not use system (2.28) since the obtained ODEs will not be separable.
Hence we resort to using (2.28), in combination with (2.35). From (2.28), we see that
dx dy dz
= = .
x x+z y−x
Quasi-Linear Equations 61
3
s= . (2.39)
c1 − 2
dy − dx dz
Similarly, if we set λ = −1, µ = 1, and ν = 0 in (2.37) we obtain = ,
z y−x
or
d(y − x)(y − x) = zdz.
An integration gives the second integral curve
(y − x)2 − z2 = c2 .
Applying the initial data we get the relation
(2 − s)2 − (1 + 2s)2 = c2 . (2.40)
Substituting the value of s given by (2.39) into (2.40) produces the expression
3 2 3 2
(2 − ) − (1 + 2 ) = c2 . (2.41)
c1 − 2 c1 − 2
Finally, to obtain a functional relation of the solution substitute c1 and c2 where c1 is
given by (2.38) and c2 = (y − x)2 − z2 into (2.41). Don’t forget to replace z by u for
the final answer.
□
2.3.1 Exercises
Exercise 2.18 Find u(x, y) satisfying
ux + uy = 1 − u,
subject to
u(x, x + x2 ) = ex , x > 0.
62 Partial Differential Equations
xuux − uy = 0,
subject to
u(x, 0) = x.
Exercise 2.20 Find u(x, y) satisfying
subject to
1
u(x, ) = 5.
x
Exercise 2.21 Find u(x, y) satisfying
uux + uy = 3,
subject to
u(2x2 , 2x) = 0.
Exercise 2.22 Find u(x, y) satisfying
and
1
u = s, when x = , y = 2s.
s
Exercise 2.23 Find the general solution u(x, y) of
(x + u)ux + (y + u)uy = 0,
subject to
u(x, x2 ) = 0.
Exercise 2.25 Find u(x, y) satisfying
xux + yuy = 1 + y2 ,
subject to
u(x, 1) = x + 1.
y2
Answer: u(x, y) = ln(y) + 2 + 12 + xy .
Burger’s Equation 63
• (x, y)
slope = 1/c
• x
(s, 0)
FIGURE 2.6
Characteristic lines y = 1c x − cs do not run into each others.
Γ : x = s, y = 0, z = h(s), at t = 0.
2
If we take the initial function h(x) = e−x , then the solution becomes
2
u(x, y) = e−(x−cy) ,
and are graphed in Fig. 2.8 for wave speed c = 1 and different values of y.
Now we turn our attention to equation (2.42) subject to the initial data given by
(2.43). One of the characteristic equation is
dx
= a(u).
dy
Burger’s Equation 65
y
• (x, y)
• x
(s, 0)
FIGURE 2.7
Wave propagation of the solutions u(x + cy, y) = h(s − cy) by considering y as spacial
time.
y=1
y = 0.5 y = 2
x
y=0
FIGURE 2.8
2
Wave propagation of the solutions u(x, y) = e−(x−cy) for wave speed c = 1.
du(x, y) dx
= uy + ux = uy + a(u)ux = 0.
dy dy
This shows that the solution u is constant along the characteristics. Moreover, the
characteristics are straight lines as it is obvious from the calculation below
d2x d dx du(x, y)
2
= ( ) = a′ = 0.
dy dy dy dy
Let
Γ : x = s, y = 0, z = h(s), at t = 0.
Using system (2.28), we have
dx dy dz
= a(z), = 1, = 0,
dt dt dt
66 Partial Differential Equations
y
• (x, y)
slope = 1/c
• x
(s, 0)
FIGURE 2.9
Characteristic lines running into each others in the nonlinear case, unlike Fig. 2.6.
or
h′ (s)
ux = (2.47)
1 + a′ (u)h′ (s)y
Similarly, taking partial derivative with respect to y we get
uy = h′ x − a(u)y − a′ (u)uy y − a(u) .
Thus, along the characteristic x = s + a(h(s))y, we have ux and uy given by (2.47) and
(2.48), respectively. Moreover, ux and uy become infinite at the positive time
1
y=− , provided that a′ (u)h′ (s) < 0. (2.49)
a′ (u)h′ (s)
Burger’s Equation 67
If a′ (u) > 0, then in order for solutions to exist, expression (2.49) implies that
h′ (s) > 0. In other words, h(s) is an increasing function. Otherwise, the solutions
will experience a “blow -up.” For example, if a(u) = u, then condition (2.49) takes
the form
1
y = min{− ′ },
h (s)
and solutions will experience blow-up at and beyond the time y = − h′ (s1 ) , where
0
h(s0 ) is the minimum of h(s) at s = s0 , and h′ (s) < 0; that is h is non-
increasing.
Example 2.15 Solve
x = zy + s or s = x − zy. (2.52)
We try to piece the solution together since our initial data is given piecewise.
1. For s ≤ 0, u = z = 1. Moreover, we have from (2.52) that s = x − y. Since, s ≤ 0,
it follows that x − y ≤ 0. We conclude that
u = 1, for x ≤ y.
x
1
FIGURE 2.10
Initial wave profile.
x−y
As for the domain, we have 0 < s ≤ 1, which implies that 0 < ≤ 1. Rearranging
1−y
the terms we arrive at y < x ≤ 1. We conclude that
1−x
u= , for y < x ≤ 1.
1−y
Finally, the solution is
1, x≤y
1−x
u(x, y) = 1−y , y<x≤1 (2.53)
0, x > 1.
□
The obtained solution in (2.53) is valid for 0 ≤ y < 1, and discontinuous at y = 1.
The characteristics run into each others in the wedged region where y > 1. (See Fig.
2.11). Next, our goal is to extend the solution beyond y ≥ 1. To do so, we introduce a
curve starting at the discontinuity point (1, 1) and try to construct such curve (shock
path) as shown in the next section.
Example 2.16 Find the blow-up time for
• (1, 1)
u(x, y) = 1 u(x, y) = 0
x
1 1−x
u(x, y) = 1−y
FIGURE 2.11
Characteristic lines running into each others in the nonlinear case, unlike Fig. 2.6.
1 1
According to (2.49), the blow-up time is y = min{− }, with h(ξ ) = .
h′ (ξ ) 1+ξ2
2ξ
Wherfore, we have h′ (ξ ) = − . We need to find the minimum of the func-
(1 + ξ 2 )2
tion g where
1 (1 + ξ 2 )2
g(ξ ) = − = .
h′ (ξ ) 2ξ
After some calculations, we find that
3ξ 4 + 2ξ 2 − 1
g′ (ξ ) = .
2ξ 2
Setting g′ (ξ ) = 0, it follows that the only feasible solution (positive and real time) is
ξ = √13 , which minimizes the function g. Thus, the blow-up time is
√
1 8 3
y = g( √ ) = .
3 9
□
t
x = γ(t)
t
x
a b
FIGURE 2.12
Shock path.
But, b
F(u) x dx = F u(b,t) − F u(a,t) ,
a
and so we have b
d
u(x,t)dx = F u(a,t) − F u(b,t) . (2.55)
dt a
If u is the amount of a quantity per unit length, then the left of (2.55) is the time rate
of change of the total amount of the quantity inside the interval [a, b]. If F u(x,t) is
the flux through x, that is, the amount of the quantity per unit time positively flowing
across x, then (2.55) implies that the rate of the quantity in [a, b] equals the flux in at
x = a minus the flux out through x = b.
As depicted in Fig. 2.12, let x = γ(t) be a smooth curve across which u is discontin-
uous. Assume u is smooth on each side of the curve γ. Let u0 and u1 denote the right
and left limits of u at γ(t), respectively. That is,
As
a → γ − (t) and b → γ + (t),
we have
γ − (t) b
ut (x,t)dx → 0 and ut (x,t)dx → 0.
a γ + (t)
As a consequence, expression (2.56) reduces to
dγ
F(u1 ) − F(u0 ) = (u1 − u0 ) . (2.57)
dt
Adopting the notation
[F(u)] = F(u1 ) − F(u0 ) and [u] = u1 − u0 ,
equation (2.57) takes the form
dγ [F(u)]
= . (2.58)
dt [u]
Using (2.58), the weak solution that evolves will be a piecewise smooth function
with a discontinuity or shock wave, that propagates with shock speed.
Example 2.17 Consider the problem of Example 2.15. By setting t = y, then ut +
uux = 0 is equivalent to
1
ut + u2 x = 0, x ∈ R, t > 0.
2
It follows that
1
F(u) = u2 .
2
We know from Fig. 2.13 that the shock occurs at and beyond (1, 1). Also from Fig.
2.13, to the left of the shock we have u1 = 1, and to the right of the shock, u0 = 0.
So, [u] = u1 − u0 = 1 and
1 1 1
[F(u)] = F(u1 ) − F(u0 ) = u21 − u20 = .
2 2 2
Thus the path of shock, γ has the slope
dγ 1
= .
dt 2
Therefore, γ is of the form 2x = t + c. Since this path passes through (1, 1), we see
that c = 1. It follows that the shock line is given by
t 1
x(t) =
+ .
2 2
Therefore, for t ≥ 1, the solution is given by
1, x(t) < 2t + 12
u(x, y) = (2.59)
0, x(t) > 2t + 12 .
□
72 Partial Differential Equations
t
t = 2x − 1
u=1
u=0
1 •
x
1 1−x
u= 1−y
FIGURE 2.13
Characteristic lines intersecting the shock line t = 2x − 1.
2.4.2 Exercises
Exercise 2.31 Find the breaking time for
2
ut + uux = 0, u(x, 0) = e−2x , x ∈ R, t ≥ 0.
Exercise 2.32 Consider the PDE
1
ut + u2 ux = 0, u(x, 0) = , x ∈ R, t ≥ 0.
x4 + 1
(a) Find and graph the characteristics.
(b) Determine the breaking time and find the shock line.
(c) Find the solution before the breaking time.
Exercise 2.33 Find the breaking time for
ut + u3 ux = 0, u(x, 0) = x1/3 , x ∈ R, t ≥ 0
and then find the solution.
Exercise 2.34 Solve and find the shock line of the traffic flow problem and explain
the physical meaning of the solution
ut + (1 − 2u)ux = 0,
subject to
1/2, x < 0
u(x, 0) =
1, x > 0.
Exercise 2.35 Solve and find the shock line
0, x≤0
ut + uux = 0, u(x, 0) = x, 0 ≤ x ≤ 2
2, x > 2.
Second-Order PDEs 73
A(x, y)uxx +2B(x, y)uxy +C(x, y)uyy +D(x, y)ux + E(x, y)uy + F(x, y)u=G(x, y),
(2.60)
where the function u and the coefficients are twice continuously differentiable in
some domain Ω ⊂ R2 . We shall consider (2.60) along with the Cauchy conditions
imposed on some curve Γ that is defined by y = f (x). Imposing Cauchy conditions
implies that ux and uy are known on Γ. That is,
where f and g are known functions. A differentiation of these relations with respect
to x gives
dy
uxx (x, y(x)) + uxy (x, y(x)) = fx (x) (2.61)
dx
74 Partial Differential Equations
dy
uxy (x, y(x)) + uyy (x, y(x)) = gx (x). (2.62)
dx
In addition, along the curve Γ, equation (2.60) takes the form
Auxx (x, y(x)) + 2Buxy (x, y(x)) +Cuyy (x, y(x)) = H, (2.63)
where H is a known function in x. Equations (2.61)–(2.63) determine uxx , uyy , and
uxy uniquely unless
A 2B C
dy dy 2 dy
△ = 0 1 dx = −A dx + 2B −C = 0.
1 dy dx
dx 0
Or,
dy 2 dy
A − 2B +C = 0. (2.64)
dx dx
dy
The above equation is quadratic in dxwith solutions
√
dy B ± B2 − AC
= . (2.65)
dx A
When B2 − AC > 0, there exists two families of curves such that no solution can
be found when Cauchy conditions are imposed on them. The families of curves are
known as the characteristics. On the other hand, there are no characteristics when
B2 − AC < 0, and one family of characteristics exists when B2 − AC = 0. We call the
initial curve Γ characteristic with respect to (2.60) and Cauchy conditions if △ = 0
along Γ, noncharacteristic if △ =
̸ 0 along Γ. When Γ is noncharacteristic, the Cauchy
data uniquely determine the solution. However, in the case of a characteristic initial
curve Γ, then (2.61)–(2.63) are inconsistent, unless more data is offered. Thus, when
Cauchy data coincide with the initial curve Γ, the PDE (2.60) has no solution.
Definition 2.4 The PDE in (2.60) has the following classifications: it is hyperbolic
if B2 − AC > 0, parabolic if B2 − AC = 0, and elliptic if B2 − AC < 0.
We conclude from Definition 2.4, that the classification of the PDE (2.60) depends
on the highest order terms. Our next task is to use transformations that will reduce
the complicated PDE (2.60) to a simpler one that we can easily solve using the
knowledge of the previous sections of this chapter. We introduce the transforma-
tions
ξ = ξ (x, y), η = η(x, y), (2.66)
where ξ and η are twice continuously differentiable and that the Jacobian
ξx ξy
J= ̸= 0 (2.67)
ηx ηy
in the region of interest. Then, x and y are uniquely determined from the system
(2.66). With this in mind, using the chain rule, we obtain
ux = uξ ξx + uη ηx
Second-Order PDEs 75
uy = uξ ξy + uη ηy
where the new coefficients are known and we list the highest order terms. That
is,
 = Aξx2 + 2Bξx ξy +Cξy2 ,
B̂ = Aξx ηx + B ξx ηy + ξy ηx +Cξy ηy ,
and
Ĉ = Aηx2 + 2Bηx ηy +Cηy2 .
Equation (2.68) is called the canonical form of (2.60). Thus, it can be easily shown
that
B̂2 − ÂĈ = J 2 (B2 − AC),
which preserves the personification of the PDE (2.60) under the transformation
(2.66). In the next discussion we explain how to find the transformations ξ and η.
Suppose none of A, B, C is zero. Assume that under the transformations (2.66) Â and
Ĉ vanish. Let’s consider  = 0. Then it follows that
dξ = ξx dx + ξy dy = 0,
or
dy ξx
=− . (2.70)
dx ξy
76 Partial Differential Equations
y = x + c1 , y = −3x + c2 .
ux = 3uξ − uη
Second-Order PDEs 77
uy = uξ + uη
uxx = 9uξ ξ − 6uηξ + uηη
uxy = 3uξ ξ + 2uξ η − uηη
uyy = uξ ξ + 2uηξ + uηη .
Thus,
uxx − 2uxy − 3uyy = 9uξ ξ − 6uηξ + uηη − 2 3uξ ξ + 2uξ η − uηη
− 3 uξ ξ + 2uηξ + uηη
= −16uξ η = 0.
Thus, under the transformation ξ = y+3x, η = y−x, the original PDE is transformed
to the canonical form
uξ η = 0,
which has the solution
u(ξ , η) = F(ξ ) + G(η),
for some functions F and G. In terms of x and y the general solution is
Thus,
1/3 = F ′ (3x) + G′ (−x).
Integrate both sides and then multiply the resulting equation with 3, to get
3x2 + x x2 − x
F(3x) = , G(−x) = .
4 4
z2 z
Let z = 3x, then F(z) = 12 + 12 , and we conclude that
(y + 3x)2 (y + 3x)
F(y + 3x) = + .
12 12
78 Partial Differential Equations
w2
Similarly, if we set w = −x, then G(w) = 4 + w4 and as a consequence,
(y − x)2 (y − x)
G(y − x) = − .
4 4
Finally, the solution is
2.5.1 Exercises
Exercise 2.40 Find the characteristics and reduce to canonical form and then solve.
Exercise 2.41 Find the characteristics and reduce to canonical form and then solve
Exercise 2.42 Show that in the hyperbolic case when B2 − AC > 0, we have  = Ĉ =
0, and B̂ ̸= 0.
Exercise 2.43 Show that in the elliptic case when B2 − AC < 0, we have  ̸= 0, Ĉ ̸=
0, and B̂ = 0.
Exercise 2.44 Solve
uxx − 4uxy + 4uyy = 0,
subject to the Cauchy conditions
Exercise 2.46 Find the characteristics and reduce to canonical form and then find
the general solution
4uxx + 5uxy + uyy + ux + uy = 3.
Exercise 2.47 Find the characteristics and reduce to canonical form and then find
the general solution
Exercise 2.48 Find the characteristics and reduce to canonical form and then solve
v = ue−(aξ +bη)
u(x, 1) = x, uy (x, 1) = 6.
u(x, 0) = ex , uy (x, 0) = 5.
string is positioned in such a way that its left endpoint coincides with the origin of
the xu coordinate system. Consider the motion of a small portion of the string sitting
atop the interval [a, b]. Then the corresponding mass is ρ(b − a), and acceleration
utt . Using Newton’s second law of motion, we have
ρ(b − a) = Total force. (2.73)
Since the mass of the string is negligible, we may discard the effect of gravity on the
string. In addition, we may as well ignore air resistance, and other external forces.
Thus, the only force that is acting on the string is the tension force T(x,t). Assuming
that the string is perfectly flexible, the tension force will have the direction of the
tangent vector along the string. At a fixed time t the position of the string is given by
the parametric equations, x = x, u = u(x,t), where x is a parameter. Then, the tangent
1 ux
vector is < 1, ux >, with corresponding unit vector < p ,p > . Under
1 + u2x 1 + u2x
this set up the tension force takes the form
T (x,t) T (x,t)ux
T(x,t) =< p ,p > (2.74)
1 + u2x 1 + u2x
where T (x,t) is the magnitude of the tension force. Due to the assumption of a small
vibration, it is safe to assume that ux is small, and thus, via Taylor’s expansion we
have
1
q
1 + u2x = 1 + u2x + o(u4x ) ≈ 1.
2
Substituting this approximation into (2.74), we arrive at an equivalent form of the
tension force
T(x,t) =< T (x,t), T (x,t)ux > .
Since there is no longitudinal displacement, we arrive at the following identities for
the balances of forces (2.73) in the x, respectively u directions
0 = T (b,t) − T (a,t)
ρ(b − a)utt = T (b,t)ux (b,t) − T (a,t)ux (a,t).
Simply stated, the first equation shows that the tensions from the two edges of the
little portion of the string balance each other out in the x direction (no longitudinal
motion). From this, we can also infer that the position of the string has no impact on
the tension force. Hence, the second equation might be rewritten as
ux (b,t) − ux (a,t)
ρutt = T .
b−a
Taking the limit in the above equation we arrive at the wave equation
ux (b,t) − ux (a,t)
ρutt = lim T = Tuxx ,
b→a b−a
or
utt − c2 uxx = 0,
with c2 = Tρ .
Wave Equation and D’Alembert’s Solution 81
reflects air resistance as a force proportional to the speed ut . On the other hand, the
wave equation
utt − c2 uxx + ku = 0, k > 0,
incorporates transverse elastic force that is proportional to the displacement u. Fi-
nally, the wave equation
c1 = x − ct, c2 = x + ct.
Let
ξ = x + ct, η = x − ct.
Then,
uxx = uξ ξ + 2uξ η + uηη ,
and
utt = c2 (uξ ξ − 2uξ η + uηη ).
Substituting into (2.75) we arrive at the canonical form −4c2 uξ η = 0. Since c ̸= 0,
we must have
uξ η = 0,
82 Partial Differential Equations
where the functions F and G are arbitrary and required to be twice differentiable. In
terms of t and x, the solution takes the form
and
g(x) = cF ′ (x) − cG′ (x). (2.80)
Integrating (2.80) from x0 to x we arrive at
x
1
F(x) − G(x) = g(s)ds + K, (2.81)
c x0
where x0 ∈ R and K are constants. Solving for F and G from (2.79) and (2.81)
yields
1 1 x
F(x) = [ f (x) + g(s)ds + K],
2 c x0
and x
1 1
G(x) = [ f (x) − g(s)ds − K].
2 c x0
Then by (2.78) the general solution takes the form
x+ct
1 1
u(x,t) = [ f (x + ct) + g(s)ds + K]
2 c x0
x−ct
1 1
+ [ f (x − ct) − g(s)ds − K]
2 c x0
1
= [ f (x + ct) + f (x − ct)]
2
x−ct
1 h x+ct i
+ g(s)ds − g(s)ds .
2c x0 x0
It is simple to verify u(x,t) given by (2.82) is a solution of the wave equation (2.75).
Moreover, by a direct substitution into the solution (2.82), it is evident that the initial
Wave Equation and D’Alembert’s Solution 83
(x0 ,t0 )
FIGURE 2.14
Nonhomogeneous wave equation.
conditions uniquely determine (2.82). According to (2.82), the value u(x0 ,t0 ) de-
pends on the initial data f and g in the interval [x0 − ct0 , x0 + ct0 ] which is cut out of
the initial line by the the two characteristics lines with slopes ± 1c passing through the
point (x0 ,t0 ). The interval [x0 − ct0 , x0 + ct0 ] on the line t = 0 is called the domain of
dependence, as indicated in Fig. 2.14.
The next theorem is about stability; it says that for a small change in the initial data,
only produces a small change in the solution.
Theorem 2.3 Let u∗ (x,t) be another solution of (2.75)–(2.77) with initial data f ∗
and g∗ . Define |h| = max−∞<x<∞ |h(x)|, for h : R → R is continuous. Similarly, for
u = u(x,t), we define
|u|T = max |u(x,t)|.
−∞<x<∞;|t|≤T
Assume there is a small change in the initial data over a finite time T . That is, for
small and positive ε we see that
ε ε
| f − f ∗| < , |g − g∗ | < .
2 2T
Then,
|u(x,t) − u∗ (x,t)| < ε.
(x0 , 0) x
FIGURE 2.15
Domain of influence of the point (x0 , 0).
x+ct
1 g(s) − g∗ (s)ds
+
2c x−ct
1 1 x+ct ε
≤ (ε/2 + ε/2) + ds
2 2c x−ct 2T
1 ε
≤ ε/2 + (2cT ) = ε.
2c 2T
Thus, for |t| ≤ T, we have shown that
|u − u∗ |T < ε.
□
Wave Equation and D’Alembert’s Solution 85
Now that we displayed an example, let us take a closer look at the geometrical in-
terpretation of (2.78). The term on the right-hand side of (2.78) is called the pro-
gressive wave. If we let x∗ = ct, then the transformation ξ = x + ct = x + x∗ is a
translation of the coordinate system to the left by x∗ . Thus, F(x + ct) is a wave that
moves in the negative x direction with speed c without change in its shape. For ex-
ample, u(x,t) = cos(x + ct) represents a cosine wave which moves in the negative
x-direction with speed c without changing its shape. Similarly, F(x − ct) is a wave
which moves in the positive x-direction with speed c without change in its shape.
Consequently, the solution
u(x,t) = F(x + ct) + G(x − ct)
is the sum of two waves traveling in opposite directions, and the shape of u(x,t) will
change with time.
Example 2.20 Consider the wave problem with zero initial velocity
utt − c2 uxx = 0,
subject to
ut (x, 0) = 0,
h, |x| ≤ a
u(x, 0) =
0, |x| > a.
This initial data corresponds to an initial disturbance of the string centered at x = 0
of height h. The solution is given by (2.82) with g(x) = 0. In other words,
1
u(x,t) = [ f (x + ct) + f (x − ct)].
2
We need to piece together the solution. Notice that
h, |x + ct| ≤ a
f (x + ct) =
0, |x + ct| > a
and
h, |x − ct| ≤ a
f (x − ct) =
0, |x − ct| > a.
As a consequence the solution is defined piecewise over four different regions. We
will only consider all regions for t ≥ 0. It is clear from the definitions of f (x + ct)
and f (x − ct) that the four regions are:
I = {|x + ct| ≤ a, |x − ct| ≤ a},
II = {|x + ct| ≤ a, |x − ct| > a},
III = {|x + ct| > a, |x − ct| ≤ a},
IV = {|x + ct| > a, |x − ct| > a},
with
h h
uI (x,t) = h, uII (x,t) = , uIII (x,t) = , uIV (x,t) = 0.
2 2
The notation uI (x,t) stands for the value of u in region I, and so on. See Fig. 2.16. □
86 Partial Differential Equations
IV
u=0
III
II u = h/2
u = h/2
IV I IV
u=0 u=h u=0
−a a x
FIGURE 2.16
Different values of u.
Example 2.21 Consider the wave problem with zero initial displacement
utt − c2 uxx = 0,
subject to
u(x, 0) = 0,
g0 , |x| ≤ a
ut (x, 0) =
0, |x| > a.
This is similar to the previous example but we will have to adjust the interval of
integrations. However, here we have six different regions that we list
(x,t)
s) τ=
t− x+
c( △(x,t) c(
x− t−
τ= s)
(x − ct, 0) (x + ct, 0) τ
FIGURE 2.17
Nonhomogeneous wave equation.
with
0 in I
1 x+ct g0
2c −a g0 dx = 2c (x + ct + a), in II
1 a
−a g0 dx = g0 t, in III
2c
u(x,t) =
1 x+ct
2c x−ct g0 dx = g0t, in IV
1 a g0
2c x−ct g0 dx = 2c (−x + ct + a), in V
0, in V I
□
Next we consider the nonhomogeneous wave equation
utt − c2 uxx = h(x,t) (2.83)
subject to
u(x, 0) = f (x), ut (x, 0) = g(x), (2.84)
where the function h is assumed to be continuous with respect to both arguments. We
show that the solution of (2.83) along with the initial data (2.84) is given by
x+ct
1 1
u(x,t) = [ f (x + ct) + f (x − ct)] + g(s)ds
2 2c x−ct
1
+ h(τ, s)dτds, (2.85)
2c △(x,t)
where △(x,t) is shown in Fig. 2.17. We will do this by piecing together the solution
of the homogeneous problem (2.75)–(2.77), which has the solution given by (2.82),
and the solution u p of
uttp − c2 uxx
p
= h(x,t) (2.86)
88 Partial Differential Equations
subject to
u p (x, 0) = 0, utp (x, 0) = 0. (2.87)
We already have the transformation
ξ = x + ct, η = x − ct.
ξ +η ξ −η
x= , t= . (2.88)
2 2c
Under the same transformation, we saw the left side of (2.86) becomes
−4c2 uξ η = 0.
uξp (ξ , ξ ) + uηp (ξ , ξ ) = 0.
Similarly, utp = uξp ξt + uηp ηt = cuξp − cuηp . Thus, the second boundary condition of
(2.87) reduces to
cuξp (ξ , ξ ) − cuηp (ξ , ξ ) = 0.
From the last two equations above, it is immediate that uξp (ξ , ξ ) = uηp (ξ , ξ ) = 0. Fix a
point (x0 ,t0 ). Then the corresponding point in the characteristic variables is (ξ0 , η0 ).
In order to find the value of the solution at this point we begin by integrating (2.89)
in term of η from ξ to η0 and obtain
η0 η0
1
uξp η dη = − h(ξ , η)dη.
ξ 4c2 ξ
However, η0
uξp η dη = uξp (ξ , η0 ) − uξp (ξ , ξ ) = uξp (ξ , η0 ).
ξ
As a result, we have
η0 ξ
1 1
uξp (ξ , η0 ) = − 2 h(ξ , η)dη = 2 h(ξ , η)dη. (2.90)
4c ξ 4c η0
Wave Equation and D’Alembert’s Solution 89
Integrating (2.90) with respect to ξ from η0 to ξ0 and then using (2.91), we arrive
at
ξ0 ξ
p 1
u (ξ0 , η0 ) = h(ξ , η)dηdξ
4c2 η0 η0
1
= h(ξ , η)dξ dη, (2.92)
4c2 △
where the double integral is taken over the triangle of dependence of the point (x0 ,t0 ),
as shown in Fig. 2.14. Left to transform the double integral in (2.92) to a double
integral in terms of the variables (x,t). For ξ = x + ct, η = x − ct, we have
ξ ξy 1 c
J = x = = −2c ̸= 0.
ηx ηy 1 −c
Thus,
1 1
u p (ξ , η) = h(τ, s)|J|dτds = h(τ, s)dτds, (2.93)
4c2 △(x,t) 2c △(x,t)
where △(x,t) is shown in Fig. 2.17. Finally, adding (2.93) to (2.82), we obtain (2.85).
For illustrational purpose, we provide the following two examples.
Example 2.22 Consider
4utt − 9uxx = 4xt
subject to u(x, 0) = x2 ,
ut (x, 0) = sin(x). Here u(x,t) is given by the D’Alambert’s
solution (2.85) with f = x2 , g = sin(x), h(x,t) = xt and c2 = 49 . With the aid of
Example 2.19 we have
t x+ 23 (t−s)
9 2 3
u(x,t) = x2 + t 2 + sin(x) sin( t) + τ sdτds
4 3 2 0 x− 32 (t−s)
t
9 2 3 1 s 2
= x + t 2 + sin(x) sin( t) +
2
[x − c (t − s)2 ]ds 2
4 3 2 3 0 2
9 2 3 t 2 x2 3t 4
= x2 + t 2 + sin(x) sin( t) + − .
4 3 2 12 96
□
Example 2.23 Consider
12
utt − 9uxx =
t2 + 1
subject to u(x, 0) = x, ut (x, 0) = e−x .
90 Partial Differential Equations
s
(x,t)
s) τ=
− x+
3 (t △(x,t) 3
2 (t
x−
2
−
τ= s)
(x − 32 t, 0) (x + 23 t, 0) τ
FIGURE 2.18
△(x,t).
2.6.1 Exercises
Exercise 2.53 Solve
utt − 4uxx = 0, u(x, 0) = sin(x), ut (x, 0) = cos(x).
Wave Equation and D’Alembert’s Solution 91
(2, 23 )
x=
−3t
t +4
x =3
(4, 0) x
x=
−3t
t +4
x =3
(2, − 23 )
FIGURE 2.19
Region for existence and uniqueness.
Exercise 2.54 At points in space where no sources are present the spherical wave
equation satisfies
2
utt = c2 urr + ur . (2.94)
r
Equation (2.94) is obtained by writing the homogeneous wave equation in spher-
ical coordinates r, θ , φ and neglecting the angular dependence. Assume the initial
functions
u(r, 0) = f (r), ut (r, 0) = g(r).
Make the change of variables v = ru to transform (2.94) into the equation in v:
vtt = c2 vrr .
Solve for v and then find the general solution of (2.94) subject to the initial data.
Exercise 2.55 Solve
subject to
u(x, 0) = f (x), ut (x, 0) = g(x), x ≥ 0,
u(0,t) = 0, t ≥ 0.
Show that its solution is given by
x+ct
1 1
u(x,t) = [ f (x + ct) + f (x − ct)] + g(s)ds, for x > ct,
2 2c x−ct
x+ct
1 1
u(x,t) = [ f (x + ct) − f (ct − x)] + g(s)ds, for x < ct.
2 2c ct−x
Exercise 2.58 Find the solution for
subject to
u(x, 0) = | sin(x)|, x>0
ut (x, 0) = 0, x ≥ 0,
u(0,t) = 0, t ≥ 0.
Exercise 2.59 Construct the solution for the wave problem with zero initial velocity
ξ = x + ct, η = x − ct.
where the functions F and G are arbitrary and differentiable. Equation (2.96) is valid
and over the domain
0 ≤ x + ct ≤ l, and 0 ≤ x − ct ≤ l.
Moreover, the solution is uniquely determined by the initial data in the the re-
gion
x l −x
t≤ , t≤ , t ≥ 0.
c c
For the fixed end u(0,t) = 0, t ≥ 0, we have
Equation (2.98) extends the range of G to negative values and can then be used to
do the same for F. If we apply the initial data to (2.96), then it was obtained from
Section 2.6 that
1 1 ξ
F(ξ ) = f (ξ ) + g(s)ds + K , 0 ≤ ξ = x + ct ≤ l (2.99)
2 c 0
1 1 η
G(η) = f (η) − g(s)ds − K , 0 ≤ η = x − ct ≤ l (2.100)
2 c 0
Using (2.98) in combination with (2.99) and (2.100) we arrive at by setting G(ζ ) =
−F(−ζ ) that
1 −ζ
1 1 ζ 1
f (ζ ) − g(s)ds = − f (−ζ ) + g(s)ds
2 c 0 2 c 0
1 1 ζ
= − f (−ζ ) − g(−s)ds .
2 c 0
By comparing both sides of the above expression, we immediately see that (2.98) is
satisfied when
f (ζ ) = − f (−ζ ), and g(ζ ) = −g(−ζ ).
In other words, we must extend the functions f and g to be odd functions with respect
to x = 0.
Now we turn our attention to the boundary condition 0 = u(l,t). As before, we
have
0 = u(l,t) = F(l + ct) + G(l − ct).
Letting ζ = l + ct in the above equation we get
This equation extends the range of F to positive values l ≤ ζ ≤ 2l. As before, setting
F(ζ ) = −G(2l − ζ ), we arrive at
1 2l−ζ
1 1 ζ 1
f (ζ ) + g(s)ds = − f (2l − ζ ) − g(s)ds
2 c 0 2 c 0
1 1 ζ
= − f (2l − ζ ) + g(2l − τ)dτ .
2 c 2l
By comparing both sides of the above expression, we immediately arrive at
ζ ζ
f (2l − ζ ) = − f (ζ ), and g(s)ds = − g(2l − τ)dτ.
0 2l
g(2l − ζ ) = −g(ζ ).
Wave Equation and D’Alembert’s Solution 95
A similar situation occurs for the function g. Thus, f and g need to be periodic odd
extensions of the original functions with period 2l. Now we try to piece the solution
together.
Let f p and g p denote the odd extensions of 2l-periodic of f and g, respectively.
Then,
f (x), 0 < x < l, g(x), 0 < x < l,
f p (x) = g p (x) =
− f (−x), −l < x < 0, −g(−x), −l < x < 0
Consider the wave problem on the whole real line with the extended initial
data
vtt − c2 uxx = 0, −∞ < x < ∞, t > 0
v(x, 0) = f p (x), 0≤x≤l
vt (x, 0) = g p (x), 0 ≤ x ≤ l.
With this set up we automatically have v(0,t) = v(l,t) = 0, and the restriction
u(x,t) = v(x,t) 0≤x≤l
Next, we discuss periodic odd extension, and for more on the subject we refer to Ap-
pendix A. Suppose we have a function f that is piecewise continuous on the interval
(0, l). We define the Fourier sine series of f by
∞
nπx
f (x) = ∑ bn sin( ), 0 < x < l, (2.105)
n=1 l
96 Partial Differential Equations
2.6.3 Exercises
Exercise 2.64 Consider
utt = uxx , 0 < x < π, t > 0
u(x, 0) = x3 , 0≤x≤π
ut (x, 0) = 0, 0≤x≤π
u(0,t) = 0 = u(π,t), t ≥ 0.
The above expression is true since a and b are constants and u is continuous. Recall
that the thin rod is insulated. The change in heat must be balanced by the heat flux
across the cross-section of the cylindrical piece around the interval [a, b], as the heat
cannot be gained or lost in the absence of an external heat source. Fourier’s law
states that the heat flux across the boundary will be inversely proportional to the
temperature derivative in the direction of the boundary’s outward normal, in this
instance the x-derivative. The second way to compute the time rate of change of D is
to notice that, in the Absence of heat sources within the rod, the quantity of heat in
u can change only through the flow of heat across the boundaries of u at x = a and
x = b. The heat flux through a section of the rod is called the heat flux through the
section.
Let κ denote the thermal conductivity of the rod. Recall that the thermal conductivity
of a material is a measure of its ability to conduct heat. Then the heat flux into u at
x = a and x = b is
−κux (a,t), and κux (b,t),
100 Partial Differential Equations
respectively. Thus, the total time rate of change of D is the sum of the rates at the two
ends. Using the fundamental theorem of calculus, we may write
b
d h i
D(x,t) = κ ux (b,t) − ux (a,t) = κ uxx dx. (2.109)
dt a
or b
cρut − κuxx dx = 0.
a
Since the above integral must hold for all x ∈ [a, b] with a < b we must have
cρut − κuxx = 0,
κ
throughout the material. Dividing by cρ and setting k = cρ , we arrive at the one-
dimensional heat equation
ut − kuxx = 0, (2.110)
where k is called the thermal diffusivity of the material.
If we consider the heat equation in (2.110) on an interval I ∈ R, then we have the
heat problem, with initial and boundary conditions
ut = kuxx , x ∈ I, t > 0
u(x, 0) = f (x), x∈I (2.111)
u satisfies certain BCs
Here, the initial condition u(x, 0) = f (x), means the lateral surface of the rod is insu-
lated and parallel to the x-axis, and its initial temperatures are f (x) for x ∈ I.
In practice, the most common boundary conditions are the following:
• u(0,t) = 0 = u(l,t) : I = (0, l), Dirichlet . It is the case when both faces of the
rod are kept at temperature zero.
• ux (0,t) = 0 = ux (l,t) : I = (0, l), Neumann . It is the case when both faces of
the rod are insulated.
• ux (0,t) − a0 u(0,t) = 0 and ux (l,t) + al u(l,t) = 0 : I = (0, l), Robin .
• u(−l,t) = u(l,t) = 0 and ux (−l,t) = ux (l,t) = 0 : I = (−l, l), Periodic .
Before we attempt to find the solution to the heat equation, we prove the uniqueness
of the solution of the nonhomogeneous heat equation. We do so by defining an energy
function V and show along the solutions of the heat equation, the energy function is
Heat Equation 101
nonnegative and its derivative is less or equal to zero. We begin by considering the
nonhomogeneous heat equation with initial and boundary conditions
ut − kuxx = f (x,t), 0 ≤ x ≤ l, t > 0
u(x, 0) = φ (x), 0 ≤ x ≤ l, (2.112)
u(0,t) = g(t), u(l,t) = h(t),
and V [w](t) is positive for t > 0. To obtain any meaningful information from the
energy function, we must show it is decreasing in time along the solutions of (2.113).
Thus, using the first equation in (2.113) we arrive at
l l
d
V [w](t) = w(x,t)wt (x,t)dx = k w(x,t)wxx (x,t)dx.
dt 0 0
102 Partial Differential Equations
0 ≤ V [w](t) ≤ V [w](0) = 0.
Hence, l
1
V [w](t) = [w(x,t)]2 dx = 0, for all t ≥ 0,
2 0
which implies w ≡ 0, for all x ∈ [0, l], t > 0. Wherefore, u−v = 0 for all x ∈ [0, l], t >
0. This shows
u = v for all x ∈ [0, l], t > 0.
This completes the proof.
Next we extend Theorem 2.4 to show uniqueness of solution for the heat equation on
R. Consider the heat equation
ut − kuxx = f (x,t),
−∞ ≤ x ≤ ∞, t > 0
u(x, 0) = φ (x), −∞ ≤ x ≤ ∞, (2.115)
lim u = 0, lim ux = 0,
t > 0,
x→±∞ x→±∞
Proof Assume (2.115) has two solutions u and v. Set w = u − v. Then by similar
arguments as in the proof of Theorem 2.4 w is a solution to the homogeneous heat
equation
wt − kwxx = 0,
−∞ ≤ x ≤ ∞, t > 0
w(x, 0) = 0, −∞ ≤ x ≤ ∞, (2.116)
lim w = 0, lim wx = 0,
t > 0.
x→±∞ x→±∞
0 ≤ V [w](t) ≤ V [w](0) = 0.
Hence, ∞
1
V [w](t) = [w(x,t)]2 dx = 0, for all t ≥ 0,
2 −∞
which implies w ≡ 0, for all x ∈ (−∞, ∞), t > 0. Consequently, u − v = 0 for all
x ∈ (−∞, ∞), t > 0. This shows
The next theorem is about stability; it says that for a small change in the initial data,
only produces a small change in the solution.
Theorem 2.6 Let u∗ (x,t) be another solution of (2.115) with initial data g∗ . Define
the L2 norm of a function h as
∞ 1
2
||h||2 = h2 (x)dx .
−∞
Assume there is a small change in the initial data. That is, for small and positive ε
we have that ||g − g∗ ||2 < ε. Then,
Proof The proof depends on the energy function. For simpler notation, we let
Define the energy function V for (2.118) by (2.117). Then by Theorem 2.5, we have V
is decreasing along the solutions of (2.118) with V [w](t) ≥ 0, for t ≥ 0. Thus
∞
1
V [w](t) ≤ V [w](0) = [w(x, 0)]2 dx.
2 −∞
Accordingly, we have
1 1
||w||22 = V [w] ≤ ||w(x, 0)||22 ,
2 2
or
1 1
||u − u∗ ||22 ≤ ||g − g∗ ||22 ,
2 2
which implies that
||u − u∗ ||2 ≤ ε, t ≥ 0.
We have established stability for all t ≥ 0 in terms of the square error. This completes
the proof.
t = c,
for some constant c. Remember, we should be able to trace any point in the xt plane
along the characteristic lines, which is not the case here since they are parallel to the
x-axis.
Our aim then is to find another approach to establishing a bounded solution of the
heat equation on an unbounded domain. We consider the heat problem with initial
condition
ut − kuxx = 0, −∞ < x < ∞, t > 0
(2.120)
u(x, 0) = φ (x), −∞ < x < ∞.
Heat Equation 105
To arrive at the solution of (2.120), we begin by considering simple form of the initial
condition. In particular, we first derive the solution of the heat problem with initial
condition of the form
Vt − kVxx = 0, −∞ < x < ∞, t > 0
(2.121)
V (x, 0) = H(x),
Proof Let ∞
2
I= e−x dx.
0
Then, for y ∈ R,
∞ ∞ ∞ ∞
−x2 −y2 2 +y2 )
2
I = I ·I = e dx e dy = e−(x dxdy.
0 0 0 0
x = r cos(θ ), y = r sin(θ ).
Then,
π/2 ∞
2
I 2
= e−r rdr dθ
0 0
π/2 ∞
2
e−r rdr dθ
=
0 0
π/2
1 2 ∞
= − e−r 0 dθ
0 2
π/2
1 π
= dθ = .
0 2 4
Taking the square root on both sides we arrive at
∞ √
−x2 π
I= e dx = .
0 2
This completes the proof.
In the next lemma we explore the invariance properties of the heat equation.
106 Partial Differential Equations
Lemma 2 [Invariance properties of the heat equation] The heat equation (2.119) is
invariant under these transformations.
(a) If u(x,t) is a solution of (2.119), then so is u(x − z,t) for any fixed z. (Spatial
translation)
(b) If u(x,t) is a solution of (2.119), then so are ux , ut , uxx , and so on. (Differentia-
tion)
(c) If u1 , u2 , . . . , un are solutions of (2.119), then so is ∑ni=1 ci ui for any constants
c1 , c2 , . . . , cn . (Linear combinations)
(d) If S(x,t) solves (2.119), then so is
∞
S(x − y,t)g(y)dy
−∞
Proof The proof of parts (a)–(c) are straightforward and we refer to Exercise 2.71.
To prove (d), we assume a finite interval [−b, b] partitioned by points {yi }ni=1 such
that −b = y1 < y2 < · · · < yn = b with equal length ∆y. Then using (c) combined with
representing the integral with the Riemann sum we may write
∞ b n
S(x − y,t)g(y)dy = lim S(x − y,t)g(y)dy = lim lim ∑ S(x − y,t)g(yi )∆y.
−∞ b→∞ −b b→∞ n→∞ i=1
√
v(x,t) = u( ax, at). Then,
As for the√proof of (d), we√make√use of the chain rule. Let √
vt = aut ( ax, at), vx = aux ( ax, at), and vxx = auxx ( ax, at). Substituting into
(2.120) we arrive at
√ √
aut ( ax, at) − kauxx ( ax, at) = 0,
or √ √
ut ( ax, at) − kuxx ( ax, at) = 0.
This completes the proof.
which we will need to compute the constants in (2.123). Since (2.123) is only valid
for t > 0, so to check the initial condition in (2.121) we take the limit at t → 0+ .
Additionally, we observe that
x ∞, x > 0
lim √ =
t→0+ t −∞, x < 0.
Thus, for x > 0, we have that
∞ √
s2
1 = lim V (x,t) = C e− 4k ds + D = C kπ + D.
t→0+ 0
we obtain
1 1
C= √ , D= .
4kπ 2
Plugging the constants into (2.123), we arrive at
√x
1 t s2 1
V (x,t) = √ e− 4k ds + .
4kπ 0 2
We try to put V in terms of the error function that we define below.
Definition 2.5 The error function is the following improper integral considered as a
real function er f : R → R, such that
x
2 2
er f (x) = √ e−z dz,
π 0
Let z = √s . Then
4k
√x √x
1 t 2
− s4k 1 4kt 2 1 x
√ e ds = √ e−z dz = er f ( √ ),
4kπ 0 π 0 2 4kt
Hence the unique particular solution of (2.121) is given by
√x
1 4kt 2 1
V (x,t) = √ e−z dz + , (2.124)
π 0 2
Heat Equation 109
∂V
S(x,t) =
(x,t). (2.126)
∂x
By the invariance property (a), S(x,t) solves (2.120). Therefore, by the invariance
property (b), ∞
u(x,t) = S(x − y,t)φ (y)dy, for t > 0 (2.127)
−∞
solves (2.120). We must show u(x,t) given by (2.127) is the unique solution of
(2.120). This can be accomplished by showing it satisfies the initial condition. Uti-
lizing (2.126), we can write u as follows
∞ ∞
∂V ∂
u(x,t) = (x − y,t)φ (y)dy = − [V (x − y,t)]φ (y)dy.
−∞ ∂ x −∞ ∂ y
We note that S(x − y,t) decays exponentially as y − x grows larger. For now, we
assume
φ (±∞) = 0,
so we may perform an integration by parts on the above integral. That is
∞ ∞
V (x − y,t)φ ′ (y)dy = V (x − y,t)φ ′ (y)dy.
∞
u(x,t) = −V (x − y,t)φ (y)−∞ +
−∞ −∞
Thus, u(x,t) does satisfy the initial condition of (2.120). Left to compute S(x,t)
which can be easily done from (2.126) with V given by (2.124). That is, using (2.124)
we obtain
∂V 1 x2
S(x,t) = (x,t) = √ e− 4kt . (2.128)
∂x 4πkt
Finally, substituting S given by (2.128) into (2.127), we have the explicit form of the
solution of (2.120)
∞
1 (x−y)2
u(x,t) = √ e− 4kt φ (y)dy, for t > 0. (2.129)
4πkt −∞
110 Partial Differential Equations
The function S(x,t) is known as the heat kernel. Other resources may refer to it as
fundamental solution, source function, Green’s function, or propagator of the heat
equation
We present the following example.
Example 2.27 Consider the heat problem
ut − 4vxx = 0, −∞ < x < ∞, t > 0
u(x, 0) = φ (x), −∞ < x < ∞,
2
The integrals with integrands in which e−s is not multiplied by a term that has an s
in it can be written in terms of the error function. The rest can be integrated out and
at the end, we end up with the solution
1 l +x x
v(x,t) = (1 + x) er f √ − er f √
2 16t 16t
1 x x − 1
+ (1 − x) er f √ − er f √
2 16t 16
√ √
2 t − (x+1)2 x 2 2 t (x−1)2 x2
+ √ e 16t − e− 16t + √ e− 16t − e− 16t . (2.130)
π π
□
Heat Equation 111
We end this section by looking into the solution of the nonhomogenous heat problem
with initial condition,
ut − kuxx = f (x,t), −∞ < x < ∞, t > 0
(2.131)
u(x, 0) = φ (x), −∞ < x < ∞,
for given function f , and φ . The derivation of the solution of (2.131) depends on
Dumamel’s principle and we ask the reader to consult with [1]. The next theorem is
stated without proof.
Theorem 2.7 The heat equation given by (2.131) has the solution
∞ t ∞
u(x,t) = S(x − y,t)φ (y)dy + S(x − y,t − τ) f (y, τ)dydτ, for t > 0
−∞ 0 −∞
(2.132)
where S(x,t) is given by (2.128).
Our aim is to find a solution of (2.133) as we did for the heat problem over the entire
real line. There will be no need to start from scratch, but instead we will reintroduce
the problem over the entire real line by extending the initial data to the whole line.
Whatever method we use to extend to the negative half-line, we should make sure
that the boundary condition is automatically satisfied by the solution of the problem
on the whole line that arises from the extended data. For heat problems with Dirichlet
condition, one would choose the odd extension of the initial data φ (x). If ψ(x) is odd,
then ψ(x) = −ψ(−x), from which we get 2ψ(0) = 0, or ψ(0) = 0. This is true for
any odd function. We make it formal in the next lemma.
Lemma 3 Let f : (−∞, ∞) → R be an odd function ( f (x) = − f (−x)), that is con-
tinuous at x = 0 then f (0) = 0.
f (0) = lim f (x) = − lim f (x) = lim f (x) = − lim f (x) = − f (0).
x→0+ x→0+ x→0− x→0−
The next lemma assures us that if the initial data is odd, then the solution of the heat
equation over the real line is also odd.
Lemma 4 Let u(x,t) be the solution of the heat equation on −∞ < x < ∞. If the
initial data φ (x) = u(x, 0) is odd, then for all t ≥ 0, u(x,t) is an odd function of x.
Thus, by Lemmas 3 and 4, we see that if the initial data φ (x) is odd, then u(x,t) is
odd. Since
u(x,t) + u(−x,t)
solves the heat problem, we have 2u(0,t) = 0, and hence u(0,t) = 0 for any t > 0,
which is exactly the boundary condition in (2.133) in v. In summary, if one extends
the initial data to an odd function on the whole real line, then the solution with the ex-
tended initial data automatically satisfies the Dirichlet boundary condition of (2.133).
We have the following definition.
Definition 2.6 The odd extension of a function f (x) denoted by fo (x) is defined as
f (x), x>0
fo (x) = − f (−x), x < 0 (2.134)
0, x = 0.
The odd extension fo is defined for negative x by reflecting the f (x) with respect to the
vertical axis, and then with respect to the horizontal axis. This procedure produces a
function whose graph is symmetric with respect to the origin, and thus it is odd.
For example. if f (x) = x, x > 0, then fo (x) = x for −∞ < x < ∞. In light of the above
discussion we recast the heat problem in (2.133) with extended data as
ut − kuxx = 0, −∞ < x < ∞, t > 0
(2.135)
u(x, 0) = φo (x).
∞
u(x,t) = S(x − y,t)φo (y)dy, for t > 0 (2.136)
−∞
Since v is a solution for x ≥ 0, we have v(x,t) = u(x,t) for x ≥ 0. Notice that
v(x, 0) = u(x, 0) = φo (x) = φ (x)
x>0 x>0
and v(0,t) = u(0,t) = 0, since u(x,t) is an odd function of x. Thus, v(x,t) satisfies the
boundary condition in (2.135). Substituting φo (x) into the solution given by (2.136),
we obtain by splitting the integral over two regions
∞ 0
u(x,t) = S(x − y,t)φo (y)dy + S(x − y,t)φo (y)dy
0 −∞
∞ 0
= S(x − y,t)φ (y)dy − S(x − y,t)φ (−y)dy.
0 −∞
Using (2.128)
1 x2
S(x,t) = √ e− 4kt ,
4πkt
and v(x,t) in the above expression, we may write the solution formula for (2.133) as
follows
∞h
1 (x−y)2 (x+y)2
i
v(x,t) = √ e− 4kt − e− 4kt φ (y)dy, for t > 0. (2.137)
4πkt 0
Example 1 Consider the heat equation in (2.133) with φ (x) = u0 for constant u0 .
Substituting φ (y) = u0 into the solution v(x,t) in (2.137), we obtain
∞h
u0 (x−y)2 (x+y)2
i
v(x,t) = √ e− 4kt − e− 4kt dy, for t > 0.
4πkt 0
x−y
Making the change of variable s = √
4kt
, we arrive at
∞ −∞
u (x−y)2 u0 2
√ 0 e− 4kt ds = − √ e−s ds
4πkt 0 π √x
4kt
√x
u 4kt 2
= √0 e−s ds.
π −∞
x+y
Similarly, by letting s = √
4kt
, we arrive at
∞ ∞
u (x+y)2 u0 2
√ 0 e− 4kt ds = √ e−s ds.
4πkt 0 π √x
4kt
114 Partial Differential Equations
Thus,
√x ∞
u h 4kt 2 2
i
v(x,t) = √0 e−s ds − e−s ds
π −∞ √x
4kt
u h 0
2
√x
4kt 2
= √0 e−s ds + e−s ds
π −∞ 0
∞ √x i
−s2 4kt 2
− e ds − e−s ds
0 0
√x
2 4kt 2 2
= u0 √ e−s ds (since e−s ds is even)
π 0
x
= u0 er f √ .
4kt
□
In the heat problem (2.133), we considered the Dirichlet boundary condition v(0,t) =
0, and derived the solution given by (2.137). Currently, we are interested in finding
the solution for (2.133), but with a nonzero boundary condition. That is, we consider
the heat problem with a nonhomogenous boundary condition
vt − kvxx = 0, 0 < x < ∞, t > 0
v(x, 0) = 0, x>0 (2.138)
v(0,t) = p(t), t > 0,
Then
−p′ (t),
−p(0), x>0 x>0
φo (x) = p(0), x<0 and fo (x,t) = p′ (t), x<0
0, x=0 0, x = 0.
Heat Equation 115
Then, by (2.132), the solution maybe written in terms of the heat kernel
∞ t ∞
u(x,t) = S(x − y,t)φ0 (y)dy + S(x − y,t − τ) fo (y, τ)dydτ, for t > 0.
−∞ 0 −∞
(2.140)
By substituting φo and fo into (2.140), we obtain the explicit solution
∞
1 (x−y)2 (x+y)2
u(x,t) = − √ e− 4kt − e− 4kt p(0)dy
4πkt 0
t ∞ (x−y)2 (x+y)2
1 − −
− p e 4k(t−τ) − e 4k(t−τ) p′ (τ)dydτ.
0 0 4πk(t − τ)
(2.141)
Our aim is to find a solution to (2.142) as we did for the heat problem over the entire
real line. There will be no need to start from scratch, but instead we will try to recast
the problem over the entire real line by extending the initial data to the whole line.
For heat problems with Neumann conditions, one would choose the even extension
of the initial data φ (x). Note that if ψ(x) is even, then ψ(x) = ψ(−x), from which
we get ψ ′ (x) = −ψ ′ (−x), and hence 2ψ ′ (0) = 0, or ψ ′ (0) = 0. This is true for any
even function. We make it formal in the next lemma.
Lemma 5 Let f : (−∞, ∞) → R be an even function ( f (x) = f (−x)), that is differ-
entiable at x = 0 then f ′ (0) = 0.
In light of the above discussion, we recast the heat problem in (2.142) with extended
data as
ut − kuxx = 0, −∞ < x < ∞, t > 0
u(x, 0) = φe (x) (2.144)
ux (0,t) = 0, t > 0.
and vx (0,t) = ux (0,t) = 0, since u(x,t) is an even function of x. Thus, v(x,t) satis-
fies the Neumann condition in (2.144). Substituting φe (x) into the solution given by
(2.145), we obtain by splitting the integral over two regions
∞ 0
u(x,t) = S(x − y,t)φe (y)dy + S(x − y,t)φe (y)dy
0 −∞
∞ 0
= S(x − y,t)φ (y)dy + S(x − y,t)φ (−y)dy.
0 −∞
Heat Equation 117
Using (2.128)
1 x2
S(x,t) = √ e− 4kt ,
4πkt
and v(x,t) in the above expression, we may write the solution formula for (2.142) as
follows
∞h
1 (x−y)2 (x+y)2
i
v(x,t) = √ e− 4kt + e− 4kt φ (y)dy, for t > 0. (2.146)
4πkt 0
We have the following example.
Example 2.28 Consider the heat equation given by (2.142) with φ (x) = u0 for con-
stant u0 . Substituting φ (y) = u0 into the solution v(x,t) in (2.146) we obtain
∞h
u0 (x−y)2 (x+y)2
i
v(x,t) = √ e− 4kt + e− 4kt dy, for t > 0.
4πkt 0
Making the change of variable s = √x−y 4kt
, and s = x+y
√
4kt
, in the first integral and the
second integral, respectively, we arrive at
√x ∞
u h 4kt 2 2
i
v(x,t) = √0 e−s ds + e−s ds
π −∞ √x
4kt
u h 0
2
√x
4kt 2
= √0 e−s ds + e−s ds
π −∞ 0
∞ √x i
−s2 4kt 2
+ e ds − e−s ds
0 0
∞
2 2 2
= u0 √ e−s ds (since e−s ds is even)
π 0
√
2 π
= u0 √ ( ), (by Lemma 1)
π 2
= u0 .
□
2.7.4 Exercises
Exercise 2.71 Prove parts (a)–(c) of Lemma 2.
Exercise 2.72 For constant u0 , write the solution in terms of the error function for
the heat problem
ut − kuxx = 0, −∞ < x < ∞, t > 0
u(x, 0) = φ (x), −∞ < x < ∞
118 Partial Differential Equations
where
u0 , |x| < l
φ (x) =
0, |x| > l
Exercise 2.73 Consider the heat problem
vt − kvxx = 0, 0 < x < ∞, t > 0
v(x, 0) = φ (x), x>0
v(0,t) = 0, t >0
1 l +x 1 l −x
v(x,t) = er f √ − er f √ .
2 4kt 2 4kt
Exercise 2.74 Solve the heat problem
ut − kuxx = 0, −∞ < x < ∞, t > 0
u(x, 0) = e−2|x| , −∞ < x < ∞.
(a)
1 − x, 0<x<1
φ (x) =
0, x≥1
(b) 2
xe−x , 0<x<1
φ (x) =
0, x ≥ 1.
Exercise 2.76 Provide all the details in obtaining (2.141).
Exercise 2.77 Consider the nonlinear heat equation
subject to an initial condition u(x, 0) = f (x). This form of PDE makes its presence
in stochastic optimal control theory. In this question you will derive a representation
formula for the solution u(x,t).
b
(a) Define the Cole-Hopf transformation w(x,t) = e− k u(x,t) . Show that w is a solution
of the linear heat equation wt − kwxx = 0.
Heat Equation 119
(b) Use the fundamental solution of the heat equation to solve for w(x,t).
(c) Invert the Cole-Hopf transformation to find a formula for u.
Exercise 2.78 Solve the heat equation ut −uxx = 0 on the entire real line −∞ < x < ∞
with initial condition u(x, 0) = cos(x) without using the fundamental solution of the
heat equation. Use your answer to deduce the value of the integral
∞
x2
cos(x)e− 4t dx.
−∞
(a)
1, 0<t <1
p(t) =
0, t ≥ 1.
(b)
p(t) = 1, t > 0.
Exercise 2.80 Provide all the details in obtaining (2.130).
Exercise 2.81 For positive constant c, consider the heat equation with convection
term
ut + cux − kuxx = 0, (2.147)
(a) Determine the values of α and β so that the transformation
u(x,t) = v(x,t)eαx+βt
(a)
1, 0<x<1
φ (x) = , (b) φ (x) = xe−ax .
0, x ≥ 1.
(c)
1 − x2 ,
0<x<1
φ (x) =
0, x ≥ 1.
Exercise 2.84 Consider the heat equation over R along with its solution given by
(2.129). Show that if the initial function φ (x) is uniformly bounded on R, then the
solution u(x,t) satisfies
|u(x,t)| ≤ max |φ (x)|.
−∞<x<∞
x−y
Hint: Make use of the substitution s = 4kt .
Using the same concept as we did for the heat equation with Dirichlet condition,
we use the odd extensions fo (x) and go (x) of f (x) and g(x), respectively and solve
the wave equation on the whole real line with initial conditions u(x, 0) = fo (x) and
ut (x, 0) = go (x). In other words, we consider
utt − c2 uxx = 0,
−∞ < x < ∞, t > 0
(2.149)
u(x, 0) = fo (x), ut (x, 0) = go (x), x > 0.
By Exercise 2.89, u(x,t) given in (2.150) is odd and so u(0,t) = 0. Now we try to
make some sense out of the solution in (2.150).
Remember that x > 0. We do this in two cases.
Wave Equation on Semi-Infinite Domain 121
In addition, on [x − ct, x + ct] we have go (s) = g(s). Thus if x > ct, we have from
(2.150) that
x+ct
1 1
u(x,t) = [ f (x + ct) + f (x − ct)] + g(s)ds.
2 2c x−ct
In summary, the solution of the half-line wave equation with Dirichlet boundary
condition is
1
1 x+ct
2 [ f (x + ct) + f (x − ct)] + 2c x−ct g(s)ds, x ≥ ct
u(x,t) =
(2.151)
1 1 ct+x
2 [ f (x + ct) − f (ct − x)] + 2c ct−x g(s)ds, 0 < x < ct.
Fig. 2.20 shows the two regions of the existence of the solution.
Now we turn our attention to establishing the solution of the wave equation on semi
unbounded domain with Neumann condition. We begin by considering the wave
equation of semi-infinite string with a Neumann condition
utt − c2 uxx = 0, 0 < x < ∞, t > 0
u(x, 0) = f (x), x>0
(2.152)
ut (x, 0) = g(x), x>0
ux (0,t) = 0, t > 0.
Using the same concept as we did for the wave equation with Dirichlet condition,
we use the even extensions fe (x) and ge (x) of f (x) and g(x), respectively, and solve
122 Partial Differential Equations
0 < x < ct
ct
x=
x ≥ ct
FIGURE 2.20
Regions of existence of the solution.
the wave equation on the whole real line with initial conditions u(x, 0) = fe (x) and
ut (x, 0) = ge (x). In other words, we consider
utt − c2 uxx = 0,
−∞ < x < ∞, t > 0
(2.153)
u(x, 0) = fe (x), ut (x, 0) = ge (x), x > 0.
D’Alembert formula gives
x+ct
1 1
u(x,t) = [ fe (x + ct) + fe (x − ct)] + ge (s)ds. (2.154)
2 2c x−ct
By Exercise 2.88, u(x,t) given by (2.154) is even and since the derivative of an even
function is odd, and so ux will be odd in x, and hence ux (0,t) = 0. As before we can
simplify (2.150). We do this in two cases and recall that x > 0.
(a) First, suppose that x > ct. Then x + ct ≥ 0 and x − ct ≥ 0, and so
fe (x + ct) = f (x + ct) and fe (x − ct) = f (x − ct).
Moreover, on [x − ct, x + ct] we have ge (s) = g(s). Thus if x > ct, we have from
(2.154) that
x+ct
1 1
u(x,t) = [ f (x + ct) + f (x − ct)] + g(s)ds.
2 2c x−ct
In summary, the solution of the half-line wave equation with Neumann boundary
condition is
1
1 x+ct
2 [ f (x + ct) + f (x − ct)] + 2c x−ct g(s)ds, x ≥ ct
u(x,t) =
1 1
ct−x x+ct
2 [ f (x + ct) + f (ct − x)] + 2c 0 g(s)ds + 0 g(s)ds , 0 < x < ct.
(2.155)
Example 2.29 Consider the wave equation (2.148) with c = 1 and with initial data
f (x) = sin(x), g(x) = 0, and the Dirichlet condition u(0,t) = 0. Then using (2.151),
we have for x > t, that
1
u(x,t) = [sin(x + t) + sin(x − t)] = sin(x) cos(t).
2
On the other hand, for 0 < x < t, we have
1
u(x,t) = [sin(x + t) − sin(t − x)] = sin(x) cos(t).
2
Thus,
u(x,t) = sin(x) cos(t), x > 0.
□
Example 2.30 Consider the wave equation (2.148) with c = 1 and with initial data
f (x) = sin(x), g(x) = 0, and the Neumann condition ux (0,t) = 0. Then using (2.155),
One can easily verify that
u(x,t) = sin(t) cos(x), x > 0.
□
2.8.1 Exercises
Exercise 2.85 Solve
utt − c2 uxx = 0, 0 < x < ∞, t > 0
u(x, 0) = 0, x>0
ut (x, 0) = cos(x), x>0
u(0,t) = 0, t > 0.
Exercise 2.88 Let u(x,t) be the D’alembert’s solution on −∞ < x < ∞ of the wave
equation given by 2.82. If the initial data given by f (x) = u(x, 0) and g(x) = ux (x, 0)
is even, then for all t ≥ 0, u(x,t) is an even function of x.
Exercise 2.89 Let u(x,t) be the D’alembert’s solution on −∞ < x < ∞ of the wave
equation given by 2.82. If the initial data given by f (x) = u(x, 0) and g(x) = ux (x, 0)
is odd, then for all t ≥ 0, u(x,t) is an odd function of x.
Exercise 2.90 Solve
utt − c2 uxx = 0 0 < x < ∞, t > 0
u(x, 0) = 0, x>0
ut (x, 0) = sin(x), x>0
ux (0,t) = 0, > 0.
Exercise 2.94 Consider the wave problem with nonhomogenous Dirichlet boundary
condition 2
vtt − c vxx = 0,
0 < x < ∞, t > 0
v(x, 0) = f (x), x>0
v t (x, 0) = g(x), x>0
v(0,t) = p(t), t > 0,
Wave Equation on Semi-Infinite Domain 125
and reduce the wave problem to a problem with homogenous boundary condition
and then use the D’alembert solution given by (2.85) to find the solution u and then
find the solution v of the original wave problem.
3
Matrices and Systems of Linear
Equations
x1 + x2 − x3 + 4x4 = 10
−x1 − x2 + 2x3 + x4 = 0
10x1 + 3x2 + x4 = 29
The set of all solutions is called the solution set. Under the assumption that the system
(3.1) has a solution, we use the Gaussian elimination method to find the solution set.
We now describe the method in few steps.
Step 1. In this step, we try to eliminate x1 from the second , third, . . . , mth equation.
Suppose that a11 ̸= 0. If not, renumber the equations or variables so that
this is the case. We may achieve the elimination of x1 by multiplying the
a21
first equation with and then subtracting the resulting equation from the
a11
a31
second equation and by multiplying the first equation with and then
a11
subtracting the resulting equation from the third equation, and so forth. This
will result of the new system of equations
Since any solution of (3.1) is a solution of (3.2) and conversely, because steps
are reversible, we may obtain (3.1) from (3.2).
Step 2. In this step we try to eliminate x2 from the third, . . . , mth equation in (3.2).
Suppose that l22 ̸= 0. (Otherwise, renumber the equations or variables so
l32
that this is so.) We do this by multiplying the second equation with and
l22
then subtracting the resulting equation from the third equation. Similarly,
l42
we multiply the second equation with and then subtracting the resulting
l22
equation from the fourth equation, and so forth. The further steps are now
obvious. For example, in the third step, we eliminate x3 and in the fourth
step, we eliminate x4 , etc. The process will only stop when no equations are
left or when the coefficients of all the unknowns in the remaining equations
are all zero. This leads to the system of equations
If the system has a solution we may obtain it by assigning arbitrary values for
the unknown xr+1 , . . . , xn , solving the last equation in (3.3) for xr , the next
128 Matrices and Systems of Linear Equations
to the last for xr−1 , and so on up to the line. When m = n = r, the system
(3.3) has triangular form and there is one, and only one , solution. We will
illustrate the method in a series of examples.
Example 3.1 Consider the system
In the first step, we eliminate x1 from the last two equations. This is done by mul-
tiplying the first equation by 2 and then subtracting the resulting equation from the
second equation. Similarly, we subtract the third equation from the first equation.
This leads us to the new system of equations
Note that the last two equations are identical. In the second step we eliminate x2 from
the third equation by subtracting the third equation from the second equation. This
results into the new system of equations
The third equation is satisfied for any value for x2 , x3 , and x4 . Thus the third equation
puts no constraint on the solution. However, the first and second equations represent
four unknowns with two constraints and hence there are two arbitrary unknowns
(4 − 2 = 2). We may choose x3 and x4 arbitrarily. Thus, we let x3 = s and x4 = t,
where s and t are arbitrary. Then the second equation gives
5 13 5 13
x2 = x3 − x4 + = s− t + .
3 3 3 3
Using the first equation, we solve for x1 and obtain
2 7
x1 = −2x2 + 2x3 − 4x4 + 11 = − t + .
3 3
□
Example 3.2 Consider the system
x1 + x2 − 3x3 = 4
2x1 + x2 − x3 = 2
3x1 + 2x2 − 4x3 = 7.
Systems of Equations and Gaussian Elimination 129
We start by using the first equation. We perform two steps simultaneously. Multiply
the first equation by −2 and add it to the second equation. Then multiply the first
equation by −3 and add it to the third equation to obtain
x1 + x2 − 3x3 = 4
−x2 + 5x3 = −6
−x2 + 5x3 = −5.
Next, −1 times the second equation is added to the third equation produces
x1 + x2 − 3x3 = 4
−x2 + 5x3 = −6
0x1 + 0x2 + 0x3 = 1.
The third equation can not be satisfied for any values x1 , x2 , and x3 and hence the
process stops and the system has no solution. □
Example 3.3 Consider the system
2x1 − 2x2 = −6
x1 − x2 + x3 = 1
3x2 − 2x3 = −5.
x1 − x2 = −3
x1 − x2 + x3 = 1
3x2 − 2x3 = −5.
Now, adding −1 times the first equation to the second equation yields
x1 − x2 = −3
x3 = 4
3x2 − 2x3 = −5.
From the second equation we immediately have x3 = 4. Substituting this value into
the third equation gives 3x2 − 2(4) = −5, and hence x2 = 1. Similarly, substituting
into the first equation gives x1 = −2. Thus the system has the solution (x1 , x2 , x3 ) =
(−2, 1, 4). □
Example 3.4 Consider the system
x1 + x2 − 3x3 = 4
2x1 + x2 − x3 = 2
3x1 + 2x2 − 4x3 = 6.
130 Matrices and Systems of Linear Equations
We will perform two steps at once. First, −2 times the first equation and then add
it to the second equation; second, −3 times the first equation and then add it to the
third equation yield the following equivalent system
x1 + x2 − 3x3 = 4
−x2 + 5x3 = −6
−x2 + 5x3 = −6.
Now, adding −1 times the second equation to the third equation yields
x1 + x2 − 3x3 = 4
−x2 + 5x3 = −6
0x1 + 0x2 + 0x3 = 0.
The third equation is satisfied for any value for x1 , x2 and x3 . Thus the third equation
puts no constraint on the solution. However, the first and second equations represent
three unknowns with two constraints and hence there is one arbitrary unknown (3 −
2 = 1.) To simplify calculation, it is more convenient to let x3 = s where s is arbitrary.
Then the second equation gives x2 = 5s + 6. Similarly, the first equation yields x1 =
−2 − 2s. Since s is arbitrary, the system has infinitely many solutions. □
Proof The proof is based on the Gauss elimination method. We may assume the
coefficient a11 of x1 is not zero. This is a fair assumption since if all the coefficients
of x1 are zero; that is a11 = a21 = . . . = am1 = 0, then, x1 = 1, x2 = x3 = . . . = xn = 0
Homogeneous Systems 131
is a nontrivial solution. Thus, we may assume a11 ̸= 0. Divide the first equation in
(3.4) by a11 to obtain the equation
Multiply (3.4) successively by a21 , a31 , . . . , am1 , and subtract the respective resultant
equations from the second, third, . . . , mth equations of (3.4), to reduce (3.4) to the
form
x1 + b12 x2 + b13 x3 + . . . + b1n xn =0
b22 x2 + b23 x3 + . . . + b2n xn =0
...
bm1 x1 + bm2 x2 + bm3 x3 + . . . + bmn xn = 0.
Now we repeat the same process but now we assume the coefficient b22 of x2 is
not zero. Hence, by applying the Gaussian procedure again produces the third sys-
tem
x1 + c13 x3 + . . . + c1n xn =0
x2 + c23 x3 + . . . + c2n xn =0
c33 x3 + . . . + c3n xn =0
...
cm3 x3 + . . . + cmn xn = 0.
By continuing this process and in particular at the r stage and by using the fact that
the numbers of variables is less than the number of equations, we ultimately, arrive
a system of m equations of the form
x1 + d1r xr + . . . + d1n xn =0
x2 + d2r xr + . . . + d2n xn =0
..
.
xr−1 + dr−1 xr + . . . + dr−1,n xn =0
0 = 0.
If we let xr = 1, xr+1 = · · · = xn = 0, and x1 = −d1r , x2 = −d2r , . . . , xr−1 = −dr−1,r ,
we obtain a nontrivial solution. The proof is done since the systems are equiva-
lent.
Remark 8 In fact, the homogeneous system (3.4) has infinitely many solutions since
the choice of xr is arbitrary.
3.2.1 Exercises
Exercise 3.1 Solve the given system
x1 − 2x2 − x3 + 3x4 = 1
2x1 − 4x2 + x3 = 5
x1 − 2x2 + 2x3 − 3x4 = 4.
132 Matrices and Systems of Linear Equations
3.3 Matrices
In this section we look at matrix algebra and related issues. We begin with the defi-
nition of a matrix. A matrix A is a rectangular array of real or complex numbers of
the form
a11 a12 · · · a1n
a21 a22 · · · a2n
A= . .. . (3.5)
.. ..
.. . . .
am1 am2 · · · amn
The element in the ith row and jth column of the matrix A is denoted by ai j . Some-
times we use the more compact notation
A = (ai j ), i = 1, 2, . . . , m, j = 1, 2, . . . n.
The matrix A has m rows and n columns and we say it is an m × n matrix and we may
write it as Am×n . When m = n, then A is said to be a square matrix. Two matrices are
said to be equal when and only when they have the same size (that is same numbers
of rows and columns) and have the same entry in each position. In other words, if
B p×q is another matrix with B = (bi j ), i = 1, 2, . . . , p, j = 1, 2, . . . q, then A = B if
and only if m = p and n = q, and ai j = bi j for all i and j. As for addition, if A and B
are two matrices with the same size, then
A + B = ai j + bi j m×n .
The product of two matrices Am×n and Bn×p is another matrix Cm×p , where the ma-
trix n
C = ∑ ai j b jk .
j=1 m×p
To be more explicit, the product of the two matrices A and B has the general for-
mula
AB = (ai j )m×n (bi j )n×p
a11 a12 · · · a1n b11 b12 ··· b1p
a21 a22 · · · a2n b21 b22 ··· b2p
= .
.. .. .. .. .. .. ..
.. . . . . . . .
am1 am2 · · · amn bn1 bn2 · · · bnp
n
∑k=1 a1k bk1 · · · ∑nk=1 a1k bkp
= .. .. ..
.
. . .
n n
∑k=1 amk bk1 · · · ∑k=1 amk bkp
Notice that the resulting matrix from the product AB has the same number of rows
as A and the same number of columns as B. Thus, if AB is defined, BA may not be
134 Matrices and Systems of Linear Equations
defined. Moreover, the multiplication of two matrices, is, in general, not commuta-
tive.
Associative law If A is an m × n matrix, B is an n × p matrix, and C is a p × q matrix,
then
(AB)C = A(BC). (3.6)
Moreover, if α is a constant, then clearly
αA = (αai j )m×n .
AI = IA = A.
IX = X.
I = (δi j )n×n .
AT = −A.
As a consequence of the above definition, we know that any square matrix may
be written as the sum of a symmetric matrix R, and a skew-symmetric matrix S,
where
1 1
R = (A + AT ) and S = (A − AT ). (3.8)
2 2
Definition 3.3 Let A be an n × n matrix.
a) If all the elements above the principal diagonal (or below the principal diagonal)
are zero, then the matrix A is called a triangular matrix
b) If all the elements above and below the principal diagonal are zero, then the
matrix A is called a diagonal matrix.
c) A matrix whose entries are all zero is called a zero matrix or null matrix.
3.3.1 Exercises
Exercise 3.10 Prove (3.6).
Exercise 3.11 a) Prove Associative law for matrix addition:
(A + B) +C = A + (B +C).
136 Matrices and Systems of Linear Equations
(B +C)A = BA +CA.
AB = BA.
Exercise 3.15 Give an example of two matrices A and B such that AB = 0, but nei-
ther A = 0 nor B = 0.
Exercise 3.16 Show that (AB)T = BT AT .
Exercise 3.17 Let A be a square matrix given by A = (ai j ), i, j = 1, 2, . . . , n. Suppose
A is skew-symmetric matrix. Show that if i = j, then all the entries in its principle
diagonal are zero.
Exercise 3.18 Give an example of a 3 × 3 matrix that is skew-symmetric.
2 3
Exercise 3.19 Write the matrix A = as the sum of R and S as given in
5 −1
(3.8).
Exercise 3.20 Show that the transpose of a triangular matrix is triangular.
1 2 6
Exercise 3.21 Write A = 3 4 7 as a sum of two triangular matrices. In
5 8 9
this sum unique?
ax + by = k1
cx + dy = k2 .
Determinants and Inverse of Matrices 137
Then, we have
a a23 a a23
C11 = M11 = 22 , C12 = −M12 = − 21 ,
a32 a33 a31 a33
a a13
C32 = −M32 = − 11 , etc.
a21 a23
□
138 Matrices and Systems of Linear Equations
or
|A| = a1kC1k + a2kC2k + . . . + ankCnk , k = 1, 2, . . . , n.
If all the entries of the matrix A are real constants, then the value of the determinant
is a real constant.
Remark 9 The determinant of an n × n matrix is the same regardless of which row
or column is chosen.
Example 3.6 Find A of
1
2 −1
A = 3 6 0 .
0 4 2
We make use of the first row
6 0 3 0 3 6
|A| = 1 − 2 − 1 = −12.
4 2 0 2 0 4
□
Below we state a theorem that contains certain facts concerning determinants. We
leave the proofs to you.
Theorem 3.2 1) If all elements of one row or one column of an n × n matrix are
multiplied by a constant k, then the determinant is k times the determinant of the
original matrix.
2) If all entries of a row or a column of an n × n matrix are zero, then the determi-
nant is zero.
3) If any two rows or columns of an n × n matrix are interchanged, then the deter-
minant is −1 times the original determinant.
4) If any two rows (or two columns) of an n × n matrix are constant multiples of
each other, then the determinant is zero.
5) If the entries of any row (or column) of an n × n matrix are altered by adding to
them any constant multiple of the corresponding elements in any other row (or
column) then the determinant does not change.
Theorem 3.3 Let A and B be two n × n matrices. Then,
det(A) = det(AT ).
Now we transition to the concept on the inverse of a matrix. We have the following
definition.
Definition 3.5 Let A be an n × n matrix. If there exists an n × n matrix B such that
AB = BA = I,
Proof Suppose the matrix A has two inverses B and C. That is, AB = BA = I, and
AC = CA = I. Then,
B = BI = B(AC) = (BA)C = IC = C.
In the next example we show how to find the inverse of a 2 × 2 matrix by solving
systems of equations.
1 2 a b
Example 2 Find the inverse of A = . Suppose the matrix B =
3 4 c d
is the inverse matrix of A. Then, it must satisfy AB = BA = I. In Other words:
1 2 a b 1 0
= .
3 4 c d 0 1
a + 2c = 1, b + 2d = 0, 3a + 4c = 0, and 3b + 4d = 1.
Solving for a in the third equation and substituting into the first equation gives a = −2
and c = 3/2. Similarly, solving for b in the second equationand substituting it into
−2 1
the fourth equation gives d = −1/2 and d = 1. Hence B = 3 . We can
2 − 12
easily check that AB = BA = I. We conclude the matrix B is the inverse matrix of A.
□
Theorem 3.6 Let A be an n × n matrix. Then A is invertible if and only if
det(A) ̸= 0.
140 Matrices and Systems of Linear Equations
det(A) det(A−1 ) = 1,
from which it follows that det(A) ̸= 0. The second part of the proof is left as an
exercise. This completes the proof.
Now we are ready to give a formula for the inverse of a square matrix, but first we
make the following definition.
Definition 3.6 Let A be an n × n matrix. The adjoint of A, Adj A is
Adj A = (Cik )T ,
1
A−1 = Adj A. (3.11)
det(A)
a b
Let A = . Then from (3.11) we have that
c d
−1 1 d −b
A = . (3.12)
det(A) −c a
we have that
1/a11 0 ··· 0
0 1/a22 ··· 0
A−1 = . .
.. .. ..
.. . . .
0 0 ··· 1/ann
AA−1 = I.
Determinants and Inverse of Matrices 141
CA−1 = I.
(A−1 )−1 = A.
This shows that the inverse of the inverse of an invertible matrix A is the matrix
A.
Theorem 3.7 Let A and B be two n × n invertible matrices. Then
AB(AB)−1 = I.
Multiply both sides of the preceding expression from the left with A−1 we arrive
at
B(AB)−1 = A−1 .
By multiplying both sides from the left by B−1 the results follows. This completes the
proof.
Of course the Theorem 3.7 can be easily generalized to products of more than two
matrices. Hence, by induction one might have
−1
ABC . . . PQ = Q−1 P−1 . . . B−1 A−1 . (3.13)
□
For application we consider the nonhomogenous system of n equations in n un-
knowns given by
X = A−1 b. (3.16)
Determinants and Inverse of Matrices 143
Clearly, (3.16) is a solution of (3.15). To see this, substitute X into (3.15) and
get
A(A−1 b) = (AA−1 )b = Ib = b.
As for uniqueness, suppose there is another solution Y such that
AY = b.
X = A−1 b, Y = A−1 b,
from which we conclude that X = Y. Now that we have established the system has
a unique solution, we try to give an explicit formula for such solution. Using (3.11)
along with (3.16) and Definition 3.6 we have
1
X= (Adj A)b.
det(A)
Or,
x1 C11 C21 · · · Cn1 b1
x2 1 C12
C22 · · · Cn2 b2
.. =
. .. .. .. ..
. det(A) .. . . . .
xn C1n a2n · · · Cnn bn
b1C11 + b2C21 + · · · + bnCn1
1 b1C12 + b2C22 + · · · + bnCn2
= .. .
det(A)
.
b1C1n + b2C2n + · · · bnCnn
It follows from the above calculation that the components of the solutions are given
by
b1C1i + b2C2i + · · · + bnCni
xi = , i = 1, 2, . . . , n (3.17)
det(A)
We summarize the results in the following theorem.
Theorem 3.8 Consider the nonhomogeneous system (3.14) of n linear equations
with n unknowns. If its coefficient matrix A has det(A) ̸= 0, then the system has a
unique solution x1 , x2 , . . . , xn given by (3.17).
As a direct consequence of (3.17) we have the following corollary.
Corollary 3 A homogeneous system of n linear equations with n unknowns and a
coefficients matrix A with det(A) ̸= 0 has just the trivial solution.
Proof Formula (3.17) is valid since det(A) ̸= 0. The results follow since each bi =
0, i = 1, 2, . . . , n.
144 Matrices and Systems of Linear Equations
x1 + 2x2 = 5
−x1 + x2 + x3 = 4
x1 + 2x2 + 3x3 = 14.
Notice that the matrix A is the same as the one in example (3.7). Thus, the Ci j are
readily available and using (3.17), we obtain
X = c1 X(1) + c2 X(2) ,
We note that the theorem does not hold for nonhomogenous systems.
Determinants and Inverse of Matrices 145
Definition 3.8 Any matrix obtained by omitting some rows or columns from a given
Am×n matrix is said to be a submatrix of A. We note a submatrix includes the matrix
A itself.
Example 3.9 The matrix
a11 a12 a13
A=
a21 a22 a23
□
Definition 3.9 The rank of a matrix A is the order of the largest square submatrix
with a nonzero determinant.
To expand on the above definition, a matrix A is said to be of rank r if it contains at
least one r-rowed square submatrix with nonvanishing determinant, while the deter-
minant of any square submatrix having r + 1 or more rows, possibly contained in A,
is zero.
Example 3.10 Let
−3 3 0
A= 1 −2 −1 .
2 2 4
1 −2
Then det(A) = 0. Now the 2×2 submatrix A = has nonzero determinant
2 2
and we conclude the rank of A is 2. □
146 Matrices and Systems of Linear Equations
Proof To simplify notation, we suppose that the submatrix R of order r is the up-
per left corner of the matrix A has a non-vanishing determinant, and consider the
submatrix of A
..
a11 a12 · · · a1r a1s
. a1s
a21 a22 · · · a2r a2s ..
R . a2s
M = ... .. .. .. .. =
. . . . .. ..
. .
ar1 ar2 · · · arr ars
. . . . . . . . . . . . ... ars
aq1 aq2 · · · aqr aqs
aq1 aq2 · · · aqr aqs
where s > r and q > r. Since A is of rank r, |M| = 0 for all such q and s. Now the
system,
where
a′qs = α1 a1s + α2 a2s + . . . + αr ars ,
and
|M| = ± aqs − a′qs |R| = 0.
Hence the last row of M is a linear combination of the first r rows. Since this is true
for any q and s, the result follows.
straight line through a group of points is a problem that can be solved using the lin-
ear least squares fitting technique, which is the most straightforward and widely used
type of linear regression.
We begin with the simple problem by trying to find the straight line y = ax + b,
that best fits an n-observations given by (xn , yn ), for n = 1, 2, . . . , N. From the linear
equation, it is intuitive to define the error by
N 2
E(a, b) = ∑ yn − (axn + b) . (3.18)
n=1
For best fitting, we must minimize the error given by (3.18). That is, we must find
the values of (a, b) such that
∂E ∂E
= 0, = 0. (3.19)
∂a ∂b
N
∂E
= 2 ∑ yn − (ax + b) (−1). (3.20)
∂b n=1
∂E ∂E
Setting ∂a = ∂b = 0, we arrive at the system of equations with two unknown
N N N
( ∑ xn2 )a + ( ∑ xn )b = ∑ xn yn ,
n=1 n=1 n=1
N N N
( ∑ xn )a + ( ∑ 1)b = ∑ yn , (3.21)
n=1 n=1 n=1
In Matrix from,
Consequently,
− ∑Nn=1 xn ∑Nn=1 xn yn
! !
N
a 1
= .
b N ∑n=1 xn − ∑Nn=1 xn ∑Nn=1 xn
N 2
− ∑Nn=1 xn ∑Nn=1 xn2 ∑Nn=1 yn
The above concept can be easily generalized to functions that are not straight lines.
Given functions f1 , . . . , fk , find the values of coefficients a1 , . . . , ak , such that the
linear combination
y = a1 f1 (x) + . . . + ak fk (x)
is the best approximation to the data. Staying with the same set up, we define the
error by
N 2
E(a1 , . . . , ak ) = ∑ yn − (a1 f1 (xn ) + . . . + ak fk (xn )) . (3.22)
n=1
To find the values of (a1 , . . . , ak ) we set
∂E ∂E
= 0, . . . , = 0. (3.23)
∂ a1 ∂ ak
To be more specific, we consider fitting the parabola y = a + bx + cx2 . Then
N 2
E(a, b, c) = ∑ yn − (a + bxn + cxn2 ) .
n=1
Then
∂E ∂E ∂E
= = = 0,
∂a ∂b ∂c
implies that
N
2 ∑ yn − (a + bxn + cxn2 ) (−1) = 0,
n=1
N
2 ∑ yn − (a + bxn + cxn2 ) (−xn ) = 0
n=1
and
N
2 ∑ yn − (a + bxn + cxn2 ) (−xn2 ) = 0 (3.24)
n=1
The system (3.24) reduces to
N N N
∑ yn = Na + b ∑ xn + c ∑ xn2
n=1 n=1 n=1
N N N N
∑ xn yn = a ∑ xn + b ∑ xn2 + c ∑ xn3
n=1 n=1 n=1 n=1
and
N N N N
∑ xn2 yn = a ∑ xn2 + b ∑ xn3 + c ∑ xn4 . (3.25)
n=1 n=1 n=1 n=1
150 Matrices and Systems of Linear Equations
Example 3.12 Suppose we want to find the values a, b, and c so that y = a + bx + cx2
is the best approximation to the data
(0, 1), (1, 1.8), (2, 1.3), (3, 2.5), (4, 6.3).
Then, we have N = 5. Using the given data, one can easily compute
5 5 5 5 5
∑ xn = 10, ∑ yn = 12.9, ∑ xn2 = 30, ∑ xn3 = 100, ∑ xn4 = 354,
n=1 n=1 n=1 n=1 n=1
5 5
∑ xn yn = 37.1, ∑ xn2 yn = 130.3.
n=1 n=1
3.4.2 Exercises
Exercise 3.22 Use the method of Example 2 to find the inverse matrix of each of the
following matrices.
1 3 3
3 5 5 2
(a) A = , (b) A = , (c) A = 1 3 4 .
1 2 −7 −3
1 5 3
Exercise 3.23 Prove Theorem 3.3 .
Exercise 3.24 Prove Theorem 3.4.
Exercise 3.25 Let A be an n × n matrix with det(A) ̸= 0. Show that
1
det(A−1 ) = .
det(A)
Determinants and Inverse of Matrices 151
2x1 + 2x2 − x3 = 4
3x1 + x2 + 4x3 = −9
x1 + 2x2 + x3 = 1.
x1 − x2 − x3 + x4 = −2
2x1 + x2 + x3 + x4 = 3
−x1 + x2 + x3 = 1
2x1 + x3 − 3x4 = 7.
(1, 2), (2, 5), (3, 3), (4, 8), (5, 7).
Exercise 3.38 Generalize the method of least squares to find the function
y = am xm + am−1 xm−1 + . . . + a0 ,
Commonly, the real numbers or complex numbers are the field in the above defini-
tion.
Example 3.13 The set of real numbers R is a vector space over the field F = R under
the usual addition and multiplication. □
Example 3.14 The set Rn = {x = (x1 , x2 , . . . , xn )T } is a vector space over the field
F = R under the usual addition and multiplication. That is for x, y ∈ Rn and α ∈ F
the addition and multiplication are defined as
x + y = (x1 + y1 , x2 + y2 , . . . , xn + yn )T
and
αx = (αx1 , αx2 , . . . , αxn )T
is a vector space. □
Example 3.15 Let an ̸= 0, and define the set V = {p(x) = an xn + an−1 xn−1 + . . . +
a1 x + a0 : ai ∈ R, i = 0, 1, . . . , n}. Then V is a vector space over the field F = R.
We define the addition of two polynomial and multiplication as follows: if p, q ∈ V,
q(x) = bn xn + bn−1 xn−1 + . . . + b1 x + b0 : bi ∈ R, i = 0, 1, . . . , n then
(p+q)(x) = (an +bn )xn +(an−1 +bn−1 )xn−1 +. . .+(a1 +b1 )x+a0 +b0 = p(x)+q(x),
and
(α p)(x) = αan xn + αan−1 xn−1 + . . . + αa1 x + αa0 = α p(x),
for α ∈ R, we have V is a vector space. For example, if p(x) = 3x2 + 4x + 6, q(x) =
x3 − 2x2 + 2x + 5, then (p + q)(x) = x3 + (3 − 2)x2 + (2 + 4)x + 5 + 6, and (3p)(x) =
9x2 + 12x + 18. The additive inverse of p(x) is −p(x) = −an xn − an−1 xn−1 − . . . −
a1 x − a0 . □
Example 3.16 Let the set C(D) be the set of all continuous functions f : D → R. For
f , g ∈ C(D), we define addition and multiplication pointwise as follows:
( f + g)(x) = f (x) + g(x), for all x ∈ D,
and
(c f )(x) = c f (x), c ∈ R
then C(D) is a vector space. □
Definition 3.11 The n×1 vectors v1 , v2 , . . . , vn in V are said to be linearly dependent
if there exists constants c1 , c2 , . . . , cn not all zero, such that
c1 v1 + c2 v2 + . . . + cn vn = 0
If the vectors are not linearly dependent, then they are called linearly independent.
154 Matrices and Systems of Linear Equations
then,
(0, . . . , 0) = (c1 , c2 , . . . , cn ).
This has the only solution c1 = c2 = . . . = cn = 0. □
Definition 3.12 Let V be a vector space over R. Let v1 , v2 , . . . , vn in V. A vector v ∈ V
is a linear combination of {v1 , v2 , . . . , vn } if there exists scalars b1 , b2 , . . . , bn ∈ R such
that
v = b1 v1 + b2 v2 + . . . + bn vn .
Definition 3.13 (Span) The span of {v1 , v2 , . . . , vn } is defined as
span(v1 , v2 , . . . , vn ) := {b1 v1 + b2 v2 + . . . + bn vn b1 , b2 , . . . , bn ∈ R}.
Then,
a1 v1 + a2 v2 + a3 v3 = (0, 0, 0)
implies that
(3a1 + 2a2 − 12a3 , 2a1 − 3a2 + 5a3 , a1 + 2a2 − 8a3 ) = (0, 0, 0).
Using Gauss-elimination method we see the system has infinitely many solutions;
namely,
a1 = 2a3 , a2 = 3a3 .
Setting a3 = 1, we obtain a solution (2, 3, 1). In general, the set of all solutions (or
solutions space) is given by
S = {(a1 , a2 , a3 )a1 = 2a3 , a2 = 3a3 } = span (2, 3, 1) .
□
Vector Spaces 155
v = b1 v1 + b2 v2 + . . . + b p v p
and
v = c1 v1 + c2 v2 + . . . + c p v p .
By subtracting the two equations we arrive at
Since the set of vectors {v1 , v2 , . . . , v p } is linearly independent, the only solution to
equation (3.26) is
b1 − c1 = 0, b2 − c2 = 0, ..., b p − c p = 0.
Thus,
b1 = c1 , b2 = c2 , . . . , b p = c p .
This proves the necessary part of the lemma. For the proof of the sufficient condition,
for every v ∈ span{v1 , v2 , . . . , v p }, there are unique bi , i = 1, 2, . . . p such that v =
b1 v1 + b2 v2 + . . . + c p v p . This implies that the zero vector v = 0 can be written as a
linear combination of v1 , v2 , . . . , v p , only when
b1 = b2 = . . . = b p = 0.
This shows the set of vectors {v1 , v2 , . . . , v p } is linearly independent. This completes
the proof.
Definition 3.14 (subspace) Let V be a vector space over F. Then U is a subspace of
V if and only if the following properties are satisfied:
1. 0 ∈ U; additive identity
2. If u1 , u2 ∈ U, then u1 + u2 ∈ U; (closure under addition)
3. For scalar a ∈ F, u ∈ U, then au ∈ U; (closure under scalar multiplication).
Example 3.19 The set
U = {(a, 0) | a ∈ R}
is a subspace of R2 . □
156 Matrices and Systems of Linear Equations
U = {(a, b, c) ∈ R3 | b + 4c = 0}
is a subspace of R3 . To see this , we make sure the requirements of (3.14) are met.
As for 1., we easily see that (0, 0, 0) ∈ U, since b + 4c = 0 is satisfied. To verify 2.,
we let u = (u1 , u2 , u3 ) and v = (v1 , v2 , v3 ). Then we have
(u2 + v2 ) + 4(u3 + v3 ) = 0.
Let K = (u1 +v1 , u2 +v2 , u3 +v3 ) ∈ U. Then it must satisfy (u2 +v2 )+4(u3 +v3 ) = 0.
This shows that K := u + v ∈ U. It remains to be shown that 3. holds. Let α ∈ R,
and u = (u1 , u2 , u3 ) ∈ U. Then, αu = (αu1 , αu2 , αu3 ) satisfies the equation αu2 +
4αu3 = α(u2 + 4u3 ) = 0, and so αu ∈ U. □
Lemma 8 Let v1 , v2 , . . . , vn be vectors in the vector space V. Then
1. v j ∈ span(v1 , v2 , . . . , vn ),
2. span(v1 , v2 , . . . , vn ) is a subspace of V.
3. If v1 , v2 , . . . , vn are vectors in the vector space V and U ⊂ V is a subspace such
that v1 , v2 , . . . , vn ∈ U, then span(v1 , v2 , . . . , vn ) ⊂ U.
implies that
(x1 , x2 , x3 ) = (2a1 + 2a2 , 2a1 − 2a2 , 0).
x1 +x2 x1 −x2
Clearly, a1 = 4 , and a1 = 4 form a solution for any x1 , x2 ∈ R and x3 = 0. □
Definition 3.15 If span(v1 , v2 , . . . , vn ) = V, then we say that (v1 , v2 , . . . , vn ) spans V.
In this case the vector space V is finite-dimensional . A vector space that is not finite-
dimensional is called infinite-dimensional .
Vector Spaces 157
u = u1 e1 + u2 e2 + . . . + un en .
Vn [x] ⊂ V [x].
This is the case since the zero polynomial is in Vn [x]. Moreover, Vn [x] is closed under
vector addition and scalar multiplication. Since
Vn [x] = span(1, x, x2 , . . . , xn ),
the subspace Vn [x] is of finite dimension. On the other hand, we assert that V [x] is
infinite-dimensional. Assume the contrary; that is
xn+1 ∈
/ span(p1 (x), p2 (x), . . . , pk (x)).
Observe that the set {1, x, x2 } is a basis for the vector space of polynomials in x
with real coefficients having degree at most 2. Note that V2 [x] has infinitely many
polynomial with degree at most 2, yet we managed to have a description of all them
using the set {1, x, x2 }.
Recall that a vector space V is called finite-dimensional if V has a basis consisting of
a finite numbers of vectors; otherwise, V is infinite-dimensional.
Remark 11 The dimension of a vector space is the number of vectors in a basis.
It can be shown that in an n-dimensional vector space, any set of n + 1 vectors is
linearly dependent. Thus, the dimension of a vector space could be defined as the
number of vectors in a maximal linearly independent set.
Remark 12 By Lemma 7, If {v1 , v2 , . . . , vn } forms a basis of V, then every vector
v ∈ V can be uniquely written as a linear combination of v1 , v2 , . . . , vn .
To see the difference between basis and span, we consider the vectors
3.5.1 Exercises
Exercise 3.39 Show the set
U = {(a, b, c) ∈ R3 | a + b + 4c = 0}
is a vector space under the usual operations vector addition and scalar multiplica-
tion on R3 .
Exercise 3.40 Show the set
U = {(a, b, c) ∈ R3 | a + 2b = 0}
is a subspace under the usual operations vector addition and scalar multiplication
on R3 .
Exercise 3.41 Show the set
U = {(a, 0) ∈ R2 | a ∈ R}
Exercise 3.43 Redo Example 3.18 for the following set of vectors.
(a) v1 = (1, 1, 1), v2 = (1, 2, 0), and v3 = (0, −1, 1).
(b) v1 = (1, 1, 1), v2 = (1, 2, 0), and v3 = (0, −1, 2).
Exercise 3.44 Explain why the set of vectors given by
is linearly dependent.
Exercise 3.45 Prove 3. of Lemma 8.
Exercise 3.46 Either show the set is a vector space or explain why it is not. All
functions are assumed to be continuous.
(a) U = {(a, 2) ∈ R2 |a ∈ R} under the usual operations of addition and multiplica-
tion on R2 .
(b) U = {(a, b) ∈ R2 | a, b ≥ 0} under the usual operations of addition and multipli-
cation on R2 .
d
(c) U = { f : R → R | dx f exists} under the usual operations of addition and multi-
plication on functions.
(d) U = { f : R → R | f (x) ̸= 0 for any x ∈ R} under the usual operations of addition
and multiplication on functions.
(e) The solution set to a linear nonhomogeneous equations.
(f) U = {A2×2 | det(A) = 0} under the usual operations of addition and multiplica-
tion for matrices.
(g) U = { f : [−1, 1] → [−1, ∞)} under the usual operations of addition and multi-
plication on functions.
(h) U = { f : R → R | f (0) = 0} under the usual operations of addition and multi-
plication on functions.
(i) U = { f : R → R | f (x) ≤ 0, for all x ∈ R} under the usual operations of addition
and multiplication on functions.
Exercise 3.47 Show that any set of vectors {v1 , v2 , . . . , vn }, which spans a vector
space V contains a linearly independent subset which also spans V.
Exercise 3.48 Show the vectors v1 = (1, 1), v2 = (1, 2), and v3 = (1, 0) span
R2 .
Exercise 3.49 Give a basis of
a b
M2×2 = { | a, b, c, and d ∈ R}.
c d
3.6 Eigenvalues-Eigenvectors
For motivational purpose we begin the the following example.
Example 3.27 (Lotka–Volterra Predator–Prey Model) We consider the Lotka–
Volterra Predator–Prey model. Let x = x(t) and y = y(t) be the number of preys
and predators at time t, respectively. To keep the model simple, we will make the
following assumptions:
• the predator species is dependent on a single prey species as its only food supply,
• the prey species has an unlimited food supply, and
• there is no threat to the prey other than the specific predator.
We observe that, in the absence of predation, the prey population would grow at a
natural rate
dx
= ax, a > 0.
dt
On the other hand, in the absence of prey, the predator population would decline at a
natural rate
dy
= −cy, c > 0.
dt
The effects of predators eating prey is an interaction rate of decline (−bxy, b > 0)
in the prey population x, and an interaction rate of growth (dxy, d > 0) of predator
population y. Hence, one obtains the predator-prey model
dx
= ax − bxy
dt
dy
= −cy + dxy. (3.27)
dt
The Lotka-Volterra model consists of a system of linked differential equations that
cannot be separated from each other and that cannot be solved in closed form. Since,
(0, 0) is a solution of the system, we linearize around it and rewrite the systems as
where
x a 0 −bxy
X= , A= , g= .
y 0 −c dxy
Eigenvalues-Eigenvectors 161
Since the function g is continuously differentiable in both variables near the origin,
the stability of the nonlinear system (3.28) is heavily influenced by the stability of
linear system
X ′ = AX. (3.29)
We search for solutions to (3.29) of the form
X = zeλt ,
z1
where z = , for a parameter λ . Substituting into (3.29) we arrive at the relation
z2
Az = λ z. (3.30)
0
It is evident that the zero vector z = is a solution of (3.30) for any value of λ .
0
We are interested in the values of λ for which (3.30) has a nonzero solution. Such
values are called eigenvalues and the corresponding vector solutions given by z are
called eigenvectors. We have this important definition below. □
Definition 3.17 Let A be an n × n constant matrix, in short “matrix.” A number λ is
said to be an eigenvalue of A if there exists a nonzero vector v such that
Av = λ v. (3.31)
is a solution of
X ′ = AX. (3.32)
Ax = λ x, (3.33)
where
a11 a12 ··· a1n x1
a21 a22 ··· a2n x2
A= . .. , x = . .
.. ..
.. . . . ..
an1 an2 ··· ann xn
162 Matrices and Systems of Linear Equations
By transferring the terms on the right-hand side to the left-hand side, we arrive
at
By Corollary 4 this homogeneous system has a nontrivial solution if and only if the
corresponding determinant of the coefficients is zero. That is
a11 − λ a12 ··· a1n
a21 a22 − λ · · · a2n
D(λ ) = det(A − λ I) = . .. = 0. (3.34)
. .
.. .. .. .
an1 an2 · · · ann − λ
Equation (3.34) is called the characteristic equation corresponding to the matrix A.
By expanding D(λ ) we obtain a polynomial of nth degree in λ . This is called the
characteristic polynomial corresponding to the matrix A. Thus, we have proved the
following theorem.
Theorem 3.13 The eigenvalues of an n × n matrix A are the roots of its correspond-
ing characteristic equation (3.34).
In general, if D(λ ) is an nth degree polynomial then it can be factored into linear
terms over C. of the form
D(λ ) = λ 5 − 3λ 2 + 6λ 3 − 4λ 2 ,
then
D(λ ) = λ 2 (λ − 1)(λ − 2)2 .
Eigenvalues-Eigenvectors 163
The solution of a given linear system of differential equations is the focus of the
following theorem.
164 Matrices and Systems of Linear Equations
for constants ci , i = 1, 2, . . . , n.
Example 3.28 Consider the linear homogeneous system of differential equations
(5 − λ )(8 − λ )(4 − λ ) = 0,
λ1 = 5, λ2 = 8,
and λ3 = 4.
k1
To compute the corresponding eigenvectors, we let K1 = k2 . Then using (3.31)
k3
we have (A − λ I)K1 = 0, or
From the third and second equations, it is obvious that, with λ = 5, that k3 = k2 = 0.
The first equation implies that 0k1 + 0 + 0 = 0, from which we conclude that k1
is arbitrary. So, if we set k1 = 1, then the corresponding eigenvector is given by
Eigenvalues-Eigenvectors 165
1
K1 = 0 . Similarly, if we substitute λ = 8 in (3.35), we arrive at the corresponding
0
2
eigenvector K2 = 3 . Finally, the third eigenvector corresponding to λ = 4 is
0
6
K3 = 3 . Using Theorem 3.15, we arrive at the solution
−4
1 2 6
x(t) = c1 0 e5t + c2 3 e8t + c3 3 e4t .
0 0 −4
□
In some cases a repeated eigenvalue gives one independent eigenvector and the others
must be found using the following method as the next example demonstrates.
We consider the system
′ 3 −18 x1
x = . (3.36)
2 −19 x2
k
Then the coefficient matrix has the repeated eigenvalue λ1 = λ2 = −3. If K1 = 1
k2
is the corresponding eigenvector, then we have the two equations 6k1 − 18k2 =
0, 2k1 − 6k2 = 0, which are bothequivalent
to k1 = 3k2 . By setting k2 = 1, we ob-
3
tain the single eigenvector K1 = and it follows that the corresponding solution
1
is given by
3 −3t
φ1 = e .
1
But since we are interested in finding the general solution, we need to examine the
question of finding another solution.
In general, if m is a positive integer and (λ − λ1 )m is a factor of the characteristic
equation det(A − λ I) = 0, while (λ − λ1 )m+1 is not a factor, then λ1 is said to be an
eigenvalue of multiplicity m. Below, we discuss two such scenarios:
(a) For some n × n matrice A it may be possible to find m linearly independent eigen-
vectors K1 , K2 , . . . , Kn corresponding to an eigenvalue λ1 of multiplicity m ≤ n.
In this case the general solution of the system contains the linear combination
φ1 = K11 eλ1 t
φ2 = K21teλ1 t + K22 eλ1 t
..
.
t m−1 λ1 t t m−2 λ1 t
φm = Km1 e + Km2 e + . . . + Kmm eλ1 t ,
(m − 1)! (m − 2)!
where Ki j are columns vectors that can always be found, and they are known as gen-
eralized eigenvectors. For an illustration of case (b), we suppose λ1 is an eigenvalue
of multiplicity two with only one corresponding eigenvector K1 . To find the second
eigenvector, we assume a second solution of
X ′ = AX
of the form
φ2 (t) = K1teλ1 t + Peλ1 t , (3.37)
where
p1 k1
p2 k2
P = . and K = .
.. ..
pn kn
are to be found. Differentiate φ2 (t) and substitute back into x′ = Ax to get
Since the above equation must hold for all t, it follows that
(A − λ1 I)K1 = 0, (3.38)
and
(A − λ1 I)P = K1 . (3.39)
Equation (3.38) reaffirm that K1 is the eigenvector of A associated with the eigenvalue
λ1 . Thus, we obtained one solution φ1 (t) = K1 eλ1 t . To find the second solution given
by (3.37) we
must
solve for the vector P in (3.39). To find a second solution for (3.36),
p1
we let P = . Then from equation (3.39), we have (A+3I)P = K1 , which implies
p2
that 6p1 − 18p2 = 3, or 2p1 − 6p2 = 1. Since these two equations are equivalent,
we may chose p1 = 1 and find p2 = 1/6. However, for simplicity, we shall choose
p1 = 1/2 so that p2 = 0. Using (3.37) we find that
−3t 1/2 −3t
φ2 (t) = 31 te + e .
0
Finally, the general solution is
X = c1 φ1 (t) + c2 φ2 (t).
Inner Product Spaces 167
3.6.1 Exercises
Exercise 3.51 Find the eigenvalues and the corresponding eigenvectors.
3 −1 0 13 −3 5
2 5
(a) , (b) 4 0 0 , (c) 0 4 0 .
3 8
2 5 −3 −15 9 −7
Exercise 3.52 Show that if A is an n × n matrix with det(A) ̸= 0, then all of its eigen-
values are different from zero.
Exercise 3.53 Show that if A is an n × n matrix with eigenvalues λi , i = 1, 2, . . . , n,
then the eigenvalues of A2 are λi2 , i = 1, 2, . . . , n.
Exercise 3.54 Solve thew following systems of differential equations.
5 −1 0 x1
′ ′ 1 2 x1
(a) x = 0
−5 9 x2 , (b) x =
,
4 3 x2
5 −1 0 x3
3 −1 −1 x1
−4 2 x1
(c) x′ = 1 1 −1 x2 , (d) x′ = .
− 52 2 x2
1 −1 1 x3
Exercise 3.55 Let A and B be two square matrices with AB = BA. Let λ be an eigen-
value of A with corresponding eigenvector k. If Bk ̸= 0, show that Bk is an eigenvector
of A, with eigenvalue λ .
Exercise 3.56 Let A be an n × n matrix. Show that if the sum of all entries of each
column is r, then r is an eigenvalue of A.
Exercise 3.57 Let A be a non-zero n × n matrix. Show that if AT = λ A, then λ = ±1.
Exercise 3.58 Solve
(a)
1 −2 2 x1
x′ = −2 1 −2 x2 ,
2 −2 1 x3
(b)
3 −18 x1
x′ = .
2 −9 x2
Definition 3.18 (Inner product) If for any vectors u, v, and w in a vector space V and
a scalar a ∈ R we can define an inner (or scalar) product (u, v) such that
1. (u, v) = (v, u),
2. (u, v + w) = (u, v) + (u, w),
3. (au, v) = a(u, v),
4. (u, u) ≥ 0, and (u, u) = 0 if and only if u = 0,
Clearly, (3.40) satisfies 1. − 4. For the purpose of illustration, we quickly go over the
verifications. Now
n n
(u, v) = ∑ ui vi = ∑ vi ui = (v, u).
i=1 i=1
This verifies 1. As for 2. we let w = (w1 , w2 , . . . , wn ) ∈ V. Then
(u, v + w) = (u1 , u2 , . . . , un ) · (v1 + w1 , v2 + w2 , . . . , vn + wn )
= u1 (v1 + w1 ) + u2 (v2 + w2 ) + . . . + un (vn + wn )
= u1 v1 + u2 v2 + . . . un vn + u1 w1 + u2 w2 + . . . + un wn
= (u, v) + (u, w).
On the other hand,
(au, v) = (au1 , au2 , . . . , aun ) · (v1 , v2 , . . . , vn )
= au1 v1 + au2 v2 + . . . + aun vn
= a(u1 v1 + u2 v2 + . . . + un vn )
= a(u, v).
This verifies 3. For verifying 4. we see that
(u, u) = u21 + u22 + . . . + u2n = 0
if and only if u1 = u2 = . . . = un = 0. Thus, (u, u) > 0, if and only if, u ̸= 0. We
conclude that (3.40) defines an inner product. We note that if u and v are two vectors
in Rn , then
(u, v) = uvT = vuT = u1 v1 + u2 v2 + . . . + un vn .
□
Inner Product Spaces 169
For the next example we define the space C0 [a, b] to be the set of all continuous
functions f : [a, b] → R.
Example 3.30 Consider the vector space C0 [a, b]. Let f , g ∈ C0 [a, b]. If we define
b
( f , g) = f (x)g(x)dx, (3.41)
a
0 ≤ ||y + λ z||2 = (y + λ z, y + λ z)
= (y, y) + 2λ (y, z) + λ 2 (z, z)
= ||y||2 + 2(y, z)λ + ||z||2 λ 2 ,
By remarking that (y, z)2 = ||(y, z)||2 , the above inequality gives
∥y + z∥ ≤ ∥y∥ + ∥z∥.
AT A = I, or A−1 = AT .
iii) A set of vectors S = {vi }ni=1 is called orthonormal if every vector in S has mag-
nitude 1 and the set of vectors are mutually orthogonal.
For example the matrix
2 −2 1
1
A= 1 2 2
3
2 1 −2
The real numbers a and b are called the real and imaginary parts of z, respectively.
The set of all complex numbers is denoted by C. The number z̄ = a − ib is called the
conjugate of z.
By placing a on the x-axis and b on the y-axis, we can interpret z = a + ib as a vector
from the origin terminating at (a, b). The length of the vector is called the modulus
or magnitude of z and is denoted by
p
||z|| = a2 + b2 .
Notice that
||z||2 = a2 + b2 = zz̄.
Next, we state two of the most important characteristics of symmetric matri-
ces.
Theorem 3.16 Suppose A is an n × n real symmetric matrix.
a) If λ1 and λ2 are two distinct eigenvalues of A, then their corresponding eigen-
vectors y1 and y2 are orthogonal. That is
(y1 , y2 ) = 0.
Multiplying the transpose of the first equation in (3.43) from the right by y2 we
get
(Ay1 )T y2 = λ1 yT1 y2 ,
or
yT1 AT y2 = λ1 yT1 y2 .
Multiplying the second equation in (3.43) from the left by yT1 , we obtain
yT1 y2 = (y1 , y2 ) = 0.
172 Matrices and Systems of Linear Equations
This proves the first part. As for the second part, suppose λ is a complex eigenvalue
of the symmetric matrix A with the possibility of a complex eigenvector V such that
Av = λ v. We take the complex conjugate on both sides of the preceding equation and
obtain Av¯ = λ¯v. This implies that Av̄ = λ̄ v̄. Using AT = A, we have the following
manipulation:
v̄T Av = v̄T (Av) = v̄T (λ v) = λ (v̄T , v).
Similarly,
v̄T Av = (Av̄)T v = (λ̄ v̄)T v = λ̄ (v̄, v).
Subtracting the above two expressions, we obtain
Next, we define the Gram-Schmidt process which is a procedure that converts a set
of linearly independent vectors into a set of orthonormal vectors that spans the same
space as the original set.
Theorem 3.17 (Gram-Schmidt Process) Suppose the vectors u1 , u2 , . . . , un form a
basis for a vector space V. Then, from the vectors ui , i = 1, 2, . . . , n we can form an
orthonornal basis x1 , x2 , . . . , xn for V.
v3 = u3 − c1 x1 − c2 x2 .
Then
want
(x1 , v3 ) = (x1 , u3 ) − c1 (x1 , x1 ) − c2 (x1 , x2 ) = 0.
Since (x1 , x2 ) = 0, the above expression implies that c1 = (x1 , u3 ). Also,
want
(x2 , v3 ) = (x2 , u3 ) − c1 (x2 , x1 ) − c2 (x2 , x2 ) = 0,
So we take
v3
x3 = .
||v3 ||
Continuing in this process, we obtain a general formula for all vectors given
by
j−1
vj
xj = , where v j = u j − ∑ (xi , u j )xi . (3.44)
||v j || i=1
1 −1 1
According to (3.44), we have
1
u1 11
x1 = =
||u1 || 2 1
1
v2
and x2 = ||v2 || , where v2 = u2 − (x1 , u2 )x1 . Now,
1 1 1 1
1 1 1 1 1 1
−1 − 2 1 · −1 2 1 ,
v2 =
−1 1 −1 1
174 Matrices and Systems of Linear Equations
or
1 1 1
1 1 1 1
−1 − 4 (0) 1 = −1 .
v2 =
−1 1 −1
1
v2 1 v3
= 12
Thus, x2 = ||v2 || −1 . Similarly, x3 = ||v3 || , where
−1
Proof We only prove part (c). For parts (a) and (b), see Exercise 3.74. Since A and
B are similar, there exists a nonsingular matrix P such that A = PBP−1 . Using I =
PP−1 , we have
given that A and B share the same characteristic polynomial. This concludes the
proof.
3.7.1 Exercises
Exercise 3.59 Verify C0 [a, b] of Example 3.30 is an inner product space.
Exercise 3.60 Let V = Rn and for y ∈ V show that
defines a norm.
Exercise 3.61 For f ∈ C0 ([a, b]), we define
and b
|| f ||1 = | f (x)|dx.
a
Show that || f ||M and || f ||1 , define norms on C0 ([a, b]).
Exercise 3.62 Show that every finite-dimensional inner product space has an or-
thonormal basis.
Exercise 3.63 Every orthonormal list of vectors in V can be extended to an or-
thonormal basis of V.
Exercise 3.64 Use the inner product defined by (3.41) to find all values of a so that
the two functions
f (x) = ax, g(x) = x2 − ax + 2
are orthogonal on [0, 1].
Exercise 3.65 Show the two vectors
1 2
u1 = −2 , u2 = 3 ,
1 4
are orthogonal but not orthonormal. Use u1 , u2 to form two vectors v1 , and v2 that
are orthonormal and span the same space.
Exercise 3.66 Suppose the vectors u1 , u2 , . . . , un are orthogonal to the vector y. Then
show that any vector in the span(u1 , u2 , . . . , un ) is orthogonal to y.
176 Matrices and Systems of Linear Equations
Exercise 3.67 For any two vectors u and v in a vector space V, show that
1
||u + v||2 + ||u − v||2 = ||u||2 + ||v||2 .
2
Exercise 3.68 Let w : R → (0, ∞) be continuous. Show that for any two polynomials
f and g of degree n,
b
( f , g) = f (x)g(x)w(x)dx
a
defines an inner product. Actually this is called “weighted inner product.”
Exercise 3.69 Consider the vector space C over R. Show that if z, w ∈ C, then
1
(z, w) = (zw̄ + wz̄),
2
is an inner product.
Exercise 3.70 Use the inner product defined by (3.41) to show that the set of func-
tions
sin(nπ ln(x))
{ fn (x)}∞
n=1 = { √ }, x ∈ [1, e]
x
is orthogonal.
Exercise 3.71 Apply the Gram-Schmidt process to the following vectors
(a)
5 3 3
u1 = −2 , u2 = −1 , u3 = −3 .
4 7 6
(b)
1 1 1
1 −2 0
u1 =
0 ,
u2 =
0 ,
u3 =
−1 .
1 0 2
(c)
1 2
u1 = 1 , u2 = 1 .
0 1
Exercise 3.72 Consider the three n × n matrices A, B, and C. Show that
(a) A is similar to A. (Reflexive)
(b) If A is similar to B, then B is similar to A. (Symmetric)
(c) If A is similar to B, and B is similar to C, then A is similar to C. (Transitive)
Diagonalization 177
Exercise 3.73 Find the matrix A that is similar to the matrix B given that
−13 −8 −4 1 1 2
B = 12 7 4 , P = −2 −1 −3 .
24 16 7 1 −2 0
3.8 Diagonalization
In this section, we look at the concept of matrix diagonalization, which is the process
of transformation on a matrix in order to recover a similar matrix that is diagonal.
Once a matrix is diagonalized, it becomes very easy to raise it to integer powers. We
begin with the following definition.
Definition 3.24 Let A be an n × n matrix. We say that A is diagonalizable if there
exists an invertible matrix P such that
D = P−1 AP
yields
Ap1 = k1 p1 , Ap2 = k1 p2 , . . . , Apn = k1 pn ,
where Api = ki pi , i = 1, 2, . . . , n are the successive columns of AP. Since P is in-
vertible, each of its column vector is nonzero. Thus the above relation implies that
k1 , k2 , . . . , kn are eigenvalues of A with correponding eigenvectors p1 , p2 , . . . , pn .
Since P is invertible, it follows from Corollary 5 that p1 , p2 , . . . , pn are linearly
independent eigenvectors. As for (b) implying (a), we assume p1 , p2 , . . . , pn are
linearly independent eigenvectors with corresponding eigenvectors, k1 , k2 , . . . , kn .
Let the matrix P be given as in the proof of part (a). Then the prod-
uct of the two matrices AP has the columns Api , i = 1, 2, . . . , n. But Api =
k1 p11 k2 p12 · · · kn p1n
k1 p21 k2 p22 . . . kn p2n
ki pi , i = 1, 2, . . . , n, and this translates into AP = . .. =
.. .. ..
. . .
k1 pn1 k2 pn2 · · · kn pnn
k 0 · · · 0
p11 p12 · · · p1n 1
p21 p22 · · · p2n .. .
0 k2 . .. = PD, where D is the diagonal ma-
.. .. .. .. . .
. . . . .. . . . . . ...
pn1 pn2 · · · pnn 0 · · · 0 kn
trix having its diagonal entries the eigenvalues k1 , k2 , . . . , kn . The matrix P is invert-
ible, since its column vectors are linearly independent. Thus, the relation AP = PD
implies that D = P−1 AP. This completes the proof.
Note not every matrix is diagonalizable. To see this we consider the matrix A =
that
0 1
. Then 0 is the only eigenvalue but A is not the zero matrix.
0 0
In summary, to diagonalize a matrix, one should perform the following steps:
(1) Compute the eigenvalues of A and the corresponding n linearly independent
eigenvectors.
(2) Form the matrix P by taking its columns to be the eigenvectors found in step (1).
(3) The diagonalization is done and given by D = P−1 AP.
Diagonalization 179
Another advantage is that once a matrix is diagonalized, then it is easy to find its in-
−1
verse if it has one. To see this, let PDP−1 = A. Then A−1 = PDP−1 = PD−1 P−1 ,
where 1
d11 0 ··· 0
1 .. ..
−1
0
d
. .
D = .
22
.. .. .. ..
. .
. .
0 · · · 0 d1nn
Hence,
1 1 1 2 −1
P= , and P−1 = .
−1 2 3 1 1
Finally, for positive integer k, we have
k
1 1 −1 0 1 2 −1
Ak = PDk P−1 =
−1 2 0 5 3 1 1
k k k
1 2(−1) + 5 −1 + 5
= .
3 −2 + 2 · 5k 1 + 2 · 5k
□
Definition 3.25 Let A be an n × n matrix. We say that A is orthogonally diagonaliz-
able if there exists an orthogonal matrix P such that
D = P−1 AP
The proof follow along the lines of Theorem 3.20. The only change is to apply the
Gram-Schmidt process to obtain orthonormal basis for each eigenspace. Then form
P whose columns are the orthonormal basis. This matrix orthogonally diagonalizes
A. Note that the proof of (c) implies (a) is a bit demanding and we refer to [8]. As
for the proof of (a) implies (c), we have that D = P−1 AP, or A = PDP−1 . Since P is
orthogonal, we have A = PDPT . Therefore,
Expanding the determinant along the first row we obtain the cubic equation
λ (λ − 9)2 = 0,
Thus,
1
− √25 2
√
3 3 5
2 √1 4
√
P= .
3 5 3 5
− 23 0 5
√
3 5
□
3.8.1 Exercises
Exercise
3.75 Diagonalize
each of
the following
matrices and find A100
.
1 4 5 −3 5 −3
(a). A = , (b). A = (c). A = .
4 3 6 −4 −6 2
Exercise
3.76 Diagonalize
each of the following
matrices.
2 0 0 5 0 0
(a). A = 0 2 2 , (b). A = 2 6 0 .
0 0 4 3 2 1
182 Matrices and Systems of Linear Equations
2 4 6
Exercise 3.77 Explain why this matrix A = 0 2 2 is not diagonalizable.
0 0 4
Exercise 3.78 Show that if B is diagonalizable and invertible, then so is B−1 .
Exercise 3.79 Let
p11 p12
P= .
p21 p22
Show that:
(a) P is diagonalizable if (p11 − p22 )2 + 4p12 p21 > 0.
(b) P is not diagonalizable if (p11 − p22 )2 + 4p12 p21 < 0.
Exercise 3.80 Show that if A and B are orthogonal matrices, then AB is also orthog-
onal.
Exercise 3.81 Find the matrix P that orthogonally diagonalizes each of the follow-
ing matrices.
2 1 −1 3 2 6
(a). A = 0 1 1 , (b). A = −6 3 2 .
1 −1 1 2 6 −3
Exercise 3.82 Show that if P is orthogonal, then aP is orthogonal if and only if a = 1
or a = −1.
where
x1 a11 a12 ··· a1n
x2 a21 a22 ··· a2n
x = . and A = . .. .
. .. ..
. .. . . .
xn an1 an2 ··· ann
The matrix is
−1 −1 7/2
A = −1 3 2 .
7/2 2 1
It is clear that AT = A and xT Ax = Q. □
Definition 3.27 Let x ∈ Rn and suppose A is an n × n constant symmetric matrix.
Then the quadratic form
Q(x) = xT Ax
is
(a) positive definite if Q(x) > 0, for all x ̸= 0,
(b) negative definite if Q(x) < 0, for all x ̸= 0,
184 Matrices and Systems of Linear Equations
Then
k1 0 ··· 0 x1
.. .. x
0 k2 . . 2
xT Ax = (x1 , x2 , . . . , xn )
.
.
..
.. .. .. .
. . . .
0 ··· 0 kn xn
= k1 x12 + k2 x22 + . . . + kn xn2 .
Thus, for any n×n diagonal matrix A, Q(x) is positive definite for x ̸= 0 and provided
that ki ≥ 0 and ki ̸= 0 for at least one i = 1, 2, . . . . □
a b
Theorem 3.22 Let A = and consider the quadratic form
b c
x
Q(x, y) = (x, y)A = ax2 + 2bxy + cy2 .
y
b 2 (ac − b2 ) 2
Q(x, y) = a x + y + y .
a a
Next we turn our attention to the characterization of eigenvalues of matrices that are
symmetric.
Quadratic Forms 185
Then,
(a) λn ≤ xT Ax ≤ λ1 for all x ∈ S n−1 .
(b) Let y1 , y2 ∈ S n−1 . Then if y1 ∈ Rn is the corresponding eigenvector for λ1 , then
yT1 Ay1 = λ1 . Similarly, If y2 ∈ Rn is the corresponding eigenvector for λn , then
yT2 Ay2 = λn .
Note that, S n−1 denotes the unit (n − 1)-dimensional sphere in Rn . Moreover, since
the set S n−1 is closed and bounded, continuous functions on S n−1 attain their max-
imum and minimum values. Thus, if x ∈ S n−1 , then the maximum and minimum
of the quadratic form Q = xT Ax can be easily computed using Theorem 3.23, as the
next example shows.
Example 3.39 Consider the quadratic form
0 1/2 x1
Q(x1 , x2 ) = x1 x2 = x1 x2 .
1/2 0 x2
0 1/2
The eigenvalues of the matrix A = are λ1 = 12 and λ2 = − 12 . Then the
1/2 0
eigenpairs are
! !
√1 − √12
2
λ1 = 1/2, v1 = 1 ; λ2 = −1/2, v2 = .
√ 1 √
2 2
Thus, the maximum λ1 = 1/2 of Q occurs at ±v1 , and its minimum λ2 = −1/2 of Q
occurs at ±v2 . In fact, one may uses Lagrange multiplier to extremize the function
f (x1 , x2 ) = x1 x2 subject to the constraint function g(x1 , x2 ) = x12 + x22 − 1. □
Theorem 3.24 If A is a real symmetric matrix, then there exists an orthogonal matrix
T such that the transformation x = Tx̄ will reduce the quadratic form (3.46) to the
canonical or diagonal form
Proof The proof is a direct consequence of Theorems 3.19 and 3.20. Let T be the
orthogonal matrix P in Theorem 3.20 and assume λ1 , λ2 , . . . , λn are the eigenvalues
of the symmetric matrix A. Let the columns of T be the obtained orthonormal vectors
yi
||yi || , i = 1, 2, . . . , n. Then we have
y1 y2
T= ||y1 || ||y2 || . . . ||yynn || .
As a consequence,
y y y y y y
AT = A ||y1 || A ||y2 || . . . A ||ynn || = λ1 ||y1 || λ2 ||y2 || . . . λn ||ynn || .
1 2 1 2
This yields
T
y1 y2
TT AT = ||y1 || ||y2 || . . . ||yynn || λ1 ||yy1 || λ2 ||yy2 || . . . λn ||yynn ||
1 2
λ1 0 ··· 0
.. ..
0 λ2 . .
=
. .. = D.
.. .. ..
. . .
0 ··· 0 λn
x = Tx̄.
Then,
Q = xT Ax = (Tx̄)T ATx
= x̄T TT ATx = x̄T Dx
= λ1 x̄12 + λ2 x̄22 + . . . + λn x̄n2 .
Now Q is equivalent to
Q = xT Ax,
where
3 0 1 x1
A = 0 2 0 , x = x2 .
1 0 3 x3
It is clear that A is symmetric. The eigenvalues of A satisfy
(2 − λ )2 (λ − 4) = 0.
Quadratic Forms 187
λ1 = λ2 = 2, and λ3 = 4.
k1
Let K = k2 . Then using (3.31) we have
k3
(3 − λ )k1 + k3 = 0
(2 − λ )k2 = 0 (3.48)
k1 + (3 − λ )k3 = 0.
x2 = x̄2
1 1
x3 = √ x̄1 + √ x̄3 .
2 2
Substituting x1 , x2 , and x3 back into Q(x1 , x2 , x3 ) confirms that
□
The above results extend to cover quadratic forms of the form Q = c, where c is
constant. Here is an example.
Example 3.41 Consider the quadratic form
Now Q is equivalent to
Q = xT Ax − 3,
where
4 2 2 x1
A = 2 4 2 , x = x2 .
2 2 4 x3
The eigenvalues are
λ1 = λ2 = 2, and λ3 = 8.
The corresponding normalized eigenvectors are
1
√1
1
− √2 − √6 3
− √1
y3 = √13 .
√1
y1 = , y2 = ,
2 6
0 √2 √1
6 3
1
− √2 − √16 √1
3
T = √12 − √16 √1 ,
3
0 √2 √1
6 3
x̄1
which is orthogonal. Let x̄ = x̄2 . Then,
x̄3
x = Tx̄,
□
Quadratic Forms 189
Theorem 3.25 A quadratic form Q = xT Ax is positive definite, if and only if, all the
eigenvalues of A are positive.
Example 3.43 Consider the matrix A in Example 3.42. Then, the principal subma-
trices of A are
4 2
B = (4), and C = .
2 4
Since
det(B) = 4, and det(C) = 12,
T
are all positive, the quadratic form Q = x Ax is positive definite. □
The next theorem is about reducing two quadratic forms simultaneously to canonical
forms when one of them is positive definite.
Theorem 3.27 If at least one of the quadratic forms
Q1 = xT Ax, Q2 = xT Bx (3.49)
Proof Suppose Q2 is positive definite. Then by Theorem 3.24, there exists T such
that
x = Ty (3.50)
that reduces Q2 to the form
ηn
where T′ is the matrix obtained from T by dividing each element of the ith column by
√
µi . Hence we may write Q1 as
T
Q1 = η T Gη, where G = T′ AT′ . (3.56)
Q2 = η T η = (Sα)T (Sα) = α T ST Sα = α T α
= α12 + α22 + . . . + αn2 (since S is orthogonal). (3.58)
x = Ty = T′ η = T′ Sα, (3.59)
(2 − µ1 )k1 + 0k3 = 0
0k1 + (2 − µ1 )k2 = 0.
192 Matrices and Systems of Linear Equations
Or, √
√
Q2 = η12 + η22 , where ηi = µi yi = 2yi , i = 1, 2.
Thus, !
√1 0
′ 2
T = ,
0 √1
2
3 −1
and Q1 = η T (T′ T AT′ )η, where A = . In particular,
−1 3
! !
√1
√1
T 2
0 3 −1 2
0
Q1 = η η
0 √12 −1 3 0 √12
3/2 −1/2
= ηT η := η T Gη.
−1/2 3/2
Thus, !
√1 √1
S= 2 2 .
√1 − √12
2
Setting
α1
η = Sα, where α = ,
α2
gives
1 1
η1 = √ (α1 + α2 ), η2 = √ (α1 − α2 ).
2 2
Quadratic Forms 193
3.9.1 Exercises
Exercise 3.83 Write the quadratic forms in matrix forms with symmetric matrices.
(a) Q(x1 , x2 ) = 3x12 + 3x22 − x1 x2 .
(b) Q(x1 , x2 , x3 ) = x12 + x22 + x32 − 8x1 x2 + 4x2 x3 + 10x1 x3 .
Exercise 3.84 For each of the given matrices, write down the correspond-
ing quadratic form and then find a symmetric matrix
which determines the
5 −1 2
2 1
same quadratic form. (a) A = , (b) B = 3 4 1 , (c) C =
3 4
1 6 2
1 2 0
3 4 5 .
0 7 6
Exercise 3.85 Let A be an n×n matrix. We say A = (ai j ) is positive definite if xT Ax >
0 for nonzero n × 1 vector x. Show that if A is positive definite, then aii > 0, i =
1, 2, . . . , n.
Exercise 3.86 Give an example of a quadratic form in 2 variables Q(x1 , x2 ), which
is
(a) positive definite,
(b) negative definite,
194 Matrices and Systems of Linear Equations
(b)
f (x1 , x2 , x3 ) = 4x12 + 4x22 + 4x32 + 4x1 x2 + 4x1 x3 + 4x2 x3 .
Exercise 3.89 Use Theorem 3.26 to show the quadratic forms in Example 3.89 are
positive definite.
Exercise 3.90 Show the matrix
1 −1 2 0
−1 4 −1 1
A=
2 −1 6 −2
0 1 −2 4
is positive definite.
Exercise 3.91 Find all values of x so that the matrix
2 −1 x
A = −1 2 −1
x −1 2
is
(a) positive semidefinite,
(b) positive definite.
Functions of Symmetric Matrices 195
Exercise 3.92 Let A and B be symmetric matrices and consider the two quadratic
forms
Q1 = xT Ax and Q2 = xT Bx.
Show that if there is a matrix P that simultaneously diagonalizes Q1 and Q2 then
A−1 B is diagonalizable.
Exercise 3.93 Use Exercise 3.92 to show the two quadratic forms
Q1 = x12 + x1 x2 − x22 and Q2 = x12 − 2x1 x2
can not be simultaneously diagonalized.
Exercise 3.94 Find the real transformation that will simultaneously reduce the
quadratic forms
Q1 = x1 x2 and Q2 = 3x12 − 2x1 x2 + 2x22 .
to canonical forms.
Exercise 3.95 Find the real transformation that will simultaneously reduce the
quadratic forms
Q1 = 4x12 + 4x22 + 4x32 + 4x1 x2 + 4x1 x3 + 4x2 x3 ,
and
Q2 = 3x12 + 3x32 + 4x1 x2 + 8x1 x3 + 4x2 x3 ,
to canonical forms.
The next theorem plays an important role in the proof of the Cayley-Hamilton The-
orem.
Theorem 3.29 Let A be an n × n symmetric matrix. For constants αi , i = 1, 2, . . . , n
let
P(A) = αn An + αn−1 An−1 + . . . + α1 A + α0 I
be the characteristic polynomial of A. Then all eigenvectors of A are eigenvectors of
P(A) and if the eigenvalues of A are λ1 , . . . , λn , then those of P(A) are
P(λ1 ), P(λ2 ), ..., P(λn ).
The next theorem, known as the Cayley-Hamilton Theorem, sheds light on an inter-
esting relationship between a matrix and its characteristic polynomial.
Theorem 3.30 (Cayley-Hamilton Theorem) Let A be an n × n symmetric matrix. If
P(λ ) = |A − λ I| = 0,
then A satisfies P(A) = 0 (zero matrix).
Proof We know
P(λ ) = |A − λ I| = (−1)n [λ n + βn−1 λ n−1 + . . . + β1 λ + (−1)n β0 ].
By definition, we have that P(λi ) = 0, i = 1, 2, . . . , n. By Theorem 3.29, we
have
P(A) = P(λi )ui , i = 1, 2, . . . , n,
where λi is an eigenvalue of P(A) and ui is its corresponding eigenvector. Let B =
P(A). Then Bx = 0 possesses the n linearly independent solutions x = u1 , u2 , . . . , un .
But since B is a square matrix of order n, B must be of rank n − n = 0. Hence B =
P(A) must be a zero matrix and it follows that B = P(A) = 0. This completes the
proof.
P(λ ) = λ 3 − 3λ 2 − 9λ + 3.
A3 − 3A2 − 9A + 3I = 0. (3.61)
Thus,
A3 = 3A2 + 9A − 3I.
Multiplying (3.61) by A we arrive at
of all Ci except Ck contains the term λk − λk , and hence vanish. Thus, it follows after
some calculations that
P(A)uk = Ck (λk − λ1 ) · · · (λk − λk−1 )(λk − λk+1 ) · · · (λk − λn ) uk , (3.63)
P(λk )
Ck = , k = 1, 2, . . . , n (3.64)
∏ k − λr )
(λ
r̸=k
where the notation ∏ denotes the product of those factors for which r takes on
nonzero values, through n, excluding k. Substituting (3.63) and (3.64) into (3.62)
we obtain
n
P(A) = ∑ P(λk )Zk (A), (3.65)
k=1
where
∏ (A − λr I)
r̸=k
Zk (A) = . (3.66)
∏ (λk − λr )
r̸=k
(A − λ2 I) 1
Z1 (A) = = (A − I).
(λ1 − λ2 I) 2
Similarly,
(A − λ1 I) 1
Z2 (A) = = − (A − 3I). Thus
(λ2 − λ1 I) 2
2
P(A) = Am = ∑ P(λk )Zk (A)
k=1
= P(λ1 )Z1 (A) + P(λ2 )Z2 (A)
= P(3)Z1 (A) + P(1)Z2 (A)
200 Matrices and Systems of Linear Equations
1 1
= 3m (A − I) + (1)m − (A − 3I)
2 2
3m 1
= (A − I) − (A − 3I).
2 2
3100 1
Hence, A100 = 2 (A − I) − 2 (A − 3I). □
Next, we extend the application of Sylvester’s formula to linear systems of ordinary
xn
differential equations. Recall that ex = ∑∞
n=0 n! converges for all x. If A is a matrix of
A n
order n, the sum eA = ∑∞ n=0 n! is a polynomial of order n − 1 in A. So if A has dis-
tinct eigenvalues, then we can use Sylvester’s formula to calculate eA . For simplicity,
suppose A is of order two with distinct eigenvalues λ1 and λ2 . Then,
(A − λ2 I) (A − λ1 I)
Z1 (A) = , Z2 (A) = .
(λ1 − λ2 I) (λ2 − λ1 I)
for all t ∈ R.
Example 3.48 Solve for t ≥ 0,
′ 2 3 x1 2
x = , x(0) = .
3 2 x2 −3
Functions of Symmetric Matrices 201
2 3
The matrix A = has the eigenvalues λ1 = 5, λ2 = −1. The solution of the
3 2
system is x(t) = eAt x0 . By (3.67) we have
1 h 5t i
eAt = (e − e−t )A − (−e5t − 5e−t )I
6
1 3e5t + 3e−t 3e5t − 3e−t
= .
6 3e5t − 3e−t 3e5t + 3e−t
3.10.1 Exercises
Exercise 3.96 Find the eigenvalues of A and A5 where,
3 −12 4
A = −1 0 −2 .
−1 5 −1
This chapter is devoted to the study of the calculus of variations. The subject of
calculus of variations is a wide field in mathematics that is devoted to minimizing
or maximizing functionals. The calculus of variations has a rampant application in
physics, engineering, and applied mathematics. In addition, the calculus of variations
naturally makes its presence felt in the field of partial differential equations. In this
chapter, we will consider many applications, such as distance between two points,
Brachistochrone problem, surfaces of revolution, navigation, Catenary and others.
The chapter covers a wide range of classical topics on the subject of the calculus
of variations. Our aim is to cover the topics in a way that strikes a balance between
the development of theory and applications. The chapter is suitable for advanced un-
dergraduate and graduate students. In most sections, we limit ourselves to smooth
solutions of the Euler-Lagrange equations and finding explicit solutions to classical
problems. We will generalize the concept to systems and functionals that contain
higher derivatives of the unknown functions. The chapter contains a long but inter-
esting section on the sufficient conditions for the existence of an extremal.
4.1 Introduction
Let f : R → R be a real valued function that is continuous. Then we know from
calculus that if f has a local minimum or maximum value at an interior point c, and
if f ′ (c) exists, then
f ′ (c) = 0. (4.1)
Condition (4.1) is a necessary condition for maximizing or minimizing the function
f . Let f (x) = x3 . Then, f ′ (0) = 0. However, the function has neither a maximum nor
minimum at c = 0, as the graph in Fig. 4.1 shows. This shows that condition (4.1) is
not sufficient.
Before we commence on formal definitions, we must be precise when talking about
maximum or minimum in the sense of distances. This brings us to the notion of a
norm.
Definition 4.1 (Normed spaces) Let V denote a linear space over the field R. A func-
tional ∥x∥, which is defined on V is called the norm of x ∈ V , if it has the following
x
(0, 0)
FIGURE 4.1
f ′ (0) = 0, but f has neither a maximum nor a minimum.
properties:
1. ∥x∥ > 0 for all x ̸= 0, x ∈ V.
2. ∥x∥ = 0 if x = 0.
3. ∥αx∥ = |α|∥x∥ for all x ∈ V, α ∈ R.
4. ∥x + y∥ ≤ ∥x∥ + ∥y∥ (triangle inequality)
Example 4.1 The space (Rn , +, ·) over the field R is a vector space (with the usual
vector addition, + and scalar multiplication, ·) and there are many suitable norms for
it. For example, if x = (x1 , x2 , . . . , xn ) then
1. ∥x∥ = max |xi |,
1≤i≤n
s
n
2. ∥x∥ = ∑ xi2 , or
i=1
n
3. ∥x∥ = ∑ |xi |,
i=1
n 1/p
4. ∥x∥ p = ∑ |xi | p , p≥1
i=1
are all suitable norms. Norm 2. is the Euclidean norm: the norm of a vector is its
Euclidean distance to the zero vector and the metric defined from this norm is the
usual Euclidean metric. Norm 3. generates the “taxi-cab” metric on R2 and Norm 4.
is the l p norm. □
Let D ⊂ Rn and define a function f : D → Rn . Let c be a point in the interior of D.
We define a neighborhood of c by
y(x)
B •Q
P
A •
O x
a b
FIGURE 4.2
Shortest path between two points.
Assume the function is scalar. That is f : R → R. Then the Taylor series expansion
of f at c is
1
f (x) = f (c) + (x − c) f ′ (c) + (x − c)2 f ′′ (c) + O((x − c)3 ).
2
By making the change of variables x = c + ε, the above expression takes the
form
1
f (c + ε) = f (c) + ε f ′ (c) + ε 2 f ′′ (c) + O(ε 3 ). (4.2)
2
The proofs of the next two theorems are based on (4.2) and we urge the interested
readers to consult any calculus textbook.
Theorem 4.1 A necessary condition for a function f to have a relative minimum at
a point c in its domain is (i) f ′ (c) = 0 and (ii) f ′′ (c) ≥ 0.
Theorem 4.2 A sufficient condition for a function f to have a strict relative minimum
at a point c in its domain is (i) f ′ (c) = 0 and (ii) f ′′ (c) > 0.
Our main purpose is to extend the above discussion to the calucus of variations.
Suppose we have two points P(a, A) and Q(b, B) in the xy-plane and we are interested
in finding the shortest path between them, see Fig. 4.2. Let f (x) be a candidate for
being the shortest path between the two points. We know from calculus that if f ′
is continuous on [a, b], then the length of the curve y = f (x), a ≤ x ≤ b, is given
by
bq bq
L= 1 + ( f ′ (x))2 dx = 1 + (y′ )2 dx. (4.3)
a a
Note that the integral in (4.3) is a functional since the integrand depends on the
unknown function y. Since the right hand side of (4.3) depends on the unknown
function y we write
bq bq
L(y) = ′ 2
1 + ( f (x)) dx = 1 + (y′ )2 dx, (4.4)
a a
206 Calculus of Variations
dy
where y′ (x) = dx . We are interested in finding a particular function y(x) that maxi-
mizes or minimizes (4.5) subject to the boundary conditions y(a) = A and y(b) = B.
Such a function will be called extremal of L(y).
Definition 4.2 Let S be a vector space (space that has algebraic structures under
multiplication and addition). Our main problem in calculus of variations is to find
y = y0 (x) ∈ S[a, b] for which the functional L(y) takes an extremal value (maximum
or minimum) with respect to all y(x) ∈ S[a, b].
The set Ck [a, b] denotes the set of functions that are continuous on [a, b] with their k-
th derivatives also being continuous on [a, b]. The vector space S[a, b] can be thought
of as the space of competing functions. To be precise, let Σ be the set of all competing
functions for the variational problem (4.5), then
Note that this space is not linear because if y, w ∈ Σ, then y(a)+w(a) = 2A ̸= A unless
A = 0. The same is true for the boundary condition at b. Next we define relative
minimum and relative maximum for a functional.
Definition 4.3 A competing function y0 ∈ Σ is said to yield relative minimum (maxi-
mum) for L(y) in Σ if
L(y) − L(y0 ) ≥ 0 (≤ 0)
for all
y ∈ N(y0 , ε) := {y ∈ Σ : ||y − y0 || < ε}, for some ε > 0,
where N(y0 , ε) is neighborhood of y0 .
Below, we build upon the notion of competing functions to define the so-called space
of admissible functions.
Euler-Lagrange Equation 207
η(x)
x
a x1 x2 b
FIGURE 4.3
The function η(x) with a, b > 0.
Proof Suppose the contrary. That is, f (x) is not zero over its entire domain [a, b].
Then, without loss of generality (w.l.o.g), let us assume it is positive for some interval
[x1 , x2 ] that is contained in [a, b]. Define
see Fig. 4.3. Then, the term (x − x1 )3 (x2 − x)3 > 0, for x ∈ (x1 , x2 ). We must make
sure that η ∈ C2 ([a, b]).
Moreover,
η(x) − η(x1 ) 0−0
lim = lim = 0.
x→x1− x − x1 x→x1− x−x
1
208 Calculus of Variations
In addition,
η ′ (x) − η ′ (x1 ) 0−0
lim = lim = 0.
−
x→x1 x − x1 x→x1 x − x1
−
Hence, η ′′ (x1 ) = 0. It follows along the lines of the previous work that η ′′ (x2 ) = 0.
Thus, the second derivative of η exists and is given by
(x − x1 )(x2 − x){(x − x1 )2 + (x2 − x)2 }
′′
η (x) = −3(x − x1 )(x2 − x), x1 < x < x2
0, otherwise.
It is evident that
lim η ′′ (x) = η ′′ (x1 ) = 0,
x→x1
and
lim η ′′ (x) = η ′′ (x2 ) = 0.
x→x2
This shows that η ∈ C2 ([a, b]). To get a contradiction, we integrate f (x)η(x) from
x = a, to x = b.
b x1 x2 b
f (x)η(x)dx = f (x)η(x)dx + f (x)η(x)dx + f (x)η(x)dx
a a x1 x2
x2
= 0+ f (x)η(x)dx + 0
x1
x2
= f (x)(x − x1 )3 (x2 − x)3 dx > 0,
x1
which contradicts (4.6). Thus, f (x) can not be non-zero anywhere in its domain [a, b].
We conclude that f (x) is zero on its entire domain [a, b]. The proof of taking f < 0 is
similar, so we omit it. This completes the proof.
Our aim is to find the path y(x) that minimizes or maximizes the functional. We will
consider all possible functions by adding a function η(x) ∈ C .
Theorem 4.3 [Euler-Lagrange equation] Assume F in (4.5) is twice differentiable
with respect to its arguments. Let y ∈ C2 [a, b] such that y(a) = A, and y(b) = B. That
Euler-Lagrange Equation 209
y
y(x) •B
A
• y(x) + εη(x)
η(x)
O x
a b
FIGURE 4.4
Possible extremal.
d ∂F ∂F
− = 0. (4.7)
dx ∂ y′ ∂y
y(x) + εη(x) ∈ Σ,
where y is an extremal function for the functional L(y) given by (4.5). See Fig. 4.4.
In the functional L(y) we replace y by y + εη and obtain
b
L(ε) = F(x, y + εη, y′ + εη ′ )dx. (4.8)
a
Once y and η are assigned, then L(ε) has extremum when ε = 0. But this possible
only when
dL(ε)
= 0 when ε = 0.
dε
dL(ε)
Suppress the arguments in F and compute dε .
b
dL(ε) d
= F(x, y + εη, y′ + εη ′ )dx
dε dε a
∂ F dx ∂ F d(y + εη) ∂ F d(y′ + εη ′ ) i
bh
= + + ′ dx
a ∂x ∂ε ∂y dε ∂y dε
bh
∂F ∂F i
= η + ′ η ′ dx,
a ∂y ∂y
210 Calculus of Variations
dx
since = 0. Setting
dε
dL(ε)
=0
dε ε=0
we arrive at
bh
∂F ∂F i
(x, y + εη, y′ + εη ′ )η + ′ (x, y + εη, y′ + εη ′ )η ′ dx = 0.
a ∂y ∂y ε=0
We perform an integration by parts on the second term in the integrand of (4.9). Let
dv = η ′ (x)dx and u = ∂∂ yF′ . Then
d ∂F
v = η(x), and du = dx.
dx ∂ y′
It follows that
b b b d ∂ F
∂F ′ ∂F
η dx = η(x) −
η(x)dx
a ∂ y′ ∂ y′ a a dx ∂ y
′
b
∂F ∂F d ∂F
= ′
η(b) − ′ η(a) − η(x)dx
∂y ∂y a dx ∂ y′
b
d ∂F
= − ′
η(x)dx,
a dx ∂ y
d ∂F ∂F
− = 0, (4.10)
dx ∂ y′ ∂y
Here,
F(x, y, y′ ) = (y′ )2 + xy + y2 , with
d
Fy′ = 2y′ , Fy = x + 2y, and F ′ = 2y′′ .
dx y
It follows that
d
F ′ − Fy = 2y′′ − x − 2y = 0,
dx y
which is the second-order ODE
2y′′ − 2y = x,
and can be solved using the method of Section 1.9. Thus the solution is
1
y(x) = c1 ex + c2 e−x − x.
2
Using the given boundary conditions we end up with the system
5
c1 + c2 = 1, c1 e + c2 e−1 = , with
2
2e−1 − 5 5 − 2e
c1 = −1
, c2 = .
2(e − e) 2(e−1 − e)
Finally, the extremal function is given by
2e−1 − 5 x 5 − 2e −x 1
y(x) = e + e − x.
2(e−1 − e) 2(e−1 − e) 2
□
Example 4.3 Find the extremal function for
π/4
(y′ )2 /2 − 2y2 dx,
L(y) = y(0) = 1, y(π/4) = 2.
0
Here,
F(x, y, y′ ) = (y′ )2 /2 − 2y2 , with
d
Fy′ = y′ , Fy = −4y, and F ′ = y′′ .
dx y
212 Calculus of Variations
It follows that
d
F ′ − Fy = y′′ + 4y = 0,
dx y
which can be solved using the method of Section 1.8 . It follows that the solution is
For fun, we evaluate the functional L at the extremal function. After some calcula-
tions we arrive at
(y′ )2 /2 − 2y2 = 6 cos(4x) − 8 sin(4x).
As a result, we see that
π/4
L(cos(2x) + 2 sin(2x)) = [6 cos(4x) − 8 sin(4x)]dx = −4.
0
□
Example 4.4 Find the extremal function for
2
x2 (y′ )2
+ y2 dx,
L(y) = y(1) = 1, y(2) = −1.
1 2
It follows that
d
F ′ − Fy = x2 y′′ + 2xy′ − 2y = 0,
dx y
which is a Cauchy-Euler equation. Using Section 1.11, we arrive at the solution
1
y(x) = c1 x + c2 .
x2
Fy′ = C, (4.11)
where C is a constant
Proof From (4.7), the term Fy is zero since F is independent of y. Hence we are left
with
d
F ′ = 0.
dx y
An integration with respect to x gives the result.
Corollary 7 If F = F(y, y′ ), that is F does not explicitly depend on the variable x,
then the Euler-Lagrange equation (4.7) is reduced to
F − y′ Fy′ = C, (4.12)
where C is a constant
An integration with respect to x gives the result. This completes the proof.
Now we are in a good place to find the shortest distance or path between two points
in the plane.
Example 4.5 Consider the functional
bq
L(y) = 1 + (y′ )2 dx,
a
Fy′ = C
y′
p = C.
1 + (y′ )2
y′ = constant = K,
y′
or by noticing that the left-hand of √ = C can be constant only if y′ = K,
1+(y′ )2
where K is some function of C (another constant). Hence,
y(x) = Kx + D.
B−A Ab − Ba
K= , D= .
b−a b−a
Of course the shortest path is a straight line, as we have expected. □
We make the following definition regarding smoothness of a function.
Definition 4.5 Let Ω ⊂ Rn . Then the function f : Ω → Rn is said to be smooth on Ω
if f (x) ∈ Cn (Ω), in the sense that f (x) has n derivatives in the entire domain Ω and
the nth derivative of f (x) is continuous.
For example, the function
0, x ≤ 0
f (x) =
x2 , x > 0
is in C1 (R) but not in C2 (R). Recall, Theorem 4.3 asks for y ∈ C2 ([a, b]), which may
not be the case in some situations. To better illustrate the requirement, we look at the
next example.
Example 4.6 Consider the variational
3
9
L(y) = (y − 2)2 (x − y′ )2 dx, y(1) = 2, y(3) = .
1 2
The integrand is positive and hence the variational is minimized when its value is
zero at the extremal. This is achieved for
2, 1 ≤ x ≤ 2
y(x) = x2
2, 2<x≤3
Euler-Lagrange Equation 215
d ∂F ∂F
− = 0.
dx ∂ y′ ∂y
Then y(x) has a continuous second derivatives at all points (x, y), where
We will further discuss Fy′ y′ in the next two sections. Now we try to connect the
concept of extremum of functionals with functions that we discussed in Section 4.1.
Consider the variational problem (4.8). An expansion of Maclaurin series of the first
term on the right hand side about ε gives
b
L(ε) = L(y) + ε [Fy η + Fy′ η ′ ]dx
a
b ε2
+ Fyy η 2 + 2Fyy′ ηη ′ + Fy′ y′ (η ′ )2 dx + O(ε 3 )
a 2!
ε 2
:= L(y) + εδ L(y) + δ 2 L(y) + O(ε 3 ).
2!
Let
L(0) = L(y), L′ (0) = δ L(y), and L′′ (0) = δ 2 L(y).
Then, we may write L(ε) in the form
ε 2 ′′
L(ε) = L(0) + εL′ (0) + L (0) + O(ε 3 ).
2!
The terms δ L(y) and δ 2 L(y) are called the first variation and second variation, re-
spectively, and they will be discussed in detail in Section 4.4.
Example 4.7 Consider the variational in Example 4.5 with y(0) = 0, and y(1) = 3.
Then,
y(x) = 3x.
216 Calculus of Variations
Moreover,
y′
Fy = 0, Fy′ = p , Fyy′ = Fy′ y = 0 and Fy′ y′ = (1 + (y′ )2 )−3/2 .
1 + (y′ )2
Thus,
1 1
′ 3
δ L(y) = [Fy η + Fy′ η ]dx = √ η ′ (x)dx
0 0 10
3
= √ (η(1) − η(0)) = 0, and
10
1
δ 2 L(y) = Fyy η 2 + 2Fyy′ ηη ′ + Fy′ y′ (η ′ )2 dx
0
1
= (10)−3/2 (η ′ (x))2 dx ≥ 0.
0
□
4.2.1 Exercises
Exercise 4.1 Assume f (x) is continuously differentiable in [a, b] such that
b
f (x)η ′ (x)dx = 0
a
for every continuous function η ∈ C2 ([a, b]) such that η(a) = η(b) = 0. Show then
f (x) = constant for all x ∈ [a, b].
Exercise 4.2 Assume f (x) is C2 ([a, b]) such that
b
f (x)η ′′ (x)dx = 0
a
for every continuous function η ∈ C3 ([a, b]) such that η(a) = η(b) = 0. Show then
f (x) = c1 + c2 x for all x ∈ [a, b], where c1 and c2 are constants.
Exercise 4.3 Assume f (x) and g(x) are continuous in [a, b] and
b
[ f (x)η(x) + g(x)η ′ (x)]dx = 0
a
for every function η ∈ C1 ([a, b]) such that η(a) = η(b) = 0. Show then g(x) is dif-
ferentiable and g′ (x) − f (x) = 0 for all x ∈ [a, b].
Exercise 4.4 Let
(x − α)(β − x), α < x < β
η(x) =
0, otherwise.
Show that η(x) ∈ C(R).
Euler-Lagrange Equation 217
is
(x − B)2 + y2 = R2 ,
for appropriate constants B and R.
Exercise 4.16 Find the extremal for
2 q
1
L(y) = 1 + (y′ )2 dx, y(1) = 0, y(2) = 1.
1 x
(b) Let 1
ϕ(ε) = F(x, y0 + εη, y′0 + εη ′ ).
0
Use the y0 from part (a) and η(x) = x(1 − x) to show
dϕ(ε)
ϕ ′ (0) = = 0.
dε ε=0
Exercise 4.19 Let p and q be known constants. Find the extremal y = y(x) that min-
imizes or maximizes the functional
b
L(y) = (y2 + pyy′ + q(y′ )2 )dx, y(a) = A, y(b) = B,
a
Euler-Lagrange Equation 219
where
−1, 0 ≤ x < 12
f (x) = 1
1, 2 <x≤1
Exercise 4.24 Display a function y(x) that minimizes the functional
1
L(y) = y2 (2x − y′ )2 dx, y(−1) = 0, y(1) = 1
−1
d dy dy′
φ = φx + φy + φy′ ,
dx dx dx
and the corresponding Euler-Lagrange equation becomes
d dy dy′
Fy′ − Fy = φx + φy + φy′ − Fy = 0,
dx dx dx
which is a second-order differential equation. On the other hand, if y′ enters linearly
d dy
in F, then Fy′ is a function in x and y only. Then φ = φ (x, y) and dx φ = φx + φy dx ,
which implies that
d dy
Fy′ − Fy = φx + φy − Fy = 0.
dx dx
The last expression is a first-order differential equation with its solution having only
one constant to be computed, based on two boundary conditions. In most cases, such
solution will not exist. To enforce this notion, we consider
2
L(y) = x2 yy′ dx, y(1) = 1, y(2) = −1. (4.13)
1
Impact of y′ on Euler-Lagrange Equation 221
Then,
d d ∂
F ′ − Fy = N(x, y) − N(x, y)y′ + M(x, y)
dx y dx ∂y
= Nx + y Ny − y′ Ny + My
′
= Nx − My . Thus,
d
F ′ − Fy = 0, implies
dx y
Nx − My = 0. (4.15)
Relation (4.15) is not even a differential equation, but rather a relation that, in most
cases, can not satisfy both boundary conditions. For example, if F(x, y, y) = 2xy′ +y2 ,
d
then the corresponding Euler-Lagrange equation is F ′ − Fy = 2 − 2y = 0, only
dx y
when y(x) = 1, which may not satisfy any of the given two boundary conditions.
However, there is a useful result in the case Nx = My for all x and y, that we state and
prove in the following theorem.
Theorem 4.5 [Path independent] Let y(x) ∈ C1 ([a, b]) be an extremal function for
the functional (4.14). If (4.15) holds for all Nx = My then the value of L is path
independent. That is, there is a function f (x, y) such that
Proof Suppose (4.15) holds for all x and y. Then there is a function f (x, y) such that
fy = N and fx = M. As a consequence, we have
dy
F = N(x, y)y′ + M(x, y) = fy + fx .
dx
df
This is saying that F = . Thus
dx
Fdx = d f .
222 Calculus of Variations
This shows the value of L is independent of the extremal y = y(x), and so L is path
independent. This completes the proof.
Example 4.8 Consider
2
(x3 + y2 )y′ + 3x2 y dx,
L(y) = y(1) = 1, y(2) = −1.
1
Nx = 3x2 = My
and so condition (4.15) is satisfied for all x and y. So there exists a function f (x, y)
such that N = x3 +y2 = fy and M = 3x2 y = fx . This gives f (x, y) = 3x2 ydx = x3 y+
g(y), for some function g. In addition, fy = x3 + g′ (y) = N = x3 + y2 , which implies
y3 3
that g′ (y) = y2 . An integration yields g(y) = + c. Hence, f (x, y) = yx3 + y3 + c.
3
Finally, according to Theorem 4.5 we have
L(y) = f (2, y(2)) + c − f (1, y(1)) + c
29
= f (2, −1) − f (1, 1) = − .
3
□
4.3.1 Exercises
Exercise 4.27 Show each of the functionals is path independent and evaluate L.
2
(3y + 7)y′ + (2x − 1) dx,
(a) L(y) = y(0) = 1, y(2) = 0,
0
2
(2yx2 + 7)y′ + (2y2 x − 3) dx,
(b) L(y) = y(1) = 1, y(2) = −1,
1
Necessary and Sufficient Conditions 223
2
3xy2 y′ + (x3 + y3 ) dx,
(c) L(y) = y(1) = 1, y(2) = 0,
1
π/2
(x cos(xy) + ey )y′ + (y cos(xy) + 1) dx,
(d) L(y) = y(0) = 0, y(π/2) = 1.
0
Exercise 4.28 Determine M(x, y) so that the functional
b
1
(xexy + 2xy + )y′ + M(x, y) dx,
L(y) = a>b>0
a x
with fixed end points, is path independent.
Exercise 4.29 Develop a parallel theory for the variational with fixed end points
b
N(x, y) + y′ M(x, y) dx.
L(y) =
a
In a similar way, we may obtain the first and second variations of the functional L.
Let
△L = L(y + εη) − L(y),
which is the change in L. An expansion of Maclaurin series of the first term on the
right hand side about ε gives
b
L(y + εη) = L(y) + ε [Fy η + Fy′ η ′ ]dx
a
b ε2
+ Fyy η 2 + 2Fyy′ ηη ′ + Fy′ y′ (η ′ )2 dx + O(ε 3 ).
a 2!
So,
ε2 2
△L(y) = εδ L(y) + δ L(y) + O(ε 3 ), (4.20)
2!
where O(ε 3 ) can be written as
b
(ε1 η 2 + ε2 ηη ′ + ε3 η ′2 )dx. (4.21)
a
Due to the continuity of Fyy , Fyy′ , and Fy′ y′ , it follows that ε1 , ε2 , ε3 → 0 as ||η||1 → 0,
where
||η||1 = max |η(x)| + max |η ′ (x)|.
a≤x≤b a≤x≤b
δ L(y) = 0.
d ∂F
v = η(x) and du = dx.
dx ∂ y′
It follows that
bh
d ∂ F i
δ L(y) = Fy − η(x)dx = 0,
a dx ∂ y′
by Euler-Lagrange equation. This completes the proof.
It takes some ingenuity to apply Theorem 4.6 as the next example shows.
Example 4.9 For a fixed b > 0 consider the functional
b
(y′ )2 − y2 dx,
L(y) = y(0) = y(b) = 0.
0
As a consequence we obtain
b
1 2 b2
δ L(y) ≥ 1 − 2 (η ′ (x))2 dx.
2 π 0
It is evident from the above inequality that δ 2 L(y) ≥ 0, for all such functions η and
extremal y if b ≤ π. This implies that y is a candidate for minimizing L.
As for the case b > π, we carefully choose η by
kπx
ηk (x) = sin( ), k = 1, 2, . . . .
b
It is evident that ηk (0) = ηk (b) = 0. A direct substitution of η and η ′ into δ 2 L(y)
yields
b 2 2
1 2 k π 2 kπx 2 kπx
δ L(y) = cos ( ) − sin ( ) dx.
2 0 b2 b b
Using trigonometric substitutions, one can compute the definite integral and find
1 2 k 2 π 2 − b2
δ L(y) = .
2 2b2
2
Since b > π, we have δ 2 L(y) < 0 for k = 1, and δ 2 L(y) > 0 for k2 > πb 2 . This shows
that δ 2 L(y) changes signs and therefore in this case (b > π) the considered functional
L can not have either relative minimum or relative maximum. □
Necessary and Sufficient Conditions 227
Now one can choose η so that the sign of (η ′ )2 Fy′ y′ dominates the sign of the in-
tegrand of δ 2 L(y) that is given by (4.24). In particular, in order to have δ 2 L(y) ≥ 0
for all η, it is necessary that Fyy′ ≥ 0. As a consequence we have the following theo-
rem.
Theorem 4.7 [Legendre necessary condition]
1. If y = y(x) is a local minimum of L in Σ, then
Proof We will only prove 1. since the proof of 2. follows along the lines. In addition,
our argument here is inspired by the one given in [12] or [21]. The idea of the proof
is to display a function η with η(a) = η(b) = 0, so that |η| is uniformly bounded and
at the same time |η ′ | can be made as large as we want it. One of the logical choice of
such η is in term of sine functions. We accomplish our proof by contradiction. That
is, assume there is a point x1 ∈ (a, b) such that
By the continuity of Fy′ y′ , we can find a number ζ > 0 such that [x1 −ζ , x1 +ζ ] ⊂ [a, b]
Fy′ y′ (x1 )
with Fy′ y′ < 2 for all x ∈ (x1 − ζ , x1 + ζ ).
The idea is to chose η so that the term η ′2 Fy′ y′ dominates the other terms in the inte-
grand of δ 2 L(y). In other words, it is imperative that Fy′ y′ ≥ 0 in order for δ 2 L(y) ≥ 0.
Let k > 2 be an integer and set
(
1)
sin2k ( π(x−x ), x ∈ [x1 − ζ , x1 + ζ ]
η(x) = ζ
0, x∈ / [x1 − ζ , x1 + ζ ].
228 Calculus of Variations
Then
(
1) 1)
′
2kπ
sin2k−1 ( π(x−x ) cos( π(x−x ), x ∈ [x1 − ζ , x1 + ζ ]
η (x) = ζ ζ ζ
0, x∈
/ [x1 − ζ , x1 + ζ ].
This is a contradiction to the fact that δ 2 L(y) ≥ 0. This completes the proof.
Warning:
Be aware that we only know when δ 2 L(y) ≥ 0 for all functions η we see that Fy′ y′ ≥ 0
(the reverse is not true). Also, so far we only know that if y = y(x) is a relative
minimum, then δ 2 L(y) ≥ 0 (Necessary condition).
Our ultimate goal is to have results that assure our solution is indeed the relative
minimum. That is, δ 2 L(y) ≥ 0 for all functions η implies that y = y(x) is a relative
minimum of L. This will be established after the next examples.
Necessary and Sufficient Conditions 229
It follows that
x
Fy′ y′ = ,
2((y′ )2 /2 + 1)3/2
which changes signs for x ∈ [−1, 1]. So by Legendre’s Theorem, this functional has
neither a local minimum nor a local maximum. One can easily verify that y(x) = 1 is
the only extremal. □
Example 3 Consider the functional
1
(y′ )2 /2 + y dx,
L(y) = y(0) = 0, y(1) = 1.
0
Here,
F(x, y, y′ ) = (y′ )2 /2 + y.
Since Fyy = Fy′ y = 0, and Fy′ y′ = 1, it follows that
1
2
δ L(y) = (η ′ )2 (x) > 0.
0
Thus, the necessary condition for a relative minimum is met. In particular, L can not
have a local maximum which requires δ 2 L(y) ≤ 0. The equation y = x2 /2 + x/2 can
be computed to identify a potential minimizer, and we’ll show later that it does, in
fact, minimize the functional. □
Sufficient Conditions
Our next task is to obtain conditions that are sufficient for a function y to be a relative
minimum or a relative maximum for the functional L. Let y(x) be an extremal of the
functional (4.16). We have established that if δ 2 L(y) ≥ 0 for all functions η then
Fy′ y′ ≥ 0. We will be in a great shape if we can show that
δ 2 L(y) ≥ 0
The next lemma plays a crucial role in proving our results regarding sufficient con-
ditions.
Lemma 13 If α(x) > 0, and the ordinary differential equation
z2
z′ + β (x) − = 0 for x ∈ [a, b], (4.25)
α(x)
230 Calculus of Variations
and
d
β (x) = Fyy (x, y(x), y′ (x)) − F ′ (x, y(x), y′ (x)). (4.27)
dx y y
Proof Let y ∈ Σ and η ∈ C . Remember our functional is given by (4.16). Using the
terms α and β , δ 2 L(y) can be put in the simplified form
b
2
δ L(y) = α(x)η ′2 + β (x)η 2 dx. (4.28)
a
Jacobi brilliantly recognized that for any continuous function z = z(x), one
has b
zη 2 )′ dx = 0, for all functions η ∈ C .
a
He also observed that
(zη 2 )′ = 2zηη ′ + z′ η 2 .
With these two observations in mind, δ 2 L(y) given in (4.28) takes the form
b
δ 2 L(y) = αη ′2 + 2zηη ′ + (z′ + β )η 2 dx. (4.29)
a
z z2
αη ′2 + 2zηη ′ + (z′ + β )η 2 = α η ′2 + 2 ηη ′ + 2 η 2
α α
z 2
+ z′ + β − η2
α
z 2 z2 2
= α η ′ + η + z′ + β − η .
α α
Thus, if
z2
z′ + β (x) − = 0,
α(x)
has a solution z, then (4.29) reduces to
b
z 2
δ 2 L(y) = α(x) η ′ + η dx ≥ 0,
a α
The million-dollar question is, when does the differential equation given by (4.25)
have a solution? We adopt the following terminology:
Definition 4.8 The second variation δ 2 L(y) of the functional L(y) is said to be pos-
itive definite if
δ 2 L(y) > 0 for all η ∈ C and η ̸= 0.
The results of Lemma 13 depend on the existence of a solution for the Ricatti non-
linear first-order differential equation given by (4.25). We introduce a new function
h = h(x) and use the transformation
α(x)h′ (x)
z(x) = − . (4.30)
h(x)
Then (4.25) is transformed to the Jacobi differential equation
′
α(x)h′ − β (x)h = 0 for x ∈ [a, b]. (4.31)
We already know from Chapter 1 that (4.31) has a solution defined on the whole
interval [a, b] as long as α(x) > 0 and β (x) is continuous. However, our next headache
stems from the fact of inverting the transformation to go back from z(x) to h(x). In
other words, we can not have the solution h(x) of (4.31) to vanish or have zeros in
[a, b]. The next definition regarding conjugacy plays an important role in deciding
whether or not the Jacobi equation (4.31) vanishes in [a, b] or not.
Definition 4.9 Two points x = ξ1 and x = ξ2 , ξ1 ̸= ξ2 , are said to be conjugate points
for the Jacobi differential equation (4.31) if it has solution h such that h ̸= 0 between
ξ1 and ξ2 , and h(ξ1 ) = h(ξ2 ) = 0.
Notice that (4.31) has the general solution of the form
Thus, if the interval [a, b] contains no conjugate points, then the Jacobi equation
(4.31) admits a solution h that does not vanish at any points in [a, b]. We have the
following theorem.
Theorem 4.8 The Jacobi equation (4.31) has a nonzero solution for all x ∈ [a, b] if
α(x) > 0 and there are no conjugate points to a in (a, b].
The implication of Lemma 13 and Theorem 4.8 is that (4.31) will have a nonzero
solution, which is a necessary condition for δ 2 L(y) to be positive definite. Thus we
have the next theorem.
Theorem 4.9 Let y ∈ C1 ([a, b]) be an extremal for the functional (4.16). Suppose
that α(x) > 0 for all x ∈ [a, b]. If there are no conjugate points to a in (a, b], then the
second variation δ 2 L(y) is positive definite.
Example 4.11 Consider the functional
1
(y′ )2 + y2 − yy′ dx, y(0) = 0, y(1) = 1.
L(y) =
0
(2). If δ 2 L(y) ≥ 0 for all η ∈ C , then there is no conjugate points to a in (a, b).
Note that the statement δ 2 L(y) ≥ 0 for all η ∈ C , permits the possibility that
δ 2 L(y) = 0 for some η ̸= 0 ∈ C .
Proof We begin by proving (1). by first showing x = b can not be a conjugate point
to a. We do this by contradiction. Assume b is a conjugate point to a. Then there is
a function h∗ depending on x such that h∗ (a) = h∗ (b) = 0 and satisfying the Jacobbi
equation (4.31). That is ′
α(x)h′∗ − β (x)h∗ = 0. (4.33)
This implies there is a nontrivial η ∈ C such that the second variation vanishes,
contradicting the fact that δ 2 L(y) is positive definite. Hence b can not be conjugate
to a. Left to show that there is no conjugate points to a in (a, b). We follow the proof
given by Gelfand and Fomin ([12], p. 109). The plan is to build a family of positive
definite functionals K(µ), which depend on the parameter µ ∈ [0, 1], such that K(1)
is the second variation and K(0) is unconstrained by conjugate points to a. This
means that any solution to the Jacobi equation for K will be a continuous function
of µ. This continuity is then used by to demonstrate that the absence of a conjugate
point for K(0) implies that for K(µ), and in particular K(1). Let K represent the
functional as defined by
b
K(µ) = µδ 2 L(y) + (1 − µ) η ′2 (x)dx.
a
b
It can be easily shown that η ′2 (x)dx has no conjugate points in (a, b]. Moreover,
a
K(µ) is positive definite for all µ ∈ [0, 1]. The Jacobi Equation associated to K(µ)
is h i′
(J)µ := µα(x) + (1 − µ) u′ − µβ (x)u = 0. (4.36)
Every solution u(x; µ) to (4.36), however, is continuous with regard to µ ∈ [0, 1]. As
a result, we may state that u(x, µ), has a continuous derivative with respect to µ
for all µ in an open interval including [0, 1] because µα(x) + (1 − µ) > 0 for all
µ ∈ [0, 1]. Therefore, the solution u(x; µ) with u(a; µ) = 0 and u′ (a; µ) = 1 depends
on µ continuously for x ∈ (a, b]. Let’s begin by the value µ = 0. Then (J)0 of (4.36)
gives u′′ = 0, with the solution
u(x; 0) = x − a
which has no conjugate points in (a, b). Next we deal with µ = 1, and assume the
contrary. That is there is a conjugate point c∗ ∈ (a, b], that is, u(c∗ ; 1) = 0. Then
Necessary and Sufficient Conditions 235
u(a; µ0 ) = u(b; µ0 ) = 0.
which is equivalent to
b
2
µ0 δ L(y) + (1 − µ0 ) η ′2 (x)dx = 0
a
with η(x) = u(x; µ0 ) ̸= 0 and η ∈ C . This is a contradiction to the fact that δ 2 L(y) >
b
0 and a η ′2 (x)dx > 0 for all η ̸= 0 ∈ C .
The proof of (2). follows along the same lines beginning with the statement “Left
to show that there is no conjugate points to a in (a, b).” This completes the
proof.
Proof We follow the proof of Sagan, [21]. Let α be given by (4.26). Assume α(x) > 0
and that the interval [a, b] does not contain any conjugate points to a Then, due to
the continuity of the Jacobi’s equation (4.31), a bigger interval [a, b + ε] exists that
still has no conjugate points to a and is such that α(x) > 0 in [a, b + ε]. For nonzero
constant ζ , consider the variational
b b
α(x)η ′2 + β (x)η 2 dx − ζ 2 η ′2 dx. (4.37)
a a
Thus, by Theorem 4.8, these two conditions imply that the quadratic functional (4.37)
is positive definite for all sufficiently small ζ . That is, there exists a positive constant
d such that b b
α(x)η ′2 + β (x)η 2 dx > d η ′2 dx. (4.39)
a a
As a consequence of (4.39), the functional or variational L(y) has a minimum. In
other words, if y = y(x) is the extremal and y = y(x) + η(x) is a sufficiently close
neighboring curve, then from the notation of Definition 4.3, and equations (4.20)
and (4.21) we have that
b b
L(y + η) − L(y) = α(x)η ′2 + β (x)η 2 dx + (ε1 η 2 + ε2 η ′2 )dx, (4.40)
a a
This yields
b
(b − a)2 b ′
(ε1 η 2 + ε2 η ′2 )dx ≤ ε 1 + (η (x))2 dx, (4.41)
a 2 a
Necessary and Sufficient Conditions 237
when |ε1 (x)| ≤ ε, |ε2 (x)| ≤ ε. Since we can chose ε > 0 arbitrarily small, it follows
from (4.39) and (4.41) that
b b
L(y + η) − L(y) = α(x)η ′2 + β (x)η 2 dx + (ε1 η 2 + ε2 η ′2 )dx > 0,
a a
for sufficiently small ||η||1 . Therefore, we conclude that the extremal y = y(x) is a
relative minimum of the functional (4.16). This completes the proof of 1. The proof of
2. is not trivial, and it follows along the lines of the proof of 1.
Example 4.12 Consider the functional
π/2
L(y) = ((y′ )2 − y2 )dx, y(0) = 1, y(π/2) = 0. (4.42)
0
has the nontrivial solution h(x) = sin(x). Clearly, h(0) = 0, and there are no other
points a∗ ∈ (0, π/2] such that h(a∗ ) = 0. Therefore, the interval [0, π/2] admits no
conjugate points. More over, the Legendre condition
4.4.1 Exercises
Exercise 4.30 Find the extremal function for
1
(y′ )2 + y2 + 2yex dx,
L(y) = y(0) = 0, y(1) = 1
0
and show it minimizes the functional L.
Exercise 4.31 Find the extremal function for
π/4
(y′ )2 /2 − 4y dx,
L(y) = y(0) = 0, y(π/4) = 1
0
and show it minimizes the functional L.
Exercise 4.32 Find the extremal function for
2
L(y) = x2 (y′ )2 + y′ )dx, y(1) = 1, y(2) = 3
1
and show it minimizes the functional L.
Applications 239
Exercise 4.35 Let g(x) be continuous and positive on the interval [a, b] with a > b >
0. Show that if y = y(x) is an extremal for the functional
b
L(y) = g(x)(y′ )2 dx,
a
4.5 Applications
This section is devoted to the application of calculus of variations. We will look into
familiar problems in physics such as minimal surface, geodesics on sphere, and the
histochrone problem.
Minimal surface area
Suppose we have a curve y given by y = f (x) that is continuous on [a, b]. For sim-
plicity, we assume f (x) > 0 on [a, b]. The goal is to find the curve passing thorough
the points P(a, A) and Q(b, B) which when rotated about the x-axis gives a minimum
surface area. This is depicted in Fig. 4.5.
Let ds be the arc length of PQ. Then at any point on the curve, ds rotates through a
ds
distance 2πy around the x-axis. Hence the sectional area is 2πyds = 2πy dx. There-
b b dx
ds p
fore, the total surface area is 2πy dx = 2πy 1 + y′2 dx. We must minimize
a dx a
the functional
b p
L(y) = 2πy 1 + y′2 dx, y(a) = A, y(b) = B.
a
240 Calculus of Variations
y
y = f (x)
B
A
x
a b
FIGURE 4.5
Surface of revolution; minimal surface area.
Since F = y 1 + y2 is independent of x, we use
F − y Fy = C.
dy
Let y = C cosh(t). Using the identity cosh2 (u) − sinh2 (u) = 1, and dt = C sinh(t) we
have
dy C sinh(t)
C =C dt = Ct.
y2 −C2 C sinh(t)
But Cy = cosh(t), which implies that t = cosh−1 ( Cy ). Thus after integrating both sides
we end up with
y
C cosh−1 ( ) = x + K,
C
or,
y x
cosh−1 ( ) = + K.
C C
Taking cosine hyperbolic inverse on both sides leads to
x
y = C cosh( + K),
C
Applications 241
A(x1 , y1 )
B(x2 , y2 )
v
mg
FIGURE 4.6
Brachistochrone curve.
where the constants C and K can be found using the boundary conditions. The graph
of the solution represents catenary.
In engineering, catenaries are frequently used in designing bridges, roofs and
arches.
Brachistochrone curve
A brachistochrone curve, also known as a curve of fastest descent in physics and
mathematics, is the curve on a plane between a point A and a lower point B, where B
is not directly below A, on which a bead slides frictionlessly under the influence of a
uniform gravitational field to a given end point in the shortest amount of time. Johann
Bernoulli posed the issue in 1696, asking: “Given two points A and B in a vertical
plane, what is the curve sketched out by a point acting only under the influence of
gravity, which starts at A and reaches B in the shortest time?” For the mathematical
set up, we assume a mass m with initial velocity zero slides with no friction under the
force of gravity g from a point A(x1 , y1 ) to a point B(x2 , y2 ) along a wire defined by
a curve y = f (x) in the xy-plane (x1 < x2 , y1 > y2 ). Which curve leads to the fastest
time of descent? See Fig. 4.6.
A variational problem can be formulated by computing the time of descent t for a
fixed curve connecting the points A and B. Let s denotes the distance traveled and
ds ds
v = v(t) represents the velocity. Then v = , which implies that dt = . The arc
p dt v
length ds of AB is ds = 1 + y′2 . To obtain an expression for v we use the fact that
energy is conserved through the motion; that is
(kinetic energy at t > 0) + (potential energy at t > 0) = (kinetic energy at t = 0) +
(potential energy at t = 0). This translate into
1 2
mv + mgy = 0 + mgy1 . (4.43)
2
Solving for v we get p
v= 2g(y1 − y(x)).
242 Calculus of Variations
ds
Using the obtained values of ds and v in dt = gives
v
p
1 + y′2
dt = p dx.
2g(y1 − y(x))
which reduces to
dy 2 1 −C2 (y1 − y)
= .
dx C2 (y1 − y)
dy
Solving for and separating the variables, it follows that
dx
p
y1 − y)
dx = − p dy, C1 = C−2 .
C1 − (y1 − y)
dy
The negative sign is due to the fact that < 0. Integrating both sides and using the
dx
transformation y1 − y = C1 sin2 (ϕ/2), we obtain x = C1 /2 ϕ − sin(ϕ) + C2 . The
solution is then
y1 − y = C1 sin2 (ϕ/2), x = C1 /2 ϕ − sin(ϕ) +C2 ,
the shortest distance between two points in a plane. Next, we formulate the problem
into a variational equation and find its solution. Let a > 0 and consider the sphere
centered at the origin with radius a,
x2 + y2 + z2 = a2 .
where θ is the angle from the positive z- axis, φ is the angle from the positive x-axis
and r = a is constant. By the chain rules we have
∂x ∂x
dx = dθ + dφ ,
∂θ ∂φ
∂y ∂y
dy = dθ + dφ ,
∂θ ∂φ
and
∂z ∂z
dz = dθ + dφ .
∂θ ∂φ
As a consequence, we arrive at
c csc2 (x)
y′ = p .
1 − c2 (1 + cot2 (x))
Separating the variables and then integrating both sides yiels
c csc2 (x)
y= p + constant.
1 − c2 (1 + cot2 (x))
Let u = c cot(x). Then du = −c csc2 (x)dx and the above integral reduces to
du u
y=− √ = cos−1 √ + d,
1 − c2 − u2 1 − c2
for some constant d. This implies that
u
√ = cos(y − d),
1 − c2
or p
c cot(x) = 1 − c2 cos(y − d).
Finally, replacing x by θ and y by φ , leads to the solution
p
c cot(θ ) = 1 − c2 cos(φ − d).
Next we try to make some sense out of this solution. Multiply both sides by
a sin(θ ), where a is the radius of the sphere and at the same time use cos(u − v) =
cos(u) cos(v) + sin(u) sin(v) to get
p
ca cos(θ ) = 1 − c2 a cos(d) sin(θ ) cos(φ ) + a sin(d) sin(θ ) sin(φ ) .
Recall that, we are in spherical coordinates, and so the above equation takes the form
in rectangular coordinates
p
cz = 1 − c2 cos(d)x + sin(d)y , c2 ∈ (0, 1)
Applications 245
which represents an equation of the plane that intersects the sphere. Since the plane
passes through the centre of the sphere, which is the origin, the section of the sphere
by the plane is the great circle, or geodesic. All sections of other planes are small
circles. This great circle has two arcs between P and Q; the major arc, and the minor
has the minimum length. This is the geodesic on the surface of a sphere. Recall, a
geodesic on a given surface is a curve lying on that surface along which distance
between two points is as small as possible.
4.5.1 Exercises
Exercise 4.37 Show that the shortest path between two points on a circular cylinder
is along the circular helix joining them. Assume the two points are not on a generator.
Hint: use cylindrical coordinates to parametrize the circular cylinder x2 + y2 = a2 .
Let P(a, θ1 , z1 ) and Q(a, θ2 , z2 ). Compute ds and then integrate to obtain the varia-
tional that needs to be minimized.
p
Hint: Let x = a cos(θ ), y = a sin(θ ), z = z(θ ). Show ds = a2 + [z′ (θ )]2 dθ .
Exercise 4.38 Find the geodesics on a right circular cone. Use spherical coordi-
nates
x = u sin(α) cos(v), y = u sin(α) sin(v), z = u cos(v),
q q
to show ds = 1 + u2 sin2 (α)(v′ )2 du and minimize 1 + u2 sin2 (α)(v′ )2 du,
where α is the apex angle. If you replace u with x and v with y, then you are to
minimize q
L(y) = 1 + x2 sin2 (α)(y′ )2 dx.
Exercise 4.39 [Hanging chain] Let y = y(x) be the curve configuration of a uniform
inextensible heavy chain hanging from two fixed points P(a, A) and Q(b, B) at rest in
a constant gravitational field. For mathematical convenience assume the rope density
and gravity are both one. Show that the shape of the curve y is a catenary.
Answer: y(x) = c cosh x−d
c , for constants c and d.
Exercise 4.40 [Minimal surface] Consider the solution of the Minimal surface prob-
lem
x
y = C cosh( + K).
C
Show that under the boundary conditions
L L
y(− ) = y( ) = 1,
2 2
the constant K = 0.
Hint: Make use of the identities cosh(x + y) and cosh(x − y).
246 Calculus of Variations
y(x) + εη(x),
where y is an extremal function for the functional L(y) given by (4.45). In the func-
tional L(y) replace y by y + εη to arrive at
b
L(ε) = F(x, y + εη, y′ + εη ′ , y′′ + εη ′′ )dx.
a
Once y and η are assigned, then L(ε) has extremum when ε = 0. But this possible
only when
dL(ε)
= 0 when ε = 0.
dε
dL(ε) dx
Suppress the arguments in F and compute dε and notice that since dε =0
b
dL(ε) ∂
= F(x, y + εη, y′ + εη ′ , y′′ + εη ′′ )dx
dε ∂ε a ε=0
bh i
= Fy η + Fy′ η ′ + Fy′′ η ′′ dx.
a
We perform an integration by parts on the second and third terms in the integrand.
Let dv = η ′ (x)dx, and u = ∂∂ yF′ . Then
d ∂F
v = η(x) and du = dx.
dx ∂ y′
Generalization of Euler-Lagrange Equation 247
It follows that
b b
′ d
Fy′ η dx = − F ′ η(x)dx,
a a dx y
since η(a) = η(b) = 0. Performing integration by parts twice on the third term
gives
b b 2
d
Fy′′ η ′′ dx = Fy′′ ηdx.
a a dx
Consequently, we have
bh
d d2 i
Fy − Fy′ + 2 Fy′′ η(x)dx = 0.
a dx dx
The aforementioned findings are easily generalized to functionals with nth order
derivatives. Let y = y(x) ∈ Cn [a, b] and consider the variational with nth order deriva-
tives b
L(y) = F(x, y, y′ , y′′ , y′′′ , . . . , y(n−1) , y(n) )dx
a
and boundary conditions
subject to
y(0) = 0, y′ (0) = 1, y(1) = −1, y′ (1) = 2.
The corresponding necessary Euler-Legandre condition is
y(4) − y′′ = 0.
y(x) = c1 + c2 x + c3 ex + c4 e−x .
3 − 2e−1 e + 2(e−1 − 1)
c3 = , c4 = .
e − e−1 1 − e−2
□
Generalizations to variational involving several variables.
Let y, z ∈ C2 [a, b], and consider the variational with two variables y and z
b
L(y, z) = F(x, y, y′ , z, z′ )dx, (4.49)
a
Generalization of Euler-Lagrange Equation 249
Let η1 = η1 (x) ∈ C([a, b]) and η2 = η2 (x) ∈ C([a, b]), such that
dL(ε)
dx
since dε = 0. Setting dε |ε=0 and integrating by parts the terms that involves η1′
′
and η2 we arrive at
bh
d d i
Fy − Fy′ η1 (x)dx + Fz − Fz′ η2 (x)dx = 0,
a dx dx
that must hold for all η1 (x), η2 (x). So without loss of generality, we assume it holds
for η2 (x) = 0. Then, we have
b
d
Fy − F ′ η1 (x)dx = 0
a dx y
d
Fy − F ′ = 0.
dx y
Substituting this back into the above integral gives
b
d
Fz − F ′ η2 (x)dx = 0,
a dx z
d
and by Lemma 10, we see that Fz − dx Fz′ = 0. As a consequence, we state the fol-
lowing theorem.
Theorem 4.14 [Euler-Lagrange equation] If the functions y = y(x), z = z(x) are ex-
tremal of the variational problem in (4.49), then y(x), z(x) must satisfy the the pair
of Euler-Lagrange equations
d d
Fy − F ′ = 0, Fz − F ′ = 0.
dx y dx z
250 Calculus of Variations
Again, the above discussion can be generalized to a variational with n variable func-
tions. To see this, we assume each of yi = yi (x) ∈ C([a, b]), i = 1, 2, . . . n is an extremal
for the variational
b
L(y1 , y2 , . . . , yn ) = F(x, y1 , y2 , . . . , yn , y′1 , y′2 , . . . , y′n )dx,
a
with
yi (a) = Ai , yi (b) = Bi , i = 1, 2, . . . , n.
Then each of yi = yi (x), i = 1, 2, . . . n must satisfy the necessary Euler-Lagrange
equation
d
Fyi − Fy′i = 0, i = 1, 2, . . . n.
dx
4.6.1 Exercises
Exercise 4.41 Find the extremal y(x) for the variational
1
L(y) = (1 + y′′2 )dx, y(0) = 0, y′ (0) = 1, y(1) = 1, y′ (1) = 1.
0
Exercise 4.42 Find the extremals y = y(x), z = z(x) for the variational
π/4
L(y, z) = (4y2 + z2 − y′2 − z′2 )dx
0
subject to
y(0) = 1, y(π/4) = 0, z(0) = 0, z(π/4) = 1.
Generalization of Euler-Lagrange Equation 251
Exercise 4.43 Find the extremals y = y(x), z = z(x) for the variational with boundary
conditions π/4
L(y, z) = (4y2 + z2 + y′ z′ )dx,
0
y(0) = 1, y(π/4) = 0, z(0) = 0, z(π/4) = 1.
Hint: Solving for the constants will be messy.
Exercise 4.44 Prove parts (a) and (b) of Remark 19.
Exercise 4.45 Use Exercise 4.44 to show that for constants c1 and c2 the Euler-
Lagrange equation of the variational
b
(1 + y′2 )2
L(y) = dx
a y′′
is
c1 y′ + c2
y′′ =1
(1 + y′2 )2
and solve the differential equation.
Hint: Use the transformation y′ = tan(u) to solve the differential equation.
Exercise 4.46 Find the extremals y = y(x), z = z(x) (no need to solve for the con-
stants) for the variational
1
z′2 + (y′2 − 1)2 + z2 + yz dx
L(y, z) =
0
Exercise 4.47 An elastic beam has vertical displacement y(x), x ∈ [0, l]. (The x-axis
is horizontal and the y-axis is vertical and directed upwards.) Let ρ be the load per
unit length on the beam. The ends of the beam are supported, that is, y(0) = y(l) = 0.
Then the displacement y minimizes the energy functional
l
1 2
D y′′ (x) + ρgy(x) dx,
L(y) =
0 2
where D, ρ and g are positive constants. Write down the differential equation and the
rest of the boundary conditions that y(x) must satisfy and then show that the solution
is
ρg
y(x) = − x(l − x)[l 2 + x(l − x)].
24D
Exercise 4.48 Find the extremal y = y(x), z = z(x), for the variational with bound-
ary conditions
π/2
L(y, z) = (y′2 + z′2 + 2yz)dx
0
y(0) = 1, y(π/2) = 1, z(0) = 0, z(π/2) = −1.
Answer: y(x) = sin(x), z(x) = − sin(x).
252 Calculus of Variations
Exercise 4.49 Find the extremal y = y(x), z = z(x) for fixed end points of the varia-
tional b
L(y, z) = (2yz − 2y2 − (y′ )2 + (z′ )2 )dx.
a
Exercise 4.50 Find the extremal y = y(x), z = z(x) for fixed end points of the varia-
tional b
L(y, z) = (y′ z′ + y2 + z2 )dx.
a
Exercise 4.51 Find the extremal y = y(x), z = z(x) for the variational with boundary
conditions 1
L(y, z) = (2y + (y′ )2 + (z′ )2 )dx,
0
3
y(0) = 1, y(1) = , z(0) = 1, z(1) = 1.
2
Answer: y(x) = 1 + x2 /2, z(x) = 1.
Such a problem is called free endpoints problem. Note that y(b) takes values at the
vertical line x = b, as illustrated in Fig. 4.7. It seems that if y is an extremal, additional
condition(s) must be imposed at the second boundary point x = b. Most of the next
derivations are similar to those in Theorem 4.3.
Let η = η(x) ∈ C2 ([a, b]) with η(a) = 0. In the functional L(y) replace y by y + εη.
Setting
dL(ε)
dε ε=0
we arrive at
bh
∂F ∂F i
(x, y + εη, y′ + εη ′ )η + ′ (x, y + εη, y′ + εη ′ )η ′ dx .
a ∂y ∂y ε=0
y(x)
• y(b)
• y(b)
A • η(b)
(0, 0) x
a x=b
FIGURE 4.7
Free boundary condition at x = b.
We perform an integration by parts on the second term in the integrand of the above
integral.
b b
∂F ′ ∂F b d ∂F
η dx = η(x)a − η(x)dx
a ∂ y′ ∂y′
a dx ∂ y′
∂ F(b, y(b), y′ (b)) ∂F b
d ∂F
= ′
η(b) − ′ η(a) − η(x)dx
∂y ∂y a dx ∂ y′
∂ F(b, y(b), y′ (b)) b
d ∂F
= η(b) − η(x)dx,
∂ y′ a dx ∂ y′
Since (4.51) holds for all values of η, it must hold for η also satisfying the condition
η(b) = 0. Hence
bh
∂F d ∂ F i
− η(x)dx = 0,
a ∂ y dx ∂ y′
and by Lemma 10 , it follows that
d ∂F ∂F
− = 0, (4.52)
dx ∂ y′ ∂y
Similar results can be easily obtained for cases when y(a) is unspecified or both y(a)
and y(b) are unspecified. We summarize the results in the next theorem but first we
state
Fy′ (a, y(a), y′ (a)) := Fy′ x=a = 0.
(4.54)
Then fy′ = 2y′ , and hence Fy′ x=1 = 2y′ (1) = 0. Moreover, the corresponding Euler-
Lagrange equation is y′′ − y = 0. Thus, we are left with solving the second-order
differential equation
v(x)
y
y(x)
• y(b)
(0, 0) x
x=b
FIGURE 4.8
Boat route.
the banks with speed v(x). The boat’s constant speed is c such that c2 > v2 . Assume
(0, 0) is the departure point. We are interested in finding the route that the boat should
take to reach the opposite bank in the shortest possible time.
To do so, we assume the boat moves along a path y = y(x). Let α be the angle at
which the boat is steered. Then the velocity of the boat in the river is
dy dy/dt v + c sin(α) v
y′ = = = = sec(α) + tan(α).
dx dx/dt c cos(α) c
On the other hand, the time T required to cross the river is
b b b b
′ dt 1 1
T= t (x)dx = dx = dx
dx = sec(α)dx.
0 0 dx 0 dt 0 c
Or
(cy′ − v sec(α))2 = c2 tan2 (α) = c2 (sec2 (α) − 1).
After rearranging the terms we arrive at the quadratic equation in sec(α),
that we need to solve. Since sec(α) > 0 in the first quadrant we have that
O(0, 0)
y(b))
v
mg
FIGURE 4.9
Brachistochrone free end point.
dx
since = 0. We perform an integration by parts on the second and third terms in
dε
the integrand. After some work we end up with
bh i x=b x=b d x=b
Fy η + Fy′ η ′ + Fy′′ η ′′ dx = Fy′ η + Fy′′ η ′ − Fy′′ η
a x=a x=a dx x=a
b 2
d d
+ F ′′ − Fy′ + Fy ηdx
2 y
a dx dx
d x=b
= Fy′′ η ′ − Fy′′ − Fy′ η
dx x=a
b 2
d d
+ F ′′ − Fy′ + Fy ηdx.
a dx2 y dx
Then a combination of the following natural boundary conditions are needed when
one or more boundary condition is unprescribed or unspecified. To be specific, we
may deduce from (4.60) the following:
Recall that in order for y(x) to be an extremal of (4.58) it must satisfy the Euler-
Lagrange equation given by
d2 d
2
Fy′′ − Fy′ + Fy = 0,
dx dx
that readily follows from (4.59). We have the following example.
258 Calculus of Variations
with solution
c1 = 3.35786, c2 = 0.80197, c3 = 4.55589, c4 = −3.15983.
□
Next, we provide an application for reducing a cantilever beam’s potential energy. A
more general case of the study of beam will be considered in Chapter 5. As a result of
an underlying force that pulls a body toward its source, a system has a propensity to
reduce potential energy. Or shoving a body away if the force is repellent. As a result,
the distance is reduced, which reduces potential energy. Hence, potential energy is a
measure of potential movement; potential energy is a measure of potential motion.
Clearly, if the two attractive bodies are already together, there is no movement and
no potential energy. Now, this justification holds true for both elastic and electric
potential energy. There is an underlying force that moves the material in each of these
instances. While these forces can produce movement, if their nature is attracting, the
corresponding potential energy increases with distance. In conclusion, the support
situation, profile (form of the cross-section), geometry, equilibrium situation, and
material of a beam are its defining characteristics.
Natural Boundary Conditions 259
q(x)
L y(L) = 0
y(0) = 0 • • x
y(x)
y′ (0) = 0 y′ (L) = 0
FIGURE 4.10
Clamped Beam at both end points.
Example 4.20 Suppose we have a beam of length L with small transverse displace-
ment y(x) under transverse load q(x). The beam is subject to infinitesimal deflections
only. According to the force and moment balance approach, the displacement is gov-
erned by the fourth-order differential equation
d4y
eI = q(x), (4.65)
dx4
where e is the modulus of elasticity of the beam’s material and I(x) is the moment of
inertia of the beam’s cross-sectional area about a point x. We are interested in min-
imizing the potential energy. It is thought that applying the minimal total potential
energy approach will make future extensions of the beam equation into large deflec-
tions, nonlinear materials, and accurate modeling of shear forces between the cable
elements simpler than using force- and moment balances. The potential energy is a
combination of the strain energy,
1 d 2 y 2
eI ,
2 dx2
or the deformed energy stored in the elastic plus the work potential. The work po-
tential is the negative work done by external forces, which is −qy. Thus, the total
potential energy is given by the variational
Lh
1 d 2 y 2 i
L(y) = eI − q(x)y(x) dx, (4.66)
0 2 dx2
where e, q, and I are known quantities. Note that then Euler-Lagrange equation of
(4.66) is (4.65). In what to follows, we will consider different cases of conditions
corresponding to support systems for the beam, and we assume that e and I are con-
stants.
(I). The beam is clamped at each end, as Fig. 4.10 shows. In this case, we have the
four boundary conditions
q(x)
L
y(0) = 0 • x
′ y(x)
y (0) = 0
FIGURE 4.11
Clamped Beam at x = 0.
(II). The beam is only clamped at x = 0 as Fig. 4.11 shows. A beam that is fixed at
one end and free at the other end is known as a cantilever beam. A cantilever
beam is one that is free-hanging at one end and fixed at the other. This type of
beam is capable of carrying loads with both bending moment and sheer stress
and is typically used when building bridge trusses or similar structures. The end
that is fixed is typically attached to a column or wall. The tension zone of a
cantilever beam, is found at the top of the beam with the compression zone at the
bottom of the beam. In such a case, we are considering a cantilever beam, which
is a rigid structure supported at one end and free at the other. We are assuming
small deflection of the beam since the end point x = L is unclamped. In this case
we need the natural boundary conditions (4.62) and (4.64). Conditions (4.62)
and(4.64) yields
eIy′′′ (L) = 0, and eIy′′ (L) = 0.
The condition y′′′ (L) = 0 means that the reaction force at x = L is zero. Similarly,
the condition y′′ (L) = 0 means that the reaction moment force at x = L is zero.
(III). We assume the beam is simply supported at the end points as depicted in Fig.
4.12. Simply supported beams are those that have supports at both ends of the
beam. These are most frequently utilized in general construction and are very
versatile in terms of the types of structures that they can be used with. A simply
supported beam has no moment resistance at the support area and is placed in a
way that allows for free rotation at the ends on columns or walls. In other words,
the beam is pinned at both ends, and no restrictions are imposed on y′ at x = 0
and x = L. The relevant natural boundary conditions in this instance are (4.61)
and (4.62) and as a consequence, we obtain y′′ (0) = 0 and y′′ (L) = 0.
(IV). Double overhanging: This is a simple beam with both ends extending beyond its
supports on both ends. Then all four natural boundary conditions (4.61)–(4.64)
are in play. Consequently, they yield
Physically, this means that the reaction force and moment at each end of the
beam must be zero under these circumstances.
□
Impact of y′′ on Euler-Lagrange Equation 261
q(x)
L
y(0) = 0 x y(L) = 0
y(x)
FIGURE 4.12
Simply supported beam.
Assume N and M are continuous with continuous partial derivatives on some subset
of R2 . Let
F(x, y) = N(x, y)y′′ + M(x, y).
Then, Fy′′ = N, and therefore
d d
F ′′ = N = Nx + Ny y′ .
dx y dx
Moreover,
d2
F ′′ = Nxx + Nxy y′ + Ny y′′ + Nyx y′ + Nyy y′2 .
dx2 y
In addition, Fy′ = 0, and Fy = Ny y′′ + My . Thus, the Euler-Lagrange equation
d2 d
2
Fy′′ − Fy′ + Fy = 2Ny y′′ + Nyy y′2 + Nxy y′ + My = 0,
dx dx
which is a second-order differential equation, and hence not all four boundary con-
ditions can be satisfied in most cases.
4.8.1 Exercises
Exercise 4.52 Let y = y(x) be an extremal of the variational
1
L(y) = (y′2 + y2 )dx.
0
262 Calculus of Variations
is
(x − B)2 + y2 = R2 ,
for appropriate constants B and R.
Exercise 4.61 Find the extremals y = y(x), z = z(x) for the variational
π
L(y, z) = (4y2 + z2 − y′2 − z′2 )dx;
0
dL(ε)
Set dε |ε=0 = 0 and integrate by parts to obtain,
c− h bh
∂F ∂F i ∂F ∂F i
η + ′ η ′ dx + η + ′ η ′ dx
a ∂y ∂y c+ ∂y ∂y
c− b
d d
= Fy − Fy′ ηdx + Fy − Fy′ ηdx
a dx + dx
c−
+ Fy′ c , y(c ), y (c ) η(c ) − Fy′ a, y(a), y′ (a) η(a)
− − ′ −
3) If y(a) is specified (y(a) = A) and y(b) is unspecified then the necessary condi-
tions (4.69), (4.71), (4.73), and (4.74) are needed.
4) If neither y(a) nor y(b) is specified then the necessary conditions (4.69), (4.71)–
(4.74) are needed.
Example 4 Consider the functional
1
L(y) = ( f (x)y′2 + y)dx, y(−1) = y(1) = 0
−1
with
1, −1 ≤ x < 0
f (x) =
2, 0 < x ≤ 1.
Obviously, we have discontinuity at c = 0. Regardless of the discontinuity, the Euler-
Lagrange equation is
d
(2 f (x)y′ ) − 1 = 0. (4.76)
dx
For −1 ≤ x < 0, we have 2y′′ − 1 = 0, with the general solution
x2
y(x) = + c1 x + c2 . (4.77)
4
Similarly, for 0 < x ≤ 1, we have 4y′′ − 1 = 0, with the general solution
x2
y(x) = + d1 x + d2 . (4.78)
8
An application of 0 = y(−1) to (4.77) gives
1
c2 − c1 = − .
4
Next apply 0 = y(1) to (4.77) and get
1
d1 + d2 = − .
8
An application of (4.69)
lim y(x) = lim y(x),
x→0− x→0+
yields
c2 = d2 .
Finally, condition (4.74) yields
c1 = 2d1 .
Next we substitute d1 = c1 /2, and d2 = c2 into d1 +d2 = − 18 to obtain c1 +2c2 = − 14 .
Finally, solving
1 1
c2 − c1 = − ; c1 + 2c2 = − ,
4 4
266 Calculus of Variations
one obtains
1 1
c1 = , c2 = − .
12 6
Also, It follows that
1 1
, d2 = − .
d1 =
24 6
In conclusion, the solution over the whole interval is given piecewise
2
1
x4 + 12
x − 16 , −1 ≤ x < 0
y(x) =
x2 + 1 x − 1 , 0 < x ≤ 1.
8 24 6
which is continuous at x = 0. □
4.9.1 Exercises
Exercise 4.65 Find the extrema y = y(x) for the functional
1
f (x)y′2 + 8y2 dx,
L(y) = y(−1) = 0, y(1) = 1,
−1
with
1
2, −1 ≤ x < 0
f (x) =
2, 0 < x ≤ 1.
Exercise 4.66 Find the extrema y = y(x) for the functional
π/2
L(y) = ( f (x)y′2 − 2y2 )dx, y(0) = 0, y(π/2) = 1,
0
with
2, 0 ≤ x < π/4
f (x) = 1
2, π/4 < x ≤ π/2.
−1, 0 ≤ x < 1/4
Exercise 4.67 Let f (x) = and consider the functional
1, 1/4 < x ≤ 1
1
L(y) = (y′2 f (x))dx.
0
ϕ(b, y(b))
ϕ(x, y(x))
y(a)
(0, 0) x
a b b + εξ
FIGURE 4.13
Transversality condition.
for ξ > 0.
Since the same point lies on the curve ϕ(x, y) = 0 we have that
ϕ b + εξ , y(b) + ε(ξ y′ (b) + η(b)) = 0. (4.80)
ξ ϕx + ϕy y′ (b) + η(b)ϕy = 0.
(4.81)
dL(ε)
Now, we are ready to compute dε . Let
b+εξ
L(y + εη) = F(x, y + εη, y′ + εη ′ )dx.
a
we see that
b+εξ
dL(ε) d ′ ′
= F(x, y + εη, y + εη )dx
dε dε a ε=0
′ ′
= F b + εξ , y(b + εξ ) + εη(b + εξ ), y (b + εξ ) + εη (b + εξ ) ξ
ε=0
bh i
∂F ∂F
+ (x, y + εη, y′ + εη ′ )η + ′ (x, y + εη, y′ + εη ′ )η ′ dx .
a ∂y ∂y ε=0
dL(ε)
Setting ε = 0, dε = 0 and integrating by parts, the above expression yields,
b
d
F b, y(b), y′ (b) ξ + Fy′ b, y(b), y′ (b) η(b) +
Fy − Fy′ η(x)dx = 0. (4.82)
a dx
Solving for ξ in (4.81) yields
η(b)ϕy
ξ =− .
ϕx + ϕy y′ (b)
The above relation (4.83) holds for all η(x), a ≤ x ≤ b and in particular it must hold
when η(b) = 0. Thus (4.83) implies
b
d
Fy − Fy′ η(x)dx = 0,
a dx
and so by Lemma 10 we arrive at
d
Fy − F ′ = 0. (4.84)
dx y
Transversality Condition 269
Reminder:
F = F(b, y(b), y′ (b)), and Fy′ = Fy′ (b, y(b), y′ (b)).
x=b x=b
Along the lines of the preceding discussion, if y(b) is fixed and the left end point
y(a) varies along a curve y = h(x), then the corresponding transversality condition
is h i
F + h′ (x) − y′ (a) Fy′ = 0. (4.88)
x=a
Thus, we proved the following theorem.
Theorem 4.17 Let y = y(x) ∈ C2 [a, b] be an extremal for the variational (4.79) with
boundary conditions specified or unspecified at x = a and x = b.
1) If y(a) moves along the curve y = h(x) and y(b) is specified (y(b) = B), then the
necessary conditions for y(x) to be an extremal of (4.79) are the Euler-Lagrange
equation given by (4.84) and (4.88).
2) If y(a) is specified (y(a) = A) and y(b) moves along the curve y = g(x), then the
necessary conditions for y(x) to be an extremal of (4.79) are the Euler-Lagrange
equation given by (4.84) and (4.87).
270 Calculus of Variations
3) If both endpoints are allowed to move freely along the curves h and g, then the
necessary conditions for y(x) to be an extremal of (4.79) are the Euler-Lagrange
equation given by (4.84), plus (4.87) and (4.88).
Natural boundary conditions can be easily derived from this discussion. For example,
if y(a) is fixed and y(b) varies along the line x = b, then ϕ(x, y) = x − b. This im-
plies that ϕx = 1, and ϕy = 0. Substituting into (4.85), we obtain Fy′ (b, y(b), y′ (b)) =
0.
Example 4.21 Find the shortest distance from the point (0, 0) to the nearest point on
the curve xy = 1, x, y > 0. Basically, by Example 4.5 we are to minimize
bq
L(y) = 1 + (y′ )2 dx, y(0) = 0
0
(b, y(b))
• (1, 1) 1
y= x
(0, 0) x
FIGURE 4.14
Shortest distance to a parabola.
Without loss of generality, we assume y(a) is fixed and y(b) slides or lies on a curve
defined by the equation ϕ(x, y) = 0. The set up is very identical to the one in Section
4.10. Thus, following the same derivations, we have, with slight modification due to
the presence of the function h that
L(y + εη) = h b + εξ , y(b + εξ + εη(b + εξ ))
b+εξ
+ F(x, y + εη, y′ + εη ′ )dx.
a
dL(ε) d
= h b + εξ , y(b + εξ ) + εη(b + εξ )
dε dε
b+εξ
d
+ F(x, y + εη, y′ + εη ′ )dx
dε a ε=0
′
= hx b, y(b) ξ + hy b, y(b) y (b)ξ + η(b)
bh i
+ F b, y(b), y′ (b) ξ + Fy (x, y, y′ )η + Fy′ (x, y, y′ )η ′ dx.
a
272 Calculus of Variations
After integrating by parts and rearranging the terms, the above expression simplifies
to
b
′ ′ d
hx b, y(b) + hy b, y(b) y (b) + F(b, y(b), y (b)) ξ + Fy − Fy′ η(x)dx
a dx
+ hy b, y(b)) + Fy′ (b, y(b), y′ (b) η(b).
(4.90)
The value of ξ is not affected by the presence of the function h and hence, using the
results of the previous section we see that
η(b)ϕy
ξ =− .
ϕx + ϕy y′ (b)
Suppose we can solve for y in terms of x in ϕ(x, y) = 0. If so, then we set y = g(x).
Now
d
ϕ(x, y) = ϕx + ϕy y′ = 0.
dx
This implies that
ϕx
y′ = − = g′ (x).
ϕy
We may solve for ϕx and obtain ϕx = −g′ (x)ϕy . As a consequence, we will
have
η(b)ϕy η(b)
ξ =− = .
ϕx + ϕy y′ (b) g′ (b) − y′ (b)
Substituting into (4.90) and factoring η(b) give
h h + h y′ + F b
x y
i d
+ hy + Fy′ η + Fy − Fy′ η(x)dx = 0. (4.91)
g′ − y′ x=b a dx
d
Fy − F′ =0 (4.92)
dx y
and the transversality condition
hx + hy y′ + F
+ hy + Fy′ ,
g′ − y′
which simplifies to
h i
hx + F + g′ hy + (g′ − y′ )Fy′ = 0. (4.93)
x=b
Note that the term y in (4.93) is the solution of the Euler-Lagrange equation given
by (4.92). Along the lines of the preceding discussion, if y(b) is fixed and the left
Transversality Condition 273
end point y(a) varies along a curve y = l(x), then the corresponding transversality
condition is
h i
hx + l ′ hy − F − (l ′ − y′ )Fy′ =0 (4.94)
x=a
4.10.2 Exercises
Exercise 4.68 Find the shortest distance from the point (a, A) to the nearest point
(b, y(b)) on the line with slope m, y = mx + c.
Exercise 4.69 Find the extremal y = y(x) for the functional
bp
1 + y′2
J(y) = dx, y(0) = 0
0 y
and y(b) varies along the circle
(x − 9)2 + y2 = 9.
274 Calculus of Variations
and y(b) varies along the curve y = g(x). Show that at the point x = b,
y = 0, or y = ±2x + B.
Easy to see that neither solution satisfy both boundary conditions. Thus, we suspect
̸ 0. So we assume c1 ̸= 0 and obtain
at least for now that c1 =
y2 − c1
y′2 = .
y2
After separating the variables we arrive at
y
dx = ± p dy.
y2 − c1
(x − c2 )2 = y2 − c1 ,
(−2 − c2 )2 = −c1
276 Calculus of Variations
y
• (2, 2)
(−2, 0)
• x
FIGURE 4.15
There is no smooth path that connects boundary conditions.
(2 − c2 )2 = 4 − c1 .
Solving for c1 in the first equation and substituting it into the second equation yields
(2 − c2 )2 = 4 + (2 + c2 )2 .
•
y1 y=
y= y2
•
(a, A) • (b, B)
x
x∗
FIGURE 4.16
Broken path with one corner point.
• (x̃∗ , ỹ∗ )
•
y = y(x)
•
(a, A) • (b, B)
x
x∗ x̃∗
FIGURE 4.17
Perturbing corner point x∗ with x̃∗ .
where
y1 (x), a ≤ x ≤ x∗
y(x) =
y2 (x), x∗ ≤ x ≤ b.
See Fig. 4.16
Now we perturb the corner point along with the broken extremal. See Fig. 4.17. Let
ξ1 , ξ2 be functions of x and positive. Then the perturbed point (x̃∗ , ỹ∗ ) must satisfy,
for the purpose of compatibility, the relations
x̃∗ = x∗ + εξ1 ,
We follow the same procedure as in Section 4.10. We write our variational as the sum
of two variations in the sense that
Similarly,
εη(x∗ + εξ1 ) = εη(x∗ ) + O(ε 2 ).
Substituting the two expressions into the right-side of the third equation of (4.96) and
then using the second equation of (4.96) yield
By doing similar work we obtain from L2 (y2 ), equation (4.98) and the additional
condition n o
− ξ1 [F − y′2 Fy′ ] − ξ2 Fy′ = 0. (4.101)
2 2 x=x∗
Combining conditions (4.100) and (4.101) we arrive at
n h i o
ξ1 F x, y1 , y′1 − y′1 Fy′ − F x, y2 , y′2 − y′2 Fy′ + ξ2 Fy′ − Fy′
= 0.
1 2 1 2 x=x∗
In light of the fact that the point of discontinuity is free to change, we can indepen-
dently change both ξ1 and ξ2 or set them both to zero. We can therefore divide the
condition into two conditions.
n o
F(x, y1 , y′1 ) − y′1 Fy′ − F(x, y2 , y′2 ) − y′2 Fy′
= 0,
1 2 x=x∗
Fy′ − Fy′ = 0.
1 2 x=x∗
The above corner conditions can be expressed in terms of limits from the left and
right rather than dividing y into y1 and y2 . That is
must hold at very corner point. The corners conditions given by (4.102) and (4.103)
are called Weirstrass-Erdmann corner conditions. We proved the following theo-
rem.
Theorem 4.18 For the functional (4.95) with one corner point x∗ ∈ (a, b) conditions
(4.102) and (4.103) must hold.
We note that (4.102) and (4.103) hold everywhere in (a, b) since if we are not at a
corner point y′ (x) is continuous as is Fy′ .
Back to Example 4.23. We saw that for c1 ̸= 0, then there is no smooth extremal that
connect both endpoints. Thus, we must look for an extremal that is piecewise defined
or has a corner. We are left with the choice of c1 = 0. In this case the Euler-Lagrange
equation has the two solutions
y = 0, or y = 2x + B.
280 Calculus of Variations
The branch of the solution y = 0, satisfies the first boundary condition y(−2) = 0. In
addition, the second part of the solution y = 2x + B satisfies y(2) = 2, for B = −2.
The corner conditions (4.102) and (4.103) are satisfied independently of the location
of the corner point in (−2, 2) since
Corollary 8 If Fy′ y′ ̸= 0, then an extremal for the functional (4.95) must be smooth.
That is it can not have corners.
Proof Let y0 be an extremal of (4.95) with a corner point at x∗ ∈ (a, b). Then from
the corner condition (4.103), we must have the continuity condition
That is
Fy′ x∗− , y(x∗− ), y′ (x∗− ) − Fy′ x∗+ , y(x∗+ ), y′ (x∗+ ) = 0.
(4.104)
Let p = y′ (x∗− ) and q = y′ (x∗+ ). Then by the Mean value theorem, there exists an
α ∈ (0, 1) such that
Fy′ x∗ , y(x∗ ), p) − Fy′ x∗ , y(x∗ ), q) = (p − q)Fy′ y′ x∗ , y(x∗ ), q + α(p − q) .
Or,
Fy′ y′ x∗ , y(x∗ ), q + α(p − q) = 0,
which is a contradiction to the fact that Fy′ y′ ̸= 0. This completes the proof.
Example 4.24 According to Corollary 8 the extremal of the variational
b
αy′2 + ϕ(y) + φ (x) dx, y(a) = A, y(b) = B
L(y) =
a
where ϕ and φ are continuous functions of y and x, respectively, has no corner points
when α ̸= 0, since Fy′ y′ = 2α. □
The next example shows that Fy′ y′ ̸= 0, is only a necessary condition.
Corners and Broken Extremal 281
was considered in Example 4.8, and it was shown that the functional was path inde-
pendent. In addition, the corresponding Euler-Lagrange equation is −2yy′ = 0, from
which we obtain either y(x) = 0 or y(x) = constant. Hence, neither one satisfies both
boundary conditions. Notice that Fy′ y′ = 0. Clearly, the path y0 (x) = 2x − 1 connects
both endpoints and it can be easily computed and verified that L(y0 (x)) = − 29 3 . (See
Example 4.8). Next, we construct another path with a corner point that will piecewise
connect both endpoints. Note that since the functional is path independent we may
assume a corner point anywhere in (1, 2). Thus we may take the corner point to be
at (3/2, 2), and we wish to construct a piecewise continuous and linear path in the
form of
A1 x + B1 , 1 ≤ x ≤ 3/2
y(x) =
A2 x + B2 , 3/2 ≤ x ≤ 2.
Applying y(1) = 1, y(2) = −1, we obtain B1 = 1−A1 and B2 = −1−2A2 . Therefore,
A1 x + 1 − A1 , 1 ≤ x ≤ 3/2
y(x) = (4.105)
A2 x − 1 − 2A2 , 3/2 ≤ x ≤ 2.
Applying the corner condition
lim Fy′ = lim Fy′
x→x∗− x→x∗+
at x∗ = 3/2 we arrive at
lim y(x) = lim y(x),
x→(3/2)− x→(3/2)+
or
3 3
A1 + 1 − A1 = A2 − 1 − 2A2 .
2 2
This results into
A1 + A2 = −4. (4.106)
Making use of the other corner condition
lim F − y′ Fy′ = F − y′ Fy′ ,
lim
x→(3/2)− x→(3/2)+
yields to
lim (3x2 y) = lim (3x2 y).
x→(3/2)− x→(3/2)+
Simplifying 3x2 from both sides, we arrive at the same expression (4.106). Due to
the continuity requirement at 3/2, we must have the solution match at 3/2. That is,
y(3/2) = 2. Applying this to the first branch of the solution we obtain
3
A1 + 1 − A1 = 2,
2
282 Calculus of Variations
4.11.1 Exercises
Exercise 4.76 In the spirit of Example 4.23 discuss the variational
1
L(y) = y2 (1 − y′ )2 dx, y(−1) = 0, y(1) = 1.
−1
Exercise 4.81 Redo Example 4.24, with corner point at (4/3, 5).
g(x, y) = k f (x, y) = 7
• f (x, y) = 6
f (x, y) = 5
f (x, y) = 4
(0, 0) x
FIGURE 4.18
Level curves and Lagrange multiplier.
multiplier. The same concept will be used to deal with variational problems with
constraints. First, we begin with a short review of Lagrange multipliers for finite-
dimensional optimization problems.
Suppose we want to find the extreme value of the function f (x, y) subject to the
constraint g(x, y) = k, for a fixed constant k. In other words, if f (x.y) has an extrema
at (x∗ , y∗ ), then (x∗ , y∗ ) must lie on the level g(x, y) = k. To maximize f (x, y) subject
to g(x, y), = k is to find the largest value of c such that the level curve f (x, y) = c
intersects g(x, y) = k. In Fig 4.18, c = 4, 5, 6, 7. Also, it appears from Fig. 4.18 that
this happens when the curves touch each other; that is, when they have a common
tangent line. (Otherwise, the values of c could be increased further.) This can only
mean that the normal lines at (x0 , y0 ) where they touch are identical. This implies
that the gradient vectors are parallel. In other words,
∇ f (x0 , y0 ) = λ ∇g(x0 , y0 ),
for some scalar λ . The number λ is called a Lagrange multiplier. The next theorem
can be found in any advanced calculus textbook.
Theorem 4.19 (Lagrange Multiplier Rule) Let f and g be differentiable functions
with gx (x0 , y0 ) and gy (x0 , y0 ) not both zero. If (x0 , y0 ) provides an extreme value to
f (x, y) = 0 subject to the constraint g(x, y) = k, then there exists a constant λ such
that
fx∗ (x0 , y0 ) = 0, fy∗ (x0 , y0 ) = 0,
and g∗ (x0 , y0 ) = k, where f ∗ = f + λ g.
The above theorem is valid for functions in Rn . Recall that fort x ∈ Rn and f : Rn → R
is a smooth function, then the gradient of f , denoted by ∇ f is the vector
∂f ∂f ∂f
∇ f =< , ,..., >.
∂ x1 ∂ x2 ∂ xn
284 Calculus of Variations
∇ f (x∗ ) = λ ∇g(x∗ ).
Let y, z ∈ C2 [a, b], and consider the variational with two variables y and z
b
L(y, z) = F(x, y, y′ , z, z′ )dx, (4.107)
a
dϕ
δ ϕ(x, y, z) = (y + εη1 , z + εη2 )ε=0
dε
= ϕy (y + εη1 , z + εη1 )η1 + ϕz (y + εη2 , z + εη2 )η2
ε=0
= ϕy (y, z)η1 + ϕz (y, z)η2 .
Multiply the above expression with Lagrange multiplier λ and then integrate the
resulting equation from a to b to obtain
b
λ ϕy (y, z)η1 + ϕz (y, z)η2 dx = 0. (4.110)
a
d
Fy − F ′ − λ ϕy = 0 (4.112)
dx y
and
d
Fz − F ′ − λ ϕz = 0. (4.113)
dx z
Theorem 4.21 Let y, z ∈ C2 [a, b] be extremals for the variational (4.107) with bound-
ary conditions specified at x = a and x = b, subject to the constraint function (4.108).
Then y(x) and z(x) must satisfy the Euler-Lagrange equations given by (4.112) and
(4.113).
The next theorem easily generalizes Theorem 4.21 to n constraints functions and its
proof is Exercise 4.82.
Theorem 4.22 Let y, z ∈ C2 [a, b] be extremals for the variational (4.107) with bound-
ary conditions specified at x = a and x = b, subject to the n constraints
ϕi (y, z) = 0, i = 1, 2, . . . n.
Example 4.26 Find the extremals y and z that minimize the functional
π/2
1 + y′2 + z′2 dx, y(0) = z(0) = y(π/2) = z(π/2) = 0,
L(y, z) =
0
y′′ + λ y = 0, z′′ + λ z = 0.
Remember λ ∈ R and so, special care must be applied. We will do this in three
separate cases.
case 1 λ = 0. In this case the general solution for the first differential equation is
y(x) = c1 x + c2 .
Applying the boundary condition, we get c1 = c2 = 0. Again, this results in the trivial
solution y(x) = 0, which has to be rejected since it does not satisfy the constraint.
case 3. λ > 0. Say λ = α 2 , where α > 0. Then the general solution is
Next we evaluate L at the obtained y and z to see if they minimize L since the inte-
grand of L is positive for all functions y and z.
π/2
1 + 4n2 (c2 + d 2 ) cos2 (2nx) dx
L c sin(2nx), d sin(2nx) =
0
π/2
π
= + 4n2 (c2 + d 2 ) cos2 (2nx)dx
2 0
π
= + πn2 (c2 + d 2 ). (4.114)
2
Note that expression (4.114) is increasing in n and therefore its minimum is achieved
when n = 1. That is y and z minimize L for n = 1. Therefore, the extremals are
where
c2 sin2 (2x) + d 2 sin2 (2x) = 5, 0 < x < π/2.
□
4.12.1 Exercises
Exercise 4.82 Prove Theorem 4.22.
Exercise 4.83 Show that if y, z, w ∈ C2 [a, b] are extremals for the variational
b
L(y, z) = F(x, y, y′ , z, z′ )dx,
a
Isoperimetric Problems 287
d
Fy − F ′ − λ ϕy = 0,
dx y
d
Fz −
F ′ − λ ϕz = 0,
dx z
d
Fw − Fw′ − λ ϕw = 0.
dx
Exercise 4.84 Use Exercise 4.83 to find the extremals y, z and w that minimizes the
functional
b
1 ′2 ′2
y + z + w′2 dx,
L(y, z, w) =
0 2
with boundary conditions
where d is a fixed constant. The fixed functions F and G are assumed to be twice
continuously differentiable. The subsidiary condition (4.116) is called isoperimetric
constraint. Before, we assumed a local extremal y(x) in a family of admissible func-
tions with respect to which we carry out the extremization. A one parameter family
y(x) + εη(x) is not , however, a suitable choice since those curves may not maintain
the consistency of W. Therefore, we introduce a two parameters family
where η1 , η2 ∈ C2 ([a, b]) such that η1 (a) = η1 (b) = η2 (a) = η2 (b) = 0, and ε1 and
ε2 are real parameters ranging over the intervals containing the origin. We make the
assumption that y is not an extremal of W. Therefore, for any choice of η1 and η2
there will be values ε1 and ε2 in the neighborhood of (0, 0), for which W (z) = d.
Let b
S1 (ε1 , ε2 ) = F(x, z, z′ )dx,
a
and b
S2 (ε1 , ε2 ) = G(x, z, z′ )dx = C.
a
Since y is a local extremal of (4.115), subject to the constraint (4.116), the point
(ε1 , ε2 ) = (0, 0) must be a local extremal for S1 (ε1 , ε2 ) subject to the constraint
S2 (ε1 , ε2 ) = C. This is just a differential calculus problem and so the Lagrange mul-
tiplier rule might be applied. That is, there must be a constant λ such that
∂ S∗ ∂ S∗
= = 0, at (ε1 , ε2 ) = (0, 0), (4.117)
∂ ε1 ∂ ε2
where b
∗
S = S1 + λ S2 = F ∗ (x, z, z′ )dx,
a
with
F ∗ = F + λ G.
Substituting z = y(x) + ε1 η1 (x) + ε2 η2 (x) into S∗ and then calculating partial deriva-
tives with respect to ε1 and ε2 we arrive at
∂ S∗ b
Fy∗ (x, y, y′ )ηi (x) + Fy∗′ (x, y, y′ )ηi′ (x) dx,
(ε1 , ε2 ) = i = 1, 2.
εi a
Setting
∂ S∗
(ε1 , ε2 ) = 0,
εi (ε1 ,ε2 )=(0,0)
followed by an integration by parts on the term that involves η ′ and then applying
Lemma 10, we arrive at the Euler-Lagrange equation
d ∗
Fy∗ (x, y, y′ ) −
F ′ (x, y, y′ ) = 0, (4.118)
dx y
which is a necessary condition for an extremal. We proved the following theo-
rem.
Theorem 4.23 Let y ∈ C2 [a, b]. If is y not an extremal of (4.116) but an extremal
for the variational (4.115) with boundary conditions specified at x = a and x = b,
subject to the isoperimetric constraint (4.116), then y(x) satisfies the Euler-Lagrange
equation (4.118), or
∂ dh ∂ i
F +λG − ′
F + λ G = 0.
∂y dx ∂ y
We furnish an example.
Isoperimetric Problems 289
Example 4.27 In this example, we show that the sphere is the solid figure of revo-
lution that, for a given surface area l, has the maximum volume. Consider a curve
y(x) ≥ 0 with y(0) = 0, and y(a) = 0, a > 0. Revolve y(x) along the x-axis. Then,
any short circular strip with a radius y and a height ds has a surface area of 2πy ds.
Consequently, the total surface area of revolution is
a a p
l= 2πy ds = 2πy 1 + y′2 dx.
0 0
a
On the other hand, the volume of the solid of revolution is 0 πy2 dx. Thus, the
problem can be formulated as a variational problem with constraints. In other words,
we want to maximize a
L(y) = πy2 dx,
0
subject to the constraint
a p
2πy 1 + y′2 dx = l (constant).
0
Set p
F ∗ = πy2 + λ 2πy 1 + y′2 .
Since x does not enter in F ∗ , we will use the Euler-Lagrange equation
F ∗ − y′ Fy∗′ = c.
Or,
p 2πλ yy′
πy2 + 2πλ y 1 + y′2 − y′ p = c,
1 + y′2
which simplifies to
2πλ y
πy2 + p = c.
1 + y′2
Now y = 0, at x = 0 and at x = a, which can be true if c = 0, and wherefore we have
2λ
y = −p . (4.119)
1 + y′2
By squaring both sides and rearranging the terms we obtain the solution
(x ± 2λ )2 + y2 = 4λ 2 .
Hence the obtained curve is a circle centered at (±2λ , 0) and radius 2λ . This shows
p the solid of revolution is a sphere. To find λ , we make use of (4.119) and obtain
that
y 1 + y′2 = −2λ . Substituting this into the integral constraint we arrive at
a p a
l= ′2
2πy 1 + y dx = 2π(−2λ ) dx = −4πλ a.
0 0
−l
This gives λ = 4πa .
□
The next theorem easily generalizes Theorem 4.23 and its proof is left as an exer-
cise.
Theorem 4.24 Let y, z ∈ C2 [a, b] be extremals for the variational
b
L(y, z) = F(x, y, z, y′ , z′ )dx,
a
where d is a fixed constant. If y and z are not extremals to (4.121), then they must
satisfy the Euler-Lagrange equations
∂ dh ∂ i
F +λG − F + λ G = 0, (4.122)
∂y dx ∂ y′
and dh ∂
∂ i
F +λG − ′
F + λ G = 0. (4.123)
∂z dx ∂ z
(x(t), y(t))
•
FIGURE 4.19
Dido’s area.
Example 4.28 [Dido’s problem] Most traditions identify Dido as the Phoenician
city-state of Tyre’s queen, who fled oppression to create her own city in northwest
Africa. Tyre is now in Lebanon. The legendary Dido requested a plot of land to farm
when she landed in Carthage (Tunisia) in 814 BC. Her request was accepted with
the provision that an n oxhide should encircle the area. She divided the oxhide into
incredibly tiny pieces and arranged them to completely enclose the available land.
View Figure 4.19. The problem comes down to finding the closed curve with a fixed
perimeter that encloses the maximum area. Let’s describe the curve by the parametric
equations (x(t), y(t)) with velocity (x′ (t), y′ (t)) and x(0) = x(1) and y(0) = y(1).
Then the length of the curve, or its perimeter is
q
x′2 (t) + y′2 (t)dt = d,
where d is the allowed perimeter. To find a formula for the enclosed area, we make
use of Green’s Theorem, which states that over a region D in the plane with boundary
∂ D we have
∂g ∂ f
f dx + gdy = − dxdy.
∂D D ∂x ∂y
If we set f = − 2y and g = 2x , we get
1
xdy − ydx = dxdy.
2 D D
y = x−1
Suspended chain
x
x=b
FIGURE 4.20
Catenary; transversality and natural conditions.
The next theorem addresses natural boundary conditions and transversality condi-
tions of variational that are subject to given constraints.
Theorem 4.26 Let y = y(x) ∈ C2 [a, b] be an extremal for the variational (4.79) with
boundary conditions specified or unspecified at x = a and x = b, and subject to the
constraint b
W (y) = G(x, y, y′ )dx = d.
a
For λ ∈ R, set
F ∗ = F(x, y, y′ ) + λ G(x, y, y′ ). (4.124)
Then 1)-3) of Theorem 4.17 hold when F is replaced with F ∗.
Example 4.29 (Catenary revisited) In Exercise 4.39 we presented the problem of
the hanging of heavy chain from two fixed points. Now, we are considering the length
of the chain to be l > 1. In addition, unlike the situation in Exercise 4.39 we let the
chain slides freely along the vertical line x = b. The left end of the cable, that is at
x = a the chain slides along a tilted pole or skewed line. Again, for mathematical
convenience we assume the chain density and gravity are both one. See Fig. 4.20.
Let y − x + 1 = 0 be the tilted pole. Then, we must have 1 ≤ a < b. Then the problem
is to minimize to potential energy
b p
L(y) = y 1 + y′2 dx,
a
294 Calculus of Variations
with y(b) being unspecified and y(a) moves along the curve y = h(x) = x − 1, subject
to the constraint bp
W (y) = 1 + y′2 dx = l.
a
Equivalently, we are to find the path y(x) that minimizes L, where
p p
F ∗ = y 1 + y′2 + λ 1 + y′2 .
Since F ∗ does not explicitly depend on the variable x, by Corollary 7 the Euler-
Lagrange equation that y must satisfy is
y′ Fy∗′ − F ∗ = D.
Or,
y′2 (y + λ ) p
p − (y + λ ) 1 + y′2 = D,
1+y ′2
which simplifies to
y+λ
p = D.
1 + y′2
Solving for y′ we arrive at
1
q
y′ = (y + λ )2 − D2 .
D
By letting
y + λ = D cosh(t),
and then imitating the work of Section 4.5 on minimal surface we arrive at the solu-
tion
x + c2
y + λ = c1 cosh( ), (4.125)
c1
where the constants c1 and c2 are to be found. We have a natural boundary condition
at x = b which implies that
y′ (y + λ )
Fy∗′ x=b = p
.
1 + y′2 x=b
This yields that y′ (b) = 0, or y(b) = −λ . Now, if y(b) = −λ , then (4.125) implies
that cosh( b+c2
c1 ) = 0, which can not be. Therefore, we must take
y′ (b) = 0.
yields
(y + λ )
p (1 + y′ ) = 0,
1 + y′2
or y′ (a) = −1. Combining this with y′ (x) = sinh( x−b
c1 ) we arrive at
b−a
sinh( ) = 1.
c1
Using the isoperimetric constraint we get
br
x−b b−a
1 + sinh2 ( )dx = c1 0 + sinh( ) = l.
a c1 c1
Combining
b−a b−a
sinh( ) = 1 and c1 sinh( ) = l,
c1 c1
−1
gives c1 = l. Thus, sinh( b−a
l ) = 1, from which we obtain b − a = l sinh (1). Using
(4.125) we obtain the solution
x−b
y(x) = −λ + l cosh( ).
l
Left to determine λ . Since y(a) lies on the line h(x) = x − 1, we have y(a) = a − 1.
In addition,
b−a
y(a) = −λ + l cosh( )
l
1
= −λ + l cosh sinh−1 (1)l
√ l
= −λ + l 2.
√
Setting y(a) = y(a), yields λ = 1 − a + l 2. Thus, the solution is
√ x−b
y(x) = a − 1 − l 2 + l cosh( ).
l
□
4.13.1 Exercises
Exercise 4.85 Prove Theorem 4.24.
296 Calculus of Variations
(x − 9)2 + y2 = 9.
subject to y2 + 2z = 2.
1
Exercise 4.95 Find an extremal corresponding to J(y) = −1 y dx when subject to
1 2 ′2
y(−1) = y(1) = 0 and −1 (y + y )dx = 1.
e
Exercise 4.96 Find an extremal
e
corresponding to J(y) = 1 x2 y′2 dx when subject
to y(1) = y(e) = 0 and 1 y2 dx = 1.
Exercise 4.97 Find an extremal corresponding to
π
J(y) = y′2 dx, y′ (0) = y′ (π) = 0,
0
π
when subject to 0 y2 dx = 1.
Exercise 4.98 Find an extremal corresponding to
1
L(y) = (y′′2 + x2 )dx, y(0) = y(1) = y′ (0) = y′ (1) = 0
0
and 1
W (y) = (y2 + 1)dx = 2.
0
Exercise 4.99 Find the curve of fixed length πa joining the two points (−a, 0) and
(a, 0) and situated above the x-axis such that the area below it and above the x-axis
is maximum.
Exercise 4.100 Consider Example 4.29, but this time the chain in freely sliding on
the line x = 0 (y-axis). Also, the right end of the chain is left free to slide on a tilted
pole, given by the equation cx + dy = cd, where c, d > 0. Find the equation of the
chain that minimizes the potential energy.
298 Calculus of Variations
where P, Q, R are continuous, and R is positive on [a, b]. Multiply both sides of the
above equation with
r(x) = e P(x)dx ,
and then by observing that
′
r(x)y′ = r′ (x)y′ + r(x)y′′ ,
where q(x) = r(x)Q(x), and p(x) = r(x)R(x) > 0 for all x ∈ [a, b]. For constants
α1 , α2 , β1 , β2 , we impose the boundary conditions
with
α12 + β12 ̸= 0; α22 + β22 ̸= 0.
The differential equation given by (4.126) along with (4.127) is called Sturm-
Liouville problem (SLP). There is a habitual relation between variational with
isoperimetric constraint and Sturm-Liouville problem. To see this, let y = y(x) ∈
C2 ([a, b]) be an extremal for the variational
b
L(y) = r(x)y′2 + q(x)y2 dx,
a
subject to
b
W (y) = p(x)y2 dx.
a
Then
F ∗ = r(x)y′2 + q(x)y2 + λ p(x)y2 ,
and
d ∗
Fy∗ − F ′ = 0,
dx y
implies that
r(x)y′′ + r′ (x)y′ − q(x)y − λ p(x)y = 0,
Sturm-Liouville Problem 299
which is equivalent to
′
r(x)y′ − q(x)y − λ p(x)y = 0.
Notice that the nontrivial y does not satisfy the Euler-Lagrange equation for the con-
straint W (y), since
−2p(x)y(x) = 0,
is not possible due to the fact that p(x) > 0 for all x ∈ [a, b]. Thus, we have
shown that the (SLP) can be recasted as variational problem with isoperimetric con-
straint.
The (SLP) has a wide range of applications. The boundary conditions make it con-
veniently suitable for standing wave. In addition, (SLP) models the one dimensional
time dependent Schrödinger equation
′ 2m
− ψ ′ (x) + 2 V (x)Ψ(x) − λ Ψ(x) = 0.
h
then we say f and g are orthogonal on [0, l] with respect to the weight function
p(x) > 0.
The integral on the left side of (4.128) is called the inner product of f and g and is
denoted by ( f , g). Thus,
l
( f , g) = f (x)g(x)p(x)dx.
0
It is then clear that an orthogonal set of functions can be made into an orthonormal set
by dividing each function in the set by its norm. The next theorem characterizes the
eigenvalues and eigenfunctions solutions of (SLP). Its proof can be found in various
places such as [1] of Chapter 2.
Theorem 4.27 For the Sturm-Liouville propblem given by (4.126) and (4.127) the
following statements hold.
i) The eigenvalues are real and to each eigenvalue there corresponds a single
eigenfunction up to a constant multiple.
ii) The eigenvalues form an infinite sequence −λ1 , −λ2 , . . . , −λn , . . . , and can be
ordered in a manner that 0 < −λ1 < −λ2 < −λ3 < . . . with
lim (−λn ) = ∞.
n→∞
iii) If −λm and −λn are two distinct eigenvalues, then the corresponding eigenfunc-
tions ym (x) and yn (x) are orthogonal on the interval [0, l].
Example 4.30 Consider the variational
e
L(y) = x2 y′2 dx, y(1) = y(e) = 0,
1
subject to e
W (y) = y2 dx = 1.
1
Then the corresponding Euler-Lagrange equation is
′
x2 y′ − λ y = 0,
or
x2 y′′ + 2xy′ − λ y = 0, (4.129)
which is a (SLP) with r(x) = x2 , q(x) = 0, and p(x) = 1. Using the method of Section
1.11, we arrive at
y′′ + y′ − λ y = 0,
where the independent variable x has been changed to the independent variable t
under the transformation x = et . If we assume solutions of the form y = emt , then we
have √
1 1 + 4λ
m=− ± .
2 2
Sturm-Liouville Problem 301
For λ = 0, we get the solution y(t) = c1 + c2 e−t , or y(x) = c1 + cx2 . Applying the
given boundary conditions we obtain c1 = c2 = 0, which corresponds to the trivial
solution (y = 0).
If 1 + 4λ > 0, then the solution is given by
Thus,
∞
∑ b2n = 2. (4.133)
n=1
Let yn (x) be given by (4.130) with c2 = 1. We have shown z(x) = ∑∞ n=1 bn yn (x),
with bn satisfying (4.133) is an extremal of the variational problem given in Example
4.30. However, if we want that same extremal to minimize the variational problem,
then we need to look deeper into the sequence of the eigenvalues λn , n = 1, 2, . . . .
Let’s evaluate the variational L at z. In the coming calculations we make use of the
following:
Let
sin(nπ ln(x)) cos(nπ ln(x))
fn (x) = √ , gn (x) = √ , x ∈ [1, e].
x x
Then,
( fn (x), fm (x)) = (gn (x), gm (x)) = 0, for all n ̸= m; n, m = 1, 2, . . . , (4.134)
and
( fn (x), gm (x)) = 0, for all n, m = 1, 2, . . . , (4.135)
The next argument is similar to the preceding one, so we skip some of the de-
tails.
e
L(z) = x2 z′2 dx
1
∞ e
sin2 (nπ ln(x))
e
cos2 (nπ ln(x))
∑ b2n
= + dx by (4.134) and (4.135)
n=1 1 4x 1 x
∞
1 n2 π 2
= ∑ b2n + . (4.136)
n=1 8 2
Since, the variational L is positive for non trivial solution, and the right side of (4.136)
is increasing in n, then it is likely that the minimum of L is achieved at n = 1, or at the
first eigenvalue −λ1 = π 2 + 14 . Recall, −λn = n2 π 2 + 14 , n = 1, 2, . . . . The extremal
eigenfunction that corresponds to the first eigenvalue is
√ 1
y1 (x) = ± 2x− 2 sin(π ln(x)).
Of course at this eigenfunction, W (y) = 1. Note that the number 1 in W (y) = 1, is
“symbolic”. As a matter of fact, the above discussion should hold for any number
l > 0, such that W (y) = l. Additionaly, (4.136) implies for n = 1 that
1
L(y1 (x)) = + π 2 = −λ1 ,
4
304 Calculus of Variations
where we have used b21 = 2 that was obtained from (4.133). Next, we make it clear
that y1 (x) minimizes L. Suppose there is another function f that minimizes L such
that f is different from y1 . Due to the completeness property of Fourier series, (see
Appendix A) f must be of the form ∑∞ n=1 bn yn (x). Since y1 and f differ, there is an
integer k ≥ 2 such that bK = ̸ 0. Thus, from (4.136) we get
∞
1 n2 π 2
L( f ) = ∑ b2n 8
+
2
n=1
1 ∞
π2 ∞ 2 2
= ∑ b2n + ∑ n bn
8 n=1 2 n=1
1 ∞ 2 π 2 K−1 2 2 ∞
= ∑ bn + ∑ n bn + K 2 b2K + ∑ n2 b2n
8 n=1 2 n=1 n=K+1
∞ 2 K−1 ∞
1 π
≥ ∑ b2n + K 2 b2K + ∑ b2n + ∑ n2 b2n
8 n=1 2 n=1 n=K+1
1 ∞
π 2 ∞
> ∑ b2n + K 2 b2K − b2K + ∑ b2n
8 n=1 2 n=1
π 2 1 π 2 ∞
= (K 2 − 1)b2K + + ∑ b2
2 8 2 n=1 n
1 π2 ∞
>
8
+
2 n=1∑ b2n
1 1 ∞
= + π2 ∑ b2n
4 2 n=1
1
= + π 2 , (by(4.132)).
4
This shows
1
+ π 2 = L(y1 ).
L( f ) >
4
This proves that y1 minimizes L subject to the constraint W.
The next theorem asserts that, in general, the corresponding eigenfunction to the
first eigenvalue of (SLP), does indeed minimize the variational subject to its isoperi-
metric constraint. For a quick reference we restate the (SLP). Consider the varia-
tional b
L(y) = r(x)y′2 + q(x)y2 dx, y(a) = y(b) = 0, (4.137)
a
and subject to
b
W (y) = p(x)y2 dx = 1. (4.138)
a
Note that, (4.138) holds when the corresponding eigenfunctions are normalized with
respect to the weight function p.
Sturm-Liouville Problem 305
Theorem 4.28 Suppose −λ1 is the first eigenvalue of (4.137) and (4.138) with cor-
responding normalized eigenfunction y1 (x) ∈ C2 ([a, b]). Then among all admissible
normalized eigenfunctions y ∈ C2 ([a, b]), the function y = y1 (x) minimizes L, subject
to (4.138). Moreover, L(y1 ) = −λ1 .
Proof We mention that the presence of the number −λ1 and not λ1 , depends solely
on the way we decided to consider (4.126). We begin by multiplying (4.126) with y
and then integrating by parts the first term in the resulting equation from x = a, to
x = b. bh
′ i
y r(x)y′ − q(x)y2 − λ p(x)y2 dx = 0.
a
′
Letting u = y, dv = r(x)y′ (x) dx and making use of y(a) = y(b) = 0, we arrive
at
b b b
′ b
y(x) r(x)y (x) dx = r(x)y(x)y′ (x)x=a −
′ ′2
r(x)y (x)dx = − r(x)y′2 (x)dx.
a a a
Since y is nontrivial, the number −λ is an eigenvalue. By ii) of Theorem 5.4, the first
eigenvalue is −λ1 and hence it has the corresponding normalized eigenfunction y1 .
This shows L is minimized at −λ1 ; that is
It is crucial that we examine the ratio of L(y) and W (y) in more detail. Expression
(4.139) gives
L(y)
L(y) = −λ1W (y), or = −λ1 .
W (y)
We define the Rayleigh quotient
L(y)
R(y) = . (4.140)
W (y)
It is evident from (4.140) that for any nontrivial solution φn (x), that corresponds to
eigenvalues −λn , we have that
It is important to remark that (4.141) holds for all nontrivial eigenfunctions y whether
they are normalized or not, since the same excess factor will appear in the numerator
and denominator, and hence it cancels out.
Also, (4.141) is handy when the eigenvalues can be computed, which is not the case
in some situations. The Rayleigh quotient can be easily generalized to the (SLP) with
general boundary conditions. Let y = y(x) ∈ C2 ([a, b]) be an extremal for
b
L(y) = r(x)y′2 + q(x)y2 dx,
a
Then b b
−r(x)y(x)y′ (x)x=a + a r(x)y′2 + q(x)y2 dx
−λ = b . (4.142)
a p(x)y2 dx
The verification of (4.142) comes from integrating by parts the first term in
b
′
y r(x)y′ − q(x)y2 − λ p(x)y2 dx = 0,
a
The next theorem says that the Rayleigh quotient yields an upper bound to the true
value of the lowest eigenvalue −λ1 .
Theorem 4.29 Suppose −λ1 is the first eigenvalue of (4.137) and (4.138). Let
Note that Theorem 4.29 does not requires the function u to be an extremal of (4.137)
subject to the constraint (4.138). You may think of the set σ as the set of “trial
functions”.
Proof
For y ∈ σ , we let ŷ = y + εη, with η(a) = η(b) = 0. Set
min R(y) = M.
y∈σ
Thus, b
L(ŷ) = L(y) + 2ε η − (r(x)y′ )′ + q(x)y dx + O(ε 2 ).
a
Similarly, but without the integration by parts,
b
W (ŷ) = W (y) + 2ε η p(x)ydx + O(ε 2 ).
a
Therefore,
min R(y) = −λ1 .
y∈σ
4.14.2 Exercises
Exercise 4.101 Put the second-order differential equation in the form of (4.126),
that were found in Example 4.30 are orthogonal and normalize the eigenfunctions.
Exercise 4.104 Prove (4.134) and (4.135).
Exercise 4.105 Redo Example 4.30 for the variational problem
π
L(y) = y′2 dx, y(0) = y(π) = 0,
0
π
subject to W (y) = 0 y2 dx = 3. Then evaluate L at the eigenfunctions yn (x) =
2
∑∞ ∞
n=1 an sin(nx), to find a formula for ∑n=1 an . Finally, argue or refer to other state-
ments, that the eigenfunction corresponding to the first eigenvalue minimizes L.
Sturm-Liouville Problem 309
λ
(xy′ )′ − = 0, y(1) = y(b) = 0
x
in the form of (4.137) and (4.138). Find the eigenvalues and normalized eigenfunc-
tions and show that the normalized eigenfunction corresponding to the first eigen-
value minimizes L.
Exercise 4.107 Consider the variational problem
l
L(y) = y′2 dx, y(0) = y(l) = 0,
0
l
subject to W (y) = 0 y2 dx. Find the eigenvalues −λn , and corresponding eigen-
functions yn (x), n = 1, 2, . . .(No need to normalize them). We already know that
R(y1 ) = −λ1 .
For the next parts, take l = 1 and use(4.140)
(a) Compute R(yT ) at the trial function
x, 0 ≤ x ≤ 1/2
yT =
1 − x, 1/2 ≤ x ≤ 1
(b)
yT = x(π − x).
310 Calculus of Variations
in the form
N
y(x) ≈ φ0 (x) + c1 φ1 (x) + c2 φ1 (x) + . . . + cN φN (x) = φ0 (x) + ∑ cn φn (x), (4.145)
n=1
For example if we have a boundary value problem in which the solutions are of the
form y = c + dx, and boundary conditions y(0) = 0, y(1) = 1, then we may take
φ0 (x) = x, and φ1 (x) = x(x − 1), and hence
Note that φ0 (0) = 0, φ0 (1) = 1, and φ1 (0) = φ1 (1) = 0. Here we only decided to
select φ0 and φ1 . If we were to write down all of them, then we would set
Next we substitute (4.145) into the variational in (4.144) and suppose we want
to
b N N
′ ′
Minimize L(y) = F x, φ0 (x) + ∑ cn φn (x), φ0 (x) + ∑ cn φn (x) dx.
a n=1 n=1
Rayleigh Ritz Method 311
The independent variable x will integrate out and we are left with a function of the
unknown constants say, L(c1 , c2 , . . . , cN ). The problem reduces to
min L(y) = min L(c1 , c2 , . . . , cN ).
c1 ,c2 ,...,cN
to be
2e−1 − 5 x 5 − 2e −x 1
y(x) = e + e − x.
2(e−1 − e) 2(e−1 − e) 2
Next we apply the Rayleigh-Ritz Method. Set φ0 (x) = 1 + x. then φ0 (0) = 1, and
φ0 (1) = 2. Thus φ0 (x) satisfies the boundary conditions as required by the method.
Choose, φ1 (x) = x(1 − x). Clearly, φ1 (x) vanishes at the boundaries and it has no
zeroes in (0, 1). Set
y1 (x) = φ0 (x) + c1 φ1 (x) = 1 + x + c1 x(1 − x).
Next we substitute y1 into the variational and obtain
1
L(y1 ) = (y′1 )2 − xy1 − y21 dx
0
312 Calculus of Variations
1 n 2 2 o
1 + c1 (1 − 2x) − x − x2 − c1 x2 (1 − x) − 1 + x + c1 x(1 − x)
= dx
0
1n o
= (−3x − 2x2 ) + c1 (2 − 6x − x2 + 3x3 ) + c21 (1 − 4x + 3x2 + 2x3 − x4 ) dx
0
13 7 3
=− − c1 + c21 .
6 12 10
∂L 35
Solving = 0 yields c1 = . Thus
∂ c1 4
35
y1 (x) = 1 + x + x(1 − x),
4
d2 L 35
as the first approximate solution. We remark that, since dc21
> 0 at c1 = 4 , y1 (x) is
a minimizer candidate. The relation
4.15.1 Exercises
Exercise 4.109 Compute y2 (x) = 1 + x + x(1 − x)[c1 + c2 x] in Example 4.31.
Exercise 4.110 Compute the second-order approximation y2 (x) for the variational
1
(y′ )2 − 2xy − y2 dx,
L(y) = y(0) = 1, y(1) = 2,
0
and compute the true extremal. Graph both functions; that is, the true solution and
y2 (x) on the same graph.
Exercise 4.111 Redo Exercise 4.110 when the boundary conditions are
y(0) = 0, y(1) = 0.
into a variational form and use Rayleigh Ritz method to obtain an approximation in
the form
y2 (x) ≈ x + x(1 − x)[c1 + c2 x].
Exercise 4.113 Compute the second-order approximation y2 (x) for the variational
1
(y′ )2 − 2xy − 2y dx,
L(y) = y(0) = 2, y(1) = 1,
0
and compute the true extremal. Graph both functions; that is, the true solution and
y2 (x) on the same graph.
Exercise 4.114 Compute the second-order approximation y2 (x) for the variational
2
1 2 ′ 2
L(y) = (x y ) + 6xy dx, y(1) = y(2) = 0,
1 2
and compute the true extremal. Graph both functions; that is, the true solution and
y2 (x) on the same graph.
where f (x, y) is a given function defined over C and whose values trace out a fixed
curve Γ, which forms the boundary of the surface u. See Fig. 4.21.
The variational problem is to minimize
J(u) = F x, y, u(x, y), ux (x, y), uy (x, y) dx dy, (4.146)
R
u = u(x, y)
R u = f (x, y)
x C
FIGURE 4.21
Surface u = u(x, y) minimizing J(u).
d
= F x, y, u + εη, ux + εηx , uy + εηy dx dy
dε R
= Fu η + Fux ηx + Fuy ηy dx dy
R
∂ ∂
= Fu − Fux − Fuy ηdx dy
∂x ∂y
R
∂ ∂
+ (ηFux ) + (ηFuy ) dx dy.
R ∂x ∂y
The second integral can be transformed to a line integral over C using Green’s theo-
rem, which states that if P and Q are functions in C1 (R), then
∂Q ∂P
− dx dy = Pdx + Qdy.
R ∂x ∂y C
Consequently,
∂ ∂
δ J(u, η) = Fu − Fux − Fuy ηdx dy
R ∂x ∂y
+ ηFux dy − ηFuy dx.
C
Since η(x, y) = 0 for (x, y) on C , the line integral vanishes and we have
∂ ∂
δ J(u, η) = Fu − Fux − Fuy ηdx dy. (4.147)
R ∂x ∂y
Multiple Integrals 315
Since u is a local minimum if follows that δ J(u, η) = 0 for every η ∈ C2 (R) with
η(x, y) = 0 on C . Next, we extend the Fundamental Lemma of Calculus of Variations
to two variables.
Lemma 14 Suppose g(x, y) is continuous over the region Ω ⊂ R2 . If
g(x, y)η(x, y)dxdy = 0,
Ω
where R is the region enclosed by the curve C which is the projection of Γ onto the
xy-plane. The function u should satisfy u(x, y) = h(x, y), (x, y) ∈ C , where h is the
1
function defining Γ. We have F = (1 + u2x + u2y ) 2 , with
ux uy
Fu = 0; Fux = q ; Fuy = q ,
1 + u2x + u2y 1 + u2x + u2y
u=0
∇2 u = 0
u=0
(0, 0) x
u=0 a
FIGURE 4.22
Dirichlet problem in rectangular coordinates.
which is the equation for the minimized surface. This is nonlinear partial differential
equation and almost impossible to solve in its current form for a given boundary
curve Γ. □
In the next example we consider the steady temperature in rectangular coordinates,
called the Dirichlet problem. For more on this, we refer to Appendix A.
Example 4.33 Consider the variational problem
u2x + u2y dx dy,
J(u) =
R
where R = {(x, y) : 0 < x < a, 0 < y < b}. Suppose on the boundary of R, we have
Equation (4.150) along with the boundary conditions represent the steady tempera-
tures u(x, y) in a plates whose faces are insulated. The function u(x, y) represents the
electrostatic potential in a space formed by the planes x = 0, x = a, y = 0, and y = b
when the space is free of charges and planar surfaces are kept at potentials given by
the boundary conditions. We are seeking non trivial solution and hence if we assume
the solution u(x, y) is the product of two functions one in x and the other in y, such
that
u(x, y) = X(x)Y (y),
Multiple Integrals 317
we obtain
X ′′ (x)Y (y) + X(x)Y ′′ (y) = 0.
Since X(x) ̸= 0, and Y (y) ̸= 0, we may divide by the term X(x)Y (y), to separate the
variables. That is,
X ′′ (x) Y ′′ (y)
=− .
X(x) Y (y)
Since the left-hand side is a function of x alone, it does not vary with y. However, it
is equal to a function of y alone, and so it can not vary with x. Hence the two sides
must have some constant value −λ in common. That is,
X ′′ (x) Y ′′ (y)
=− = −λ .
X(x) Y (y)
This gives the Sturm-Liouville problems
and
Y ′′ (y) − λY (y) = 0, Y (0) = 0. (4.152)
Using arguments of Section 4.14, Equations (4.151) and (4.152) have the respective
eigenfunctions
nπx nπy
Xn (x) = sin( ), Yn (y) = sinh( ), n = 1, 2, . . .
a a
where the eigenvalues are given by λn = ( nπ 2
a ) . Thus, the general solution of the
Dirichlet problem
∞
nπy nπx
u(x, y) = ∑ bn sinh( ) sin( ).
n=1 a a
For detail on computing the coefficients bn we refer to Appendix A. Thus, bn are
given by a
2 nπx
bn = nπb
f (x) sin( )dx, n = 1, 2, . . . .
a sinh( a ) 0 a
□
Next we look at a parametrized three dimensional surface. Suppose we have a surface
S specified or parametrized with
r = r(u, v) = x(u, v), y(u, v), z(u, v) . (4.153)
where E, K, and G are called the coefficient of the first fundamental form
E = ru · ru , K = ru · rv , G = rv · rv ,
where · means the dot product. Equation (4.154) is a variational with several func-
tions and by (4.149) we have the relevant Euler-Lagrange equations
d d
Fu − Fu′ = 0, and Fv − Fv′ = 0.
dt dt
Be aware that the coefficients of the first fundamental form E, G and K depend on u
and v. After some calculations the corresponding Euler-Lagrange equations are given
by
Eu u′2 + 2Ku u′ v′ + Gu v′2 d Eu′ + Kv′
√ − √ = 0, (4.155)
2 Eu′2 + 2Ku′ v′ + Gv′2 dt Eu′2 + 2Ku′ v′ + Gv′2
and
Ev u′2 + 2Kv u′ v′ + Gv v′2 d Ku′ + Gv′
√ − √ = 0. (4.156)
′2
2 Eu + 2Ku v + Gv′ ′ ′2 dt Eu + 2Ku′ v′ + Gv′2
′2
In the next example we use equations (4.155) and (4.156) to find the geodesics on a
circular cylinder.
Example 4.34 In this example we want to find the geodesics on a circular cylinder.
Note that the circular cylinder has the parametrization
r = (a cos(u), a sin(u), v)
d a2 u′ d v′
− √ = 0, − √ = 0.
dt a2 u′2 + v′2 dt a2 u′2 + v′2
Moreover, the corresponding solutions are given by
a2 u′ a2 v′
√ = c1 , √ = c2 ,
a2 u′2 + v′2 a2 u′2 + v′2
for constants c1 and c2 . Taking the ratio we obtain
√ v′
a2 u′2 +v′2 c2
= .
√ a2 u′ c1
a2 u′2 +v′2
Multiple Integrals 319
FIGURE 4.23
Geodesics on a right cylinder.
It follows that
v′ c2
2 ′
= = k,
a u c1
for another constant k. This implies that
v′
= a2 k,
u′
and by rewriting the derivatives we see that
dv
dt
du
= a2 k.
dt
Separating the variables yields the first-order ODE dv = a2 kdu, which has the solu-
tion
v(t) = a2 ku(t) + c3 , for constant c3 ,
which is a two parameter family of helical lines on the cylinder, where the constants
can be determined from the location of the two points A and B as depicted in Fig.
4.23. □
4.16.1 Exercises
Exercise 4.115 Determine the natural boundary condition for
J(u) = F x, y, u(x, y), ux (x, y), uy (x, y) dx dy,
R
x2 y2
z2 = + , 0 ≤ z ≤ 3.
4 9
Hint: Use the following parametrization for the cone
Exercise 4.122 Determine (4.155) and (4.156) for the surface parametrized by
Integral equations are used in a wide variety of contexts, including science and engi-
neering. Integral equations such as those derived from Volterra or Fredholm can be
utilized to find solutions to a wide variety of initial and boundary value problems.
Integral equations can take on a number of different forms, but in most cases they
are used to model scientific procedures in which the current value of a quantity (or
set of values) or its rate of change is dependent on its historical performance. This is
in contrast to differential equations, which assume that the value of a quantity at any
given time is the only factor that may affect the rate at which it changes. In the same
way that differential equations need to be “solved,” integral equations also need to be
“solved” in order to describe and predict how a physical quantity will behave over a
period of time. One strong argument in favor of using integral equations rather than
differential equations is the fact that all of the conditions defining the initial value
problems or boundary value problems for a differential equation can frequently be
condensed into a single integral equation. This is one of the many reasons why in-
tegral equations are preferred over differential equations. The study of a variety of
integral equations, including Fredholm first- and second-kind integral equations as
well as Volterra integral equations, symmetric and separable kernels, iterative meth-
ods, the approximation of non-degenerate kernels, and the application of the Laplace
transform to the solution of convoluted integral equations, will be the focus of our
work. The chapter comes to a close with a discussion on integral equations that ex-
hibit strange behavior.
Definition 5.1 An integral equation in the unknown function y(x) is a relation of the
form
y(x) = f (x) + K(x, ξ )y(ξ )dξ (5.1)
in which y(x) appears in the integrand, where K(x, ξ ) is a function of two variables
x and ξ and referred to as the kernel of the integral equation.
Note that we purposefully omitted the limits of integration from the formulation
above because, in most circumstances, they determine the sort of integral equation
we have. In (5.1) the functions f and K are given and satisfy continuity conditions
and perhaps others. The following are examples of integral equations.
x
y(x) = sin(2x) + (x3 + ξ x + 1)y(ξ )dξ ,
0
and 1
ex y(x) = sin(xξ )y(ξ )dξ .
0
In this chapter we discuss the Fredholm equation of the first kind
b
α(x)y(x) = K(x, ξ )y(ξ )dξ , (5.2)
a
Fredholm integral equations given by (5.2) and (5.3) have the unique property of
having finite limits of integration ξ = a, and ξ = b. In addition to discussing Fredhom
equations, we will discuss Volterra equations of first kind and second kind given by
x
y(x) = K(x, ξ )y(ξ )dξ , (5.4)
a
and x
y(x) = f (x) + λ K(x, ξ )y(ξ )dξ , (5.5)
a
respectively. In later sections we will develop particular methods to solve integral
equations with specific characteristics. Without worrying about technicality, we try
to define a sequence of functions {yn } successively for (5.5), with λ = 1 by set-
ting
y0 (x) = f (x)
x
y1 (x) = f (x) + K(x, ξ )y0 (ξ )dξ
0 x
y2 (x) = f (x) + K(x, ξ )y1 (ξ )dξ (5.6)
0
Introduction and Classifications 323
..
. x
yn (x) = f (x) + K(x, ξ )yn−1 (ξ )dξ , n = 1, 2, . . . (5.7)
0
For n = 1 we have
x
x2
y1 (x) = 1 − (x − ξ )(1)dξ = 1 − .
0 2
ξ2
For n = 2, with y1 (ξ ) = 1 − 2 ,
x
ξ2 x2 x4
y2 (x) = 1 − (x − ξ )(1 − )dξ = 1 − + .
0 2 2 24
2 4
Similarly, for n = 3 with y2 (ξ ) = 1 − ξ2 + ξ24 , we have
x
ξ2 ξ4 x2 x4 x6
y3 (x) = 1 − (x − ξ )( + )dξ = 1 − + − .
0 2 24 2 24 720
A continuation of this method leads to the sequence of functions
n
x2 x4 x6 (−1)k x2k
yn (x) = 1 − + − +··· = ∑ .
2 24 720 k=0 (2k)!
Note that
n
(−1)k x2k
lim yn (x) = lim ∑ = cos(x).
k=0 (2k)!
n→∞ n→∞
We leave it to the students to verify, using either, the Laplace transform or by direct
substitution that y(x) = cos(x) is indeed a solution of the integral equation. □
5.1.1 Exercises
3
Exercise 5.1 By direct substitution, show that y(x) = (1 + x2 )− 2 is a solution of the
Volterra integral equation
x
1 ξ
y(x) = − y(ξ )dξ .
1 + x2 0 1 + x2
324 Integral Equations
Exercise 5.2 By direct substitution, show that y(x) = cos(x) is a solution of the
Volterra integral equation
x
y(x) = 1 − (x − ξ )y(ξ )dξ .
0
Exercise 5.3 By direct substitution, show that y(x) = (x + 1)2 is a solution of the
Volterra integral equation
x
y(x) = e−x + 2x + eξ −x y(ξ )dξ .
0
Exercise 5.4 Use the method of successive approximation and show the solution of
the Volterra integral equation
x
y(x) = 1 + (x − ξ )y(ξ )dξ
0
is y(x) = cosh(x).
Exercise 5.5 Use the method of successive approximation and show the solution of
the Fredholm integral equation
1
1
y(x) = x + (ξ − x)y(ξ )dξ
2 −1
is y(x) = 34 x + 14 .
∂f
f (t, x1 ) − f (t, x2 ) = (t, c)(x1 − x2 ),
∂x
form which it follows that
∂ f
| f (t, x1 ) − f (t, x2 )| ≤ (t, c)|x1 − x2 |
∂x
≤ K |x1 − x2 |. (5.12)
∂f
We have shown that if f and ∂y are continuous on R, then f satisfies a global Lips-
chitz condition on R.
We state the following definition regarding solutions of (IVP) and integral equa-
tions.
Definition 5.4 We say x is a solution of (5.8) on an interval I, provided that x : I → R
is differentiable, (t, x(t)) ∈ D, for t ∈ I, x′ (t) = f (t, x(t)) for t ∈ I, and x(t0 ) = x0 for
(t0 , x0 ) ∈ D.
In preparation for the next theorem, we observe that the (IVP) (5.8) is related
to t
x(t) = x0 + f (s, x(s))ds. (5.13)
t0
and set
b
}.
h = min{a, (5.15)
M
Then the (IVP) (5.8) has a unique solution denoted by x(t,t0 , x0 ) on the interval
|t − t0 | ≤ h and passing through (t0 , x0 ). Furthermore,
|x(t) − x0 | ≤ b, for |t − t0 | ≤ h.
Proof The proof involves changing the limits of integration. From Fig. 5.1, we define
the shaded region by
D = {(t, ξ ) : t ≤ ξ ≤ x, a ≤ t ≤ x}.
Then
x ξ
F(t)dtdξ = F(t)dξ dt
a a D
Connection between Ordinary Differential Equations and Integral Equations 327
ξ
t
=
ξ
x
a x t
FIGURE 5.1
Shaded region of integrations.
x x
= F(t)dξ dt
a x t
x
= F(t) dξ dt
a x t
x
= F(t) dξ dt
a x t
= (x − t)F(t)dt.
a
Or, x
y′ (x) − y′ (0) + Ay(x) − Ay(0) + B y(ξ ) dξ = 0.
0
We are given y(0) = 0 and so
x
′ ′
y (x) − y (0) + Ay(x) + B y(ξ ) dξ = 0. (5.19)
0
and so,
x x
′
y(x) − y(0) − xy (0) + A y(ξ ) dξ + B (x − ξ )y(ξ ) dξ = 0.
0 0
As y(0) = 0, then
x x
y(x) − xy′ (0) + A y(ξ ) dξ + B (x − ξ )y(ξ ) dξ = 0. (5.20)
0 0
Now, by letting x = 1 and making use of y(1) = 0 in the above equation we obtain
1 1
y′ (0) = A y(ξ ) dξ + B (1 − ξ )y(ξ ) dξ ,
0 0
or 1
′
y (0) = [A + B − Bξ ]y(ξ ) dξ . (5.21)
0
Finally, substitute (5.21) into (5.20) to get
1 x x
y(x) − x [A + B − Bξ ]y(ξ ) dξ + A y(ξ ) dξ + B (x − ξ )y(ξ ) dξ = 0,
0 0 0
(5.22)
or
1 x
y(x) = [Ax + Bx − Bxξ ]y(ξ ) dξ − [A + Bx − Bξ ]y(ξ ) dξ .
0 0
Since, 0 ≤ x ≤ 1, we may use
1 x 1
= +
0 0 x
Connection between Ordinary Differential Equations and Integral Equations 329
and rewrite
x 1
y(x) = [Ax + Bx − Bxξ ]y(ξ ) dξ + [Ax + Bx − Bxξ ]y(ξ ) dξ
0 x x
− [A + Bx − Bξ ]y(ξ ) dξ
0
x 1
= [Ax − A − Bxξ + Bξ ]y(ξ ) dξ + [Ax + Bx − Bxξ ]y(ξ ) dξ
0 x x
= [Bξ (1 − x) + Ax − A]y(ξ ) dξ
0
1
+ [Ax + Bx(1 − ξ )]y(ξ ) dξ . (5.23)
x
By observing that the first term and second term in (5.23) are valid over 0 < ξ < x
and x < ξ < 1, respectively, we conclude that (5.23) can be written as
1
y(x) = K(x, ξ )y(ξ ) dξ ,
0
□
The next lemma is essential when differentiating an integral equation. It is referred
to as Leibnitz formula
Lemma 16 Suppose α(x), β (x) are continuous such that ∂∂αx and ∂∂βx exist. If F is
continuous in both variables and its first partial derivatives exist, then
β (x) β (x)
d ∂F
F(x, ξ )dξ = (x, ξ )dξ
dx α(x) α(x) ∂x
+ F(x, β (x))β ′ (x) − F(x, α(x))α ′ (x) (5.24)
with
∂f
= F(x, ξ ).
∂y
Then
φ (α, β , x) = f (x, β (x)) − f (x, α(x)).
330 Integral Equations
dφ ∂φ ∂φ ∂β ∂φ ∂α
= + + . (5.25)
dx ∂x ∂β ∂x ∂α ∂x
Moreover,
β (x) β (x)
∂φ ∂ ∂F
= F(x, ξ )dξ = (x, ξ )dξ ,
∂x ∂x α(x) α(x) ∂x
since computing ∂∂φx means that α and β are kept constants. On the other hand, using
φ (α, β , x) = f (x, β (x)) − f (x, α(x)), we get
∂φ ∂ f (x, β ) ∂ f (x, α)
= − = 0 − F(x, α).
∂α ∂α ∂α
Similarly,
∂φ ∂ f (x, β ) ∂ f (x, α)
= − = F(x, β ) − 0.
∂β ∂β ∂β
Substituting the last three expressions into (5.25), we arrive at
β (x)
dφ ∂β ∂α
= F(x, ξ )dξ + F(x, β ) − F(x, α) ,
dx α(x) ∂x ∂x
It is clear that the kernel K is symmetric and from (5.26) we have that u(0) = u(1) =
0. Moreover, using Lemma 16, we have
x
u′ (x) = λ (1 − x)xu(x) − λ ξ u(ξ )dξ
0
1
− λ (1 − x)xu(x) + λ (1 − ξ )u(ξ )dξ
x
Connection between Ordinary Differential Equations and Integral Equations 331
x 1
= −λ ξ u(ξ )dξ + λ (1 − ξ )u(ξ )dξ .
0 x
Thus, the integral equation satisfies the second-order boundary value problem
u(0) = 0, u(1) = 0.
The boundary value problem is a Sturm-Liouville problem and we refer you to Chap-
ter 4, Section 4.14. Note that for λ ≤ 0, the problem only has the trivial solution. For
λ > 0, we let λ = α 2 , α ̸= 0. Then the problem has the solution
un (x) = sin(nπx), n = 1, 2, . . .
5.2.1 Exercises
Exercise 5.6 Show that f (t, x) = x2/3 does not satisfy Lipschitz condition in the rect-
angle R = {(t, x) : |t| ≤ 1, |x| ≤ 1}.
Exercise 5.7 Use integration by parts to prove (5.16) of Lemma 15.
Exercise 5.8 (a) If y′′ (x) = F(x), and y satisfies the initial condition y(0) = y0 and
y′ (0) = y′0 , show that
x
y(x) = (x − ξ )F(ξ ) dξ + y′0 x + y0 .
0
(b) Verify that this expression satisfies the prescribed differential equation and initial
conditions.
Exercise 5.9 (a) If y′′ (x) = F(x), and y satisfies the end conditions y(0) = 0 and
y(1) = 0, show that
1
y(x) = K(x, ξ )F(ξ ) dξ
0
332 Integral Equations
(b) Verify directly that the expression obtained satisfies the prescribed differential
equation and end conditions.
Exercise 5.10 Verify the integral equation
t
8
y(t) = 1 + t − (ξ − t)3 y(ξ )dξ ,
3 0
to a differential equation.
Exercise 5.12 Write the second-order nonhomogenous differential equation y′′ (x) =
λ y(x) + g(x), x > 0 that satisfies the initial conditions y(0) = y′ (0) = 0 into an inte-
gral equation.
Exercise 5.13 Show that the second-order boundary value problem
where A and B are differentiable functions on (a, b) and g is continuous, leads to the
integral equation a
y(x) = f (x) + K(x, ξ )y(ξ ) dξ ,
b
Connection between Ordinary Differential Equations and Integral Equations 333
where
x b
x−a
f (x) = a1 + (x − ξ )g(ξ )dξ + b1 − a1 − (b − ξ )g(ξ )dξ ,
a b−a a
and
A(ξ )−(a−ξ )(A′ (ξ )−B(ξ )
(x−b)
b−a when ξ ≤ x ≤ b,
K(x, ξ ) =
(x−a) A(ξ )−(b−ξ )(A′ (ξ )−B(ξ )
b−a when a ≤ x ≤ ξ .
Exercise 5.15 Find the solution of the Volterra integral equation
x
y(x) = 1 − x − 4 sin(x) + [3 − 2(x − ξ )]y(ξ )dξ ,
0
Show that if x
y(x) = h(x − ξ )g(ξ ) dξ ,
0
then y(x) solves the nonhomogeneous second-order differential equation
Exercise 5.18 Find all continuous functions y = y(x) that satisfy the relation
x x
t y(t)dt = (t + 1) ty(t)dt.
0 0
exact.
We have the following lemma.
Lemma 17 The differential operator L given by (5.27) is self-adjoint.
This implies
d
p′ [wz′ − zw′ ] = {p(x)[wz′ − zw′ ]} − p[wz′′ − zw′′ ].
dx
By substituting the above term into (5.29) we obtain
d
wLλ − zLλ = {p(x)[wz′ − zw′ ]}.
dx
Or,
wLλ z − zLλ w dx = d{p(x)[wz′ − zw′ ]} := dg.
(5.30)
This completes the proof.
L y + Φ(x) = 0, (5.31)
d d d2 dp d
L := p(x) + q(x) = p 2 + + q, (5.32)
dx dx dx dx dx
together with homogeneous boundary conditions of the form
dy dy
αy(a) + β (a) = 0, αy(b) + β (b) = 0, (5.33)
dx dx
for some constants α and β . It is assumed that the function p(x) is continuous and
that p(x) ̸= 0 for all x ∈ (a, b). Also p′ (x) and q(x) are continuous on (a, b). The
function Φ(x) may depend on x and y(x); that is
Note that the differential operator defined by (5.32) is the same as the one defined by
(5.27) when λ = 0. We attempt to find a Green function, denoted with G(x, ξ ) and
given by (
G1 (x, ξ ) when x < ξ
G(x, ξ ) = (5.34)
G2 (x, ξ ) when x > ξ ,
and satisfies the following four properties:
(i) The functions G1 and G2 satisfy the equation L G = 0; that is L G1 = 0 when
x < ξ and L G2 = 0 when x > ξ .
(ii) The function G satisfies the homogeneous conditions prescribed at the end points
x = a, and x = b; that is G1 satisfies the condition prescribed at x = a, and G2
satisfies the condition prescribed at x = b.
336 Integral Equations
Note that if Φ is constant or a function of x but not y(x) then (5.36) can be solved to
obtain the solution. However, if Φ has y(x) then (5.36) is an integral equation of the
form b
y(x) = G(x, ξ )ϕ(ξ , y(ξ ))dξ ,
a
where y needs to be determined. We begin by determining the Green’s function G.
Let y = u(x) be a nontrivial solution of the homogeneous equation L y = 0 along with
dy
αy(a) + β dx (a) = 0. Similarly, we let y = v(x) be a nontrivial solution of the homo-
dy
geneous equation L y = 0 and αy(b) + β dx (b) = 0. Then (i) and (ii) are satisfied if
we write (
c1 u(x) when x < ξ
G(x, ξ ) = (5.37)
c2 v(x) when x > ξ ,
where c1 and c2 are constants. Condition (iii) yields
By multiplying the second equation by u and the first equation by v, and subtracting
the results, there follows
u(pv′ )′ − v(pu′ )′ = 0.
The Green’s Function 337
A
u(ξ )v′ (ξ ) − v(ξ )u′ (ξ ) = . (5.41)
p(ξ )
It turns out that the Green’s function of a self-adjoint operator is symmetric. Finally
substituting (5.42) into (5.36) the solution can be explicitly found to be
b
y(x) = G(x, ξ )Φ(ξ )dξ
a
x b
u(ξ ) v(ξ )
= − v(x)Φ(ξ )dξ + − u(x)Φ(ξ )dξ
a A x A
x b
1
= − u(ξ )v(x)Φ(ξ )dξ + v(ξ )u(x)Φ(ξ )dξ . (5.43)
A a x
Remark 21 The Green’s function for (5.31) and (5.33) is independent of the function
Φ(x). For example if (5.31) is replaced with
L y = f (x), (5.44)
338 Integral Equations
Example 5.5 In this example, we attempt to find the Green’s function for the second-
order boundary value problem with homogeneous boundary conditions
2+ξ ξ
c1 = and c2 = .
2 2
Thus, (
1
G(x, ξ ) = 2 (2 + ξ )x when x < ξ
1
2 ξ (2 + x) when x > ξ ,
Note that G(x, ξ ) = G(ξ , x). □
5.3.1 Exercises
Exercise 5.19 Consider the second-order boundary value problem
a−b
≤ G(x, ξ ) ≤ 0.
4
where G′ = d
dx G.
Exercise 5.20 Use the Green’s function to solve the second-order boundary value
problem
y′′ (x) + x2 = 0, 0 < x < 1, y(0) = y(1) = 0.
Exercise 5.21 Use the Green’s function to solve the second-order boundary value
problem
e2x y′′ (x) + 2e2x y′ (x) = e3x , 0 < x < ln(2), y(0) = y(ln(2)) = 0.
Exercise 5.22 Use the Green’s function to solve the second-order boundary value
problem
y′′ (x) + ex = 0, a < x < b, y(a) = 0, y′ (b) = 0.
Exercise 5.23 (a) Show the Green’s function for the second-order boundary value
problem
is given by
( sin[α(1−ξ )] sin(αx)
α sin(α) when 0 ≤ x ≤ ξ ≤ 1
G(x, ξ ) = sin[α(1−x)] sin(αξ )
α sin(α) when a ≤ ξ ≤ x ≤ 1.
Exercise 5.26 Make use of (5.43) to show that y given by (5.36) satisfies the bound-
ary value problem (5.31) and the boundary conditions given by (5.33).
Fredholm Integral Equations and Green’s Function 341
then by the result of Section 5.3, the solution to the boundary value problem (5.47)
subject to the boundary conditions given by (5.33) is given by
b b
y(x) = λ G(x, ξ )ρ(ξ )y(ξ )dξ − G(x, ξ ) f (ξ )dξ , (5.49)
a a
Next we try to put the Fredholm integral equation given by (5.51) in symmetric form
provided
p ρ(x) > 0 for x ∈ (a, b). Assume so and multiply both sides of (5.51) by
ρ(x) and arrive at
p p bp p
ρ(x)y(x) = ρ(x)F(x) + λ ρ(x)ρ(ξ )G(x, ξ ) ρ(ξ )y(ξ )dξ .
a
p p
Letting z(x) = ρ(x)y(x) and g(x) = ρ(x)F(x), the preceding integral equation
reduces to the symmetric Fredhom integral equation
342 Integral Equations
bp
z(x) = g(x) + λ ρ(x)ρ(ξ )G(x, ξ )z(ξ )dξ
a
b
= g(x) + λ K(x, ξ )z(ξ )dξ . (5.52)
a
Note that p
K(x, ξ ) = ρ(x)ρ(ξ )G(x, ξ ) = K(ξ , x)
since G is symmetric.
Example 5.6 Let Ly be given by Example 5.4 and we want to reduce the boundary
value problem
L y + λ y = x, y(0) = y(l) = 0,
to a Fredholm integral equation. From the above discussion we have ρ(x) = 1, and
hence l
g(x) = − G(x, ξ )ξ dξ ,
0
where from Example 5.4 we have
(
x
G(x, ξ ) = l (l − ξ ) when x < ξ
ξ
l (l − x) when x > ξ .
Thus,
l
g(x) = − G(x, ξ )ξ dξ
0
h ξ x l
x i
= − (l − x)ξ dξ + (l − ξ )ξ dξ
0 l x l
x 2 2
= x −l .
6
Hence, by (5.52) the solution is given by the Fredholm integral equation
l
x 2 2
y(x) = x −l +λ G(x, ξ )y(ξ )dξ ,
6 0
since ρ(x) = 1. □
5.4.1 Exercises
Exercise 5.27 Reduce the boundary value problem
y′′ (x) + λ y(x) = x, 0 < x < 1, y(0) − 2y′ (0) = 0, 2y(1) − y′ (1) = 0
L y = 0, y(0) = 0, y(1) = 0
is (
x
2ξ
(1 − ξ 2 ) when x < ξ
G(x, ξ ) = ξ 2
2x (1 − x ) when x > ξ .
L y + λ xy = 0, y(0) = 0, y(1) = 0.
w(x,t)
L
u(0,t) = 0 x
ux (0,t) = 0 u(x,t)
FIGURE 5.2
A thin beam undergoing a small deflections u(x,t) from equilibrium.
moment of inertia of the beam’s cross-sectional area about a point x. If M(x,t) is the
total bending moment produced by all the forces acting on the beam at point x, then
the differential equation of the elastic curve of a beam is found to be
∂ 2 u(x,t)
eI(x) = M(x,t). (5.53)
∂ x2
The bending moment is related to the applied load by the second-order partial differ-
ential equation
∂ 2 M(x,t)
= −w(x,t). (5.54)
∂ x2
Let’s decompose the applied load w(x,t) into an external applied component F(x,t)
2
and an internal inertia component ρ ∂ ∂u(x,t)
x2
, where ρ = ρ(x) is the linear mass density
of the beam at the point x. Thus, if we let
∂ 2 u(x,t)
w(x,t) = ρ + F(x,t),
∂ x2
then (5.53) and (5.54) yield
∂2 ∂ 2 u(x,t) ∂ 2 u(x,t)
2
eI(x) 2
+ ρ(x) = −F(x,t). (5.55)
∂x ∂x ∂ x2
Since the beam is fixed at the end point x = 0, while at the end at x = L is free, we
have the appropriate initial and boundary conditions
∂ 2 u(L,t)
u(0,t) = 0, = 0,
∂ x2
∂ u(0,t) ∂ ∂ 2 u
= 0, eI 2 x=L = 0, (5.56)
∂x ∂x ∂x
∂ u(x, 0)
u(x, 0) = g(x), = h(x).
∂t
Fredholm Integral Equations and Green’s Function 345
then (5.55) can be easily reduced to the fourth order ordinary differential equa-
tion ′′
eIy′′ − ρω 2 y = − f . (5.57)
Consequently, the first four initial and boundary conditions given by (5.56) are re-
duced to
Next, we briefly discuss how to compute the Green’s function for the fourth-order
ordinary differential operator. Consider the fourth-order differential equation
L y + Φ(x) = 0, (5.60)
d2 d2y
(L y)(x) := p(x) + q(x)y(x) = 0, x ∈ [a, b] (5.61)
dx2 dx2
together with homogeneous boundary conditions of the form
G1 (ξ ) = G2 (ξ ),
d d
G1 (ξ ) = G2 (ξ ),
dx dx
d2 d2
G1 (ξ ) = G2 (ξ ).
dx2 dx2
1
(iv) The third derivative of G has a discontinuity of magnitude − p(ξ )
at the point
x = ξ ; that is
d3 d3 1
3
G2 (ξ ) − 3
G1 (ξ ) = − . (5.64)
dx dx p(ξ )
Once we determine the Green’s function of (5.61) and (5.62), then the problem can
be transformed to the relation
b
y(x) = G(x, ξ )Φ(ξ )dξ . (5.65)
a
Our interest is to use Green’s function to solve the beam problem when the inertia
I(x) is constant subject to the initial and boundary conditions given by (5.58). Thus,
we consider the boundary value problem given by (5.59) and the initial and boundary
conditions given by (5.58). Using (5.63) we obtain from
′′
eIy′′ = 0
that
(
1 A1 + A2 x + A3 x2 + A4 x3 when x < ξ
G(x, ξ ) = (5.66)
eI B1 + B2 (x − L) + B3 (x − L)2 + B4 (x − L)3 when x > ξ .
1 1
0− 6A4 = − ,
eI eI
Fredholm Integral Equations and Green’s Function 347
which implies that A4 = 16 . Next we apply the continuity condition given by (iii) and
obtain
A3 ξ 2 + A4 ξ 3 = B1 + B2 (ξ − L),
2A3 ξ + 3A4 ξ 2 = B2 ,
2A3 + 3A4 ξ = 0.
From the third equation we obtain A3 = − 12 ξ . On the other hand, the second equation
yields B2 = − 12 ξ 2 . Thereupon, from the first equation we arrive at
ξ2
B1 = (ξ − L).
2
The second part of the Green’s function B1 + B2 (x − L) reduces to
ξ2 ξ ξ2 ξ2 ξ
B1 + B2 (x − L) = ( − L) − (x − L) = ( − x).
2 3 2 2 3
Finally, the Green’s function takes the form
( 2
1 x2 3x − ξ
when x < ξ
G(x, ξ ) = 2 (5.68)
eI ξ ξ − x
2 3 when x > ξ .
Furthermore, the solution of the beam problem for constant inertia I takes the
form
L
G(x, ξ ) − ω 2 ρ(ξ )y(ξ ) + f (ξ ) dξ
y(x) =
0
x
1 ξ2 ξ
− x − ω 2 ρ(ξ )y(ξ ) + f (ξ ) dξ
=
eI 0 2 3
L 3
1 x x2 ξ
− ω 2 ρ(ξ )y(ξ ) + f (ξ ) dξ .
+ −
eI x 6 2
5.4.3 Exercises
Exercise 5.31 Find the Green’s function for
′′
eIy′′ = 0,
y(0) = 0, y(L) = 0,
y′ (0) = 0, y′ (L) = 0.
348 Integral Equations
(b)
y(0) = 0, y′ (L) = 0,
y′ (0) = 0, ′′′
y (L) = 0.
(c)
y(0) = 0, y(L) = 0,
y′′ (0) = 0, y′′ (L) = 0.
Throughout this section we assume K is separable. For example, the kernel K(x, ξ ) =
3 + 2xξ is separable since it can be written in the form K(x, ξ ) = ∑2i=1 αi (x)βi (ξ ),
where α1 (x) = 3, β1 (ξ ) = 1, α2 (x) = 2x, and β2 (ξ ) = ξ . Note that αi (x) and βi (ξ ) are
not unique. If we substitute (5.70) into (5.69) we arrive at the new expression
n b
y(x) = f (x) + λ ∑ {βi (ξ )y(ξ )dξ }αi (x). (5.71)
i=1 a
Letting
b
ci = βi (ξ )y(ξ )dξ , (5.72)
a
equation (5.71) simplifies to
n
y(x) = f (x) + λ ∑ ci αi (x). (5.73)
i=1
Fredholm Integral Equations with Separable Kernels 349
Note that the ci given by (5.72) are unknown constants. Once they are determined the
solution is given by (5.73). Multiplying (5.73) by β j (x) and integrating the resulting
expression with respect to x from a to b gives
b b n b
β j (x)y(x)dx = β j (x) f (x)dx + λ ∑ ci β j (x)αi (x)dx, j = 1, 2, . . . n.
a a i=1 a
(5.74)
Interchanging i with j expression (5.74) can be written as
n
ci = fi + λ ∑ c j ai j , i = 1, 2, . . . n (5.75)
j=1
(I − λ A)c = f (5.77)
A = (ai j ), c = (c1 , c2 , . . . , cn )T , f = ( f1 , f2 , . . . , fn )T ,
where T denotes the transpose. Thus, (5.77) represents a system of n linear algebraic
equations for c. Before we embark on few examples, we recall some basic facts from
Chapter 3 about linear systems. Consider the linear system
Bx = b (5.78)
We provide two examples to illustrate the method of Fredholm equations with sepa-
rable kernels.
Example 5.7 Consider the homogeneous Fredholm equation
1
y(x) = λ (4xξ − 5x2 ξ 2 )y(ξ )dξ . (5.79)
0
So we have
2
K(x, ξ ) = 4xξ − 5x2 ξ 2 = ∑ αi (x)β j (ξ ).
i=1
We may choose
Next we use 1
ai j = βi (x)α j (x)dx, i, j = 1, 2
0
to compute the matrix A = (ai j ).
1 1
4
a11 = β1 (x)α1 (x)dx = 4x2 dx = ,
0 0 3
1 1
a12 = β1 (x)α2 (x)dx = 4x3 dx = 1,
0 0
1 1
5
a21 = β2 (x)α1 (x)dx = − 5x3 dx = − ,
0 0 4
1 1
a22 = β2 (x)α2 (x)dx = − 5x4 dx = −1.
0 0
Hence we have matrix
4
3 1
A=
− 54 −1
and
1 − 4 λ
−λ 4 5
det(I − λ A) = 5 3 = (1 − λ )(1 + λ ) + λ 2 .
4λ 1+λ 3 4
Fredholm Integral Equations with Separable Kernels 351
λ 2 + 4λ − 12 = 0
4
(1 − λ )c1 − λ c2 = 0
3
5
λ c1 + (1 + λ )c2 = 0. (5.80)
4
Setting λ = −6 in (5.80) we arrive at 3c1 + 2c2 = 0, from either equation. Setting
c1 = a, implies that c2 = − 32 a, for nonzero constant a. Thus, using (5.73) with f = 0
we arrive at the infinitely many solutions
2
y(x) = 0 + λ ∑ ci αi (x) = −6 c1 α1 (x) + c2 α2 (x)
i=1
3 3
= −6 ax − ax2 = −6a(x − x2 ).
2 2
In a similar manner if we substitute λ = 2 into (5.80) we arrive at 5c1 + 6c2 = 0,
from either equation. Setting c1 = b, implies that c2 = − 56 b, for nonzero constant b.
Thus, using (5.73) with f = 0 we arrive at the infinitely many solutions
2
y(x) = 0 + λ ∑ ci αi (x) = 2 c1 α1 (x) + c2 α2 (x)
i=1
5 5
= 2 bx − bx2 = 2b(x − x2 ).
6 6
□
The next example illustrates the techniques for dealing with nonhomogeneous Fred-
holm integral equations.
352 Integral Equations
Notice that the kernel and hence the matrix A and the values of λ are the same as in
Example 5.7. We begin by addressing (i) of Theorem 5.3.
1
• If fi = 0 βi (x) f (x)dx, i = 1, 2 are not all zero; that is
1 1
f1 = 4x f (x)dx ̸= 0, or f2 = − 5x2 f (x)dx ̸= 0,
0 0
• If
1 1
f1 = 4x f (x)dx ̸= 0, or f2 = − 5x2 f (x)dx ̸= 0, (5.82)
0 0
and λ = −6, then using (5.77) we arrive at the system
1
9c1 + 6c2 = 4x f (x)dx
0
1
15
c1 − 5c2 = − 5x2 f (x)dx.
2 0
Multiplying the second equation by 2 and then simplifying the resulting system we
arrive at the new system of equations
1
1
3c1 + 2c2 = 4x f (x)dx
3 0
1
−3c1 − 2c2 = −2 x2 f (x)dx.
0
In such case to determine the solutions we let 3c1 = a and obtain 2c2 = −a +
1
2 0 x2 f (x)dx. This gives the solutions
2
y(x) = f (x) − 6 ∑ ci αi (x)
i=1
1
a −a
x2 f (x)dx x2
= f (x) − 6 x+ +
3 2 0
In such case to determine the solutions we let 5c1 = a and obtain 6c2 = −a −
1
10 0 x2 f (x)dx. This gives the solutions
2
y(x) = f (x) + 2 ∑ ci αi (x)
i=1
−a 5 1 2
a 2
= f (x) + 2 x + − x f (x)dx x .
5 6 3 0
y(x) = f (x).
5.5.1 Exercises
Exercise 5.32 Solve the Fredholm equation
1
y(x) = x2 + λ xξ y(ξ )dξ , for λ = −1.
0
Fredholm Integral Equations with Separable Kernels 355
(a) Find the matrix A and show the roots of the equation
det(I − λ A) = 0
are √ √
4 15 4 15
λ=√ , √ .
15 − 4 15 + 4
(b) Use (a) and discuss the solutions of the Fredholm equation.
(c) Find the solution for nonhomogeneous Fredholm equation
1
y(x) = x + λ (xξ 2 + x2 ξ )y(ξ )dξ .
0
Exercise 5.34 Find all values of λ so that the nonhomogeneous Fredholm equation
1
x
y(x) = e + λ xξ y(ξ )dξ
0
when F(x) = x and when F(x) = 1, under the assumption that λ ̸= ±1/π.
(c) Prove that the equation
2π
1
y(x) = sin(x + ξ ) y(ξ ) dξ + F(x)
π 0
possesses no solution when F(x) = x, but that it possesses infinitely many solu-
tions when F(x) = 1. Determine all such solutions.
356 Integral Equations
(d) Determine the most general form of the prescribed F(x), for which the integral
equation
2π
sin(x + ξ ) y(ξ ) dξ = F(x),
0
of the first kind, possesses a solution.
Exercise 5.37 In light of Example 5.8, discuss the solutions of the nonhomogeneous
Fredholm equation
1
y(x) = F(x) + λ (1 − 3xξ )y(ξ )dξ .
0
Exercise 5.38 In light of Example 5.8, discuss the solutions of the nonhomogeneous
Fredholm equation
1
y(x) = F(x) + λ (x + ξ )y(ξ )dξ .
−1
In addition, find an example of F(x) that satisfies all the relevant condition(s) that
you obtain in studying the solutions.
Exercise 5.39 Solve
1
y(x) = 1 + λ (x + 3x2 ξ )y(ξ )dξ .
0
where all functions are continuous on their respective domains. We assume the kernel
K in (5.84) is symmetric. That is
Throughout this section it is assumed that the kernel K is symmetric. As in the case
of nonhomogeneous differential equations, first we learn how to find the solution of
the homogeneous integral equation
b
y(x) = λ K(x, ξ )y(ξ )dξ , (5.85)
a
and then utilize it to find the general solution of (5.84).
Recall that, If λ and y(x) satisfy (5.85) we say λ is an eigenvalue and y(x) is the
corresponding eigenfunction. It should cause no confusion between λn being all the
eigenvalues of (5.85) and the value of λ for (5.84). In most cases, we will require
λ ̸= λn . We have the following theorem regarding eigenvalues and corresponding
eigenfunctions of (5.85).
Theorem 5.4 Assume the kernel of the homogeneous integral equation (5.85) is sym-
metric. Then the following statements hold.
(i) If λm and λn are two distinct eigenvalues, then the corresponding eigenfunctions
ym (x) and yn (x) are orthogonal on the interval [a, b]. That is
b
ym (x)yn (x)dx = 0, for m ̸= n.
a
Since
y(x)ȳ(x) = u2 (x) + v2 (x) > 0,
the above expression takes the form
b
λ̄
1− (u2 (x) + v2 (x))dx > 0,
λ a
which is not identically zero. Thus, our assumption that β ̸= 0, has led to a contra-
diction. Therefore, we must conclude that β = 0, and hence λ is real. This completes
the proof.
Example 5.9 See Example 5.3. □
Next, we develop the solution for the nonhomogeneous integral equation
(5.84).
We begin with Hilbert-Schmidt theorem.
Theorem 5.5 (Hilbert-Schmidt Theorem) Assume that there is a continuous func-
tion g for which
b
F(x) = K(x, ξ )g(ξ )dξ .
a
Then F(x) can be expressed as
∞
F(x) = ∑ cn yn (x),
n=1
Symmetric Kernel 359
As a result of Theorem 5.5, we may say the function F is generated by the continuous
function g.
Theorem 5.6 Let y(x) be a solution to (5.84) where λ is not an eigenvalue of (5.85).
Then
∞
fn
y(x) = f (x) + λ ∑ yn (x), (5.88)
n=1 λn − λ
where b
fn = f (x)yn (x)dx, (5.89)
a
and the λn and yn are the eigenvalues and normalized eigenfunctions of (5.85).
with b
fn = f (x)yn (x)dx.
a
Next we multiply (5.84) by yn (x) and integrate from a to b
b b b
y(x)yn (x)dx = fn + λ ( K(x, ξ )y(ξ )dξ )yn (x)dx
a a a
b b
= fn + λ ( K(ξ , x)yn (x)dx)y(ξ )dξ (since K(x, ξ ) = K(ξ , x))
a a
b
λ
= fn + yn (ξ )y(ξ )dξ .
λn a
360 Integral Equations
b
After replacing ξ with x in the right side and solving for a y(x)yn (x)dx we arrive
at b
fn λn f n
y(x)yn (x)dx = = .
n −λ
λ
a 1 − λn
λ
Utilizing (5.91) we obtan
λn f n λ fn
cn = − fn = .
λn − λ λn − λ
Finally, using (5.90) we arrive at the solution
∞
fn
y(x) = f (x) + λ ∑ λn − λ yn (x).
n=1
y(0) = 0, y(1) = 0.
See Example 5.3. This boundary value problem is a Sturm-Liouville problem. From
Example 5.3 we have the eigenvalues λn = n2 π 2 , n = 1, 2, . . . with corresponding
eigenfunctions
yn (x) = sin(nπx), n = 1, 2, . . .
Then the normalized eigenfunctions are
√
yn (x) = 2 sin(nπx), n = 1, 2, . . . .
Moreover,
√
1 1 √ (−1)n+1 2
fn = f (x)yn (x)dx = x 2 sin(nπx)dx = , n = 1, 2, . . . .
0 0 nπ
Symmetric Kernel 361
2λ ∞
(−1)n+1 sin(nπx)
y(x) = x + ∑ , λ ̸= n2 π 2 , n = 1, 2, . . . .
π n=1 n(n2 π 2 − λ )
□
Example 5.11 Consider the Fredholm integral equation
1
y(x) = (x + 1)2 + λ (xξ + x2 ξ 2 )y(ξ )dξ . (5.93)
−1
It is clear that the kernel K(x, ξ ) = xξ + x2 ξ 2 = K(ξ , x). To apply Theorem 5.6, we
first need to find the eigenvalues and corresponding normalized eigenfunctions of
1
y(x) = λ (xξ + x2 ξ 2 )y(ξ )dξ . (5.94)
−1
where
1 1
C1 = ξ y(ξ )dξ , C2 = ξ 2 y(ξ )dξ .
−1 −1
From (5.95), we see that
y(ξ ) = λ ξC1 + λ ξ 2C2 . (5.96)
Substituting y(ξ ) given by (5.96) into C1 and C2 gives
1
2
ξ λ ξC1 + λ ξ 2C2 dξ = λC1 + 0C2 ,
C1 =
−1 3
and 1
2
ξ 2 λ ξC1 + λ ξ 2C2 dξ = 0C1 + λC2 .
C2 =
−1 5
Thus, we have the system of equations
2
(1 − λ )C1 + 0C2 = 0
3
2
0C1 + (1 − λ )C2 = 0.
5
For nontrivial values of C1 and C2 we must have
1 − 2 λ
3 0
= 0.
0 1 − 25 λ
362 Integral Equations
y1 (x) y2 (x)
φ1 (x) = q , φ2 (x) = q .
1 2 1 2
y
−1 1 (x)dx y
−1 2 (x)dx
and √
x2 x2 10
φ2 (x) = q = .
1 4 2
−1 x dx
and √ √
1 1
x2 10 8 10
f2 = f (x)φ2 (x)dx = (x + 1)2 dx = .
−1 −1 2 15
3 5
Thus, for λ ̸= λ1 = 2 and λ ̸= λ2 = 2, the solution is
2
fn
y(x) = (x + 1)2 + λ ∑ λn − λ φn (x)
n=1
2 f1 f2
= (x + 1) + λ φ1 (x) + φ2 (x)
λ1 − λ λ2 − λ
√
2 6 √ √
8 10 2 √
2 3 x 6 15 x 10
= (x + 1) + λ 3 +5 .
2 −λ
2 2 −λ
2
Symmetric Kernel 363
5.6.1 Exercises
Exercise 5.42 (a) Find the eigenvalues and corresponding normalized eigenfunc-
tions for the homogeneous integral equation
1
y(x) = λ K(x, ξ )y(ξ )dξ
0
have a solution?
Exercise 5.43 (a) Determine the eigenvalues and the corresponding normalized
eigenfunctions for
π
y(x) = λ cos(x + ξ ) y(ξ ) dξ .
0
(b) Solve π
y(x) = F(x) + λ cos(x + ξ ) y(ξ ) dξ
0
when λ is not characteristic and F(x) = 1.
364 Integral Equations
(c) Obtain the general solution (when it exists) if F(x) = sin(x), considering all
possible cases.
Exercise 5.44 (a) Determine the eigenvalues and the corresponding normalized
eigenfunctions for
1
y(x) = λ K(x, ξ )y(ξ ) dξ
0
where K is the Green’s function that was obtained in Example 5.5 for the bound-
ary value problem
Exercise 5.45 (a) Determine the eigenvalues and the corresponding normalized
eigenfunctions for
1
y(x) = λ (xξ + 1)y(ξ )dξ .
−1
(b) Solve 1
y(x) = x + λ (xξ + 1)y(ξ )dξ .
−1
1
y(x) = λ (x3 ξ 3 + x2 ξ 2 )y(ξ )dξ .
−1
(b) Solve 1
y(x) = x + λ (x3 ξ 3 + x2 ξ 2 )y(ξ )dξ .
−1
where f and K are continuous on their respective domains. We emphasize that the
kernel K in (5.97) does not need to be symmetric or separable. We define a successive
approximation by
y0 (x) = f (x)
x
y1 (x) = f (x) + λ K(x, ξ )y0 (ξ )dξ
a x
y2 (x) = f (x) + λ K(x, ξ )y1 (ξ )dξ
a
..
. x
yn (x) = f (x) + λ K(x, ξ )yn−1 (ξ )dξ , n = 1, 2, . . . (5.98)
a
Our aim here is to give a concise method on how to define a sequence of functions
{yn } successively for (5.97) and obtain an infinite series that represents the solu-
tion. Let y0 be an initial approximation. Then replacing y in the integrand by y0
gives x
y1 (x) = f (x) + λ K(x, ξ )y0 (ξ )dξ .
a
Substituting this approximation for y in the integrand gives the approximation
x ξ
y2 (x) = f (x) + λ K(x, ξ ) f (ξ ) + λ K(ξ , ξ1 )y0 (ξ1 )dξ1 dξ
a x a
y = f + λ Ly. (5.100)
In addition, y1 , y2 , and y3 may also be rewritten so that
y1 = f + λ Ly0 , y2 = f + λ L f + λ 2 L2 y0
and
y3 = f + λ L f + λ 2 L2 f + λ 3 L3 y0 .
Continuing in this fashion we obtain the successive approximation
n−1
yn (x) = f (x) + ∑ λ i Li f (x) + λ n Ln y0 (x), (5.101)
i=1
Proof Let
M = max |K(x, ξ )| and C = max |y0 (x)|.
a≤x,ξ ≤b a≤x≤b
Then
x
Ly0 (x) = K(x, ξ )y0 (ξ )dξ
a
x
≤ K(x, ξ )y0 (ξ )dξ
a
≤ (x − a)MC, a ≤ x ≤ b.
Similarly,
x
2
L y0 (x) = K(x, ξ )Ly0 (ξ )dξ
a
Iterative Methods and Neumann Series 367
x
≤ K(x, ξ )Ly0 (ξ )dξ
a x
≤ M(ξ − a)MCdξ
a
(x − a)2 2
≤ M C, a ≤ x ≤ b.
2
Continuing this way, we arrive at
n n
L y0 (x) ≤ (x − a) M nC ≤ (b − a) M nC.
n
(5.103)
n! n!
To complete the induction argument, we assume (5.103) holds for n and show it holds
for n + 1. Using (5.103), we arrive at
x
n+1
K(x, ξ )Ln y0 (ξ )dξ
L y0 (x) =
a
x
K(x, ξ )Ln y0 (ξ )dξ
≤
a
x
(ξ − a)n n
≤ M
M C dξ
a n!
(x − a)n+1 n+1
≤ M C
(n + 1)!
(b − a)n+1 n+1
≤ M C.
(n + 1)!
This completes the induction argument. Now, it is clear from (5.103) that
(b − a)n n
lim |λ n ||Ln y0 (x) = lim |λ n |
M C=0
n→∞ n→∞ n!
uniformly for all a ≤ x ≤ b and for all values of λ . This shows the infinite series
(5.101) converges for any finite λ .
Lemma 20 The solution of (5.97) is given by (5.102).
Proof By Lemma 19 we have the sequence {yn (x)} converges uniformly on [a, b],
say to a function y(x). Consider the successive iterations
x
yn (x) = f (x) + λ K(x, ξ )yn−1 (ξ )dξ , n = 1, 2, . . .
a
Then,
In the next lemma we show that if y satisfies (5.104), then it is unique. In addition,
the author assume the reader is familiar with Banach spaces. For more on Banach
spaces we refer to [19] of Chapter 4.
Lemma 21 If y satisfies (5.104), then it is unique provided that
1
λ< , (5.105)
M(b − a)
Thus, the operator P is a contraction and according to Banach fixed point theorem,
it has a unique fixed point.
Moreover, if
1
|λ | < , (5.108)
M(b − a)
then the solution of (5.106) is unique and it is given by (5.107). The representation
(5.107) is called the Neumann series.
y = f + λ Ly
where n
L y0 (x) ≤ (b − a)n M nC.
(5.113)
Then,
|λ n |Ln y0 (x) ≤ |λ |n (b − a)n M nC.
Furthermore,
provided that
1
|λ | < . (5.114)
M(b − a)
370 Integral Equations
Notice that condition (5.114) is a sufficient condition and hence without it the infinite
series may or may not converge.
Observe that the convergence for the infinite series in the case of Volterra integral
equation of the second kind was irrespective of the values of λ .
Example 5.12 Use both methods, iterations and Neumann series to solve the
Volterra integral equation
x
y(x) = x + λ (x − ξ )y(ξ )dξ .
0
x3 x5 x7 ∞
x2n+1
yn (x) = x + λ +λ2 +λ3 +··· = x+ ∑ λn ,
3! 5! 7! n=1 (2n + 1)!
which converges for all values of λ and x. Next we compute the Neumann series. Let
x
Ly(x) = K(x, ξ )y(ξ )dξ ,
0
then x
1 1 x3
L f (x) = L x = (x − ξ )ξ dξ = .
0 3!
Iterative Methods and Neumann Series 371
□
In what to follow, we twist the Neumann series and define the resolvent for the
Volterra integral equation given by (5.97). As before, define the operator
x
(L f )(x) = K(x, ξ ) f (ξ )dξ .
a
Then,
then we have x
(L2 f )(x) = K2 (x, ξ1 ) f (ξ1 )dξ1 .
a
Following in the same steps we arrive at
x
3
(L f )(x) = K3 (x, ξ1 ) f (ξ1 )dξ1 ,
a
372 Integral Equations
where x
K3 (x, ξ1 ) = K(x, ξ )K2 (ξ , ξ1 )dξ ,
ξ1
and in general, x
(Ln f )(x) = Kn (x, ξ1 ) f (ξ1 )dξ1 ,
a
where x
Kn (x, ξ1 ) = K(x, ξ )Kn−1 (ξ , ξ1 )dξ .
ξ1
The kernels K1 = K, K2 , K3 , . . . are called the iterated kernels. Consequently, the
Neumann series (5.107) can be written as
∞ x
i−1
y(x) = f (x) + λ ∑ λ Ki (x, ξ ) f (ξ )dξ
i=1 a
x ∞
i−1
= f (x) + λ ∑ λ Ki (x, ξ ) f (ξ )dξ
a i=1
x
= f (x) + λ Γ(x, ξ ; λ ) f (ξ )dξ , (5.116)
a
where
∞
Γ(x, ξ ; λ ) = ∑ λ i−1 Ki (x, ξ )dξ , (5.117)
i=1
is the resolvent kernel. We arrived at the following theorem.
Theorem 5.8 Let f and K be continuous. Then the Volterra equation
x
y(x) = f (x) + λ K(x, ξ )y(ξ )dξ , a≤x≤b (5.118)
a
Then
x
K2 (x, ξ1 ) = K(x, ξ )K(ξ , ξ1 )dξ
ξ1
Iterative Methods and Neumann Series 373
x
2 −ξ 2 2 −ξ 2
= ex eξ 1 dξ
ξ1
x
2 −ξ 2 2 −ξ 2
= ex 1 dξ = ex 1 (x − ξ1 ).
ξ1
Similarly
x
K3 (x, ξ1 ) = K(x, ξ )K2 (ξ , ξ1 )dξ
ξ
1x
2 −ξ 2 2 −ξ 2
= ex eξ 1 (ξ − ξ1 )dξ
ξ1
x
2 −ξ 2 2 −ξ 2 (x − ξ1 )2
= ex 1 (ξ − ξ1 )dξ = ex 1 .
ξ1 2
Additionally
x
K4 (x, ξ1 ) = K(x, ξ )K3 (ξ , ξ1 )dξ
ξ1
x
2 −ξ 2 2 −ξ 2 (ξ − ξ1 )2
= ex eξ 1 dξ
ξ1 2
2 −ξ 2 (x − ξ1 )3
= ex 1 .
3!
Inductively, we arrive at the formula
2 −ξ 2 (x − ξ1 )n−1
Kn (x, ξ1 ) = ex 1 , n = 1, 2, . . . .
(n − 1)!
Thus, the resolvent kernel is
∞
Γ(x, ξ ; λ ) = ∑ λ n−1 Kn (x, ξ )dξ
n=1
∞
2 2 (x − ξ1 )n−1
= ∑ ex −ξ1 (n − 1)!
n=1
2 −ξ 2
∞
(x − ξ1 )n−1
= ex 1
∑
n=1 (n − 1)!
2 −ξ 2
= ex 1 ex−ξ1 .
□
The next example displays another approach of finding the resolvent kernel.
Example 5.14 Consider the Volterra equation
x
ϕ(x) = f (x) + λ ξ ϕ(ξ )dξ ,
a
d −λ x2 x2
e 2 ϕ(x) = e−λ 2 f ′ (x).
dx
Integrating both sides from a to x leads to
x
2 2 ξ2
−λ x2 −λ a2
e ϕ(x) − e ϕ(a) = e−λ 2 f ′ (ξ )dξ .
a
The term on the right hand side can be integrated by parts to obtain
x
x2 a2 a2 x2 ξ2
e−λ 2 ϕ(x) − e−λ 2 ϕ(a) = −e−λ 2 f (a) + f (x)e−λ 2 +λ ξ e−λ 2 f ξ )dξ .
a
x2
Multiplying both sides with eλ 2 yields
x
λ 2 −ξ 2 )
ϕ(x) = f (x) + λ ξ e 2 (x f ξ )dξ . (5.121)
a
equation x′ (t) = f (t, x(t)) for t ∈ I, and x(t0 ) = x0 is given by the nonlinear integral
equation t
x(t) = x0 + f ξ , x(ξ ) dξ .
t0
To be consistent with our notations, we consider the nonlinear Volterra integral equa-
tion x
y(x) = h(x) + f ξ , y(ξ ) dξ .
t0
where the functions h and f are continuous on their respective domains. As we have
done before, we define a successive approximation or Picard’s iteration by
y0 (x) = h(x)
x
y1 (x) = h(x) + f (ξ , y0 (ξ ))dξ
0 x
y2 (x) = h(x) + f (ξ , y1 (ξ ))dξ
0
..
. x
yn (x) = h(x) + f (ξ , yn−1 (ξ ))dξ , n = 1, 2, . . . (5.122)
0
5.7.1 Exercises
Exercise 5.47 Provide all the detail for (5.113).
Exercise 5.48 Find the Neumann series for the Volterra integral equation
x
y(x) = 1 − (x − ξ )y(ξ )dξ ,
0
using:
(a) iterations,
(b) Neumann series. Answer: y(x) = ex + 1.
Exercise 5.50 Consider the Fredholm integral equation
1
y(x) = 1 + λ (x − ξ )y(ξ )dξ .
0
Find:
(a) y1 (x), y2 (x) and y3 (x),
(b) the first three terms of the Neumann series.
Exercise 5.51 Find the Neumann series for the following integral equations.
(a) x
y(x) = 1 − 2 ξ y(ξ )dξ .
0
(b) 1
1
y(x) = x + (x + ξ )y(ξ )dξ .
2 −1
Exercise 5.52 Solve the Volterra integral equation using the Neumann series
(a) x
y(x) = 1 + x2 − 2 (x − ξ )y(ξ )dξ .
0
Answer: y(x) = 1.
Iterative Methods and Neumann Series 377
(b) x
y(x) = x cos(x) + ξ y(ξ )dξ .
0
Very hard to simplify.
Answer: y(x) = sin(x).
Exercise 5.53 Consider the integral equation
1
23 1
y(x) = x+ xξ y(ξ )dξ . (5.123)
6 8 0
xξ
(a) Show that the iterative kernel Kn (x, ξ ) = 3n−1
.
1 24
(b) Show that the resolvent kernel simplifies to Γ(x, ξ ; λ ) = xξ 1−1/24 = 23 xξ .
(x−ξ )n−1
(a) Show that the iterative kernel Kn (x, ξ ) = (n−1)! .
and show the obtained {xn (t)} of each of the of iterate converges to the true solution
of each of the (IVP) (True solution is the solution found by solving the (IVP)).
Exercise 5.60 Consider the coupled system of differential equations
dy dz 1
= z(x), = x3 (y(x) + z(x)); y(0) = 1 and z(0) = .
dx dx 2
Convert the system to integral equations and find the iterates
{y1 (x), y2 (x), y3 (x), z1 (x), z2 (x), z3 (x)}.
Let D(x, ξ ) be the approximate and degenerta kernel of K. Then, the approximate
Fredholm integral equation of the second kind of (5.126) may take the form
b
e(x) = f (x) + λ D(x, ξ )e(ξ )dξ , (5.127)
a
where the kernel D is degenerate. We may use Section 5.5 to obtain the solution e(x)
of (5.127), which is the approximate solution of (5.126). Such approximation will
involve an error which we denote by
ε = |y(x) − e(x)|
for small and positive ε. For illustrational purpose we propose the following exam-
ple.
Approximating Non-Degenerate Kernels 379
x4 ξ 3
D(x, ξ ) = x2 ξ − ,
3!
which is degenerate. The approximate Fredholm integral equation is then
1
x4 ξ 3
e(x) = cos(x) + λ x2 ξ − e(ξ )dξ . (5.129)
0 3!
with
ξ3
α1 (x) = x2 , α2 (x) = −x4 , β1 (ξ ) = ξ , β2 (ξ ) = .
3!
Next we use 1
ai j = βi (x)α j (x)dx, i, j = 1, 2
0
to compute the matrix A = (ai j ).
1 1
1
a11 = β1 (x)α1 (x)dx = x3 dx = ,
0 0 4
1 1
1
a12 = β1 (x)α2 (x)dx = − x5 dx = − ,
0 0 6
1 1 5
x 1
a21 = β2 (x)α1 (x)dx = dx = ,
0 0 6 36
1 1 7
x 1
a22 = β2 (x)α2 (x)dx = − dx = − .
0 0 6 48
So we have
1
− 16
A= 4 .
1 1
36 − 48
380 Integral Equations
The values of e(x) are compared to the values of the actual solution of (5.128) which
can be easily proved to be y(x) = 1, at various values of x ∈ [0, 1], in the table below.
□
x 0 0.25 0.5 0.75 1
y(x) 1 1 1 1 1
e(x) 1 0.99998 0.99926 0.99963 0.998437
5.8.1 Exercises
Exercise 5.61 Verify that y(x) = 1 is a solution of the Fredholm integral equation
given by (5.128).
Exercise 5.62 Redo Example 5.16 by taking
x4 ξ 3 x6 ξ 5
D(x, ξ ) = x2 ξ − + .
3! 5!
Laplace Transform and Integral Equations 381
by considering the first three terms of the Maclaurin series of the kernel.
Example 5.17 In this example we develop the Laplace transform of basic functions.
We do so by considering different values of f (t).
(a) For f (t) = 1, then
1 −st ∞ 1
∞
−st
Ł[1] = e dt = − e = , s > 0.
0 s 0 s
382 Integral Equations
(e)
Ł[cos at + i sin at] = Ł[eiat ] by DeMoivre.
1 s ia
= 2 Ł[eiat ] =
2
+ 2 .
s − ia s + a s + a2
Hence, equating real and imaginary parts and using linearity
s
Ł[cos at] =
s2 + a2
a
Ł[sin at] = .
s2 + a2
We can apply the convolution property from the table to find
f (s)
Ł−1 .
s
1
Ł−1 [ f (s)] = f (t), and Ł−1 [ ] = 1 = g(t),
s
so t
f (s)
Ł−1 = f (θ ) dθ .
s 0
n!
Ł[t n ] = , n = 0, 1, 2, . . . .
sn+1
□
Laplace Transform and Integral Equations 383
You can access the Laplace Transforms of all the functions you are likely to meet
online thanks to computer algebra tools like Mathematica, Matlab, and Maple. The
packages also provide an inversion technique to find a function f from a given F(s).
For example
1 π 1/2
Ł[t 1/2 ] = ,
2 s3
and π 1/2
Ł[t −1/2 ] = .
s
Ł−1 [F(s)] = f .
Example 5.18 We use Laplace transform to solve the initial value problem
dy
2 − y = sint, y(0) = 1.
dt
We begin by taking the Laplace transform on both sides and obtain
1
2(sY (s) − 1) −Y (s) = .
s2 + 1
Solving for Y (s) gives
2s2 + 3
Y (s) = .
(2s − 1)(s2 + 1)
Taking the Laplace inverse we arrive at
2s2 + 3
y(t) = Ł−1 [ ].
(2s − 1)(s2 + 1)
Next we use partial fractions. That is
2s2 + 3 A Bs +C
= + ,
(2s − 1)(s2 + 1) 2s − 1 s2 + 1
384 Integral Equations
h = f ∗ g.
Theorem 5.10
f ∗g = g∗ f.
Let F(s) and G(s) be the Laplace transform of the functions f , and g, respec-
tively. We are interested in computing Ł−1 [F(s)G(s)]. We have the following the-
orem
Theorem 5.11 (Convolution Theorem) Let F(s) and G(s) be the Laplace trans-
form of the functions f , and g, respectively. Then
t
Ł[ f ∗ g] = Ł[ f (t − τ)g(τ) dτ] = F(s)G(s).
0
Then
∞ ∞
−st
H(s) = ( e f (t)dt)( e−sτ g(τ)dτ)
0 0
Laplace Transform and Integral Equations 385
∞ ∞
e−s(t+τ) f (t)dt g(τ)dτ.
=
0 0
Replacing the dummy variable of integration u with t and then compare the result
with (5.133), we clearly see that
∞ t
e−st h(t)dt = f (t − τ)g(τ) dτ
0 0
∞ t
f (t − τ)g(τ) dτ e−st dt.
=
0 0
1 1
Ł−1 [F(s)] = Ł−1 [G(s)] = Ł−1 [ ]= sin(2t).
s2 + 4 2
Thus, t
1 1 1
h(t) = sin(2t) ∗ sin(2t) = sin 2(t − τ) sin(2τ)dτ.
2 2 4 0
□
Before we consider the next example, we define the error function.
Definition 5.11 The error function is the following improper integral considered as
a real function er f : R → R, such that
x
2 2
er f (x) = √ e−z dz,
π 0
Next we state the gamma function, which is needed in future work. We denote the
Gamma function by Γ and it is defined by
∞
Γ(x) = ux−1 e−u du, x > 0.
0
Γ(x + 1) = xΓ(x),
Γ(n) = (n − 1)!.
We will also need the following formula. For positive integer n, we have
√
1 (n − 2)!! π
Γ( n) = , (5.134)
2 2(n−1)/2
where n!! is a double factorial. For example,
√ √ √
Γ(1/2) = π, Γ(3/2) = π/2, Γ(5/2) = (3 π)/4, etc,
1 (2n − 1)!! √
Γ( + n) = π,
2 2n
and
1 (−1)n 2n √
Γ( − n) = π.
2 (2n − 1)!!
Let Ł[r(t)] = R(s) and take Laplace transform on both sides of (5.136).
This gives √
1 Γ(1 − 1/2) 1 π
R(s) = − R(s) = − R(s).
s s1−1/2 s s1/2
Laplace Transform and Integral Equations 387
1
R(s) = √ .
s1/2 (s1/2 + π)
Or,
1 −1 1 1 −1 1
r(t) = √ Ł −√ Ł √ . (5.138)
π s1/2 π s1/2 + π
h i 2 √
By our provided table, we see that Ł−1 √s+a1
= √π1√t − a ea t erf(a t), then
√ πt √ √
1 1 1 1
r(t) = √ √ √ − √ √ √ − π e erf( π t) .
π π t π π t
This simplifies to
1 1 √ √
r(t) = √ − √ + eπt erf( π t).
π t π t
Finally, √ √
r(t) = eπt erf( π t).
□
t −1/2 ( πs )1/2
388 Integral Equations
eat 1/(s − a)
sin ωt ω/(s2 + ω 2 )
cosωt s/(s2 + ω 2 )
t sin ωt 2ωs/(s2 + ω 2 )2
t cos ωt (s2 − ω 2 )/(s2 + ω 2 )2
eat t n n!/(s − a)n+1
eat sin ωt ω/ (s − a)2 + ω 2
sinh ωt ω/(s2 − ω 2 )
cosh ωt s/(s2 − ω 2 )
Shift of g: eat g(t) t
G(s − a)
Convolution: f (t) ∗ g(t) = 0 f (t − τ)g(τ) dτ G(s)F(s)
t 1
Integration: 1 ∗ g(t) = 0 g(τ) dτ s G(s)
Derivative: y′ sY (s) − y(0)
y′′ √ s2Y (s) − √sy(0) − y′ (0)
(1 + 2at)/
√ πt (s +
√a)/s s
e−at / πt √ √ s + a√
1/
(ebt − e−at )/2t √ πt √s − a − √s − b
(e−bt√− e−at√)/2t πt s +√a + s + b
er f ( at)/√ a√ 1/(s√ s√+ a)
eat er f ( at)/ a 1/( s s − a)
2 √ √
√1 − beb t er f (b t)] 1/( s + b)
πt
f (ct) 1/(cF(1/c)), c > 0
f (n) (t) sn F(s) − sn−1 f (0) − . . . − f (n−1) (0)
(−t)n f (t) F (n) (s)
u(t − a) f (t − a) e−as F(s)
u(t − a) e−as /s
Γ(v+1)
t v , (v > −1) sv+1
.
5.9.2 Exercises
Exercise 5.65 Solve the initial value problem using Laplace transform
(a) y′′ + 9y = u(t − 3), y(0) = 1, y′ (0) = 2.
(b) 2 dy
dt − y = sin(t), y(0) = 1.
Exercise 5.66 Express h in the form f ∗ g, when
1
(a) H(s) = s3 −3s
.
1
(b) H(s) = (s2 +4)(s2 +9)
.
1
(c) H(s) = 3 .
s 2 (s2 +4)
Laplace Transform and Integral Equations 389
Exercise 5.67 Use Laplace transform and write down the solution of the integral
equation t
y(t) = f (t) + λ e(t−τ) y(τ) dτ.
0
t
Answer: y(t) = f (t) + λ 0 e(λ +1)(t−τ) f (τ) dτ.
Exercise 5.68 Use Exercise 5.67 to solve the integral equation
t
y(t) = cos(t) − e(t−τ) y(τ) dτ.
0
and t
x(t) = f (t) + a(t − s)x(s)ds,
0
then t
x(t) = f (t) − r(t − s) f (s)ds.
0
Exercise 5.73 Solve the Abel equation
t
1
√ y(τ)dτ = f (t),
0 t −τ
where f (t) is a given function with f (0) = 0 and f ′ admits a Laplace transform.
390 Integral Equations
The next example is concerned with the existence of multiple solutions on an integral
equation.
Example 5.22 Consider the integral equation
x
y(ξ )
y(x) = p dξ .
0 x2 − ξ 2
It is clear that y(x) = 0, is a solution. Additionally, y(x) = x is another solution since
x x
y(ξ ) ξ
p dξ = p dξ
0 x2 − ξ 2 0 x2 − ξ 2
x2
1 1
= √ dξ = x,
2 0 u
where we have used the transformation u = x2 − ξ 2 . Note that the kernel K(x, ξ ) =
√ 1 , is well behaved under integration. That is for any T > 0 we see that
x2 −ξ 2
T x
|K(x, ξ )|dξ dx < ∞,
0 0
and moreover, the function g(x) = x is certainly Lipschitz continuous. However, the
kernel is singular, in the sense that
K(x, ξ ) → ∞, as x → ξ .
□
The next theorem provide necessary conditions for the existence of unique solutions
of integral equations of the form
t
x(t) = f (t) + g(t, s, x(s))ds (5.139)
0
where for Ψ ∈ X the norm || · || is taken to be ∥Ψ∥ = maxt∈[0,T ] {|Ψi (t)|}. Let φ ∈ X
and define an operator D : X → X, by
t
D(φ )(t) = f (t) + g(t, s, φ (s))ds.
0
Next we state and prove Gronwall’s inequality, which plays an important role in the
next results.
Theorem 5.13 (Gronwall’s inequality) Let C be a nonnegative constant and let u, v
be nonnegative continuous functions on [a, b] such that
t
v(t) ≤ C + v(s)u(s)ds, a ≤ t ≤ b, (5.140)
a
Odd Behavior 393
then t
v(t) ≤ Ce a u(s)ds , a ≤ t ≤ b. (5.141)
In particular, if C = 0, then v = 0.
t
Proof Assume C > 0 and let h(t) = C + v(s)u(s)ds. Then
a
h′ (t) − h(t)u(t) ≤ 0.
t
Multiply both sides of the above expression by the integrating factor e− a u(s)ds , to
get t ′
h(t)e− a u(s)ds ≤ 0.
Finally, t
v(t) ≤ h(t) ≤ Ce a u(s)ds , C = h(a).
If C = 0 then form (5.140) it follows that
t t
1
v(t) ≤ v(s)u(s)ds ≤ + v(s)u(s)ds a ≤ t ≤ b,
a m a
|x(t)| ≤ Ae−βt .
394 Integral Equations
Or,
|x(t)| ≤ Ae(λ −α)t = Ae−βt .
This completes the proof.
The next theorem shows that if the signs of the function g are right, then the growth of
g has nothing to do with continuation of solutions. Before we embark on the details,
the following is needed. Let x : R → R be continuous. Observing that
√ 1
|x| = x2 = (x2 ) 2 ,
Suppose f and f ′ are continuous. In addition, we assume ∂ K(t,s)∂s and K(t, s) are
continuous for 0 ≤ s ≤ t < ∞. If for y ̸= 0, yg(y) > 0 and for each T > 0 we have
T ∂ K(u,t)
K(t,t) + du ≤ 0,
t ∂u
then each solution y(t) of (5.142) can be continued for all future times.
Odd Behavior 395
y(t)g(y(t)) |y(t)||g(y(t))|
= = |g(y(t))|.
|y(t)| |y(t)|
Since H > 0 and H is decreasing along the solutions, we see that H is bounded by
some constant, and hence |y(t)| is bounded on [0, η). As a matter of fact, we have
from the definition of H that
This yields
|y(t)| ≤ DeMt ≤ DeMα .
This completes the proof.
Then, for any η > 0 we have | f ′ (t)| = et ≤ eη := M. It readily follows that yg(y) =
−1/2 −3/2
y6 > 0 when y ̸= 0. Let K(t, s) = − t − s + 1 . Then Ku (u,t) = 12 u −t + 1 .
Moreover, for any T > 0 we have
T T
Ku (u,t)du = −1 + 1 −3/2
K(t,t) + u−t +1 du
t t 2
−1/2
= −1 − T − t + 1 + 1 ≤ 0.
Hence, by Theorem 5.15 solutions can be continued, or continuable, for all future
times. □
5.10.1 Exercises
Exercise 5.76 Construct an example that satisfies the hypothesis of Theorem 5.14.
Exercise 5.77 Use Theorem 5.15 to show that solutions of the integral equation
t
t y3 (s)
y(t) = e − ds
0 (t − s + 1)2
Show that if y(t) is a solution of the above integral equation on some interval [0, α),
then it is bounded, and, hence, it can be continued for all future times.
α
Hint: Convince yourself of the fact that | f (t)| + 0 M(s, α)ds ≤ Q, for some positive
constant Q, and then apply Gronawall’s inequality.
Appendices
A
Fourier Series
This appendix covers the basic main topics of Fourier series. We briefly discuss
Fourier series expansion, including sine and cosine. We provide applications to the
heat problem in a finite slab by utilizing the concept of separation of variables. We
end this appendix by studying the Laplacian equation in circular domains.
A.1 Preliminaries
We start with some basic definitions.
Definition A.1 A function f (x) is said to be periodic with period p if f (x+ p) = f (x)
for all x in the domain of f . This means that the function will repeat itself every p
units.The main period is the smallest positive period of a function.
For example, the trig functions sin x and cos x are periodic with period 2π, as well
as with period 4π, 6π, 8π, etc. The function sin nx is periodic, with main period 2π
n ,
though it also has period 2π. If two functions are period with the same period, then
any linear combination of those functions is periodic with the same period.This is
important fact since the infinite sum
∞
a0
+ ∑ (an cos nx + bn sin nx), (A.1)
2 n=1
has period 2π. Expression (A.1) is known as the Fourier series, where an , bn are
called Fourier coefficients. Given a function f (x) that is periodic with period 2π,
then we write
∞
a0
f (x) = + ∑ (an cos nx + bn sin nx), (A.2)
2 n=1
where the Fourier coefficients of f (x) are given by the Euler formulas
π
1
a0 = f (x)dx, (A.3)
π −π
π
1
an = f (x) cos(nx)dx, n = 1, 2 . . . (A.4)
π −π
and
π
1
bn = f (x) sin(nx)dx, n = 1, 2 . . . (A.5)
π −π
This is an alternative way of expressing a function in an infinite series in terms of sine
and cosine. The above extension of f can be easily extended to periodic function with
period 2L. In such a case the above formulae takes the form
∞
a0 nπx nπx
+ ∑ (an cos + bn sin , (A.6)
2 n=1 L L
has period 2L. Given a function f (x) that is periodic with period 2L, then we
write
∞
a0 nπx nπx
f (x) = + ∑ (an cos + bn sin , (A.7)
2 n=1 L L
where the Fourier coefficients of f (x) are given by the Euler formulas
L
1
a0 = f (x)dx, (A.8)
L −L
L
1 nπx
an = f (x) cos dx, n = 1, 2 . . . (A.9)
L −L L
and
L
1 nπx
bn = f (x) sin dx, n = 1, 2 . . . (A.10)
L −L L
We have the following definition.
Definition A.2 Let x0 be a point in the domain of a function f . Then,
(a) the right-hand limit of f at x0 , denoted by f (x0+ ) is defined by
f (x) − f (x0+ )
f ′ (x0+ ) = x→x
lim ,
x>x0
0 x − x0
Finding the Fourier Coefficients 401
f (x) − f (x0− )
f ′ (x0− ) = x→x
lim .
0
x<x0
x − x0
Remark 23 If (a) and (b) of Definition A.2 are satisfied for ever x ∈ (a∗ , b∗ ), then we
say f is piecewise continuous on (a∗ , b∗ ), and we write f ∈ C p (a∗ , b∗ ). In addition
to (a) and (b), if (c) and (d) of Definition A.2 are satisfied for ever x ∈ (a∗ , b∗ ), then
we say f is piecewise smooth on (a∗ , b∗ ), and we write f ∈ C′p (a∗ , b∗ ).
We furnish the following example.
Example A.1 Consider
−x, x<0
f (x) =
x + 1, x > 0
Then
lim f (x) = f (0+ ) = 1, and lim f (x) = f (0− ) = 0.
x→0 x→0
x>0 x<0
Moreover,
f (x) − 1 (x + 1) − 1
f ′ (0+ ) = lim = lim = 1,
x→0 x x→0 x
x>0 x>0
and
f (x) − 1 −x
f ′ (0− ) = lim = lim = −1.
x→0 x x→0 x
x<0 x<0
m)x − cos(n + m)x , and sin(nx) cos(mx) = 12 sin(n + m)x + sin(n − m)x . Thus,
402 Fourier Series
1 1
cos(nx) cos(nx) = (cos(2nx) + 1) and sin(nx) sin(nx) = (1 − cos(2nx)),
2 2
so we can compute
π
1 π
cos(nx) cos(nx)dx = (cos(2nx) + 1)dx
−π 2 −π
1 sin(2nx) π
= (x + ) dx
2 2n + x −π
= π,
π
1 π
sin nx sin nxdx = (1 − cos(2nx))dx
−π 2 −π
1 sin(2nx) π
= (x − ) −π dx
2 2n
= π.
Now, if we multiply both sides of
∞
a0
f (x) = + ∑ an cos(nx) + bn sin(nx)
2 n=1
by cos(mx), and then integrate term by term, we have by using the orthogonality
concept that
π π
a0
f (x) cos(mx)dx = cos(mx)dx
−π −π 2
∞ π π
+ ∑ (an cos(nx) cos(mx)dx + bn sin(nx) cos(mx)dx)
n=1 −π −π
= 0 + am π.
Hence π
1
am = f (x) cos(mx)dx.
π −π
The other coefficients are derived similarly. Now we work out some exam-
ples.
Example A.2 Let (
2, 0<x<π
f (x) = .
−1, −π < x < 0
Finding the Fourier Coefficients 403
π
1
an = f (x) cos(nx)dx
π −π
0 π
1
= − cos(nx)dx + 2 cos(nx)dx
π −π 0
1 sin nx 0 sin nx π
= − +2
π n −π n 0
= 0, n = 1, 2, . . . .
Finally,
π
1
bn = f (x) sin(nx)dx
π −π
0 π
1
= − sin(nx)dx + 2 sin(nx)dx
π −π 0
1 cos nx 0 cos nx π
= −2
π n −π n 0
1 1 cos nπ cos nπ 2
= ( − −2 + )
π n n n n
3
= (1 − cos nπ).
nπ
Note that when n is even then cos nπ = 1 and when n is odd cos nπ = −1. Hence
6
bn = nπ if n is odd and bn = 0 if n is even. This means that can replace n by 2n − 1
in the sum and obtain
(
∞
2, 0<x<π 1 6
f (x) = = +∑ sin((2n − 1)x).
−1, −π < x < 0 2 n=1 (2n − 1)π
According to Theorem A.1, the infinite sum given by the above expression converges
to the function (
1
at x = 0, ±π
g(x) = 2
f (x) otherwise
□
404 Fourier Series
This function is 1 for −2 < x < 2, 6 < x < 10, etc. It is a regular pulse which is on
for 4 units of time, and then off for four units of time. Since the period is not 2π, but
instead 2L = 8, we have L = 4. The Fourier coefficients are
4 2
1 1
a0 = f (x)dx = 1dx = 1,
4 −4 4 −2
1 4 nπx 1 2 nπx
an = f (x) cos dx = cos dx
L −4 4 4 −2 4
1 4 nπx 2 1 nπx 2
= sin −2
= sin
4 nπ 4 nπ 4 −2
1 2nπ −2nπ 1 nπ
= sin − sin = (2 sin ),
nπ 4 4 nπ 2
4
1 nπx 1 2 nπx
bn = f (x) sin
dx = sin dx
L−4 4 4 −2 4
1 4 nπx 2 1 nπx 2
= − cos −2
= − cos
4 nπ 4 nπ 4 −2
1 nπ −nπ 1
= − cos − cos = (0) = 0.
nπ 2 2 nπ
If n is even, then an = 0 as sine is 0 at integer values. Thus, an contribute nonzero
values for odd n, and so we may replace n by 2n − 1. With this in mind, the Fourier
series can be written as
0 −4 < x < −2 1 2 ∞ sin( (2n−1)π )
(2n − 1)πx
2
f (x) = 1 −2 < x < 2 = + ∑ cos( ).
2 π n=1 2n − 1 4
0 2<x<4
are symmetric about the y-axis and odd functions are symmetric about the origin.
For example, f (x) = cos(x) is an even function since cos(−x) = cos(x) and f (x) =
sin(x) is an odd function since sin(−x) = − sin(x). Using the concept of odd and even
functions, one can easily show, using (A.9) and (A.10) that the Fourier coefficients
of an even function are simply
L
2 nπx
an = f (x) cos dx, n = 0, 1, . . .
L 0 L
and
bn = 0, n = 1, 2, . . .
and the corresponding Fourier series is called a Fourier cosine series. Similarly, for
an odd function the coefficients are
an = 0, n = 0, 1, . . .
and
2 L nπx
bn = f (x) sin dx, n = 1, 2, . . .
L 0 L
and the corresponding Fourier series is called a Fourier sine series. This comes be-
cause the product of two even functions is even, the product of two odd functions is
even, and the product of an even and an odd function is odd. In addition, integration
from −L to L of an odd function is zero, while integration from −L to L of an even
function is twice the integral of 0 to L.
Consider the sawtooth wave, which is given by the function f (x) = x + π for −π <
x < π, and f (x + 2π) = f (x). It can be written as the sum of an even function f1 (x) =
π and an odd function f2 (x) = x. The corresponding Fourier cosine and sine series
are f1 = π and f2 = 2 sin x − 12 sin 2x + 13 sin 3x − 14 sin 4x + · · · . Addition of series
fe (x)
−2L −L 0 L 2L
FIGURE A.1
Periodic even extension.
where L
2 nπx
bn = f (x) sin( )dx, n = 1, 2, . . . . (A.12)
L 0 L
Similarly, the periodic even extension of a function f that is piecewise continuous on
the interval (0, L), is the Fourier cosine series of f given by
∞
a0 nπx
f (x) = + ∑ an cos( ), 0 < x < L, (A.13)
2 n=1 L
where L
2
a0 = f (x)dx,
L 0
and L
2 nπx
an = f (x) cos( )dx, n = 1, 2, . . . .
L 0 L
In Fig. A.1, we display the cosine Fourier series of
x
f (x) = + 1, 0 < x < L.
L
Every term in the Fourier cosine series is 2L-periodic. Note that the periodic even
extension does not introduce new jumps. Similarly, if we consider
x
f (x) = + 1, 0 < x < L,
L
Every term in the Fourier sine series is 2L-periodic. Note that the periodic odd exten-
sion does not introduce new jumps if and only if f (0) = f (L) = 0. The two figures
below illustrate both cases.
We provide the following examples.
Even and Odd Extensions 407
fo (x)
2L −L 0 L 2L
FIGURE A.2
Discontinuous periodic odd extension.
fo (x)
2L −L 0 L 2L
FIGURE A.3
Continuous periodic odd extension when we require f (0) = f (L) = 0.
Example A.4 Let us find the Fourier cosine series of f (x) = x, 0 < x < π. It is
easy to see that
2 π
a0 = xdx = π.
π 0
Using integration by parts, we find that
2 π
2 (−1)n − 1
an = x cos(nx)dx = , n = 1, 2, . . . .
π 0 π n2
By noticing that an = 0 for n is even and an = −2 for n is odd, we may replace
(−1)n − 1 with −2 and use 2n − 1 for n in the summation. Thus,
π 4 ∞ cos(2n − 1)x
x= − ∑ , 0 < x < π,
2 π n=1 (2n − 1)2
408 Fourier Series
We are interested in finding a non trivial solution u(x,t) of (A.14) that satisfies the
boundary and the initial conditions. We seek separated solutions of functions of
the
u(x, y) = X(x)T (t), (A.15)
where X is a function of x alone and T is a function of t alone. Note, too, that X
and T must be nontrivial. That is X ̸= 0, and T = ̸ 0. By differentiating (A.15) with
respect to t and x and substituting into (A.14) we obtain the relation
X(x)T ′ (t) = kX ′′ (x)T (t).
Since X(x) ≠ 0, and T (t) ̸= 0, we may divide by the term X(x)T (t) to separate the
variables. That is,
X ′′ (x) T ′ (t)
= .
X(x) kT (t)
Applications of Fourier Series 409
Since the left-hand side is a function of x alone, and the right-hand side is a function
of t alone, the two sides must have a common constant value −λ . That is,
X ′′ (x) T ′ (t)
= = −λ .
X(x) kT (t)
Now we check the Neumann boundary conditions. 0 = ux (0,t) = X ′ (0)T (t), implies
that X ′ (0) = 0. Similarly, 0 = ux (c,t) = X ′ (c)T (t), implies that X ′ (c) = 0. Thus, we
arrive at the Sturm-Liouville problem
One can easily argue as in Section 4.14, and determine that (A.16) has the trivial
solution for λ < 0. For λ = 0, we have from (A.16) that X ′′ (x) = 0, which has
the solution X(x) = Ax + B. Applying the boundary conditions we get B = 0 and
A is arbitrary, and so we set it equal to one. Thus, for λ0 = 0, the corresponding
eigenfunction is X0 (x) = 1. Now for λ > 0, we assume λ = α 2 for positive α. Then
the general solution of (A.16) is
and hence
X ′ (x) = −Aα sin(αx) + Bα cos(αx).
Applying X ′ (0) = 0, we automatically get B = 0. Applying X ′ (c) = 0, with B = 0
already, we arrive at
−Aα sin(αx) = 0.
To obtain a nontrivial solution we set sin(αc) = 0. This gives αc = nπ, n = 1, 2, . . . .
and obtain α = nπ nπ 2
c . Thus, for λn = ( c ) , the corresponding eigenfunctions are given
by
nπx
Xn (x) = cos( ), n = 1, 2, . . . ,
c
where we set A = 1. Turning to (A.17), we need to solve it based on the already
determined eigenvalues λ0 and λn , n = 1, 2, . . . . For λ0 = 0, equation (A.17) has the
solution constant multiple of T0 (t) = 1. Similarly, for λn we have the corresponding
eigenfunctions
2 2
− n π2 k t
Tn (t) = e c , n = 1, 2, . . . .
Thus, we may write the solution as
2 2
− n π2 k t nπx
u(x,t) = u0 (x,t) + un (x,t) = 1 + e c cos( ).
c
410 Fourier Series
Note that u satisfies both of Neumann conditions. Now by the superposition principle
the general solution of (A.14) maybe written as
∞ 2 2
a0 −n π kt nπx
u(x,t) = + ∑ an e c2 cos( ). (A.18)
2 n=1 c
By applying the initial condition u(x, 0) = f (x) to (A.18) we obtain the Fourier cosine
series
∞
a0 nπx
f (x) = + ∑ an cos( ),
2 n=1 c
where c
2
a0 = f (x)dx,
c 0
and c
2 nπx
an = f (x) cos( )dx, n = 1, 2, . . . .
c 0 c
∂ 2 u(x, y) ∂ 2 u(x, y)
∇2 u = +
∂ x2 ∂ y2
while in three dimensions we write
∂ 2 u(x, y, z) ∂ 2 u(x, y, z) ∂ 2 u(x, y, z)
∇2 u = + + .
∂ x2 ∂ y2 ∂ z2
Either equation may be written
∇2 u = 0
and we will have to learn how to express the Laplacian
∂2 ∂2 ∂2
∇2 = 2
+ 2+ 2
∂x ∂y ∂z
in different ways such as in polar, cylindrical or spherical coordinates.
Laplacian in Polar, Cylindrical and Spherical Coordinates 411
x = ρ cos φ
y = ρ sin φ
z=z
and it can also be shown that Laplace’s equation in cylindrical coordinates takes the
form
∂ 2 u(ρ, φ , z) 1 ∂ u(ρ, φ , z) 1 ∂ 2 u(ρ, φ , z) ∂ 2 u(ρ, φ , z)
∇2 u(ρ, φ , z) = 2
+ + 2 + .
∂ρ ρ ∂ρ ρ ∂φ2 ∂ z2
(A.20)
Note that expression (A.19) is a special case of (A.20) by simply holding z constant.
Finally, in spherical coordinates
x = ρ sin θ cos φ
y = ρ sin θ sin φ
z = ρ cos θ
we have
1 ∂ 2 ∂ u(rθ , φ )
∇2 u(r, θ , z) = [ (r )
r2 ∂ r ∂r
1 ∂ ∂ u(r, θ , φ ) 1 ∂ 2 u(r, θ , φ )
+ (sin θ )+ 2 ]. (A.21)
sin θ ∂ θ ∂θ sin θ ∂φ2
Next, we give a brief derivation of (A.20). We already know that
p y
ρ = x2 + y2 , and φ = arctan .
x
Using the chain rule we have
∂u ∂u ∂ρ ∂u ∂φ ∂u ∂z
= + + .
∂x ∂ρ ∂x ∂φ ∂x ∂z ∂x
However,
∂ρ x x ρ cos φ
=p = = = cos φ .
∂x x 2 + y2 ρ ρ
412 Fourier Series
∂z ∂ 2u ∂u
since = 0. To obtain 2
, we replace the function u in (A.22) by . That
∂x ∂x ∂x
is,
∂ 2u ∂ ∂ u sin φ ∂ ∂ u
= cos φ −
∂ x2 ∂ρ ∂x ρ ∂φ ∂x
∂ ∂ u sin φ ∂ u sin φ ∂ ∂ u sin φ ∂ u
= cos φ cos φ − − cos φ −
∂ρ ∂ρ ρ ∂φ ρ ∂φ ∂ρ ρ ∂φ
2
∂ u sin φ ∂ u sin φ ∂ u 2
= cos φ cos φ 2 + 2 −
∂ρ ρ ∂φ ρ ∂ ρ∂ φ
sin φ ∂u 2
∂ u cos φ ∂ u sin φ ∂ 2 u
− − sin φ + cos φ − − .
ρ ∂ρ ∂φ∂ρ ρ ∂φ ρ ∂φ2
Using the fact that
∂ 2u ∂ 2u
= ,
∂ ρ∂ φ ∂φ∂ρ
the above expression simplifies to
Θ′′ − λ Θ = 0,
Θ(0) = 0
and
1 ′ λ
R′′ (ρ) + R (ρ) + R(ρ) = 0, ρ > 0,
ρ ρ
R(1) = R(2) = 0.
2 u=0
∇2 u = 0
u=0
x
u = u0 u=0
−2 −1 0 1 2
FIGURE A.4
Laplacian on an annulus.
Θ − λ Θ = 0, Θ(0) = 0,
Θn (φ ) = c1 eαn φ + c2 e−αn φ .
Recall that the set of functions given by ζn are normalized with respect to the weight
function p = ρ1 and therefore,
2
1, m = n
ζn (ρ)ζm (ρ)dρ =
1 0, m = n.
416 Fourier Series
Hence, by multiplying both sided of (A.28) with ζm (ρ) and then integrate with re-
spect to ρ from 1 to 2 we arrive at
2
u0 ζn (ρ)dρ = dn sinh(αn π),
1
or,
2
s
1 2
u0 sin(αn ln(ρ))dρ = dn sinh(αn π).
1 ρ ln(2)
From which we obtain after integrating,
p
1 u
0 2 ln(2) 1 − (−1)n
dn = .
sinh(αn π) π n
where
nπ
αn = , n = 1, 2, . . . .
ln(2)
Bibliography
Chapter 1
1. Bellman, R., Stability Theory of Differential Equations, McGraw-Hill Book Com-
pany, New York, London, 1953.
2. Berezansky, L., and Braverman, E., Exponential stability of difference equations
with several delays: Recursive approach, Adv. Difference. Equ. Vol. 2009, Article
ID 104310, 13.
3. Driver, R. D., Introduction to Ordinary Differential Equations, Harper & Row,
Publishers, New York, 1978.
4. Hartman, P., Ordinary Differential Equations, John Wiley & Sons, Inc., New York,
1964.
5. Kelley, W., and Peterson, A., The Theory of Differential Equations, Classical and
Qualitative, Pearson Prentice Hall, 2004.
6. Miller, R. K., Nonlinear Volterra Integral Equations, Benjamin, New York,
1971.
7. Miller, R. K., Introduction to Differential Equations, Prentice Hall 1987.
8. Raffoul, Y. N., Class Notes on Ordinary Differential Equations, University of Day-
ton, 2022.
9. Raffoul, Y. N., Advanced Differential Equations, Elsevier/Academic Press, N Y,
2022.
Chapter 2
1. Brown, J. W., Fourier Series and Boundary Value Problems, 8th edition, Mc-
Grawhill, 2012.
2. Jeffrey, A., Applied Partial Differential Equations: An Introduction, Academic
Press, 2003.
3. Logan, D. J., Applied Partial Differential Equations, Springer, 1998.
4. Myint-U, T. and L. Debnath, Linear Partial Differential Equations for Scientists
and Engineers, 4th edition, Birkhauser, 2006.
5. Olver, P. J., Introduction to Partial Differential Equations, Springer, 2014.
417
418 Bibliography
2. Arfken, G., Mathematical Methods for Physicists, 2nd edition, Academic Press,
1970.
3. Arnold, V. I., Mathematical Methods of Classical Mechanics, Springer-Verlag,
1978.
4. Bliss, G. A., Lectures on the Calculus of Variations, University of Chicago Press,
1946.
5. Bolza, O., Lectures on the Calculus of Variations, G.E. Stechert and Co.,
1931.
6. Brechtken-Manderscheid, U., Introduction to the Calculus of Variations, Chapman
& Hall, 1991.
7. Carathéodory, C., Calculus of Variations and Partial Differential Equations of the
First Order, Chelsea, 1982.
8. Ewing, G. M., Calculus of Variations with Applications, Dover, 1985.
9. Forsyth, A. R., Calculus of Variations, Cambridge University Press, 1927.
10. Fox, C., An Introduction to the Calculus of Variations, Dover, 1987.
11. Fulks, W., Advanced Calculus, 3rd edition, John Wiley, 1978.
12. Gelfand, I. M. and Fomin, S. V., Calculus of Variations, Prentice-Hall, 1963.
13. Giaquinta, M. and Hildebrandt, S., Calculus of Variations I: The Lagrangian For-
malism, Springer-Verlag, 1996.
14. Giaquinta, M. and Hildebrandt, S., Calculus of Variations II: The Hamiltonian
Formalism, Springer-Verlag, 1996.
15. Hildebrand, F. B., Methods of Applied Mathematics, Prentice-Hall, 1965.
16. Morse, M., The Calculus of Variations in the Large, American Math. Soc. Collo-
quium Pub., Vol. 18, 1932.
17. Pars, L. A., A Treatise on Analytical Dynamics, Heinemann, 1965.
18. Postnikov, M. M., The Variational Theory of Geodesics, Dover, 1983.
19. Raffoul, Y. N., Advanced Differential Equations, Elsevier/Academic Press, N Y,
2022.
20. Raffoul, Y. N., Class Notes on Calculus of Variations, University of Dayton,
2022.
21. Sagan, H., Introduction to the Calculus of Variations, Dover, 1992.
22. Wan, F. W., Introduction to the Calculus of Variations and its Applications, Chap-
man & Hall, 1995.
420 Bibliography
Chapter 5
1. Constanda., C., Integral methods in science and engineering, CRC, Press,
2000.
2. Hackbusch, W., Integral Equations: Theory and Numerical Treatment, Birkhäuser,
1995.
3. Hochstadt, H., Integral Equations, Wiley, 1973.
4. Colton, D., and Kress, R., Integral equation methods in scattering theory: Classics
In Applied mathematics, SIAM, 2013.
5. Lovitt, W. V., Linear Integral Equations, Dover Publications Inc.: New York,
1950.
6. Mikhlin, S. G., Linear Integral Equations, Dover Publications, 2020.
7. Porter, D, Stirling, D. G., and et al. Integral Equations: A Practical Treatment,
from Spectral Theory to Applications, Cambridge University Press, 1991.
8. Raffoul, Y. N., Class Notes on Integral Equations, University of Dayton,
2022.
9. Rahman, M., Mathematical Methods with Applications, WIT Press: Southampton,
2000.
10. Sharma, D. C., and Goyal, M. C., Integral equations, PHI Learning, Delhi,
2017.
11. Tricomi, F. G., Integral Equations, Dover, 1985.
12. Wazwaz, A. M., A First Course in Integral Equations, World Scientific: Singa-
pore, 2015.
13. Yosida, K., Lectures on differential and integral equations, Dover Publications,
1991.
14. Zabreyko, P. P., Integral equations: A Reference Text, Springer 1976.
Index
421
422 Index